summaryrefslogtreecommitdiff
path: root/sys/src/cmd/uhtml.c
AgeCommit message (Collapse)Author
2016-03-13uhtml: dont trust charset=utf-8 attribute, verify.cinap_lenrek
when the charset is explicitely specified as utf-8, ignore it for now. we'll assume utf-8 when all bytes have been properly utf-8 encoded.
2015-05-28uhtml: check if document is valid utf8 even with charset specifiedcinap_lenrek
often, documents specify charsets but are really utf-8 encoded. we now try to decode as utf-8 and only if that fails assume the charset specified in the document.
2013-07-14uhtml: honor default charset -c when not found in documentcinap_lenrek
2013-06-21uhtml: fix wrong open error handling (fd 0 != fd 1) (thanks BurnZeZ)cinap_lenrek
2012-08-15mothra: handle misplaced trailing quotescinap_lenrek
2012-07-19fix strchr \0 bugscinap_lenrek
2012-07-16uhtml: use first matchcinap_lenrek
2012-06-24mothra/uhtml: properly handle quoting in tagscinap_lenrek
2012-02-20uhtml: fix -c overridecinap_lenrek
2012-02-20uhtml: scan tags only, fix cat fallback, usage, cleanupcinap_lenrek
2011-10-05uhtml: assume latin1 if not valid utf8cinap_lenrek
2011-09-24html2ms, tcs, mothra, uhtml: threat ' as special entity, add uhtml(1)cinap_lenrek
2011-09-21html2ms: table supportcinap_lenrek
2011-09-20uhtml: remove trailing utf BOM marker, html2ms: fix underline handling and ↵cinap_lenrek
escaping
2011-09-20uhtml: add html to unicode converter, used by mothra and page/html2mscinap_lenrek