diff options
Diffstat (limited to 'TODO')
-rw-r--r-- | TODO | 12 |
1 files changed, 6 insertions, 6 deletions
@@ -1,11 +1,11 @@ -html5lib has its own Unicode, Dammit-like system. Converting the input -to Unicode should be up to the builder. The lxml builder would use -Unicode, Dammit, and the html5lib builder would be a no-op. - Bare ampersands should be converted to HTML entities upon output. -It should also be possible to convert certain Unicode characters to -HTML entities upon output. +It should also be possible to, on output, convert to HTML entities any +Unicode characters found in htmlentitydefs.codepoint2name. (This +algorithm would allow me to simplify Unicode, Dammit--convert +everything to Unicode, and then convert to entities upon output, not +treating smart quotes differently from any other Unicode character +that can be represented as an entity.) XML handling: |