summaryrefslogtreecommitdiff
path: root/TODO
diff options
context:
space:
mode:
authorLeonard Richardson <leonard.richardson@canonical.com>2011-02-18 12:53:33 -0500
committerLeonard Richardson <leonard.richardson@canonical.com>2011-02-18 12:53:33 -0500
commitb5fa9d7f5579f22f5fe0f7c9dc63e0aa7d29262f (patch)
treef089e9dee8109e0fdfae2589cd8228d4ddee5939 /TODO
parent5962a409b04b8a78d78e9186da97bedbb67df8e6 (diff)
By default, Unicode Dammit converts smart quotes to Unicode characters, not XML entities.
Diffstat (limited to 'TODO')
-rw-r--r--TODO12
1 files changed, 6 insertions, 6 deletions
diff --git a/TODO b/TODO
index ea32bbb..887c426 100644
--- a/TODO
+++ b/TODO
@@ -1,11 +1,11 @@
-html5lib has its own Unicode, Dammit-like system. Converting the input
-to Unicode should be up to the builder. The lxml builder would use
-Unicode, Dammit, and the html5lib builder would be a no-op.
-
Bare ampersands should be converted to HTML entities upon output.
-It should also be possible to convert certain Unicode characters to
-HTML entities upon output.
+It should also be possible to, on output, convert to HTML entities any
+Unicode characters found in htmlentitydefs.codepoint2name. (This
+algorithm would allow me to simplify Unicode, Dammit--convert
+everything to Unicode, and then convert to entities upon output, not
+treating smart quotes differently from any other Unicode character
+that can be represented as an entity.)
XML handling: