diff options
author | Leonard Richardson <leonard.richardson@canonical.com> | 2011-02-10 11:52:30 -0500 |
---|---|---|
committer | Leonard Richardson <leonard.richardson@canonical.com> | 2011-02-10 11:52:30 -0500 |
commit | bb9d9c5dc0af0deefc1a77542c007b7040aa55bb (patch) | |
tree | 1873ec97e3684c4676d1c62177b60e42aa4f1f2b /TODO | |
parent | 749f01e2b664dcbf4f58dfbdcaa4d314f6e3b9ef (diff) |
Ported some more tests demonstrating that entities are converted to Unicode characters on the way in.
Diffstat (limited to 'TODO')
-rw-r--r-- | TODO | 5 |
1 files changed, 5 insertions, 0 deletions
@@ -2,6 +2,11 @@ html5lib has its own Unicode, Dammit-like system. Converting the input to Unicode should be up to the builder. The lxml builder would use Unicode, Dammit, and the html5lib builder would be a no-op. +Bare ampersands should be converted to HTML entities upon output. + +It should also be possible to convert certain Unicode characters to +HTML entities upon output. + --- Here are some unit tests that fail with HTMLParser. |