summaryrefslogtreecommitdiff
path: root/NEWS.txt
diff options
context:
space:
mode:
authorLeonard Richardson <leonard.richardson@canonical.com>2012-02-09 16:15:56 -0500
committerLeonard Richardson <leonard.richardson@canonical.com>2012-02-09 16:15:56 -0500
commit4aff2ee4d6f077e06159c92ab05c0f2ea527c6fa (patch)
tree40951a60046f184794a011a498187053e8ad2a92 /NEWS.txt
parentcaeb168dc47470607b3cd091e1d35db45c089385 (diff)
As a last-ditch attempt to turn data into Unicode, use errors=replace instead of errors=strict.
Diffstat (limited to 'NEWS.txt')
-rw-r--r--NEWS.txt6
1 files changed, 6 insertions, 0 deletions
diff --git a/NEWS.txt b/NEWS.txt
index b1df902..7084cde 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -20,6 +20,12 @@
* Unicode, Dammit now detects the encoding in HTML 5-style <meta> tags
like <meta charset="utf-8" />. [bug=837268]
+* If Unicode, Dammit can't figure out a consistent encoding for a
+ page, it will try each of its guesses again, with errors="replace"
+ instead of errors="strict". This may mean that some data gets
+ replaced with REPLACEMENT CHARACTER, but at least most of it will
+ get turned into Unicode. [bug=754903]
+
* Patched over a bug in html5lib (?) that was crashing Beautiful Soup
on certain kinds of markup. [bug=838800]