summaryrefslogtreecommitdiff
path: root/NEWS.txt
diff options
context:
space:
mode:
Diffstat (limited to 'NEWS.txt')
-rw-r--r--NEWS.txt25
1 files changed, 21 insertions, 4 deletions
diff --git a/NEWS.txt b/NEWS.txt
index a3485e7..3d0846f 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -1,9 +1,22 @@
= 4.3.0 (Unreleased) =
-* A NavigableString object now has an immutable '.name' property whose
- value is always None. This makes it easier to iterate over a mixed
- list of tags and strings without having to check whether each
- element is a tag or a string.
+* Instead of converting incoming data to Unicode and feeding it to the
+ lxml tree builder, Beautiful Soup now makes successive guesses at
+ the encoding of the incoming data, and tells lxml to parse the data
+ as that encoding. This improves performance and avoids an issue in
+ which lxml was refusing to parse strings because they were Unicode
+ strings.
+
+ This required a major overhaul of the tree builder architecture. If
+ you wrote your own tree builder and didn't tell me, you'll need to
+ modify your prepare_markup() method.
+
+* The UnicodeDammit code that makes guesses at encodings has been
+ split into its own class, EncodingDetector. A lot of apparently
+ redundant code has been removed from Unicode, Dammit, and some
+ undocumented features have also been removed.
+
+= 4.2.1 (20130531) =
* The default XML formatter will now replace ampersands even if they
appear to be part of entities. That is, "<" will become
@@ -29,6 +42,10 @@
* html5lib now supports Python 3. Fixed some Python 2-specific
code in the html5lib test suite. [bug=1181624]
+* The html.parser treebuilder can now handle numeric attributes in
+ text when the hexidecimal name of the attribute starts with a
+ capital X. Patch by Tim Shirley. [bug=1186242]
+
= 4.2.0 (20130514) =
* The Tag.select() method now supports a much wider variety of CSS