diff options
Diffstat (limited to 'NEWS.txt')
-rw-r--r-- | NEWS.txt | 25 |
1 files changed, 21 insertions, 4 deletions
@@ -1,9 +1,22 @@ = 4.3.0 (Unreleased) = -* A NavigableString object now has an immutable '.name' property whose - value is always None. This makes it easier to iterate over a mixed - list of tags and strings without having to check whether each - element is a tag or a string. +* Instead of converting incoming data to Unicode and feeding it to the + lxml tree builder, Beautiful Soup now makes successive guesses at + the encoding of the incoming data, and tells lxml to parse the data + as that encoding. This improves performance and avoids an issue in + which lxml was refusing to parse strings because they were Unicode + strings. + + This required a major overhaul of the tree builder architecture. If + you wrote your own tree builder and didn't tell me, you'll need to + modify your prepare_markup() method. + +* The UnicodeDammit code that makes guesses at encodings has been + split into its own class, EncodingDetector. A lot of apparently + redundant code has been removed from Unicode, Dammit, and some + undocumented features have also been removed. + += 4.2.1 (20130531) = * The default XML formatter will now replace ampersands even if they appear to be part of entities. That is, "<" will become @@ -29,6 +42,10 @@ * html5lib now supports Python 3. Fixed some Python 2-specific code in the html5lib test suite. [bug=1181624] +* The html.parser treebuilder can now handle numeric attributes in + text when the hexidecimal name of the attribute starts with a + capital X. Patch by Tim Shirley. [bug=1186242] + = 4.2.0 (20130514) = * The Tag.select() method now supports a much wider variety of CSS |