Age | Commit message (Collapse) | Author | |
---|---|---|---|
2015-09-28 | Fixed a parse bug with the html5lib tree-builder. Thanks to Roel | Leonard Richardson | |
Kramer for the patch. [bug=1483781] | |||
2015-09-28 | Improved the implementation of CSS selector grouping. Thanks to Orangain for ↵ | Leonard Richardson | |
the patch. [bug=1484543] | |||
2015-09-28 | Corrected the output of Declaration objects. [bug=1477847] | Leonard Richardson | |
2015-09-28 | Fixed a bug that deranged the tree when part of it was | Leonard Richardson | |
removed. Thanks to Eric Weiser for the patch and John Wiseman for a test. [bug=1481520] | |||
2015-07-05 | Fixed the test_detect_utf8 test so that it works when chardet is | Leonard Richardson | |
installed. [bug=1471359] | |||
2015-06-28 | It's now possible to pickle a BeautifulSoup object no matter which | Leonard Richardson | |
tree builder was used to create it. However, the only tree builder that survives the pickling process is the HTMLParserTreeBuilder ('html.parser'). If you unpickle a BeautifulSoup object created with some other tree builder, soup.builder will be None. [bug=1231545] | |||
2015-06-28 | Changed the way soup objects work under copy.copy(). Copying a | Leonard Richardson | |
NavigableString or a Tag will give you a new NavigableString that's equal to the old one but not connected to the parse tree. Patch by Martijn Peters. [bug=1307490] | |||
2015-06-28 | Copying a NavigableString will give you a new NavigableString that is not ↵ | Leonard Richardson | |
connected to the parse tree. | |||
2015-06-28 | Raise a NotImplementedError whenever an unsupported CSS pseudoclass | Leonard Richardson | |
is used in select(). Previously some cases did not result in a NotImplementedError. | |||
2015-06-27 | Added an exclude_encodings argument to UnicodeDammit and to the | Leonard Richardson | |
Beautiful Soup constructor, which lets you prohibit the detection of an encoding that you know is wrong. [bug=1469408] | |||
2015-06-25 | Introduced the select_one() method, which uses a CSS selector but | Leonard Richardson | |
only returns the first match, instead of a list of matches. [bug=1349367] | |||
2015-06-25 | The text argument to the find_* methods is now called string, | Leonard Richardson | |
which is more accurate. text still works, but is the argument described in the documentation. text may eventually change its meaning, but not for a very long time. [bug=1366856] | |||
2015-06-25 | Improved the exception raised when you call .unwrap() or | Leonard Richardson | |
.replace_with() on an element that's not attached to a tree. | |||
2015-06-25 | __repr__ now returns an ASCII bytestring in Python 2, and a Unicode string ↵ | Leonard Richardson | |
in Python 3, instead of a UTF8-encoded bytestring in both versions. [bug=1420131] | |||
2015-06-25 | Fixed a crash in Unicode, Dammit's encoding detector when the name | Leonard Richardson | |
of the encoding itself contained invalid bytes. [bug=1360913] | |||
2015-06-24 | The select() method can now find tags with attributes whose names | Leonard Richardson | |
contain dashes. Patch by Marek Kapolka. [bug=1304007] | |||
2015-06-23 | Force object_was_parsed() to keep the tree intact even when an element | Leonard Richardson | |
from later in the document is moved into place. [bug=1430633] | |||
2014-12-11 | Improved the lxml tree builder's handling of processing | Leonard Richardson | |
instructions. [bug=1294645] | |||
2014-12-11 | The select() method can now find tags whose names contain | Leonard Richardson | |
dashes. Patch by Francisco Canas [bug=1276211] | |||
2014-12-10 | The select() method now supports selector grouping. Patch by | Leonard Richardson | |
Francisco Canas [bug=1191917] | |||
2014-12-07 | Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵ | Leonard Richardson | |
name a parser. | |||
2013-10-02 | Fixed a bug that caused Unicode data put into UnicodeDammit to | Leonard Richardson | |
return None instead of the original data. [bug=1214983] | |||
2013-10-01 | Fixed a crash when a short input contains data not valid in | Leonard Richardson | |
filenames. [bug=1232604] | |||
2013-10-01 | Fixed a bug in which short Unicode input was improperly encoded to ASCII ↵ | Leonard Richardson | |
when checking whether or not it was a file on disk. [bug=1227016] | |||
2013-08-19 | Combined two tests to stop a spurious test failure when tests are | Leonard Richardson | |
run by nodetests. [bug=1212445] | |||
2013-08-15 | Make sure the optimized find_all() ResultSets actually contain the right data. | Leonard Richardson | |
2013-08-13 | * Fixed yet another problem with the html5lib tree builder, caused by | Leonard Richardson | |
html5lib's tendency to rearrange the tree during parsing. [bug=1189267] | |||
2013-08-12 | All find_all calls should now return a ResultSet object. Patch by | Leonard Richardson | |
Aaron DeVore. [bug=1194034] | |||
2013-06-03 | A NavigableString object now has an immutable '.name' property whose | Leonard Richardson | |
+ value is always None. This makes it easier to iterate over a mixed + list of tags and strings without having to check whether each + element is a tag or a string. | |||
2013-06-03 | Let's get some profiling going. | Leonard Richardson | |
2013-06-03 | Test that the filename warning isn't given unless the file actually exists ↵ | Leonard Richardson | |
on disk. | |||
2013-06-03 | Beautiful Soup will issue a warning if instead of markup you pass it | Leonard Richardson | |
a URL or the name of a file on disk (a common beginner mistake). | |||
2013-06-02 | Merged in big encoding-detection refactoring branch. | Leonard Richardson | |
2013-06-02 | Turns out we had two bits of code to strip byte-order marks. | Leonard Richardson | |
2013-06-02 | It turns out most of the untested code wasn't doing anything useful. | Leonard Richardson | |
2013-05-31 | Reverted the patch that gives NavigableString a .name property, because ↵ | Leonard Richardson | |
that's too big an API change for a bugfix release. | |||
2013-05-31 | Create a new lxml parser object for every new parsing strategy. | Leonard Richardson | |
2013-05-30 | Split out the code that guesses at encodings from the code that tries to ↵ | Leonard Richardson | |
decode a bytestring based on those encodings. This is necessary because lxml wants to do the decoding itself. | |||
2013-05-20 | The default XML formatter will now replace ampersands even if they appear to ↵ | Leonard Richardson | |
be part of entities. That is, "<" will become "&lt;".[bug=1182183] | |||
2013-05-20 | A NavigableString object now has an immutable '.name' property whose | Leonard Richardson | |
value is always None. This makes it easier to iterate over a mixed list of tags and strings without having to check whether each element is a tag or a string. | |||
2013-05-20 | Gave new_string() the ability to create subclasses of | Leonard Richardson | |
NavigableString. [bug=1181986] | |||
2013-05-20 | html5lib now supports Python 3. Fixed some Python 2-specific | Leonard Richardson | |
code in the html5lib test suite. [bug=1181624] | |||
2013-05-20 | Fixed test failures when lxml is not installed. | Leonard Richardson | |
2013-05-15 | Merge. | Leonard Richardson | |
2013-05-14 | Added a deprecation warning to has_key(). | Leonard Richardson | |
2013-05-09 | Changed lxml.feed() to handle the eventuality that it may be given a bytestring. | Leonard Richardson | |
2013-05-08 | A CSS selector should never match the same tag twice. | Leonard Richardson | |
2013-05-08 | Added tests. | Leonard Richardson | |
2013-05-08 | Aaand... it's now trivial to implement sibling selectors. | Leonard Richardson | |
2013-05-08 | OK, the tests pass. | Leonard Richardson | |