summaryrefslogtreecommitdiff
path: root/bs4
AgeCommit message (Collapse)Author
2015-06-28Accept 'xml' as an unambiguous identifier for the lxml XML parser, since ↵Leonard Richardson
it's the only XML parser supported at the moment.
2015-06-28Raise a NotImplementedError whenever an unsupported CSS pseudoclassLeonard Richardson
is used in select(). Previously some cases did not result in a NotImplementedError.
2015-06-27Added another layer of security to catch cases where lxml and html5lib are ↵Leonard Richardson
not installed.
2015-06-27Added an exclude_encodings argument to UnicodeDammit and to theLeonard Richardson
Beautiful Soup constructor, which lets you prohibit the detection of an encoding that you know is wrong. [bug=1469408]
2015-06-26Added a sanity check helper method that makes sure all the elements of a ↵Leonard Richardson
tree are properly connected via .next_element and .previous_element.
2015-06-25Introduced the select_one() method, which uses a CSS selector butLeonard Richardson
only returns the first match, instead of a list of matches. [bug=1349367]
2015-06-25The text argument to the find_* methods is now called string,Leonard Richardson
which is more accurate. text still works, but is the argument described in the documentation. text may eventually change its meaning, but not for a very long time. [bug=1366856]
2015-06-25Make it possible to invoke the Tag() constructor without providing a ↵Leonard Richardson
builder. [bug=1307471]
2015-06-25You can now create a NavigableString or a subclass just by invokingLeonard Richardson
the constructor. [bug=1294315]
2015-06-25Improved the exception raised when you call .unwrap() orLeonard Richardson
.replace_with() on an element that's not attached to a tree.
2015-06-25__repr__ now returns an ASCII bytestring in Python 2, and a Unicode string ↵Leonard Richardson
in Python 3, instead of a UTF8-encoded bytestring in both versions. [bug=1420131]
2015-06-25Fixed a crash in Unicode, Dammit's encoding detector when the nameLeonard Richardson
of the encoding itself contained invalid bytes. [bug=1360913]
2015-06-24Fixed an import error in Python 3.5 caused by the removal of theLeonard Richardson
2015-06-24Made double sure that we don't use the 'strict' constructor argument when ↵Leonard Richardson
it's deprecated. [bug=1341055]
2015-06-24If the initial <html> tag contains a CDATA list attribute such asLeonard Richardson
'class', the html5lib tree builder will now turn its value into a list, as it would with any other tag. [bug=1296481]
2015-06-24The select() method can now find tags with attributes whose namesLeonard Richardson
contain dashes. Patch by Marek Kapolka. [bug=1304007]
2015-06-24Improved docstring for encode_contents() and decode_contents(). [bug=1441543]Leonard Richardson
2015-06-23Made the previous fix nicer by adding arguments to setup() that let us ↵Leonard Richardson
preserve a tag's existing place in the tree.
2015-06-23Got a hacky fix for the latest html5lib problem.Leonard Richardson
2015-06-23Force object_was_parsed() to keep the tree intact even when an elementLeonard Richardson
from later in the document is moved into place. [bug=1430633]
2014-12-11Improved the lxml tree builder's handling of processingLeonard Richardson
instructions. [bug=1294645]
2014-12-11The select() method can now find tags whose names containLeonard Richardson
dashes. Patch by Francisco Canas [bug=1276211]
2014-12-10The warning when you pass in a filename or URL as markup will now beLeonard Richardson
displayed correctly even if the filename or URL is a Unicode string. [bug=1268888]
2014-12-10The select() method now supports selector grouping. Patch byLeonard Richardson
Francisco Canas [bug=1191917]
2014-12-07In Python 3.4 and above, set the new convert_charrefs argument toLeonard Richardson
the html.parser constructor to avoid a warning and future failures. Patch by Stefano Revera. [bug=1375721]
2014-12-07Tweaked the parser warning.Leonard Richardson
2014-12-07Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵Leonard Richardson
name a parser.
2013-10-18Fixed yet another problem that caused the html5lib tree builder toLeonard Richardson
create a disconnected parse tree. [bug=1237763]
2013-10-02Restored the helpful syntax error that happens when you try toLeonard Richardson
import the Python 2 edition of Beautiful Soup under Python 3. [bug=1213387]
2013-10-02Prep for release.Leonard Richardson
2013-10-02Fixed a bug that caused Unicode data put into UnicodeDammit toLeonard Richardson
return None instead of the original data. [bug=1214983]
2013-10-01 Fixed a crash when a short input contains data not valid inLeonard Richardson
filenames. [bug=1232604]
2013-10-01Fixed a bug in which short Unicode input was improperly encoded to ASCII ↵Leonard Richardson
when checking whether or not it was a file on disk. [bug=1227016]
2013-08-19Combined two tests to stop a spurious test failure when tests areLeonard Richardson
run by nodetests. [bug=1212445]
2013-08-15Bumped version number.Leonard Richardson
2013-08-15Make sure the optimized find_all() ResultSets actually contain the right data.Leonard Richardson
2013-08-13* Fixed yet another problem with the html5lib tree builder, caused byLeonard Richardson
html5lib's tendency to rearrange the tree during parsing. [bug=1189267]
2013-08-12Fixed incorrect superclass in super() Call.Leonard Richardson
2013-08-12All find_all calls should now return a ResultSet object. Patch byLeonard Richardson
Aaron DeVore. [bug=1194034]
2013-08-12A little cleanup.Leonard Richardson
2013-06-03Updated NEWS.Leonard Richardson
2013-06-03A NavigableString object now has an immutable '.name' property whoseLeonard Richardson
+ value is always None. This makes it easier to iterate over a mixed + list of tags and strings without having to check whether each + element is a tag or a string.
2013-06-03_last_descendant can be optimized in some cases.Leonard Richardson
2013-06-03Save another Element creation.Leonard Richardson
2013-06-03Improved performance for html5lib.Leonard Richardson
2013-06-03Added raw html5lib to the list of parsers that get tested.Leonard Richardson
2013-06-03Changed _popToTag to run through a single range instead of two.Leonard Richardson
2013-06-03Improved _popToTag a tiny bit.Leonard Richardson
2013-06-03Inlined some commonly called code to save a function call.Leonard Richardson
2013-06-03Limit how much of the document is searched via regular expression for a ↵Leonard Richardson
declared encoding.