summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-07-17Use a dedicated logger instead of the root logger. [bug=1511661]Leonard Richardson
2016-07-17Use a dedicated logger instead of the root logger. [bug=1511661]Leonard Richardson
2016-07-17 When a BeautifulSoup object is pickled but its tree builder cannotLeonard Richardson
be pickled, its .builder attribute is set to None instead of being destroyed. This avoids a performance problem once the object is unpickled. [bug=1523629]
2016-07-17Although the previously fixed problem only occurs when using the html5lib ↵Leonard Richardson
tree builder, it's not actually a problem with the tree builder itself.
2016-07-17Fixed a bug in the html5lib treebuilder that deranged the treeLeonard Richardson
when a whitespace element was reparented into a tag that contained an identical whitespace element. [bug=1505351]
2016-07-17Use known_xml instead of continually adding underscores to is_xml.Leonard Richardson
2016-07-17Whenever possible, keep track ahead of time whether a PageElement is HTML or ↵Leonard Richardson
XML.
2016-07-16Beautiful Soup will now work with versions of html5lib greater thanLeonard Richardson
0.99999999. [bug=1603299]
2016-07-16We don't run the check for a filename passed in as markup if theLeonard Richardson
'filename' contains a less-than character; the less-than character indicates it's most likely a very small document. [bug=1577864]
2016-07-16Removed imports to pdb, since pdb is not available in some environments. ↵Leonard Richardson
[bug=1491700]
2016-07-16corrected typo. [bug=1561510]Leonard Richardson
2016-07-16Specify the file and line number when warning about aLeonard Richardson
BeautifulSoup object being instantiated without a parser being specified. [bug=1574647]
2016-07-16The contents of <textarea> tags will no longer be modified when theLeonard Richardson
tree is prettified. [bug=1555829]
2016-07-16Fixed a Python 3 ByteWarning when a URL was passed in as though itLeonard Richardson
were markup. Thanks to James Salter for a patch and test. [bug=1533762]
2016-07-16Added a separate class for XML processing instructions, which have a ↵Leonard Richardson
slightly different format from SGML processing instructions. [bug=1504383]
2016-07-16Rename COPYING.txt to LICENSE. Add a reference to LICENSE in every source file.Leonard Richardson
2015-11-24Fixed a typo in the documentation, fixed by Gene Wood.Leonard Richardson
2015-11-23Fixing typo in example of nth-of-type css selectorGene Wood
2015-09-28Add a __license__ statement to all source files.Leonard Richardson
2015-09-28Fixed a parse bug with the html5lib tree-builder. Thanks to RoelLeonard Richardson
Kramer for the patch. [bug=1483781]
2015-09-28Improved the implementation of CSS selector grouping. Thanks to Orangain for ↵Leonard Richardson
the patch. [bug=1484543]
2015-09-28Corrected the output of Declaration objects. [bug=1477847]Leonard Richardson
2015-09-28Fixed a bug that deranged the tree when part of it wasLeonard Richardson
removed. Thanks to Eric Weiser for the patch and John Wiseman for a test. [bug=1481520]
2015-09-28Don't allow inserting None into a tag.Leonard Richardson
2015-08-06Use identity comparisons for tree traversalEric Wieser
Otherwise, different NavigableStrings compare equal. Fixes Bug #1481520
2015-07-05Fixed the test_detect_utf8 test so that it works when chardet isLeonard Richardson
installed. [bug=1471359]
2015-07-05Added reference to old 'text' name to documentation.Leonard Richardson
2015-07-03Added instructions for final post-release test.Leonard Richardson
2015-07-03Turns out setup.py requiring lxml was never in a released version which is a ↵Leonard Richardson
big relief as we don't need that anymore.
2015-07-03Change setup.py to focus on creating wheels.Leonard Richardson
2015-07-03Unicode data cannot have a byte-order mark. Returning early stops a warning ↵Leonard Richardson
from happening.
2015-06-28Also include convert-py3k in source distributions. [bug=1304006]Leonard Richardson
2015-06-28Added test-all-versions and the Chinese docs to the manifest.Leonard Richardson
2015-06-28 It's now possible to pickle a BeautifulSoup object no matter whichLeonard Richardson
tree builder was used to create it. However, the only tree builder that survives the pickling process is the HTMLParserTreeBuilder ('html.parser'). If you unpickle a BeautifulSoup object created with some other tree builder, soup.builder will be None. [bug=1231545]
2015-06-28Changed the way soup objects work under copy.copy(). Copying aLeonard Richardson
NavigableString or a Tag will give you a new NavigableString that's equal to the old one but not connected to the parse tree. Patch by Martijn Peters. [bug=1307490]
2015-06-28Copying a NavigableString will give you a new NavigableString that is not ↵Leonard Richardson
connected to the parse tree.
2015-06-28Reorganized changelog.Leonard Richardson
2015-06-28Fixed a bug where Element.extract() could create an infinite loop inLeonard Richardson
the remaining tree.
2015-06-28Accept 'xml' as an unambiguous identifier for the lxml XML parser, since ↵Leonard Richardson
it's the only XML parser supported at the moment.
2015-06-28Raise a NotImplementedError whenever an unsupported CSS pseudoclassLeonard Richardson
is used in select(). Previously some cases did not result in a NotImplementedError.
2015-06-27Added an example of using a fuction on an attribute value/using a function ↵Leonard Richardson
to invert a normal search.
2015-06-27Added another layer of security to catch cases where lxml and html5lib are ↵Leonard Richardson
not installed.
2015-06-27Added an exclude_encodings argument to UnicodeDammit and to theLeonard Richardson
Beautiful Soup constructor, which lets you prohibit the detection of an encoding that you know is wrong. [bug=1469408]
2015-06-26Added a sanity check helper method that makes sure all the elements of a ↵Leonard Richardson
tree are properly connected via .next_element and .previous_element.
2015-06-25Introduced the select_one() method, which uses a CSS selector butLeonard Richardson
only returns the first match, instead of a list of matches. [bug=1349367]
2015-06-25The text argument to the find_* methods is now called string,Leonard Richardson
which is more accurate. text still works, but is the argument described in the documentation. text may eventually change its meaning, but not for a very long time. [bug=1366856]
2015-06-25Make it possible to invoke the Tag() constructor without providing a ↵Leonard Richardson
builder. [bug=1307471]
2015-06-25You can now create a NavigableString or a subclass just by invokingLeonard Richardson
the constructor. [bug=1294315]
2015-06-25Improved the exception raised when you call .unwrap() orLeonard Richardson
.replace_with() on an element that's not attached to a tree.
2015-06-25In Python 3, __str__ now returns a Unicode string insteadLeonard Richardson
of a bytestring. [bug=1420131]