summaryrefslogtreecommitdiff
path: root/bs4/tests/test_tree.py
AgeCommit message (Collapse)Author
2021-02-13The behavior of methods like .get_text() and .strings now differsLeonard Richardson
depending on the type of tag. The change is visible with HTML tags like <script>, <style>, and <template>. Starting in 4.9.0, methods like get_text() returned no results on such tags, because the contents of those tags are not considered 'text' within the document as a whole. But a user who calls script.get_text() is working from a different definition of 'text' than a user who calls div.get_text()--otherwise there would be no need to call script.get_text() at all. In 4.10.0, the contents of (e.g.) a <script> tag are considered 'text' during a get_text() call on the tag itself, but not considered 'text' during a get_text() call on the tag's parent. Because of this change, calling get_text() on each child of a tag may now return a different result than calling get_text() on the tag itself. That's because different tags now have different understandings of what counts as 'text'. [bug=1906226] [bug=1868861]
2021-02-13Corrected the use of special string container classes in cases when aLeonard Richardson
single tag may contain strings with different containers; such as the <template> tag, which may contain both TemplateString objects and Comment objects. [bug=1913406]
2020-09-26Fixed a bug that inconsistently moved elements over when passingLeonard Richardson
a Tag, rather than a list, into Tag.extend(). [bug=1885710]
2020-04-12Fixed test failures when run against soupselect 2.0. Patch by TomášLeonard Richardson
Chvátal. [bug=1872279]
2020-04-05Embedded CSS and Javascript is now stored in distinct Stylesheet andLeonard Richardson
Script tags, which are ignored by methods like get_text(). This feature is not supported by the html5lib treebuilder. [bug=1868861]
2020-01-01API CHANGE - Added PageElement.decomposed, a new property which lets youLeonard Richardson
check whether you've already called decompose() on a Tag or NavigableString.
2019-12-29Fixed an unhandled exception when formatting a Tag that had been ↵Leonard Richardson
decomposed.[bug=1857767]
2019-08-21Copying a Tag preserves information that was originally obtained fromLeonard Richardson
the TreeBuilder used to build the original Tag. [bug=1838903]
2019-08-21Fixed a crash when pretty-printing tags that were not createdLeonard Richardson
during initial parsing. [bug=1838903]
2019-07-15Implemented Tag.smooth.Leonard Richardson
2019-07-15Moved the formatter to its own class and updated its documentation.Leonard Richardson
2019-07-15Improved comments in tests.Leonard Richardson
2019-07-14Give the Formatter class more control over formatting decisions.Leonard Richardson
2019-07-07A Formatter can now decide how (or whether) to order the attributesLeonard Richardson
inside a tag. [bug=1812422]
2019-07-07It's now possible to override a TreeBuilder's cdata_list_attributes ↵Leonard Richardson
dictionary by passing in a replacement. None will disable the feature altogether. [bug=1832978]
2019-01-06Fixed an incorrectly raised exception when inserting a tag before orLeonard Richardson
after an identical tag. [bug=1810692]
2018-12-31Improved and tested error checking for insert_before and insert_after.Leonard Richardson
2018-12-30Add convienances for inserting multiple tagsIsaac Muse
Add extend method to append a list of tags. Make insert_before and insert_after accept multiple arguments
2018-12-19Add Soup Sieve supportIsaac Muse
2018-07-30Fix an exception when a custom formatter was asked to format a voidLeonard Richardson
element. [bug=1784408]
2018-07-28When markup contains duplicate elements, a select() call thatLeonard Richardson
includes multiple match clauses will match all relevant elements. [bug=1770596]
2018-07-28Correctly handle invalid HTML numeric character entities like &#147;Leonard Richardson
which reference code points that are not Unicode code points. Note that this is only fixed when Beautiful Soup is used with the html.parser parser -- html5lib already worked and I couldn't fix it with lxml. [bug=1782933]
2018-07-15You can pass a dictionary of intoLeonard Richardson
BeautifulSoup.new_tag. This makes it possible to create a tag with an attribute like 'name' that would otherwise be masked by another argument of new_tag. [bug=1779276]
2018-07-15Introduced the Formatter system. [bug=1716272].Leonard Richardson
2018-07-14Fixed a disconnected parse tree when one BeautifulSoup object wasLeonard Richardson
inserted into another. [bug=1105148]
2018-07-14Fixed code that was causing deprecation warnings in recent Python 3Leonard Richardson
versions. Includes a patch from Ville Skyttä. [bug=1778909] [bug=1689496]
2017-05-06Replace get_attribute_text with get_attribute_list.Leonard Richardson
2017-05-06Renamed convenience method to get_attribute_text.Leonard Richardson
2017-05-06Added the method, which acts like forLeonard Richardson
getting the value of an attribute, but which joins attribute multi-values into a single string value. [bug=1678589]
2017-05-06It's now possible to use a tag's namespace prefix when searching,Leonard Richardson
e.g. soup.find('namespace:tag') [bug=1655332]
2016-07-26Spelling fixesVille Skyttä
2016-07-19Fixed test that fails in Python 3.5.Leonard Richardson
2016-07-18Pass in bytes so that the BeautifulSoup object always has an original_encoding.Leonard Richardson
2016-07-18If a search against each individual value of a multi-valuedLeonard Richardson
attribute fails, the search will be run one final time against the complete attribute value considered as a single string. [bug=1476868]
2016-07-18Corrected an encoding error that happened when a BeautifulSoupLeonard Richardson
object was copied. [bug=1554439]
2016-07-18Added support for CSS selector values that contain quoted spaces,Leonard Richardson
such as tag[style="display: foo"]. [bug=1540588]
2016-07-18The argument to now works correctly, though it'sLeonard Richardson
not implemented very efficiently. [bug=1520530]
2015-09-28Improved the implementation of CSS selector grouping. Thanks to Orangain for ↵Leonard Richardson
the patch. [bug=1484543]
2015-09-28Corrected the output of Declaration objects. [bug=1477847]Leonard Richardson
2015-09-28Fixed a bug that deranged the tree when part of it wasLeonard Richardson
removed. Thanks to Eric Weiser for the patch and John Wiseman for a test. [bug=1481520]
2015-06-28Changed the way soup objects work under copy.copy(). Copying aLeonard Richardson
NavigableString or a Tag will give you a new NavigableString that's equal to the old one but not connected to the parse tree. Patch by Martijn Peters. [bug=1307490]
2015-06-28Copying a NavigableString will give you a new NavigableString that is not ↵Leonard Richardson
connected to the parse tree.
2015-06-28Raise a NotImplementedError whenever an unsupported CSS pseudoclassLeonard Richardson
is used in select(). Previously some cases did not result in a NotImplementedError.
2015-06-25Introduced the select_one() method, which uses a CSS selector butLeonard Richardson
only returns the first match, instead of a list of matches. [bug=1349367]
2015-06-25The text argument to the find_* methods is now called string,Leonard Richardson
which is more accurate. text still works, but is the argument described in the documentation. text may eventually change its meaning, but not for a very long time. [bug=1366856]
2015-06-25Improved the exception raised when you call .unwrap() orLeonard Richardson
.replace_with() on an element that's not attached to a tree.
2015-06-25__repr__ now returns an ASCII bytestring in Python 2, and a Unicode string ↵Leonard Richardson
in Python 3, instead of a UTF8-encoded bytestring in both versions. [bug=1420131]
2015-06-24The select() method can now find tags with attributes whose namesLeonard Richardson
contain dashes. Patch by Marek Kapolka. [bug=1304007]
2015-06-23Force object_was_parsed() to keep the tree intact even when an elementLeonard Richardson
from later in the document is moved into place. [bug=1430633]
2014-12-11The select() method can now find tags whose names containLeonard Richardson
dashes. Patch by Francisco Canas [bug=1276211]