summaryrefslogtreecommitdiff
path: root/bs4/tests/test_tree.py
AgeCommit message (Collapse)Author
2023-01-31Consistently use pytest.mark.skipif to skip tests when the corresponding ↵Leonard Richardson
libraries are not installed.
2023-01-27Parametrize the 'string is deprecated' warning test so we can test all of ↵Leonard Richardson
the relevant methods.
2023-01-27Check the associated filename for more warnings.Leonard Richardson
2023-01-25The HTMLFormatter and XMLFormatter constructors no longer return aLeonard Richardson
value. [bug=1992693]
2023-01-25Passing a Tag's .contents into PageElement.extend() now works theLeonard Richardson
same way as passing the Tag itself.
2021-10-24Used a warning to formally deprecate the 'text' argument in favor of 'string'.Leonard Richardson
2021-10-23Changing find* tests to use string instead of text, except for one test that ↵Leonard Richardson
specifically checks that text is an alias for string.
2021-10-11More test refactoring.Leonard Richardson
2021-10-11Broke up some monolithic unit test files.Leonard Richardson
2021-10-11Moved the test classes to tests/__init__.py.Leonard Richardson
2021-10-09Moved testing.py into the same package as the tests.Leonard Richardson
2021-09-12Ported unit tests to use pytest.Leonard Richardson
2021-09-07Goodbye, Python 2. [bug=1942919]Leonard Richardson
2021-06-01The 'replace_with()' method now takes a variable number of arguments,Leonard Richardson
and can be used to replace a single element with a sequence of elements. Patch by Bill Chandos.
2021-02-14NavigableString and its subclasses now implement the get_text()Leonard Richardson
method, as well as the properties .strings and .stripped_strings. These methods will either return the string itself, or nothing, so the only reason to use this is when iterating over a list of mixed Tag and NavigableString objects. [bug=1904309]
2021-02-14The 'html5' formatter now treats attributes whose values are theLeonard Richardson
empty string as HTML boolean attributes. Previously (and in other formatters), an attribute value must be set as None to be treated as a boolean attribute. In a future release, I plan to also give this behavior to the 'html' formatter. Patch by Isaac Muse. [bug=1915424]
2021-02-13The behavior of methods like .get_text() and .strings now differsLeonard Richardson
depending on the type of tag. The change is visible with HTML tags like <script>, <style>, and <template>. Starting in 4.9.0, methods like get_text() returned no results on such tags, because the contents of those tags are not considered 'text' within the document as a whole. But a user who calls script.get_text() is working from a different definition of 'text' than a user who calls div.get_text()--otherwise there would be no need to call script.get_text() at all. In 4.10.0, the contents of (e.g.) a <script> tag are considered 'text' during a get_text() call on the tag itself, but not considered 'text' during a get_text() call on the tag's parent. Because of this change, calling get_text() on each child of a tag may now return a different result than calling get_text() on the tag itself. That's because different tags now have different understandings of what counts as 'text'. [bug=1906226] [bug=1868861]
2021-02-13Corrected the use of special string container classes in cases when aLeonard Richardson
single tag may contain strings with different containers; such as the <template> tag, which may contain both TemplateString objects and Comment objects. [bug=1913406]
2020-09-26Fixed a bug that inconsistently moved elements over when passingLeonard Richardson
a Tag, rather than a list, into Tag.extend(). [bug=1885710]
2020-04-12Fixed test failures when run against soupselect 2.0. Patch by TomášLeonard Richardson
Chvátal. [bug=1872279]
2020-04-05Embedded CSS and Javascript is now stored in distinct Stylesheet andLeonard Richardson
Script tags, which are ignored by methods like get_text(). This feature is not supported by the html5lib treebuilder. [bug=1868861]
2020-01-01API CHANGE - Added PageElement.decomposed, a new property which lets youLeonard Richardson
check whether you've already called decompose() on a Tag or NavigableString.
2019-12-29Fixed an unhandled exception when formatting a Tag that had been ↵Leonard Richardson
decomposed.[bug=1857767]
2019-08-21Copying a Tag preserves information that was originally obtained fromLeonard Richardson
the TreeBuilder used to build the original Tag. [bug=1838903]
2019-08-21Fixed a crash when pretty-printing tags that were not createdLeonard Richardson
during initial parsing. [bug=1838903]
2019-07-15Implemented Tag.smooth.Leonard Richardson
2019-07-15Moved the formatter to its own class and updated its documentation.Leonard Richardson
2019-07-15Improved comments in tests.Leonard Richardson
2019-07-14Give the Formatter class more control over formatting decisions.Leonard Richardson
2019-07-07A Formatter can now decide how (or whether) to order the attributesLeonard Richardson
inside a tag. [bug=1812422]
2019-07-07It's now possible to override a TreeBuilder's cdata_list_attributes ↵Leonard Richardson
dictionary by passing in a replacement. None will disable the feature altogether. [bug=1832978]
2019-01-06Fixed an incorrectly raised exception when inserting a tag before orLeonard Richardson
after an identical tag. [bug=1810692]
2018-12-31Improved and tested error checking for insert_before and insert_after.Leonard Richardson
2018-12-30Add convienances for inserting multiple tagsIsaac Muse
Add extend method to append a list of tags. Make insert_before and insert_after accept multiple arguments
2018-12-19Add Soup Sieve supportIsaac Muse
2018-07-30Fix an exception when a custom formatter was asked to format a voidLeonard Richardson
element. [bug=1784408]
2018-07-28When markup contains duplicate elements, a select() call thatLeonard Richardson
includes multiple match clauses will match all relevant elements. [bug=1770596]
2018-07-28Correctly handle invalid HTML numeric character entities like &#147;Leonard Richardson
which reference code points that are not Unicode code points. Note that this is only fixed when Beautiful Soup is used with the html.parser parser -- html5lib already worked and I couldn't fix it with lxml. [bug=1782933]
2018-07-15You can pass a dictionary of intoLeonard Richardson
BeautifulSoup.new_tag. This makes it possible to create a tag with an attribute like 'name' that would otherwise be masked by another argument of new_tag. [bug=1779276]
2018-07-15Introduced the Formatter system. [bug=1716272].Leonard Richardson
2018-07-14Fixed a disconnected parse tree when one BeautifulSoup object wasLeonard Richardson
inserted into another. [bug=1105148]
2018-07-14Fixed code that was causing deprecation warnings in recent Python 3Leonard Richardson
versions. Includes a patch from Ville Skyttä. [bug=1778909] [bug=1689496]
2017-05-06Replace get_attribute_text with get_attribute_list.Leonard Richardson
2017-05-06Renamed convenience method to get_attribute_text.Leonard Richardson
2017-05-06Added the method, which acts like forLeonard Richardson
getting the value of an attribute, but which joins attribute multi-values into a single string value. [bug=1678589]
2017-05-06It's now possible to use a tag's namespace prefix when searching,Leonard Richardson
e.g. soup.find('namespace:tag') [bug=1655332]
2016-07-26Spelling fixesVille Skyttä
2016-07-19Fixed test that fails in Python 3.5.Leonard Richardson
2016-07-18Pass in bytes so that the BeautifulSoup object always has an original_encoding.Leonard Richardson
2016-07-18If a search against each individual value of a multi-valuedLeonard Richardson
attribute fails, the search will be run one final time against the complete attribute value considered as a single string. [bug=1476868]