summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-12-26Remove dead line of codeIsaac Muse
2018-12-25Ensure html5lib always has valid internal linkageIsaac Muse
html5lib, with malformed HTML, can end up with detached linkage internally. Improve the current code to ensure html5lib always has proper linkage.
2018-12-24Issue a warning and raise a more useful exception if someone tries to call ↵Leonard Richardson
Tag.select() without SoupSieve installed.
2018-12-24Keep track of the namespace abbreviations found while parsing the document. ↵Leonard Richardson
This makes select() work most of the time without requiring a value for 'namespaces'.
2018-12-24Rewrote select() documentation and namespace example.Leonard Richardson
2018-12-23Merging Isaac Muse's Soup Sieve branch as-is before making some modifications.Leonard Richardson
2018-12-23Merged in next_previous_fixes from Isaac Muse. [bug=1782928,1798699]Leonard Richardson
2018-12-22Fix next and previous linkage issues. Fixes issues #1806598 and #1782928.Isaac Muse
2018-12-20Pass flags to soupsieve.Isaac Muse
2018-12-19Add Soup Sieve supportIsaac Muse
2018-08-12Bump up to version 4.6.3 so I can re-release.Leonard Richardson
2018-08-12Converted README to Markdown format.Leonard Richardson
2018-07-30Fix an exception when a custom formatter was asked to format a voidLeonard Richardson
element. [bug=1784408]
2018-07-28Prep for release.Leonard Richardson
2018-07-28When markup contains duplicate elements, a select() call thatLeonard Richardson
includes multiple match clauses will match all relevant elements. [bug=1770596]
2018-07-28Correctly handle invalid HTML numeric character entities like “Leonard Richardson
which reference code points that are not Unicode code points. Note that this is only fixed when Beautiful Soup is used with the html.parser parser -- html5lib already worked and I couldn't fix it with lxml. [bug=1782933]
2018-07-21Clarified the deprecation warning when accessing tag.fooTag, to coverLeonard Richardson
the possibility that you might really have been looking for a tag called 'fooTag'.
2018-07-21Fixed a problem where the html.parser tree builder interpretedLeonard Richardson
a string like '&foo ' as the character entity '&foo;' [bug=1728706]
2018-07-21Include LICENSE in the manifest. [bug=1736563]Leonard Richardson
2018-07-19Clarified phrasing.Leonard Richardson
2018-07-18Fixed a bug where find_all() was not working when asked to find aLeonard Richardson
tag with a namespaced name in an XML document that was parsed as HTML. [bug=1723783]
2018-07-18Preserve XML namespaces when they are introduced inside an XMLLeonard Richardson
document, not just the ones introduced at the top level. [bug=1718787]
2018-07-15You can pass a dictionary of intoLeonard Richardson
BeautifulSoup.new_tag. This makes it possible to create a tag with an attribute like 'name' that would otherwise be masked by another argument of new_tag. [bug=1779276]
2018-07-15Corrected some typos in the documentation.Leonard Richardson
2018-07-15Introduced the Formatter system. [bug=1716272].Leonard Richardson
2018-07-15It's possible for a TreeBuilder subclass to specify that voidLeonard Richardson
elements should be represented as <element> rather than <element/>, by setting TreeBuilder.void_element_close_prefix to the empty string. [bug=1716272]
2018-07-15Improved the 'no parser specified' warning so it doesn't show up in a REPL.Leonard Richardson
2018-07-15Stop data loss when encountering an empty numeric entity, andLeonard Richardson
possibly in other cases. Thanks to tos.kamiya for the fix. [bug=1698503]
2018-07-14Fixed a disconnected parse tree when one BeautifulSoup object wasLeonard Richardson
inserted into another. [bug=1105148]
2018-07-14Fix an error in the warning when run from REPL.Leonard Richardson
2018-07-14Bring in some more code from warnings.py.Leonard Richardson
2018-07-14Improve the technique for finding the line number with the problematic ↵Leonard Richardson
method call.
2018-07-14Stopped HTMLParser from raising an exception in very rare cases ofLeonard Richardson
bad markup. [bug=1708831]
2018-07-14Fixed a Windows crash in diagnose() when checking whether a longLeonard Richardson
markup string is a filename. [bug=1737121]
2018-07-14Fixed code that was causing deprecation warnings in recent Python 3Leonard Richardson
versions. Includes a patch from Ville Skyttä. [bug=1778909] [bug=1689496]
2018-07-14Improve the warning given when no parser is specified. [bug=1780571]Leonard Richardson
2017-10-01Fix two typos in docstt.
2017-05-07Prep for 4.6.0 release.Leonard Richardson
2017-05-07Namespace prefix is preserved when an XML tag is copied. ThanksLeonard Richardson
to Vikas for a patch and test. [bug=1685172]
2017-05-07Corrected formatting of warning.Leonard Richardson
2017-05-06Replace get_attribute_text with get_attribute_list.Leonard Richardson
2017-05-06 Improved the handling of empty-element tags like <br> when using theLeonard Richardson
html.parser parser. [bug=1676935]
2017-05-06Renamed convenience method to get_attribute_text.Leonard Richardson
2017-05-06Added the method, which acts like forLeonard Richardson
getting the value of an attribute, but which joins attribute multi-values into a single string value. [bug=1678589]
2017-05-06HTML parsers treat all HTML4 and HTML5 empty element tags (aka void element ↵Leonard Richardson
tags) correctly. [bug=1656909]
2017-05-06It's now possible to use a tag's namespace prefix when searching,Leonard Richardson
e.g. soup.find('namespace:tag') [bug=1655332]
2017-05-06Implement ResultSet.__getattr__ to give a helpful message in a common error ↵Leonard Richardson
scenario.
2017-05-06Change no-parser-specified warning to avoid the implication that you should ↵Leonard Richardson
put your markup into square brackets.
2017-01-02I need to do another release because of an error while running the release ↵Leonard Richardson
script.
2017-01-02Prep for 4.5.2 release.Leonard Richardson