summaryrefslogtreecommitdiff
path: root/bs4/tests/test_htmlparser.py
AgeCommit message (Collapse)Author
2023-02-15When the html.parser parser decides it can't parse a document, BeautifulLeonard Richardson
Soup now consistently propagates this fact by raising a ParserRejectedMarkup error. [bug=2007343]
2023-01-27Got rid of some more warnings by removing code that's not relevant anymore, ↵Leonard Richardson
now that the minimum supported Python version is 3.6.
2021-10-24Issue a warning when an HTML parser is used to parse a document thatLeonard Richardson
looks like XML but not XHTML. [bug=1939121]
2021-10-11Moved the test classes to tests/__init__.py.Leonard Richardson
2021-10-09Moved testing.py into the same package as the tests.Leonard Richardson
2021-09-12Ported unit tests to use pytest.Leonard Richardson
2021-09-07Goodbye, Python 2. [bug=1942919]Leonard Richardson
2021-05-31The html.parser tree builder can now handles named entitiesLeonard Richardson
found in the HTML5 spec in much the same way that the html5lib tree builder does. Note that the lxml tree builder still handles named entities differently. [bug=1924908]
2021-04-08Brought in fuzz tests from the oss-project into Beautiful Soup's unit test ↵Leonard Richardson
suite.
2020-05-17Documented some recently added customization features.Leonard Richardson
2020-05-17Added a keyword argument on_duplicate_attribute to theLeonard Richardson
BeautifulSoupHTMLParser constructor (used by the html.parser tree builder) which lets you customize the handling of markup that contains the same attribute more than once, as in: <a href="url1" href="url2"> [bug=1878209]
2019-07-21Implemented line number tracking for html5lib.Leonard Richardson
2019-07-21Adapt Chris Mayo's code to track line number and position when using ↵Leonard Richardson
html.parser.
2019-07-07It's now possible to override a TreeBuilder's cdata_list_attributes ↵Leonard Richardson
dictionary by passing in a replacement. None will disable the feature altogether. [bug=1832978]
2018-07-15Stop data loss when encountering an empty numeric entity, andLeonard Richardson
possibly in other cases. Thanks to tos.kamiya for the fix. [bug=1698503]
2018-07-14Stopped HTMLParser from raising an exception in very rare cases ofLeonard Richardson
bad markup. [bug=1708831]
2017-05-06 Improved the handling of empty-element tags like <br> when using theLeonard Richardson
html.parser parser. [bug=1676935]
2015-06-28 It's now possible to pickle a BeautifulSoup object no matter whichLeonard Richardson
tree builder was used to create it. However, the only tree builder that survives the pickling process is the HTMLParserTreeBuilder ('html.parser'). If you unpickle a BeautifulSoup object created with some other tree builder, soup.builder will be None. [bug=1231545]
2012-04-18Fixed a bug that made the HTMLParser treebuilder generate XML definitions ↵Leonard Richardson
ending with two question marks instead of one. [bug=984258]
2012-02-23Added basic namespace tests.Leonard Richardson
2012-02-22Removed tests that merely illustrated parser behavior, behavior that ↵Leonard Richardson
wouldn't break Beautiful Soup if it changed.
2012-02-20It's now possible to copy a BeautifulSoup object created with the ↵Leonard Richardson
html.parser treebuilder.
2012-02-20Temporarily skip the deepcopy test when lxml is not installed.Leonard Richardson
2012-02-20lxml tests are once again run and pass when lxml is installed.Leonard Richardson
2012-02-20Changd the class structure so that the default parser test class uses ↵Leonard Richardson
html.parser.
2012-02-15Added a kind of hacky way to interpret the restriction class='foo bar'. Stop ↵Leonard Richardson
generating a space before the slash that closes an empty-element tag.
2012-02-06Monkeypatch Python 3.2 versions prior to 3.2.3 to solve some major ↵Leonard Richardson
HTMLParser bugs.
2012-01-20Made it easier to convert BS3 code to BS4.Leonard Richardson
2012-01-20Replaced assertEquals with assertEqual to get rid of deprecation notice.Leonard Richardson