summaryrefslogtreecommitdiff
path: root/bs4/builder/_html5lib.py
AgeCommit message (Collapse)Author
2024-01-17Added the correct stacklevel to instances of the XMLParsedAsHTMLWarning.Leonard Richardson
[bug=2034451]
2023-01-27Warnings now do their best to provide an appropriate stacklevel,Leonard Richardson
improving the usefulness of the message. [bug=1978744]
2022-04-10Fixed another crash when overriding multi_valued_attributes and using theLeonard Richardson
html5lib parser. [bug=1948488]
2021-10-24Issue a warning when an HTML parser is used to parse a document thatLeonard Richardson
looks like XML but not XHTML. [bug=1939121]
2021-10-23Fixed a crash when overriding multi_valued_attributes and using theLeonard Richardson
html5lib parser. [bug=1948488]
2021-09-07Goodbye, Python 2. [bug=1942919]Leonard Richardson
2020-05-17Switch entirely to Python 3-style print statements, even in Python 2.Leonard Richardson
2020-04-05Embedded CSS and Javascript is now stored in distinct Stylesheet andLeonard Richardson
Script tags, which are ignored by methods like get_text(). This feature is not supported by the html5lib treebuilder. [bug=1868861]
2019-11-11Added a Brazilian Portuguese translation by Cezar Peixeiro.Leonard Richardson
2019-07-21Implemented line number tracking for html5lib.Leonard Richardson
2019-07-07It's now possible to override a TreeBuilder's cdata_list_attributes ↵Leonard Richardson
dictionary by passing in a replacement. None will disable the feature altogether. [bug=1832978]
2018-12-30Fixed a problem with multi-valued attributes where the valueLeonard Richardson
contained whitespace. Thanks to Jens Svalgaard for the fix. [bug=1787453]
2018-12-24Clarified the software license.Leonard Richardson
2018-12-22Fix next and previous linkage issues. Fixes issues #1806598 and #1782928.Isaac Muse
2016-12-19Fixed foster parenting when html5lib is the tree builder. Thanks to Geoffrey ↵Leonard Richardson
Sneddon for a patch and test.
2016-12-19Fixed yet another problem that caused the html5lib tree builder toLeonard Richardson
2016-07-16Beautiful Soup will now work with versions of html5lib greater thanLeonard Richardson
0.99999999. [bug=1603299]
2016-07-16Removed imports to pdb, since pdb is not available in some environments. ↵Leonard Richardson
[bug=1491700]
2016-07-16Rename COPYING.txt to LICENSE. Add a reference to LICENSE in every source file.Leonard Richardson
2015-12-08Fix foster parenting with html5lib.Geoffrey Sneddon
This makes all of the html5lib tests pass. Yay!
2015-12-08Make TreeBuilderForHtml5lib strictly follow the html5lib API.Geoffrey Sneddon
This slightly changes the constructor (to make soup optional), and adds a testSerializer method so the tests can be run against it.
2015-09-28Fixed a parse bug with the html5lib tree-builder. Thanks to RoelLeonard Richardson
Kramer for the patch. [bug=1483781]
2015-06-28Changed the way soup objects work under copy.copy(). Copying aLeonard Richardson
NavigableString or a Tag will give you a new NavigableString that's equal to the old one but not connected to the parse tree. Patch by Martijn Peters. [bug=1307490]
2015-06-28Fixed a bug where Element.extract() could create an infinite loop inLeonard Richardson
the remaining tree.
2015-06-27Added an exclude_encodings argument to UnicodeDammit and to theLeonard Richardson
Beautiful Soup constructor, which lets you prohibit the detection of an encoding that you know is wrong. [bug=1469408]
2015-06-26Added a sanity check helper method that makes sure all the elements of a ↵Leonard Richardson
tree are properly connected via .next_element and .previous_element.
2015-06-24If the initial <html> tag contains a CDATA list attribute such asLeonard Richardson
'class', the html5lib tree builder will now turn its value into a list, as it would with any other tag. [bug=1296481]
2015-06-23Got a hacky fix for the latest html5lib problem.Leonard Richardson
2014-12-07Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵Leonard Richardson
name a parser.
2013-10-18Fixed yet another problem that caused the html5lib tree builder toLeonard Richardson
create a disconnected parse tree. [bug=1237763]
2013-08-13* Fixed yet another problem with the html5lib tree builder, caused byLeonard Richardson
html5lib's tendency to rearrange the tree during parsing. [bug=1189267]
2013-06-03Save another Element creation.Leonard Richardson
2013-06-03Improved performance for html5lib.Leonard Richardson
2013-05-31Create a new lxml parser object for every new parsing strategy.Leonard Richardson
2013-05-20The .next_element attribute used during parsing was confusingly similar to ↵Leonard Richardson
the .next_element navigation attribute. Renamed the former to _most_recent_element.
2013-05-20Fixed another bug by which the html5lib tree builder could create aLeonard Richardson
disconnected tree. [bug=1182089]
2012-08-21We don't need a special insertComment method, we just need to make ↵Leonard Richardson
Element.appendChild call object_was_parsed.
2012-08-21Fixed a problem with the html5lib builder not handling comments correctly.Leonard Richardson
2012-04-26The test suite now passes when lxml is not installed, whether or not ↵Leonard Richardson
html5lib is installed. [bug=987004]
2012-04-18Got rid of contains_substitutions.Leonard Richardson
2012-03-01Added missing __len__ method that stopped html5lib tree builder from working ↵Leonard Richardson
on nested formatting elements. [bug=943246]
2012-02-24Warn when SoupStrainer is used with the html5lib tree builder.Leonard Richardson
2012-02-23Bumped version number.Leonard Richardson
2012-02-16It's a start, at least.Leonard Richardson
2012-02-15Clarified comment.Leonard Richardson
2012-02-15Removed _nodeIndex, because the misfeature it works around is now gone.Leonard Richardson
2012-02-15Minor cleanup.Leonard Richardson
2012-02-15Tested and cleaned up html5lib insertBefore.Leonard Richardson
2012-02-15Use append instead of insert.Leonard Richardson
2012-02-15Minor cleanup.Leonard Richardson