summaryrefslogtreecommitdiff
path: root/bs4/builder/_html5lib.py
AgeCommit message (Collapse)Author
2016-07-16Beautiful Soup will now work with versions of html5lib greater thanLeonard Richardson
0.99999999. [bug=1603299]
2016-07-16Removed imports to pdb, since pdb is not available in some environments. ↵Leonard Richardson
[bug=1491700]
2016-07-16Rename COPYING.txt to LICENSE. Add a reference to LICENSE in every source file.Leonard Richardson
2015-09-28Fixed a parse bug with the html5lib tree-builder. Thanks to RoelLeonard Richardson
Kramer for the patch. [bug=1483781]
2015-06-28Changed the way soup objects work under copy.copy(). Copying aLeonard Richardson
NavigableString or a Tag will give you a new NavigableString that's equal to the old one but not connected to the parse tree. Patch by Martijn Peters. [bug=1307490]
2015-06-28Fixed a bug where Element.extract() could create an infinite loop inLeonard Richardson
the remaining tree.
2015-06-27Added an exclude_encodings argument to UnicodeDammit and to theLeonard Richardson
Beautiful Soup constructor, which lets you prohibit the detection of an encoding that you know is wrong. [bug=1469408]
2015-06-26Added a sanity check helper method that makes sure all the elements of a ↵Leonard Richardson
tree are properly connected via .next_element and .previous_element.
2015-06-24If the initial <html> tag contains a CDATA list attribute such asLeonard Richardson
'class', the html5lib tree builder will now turn its value into a list, as it would with any other tag. [bug=1296481]
2015-06-23Got a hacky fix for the latest html5lib problem.Leonard Richardson
2014-12-07Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵Leonard Richardson
name a parser.
2013-10-18Fixed yet another problem that caused the html5lib tree builder toLeonard Richardson
create a disconnected parse tree. [bug=1237763]
2013-08-13* Fixed yet another problem with the html5lib tree builder, caused byLeonard Richardson
html5lib's tendency to rearrange the tree during parsing. [bug=1189267]
2013-06-03Save another Element creation.Leonard Richardson
2013-06-03Improved performance for html5lib.Leonard Richardson
2013-05-31Create a new lxml parser object for every new parsing strategy.Leonard Richardson
2013-05-20The .next_element attribute used during parsing was confusingly similar to ↵Leonard Richardson
the .next_element navigation attribute. Renamed the former to _most_recent_element.
2013-05-20Fixed another bug by which the html5lib tree builder could create aLeonard Richardson
disconnected tree. [bug=1182089]
2012-08-21We don't need a special insertComment method, we just need to make ↵Leonard Richardson
Element.appendChild call object_was_parsed.
2012-08-21Fixed a problem with the html5lib builder not handling comments correctly.Leonard Richardson
2012-04-26The test suite now passes when lxml is not installed, whether or not ↵Leonard Richardson
html5lib is installed. [bug=987004]
2012-04-18Got rid of contains_substitutions.Leonard Richardson
2012-03-01Added missing __len__ method that stopped html5lib tree builder from working ↵Leonard Richardson
on nested formatting elements. [bug=943246]
2012-02-24Warn when SoupStrainer is used with the html5lib tree builder.Leonard Richardson
2012-02-23Bumped version number.Leonard Richardson
2012-02-16It's a start, at least.Leonard Richardson
2012-02-15Clarified comment.Leonard Richardson
2012-02-15Removed _nodeIndex, because the misfeature it works around is now gone.Leonard Richardson
2012-02-15Minor cleanup.Leonard Richardson
2012-02-15Tested and cleaned up html5lib insertBefore.Leonard Richardson
2012-02-15Use append instead of insert.Leonard Richardson
2012-02-15Minor cleanup.Leonard Richardson
2012-02-09As a last-ditch attempt to turn data into Unicode, use errors=replace ↵Leonard Richardson
instead of errors=strict.
2012-02-09Patched over a bug in html5lib (?) that was crashing Beautiful Soup on ↵Leonard Richardson
certain kinds of markup. [bug=838800]
2012-02-08Added missing import.Leonard Richardson
2011-05-21More Python 3 compatibility.Leonard Richardson
2011-02-27Renamed the beautifulsoup module to bs4 to save typing.Leonard Richardson