summaryrefslogtreecommitdiff
path: root/bs4/builder/_htmlparser.py
AgeCommit message (Collapse)Author
2016-07-16Removed imports to pdb, since pdb is not available in some environments. ↵Leonard Richardson
[bug=1491700]
2016-07-16Added a separate class for XML processing instructions, which have a ↵Leonard Richardson
slightly different format from SGML processing instructions. [bug=1504383]
2016-07-16Rename COPYING.txt to LICENSE. Add a reference to LICENSE in every source file.Leonard Richardson
2015-06-28 It's now possible to pickle a BeautifulSoup object no matter whichLeonard Richardson
tree builder was used to create it. However, the only tree builder that survives the pickling process is the HTMLParserTreeBuilder ('html.parser'). If you unpickle a BeautifulSoup object created with some other tree builder, soup.builder will be None. [bug=1231545]
2015-06-27Added an exclude_encodings argument to UnicodeDammit and to theLeonard Richardson
Beautiful Soup constructor, which lets you prohibit the detection of an encoding that you know is wrong. [bug=1469408]
2015-06-24Fixed an import error in Python 3.5 caused by the removal of theLeonard Richardson
2015-06-24Made double sure that we don't use the 'strict' constructor argument when ↵Leonard Richardson
it's deprecated. [bug=1341055]
2014-12-11Improved the lxml tree builder's handling of processingLeonard Richardson
instructions. [bug=1294645]
2014-12-07In Python 3.4 and above, set the new convert_charrefs argument toLeonard Richardson
the html.parser constructor to avoid a warning and future failures. Patch by Stefano Revera. [bug=1375721]
2014-12-07Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵Leonard Richardson
name a parser.
2013-10-01Fixed a bug in which short Unicode input was improperly encoded to ASCII ↵Leonard Richardson
when checking whether or not it was a file on disk. [bug=1227016]
2013-06-02Merged in big encoding-detection refactoring branch.Leonard Richardson
2013-05-31The html.parser treebuilder can now handle numeric attributes inLeonard Richardson
text when the hexidecimal name of the attribute starts with a capital X.
2013-05-31Create a new lxml parser object for every new parsing strategy.Leonard Richardson
2013-05-07Now that lxml's segfault on invalid doctype has been fixed, fix aLeonard Richardson
corresponding problem on the Beautiful Soup end that was previously invisible. [bug=984936]
2012-04-18Changed wording slightly.Leonard Richardson
2012-04-18Print a warning on HTMLParseErrors to let people know they should install an ↵Leonard Richardson
external parser.
2012-04-18Fixed a bug that made the HTMLParser treebuilder generate XML definitions ↵Leonard Richardson
ending with two question marks instead of one. [bug=984258]
2012-02-21Added nsprefix argument to the tag class.Leonard Richardson
2012-02-21Merged from trunk.Leonard Richardson
2012-02-20It's now possible to copy a BeautifulSoup object created with the ↵Leonard Richardson
html.parser treebuilder.
2012-02-20Changd the class structure so that the default parser test class uses ↵Leonard Richardson
html.parser.
2012-02-16It's a start, at least.Leonard Richardson
2012-02-09As a last-ditch attempt to turn data into Unicode, use errors=replace ↵Leonard Richardson
instead of errors=strict.
2012-02-09Minor Unicode, Dammit cleanup.Leonard Richardson
2012-02-06Monkeypatch Python 3.2 versions prior to 3.2.3 to solve some major ↵Leonard Richardson
HTMLParser bugs.
2012-01-20Made it easier to convert BS3 code to BS4.Leonard Richardson
2012-01-20Got the test suite to pass on Python 3.2 (skipping the html5lib stuff, which ↵Leonard Richardson
doesn't seem to have Python 3 support yet.)
2011-02-27Added a tree builder for the built-in HTMLParser, and tests.Leonard Richardson