summaryrefslogtreecommitdiff
path: root/bs4/builder/_htmlparser.py
AgeCommit message (Expand)Author
2023-06-04Fixed a case found by Mengyuhan where html.parser giving up onLeonard Richardson
2023-02-15When the html.parser parser decides it can't parse a document, BeautifulLeonard Richardson
2023-01-27Got rid of some more warnings by removing code that's not relevant anymore, n...Leonard Richardson
2023-01-27Warnings now do their best to provide an appropriate stacklevel,Leonard Richardson
2021-10-24Issue a warning when an HTML parser is used to parse a document thatLeonard Richardson
2021-09-07Goodbye, Python 2. [bug=1942919]Leonard Richardson
2021-05-31The html.parser tree builder can now handles named entitiesLeonard Richardson
2021-02-13Added a second way to pass specify encodings to UnicodeDammit andLeonard Richardson
2020-05-30Remove explicit reference to the module name within the module, replacing it ...Leonard Richardson
2020-05-17Switch entirely to Python 3-style print statements, even in Python 2.Leonard Richardson
2020-05-17Documented some recently added customization features.Leonard Richardson
2020-05-17Added a keyword argument on_duplicate_attribute to theLeonard Richardson
2019-12-24Added docstrings for some but not all tree buidlers.Leonard Richardson
2019-11-11Simplified code.Leonard Richardson
2019-11-11The html.parser tree builder now correctly handles DOCTYPEs that areLeonard Richardson
2019-07-21Implemented line number tracking for html5lib.Leonard Richardson
2019-07-21Adapt Chris Mayo's code to track line number and position when using html.par...Leonard Richardson
2019-07-07It's now possible to override a TreeBuilder's cdata_list_attributes dictionar...Leonard Richardson
2018-12-24Clarified the software license.Leonard Richardson
2018-07-28Correctly handle invalid HTML numeric character entities like “Leonard Richardson
2018-07-21Fixed a problem where the html.parser tree builder interpretedLeonard Richardson
2018-07-15Stop data loss when encountering an empty numeric entity, andLeonard Richardson
2018-07-14Stopped HTMLParser from raising an exception in very rare cases ofLeonard Richardson
2017-05-06 Improved the handling of empty-element tags like <br> when using theLeonard Richardson
2016-07-16Removed imports to pdb, since pdb is not available in some environments. [bug...Leonard Richardson
2016-07-16Added a separate class for XML processing instructions, which have a slightly...Leonard Richardson
2016-07-16Rename COPYING.txt to LICENSE. Add a reference to LICENSE in every source file.Leonard Richardson
2015-06-28 It's now possible to pickle a BeautifulSoup object no matter whichLeonard Richardson
2015-06-27Added an exclude_encodings argument to UnicodeDammit and to theLeonard Richardson
2015-06-24Fixed an import error in Python 3.5 caused by the removal of theLeonard Richardson
2015-06-24Made double sure that we don't use the 'strict' constructor argument when it'...Leonard Richardson
2014-12-11Improved the lxml tree builder's handling of processingLeonard Richardson
2014-12-07In Python 3.4 and above, set the new convert_charrefs argument toLeonard Richardson
2014-12-07Issue a warning if the BeautifulSoup constructor arguments do not explicitly ...Leonard Richardson
2013-10-01Fixed a bug in which short Unicode input was improperly encoded to ASCII when...Leonard Richardson
2013-06-02Merged in big encoding-detection refactoring branch.Leonard Richardson
2013-05-31The html.parser treebuilder can now handle numeric attributes inLeonard Richardson
2013-05-31Create a new lxml parser object for every new parsing strategy.Leonard Richardson
2013-05-07Now that lxml's segfault on invalid doctype has been fixed, fix aLeonard Richardson
2012-04-18Changed wording slightly.Leonard Richardson
2012-04-18Print a warning on HTMLParseErrors to let people know they should install an ...Leonard Richardson
2012-04-18Fixed a bug that made the HTMLParser treebuilder generate XML definitions end...Leonard Richardson
2012-02-21Added nsprefix argument to the tag class.Leonard Richardson
2012-02-21Merged from trunk.Leonard Richardson
2012-02-20It's now possible to copy a BeautifulSoup object created with the html.parser...Leonard Richardson
2012-02-20Changd the class structure so that the default parser test class uses html.pa...Leonard Richardson
2012-02-16It's a start, at least.Leonard Richardson
2012-02-09As a last-ditch attempt to turn data into Unicode, use errors=replace instead...Leonard Richardson
2012-02-09Minor Unicode, Dammit cleanup.Leonard Richardson
2012-02-06Monkeypatch Python 3.2 versions prior to 3.2.3 to solve some major HTMLParser...Leonard Richardson