Age | Commit message (Collapse) | Author | |
---|---|---|---|
2019-07-21 | Implemented line number tracking for html5lib. | Leonard Richardson | |
2019-07-21 | Adapt Chris Mayo's code to track line number and position when using ↵ | Leonard Richardson | |
html.parser. | |||
2019-07-07 | It's now possible to override a TreeBuilder's cdata_list_attributes ↵ | Leonard Richardson | |
dictionary by passing in a replacement. None will disable the feature altogether. [bug=1832978] | |||
2018-12-24 | Clarified the software license. | Leonard Richardson | |
2018-07-28 | Correctly handle invalid HTML numeric character entities like “ | Leonard Richardson | |
which reference code points that are not Unicode code points. Note that this is only fixed when Beautiful Soup is used with the html.parser parser -- html5lib already worked and I couldn't fix it with lxml. [bug=1782933] | |||
2018-07-21 | Fixed a problem where the html.parser tree builder interpreted | Leonard Richardson | |
a string like '&foo ' as the character entity '&foo;' [bug=1728706] | |||
2018-07-15 | Stop data loss when encountering an empty numeric entity, and | Leonard Richardson | |
possibly in other cases. Thanks to tos.kamiya for the fix. [bug=1698503] | |||
2018-07-14 | Stopped HTMLParser from raising an exception in very rare cases of | Leonard Richardson | |
bad markup. [bug=1708831] | |||
2017-05-06 | Improved the handling of empty-element tags like <br> when using the | Leonard Richardson | |
html.parser parser. [bug=1676935] | |||
2016-07-16 | Removed imports to pdb, since pdb is not available in some environments. ↵ | Leonard Richardson | |
[bug=1491700] | |||
2016-07-16 | Added a separate class for XML processing instructions, which have a ↵ | Leonard Richardson | |
slightly different format from SGML processing instructions. [bug=1504383] | |||
2016-07-16 | Rename COPYING.txt to LICENSE. Add a reference to LICENSE in every source file. | Leonard Richardson | |
2015-06-28 | It's now possible to pickle a BeautifulSoup object no matter which | Leonard Richardson | |
tree builder was used to create it. However, the only tree builder that survives the pickling process is the HTMLParserTreeBuilder ('html.parser'). If you unpickle a BeautifulSoup object created with some other tree builder, soup.builder will be None. [bug=1231545] | |||
2015-06-27 | Added an exclude_encodings argument to UnicodeDammit and to the | Leonard Richardson | |
Beautiful Soup constructor, which lets you prohibit the detection of an encoding that you know is wrong. [bug=1469408] | |||
2015-06-24 | Fixed an import error in Python 3.5 caused by the removal of the | Leonard Richardson | |
2015-06-24 | Made double sure that we don't use the 'strict' constructor argument when ↵ | Leonard Richardson | |
it's deprecated. [bug=1341055] | |||
2014-12-11 | Improved the lxml tree builder's handling of processing | Leonard Richardson | |
instructions. [bug=1294645] | |||
2014-12-07 | In Python 3.4 and above, set the new convert_charrefs argument to | Leonard Richardson | |
the html.parser constructor to avoid a warning and future failures. Patch by Stefano Revera. [bug=1375721] | |||
2014-12-07 | Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵ | Leonard Richardson | |
name a parser. | |||
2013-10-01 | Fixed a bug in which short Unicode input was improperly encoded to ASCII ↵ | Leonard Richardson | |
when checking whether or not it was a file on disk. [bug=1227016] | |||
2013-06-02 | Merged in big encoding-detection refactoring branch. | Leonard Richardson | |
2013-05-31 | The html.parser treebuilder can now handle numeric attributes in | Leonard Richardson | |
text when the hexidecimal name of the attribute starts with a capital X. | |||
2013-05-31 | Create a new lxml parser object for every new parsing strategy. | Leonard Richardson | |
2013-05-07 | Now that lxml's segfault on invalid doctype has been fixed, fix a | Leonard Richardson | |
corresponding problem on the Beautiful Soup end that was previously invisible. [bug=984936] | |||
2012-04-18 | Changed wording slightly. | Leonard Richardson | |
2012-04-18 | Print a warning on HTMLParseErrors to let people know they should install an ↵ | Leonard Richardson | |
external parser. | |||
2012-04-18 | Fixed a bug that made the HTMLParser treebuilder generate XML definitions ↵ | Leonard Richardson | |
ending with two question marks instead of one. [bug=984258] | |||
2012-02-21 | Added nsprefix argument to the tag class. | Leonard Richardson | |
2012-02-21 | Merged from trunk. | Leonard Richardson | |
2012-02-20 | It's now possible to copy a BeautifulSoup object created with the ↵ | Leonard Richardson | |
html.parser treebuilder. | |||
2012-02-20 | Changd the class structure so that the default parser test class uses ↵ | Leonard Richardson | |
html.parser. | |||
2012-02-16 | It's a start, at least. | Leonard Richardson | |
2012-02-09 | As a last-ditch attempt to turn data into Unicode, use errors=replace ↵ | Leonard Richardson | |
instead of errors=strict. | |||
2012-02-09 | Minor Unicode, Dammit cleanup. | Leonard Richardson | |
2012-02-06 | Monkeypatch Python 3.2 versions prior to 3.2.3 to solve some major ↵ | Leonard Richardson | |
HTMLParser bugs. | |||
2012-01-20 | Made it easier to convert BS3 code to BS4. | Leonard Richardson | |
2012-01-20 | Got the test suite to pass on Python 3.2 (skipping the html5lib stuff, which ↵ | Leonard Richardson | |
doesn't seem to have Python 3 support yet.) | |||
2011-02-27 | Added a tree builder for the built-in HTMLParser, and tests. | Leonard Richardson | |