beautifulsoup.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2021-10-09	Moved testing.py into the same package as the tests.	Leonard Richardson

2021-09-12	Ported unit tests to use pytest.	Leonard Richardson

2021-09-07	Goodbye, Python 2. [bug=1942919]	Leonard Richardson

2021-05-31	The html.parser tree builder can now handles named entities	Leonard Richardson
	found in the HTML5 spec in much the same way that the html5lib tree builder does. Note that the lxml tree builder still handles named entities differently. [bug=1924908]
2021-04-08	Brought in fuzz tests from the oss-project into Beautiful Soup's unit test ↵	Leonard Richardson
	suite.
2020-05-30	Fixed a bug that caused too many tags to be popped from the tag	Leonard Richardson
	stack during tree building, when encountering a closing tag that had no matching opening tag. [bug=1880420]
2020-04-24	If you encode a document with a Python-specific encoding like	Leonard Richardson
	'unicode_escape', that encoding is no longer mentioned in the final XML or HTML document. Instead, encoding information is omitted or left blank. [bug=1874955]
2020-04-05	Embedded CSS and Javascript is now stored in distinct Stylesheet and	Leonard Richardson
	Script tags, which are ignored by methods like get_text(). This feature is not supported by the html5lib treebuilder. [bug=1868861]
2019-11-11	The html.parser tree builder now correctly handles DOCTYPEs that are	Leonard Richardson
	not uppercase. [bug=1848401]
2019-07-21	Implemented line number tracking for html5lib.	Leonard Richardson

2019-07-21	Adapt Chris Mayo's code to track line number and position when using ↵	Leonard Richardson
	html.parser.
2019-07-07	' (which is valid in XML and XHTML, but not HTML 4) is now	Leonard Richardson
	recognized as a named entity and converted to a single quote. [bug=1818721]
2019-07-07	It's now possible to override a TreeBuilder's cdata_list_attributes ↵	Leonard Richardson
	dictionary by passing in a replacement. None will disable the feature altogether. [bug=1832978]
2018-12-30	Fixed a problem with multi-valued attributes where the value	Leonard Richardson
	contained whitespace. Thanks to Jens Svalgaard for the fix. [bug=1787453]
2018-12-30	Merging the linkage checker and html5lib fixes by Isaac Muse found in ↵	Leonard Richardson
	https://code.launchpad.net/~facelessuser/beautifulsoup/html5lib-fix/+merge/361282. [bug=1809910]
2018-12-26	Remove dead line of code	Isaac Muse

2018-12-25	Ensure html5lib always has valid internal linkage	Isaac Muse
	html5lib, with malformed HTML, can end up with detached linkage internally. Improve the current code to ensure html5lib always has proper linkage.
2018-12-24	Clarified the software license.	Leonard Richardson

2018-07-28	Correctly handle invalid HTML numeric character entities like	Leonard Richardson
	which reference code points that are not Unicode code points. Note that this is only fixed when Beautiful Soup is used with the html.parser parser -- html5lib already worked and I couldn't fix it with lxml. [bug=1782933]
2018-07-21	Fixed a problem where the html.parser tree builder interpreted	Leonard Richardson
	a string like '&foo ' as the character entity '&foo;' [bug=1728706]
2018-07-18	Fixed a bug where find_all() was not working when asked to find a	Leonard Richardson
	tag with a namespaced name in an XML document that was parsed as HTML. [bug=1723783]
2018-07-18	Preserve XML namespaces when they are introduced inside an XML	Leonard Richardson
	document, not just the ones introduced at the top level. [bug=1718787]
2018-07-15	Stop data loss when encountering an empty numeric entity, and	Leonard Richardson
	possibly in other cases. Thanks to tos.kamiya for the fix. [bug=1698503]
2017-05-07	Namespace prefix is preserved when an XML tag is copied. Thanks	Leonard Richardson
	to Vikas for a patch and test. [bug=1685172]
2017-05-06	Improved the handling of empty-element tags like <br> when using the	Leonard Richardson
	html.parser parser. [bug=1676935]
2017-05-06	HTML parsers treat all HTML4 and HTML5 empty element tags (aka void element ↵	Leonard Richardson
	tags) correctly. [bug=1656909]
2017-05-06	It's now possible to use a tag's namespace prefix when searching,	Leonard Richardson
	e.g. soup.find('namespace:tag') [bug=1655332]
2016-07-30	Explained why we test both unicode and bytestring processing instructions.	Leonard Richardson

2016-07-16	Beautiful Soup will now work with versions of html5lib greater than	Leonard Richardson
	0.99999999. [bug=1603299]
2016-07-16	The contents of <textarea> tags will no longer be modified when the	Leonard Richardson
	tree is prettified. [bug=1555829]
2016-07-16	Added a separate class for XML processing instructions, which have a ↵	Leonard Richardson
	slightly different format from SGML processing instructions. [bug=1504383]
2016-07-16	Rename COPYING.txt to LICENSE. Add a reference to LICENSE in every source file.	Leonard Richardson

2015-09-28	Add a __license__ statement to all source files.	Leonard Richardson

2015-09-28	Corrected the output of Declaration objects. [bug=1477847]	Leonard Richardson

2015-06-28	It's now possible to pickle a BeautifulSoup object no matter which	Leonard Richardson
	tree builder was used to create it. However, the only tree builder that survives the pickling process is the HTMLParserTreeBuilder ('html.parser'). If you unpickle a BeautifulSoup object created with some other tree builder, soup.builder will be None. [bug=1231545]
2015-06-26	Added a sanity check helper method that makes sure all the elements of a ↵	Leonard Richardson
	tree are properly connected via .next_element and .previous_element.
2015-06-24	If the initial <html> tag contains a CDATA list attribute such as	Leonard Richardson
	'class', the html5lib tree builder will now turn its value into a list, as it would with any other tag. [bug=1296481]
2015-06-23	Got a hacky fix for the latest html5lib problem.	Leonard Richardson

2015-06-23	Force object_was_parsed() to keep the tree intact even when an element	Leonard Richardson
	from later in the document is moved into place. [bug=1430633]
2014-12-11	Improved the lxml tree builder's handling of processing	Leonard Richardson
	instructions. [bug=1294645]
2014-12-07	Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵	Leonard Richardson
	name a parser.
2013-10-18	Fixed yet another problem that caused the html5lib tree builder to	Leonard Richardson
	create a disconnected parse tree. [bug=1237763]
2013-06-02	Merged in big encoding-detection refactoring branch.	Leonard Richardson

2013-05-31	The html.parser treebuilder can now handle numeric attributes in	Leonard Richardson
	text when the hexidecimal name of the attribute starts with a capital X.
2013-05-31	Create a new lxml parser object for every new parsing strategy.	Leonard Richardson

2013-05-20	Fixed another bug by which the html5lib tree builder could create a	Leonard Richardson
	disconnected tree. [bug=1182089]
2013-05-20	Fixed test failures when lxml is not installed.	Leonard Richardson

2013-05-07	Now that lxml's segfault on invalid doctype has been fixed, fix a	Leonard Richardson
	corresponding problem on the Beautiful Soup end that was previously invisible. [bug=984936]
2013-05-06	Added failing test.	Leonard Richardson

2012-10-11	Fix a bug in the lxml treebuilder which crashed when a tag included	Leonard Richardson
	an attribute from the predefined xml: namespace. [bug=1065617]