beautifulsoup.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2021-11-29	Do a better job of keeping track of namespaces as an XML document is	Leonard Richardson
	parsed, so that CSS selectors that use namespaces will do the right thing more often. [bug=1946243]
2021-10-24	Issue a warning when an HTML parser is used to parse a document that	Leonard Richardson
	looks like XML but not XHTML. [bug=1939121]
2021-10-23	Added a workaround for an lxml bug ↵	Leonard Richardson
	(https://bugs.launchpad.net/lxml/+bug/1948551) that caused problems when parsing a Unicode string beginning with BYTE ORDER MARK. [bug=1947768]
2021-09-07	Goodbye, Python 2. [bug=1942919]	Leonard Richardson

2021-02-13	Added a second way to pass specify encodings to UnicodeDammit and	Leonard Richardson
	EncodingDetector, based on the order of precedence defined in the HTML5 spec, starting at: https://html.spec.whatwg.org/multipage/parsing.html#parsing-with-a-known-character-encoding Encodings in 'known_definite_encodings' are tried first, then byte-order-mark sniffing is run, then encodings in 'user_encodings' are tried. The old argument, 'override_encodings', is now a deprecated alias for 'known_definite_encodings'. This changes the default behavior of the html.parser and lxml tree builders, in a way that may slightly improve encoding detection but will probably have no effect. [bug=1889014]
2019-12-24	Added docstrings for some but not all tree buidlers.	Leonard Richardson

2019-11-11	Added a Brazilian Portuguese translation by Cezar Peixeiro.	Leonard Richardson

2019-09-02	Avoid a crash when trying to detect the declared encoding of a	Leonard Richardson
	Unicode document. Raise an explanatory exception when the underlying parser completely rejects the incoming markup. [bug=1838877]
2019-07-21	Implemented line number tracking for html5lib.	Leonard Richardson

2019-07-07	It's now possible to override a TreeBuilder's cdata_list_attributes ↵	Leonard Richardson
	dictionary by passing in a replacement. None will disable the feature altogether. [bug=1832978]
2019-01-06	Don't track un-prefixed namespaces	Isaac Muse

2018-12-24	Clarified the software license.	Leonard Richardson

2018-12-24	Keep track of the namespace abbreviations found while parsing the document. ↵	Leonard Richardson
	This makes select() work most of the time without requiring a value for 'namespaces'.
2018-07-18	Preserve XML namespaces when they are introduced inside an XML	Leonard Richardson
	document, not just the ones introduced at the top level. [bug=1718787]
2018-07-14	Stopped HTMLParser from raising an exception in very rare cases of	Leonard Richardson
	bad markup. [bug=1708831]
2016-07-30	Explained why we test both unicode and bytestring processing instructions.	Leonard Richardson

2016-07-26	Fixed a reported (but not duplicated) bug involving processing instructions ↵	Leonard Richardson
	fed into the lxml HTML parser.
2016-07-16	Removed imports to pdb, since pdb is not available in some environments. ↵	Leonard Richardson
	[bug=1491700]
2016-07-16	Added a separate class for XML processing instructions, which have a ↵	Leonard Richardson
	slightly different format from SGML processing instructions. [bug=1504383]
2016-07-16	Rename COPYING.txt to LICENSE. Add a reference to LICENSE in every source file.	Leonard Richardson

2015-06-28	Accept 'xml' as an unambiguous identifier for the lxml XML parser, since ↵	Leonard Richardson
	it's the only XML parser supported at the moment.
2015-06-27	Added an exclude_encodings argument to UnicodeDammit and to the	Leonard Richardson
	Beautiful Soup constructor, which lets you prohibit the detection of an encoding that you know is wrong. [bug=1469408]
2014-12-11	Improved the lxml tree builder's handling of processing	Leonard Richardson
	instructions. [bug=1294645]
2014-12-07	Tweaked the parser warning.	Leonard Richardson

2014-12-07	Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵	Leonard Richardson
	name a parser.
2013-06-02	Turns out we had two bits of code to strip byte-order marks.	Leonard Richardson

2013-06-02	It turns out most of the untested code wasn't doing anything useful.	Leonard Richardson

2013-06-02	Treat an lxml ParserError as a ParserRejectedMarkup.	Leonard Richardson

2013-05-31	Create a new lxml parser object for every new parsing strategy.	Leonard Richardson

2013-05-09	Changed lxml.feed() to handle the eventuality that it may be given a bytestring.	Leonard Richardson

2013-05-09	Added a diagnostic function for randomly generating a simple, invalid HTML ↵	Leonard Richardson
	document.
2012-10-11	Fix a bug in the lxml treebuilder which crashed when a tag included	Leonard Richardson
	an attribute from the predefined xml: namespace. [bug=1065617]
2012-09-28	Fixed package name.	Leonard Richardson

2012-08-16	Use namespace prefixes for namespaced attribute names, instead of	Leonard Richardson
	the fully-qualified names given by the lxml parser. [bug=1037597]
2012-05-29	Removed breakpoints.	Leonard Richardson

2012-05-29	Prep for release.	Leonard Richardson

2012-05-24	Fixed a bug with the lxml treebuilder that prevented the user from adding ↵	Leonard Richardson
	attributes to a tag that didn't originally have any. [bug=1002378] Thanks to Oliver Beattie for the patch.
2012-04-03	Got rid of the 4.0.2 workaround for HTML documents--it was unnecessary and ↵	Leonard Richardson
	the workaround was triggering a (possibly different, but related) bug in lxml. [bug=972466]
2012-04-03	Don't split up the markup into chunks when using the lxml HTML parser, which ↵	Leonard Richardson
	doesn't have the problems of the XML parser.
2012-03-24	Pass data into XMLParser.feed() in chunks. [bug=963880]	Leonard Richardson

2012-02-28	Fixed the generated XML declaration.	Leonard Richardson

2012-02-23	Fixed handling of the closing of namespaced tags.	Leonard Richardson

2012-02-23	Merge from trunk and added tests.	Leonard Richardson

2012-02-22	Added comments.	Leonard Richardson

2012-02-22	Treat a new namespace mapping as a set of attributes on the tag that defines ↵	Leonard Richardson
	it, so we don't lose the mappings.
2012-02-21	Have lxml invert namespace maps as they come in and set each tag's prefix ↵	Leonard Richardson
	appropriately.
2012-02-21	Added nsprefix argument to the tag class.	Leonard Richardson

2012-02-16	It's a start, at least.	Leonard Richardson

2012-02-09	As a last-ditch attempt to turn data into Unicode, use errors=replace ↵	Leonard Richardson
	instead of errors=strict.
2012-02-09	Minor Unicode, Dammit cleanup.	Leonard Richardson