beautifulsoup.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2024-02-12	Applied patch from Marc Müller to add a stacklevel to a warning that was ↵	Leonard Richardson
	missing it.
2024-01-17	Added the correct stacklevel to instances of the XMLParsedAsHTMLWarning.	Leonard Richardson
	[bug=2034451]
2022-04-10	Fixed another crash when overriding multi_valued_attributes and using the	Leonard Richardson
	html5lib parser. [bug=1948488]
2021-10-24	Issue a warning when an HTML parser is used to parse a document that	Leonard Richardson
	looks like XML but not XHTML. [bug=1939121]
2021-10-11	Added special string classes, RubyParenthesisString and RubyTextString,	Leonard Richardson
	to make it possible to treat ruby text specially in get_text() calls. [bug=1941980]
2021-09-07	Goodbye, Python 2. [bug=1942919]	Leonard Richardson

2021-02-13	Added a second way to pass specify encodings to UnicodeDammit and	Leonard Richardson
	EncodingDetector, based on the order of precedence defined in the HTML5 spec, starting at: https://html.spec.whatwg.org/multipage/parsing.html#parsing-with-a-known-character-encoding Encodings in 'known_definite_encodings' are tried first, then byte-order-mark sniffing is run, then encodings in 'user_encodings' are tried. The old argument, 'override_encodings', is now a deprecated alias for 'known_definite_encodings'. This changes the default behavior of the html.parser and lxml tree builders, in a way that may slightly improve encoding detection but will probably have no effect. [bug=1889014]
2020-05-30	Remove explicit reference to the module name within the module, replacing it ↵	Leonard Richardson
	with __name__.
2020-05-17	Switch entirely to Python 3-style print statements, even in Python 2.	Leonard Richardson

2020-04-05	Embedded CSS and Javascript is now stored in distinct Stylesheet and	Leonard Richardson
	Script tags, which are ignored by methods like get_text(). This feature is not supported by the html5lib treebuilder. [bug=1868861]
2019-12-24	Added docstrings for some but not all tree buidlers.	Leonard Richardson

2019-09-02	Avoid a crash when trying to detect the declared encoding of a	Leonard Richardson
	Unicode document. Raise an explanatory exception when the underlying parser completely rejects the incoming markup. [bug=1838877]
2019-07-21	Adapt Chris Mayo's code to track line number and position when using ↵	Leonard Richardson
	html.parser.
2019-07-14	Give the Formatter class more control over formatting decisions.	Leonard Richardson

2019-07-07	Renamed the cdata_list_attributes argument to multi_valued_attributes since ↵	Leonard Richardson
	it's facing the end-user and that's a more easily understandable name.
2019-07-07	It's now possible to override a TreeBuilder's cdata_list_attributes ↵	Leonard Richardson
	dictionary by passing in a replacement. None will disable the feature altogether. [bug=1832978]
2018-12-30	Fixed a problem with multi-valued attributes where the value	Leonard Richardson
	contained whitespace. Thanks to Jens Svalgaard for the fix. [bug=1787453]
2018-12-24	Clarified the software license.	Leonard Richardson

2018-12-24	Keep track of the namespace abbreviations found while parsing the document. ↵	Leonard Richardson
	This makes select() work most of the time without requiring a value for 'namespaces'.
2018-08-12	Converted README to Markdown format.	Leonard Richardson

2018-07-15	Introduced the Formatter system. [bug=1716272].	Leonard Richardson

2018-07-15	It's possible for a TreeBuilder subclass to specify that void	Leonard Richardson
	elements should be represented as <element> rather than <element/>, by setting TreeBuilder.void_element_close_prefix to the empty string. [bug=1716272]
2017-05-06	HTML parsers treat all HTML4 and HTML5 empty element tags (aka void element ↵	Leonard Richardson
	tags) correctly. [bug=1656909]
2016-07-16	The contents of <textarea> tags will no longer be modified when the	Leonard Richardson
	tree is prettified. [bug=1555829]
2016-07-16	Rename COPYING.txt to LICENSE. Add a reference to LICENSE in every source file.	Leonard Richardson

2015-06-28	It's now possible to pickle a BeautifulSoup object no matter which	Leonard Richardson
	tree builder was used to create it. However, the only tree builder that survives the pickling process is the HTMLParserTreeBuilder ('html.parser'). If you unpickle a BeautifulSoup object created with some other tree builder, soup.builder will be None. [bug=1231545]
2014-12-07	Tweaked the parser warning.	Leonard Richardson

2014-12-07	Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵	Leonard Richardson
	name a parser.
2013-06-03	Improved performance of _replace_cdata_list_attribute_values, and greatly ↵	Leonard Richardson
	reduced the number of times it is called.
2013-05-31	Create a new lxml parser object for every new parsing strategy.	Leonard Richardson

2013-05-20	The default XML formatter will now replace ampersands even if they appear to ↵	Leonard Richardson
	be part of entities. That is, "<" will become "&lt;".[bug=1182183]
2012-06-30	Fixed an html5lib tree builder crash which happened when html5lib	Leonard Richardson
	moved a tag with a multivalued attribute from one part of the tree to another. [bug=1019603]
2012-04-26	The test suite now passes when lxml is not installed, whether or not ↵	Leonard Richardson
	html5lib is installed. [bug=987004]
2012-04-18	Made encoding substitution in <meta> tags completely transparent (no more ↵	Leonard Richardson
	%SOUP-ENCODING%).
2012-03-30	Fixed a typo that caused some versions of Python 3 to convert the Beautiful ↵	Leonard Richardson
	Soup codebase incorrectly.
2012-03-01	In HTML5-style <meta charset="foo"> tags, the value of the "charset" ↵	Leonard Richardson
	attribute is now replaced with the appropriate encoding on output. [bug=942714]
2012-02-15	Some cdata-list attributes are only cdata lists for certain tags.	Leonard Richardson

2012-02-09	As a last-ditch attempt to turn data into Unicode, use errors=replace ↵	Leonard Richardson
	instead of errors=strict.
2012-02-08	Rationalized the treatment of multi-valued HTML attributes such as 'class'	Leonard Richardson

2012-02-07	Newly created tags use the same empty-element rules as the builder used to ↵	Leonard Richardson
	originally create the soup.
2011-05-21	More Python 3 compatibility.	Leonard Richardson

2011-05-21	More Python 3 compatibility.	Leonard Richardson

2011-02-27	Got rid of __package__; hopefully this is the only thing holding up 2.5 support.	Leonard Richardson

2011-02-27	Added a tree builder for the built-in HTMLParser, and tests.	Leonard Richardson