summaryrefslogtreecommitdiff
path: root/bs4/__init__.py
AgeCommit message (Collapse)Author
2023-03-26Implement a proper BeautifulSoup.deepcopy rather than parsing the document ↵Leonard Richardson
again.
2023-03-23Found and removed accidental calls to find(), greatly improving performance.Leonard Richardson
2023-03-23Bump version number preemptively.Leonard Richardson
2023-03-20Increased version number in __version__.Leonard Richardson
2023-02-03Move the Soup Sieve proxy and its tests into separate files.Leonard Richardson
2023-01-28Incremented version number.Leonard Richardson
2023-01-27Change the tests that check warnings to also (indirectly) verify that the ↵Leonard Richardson
stacklevel associated with the warning is more or less correct.
2023-01-27Warnings now do their best to provide an appropriate stacklevel,Leonard Richardson
improving the usefulness of the message. [bug=1978744]
2022-04-08Some cleanup work to get more consistent and complete about what gets ↵Leonard Richardson
packaged with the Beautiful Soup release.
2022-04-07Omit untrusted input when issuing warnings.Leonard Richardson
2021-12-21Standardized the wording of the MarkupResemblesLocatorWarningLeonard Richardson
warnings to to make them less judgemental about what you ought to be doing. [bug=1955450]
2021-12-17Fix a crash when pickling a BeautifulSoup object that has noLeonard Richardson
tree builder. [bug=1934003]
2021-11-29Do a better job of keeping track of namespaces as an XML document isLeonard Richardson
parsed, so that CSS selectors that use namespaces will do the right thing more often. [bug=1946243]
2021-10-24Issue a warning when an HTML parser is used to parse a document thatLeonard Richardson
looks like XML but not XHTML. [bug=1939121]
2021-10-24Used a warning to formally deprecate the 'text' argument in favor of 'string'.Leonard Richardson
2021-09-07Goodbye, Python 2. [bug=1942919]Leonard Richardson
2021-02-13Corrected the use of special string container classes in cases when aLeonard Richardson
single tag may contain strings with different containers; such as the <template> tag, which may contain both TemplateString objects and Comment objects. [bug=1913406]
2021-02-13Performance improvement when processing tags that speeds up overallLeonard Richardson
tree construction by 2%. Patch by Morotti. [bug=1899358]
2021-02-13Improve the warning issued when a directory name (as opposed toLeonard Richardson
the name of a regular file) is passed as markup into the BeautifulSoup constructor. [bug=1913628]
2020-10-03Prepare for release.Leonard Richardson
2020-09-26Increment version number.Leonard Richardson
2020-09-26Fixed a bug that inconsistently moved elements over when passingLeonard Richardson
a Tag, rather than a list, into Tag.extend(). [bug=1885710]
2020-09-26Change the signatures for BeautifulSoup.insert_before and insert_afterLeonard Richardson
(which are not implemented) to match PageElement.insert_before and insert_after, quieting warnings in some IDEs. [bug=1897120]
2020-05-30Fixed a bug that caused too many tags to be popped from the tagLeonard Richardson
stack during tree building, when encountering a closing tag that had no matching opening tag. [bug=1880420]
2020-05-17Switch entirely to Python 3-style print statements, even in Python 2.Leonard Richardson
2020-05-17Added docstring for BeautifulSoup.new_tag.Leonard Richardson
2020-04-24If you encode a document with a Python-specific encoding likeLeonard Richardson
'unicode_escape', that encoding is no longer mentioned in the final XML or HTML document. Instead, encoding information is omitted or left blank. [bug=1874955]
2020-04-21Added two distinct UserWarning subclasses for warnings issued from the ↵Leonard Richardson
BeautifulSoup constructor which a caller may want to filter out. [bug=1873787]
2020-04-07Add Script, Stylesheet, and TemplateString to the 'bs4' namespace.Leonard Richardson
2020-04-05Embedded CSS and Javascript is now stored in distinct Stylesheet andLeonard Richardson
Script tags, which are ignored by methods like get_text(). This feature is not supported by the html5lib treebuilder. [bug=1868861]
2020-03-10Fixed a bug that happened when passing a Unicode filename containingLeonard Richardson
non-ASCII characters as markup into Beautiful Soup, on a system that allows Unicode filenames. [bug=1866717]
2019-12-24Bumped version number.Leonard Richardson
2019-12-24Minor changes to docstrings.Leonard Richardson
2019-12-24Added docstrings to all public methods in dammit.py.Leonard Richardson
2019-12-20Added docstrings to all methods in __init__.pyLeonard Richardson
2019-10-06Added section on Python 2 sunsetting.Leonard Richardson
2019-09-02Avoid a crash when trying to detect the declared encoding of aLeonard Richardson
Unicode document. Raise an explanatory exception when the underlying parser completely rejects the incoming markup. [bug=1838877]
2019-08-26It's now possible to override any of the element classes.Leonard Richardson
2019-08-21When instantiating a BeautifulSoup object, it's now possible toLeonard Richardson
provide replacement classes to be instantiated for every tag ('tag_class') or string ('string_class') encountered during parsing, rather than using the default Tag and NavigableString objects.
2019-07-21Implemented line number tracking for html5lib.Leonard Richardson
2019-07-21Adapt Chris Mayo's code to track line number and position when using ↵Leonard Richardson
html.parser.
2019-07-16Prep for release.Leonard Richardson
2019-07-07It's now possible to customize the TreeBuilder object by passingLeonard Richardson
keyword arguments into the BeautifulSoup constructor. The main reason to do this right now is to change how multi-valued attributes are treated. [bug=1832978]
2019-01-06Prep for release.Leonard Richardson
2019-01-05Fix for performance with the linkage fix.Isaac Muse
The exact situations have been pinned down, and now solve current known issues without excessive and aggressive recursion.
2018-12-31Prep for release.Leonard Richardson
2018-12-30Merging the linkage checker and html5lib fixes by Isaac Muse found in ↵Leonard Richardson
https://code.launchpad.net/~facelessuser/beautifulsoup/html5lib-fix/+merge/361282. [bug=1809910]
2018-12-25Ensure html5lib always has valid internal linkageIsaac Muse
html5lib, with malformed HTML, can end up with detached linkage internally. Improve the current code to ensure html5lib always has proper linkage.
2018-12-24Clarified the software license.Leonard Richardson
2018-12-24Keep track of the namespace abbreviations found while parsing the document. ↵Leonard Richardson
This makes select() work most of the time without requiring a value for 'namespaces'.