summaryrefslogtreecommitdiff
path: root/bs4/tests/test_soup.py
AgeCommit message (Collapse)Author
2023-04-07Fixed an unhandled exception in BeautifulSoup.decode_contentsLeonard Richardson
and methods that call it. [bug=2015545]
2023-01-31Consistently use pytest.mark.skipif to skip tests when the corresponding ↵Leonard Richardson
libraries are not installed.
2023-01-29Reworded the 'multi-valued attributes' portion of the documentation to make ↵Leonard Richardson
it more clear. [bug=1970767]
2023-01-27Change the tests that check warnings to also (indirectly) verify that the ↵Leonard Richardson
stacklevel associated with the warning is more or less correct.
2022-04-07Omit untrusted input when issuing warnings.Leonard Richardson
2021-12-17Fix a crash when pickling a BeautifulSoup object that has noLeonard Richardson
tree builder. [bug=1934003]
2021-10-11Broke up some monolithic unit test files.Leonard Richardson
2021-10-11Moved the test classes to tests/__init__.py.Leonard Richardson
2021-10-09Moved testing.py into the same package as the tests.Leonard Richardson
2021-09-12Ported unit tests to use pytest.Leonard Richardson
2021-09-07Goodbye, Python 2. [bug=1942919]Leonard Richardson
2021-05-31The html.parser tree builder can now handles named entitiesLeonard Richardson
found in the HTML5 spec in much the same way that the html5lib tree builder does. Note that the lxml tree builder still handles named entities differently. [bug=1924908]
2021-02-13Added a second way to pass specify encodings to UnicodeDammit andLeonard Richardson
EncodingDetector, based on the order of precedence defined in the HTML5 spec, starting at: https://html.spec.whatwg.org/multipage/parsing.html#parsing-with-a-known-character-encoding Encodings in 'known_definite_encodings' are tried first, then byte-order-mark sniffing is run, then encodings in 'user_encodings' are tried. The old argument, 'override_encodings', is now a deprecated alias for 'known_definite_encodings'. This changes the default behavior of the html.parser and lxml tree builders, in a way that may slightly improve encoding detection but will probably have no effect. [bug=1889014]
2021-02-13Improve the warning issued when a directory name (as opposed toLeonard Richardson
the name of a regular file) is passed as markup into the BeautifulSoup constructor. [bug=1913628]
2021-02-13Corrected output when the namespace prefix associated with aLeonard Richardson
namespaced attribute is the empty string, as opposed to None. [bug=1915583]
2020-04-21Added two distinct UserWarning subclasses for warnings issued from the ↵Leonard Richardson
BeautifulSoup constructor which a caller may want to filter out. [bug=1873787]
2020-04-05Embedded CSS and Javascript is now stored in distinct Stylesheet andLeonard Richardson
Script tags, which are ignored by methods like get_text(). This feature is not supported by the html5lib treebuilder. [bug=1868861]
2019-10-05Avoid a crash when unpickling certain parse trees generated using html5lib ↵Leonard Richardson
on Python 3. [bug=1843545]
2019-09-02Avoid a crash when trying to detect the declared encoding of aLeonard Richardson
Unicode document. Raise an explanatory exception when the underlying parser completely rejects the incoming markup. [bug=1838877]
2019-08-26It's now possible to override any of the element classes.Leonard Richardson
2019-08-22Test the ability to build a tree using objects other than Tag and ↵Leonard Richardson
NavigableString.
2019-07-16Suppressed warnings during tests that aren't about the warnings.Leonard Richardson
2019-07-07' (which is valid in XML and XHTML, but not HTML 4) is nowLeonard Richardson
recognized as a named entity and converted to a single quote. [bug=1818721]
2019-07-07Renamed the cdata_list_attributes argument to multi_valued_attributes since ↵Leonard Richardson
it's facing the end-user and that's a more easily understandable name.
2019-07-07It's now possible to override a TreeBuilder's cdata_list_attributes ↵Leonard Richardson
dictionary by passing in a replacement. None will disable the feature altogether. [bug=1832978]
2019-07-07It's now possible to customize the TreeBuilder object by passingLeonard Richardson
keyword arguments into the BeautifulSoup constructor. The main reason to do this right now is to change how multi-valued attributes are treated. [bug=1832978]
2016-07-26Clarify that Beautiful Soup is no longer compatible with versions of Python ↵Leonard Richardson
pre-2.7. Contributed by Ville Skyttä.
2016-07-26Use assertEqual instead of deprecated assertEqualsVille Skyttä
2016-07-26Clarify Python 2(.7) support statusVille Skyttä
2016-07-16Fixed a Python 3 ByteWarning when a URL was passed in as though itLeonard Richardson
were markup. Thanks to James Salter for a patch and test. [bug=1533762]
2015-07-05Fixed the test_detect_utf8 test so that it works when chardet isLeonard Richardson
installed. [bug=1471359]
2015-06-27Added an exclude_encodings argument to UnicodeDammit and to theLeonard Richardson
Beautiful Soup constructor, which lets you prohibit the detection of an encoding that you know is wrong. [bug=1469408]
2015-06-25Fixed a crash in Unicode, Dammit's encoding detector when the nameLeonard Richardson
of the encoding itself contained invalid bytes. [bug=1360913]
2014-12-07Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵Leonard Richardson
name a parser.
2013-10-02Fixed a bug that caused Unicode data put into UnicodeDammit toLeonard Richardson
return None instead of the original data. [bug=1214983]
2013-10-01 Fixed a crash when a short input contains data not valid inLeonard Richardson
filenames. [bug=1232604]
2013-10-01Fixed a bug in which short Unicode input was improperly encoded to ASCII ↵Leonard Richardson
when checking whether or not it was a file on disk. [bug=1227016]
2013-08-19Combined two tests to stop a spurious test failure when tests areLeonard Richardson
run by nodetests. [bug=1212445]
2013-06-03Let's get some profiling going.Leonard Richardson
2013-06-03Test that the filename warning isn't given unless the file actually exists ↵Leonard Richardson
on disk.
2013-06-03Beautiful Soup will issue a warning if instead of markup you pass itLeonard Richardson
a URL or the name of a file on disk (a common beginner mistake).
2013-06-02Turns out we had two bits of code to strip byte-order marks.Leonard Richardson
2013-06-02It turns out most of the untested code wasn't doing anything useful.Leonard Richardson
2013-05-30Split out the code that guesses at encodings from the code that tries to ↵Leonard Richardson
decode a bytestring based on those encodings. This is necessary because lxml wants to do the decoding itself.
2013-05-20The default XML formatter will now replace ampersands even if they appear to ↵Leonard Richardson
be part of entities. That is, "<" will become "<".[bug=1182183]
2012-08-20Python 3.1 also needs to skip the unicode attribute name test.Leonard Richardson
2012-08-20Skipped a test under Python 2.6 to avoid a spurious test failure. [bug=1038503]Leonard Richardson
2012-08-17Okay, I'll use assertEqual instead.Leonard Richardson
2012-08-17Fixed a crash on encoding when an attribute name containedLeonard Richardson
non-ASCII characters.
2012-07-03Mentioned cchardet in docs.Leonard Richardson