summaryrefslogtreecommitdiff
path: root/bs4/tests/test_soup.py
AgeCommit message (Collapse)Author
2016-07-26Clarify that Beautiful Soup is no longer compatible with versions of Python ↵Leonard Richardson
pre-2.7. Contributed by Ville Skyttä.
2016-07-26Use assertEqual instead of deprecated assertEqualsVille Skyttä
2016-07-26Clarify Python 2(.7) support statusVille Skyttä
2016-07-16Fixed a Python 3 ByteWarning when a URL was passed in as though itLeonard Richardson
were markup. Thanks to James Salter for a patch and test. [bug=1533762]
2015-07-05Fixed the test_detect_utf8 test so that it works when chardet isLeonard Richardson
installed. [bug=1471359]
2015-06-27Added an exclude_encodings argument to UnicodeDammit and to theLeonard Richardson
Beautiful Soup constructor, which lets you prohibit the detection of an encoding that you know is wrong. [bug=1469408]
2015-06-25Fixed a crash in Unicode, Dammit's encoding detector when the nameLeonard Richardson
of the encoding itself contained invalid bytes. [bug=1360913]
2014-12-07Issue a warning if the BeautifulSoup constructor arguments do not explicitly ↵Leonard Richardson
name a parser.
2013-10-02Fixed a bug that caused Unicode data put into UnicodeDammit toLeonard Richardson
return None instead of the original data. [bug=1214983]
2013-10-01 Fixed a crash when a short input contains data not valid inLeonard Richardson
filenames. [bug=1232604]
2013-10-01Fixed a bug in which short Unicode input was improperly encoded to ASCII ↵Leonard Richardson
when checking whether or not it was a file on disk. [bug=1227016]
2013-08-19Combined two tests to stop a spurious test failure when tests areLeonard Richardson
run by nodetests. [bug=1212445]
2013-06-03Let's get some profiling going.Leonard Richardson
2013-06-03Test that the filename warning isn't given unless the file actually exists ↵Leonard Richardson
on disk.
2013-06-03Beautiful Soup will issue a warning if instead of markup you pass itLeonard Richardson
a URL or the name of a file on disk (a common beginner mistake).
2013-06-02Turns out we had two bits of code to strip byte-order marks.Leonard Richardson
2013-06-02It turns out most of the untested code wasn't doing anything useful.Leonard Richardson
2013-05-30Split out the code that guesses at encodings from the code that tries to ↵Leonard Richardson
decode a bytestring based on those encodings. This is necessary because lxml wants to do the decoding itself.
2013-05-20The default XML formatter will now replace ampersands even if they appear to ↵Leonard Richardson
be part of entities. That is, "<" will become "<".[bug=1182183]
2012-08-20Python 3.1 also needs to skip the unicode attribute name test.Leonard Richardson
2012-08-20Skipped a test under Python 2.6 to avoid a spurious test failure. [bug=1038503]Leonard Richardson
2012-08-17Okay, I'll use assertEqual instead.Leonard Richardson
2012-08-17Fixed a crash on encoding when an attribute name containedLeonard Richardson
non-ASCII characters.
2012-07-03Mentioned cchardet in docs.Leonard Richardson
2012-07-03When sniffing encodings, if the cchardet library is installed, use it ↵Leonard Richardson
instead of chardet. It's much faster. [bug=1020748]
2012-07-03Use logging.warning() instead of warning.warn() to notify the user that ↵Leonard Richardson
characters were replaced with REPLACEMENT CHARACTER. [bug=1013862]
2012-05-24Fixed a bug with the lxml treebuilder that prevented the user from adding ↵Leonard Richardson
attributes to a tag that didn't originally have any. [bug=1002378] Thanks to Oliver Beattie for the patch.
2012-04-27Added experimental support for fixing Windows-1252 characters embedded in ↵Leonard Richardson
UTF-8 documents.
2012-04-26Fixed a bug in decoding data that contained a byte-order mark, such as data ↵Leonard Richardson
encoded in UTF-16LE. [bug=988980]
2012-04-26Fixed test failure when lxml is not installed.Leonard Richardson
2012-04-18Made encoding substitution in <meta> tags completely transparent (no more ↵Leonard Richardson
%SOUP-ENCODING%).
2012-04-16Unicode, Dammit now has an option to turn MS smart quotes into ASCII characters.Leonard Richardson
2012-03-01For backwards compatibility, brought back the BeautifulStoneSoup class as a ↵Leonard Richardson
deprecated wrapper around BeautifulSoup.
2012-02-26Fixed DOCTYPE handling.Leonard Richardson
2012-02-24Fixed a test failure that occured on Python 3.x when chardet was installed.Leonard Richardson
2012-02-23Fixed handling of the closing of namespaced tags.Leonard Richardson
2012-02-23Bumped version number.Leonard Richardson
2012-02-23Namespaced attributes are equal if they correspond to the same string.Leonard Richardson
2012-02-22Removed tests that merely illustrated parser behavior, behavior that ↵Leonard Richardson
wouldn't break Beautiful Soup if it changed.
2012-02-20Changd the class structure so that the default parser test class uses ↵Leonard Richardson
html.parser.
2012-02-16Issue a warning if characters were replaced with REPLACEMENT CHARACTER ↵Leonard Richardson
during Unicode conversion.
2012-02-09As a last-ditch attempt to turn data into Unicode, use errors=replace ↵Leonard Richardson
instead of errors=strict.
2012-02-09Unicode, Dammit now detects the encoding in HTML 5-style <meta> tags like ↵Leonard Richardson
<meta charset="utf-8" />. [bug=837268]
2012-01-20Tests now work in both versions, and it's possible to test both versions by ↵Leonard Richardson
running one command.
2012-01-20Made it easier to convert BS3 code to BS4.Leonard Richardson
2012-01-20Replaced assertEquals with assertEqual to get rid of deprecation notice.Leonard Richardson
2011-06-29Various changes so most tests pass on Python 3.Thomas Kluyver