summaryrefslogtreecommitdiff
path: root/bs4/tests
AgeCommit message (Collapse)Author
2013-06-02Turns out we had two bits of code to strip byte-order marks.Leonard Richardson
2013-06-02It turns out most of the untested code wasn't doing anything useful.Leonard Richardson
2013-05-31Create a new lxml parser object for every new parsing strategy.Leonard Richardson
2013-05-30Split out the code that guesses at encodings from the code that tries to ↵Leonard Richardson
decode a bytestring based on those encodings. This is necessary because lxml wants to do the decoding itself.
2013-05-20The default XML formatter will now replace ampersands even if they appear to ↵Leonard Richardson
be part of entities. That is, "<" will become "<".[bug=1182183]
2013-05-20A NavigableString object now has an immutable '.name' property whoseLeonard Richardson
value is always None. This makes it easier to iterate over a mixed list of tags and strings without having to check whether each element is a tag or a string.
2013-05-20Gave new_string() the ability to create subclasses ofLeonard Richardson
NavigableString. [bug=1181986]
2013-05-20html5lib now supports Python 3. Fixed some Python 2-specificLeonard Richardson
code in the html5lib test suite. [bug=1181624]
2013-05-20Fixed test failures when lxml is not installed.Leonard Richardson
2013-05-15Merge.Leonard Richardson
2013-05-14Added a deprecation warning to has_key().Leonard Richardson
2013-05-09Changed lxml.feed() to handle the eventuality that it may be given a bytestring.Leonard Richardson
2013-05-08A CSS selector should never match the same tag twice.Leonard Richardson
2013-05-08Added tests.Leonard Richardson
2013-05-08Aaand... it's now trivial to implement sibling selectors.Leonard Richardson
2013-05-08OK, the tests pass.Leonard Richardson
2013-05-08We're getting there.Leonard Richardson
2013-05-07Fixed an exception when an overspecified CSS selector didn't matchLeonard Richardson
anything. Code by Stefaan Lippens. [bug=1168167]
2013-05-07Added support for the "nth-of-type" CSS selector. The CSS selector ">" can ↵Leonard Richardson
now find a tag by means other than the tag name. Code by Sven Slootweg.
2013-05-07The prettify() method now leaves the contents of <pre> tagsLeonard Richardson
alone. [bug=1095654]
2013-05-07Improved detection of lxml version number.Leonard Richardson
2013-05-07Now that lxml's segfault on invalid doctype has been fixed, fix aLeonard Richardson
corresponding problem on the Beautiful Soup end that was previously invisible. [bug=984936]
2013-05-06Methods like get_text() and properties like .strings now only giveLeonard Richardson
you strings that are visible in the document--no comments or processing commands. [bug=1050164]
2013-05-06 Fix a bug by which keyword arguments to find_parent() were not being passed ↵Leonard Richardson
on. [bug=1126734]
2013-05-06In an HTML document, the contents of a <script> or <style> tag willLeonard Richardson
no longer undergo entity substitution by default. XML documents work the same way they did before. [bug=1085953]
2013-05-06Added failing test.Leonard Richardson
2012-08-21Fixed a problem with the html5lib builder not handling comments correctly.Leonard Richardson
2012-08-20Python 3.1 also needs to skip the unicode attribute name test.Leonard Richardson
2012-08-20Raise a more specific error (FeatureNotFound) when a requestedLeonard Richardson
parser or parser feature is not installed. Raise NotImplementedError instead of ValueError when the user calls insert_before() or insert_after() on the BeautifulSoup object itself. Patch by Aaron Devore. [bug=1038301]
2012-08-20Skipped a test under Python 2.6 to avoid a spurious test failure. [bug=1038503]Leonard Richardson
2012-08-17Okay, I'll use assertEqual instead.Leonard Richardson
2012-08-17Fixed a crash on encoding when an attribute name containedLeonard Richardson
non-ASCII characters.
2012-08-16As per PEP-8, allow searching by CSS class using the 'class_'Leonard Richardson
keyword argument. [bug=1037624]
2012-07-03Mentioned cchardet in docs.Leonard Richardson
2012-07-03When sniffing encodings, if the cchardet library is installed, use it ↵Leonard Richardson
instead of chardet. It's much faster. [bug=1020748]
2012-07-03Use logging.warning() instead of warning.warn() to notify the user that ↵Leonard Richardson
characters were replaced with REPLACEMENT CHARACTER. [bug=1013862]
2012-05-24 Fixed the inability to search for non-ASCII attributeLeonard Richardson
values. [bug=1003974] This caused a major refactoring of the search code. All the tests pass, but it's possible that some searches will behave differently.
2012-05-24Fixed the basic failure in [bug=1003974], but not more advanced cases.Leonard Richardson
2012-05-24 Fixed some edge-case bugs having to do with inserting an elementLeonard Richardson
into a tag it's already inside, and replacing one of a tag's children with another. [bug=997529]
2012-05-24Fixed a bug with the lxml treebuilder that prevented the user from adding ↵Leonard Richardson
attributes to a tag that didn't originally have any. [bug=1002378] Thanks to Oliver Beattie for the patch.
2012-05-03Fixed the handling of &quot; with the built-in parser. [bug=993871]Leonard Richardson
2012-04-27Added experimental support for fixing Windows-1252 characters embedded in ↵Leonard Richardson
UTF-8 documents.
2012-04-26Added a new method, wrap().Leonard Richardson
2012-04-26Renamed replace_with_children() to the jQuery name, unwrap().Leonard Richardson
2012-04-26Fixed a bug in decoding data that contained a byte-order mark, such as data ↵Leonard Richardson
encoded in UTF-16LE. [bug=988980]
2012-04-26Upon document generation, CData objects are no longer run through the ↵Leonard Richardson
formatter. [bug=988905]
2012-04-26Fixed test failure when lxml is not installed.Leonard Richardson
2012-04-18Made encoding substitution in <meta> tags completely transparent (no more ↵Leonard Richardson
%SOUP-ENCODING%).
2012-04-18Fixed a bug that made the HTMLParser treebuilder generate XML definitions ↵Leonard Richardson
ending with two question marks instead of one. [bug=984258]
2012-04-16Unicode, Dammit now has an option to turn MS smart quotes into ASCII characters.Leonard Richardson