summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-02-23Merge from trunk and added tests.Leonard Richardson
2012-02-23Removed unit tests that test different parsers' behavior on invalid markup, ↵Leonard Richardson
and replace them with informative comparisons generated by demonstrate_parser_differences.py.
2012-02-23Updated NEWS.Leonard Richardson
2012-02-23Cleaned up script and added it to the MANIFEST.in.Leonard Richardson
2012-02-22Added scripts.Leonard Richardson
2012-02-22Minor cleanup.Leonard Richardson
2012-02-22Bare strings are not HTML-escaped by default, but tags are.Leonard Richardson
2012-02-22Removed tests that merely illustrated parser behavior, behavior that ↵Leonard Richardson
wouldn't break Beautiful Soup if it changed.
2012-02-22Added comments.Leonard Richardson
2012-02-22Treat a new namespace mapping as a set of attributes on the tag that defines ↵Leonard Richardson
it, so we don't lose the mappings.
2012-02-21Have lxml invert namespace maps as they come in and set each tag's prefix ↵Leonard Richardson
appropriately.
2012-02-21Added nsprefix argument to the tag class.Leonard Richardson
2012-02-21Removed unused test data.Leonard Richardson
2012-02-21Merged from trunk.Leonard Richardson
2012-02-20It's now possible to copy a BeautifulSoup object created with the ↵Leonard Richardson
html.parser treebuilder.
2012-02-20Use MANIFEST.in instead of setup.py to hold the docs and text files.Leonard Richardson
2012-02-20Remove *.txt and doc from setup.py until I can figure out how to include ↵Leonard Richardson
them in the tarball without installing them.
2012-02-20Temporarily skip the deepcopy test when lxml is not installed.Leonard Richardson
2012-02-20lxml tests are once again run and pass when lxml is installed.Leonard Richardson
2012-02-20copied skipIf didn't work, so made a smaller one.Leonard Richardson
2012-02-20Tests now pass if neither lxml nor html5lib is installed.Leonard Richardson
2012-02-20Changd the class structure so that the default parser test class uses ↵Leonard Richardson
html.parser.
2012-02-20Updated testing instructions.Leonard Richardson
2012-02-20Added code from 2.7's standard library so that the tests will run on Python 2.6.Leonard Richardson
2012-02-17Doctests don't work because the package name is wrong and README.txt isn't ↵Leonard Richardson
in any package, so comment them out for now.
2012-02-16It's a start, at least.Leonard Richardson
2012-02-16By default, turn unrecognized characters into numeric XML entity refs.Leonard Richardson
2012-02-16Issue a warning if characters were replaced with REPLACEMENT CHARACTER ↵Leonard Richardson
during Unicode conversion.
2012-02-16Prep for release.Leonard Richardson
2012-02-15Added a kind of hacky way to interpret the restriction class='foo bar'. Stop ↵Leonard Richardson
generating a space before the slash that closes an empty-element tag.
2012-02-15The value of multi-valued attributes like class are always turned into a ↵Leonard Richardson
list, even if there's only one value.
2012-02-15Some cdata-list attributes are only cdata lists for certain tags.Leonard Richardson
2012-02-15Better defined behavior when the user wants to search for a combination of ↵Leonard Richardson
text and tag-specific arguments. [bug=695312]
2012-02-15Fixed up html5lib tree builder.Leonard Richardson
2012-02-15Added to NEWS.Leonard Richardson
2012-02-15Clarified comment.Leonard Richardson
2012-02-15Removed _nodeIndex, because the misfeature it works around is now gone.Leonard Richardson
2012-02-15Minor cleanup.Leonard Richardson
2012-02-15Tested and cleaned up html5lib insertBefore.Leonard Richardson
2012-02-15Use append instead of insert.Leonard Richardson
2012-02-15Tested improvements to html5lib treebuilder.Leonard Richardson
2012-02-15Minor cleanup.Leonard Richardson
2012-02-15Tested that extract() distinguishes between idientical strings.Leonard Richardson
2012-02-09Bumped version number.Leonard Richardson
2012-02-09Added bug reference.Leonard Richardson
2012-02-09Corrected documentation.Leonard Richardson
2012-02-09As a last-ditch attempt to turn data into Unicode, use errors=replace ↵Leonard Richardson
instead of errors=strict.
2012-02-09Patched over a bug in html5lib (?) that was crashing Beautiful Soup on ↵Leonard Richardson
certain kinds of markup. [bug=838800]
2012-02-09Unicode, Dammit now detects the encoding in HTML 5-style <meta> tags like ↵Leonard Richardson
<meta charset="utf-8" />. [bug=837268]
2012-02-09Minor Unicode, Dammit cleanup.Leonard Richardson