summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2011-02-19Preliminary work for getting XML parsing to work.Leonard Richardson
2011-02-19Oh, good, html5lib correctly handles literals in <textarea> tags.Leonard Richardson
2011-02-19Set up an lxml parser that only parses XML, though it's not very functional yet.Leonard Richardson
2011-02-18Ported the rest of the HTML tests, including tests of broken HTML from the ↵Leonard Richardson
TODO. Made Unicode, Dammit PEP-8 compliant.
2011-02-18Moved in the last of the tests from TODO.Leonard Richardson
2011-02-18Ported tests of bad markup that were lying around the TODO.Leonard Richardson
2011-02-18By default, Unicode Dammit converts smart quotes to Unicode characters, not ↵Leonard Richardson
XML entities.
2011-02-18OK, that should do it.Leonard Richardson
2011-02-18Made Unicode, Dammit more PEP-8 compliant.Leonard Richardson
2011-02-18Made Unicode, Dammit more PEP-8 compliant.Leonard Richardson
2011-02-18Defer to html5lib's Unicode converter rather than using Unicode, Dammit. The ↵Leonard Richardson
lxml treebuilder still uses UD.
2011-02-18Fixed the test by giving it more data to sniff.Leonard Richardson
2011-02-18Don't let html5lib set the original encoding to UTF-8 if the input was ↵Leonard Richardson
actually Unicode.
2011-02-18Pass the user-specified encoding in to html5lib rather than dropping it on ↵Leonard Richardson
the floor.
2011-02-18Have the html5lib builder set the sniffed encoding after parsing, rather ↵Leonard Richardson
than before as happens with lxml.
2011-02-18Added failing encoding conversion tests for html5lib.Leonard Richardson
2011-02-18Made conversion of markup to Unicode the responsibility of the builder, not ↵Leonard Richardson
the BeautifulSoup class itself. lxml uses Unicode, Dammit; html5lib uses its internal algorithms.
2011-02-18Refactored the code that sets up substitutions in attribute values, and made ↵Leonard Richardson
content-type substitution work with html5lib.
2011-02-18Yay, meta tag rewrites now work with html5lib.Leonard Richardson
2011-02-18Still trying to get html5lib to rewrite the META tag.Leonard Richardson
2011-02-18Moved the substitution code to the Tag constructor so that we don't have to ↵Leonard Richardson
rely on handle_starttag to trigger it.
2011-02-18Added tests for META tag rewriting and encoding smoke tests.Leonard Richardson
2011-02-18Clarified wording.Leonard Richardson
2011-02-18Removed partially ported test that's now completely ported.Leonard Richardson
2011-02-18Ported the encoding tests, and split them up into logical chunks. The ↵Leonard Richardson
html5lib writer isn't setting up the charset substitution.
2011-02-18Renamed all the search methods to conform with PEP 8.Leonard Richardson
2011-02-18Clarified CHANGELOG.Leonard Richardson
2011-02-18Reminisced in the CHANGELOG.Leonard Richardson
2011-02-18Fixed the findAll backwards compatibility alias.Leonard Richardson
2011-02-18Fixed test failures that were masked by the compatibility methods.Leonard Richardson
2011-02-18Did a bunch more renames--they're listed in the CHANGELOG.Leonard Richardson
2011-02-18Renamed findAllPrevious to find_all_previous.Leonard Richardson
2011-02-18Renamed findAllNext to find_all_next.Leonard Richardson
2011-02-18Renamed findNext to find_next.Leonard Richardson
2011-02-18Renamed a straggler generator.Leonard Richardson
2011-02-18Renamed the generators and made them properties.Leonard Richardson
2011-02-18Renamed recursiveChildGenerator to recursive_children.Leonard Richardson
2011-02-18Renamed findAll to find_all.Leonard Richardson
2011-02-18Got rid of _convertEntities altogether (though I'll probably have to ↵Leonard Richardson
introduce a method that does the opposite.)
2011-02-18Removed builder argument from _convertEntities.Leonard Richardson
2011-02-18Convert all entities to Unicode, don't make it configurable.Leonard Richardson
2011-02-18Got rid of now-useless builder configuration.Leonard Richardson
2011-02-13Ported more tests of bad declarations.Leonard Richardson
2011-02-13Ported more tests of bad declarations.Leonard Richardson
2011-02-13Fixed handling of doctypes and added tests for nonsensical declarations.Leonard Richardson
2011-02-13Added tests of nonsensical declarations.Leonard Richardson
2011-02-13Got the doctype tests to work for html5lib.Leonard Richardson
2011-02-13Got a variety of doctype tests working.Leonard Richardson
2011-02-13Added tests for namespaced doctypes.Leonard Richardson
2011-02-13Clarified lxml's behavior w/r/t CDATA sections.Leonard Richardson