summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2011-02-26Emit an XML declaration when appropriate.Leonard Richardson
2011-02-22Fixed comment.Leonard Richardson
2011-02-22Solved the question of how to decide between ' (XML) and &squot; (HTML) ↵Leonard Richardson
by cutting the Gordian knot: quote the *double* quotes, which are always ".
2011-02-22Don't turn " into " except in attribute values.Leonard Richardson
2011-02-21Added a class for converting certain characters into XML or HTML entities, ↵Leonard Richardson
though it's not usable by the end-user yet.
2011-02-21Removed useless code.Leonard Richardson
2011-02-21PEP-8-ified more argument names.Leonard Richardson
2011-02-21Renamed prettyPrint to pretty_print.Leonard Richardson
2011-02-21Minor cleanup.Leonard Richardson
2011-02-21Removed the now-useless Entities class.Leonard Richardson
2011-02-21Got rid of isString.Leonard Richardson
2011-02-21Switched Tag.decode to use EntitySubstitution.substitute_xml.Leonard Richardson
2011-02-21Created an EntitySubstitution class that's going to take code away from ↵Leonard Richardson
UnicodeDammit, Entities, and BeautifulSoup.
2011-02-20Added a registry for tree builders and made it possible to find a tree ↵Leonard Richardson
builder that has the features you want from the BeautifulSoup constructor.
2011-02-20Fixed bug in the BS constructor lookup, and added the test file I've been ↵Leonard Richardson
working on this whole time.
2011-02-20Renamed the registry variable to builder_registry.Leonard Richardson
2011-02-20Started using the builder registry.Leonard Richardson
2011-02-20Renamed constructor arguments to comply with PEP 8.Leonard Richardson
2011-02-20Added tests for the default builder registry.Leonard Richardson
2011-02-20Tree builders now advertise their features.Leonard Richardson
2011-02-20Started work on a tagging system that should make it easy to find a tree ↵Leonard Richardson
builder that meets your needs.
2011-02-20Started work on a tagging system that should make it easy to find a tree ↵Leonard Richardson
builder that meets your needs.
2011-02-20Created a function that puts all tree-builders in a module into ↵Leonard Richardson
beautifulsoup.builders.
2011-02-20Simplified the builder registration.Leonard Richardson
2011-02-20Greatly simplified the module import code by making it take a module, not a ↵Leonard Richardson
module name.
2011-02-20Fixed up the code to register builders from a module.Leonard Richardson
2011-02-20Use registration code to register builders. The registration code will be ↵Leonard Richardson
expanded later.
2011-02-20Discovered that html5lib can't be made to support SoupStrainers, and changed ↵Leonard Richardson
the test suite appropriately.
2011-02-20Discovered that html5lib can't be made to support SoupStrainers, and changed ↵Leonard Richardson
the test suite appropriately.
2011-02-20Removed extraneous newlines.Leonard Richardson
2011-02-20I couldn't get the XML parser to parse CDATA as CData objects, but at least ↵Leonard Richardson
I documented the current behavior.
2011-02-20Since we can't parse in CData objects ATM, added a test for CData objects ↵Leonard Richardson
created manually, to keep the bits from rotting.
2011-02-20Made the XML treebuilder able to handle basic invalid XML.Leonard Richardson
2011-02-20Greatly improved the handling of empty-element tags.Leonard Richardson
2011-02-20Added a test showing weird behavior when you .insert contents into an ↵Leonard Richardson
empty-element tag.
2011-02-20Refactored some empty-element tests and added more.Leonard Richardson
2011-02-20Test that empty-element tags that get children stop being empty-element tags.Leonard Richardson
2011-02-20Added tests of custom lists of empty-element tags.Leonard Richardson
2011-02-20Added an empty-element tag test.Leonard Richardson
2011-02-20Tag.is_empty_element is determined dynamically, based on a) whether the ↵Leonard Richardson
builder used to create the tag had an explicit list of empty-element tags, and b) whether the tag actually contains anything.
2011-02-20Why is the test failing? Because I'm asserting the wrong thing.Leonard Richardson
2011-02-19You can now parse XML fairly reasonably.Leonard Richardson
2011-02-19Hacked in something to get lxml's behavior where any empty tag is treated as ↵Leonard Richardson
self-closing. This may or may not stay as is.
2011-02-19Made it easier to pass a custom lxml parser object into the treebuilder.Leonard Richardson
2011-02-19Preliminary work for getting XML parsing to work.Leonard Richardson
2011-02-19Oh, good, html5lib correctly handles literals in <textarea> tags.Leonard Richardson
2011-02-19Set up an lxml parser that only parses XML, though it's not very functional yet.Leonard Richardson
2011-02-18Ported the rest of the HTML tests, including tests of broken HTML from the ↵Leonard Richardson
TODO. Made Unicode, Dammit PEP-8 compliant.
2011-02-18Moved in the last of the tests from TODO.Leonard Richardson
2011-02-18Ported tests of bad markup that were lying around the TODO.Leonard Richardson