summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-05-20The default XML formatter will now replace ampersands even if they appear to ↵Leonard Richardson
be part of entities. That is, "<" will become "<".[bug=1182183]
2013-05-20A NavigableString object now has an immutable '.name' property whoseLeonard Richardson
value is always None. This makes it easier to iterate over a mixed list of tags and strings without having to check whether each element is a tag or a string.
2013-05-20The .previous_element of a BeautifulSoup object is now always None,Leonard Richardson
2013-05-20The .next_element attribute used during parsing was confusingly similar to ↵Leonard Richardson
the .next_element navigation attribute. Renamed the former to _most_recent_element.
2013-05-20Fixed another bug by which the html5lib tree builder could create aLeonard Richardson
disconnected tree. [bug=1182089]
2013-05-20Gave new_string() the ability to create subclasses ofLeonard Richardson
NavigableString. [bug=1181986]
2013-05-20html5lib now supports Python 3. Fixed some Python 2-specificLeonard Richardson
code in the html5lib test suite. [bug=1181624]
2013-05-20Fixed test failures when lxml is not installed.Leonard Richardson
2013-05-15How about actually parsing the same markup with different parsers.Leonard Richardson
2013-05-15Merge.Leonard Richardson
2013-05-14Prep for release.Leonard Richardson
2013-05-14Added diagnostic case for attempting to parse a URL as HTML.Leonard Richardson
2013-05-14Added warning about using NavigableString outside of Beautiful Soup.Leonard Richardson
2013-05-14Added a deprecation warning to has_key().Leonard Richardson
2013-05-09Changed lxml.feed() to handle the eventuality that it may be given a bytestring.Leonard Richardson
2013-05-09Added a basic benchmark function to the diagnose module.Leonard Richardson
2013-05-09Added a diagnostic function for randomly generating a simple, invalid HTML ↵Leonard Richardson
document.
2013-05-09Thanks to data-*, there's now a good use for attrs again. This lets me clean ↵Leonard Richardson
up the docs quite a bit.
2013-05-09Added <body> tag to sample doc so it will work the same on all parsers.Leonard Richardson
2013-05-08Updated docs with new examples.Leonard Richardson
2013-05-08Updated docs with new examples.Leonard Richardson
2013-05-08A CSS selector should never match the same tag twice.Leonard Richardson
2013-05-08Refactored the CSS selector support, and added the sibling combinators.Leonard Richardson
2013-05-08Minor cleanup.Leonard Richardson
2013-05-08Added tests.Leonard Richardson
2013-05-08Fixed terminology.Leonard Richardson
2013-05-08Updated news.Leonard Richardson
2013-05-08Moved select() to Tag. It was always an error to call select() on a string, ↵Leonard Richardson
so there's no reason for it to be in PageElement.
2013-05-08Give the checker the ability to stop the iteration over the generator by ↵Leonard Richardson
raising StopIteration.
2013-05-08Aaand... it's now trivial to implement sibling selectors.Leonard Richardson
2013-05-08Once again, we're back to the steady state.Leonard Richardson
2013-05-08Got it all working again except for nth_child_of_type.Leonard Richardson
2013-05-08Refactored again to use iterators instead of calling find_all().Leonard Richardson
2013-05-08OK, the tests pass.Leonard Richardson
2013-05-08Almost there.Leonard Richardson
2013-05-08We're getting there.Leonard Richardson
2013-05-08Fixing test failures.Leonard Richardson
2013-05-08Initial refactoring.Leonard Richardson
2013-05-07Fixed up diagnose() and added it to the docs.Leonard Richardson
2013-05-07Since the string part of a NavigableString is immutable, gave it a simpler ↵Leonard Richardson
__copy__ implementation. [bug=682685]
2013-05-07Mentioned the CSS selector solution to searching for a tag by multiple CSS ↵Leonard Richardson
classes.
2013-05-07Fixed an exception when an overspecified CSS selector didn't matchLeonard Richardson
anything. Code by Stefaan Lippens. [bug=1168167]
2013-05-07Added support for the "nth-of-type" CSS selector. The CSS selector ">" can ↵Leonard Richardson
now find a tag by means other than the tag name. Code by Sven Slootweg.
2013-05-07The prettify() method now leaves the contents of <pre> tagsLeonard Richardson
alone. [bug=1095654]
2013-05-07Merged.Leonard Richardson
2013-05-07Aliased the BeautifulSoup class to the easier-to-type "_s" and "_soup".Leonard Richardson
2013-05-07Improved detection of lxml version number.Leonard Richardson
2013-05-07Now that lxml's segfault on invalid doctype has been fixed, fix aLeonard Richardson
corresponding problem on the Beautiful Soup end that was previously invisible. [bug=984936]
2013-05-06Stop a crash when unwisely messing with a tag that's beenLeonard Richardson
decomposed. [bug=1097699]
2013-05-06Methods like get_text() and properties like .strings now only giveLeonard Richardson
you strings that are visible in the document--no comments or processing commands. [bug=1050164]