Age | Commit message (Collapse) | Author | |
---|---|---|---|
2013-05-31 | Create a new lxml parser object for every new parsing strategy. | Leonard Richardson | |
2013-05-30 | Refactored code a bit. | Leonard Richardson | |
2013-05-30 | Split out the code that guesses at encodings from the code that tries to ↵ | Leonard Richardson | |
decode a bytestring based on those encodings. This is necessary because lxml wants to do the decoding itself. | |||
2013-05-20 | The default XML formatter will now replace ampersands even if they appear to ↵ | Leonard Richardson | |
be part of entities. That is, "<" will become "&lt;".[bug=1182183] | |||
2013-05-20 | A NavigableString object now has an immutable '.name' property whose | Leonard Richardson | |
value is always None. This makes it easier to iterate over a mixed list of tags and strings without having to check whether each element is a tag or a string. | |||
2013-05-20 | The .next_element attribute used during parsing was confusingly similar to ↵ | Leonard Richardson | |
the .next_element navigation attribute. Renamed the former to _most_recent_element. | |||
2013-05-20 | Fixed another bug by which the html5lib tree builder could create a | Leonard Richardson | |
disconnected tree. [bug=1182089] | |||
2013-05-20 | Gave new_string() the ability to create subclasses of | Leonard Richardson | |
NavigableString. [bug=1181986] | |||
2013-05-20 | html5lib now supports Python 3. Fixed some Python 2-specific | Leonard Richardson | |
code in the html5lib test suite. [bug=1181624] | |||
2013-05-20 | Fixed test failures when lxml is not installed. | Leonard Richardson | |
2013-05-15 | How about actually parsing the same markup with different parsers. | Leonard Richardson | |
2013-05-15 | Merge. | Leonard Richardson | |
2013-05-14 | Added diagnostic case for attempting to parse a URL as HTML. | Leonard Richardson | |
2013-05-14 | Added a deprecation warning to has_key(). | Leonard Richardson | |
2013-05-09 | Changed lxml.feed() to handle the eventuality that it may be given a bytestring. | Leonard Richardson | |
2013-05-09 | Added a basic benchmark function to the diagnose module. | Leonard Richardson | |
2013-05-09 | Added a diagnostic function for randomly generating a simple, invalid HTML ↵ | Leonard Richardson | |
document. | |||
2013-05-08 | A CSS selector should never match the same tag twice. | Leonard Richardson | |
2013-05-08 | Minor cleanup. | Leonard Richardson | |
2013-05-08 | Added tests. | Leonard Richardson | |
2013-05-08 | Fixed terminology. | Leonard Richardson | |
2013-05-08 | Moved select() to Tag. It was always an error to call select() on a string, ↵ | Leonard Richardson | |
so there's no reason for it to be in PageElement. | |||
2013-05-08 | Give the checker the ability to stop the iteration over the generator by ↵ | Leonard Richardson | |
raising StopIteration. | |||
2013-05-08 | Aaand... it's now trivial to implement sibling selectors. | Leonard Richardson | |
2013-05-08 | Once again, we're back to the steady state. | Leonard Richardson | |
2013-05-08 | Got it all working again except for nth_child_of_type. | Leonard Richardson | |
2013-05-08 | Refactored again to use iterators instead of calling find_all(). | Leonard Richardson | |
2013-05-08 | OK, the tests pass. | Leonard Richardson | |
2013-05-08 | Almost there. | Leonard Richardson | |
2013-05-08 | We're getting there. | Leonard Richardson | |
2013-05-08 | Fixing test failures. | Leonard Richardson | |
2013-05-08 | Initial refactoring. | Leonard Richardson | |
2013-05-07 | Fixed up diagnose() and added it to the docs. | Leonard Richardson | |
2013-05-07 | Since the string part of a NavigableString is immutable, gave it a simpler ↵ | Leonard Richardson | |
__copy__ implementation. [bug=682685] | |||
2013-05-07 | Fixed an exception when an overspecified CSS selector didn't match | Leonard Richardson | |
anything. Code by Stefaan Lippens. [bug=1168167] | |||
2013-05-07 | Added support for the "nth-of-type" CSS selector. The CSS selector ">" can ↵ | Leonard Richardson | |
now find a tag by means other than the tag name. Code by Sven Slootweg. | |||
2013-05-07 | The prettify() method now leaves the contents of <pre> tags | Leonard Richardson | |
alone. [bug=1095654] | |||
2013-05-07 | Merged. | Leonard Richardson | |
2013-05-07 | Aliased the BeautifulSoup class to the easier-to-type "_s" and "_soup". | Leonard Richardson | |
2013-05-07 | Improved detection of lxml version number. | Leonard Richardson | |
2013-05-07 | Now that lxml's segfault on invalid doctype has been fixed, fix a | Leonard Richardson | |
corresponding problem on the Beautiful Soup end that was previously invisible. [bug=984936] | |||
2013-05-06 | Stop a crash when unwisely messing with a tag that's been | Leonard Richardson | |
decomposed. [bug=1097699] | |||
2013-05-06 | Methods like get_text() and properties like .strings now only give | Leonard Richardson | |
you strings that are visible in the document--no comments or processing commands. [bug=1050164] | |||
2013-05-06 | Fix a bug by which keyword arguments to find_parent() were not being passed ↵ | Leonard Richardson | |
on. [bug=1126734] | |||
2013-05-06 | In an HTML document, the contents of a <script> or <style> tag will | Leonard Richardson | |
no longer undergo entity substitution by default. XML documents work the same way they did before. [bug=1085953] | |||
2013-05-06 | Added failing test. | Leonard Richardson | |
2013-05-06 | Added a library full of diagnostics to make tech support easier. | Leonard Richardson | |
2012-11-03 | Merged in changes made elsewhere. | Leonard Richardson | |
2012-11-03 | Doc fixes. | Leonard Richardson | |
2012-10-11 | Fix a bug in the lxml treebuilder which crashed when a tag included | Leonard Richardson | |
an attribute from the predefined xml: namespace. [bug=1065617] |