Age | Commit message (Collapse) | Author |
|
|
|
from happening.
|
|
Beautiful Soup constructor, which lets you prohibit the detection of
an encoding that you know is wrong. [bug=1469408]
|
|
tree are properly connected via .next_element and .previous_element.
|
|
of the encoding itself contained invalid bytes. [bug=1360913]
|
|
return None instead of the original data. [bug=1214983]
|
|
|
|
declared encoding.
|
|
|
|
|
|
|
|
|
|
decode a bytestring based on those encodings. This is necessary because lxml wants to do the decoding itself.
|
|
be part of entities. That is, "<" will become "&lt;".[bug=1182183]
|
|
|
|
|
|
|
|
instead of chardet. It's much faster. [bug=1020748]
|
|
characters were replaced with REPLACEMENT CHARACTER. [bug=1013862]
|
|
declarations are now treated as preformatted strings, the way CData blocks are. [bug=1001025] Also in this commit: renamed detwingle method to detwingle().
|
|
|
|
UTF-8 documents.
|
|
encoded in UTF-16LE. [bug=988980]
|
|
|
|
Previously they were always run through the 'minimal' formatter. [bug=980237]
|
|
during Unicode conversion.
|
|
instead of errors=strict.
|
|
<meta charset="utf-8" />. [bug=837268]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|