diff options
Diffstat (limited to 'TODO.txt')
-rw-r--r-- | TODO.txt | 42 |
1 files changed, 42 insertions, 0 deletions
diff --git a/TODO.txt b/TODO.txt new file mode 100644 index 0000000..2f03dd2 --- /dev/null +++ b/TODO.txt @@ -0,0 +1,42 @@ +Bugs +---- + +* I think whitespace may not be processed correctly. + +* html5lib doesn't support SoupStrainers, which is OK, but there + should be a warning about it. + +Big features +------------ + +* Add namespace support. + +Optimizations +------------- + +markup_attr_map can be optimized since it's always a map now. + +BS3 features not yet ported +--------------------------- + +* In BS3, "soup.aTag" is the same as 'soup.find("a")'. This lets you +locate a tag called (let's say) "find" with attribute +access. "soup.find" won't do what you want, but "soup.findTag" will. + +This still works In BS4 but it's deprecated. I could make +"soup.find_tag" work the same way as "soup.find('find')", but I don't +think it's worth it. + +CDATA +----- + +The elementtree XMLParser has a strip_cdata argument that, when set to +False, should allow Beautiful Soup to preserve CDATA sections instead +of treating them as text. Except it doesn't. (This argument is also +present for HTMLParser, and also does nothing there.) + +Currently, htm5lib converts CDATA sections into comments. An +as-yet-unreleased version of html5lib changes the parser's handling of +CDATA sections to allow CDATA sections in tags like <svg> and +<math>. The HTML5TreeBuilder will need to be updated to create CData +objects instead of Comment objects in this situation. |