summaryrefslogtreecommitdiff
path: root/TODO
diff options
context:
space:
mode:
Diffstat (limited to 'TODO')
-rw-r--r--TODO41
1 files changed, 29 insertions, 12 deletions
diff --git a/TODO b/TODO
index b40fb18..19dbd30 100644
--- a/TODO
+++ b/TODO
@@ -1,26 +1,43 @@
-soup.new_tar("<br>") should create an empty-element tag if the soup
+Bugs
+----
+
+* I think whitespace may not be processed correctly.
+
+* Characters like & < > should always be converted to HTML entities on
+ output, even if substitute_html_entities is False.
+
+Big features
+------------
+
+* Add namespace support.
+
+* soup.new_tag("<br>") should create an empty-element tag if the soup
was created with an HTML-aware builder, but not otherwise. This
requires keeping around information about the builder.
-Is whitespace being processed correctly?
+Optimizations
+-------------
-if len(tag) > 3 and tag.endswith('Tag'): -> endswith('_tag')
markup_attr_map can be optimized since it's always a map now.
-Can we get rid of isList?
-Split self.assertRaises(ValueError, tree.index, 1) into a separate test
-Bare ampersands should be converted to HTML entities upon output.
+BS3 features not yet ported
+---------------------------
+
+* In BS3, "soup.aTag" is the same as 'soup.find("a")'. This lets you
+locate a tag called (let's say) "find" with attribute
+access. "soup.find" won't do what you want, but "soup.findTag" will.
-Add namespace support.
+This still works In BS4 but it's deprecated. I could make
+"soup.find_tag" work the same way as "soup.find('find')", but I don't
+think it's worth it.
-XML handling:
+CDATA
+-----
The elementtree XMLParser has a strip_cdata argument that, when set to
False, should allow Beautiful Soup to preserve CDATA sections instead
-of treating them as text. (This argument is also present for
-HTMLParser, but does nothing.)
-
-Later:
+of treating them as text. Except it doesn't. (This argument is also
+present for HTMLParser, and also does nothing there.)
Currently, htm5lib converts CDATA sections into comments. An
as-yet-unreleased version of html5lib changes the parser's handling of