summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--CHANGELOG30
-rw-r--r--doc/source/index.rst34
2 files changed, 55 insertions, 9 deletions
diff --git a/CHANGELOG b/CHANGELOG
index 019ace4..c7eb9a7 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,16 +1,28 @@
-= Unreleased
+= 4.8.0 (Unreleased)
* It's now possible to customize the TreeBuilder object by passing
- keyword arguments into the BeautifulSoup constructor. The main
- reason to do this right now is to change how multi-valued
- attributes are treated -- you can do this with the
- `multi_valued_attributes` argument. [bug=1832978]
+ keyword arguments into the BeautifulSoup constructor. The main
+ reason to do this right now is to change how multi-valued
+ attributes are treated -- you can do this with the
+ `multi_valued_attributes` argument. [bug=1832978]
-* A Formatter can now decide how (or whether) to order the attributes
- inside a tag. [bug=1812422]
+* The role of Formatter objects has been greatly expanded. It now contains
+ consolidated code for controlling the following:
-* ' (which is valid in XML and XHTML, but not HTML 4) is now
- recognized as a named entity and converted to a single quote. [bug=1818721]
+ - The function to call to perform entity substitution. (This was
+ previously Formatter's only job.)
+ - Which tags should be treated as containing CDATA and have their
+ contents exempt from entity substitution.
+ - The order in which a tag's attributes are output. [bug=1812422]
+ - Whether or not to put a '/' inside a void element, e.g. '<br/>' vs '<br>'
+
+ All preexisting code should work as before.
+
+* Added a new method to the API, Tag.smooth(), which consolidates
+ multiple adjacent NavigableString elements.
+
+* &apos; (which is valid in XML, XHTML, and HTML 5, but not HTML 4) is now
+ recognized as a named entity and converted to a single quote. [bug=1818721]
= 4.7.1 (20190106)
diff --git a/doc/source/index.rst b/doc/source/index.rst
index 4bca0ae..9ef8ef4 100644
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -2112,6 +2112,40 @@ whatever's inside that tag. It's good for stripping out markup::
Like ``replace_with()``, ``unwrap()`` returns the tag
that was replaced.
+``smooth()``
+---------------------------
+
+After calling a bunch of methods that modify the parse tree, you may end up with two or more ``NavigableString`` objects next to each other. Beautiful Soup doesn't have any problems with this, but since it can't happen in a freshly parsed document, you might not expect behavior like the following::
+
+ soup = BeautifulSoup("<p>A one</p>")
+ soup.p.append(", a two")
+
+ soup.p.contents
+ # [u'A one', u', a two']
+
+ print(soup.p.encode())
+ # <p>A one, a two</p>
+
+ print(soup.p.prettify())
+ # <p>
+ # A one
+ # , a two
+ # </p>
+
+You can call ``Tag.smooth()`` to clean up the parse tree by consolidating adjacent strings::
+
+ soup.smooth()
+
+ soup.p.contents
+ # [u'A one, a two']
+
+ print(soup.p.prettify())
+ # <p>
+ # A one, a two
+ # </p>
+
+The ``smooth()`` method is new in Beautiful Soup 4.8.0.
+
Output
======