summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorLeonard Richardson <leonardr@segfault.org>2021-02-14 16:53:14 -0500
committerLeonard Richardson <leonardr@segfault.org>2021-02-14 16:53:14 -0500
commit34e0ce8a9dd43ada1c55b50a156fbce63b1e2ebb (patch)
treefdeb487c1f52e32c6eb4761cd2a530a24c10b8b0 /doc
parent7201eecc09b51df5a0fb704670aa66bcc9d8e635 (diff)
NavigableString and its subclasses now implement the get_text()
method, as well as the properties .strings and .stripped_strings. These methods will either return the string itself, or nothing, so the only reason to use this is when iterating over a list of mixed Tag and NavigableString objects. [bug=1904309]
Diffstat (limited to 'doc')
-rw-r--r--doc/source/index.rst11
1 files changed, 9 insertions, 2 deletions
diff --git a/doc/source/index.rst b/doc/source/index.rst
index 2b5843d..63e74e2 100644
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -2312,7 +2312,7 @@ omit the closing slash in HTML void tags like "br"::
# b'<br>'
In addition, any attributes whose values are the empty string
-will become HTML-style boolean attributes:
+will become HTML-style boolean attributes::
option = BeautifulSoup('<option selected=""></option>').option
print(option.encode(formatter="html"))
@@ -2321,6 +2321,8 @@ will become HTML-style boolean attributes:
print(option.encode(formatter="html5"))
# b'<option selected></option>'
+*(This behavior is new as of Beautiful Soup 4.10.0.)*
+
If you pass in ``formatter=None``, Beautiful Soup will not modify
strings at all on output. This is the fastest option, but it may lead
to Beautiful Soup generating invalid HTML/XML, as in these examples::
@@ -2429,9 +2431,14 @@ generator instead, and process the text yourself::
*As of Beautiful Soup version 4.9.0, when lxml or html.parser are in
use, the contents of <script>, <style>, and <template>
-tags are not considered to be 'text', since those tags are not part of
+tags are generally not considered to be 'text', since those tags are not part of
the human-visible content of the page.*
+*As of Beautiful Soup version 4.10.0, you can call get_text(),
+.strings, or .stripped_strings on a NavigableString object. It will
+either return the object itself, or nothing, so the only reason to do
+this is when you're iterating over a mixed list.*
+
Specifying the parser to use
============================