diff options
author | Leonard Richardson <leonardr@segfault.org> | 2021-02-14 16:53:14 -0500 |
---|---|---|
committer | Leonard Richardson <leonardr@segfault.org> | 2021-02-14 16:53:14 -0500 |
commit | 34e0ce8a9dd43ada1c55b50a156fbce63b1e2ebb (patch) | |
tree | fdeb487c1f52e32c6eb4761cd2a530a24c10b8b0 /doc | |
parent | 7201eecc09b51df5a0fb704670aa66bcc9d8e635 (diff) |
NavigableString and its subclasses now implement the get_text()
method, as well as the properties .strings and
.stripped_strings. These methods will either return the string
itself, or nothing, so the only reason to use this is when iterating
over a list of mixed Tag and NavigableString objects. [bug=1904309]
Diffstat (limited to 'doc')
-rw-r--r-- | doc/source/index.rst | 11 |
1 files changed, 9 insertions, 2 deletions
diff --git a/doc/source/index.rst b/doc/source/index.rst index 2b5843d..63e74e2 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -2312,7 +2312,7 @@ omit the closing slash in HTML void tags like "br":: # b'<br>' In addition, any attributes whose values are the empty string -will become HTML-style boolean attributes: +will become HTML-style boolean attributes:: option = BeautifulSoup('<option selected=""></option>').option print(option.encode(formatter="html")) @@ -2321,6 +2321,8 @@ will become HTML-style boolean attributes: print(option.encode(formatter="html5")) # b'<option selected></option>' +*(This behavior is new as of Beautiful Soup 4.10.0.)* + If you pass in ``formatter=None``, Beautiful Soup will not modify strings at all on output. This is the fastest option, but it may lead to Beautiful Soup generating invalid HTML/XML, as in these examples:: @@ -2429,9 +2431,14 @@ generator instead, and process the text yourself:: *As of Beautiful Soup version 4.9.0, when lxml or html.parser are in use, the contents of <script>, <style>, and <template> -tags are not considered to be 'text', since those tags are not part of +tags are generally not considered to be 'text', since those tags are not part of the human-visible content of the page.* +*As of Beautiful Soup version 4.10.0, you can call get_text(), +.strings, or .stripped_strings on a NavigableString object. It will +either return the object itself, or nothing, so the only reason to do +this is when you're iterating over a mixed list.* + Specifying the parser to use ============================ |