summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLeonard Richardson <leonardr@segfault.org>2020-04-07 08:16:56 -0400
committerLeonard Richardson <leonardr@segfault.org>2020-04-07 08:16:56 -0400
commita5e762fbcbc882dffad22a53823f6e8be12c6583 (patch)
tree124f03d1e71d5152df2fd459b15eb74b2c4104a5
parent783bdc774f0148fc900b7318bf069e33fbab4b67 (diff)
Added a notice about the new behavior of .text to the documentation.
-rw-r--r--CHANGELOG3
-rw-r--r--doc/source/index.rst29
2 files changed, 19 insertions, 13 deletions
diff --git a/CHANGELOG b/CHANGELOG
index 1c7d57d..e3b1a8d 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -5,7 +5,8 @@
NavigableString.
* Embedded CSS and Javascript is now stored in distinct Stylesheet and
- Script tags, which are ignored by methods like get_text(). This
+ Script tags, which are ignored by methods like get_text() since most
+ people don't consider this sort of content to be 'text'. This
feature is not supported by the html5lib treebuilder. [bug=1868861]
* Added a Russian translation by 'authoress' to the repository.
diff --git a/doc/source/index.rst b/doc/source/index.rst
index a233e89..dbc8c15 100644
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -18,7 +18,7 @@ with examples. I show you what the library is good for, how it works,
how to use it, how to make it do what you want, and what to do when it
violates your expectations.
-This document covers Beautiful Soup version 4.8.1. The examples in
+This document covers Beautiful Soup version 4.9.0. The examples in
this documentation should work the same way in Python 2.7 and Python
3.2.
@@ -1708,18 +1708,17 @@ tag it contains.
CSS selectors
-------------
-As of version 4.7.0, Beautiful Soup supports most CSS4 selectors via
-the `SoupSieve <https://facelessuser.github.io/soupsieve/>`_
-project. If you installed Beautiful Soup through ``pip``, SoupSieve
-was installed at the same time, so you don't have to do anything extra.
+``BeautifulSoup`` has a ``.select()`` method which uses the `SoupSieve
+<https://facelessuser.github.io/soupsieve/>`_ package to run a CSS
+selector against a parsed document and return all the matching
+elements. ``Tag`` has a similar method which runs a CSS selector
+against the contents of a single tag.
-``BeautifulSoup`` has a ``.select()`` method which uses SoupSieve to
-run a CSS selector against a parsed document and return all the
-matching elements. ``Tag`` has a similar method which runs a CSS
-selector against the contents of a single tag.
-
-(Earlier versions of Beautiful Soup also have the ``.select()``
-method, but only the most commonly-used CSS selectors are supported.)
+(The SoupSieve integration was added in Beautiful Soup 4.7.0. Earlier
+versions also have the ``.select()`` method, but only the most
+commonly-used CSS selectors are supported. If you installed Beautiful
+Soup through ``pip``, SoupSieve was installed at the same time, so you
+don't have to do anything extra.)
The SoupSieve `documentation
<https://facelessuser.github.io/soupsieve/>`_ lists all the currently
@@ -2436,6 +2435,12 @@ generator instead, and process the text yourself::
[text for text in soup.stripped_strings]
# [u'I linked to', u'example.com']
+*As of Beautiful Soup version 4.9.0, when lxml or html.parser are in
+use, the contents of <script>, <style>, and <template>
+tags are not considered to be 'text', since those tags are not part of
+the human-visible content of the page.*
+
+
Specifying the parser to use
============================