summaryrefslogtreecommitdiff
path: root/doc/source
diff options
context:
space:
mode:
authorLeonard Richardson <leonard.richardson@canonical.com>2012-04-26 07:32:53 -0400
committerLeonard Richardson <leonard.richardson@canonical.com>2012-04-26 07:32:53 -0400
commit3ff7bde5d320fbec4c16e7f245c345e8455ca887 (patch)
tree97f12ee78940b9a9e70b560182e6d3d8f33f0549 /doc/source
parent2261264fb5b4cc8a9095a1b14a92b52258e8029e (diff)
Fixed test failure when lxml is not installed.
Diffstat (limited to 'doc/source')
-rw-r--r--doc/source/index.rst7
1 files changed, 7 insertions, 0 deletions
diff --git a/doc/source/index.rst b/doc/source/index.rst
index 734851d..5b65354 100644
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -2670,6 +2670,13 @@ deprecated and removed in Python 3.0. Beautiful Soup 4 uses
``html.parser`` by default, but you can plug in lxml or html5lib and
use that instead. See `Installing a parser`_ for a comparison.
+Since ``html.parser`` is not the same parser as ``SGMLParser``, it
+will treat invalid markup differently. Usually the "difference" is
+that ``html.parser`` crashes. In that case, you'll need to install
+another parser. But sometimes ``html.parser`` just creates a different
+parse tree than ``SGMLParser`` would. If this happens, you may need to
+update your BS3 scraping code to deal with the new tree.
+
Method names
^^^^^^^^^^^^