summaryrefslogtreecommitdiff
path: root/doc/source
diff options
context:
space:
mode:
authorLeonard Richardson <leonard.richardson@canonical.com>2012-04-16 10:46:36 -0400
committerLeonard Richardson <leonard.richardson@canonical.com>2012-04-16 10:46:36 -0400
commitbb02cc186306b946faaff474ce738acefa9f9ab1 (patch)
treedb5a272651c21acbe39d55810486f791f9a4edf0 /doc/source
parent3793495c8ea91243f9689d9788d30b9c6e0740d7 (diff)
Doc update.
Diffstat (limited to 'doc/source')
-rw-r--r--doc/source/index.rst112
1 files changed, 59 insertions, 53 deletions
diff --git a/doc/source/index.rst b/doc/source/index.rst
index a7757d6..5abc597 100644
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -2505,59 +2505,26 @@ thought I'd mention it::
Troubleshooting
===============
-Common Problems
----------------
+Version mismatch problems
+-------------------------
+
+* ``SyntaxError: Invalid syntax`` (on the line ``ROOT_TAG_NAME =
+ u'[document]'``): Caused by running the Python 2 version of
+ Beautiful Soup under Python 3, without converting the code.
+
+* ``ImportError: No module named HTMLParser`` - Caused by running the
+ Python 2 version of Beautiful Soup under Python 3.
-If your script works on one computer but not another, it's probably
-because the two computers have different parser libraries
-available. For example, you may have developed the script on a
-computer that has lxml installed, and then tried to run it on a
-computer that only has html5lib installed. See `Differences between
-parsers`_ for why this matters, and fix the problem by mentioning a
-specific parser library in the ``BeautifulSoup`` constructor.
-
-If you can't find a tag that you know is in the document (that is,
-``find_all()`` returned ``[]`` or ``find()`` returned ``None``), you're
-probably using Python's built-in HTML parser, which sometimes skips
-tags it doesn't understand. Solution: :ref:`Install lxml or
-html5lib. <parser-installation>`
-
-``SyntaxError: Invalid syntax`` (on the line ``ROOT_TAG_NAME =
-u'[document]'``): Caused by the Python 2 version of Beautiful Soup
-under Python 3.
-
-``ImportError: No module named HTMLParser`` - Caused by running the
-Python 2 version of Beautiful Soup under Python 3.
-
-``ImportError: No module named html.parser`` - Caused by running the
-Python 3 version of Beautiful Soup under Python 2.
-
-``ImportError: No module named BeautifulSoup`` - Caused by running
-Beautiful Soup 3 code on a system that doesn't have BS3 installed. Or,
-by writing Beautiful Soup 4 code without knowing that the package name
-has changed to ``bs4``.
-
-``ImportError: No module named bs4`` - Caused by running Beautiful
-Soup 4 code on a system that doesn't have BS4 installed.
-
-``HTMLParser.HTMLParseError: malformed start tag`` - Caused by giving
-Python's built-in HTML parser a document it can't handle. Any other
-``HTMLParseError`` is probably the same problem. Solution:
-:ref:`Install lxml or html5lib. <parser-installation>`
-
-``KeyError: [attr]`` - Caused by accessing ``tag['attr']`` when the
-tag in question doesn't define the ``attr`` attribute. The most common
-errors are ``KeyError: 'href'`` and ``KeyError: 'class'``. Use
-``tag.get('attr')`` if you're not sure ``attr`` is defined, just as
-you would with a Python dictionary.
-
-``UnicodeEncodeError: 'charmap' codec can't encode character u'\xfoo'
-in position bar`` (or just about any other ``UnicodeEncodeError``) -
-This is not a problem with Beautiful Soup: you're trying to print a
-Unicode character that your console doesn't know how to display. See
-`this page on the Python wiki
-<http://wiki.python.org/moin/PrintFails>`_ for help. One easy solution
-is to write the text to a file and then look at the file.
+* ``ImportError: No module named html.parser`` - Caused by running the
+ Python 3 version of Beautiful Soup under Python 2.
+
+* ``ImportError: No module named BeautifulSoup`` - Caused by running
+ Beautiful Soup 3 code on a system that doesn't have BS3
+ installed. Or, by writing Beautiful Soup 4 code without knowing that
+ the package name has changed to ``bs4``.
+
+* ``ImportError: No module named bs4`` - Caused by running Beautiful
+ Soup 4 code on a system that doesn't have BS4 installed.
Parsing XML
-----------
@@ -2566,10 +2533,49 @@ By default, Beautiful Soup parses documents as HTML. To parse a
document as XML, pass in "xml" as the second argument to the
``BeautifulSoup`` constructor::
- soup = BeautifulSoup(markup, "xml")
+soup = BeautifulSoup(markup, "xml")
You'll need to :ref:`have lxml installed <parser-installation>`.
+Other parser problems
+---------------------
+
+* If your script works on one computer but not another, it's probably
+ because the two computers have different parser libraries
+ available. For example, you may have developed the script on a
+ computer that has lxml installed, and then tried to run it on a
+ computer that only has html5lib installed. See `Differences between
+ parsers`_ for why this matters, and fix the problem by mentioning a
+ specific parser library in the ``BeautifulSoup`` constructor.
+
+* ``HTMLParser.HTMLParseError: malformed start tag`` - Caused by
+ giving Python's built-in HTML parser a document it can't handle. Any
+ other ``HTMLParseError`` is probably the same problem. Solution:
+ :ref:`Install lxml or html5lib. <parser-installation>`
+
+* If you can't find a tag that you know is in the document (that is,
+ ``find_all()`` returned ``[]`` or ``find()`` returned ``None``),
+ you're probably using Python's built-in HTML parser, which sometimes
+ skips tags it doesn't understand. Solution: :ref:`Install lxml or
+ html5lib. <parser-installation>`
+
+Miscellaneous
+-------------
+
+* ``KeyError: [attr]`` - Caused by accessing ``tag['attr']`` when the
+ tag in question doesn't define the ``attr`` attribute. The most
+ common errors are ``KeyError: 'href'`` and ``KeyError:
+ 'class'``. Use ``tag.get('attr')`` if you're not sure ``attr`` is
+ defined, just as you would with a Python dictionary.
+
+* ``UnicodeEncodeError: 'charmap' codec can't encode character
+ u'\xfoo' in position bar`` (or just about any other
+ ``UnicodeEncodeError``) - This is not a problem with Beautiful Soup:
+ you're trying to print a Unicode character that your console doesn't
+ know how to display. See `this page on the Python wiki
+ <http://wiki.python.org/moin/PrintFails>`_ for help. One easy
+ solution is to write the text to a file and then look at the file.
+
Improving Performance
---------------------