diff options
author | Leonard Richardson <leonard.richardson@canonical.com> | 2012-04-11 18:59:43 -0400 |
---|---|---|
committer | Leonard Richardson <leonard.richardson@canonical.com> | 2012-04-11 18:59:43 -0400 |
commit | 233cc621d768654ae86e74b753da02bd138cf2d1 (patch) | |
tree | 51bbf776997a96ee8a886a4e7e2cba9ea95f6d7a /doc/source | |
parent | 69a40882e7dcbee8cca9ad17a43c4488601f7f82 (diff) |
Added more common errors to doc.
Diffstat (limited to 'doc/source')
-rw-r--r-- | doc/source/index.rst | 35 |
1 files changed, 21 insertions, 14 deletions
diff --git a/doc/source/index.rst b/doc/source/index.rst index 5016fb0..9a29b0f 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -2478,6 +2478,20 @@ Troubleshooting Common Problems --------------- +If your script works on one computer but not another, it's probably +because the two computers have different parser libraries +available. For instance, you may have developed the script on a +computer that has lxml installed, and then tried to run it on a +computer that only has html5lib installed. See `Differences between +parsers`_ for why this matters, and fix the problem by mentioning a +specific parser library in the ``BeautifulSoup`` constructor. + +If you can't find a tag that you know is in the document (that is, +``find_all()`` returned ``[]`` or ``find()`` returned ``None``), you're +probably using Python's built-in HTML parser, which sometimes skips +tags it doesn't understand. Solution: :ref:`Install lxml or +html5lib. <parser-installation>` + ``ImportError: No module named HTMLParser`` - Caused by running the Python 2 version of Beautiful Soup under Python 3. @@ -2497,26 +2511,19 @@ Python's built-in HTML parser a document it can't handle. Any other ``HTMLParseError`` is probably the same problem. Solution: :ref:`Install lxml or html5lib. <parser-installation>` -If you can't find a tag that you know is in the document (that is, -``find_all()`` returned ``[]`` or ``find()`` returned ``None``), you're -probably using Python's built-in HTML parser, which sometimes skips -tags it doesn't understand. Solution: :ref:`Install lxml or -html5lib. <parser-installation>` - -If your script works on one computer but not another, it's probably -because the two computers have different sets of parser libraries -available. For instance, you may have developed the script on a -computer that has lxml installed, and then tried to run it on a -computer that only has html5lib installed. See `Differences between -parsers`_ for why this matters, and fix the problem by mentioning a -specific parser library in the ``BeautifulSoup`` constructor. - ``KeyError: [attr]`` - Caused by accessing ``tag['attr']`` when the tag in question doesn't define the ``attr`` attribute. The most common errors are ``KeyError: 'href'`` and ``KeyError: 'class'``. Use ``tag.get('attr')`` if you're not sure ``attr`` is defined, just as you would with a Python dictionary. +``UnicodeEncodeError: 'charmap' codec can't encode character u'\xfoo' +in position bar`` (or just about any other ``UnicodeEncodeError``) - +This is not a problem with Beautiful Soup: you're trying to print a +Unicode character that your console doesn't know how to display. See +`this page on the Python wiki +<http://wiki.python.org/moin/PrintFails>`_ for help. One easy solution +is to write the text to a file and then look at the file. Parsing XML ----------- |