From 233cc621d768654ae86e74b753da02bd138cf2d1 Mon Sep 17 00:00:00 2001 From: Leonard Richardson Date: Wed, 11 Apr 2012 18:59:43 -0400 Subject: Added more common errors to doc. --- doc/source/index.rst | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) (limited to 'doc/source') diff --git a/doc/source/index.rst b/doc/source/index.rst index 5016fb0..9a29b0f 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -2478,6 +2478,20 @@ Troubleshooting Common Problems --------------- +If your script works on one computer but not another, it's probably +because the two computers have different parser libraries +available. For instance, you may have developed the script on a +computer that has lxml installed, and then tried to run it on a +computer that only has html5lib installed. See `Differences between +parsers`_ for why this matters, and fix the problem by mentioning a +specific parser library in the ``BeautifulSoup`` constructor. + +If you can't find a tag that you know is in the document (that is, +``find_all()`` returned ``[]`` or ``find()`` returned ``None``), you're +probably using Python's built-in HTML parser, which sometimes skips +tags it doesn't understand. Solution: :ref:`Install lxml or +html5lib. ` + ``ImportError: No module named HTMLParser`` - Caused by running the Python 2 version of Beautiful Soup under Python 3. @@ -2497,26 +2511,19 @@ Python's built-in HTML parser a document it can't handle. Any other ``HTMLParseError`` is probably the same problem. Solution: :ref:`Install lxml or html5lib. ` -If you can't find a tag that you know is in the document (that is, -``find_all()`` returned ``[]`` or ``find()`` returned ``None``), you're -probably using Python's built-in HTML parser, which sometimes skips -tags it doesn't understand. Solution: :ref:`Install lxml or -html5lib. ` - -If your script works on one computer but not another, it's probably -because the two computers have different sets of parser libraries -available. For instance, you may have developed the script on a -computer that has lxml installed, and then tried to run it on a -computer that only has html5lib installed. See `Differences between -parsers`_ for why this matters, and fix the problem by mentioning a -specific parser library in the ``BeautifulSoup`` constructor. - ``KeyError: [attr]`` - Caused by accessing ``tag['attr']`` when the tag in question doesn't define the ``attr`` attribute. The most common errors are ``KeyError: 'href'`` and ``KeyError: 'class'``. Use ``tag.get('attr')`` if you're not sure ``attr`` is defined, just as you would with a Python dictionary. +``UnicodeEncodeError: 'charmap' codec can't encode character u'\xfoo' +in position bar`` (or just about any other ``UnicodeEncodeError``) - +This is not a problem with Beautiful Soup: you're trying to print a +Unicode character that your console doesn't know how to display. See +`this page on the Python wiki +`_ for help. One easy solution +is to write the text to a file and then look at the file. Parsing XML ----------- -- cgit v1.2.3