diff options
author | Leonard Richardson <leonard.richardson@canonical.com> | 2012-02-09 16:22:52 -0500 |
---|---|---|
committer | Leonard Richardson <leonard.richardson@canonical.com> | 2012-02-09 16:22:52 -0500 |
commit | 822092c0edc77289103d7fa9c8a93c59e45f082e (patch) | |
tree | d30a9471c4a8b4635b841582fc190b8fd743bd1a | |
parent | 4aff2ee4d6f077e06159c92ab05c0f2ea527c6fa (diff) |
Corrected documentation.
-rw-r--r-- | bs4/doc/source/index.rst | 9 |
1 files changed, 6 insertions, 3 deletions
diff --git a/bs4/doc/source/index.rst b/bs4/doc/source/index.rst index d28787b..1ad6449 100644 --- a/bs4/doc/source/index.rst +++ b/bs4/doc/source/index.rst @@ -2080,10 +2080,13 @@ In rare cases (usually when a UTF-8 document contains text written in a completely different encoding), the only way to get Unicode may be to replace some characters with the special Unicode character "REPLACEMENT CHARACTER" (U+FFFD, �). If Unicode, Dammit needs to do -this, it will set the ``.characters_were_replaced`` attribute to -``True`` on the ``UnicodeDammit`` or ``BeautifulSoup`` object. This +this, it will set the ``.contains_replacement_characters`` attribute +to ``True`` on the ``UnicodeDammit`` or ``BeautifulSoup`` object. This lets you know that the Unicode representation is not an exact -representation of the original--some data was lost. +representation of the original--some data was lost. If a document +contains �, but ``.contains_replacement_characters`` if ``False``, +you'll know that the � was there originally (as it is in this +paragrpah) and doesn't stand in for missing data. Output encoding --------------- |