Issue a warning if characters were replaced with REPLACEMENT CHARACTER during Unicode conversion.

author: Leonard Richardson <leonard.richardson@canonical.com> 2012-02-16 13:31:20 -0500
committer: Leonard Richardson <leonard.richardson@canonical.com> 2012-02-16 13:31:20 -0500
commit: ffcebc274b84b85a0b8c93c2aca8756df4baa236 (patch)
tree: 29dab20e3176c47b37d8a133fd9d4fee52f75b63 /bs4/doc/source
parent: 97ac0bc1947b3c5ea7d262d268f42ab629117441 (diff)
1 files changed, 21 insertions, 8 deletions
diff --git a/bs4/doc/source/index.rst b/bs4/doc/source/index.rst
index 8328ed7..200317a 100644
--- a/bs4/doc/source/index.rst
+++ b/bs4/doc/source/index.rst
@@ -303,19 +303,24 @@ done by treating the tag as a dictionary::
 Multi-valued attributes
 &&&&&&&&&&&&&&&&&&&&&&&
 
-HTML defines a few attributes that can have multiple values. The most
-common is ``class`` (a tag can have more than one CSS class), but
-there are a few others: ``rel``, ``rev``, ``archive``,
-``accept-charset``, and ``headers``. If one of these attributes has
-more than one value, Beautiful Soup will turn its values into a list::
+HTML 4 defines a few attributes that can have multiple values. HTML 5
+removes a couple of them, but defines a few more. The most common
+multi-valued attribute is ``class`` (that is, a tag can have more than
+one CSS class). Others include ``rel``, ``rev``, ``accept-charset``,
+``headers``, and ``accesskey``. Beautiful Soup presents the value(s)
+of a multi-valued attribute as a list::
 
  css_soup = BeautifulSoup('<p class="body strikeout"></p>')
  css_soup.p['class']
  # ["body", "strikeout"]
 
+ css_soup = BeautifulSoup('<p class="body"></p>')
+ css_soup.p['class']
+ # ["body"]
+
 If an attribute `looks` like it has more than one value, but it's not
-one of the special attributes listed above, Beautiful Soup will leave
-the attribute alone::
+a multi-valued attribute as defined by any version of the HTML
+standard, Beautiful Soup will leave the attribute alone::
 
  id_soup = BeautifulSoup('<p id="my id"></p>')
  id_soup.p['id']
@@ -326,11 +331,19 @@ consolidated::
 
  rel_soup = BeautifulSoup('<p>Back to the <a rel="index">homepage</a></p>')
  rel_soup.a['rel']
- # 'index'
+ # ['index']
  rel_soup.a['rel'] = ['index', 'contents']
  print(rel_soup.p)
  # <p>Back to the <a rel="index contents">homepage</a></p>
 
+If you parse a document as XML, there are no multi-valued attributes::
+
+ xml_soup = BeautifulSoup('<p class="body strikeout"></p>', 'xml')
+ xml_soup.p['class']
+ # u'body strikeout'
+
+
+
 ``NavigableString``
 -------------------
author	Leonard Richardson <leonard.richardson@canonical.com>	2012-02-16 13:31:20 -0500
committer	Leonard Richardson <leonard.richardson@canonical.com>	2012-02-16 13:31:20 -0500
commit	ffcebc274b84b85a0b8c93c2aca8756df4baa236 (patch)
tree	29dab20e3176c47b37d8a133fd9d4fee52f75b63 /bs4/doc/source
parent	97ac0bc1947b3c5ea7d262d268f42ab629117441 (diff)