From 0afe0af7cd8240ab790ccbcea6ecbdf69f21461e Mon Sep 17 00:00:00 2001 From: Leonard Richardson Date: Mon, 16 Apr 2012 10:06:26 -0400 Subject: Attribute values are now run through the provided output formatter. Previously they were always run through the 'minimal' formatter. [bug=980237] --- doc/source/index.rst | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) (limited to 'doc/source') diff --git a/doc/source/index.rst b/doc/source/index.rst index 1ebcb5c..d4dabb1 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -1996,6 +1996,10 @@ invalid HTML or XML:: soup.p #

The law firm of Dewey, Cheatem, & Howe

+ soup = BeautifulSoup('A link') + soup.a + # A link + You can change this behavior by providing a value for the ``formatter`` argument to ``prettify()``, ``encode()``, or ``decode()``. Beautiful Soup recognizes four possible values for @@ -2029,7 +2033,7 @@ Unicode characters to HTML entities whenever possible:: If you pass in ``formatter=None``, Beautiful Soup will not modify strings at all on output. This is the fastest option, but it may lead -to Beautiful Soup generating invalid HTML/XML, as in this example:: +to Beautiful Soup generating invalid HTML/XML, as in these examples:: print(soup.prettify(formatter=None)) # @@ -2040,11 +2044,16 @@ to Beautiful Soup generating invalid HTML/XML, as in this example:: # # + link_soup = BeautifulSoup('A link') + print(link_soup.a.encode(formatter=None)) + # A link + Finally, if you pass in a function for ``formatter``, Beautiful Soup -will call that function once for every string in the document. You can -do whatever you want in this function. Here's a formatter that -converts strings to uppercase and does absolutely nothing else:: +will call that function once for every string and attribute value in +the document. You can do whatever you want in this function. Here's a +formatter that converts strings to uppercase and does absolutely +nothing else:: def uppercase(str): return str.upper() @@ -2058,6 +2067,11 @@ converts strings to uppercase and does absolutely nothing else:: # # + print(link_soup.a.prettify(formatter=uppercase)) + # + # A LINK + # + If you're writing your own function, you should know about the ``EntitySubstitution`` class in the ``bs4.dammit`` module. This class implements Beautiful Soup's standard formatters as class methods: the -- cgit v1.2.3