= About Beautiful Soup 4 = Earlier versions of Beautiful Soup included a custom HTML parser. Beautiful Soup 4 uses Python's default HTMLParser, which does fairly poorly on real-world HTML. By installing lxml or html5lib you can get more accurate parsing and possibly better performance as well. = Introduction = >>> from bs4 import BeautifulSoup >>> soup = BeautifulSoup("
SomebadHTML") >>> print soup.prettify()
Some bad HTML
>>> soup.find(text="bad") u'bad' >>> soup.i HTML