summaryrefslogtreecommitdiff
path: root/README.txt
blob: 6e789c2f13c1fac31a19aa2fa2881c7bae00098b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
= About Beautiful Soup 4 =

Earlier versions of Beautiful Soup included a custom HTML
parser. Beautiful Soup 4 does not include a parser. You'll need to
install either lxml or html5lib.

= Introduction =

  >>> from bs4 import BeautifulSoup
  >>> soup = BeautifulSoup("<p>Some<b>bad<i>HTML")
  >>> print soup.prettify()
  <html>
   <body>
    <p>
     Some
     <b>
      bad
      <i>
       HTML
      </i>
     </b>
    </p>
   </body>
  </html>
  >>> soup.find(text="bad")
  u'bad'

  >>> soup.i
  <i>HTML</i>