From 23ec4e144b4d737e8fb8712e35532bb9f5e67cbf Mon Sep 17 00:00:00 2001 From: Leonard Richardson Date: Wed, 8 Feb 2012 09:21:39 -0500 Subject: Moved around a bunch of metadata. --- AUTHORS | 39 ---------- AUTHORS.txt | 39 ++++++++++ CHANGELOG | 229 ----------------------------------------------------------- COPYING | 26 ------- COPYING.txt | 26 +++++++ NEWS.txt | 231 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ README.txt | 25 ++++--- TODO | 42 ----------- TODO.txt | 42 +++++++++++ setup.py | 25 ++++--- 10 files changed, 365 insertions(+), 359 deletions(-) delete mode 100644 AUTHORS create mode 100644 AUTHORS.txt delete mode 100644 CHANGELOG delete mode 100644 COPYING create mode 100644 COPYING.txt create mode 100644 NEWS.txt delete mode 100644 TODO create mode 100644 TODO.txt diff --git a/AUTHORS b/AUTHORS deleted file mode 100644 index 9623a7c..0000000 --- a/AUTHORS +++ /dev/null @@ -1,39 +0,0 @@ -Behold, mortal, the origins of Beautiful Soup... -================================================ - -Leonard Richardson is the primary programmer. - -Aaron DeVore is awesome. - -Mark Pilgrim provided the encoding detection code that forms the base -of UnicodeDammit. - -Thomas Kluyver and Ezio Melotti finished the work of getting Beautiful -Soup 4 working under Python 3. - -Sam Ruby helped with a lot of edge cases. - -Jonathan Ellis was awarded the prestigous Beau Potage D'Or for his -work in solving the nestable tags conundrum. - -The following people have contributed patches to Beautiful Soup: - - Istvan Albert, Andrew Lin, Anthony Baxter, Andrew Boyko, Tony Chang, - Zephyr Fang, Fuzzy, Roman Gaufman, Yoni Gilad, Richie Hindle, Peteris - Krumins, Kent Johnson, Ben Last, Robert Leftwich, Staffan Malmgren, - Ksenia Marasanova, JP Moins, Adam Monsen, John Nagle, "Jon", Ed - Oskiewicz, Greg Phillips, Giles Radford, Arthur Rudolph, Marko - Samastur, Jouni Seppänen, Alexander Schmolck, Andy Theyers, Glyn - Webster, Paul Wright, Danny Yoo - -The following people made suggestions or found bugs or found ways to -break Beautiful Soup: - - Hanno Böck, Matteo Bertini, Chris Curvey, Simon Cusack, Matt Ernst, - Michael Foord, Tom Harris, Bill de hOra, Donald Howes, Matt - Patterson, Scott Roberts, Steve Strassmann, Mike Williams, warchild - at redho dot com, Sami Kuisma, Carlos Rocha, Bob Hutchison, Joren Mc, - Michal Migurski, John Kleven, Tim Heaney, Tripp Lilley, Ed Summers, - Dennis Sutch, Chris Smith, Aaron Sweep^W Swartz, Stuart Turner, Greg - Edwards, Kevin J Kalupson, Nikos Kouremenos, Artur de Sousa Rocha, - Yichun Wei, Per Vognsen diff --git a/AUTHORS.txt b/AUTHORS.txt new file mode 100644 index 0000000..e093cd6 --- /dev/null +++ b/AUTHORS.txt @@ -0,0 +1,39 @@ +Behold, mortal, the origins of Beautiful Soup... +================================================ + +Leonard Richardson is the primary programmer. + +Aaron DeVore is awesome. + +Mark Pilgrim provided the encoding detection code that forms the base +of UnicodeDammit. + +Thomas Kluyver and Ezio Melotti finished the work of getting Beautiful +Soup 4 working under Python 3. + +Sam Ruby helped with a lot of edge cases. + +Jonathan Ellis was awarded the prestigous Beau Potage D'Or for his +work in solving the nestable tags conundrum. + +The following people have contributed patches to Beautiful Soup: + + Istvan Albert, Andrew Lin, Anthony Baxter, Andrew Boyko, Tony Chang, + Zephyr Fang, Fuzzy, Roman Gaufman, Yoni Gilad, Richie Hindle, Peteris + Krumins, Kent Johnson, Ben Last, Robert Leftwich, Staffan Malmgren, + Ksenia Marasanova, JP Moins, Adam Monsen, John Nagle, "Jon", Ed + Oskiewicz, Greg Phillips, Giles Radford, Arthur Rudolph, Marko + Samastur, Jouni Seppänen, Alexander Schmolck, Andy Theyers, Glyn + Webster, Paul Wright, Danny Yoo + +The following people made suggestions or found bugs or found ways to +break Beautiful Soup: + + Hanno Böck, Matteo Bertini, Chris Curvey, Simon Cusack, Bruce Eckel, + Matt Ernst, Michael Foord, Tom Harris, Bill de hOra, Donald Howes, + Matt Patterson, Scott Roberts, Steve Strassmann, Mike Williams, + warchild at redho dot com, Sami Kuisma, Carlos Rocha, Bob Hutchison, + Joren Mc, Michal Migurski, John Kleven, Tim Heaney, Tripp Lilley, Ed + Summers, Dennis Sutch, Chris Smith, Aaron Sweep^W Swartz, Stuart + Turner, Greg Edwards, Kevin J Kalupson, Nikos Kouremenos, Artur de + Sousa Rocha, Yichun Wei, Per Vognsen diff --git a/CHANGELOG b/CHANGELOG deleted file mode 100644 index b0ad7be..0000000 --- a/CHANGELOG +++ /dev/null @@ -1,229 +0,0 @@ -= 4.0 beta 4 = - -Added BeautifulSoup.new_string() to go along with BeautifulSoup.new_tag() - -BeautifulSoup.new_tag() will follow the rules of whatever tree-builder -was used to create the original BeautifulSoup object. A new

tag -will look like "

" if the soup object was created to parse XML, -but it will look like "

" if the soup object was created to -parse HTML. - -We pass in strict=False to html.parser on Python 3, greatly improving -html.parser's ability to handle bad HTML. - -Monkeypatch a serious bug in html.parser that made strict=False -disastrous on Python 3.2.2. - -Replaced the "substitute_html_entities" argument with the "formatter" argument. - -Bare ampersands and angle brackets are always converted to XML -entities unless the user prevents it. - -Added PageElement.insert_before(). - -Added PageElement.insert_after(). - -Raise an exception when the user tries to do something nonsensical -like insert a tag into itself. - -= 4.0.0b3 = - -Beautiful Soup 4 is a nearly-complete rewrite that removes Beautiful -Soup's custom HTML parser in favor of a system that lets you write a -little glue code and plug in any HTML or XML parser you want. - -Beautiful Soup 4.0 comes with glue code for four parsers: - - * Python's standard HTMLParser (html.parser in Python 3) - * lxml's HTML and XML parsers - * html5lib's HTML parser - -HTMLParser is the default, but I recommend you install lxml if you -can. - -For complete documentation, see the Sphinx documentation in -bs4/doc/source/. What follows is a summary of the changes from -Beautiful Soup 3. - -=== The module name has changed === - -Previously you imported the BeautifulSoup class from a module also -called BeautifulSoup. To save keystrokes and make it clear which -version of the API is in use, the module is now called 'bs4': - - >>> from bs4 import BeautifulSoup - -=== It works with Python 3 === - -Beautiful Soup 3.1.0 worked with Python 3, but the parser it used was -so bad that it barely worked at all. Beautiful Soup 4 works with -Python 3, and since its parser is pluggable, you don't sacrifice -quality. - -Special thanks to Thomas Kluyver and Ezio Melotti for getting Python 3 -support to the finish line. Ezio Melotti is also to thank for greatly -improving the HTML parser that comes with Python 3.2. - -=== CDATA sections are normal text, if they're understood at all. === - -Currently, the lxml and html5lib HTML parsers ignore CDATA sections in -markup: - -

=>

- -A future version of html5lib will turn CDATA sections into text nodes, -but only within tags like and : - - foo =>

foo

- -The default XML parser (which uses lxml behind the scenes) turns CDATA -sections into ordinary text elements: - -

=>

foo

- -In theory it's possible to preserve the CDATA sections when using the -XML parser, but I don't see how to get it to work in practice. - -=== Miscellaneous other stuff === - -If the BeautifulSoup instance has .is_xml set to True, an appropriate -XML declaration will be emitted when the tree is transformed into a -string: - - - - ... - - -The ['lxml', 'xml'] tree builder sets .is_xml to True; the other tree -builders set it to False. If you want to parse XHTML with an HTML -parser, you can set it manually. - - -= 3.2.0 = - -The 3.1 series wasn't very useful, so I renamed the 3.0 series to 3.2 -to make it obvious which one you should use. - -= 3.1.0 = - -A hybrid version that supports 2.4 and can be automatically converted -to run under Python 3.0. There are three backwards-incompatible -changes you should be aware of, but no new features or deliberate -behavior changes. - -1. str() may no longer do what you want. This is because the meaning -of str() inverts between Python 2 and 3; in Python 2 it gives you a -byte string, in Python 3 it gives you a Unicode string. - -The effect of this is that you can't pass an encoding to .__str__ -anymore. Use encode() to get a string and decode() to get Unicode, and -you'll be ready (well, readier) for Python 3. - -2. Beautiful Soup is now based on HTMLParser rather than SGMLParser, -which is gone in Python 3. There's some bad HTML that SGMLParser -handled but HTMLParser doesn't, usually to do with attribute values -that aren't closed or have brackets inside them: - - baz - ', '"> - -A later version of Beautiful Soup will allow you to plug in different -parsers to make tradeoffs between speed and the ability to handle bad -HTML. - -3. In Python 3 (but not Python 2), HTMLParser converts entities within -attributes to the corresponding Unicode characters. In Python 2 it's -possible to parse this string and leave the é intact. - - - -In Python 3, the é is always converted to \xe9 during -parsing. - - -= 3.0.7a = - -Added an import that makes BS work in Python 2.3. - - -= 3.0.7 = - -Fixed a UnicodeDecodeError when unpickling documents that contain -non-ASCII characters. - -Fixed a TypeError that occured in some circumstances when a tag -contained no text. - -Jump through hoops to avoid the use of chardet, which can be extremely -slow in some circumstances. UTF-8 documents should never trigger the -use of chardet. - -Whitespace is preserved inside
 and