- urllib2: extensible library for opening URLs.
- PyQuery: jQuery-like traversing and selecting for Python.
- mechanize: stateful programmatic web browsing in Python.
- Beautiful Soup: not supported/maintained that much anymore. Latest versions are rather slow and buggy.
- Scrapy: looks nice, includes the URL requesting part as well, with cookie support and such.
- lxml.html: lxml is a Pythonic binding for the libxml2 and libxslt libraries.
Probably going with Scrapy.
No comments :
Post a Comment