Posts
 
Reputation
Joined
Last Seen
0 Reputation Points
Unknown Quality Score

No one has voted on any posts yet. Votes from other community members are used to determine a member's reputation amongst their peers.

0 Endorsements
~6K People Reached
Favorite Forums
Favorite Tags

2 Posted Topics

Member Avatar for amrutraj

I am building a crawler+parser in Python. It has to be run for, like 20 hours. How can I modify the code such that the code execution pauses (before next urllib2.urlopen) when the internet is disconnected, and AUTOMATICALLY resumes with the same variable values, when the internet connection is back …

Member Avatar for musawir_2
0
330
Member Avatar for gunbuster363

Parsing a page with 8000+ urls with BeautifulSoup this is the page [CODE]http://www.thehindubusinessline.com/cgi-bin/bl2002.pl?mainclass=03[/CODE] this is my code [CODE] from urllib2 import URLError,urlopen import re from BeautifulSoup import BeautifulSoup, SoupStrainer def gethtml(address): try: raw=urlopen(address) raw=raw.read() except URLError: raw='Error occured' return raw dat=gethtml("http://www.thehindubusinessline.com/cgi-bin/bl2002.pl?mainclass=03") print 'got html' a_tag=SoupStrainer('a') html_atag = BeautifulSoup(dat, parseOnlyThese=a_tag) print …

Member Avatar for amrutraj
0
6K

The End.