Web Page Parsing/Downloading

TheRandomPast wishingforsam at gmail.com
Fri Nov 22 05:10:33 EST 2013


Hi. I'm self taught at Python and I used http://www.codecademy.com/ to learn which was great help i must say but now, I'm attempting it all on my own and need a little help? 

I have three scripts and this is what I'm trying to do with them;


Download from webpage
Parse Links from Page
Output summary of total links
Format a list of matched links
Parse and Print Email addresses
Crach Hash Passwords
Exception Handling
Parsing and Print links to image files/.doc 
Save file into specified folder and alert when files don't save

Can anyone help because I've become a little stuck? None of the scripts are running for me and I can't see where I'm having issues


WebPage script;
import sys, urllib
def getWebpage(url):
    print '[*] getWebpage()'
    url_file = urllib.urlopen(url)
    page = url_file.read()
    return page
def main():
    sys.argv.append('http://www.funeralformyfat.tumblr.com')
    if len(sys.argv) != 2:
        print '[-] Usage: webpage_get URL'
        return
    
print getWebpage(sys.argv[1])

if __name__ == '__main__':
    main()

getLinks

def print_links(page):
    print '[*] print_links()'
    links = re.findall(r'\<a.*href\=.*http\:.+', page)
    links.sort()
    print '[+]', str(len(links)), 'HyperLinks Found:'
    
for link in links:
    print link
    
def main():
    sys.argv.append('http://www.funeralformyfat.tumblr.com')
    if len(sys.argv) != 2:
        print '[-] Usage: webpage_getlinks URL'
        return
        page = webpage_get.wget(sys.argv[1])
        print_links(page)

from os.path import join

    directory = join('/home/', y, '/newdir/')
    file_name = url.split('/')[-1]
    file_name = join(directory, file_name)



        
if __name__ == '__main__':
    main()

getParser 

 import md5

 oldpasswd_byuser=str("tom")
 oldpasswd_db="sha1$c60da$1835a9c3ccb1cc436ccaa577679b5d0321234c6f"
 opw=     md5.new(oldpasswd_byuser)
 #opw=     md5.new(oldpasswd_byuser).hexdigest()
 if(opw ==      oldpasswd_db):
    print "same password"
 else:
     print "Invalid password"

from email.parser import Parser


#headers = Parser().parse(open(messagefile, 'r'))


headers = Parser().parsestr('From: <user at example.com>\n'
        'To: <someone_else at example.com>\n'
        'Subject: Test message\n'
        '\n'
        'Body would go here\n')
print 'To: %s' % headers['to']
print 'From: %s' % headers['from']
print 'Subject: %s' % headers['subject']



Thanks for any help! 




More information about the Python-list mailing list