Web Scraping - Output File

Prasad, Ramit ramit.prasad at jpmorgan.com
Thu Apr 26 18:05:16 EDT 2012


> > I am having some difficulty generating the output I want from web
> > scraping. Specifically, the script I wrote, while it runs without any
> > errors, is not writing to the output file correctly. It runs, and
> > creates the output .txt file; however, the file is blank (ideally it
> > should be populated with a list of names).
> >
> > I took the base of a program that I had before for a different data
> > gathering task, which worked beautifully, and edited it for my
> > purposes here. Any insight as to what I might be doing wrote would be
> > highly appreciated. Code is included below. Thanks!

> Your code is bound to break over and over (you should do some smarter
> parsing), but here's a working version:

Take a look at Beautiful Soup/lxml; considering the number of malformed web pages Beautiful Soup will probably work a better than lxml.

Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--
This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  



More information about the Python-list mailing list