[Tutor] parse text file

Norman Khine norman at khine.net
Fri Jan 22 14:11:42 CET 2010


Hello,
I have the following http://paste.lisp.org/display/93732 txt file.
>From this I would like to extract

...
            '<strong>ACP</strong>' +
            '<br /><a href="/acp.html">En savoir plus</a>'
      		);
...
		  map.addOverlay(marqueur[1]);var latlng = new GLatLng(9.696333,
                                  122.985992);

so that i get a CSV file:

"ACP", "acp.html" , "9.69633", "122.985992"

This is what I have so far:

>>> file=open('google_map_code.txt', 'r')
>>> data =  repr( file.read().decode('utf-8') )
>>> from BeautifulSoup import BeautifulStoneSoup
>>> soup = BeautifulStoneSoup(data)
>>> strongs = soup.findAll('strong')
>>> strongs
[<strong>ALTER TRADE CORPORATION</strong>, <strong>ANAPQUI</strong>,
<strong>APICOOP / VALVIDIA</strong>, <strong>APIKRI</strong>,
...

>>> path = soup.findAll('a')
>>> path
[<a href="/acp.html">En savoir plus</a>, <a
href="/alter-trade-corporation.html">En savoir plus</a>,
...

but my problem comes when i try to list the GLatLng:

GLatLng(9.696333, 122.985992);

>>> StartingWithGLatLng = soup.findAll(re.compile('GLatLng'))
>>> StartingWithGLatLng
[]

Thanks
Norman


More information about the Tutor mailing list