Python text file fetch specific part of line

Gordon Levi gordon at address.invalid
Fri Jul 29 04:42:14 EDT 2016


cs at zip.com.au wrote:

>On 28Jul2016 19:28, Gordon Levi <gordon at address.invalid> wrote:
>>Arshpreet Singh <arsh840 at gmail.com> wrote:
>>>I am writing Imdb scrapper, and getting available list of titles from IMDB 
>>>website which provide txt file in very raw format, Here is the one part of 
>>>file(http://pastebin.com/fpMgBAjc) as the file provides tags like 
>>>Distribution  Votes,Rank,Title I want to parse title names, I tried with 
>>>readlines() method but it returns only list which is quite heterogeneous, is 
>>>it possible that I can parse each value comes under title section?
>>
>>Beautiful Soup will make your task much easier
>><https://www.crummy.com/software/BeautifulSoup/>.
>
>Did you look at his sample data?

No. I read he was "writing an IMDB scraper, and getting the available
list of titles from the IMDB web site". It's here
<http://www.imdb.com/>.  
> Plain text, not HTML or XML. Beautiful Soup is 
>not what he needs here.

Fortunately the OP told us his application rather than just telling us
his current problem. His life would be much easier if he ignored the
plain text he has obtained so far and started again using a Beautiful
Soup tutorial. 



More information about the Python-list mailing list