[Tutor] Has anyone already done a python parser for the IMDB datafiles?

R. Alan Monroe amonroe at columbus.rr.com
Sat Dec 20 10:59:23 EST 2003


I recently discovered their data files are available for download:
ftp://ftp.fu-berlin.de/pub/misc/movies/database/

But they're not (I was VERY surprised to find) in any kind of
well-defined format. It looks like someone basically typed them into a
text editor. Just glancing at it, it looks like:

actor 1<bunch of tabs> movie 1 (year) (media) [role] <billing>
<bunch of tabs>        movie 2 (year) [role] <billing>
<bunch of tabs>        movie 3 etc.

actor 2<bunch of tabs> movie 1
<bunch of tabs>        movie 2
<bunch of tabs>        movie 3

The fields after the movie appear to be optional, there may be others
I overlooked. I didn't get any matches on Parnassus or Uselesspython,
searching for "imdb". Probably wouldn't be too hard to cobble
something together, but maybe it's already been done somewhere?

Alan




More information about the Tutor mailing list