Wikipedia XML Dump

Burak Arslan burak.arslan at arskom.com.tr
Tue Jan 28 17:47:47 EST 2014


hi,

On 01/29/14 00:31, Kevin Glover wrote:
> Thanks for the comments, guys. The Wikipedia download is a single XML document, 43.1GB. Any further thoughts?
>
>

in that case, http://lxml.de/tutorial.html#event-driven-parsing seems to
be your only option.

hth,
burak



More information about the Python-list mailing list