Best approach to create humongous amount of files

Tim Chase python.list at tim.thechases.com
Wed May 20 12:12:11 EDT 2015


On 2015-05-20 17:59, Peter Otten wrote:
> Tim Chase wrote:
> >   wordlist[:] = [ # just lowercase all-alpha words
> >     word
> >     for word in wordlist
> >     if word.isalpha() and word.islower()
> >     ]
> 
> Just a quick reminder: if the data is user-provided you have to
> sanitize it:

Thus my sanitizing to isalpha()+islower() words in my sample.

> I expect that performance will be dominated by I/O; if that's
> correct the extra work of serializing the JSON should not do much
> harm.

I seem to recall that there was a change-over, that an older JSON
library was particularly slow, but that a later replacement sped that
up immensely.  So performance may depend heavily on which version
you're running.

[to the OP] But yes, if you're trusting unsanitized data, Peter's
suggestion would be the way to go.

-tkc





More information about the Python-list mailing list