Too many open files

Gary Herron gherron at islandtraining.com
Mon Feb 4 11:27:08 EST 2008


AMD wrote:
> Hello,
>
> I need to split a very big file (10 gigabytes) into several thousand 
> smaller files according to a hash algorithm, I do this one line at a 
> time. The problem I have is that opening a file using append, writing 
> the line and closing the file is very time consuming. I'd rather have 
> the files all open for the duration, do all writes and then close them 
> all at the end.
> The problem I have under windows is that as soon as I get to 500 files I 
> get the Too many open files message. I tried the same thing in Delphi 
> and I can get to 3000 files. How can I increase the number of open files 
> in Python?
>
> Thanks in advance for any answers!
>
> Andre M. Descombes
>   
Try something like this:

Instead of opening several thousand files:

* Create several thousand lists.

* Open the input file and process each line, dropping it into the
correct list.

* Whenever a single list passes some size threshold, open its file,
write the batch, and immediately close the file. 

* Similarly at the end (or when the total of all lists passes sme size
threshold), loop through the several thousand lists, opening, writing,
and closing.

This will keep the open/write/closes operations to a minimum, and you'll
never have more than 2 files open at a time.  Both of those are wins for
you.

Gary Herron




More information about the Python-list mailing list