Text processing and file creation

Ginger luojiang2 at tom.com
Wed Sep 5 21:30:52 EDT 2007


file reading latency is mainly caused by large reading frequency, so reduction of the frequency of file reading may be way to solved your problem.
u may specify an read bytes count for python file object's read() method, some large value(like 65536) can be specified due to ur memory usage, and u can parse lines from read buffer freely.

have fun!

----- Original Message ----- 
From: "Shawn Milochik" <Shawn at Milochik.com>
To: <python-list at python.org>
Sent: Thursday, September 06, 2007 1:03 AM
Subject: Re: Text processing and file creation


> On 9/5/07, malibuster at gmail.com <malibuster at gmail.com> wrote:
>> I have a text source file of about 20.000 lines.
>> >From this file, I like to write the first 5 lines to a new file. Close
>> that file, grab the next 5 lines write these to a new file... grabbing
>> 5 lines and creating new files until processing of all 20.000 lines is
>> done.
>> Is there an efficient way to do this in Python?
>> In advance, thanks for your help.
>>
> 
> 
> I have written a working test of this. Here's the basic setup:
> 
> 
> 
> 
> open the input file
> 
> function newFileName:
>    generate a filename (starting with 00001.tmp).
>    If filename exists, increment and test again (0002.tmp and so on).
>    return fileName
> 
> read a line until input file is empty:
> 
>    test to see whether I have written five lines. If so, get a new
> file name, close file, and open new file
> 
>    write line to file
> 
> close output file final time
> 
> 
> Once you get some code running, feel free to post it and we'll help.
> 
>


More information about the Python-list mailing list