[Tutor] in as a stream, change stuff, out as a stream?

Wesley Chun wesc@deirdre.org
Wed, 11 Jul 2001 20:33:13 -0700 (PDT)


On Wed, 11 Jul 2001 sill@optonline.net wrote:
> On Wed, Jul 11, 2001 at 05:42:16PM -0700, Israel Evans wrote:
> >
> > I'm in the middle of reading a bunch of rather large files in order to
> > change one string to another.
> >
> > I know I should be able to open up a file, read all of it's contents into a
> > list with readlines() or read one line at a time with readline(), and then
> > write all of that out to another file.  I think that since the files are
> > rather large, it might be best to avoid the speed of readlines() and go with
> > readline() repeatedly.
> there's also xreadlines()

xreadlines() will work here.  it was invented in a similar
vein to xrange().  in other words it implements a "lazy"
reading scheme where by you *do* get the entire list read
in, but it will do just enough I/O to get you going rather
than reading everything in all at once, filling up memory.


> > At any rate, I was wondering if it would be possible to read a line, change
> > it in the same file I'm reading and move on to the next line, when I'm done.
> > Is this possible?  Or should I read everything at once, change the name of
> > the old file and write everything out to a file with the same name as the
> > old one.
>
> I'd read a line, change it, write it to a temp file, when done overwrite old
> file with new.

this is definitely the way most people do it.  you only hold
both files on the disk while the data is being converted but
once that's done, the old file goes away b4 your app ends.


> > Is it possible to open a file as a stream  and output it back into itself?
> > or is that just plain goofy.

in C, you can mmap() a large file into memory, manipulate it
that way.  more higher-level, you can open your file for both
read *and* write, but unless your records are fixed-length,
you may end up corrupting things... this can also happen with
mmap().  in short, it's safer to have a temp file, so in case
the new file is bad for some reason, you can fall back on the
old, unmanipulated data file.


> > I'm trying to do this bit with a number of fairly large files in multiple
> > directories, and so it would be ideal If I didn't have to fill up my hard
> > drive with old copies.

(see comment up above)

hope this helps!

-wesley

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Silicon Valley-SF Bay Area Python users group:  http://baypiggies.org

"Core Python Programming", Prentice Hall PTR, December 2000
    http://starship.python.net/crew/wesc/cpp/

wesley.j.chun :: wesc@baypiggies.org
cyberweb.consulting :: silicon.valley, ca
http://www.roadkill.com/~wesc/cyberweb/