a couple of newbie questions

Robin Munn rmunn at pobox.com
Mon Mar 24 10:56:09 EST 2003


Joseph Paish <jpaish at freenet.edmonton.ab.ca> wrote:
> exactly what i was looking for
> 
> thank you
> 
> joe
> 
> ps.  presently using Python 2.0.  hopefully, anything 2.2 specific i should be able to easily
> convert using my (older) books as a reference.  for example, i used the following code to do what
> you described above :
>>>> datadict = {}
>>>> filename = "/path/to/filename"
>>>> inputfile = open(filename, 'r')
>>>> for line in inputfile :
> 		fields = line.split()
> 		datadict[fields[1]] = line

You're welcome. Next time, though, please trim the message you're
replying to. Usenet newsgroups get propagated all over the world, and 80
lines of unnecessary text (the part of my original post you weren't
responding to) x umpteen numbers of Usenet servers worldwide = a lot of
wasted bandwidth.

Anyway, you've changed one part of my code correctly to work with Python
2.0: replacing the file() call with open(). (BTW, don't *ever* use
"file" as a variable name, or you *will* get confusing bugs when you
upgrade to Python 2.2!) But there's another part of that code that won't
work properly in Python 2.0: the "for line in inputfile:" loop. The
ability to use file objects as iterators is new in Python 2.2; in Python
2.1 or previous, you'll have to do:

    for line in inputfile.readlines():
        # Process line

A word of warning, though: the readlines() method will read the *entire*
file into memory. If you have a pretty large file to process, that is
not what you want, and you'll want to use the xreadlines module:

    import xreadlines
    for line in xreadlines.xreadlines(inputfile):
        # Process line

The xreadlines module lets you "simulate" a readlines() call, but
instead of reading the entire file into memory, it will read chunks (I
think 8K-sized chunks) of the file as needed, so you can iterate through
even a huge file (on the order of a gigabyte) without using excessive
RAM. If a gigabyte-sized file sounds excessive, you've obviously never
done any programming on things like genetics projects. :-) Read more
about the xreadlines module here:

    http://www.python.org/doc/current/lib/module-xreadlines.html

Oh; I just looked at that URL and saw that the xreadlines module was
introduced in Python 2.1. Well, that's a good reason to upgrade, if you
ask me! But on that same page is a code snippet (the one beginning with
"while 1:") that should show you a way to get the same functionality in
Python 2.0.

-- 
Robin Munn <rmunn at pobox.com>
http://www.rmunn.com/
PGP key ID: 0x6AFB6838    50FF 2478 CFFB 081A 8338  54F7 845D ACFD 6AFB 6838




More information about the Python-list mailing list