multiple file objects for some file?

Tim Peters tim.one at comcast.net
Sun Jul 27 13:17:50 EDT 2003


[Gary Robinson]
> For some code I'm writing, it's convenient for me to have multiple
> file objects (say, 2 or 3) open on the same file in the same process
> for reading different locations.
>
> As far as I can tell, there is no problem with that but I thought it
> might be a good idea to ask here just in case I'm wrong.

Provided they're "ordinary" on-disk files (not, e.g., sockets or pipes
wrapped in a Python file object), that should be fine.  Each file object has
its own idea of the current file position, and they'll all (of course) see
the same data on disk.

You can get in deep trouble if the file mutates, though.  Python's file
objects are wrappers around C's streams, and each C stream has its own
buffers.  So, for example, a mutation (a file write) made via one file
object won't necessarily update the buffers held by the other file objects
(it depends on how your platform C works, but I don't know of any that
automagically try to update buffers across multiple streams open on the same
file), and then reading via some other file object may continue to see stale
data (left over in its buffer).

For example, I bet this program will fail on your box before a minute
passes:

"""
f = open('temp.dat', 'wb')
chars = ''.join(map(chr, range(256)))
chars = chars * 1000
f.write(chars)
f.close()

f1 = open('temp.dat', 'r+b')
f2 = open('temp.dat', 'r+b')
f1.seek(0, 2)
n = f1.tell()
assert n == len(chars)

import random
while True:
    f = random.choice((f1, f2))
    start = random.randrange(n)
    f.seek(start)
    len = random.randrange(500)
    data = f.read(len)
    assert data == chars[start : start+len]
    print '.',
    if random.random() < 0.1:
        print len,
        data = ''.join([random.choice(chars) for dummy in xrange(len)])
        f.seek(start)
        f.write(data)
        chars = chars[:start] + data + chars[start+len:]
"""

If you change

    f = random.choice((f1, f2))

to

    f = f1

it should never fail (then there's only one file object getting mucked
with).

If instead you change

    if random.random() < 0.1:

to

    if 0:

it should also never fail (then there are multiple file objects, but no file
mutations).






More information about the Python-list mailing list