[Python-ideas] Add a context manager to keep stream position unchanged

Andrew Barnert abarnert at yahoo.com
Mon Mar 30 17:36:40 CEST 2015


On Mar 30, 2015, at 06:59, Chris Angelico <rosuav at gmail.com> wrote:
> 
> On Tue, Mar 31, 2015 at 12:52 AM, Andrew Barnert
> <abarnert at yahoo.com.dmarc.invalid> wrote:
>>> If that is being done, anyway, I propose an optional parameter that
>>> would allow the context to internally use itertools.tee if the stream
>>> can't seek itself.
>> 
>> That sounds cool, but... If you have an actual file object with seek and tell, you're often using other methods besides iterating lines (like read), and tee won't help with those, so it might be a little misleading.
> 
> That shouldn't be an issue, though, as those will just use seek/tell.

The whole point of this context manager is to seek at enter and tell at exit. If it's useful to people using files as iterators, it's at least as useful to people using them as files. More so, in fact, because most code that asks for a file-like iterator isn't going to want to assume it's actually a file, while code that asks for something that has file methods doesn't have that problem.

> I'm more wondering about the ones that _do_ need the tee'ing; how,
> after the context manager completes, do you have the file object
> return something from the other half of the tee?

You're right; tee doesn't do anything to the iterable you give it, it creates two new iterators that share the iterable and a cache. So the caller has to use one of those new iterators, which would be clumsy. Something like:

    with restoring_position(f, tee_if_needed=True) as f, flocal:
        # do stuff with flocal as an iterator of lines
    # do stuff with f, which is an iterator of the same lines

Of course this only works if f only lives within this scope, or is a closure or global variable (if it was a normal parameter, as soon as you return to the caller, he's got the original file object seeked to the wrong place). On top of the issue with f now being an iterator rather than whatever kind of file-like object you started with, the potential for confusion and misuse is so high that I don't think I'd do this even internally to my own project, much less suggest it for the stdlib... But at least the implementation is still pretty simple:

    @contextmanager
    def restoring_position(t, tee_if_needed=False):
        try:
            pos = f.tell()
            f.seek(pos)
        except (AttributeError, IForgetTheExceptionForNotSeekable):
            if not tee_jf_needed: raise
            f, flocal = tee(f)
            frestore = None
        else:
            flocal = frestore = f
        try:
            yield f, flocal
        finally:
            if frestore is not None:
                frestore.seek(pos)

A better solution might be to build an actual file tee--a BufferedReader or TextIOWrapper that wraps another object of the same type, doesn't discard used data from the buffer, and can therefore seek back to the start of its buffer without  seeking the wrapped file.

But even then, I don't think I'd use it this way. I'd construct the wrapper explicitly, and give it a method that returns a pos-restoring context manager.

> I'm not sure this
> mode will work. If the stream can't use seek(), this context manager
> should simply let the exception bubble IMO.

Agreed. If you really want to, just try the context manager, and fall back to tee if it fails. A couple lines of code, and a lot more explicit and readable, and probably not that commonly needed...



More information about the Python-ideas mailing list