python 3 and stringio.seek

Terry Reedy tjreedy at udel.edu
Wed Jul 29 11:43:12 EDT 2009


Miles Kaufmann wrote:
> 
> On Jul 28, 2009, at 6:30 AM, Michele Petrazzo wrote:
> 
>> Hi list,
>> I'm trying to port a my library to python 3, but I have a problem with a
>> the stringio.seek:
>> the method not accept anymore a value like pos=-6 mode=1, but the "old"
>> (2.X) version yes...
>>
>> The error:
>>  File "/home/devel/Py3/lib/python3.0/io.py", line 2031, in seek
>>    return self._seek(pos, whence)
>> IOError: Can't do nonzero cur-relative seeks
>>
>>
>> How solve this?
> 
> In Python 2, StringIO is a stream of bytes (non-Unicode characters).  In 
> Python 3, StringIO is a stream of text (Unicode characters).  In the 
> early development of Python 3 (and 3.1's _pyio), it was implemented as a 
> TextIOWrapper over a BytesIO buffer.  TextIOWrapper does not support 
> relative seeks because it is difficult to map the concept of a "current 
> position" between bytes and the text that it encodes, especially with 
> variable-width encodings and other considerations.  Furthermore, the 
> value returned from TextIOWrapper.tell isn't just a file position but a 
> "cookie" that contains other data necessary to restore the decoding 
> mechanism to the same state.  However, for the default encoding (utf-8), 
> the current position is equivalent to that of the underlying bytes buffer.
> 
> In Python 3, StringIO is implemented using an internal buffer of Unicode 
> characters.  There is no technical reason why it can't support relative 
> seeks; I assume it does not for compatibility with the original Python 
> TextIOWrapper implementation (which is present in 3.1's _pyio, but not 
> in 3.0).
> 
> Note that because of the different implementations, StringIO.tell() (and 
> seek) behaves differently for the C and Python implementations:
> 
> $ python3.1
>  >>> import io, _pyio
>  >>> s = io.StringIO('\u263A'); s.read(1), s.tell()
> ('☺', 1)
>  >>> s = _pyio.StringIO('\u263A'); s.read(1), s.tell()
> ('☺', 3)

It seems to me that this discrepancy might be worth a tracker item.
I wonder why the second implementation is even there if not used.
Two different commiters?

> The end result seems to be that, for text streams (including StreamIO), 
> you *should* treat the value returned by tell() as an opaque magic 
> cookie, and *only* pass values to seek() that you have obtained from a 
> previous tell() call.  However, in practice, it appears that you *may* 
> seek StringIO objects relatively by characters using s.seek(s.tell() + 
> n), so long as you do not use the _pyio.StringIO implementation.

A tracker item could include a request that relative seek be restored if 
possible.

tjr




More information about the Python-list mailing list