python 3 and stringio.seek

Miles Kaufmann milesck at umich.edu
Wed Jul 29 04:14:03 EDT 2009


On Jul 28, 2009, at 6:30 AM, Michele Petrazzo wrote:

> Hi list,
> I'm trying to port a my library to python 3, but I have a problem  
> with a
> the stringio.seek:
> the method not accept anymore a value like pos=-6 mode=1, but the  
> "old"
> (2.X) version yes...
>
> The error:
>  File "/home/devel/Py3/lib/python3.0/io.py", line 2031, in seek
>    return self._seek(pos, whence)
> IOError: Can't do nonzero cur-relative seeks
>
>
> How solve this?

In Python 2, StringIO is a stream of bytes (non-Unicode characters).   
In Python 3, StringIO is a stream of text (Unicode characters).  In  
the early development of Python 3 (and 3.1's _pyio), it was  
implemented as a TextIOWrapper over a BytesIO buffer.  TextIOWrapper  
does not support relative seeks because it is difficult to map the  
concept of a "current position" between bytes and the text that it  
encodes, especially with variable-width encodings and other  
considerations.  Furthermore, the value returned from  
TextIOWrapper.tell isn't just a file position but a "cookie" that  
contains other data necessary to restore the decoding mechanism to the  
same state.  However, for the default encoding (utf-8), the current  
position is equivalent to that of the underlying bytes buffer.

In Python 3, StringIO is implemented using an internal buffer of  
Unicode characters.  There is no technical reason why it can't support  
relative seeks; I assume it does not for compatibility with the  
original Python TextIOWrapper implementation (which is present in  
3.1's _pyio, but not in 3.0).

Note that because of the different implementations, StringIO.tell()  
(and seek) behaves differently for the C and Python implementations:

$ python3.1
 >>> import io, _pyio
 >>> s = io.StringIO('\u263A'); s.read(1), s.tell()
('☺', 1)
 >>> s = _pyio.StringIO('\u263A'); s.read(1), s.tell()
('☺', 3)

The end result seems to be that, for text streams (including  
StreamIO), you *should* treat the value returned by tell() as an  
opaque magic cookie, and *only* pass values to seek() that you have  
obtained from a previous tell() call.  However, in practice, it  
appears that you *may* seek StringIO objects relatively by characters  
using s.seek(s.tell() + n), so long as you do not use the  
_pyio.StringIO implementation.

If what you actually want is a stream of bytes, use BytesIO, which may  
be seeked (sought?) however you please.

I'm basing this all on my reading of the Python source (and svn  
history), since it doesn't seem to be documented, so take it with a  
grain of salt.

-Miles




More information about the Python-list mailing list