[issue4862] utf-16 BOM is not skipped after seek(0)

STINNER Victor report at bugs.python.org
Wed Jan 7 12:46:50 CET 2009


STINNER Victor <victor.stinner at haypocalc.com> added the comment:

> This is because the zero in seek(0) is a "cookie" 
> which contains both the position and the decoder state. 
> Unfortunately, state=0 means 'endianness has been determined:
> native order'.

The problem is maybe that TextIOWrapper._pack_cookie() can create a 
cookie=0. Example to create a non-null value, replace:
    def _pack_cookie(self, position, ...):
        return (position | (dec_flags<<64) | ...
    def _unpack_cookie(self, bigint):
        rest, position = divmod(bigint, 1<<64)
        ...
by
    def _pack_cookie(self, position, ...):
        return (1 | (position<<1) | (dec_flags<<65) | ...
    def _unpack_cookie(self, bigint):
        if not (bigint & 1):
           raise ValueError("invalid cookie")
        bigint >>= 1
        rest, position = divmod(bigint, 1<<64)
        ...

Why the cookie is an integer and not an object with attributes?

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue4862>
_______________________________________


More information about the Python-bugs-list mailing list