file.seek() and file.tell() look inconsistent to me

MRAB python at mrabarnett.plus.com
Mon Jul 4 13:29:17 EDT 2016


On 2016-07-04 16:48, Marco Buttu wrote:
> Hi all,
>
> if I open a file in text mode, do you know why file.seek() returns the
> number of bytes, and file.tell() takes the number of bytes? I was
> expecting the number of characters, like write() does:
>
>  >>> f = open('myfile', 'w')
>  >>> f.write('aè')
> 2
>
> It seems to me not consistent, and maybe could also be error prone:
>
>  >>> f.seek(2)
> 2
>  >>> f.write('c')
> 1
>  >>> f.close()
>  >>> open('myfile').read()
>     ...
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3...
>
>
Some encodings, such as UTF-8, use a variable number of bytes per 
character (codepoint, actually), so in order to seek to a certain 
character position you would need to read from a known position, e.g. 
the start of the file, until you reached the desired place.

Most of the time you're seeking to a position that was previously 
returned by tell anyway.

I think it's a case of "practicality beats purity".




More information about the Python-list mailing list