[New-bugs-announce] [issue40416] Calling TextIOWrapper.tell() in the middle of reading a gb2312-encoded file causes UnicodeDecodeError

Rob Malouf report at bugs.python.org
Tue Apr 28 00:33:56 EDT 2020


New submission from Rob Malouf <rmalouf at mail.sdsu.edu>:

Calling TextIOWrapper.tell() while reading the attached gb2312-encoded file like this:

with open('udhr-gb2312.txt', encoding='GB2312') as f: 
    while True: 
       line = f.readline() 
       t = f.tell()
       if not line: 
           break 

gives this result:

Traceback (most recent call last):
  File "test.py", line 4, in <module>
    t = f.tell()
UnicodeDecodeError: 'gb2312' codec can't decode byte 0xb5 in position 0: illegal multibyte sequence

The file seems to be well-formed and can be read without any problem.  It's only the call to tell() that raises an issue.

----------
components: IO, Unicode
files: udhr-gb2312.txt
messages: 367494
nosy: ezio.melotti, rmalouf, vstinner
priority: normal
severity: normal
status: open
title: Calling TextIOWrapper.tell() in the middle of reading a gb2312-encoded file causes UnicodeDecodeError
type: crash
versions: Python 3.7
Added file: https://bugs.python.org/file49096/udhr-gb2312.txt

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue40416>
_______________________________________


More information about the New-bugs-announce mailing list