[New-bugs-announce] [issue41497] Potential UnicodeDecodeError in dis

JIanqiu Tao report at bugs.python.org
Thu Aug 6 15:52:45 EDT 2020


New submission from JIanqiu Tao <zkonge at outlook.com>:

A potential UnicodeDecodeError could be raised when run "python -m dis" on non-utf8 encoding environment.

Assume there is a file named "a.py", and contains "print('喵')", then save with UTF8 encoding.

Run "python -m dis ./a.py", on non-UTF8 encoding environment, for example a Windows PC which default language is Chinese.

A UnicodeDecodeError raised.

Traceback (most recent call last):
  File "C:\Program Files\Python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Program Files\Python38\lib\dis.py", line 553, in <module>
    _test()
  File "C:\Program Files\Python38\lib\dis.py", line 548, in _test
    source = infile.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xb5 in position 9: illegal multibyte sequence

That because Windows' default encoding is decided by language. Chinese use cp936(GB2312) as default encoding and can't handle UTF8 encoding.

It just need to read in "rb" mode instead of "r".

----------
components: Library (Lib)
messages: 374961
nosy: zkonge
priority: normal
severity: normal
status: open
title: Potential UnicodeDecodeError in dis
type: behavior
versions: Python 3.10, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41497>
_______________________________________


More information about the New-bugs-announce mailing list