[New-bugs-announce] [issue42453] utf-8 codec error when pip uninstalling a package which has files containing unicode filename on Windows
赵豪杰
report at bugs.python.org
Tue Nov 24 11:26:27 EST 2020
New submission from 赵豪杰 <1292756898 at qq.com>:
When using `pip install package_name` installing a package, it will generate a `installed-files.txt` file, which records the file that the package contains.
When updating or uninstalling the package, pip will need to read the `installed-files.txt` file, then delete the old files.
If the package installed contains files whose name has unicode character like `文件`, the problem will occur.
In China (I don't know other places), for historical reasons, the Windows default system codec is `gbk`, so the `installed-files.txt` file is also written with `gbk` codec when installing a package.
When it comes to updating or uninstalling, the pip will use `utf-8` codec to read the `installed-files.txt` file. Since the file contains non ascii characters, it went error:
```
File "d:\users\haujet\appdata\local\programs\python\python39\lib\site-packages\pip\_vendor\pkg_resources\__init__.py", line 1424, in get_metadata
return value.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb8 in position 343: invalid start byte in installed-files.txt file at path: d:\users\haujet\appdata\local\programs\python\python39\lib\site-packages\Markdown_Toolbox-0.0.8-py3.9.egg-info\installed-files.txt
```
I hate that default `gbk` system codec, but this set is fixed on Windows.
So, my suggestion is, make a `try except` at the error point, if the `utf-8` codec went wrong reading `installed-files.txt`, then let `gbk` codec have a go.
Or, more foundamental solution is, when pip writing text files, strictly use `utf-8` codec instead of the default system codec.
----------
components: Windows
messages: 381753
nosy: HaujetZhao, paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: utf-8 codec error when pip uninstalling a package which has files containing unicode filename on Windows
type: crash
versions: Python 3.9
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue42453>
_______________________________________
More information about the New-bugs-announce
mailing list