[New-bugs-announce] [issue38864] dbm: Can't open database with bytes-encoded filename
John Goerzen
report at bugs.python.org
Wed Nov 20 10:07:49 EST 2019
New submission from John Goerzen <jgoerzen at users.sourceforge.net>:
This simple recipe fails:
>>> import dbm
>>> dbm.open(b"foo")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.7/dbm/__init__.py", line 78, in open
result = whichdb(file) if 'n' not in flag else None
File "/usr/lib/python3.7/dbm/__init__.py", line 112, in whichdb
f = io.open(filename + ".pag", "rb")
TypeError: can't concat str to bytes
Why does this matter? On POSIX, a filename is any string of bytes that does not contain 0x00 or '/'. A database with a filename containing, for instance, German characters in ISO-8859-1, can't be opened by dbm, EVEN WITH decoding.
For instance:
file = b"test\xf7"
>>> dbm.open(file.decode())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf7 in position 4: invalid start byte
db = dbm.open(file.decode('iso-8859-1'), 'c')
db.close()
Then:
ls *.db | hd
00000000 74 65 73 74 c3 b7 2e 64 62 0a |test...db.|
0000000a
Note that it didn't insert the 0xf7 here; rather, it inserted the Unicode sequence corresponding to the division character (which is what 0xf7 in iso-8859-1 is). It is not possible to open a filename named "test\xf7.db" with the dbm module.
----------
messages: 357078
nosy: jgoerzen
priority: normal
severity: normal
status: open
title: dbm: Can't open database with bytes-encoded filename
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue38864>
_______________________________________
More information about the New-bugs-announce
mailing list