[issue1170] shlex have problems with parsing unicode
wombat
report at bugs.python.org
Wed Sep 29 03:16:49 EDT 2021
wombat <jewett.aij at gmail.com> added the comment:
The error messages may have gone away, but the underlying unicode limitations I mentioned remain:
Suppose you wanted to use shlex to build a parser for Chinese text. Would you have to set "wordchars" to a string containing every possible Chinese character?
I myself wrote a parser for a crude language where words can contain any character except for whitespace and parenthesis. I needed a way to specify the characters which cannot belong to a word. (That's how I solved the problem. I modified shlex.py and added a "wordterminators" member. If "wordterminators" was left blank, then "wordchars" were used instead. This was a trivial change to "shlex.py" and it added a lot of functionality.)
I would like to suggest making this change (or something similar) to the official version of "shlex.py". Would sending an email to "python-ideas at python.org" be a good place to make this proposal?
----------
nosy: +jewett-aij
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue1170>
_______________________________________
More information about the Python-bugs-list
mailing list