Problem writing some strings (UnicodeEncodeError)

Peter Otten __peter__ at web.de
Mon Jan 13 12:29:28 EST 2014


Paulo da Silva wrote:

> Em 13-01-2014 08:58, Peter Otten escreveu:
>> Peter Otten wrote:
>> 
>>> Paulo da Silva wrote:
>>>
>>>> Em 12-01-2014 20:29, Peter Otten escreveu:
>>>>> Paulo da Silva wrote:
>>>>>
>>>>>>> but I have not tried it myself. Also, some bytes may need to be
>>>>>>> escaped, either to be understood by the shell, or to address
>>>>>>> security concerns:
>>>>>>>
>>>>>>
>>>>>> Since I am puting the file names between "", the only char that needs
>>>>>> to be escaped is the " itself.
>>>>>
>>>>> What about the escape char?
>>>>>
>>>> Just this fn=fn.replace('"','\\"')
>>>>
>>>> So far I didn't find any problem, but the script is still running.
>>>
>>> To be a bit more explicit:
>>>
>>>>>> for filename in os.listdir():
>>> ...     print(template.replace("<fn>", filename.replace('"', '\\"')))
>>> ...
>>> ls "\\"; rm whatever; ls \"
>> 
>> The complete session:
>> 
>>>>> import os
>>>>> template = 'ls "<fn>"'
>>>>> with open('\\"; rm whatever; ls \\', "w") as f: pass
>> ...
>>>>> for filename in os.listdir():
>> ...     print(template.replace("<fn>", filename.replace('"', '\\"')))
>> ...
>> ls "\\"; rm whatever; ls \"
>> 
>> 
>> Shell variable substitution is another problem. c.l.py is probably not
>> the best place to get the complete list of possibilities.
> I see what you mean.
> This is a tedious problem. Don't know if there is a simple solution in
> python for this. I have to think about it ...
> On a more general and serious application I would not produce a bash
> script. I would do all the work in python.
> 
> That's not the case, however. This is a few times execution script for a
> very special purpose. The only problem was the occurrence of some
> Portuguese characters in old filenames encoded in another code than
> utf-8. Very few also include the ".
> 
> The worst thing that could happen was the bash script to abort. Then it
> would be easy to fix it using a simple editor.

I looked around in the stdlib and found shlex.quote(). It uses ' instead of 
" which simplifies things, and special-cases only ':

>>> print(shlex.quote("alpha'beta"))
'alpha'"'"'beta'

So the answer is simpler than I had expected.





More information about the Python-list mailing list