Problem writing some strings (UnicodeEncodeError)
Peter Otten
__peter__ at web.de
Mon Jan 13 12:29:28 EST 2014
Paulo da Silva wrote:
> Em 13-01-2014 08:58, Peter Otten escreveu:
>> Peter Otten wrote:
>>
>>> Paulo da Silva wrote:
>>>
>>>> Em 12-01-2014 20:29, Peter Otten escreveu:
>>>>> Paulo da Silva wrote:
>>>>>
>>>>>>> but I have not tried it myself. Also, some bytes may need to be
>>>>>>> escaped, either to be understood by the shell, or to address
>>>>>>> security concerns:
>>>>>>>
>>>>>>
>>>>>> Since I am puting the file names between "", the only char that needs
>>>>>> to be escaped is the " itself.
>>>>>
>>>>> What about the escape char?
>>>>>
>>>> Just this fn=fn.replace('"','\\"')
>>>>
>>>> So far I didn't find any problem, but the script is still running.
>>>
>>> To be a bit more explicit:
>>>
>>>>>> for filename in os.listdir():
>>> ... print(template.replace("<fn>", filename.replace('"', '\\"')))
>>> ...
>>> ls "\\"; rm whatever; ls \"
>>
>> The complete session:
>>
>>>>> import os
>>>>> template = 'ls "<fn>"'
>>>>> with open('\\"; rm whatever; ls \\', "w") as f: pass
>> ...
>>>>> for filename in os.listdir():
>> ... print(template.replace("<fn>", filename.replace('"', '\\"')))
>> ...
>> ls "\\"; rm whatever; ls \"
>>
>>
>> Shell variable substitution is another problem. c.l.py is probably not
>> the best place to get the complete list of possibilities.
> I see what you mean.
> This is a tedious problem. Don't know if there is a simple solution in
> python for this. I have to think about it ...
> On a more general and serious application I would not produce a bash
> script. I would do all the work in python.
>
> That's not the case, however. This is a few times execution script for a
> very special purpose. The only problem was the occurrence of some
> Portuguese characters in old filenames encoded in another code than
> utf-8. Very few also include the ".
>
> The worst thing that could happen was the bash script to abort. Then it
> would be easy to fix it using a simple editor.
I looked around in the stdlib and found shlex.quote(). It uses ' instead of
" which simplifies things, and special-cases only ':
>>> print(shlex.quote("alpha'beta"))
'alpha'"'"'beta'
So the answer is simpler than I had expected.
More information about the Python-list
mailing list