encode short string as filename (unix/windows)
Jean-Paul Calderone
exarkun at divmod.com
Mon Mar 27 11:21:28 EST 2006
On Mon, 27 Mar 2006 18:13:17 +0200, "Diez B. Roggisch" <deets at nospam.web.de> wrote:
>robert wrote:
>
>> want to encode/decode an arbitrary short 8-bit string as save filename.
>> is there a good already builtin encoding to do this (without too much
>> inflation) ? or re.sub expression?
>
>Yuu could use the base64-encoder. Disadvantage is clearly that you can't
>easily read your original text. Alternatively, three is that encoding that
>is used by e.g. emails if you have an umlaut in a name. I _think_ it is
>called puny-code, but I'm not sure how and if you can use that from within
>python - google yourself :)
punycode is used by dns. A commonly used email codec is quoted-printable. Here's an example of each:
>>> u'Helló world'.encode('utf-8').encode('quopri')
'Hell=C3=B3=20world'
>>> u'Helló world'.encode('punycode')
'Hell world-jbb'
>>>
Note the extra trip through utf-8 for quoted-printable, as it is not implemented in Python as a character encoding, but a byte encoding, so you cannot (safely) apply it to a unicode string.
Jean-Paul
More information about the Python-list
mailing list