encode short string as filename (unix/windows)
robert
no-spam at no-spam-no-spam.com
Mon Mar 27 11:40:38 EST 2006
Jean-Paul Calderone wrote:
> punycode is used by dns. A commonly used email codec is
> quoted-printable. Here's an example of each:
>
> >>> u'Helló world'.encode('utf-8').encode('quopri')
> 'Hell=C3=B3=20world'
> >>> u'Helló world'.encode('punycode')
> 'Hell world-jbb'
> >>>
> Note the extra trip through utf-8 for quoted-printable, as it is not
> implemented in Python as a character encoding, but a byte encoding, so
> you cannot (safely) apply it to a unicode string.
>
> Jean-Paul
>
>>> u'Helló world\\/\x00'.encode('punycode')
'Hell world\\/\x00-elb'
>>> u'Helló world\\/\x00'.encode('utf-8').encode('quopri')
'Hell=C3=B3=20world\\/=00'
>>>
that doesn't remove \ /
that other base.. things similar
so finally found me reggae'ing :-( , but this provides minimal optical
damage for common strings ...
def encode_as_filename(s):
def _(m): return "+%02X" % ord(m.group(0))
return re.sub('[\x00"\\\\/*?:<>|+\n]',_,s)
def decode_from_filename(s):
def _(m): return chr(int(m.group(0)[1:],16))
return re.sub("\\+[\dA-F]{2,2}",_,s)
>>> newsletter.encode_as_filename('robert@?/\\+\n\x00:+test')
'robert at +3F+2F+5C+2B+0A+00+3A+2Btest'
>>> newsletter.decode_from_filename(_)
'robert@?/\\+\n\x00:+test'
>>>
Robert
More information about the Python-list
mailing list