Sanitizing filename strings across platforms

Jean-Paul Calderone calderone.jeanpaul at gmail.com
Tue May 31 23:17:50 EDT 2011


On May 31, 10:17 pm, Tim Chase <python.l... at tim.thechases.com> wrote:
> Scenario: a file-name from potentially untrusted sources may have
> odd filenames that need to be sanitized for the underlying OS.
> On *nix, this generally just means "don't use '/' or \x00 in your
> string", while on Win32, there are a host of verboten characters
> and file-names.  Then there's also checking the abspath/normpath
> of the resulting name to make sure it's still in the intended folder.
>
> I've read through [1] and have started to glom together various
> bits from that thread.  My current course of action is something like
>
>   SACRED_WIN32_FNAMES = set(
>     ['CON', 'PRN', 'CLOCK$', 'AUX', 'NUL'] +
>     ['LPT%i' % i for i in range(32)] +
>     ['CON%i' % i for i in range(32)] +
>
>   def sanitize_filename(fname):
>     sane = set(string.letters + string.digits + '-_.[]{}()$')
>     results = ''.join(c for c in fname if c in sane)
>     # might have to check sans-extension
>     if results.upper() in SACRED_WIN32_FNAMES:
>       results = "_" + results
>     return results
>
> but if somebody already has war-hardened code they'd be willing
> to share, I'd appreciate any thoughts.
>

There's http://pypi.python.org/pypi/filepath/0.1 (taken from
twisted.python.filepath).

Jean-Paul



More information about the Python-list mailing list