[Python-ideas] Iterating non-newline-separated files should be easier

Nick Coghlan ncoghlan at gmail.com
Sun Jul 20 07:00:15 CEST 2014


On 20 July 2014 13:58, Andrew Barnert <abarnert at yahoo.com> wrote:
> First, why is it so odd to have newlines in filenames? It used to be pretty common on Classic Mac. Sure, they're not too common nowadays, but that's because they're illegal on DOS/Windows, and because the shell on Unix systems makes them a pain to deal with, not because there's something inherently nonsensical about the idea, any more than filenames with spaces or non-ASCII characters or >255 length.

You answered your own question: because DOS/Windows make them illegal,
and the Unix shell isn't fond of them either. I was a DOS/Windows user
for more than a decade before switching to Linux for personal use, and
in a decade of using Linux (and even going to work for a Linux
vendor), I've never encountered a filename with a newline in it. Thus
the idea that anyone *would* do such a thing, and that it would be
prevalent enough for UNIX tools to include a workaround in programs
that normally produce newline separated output is an entirely novel
concept for me. Any such file I encountered *would* be an outlier, and
I'd likely be in a position to get the offending filename fixed rather
than changing any data processing pipelines (whether written in Python
or not) to tolerate newlines in filenames (since the cost differential
between fixing one filename vs updating the data processing pipelines
would be enormous).

However, note that my attitude changed significantly once you
clarified the use case - it's clear that there *is* a use case, it's
just one that's outside my own personal experience. That's one of the
things the PEP process is for - to explain such use cases to folks
that haven't personally encountered them, and then explain why the
proposed solution addresses the use case in a way that makes sense for
the domains where the use case arises. The recent matrix
multiplication PEP was an exemplary example of the breed.

That's what I'm asking for here: a PEP that makes sense to someone
like me for whom the idea of putting a newline in a filename is
completely alien. Yes, it's technically permitted by the underlying
operating system APIs on POSIX systems, but all the affordances at
both the console and GUI level suggest "no newlines allowed". If
you're coming from a DOS/Windows background (as I did), then the idea
that a newline is technically a permitted filename character may never
even occur to you (it certainly hadn't to me, and I'd never previously
come across anything to challenge that assumption).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list