[Python-ideas] Iterating non-newline-separated files should be easier

Nick Coghlan ncoghlan at gmail.com
Sun Jul 20 03:23:56 CEST 2014


On 20 July 2014 10:57, Andrew Barnert <abarnert at yahoo.com> wrote:
> On Saturday, July 19, 2014 4:49 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>>On 20 Jul 2014 09:28, "Andrew Barnert" <abarnert at yahoo.com> wrote:
>
>
>>> In general, it's just as easy to write Unix command-line tools in Python as in Perl, and that's a good thing—it means I don't have to use Perl. But as soon as -0 comes into the mix, that's no longer true. And that's a problem.
>
>>I would find adding NULL to the potential newline set significantly less objectionable than opening it up to arbitrary character sequences.
>
>
>>Adding a single possible newline character is a much simpler change, and one likely to have far fewer odd consequences. This is especially so if specifying NULL as the line separator is only permitted for files opened in binary mode.
>
>
> But newline is only permitted for text mode. Are you suggesting that we add newline to binary mode, but the only allowed values are NULL (current behavior) and \0, while on text files the list of allowed values stays the same as today?

Actually, I temporarily forgot that newline was only handled at the
TextIOWrapper layer. All the more reason for a PEP that clearly lays
out the status quo (both Python's own newline handling and the "-0"
option for various UNIX utilities, and the way that is handled in
other scripting langauges), and discusses the various options for
dealing with it (new RecordIOWrapper class with a new "open"
parameter, new methods on IO clases, new semantics on the existing
TextIOWrapper class).

If the description of the use cases is clear enough, then the "right
answer" amongst the presented alternatives (which includes "don't
change anything") may be obvious. At present, I'm genuinely unclear on
why someone would ever want to pass the "-0" option to the other UNIX
utilities, which then makes it very difficult to have a sensible
discussion on how we should address that use case in Python.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list