[Python-ideas] Iterating non-newline-separated files should be easier

Wolfgang Maier wolfgang.maier at biologie.uni-freiburg.de
Fri Jul 18 13:53:48 CEST 2014


On 07/18/2014 02:04 AM, Andrew Barnert wrote:
> On Thursday, July 17, 2014 3:21 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
>
>
>
>>    On Thursday, July 17, 2014 2:40 PM, Alexander Heger <python at 2sn.net> wrote:
>
>>>    Could the "split" (or splitline) keyword-only
>>> parameter instead be passed to the open function
>>> (and the __init__ of IOBase and be stored there)?
>>
>> Good idea. It's less powerful/flexible, but probably
>> good enough for almost all use cases. (I can't think
>> of any file where I'd need to split part of it on \0
>> and the rest on \n…) Also, it means you can stick with
>> the normal __iter__ instead of needing a separate
>> iterlines method.
>
> It turns out to be even simpler than I expected.
>
> I reused the "newline" parameter of open and TextIOWrapper.__init__, adding a param of the same name to the constructors for BufferedReader, BufferedWriter, BufferedRWPair, BufferedRandom, and FileIO.
>
> For text files, just remove the check for newline being one of the standard values and it all works. For binary files, remove the check for truthy, make open pass each Buffered* constructor newline=(newline if binary else None), make each Buffered* class store it, and change two lines in RawIOBase.readline to use it. And that's it.
>

You are not the first one to come up with this idea and suggesting 
solutions. This whole thing has been hanging around on the bug tracker 
as an unresolved issue (started by Nick Coghlan) since almost a decade:

http://bugs.python.org/issue1152248

Ever since discovering it, I've been sticking to the recipe provided by 
Douglas Alan:

http://bugs.python.org/issue1152248#msg109117

Not that I wouldn't like to see this feature to be shipping with Python, 
but it may help to read through all aspects of the problem that have 
been discussed before.

Best,
Wolfgang




More information about the Python-ideas mailing list