[Pythonmac-SIG] Re: MacPython and line-endings
Jack Jansen
jack@oratrix.nl
Sat, 06 Oct 2001 17:11:32 +0200
Recently, Chris Barker <chrishbarker@home.net> said:
> > - I don't like all the various ways to specify line endings (with
> > 'mac', 'Mac' and '\r' all being equivalent), I think I'd go with a
> > simple '\r' (maybe with a symbolic constant mac='\r' somewhere).
>
> I wanted the interface to be easy to use. I suppose if you are a
> programmer, you should know what a Mac line ending looks like, but that
> may not fit with CP4E. If you were to define a constant somewhere, where
> would you define it???
Hmm. I agree with your thinking that newbies won't know about
\r\n. But on the other hand newbees wil never use the parameter in the
first place: they'll read files with any line ending and wite files
with the local lineending. But this isn't really all that important a
point, feel free to do what you like.
> > - Interactive input is important, and the lookahead for \n should be
> > tackled. I think the way to do it is different than what Guido
> > suggests, I think that instead of peeking ahead for a \n if you see
> > a \r you should set a flag (self.return_seen_skip_initial_newline)
> > that will eat the newline upon the next read/readline. But: this has
> > implications for tell(), as tell() _will_ have to do the peek to
> > return the correct position for the beginning of the next
> > line. And seek() should reset the flag.
> Also read().
>
> There just isn't a simple way to do this. I really hadn't been thinking
> of interactive input...how important is it to support arbitrary line
> ending in interactive input? How often would it be coming from an
> unknown source? Not that it wouldn't be nice for completeness' sake in
> any case.
Maybe it isn't all that important. We can assume that sys.stdin
conforms to the local convention, I guess. And for interactive input
coming in over sockets (think of things like a Python MUD server
connected to via telnet) we'll probably get a known convention.
But: I don't think it's all that difficult either, I think my flag
proposal shoul handle all cases fairly easily. Or do you see problems
with it?
I'm skipping the read()/readline() stuff, the more I think about it
the more I think that the readtoterminator() solution is the right
one. And if we want backward compatibility to Python versions that
don't have the readtoterminator() file object method we can add a
workaround to the class. We'd still have only a single place in the
code where we would have to look at every byte.
Read and readline would become really simple:
def readline(self, count=0):
data = self.fp.readtoterminator('\r\n', count)
if not data: return data
if self.skipinitialnewline and data[0] == '\n':
data = data[1:]
self.skipinitialnewline = (data[-1] == '\r')
if data[-1] == '\r':
data = data[:-1] + '\n'
return data
def read(self, count=0):
data = ''
while 1:
next = self.readline(count)
if not next: return data
data = data + next
if count:
count = count - len(next)
if count <= 0: return data
> In any case, my goal is that something like this would become part of
> the built-in file object.
> [...]
> One reason it needs to be built in is that it could then be used for
> imports and execs() and all that, which is really where this all
> started.
This is a whole different can of worms. And I think that putting the
crossplatform newline functionality in the file objects isn't going to
get us closer to a solution:-(
The lowlevel import code uses stdio FILE * parameters all over the
place, so unless newline conversion is implemented at the stdio level
we will have to replace te whole import machinery by a
re-implementation in Python. This is doable, all the hooks are there
and there's prior art too if I'm not mistaken (IIRC someone did
imports from zip files or so).
> Jack, you had mentioned that you had some version of cross platform
> importing working with MacPython. What do you have now?
A very simple and efficient hack, for input only. The MSL stdio, that
MacPython uses, always calls a lowlevel internal routine to do \r->\n
mapping. MacPython now has a modified version of that routine that
will pass both \r and \n as \n. No support for \r\n, though.
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm