[Pythonmac-SIG] Re: [Python-Dev] Import hook to do end-of-line conversion?

M.-A. Lemburg mal@lemburg.com
Sat, 14 Apr 2001 19:02:09 +0200


Tim Peters wrote:
> 
> [MAL]
> > I don't know why this thread lead to tweaking stdio -- after all
> > we only need a solution for the Python tokenizer ...
> 
> [Just]
> > Aaaaaaaaaaaargh! ;-) Here we go again: fixing the tokenizer is
> > great and all,> but then what about all tools that read source
> > files line by line? ...
> 
> Note that this is why the topic needs a PEP:  nothing here is new; the same
> debates reoccur every time it comes up.

Right.
 
> [Aahz]
> > ...
> > QIO claims that it can be configured to recognize different
> > kinds of line endings.
> 
> It can be, yes, but in the same sense as Awk/Perl paragraph mode:  you can
> tell it to consider any string (not just single character) as meaning "end of
> the line", but it's a *fixed* string per invocation.  What people want *here*
> is more the ability to recognize the regular expression
> 
>     \r\n?|\n
> 
> as ending a line, and QIO can't do that directly (as currently written).  And
> MAL probably wants Unicode line-end detection:
> 
>     http://www.unicode.org/unicode/reports/tr13/

Right ;-)
 
> > QIO is claimed to be 2-3 times faster than Python 1.5.2; don't
> > know how that compares to 2.x.
> 
> The bulk of that was due to QIO avoiding per-character thread locks.  2.1
> avoids them too, so most of QIO's speed advantage should be gone now.  But
> QIO's internals could certainly be faster than they are (this is obscure
> because QIO.readline() has so many optional behaviors that the maze of
> if-tests makes it hard to see the speed-crucial bits; studying Perl's
> line-reading code is a better model, because Perl's speed-crucial inner loop
> has no non-essential operations -- Perl makes the *surrounding* code sort out
> the optional bits, instead of bogging down the loop with them).

Just curious: for the applications which Just has in mind,
reading source code line-by-line is not really needed. Wouldn't
it suffice to read the whole file, split it into lines and then
let the tools process the resulting list of lines ?

Maybe a naive approach, but one which will most certainly work
on all platforms without having to replace stdio...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Pages:                           http://www.lemburg.com/python/