[Python-Dev] unicode regex quickie: should a newline be the same thing as a linebreak?

M.-A. Lemburg mal@lemburg.com
Tue, 30 May 2000 19:57:41 +0200


Fredrik Lundh wrote:
> 
> M.-A. Lemburg wrote:
> > > At the other end, the same compiled pattern can be applied
> > > to either 8-bit or unicode strings.  It's all just characters to
> > > the engine...
> >
> > Doesn't the engine remember wether the pattern was a string
> > or Unicode ?
> 
> The pattern object contains a reference to the original pattern
> string, so I guess the answer is "yes, but indirectly".  But the core
> engine doesn't really care -- it just follows the instructions in the
> compiled pattern.
> 
> > Thinking about this some more: I wouldn't even mind if
> > the engine would use LINEBREAK for all strings :-). It would
> > certainly make life easier whenever you have to deal with
> > file input from different platforms, e.g. Mac, Unix and
> > Windows.
> 
> That's what I originally proposed (and implemented).  But this may
> (in theory, at least) break existing code.  If not else, it broke the
> test suite ;-)

SRE is new, so what could it break ?

Anyway, perhaps we should wait for some Perl 5.6 wizard to
speak up ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/