Regular Expression Help

Wed Aug 15 16:54:05 EDT 2001

I would suggest the book "Mastering Regular Expressions" published by
O'Rielly.  It is extremely well written and gives a good feel for them.
It is much more enlightening then cookbook approaches.

Paul

> -----Original Message-----
> From: Alex Martelli [mailto:aleax at aleax.it]
> Posted At: Tuesday, August 14, 2001 8:46 AM
> Posted To: comp.lang.python
> Conversation: Regular Expression Help
> Subject: Re: Regular Expression Help
> 
> 
> "Tino Lange" <tino.lange at isg.de> wrote in message
> news:3B793BD7.1A48D776 at isg.de...
>     ...
> > I want to parse a continuos file, that contains messages 
> surrounded by
> > nonalphanumerical begin- and end-signs.
> > (BEGIN sign 0x02, END sign 0x03)
> >
> > How can I parse this?
> > A working perl-script would be
> >
> > #!/usr/bin/perl
> > while(<>) { s/\x02/\n/g; s/\x03//g; print; }
> 
> Very fragile, it seems to me -- the \n within a
> message are getting confused with the markers.
> 
> > pattern=re.compile('([0x02] | [0x03])')
> 
> This pattern matches any one of the ASCII characters:
>     0
>     x
>     2
>     3
> although it's chosen a very peculiar way to specify
> that:-).  Plus, it defines a group, so the splitter
> itself would appear in the value from .split, which
> is apparently not what you want.
> 
> > I could only split by "normal" characters as far as I saw in the
> > documentation.
> > Is this right?
> 
> No, you just have to use the \02 etc escapes to
> specify special characters.  Try this split.py:
> 
> import re
> 
> samplestring='able\02baker\03charlie\02delta'
> splitter = re.compile('[\02\03]')
> print splitter.split(samplestring)
> 
> D:\py21>python spli.py
> ['able', 'baker', 'charlie', 'delta']
> 
> This looks like what you want, right?
> 
> 
> Alex
> 
> 
>