Regular Expression Help
Paul Schwartz
PaulS at scenicsoft.com
Wed Aug 15 16:54:05 EDT 2001
I would suggest the book "Mastering Regular Expressions" published by
O'Rielly. It is extremely well written and gives a good feel for them.
It is much more enlightening then cookbook approaches.
Paul
> -----Original Message-----
> From: Alex Martelli [mailto:aleax at aleax.it]
> Posted At: Tuesday, August 14, 2001 8:46 AM
> Posted To: comp.lang.python
> Conversation: Regular Expression Help
> Subject: Re: Regular Expression Help
>
>
> "Tino Lange" <tino.lange at isg.de> wrote in message
> news:3B793BD7.1A48D776 at isg.de...
> ...
> > I want to parse a continuos file, that contains messages
> surrounded by
> > nonalphanumerical begin- and end-signs.
> > (BEGIN sign 0x02, END sign 0x03)
> >
> > How can I parse this?
> > A working perl-script would be
> >
> > #!/usr/bin/perl
> > while(<>) { s/\x02/\n/g; s/\x03//g; print; }
>
> Very fragile, it seems to me -- the \n within a
> message are getting confused with the markers.
>
> > pattern=re.compile('([0x02] | [0x03])')
>
> This pattern matches any one of the ASCII characters:
> 0
> x
> 2
> 3
> although it's chosen a very peculiar way to specify
> that:-). Plus, it defines a group, so the splitter
> itself would appear in the value from .split, which
> is apparently not what you want.
>
> > I could only split by "normal" characters as far as I saw in the
> > documentation.
> > Is this right?
>
> No, you just have to use the \02 etc escapes to
> specify special characters. Try this split.py:
>
> import re
>
> samplestring='able\02baker\03charlie\02delta'
> splitter = re.compile('[\02\03]')
> print splitter.split(samplestring)
>
> D:\py21>python spli.py
> ['able', 'baker', 'charlie', 'delta']
>
> This looks like what you want, right?
>
>
> Alex
>
>
>
More information about the Python-list
mailing list