Extracting parts of string between anchor points
Denis McMahon
denismfmcmahon at gmail.com
Thu Feb 27 19:55:01 EST 2014
On Thu, 27 Feb 2014 20:07:56 +0000, Jignesh Sutar wrote:
> I've kind of got this working but my code is very ugly. I'm sure it's
> regular expression I need to achieve this more but not very familiar
> with use regex, particularly retaining part of the string that is being
> searched/matched for.
>
> Notes and code below to demonstrate what I am trying to achieve. Any
> help,
> much appreciated.
It seems you have a string which may be split into between 1 and 3
substrings by the presence of up to 2 delimeters, and that if both
delimeters are present, they are in a specified order.
You have several possible cases which, broadly speaking, break down into
4 groups:
(a) no delimiters present
(b) delimiter 1 present
(c) delimiter 2 present
(d) both delimiters present
It is important when coding for such scenarios to consider the possible
cases that are not specified, as well as the ones that are.
For example, consider the string:
"<delim1><delim2>"
where you have both delims, in sequence, but no other data elements.
I believe there are at least 17 possible combinations, and maybe another
8 if you allow for the delims being out of sequence.
The code in the file at the url below processes 17 different cases. It
may help, or it may confuse.
http://www.sined.co.uk/tmp/strparse.py.txt
--
Denis McMahon, denismfmcmahon at gmail.com
More information about the Python-list
mailing list