[Tutor] re module- puzzling results when matching money
Alex Kleider
akleider at sonic.net
Sun Aug 4 01:34:34 CEST 2013
On 2013-08-03 13:30, Albert-Jan Roskam wrote:
> Word boundary. This is a zero-width assertion that matches only at the
> beginning or end of a word. A word is defined as a sequence of
> alphanumeric
> characters, so the end of a word is indicated by whitespace or a
> non-alphanumeric character.[http://docs.python.org/2/howto/regex.html]
> So I think it's because a dollar sign is not an alphanumeric character.
I get it now, thanks.
>
>>>> re.findall(r"\b\e\b", "d e f")
^
I'm puzzled by the presence of the '\' character before the 'e' above.
Testing suggests that its presence or absence seems to make no
difference.
> ['e']
>>>> re.findall(r"\b\$\b", "d $ f")
^
Here it escapes the '$' which would otherwise be a metachar.
> []
>>>> re.findall(r"\b\&\b", "d & f")
^
Here also I don't understand but again it seems not to matter.
> []
>
>
> How about this version (I like the re.VERBOSE/re.X flag!)
I am also now getting to like re.VERBOSE
>
> import re
> import collections
>
> regex = r"""(?P<sign>\$)
> (?P<dollars>\d*)
> (?:\.)
> (?P<cents>\d{2})"""
> target = \
> """Cost is $4.50. With a $.30 discount:
> Price is $4.15.
> The price could be less, say $4 or $4.
> Let's see how this plays out: $4.50.60
> """
> Match = collections.namedtuple("Match", "sign dollars cents")
> matches = [Match(*match) for match in re.findall(regex, target, re.X)]
> for match in matches:
> print repr(match.sign), repr(match.dollars), repr(match.cents)
'collections' is new to me. A new topic to study.
Thanks for the help, much appreciated!
alex k
More information about the Tutor
mailing list