regexp matching end of line or comma

Jean-Michel Pichavant jeanmichel at sequans.com
Thu Nov 25 11:26:34 EST 2010


MRAB wrote:
> On 25/11/2010 14:40, Jean-Michel Pichavant wrote:
>> Hy guys,
>>
>> I'm struggling matching patterns ending with a comma ',' or an end of
>> line '$'.
>>
>> import re
>>
>> ex1 = 'sumthin,'
>> ex2 = 'sumthin'
>> m1 = re.match('(?P<something>\S+),', ex1)
>> m2 = re.match('(?P<something>\S+)$', ex2)
>> m3 = re.match('(?P<something>\S+)[,$]', ex1)
>> m4 = re.match('(?P<something>\S+)[,$]', ex2)
>>
>> print m1, m2
>> print m3
>> print m4
>>
>> <_sre.SRE_Match object at 0x8834de0> <_sre.SRE_Match object at 
>> 0x8834e20>
>> <_sre.SRE_Match object at 0x8834e60>
>> None
>>
>> My problem is that m4 is None while I'd like it to match ex2.
>>
>> Any clue ?
>>
> Within a character set '$' is a literal '$' and not end-of-string, just
> as '\b' is '\x08' and not word-boundary.
>
> Use a lookahead instead:
>
> >>> re.match('(?P<something>\S+)(?=,|$)', ex1)
> <_sre.SRE_Match object at 0x01719FA0>
> >>> re.match('(?P<something>\S+)(?=,|$)', ex2)
> <_sre.SRE_Match object at 0x016937E0>
thanks, it works that way.
By the way I don't get the difference between non capturing parentesis 
(?:) and lookahead parenthesis (?=):

re.match('(?P<something>\S+)(?:,|$)', ex2).groups()
('sumthin',)

re.match('(?P<something>\S+)(?=,|$)', ex2).groups()
('sumthin',)

JM




More information about the Python-list mailing list