Regular expressions, help?
Jon Clements
joncle at googlemail.com
Thu Apr 19 09:52:33 EDT 2012
On Thursday, 19 April 2012 07:11:54 UTC+1, Sania wrote:
> Hi,
> So I am trying to get the number of casualties in a text. After 'death
> toll' in the text the number I need is presented as you can see from
> the variable called text. Here is my code
> I'm pretty sure my regex is correct, I think it's the group part
> that's the problem.
> I am using nltk by python. Group grabs the string in parenthesis and
> stores it in deadnum and I make deadnum into a list.
>
> text="accounts put the death toll at 637 and those missing at
> 653 , but the total number is likely to be much bigger"
> dead=re.match(r".*death toll.*(\d[,\d\.]*)", text)
> deadnum=dead.group(1)
> deaths.append(deadnum)
> print deaths
>
> Any help would be appreciated,
> Thank you,
> Sania
Or just don't fully rely on a regex. I would, for time, and the little sanity I believe I have left, would just do something like:
death_toll = re.search(r'death toll.*\d+', text).group().rsplit(' ', 1)[1]
hth,
Jon.
More information about the Python-list
mailing list