[Tutor] help with regular expressions

Alan Gauld alan.gauld at blueyonder.co.uk
Wed Jun 30 15:42:44 EDT 2004


> line = FileHandle.readline(-1)
> test = findall('\d', line)
>
> where "line" is a line of text from a file I would like to read
> and extract numbers

Regular expressions process text, that is characters. Thus there
is a difference between a digit and a number. A digit in a
regular expression is a *character* that can be interpreted
as a number. The regex has no concept of numbers in the mathematical
sense, it simply sees these as groups of digits.

> example test would be for the above numbers
> [3, 4, 4, 4, 4, 5, 6, 4, 8, ....].
> How is this done so that test = [3.444456    4   84.3546354]?

You probably don't need regular expressions here at all, since regex
are
best for extracting complex patterns out of complex test. In this case
the string split() method wil likely work better. You can then convert
the list of substrings into a list of numbers with the float()
function:

test = line.split()
numbers = [float(n) for n in test]

Or all in one line:

numbers = [float(n) for n in line.split()]

You may need to strip() whitespace off the line before applying
split() too.

Regex are powerful tools but using them where they aren't needed is a
good way to make your programs hard to read!

HTH,

Alan G.




More information about the Tutor mailing list