[Tutor] reading parts of a input string into different variables based on units.

Kent Johnson kent37 at tds.net
Thu Mar 20 02:02:24 CET 2008


Alan Gauld wrote:

> For something like this I'd use regular expressions.
> If your strings vary in length then I'd use a separate regular
> expression per unit then use that to findall matching
> substrings for each unit.

> Of course you will have to work out the right regex but that
> shouldn't be too difficult if you are sure you have a whitespace
> separator and a number followed by the unit string.

One regex can split apart a numeric part and a non-numeric unit:

In [22]: import re
In [23]: splitter = re.compile(r'(\d+)(\S+)')
In [24]: splitter.findall('2m 4cm 3mm')
Out[24]: [('2', 'm'), ('4', 'cm'), ('3', 'mm')]
In [25]: splitter.findall('1pound 30pence')
Out[25]: [('1', 'pound'), ('30', 'pence')]

If you want to allow decimals change the regex to r'([\d.]+)(\S+)'

Kent


More information about the Tutor mailing list