beginners question about return value of re.split

Tim Chase python.list at tim.thechases.com
Fri Mar 21 11:31:20 EDT 2008


>     datum = "2008-03-14"
>     the_date = re.split('^([0-9]{4})-([0-9]{2})-([0-9]{2})$', datum, 3)
>     print the_date
> 
> Now the result that is printed is:
> ['', '2008', '03', '14', '']
> 
> My question: what are the empty strings doing there in the beginning and 
> in the end ? Is this due to a faulty regular expression ?


I think in this case, you just want the standard string .split() 
method:

   the_date = datum.split('-')

which will return you

   ['2008', '03', '14']

The re.split() splits your string using your regexp as the way to 
find the divider.  It finds emptiness before, emptiness after, 
and returns the tagged matches for each part.  It would be similar to

   >>> s = ','
   >>> s.split(',')
   ['', '']

only you get your tagged matches in there too.  Or, if you need 
more precision in your matching (in your case, ensuring that 
they're digits, and with the right number of digits), you can do 
something like

   >>> r = re.compile('^([0-9]{4})-([0-9]{2})-([0-9]{2})$')
   >>> m = r.match(datum)
   >>> m.groups()
   ('2008', '03', '14')

-tkc





More information about the Python-list mailing list