Regular expression bug?

Lie Ryan lie.1296 at gmail.com
Fri Feb 20 08:03:50 EST 2009


On Thu, 19 Feb 2009 13:03:59 -0800, Ron Garret wrote:

> In article <gnkdal$bcq$01$1 at news.t-online.com>,
>  Peter Otten <__peter__ at web.de> wrote:
> 
>> Ron Garret wrote:
>> 
>> > I'm trying to split a CamelCase string into its constituent
>> > components.
>> 
>> How about
>> 
>> >>> re.compile("[A-Za-z][a-z]*").findall("fooBarBaz")
>> ['foo', 'Bar', 'Baz']
> 
> That's very clever.  Thanks!
> 
>> > (BTW, I tried looking at the source code for the re module, but I
>> > could not find the relevant code.  re.split calls
>> > sre_compile.compile().split, but the string 'split' does not appear
>> > in sre_compile.py.  So where does this method come from?)
>> 
>> It's coded in C. The source is Modules/sremodule.c.
> 
> Ah.  Thanks!
> 
> rg

This re.split() doesn't consume character:

>>> re.split('([A-Z][a-z]*)', 'fooBarBaz')
['foo', 'Bar', '', 'Baz', '']

it does what the OP wants, albeit with extra blank strings. 




More information about the Python-list mailing list