Regular expression bug?

Ron Garret rNOSPAMon at flownet.com
Thu Feb 19 16:08:19 EST 2009


In article <mailman.273.1235071607.11746.python-list at python.org>,
 Albert Hopkins <marduk at letterboxes.org> wrote:

> On Thu, 2009-02-19 at 10:55 -0800, Ron Garret wrote:
> > I'm trying to split a CamelCase string into its constituent components.  
> > This kind of works:
> > 
> > >>> re.split('[a-z][A-Z]', 'fooBarBaz')
> > ['fo', 'a', 'az']
> > 
> > but it consumes the boundary characters.  To fix this I tried using 
> > lookahead and lookbehind patterns instead, but it doesn't work:
> 
> That's how re.split works, same as str.split...

I think one could make the argument that 'foo'.split('') ought to return 
['f','o','o']

> 
> > >>> re.split('((?<=[a-z])(?=[A-Z]))', 'fooBarBaz')
> > ['fooBarBaz']
> > 
> > However, it does seem to work with findall:
> > 
> > >>> re.findall('(?<=[a-z])(?=[A-Z])', 'fooBarBaz')
> > ['', '']
> 
> 
> Wow!
> 
> To tell you the truth, I can't even read that...

It's a regexp.  Of course you can't read it.  ;-)

rg



More information about the Python-list mailing list