regex \b behaviour in python

Walter Cruz walter.php at gmail.com
Thu Jun 19 11:59:27 EDT 2008


Hi all!

Just a simple question about the behaviour of a regex in python. (I
discussed this on IRC, and they suggest me to post here).

I tried to split the string  "walter ' cruz" using \b .

In ruby, it returns:

irb(main):001:0>"walter ' cruz".split(/\b/)
=> ["walter", " ' ", "cruz"]

and in php:

Array
(
    [0] =>
    [1] => walter
    [2] =>  '
    [3] => cruz
    [4] =>
)


But in python the behaviour of \b is differente from ruby or php.

The guys on the IRC pointed me a way to do that: [m.span() for m in
re.finditer(r'\b',"walter ' cruz")], but if fact there's some
differente as it strips the spaces :)

My question is: why \b behaves like this on python? Why it's different
from ruby or php (or even perl, I believe)?

(For the sake of curiosity, I was trying to solve the
http://www.rubyquiz.com/quiz76.html in python. But the question to not
to solve the quiz, but understand why python is different)

[]'s
- Walter



More information about the Python-list mailing list