[Tutor] confusions about re module
Peter Otten
__peter__ at web.de
Sun May 1 09:14:42 CEST 2011
naheed arafat wrote:
> someone please tell me why i'm getting this output?
> specially the 'e3%' ! ! !
>>>> import re
>>>> re.findall('([\w]+.)','abdd.e3\45 dret.8dj st.jk')
> ['abdd.', 'e3%', 'dret.', '8dj ', 'st.', 'jk']
>
> I am getting the same output for the following too..
>>>> re.findall(r'([\w]+.)','abdd.e3\45 dret.8dj st.jk')
> ['abdd.', 'e3%', 'dret.', '8dj ', 'st.', 'jk']
>
> wasn't i supposed to get ['abdd.','dret.'] ??
> python version: 2.6.5
> os: windows
Quoting http://docs.python.org/library/re.html :
"""
'.'
(Dot.) In the default mode, this matches any character except a newline. If
the DOTALL flag has been specified, this matches any character including a
newline.
[...]
'\'
Either escapes special characters (permitting you to match characters like
'*', '?', and so forth), or signals a special sequence; special sequences
are discussed below.
"""
So you get the desired behaviour by escaping the dot:
>>> re.findall(r'([\w]+\.)','abdd.e3\45 dret.8dj st.jk')
['abdd.', 'dret.', 'st.']
>>> re.findall(r'([\w]+[.])','abdd.e3\45 dret.8dj st.jk')
['abdd.', 'dret.', 'st.']
(assuming that you left out the last match accidentally)
More information about the Tutor
mailing list