Regular expressions

Peter Otten __peter__ at web.de
Thu Nov 5 03:33:39 EST 2015


Steven D'Aprano wrote:

> On Wed, 4 Nov 2015 07:57 pm, Peter Otten wrote:
> 
>> I tried Tim's example
>> 
>> $ seq 5 | grep '1*'
>> 1
>> 2
>> 3
>> 4
>> 5
>> $
> 
> I don't understand this. What on earth is grep matching? How does "4"
> match "1*"?

Look for zero or more "1". Written in Python:

for line in sys.stdin:
    if re.compile("1*").search(line):
        print(line, end="")
 
>> which surprised me because I remembered that there usually weren't any
>> matching lines when I invoked grep instead of egrep by mistake. So I
>> tried another one
>> 
>> $ seq 5 | grep '[1-3]+'
>> $
>> 
>> and then headed for the man page. Apparently there is a subset called
>> "basic regular expressions":
>> 
>> """
>>   Basic vs Extended Regular Expressions
>>        In basic regular expressions the meta-characters ?, +, {, |, (,
>>        and ) lose their special meaning; instead use  the  backslashed
>>        versions \?, \+, \{, \|, \(, and \).
>> """
> 
> None of this appears relevant, as the metacharacter * is not listed. 

That's the very point. 

> So what's going on?

Most special characters are not working with grep, but * is. The quote 
explains why many regular expressions like "[1-3]+" that you may know from 
Python's re don't work, but a small subset including the ominous "1*" do.




More information about the Python-list mailing list