Regular expressions

Albert van der Horst albert at spenarnc.xs4all.nl
Thu Nov 5 08:39:52 EST 2015


Steven D'Aprano <steve at pearwood.info> writes:

>On Wed, 4 Nov 2015 07:57 pm, Peter Otten wrote:

>> I tried Tim's example
>>
>> $ seq 5 | grep '1*'
>> 1
>> 2
>> 3
>> 4
>> 5
>> $

>I don't understand this. What on earth is grep matching? How does "4"
>match "1*"?


>> which surprised me because I remembered that there usually weren't any
>> matching lines when I invoked grep instead of egrep by mistake. So I tried
>> another one
>>
>> $ seq 5 | grep '[1-3]+'
>> $
>>
>> and then headed for the man page. Apparently there is a subset called
>> "basic regular expressions":
>>
>> """
>>   Basic vs Extended Regular Expressions
>>        In basic regular expressions the meta-characters ?, +, {, |, (,
>>        and ) lose their special meaning; instead use  the  backslashed
>>        versions \?, \+, \{, \|, \(, and \).
>> """

>None of this appears relevant, as the metacharacter * is not listed. So
>what's going on?

* is so fundamental that it never looses it special meaning.
Same for [ .

* means zero more of the preceeding char.
This makes + superfluous (a mere convenience) as
    [1-3]+
can be expressed as
    [1-3][1-3]*

Note that [1-3]* matches the empty string. This happens a lot.

Groetjes Albert




>--
>Steven
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert at spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst




More information about the Python-list mailing list