Regular expressions
Peter Otten
__peter__ at web.de
Thu Nov 5 03:33:39 EST 2015
Steven D'Aprano wrote:
> On Wed, 4 Nov 2015 07:57 pm, Peter Otten wrote:
>
>> I tried Tim's example
>>
>> $ seq 5 | grep '1*'
>> 1
>> 2
>> 3
>> 4
>> 5
>> $
>
> I don't understand this. What on earth is grep matching? How does "4"
> match "1*"?
Look for zero or more "1". Written in Python:
for line in sys.stdin:
if re.compile("1*").search(line):
print(line, end="")
>> which surprised me because I remembered that there usually weren't any
>> matching lines when I invoked grep instead of egrep by mistake. So I
>> tried another one
>>
>> $ seq 5 | grep '[1-3]+'
>> $
>>
>> and then headed for the man page. Apparently there is a subset called
>> "basic regular expressions":
>>
>> """
>> Basic vs Extended Regular Expressions
>> In basic regular expressions the meta-characters ?, +, {, |, (,
>> and ) lose their special meaning; instead use the backslashed
>> versions \?, \+, \{, \|, \(, and \).
>> """
>
> None of this appears relevant, as the metacharacter * is not listed.
That's the very point.
> So what's going on?
Most special characters are not working with grep, but * is. The quote
explains why many regular expressions like "[1-3]+" that you may know from
Python's re don't work, but a small subset including the ominous "1*" do.
More information about the Python-list
mailing list