Ask for help on using re

jak nospam at please.ty
Fri Aug 6 10:17:59 EDT 2021


Il 06/08/2021 12:57, Jach Feng ha scritto:
> jak 在 2021年8月6日 星期五下午4:10:05 [UTC+8] 的信中寫道:
>> Il 05/08/2021 11:40, Jach Feng ha scritto:
>>> I want to distinguish between numbers with/without a dot attached:
>>>
>>>>>> text = 'ch 1. is\nch 23. is\nch 4 is\nch 56 is\n'
>>>>>> re.compile(r'ch \d{1,}[.]').findall(text)
>>> ['ch 1.', 'ch 23.']
>>>>>> re.compile(r'ch \d{1,}[^.]').findall(text)
>>> ['ch 23', 'ch 4 ', 'ch 56 ']
>>>
>>> I can guess why the 'ch 23' appears in the second list. But how to get rid of it?
>>>
>>> --Jach
>>>
>> import re
>> t = 'ch 1. is\nch 23. is\nch 4 is\nch 56 is\n'
>> r = re.compile(r'(ch +\d+\.)|(ch +\d+)', re.M)
>>
>> res = r.findall(t)
>>
>> dot = [x[1] for x in res if x[1] != '']
>> udot = [x[0] for x in res if x[0] != '']
>>
>> print(f"dot: {dot}")
>> print(f"undot: {udot}")
>>
>> out:
>>
>> dot: ['ch 4', 'ch 56']
>> undot: ['ch 1.', 'ch 23.']
>> r = re.compile(r'(ch +\d+\.)|(ch +\d+)', re.M)
> That's an interest solution! Where the '|' operator in re.compile() was documented?
> 
> --Jach
> 

I honestly can't tell you, I've been using it for over 30 years. In any
case you can find some traces of it in the "regular expressions quick
reference" on the site https://regex101.com (bottom right side).



More information about the Python-list mailing list