Regular expression for file name
Bengt Richter
bokr at oz.net
Sun Jul 18 15:40:24 EDT 2004
On Sun, 18 Jul 2004 14:21:14 +0200, "Miki Tebeka" <miki.tebeka at zoran.com> wrote:
>Hello All,
>
>In a configuration file there can be ID's and filename tokens.
>The file names have a known suffix (.o or .mls) and I need to get a regular
>expression that will catch filename but not an ID.
>
>Currently:
>ID = r"[a-zA-Z\.]\w+(?![/\\])"
>FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"
>
>However if I have the filename "Sources/kernel/rom_kernel.mls" then
>"Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
>as file name.
ITYM s/interrupted/interpreted/ ;-)
>
>Any way to do better?
If you want to prioritize matching amongst several
patterns with some leading commonality, UIAM or'ed terms get
tried left to right. I'm not checking your terms, but I think
here's a possible way to give priority to the FILENAME
pattern:
>>> import re
>>> ID = r"[a-zA-Z\.]\w+(?![/\\])"
>>> FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"
>>> COMBINED = '(?P<file>%s)|(?P<id>%s)' % (FILENAME, ID)
>>> rxo = re.compile(COMBINED)
>>> filename = "Sources/kernel/rom_kernel.mls"
>>> rxo.search(filename).groupdict()
{'id': None, 'file': 'Sources/kernel/rom_kernel.mls'}
Try it with an id:
>>> rxo.search('no_slashes_in_this').groupdict()
{'id': 'no_slashes_in_this', 'file': None}
Of course you can mess with the result, e.g.,
>>> result = rxo.search('no_slashes_in_this').groupdict()
>>> result['id']
'no_slashes_in_this'
>>> result['file']
>>> result['file'] is None
True
>>> result['id'], result['file']
('no_slashes_in_this', None)
No guarantees, but HTH
Regards,
Bengt Richter
More information about the Python-list
mailing list