Hostmask matching

John Machin sjmachin at lexicon.net
Sun Jun 4 04:24:06 EDT 2006


On 4/06/2006 4:45 PM, Nexu wrote:
> On Sun, 2006-06-04 at 06:26 +0000, Marc Schoechlin wrote:
>> Hi !
>>
>> Nexu <nexu.jin at gmail.com> schrieb:
>>> I'm trying to write a def to match a string that is an irc hostmask. eg:
>>> *one!~*name at a??-??-101-101.someisp.com
>>> But using re.search(). I get an error when the string starts with '*'.
>>> What is the best way to solve this?
>> I suppose the problem occurs because you expression is not a valid
>> regular expression.
>>
>> A correct regular expression should look like this:
>> ".*one!~.*name at a..-..-101-101.someisp.com"
> Thx for everyones input.
> 
> This solved the problem:
> 	host = 'someone!~thename at a80-80-101-101.someisp.com'
> 	mask = '*one!~*name at a??-??-101-101.someisp.com'
> 	newmask = re.sub('\*', '.*', re.sub('\?', '.', mask))
> result in that:
> 	re.search(newmask, host) == True

For a start, you must mean bool(re.search(newmask, host)) == True, 
because re.search() returns a MatchObject or None; neither of those will 
  ever compare equal to True.

Moving right along, you have only one facet of your multi-faceted 
multi-level problem fixed. Consider the following:

|>>> newmask
'.*one!~.*name at a..-..-101-101.someisp.com'
|>>> host = 'someone!~thename at a80-80-101-101.someisp.com'
|>>> bool(re.search(newmask, host))
True
|>>> host2 = 'someone!~thename at a80-80-101-101.someisp.communication.problem'
|>>> bool(re.search(newmask, host2))
True
|>>> host3 = 'someone!~thename at a80-80-101-101XsomeispYcom'
|>>> bool(re.search(newmask, host3))
True

You didn't answer either of my questions that would have told me whether 
host2 is a problem; if it is, you need a '$' at the end of newmask.

To fix the host3 problem, you need '\.' instead of '.'.

There is another possible host2-like problem: if you have a hostmask 
that starts with 'one' (i.e. no '*' at the front), what you are doing 
now will give True for incoming starting with 'anyone!' or 
'I_am_the_one!' or whatever. I don't think you want that to happen. Two 
solutions: (1) Put '^' at the start of newmask (2) use re.match() 
instead of re.search().

Another question: should you be doing a case-insensitive match? If so, 
you need re.search/match(newmask, host, re.IGNORECASE)

You may wish to consider looking at the fnmatch module, at three levels:
(1) calling fnmatch.fnmatchcase() may be good enough for your purpose
(2) you can use the undocumented fnmatch.translate(), like this:
     newmask = fnmatch.translate(mask)
and use re.match()
(3) you can find the source code in
<YOUR_PYTHON_INSTALLATION_DIRECTORY>/Lib/fnmatch.py,
copy the translate function, and rip out the lines that treat '[' as 
special.

HTH,
John



More information about the Python-list mailing list