Unsung Modules Redux

Christos TZOTZIOY Georgiou DLNXPEGFQVEB at spammotel.com
Thu Jan 9 04:58:14 EST 2003


On 22 Dec 2002 23:30:24 -0500, rumours say that aahz at pythoncraft.com
(Aahz) might have written:

>In article <mailman.1040613990.20889.python-list at python.org>,
>Tim Peters  <tim.one at comcast.net> wrote:
>>[Scherer, Bill]
>>> 
>>> So, what's your favorite unsung module today?
>>
>>bisect.py.  Everyone needs it and nobody realizes it <wink>.
>
>I use a dict instead.  <0.2 wink>

Say you got a list of forbidden domains, eg ['doubleclick.net',
'doubleclick.com', 'yimg.com', ...], lots of them.  You want to filter a
list of domains and find the forbidden ones.  Please note that
a12.blabla.yimg.com is to be filtered as forbidden.  I think there are
two obvious (at least to me so far) ways to do it:

1. Make forbidden_domains a dict
For the domain a12.blabla.yimg.com, test all of ['a12.blabla.yimg.com',
'blabla.yimg.com', 'yimg.com', 'com'] for existence in
forbidden_domains.

2. Leave forbidden domains a list, reverse every string in it, sort
(perhaps add a dummy sentinel that sorts after every other item).
Reverse every candidate_domain, find an index in forbidden_domains using
bisect_left, check either for
len(candidate_domain)==len(forbidden_domain) or
candidate_domain[len(forbidden_domain)]=="."

Way 2 seems more appropriate to me...
-- 
TZOTZIOY, I speak England very best,
Real email address: 'dHpvdEBzaWwtdGVjLmdy\n'.decode('base64')




More information about the Python-list mailing list