psss...I want to move from Perl to Python

Cameron Simpson cs at zip.com.au
Sun Jan 31 16:16:46 EST 2016


On 31Jan2016 09:49, Paul Rubin <no.email at nospam.invalid> wrote:
>Cameron Simpson <cs at zip.com.au> writes:
>> Adzapper. It has many many regexps matching URLs. (Actually a more
>> globlike syntax, but it gets turned into a regexp.) You plug it into
>> your squid proxy.
>
>Oh cool, is that out there in circulation?

Yes:

  http://adzapper.sourceforge.net/

which includes the installation instructions (install script, add a line to 
squid.conf).

However my publication workflow is broken. (And source forge isn't what it used 
to be.) I need to get the update process improved. I'm happy to send the latest 
copy to people by private email.

>It sounds like the approach of merging all the regexes into one and
>compiling to a FSM could be a big win.  I wouldn't expect too big a
>state space explosion.

Perhaps so. The existing script (a) merges regexps for successive patterns for 
the same class and (b) use's perl's "study" function, which examines a string 
which will have several regexps applies to it - it nots things like character 
positions I gather, which is used in the matching process. Since the zapper 
applies all the rules to most URLs this is a performance win.

Cheers,
Cameron Simpson <cs at zip.com.au>



More information about the Python-list mailing list