[Speed] Performance comparison of regular expression engines

Brett Cannon brett at python.org
Mon Mar 14 11:40:14 EDT 2016


On Mon, 14 Mar 2016 at 07:27 Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Sun, 13 Mar 2016 17:44:10 +0000
> Brett Cannon <brett at python.org> wrote:
> > >
> > > 2. One iteration of all searches on full text takes 29 seconds on my
> > > computer. Isn't this too long? In any case I want first optimize some
> > > bottlenecks in the re module.
> > >
> >
> > I don't think we have established a "too long" time. We do have some
> > benchmarks like spectral_norm that don't run unless you use rigorous mode
> > and this could be one of them.
> >
> > > 3. Do we need one benchmark that gives an accumulated time of all
> > > searches, or separate microbenchmarks for every pattern?
> >
> > I don't care either way. Obviously it depends on whether you want to
> > measure overall re perf and have people aim to improve that or let people
> > target specific workload types.
>
> This is a more general latent issue with our current benchmarking
> philosophy.  We have built something which aims to be a general-purpose
> benchmark suite, but in some domains a more comprehensive set of
> benchmarks may be desirable.  Obviously we don't want to have 10 JSON
> benchmarks, 10 re benchmarks, 10 I/O benchmarks, etc. in the default
> benchmarks run, so what do we do for such cases?  Do we tell people
> domain-specific benchmarks should be developed independently?  Do we
> include some facilities to create such subsuites without them being
> part of the default bunch?
>
> (note a couple domain-specific benchmarks -- iobench, stringbench, etc.
> -- are currently maintained separately)
>

Good point. I personally don't have a good feel on how to handle this. Part
of me would like to consolidate the benchmarks so that it's easier to
discover what benchmarks there are. Another part of me doesn't want to
burden folks writing there own benchmarks for development purposes too much.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/speed/attachments/20160314/69fb4b39/attachment.html>


More information about the Speed mailing list