[Catalog-sig] Flamenco queries

"Martin v. Löwis" martin at v.loewis.de
Mon Apr 9 01:22:03 CEST 2007


In studying the Cheeseshop code, I ran into flamenco.py, which
"represents a flamenco-style query". What is that?

It appears that it contains a fairly expensive SQL operation,

select rc.trove_id, r.name, r.version, r.summary
        from releases r, release_classifiers rc
        where r.name=rc.name and r.version=rc.version
          and r._pypi_hidden=FALSE

Postgres' 'explain analyze' gives this for the query:

                                                              QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=6130.46..6512.03 rows=3545 width=100) (actual
time=1069.206..1194.849 rows=13149 loops=1)
   Merge Cond: (("outer"."version" = "inner"."version") AND
("outer".name = "inner".name))
   ->  Sort  (cost=1009.16..1017.43 rows=3309 width=96) (actual
time=58.092..58.968 rows=2272 loops=1)
         Sort Key: r."version", r.name
         ->  Seq Scan on releases r  (cost=0.00..815.71 rows=3309
width=96) (actual time=0.050..18.740 rows=2272 loops=1)
               Filter: (_pypi_hidden = false)
   ->  Sort  (cost=5121.30..5228.40 rows=42841 width=68) (actual
time=1011.050..1039.380 rows=42841 loops=1)
         Sort Key: rc."version", rc.name
         ->  Seq Scan on release_classifiers rc  (cost=0.00..749.41
rows=42841 width=68) (actual time=0.052..68.960 rows=42841 loops=1)
 Total runtime: 1200.133 ms

This operation is the result of accessing, say

  /pypi?:action=browse&c=229&c=5

which gives a total of 5 packages.

What puzzles me is that the parameter of the URL (229 and 5) don't
occur in the query. Isn't there some cheaper way to achieve the
same result? After all, on my machine, this query runs for 1s!

I measured it on ximinez, and it reported a running time of 3.5s.

Regards,
Martin


More information about the Catalog-sig mailing list