From jodok at lovelysystems.com Mon Jul 2 21:33:38 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Mon, 2 Jul 2007 21:33:38 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
Message-ID: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
hi,
is it possible that our outgoing proxy server is beeing blocked by
cheeseshop? it's ip address is 194.183.146.189
no, it was no attack to cheeseshop :) we're simply running buildout
over and over and probably generating some load.
thanks
jodok
--
"Explicit is better than implicit."
-- The Zen of Python, by Tim Peters
Jodok Batlogg, Lovely Systems
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
phone: +43 5572 908060, fax: +43 5572 908060-77
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2454 bytes
Desc: not available
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070702/69790938/attachment.bin
From fdrake at gmail.com Mon Jul 2 23:21:06 2007
From: fdrake at gmail.com (Fred Drake)
Date: Mon, 2 Jul 2007 17:21:06 -0400
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
Message-ID: <9cee7ab80707021421v4c30a348g9bd62272d81b2413@mail.gmail.com>
On 7/2/07, Jodok Batlogg wrote:
> is it possible that our outgoing proxy server is beeing blocked by
> cheeseshop? it's ip address is 194.183.146.189
> no, it was no attack to cheeseshop :) we're simply running buildout
> over and over and probably generating some load.
Hey Jodok,
I've taken to only using an internal repository for project buildouts;
if I need/want a new release from PyPI, I load that into the internal
repository. That avoids depending on PyPI being accessible at all
times, and I can always get what I've used again. No need to worry
about someone hiding old releases, or whatever.
It in incurs a little overhead on adding or updating a package used in
my projects, but avoids depending on a highly-variable service. An
internal repository can still have problems, but at least it's easier
to make changes if needed.
-Fred
--
Fred L. Drake, Jr.
"Chaos is the score upon which reality is written." --Henry Miller
From jodok at lovelysystems.com Mon Jul 2 23:25:33 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Mon, 2 Jul 2007 23:25:33 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <9cee7ab80707021421v4c30a348g9bd62272d81b2413@mail.gmail.com>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
<9cee7ab80707021421v4c30a348g9bd62272d81b2413@mail.gmail.com>
Message-ID: <003AB009-74C9-4F3A-8C78-F9CA96B31605@lovelysystems.com>
On 02.07.2007, at 23:21, Fred Drake wrote:
> On 7/2/07, Jodok Batlogg wrote:
>> is it possible that our outgoing proxy server is beeing blocked by
>> cheeseshop? it's ip address is 194.183.146.189
>> no, it was no attack to cheeseshop :) we're simply running buildout
>> over and over and probably generating some load.
>
> Hey Jodok,
>
> I've taken to only using an internal repository for project buildouts;
> if I need/want a new release from PyPI, I load that into the internal
> repository. That avoids depending on PyPI being accessible at all
> times, and I can always get what I've used again. No need to worry
> about someone hiding old releases, or whatever.
>
> It in incurs a little overhead on adding or updating a package used in
> my projects, but avoids depending on a highly-variable service. An
> internal repository can still have problems, but at least it's easier
> to make changes if needed.
already done after pypi beeing flakey :)
unfortunately now the outgoing ip of this repo is beeing blocked and
it sucks to scp downloaded files :)
thanks fred,
jodok
>
>
> -Fred
>
> --
> Fred L. Drake, Jr.
> "Chaos is the score upon which reality is written." --Henry Miller
--
"Errors should never pass silently."
"Unless explicitly silenced."
-- The Zen of Python, by Tim Peters
Jodok Batlogg, Lovely Systems
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
phone: +43 5572 908060, fax: +43 5572 908060-77
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2454 bytes
Desc: not available
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070702/9efc7cec/attachment.bin
From jim at zope.com Tue Jul 3 00:04:36 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 2 Jul 2007 18:04:36 -0400
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
Message-ID: <1BBE8714-5AC2-40E0-9182-F628B58F4911@zope.com>
On Jul 2, 2007, at 3:33 PM, Jodok Batlogg wrote:
> hi,
>
> is it possible that our outgoing proxy server is beeing blocked by
> cheeseshop? it's ip address is 194.183.146.189
> no, it was no attack to cheeseshop :) we're simply running buildout
> over and over and probably generating some load.
It's hard to believe that buildout could be generating enough load to
trigger being blocked.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From lac at openend.se Tue Jul 3 00:16:38 2007
From: lac at openend.se (Laura Creighton)
Date: Tue, 03 Jul 2007 00:16:38 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: Message from Jodok Batlogg of "Mon,
02 Jul 2007 21:33:38 +0200."
<8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
Message-ID: <200707022216.l62MGcg6009085@theraft.openend.se>
Could it be that you are simply out of apache's? i recall that
Sean set the number of simultaneous ones at some very tiny number.
Laura
From martin at v.loewis.de Tue Jul 3 00:29:21 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 03 Jul 2007 00:29:21 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <200707022216.l62MGcg6009085@theraft.openend.se>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
<200707022216.l62MGcg6009085@theraft.openend.se>
Message-ID: <46897C41.7090609@v.loewis.de>
Laura Creighton schrieb:
> Could it be that you are simply out of apache's? i recall that
> Sean set the number of simultaneous ones at some very tiny number.
I think you misunderstood. He set MaxRequestsPerChild to 10, which
means that each process will be replaced by a different one after
10 requests. MaxClients is 60, which should be more than enough.
Regards,
Martin
From lac at openend.se Tue Jul 3 00:33:36 2007
From: lac at openend.se (Laura Creighton)
Date: Tue, 03 Jul 2007 00:33:36 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: Message from =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
of "Tue, 03 Jul 2007 00:29:21 +0200." <46897C41.7090609@v.loewis.de>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
<200707022216.l62MGcg6009085@theraft.openend.se>
<46897C41.7090609@v.loewis.de>
Message-ID: <200707022233.l62MXavr011774@theraft.openend.se>
In a message of Tue, 03 Jul 2007 00:29:21 +0200, "Martin v. L?wis" writes:
>Laura Creighton schrieb:
>> Could it be that you are simply out of apache's? i recall that
>> Sean set the number of simultaneous ones at some very tiny number.
>
>I think you misunderstood. He set MaxRequestsPerChild to 10, which
>means that each process will be replaced by a different one after
>10 requests. MaxClients is 60, which should be more than enough.
>
>Regards,
>Martin
yes, I thought it was 10. Sorry about that, and thank you.
Laura
From martin at v.loewis.de Tue Jul 3 09:22:11 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 03 Jul 2007 09:22:11 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
Message-ID: <4689F923.8030304@v.loewis.de>
> is it possible that our outgoing proxy server is beeing blocked by
> cheeseshop? it's ip address is 194.183.146.189
I can't see anything like that in the configuration of ximinez.
Furthermore, I cannot see that this IP addresses made any attempt
to contact ximinez. I got several accesses from 194.183.146.178,
for various versions of zc.buildout, through setuptools, and
I got requests from 194.183.146.185 through Firefox, but none
from the IP address that you mention. Going back until December
2006 (if I can trust the logs), that machine never made any
access to the Cheeseshop.
Regards,
Martin
From jodok at lovelysystems.com Tue Jul 3 11:02:19 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Tue, 3 Jul 2007 11:02:19 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <4689F923.8030304@v.loewis.de>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
<4689F923.8030304@v.loewis.de>
Message-ID: <5B0A8BC7-CC65-49E6-AA15-CCF591A0EA41@lovelysystems.com>
On 03.07.2007, at 09:22, Martin v. L?wis wrote:
>> is it possible that our outgoing proxy server is beeing blocked by
>> cheeseshop? it's ip address is 194.183.146.189
>
> I can't see anything like that in the configuration of ximinez.
>
> Furthermore, I cannot see that this IP addresses made any attempt
> to contact ximinez. I got several accesses from 194.183.146.178,
> for various versions of zc.buildout, through setuptools, and
> I got requests from 194.183.146.185 through Firefox, but none
> from the IP address that you mention. Going back until December
> 2006 (if I can trust the logs), that machine never made any
> access to the Cheeseshop.
it seems to happen on the network level. i can't ping the machine
from this ip address :)
coming from 194.183.146.189:
traceroute to ximinez.python.org (82.94.237.219), 64 hops max, 60
byte packets
1 lsfw01 (192.168.34.254) 0.727 ms 0.406 ms 0.345 ms
2 194-183-146-177.tele.net (194.183.146.177) 1.212 ms 1.061 ms
3.801 ms
3 cr4-swz1.net.tele.net (194.183.134.8) 6.733 ms 5.034 ms 4.472 ms
4 fas0-1-70-cr3-swz1.net.tele.net (194.183.133.188) 4.550 ms
4.581 ms 4.627 ms
5 atm0-0-r1-hoe1.net.tele.net (194.183.135.34) 5.743 ms 5.471
ms 5.362 ms
6 giga0-2.r2-buh1.net.tele.net (194.183.135.194) 7.449 ms 6.484
ms 5.843 ms
7 83.144.194.17 (83.144.194.17) 8.407 ms 8.736 ms 8.444 ms
8 g4-0-211.core01.zrh01.atlas.cogentco.com (149.6.83.129) 9.269
ms 8.669 ms 8.727 ms
9 p6-0.core01.str01.atlas.cogentco.com (130.117.0.53) 11.924 ms
11.825 ms 10.960 ms
10 p3-0.core01.fra03.atlas.cogentco.com (130.117.0.217) 13.820 ms
14.551 ms 13.941 ms
11 p3-0.core01.ams03.atlas.cogentco.com (130.117.0.145) 21.411 ms
21.266 ms 20.842 ms
12 t3-1.mpd01.ams03.atlas.cogentco.com (130.117.0.34) 20.100 ms
21.003 ms 20.880 ms
13 ams-ix.sara.xs4all.net (195.69.144.48) 20.878 ms 20.983 ms
28.193 ms
14 0.so-6-0-0.xr1.3d12.xs4all.net (194.109.5.1) 21.045 ms 21.486
ms 20.892 ms
15 0.so-3-0-0.cr1.3d12.xs4all.net (194.109.5.58) 49.436 ms 29.076
ms 103.199 ms
16 * * *
17 * * *
18 * * *
coming from 194.183.146.179:
traceroute to ximinez.python.org (82.94.237.219), 64 hops max, 60
byte packets
1 lsfw01 (192.168.34.254) 2.030 ms 1.495 ms 1.461 ms
2 * 194-183-146-177.tele.net (194.183.146.177) 1.834 ms 1.646 ms
3 cr4-swz1.net.tele.net (194.183.134.8) 4.873 ms 6.393 ms 5.318 ms
4 fas4-0-70-cr1-swz1.net.tele.net (194.183.133.190) 8.466 ms
196.174 ms 5.562 ms
5 194.183.142.2 (194.183.142.2) 6.540 ms 6.462 ms 21.969 ms
6 giga0-2.r2-buh1.net.tele.net (194.183.135.194) 6.642 ms 6.871
ms 7.797 ms
7 83.144.194.17 (83.144.194.17) 18.965 ms 9.923 ms 10.459 ms
8 g4-0-211.core01.zrh01.atlas.cogentco.com (149.6.83.129) 10.003
ms 9.462 ms 9.945 ms
9 p6-0.core01.str01.atlas.cogentco.com (130.117.0.53) 13.728 ms
11.831 ms 12.375 ms
10 p3-0.core01.fra03.atlas.cogentco.com (130.117.0.217) 14.568 ms
16.176 ms 15.069 ms
11 p3-0.core01.ams03.atlas.cogentco.com (130.117.0.145) 124.421 ms
134.435 ms 205.047 ms
12 t3-1.mpd01.ams03.atlas.cogentco.com (130.117.0.34) 21.689 ms
21.962 ms 22.313 ms
13 ams-ix.tc2.xs4all.net (195.69.144.166) 21.655 ms 21.213 ms
23.011 ms
14 0.so-7-0-0.xr2.3d12.xs4all.net (194.109.5.13) 21.531 ms 21.966
ms 0.so-7-0-0.xr1.3d12.xs4all.net (194.109.5.9) 21.673 ms
15 0.so-2-0-0.cr1.3d12.xs4all.net (194.109.5.74) 21.526 ms
0.so-3-0-0.cr1.3d12.xs4all.net (194.109.5.58) 24.606 ms 22.263 ms
16 ximinez.python.org (82.94.237.219) 23.363 ms 21.890 ms 25.506 ms
thanks a lot for your help
jodok
>
> Regards,
> Martin
--
"Simple is better than complex."
-- The Zen of Python, by Tim Peters
Jodok Batlogg, Lovely Systems
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
phone: +43 5572 908060, fax: +43 5572 908060-77
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2454 bytes
Desc: not available
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070703/b6ff4b4e/attachment.bin
From pje at telecommunity.com Thu Jul 5 02:56:25 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 04 Jul 2007 20:56:25 -0400
Subject: [Catalog-sig] Cheeseshop login problems?
Message-ID: <20070705005415.3F4F03A4046@sparrow.telecommunity.com>
I can't seem to log in to the Cheeseshop, from any platform or
machine, whether via script or browser (Firefox or Lynx). I haven't
changed my password, but just in case there was an issue with my
password, I asked for a password reset.
The passwords I received in email don't work either, however, which
seems to suggest that there is a server problem involved. :(
From richardjones at optusnet.com.au Thu Jul 5 05:43:44 2007
From: richardjones at optusnet.com.au (richardjones at optusnet.com.au)
Date: Thu, 05 Jul 2007 13:43:44 +1000
Subject: [Catalog-sig] Cheeseshop login problems?
Message-ID: <200707050343.l653hirE007904@mail06.syd.optusnet.com.au>
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://mail.python.org/pipermail/catalog-sig/attachments/20070705/a90ec30e/attachment.asc
From martin at v.loewis.de Thu Jul 5 07:38:36 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 05 Jul 2007 07:38:36 +0200
Subject: [Catalog-sig] Cheeseshop login problems?
In-Reply-To: <200707050343.l653hirE007904@mail06.syd.optusnet.com.au>
References: <200707050343.l653hirE007904@mail06.syd.optusnet.com.au>
Message-ID: <468C83DC.4030605@v.loewis.de>
> No logins appear to work at the moment.
>
> Has anyone made changes to the apache config recently?
I did - I'll look into it.
Martin
From martin at v.loewis.de Thu Jul 5 08:22:33 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 05 Jul 2007 08:22:33 +0200
Subject: [Catalog-sig] Cheeseshop login problems?
In-Reply-To: <20070705005415.3F4F03A4046@sparrow.telecommunity.com>
References: <20070705005415.3F4F03A4046@sparrow.telecommunity.com>
Message-ID: <468C8E29.70808@v.loewis.de>
Phillip J. Eby schrieb:
> I can't seem to log in to the Cheeseshop, from any platform or
> machine, whether via script or browser (Firefox or Lynx). I haven't
> changed my password, but just in case there was an issue with my
> password, I asked for a password reset.
>
> The passwords I received in email don't work either, however, which
> seems to suggest that there is a server problem involved. :(
Please try again; it should work now.
I switched the Cheeseshop from using mod_python to using FastCGI,
but forgot to do the RewriteCond dance. Sorry about that.
Regards,
Martin
From martin at v.loewis.de Thu Jul 5 08:37:35 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 05 Jul 2007 08:37:35 +0200
Subject: [Catalog-sig] Cheeseshop performance problems solved
Message-ID: <468C91AF.8000304@v.loewis.de>
I think I solved the performance problems of the Cheeseshop,
by switching both the wiki and the Cheeseshop it to FastCGI.
I raised the MaxRequestsPerChild to 1000 again, and
MaxClients back to its default (256). There are four processes
running the PyPI, and four threads running MoinMoin.
If you experience problems, please report exact data and
time of the outage, as well as the nature of the outage
(e.g. if it doesn't respond within a reasonable time,
report what operation you did and after what time you
gave up waiting for a response).
Regards,
Martin
From jim at zope.com Thu Jul 5 15:32:44 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 5 Jul 2007 09:32:44 -0400
Subject: [Catalog-sig] Cheeseshop login problems?
In-Reply-To: <468C8E29.70808@v.loewis.de>
References: <20070705005415.3F4F03A4046@sparrow.telecommunity.com>
<468C8E29.70808@v.loewis.de>
Message-ID: <24CECA6B-B9F3-420B-8016-C3C4FBB06548@zope.com>
Hey Martin,
I want to say thanks to you and the other folks who are working on
trying to address the PyPI performance issues. Much much thanks!
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Fri Jul 6 01:29:57 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 5 Jul 2007 19:29:57 -0400
Subject: [Catalog-sig] psycoph errors from pypi
Message-ID: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
I imagine the people working on the cheeseshop are aware of this,
but, in case you aren't, I'm getting intermittent errors from the
cheeseshop. For example, requests for: http://www.python.org/pypi/
Often give:
Error...
There's been a problem with your request
psycopg.ProgrammingError: ERROR: current transaction is aborted,
commands ignored until end of transaction block
select name, version, summary, _pypi_ordering
from releases where (lower(name) LIKE '%%%%') and
_pypi_hidden = FALSE
order by lower(name), _pypi_ordering
or http://www.python.org/pypi/setuptools sometimes gives:
Error...
There's been a problem with your request
psycopg.ProgrammingError: ERROR: current transaction is aborted,
commands ignored until end of transaction block
select name, version, summary, _pypi_hidden
from releases
where name = 'setuptools' and _pypi_hidden = False
order by _pypi_ordering desc
http://www.python.org/pypi/setuptools/0.6c6 gives:
Error...
There's been a problem with your request
psycopg.ProgrammingError: ERROR: current transaction is aborted,
commands ignored until end of transaction block
select packages.name as name, stable_version, version, author,
author_email, maintainer, maintainer_email,
home_page,
license, summary, description, description_html,
keywords,
platform, download_url, _pypi_ordering, _pypi_hidden,
cheesecake_installability_id,
cheesecake_documentation_id,
cheesecake_code_kwalitee_id
from packages, releases
where packages.name='setuptools' and version='0.6c6'
and packages.name = releases.name
And so on.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Fri Jul 6 04:10:19 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 06 Jul 2007 04:10:19 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
Message-ID: <468DA48B.2020008@v.loewis.de>
Jim Fulton schrieb:
> I imagine the people working on the cheeseshop are aware of this,
> but, in case you aren't, I'm getting intermittent errors from the
> cheeseshop. For example, requests for: http://www.python.org/pypi/
I wasn't aware of this until you reported it.
I don't have a clue what's causing it.
Regards,
Martin
From martin at v.loewis.de Fri Jul 6 04:33:54 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 06 Jul 2007 04:33:54 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468DA48B.2020008@v.loewis.de>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
<468DA48B.2020008@v.loewis.de>
Message-ID: <468DAA12.4000707@v.loewis.de>
Martin v. L?wis schrieb:
> Jim Fulton schrieb:
>> I imagine the people working on the cheeseshop are aware of this,
>> but, in case you aren't, I'm getting intermittent errors from the
>> cheeseshop. For example, requests for: http://www.python.org/pypi/
>
> I wasn't aware of this until you reported it.
>
> I don't have a clue what's causing it.
I now do, somewhat. Apparently, when you discard a cursor object
in psycopg, and create a new one, that doesn't necessarily start
a new transaction. So if there was some SQL error in the connection,
it stops accepting further SQL statements.
I fixed that by rolling back the connection after each request,
and before each new request.
What I don't understand is why there was an error in the first
place (or what that error was).
Regards,
Martin
From jim at zope.com Fri Jul 6 14:04:06 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 6 Jul 2007 08:04:06 -0400
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468DAA12.4000707@v.loewis.de>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
Message-ID:
On Jul 5, 2007, at 10:33 PM, Martin v. L?wis wrote:
> Martin v. L?wis schrieb:
>> Jim Fulton schrieb:
>>> I imagine the people working on the cheeseshop are aware of this,
>>> but, in case you aren't, I'm getting intermittent errors from the
>>> cheeseshop. For example, requests for: http://www.python.org/pypi/
>>
>> I wasn't aware of this until you reported it.
>>
>> I don't have a clue what's causing it.
>
> I now do, somewhat. Apparently, when you discard a cursor object
> in psycopg, and create a new one, that doesn't necessarily start
> a new transaction. So if there was some SQL error in the connection,
> it stops accepting further SQL statements.
>
> I fixed that by rolling back the connection after each request,
> and before each new request.
>
> What I don't understand is why there was an error in the first
> place (or what that error was).
OK, this probably isn't helpful, but I can't help asking an obvious
question. Did something change in the software other than a switch
from mod_python to FastCGI?
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Fri Jul 6 14:16:47 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 6 Jul 2007 08:16:47 -0400
Subject: [Catalog-sig] Cheeseshop performance improved
In-Reply-To: <20070626105201.GA14025@tummy.com>
References: <467CC2E1.3010708@v.loewis.de>
<46801FDC.4060502@v.loewis.de>
<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
<46802A10.8080205@v.loewis.de>
<200706252144.l5PLi7cs032424@theraft.openend.se>
<20070626105201.GA14025@tummy.com>
Message-ID:
On Jun 26, 2007, at 6:52 AM, Sean Reifschneider wrote:
...
> The quick fix would be to engage XS4ALL to upgrade the RAM in that
> box,
> leaving the box otherwise untouched. The system has only 1GB of
> RAM in it.
> It's got a 2.8GHz Xeon CPU in it, so I would expect it can take at
> least
> 4GB of RAM, if not 8 or 16GB.
>
> Thomas: If the PSF threw a grand or two at XS4ALL, could we get the
> memory
> in ximinez upgraded? Preferably to 4 or 8GB of RAM?
What is the status if this? This seems like a promising early step
and a pretty darn good use of PSF funds.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Fri Jul 6 19:21:00 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 06 Jul 2007 13:21:00 -0400
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To:
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
Message-ID: <20070706171848.268C23A4046@sparrow.telecommunity.com>
At 08:04 AM 7/6/2007 -0400, Jim Fulton wrote:
>On Jul 5, 2007, at 10:33 PM, Martin v. L?wis wrote:
> > I now do, somewhat. Apparently, when you discard a cursor object
> > in psycopg, and create a new one, that doesn't necessarily start
> > a new transaction. So if there was some SQL error in the connection,
> > it stops accepting further SQL statements.
> >
> > I fixed that by rolling back the connection after each request,
> > and before each new request.
> >
> > What I don't understand is why there was an error in the first
> > place (or what that error was).
>
>OK, this probably isn't helpful, but I can't help asking an obvious
>question. Did something change in the software other than a switch
>from mod_python to FastCGI?
That wouldn't be necessary for this to become a problem. If PyPI was
CGI before, then any sort of transient SQL problem wouldn't have had
this effect, because the DB connection would've been closed at the
end of each request. So, it's probably an existing SQL error in PyPI.
From jafo at tummy.com Fri Jul 6 23:45:27 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Fri, 6 Jul 2007 15:45:27 -0600
Subject: [Catalog-sig] Cheeseshop performance improved
In-Reply-To:
References: <467CC2E1.3010708@v.loewis.de>
<46801FDC.4060502@v.loewis.de>
<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
<46802A10.8080205@v.loewis.de>
<200706252144.l5PLi7cs032424@theraft.openend.se>
<20070626105201.GA14025@tummy.com>
Message-ID: <20070706214527.GR28082@tummy.com>
On Fri, Jul 06, 2007 at 08:16:47AM -0400, Jim Fulton wrote:
>What is the status if this? This seems like a promising early step
>and a pretty darn good use of PSF funds.
I never heard anything from Thomas, which I would think would be the right
person to run this through, as I really don't know anything about the
arrangement we have with XS4ALL. I guess we'd also need to get the PSF to
approve this, though I'd imagine that'd be little more than a formality.
If we don't have any response from Thomas in a bit, I can try contacting
XS4ALL directly and see if they can give us any ideas.
However, I believe that Martin also thinks that with his FastCGi changes it
should be happy now as is...
Thanks,
Sean
--
I think you are blind to the fact that the hand you hold
is the hand that holds you down. -- Everclear
Sean Reifschneider, Member of Technical Staff
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
From martin at v.loewis.de Sat Jul 7 00:02:47 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 00:02:47 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To:
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
Message-ID: <468EBC07.6010607@v.loewis.de>
>> I now do, somewhat. Apparently, when you discard a cursor object
>> in psycopg, and create a new one, that doesn't necessarily start
>> a new transaction. So if there was some SQL error in the connection,
>> it stops accepting further SQL statements.
>>
>> I fixed that by rolling back the connection after each request,
>> and before each new request.
>>
>> What I don't understand is why there was an error in the first
>> place (or what that error was).
>
> OK, this probably isn't helpful, but I can't help asking an obvious
> question. Did something change in the software other than a switch from
> mod_python to FastCGI?
Yes, I also made the connections to Postgres persistent, rather than
opening a new connection on each request.
Regards,
Martin
From jim at zope.com Sat Jul 7 00:06:29 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 6 Jul 2007 18:06:29 -0400
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468EBC07.6010607@v.loewis.de>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
<468EBC07.6010607@v.loewis.de>
Message-ID: <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com>
On Jul 6, 2007, at 6:02 PM, Martin v. L?wis wrote:
>>> I now do, somewhat. Apparently, when you discard a cursor object
>>> in psycopg, and create a new one, that doesn't necessarily start
>>> a new transaction. So if there was some SQL error in the connection,
>>> it stops accepting further SQL statements.
>>>
>>> I fixed that by rolling back the connection after each request,
>>> and before each new request.
>>>
>>> What I don't understand is why there was an error in the first
>>> place (or what that error was).
>>
>> OK, this probably isn't helpful, but I can't help asking an obvious
>> question. Did something change in the software other than a
>> switch from
>> mod_python to FastCGI?
>
> Yes, I also made the connections to Postgres persistent, rather than
> opening a new connection on each request.
Ah, OK, that explains it. This is a reasonable thing to do from a
performance point of view. Thanks for plugging away at this. :)
(Of course it's too bad we don't have a better way of testing
changes. Oh well.)
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Sat Jul 7 00:15:10 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 00:15:10 +0200
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <20070706214527.GR28082@tummy.com>
References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com>
<20070706214527.GR28082@tummy.com>
Message-ID: <468EBEEE.9010404@v.loewis.de>
> I never heard anything from Thomas, which I would think would be the right
> person to run this through, as I really don't know anything about the
> arrangement we have with XS4ALL. I guess we'd also need to get the PSF to
> approve this, though I'd imagine that'd be little more than a formality.
>
> If we don't have any response from Thomas in a bit, I can try contacting
> XS4ALL directly and see if they can give us any ideas.
I expect such a project to complete in a matter of months rather
than a matter of days. It took a year or so before the current set of
machines was actively being used (IIRC).
> However, I believe that Martin also thinks that with his FastCGi changes it
> should be happy now as is...
Indeed. If there are further complaints on the performance, I'd like to
hear them (preferably with a way for reproducing them). There is still
stuff that can be done to improve PyPI further, such as better usage of
SQL.
Regards,
Martin
From martin at v.loewis.de Sat Jul 7 00:22:42 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 00:22:42 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
<468EBC07.6010607@v.loewis.de>
<0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com>
Message-ID: <468EC0B2.9070903@v.loewis.de>
> Ah, OK, that explains it. This is a reasonable thing to do from a
> performance point of view. Thanks for plugging away at this. :)
>
> (Of course it's too bad we don't have a better way of testing changes.
> Oh well.)
If there were volunteer testers, it would be possible to test changes
for some period of time. Such testers would have to build themselves
a PyPI installation, and then checkout all changes that have been
committed (or install them from a tracker where they float around).
Alternatively, if somebody contributed a unit test suite, certain
problems might get caught.
In the specific case, I tested whether PyPI "works" on my local
installation, and I apparently didn't not manage to trigger the
problem. My guess is that it was originally triggered by some
failing concurrent access, which is really hard to test for.
Regards,
Martin
From martin at v.loewis.de Sat Jul 7 00:40:19 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 00:40:19 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <20070706171848.268C23A4046@sparrow.telecommunity.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
<20070706171848.268C23A4046@sparrow.telecommunity.com>
Message-ID: <468EC4D3.5030108@v.loewis.de>
> That wouldn't be necessary for this to become a problem. If PyPI was
> CGI before, then any sort of transient SQL problem wouldn't have had
> this effect, because the DB connection would've been closed at the end
> of each request. So, it's probably an existing SQL error in PyPI.
That would be my guess. Another possibility might have been that
there was a Python exception, in which case PyPI would not have invoked
.commit on the transaction (so apparently, the transaction would have
been kept open). I'm unsure whether this might cause problems for
subsequent actions. Still, no such exceptions were reported...
In any case, I now do a .rollback in the case of an exception, and
a .rollback before processing a new request. I'd like to get some
confirmation that this is a sensible approach (or what else best
practice is).
Regards,
Martin
From ianb at colorstudy.com Sat Jul 7 00:44:51 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 06 Jul 2007 17:44:51 -0500
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468EC0B2.9070903@v.loewis.de>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de>
<468DAA12.4000707@v.loewis.de> <468EBC07.6010607@v.loewis.de> <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com>
<468EC0B2.9070903@v.loewis.de>
Message-ID: <468EC5E3.2040903@colorstudy.com>
Martin v. L?wis wrote:
>> Ah, OK, that explains it. This is a reasonable thing to do from a
>> performance point of view. Thanks for plugging away at this. :)
>>
>> (Of course it's too bad we don't have a better way of testing changes.
>> Oh well.)
>
> If there were volunteer testers, it would be possible to test changes
> for some period of time. Such testers would have to build themselves
> a PyPI installation, and then checkout all changes that have been
> committed (or install them from a tracker where they float around).
>
> Alternatively, if somebody contributed a unit test suite, certain
> problems might get caught.
>
> In the specific case, I tested whether PyPI "works" on my local
> installation, and I apparently didn't not manage to trigger the
> problem. My guess is that it was originally triggered by some
> failing concurrent access, which is really hard to test for.
Are exceptions being logged, and actively sent to someone who can handle
them? This particular problem sounds like it is fairly deployment- and
load-specific, so testing probably wouldn't have found it anyway.
--
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
: Write code, do good : http://topp.openplans.org/careers
From pje at telecommunity.com Sat Jul 7 01:20:37 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 06 Jul 2007 19:20:37 -0400
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468EC4D3.5030108@v.loewis.de>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
<20070706171848.268C23A4046@sparrow.telecommunity.com>
<468EC4D3.5030108@v.loewis.de>
Message-ID: <20070706231831.B83783A405F@sparrow.telecommunity.com>
At 12:40 AM 7/7/2007 +0200, Martin v. L?wis wrote:
> > That wouldn't be necessary for this to become a problem. If PyPI was
> > CGI before, then any sort of transient SQL problem wouldn't have had
> > this effect, because the DB connection would've been closed at the end
> > of each request. So, it's probably an existing SQL error in PyPI.
>
>That would be my guess. Another possibility might have been that
>there was a Python exception, in which case PyPI would not have invoked
>.commit on the transaction (so apparently, the transaction would have
>been kept open). I'm unsure whether this might cause problems for
>subsequent actions. Still, no such exceptions were reported...
>
>In any case, I now do a .rollback in the case of an exception, and
>a .rollback before processing a new request. I'd like to get some
>confirmation that this is a sensible approach (or what else best
>practice is).
The best practice is ensuring that either a commit or rollback
happens at the end of each web request that uses the
connection. Then, there's no chance of a failed but not rolled-back
transaction continuing to hold locks in the database.
In PostgreSQL's case, the MVCC would prevent such a transaction from
blocking any read-only transactions, of course.
What you're doing is quite close to best practice; if I understand
you correctly, it differs only in the case of what happens if there
is a program error resulting in failure to commit or abort.
From richardjones at optushome.com.au Sat Jul 7 01:30:46 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Sat, 7 Jul 2007 09:30:46 +1000
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468EC5E3.2040903@colorstudy.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
<468EC0B2.9070903@v.loewis.de> <468EC5E3.2040903@colorstudy.com>
Message-ID: <200707070930.46318.richardjones@optushome.com.au>
On Sat, 7 Jul 2007, Ian Bicking wrote:
> Are exceptions being logged, and actively sent to someone who can handle
> them? This particular problem sounds like it is fairly deployment- and
> load-specific, so testing probably wouldn't have found it anyway.
Errors are currently emailed to myself and AMK. This is controlled by the
config on xminiez, so others may receive the error emails if they desire.
Richard
From renesd at gmail.com Sat Jul 7 03:22:27 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 7 Jul 2007 11:22:27 +1000
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <468EBEEE.9010404@v.loewis.de>
References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de>
<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
<46802A10.8080205@v.loewis.de>
<200706252144.l5PLi7cs032424@theraft.openend.se>
<20070626105201.GA14025@tummy.com>
<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
Message-ID: <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
Hi,
yeah, the sql can be improved.
A lot of the queries cause a sequential scan of all the rows in the
journal and release tables.
I think the cause of this is that one of the tables does not have a
primary key, so postgresql can't optimize the query. Even if the
table had an incrementing numeric id field, then I think the joins
could be sped up. I haven't tested this yet, but maybe that'd help -
or maybe there would need to be more changes needed. Postgresql
definitely needs a PK on each table though.
ps, I'm going to try and finish off that caching/static file work I've
been working on(more on that later). I guess I'll need to test things
a little differently with fastcgi. How did you set up a fastcgi pypi?
Cheers,
On 7/7/07, "Martin v. L?wis" wrote:
> > I never heard anything from Thomas, which I would think would be the right
> > person to run this through, as I really don't know anything about the
> > arrangement we have with XS4ALL. I guess we'd also need to get the PSF to
> > approve this, though I'd imagine that'd be little more than a formality.
> >
> > If we don't have any response from Thomas in a bit, I can try contacting
> > XS4ALL directly and see if they can give us any ideas.
>
> I expect such a project to complete in a matter of months rather
> than a matter of days. It took a year or so before the current set of
> machines was actively being used (IIRC).
>
> > However, I believe that Martin also thinks that with his FastCGi changes it
> > should be happy now as is...
>
> Indeed. If there are further complaints on the performance, I'd like to
> hear them (preferably with a way for reproducing them). There is still
> stuff that can be done to improve PyPI further, such as better usage of
> SQL.
>
> Regards,
> Martin
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
From renesd at gmail.com Sat Jul 7 06:24:53 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 7 Jul 2007 14:24:53 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
Message-ID: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
Hello,
here is the start of an apache config for using static files if they
exist, and the person is not logged in.
The idea will be to have a www/static/cheeseshop.python.org/pypi/
directory filled with the relevant cached files.
Here's the apache config so far. It checks to see if the person is
authorized, and if they are it does not use the static files.
There are a couple of special cases... ie the /pypi and pypi urls.
Now I just need to finish off the static file generation code. It
needs a tool which can run every minute or so, which will look for any
changes. If it finds changes it will update just those files. It
will generate the files in a separate directory first, and then move
them in - so people don't download half generated files. It will
optionally be able to regenerate all the static files - incase there
are database, or template changes.
Of course the config will have to change a little bit for using fcgi
instead of modpython... but there shouldn't be too much to change.
I've also updated the http://wiki.python.org/moin/CheeseShopDev page
with some things I noticed when installing the cheeseshop again on my
laptop. Mainly dependencies, and missing config steps.
NameVirtualHost 192.168.0.3
ServerAdmin webmaster at localhost
ServerName gracerr.pretendpaper.com
DocumentRoot /home/rene/dev/python/cheeseshop/packages/trunk/www/
# Redirect RSS to a static file
Alias /pypi/?:action=rss /data/www/pypi/pypi_rss.xml
Options Indexes FollowSymLinks MultiViews
AllowOverride None
Order allow,deny
allow from all
AddHandler cgi-script .cgi
Options Indexes
SetHandler mod_python
#PythonPath "['/data/pypi/src/pypi']+sys.path"
PythonPath
"['/home/rene/dev/python/cheeseshop/packages/trunk/pypi']+sys.path"
PythonHandler pypi::handle
PythonDebug On
# 2007-06-15 -- POSTs to /pypi every second
deny from 69.55.232.188
# Rewrite rules
RewriteEngine on
# if the authorization header is empty, redirect.
RewriteCond %{HTTP:authorization} ^$
RewriteRule ^(.*)pypi/$ /static/package_index.html [L]
#RewriteRule ^(.*)pypi$ /static/front-page.html [L]
# always make the /pypi empty one go straight through.
RewriteRule ^(.*)pypi$ /pypi2 [PT]
# a file, or a directory, and empty authorization header.
RewriteCond %{HTTP:authorization} ^$
RewriteCond /home/rene/dev/python/cheeseshop/packages/trunk/www/static/gracerr.pretendpaper.com/%{REQUEST_FILENAME}
-f
RewriteRule ^(.*)pypi/(.*) /static/gracerr.pretendpaper.com/pypi/$2 [PT]
RewriteCond %{HTTP:authorization} ^$
RewriteCond /home/rene/dev/python/cheeseshop/packages/trunk/www/static/gracerr.pretendpaper.com/%{REQUEST_FILENAME}
-d
RewriteRule ^(.*)pypi/(.*) /static/gracerr.pretendpaper.com/pypi/$2 [PT]
# Look here instead...
RewriteRule (.*) /pypi2/$1 [PT]
# Point to package directory
RewriteRule /packages(/.*)?$ /data/packages$1 [last]
RewriteRule /icons/(.*$) /usr/share/apache2/icons/$1 [last]
RedirectMatch permanent ^/$ "http://gracerr.pretendpaper.com/pypi"
RewriteLog /var/log/apache2/rewrite.log
RewriteLogLevel 9
ErrorLog /var/log/apache2/grace_error.log
# Possible values include: debug, info, notice, warn, error, crit,
# alert, emerg.
#LogLevel warn
LogLevel debug
CustomLog /var/log/apache2/grace_access.log combined
#ServerSignature On
# mkdir /var/tmp/proxy2/cheeseshop
# chown www-data: /var/tmp/proxy2/cheeseshop
# CacheRoot "/var/tmp/proxy2/cheeseshop"
# CacheEnable disk /
# CacheSize 4000000
# # CacheMinFileSize setting this so that 403 forbidden pages are not cached.
# CacheMinFileSize 400
# CacheDirLevels 5
# CacheDirLength 3
# #CacheGcInterval 4
# CacheMaxExpire 24
# CacheLastModifiedFactor 0.1
# CacheDefaultExpire 1
# #CacheForceCompletion 100
From martin at v.loewis.de Sat Jul 7 08:30:53 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 08:30:53 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468EC5E3.2040903@colorstudy.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de>
<468DAA12.4000707@v.loewis.de> <468EBC07.6010607@v.loewis.de> <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com>
<468EC0B2.9070903@v.loewis.de> <468EC5E3.2040903@colorstudy.com>
Message-ID: <468F331D.1080904@v.loewis.de>
> Are exceptions being logged, and actively sent to someone who can handle
> them? This particular problem sounds like it is fairly deployment- and
> load-specific, so testing probably wouldn't have found it anyway.
They are sent by email. AFAICT, they are not logged.
Regards,
Martin
From martin at v.loewis.de Sat Jul 7 08:44:21 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 08:44:21 +0200
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
References: <467CC2E1.3010708@v.loewis.de>
<46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com>
<468EBEEE.9010404@v.loewis.de>
<64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
Message-ID: <468F3645.1030000@v.loewis.de>
> A lot of the queries cause a sequential scan of all the rows in the
> journal and release tables.
>
> I think the cause of this is that one of the tables does not have a
> primary key, so postgresql can't optimize the query. Even if the
> table had an incrementing numeric id field, then I think the joins
> could be sped up. I haven't tested this yet, but maybe that'd help -
> or maybe there would need to be more changes needed. Postgresql
> definitely needs a PK on each table though.
Not definitely - and index is enough. A PK only adds an additional
constraint, and does not contribute in itself to performance.
In any case, I plan to add a name-version index to release_classifiers,
as the browsing often looks into release_classifiers by name and
version.
> ps, I'm going to try and finish off that caching/static file work I've
> been working on(more on that later). I guess I'll need to test things
> a little differently with fastcgi. How did you set up a fastcgi pypi?
FastCgiServer /data/pypi/src/pypi/pypi.fcgi -idle-timeout 60 -processes 4
then
# Trick Apache in providing Basic-Auth to pypi.fcgi
RewriteCond %{HTTP:Authorization} ^(.+)$
RewriteRule ^/pypi(.*) /data/pypi/src/pypi/pypi.fcgi$1
[e=HTTP_CGI_AUTHORIZATION:%1,l]
ScriptAlias /pypi /data/pypi/src/pypi/pypi.fcgi
Regards,
Martin
From martin at v.loewis.de Sat Jul 7 09:12:20 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 09:12:20 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
Message-ID: <468F3CD4.1070501@v.loewis.de>
> Now I just need to finish off the static file generation code. It
> needs a tool which can run every minute or so, which will look for any
> changes.
Would it be possible to trigger that explicitly by a write operation?
I'm doubtful about cron jobs for that kind of stuff - they run both
too often and too infrequent. It's too often because most of the time,
nothing changes, and too infrequent, because the user making the change
won't see it, and wonders where it got lost (they will see the change
as long they are logged in, then they log out, and the release is not
there).
IIUC, every addition to the journals should trigger a change, and then
the updating of the download counters. There are also changes to the
templates, but it would be ok if one would have to trigger regeneration
manually in this case.
Regards,
Martin
From renesd at gmail.com Sat Jul 7 09:38:18 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 7 Jul 2007 17:38:18 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <468F3CD4.1070501@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
Message-ID: <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
Yeah, that could be triggered then.
For the case of multiple changes at a similar time, we could add some
checks to make sure the updater process is only running once.
Otherwise for the case when there are a few changes happening at a
time, the machine would get unnecessarily overloaded.
On 7/7/07, "Martin v. L?wis" wrote:
> > Now I just need to finish off the static file generation code. It
> > needs a tool which can run every minute or so, which will look for any
> > changes.
>
> Would it be possible to trigger that explicitly by a write operation?
> I'm doubtful about cron jobs for that kind of stuff - they run both
> too often and too infrequent. It's too often because most of the time,
> nothing changes, and too infrequent, because the user making the change
> won't see it, and wonders where it got lost (they will see the change
> as long they are logged in, then they log out, and the release is not
> there).
>
> IIUC, every addition to the journals should trigger a change, and then
> the updating of the download counters. There are also changes to the
> templates, but it would be ok if one would have to trigger regeneration
> manually in this case.
>
> Regards,
> Martin
>
From renesd at gmail.com Sat Jul 7 09:41:54 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 7 Jul 2007 17:41:54 +1000
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <468F3645.1030000@v.loewis.de>
References: <467CC2E1.3010708@v.loewis.de> <46802A10.8080205@v.loewis.de>
<200706252144.l5PLi7cs032424@theraft.openend.se>
<20070626105201.GA14025@tummy.com>
<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
<64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
<468F3645.1030000@v.loewis.de>
Message-ID: <64ddb72c0707070041n5eb565c1jdaa25e4c9d583641@mail.gmail.com>
Thanks.
I thought because of the types of joins being done postgresql needed a
primary key - but maybe you can get them working with just some more
indices.
On 7/7/07, "Martin v. L?wis" wrote:
> > A lot of the queries cause a sequential scan of all the rows in the
> > journal and release tables.
> >
> > I think the cause of this is that one of the tables does not have a
> > primary key, so postgresql can't optimize the query. Even if the
> > table had an incrementing numeric id field, then I think the joins
> > could be sped up. I haven't tested this yet, but maybe that'd help -
> > or maybe there would need to be more changes needed. Postgresql
> > definitely needs a PK on each table though.
>
> Not definitely - and index is enough. A PK only adds an additional
> constraint, and does not contribute in itself to performance.
> In any case, I plan to add a name-version index to release_classifiers,
> as the browsing often looks into release_classifiers by name and
> version.
>
> > ps, I'm going to try and finish off that caching/static file work I've
> > been working on(more on that later). I guess I'll need to test things
> > a little differently with fastcgi. How did you set up a fastcgi pypi?
>
> FastCgiServer /data/pypi/src/pypi/pypi.fcgi -idle-timeout 60 -processes 4
>
> then
>
> # Trick Apache in providing Basic-Auth to pypi.fcgi
> RewriteCond %{HTTP:Authorization} ^(.+)$
> RewriteRule ^/pypi(.*) /data/pypi/src/pypi/pypi.fcgi$1
> [e=HTTP_CGI_AUTHORIZATION:%1,l]
> ScriptAlias /pypi /data/pypi/src/pypi/pypi.fcgi
>
> Regards,
> Martin
>
From jafo at tummy.com Sat Jul 7 10:18:13 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Sat, 7 Jul 2007 02:18:13 -0600
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <468EBEEE.9010404@v.loewis.de>
References:
<46801FDC.4060502@v.loewis.de>
<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
<46802A10.8080205@v.loewis.de>
<200706252144.l5PLi7cs032424@theraft.openend.se>
<20070626105201.GA14025@tummy.com>
<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
Message-ID: <20070707081813.GS28082@tummy.com>
On Sat, Jul 07, 2007 at 12:15:10AM +0200, "Martin v. L?wis" wrote:
>I expect such a project to complete in a matter of months rather
>than a matter of days. It took a year or so before the current set of
I believe that Jim was referring to the memory upgrade of ximinez, not the
getting creosote replaced with a new box. The memory upgrade should tale
little if any of our time.
Thanks,
Sean
--
moshez always wanted to invent a compression scheme called "feather",
so he could tar and feather his files.
Sean Reifschneider, Member of Technical Staff
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
From renesd at gmail.com Sat Jul 7 11:03:24 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 7 Jul 2007 19:03:24 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
Message-ID: <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
Hi,
I tried using memcached for caching the database queries - for logged
in users. It did speed it up a little, but not that much. It turns
out that the templates take most of the time - at least on my machine.
I guess pagetemplates are not that quick?
Here's the modified files if you want to try it out yourself:
http://rene.f0o.com/~rene/stuff/store.py
http://rene.f0o.com/~rene/stuff/webui.py
I just tried out on the queries that /pypi /pypi/ use.
There's some timing in the webui that gets written to a file in /tmp/asdfsdaf
For concurrent access then memcached will make more of a difference though.
Memcache could help even for logged in people, but I think replacing
the template language with something faster will have the most effect.
Cheers,
On 7/7/07, Ren? Dudfield wrote:
> Yeah, that could be triggered then.
>
> For the case of multiple changes at a similar time, we could add some
> checks to make sure the updater process is only running once.
> Otherwise for the case when there are a few changes happening at a
> time, the machine would get unnecessarily overloaded.
>
>
> On 7/7/07, "Martin v. L?wis" wrote:
> > > Now I just need to finish off the static file generation code. It
> > > needs a tool which can run every minute or so, which will look for any
> > > changes.
> >
> > Would it be possible to trigger that explicitly by a write operation?
> > I'm doubtful about cron jobs for that kind of stuff - they run both
> > too often and too infrequent. It's too often because most of the time,
> > nothing changes, and too infrequent, because the user making the change
> > won't see it, and wonders where it got lost (they will see the change
> > as long they are logged in, then they log out, and the release is not
> > there).
> >
> > IIUC, every addition to the journals should trigger a change, and then
> > the updating of the download counters. There are also changes to the
> > templates, but it would be ok if one would have to trigger regeneration
> > manually in this case.
> >
> > Regards,
> > Martin
> >
>
From jim at zope.com Sat Jul 7 16:30:24 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 7 Jul 2007 10:30:24 -0400
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de>
<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
<46802A10.8080205@v.loewis.de>
<200706252144.l5PLi7cs032424@theraft.openend.se>
<20070626105201.GA14025@tummy.com>
<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
<64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
Message-ID: <2F7122AD-4F6C-4714-9955-0E12AF8A6864@zope.com>
On Jul 6, 2007, at 9:22 PM, Ren? Dudfield wrote:
...
> ps, I'm going to try and finish off that caching/static file work I've
> been working on(more on that later).
Yay!
> I guess I'll need to test things
> a little differently with fastcgi. How did you set up a fastcgi pypi?
Does it matter? Couldn't you just test with CGI?
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Sat Jul 7 17:19:19 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 7 Jul 2007 11:19:19 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
Message-ID:
On Jul 7, 2007, at 12:24 AM, Ren? Dudfield wrote:
...
> Now I just need to finish off the static file generation code. It
> needs a tool which can run every minute or so, which will look for any
> changes.
Why not write the files when the underlying packages change?
I don't like polling for two reasons:
- New pages are out of date for up to the polling interval. This is
especially annoying for someone who uploads a package and wants to be
able to access it immediately.
- Polling all of the pages to see what's changed doesn't seem
scalable to me.
...
> I've also updated the http://wiki.python.org/moin/CheeseShopDev page
> with some things I noticed when installing the cheeseshop again on my
> laptop. Mainly dependencies, and missing config steps.
Thanks!
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Sat Jul 7 18:39:42 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 18:39:42 +0200
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <20070707081813.GS28082@tummy.com>
References: <46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com>
<468EBEEE.9010404@v.loewis.de> <20070707081813.GS28082@tummy.com>
Message-ID: <468FC1CE.8080708@v.loewis.de>
>> I expect such a project to complete in a matter of months rather
>> than a matter of days. It took a year or so before the current set of
>
> I believe that Jim was referring to the memory upgrade of ximinez, not the
> getting creosote replaced with a new box. The memory upgrade should tale
> little if any of our time.
Ah, ok. If you would like to find the right person at XS4ALL to talk to,
please go ahead - else I could try myself.
Regards,
Martin
From martin at v.loewis.de Sat Jul 7 18:43:39 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 18:43:39 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
Message-ID: <468FC2BB.7030607@v.loewis.de>
> I tried using memcached for caching the database queries - for logged
> in users. It did speed it up a little, but not that much. It turns
> out that the templates take most of the time - at least on my machine.
For the majority of pages generated through page templates, I think
the static generation would be fine. I'm looking primarily into the
browse interface at the moment.
> There's some timing in the webui that gets written to a file in /tmp/asdfsdaf
>
> For concurrent access then memcached will make more of a difference though.
>
> Memcache could help even for logged in people, but I think replacing
> the template language with something faster will have the most effect.
I'm quite skeptical on caching in general (even about the static page
generation). It *should* be possible to make it fast enough so that
it doesn't need caching. I consider caching a work-around, not a
solution - and one with severe drawbacks.
Regards,
Martin
From jim at zope.com Sat Jul 7 19:48:50 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 7 Jul 2007 13:48:50 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <468FC2BB.7030607@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
Message-ID: <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote:
...
> I'm quite skeptical on caching in general (even about the static page
> generation). It *should* be possible to make it fast enough so that
> it doesn't need caching.
Sure, with more hardware than we want to afford.
> I consider caching a work-around, not a
> solution - and one with severe drawbacks.
The pages we're talking about are static. They change at well-known
times. IMO, It's crazy to serve static content dynamically when it's
easy to serve it statically.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jafo at tummy.com Sat Jul 7 20:56:30 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Sat, 7 Jul 2007 12:56:30 -0600
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <468FC1CE.8080708@v.loewis.de>
References: <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
<46802A10.8080205@v.loewis.de>
<200706252144.l5PLi7cs032424@theraft.openend.se>
<20070626105201.GA14025@tummy.com>
<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
<20070707081813.GS28082@tummy.com> <468FC1CE.8080708@v.loewis.de>
Message-ID: <20070707185630.GV28082@tummy.com>
On Sat, Jul 07, 2007 at 06:39:42PM +0200, "Martin v. L?wis" wrote:
>Ah, ok. If you would like to find the right person at XS4ALL to talk to,
>please go ahead - else I could try myself.
I've sent a request to the "sales" e-mail contact explaining what we're
trying to do and asking for direction.
Thanks,
Sean
--
You know you're in Canada when: You see a flyer advertising a polka-fest
at the curling rink.
Sean Reifschneider, Member of Technical Staff
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
From thomas at python.org Sat Jul 7 21:38:00 2007
From: thomas at python.org (Thomas Wouters)
Date: Sat, 7 Jul 2007 12:38:00 -0700
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <20070707185630.GV28082@tummy.com>
References: <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
<200706252144.l5PLi7cs032424@theraft.openend.se>
<20070626105201.GA14025@tummy.com>
<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
<20070707081813.GS28082@tummy.com> <468FC1CE.8080708@v.loewis.de>
<20070707185630.GV28082@tummy.com>
Message-ID: <9e804ac0707071238v6664e9c1xb954fe805f5ebb15@mail.gmail.com>
On 7/7/07, Sean Reifschneider wrote:
>
> On Sat, Jul 07, 2007 at 06:39:42PM +0200, "Martin v. L?wis" wrote:
> >Ah, ok. If you would like to find the right person at XS4ALL to talk to,
> >please go ahead - else I could try myself.
>
> I've sent a request to the "sales" e-mail contact explaining what we're
> trying to do and asking for direction.
I doubt they can figure out what to do, frankly, since we're not an official
sales customer. But who knows, they might surprise me ;) I sent out an email
asking for extra memory last week, but I've been busy with work and
travelling (first Mountain View for Google, now Vilnius for EuroPython) and
haven't had a chance to find out if the people I asked are even in the
country right now. If you don't hear back from sales, let me know and I'll
ask around more.
--
Thomas Wouters
Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/catalog-sig/attachments/20070707/0500495c/attachment.html
From martin at v.loewis.de Sat Jul 7 22:24:59 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 22:24:59 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
Message-ID: <468FF69B.2090503@v.loewis.de>
Jim Fulton schrieb:
> ...
>> I'm quite skeptical on caching in general (even about the static page
>> generation). It *should* be possible to make it fast enough so that
>> it doesn't need caching.
>
> Sure, with more hardware than we want to afford.
So you are saying it's not fast enough already?
Regards,
Martin
From renesd at gmail.com Sun Jul 8 05:14:56 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sun, 8 Jul 2007 13:14:56 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
Message-ID: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
Hello,
Cool, ok. Let's start with event based updating of the static files.
I need to make this tool in this way anyway though. But we can either
set it up to work with polling, or event based. We can start with
event based and switch to polling later if needed.
Since none of the files exists at the moment, the tool will be
needed to generate them initially. Also if templates change, or the
database changes - then the static pages may need regenerating.
Polling is just one sql statement to see if something has changed.
You do this once, no matter how many things have changed. It's a
really quick, operation if nothing has changed.
Polling ends up being faster if you are constantly having to do things
all the time anyway. It's what network drivers do these days because
they realise that there are a constant stream of events(interupts)
anyway - so might as well deal with them at a fixed interval.
Logged in users will not see the static file anyway - since they are
logged in, they get to see the dynamically generated stuff.
Imagine this case:
2-3 users are updating their packages, at a similar time. The main
index then gets regenerated 3 times, rather than once. The more
people who are changing things the more this method works. If there
are 20 people changing things at the same time, then there is still
only one update of the main index page. However since the cheeseshop
only gets updated about 6 times daily, event based is probably better
for the moment.
Anyway... I'm just making the tool which can be used on demand, or at
regular timings.
Cheers,
On 7/8/07, Jim Fulton wrote:
>
> On Jul 7, 2007, at 12:24 AM, Ren? Dudfield wrote:
> ...
> > Now I just need to finish off the static file generation code. It
> > needs a tool which can run every minute or so, which will look for any
> > changes.
>
> Why not write the files when the underlying packages change?
>
> I don't like polling for two reasons:
>
> - New pages are out of date for up to the polling interval. This is
> especially annoying for someone who uploads a package and wants to be
> able to access it immediately.
>
> - Polling all of the pages to see what's changed doesn't seem
> scalable to me.
>
> ...
>
> > I've also updated the http://wiki.python.org/moin/CheeseShopDev page
> > with some things I noticed when installing the cheeseshop again on my
> > laptop. Mainly dependencies, and missing config steps.
>
> Thanks!
>
> Jim
>
> --
> Jim Fulton mailto:jim at zope.com Python Powered!
> CTO (540) 361-1714 http://www.python.org
> Zope Corporation http://www.zope.com http://www.zope.org
>
>
>
>
From renesd at gmail.com Sun Jul 8 05:27:53 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sun, 8 Jul 2007 13:27:53 +1000
Subject: [Catalog-sig] europython cheeseshop sprint? Rolling out changes.
Message-ID: <64ddb72c0707072027p226f7125k3642ef00c5577675@mail.gmail.com>
Hellos,
I'll need to coordinate with someone at some point to implement my
changes... since I don't have access. I'm at europython, so maybe
that would be a good time to meet up for a little sprint?
Is there anyone with access to the cheeseshop going to europython who
wants to work on implementing these changes? I don't have subversion
commit access, or access to the server, so I'll need someone else who
does to help me.
Here's the sprint wiki page for sprints:
http://wiki.python.org/moin/EuroPython2007Sprints
I also created a page here:
http://wiki.python.org/moin/EuroPython2007/CheeseshopSprint
We need to decide when to do the sprint too.
Please let me know if you want to join the sprint, and on what day?
What other things do people want to work on at the sprint?
It would be good to set up a different virtual domain so we can test
changes on there without mucking up the normal cheeseshop so much. It
might be best if I set it up on a separate server for testing, since
apache will have to be restarted a lot.
Since there aren't really any tests for the cheeseshop, should I start
adding some? If so with which tool? I'd like to make some tests to
see if the dymanic, or static files are being served - depending if
the user is authorized or not. I'd also like to
These tests can also serve as monitoring tools - to answer this
question - 'is the cheeseshop still working?'
From renesd at gmail.com Sun Jul 8 06:48:45 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sun, 8 Jul 2007 14:48:45 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
Message-ID: <64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com>
Hi,
here's the start of the static file generator. It just works on one
web path, and one fileout at a time so far. It doesn't figure out the
correct path to put the file, or check to see if there are any
changes.
http://rene.f0o.com/~rene/stuff/pypi/pypi-static-generation.py
# here is like looking at the http://cheeseshop.python.org/pypi/pygame url
python pypi-static-generation.py -create_single /pypi/pygame /tmp/pygame.html
It uses the webui.py code, so that there will not be any repeating of
code. It does this in a similar manner to how the pypi.py pypi.cgi
and pypi.fcgi codes works. That is by making its implementation of
the RequestWrapper class.
I thought I'd just keep posting my changes to the mailing list as I
go... so there's some history of changes - and so people can have a
look/review if they want. If that annoys people I'll stop sending to
the list.
Next up I'm going to put a few functions into store.py. Ones to check
if a release has changed since a given date. Also one to see if any
changes at all have happened since a given date.
I'll also add some onChange type functions for releases. That will be
where all of the code can go for stuff that happens on a change to
releases etc.
cheers,
On 7/8/07, Ren? Dudfield wrote:
> Hello,
>
> Cool, ok. Let's start with event based updating of the static files.
>
>
> I need to make this tool in this way anyway though. But we can either
> set it up to work with polling, or event based. We can start with
> event based and switch to polling later if needed.
>
> Since none of the files exists at the moment, the tool will be
> needed to generate them initially. Also if templates change, or the
> database changes - then the static pages may need regenerating.
>
>
> Polling is just one sql statement to see if something has changed.
> You do this once, no matter how many things have changed. It's a
> really quick, operation if nothing has changed.
>
> Polling ends up being faster if you are constantly having to do things
> all the time anyway. It's what network drivers do these days because
> they realise that there are a constant stream of events(interupts)
> anyway - so might as well deal with them at a fixed interval.
>
> Logged in users will not see the static file anyway - since they are
> logged in, they get to see the dynamically generated stuff.
>
> Imagine this case:
> 2-3 users are updating their packages, at a similar time. The main
> index then gets regenerated 3 times, rather than once. The more
> people who are changing things the more this method works. If there
> are 20 people changing things at the same time, then there is still
> only one update of the main index page. However since the cheeseshop
> only gets updated about 6 times daily, event based is probably better
> for the moment.
>
>
> Anyway... I'm just making the tool which can be used on demand, or at
> regular timings.
>
>
> Cheers,
>
>
>
> On 7/8/07, Jim Fulton wrote:
> >
> > On Jul 7, 2007, at 12:24 AM, Ren? Dudfield wrote:
> > ...
> > > Now I just need to finish off the static file generation code. It
> > > needs a tool which can run every minute or so, which will look for any
> > > changes.
> >
> > Why not write the files when the underlying packages change?
> >
> > I don't like polling for two reasons:
> >
> > - New pages are out of date for up to the polling interval. This is
> > especially annoying for someone who uploads a package and wants to be
> > able to access it immediately.
> >
> > - Polling all of the pages to see what's changed doesn't seem
> > scalable to me.
> >
> > ...
> >
> > > I've also updated the http://wiki.python.org/moin/CheeseShopDev page
> > > with some things I noticed when installing the cheeseshop again on my
> > > laptop. Mainly dependencies, and missing config steps.
> >
> > Thanks!
> >
> > Jim
> >
> > --
> > Jim Fulton mailto:jim at zope.com Python Powered!
> > CTO (540) 361-1714 http://www.python.org
> > Zope Corporation http://www.zope.com http://www.zope.org
> >
> >
> >
> >
>
From martin at v.loewis.de Sun Jul 8 07:19:33 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 07:19:33 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
Message-ID: <469073E5.6010201@v.loewis.de>
> Polling is just one sql statement to see if something has changed.
It's not good enough if something has changed - one would also need
to know what precisely has changed, or else you would need to
regenerate everything.
> Polling ends up being faster if you are constantly having to do things
> all the time anyway.
Maybe (I don't fully understand what you try to say).
However, the cheeseshop does not change very often, so you don't
have to do things all the time anyway. If it was, caching would have
no advantage.
> 2-3 users are updating their packages, at a similar time. The main
> index then gets regenerated 3 times, rather than once.
[Not sure what page precisely you are referring to as "the main index".
I'll assume you talk about the home page]
On July 7 (yesterday), there were 54 changes; the day before, there were
37. Of these, it is typical that multiple changes to the same package
happen within a few seconds, and then no changes happen for many
minutes; often not a single change within an hour.
It very rarely happens that there are 3 users simultaneously updating
their packages.
Regenerating the main index 3 times is very fast. Depending on how
precisely you prevent concurrent updates, and depending on how
similar the times are, the three users may not trigger three updates,
but only two, if the first update is still running when the second
and third one is attempted.
Regards,
Martin
From martin at v.loewis.de Sun Jul 8 07:29:36 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 07:29:36 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
<64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com>
Message-ID: <46907640.3010408@v.loewis.de>
> Next up I'm going to put a few functions into store.py. Ones to check
> if a release has changed since a given date. Also one to see if any
> changes at all have happened since a given date.
Is this really necessary? I think it would be sufficient to have a table
of name,version pairs that list the releases that have changed. This
table is filled on modification, and cleared by the regeneration.
Regards,
Martin
From renesd at gmail.com Sun Jul 8 07:36:21 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sun, 8 Jul 2007 15:36:21 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46907640.3010408@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
<64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com>
<46907640.3010408@v.loewis.de>
Message-ID: <64ddb72c0707072236x6c800515sc8869e31334bd359@mail.gmail.com>
hello,
It's less work to just look up to see when the last change was.
Rather than make another table and store it - duplicating the data.
Cheers,
On 7/8/07, "Martin v. L?wis" wrote:
> > Next up I'm going to put a few functions into store.py. Ones to check
> > if a release has changed since a given date. Also one to see if any
> > changes at all have happened since a given date.
>
> Is this really necessary? I think it would be sufficient to have a table
> of name,version pairs that list the releases that have changed. This
> table is filled on modification, and cleared by the regeneration.
>
> Regards,
> Martin
>
From renesd at gmail.com Sun Jul 8 09:46:18 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sun, 8 Jul 2007 17:46:18 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707072236x6c800515sc8869e31334bd359@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
<64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com>
<46907640.3010408@v.loewis.de>
<64ddb72c0707072236x6c800515sc8869e31334bd359@mail.gmail.com>
Message-ID: <64ddb72c0707080046j4c1a2566s7cf6ae5cba0ad9c6@mail.gmail.com>
Hi,
here's another update:
http://rene.f0o.com/~rene/stuff/pypi/pypi-static-generation.py
Now you can also create all of the releases listed on the "/pypi/" url.
python pypi-static-generation.py -create_all
It still doesn't do date checking yet. I'll probably get around to
that tomorrow.
so it creates these files and directories:
/pypi/Pygame/index.html
/pypi/Pygame/1.7.1/index.html
So these urls can use the static files:
/pypi/Pygame/
/pypi/Pygame
/pypi/Pygame/1.7.1
/pypi/Pygame/1.7.1/
It took about 20 minutes to generate all of them on my Ye Olde p3
256MB ram, laptop HD computer.
Cheers,
On 7/8/07, Ren? Dudfield wrote:
> hello,
>
> It's less work to just look up to see when the last change was.
> Rather than make another table and store it - duplicating the data.
>
> Cheers,
>
>
> On 7/8/07, "Martin v. L?wis" wrote:
> > > Next up I'm going to put a few functions into store.py. Ones to check
> > > if a release has changed since a given date. Also one to see if any
> > > changes at all have happened since a given date.
> >
> > Is this really necessary? I think it would be sufficient to have a table
> > of name,version pairs that list the releases that have changed. This
> > table is filled on modification, and cleared by the regeneration.
> >
> > Regards,
> > Martin
> >
>
From jim at zope.com Sun Jul 8 14:14:27 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 8 Jul 2007 08:14:27 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <468FF69B.2090503@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
Message-ID: <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
On Jul 7, 2007, at 4:24 PM, Martin v. L?wis wrote:
> Jim Fulton schrieb:
>> ...
>>> I'm quite skeptical on caching in general (even about the static
>>> page
>>> generation). It *should* be possible to make it fast enough so that
>>> it doesn't need caching.
>>
>> Sure, with more hardware than we want to afford.
>
> So you are saying it's not fast enough already?
Uh, yeah. That's what this whole thread has been about. *Maybe* all
your efforts will make it fast enough. I'm skeptical though. Also
understand that now that we're using the cheeseshop to support
automated builds, the load will increase a lot over time.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Sun Jul 8 18:07:27 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 18:07:27 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
Message-ID: <46910BBF.3010308@v.loewis.de>
>> So you are saying it's not fast enough already?
>
> Uh, yeah.
Can you please be more precise, then? What kind of operation are
you performing, how long does it take, and how long should it
take so that you would consider it fast enough?
It's difficult to implement a system if the requirements are
unknown to those implementing it.
Regards,
Martin
From pje at telecommunity.com Sun Jul 8 19:27:56 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 08 Jul 2007 13:27:56 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
Message-ID: <20070708172544.8D2763A404D@sparrow.telecommunity.com>
At 01:48 PM 7/7/2007 -0400, Jim Fulton wrote:
>On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote:
>...
> > I'm quite skeptical on caching in general (even about the static page
> > generation). It *should* be possible to make it fast enough so that
> > it doesn't need caching.
>
>Sure, with more hardware than we want to afford.
>
> > I consider caching a work-around, not a
> > solution - and one with severe drawbacks.
>
>The pages we're talking about are static. They change at well-known
>times. IMO, It's crazy to serve static content dynamically when it's
>easy to serve it statically.
If they're effectively static, why can't Apache cache
them? Shouldn't we be able to simply add Last-Modified/If-Modified
support to the PyPI output, and enable Apache's disk caching for
non-logged-in users?
That is, as long as there is a quick last-modified-time query for a
package, we can use those to process the If-Modified header. The
modification time could even be memcached, so as not to need a
database hit 99% of the time.
While that's not necessarily as fast as static page generation, it's
a lot less complex to get right, and it saves the main piece of CPU
load: i.e., doing SQL queries and actually generating the page.
Pages that pertain to more than one package might be a bit more
complex to do this on, but if I understand correctly it's mainly the
package-specific pages we're concerned with here, correct? Even so,
it's possible to have any updates also update a global "something's
changed" time, and use that time as the Last-Modified of those pages.
From martin at v.loewis.de Sun Jul 8 19:37:24 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 19:37:24 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070708172544.8D2763A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<20070708172544.8D2763A404D@sparrow.telecommunity.com>
Message-ID: <469120D4.60909@v.loewis.de>
> If they're effectively static, why can't Apache cache them?
That's easy to answer: nobody told Apache to do that
(and I don't know how to tell it to).
Ren?'s approach currently is to generate the files explicitly
on disk, and then have Apache return them always from disk.
> Shouldn't
> we be able to simply add Last-Modified/If-Modified support to the PyPI
> output, and enable Apache's disk caching for non-logged-in users?
How precisely would that work? I.e. what software should put what
header into what place, and how would the cache then find out that
the real data have changed?
> While that's not necessarily as fast as static page generation, it's a
> lot less complex to get right, and it saves the main piece of CPU load:
> i.e., doing SQL queries and actually generating the page.
I'm not convinced yet that this is where the time is spent (seeing
actual profiling data would convince me). I have learned to never
ever guess what precisely is consuming cycles in a piece of software.
> Pages that pertain to more than one package might be a bit more complex
> to do this on, but if I understand correctly it's mainly the
> package-specific pages we're concerned with here, correct?
I'm not convinced of that, either.
Regards,
Martin
From renesd at gmail.com Sun Jul 8 19:47:17 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Mon, 9 Jul 2007 03:47:17 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070708172544.8D2763A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<20070708172544.8D2763A404D@sparrow.telecommunity.com>
Message-ID: <64ddb72c0707081047i1f4209e0j1584c1c2d6863bc5@mail.gmail.com>
Hi,
turning on caching is the plan as well, but after the static files.
See my earlier emails on the subject.
However static pages have their uses too, and are a bit faster than
the cached ones.
On 7/9/07, Phillip J. Eby wrote:
> At 01:48 PM 7/7/2007 -0400, Jim Fulton wrote:
>
> >On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote:
> >...
> > > I'm quite skeptical on caching in general (even about the static page
> > > generation). It *should* be possible to make it fast enough so that
> > > it doesn't need caching.
> >
> >Sure, with more hardware than we want to afford.
> >
> > > I consider caching a work-around, not a
> > > solution - and one with severe drawbacks.
> >
> >The pages we're talking about are static. They change at well-known
> >times. IMO, It's crazy to serve static content dynamically when it's
> >easy to serve it statically.
>
> If they're effectively static, why can't Apache cache
> them? Shouldn't we be able to simply add Last-Modified/If-Modified
> support to the PyPI output, and enable Apache's disk caching for
> non-logged-in users?
>
> That is, as long as there is a quick last-modified-time query for a
> package, we can use those to process the If-Modified header. The
> modification time could even be memcached, so as not to need a
> database hit 99% of the time.
>
> While that's not necessarily as fast as static page generation, it's
> a lot less complex to get right, and it saves the main piece of CPU
> load: i.e., doing SQL queries and actually generating the page.
>
> Pages that pertain to more than one package might be a bit more
> complex to do this on, but if I understand correctly it's mainly the
> package-specific pages we're concerned with here, correct? Even so,
> it's possible to have any updates also update a global "something's
> changed" time, and use that time as the Last-Modified of those pages.
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
From renesd at gmail.com Sun Jul 8 19:50:07 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Mon, 9 Jul 2007 03:50:07 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469120D4.60909@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<20070708172544.8D2763A404D@sparrow.telecommunity.com>
<469120D4.60909@v.loewis.de>
Message-ID: <64ddb72c0707081050l55c8beakbc241e5ac94ed7d7@mail.gmail.com>
On 7/9/07, "Martin v. L?wis" wrote:
> > If they're effectively static, why can't Apache cache them?
>
> That's easy to answer: nobody told Apache to do that
> (and I don't know how to tell it to).
>
> Ren?'s approach currently is to generate the files explicitly
> on disk, and then have Apache return them always from disk.
Yeah, have apache return from disk if not logged in. Also if the
static file is not there, then it generates the page dynamically.
From pje at telecommunity.com Sun Jul 8 21:33:36 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 08 Jul 2007 15:33:36 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469120D4.60909@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<20070708172544.8D2763A404D@sparrow.telecommunity.com>
<469120D4.60909@v.loewis.de>
Message-ID: <20070708193123.CCB803A404D@sparrow.telecommunity.com>
At 07:37 PM 7/8/2007 +0200, Martin v. L?wis wrote:
> > If they're effectively static, why can't Apache cache them?
>
>That's easy to answer: nobody told Apache to do that
>(and I don't know how to tell it to).
>
>Ren?'s approach currently is to generate the files explicitly
>on disk, and then have Apache return them always from disk.
>
> > Shouldn't
> > we be able to simply add Last-Modified/If-Modified support to the PyPI
> > output, and enable Apache's disk caching for non-logged-in users?
>
>How precisely would that work? I.e. what software should put what
>header into what place, and how would the cache then find out that
>the real data have changed?
I was under the impression that when Apache caching is enabled, it
can add an If-Modified-Since header to incoming requests, and in the
event that the dynamic content hasn't changed, use its cached version
of the response. I am not an expert on this, however.
If it does do this, then PyPI would check for an If-Modified-Since
header and compare it to the modified date for the page, and return a
"not changed" response if appropriate.
> > While that's not necessarily as fast as static page generation, it's a
> > lot less complex to get right, and it saves the main piece of CPU load:
> > i.e., doing SQL queries and actually generating the page.
>
>I'm not convinced yet that this is where the time is spent (seeing
>actual profiling data would convince me).
I thought Rene' had done such profiling, as he said it was the
templates that were taking most of the CPU.
> > Pages that pertain to more than one package might be a bit more complex
> > to do this on, but if I understand correctly it's mainly the
> > package-specific pages we're concerned with here, correct?
>
>I'm not convinced of that, either.
Well, I thought those were the ones we were caching.
It may be that I'm making too many assumptions, but if those
assumptions are correct, then the whole thing gets a lot easier to
prove correct, compared to a static cache, due to fewer moving
parts. If most CPU time is spent rendering package-specific pages,
then this approach would fix the problem using the fewest changed
parts and extra code to maintain.
From martin at v.loewis.de Sun Jul 8 21:34:00 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 21:34:00 +0200
Subject: [Catalog-sig] ZPT template caching
Message-ID: <46913C28.4060903@v.loewis.de>
I just added template caching to PyPI: rather than parsing
a page template on each request, it caches the templates, and
later renders a pre-parsed one. According to my measurements,
this should reduce the number of Python function calls needed
to render a page noticably.
As a side effect, Apache needs to be restarted when a template
changes (this was already the case for code changes).
Regards,
Martin
From martin at v.loewis.de Sun Jul 8 22:00:44 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 22:00:44 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070708193123.CCB803A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<20070708172544.8D2763A404D@sparrow.telecommunity.com>
<469120D4.60909@v.loewis.de>
<20070708193123.CCB803A404D@sparrow.telecommunity.com>
Message-ID: <4691426C.7030501@v.loewis.de>
> I was under the impression that when Apache caching is enabled, it can
> add an If-Modified-Since header to incoming requests, and in the event
> that the dynamic content hasn't changed, use its cached version of the
> response. I am not an expert on this, however.
Where would it add that? The (F)CGI script doesn't see any headers,
except for those communicated in environment variables. AFAICT,
there is non for if-modified-since.
If you were thinking of mod_cache: it will expire entries after
CacheDefaultExpire (default 1h), unless an Expires or Last-Modified
header is in the original response. In the latter case,
CacheLastModifiedFactor is used to determine an expiry period
(default 10% since last-modified).
>> I'm not convinced yet that this is where the time is spent (seeing
>> actual profiling data would convince me).
>
> I thought Rene' had done such profiling, as he said it was the templates
> that were taking most of the CPU.
I saw that he said that its taking most of the CPU, however, he didn't
say he did profiling.
I now did, and found that the parsing of the templates takes some time,
so it now caches the parsed templates.
>> > Pages that pertain to more than one package might be a bit more complex
>> > to do this on, but if I understand correctly it's mainly the
>> > package-specific pages we're concerned with here, correct?
>>
>> I'm not convinced of that, either.
>
> Well, I thought those were the ones we were caching.
Not "were caching", but "going to cache". As I said before, I'm
unconvinced that this is were the load goes; as a consequence,
I'm unconvinced that generating static pages will improve things.
Of course, if Rene completes this project, and the static
pages don't actually break anything, it shouldn't hurt to use them;
then we will see what the saving is (there surely will be *some*
saving, and it might be that those who complain about the performance
most will see a performance increase assuming that they are primarily
interested in the static pages).
> It may be that I'm making too many assumptions, but if those assumptions
> are correct, then the whole thing gets a lot easier to prove correct,
> compared to a static cache, due to fewer moving parts. If most CPU time
> is spent rendering package-specific pages, then this approach would fix
> the problem using the fewest changed parts and extra code to maintain.
My biggest concern is whether there can be a reliable computation of
"has this changed". If that predicate gives an incorrect response,
it doesn't matter much whether Apache does its own caching, or whether
the static page fail to be regenerated.
Regards,
Martin
From jafo at tummy.com Mon Jul 9 06:50:38 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Sun, 8 Jul 2007 22:50:38 -0600
Subject: [Catalog-sig] ZPT template caching
In-Reply-To: <46913C28.4060903@v.loewis.de>
References: <46913C28.4060903@v.loewis.de>
Message-ID: <20070709045038.GA12464@tummy.com>
On Sun, Jul 08, 2007 at 09:34:00PM +0200, "Martin v. L?wis" wrote:
>As a side effect, Apache needs to be restarted when a template
>changes (this was already the case for code changes).
The way I cache our site, I put the cache into memcached, so that the cache
is shared among all apaches, ages out old stuff, and when I update
something I just tell memcached to invalidate everything in it's cache, no
Apache restart necessary. I *DO* need to restart it if I make code
changes, but not template changes.
Thanks,
Sean
--
If not actually disgruntled, he was far from being gruntled.
-- P. G. Wodehouse
Sean Reifschneider, Member of Technical Staff
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
From martin at v.loewis.de Mon Jul 9 07:08:15 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 09 Jul 2007 07:08:15 +0200
Subject: [Catalog-sig] ZPT template caching
In-Reply-To: <20070709045038.GA12464@tummy.com>
References: <46913C28.4060903@v.loewis.de> <20070709045038.GA12464@tummy.com>
Message-ID: <4691C2BF.1060901@v.loewis.de>
> The way I cache our site, I put the cache into memcached, so that the cache
> is shared among all apaches, ages out old stuff, and when I update
> something I just tell memcached to invalidate everything in it's cache, no
> Apache restart necessary. I *DO* need to restart it if I make code
> changes, but not template changes.
How can I put parsed zope templates into memcached?
Regards,
Martin
From jafo at tummy.com Mon Jul 9 07:25:57 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Sun, 8 Jul 2007 23:25:57 -0600
Subject: [Catalog-sig] ZPT template caching
In-Reply-To: <4691C2BF.1060901@v.loewis.de>
References: <46913C28.4060903@v.loewis.de> <20070709045038.GA12464@tummy.com>
<4691C2BF.1060901@v.loewis.de>
Message-ID: <20070709052557.GD5041@tummy.com>
On Mon, Jul 09, 2007 at 07:08:15AM +0200, "Martin v. L?wis" wrote:
>How can I put parsed zope templates into memcached?
I have no idea. I do it by caching the results, which for my application
is all I really care about and don't vary from request to request unless
the data or template has changed, or it's a different day.
Sean
--
Examine what is said, not who speaks. (Arabian Proverb)
Sean Reifschneider, Member of Technical Staff
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
Back off man. I'm a scientist. http://HackingSociety.org/
From jim at zope.com Mon Jul 9 15:49:32 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 9 Jul 2007 09:49:32 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
Message-ID: <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
What Martin said :), and:
On Jul 7, 2007, at 11:14 PM, Ren? Dudfield wrote:
...
> Logged in users will not see the static file anyway - since they are
> logged in, they get to see the dynamically generated stuff.
Here's a common use case:
- A user uploads a new release
- They then use setuptools to install the release from PyPI.
setuptools will not present their credentials and will therefore
behave like a logged in user. It will see and install an older
version of the package.
This will be very mysterious and annoying to the user that just
uploaded the release.
> Imagine this case:
> 2-3 users are updating their packages, at a similar time. The main
> index then gets regenerated 3 times, rather than once.
Who cares. That's one page that we get dynamically now.
> The more
> people who are changing things the more this method works. If there
> are 20 people changing things at the same time, then there is still
> only one update of the main index page. However since the cheeseshop
> only gets updated about 6 times daily, event based is probably better
> for the moment.
Yup.
> Anyway... I'm just making the tool which can be used on demand, or at
> regular timings.
I wonder if we are talking about the same thing here. I fear not.
With event based update, you should only update the pages that need
to be updated, at worst, this should be the pages for the project
being updated plus http://www.python.org/pypi/. The software needed
for this would be very different than the software that would build
the static pages initially or update all if a template has changed.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Mon Jul 9 16:09:37 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 9 Jul 2007 10:09:37 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46910BBF.3010308@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
Message-ID:
On Jul 8, 2007, at 12:07 PM, Martin v. L?wis wrote:
>>> So you are saying it's not fast enough already?
>>
>> Uh, yeah.
>
> Can you please be more precise, then? What kind of operation are
> you performing,
I'm using setuptools. Sertuptools looks at package pages (e.g.
http://www.python.org/pypi/foobar), it looks at:
http://www.python.org/pypi/ and it doenloads distributions. (AFAICT,
the later is done dynamically too, which
is especially insane.)
> how long does it take,
Lately, it's has often taken minutes. This has been the major
problem. At the best of times. well, I don't know when those are. :)
ATM, requests for http://www.python.org/pypi/zc.buildout takes about
1/3 second. Requests for http://cheeseshop.python.org/packages/2.5/z/
zc.buildout/zc.buildout-1.0.0b28-py2.5.egg take about 2.5 seconds.
Requests for http://www.python.org/pypi/ take about 10 seconds.
I would say that these times are too long.
> and how long should it
> take so that you would consider it fast enough?
IMO, it needs to be much much faster. If we were serving pages
staticially, we would be able to serve thousands of requests per
second. There's nothing about this application that would make doing
that hard.
> It's difficult to implement a system if the requirements are
> unknown to those implementing it.
I'm sorry, I've been talking about setuptools all along. I thought
the use case was understood. Also, I thought it was pretty obvious
that the performance we've been seeing lately is totally
unacceptable. It's hard to pinpoint exactly what the acceptable
performance will be, in part because, we we do better, demand will
increase. Note that, as it is now, demand is possibly decreasing
because people are building their own indexes.
If this was an application that had to be served dynamically (and of
course, parts of it are), then it would be much more interesting to
discuss targets for dynamic delivery. The performance-critical parts
of this application -- the pages that setuptools uses, can readily be
served statically, so it makes no sense not to do so.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Mon Jul 9 16:21:23 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 9 Jul 2007 10:21:23 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070708172544.8D2763A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<20070708172544.8D2763A404D@sparrow.telecommunity.com>
Message-ID: <437CFE1D-125A-4856-936E-27FC688B57BA@zope.com>
On Jul 8, 2007, at 1:27 PM, Phillip J. Eby wrote:
> At 01:48 PM 7/7/2007 -0400, Jim Fulton wrote:
>
>> On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote:
>> ...
>> > I'm quite skeptical on caching in general (even about the static
>> page
>> > generation). It *should* be possible to make it fast enough so that
>> > it doesn't need caching.
>>
>> Sure, with more hardware than we want to afford.
>>
>> > I consider caching a work-around, not a
>> > solution - and one with severe drawbacks.
>>
>> The pages we're talking about are static. They change at well-known
>> times. IMO, It's crazy to serve static content dynamically when it's
>> easy to serve it statically.
>
> If they're effectively static, why can't Apache cache them?
> Shouldn't we be able to simply add Last-Modified/If-Modified
> support to the PyPI output, and enable Apache's disk caching for
> non-logged-in users?
When caching something, you typically specify a age before you start
checking. That means that content would be stale for that period.
Sometimes, that is both acceptable and necessary. In any case,
dynamic servers typically take just as long to handle an If-Modified
or Last-Modified request than they do to handle a regular request. It
would be just as complicated, if not more so, to get the cheeseshop
software to do this properly than it would to just bake.
> That is, as long as there is a quick last-modified-time query for a
> package, we can use those to process the If-Modified header. The
> modification time could even be memcached, so as not to need a
> database hit 99% of the time.
No, it can't be cached. What would you do to make sure that cache
wasn't stale.
> While that's not necessarily as fast as static page generation,
> it's a lot less complex to get right, and it saves the main piece
> of CPU load: i.e., doing SQL queries and actually generating the page.
It is really easy to get static page generation right for an
application this simple. YOu know when pages are invalidated. The
page relationships are not at all complicated here.
> Pages that pertain to more than one package might be a bit more
> complex to do this on, but if I understand correctly it's mainly
> the package-specific pages we're concerned with here, correct?
Yes, and http://www.python.org/pypi/
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From renesd at gmail.com Mon Jul 9 17:13:48 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Mon, 9 Jul 2007 19:13:48 +0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
Message-ID: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
Hello Jim,
I double+ agree we should update on change.
On 7/9/07, Jim Fulton wrote:
> Here's a common use case:
>
> - A user uploads a new release
>
> - They then use setuptools to install the release from PyPI.
> setuptools will not present their credentials and will therefore
> behave like a logged in user. It will see and install an older
> version of the package.
>
You mean it will behave like someone *not* logged in right? Either
way they should always get the latest change.
The way to do this atomically, so not one can possibly get an old
page, the static file will be removed as the change is committed.
Then everyone gets the latest change right away - as soon as the
change has been committed.
> > Anyway... I'm just making the tool which can be used on demand, or at
> > regular timings.
>
> I wonder if we are talking about the same thing here. I fear not.
> With event based update, you should only update the pages that need
> to be updated, at worst, this should be the pages for the project
> being updated plus http://www.python.org/pypi/. The software needed
> for this would be very different than the software that would build
> the static pages initially or update all if a template has changed.
>
These are the commands so far:
python pypi-static-generation.py -create_single /pypi/pygame /tmp/pygame.html
python pypi-static-generation.py -create_all
The generation of the main index page would be:
python pypi-static-generation.py -create_single /pypi/
path_to_static_indexpage.html
Then there would be a command to update the single page:
python pypi-static-generation.py -create_single /pypi/Pygame
path_to_static_pygame.html
Ok, that's all for now. I'll be able to finish it off in a few days
after europython.
Cheers,
From renesd at gmail.com Mon Jul 9 17:19:52 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Mon, 9 Jul 2007 19:19:52 +0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
Message-ID: <64ddb72c0707090819l2382c8cu619a85ec0d3464dc@mail.gmail.com>
On 7/9/07, Jim Fulton wrote:
> ATM, requests for http://www.python.org/pypi/zc.buildout takes about
> 1/3 second. Requests for http://cheeseshop.python.org/packages/2.5/z/
> zc.buildout/zc.buildout-1.0.0b28-py2.5.egg take about 2.5 seconds.
> Requests for http://www.python.org/pypi/ take about 10 seconds.
>
> I would say that these times are too long.
>
Hi again,
Just a note, the static pages through the mod-rewrite logic goes
pretty quickly. So both those pages can be served at 1000s of
requests per second.
Cheers,
From jim at zope.com Mon Jul 9 18:27:08 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 9 Jul 2007 12:27:08 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
Message-ID: <34EBAA43-50DD-49F0-BAB0-B114DA870C37@zope.com>
On Jul 9, 2007, at 11:13 AM, Ren? Dudfield wrote:
> Hello Jim,
>
> I double+ agree we should update on change.
Yay! :)
> On 7/9/07, Jim Fulton wrote:
>> Here's a common use case:
>>
>> - A user uploads a new release
>>
>> - They then use setuptools to install the release from PyPI.
>> setuptools will not present their credentials and will therefore
>> behave like a logged in user. It will see and install an older
>> version of the package.
>>
>
> You mean it will behave like someone *not* logged in right?
Right.
> Either
> way they should always get the latest change.
Yes, if we update the static on change.
I though you were arguing that it didn't matter of cached pages were
out of date because the person updating the pages would see the
changes because they'd see uncached pages.
> The way to do this atomically, so not one can possibly get an old
> page, the static file will be removed as the change is committed.
> Then everyone gets the latest change right away - as soon as the
> change has been committed.
Sure.
>
>> > Anyway... I'm just making the tool which can be used on demand,
>> or at
>> > regular timings.
>>
>> I wonder if we are talking about the same thing here. I fear not.
>> With event based update, you should only update the pages that need
>> to be updated, at worst, this should be the pages for the project
>> being updated plus http://www.python.org/pypi/. The software needed
>> for this would be very different than the software that would build
>> the static pages initially or update all if a template has changed.
>>
>
>
> These are the commands so far:
> python pypi-static-generation.py -create_single /pypi/pygame /tmp/
> pygame.html
> python pypi-static-generation.py -create_all
Ah, so one script, 2 behaviors. Fair enough.
> The generation of the main index page would be:
> python pypi-static-generation.py -create_single /pypi/
> path_to_static_indexpage.html
>
> Then there would be a command to update the single page:
> python pypi-static-generation.py -create_single /pypi/Pygame
> path_to_static_pygame.html
Shouldn't that be implied by both of the commands above.
I'm a little surprised that you are doing this as an external script,
as opposed to adding the behavior to the cheeseshop code, but I guess
it doesn't matter.
> Ok, that's all for now. I'll be able to finish it off in a few days
> after europython.
Haven't you been able to get anyone to sprint with you on it there?
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Mon Jul 9 18:44:45 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 12:44:45 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.co
m>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
Message-ID: <20070709164232.95EED3A404D@sparrow.telecommunity.com>
At 07:13 PM 7/9/2007 +0400, Ren? Dudfield wrote:
>The way to do this atomically, so not one can possibly get an old
>page, the static file will be removed as the change is committed.
>Then everyone gets the latest change right away - as soon as the
>change has been committed.
This sounds pretty good... except that you may need better
protection against a race condition. What happens if a page is
removed *while* it is being regenerated? PostgreSQL has MVCC for
read-only transactions, so the static page will be generated against
old data, unless you have some other locking mechanism used to
serialize access to the static file, that is shared by both the
deletion and generating mechanisms.
One possible approach: if the generator writes its files to
foo/index.html.tmp (opened with exclusive access) and then renames
them to 'foo/index.html', then the deletion mechanism can attempt to
*first* remove the .tmp file, then the real file. Both processes
must be robust against their renames or unlinks or exclusive open()'s
failing, but there would then be no possibility of collision. The
exclusive open would have to be done at the *start* of write
processing, however, before any database queries have been
attempted. (And their connection must be rolled back at that
point.) This ensures that, if a writer succeeds in locking the .tmp
file, then they are seeing data that is current.
All that having been said, the idea in general sounds good. If PyPI
itself simply checked whether the URL it's about to serve is
cacheable (i.e., has a static location and no user logged in), and if
so, opened the temp file for exclusive writing, it could just dump
its generated page out, and rename it at the end if it had been
successful in acquiring the temp file.
And voila! No separate caching process, no scheduling, and an always
perfectly-up-to-date cache. As soon as a page becomes out of date,
it gets served dynamically... but only for as long as it takes to
serve one copy of that page. :)
In pseudocode:
def process_request():
if no authentication header and URL path is cacheable:
try:
temp = exclusive open cache file with .tmp extension
except os.error:
pass
else:
with stdout redirected to temp:
process_request_normally()
try:
rename(tempfilename, realfilename)
except os.error:
pass
send_browser_contents_of(temp)
return
return process_request_normally()
Here, 'process_request_normally()' should refer to everything that
PyPI does now, *including database connection rollback or
commit*. This will ensure that it's impossible to write stale data
to the cache.
The deletion process should just do this:
for name in (cache_path+'.tmp', cache_path):
try:
os.unlink(name)
except os.error:
pass
after committing the database transaction.
Informal serialization proof:
* Only one process may write to a page's .tmp file at a time
* Either the writer has committed its page write (by renaming the
.tmp file), or it has not (i.e., rename() is atomic)
* If the writer has *not* committed its page, then the first unlink
will prevent it from doing so.
* If the writer *has* committed its page, then the second unlink will
undo this.
* If between the two unlinks operations, another writer appears, that
writer will be reading current data from the database, because it has
to acquire exclusive access to the .tmp file before doing a rollback
and reading the data it will use for writing.
QED, it will be impossible to have stale data in the cache, unless
the invalidating request fails to attempt its two unlink operations
during the brief window after its database commit.
From renesd at gmail.com Mon Jul 9 19:15:19 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Tue, 10 Jul 2007 03:15:19 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070709164232.95EED3A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
<20070709164232.95EED3A404D@sparrow.telecommunity.com>
Message-ID: <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com>
On 7/10/07, Phillip J. Eby wrote:
> At 07:13 PM 7/9/2007 +0400, Ren? Dudfield wrote:
> >The way to do this atomically, so not one can possibly get an old
> >page, the static file will be removed as the change is committed.
> >Then everyone gets the latest change right away - as soon as the
> >change has been committed.
>
> This sounds pretty good... except that you may need better
> protection against a race condition. What happens if a page is
> removed *while* it is being regenerated? PostgreSQL has MVCC for
> read-only transactions, so the static page will be generated against
> old data, unless you have some other locking mechanism used to
> serialize access to the static file, that is shared by both the
> deletion and generating mechanisms.
>
Hi,
move in linux/unix is atomic. So the file is generated and then moved
in. unlink is similar... once you remove it, any processes with that
file open still references the old file.
So no race condition.
def the static generation:
- generate file in temp file
- move temp file to place where static file lives.
def the update code:
- do inserts/updates/deletes.
- remove static files.
- commit change.
- the static generation()
From renesd at gmail.com Mon Jul 9 19:31:10 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Tue, 10 Jul 2007 03:31:10 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <34EBAA43-50DD-49F0-BAB0-B114DA870C37@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
<34EBAA43-50DD-49F0-BAB0-B114DA870C37@zope.com>
Message-ID: <64ddb72c0707091031m1fe5fccai12708e38fb547d79@mail.gmail.com>
No, I haven't found anyone yet. I'll write it up on the board, and
see if anyone wants to join in tomorrow - or maybe find someone at the
bar tonight.
Where do people report bugs for the cheeseshop/distutils? Someone was
telling me today that he couldn't get the setup.py to do new releases
anymore.
cu.
On 7/10/07, Jim Fulton wrote:
>
> On Jul 9, 2007, at 11:13 AM, Ren? Dudfield wrote:
>
> > Hello Jim,
> >
> > I double+ agree we should update on change.
>
> Yay! :)
>
> > On 7/9/07, Jim Fulton wrote:
> >> Here's a common use case:
> >>
> >> - A user uploads a new release
> >>
> >> - They then use setuptools to install the release from PyPI.
> >> setuptools will not present their credentials and will therefore
> >> behave like a logged in user. It will see and install an older
> >> version of the package.
> >>
> >
> > You mean it will behave like someone *not* logged in right?
>
> Right.
>
> > Either
> > way they should always get the latest change.
>
> Yes, if we update the static on change.
>
> I though you were arguing that it didn't matter of cached pages were
> out of date because the person updating the pages would see the
> changes because they'd see uncached pages.
>
> > The way to do this atomically, so not one can possibly get an old
> > page, the static file will be removed as the change is committed.
> > Then everyone gets the latest change right away - as soon as the
> > change has been committed.
>
> Sure.
>
>
> >
> >> > Anyway... I'm just making the tool which can be used on demand,
> >> or at
> >> > regular timings.
> >>
> >> I wonder if we are talking about the same thing here. I fear not.
> >> With event based update, you should only update the pages that need
> >> to be updated, at worst, this should be the pages for the project
> >> being updated plus http://www.python.org/pypi/. The software needed
> >> for this would be very different than the software that would build
> >> the static pages initially or update all if a template has changed.
> >>
> >
> >
> > These are the commands so far:
> > python pypi-static-generation.py -create_single /pypi/pygame /tmp/
> > pygame.html
> > python pypi-static-generation.py -create_all
>
> Ah, so one script, 2 behaviors. Fair enough.
>
>
> > The generation of the main index page would be:
> > python pypi-static-generation.py -create_single /pypi/
> > path_to_static_indexpage.html
> >
> > Then there would be a command to update the single page:
> > python pypi-static-generation.py -create_single /pypi/Pygame
> > path_to_static_pygame.html
>
> Shouldn't that be implied by both of the commands above.
>
> I'm a little surprised that you are doing this as an external script,
> as opposed to adding the behavior to the cheeseshop code, but I guess
> it doesn't matter.
>
> > Ok, that's all for now. I'll be able to finish it off in a few days
> > after europython.
>
> Haven't you been able to get anyone to sprint with you on it there?
>
> Jim
>
> --
> Jim Fulton mailto:jim at zope.com Python Powered!
> CTO (540) 361-1714 http://www.python.org
> Zope Corporation http://www.zope.com http://www.zope.org
>
>
>
>
From pje at telecommunity.com Mon Jul 9 19:37:56 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 13:37:56 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.co
m>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
<20070709164232.95EED3A404D@sparrow.telecommunity.com>
<64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com>
Message-ID: <20070709173543.BC31B3A404D@sparrow.telecommunity.com>
At 03:15 AM 7/10/2007 +1000, Ren? Dudfield wrote:
>def the static generation:
> - generate file in temp file
> - move temp file to place where static file lives.
>
>def the update code:
> - do inserts/updates/deletes.
> - remove static files.
> - commit change.
> - the static generation()
Ah - I was assuming static generation was going to be a separate process.
However, there's still a race condition here, unless you open the
temp file exclusively before the transaction commits. If you wait
until after the transaction is finished, another change could occur
to the same page after you, but finish its page write *before* you,
causing you to overwrite it with your move! You then end up with an
outdated page that will stick around indefinitely. (Yes, it's
unlikely, but it *can* happen, and therefore eventually will.)
So, as in my suggestion, you *still* need an exclusive open of a
pre-determined tempfile name, prior to transaction commit. Then,
such an occurrence is impossible.
By the way, the generate-on-change approach also means you have to do
a big batch run to pre-generate all the existing static pages; the
approach I suggested will simply generate them in response to actual
demand, with no batch processing necessary. A new PyPI installation
would just build up its cache as it gets used, getting faster as it goes.
From martin at v.loewis.de Tue Jul 10 00:16:03 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 00:16:03 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
Message-ID: <4692B3A3.5030209@v.loewis.de>
> Lately, it's has often taken minutes. This has been the major problem.
> At the best of times. well, I don't know when those are. :)
>
> ATM, requests for http://www.python.org/pypi/zc.buildout takes about 1/3
> second.
Ok. By "ATM", you mean July 9, 14:09 GMT?
Please take a look at
http://ximinez.python.org/munin/localdomain/localhost.localdomain-load.html
That was the most significant spike in the load today, and I surely
would like to know what was causing it.
> Requests for
> http://cheeseshop.python.org/packages/2.5/z/zc.buildout/zc.buildout-1.0.0b28-py2.5.egg
> take about 2.5 seconds.
That is a static file, not going through PyPI. It's 168kiB, so that
means you download with 67kB/s.
> Requests for http://www.python.org/pypi/ take
> about 10 seconds.
Why does that matter for setuptools? Does setuptools ever look at this
page?
> I would say that these times are too long.
Which of these precisely? Given that the actual file downloads in 2.5s,
why is it important that the access to the page referring to it is 1/3s?
>> and how long should it
>> take so that you would consider it fast enough?
>
> IMO, it needs to be much much faster. If we were serving pages
> staticially, we would be able to serve thousands of requests per
> second. There's nothing about this application that would make doing
> that hard.
I looked at the load preceding your message. Counting 1000 requests
backwards from 14:09, we are at 16:07. So this system receives roughly
1000 requests per minute in its peak load, and it seems to be able to
handle them (although the performance degrades at that point).
Of these requests, 853 came from a single machine (x.y.237.218), which
appears to be an extraordinarily "big" client of PyPI. 45 requests
came from msnbot, 13 from Google, 44 requests from setuptools (from
different machines), and the rest from various web browsers and
crawlers.
Also, there is a significant difference between throughput and latency:
1000 requests per second is a throughput requirement, whereas "faster
than 0.3s" is a latency requirement. They are somewhat unrelated, see
below.
>> It's difficult to implement a system if the requirements are
>> unknown to those implementing it.
>
> I'm sorry, I've been talking about setuptools all along. I thought the
> use case was understood.
I understand the use case, I just don't understand the performance
requirements resulting out of it. If it's an automated build, why do
you care if the page download completes in 0.3s or in 0.01s (it won't
be much faster because of network roundtrip times).
> Also, I thought it was pretty obvious that the
> performance we've been seeing lately is totally unacceptable.
Define "lately". I never personally saw "totally unacceptable
performance". Whenever I access the system, it behaves completely
reasonable, much faster than any other web pages.
There were only two instances of "totally unacceptable performance",
which were when the system was overloaded, and thrashing. I have
since fixed these cases; they cannot occur again. So I don't think
it is possible that the current installation shows "totally
unacceptable" performance.
> If this was an application that had to be served dynamically (and of
> course, parts of it are), then it would be much more interesting to
> discuss targets for dynamic delivery. The performance-critical parts of
> this application -- the pages that setuptools uses, can readily be
> served statically, so it makes no sense not to do so.
Except that somebody needs to implement that, of course.
Regards,
Martin
From martin at v.loewis.de Tue Jul 10 00:19:58 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 00:19:58 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
Message-ID: <4692B48E.705@v.loewis.de>
> These are the commands so far:
> python pypi-static-generation.py -create_single /pypi/pygame /tmp/pygame.html
> python pypi-static-generation.py -create_all
That also needs -create-single /pypi/pywin32/210
Regards,
Martin
From martin at v.loewis.de Tue Jul 10 00:37:51 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 00:37:51 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> <20070709164232.95EED3A404D@sparrow.telecommunity.com>
<64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com>
Message-ID: <4692B8BF.60203@v.loewis.de>
> So no race condition.
What Phillip says: "the update code" has a race condition,
if multiple simultaneous updates occur.
My proposal is still to put a table into Postgres that lists
the pages to regenerate. The (single) update process would
lock this job table, clear it, release the lock, and start
generating; alternatively, multiple update process would each
lock the table, generate, then release the lock.
Regards,
Martin
From pje at telecommunity.com Tue Jul 10 02:34:26 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 20:34:26 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4692B3A3.5030209@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
Message-ID: <20070710003214.A2EA83A404D@sparrow.telecommunity.com>
At 12:16 AM 7/10/2007 +0200, Martin v. L?wis wrote:
> > Requests for http://www.python.org/pypi/ take
> > about 10 seconds.
>
>Why does that matter for setuptools? Does setuptools ever look at this
>page?
Yes, in order to find the correct spelling for a package's name. If
a user types, say "pylons" when the package is listed on PyPI as
"Pylons", setuptools looks at the root after the lookup of
/pypi/pylons fails. This need could be eliminated if PyPI would
canonicalize package names case-insensitively, collapsing all
non-alphanumeric characters (other than '.') to a single '-'. i.e.:
def safe_name(name):
"""Convert an arbitrary string to a standard distribution name
Any runs of non-alphanumeric/. characters are replaced with a single '-'.
"""
return re.sub('[^A-Za-z0-9.]+', '-', name)
A case-insensitive match by safe_name would be ideal, and could also
be used to prevent users from registering packages whose names differ
only by case or punctuation.
From martin at v.loewis.de Tue Jul 10 07:33:46 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 07:33:46 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070710003214.A2EA83A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
Message-ID: <46931A3A.5000703@v.loewis.de>
> Yes, in order to find the correct spelling for a package's name. If a
> user types, say "pylons" when the package is listed on PyPI as "Pylons",
> setuptools looks at the root after the lookup of /pypi/pylons fails.
I don't understand. How does it help to look at /pypi in this case?
The right spelling of Pylons is not listed there, unless there was
a release of Pylons recently.
If you want to correct the spelling, you need to look at
http://cheeseshop.python.org/pypi?%3Aaction=index
> A case-insensitive match by safe_name would be ideal, and could also be
> used to prevent users from registering packages whose names differ only
> by case or punctuation.
Would it be acceptable to do an HTTP redirect in that case, ie.
redirect /pypi/pylons/0.9.5 to /pypi/Pylons/0.9.5? I would not
want to have multiple URLs to render the same page, in general
(I know it already does that in some cases).
I can see how lower-casing helps; I'm doubtful about replacing
spaces. I.e. why is it better to look for
python-ftp-server-library--pyftpdlib-
than
Python FTP server library (pyftpdlib)
IOW, if you have a mis-spelling of the latter, what are the
chances that it is so misspelled that the safe_name is still
the former? Shouldn't the package owner just correct the
package name, to pyftpdlib, and put the other string into
the summary?
In any case, if it where postgres 8.1 or later, I could simply do
select name from packages where
regexp_replace(lower(name),'[^a-z0-9.]','-')='gnosis-utilities';
to do the lookup; with 7.4, I would have to download all names
and do the safe matching myself.
Regards,
Martin
From martin at v.loewis.de Tue Jul 10 08:07:15 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 08:07:15 +0200
Subject: [Catalog-sig] Speeding up /pypi
Message-ID: <46932213.6050508@v.loewis.de>
I created a partial index (didn't know such a thing existed until
yesterday) to speed up the computation of the home page:
CREATE INDEX journals_latest_releases ON
journals(submitted_date, name, version)
WHERE version IS NOT NULL AND action='new release';
and reworked the query to let postgres actually use that index;
now I can get the Cheeseshop home page as fast as that of
www.python.org (namely, in 0.1s), as measured by
start=time.time();x=urllib.urlopen("http://cheeseshop.python.org/pypi").read();print
time.time()-start
Regards,
Martin
From renesd at gmail.com Tue Jul 10 10:49:10 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Tue, 10 Jul 2007 18:49:10 +1000
Subject: [Catalog-sig] Speeding up /pypi
In-Reply-To: <46932213.6050508@v.loewis.de>
References: <46932213.6050508@v.loewis.de>
Message-ID: <64ddb72c0707100149k46782c66m214b184447ab667b@mail.gmail.com>
nice one :)
On 7/10/07, "Martin v. L?wis" wrote:
> I created a partial index (didn't know such a thing existed until
> yesterday) to speed up the computation of the home page:
>
> CREATE INDEX journals_latest_releases ON
> journals(submitted_date, name, version)
> WHERE version IS NOT NULL AND action='new release';
>
> and reworked the query to let postgres actually use that index;
> now I can get the Cheeseshop home page as fast as that of
> www.python.org (namely, in 0.1s), as measured by
>
> start=time.time();x=urllib.urlopen("http://cheeseshop.python.org/pypi").read();print
> time.time()-start
>
> Regards,
> Martin
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
From jim at zope.com Tue Jul 10 15:52:42 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 09:52:42 -0400
Subject: [Catalog-sig] Merge catalog and distutils sigs
Message-ID: <6C0A5EEC-7E01-4C25-BC09-E0B595C8109A@zope.com>
Is there are good reason for the distutils and catalog sigs to be
separate? Now, that PyPI is an integral part of the distribution
system, I find most topics are really of of interested to both sigs,
and I bet that the overlap between the sigs is significant.
Would anyone object to combining them?
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Tue Jul 10 16:15:05 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 10 Jul 2007 10:15:05 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46931A3A.5000703@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
Message-ID: <20070710141304.BC6903A40A4@sparrow.telecommunity.com>
At 07:33 AM 7/10/2007 +0200, Martin v. L?wis wrote:
> > Yes, in order to find the correct spelling for a package's name. If a
> > user types, say "pylons" when the package is listed on PyPI as "Pylons",
> > setuptools looks at the root after the lookup of /pypi/pylons fails.
>
>I don't understand. How does it help to look at /pypi in this case?
It doesn't. It looks at /pypi/ (note the trailing /) -- which lists
all packages.
>The right spelling of Pylons is not listed there, unless there was
>a release of Pylons recently.
>
>If you want to correct the spelling, you need to look at
>
>http://cheeseshop.python.org/pypi?%3Aaction=index
Which is also spelled /pypi/ - the advantage of this is that a purely
static index consisting of Apache directory indexes produces an
equally useful result for setuptools.
> > A case-insensitive match by safe_name would be ideal, and could also be
> > used to prevent users from registering packages whose names differ only
> > by case or punctuation.
>
>Would it be acceptable to do an HTTP redirect in that case, ie.
>redirect /pypi/pylons/0.9.5 to /pypi/Pylons/0.9.5?
Yes, although setuptoools at the moment looks at /pypi/pylons/
(again, with a trailing /) and does not go to individual version
pages unless the base page contains only links to individual version pages.
It will handle a redirect correctly, as far as interpreting relative
links on result pages.
> I would not
>want to have multiple URLs to render the same page, in general
>(I know it already does that in some cases).
>
>I can see how lower-casing helps; I'm doubtful about replacing
>spaces. I.e. why is it better to look for
>
>python-ftp-server-library--pyftpdlib-
That '--' would actually just be one '-'
>than
>
>Python FTP server library (pyftpdlib)
It's not much better, however, there are a lot of packages with
shorter names for which it does help. Mainly, though, setuptools
just uses this for purposes of determining distribution filenames.
>IOW, if you have a mis-spelling of the latter, what are the
>chances that it is so misspelled that the safe_name is still
>the former? Shouldn't the package owner just correct the
>package name, to pyftpdlib, and put the other string into
>the summary?
>
>In any case, if it where postgres 8.1 or later, I could simply do
>
>select name from packages where
>regexp_replace(lower(name),'[^a-z0-9.]','-')='gnosis-utilities';
>
>to do the lookup; with 7.4, I would have to download all names
>and do the safe matching myself.
I think this will work instead:
select name from packages where name ~* 'gnosis[^a-z0-9.]+utilities'
i.e., replace all '-' in the safe_name() with the appropriate
regex. '~*' is the case-insensitive regular expression match
operator, according to:
http://www.postgresql.org/docs/7.4/interactive/functions-matching.html
Of course, it may also suffice to do:
select lower(name) from packages where name like 'gnosis_%utilities'
i.e. replace all '-' in the safe_name with '_%', which is sort of
like '.+' in a regex. You would still have to postprocess the result
to catch the difference between say, "gnosis-utilities" and
"gnosis3utilities" or some such, but there should be very few such matches.
The "like" query may be easier for postgres to use an index on - an
expression index on lower(name) would do the trick. Of course, I'm
used to trying to optimize much larger databases than PyPI - with
only a few thousand entries, a non-index query here may be just fine.
In any case, this query should also be used to check for uniqueness
when adding packages.
From jim at zope.com Tue Jul 10 16:32:10 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 10:32:10 -0400
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <46937F10.3070201@weitershausen.de>
References: <46937F10.3070201@weitershausen.de>
Message-ID: <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
You raise a really good point, which is especially relevant in light
of pypi performance issues and discussions.
I'm copying the distutils and catalog sigs to get some wider
discussion. I apologize for the cross posting.
I'm beginning to wonder about the strategy that setuptools uses, or
maybe about the way we are using the index.
It's important to note that there is nothing specific about the
buildout package here.
It is very important to make multiple versions available to support
requirements for specific package versions. It make builds/installs
repeatable, whether talking about buildout or other systems built on
setuptools. When someone has tested and wants to release an
application built from a collection of distributions, they will want
to specify those *specific* versions for future builds or installs.
This means that we need to retain any versions published indefinitely
in a way that can be found by setuptools.
Currently, the only way to support multiple versions with the
cheeseshop is to unhide past releases. This has a fairly severe
effect on performance. As the example below shows, setuptools will
fetch the package page and then fetch the pages for each release.
That's a lot of requests. What makes it worse is that the individual
package pages can be fairly long. I've gotten in the habit of
including full documentation on every release page. For example,
recent release pages for zc.buildout are around 200K. This is a
fairly significant amount of data to transfer. This will certainly
make the scanning process take a long time for clients. (Obviously,
if we keep doing things the way we are, I'll need to stop doing that.)
All of this aggravates any performance problems we might have.
Up to now, setuptools has tried hard to use existing systems without
change. This means that it reuses systems designed primarily for
people, not software. I think that setuptools rightly took the
approach it has up to now so that progress could be made without
making people change other systems. This was appropriate when
setuptools was evolving and people were figuring out ways to use it.
I think it is time to take a step back and think a lot harder about
how we'd want to structure an index to support setuptools.
IMO, a setuptools-aware index would have a single page for each package:
- The single page would be published in a case-insensitive way. It
would be nice to find a way to avoid this, or maybe we should use a
windows-based web server. :) It would also be served very cheaply,
for example statically.
- The single page would list links for all available distributions,
which should include all distributions published. It would also list
any other URLs that should be scanned for releases, when releases
aren't all uploaded to PyPI.
- The single page would contain very little additional information.
It would be for use by software, not humans.
In addition, the root page with a trailing / would be empty and very
cheap.
There are a lot of ways we could achieve this pretty cheaply while
keeping the existing system pretty much as it is.
For example, the current effort to bake static pages could bake these
pages instead. We could make the new index available at a different
URL for people to play with while we worked the kinks out of the
process.
Of course, those of us who use the cheesehop and setuptools
extensively can also achieve much of this by changing the way we work.
Thoughts?
Jim
On Jul 10, 2007, at 8:44 AM, Philipp von Weitershausen wrote:
> When easy_installing zc.buildout I realized that the CheeseShop
> still lists a gazillion old versions of zc.buildout. That makes it
> take quite some time to install zc.buildout (see below), and I
> reckon the same sort of check has to happen each time it looks for
> a new version of that egg...
>
> Is there any reason for having so many old versions around?
>
>
> $ easy_install zc.buildout
> Searching for zc.buildout
> Reading http://cheeseshop.python.org/pypi/zc.buildout/
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19
> Reading http://svn.zope.org/zc.buildout
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18
> Best match: zc.buildout 1.0.0b28
> ...
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Tue Jul 10 16:40:48 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 10:40:48 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4692B3A3.5030209@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
Message-ID: <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
On Jul 9, 2007, at 6:16 PM, Martin v. L?wis wrote:
...
> Ok. By "ATM", you mean July 9, 14:09 GMT?
Whenever I sent the note,
> Please take a look at
>
> http://ximinez.python.org/munin/localdomain/localhost.localdomain-
> load.html
>
> That was the most significant spike in the load today, and I surely
> would like to know what was causing it.
Maybe someone was trying to mirror pypi because it is too slow. :/ I
suspect that there is a lot of this going on.
>
>> Requests for
>> http://cheeseshop.python.org/packages/2.5/z/zc.buildout/
>> zc.buildout-1.0.0b28-py2.5.egg
>> take about 2.5 seconds.
>
> That is a static file, not going through PyPI. It's 168kiB, so that
> means you download with 67kB/s.
OK. So I guess that is reasonable. I'll note that in the long term,
we'll probably want to create mirrors to get better locality and this
faster downloads and to prevent excessive bandwith consumption for
python.org.
>
>> Requests for http://www.python.org/pypi/ take
>> about 10 seconds.
>
> Why does that matter for setuptools? Does setuptools ever look at this
> page?
Phillip answered this.
>> I would say that these times are too long.
>
> Which of these precisely? Given that the actual file downloads in
> 2.5s,
> why is it important that the access to the page referring to it is
> 1/3s?
I guess all of them except the download. Really, in the long run, I
think the download time is too long too. But that isn't my immediate
concern.
BTW, the problem is exacerbased by packages like zc.buildout that
include full documentation in their package pages. Although even
packages that don't do that seem to take about a third of a second.
>>> and how long should it
>>> take so that you would consider it fast enough?
>>
>> IMO, it needs to be much much faster. If we were serving pages
>> staticially, we would be able to serve thousands of requests per
>> second. There's nothing about this application that would make doing
>> that hard.
>
> I looked at the load preceding your message. Counting 1000 requests
> backwards from 14:09, we are at 16:07. So this system receives roughly
> 1000 requests per minute in its peak load, and it seems to be able to
> handle them (although the performance degrades at that point).
You can expect one of 2 things to happen:
- We'll fix the PyPI performance problems and load will increase
dramatically, or
- We won't fix the problems and people will create alternate
indexes. This is already happening. If that happens, the load will
likely still increase, although not as rapidly.
...
>>> It's difficult to implement a system if the requirements are
>>> unknown to those implementing it.
>>
>> I'm sorry, I've been talking about setuptools all along. I
>> thought the
>> use case was understood.
>
> I understand the use case, I just don't understand the performance
> requirements resulting out of it. If it's an automated build, why do
> you care if the page download completes in 0.3s or in 0.01s (it won't
> be much faster because of network roundtrip times).
Two reasons:
- People wait for these builds. A build will usually make *many*
(tens or hundreds) of requests for pypi checking for new versions of
software. If there are no new versions, which will be the common
case, then nothing will be downloaded. I'm most interested in
speeding up the checking. Of course, a requests for http://
www.python.org/pypi/ will usually be done once per build if any of
the packages in in the build aren't in pypi (only once because
setuptools caches pages internally). It would be nice to find a way
to stop doing this.
- If performance degrades, as it has often lately, then the times are
much longer. In fact, requests over the last few weeks have often
timed out, making work grind to a halt. It't imporant to realize
that demand will increase substantially, so whatwver we do needs to
be scalable.
>> Also, I thought it was pretty obvious that the
>> performance we've been seeing lately is totally unacceptable.
>
> Define "lately". I never personally saw "totally unacceptable
> performance". Whenever I access the system, it behaves completely
> reasonable, much faster than any other web pages.
I've seen requests take minutes and time out with proxy errors many
times over the last few weeks. We, ZC, and many people we work with
are at the point of building private indexes to get around the
horrible performance.
> There were only two instances of "totally unacceptable performance",
> which were when the system was overloaded, and thrashing. I have
> since fixed these cases; they cannot occur again. So I don't think
> it is possible that the current installation shows "totally
> unacceptable" performance.
Maybe others can chime in.
>> If this was an application that had to be served dynamically (and of
>> course, parts of it are), then it would be much more interesting to
>> discuss targets for dynamic delivery. The performance-critical
>> parts of
>> this application -- the pages that setuptools uses, can readily be
>> served statically, so it makes no sense not to do so.
>
> Except that somebody needs to implement that, of course.
And happily, someone is.
I've realized this morning, in responding to a note from Philipp von
Weitershausen that we really should take a step back and think about
an index to support setuptools, or, failing that, rethink the ways
we're using PyPI in light of the way setuptools works.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Tue Jul 10 17:56:42 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 10 Jul 2007 11:56:42 -0400
Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions?
In-Reply-To: <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
References: <46937F10.3070201@weitershausen.de>
<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
Message-ID: <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com>
At 10:32 AM 7/10/2007 -0400, Jim Fulton wrote:
>Currently, the only way to support multiple versions with the
>cheeseshop is to unhide past releases. This has a fairly severe
>effect on performance. As the example below shows, setuptools will
>fetch the package page and then fetch the pages for each release.
>That's a lot of requests.
This could potentially be fixed in setuptools, so that it only looks
at release pages that match its requirements, in highest-to-lowest
version order, stopping as soon as a suitable match is found. That
would eliminate the current issue -- but only for new versions of
setuptools. So I do like your idea better, since it can be made to
work for already-deployed clients as well.
>I think it is time to take a step back and think a lot harder about
>how we'd want to structure an index to support setuptools.
+1, as long as somebody's willing to build and host the
thing. Please see my earlier comments on the Catalog-Sig about this.
>IMO, a setuptools-aware index would have a single page for each package:
>
>- The single page would be published in a case-insensitive way. It
>would be nice to find a way to avoid this, or maybe we should use a
>windows-based web server. :) It would also be served very cheaply,
>for example statically.
Apache's CheckSpelling directive does case-insensitivity and
approximate matching. Combine that with making the directories be
based on "safe_name" values to begin with, and you should be all set.
>- The single page would list links for all available distributions,
>which should include all distributions published. It would also list
>any other URLs that should be scanned for releases, when releases
>aren't all uploaded to PyPI.
The piece you're missing here is direct links to other downloads,
such as "#egg=project-dev" subversion links. However, if you
extracted these from all of the relevant PyPI HTML pages, you could
certainly do that.
>In addition, the root page with a trailing / would be empty and very
>cheap.
As long as the individual package directories are safe_name based,
this would work.
>There are a lot of ways we could achieve this pretty cheaply while
>keeping the existing system pretty much as it is.
Of course, there are still other reasons to want to improve the
Cheeseshop's performance, such as search engines and other bots.
>For example, the current effort to bake static pages could bake these
>pages instead. We could make the new index available at a different
>URL for people to play with while we worked the kinks out of the
>process.
...and then use a User-Agent rewrite rule to redirect setuptools
clients to the static piece, as soon as we're satisfied that it works.
From jim at zope.com Tue Jul 10 18:04:01 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 12:04:01 -0400
Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions?
In-Reply-To: <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com>
References: <46937F10.3070201@weitershausen.de>
<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
<20070710155535.4D8CC3A40A9@sparrow.telecommunity.com>
Message-ID:
On Jul 10, 2007, at 11:56 AM, Phillip J. Eby wrote:
> At 10:32 AM 7/10/2007 -0400, Jim Fulton wrote:
>> Currently, the only way to support multiple versions with the
>> cheeseshop is to unhide past releases. This has a fairly severe
>> effect on performance. As the example below shows, setuptools will
>> fetch the package page and then fetch the pages for each release.
>> That's a lot of requests.
>
> This could potentially be fixed in setuptools, so that it only
> looks at release pages that match its requirements, in highest-to-
> lowest version order, stopping as soon as a suitable match is
> found. That would eliminate the current issue
No, it will mitigate the current issue somewhat, but it will still
involve multiple requests per package, while a simpler index
structure would allow a single request per package.
> -- but only for new versions of setuptools. So I do like your idea
> better, since it can be made to work for already-deployed clients
> as well.
Yup.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Tue Jul 10 23:29:14 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 23:29:14 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070710141304.BC6903A40A4@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
Message-ID: <4693FA2A.3020107@v.loewis.de>
> It doesn't. It looks at /pypi/ (note the trailing /) -- which lists all
> packages.
Ah, ok. I keep forgetting that feature.
>> Would it be acceptable to do an HTTP redirect in that case, ie.
>> redirect /pypi/pylons/0.9.5 to /pypi/Pylons/0.9.5?
>
> Yes, although setuptoools at the moment looks at /pypi/pylons/ (again,
> with a trailing /) and does not go to individual version pages unless
> the base page contains only links to individual version pages.
Right - I meant that to mean that it would redirect /pypi/Pylons/ to
/pypi/pylons/
> I think this will work instead:
>
> select name from packages where name ~* 'gnosis[^a-z0-9.]+utilities'
Ok. I was hoping to be able to create an index of safe_names, which
postgres would automatically maintain on updates; the above approach
would always cause a sequential scan (in postgres, not in Python).
Your second approach (using like) might solve that, but there I
dislike having the logic both in Python and in SQL - ideally, only
one of them should do "real" computation (and ideally, it would be
SQL).
On ximinez, your query gets analyzed as
Seq Scan on packages (cost=0.00..46.65 rows=1 width=13) (actual
time=0.461..9.367 rows=1 loops=1)
Filter: (name ~* 'gnosis[^a-z0-9.]+utilities'::text)
Total runtime: 9.413 ms
Compared to some other queries it performs, that's a cheap one.
> In any case, this query should also be used to check for uniqueness when
> adding packages.
Hmm. I'm somewhat skeptical about setuptools (or any other packaging
infrastructure, say, Debian) establishing rules on what makes a
difference in package names.
Regards,
Martin
From martin at v.loewis.de Tue Jul 10 23:36:28 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 23:36:28 +0200
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
References: <46937F10.3070201@weitershausen.de>
<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
Message-ID: <4693FBDC.2060201@v.loewis.de>
> For example, the current effort to bake static pages could bake these
> pages instead.
Certainly not instead; in addition, if there are volunteers to implement
that.
> We could make the new index available at a different
> URL for people to play with while we worked the kinks out of the
> process.
I have been thinking about the same thing. I think it would be good
to have, however, it will surely take some time until all setuptools
implementations learn to use it.
> Of course, those of us who use the cheesehop and setuptools
> extensively can also achieve much of this by changing the way we work.
Hmm. How about those using them extensively start contributing to
them also?
Regards,
Martin
From martin at v.loewis.de Tue Jul 10 23:39:28 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 23:39:28 +0200
Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions?
In-Reply-To:
References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com>
Message-ID: <4693FC90.9060001@v.loewis.de>
> No, it will mitigate the current issue somewhat, but it will still
> involve multiple requests per package, while a simpler index
> structure would allow a single request per package.
I don't understand. If setuptools would always look
/pypi/package/version first, it would immediately find the right
page if that version is indeed stored in the cheeseshop.
Why would that require multiple requests per package?
Regards,
Martin
From martin at v.loewis.de Tue Jul 10 23:48:04 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 23:48:04 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
Message-ID: <4693FE94.6090107@v.loewis.de>
>> That was the most significant spike in the load today, and I surely
>> would like to know what was causing it.
>
> Maybe someone was trying to mirror pypi because it is too slow. :/ I
> suspect that there is a lot of this going on.
In that case, I doubt it. The top client identified itself as
setuptools.
> I've seen requests take minutes and time out with proxy errors many
> times over the last few weeks. We, ZC, and many people we work with are
> at the point of building private indexes to get around the horrible
> performance.
I still don't understand why you consider this an easier option than
contributing to the existing project. If you invest time to do an
alternative, isn't this more expensive than starting where others
have already contributed?
But if you think that scratches your itches: good luck!
> Maybe others can chime in.
That's also my concern. Nobody else is complaining; AFAICT, there
is just one unhappy user of PyPI.
Regards,
Martin
From jim at zope.com Tue Jul 10 23:49:43 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 17:49:43 -0400
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <4693FBDC.2060201@v.loewis.de>
References: <46937F10.3070201@weitershausen.de>
<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
<4693FBDC.2060201@v.loewis.de>
Message-ID: <4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com>
On Jul 10, 2007, at 5:36 PM, Martin v. L?wis wrote:
>> For example, the current effort to bake static pages could bake these
>> pages instead.
>
> Certainly not instead; in addition, if there are volunteers to
> implement
> that.
Sure,
>
>> We could make the new index available at a different
>> URL for people to play with while we worked the kinks out of the
>> process.
>
> I have been thinking about the same thing. I think it would be good
> to have, however, it will surely take some time until all setuptools
> implementations learn to use it.
No, not at all. You can tell setuptools to use a different index
than the current one. For example, this is a command-line option for
easy_install and a configuration option for buildout.
>> Of course, those of us who use the cheesehop and setuptools
>> extensively can also achieve much of this by changing the way we
>> work.
>
> Hmm. How about those using them extensively start contributing to
> them also?
I like to think that I am by participating in this discussion.
Actually changing the cheeseshop software has a very high learning
curve. I don't think that I can make that kind of time any time
soon. I'm very grateful that you and Ren? are doing what you're
doing. I also suspect that, given your and Ren?'s activity, it would
be counter productive for someone else to get involved at that level,
but maybe I'm wrong about that.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Tue Jul 10 23:55:28 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 17:55:28 -0400
Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions?
In-Reply-To: <4693FC90.9060001@v.loewis.de>
References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com>
<4693FC90.9060001@v.loewis.de>
Message-ID:
On Jul 10, 2007, at 5:39 PM, Martin v. L?wis wrote:
>> No, it will mitigate the current issue somewhat, but it will still
>> involve multiple requests per package, while a simpler index
>> structure would allow a single request per package.
>
> I don't understand. If setuptools would always look
> /pypi/package/version first, it would immediately find the right
> page if that version is indeed stored in the cheeseshop.
>
> Why would that require multiple requests per package?
It usually doesn't have a single required version. It usually has
just a package name or a name and a range of versions. It has to
scan the package page to find out what versions are available, and
*then* it can load the release page for the highest version that
satisfies the requirement. It can usually read that one page,
however, there may be additional filtering needed that would cause it
to search multiple releases. For example, it might be looking for a
source distribution, or a platform-specific distribution that isn't
available for the most recent release. In any case, the best case is
that it has to scan the package page to find the most recent release,
and then scan that release page.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Wed Jul 11 00:18:00 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 10 Jul 2007 18:18:00 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4693FA2A.3020107@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
Message-ID: <20070710221547.4A3043A40A4@sparrow.telecommunity.com>
At 11:29 PM 7/10/2007 +0200, Martin v. L?wis wrote:
>Hmm. I'm somewhat skeptical about setuptools (or any other packaging
>infrastructure, say, Debian) establishing rules on what makes a
>difference in package names.
I can certainly understand that. However, *having* SOME definition
that's more human-friendly (and cross-platform filename friendly!)
than "the bytes match exactly", would be very useful to have.
If PyPI had already had one (and I asked about this when I was first
trying to establish one) I'd have used that, or negotiated a
compromise if it didn't meet the filename-related requirements.
However, none of the times that I asked about this issue on either
the catalog-sig nor the distutils-sig did anyone propose any
alternative canonicalization, nor bring up any objection besides the
general sort of reservation that you're expressing here - i.e., not
sure it's a good idea, but not expressing any particular reason it's
a bad idea.
Note that Windows (and Mac OS under certain circumstances) have
filename case insensitivity, and have different restrictions about
what can or can't be in a filename than Unix. Spaces and other
punctuation characters can cause problems for shells, even if they're
theoretically acceptable as filenames.
If you'd like to propose a *different* canonicalization, however, I'm
certainly willing to consider implementing it in setuptools, if it
can be done. However, as I said, nobody has proposed anything else,
but it would be nice to resolve the issue *before* name collisions happen.
If anything, I think that PyPI canonicalization may wish to be *more*
restrictive than setuptools' is. There isn't a whole lot of user
benefit to having, say, "Mike's Nifty module" and "Mikes Nifty
Module" being considered distinct packages, even though setuptools
actually allows that distinction to be made.
IOW, setuptools' focus is more on distribution filename safety,
rather than on sensible naming distinctions for end users. The
former is less restrictive than the latter, I believe.
From jim at zope.com Wed Jul 11 00:54:10 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 18:54:10 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4693FE94.6090107@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de>
Message-ID:
On Jul 10, 2007, at 5:48 PM, Martin v. L?wis wrote:
>> I've seen requests take minutes and time out with proxy errors many
>> times over the last few weeks. We, ZC, and many people we work
>> with are
>> at the point of building private indexes to get around the horrible
>> performance.
>
> I still don't understand why you consider this an easier option than
> contributing to the existing project.
I don't. I'm not advocating it. In fact, I've been trying to
convince people not to.
People are doing it, usually in limited ways, out of desperation.
...
>> Maybe others can chime in.
>
> That's also my concern. Nobody else is complaining; AFAICT, there
> is just one unhappy user of PyPI.
Oh come on, I'm not the only one who has posted messages on this
mailing list over the last few weeks reporting problems.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Wed Jul 11 00:55:57 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 18:55:57 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4693FA2A.3020107@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
Message-ID: <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
On Jul 10, 2007, at 5:29 PM, Martin v. L?wis wrote:
...
> Hmm. I'm somewhat skeptical about setuptools (or any other packaging
> infrastructure, say, Debian) establishing rules on what makes a
> difference in package names.
Why? It certainly seems reasonable to me for a packaging system to
define rules for package names.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Wed Jul 11 01:03:09 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 19:03:09 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070710221547.4A3043A40A4@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
Message-ID:
On Jul 10, 2007, at 6:18 PM, Phillip J. Eby wrote:
> At 11:29 PM 7/10/2007 +0200, Martin v. L?wis wrote:
>> Hmm. I'm somewhat skeptical about setuptools (or any other packaging
>> infrastructure, say, Debian) establishing rules on what makes a
>> difference in package names.
>
> I can certainly understand that. However, *having* SOME definition
> that's more human-friendly (and cross-platform filename friendly!)
> than "the bytes match exactly", would be very useful to have.
>
> If PyPI had already had one (and I asked about this when I was
> first trying to establish one) I'd have used that, or negotiated a
> compromise if it didn't meet the filename-related requirements.
>
> However, none of the times that I asked about this issue on either
> the catalog-sig nor the distutils-sig did anyone propose any
> alternative canonicalization, nor bring up any objection besides
> the general sort of reservation that you're expressing here - i.e.,
> not sure it's a good idea, but not expressing any particular reason
> it's a bad idea.
I think it is time we (the Python community) nailed this down.
Perhaps a distribution project-name naming PEP is in order.
>
> Note that Windows (and Mac OS under certain circumstances) have
> filename case insensitivity, and have different restrictions about
> what can or can't be in a filename than Unix. Spaces and other
> punctuation characters can cause problems for shells, even if
> they're theoretically acceptable as filenames.
Why should this imply case insensitivity of distribution project
names. Python has case sensitive module (including package) names
that can lead to problems if two modules have names that differ only
in case. (I assume that Python 3000 retains this although, sadly, I
don't know.) We deal with this by telling people "don't do that."
Two packages with the same name except for case are incompatible, but
then, so are modules with incompatible dependencies.
> If you'd like to propose a *different* canonicalization, however,
> I'm certainly willing to consider implementing it in setuptools, if
> it can be done. However, as I said, nobody has proposed anything
> else, but it would be nice to resolve the issue *before* name
> collisions happen.
>
> If anything, I think that PyPI canonicalization may wish to be
> *more* restrictive than setuptools' is. There isn't a whole lot of
> user benefit to having, say, "Mike's Nifty module" and "Mikes Nifty
> Module" being considered distinct packages, even though setuptools
> actually allows that distinction to be made.
>
> IOW, setuptools' focus is more on distribution filename safety,
> rather than on sensible naming distinctions for end users. The
> former is less restrictive than the latter, I believe.
I don't care much what canonicalization we use, but I agree strongly
that we should decide something.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Wed Jul 11 02:12:54 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 10 Jul 2007 20:12:54 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
Message-ID: <20070711001040.D6DBF3A404D@sparrow.telecommunity.com>
At 07:03 PM 7/10/2007 -0400, Jim Fulton wrote:
>Why should this imply case insensitivity of distribution project
>names. Python has case sensitive module (including package) names
>that can lead to problems if two modules have names that differ only
>in case.
Module names are identifiers, with an already-restricted character
set. Package names are strings, and many people (especially those
who enter their PyPI data through the web) assume they can put
whatever the heck they want in there.
> (I assume that Python 3000 retains this although, sadly, I
>don't know.) We deal with this by telling people "don't do that."
Right... and PyPI's input validation would be a good place to tell them. :)
>Two packages with the same name except for case are incompatible, but
>then, so are modules with incompatible dependencies.
Compatibility isn't the only concern, it's also about confusion as to
which package is which. While one can't legislate away confusion,
fixing simple, obvious errors that can and *do* occur in practice
(like one package name having one space in it, the other having two!)
is a good idea.
One of the things that prompted my search for a canonicalization
strategy was my survey of existing CheeseShop packages, which
actually included a certain amount of duplication due to changes in
case or punctuation at one point. (I believe the specific instances
were fixed a long time ago, although I wouldn't rule out the
possibility that some still exist.)
From srichter at cosmos.phy.tufts.edu Wed Jul 11 02:16:40 2007
From: srichter at cosmos.phy.tufts.edu (Stephan Richter)
Date: Tue, 10 Jul 2007 20:16:40 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
Message-ID: <200707102016.40669.srichter@cosmos.phy.tufts.edu>
Hi all,
Jim Fulton forwarded this exchange to the Zope3-Dev mailing lsit asking for us
to comment.
> On Jul 10, 2007, at 5:48 PM, Martin v. L?wis wrote:
>> That's also my concern. Nobody else is complaining; AFAICT, there
>> is just one unhappy user of PyPI.
>
> Oh come on, I'm not the only one who has posted messages on this
> mailing list over the last few weeks reporting problems.
I can assure you that I have had several times troubles with performance. One
Friday I could not even finish my release, because I could not upload to PyPI
or test the release since the packages were not downloaded after 5 hours!
Regards,
Stephan
--
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training
From waterbug at pangalactic.us Wed Jul 11 04:55:30 2007
From: waterbug at pangalactic.us (Stephen Waterbury)
Date: Tue, 10 Jul 2007 22:55:30 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4693FE94.6090107@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de>
Message-ID: <469446A2.9070500@pangalactic.us>
Martin v. L?wis wrote:
> [Jim Fulton wrote:]
>> Maybe others can chime in.
>
> That's also my concern. Nobody else is complaining; AFAICT, there
> is just one unhappy user of PyPI.
I'm not happy with PyPI's performance either.
Probably many users are like me: I thought it was
common knowledge that the performance of PyPI was bad, but
I didn't want to complain when it appeared that people were
working on improvements.
Steve
From richardjones at optusnet.com.au Wed Jul 11 06:04:24 2007
From: richardjones at optusnet.com.au (richardjones at optusnet.com.au)
Date: Wed, 11 Jul 2007 14:04:24 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
Message-ID: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://mail.python.org/pipermail/catalog-sig/attachments/20070711/c7f5e06a/attachment.pot
From martin at v.loewis.de Wed Jul 11 07:16:26 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:16:26 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070710221547.4A3043A40A4@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
Message-ID: <469467AA.7070409@v.loewis.de>
> Note that Windows (and Mac OS under certain circumstances) have filename
> case insensitivity, and have different restrictions about what can or
> can't be in a filename than Unix. Spaces and other punctuation
> characters can cause problems for shells, even if they're theoretically
> acceptable as filenames.
I can see that collisions should be avoided in advance when it comes to
file names. However, the name of a software package is not necessarily a
file name, nor is it even related to the name of files inside the
package.
*Python* package names are the ones that must not conflict. For
a packaging tool, the names of the package files must not conflict,
either. For the package names in general, issues of file names
are only remotely relevant, on a first glance.
> IOW, setuptools' focus is more on distribution filename safety, rather
> than on sensible naming distinctions for end users. The former is less
> restrictive than the latter, I believe.
Yes. However, it's not clear to me that the infrastructure needs to
(or even is able to) enforce sensible naming. Instead, any policing
that might be necessary should be done in the community. If two
packages are named too similarly, users will get confused, and
eventually one package may disappear, get renamed, get its naming
challenged in court, and so on. It's not the job of the package
*index* to do that sort of policing.
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 07:19:45 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:19:45 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de>
Message-ID: <46946871.3060100@v.loewis.de>
> People are doing it, usually in limited ways, out of desperation.
Same question to these people, then (whoever they are): why
do you think it's easier to build your own index in desperation,
rather than contributing to PyPI?
>> That's also my concern. Nobody else is complaining; AFAICT, there
>> is just one unhappy user of PyPI.
>
> Oh come on, I'm not the only one who has posted messages on this mailing
> list over the last few weeks reporting problems.
Can you kindly refer to four or five such messages in the archives?
I must have missed them.
Regards,
Martin
From waterbug at pangalactic.us Wed Jul 11 07:20:05 2007
From: waterbug at pangalactic.us (Stephen Waterbury)
Date: Wed, 11 Jul 2007 01:20:05 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
Message-ID: <46946885.8080100@pangalactic.us>
richardjones at optusnet.com.au wrote:
> Stephen Waterbury wrote:
>> Martin v. L??wis wrote:
>>> [Jim Fulton wrote:]
>>>> Maybe others can chime in.
>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>> I'm not happy with PyPI's performance either.
>> Probably many users are like me: I thought it was
>> common knowledge that the performance of PyPI was bad, but
>> I didn't want to complain when it appeared that people were
>> working on improvements.
>
> It has been slow in the past, but Martin has done some great work
> speeding it up in the last few days. If it's still slow, please
> report when you noticed and what you were trying to do.
I agree, Martin's improvements have made a huge difference, in my
recent experience. Thanks, Martin! I inferred from the conversation
that the performance is variable, and I think my tests of it have been
in off-peak times, so my current impressions should be regarded as
anecdotal ... another reason why I hadn't volunteered my opinion until
this request for input.
Steve
From martin at v.loewis.de Wed Jul 11 07:21:09 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:21:09 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de>
<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
Message-ID: <469468C5.8000906@v.loewis.de>
>> Hmm. I'm somewhat skeptical about setuptools (or any other packaging
>> infrastructure, say, Debian) establishing rules on what makes a
>> difference in package names.
>
> Why? It certainly seems reasonable to me for a packaging system to
> define rules for package names.
Ah, sure. It's certainly fine and reasonable for a packaging system
to do that for its own purposes. However, I'm skeptical about that
packaging system then to enforce its rules on other systems (such
as the cheeseshop, which is not packaging system).
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 07:28:09 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:28:09 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469446A2.9070500@pangalactic.us>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de>
<469446A2.9070500@pangalactic.us>
Message-ID: <46946A69.4000702@v.loewis.de>
> I'm not happy with PyPI's performance either.
> Probably many users are like me: I thought it was
> common knowledge that the performance of PyPI was bad
Please trust me that it isn't. I know that PyPI could
become unresponsive, and I FIXED that. AFAICT, it's
solved, done, can't happen again. I do not know that
performance IS bad; I know that it WAS bad (primarily
not due to the way the software was written, but
due to the way it was run).
> but
> I didn't want to complain when it appeared that people were
> working on improvements.
Sure: mere complaints would not be constructive. However,
specific *reports* of problems are absolutely necessary.
If you experience problems today, tomorrow, next week,
by all means, report them. Different people apparently
also have different perception what good performance is,
so please always make a full bug report:
- what precisely did you do (including "when" also
in this case),
- what happened,
- what did you expect to happen instead
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 07:31:06 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:31:06 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46946885.8080100@pangalactic.us>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46946885.8080100@pangalactic.us>
Message-ID: <46946B1A.9040004@v.loewis.de>
> I agree, Martin's improvements have made a huge difference, in my
> recent experience. Thanks, Martin! I inferred from the conversation
> that the performance is variable, and I think my tests of it have been
> in off-peak times, so my current impressions should be regarded as
> anecdotal ... another reason why I hadn't volunteered my opinion until
> this request for input.
Ah, ok. Please take a look at
http://ximinez.python.org/munin/localdomain/localhost.localdomain-load.html
Times are in CEST (UTC+2), so the peak load occurred during the times
I was asleep - I never personally see any significant load on the system
anymore. If you also work in a similar time zone as I do, I would
consider your problems solved.
Regards,
Martin
From gentoodev at gmail.com Wed Jul 11 07:54:48 2007
From: gentoodev at gmail.com (Rob Cakebread)
Date: Tue, 10 Jul 2007 22:54:48 -0700
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469467AA.7070409@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
Message-ID: <9b06ffb10707102254i57e5c0f8gf92836805f8a0626@mail.gmail.com>
On 7/10/07, "Martin v. L?wis" wrote:
>
> Yes. However, it's not clear to me that the infrastructure needs to
> (or even is able to) enforce sensible naming. Instead, any policing
> that might be necessary should be done in the community. If two
> packages are named too similarly, users will get confused, and
> eventually one package may disappear, get renamed, get its naming
> challenged in court, and so on. It's not the job of the package
> *index* to do that sort of policing.
>
Every package index I can think of does enforce sensible naming, except PyPI.
Nobody is going to change the name of their project if you enforce
sensible naming for PyPI, they'll just have to map their project name
to a way that is easily mapped to PyPI's system, just like on
Freshmeat, RubyForge etc.
From martin at v.loewis.de Wed Jul 11 07:06:59 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:06:59 +0200
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com>
References: <46937F10.3070201@weitershausen.de>
<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
<4693FBDC.2060201@v.loewis.de>
<4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com>
Message-ID: <46946573.2070400@v.loewis.de>
>> I have been thinking about the same thing. I think it would be good
>> to have, however, it will surely take some time until all setuptools
>> implementations learn to use it.
>
> No, not at all. You can tell setuptools to use a different index than
> the current one. For example, this is a command-line option for
> easy_install and a configuration option for buildout.
Yes. However, that will make the feature only available to those who
know about it. I have very shallow knowledge of setuptools and
easy_install only (I nearly never use them at all), and I surely would
miss such an option, and miss why it's relevant.
It's true that the Apache installation could also redirect existing
installations to the new pages, but I doubt that they would be otherwise
widely used until setuptools changes its hard-coded default.
>> Hmm. How about those using them extensively start contributing to
>> them also?
>
> I like to think that I am by participating in this discussion. Actually
> changing the cheeseshop software has a very high learning curve. I don't
> think that I can make that kind of time any time soon. I'm very
> grateful that you and Ren? are doing what you're doing. I also suspect
> that, given your and Ren?'s activity, it would be counter productive for
> someone else to get involved at that level, but maybe I'm wrong about that.
I strongly think you are. There are many things that could be improved,
and I would not mind leaving the cheeseshop alone if some other
maintainer came along - I also have other things to do.
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 07:44:53 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 11 Jul 2007 07:44:53 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <200707102016.40669.srichter@cosmos.phy.tufts.edu>
References: <200707102016.40669.srichter@cosmos.phy.tufts.edu>
Message-ID: <46946E55.30308@v.loewis.de>
>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>> Oh come on, I'm not the only one who has posted messages on this
>> mailing list over the last few weeks reporting problems.
>
> I can assure you that I have had several times troubles with performance. One
> Friday I could not even finish my release, because I could not upload to PyPI
> or test the release since the packages were not downloaded after 5 hours!
I assume you are talking about past here - I can readily believe that
has happened. I think it's fixed now, and it should not happen again
that you have to wait 5 hours to download a file (unless there is
some hardware failure, network outage or the like beyond the control
of the local software).
So yes, I trust that there have been complaints in the past - I
wonder whether there are *still* complaints (beyond the ones
of Jim Fulton).
Regards,
Martin
From benji at benjiyork.com Wed Jul 11 14:10:30 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 08:10:30 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46946871.3060100@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de>
<46946871.3060100@v.loewis.de>
Message-ID: <4694C8B6.1030804@benjiyork.com>
Martin v. L?wis wrote:
>> People are doing it, usually in limited ways, out of desperation.
>
> Same question to these people, then (whoever they are): why
> do you think it's easier to build your own index in desperation,
> rather than contributing to PyPI?
Because they aren't aware of the progress being made or the intent to
make more?
>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>> Oh come on, I'm not the only one who has posted messages on this mailing
>> list over the last few weeks reporting problems.
>
> Can you kindly refer to four or five such messages in the archives?
> I must have missed them.
Here's one (you didn't say they had to be past messages ).
Is your position that PyPI isn't down/very slow on occasion or that when
it is no one complains?
My team has lost many man hours to PyPI begin down/glacially slow. This
isn't meant to disparage PyPI though, if it weren't such a great thing
it wouldn't be important to us.
--
Benji York
http://benjiyork.com
From renesd at gmail.com Wed Jul 11 14:20:22 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Wed, 11 Jul 2007 22:20:22 +1000
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <46946573.2070400@v.loewis.de>
References: <46937F10.3070201@weitershausen.de>
<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
<4693FBDC.2060201@v.loewis.de>
<4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com>
<46946573.2070400@v.loewis.de>
Message-ID: <64ddb72c0707110520j42bb8f27nb676bcf4de39d14c@mail.gmail.com>
I have to say the cheeseshop code was pretty easy to get into.
I think I was able to make most of my changes within the first reading of it.
It quite clearly separates things like the templates, the database
functionality and the 'webui'.
There definitely are a huge amount of things that I would love to
change with it over time, and I hope other people begin to develop it
more - it can only help the python community as a whole.
The amount of people doing releases has increased quite a lot even in
the last two months, so I think the releases will get more frequent.
As it grows it will continue to need different changes - optimizations
to the database/webserver, and also optimizations to the user
interface.
On 7/11/07, "Martin v. L?wis" wrote:
> >> I have been thinking about the same thing. I think it would be good
> >> to have, however, it will surely take some time until all setuptools
> >> implementations learn to use it.
> >
> > No, not at all. You can tell setuptools to use a different index than
> > the current one. For example, this is a command-line option for
> > easy_install and a configuration option for buildout.
>
> Yes. However, that will make the feature only available to those who
> know about it. I have very shallow knowledge of setuptools and
> easy_install only (I nearly never use them at all), and I surely would
> miss such an option, and miss why it's relevant.
>
> It's true that the Apache installation could also redirect existing
> installations to the new pages, but I doubt that they would be otherwise
> widely used until setuptools changes its hard-coded default.
>
> >> Hmm. How about those using them extensively start contributing to
> >> them also?
> >
> > I like to think that I am by participating in this discussion. Actually
> > changing the cheeseshop software has a very high learning curve. I don't
> > think that I can make that kind of time any time soon. I'm very
> > grateful that you and Ren? are doing what you're doing. I also suspect
> > that, given your and Ren?'s activity, it would be counter productive for
> > someone else to get involved at that level, but maybe I'm wrong about that.
>
> I strongly think you are. There are many things that could be improved,
> and I would not mind leaving the cheeseshop alone if some other
> maintainer came along - I also have other things to do.
>
> Regards,
> Martin
>
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
From benji at benjiyork.com Wed Jul 11 14:42:16 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 08:42:16 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469468C5.8000906@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de>
Message-ID: <4694D028.6050203@benjiyork.com>
Martin v. L?wis wrote:
>>> Hmm. I'm somewhat skeptical about setuptools (or any other packaging
>>> infrastructure, say, Debian) establishing rules on what makes a
>>> difference in package names.
>> Why? It certainly seems reasonable to me for a packaging system to
>> define rules for package names.
>
> Ah, sure. It's certainly fine and reasonable for a packaging system
> to do that for its own purposes. However, I'm skeptical about that
> packaging system then to enforce its rules on other systems (such
> as the cheeseshop, which is not packaging system).
Although it wasn't part of the cheeseshop's original mission, it has
become an integral part of distributing Python packages. If it doesn't
want to participate in its new-found utility, other options need to be
explored.
--
Benji York
http://benjiyork.com
From jim at zope.com Wed Jul 11 14:52:20 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 08:52:20 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469467AA.7070409@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
Message-ID: <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
On Jul 11, 2007, at 1:16 AM, Martin v. L?wis wrote:
...
>> IOW, setuptools' focus is more on distribution filename safety,
>> rather
>> than on sensible naming distinctions for end users. The former is
>> less
>> restrictive than the latter, I believe.
>
> Yes. However, it's not clear to me that the infrastructure needs to
> (or even is able to) enforce sensible naming. Instead, any policing
> that might be necessary should be done in the community. If two
> packages are named too similarly, users will get confused, and
> eventually one package may disappear, get renamed, get its naming
> challenged in court, and so on. It's not the job of the package
> *index* to do that sort of policing.
When Phillip designed setuptools, he tried to have a very low impact
on lots of systems. He did that very well and that has allowed
setuptools to be adopted gradually with very little up front buy in.
One of the decisions Phillip made was to not use an installed-package
database other than sys.path. When a distribution is installed, the
installed file name reflects the package name. If you want to know
whether a package is installed, you can scan sys.path looking for
files or directories that contain/reflect the package name. IMO,
this was a very good decision, however, it does have the disadvantage
that it may run afoul of system file-naming limitations. Again, I
think this was a fair trade off.
The questions for us is, how much effort we are willing to make to
prevent people from shooting themselves in the foot. I can
understand why Phillip would like the package index to prevent people
from choosing problematic package names.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Wed Jul 11 14:56:22 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 08:56:22 -0400
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <46946573.2070400@v.loewis.de>
References: <46937F10.3070201@weitershausen.de>
<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
<4693FBDC.2060201@v.loewis.de>
<4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com>
<46946573.2070400@v.loewis.de>
Message-ID: <7E6E8D05-9669-4765-B61D-254835DDA553@zope.com>
On Jul 11, 2007, at 1:06 AM, Martin v. L?wis wrote:
>>> I have been thinking about the same thing. I think it would be good
>>> to have, however, it will surely take some time until all setuptools
>>> implementations learn to use it.
>>
>> No, not at all. You can tell setuptools to use a different index
>> than
>> the current one. For example, this is a command-line option for
>> easy_install and a configuration option for buildout.
>
> Yes. However, that will make the feature only available to those who
> know about it. I have very shallow knowledge of setuptools and
> easy_install only (I nearly never use them at all), and I surely would
> miss such an option, and miss why it's relevant.
That's fine. I don't care if most people can find it. While it is
an *experimental* index, it is fine if only a few people play with
it. If it is proven to work properly, then we could arrange that
other people get it by default.
> It's true that the Apache installation could also redirect existing
> installations to the new pages, but I doubt that they would be
> otherwise
> widely used until setuptools changes its hard-coded default.
Right, that's why, if the experiment works, we should then change the
Apache config to rediect setuptools to it.
Changing the apache config is much easier than updating the
setuptools installed base.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Wed Jul 11 15:32:31 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 09:32:31 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46946871.3060100@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de>
<46946871.3060100@v.loewis.de>
Message-ID:
On Jul 11, 2007, at 1:19 AM, Martin v. L?wis wrote:
...
>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>>
>> Oh come on, I'm not the only one who has posted messages on this
>> mailing
>> list over the last few weeks reporting problems.
>
> Can you kindly refer to four or five such messages in the archives?
> I must have missed them.
http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html
http://mail.python.org/pipermail/catalog-sig/2007-June/001101.html
http://mail.python.org/pipermail/catalog-sig/2007-April/001049.html
http://mail.python.org/pipermail/catalog-sig/2006-November/000997.html
There haven't been a large number of messages.
There must not be a problem.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html
From jim at zope.com Wed Jul 11 15:34:41 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 09:34:41 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469468C5.8000906@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de>
<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de>
Message-ID:
On Jul 11, 2007, at 1:21 AM, Martin v. L?wis wrote:
>>> Hmm. I'm somewhat skeptical about setuptools (or any other packaging
>>> infrastructure, say, Debian) establishing rules on what makes a
>>> difference in package names.
>>
>> Why? It certainly seems reasonable to me for a packaging system to
>> define rules for package names.
>
> Ah, sure. It's certainly fine and reasonable for a packaging system
> to do that for its own purposes. However, I'm skeptical about that
> packaging system then to enforce its rules on other systems (such
> as the cheeseshop, which is not packaging system).
OK, let's take a step back. IMO, PyPI is a *part* is the packaging
system. If we can't agree that that is true, then we need to find a
package index that *is* part of the package system.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Wed Jul 11 16:03:33 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 10:03:33 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
Message-ID: <721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
On Jul 11, 2007, at 12:04 AM, richardjones at optusnet.com.au wrote:
> Stephen Waterbury wrote:
>> Martin v. L?wis wrote:
>>> [Jim Fulton wrote:]
>>>> Maybe others can chime in.
>>>
>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>>
>> I'm not happy with PyPI's performance either.
>> Probably many users are like me: I thought it was
>> common knowledge that the performance of PyPI was bad, but
>> I didn't want to complain when it appeared that people were
>> working on improvements.
>
> It has been slow in the past, but Martin has done some great work
> speeding it up in the last few days.
Yup. Much thanks Martin!
> If it's still slow, please report when you noticed and what you
> were trying to do.
Let's look at the new-improved times. Right now ~14:00UTC July 11:
http://www.python.org/ZODB3 takes about .3 seconds (median)(means
is higher)
http://www.python.org/ZODB3/3.8.0b2 also takes about .3 seconds
http://www.python.org/pypi/ takes aabout 6 seconds (median)
For the sake of argument, let's ignore http://www.python.org/pypi/.
The .3-second times per request is *much* better than we had before
(I assume), but it's *not fast enough*. The demand on the package
index used by setuptools is going to increase substantially. Even if
setuptools only made a single request per package, .3 seconds per
request is too slow. Given the current structure of the index,
setuptools has to make a request for the package and a request per
release. For ZODB, this means about 12 requests, or more than 3
seconds. Of course, this will increase over time, as more releases
are made.
The progress Martin has made has (I assume and hope) greatly
increased the reliability and performance of PYPI. This is very
important and much appreciated. It is not enough in the long (or, I
suspect medium) term.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From nathan at creativecommons.org Wed Jul 11 16:28:33 2007
From: nathan at creativecommons.org (Nathan R. Yergler)
Date: Wed, 11 Jul 2007 07:28:33 -0700
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46946A69.4000702@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
Message-ID:
On 7/10/07, "Martin v. L?wis" wrote:
> > I'm not happy with PyPI's performance either.
> > Probably many users are like me: I thought it was
> > common knowledge that the performance of PyPI was bad
>
> Please trust me that it isn't. I know that PyPI could
> become unresponsive, and I FIXED that. AFAICT, it's
> solved, done, can't happen again. I do not know that
> performance IS bad; I know that it WAS bad (primarily
> not due to the way the software was written, but
> due to the way it was run).
The speed has noticeably improved (thanks!) but as recently as Monday
PyPI was unresponsive and then returning proxy errors. It definitely
caused us (Creative Commons) to lose productivity Monday afternoon
(PDT).
Nathan
>
> > but
> > I didn't want to complain when it appeared that people were
> > working on improvements.
>
> Sure: mere complaints would not be constructive. However,
> specific *reports* of problems are absolutely necessary.
> If you experience problems today, tomorrow, next week,
> by all means, report them. Different people apparently
> also have different perception what good performance is,
> so please always make a full bug report:
>
> - what precisely did you do (including "when" also
> in this case),
> - what happened,
> - what did you expect to happen instead
>
> Regards,
> Martin
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
From jodok at lovelysystems.com Wed Jul 11 17:57:14 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Wed, 11 Jul 2007 17:57:14 +0200 (CEST)
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
Message-ID: <21138246.8381184169434589.JavaMail.root@post.webmeisterei.com>
sorry for incorrect quoting - i'm at europython and the webmailer behaves badly...
i've been complaining loudly! :)
in fact cheeseshop was unusably slow. in meanwhile we built our own index and are not depending on cheeseshop anymore. i think at least me (lovely systems) and jim (zope corporation) offered help and volunteered to pay someone to fix it.
for me, the current solution is just "tuning", but not addressing the general problem behind the current software design (that is the pypi software and parts of setuptools in general). i've been following the thread actively and like to thank especially martin for his work to get a short-term solution. nevertheless we need to solve these issues. as a lot of other projects are moving to egg-based distributions pypi is a integral part.
baking static pages would be my first choice.
jodok
----- Original Message -----
From: "Jim Fulton"
To: "=?ISO-8859-1?Q? \"Martin_v._L=F6wis\" ?="
Cc: catalog-sig at python.org
Sent: Wednesday, July 11, 2007 4:32:31 PM (GMT+0200) Europe/Athens
Subject: Re: [Catalog-sig] start on static generation, and caching - apache config.
On Jul 11, 2007, at 1:19 AM, Martin v. L?wis wrote:
...
>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>>
>> Oh come on, I'm not the only one who has posted messages on this
>> mailing
>> list over the last few weeks reporting problems.
>
> Can you kindly refer to four or five such messages in the archives?
> I must have missed them.
http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html
http://mail.python.org/pipermail/catalog-sig/2007-June/001101.html
http://mail.python.org/pipermail/catalog-sig/2007-April/001049.html
http://mail.python.org/pipermail/catalog-sig/2006-November/000997.html
There haven't been a large number of messages.
There must not be a problem.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html
_______________________________________________
Catalog-SIG mailing list
Catalog-SIG at python.org
http://mail.python.org/mailman/listinfo/catalog-sig
--
Lovely Systems, Partner
phone: +43 5572 908060, fax: +43 5572 908060-77
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
From martin at v.loewis.de Wed Jul 11 19:40:34 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 19:40:34 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
Message-ID: <46951612.9010009@v.loewis.de>
> The speed has noticeably improved (thanks!) but as recently as Monday
> PyPI was unresponsive and then returning proxy errors. It definitely
> caused us (Creative Commons) to lose productivity Monday afternoon
> (PDT).
Ok. What precisely was that proxy error? (I'm puzzled, because I'm
not aware of a proxy somewhere)
Regards,
Martin
From fdrake at gmail.com Wed Jul 11 19:42:04 2007
From: fdrake at gmail.com (Fred Drake)
Date: Wed, 11 Jul 2007 13:42:04 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
Message-ID: <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
On 7/11/07, Nathan R. Yergler wrote:
> The speed has noticeably improved (thanks!) but as recently as Monday
> PyPI was unresponsive and then returning proxy errors. It definitely
> caused us (Creative Commons) to lose productivity Monday afternoon
> (PDT).
We're seeing this right now, too. I'm checking both www.python.org
and cheeseshop.python.org.
-Fred
--
Fred L. Drake, Jr.
"Chaos is the score upon which reality is written." --Henry Miller
From nathan at creativecommons.org Wed Jul 11 19:47:33 2007
From: nathan at creativecommons.org (Nathan R. Yergler)
Date: Wed, 11 Jul 2007 10:47:33 -0700
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46951612.9010009@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
<46951612.9010009@v.loewis.de>
Message-ID:
On 7/11/07, "Martin v. L?wis" wrote:
> > The speed has noticeably improved (thanks!) but as recently as Monday
> > PyPI was unresponsive and then returning proxy errors. It definitely
> > caused us (Creative Commons) to lose productivity Monday afternoon
> > (PDT).
>
> Ok. What precisely was that proxy error? (I'm puzzled, because I'm
> not aware of a proxy somewhere)
IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
is passing requests through to a local process (mod_rewrite or
mod_proxy?), and that process wasn't responding.
>
> Regards,
> Martin
>
From jim at zope.com Wed Jul 11 19:50:01 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 13:50:01 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46951612.9010009@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
<46951612.9010009@v.loewis.de>
Message-ID:
On Jul 11, 2007, at 1:40 PM, Martin v. L?wis wrote:
>> The speed has noticeably improved (thanks!) but as recently as Monday
>> PyPI was unresponsive and then returning proxy errors. It definitely
>> caused us (Creative Commons) to lose productivity Monday afternoon
>> (PDT).
>
> Ok. What precisely was that proxy error? (I'm puzzled, because I'm
> not aware of a proxy somewhere)
Here's the error I just got after several minutes of spinning trying
to get: http://www.python.org/pypi/ZODB3
503 Service Temporarily Unavailable
Service Temporarily Unavailable
The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From benji at benjiyork.com Wed Jul 11 19:50:39 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 13:50:39 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46946E55.30308@v.loewis.de>
References: <200707102016.40669.srichter@cosmos.phy.tufts.edu>
<46946E55.30308@v.loewis.de>
Message-ID: <4695186F.3030207@benjiyork.com>
Martin v. L?wis wrote:
> So yes, I trust that there have been complaints in the past - I
> wonder whether there are *still* complaints (beyond the ones
> of Jim Fulton).
Here's a complaint: the cheeseshop is down.
--
Benji York
http://benjiyork.com
From fdrake at gmail.com Wed Jul 11 19:50:56 2007
From: fdrake at gmail.com (Fred Drake)
Date: Wed, 11 Jul 2007 13:50:56 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
<46951612.9010009@v.loewis.de>
Message-ID: <9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com>
On 7/11/07, Nathan R. Yergler wrote:
> IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
> is passing requests through to a local process (mod_rewrite or
> mod_proxy?), and that process wasn't responding.
Firefox's "Page Info" says 503.
-Fred
--
Fred L. Drake, Jr.
"Chaos is the score upon which reality is written." --Henry Miller
From nathan at creativecommons.org Wed Jul 11 19:55:46 2007
From: nathan at creativecommons.org (Nathan R. Yergler)
Date: Wed, 11 Jul 2007 10:55:46 -0700
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
<46951612.9010009@v.loewis.de>
<9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com>
Message-ID:
I'm getting the following right now:
502 Proxy Error
Proxy Error
The proxy server received an invalid
response from an upstream server.
The proxy server could not handle the request GET /pypi.
Reason: Error reading from remote server
On 7/11/07, Fred Drake wrote:
> On 7/11/07, Nathan R. Yergler wrote:
> > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
> > is passing requests through to a local process (mod_rewrite or
> > mod_proxy?), and that process wasn't responding.
>
> Firefox's "Page Info" says 503.
>
>
> -Fred
>
> --
> Fred L. Drake, Jr.
> "Chaos is the score upon which reality is written." --Henry Miller
>
From martin at v.loewis.de Wed Jul 11 20:01:59 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:01:59 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4694C8B6.1030804@benjiyork.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <46946871.3060100@v.loewis.de>
<4694C8B6.1030804@benjiyork.com>
Message-ID: <46951B17.4000104@v.loewis.de>
Benji York schrieb:
> Martin v. L?wis wrote:
>>> People are doing it, usually in limited ways, out of desperation.
>> Same question to these people, then (whoever they are): why
>> do you think it's easier to build your own index in desperation,
>> rather than contributing to PyPI?
>
> Because they aren't aware of the progress being made or the intent to
> make more?
And then, why didn't they ask how they could help?
People can start all the projects they want, of course. It just seems
like a waste of volunteer time to work on competing projects.
> Here's one (you didn't say they had to be past messages ).
And indeed, I'm more interested in new reports than in old ones
(since the system changed since the old ones).
> Is your position that PyPI isn't down/very slow on occasion or that when
> it is no one complains?
Both. I believe it shouldn't be down, and I have no precise reports of
it being "very slow". Jim Fulton complained that it took 0.3s to
get a single package's page, which I cannot classify as "very slow".
> My team has lost many man hours to PyPI begin down/glacially slow. This
> isn't meant to disparage PyPI though, if it weren't such a great thing
> it wouldn't be important to us.
But when did that happen precisely?
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 20:03:01 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:03:01 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
<46951612.9010009@v.loewis.de>
Message-ID: <46951B55.9050009@v.loewis.de>
>> Ok. What precisely was that proxy error? (I'm puzzled, because I'm
>> not aware of a proxy somewhere)
>
> IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
> is passing requests through to a local process (mod_rewrite or
> mod_proxy?), and that process wasn't responding.
Neither is going on for PyPI, AFAIK - it's mod_fastcgi.
Regards,
Martin
From nathan at creativecommons.org Wed Jul 11 20:06:20 2007
From: nathan at creativecommons.org (Nathan R. Yergler)
Date: Wed, 11 Jul 2007 11:06:20 -0700
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46951B55.9050009@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
<46951612.9010009@v.loewis.de>
<46951B55.9050009@v.loewis.de>
Message-ID:
On 7/11/07, "Martin v. L?wis" wrote:
> >> Ok. What precisely was that proxy error? (I'm puzzled, because I'm
> >> not aware of a proxy somewhere)
> >
> > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
> > is passing requests through to a local process (mod_rewrite or
> > mod_proxy?), and that process wasn't responding.
>
> Neither is going on for PyPI, AFAIK - it's mod_fastcgi.
>
So perhaps the external fastcgi server has barfed? Like I said, I was
just guessing based on past experience. I don't know enough about the
internals of PyPI to actually comment on how applicable that
experience is.
NRY
From benji at benjiyork.com Wed Jul 11 20:25:58 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 14:25:58 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46951B17.4000104@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <46946871.3060100@v.loewis.de>
<4694C8B6.1030804@benjiyork.com> <46951B17.4000104@v.loewis.de>
Message-ID: <469520B6.2030002@benjiyork.com>
Martin v. L?wis wrote:
> Benji York schrieb:
>> Is your position that PyPI isn't down/very slow on occasion or that when
>> it is no one complains?
>
> Both. I believe it shouldn't be down
The cheeseshop has provided its own proof that that believe is mistaken
by being down as I began composing this message.
> Jim Fulton complained that it took 0.3s to
> get a single package's page, which I cannot classify as "very slow".
During a single run setuptools or zc.buildout may make hundreds of
requests to the cheeseshop taking a total time in the minutes. That's
not fast enough. I can't see a technical reason why these requests
couldn't be handled much faster than 3 a second.
>> My team has lost many man hours to PyPI begin down/glacially slow. This
>> isn't meant to disparage PyPI though, if it weren't such a great thing
>> it wouldn't be important to us.
>
> But when did that happen precisely?
I don't recall precisely. I'll be sure to report outages religiously
from now on.
--
Benji York
http://benjiyork.com
From martin at v.loewis.de Wed Jul 11 20:27:00 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:27:00 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de>
<469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> <9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com>
Message-ID: <469520F4.2050708@v.loewis.de>
Nathan R. Yergler schrieb:
> I'm getting the following right now:
>
>
>
> 502 Proxy Error
>
> Proxy Error
> The proxy server received an invalid
> response from an upstream server.
> The proxy server could not handle the request href="/pypi">GET /pypi.
> Reason: Error reading from remote server
>
>
Thanks for all the reports. I'm really puzzled what precisely
happened. Apache has logged tons of the error messages
[Wed Jul 11 20:11:01 2007] [warn] FastCGI: server
"/data/pypi/src/pypi/pypi.fcgi" has failed to remain running for 30
seconds given 3 attempts, its restart interval has been backed off to
600 seconds
That caused the outage: the PyPI FCGI servers stopped, and failed
to restart, so FCGI backed off starting new ones.
However, I don't understand why PyPI crashed - it did not leave
a log message, and did not send an error email. After restarting
it, it seems to run just fine. The first crashed server was started
7:56 (UTC+2), and, at 11:04, the line
[warn] FastCGI: server "/data/pypi/src/pypi/pypi.fcgi" (pid 3770)
terminated by calling exit with status '0'
was logged, i.e. PyPI voluntarily decided to exit. The same happened
later again and again, but I can't figure out why it would do such
a thing.
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 20:27:57 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:27:57 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
<46951612.9010009@v.loewis.de>
<46951B55.9050009@v.loewis.de>
Message-ID: <4695212D.6010406@v.loewis.de>
> So perhaps the external fastcgi server has barfed? Like I said, I was
> just guessing based on past experience. I don't know enough about the
> internals of PyPI to actually comment on how applicable that
> experience is.
I just looked into it a little - that happened, but I don't know why.
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 20:32:22 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:32:22 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4694D028.6050203@benjiyork.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com>
Message-ID: <46952236.30704@v.loewis.de>
> Although it wasn't part of the cheeseshop's original mission, it has
> become an integral part of distributing Python packages. If it doesn't
> want to participate in its new-found utility, other options need to be
> explored.
It's a software system; it doesn't have a mission.
I just dislike making unilateral decisions.
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 20:35:02 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:35:02 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
Message-ID: <469522D6.1070706@v.loewis.de>
> The questions for us is, how much effort we are willing to make to
> prevent people from shooting themselves in the foot. I can understand
> why Phillip would like the package index to prevent people from choosing
> problematic package names.
That's not my understanding - the issue isn't with "problematic package
names", but with conflicting package names. IOW, any single name is
fine - it's a pair of names that would cause a problem (and only if
you wanted to install both packages on the same system).
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 20:36:57 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:36:57 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de>
<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de>
Message-ID: <46952349.5050606@v.loewis.de>
> OK, let's take a step back. IMO, PyPI is a *part* is the packaging
> system. If we can't agree that that is true, then we need to find a
> package index that *is* part of the package system.
It might be hairsplitting to discuss this specific question, but
I think the purpose of PyPI is to allow people to find Python
packages, i.e. it is a package index.
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 20:40:29 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:40:29 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
Message-ID: <4695241D.3090203@v.loewis.de>
> The .3-second times per request is *much* better than we had before
> (I assume), but it's *not fast enough*. The demand on the package
> index used by setuptools is going to increase substantially. Even if
> setuptools only made a single request per package, .3 seconds per
> request is too slow. Given the current structure of the index,
> setuptools has to make a request for the package and a request per
> release. For ZODB, this means about 12 requests, or more than 3
> seconds. Of course, this will increase over time, as more releases
> are made.
This I still don't understand. Why does it need to query all available
releases?
Regards,
Martin
From benji at benjiyork.com Wed Jul 11 20:41:32 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 14:41:32 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46952236.30704@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com>
<46952236.30704@v.loewis.de>
Message-ID: <4695245C.3020703@benjiyork.com>
Martin v. L?wis wrote:
>> Although it wasn't part of the cheeseshop's original mission, it has
>> become an integral part of distributing Python packages. If it doesn't
>> want to participate in its new-found utility, other options need to be
>> explored.
>
> It's a software system; it doesn't have a mission.
This SIG has a mission, I was under the impression that the cheeseshop
was developed to forward that mission. If not, we need to start work on
something that will provide a usable server-side compliment to setuptools.
> I just dislike making unilateral decisions.
Fortunately you don't have to. We have several people here with varied
experience that have the facilities to communicate their desires and
expertise.
--
Benji York
http://benjiyork.com
From jim at zope.com Wed Jul 11 20:45:09 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 14:45:09 -0400
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <46952349.5050606@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de>
<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de>
<46952349.5050606@v.loewis.de>
Message-ID:
On Jul 11, 2007, at 2:36 PM, Martin v. L?wis wrote:
>> OK, let's take a step back. IMO, PyPI is a *part* is the packaging
>> system. If we can't agree that that is true, then we need to find a
>> package index that *is* part of the package system.
>
> It might be hairsplitting to discuss this specific question, but
> I think the purpose of PyPI is to allow people to find Python
> packages, i.e. it is a package index.
Let me try to put this another way.
Can we agree that it is part of the purpose of PyPI to serve as a
repository for setuptools? I'd like to resolve this issue. If it
isn't part of PyPI's purpose to serve as a repository for setuptools,
then we'll build another system that *does* have that purpose. If it
is part of the purpose to serve as a repository for setuptools, then
we'll need to take various needs of setuptools into account.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Wed Jul 11 20:37:21 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 11 Jul 2007 14:37:21 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469467AA.7070409@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
Message-ID: <20070711184549.733CE3A404D@sparrow.telecommunity.com>
At 07:16 AM 7/11/2007 +0200, Martin v. L?wis wrote:
> > Note that Windows (and Mac OS under certain circumstances) have filename
> > case insensitivity, and have different restrictions about what can or
> > can't be in a filename than Unix. Spaces and other punctuation
> > characters can cause problems for shells, even if they're theoretically
> > acceptable as filenames.
>
>I can see that collisions should be avoided in advance when it comes to
>file names. However, the name of a software package is not necessarily a
>file name,
Actually, it is. The distutils generate distribution filenames based on this.
> > IOW, setuptools' focus is more on distribution filename safety, rather
> > than on sensible naming distinctions for end users. The former is less
> > restrictive than the latter, I believe.
>
>Yes. However, it's not clear to me that the infrastructure needs to
>(or even is able to) enforce sensible naming.
I said sensible *distinctions* - not sensible naming. Clearly, we
can't advise people not to publish packages named "Joe's
Miscellaneous Functions", at least not in an automated way. :)
> Instead, any policing
>that might be necessary should be done in the community. If two
>packages are named too similarly, users will get confused, and
>eventually one package may disappear, get renamed, get its naming
>challenged in court, and so on. It's not the job of the package
>*index* to do that sort of policing.
Within its own scope, that's a valid and sensible argument. Within
the larger scope of "what is good for users", I would say it does no
*good* to allow people to register such similar package names, and in
many cases will do *harm* to do so.
Contrariwise, it will not do *harm* to anyone to reject their
too-similar name, and will in fact do them good. Today, I almost
created a package called "Aspects". Had I done so, and uploaded it
to the Cheeseshop, I wouldn't have been warned that there is already
a package named "aspects". I would have been well on my way to
creating confusion that would be entirely avoidable, were the
Cheeseshop to stop me at the point of registration or uploading.
Since the restriction can cause no real harm, and produces a net
good, but the lack of restriction can cause real harm (e.g., I had to
later change a package name, thereby breaking dependencies in other
packages), there is no reason *not* to provide that benefit to the
users, and protect them from that harm.
Perhaps, as Jim says, it is time to start treating PyPI as part of
the packaging system. It is so in fact, anyway. Meanwhile, the
separation between cataloging and packaging means other issues, such
as the complete disconnect between the cataloging of metadata and the
automated production and use of such metadata. The PKG-INFO format
has been degrading with each new version, in terms of defining more
metadata for which over-restrictive *syntax* is defined, while being
almost completely lacking in any *semantics*.
This schism between the idea of neatly cataloging things, versus
being able to actually *use* that cataloging for practical purposes
by automated tools (as opposed to being usable only to human
readers), seems to be at the heart of some of the current discussion.
From martin at v.loewis.de Wed Jul 11 20:46:02 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:46:02 +0200
Subject: [Catalog-sig] cheeseshop outage
In-Reply-To: <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de>
<469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de>
<9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
Message-ID: <4695256A.5020208@v.loewis.de>
Fred Drake schrieb:
> On 7/11/07, Nathan R. Yergler wrote:
>> The speed has noticeably improved (thanks!) but as recently as Monday
>> PyPI was unresponsive and then returning proxy errors. It definitely
>> caused us (Creative Commons) to lose productivity Monday afternoon
>> (PDT).
>
> We're seeing this right now, too. I'm checking both www.python.org
> and cheeseshop.python.org.
If www.python.org is up, should be safe to ignore. If you can find any
post-mortem evidence on ximinez, that would be much appreciated.
Regards,
Martin
P.S. Why is www.python.org proxying for ximinez? Shouldn't it perform
redirects instead?
From benji at benjiyork.com Wed Jul 11 20:52:50 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 14:52:50 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070711184549.733CE3A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de>
<20070711184549.733CE3A404D@sparrow.telecommunity.com>
Message-ID: <46952702.8060606@benjiyork.com>
Phillip J. Eby wrote:
> This schism between the idea of neatly cataloging things, versus
> being able to actually *use* that cataloging for practical purposes
> by automated tools (as opposed to being usable only to human
> readers), seems to be at the heart of some of the current discussion.
Wasn't there a proposal to merge the catalog-sig and distutils-sig?
--
Benji York
http://benjiyork.com
From jim at zope.com Wed Jul 11 20:57:43 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 14:57:43 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4695241D.3090203@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
<4695241D.3090203@v.loewis.de>
Message-ID: <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
On Jul 11, 2007, at 2:40 PM, Martin v. L?wis wrote:
>> The .3-second times per request is *much* better than we had before
>> (I assume), but it's *not fast enough*. The demand on the package
>> index used by setuptools is going to increase substantially. Even if
>> setuptools only made a single request per package, .3 seconds per
>> request is too slow. Given the current structure of the index,
>> setuptools has to make a request for the package and a request per
>> release. For ZODB, this means about 12 requests, or more than 3
>> seconds. Of course, this will increase over time, as more releases
>> are made.
>
> This I still don't understand. Why does it need to query all available
> releases?
The way that setuptools currently works, it scans each of the release
pages looking for distributions. In theory, it could take the names
of these pages into account and scan fewer. It will still have to
scan at least 2.
I have a feeling that I'll never convince you that a third of a
second is too slow. I think I'll stop trying. Hopefully, Ren?, will
be able to get baking working, at which point the pages will be a lot
faster. At that point, I think it would be good to pursue alternate
pages more optimized for setuptools to reduce the number and size of
setuptools requests. I'll help any way I can with that.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Wed Jul 11 21:03:12 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 11 Jul 2007 15:03:12 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469520B6.2030002@benjiyork.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de>
<46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com>
<46951B17.4000104@v.loewis.de> <469520B6.2030002@benjiyork.com>
Message-ID: <20070711190058.2322F3A404D@sparrow.telecommunity.com>
At 02:25 PM 7/11/2007 -0400, Benji York wrote:
>Martin v. L?wis wrote:
> > Benji York schrieb:
>
> >> Is your position that PyPI isn't down/very slow on occasion or that when
> >> it is no one complains?
> >
> > Both. I believe it shouldn't be down
>
>The cheeseshop has provided its own proof that that believe is mistaken
>by being down as I began composing this message.
>
> > Jim Fulton complained that it took 0.3s to
> > get a single package's page, which I cannot classify as "very slow".
>
>During a single run setuptools or zc.buildout may make hundreds of
>requests to the cheeseshop taking a total time in the minutes. That's
>not fast enough. I can't see a technical reason why these requests
>couldn't be handled much faster than 3 a second.
An interesting thought for future optimization... an XML-RPC catalog
server designed for this use case could in fact do all the
computation server-side, resolving dependencies and evaluating
version constraints. Heck, in theory, it could cache packages'
external links, and simply hand back to the caller a complete list of
candidate URLs to choose for downloading. That way, most activities
would take only one server round-trip to complete, if the client sent
a list of everything it expects to need, and the server includes
everything that the server expects the client to want due to those
things' dependencies.
The main obstacle to implementing such a service today, is that it
would have no way of knowing what dependencies to look for, without
sniffing the contents of .egg files. But, as long as a superset of
possible dependencies was listed in PKG-INFO, the server could make
intelligent guesses about what other packages are likely to be
needed, and return their version/download info as well. Returning
information for packages that turn out not to be needed is likely to
be far less expensive than having to make round-trip requests.
An alternative to providing that information from metadata, of
course, would be for the client to include a "referrer" header of
sorts, saying why it is asking for a package. The server could then
simply "learn" the relevant associations.
From jim at zope.com Wed Jul 11 21:08:03 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 15:08:03 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070711190058.2322F3A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de>
<46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com>
<46951B17.4000104@v.loewis.de> <469520B6.2030002@benjiyork.com>
<20070711190058.2322F3A404D@sparrow.telecommunity.com>
Message-ID: <9EE8B28D-5B16-4AE8-8001-E3ECCC34A199@zope.com>
On Jul 11, 2007, at 3:03 PM, Phillip J. Eby wrote:
> At 02:25 PM 7/11/2007 -0400, Benji York wrote:
>> Martin v. L?wis wrote:
>>> Benji York schrieb:
>>
>>>> Is your position that PyPI isn't down/very slow on occasion or
>>>> that when
>>>> it is no one complains?
>>>
>>> Both. I believe it shouldn't be down
>>
>> The cheeseshop has provided its own proof that that believe is
>> mistaken
>> by being down as I began composing this message.
>>
>>> Jim Fulton complained that it took 0.3s to
>>> get a single package's page, which I cannot classify as "very slow".
>>
>> During a single run setuptools or zc.buildout may make hundreds of
>> requests to the cheeseshop taking a total time in the minutes.
>> That's
>> not fast enough. I can't see a technical reason why these requests
>> couldn't be handled much faster than 3 a second.
>
> An interesting thought for future optimization... an XML-RPC catalog
> server designed for this use case could in fact do all the
> computation server-side, resolving dependencies and evaluating
> version constraints. Heck, in theory, it could cache packages'
> external links, and simply hand back to the caller a complete list of
> candidate URLs to choose for downloading. That way, most activities
> would take only one server round-trip to complete, if the client sent
> a list of everything it expects to need, and the server includes
> everything that the server expects the client to want due to those
> things' dependencies.
That wouldn't help when local (e.g. development) or private
distributions need to be included in the mix.
I think collecting all of the links for a package that PYPI knows
about on individual package pages would go a very long way to
reducing the number of requests. If these pages were served
statically (or in similar times), then I think we'd be in very good
shape.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Wed Jul 11 21:13:42 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 21:13:42 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4695245C.3020703@benjiyork.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com>
<46952236.30704@v.loewis.de> <4695245C.3020703@benjiyork.com>
Message-ID: <46952BE6.1070604@v.loewis.de>
Benji York schrieb:
> Martin v. L?wis wrote:
>>> Although it wasn't part of the cheeseshop's original mission, it has
>>> become an integral part of distributing Python packages. If it doesn't
>>> want to participate in its new-found utility, other options need to be
>>> explored.
>>
>> It's a software system; it doesn't have a mission.
>
> This SIG has a mission, I was under the impression that the cheeseshop
> was developed to forward that mission.
That's true. That mission is "The Python Catalog SIG aims at producing a
master index of Python software and other resources."
I think this still is the mission - be *the* central site for indexing
Python software. The part "other resources" apparently never was
considered; it only indexes software now.
>> I just dislike making unilateral decisions.
>
> Fortunately you don't have to. We have several people here with varied
> experience that have the facilities to communicate their desires and
> expertise.
Ok. Of course, here the usual software engineer's reaction comes into
play: if you don't think something is that important, you try to come
up with reasons not doing it. I should have been more open: I don't
see that I have time to implement the clashing check that Phillip
proposed, although I'll see what I can do about the redirect
on lookup.
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 21:23:51 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 21:23:51 +0200
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To:
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de>
<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de>
<46952349.5050606@v.loewis.de>
Message-ID: <46952E47.8020700@v.loewis.de>
> Can we agree that it is part of the purpose of PyPI to serve as a
> repository for setuptools? I'd like to resolve this issue. If it isn't
> part of PyPI's purpose to serve as a repository for setuptools, then
> we'll build another system that *does* have that purpose. If it is part
> of the purpose to serve as a repository for setuptools, then we'll need
> to take various needs of setuptools into account.
I can't answer that question. I know PyPI is a master index of Python
software and other resources, because (as Benji York kindly reminded
me) that's the mission under which it was created.
Beyond that, it is what the community makes it to be. I personally know
it is not a "repository for setuptools" for *me*, as I don't use
setuptools. I also know it is a "repository for setuptools" for you,
as you have reported using it for that purpose. For many of the package
authors, I think it is a platform to advertise their software; for
some, it is also a web hosting service to place their released files
onto.
As for taking needs into account: First of all, it's a volunteer
project. Open source contributors are known to primarily scratch
their own itches. So if you want to see needs be taken into account,
you may have to write the code yourself, pay somebody to write
it for your, or talk somebody into writing it for you. In particular,
I personally won't write any line of code just because of a threat to
go away and write a competing index. Instead, my reaction to such
a threat remains the same: good luck!
Regards,
Martin
From benji at benjiyork.com Wed Jul 11 21:27:45 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 15:27:45 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46952BE6.1070604@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com>
<46952236.30704@v.loewis.de> <4695245C.3020703@benjiyork.com>
<46952BE6.1070604@v.loewis.de>
Message-ID: <46952F31.5020806@benjiyork.com>
Martin v. L?wis wrote:
> That's true. That mission is "The Python Catalog SIG aims at producing a
> master index of Python software and other resources."
>
> I think this still is the mission - be *the* central site for indexing
> Python software. The part "other resources" apparently never was
> considered; it only indexes software now.
There exists ambiguity as to the audience for the index. Humans are
assumed; I propose that packaging systems need to be on the list as well.
> I should have been more open: I don't
> see that I have time to implement the clashing check that Phillip
> proposed, although I'll see what I can do about the redirect
> on lookup.
Knowing your motivation helps. I don't think anyone expected you to
jump on the implementation. It's OK to say that you don't have time to
implement something. There are other people that can help, and if not
it'll just have to wait. We have to make sure we distinguish between
desirability and feasibility.
--
Benji York
http://benjiyork.com
From pje at telecommunity.com Wed Jul 11 21:25:44 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 11 Jul 2007 15:25:44 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46952702.8060606@benjiyork.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
<20070711184549.733CE3A404D@sparrow.telecommunity.com>
<46952702.8060606@benjiyork.com>
Message-ID: <20070711192751.A9FF33A404D@sparrow.telecommunity.com>
At 02:52 PM 7/11/2007 -0400, Benji York wrote:
>Phillip J. Eby wrote:
>>This schism between the idea of neatly cataloging things, versus
>>being able to actually *use* that cataloging for practical purposes
>>by automated tools (as opposed to being usable only to human
>>readers), seems to be at the heart of some of the current discussion.
>
>Wasn't there a proposal to merge the catalog-sig and distutils-sig?
Merging the lists isn't going to merge the people or change anybody's
point of view. The difference in SIGs reflects, for the most part, a
difference in Special Interest -- the "I" in SIG.
Or another way of looking at the "I" is "Itch". The people who have
been working on cataloging already have their itch basically
scratched; PyPI has been sufficient for their needs for some time now.
The packaging people, OTOH, have an ever-increasing itch, as
setuptools hits its "hockey stick" growth phase both in user volume
and package volume. This is understandably, of little interest to
people who don't do lots of packaging, deployment, and distribution.
I absolutely don't want to disparage the good folks who have made
PyPI what it is today, and I totally understand their not wanting to
take on the burden of supporting a tool they don't use or care about
themselves, just because it happens to use PyPI.
But it seems to me that for folks whose Interest/Itch is not merely
finding packages, but *using* them, a different infrastructure is
needed, treating PyPI as the ultimate *source* of the information,
without being also its sole *distribution* point, or query interface.
There are plenty of folks who have offered to spend funds, provide
hosting, etc. for PyPI mirrors or alternatives -- perhaps we should
create a SIG to start figuring out *how* to provide that, ideally
while creating the least amount of additional service burden on the Cheeseshop.
Ideally, we could then support having the Cheeseshop redirect
existing clients to a nearby distribution index, while newer clients
could use a distribution index to start with.
Such a discussion would need to resolve certain design tradeoffs such
as speed and availability vs. freshness of the index vs. load on the
primary Cheeseshop vs. ability to have lots of mirrors/distribution
indexes vs. ease of selecting one, etc.
But I believe the main reason why such discussion hasn't gone very
far at this point is because the packaging-interest folks have been
looking to the cataloging-interest folks to provide direction and
focus to the discussion of the tradeoffs, even though these things
lie mostly outside their itch/interest. I think it is more likely to
be productive for the packaging-interest folks to get clear about
what they want first, and then the cataloging-interest folks can
chime in if they see something being proposed that might be
especially harmful to the Cheeseshop's availability or performance.
From jim at zope.com Wed Jul 11 21:29:23 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 15:29:23 -0400
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <46952E47.8020700@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de>
<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de>
<46952349.5050606@v.loewis.de>
<46952E47.8020700@v.loewis.de>
Message-ID: <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
On Jul 11, 2007, at 3:23 PM, Martin v. L?wis wrote:
...
> As for taking needs into account: First of all, it's a volunteer
> project. Open source contributors are known to primarily scratch
> their own itches.
Thank you for explaining open source to me.
> So if you want to see needs be taken into account,
> you may have to write the code yourself, pay somebody to write
> it for your, or talk somebody into writing it for you.
Yup. I'm aware of that.
> In particular,
> I personally won't write any line of code just because of a threat to
> go away and write a competing index.
First, I'm not aware that anyone has asked you do do anything.
Second, I certainly meant no threat. We need a working index to use
with setuptools. I would hope, in the spirit of open source to
collaborate on that. A basic questions that needs to be answered is
whether to use PyPI or to build something else.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Wed Jul 11 21:41:59 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 21:41:59 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
<4695241D.3090203@v.loewis.de>
<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
Message-ID: <46953287.8020702@v.loewis.de>
>> This I still don't understand. Why does it need to query all available
>> releases?
>
> The way that setuptools currently works, it scans each of the release
> pages looking for distributions. In theory, it could take the names of
> these pages into account and scan fewer. It will still have to scan at
> least 2.
Can you elaborate please? Why does it need to find distributions for
versions that it will eventually not download?
> I have a feeling that I'll never convince you that a third of a second
> is too slow.
That's likely, yes.
> to get baking working, at which point the pages will be a lot faster.
> At that point, I think it would be good to pursue alternate pages more
> optimized for setuptools to reduce the number and size of setuptools
> requests. I'll help any way I can with that.
Deal: please provide sample pages for some of the packages (starting
with some zc packages perhaps), plus a directory structure in which
they should live.
I'll put them up on ximinez, at (say) /raw (or /simple, or
whatever URL people propose), so that one can experiment with
whether they look right.
Then somebody else can write a generator to populate that; I
will at the earliest point when I have time (which won't be
before August), unless somebody does it earlier.
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 21:53:04 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 21:53:04 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070711190058.2322F3A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de>
<46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com>
<46951B17.4000104@v.loewis.de> <469520B6.2030002@benjiyork.com>
<20070711190058.2322F3A404D@sparrow.telecommunity.com>
Message-ID: <46953520.4080106@v.loewis.de>
> An interesting thought for future optimization... an XML-RPC catalog
> server designed for this use case could in fact do all the computation
> server-side, resolving dependencies and evaluating version constraints.
> Heck, in theory, it could cache packages' external links, and simply
> hand back to the caller a complete list of candidate URLs to choose for
> downloading.
You mean something like
select f.filename from release_files f,releases r where
f.name='setuptools' and f.name=r.name and f.version=r.version and not
r._pypi_hidden;
This gives
filename
----------------------------------
setuptools-0.6c5.win32-py2.3.exe
setuptools-0.6c5-py2.3.egg
setuptools-0.6c5.win32-py2.4.exe
setuptools-0.6c5-1.src.rpm
setuptools-0.6c5.win32-py2.5.exe
setuptools-0.6c5.tar.gz
setuptools-0.6c5-py2.5.egg
setuptools-0.6c5-py2.4.egg
That would be very easy to add to the RPC server, and
would be quite efficient also.
> That way, most activities would take only one server
> round-trip to complete, if the client sent a list of everything it
> expects to need, and the server includes everything that the server
> expects the client to want due to those things' dependencies.
>
> The main obstacle to implementing such a service today, is that it would
> have no way of knowing what dependencies to look for, without sniffing
> the contents of .egg files.
For that, I would definitely need code contributions.
Regards,
Martin
From jim at zope.com Wed Jul 11 21:57:47 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 15:57:47 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46953287.8020702@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
<4695241D.3090203@v.loewis.de>
<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
<46953287.8020702@v.loewis.de>
Message-ID:
On Jul 11, 2007, at 3:41 PM, Martin v. L?wis wrote:
>>> This I still don't understand. Why does it need to query all
>>> available
>>> releases?
>>
>> The way that setuptools currently works, it scans each of the release
>> pages looking for distributions. In theory, it could take the
>> names of
>> these pages into account and scan fewer. It will still have to
>> scan at
>> least 2.
>
> Can you elaborate please? Why does it need to find distributions for
> versions that it will eventually not download?
It just scans the package page for URLs. It doesn't really know that
the release pages correspond to a particular version.
Let's suppose that setuptools was changed to be aware that PyPI
release pages correspond to a particular version. In that case, it
would have to read the package page to discover the release pages and
then it would have to read at least one release page. If it had
requirements other than the version (e.g. Python version or
platform), it might have to scan several releases to find an
acceptable distribution. But, in the best case, it would have to
scan at least two pages.
...
>> to get baking working, at which point the pages will be a lot faster.
>> At that point, I think it would be good to pursue alternate pages
>> more
>> optimized for setuptools to reduce the number and size of setuptools
>> requests. I'll help any way I can with that.
>
> Deal: please provide sample pages for some of the packages (starting
> with some zc packages perhaps), plus a directory structure in which
> they should live.
Fair enough. I'll do that.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Wed Jul 11 22:08:33 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 22:08:33 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070711192751.A9FF33A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
<20070711184549.733CE3A404D@sparrow.telecommunity.com>
<46952702.8060606@benjiyork.com>
<20070711192751.A9FF33A404D@sparrow.telecommunity.com>
Message-ID: <469538C1.4050404@v.loewis.de>
> There are plenty of folks who have offered to spend funds, provide
> hosting, etc. for PyPI mirrors or alternatives -- perhaps we should
> create a SIG to start figuring out *how* to provide that, ideally while
> creating the least amount of additional service burden on the Cheeseshop.
This makes me suspicious. I can certainly believe that you may need more
sheer processing power, or more bandwidth, for such a system than the
current PyPI installation has to offer.
What I don't see why you need to implement something *different*. If
you need better queries - fine, add them to PyPI. If you need
replication, load balancing, etc, please add it to PyPI. If you
have a way faster machine, migrate PyPI to that machine. That
is all possible, but assumes availability of volunteers. However,
the approach "let's create a different system" *also* needs
volunteers. So I'd rather have these volunteers contribute to
a single system, instead of each of them building their own one.
With the particular offer of a faster machine, *all* it needs
is a volunteer who first migrates and then maintains the
installation. Of course, that would involve responsibility for
all of PyPI (i.e. also dealing with abandoned packages that
somebody else takes over, adding new classifiers, etc) (I
say that because that aspect also lacks volunteers in the
current installation).
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 22:11:09 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 22:11:09 +0200
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de>
<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de>
<46952349.5050606@v.loewis.de>
<46952E47.8020700@v.loewis.de>
<5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
Message-ID: <4695395D.5030602@v.loewis.de>
> Second, I certainly meant no threat. We need a working index to use
> with setuptools. I would hope, in the spirit of open source to
> collaborate on that. A basic questions that needs to be answered is
> whether to use PyPI or to build something else.
Ok. For this question, there is a seemingly-obvious answer: use PyPI.
Why on earth would somebody want to build something else?
Regards,
Martin
From martin at v.loewis.de Wed Jul 11 22:15:44 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 22:15:44 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
<4695241D.3090203@v.loewis.de>
<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
<46953287.8020702@v.loewis.de>
Message-ID: <46953A70.6070600@v.loewis.de>
> Let's suppose that setuptools was changed to be aware that PyPI release
> pages correspond to a particular version. In that case, it would have
> to read the package page to discover the release pages and then it would
> have to read at least one release page. If it had requirements other
> than the version (e.g. Python version or platform), it might have to
> scan several releases to find an acceptable distribution. But, in the
> best case, it would have to scan at least two pages.
Sure. However, that makes the difference between O(1) and O(N),
where N is the number of releases recorded. Going back to your
original concern: you would not have to change the policy of
keeping many different releases if the number of releases
does not impact performance.
When it looks for individual release pages, does it know that these
are release pages, or does it follow all links on the package
page? If the latter, what links does it follow (there are plenty
more on the package page)?
Regards,
Martin
From jim at zope.com Wed Jul 11 22:18:08 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 16:18:08 -0400
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <4695395D.5030602@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de>
<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
<469468C5.8000906@v.loewis.de>
<46952349.5050606@v.loewis.de>
<46952E47.8020700@v.loewis.de>
<5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
<4695395D.5030602@v.loewis.de>
Message-ID: <86EDAB6A-62C4-437C-82CD-34242258472C@zope.com>
On Jul 11, 2007, at 4:11 PM, Martin v. L?wis wrote:
>> Second, I certainly meant no threat. We need a working index to use
>> with setuptools. I would hope, in the spirit of open source to
>> collaborate on that. A basic questions that needs to be answered is
>> whether to use PyPI or to build something else.
>
> Ok. For this question, there is a seemingly-obvious answer: use PyPI.
> Why on earth would somebody want to build something else?
If we can make PyPI do what we (where "we" doesn't have to include
"you") need, then there is no reason.
I don't want to shove a bunch of requirements down someone's throat.
I understand that you don't object to new requirements if you don't
have to be responsible for them. That's perfectly fair.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From benji at benjiyork.com Wed Jul 11 22:22:09 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 16:22:09 -0400
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <4695395D.5030602@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <46952349.5050606@v.loewis.de> <46952E47.8020700@v.loewis.de> <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
<4695395D.5030602@v.loewis.de>
Message-ID: <46953BF1.2020905@benjiyork.com>
Martin v. L?wis wrote:
>> Second, I certainly meant no threat. We need a working index to use
>> with setuptools. I would hope, in the spirit of open source to
>> collaborate on that. A basic questions that needs to be answered is
>> whether to use PyPI or to build something else.
>
> Ok. For this question, there is a seemingly-obvious answer: use PyPI.
> Why on earth would somebody want to build something else?
Great; now that we've established that PyPI's audience will include
setuptools, the people who know what it wants can make (or reiterate)
proposals.
--
Benji York
http://benjiyork.com
From jodok at lovelysystems.com Wed Jul 11 23:15:43 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Wed, 11 Jul 2007 21:15:43 +0000 GMT
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de><069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com><469468C5.8000906@v.loewis.de><46952349.5050606@v.loewis.de><46952E47.8020700@v.loewis.de>
<5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
Message-ID: <1827602359-1184184972-cardhu_blackberry.rim.net-22952-@engine37-cell01.bwc.produk.on.blackberry>
+1 on all you said jim
--
Lovely Systems, Partner
phone: +43 5572 908060, fax: +43 5572 908060-77
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
-----Original Message-----
From: Jim Fulton
Date: Wed, 11 Jul 2007 15:29:23
To: "Martin v. L?wis"
Cc:catalog-sig at python.org
Subject: Re: [Catalog-sig] The purpose(s) of PYPI
On Jul 11, 2007, at 3:23 PM, Martin v. L?wis wrote:
...
> As for taking needs into account: First of all, it's a volunteer
> project. Open source contributors are known to primarily scratch
> their own itches.
Thank you for explaining open source to me.
> So if you want to see needs be taken into account,
> you may have to write the code yourself, pay somebody to write
> it for your, or talk somebody into writing it for you.
Yup. I'm aware of that.
> In particular,
> I personally won't write any line of code just because of a threat to
> go away and write a competing index.
First, I'm not aware that anyone has asked you do do anything.
Second, I certainly meant no threat. We need a working index to use
with setuptools. I would hope, in the spirit of open source to
collaborate on that. A basic questions that needs to be answered is
whether to use PyPI or to build something else.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
_______________________________________________
Catalog-SIG mailing list
Catalog-SIG at python.org
http://mail.python.org/mailman/listinfo/catalog-sig
From jim at zope.com Wed Jul 11 22:29:56 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 16:29:56 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46953A70.6070600@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
<4695241D.3090203@v.loewis.de>
<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
<46953287.8020702@v.loewis.de>
<46953A70.6070600@v.loewis.de>
Message-ID:
On Jul 11, 2007, at 4:15 PM, Martin v. L?wis wrote:
>> Let's suppose that setuptools was changed to be aware that PyPI
>> release
>> pages correspond to a particular version. In that case, it would
>> have
>> to read the package page to discover the release pages and then it
>> would
>> have to read at least one release page. If it had requirements other
>> than the version (e.g. Python version or platform), it might have to
>> scan several releases to find an acceptable distribution. But, in
>> the
>> best case, it would have to scan at least two pages.
>
> Sure. However, that makes the difference between O(1) and O(N),
> where N is the number of releases recorded. Going back to your
> original concern: you would not have to change the policy of
> keeping many different releases if the number of releases
> does not impact performance.
Yup. Absolutely. That's why it we should change the index or
setuptools, or both. IMO, it makes the most sense to change the
index to have setuptools specific pages, in addition to the ones for
humans, that allow:
- One page per package and
- a minimal amount of data to be downloaded and scanned per page.
(As I noted before, release pages are meant for humans. They
sometimes contain *lots* of data that setuptools doesn't need.)
> When it looks for individual release pages, does it know that these
> are release pages, or does it follow all links on the package
> page?
I'll have to dig to answer that question precisely. I'll do that
after pausing to see if Phillip explains it first.
> If the latter, what links does it follow (there are plenty
> more on the package page)?
See: http://mail.python.org/pipermail/catalog-sig/2007-July/001217.html
It seems to only scan the release pages. So it has some heuristic to
know which links to follow.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Wed Jul 11 22:43:41 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 22:43:41 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
<4695241D.3090203@v.loewis.de>
<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
<46953287.8020702@v.loewis.de>
<46953A70.6070600@v.loewis.de>
Message-ID: <469540FD.5060109@v.loewis.de>
>> If the latter, what links does it follow (there are plenty
>> more on the package page)?
>
> See: http://mail.python.org/pipermail/catalog-sig/2007-July/001217.html
>
> It seems to only scan the release pages. So it has some heuristic to
> know which links to follow.
Looking at
http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api
tells me that it always expects that release pages have the form
base/projectname/version.
This looks like a formal specification of PyPI, so I wonder why it
then would not trust this specification more actively.
Regards,
Martin
From jim at zope.com Wed Jul 11 22:55:55 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 16:55:55 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469540FD.5060109@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
<4695241D.3090203@v.loewis.de>
<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
<46953287.8020702@v.loewis.de>
<46953A70.6070600@v.loewis.de>
<469540FD.5060109@v.loewis.de>
Message-ID: <05004547-983F-4192-8FA6-7D0A05D6155C@zope.com>
On Jul 11, 2007, at 4:43 PM, Martin v. L?wis wrote:
>>> If the latter, what links does it follow (there are plenty
>>> more on the package page)?
>>
>> See: http://mail.python.org/pipermail/catalog-sig/2007-July/
>> 001217.html
>>
>> It seems to only scan the release pages. So it has some heuristic to
>> know which links to follow.
>
> Looking at
>
> http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api
>
> tells me that it always expects that release pages have the form
> base/projectname/version.
>
> This looks like a formal specification of PyPI, so I wonder why it
> then would not trust this specification more actively.
Phillip has certainly said it could.
IMO, it wouldn't really matter if the pages used by setuptools were
specialized for it. Compared with changing setuptools to be more
clever in its handling of release pages, providing custom pages for
setuptools will reduce the number of requests by at least 50% and
sometimes much more and will greatly reduce the amount of data that
needs to be downloaded and scanned. Someone will need to modify some
software in either case, so the custom index pages look like a big
win to me.
I'll take a stab at writing a module, probably using setuptools
itself, to scan the existing package and release pages to generate
the sort of pages I'm talking about. This can be used to generate
sample pages and might be useful for implementing the pages.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From richardjones at optusnet.com.au Thu Jul 12 00:09:48 2007
From: richardjones at optusnet.com.au (Richard Jones)
Date: Thu, 12 Jul 2007 08:09:48 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
Message-ID: <200707120809.48344.richardjones@optusnet.com.au>
On Thu, 12 Jul 2007, you wrote:
> Yup. Absolutely. That's why it we should change the index or
> setuptools, or both. IMO, it makes the most sense to change the
> index to have setuptools specific pages, in addition to the ones for
> humans, that allow:
... you know about the XML-RPC interface, yes?
http://wiki.python.org/moin/CheeseShopXmlRpc
I never fully understood why setuptools went with HTML scraping instead of
XML-RPC.
Richard
From richardjones at optushome.com.au Thu Jul 12 00:11:49 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Thu, 12 Jul 2007 08:11:49 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469522D6.1070706@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
Message-ID: <200707120811.49824.richardjones@optushome.com.au>
On Thu, 12 Jul 2007, Martin v. L?wis wrote:
> > The questions for us is, how much effort we are willing to make to
> > prevent people from shooting themselves in the foot. I can understand
> > why Phillip would like the package index to prevent people from choosing
> > problematic package names.
>
> That's not my understanding - the issue isn't with "problematic package
> names", but with conflicting package names. IOW, any single name is
> fine - it's a pair of names that would cause a problem (and only if
> you wanted to install both packages on the same system).
A big issue that's not been raised is that *distutils* have no package name
rules, but it's being proposed that PyPI does - thus a package author will
potentially get an error when uploading their package, and also the name that
appears in the index may be quite different to the name of their package.
Richard
From richardjones at optushome.com.au Thu Jul 12 00:23:11 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Thu, 12 Jul 2007 08:23:11 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070711190058.2322F3A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<469520B6.2030002@benjiyork.com>
<20070711190058.2322F3A404D@sparrow.telecommunity.com>
Message-ID: <200707120823.12001.richardjones@optushome.com.au>
On Thu, 12 Jul 2007, Phillip J. Eby wrote:
> An interesting thought for future optimization... an XML-RPC catalog
> server designed for this use case could in fact do all the
> computation server-side, resolving dependencies and evaluating
> version constraints.
Just to remind again: PyPI has an XML-RPC interface, and has had for a long
time. It has a history of accepting any and all additional functions for that
interface.
Richard
ps. why is it I keep on reading this undercurrent of "pypi doesn't do exactly
what we need, so let's write a new one" and not "let's just add some more
functionality to pypi so it does exactly what we need"... Is there something
written somewhere, or even implied, that PyPI is somehow a closed
development? If there is, I really need to strongly reiterate - PyPI will
*always* be completely open for new developers. Please see the wiki page
http://wiki.python.org/moin/CheeseShopDev for further information.
From pje at telecommunity.com Thu Jul 12 00:40:26 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 11 Jul 2007 18:40:26 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <200707120809.48344.richardjones@optusnet.com.au>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
<200707120809.48344.richardjones@optusnet.com.au>
Message-ID: <20070711223812.D02D13A404D@sparrow.telecommunity.com>
At 08:09 AM 7/12/2007 +1000, Richard Jones wrote:
>On Thu, 12 Jul 2007, you wrote:
> > Yup. Absolutely. That's why it we should change the index or
> > setuptools, or both. IMO, it makes the most sense to change the
> > index to have setuptools specific pages, in addition to the ones for
> > humans, that allow:
>
>... you know about the XML-RPC interface, yes?
>
>http://wiki.python.org/moin/CheeseShopXmlRpc
>
>I never fully understood why setuptools went with HTML scraping instead of
>XML-RPC.
Fundamentally, it was because the XML-RPC API did not then (and does
not now) provide everything that's needed. (As I mentioned a few of
the other times you asked this.) The API has improved and added some
of the missing bits, but not all of them.
There are two pieces still missing:
1. Access to "hidden" packages' release info
2. Links in the long_description that are rendered by PyPI's web interface
Without #2, we can't pick up author-provided Subversion links; see:
http://peak.telecommunity.com/DevCenter/setuptools#making-your-package-available-for-easyinstall
for details.
With this information, easy_install could be changed to use the
XML-RPC API.... *but* it would make even *more* round-trips to PyPI
than it does now, unless those APIs were also designed differently
than the ones that exist now, because you would need at least one
search to find the correct package and its PKG-INFO, and another
search to get the download files. Currently, it can at least get
both of these in one trip, if the package name is an exact match.
To answer Martin's question of why setuptools doesn't "trust" the
PyPI specification even more, it's because having chosen to use the
web interface to get the information, I thought it prudent to use
only that subset of the web interface that could be easily duplicated
using simple Apache directory indexes, since that meant someone could
create their own index or mirror a portion of PyPI without having to
implement its entire feature set. This later proved prudent when Jim
wanted to have tests of his buildout framework that did not rely on
PyPI being up, as it made it easier to create a mock PyPI for unit
testing purposes.
To be honest, the one thing I did *not* anticipate in this design was
that Jim would be making 20 releases of the same package available in
"unhidden" form. :)
From pje at telecommunity.com Thu Jul 12 00:44:51 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 11 Jul 2007 18:44:51 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <200707120811.49824.richardjones@optushome.com.au>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
<200707120811.49824.richardjones@optushome.com.au>
Message-ID: <20070711224237.511A73A404D@sparrow.telecommunity.com>
At 08:11 AM 7/12/2007 +1000, Richard Jones wrote:
>On Thu, 12 Jul 2007, Martin v. L?wis wrote:
> > > The questions for us is, how much effort we are willing to make to
> > > prevent people from shooting themselves in the foot. I can understand
> > > why Phillip would like the package index to prevent people from choosing
> > > problematic package names.
> >
> > That's not my understanding - the issue isn't with "problematic package
> > names", but with conflicting package names. IOW, any single name is
> > fine - it's a pair of names that would cause a problem (and only if
> > you wanted to install both packages on the same system).
>
>A big issue that's not been raised is that *distutils* have no package name
>rules, but it's being proposed that PyPI does - thus a package author will
>potentially get an error when uploading their package,
That would happen now, if they spell their package exactly the same
as somebody else's package.
>and also the name that
>appears in the index may be quite different to the name of their package.
No-one has proposed that PyPI *change* a package's name, only that
one not be allowed to *add* a package whose name does not
sufficiently differ from an existing package that it would have a
different filename.
In other words, since someone has uploaded a package to the
CheeseShop called "aspects", I should not be able to register a
package called "Aspects" or "asPecTS".
If on the other hand I had registered a package named "Aspects"
first, then the other person should not be able to create one called
"aspects" or "ASPects".
So there is neither any changing of names, nor rejection of names on
their own, but only a restriction as to how *similar* two names may be.
From martin at v.loewis.de Thu Jul 12 00:51:55 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 00:51:55 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070711223812.D02D13A404D@sparrow.telecommunity.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
<200707120809.48344.richardjones@optusnet.com.au>
<20070711223812.D02D13A404D@sparrow.telecommunity.com>
Message-ID: <46955F0B.2060006@v.loewis.de>
> 1. Access to "hidden" packages' release info
Can you explain what you need them for, and when?
I don't fully understand _pypi_hidden, however, I thought that
a "hidden" release is really one that the author doesn't want
to be ever found, and that is maintained just because of old
clients know exactly where it is, and access it directly.
> 2. Links in the long_description that are rendered by PyPI's web interface
Just specify precisely what operation you want, and what precisely the
result should be, and it will appear (also for _pypi_hidden).
> With this information, easy_install could be changed to use the XML-RPC
> API.... *but* it would make even *more* round-trips to PyPI than it
> does now, unless those APIs were also designed differently than the ones
> that exist now, because you would need at least one search to find the
> correct package and its PKG-INFO, and another search to get the download
> files. Currently, it can at least get both of these in one trip, if the
> package name is an exact match.
Ok, so can you design different APIs, reducing the number of roundtrips
to one in the common case, while simultaneously not requiring the server
to compute information that is not needed in the common case?
If you can, it will appear.
> To answer Martin's question of why setuptools doesn't "trust" the PyPI
> specification even more, it's because having chosen to use the web
> interface to get the information, I thought it prudent to use only that
> subset of the web interface that could be easily duplicated using simple
> Apache directory indexes, since that meant someone could create their
> own index or mirror a portion of PyPI without having to implement its
> entire feature set. This later proved prudent when Jim wanted to have
> tests of his buildout framework that did not rely on PyPI being up, as
> it made it easier to create a mock PyPI for unit testing purposes.
I still don't understand. I'm talking about not accessing all
versions in /root/package/version, trusting that the last part
really is a version (i.e. reading only /root/package, finding
out all possible versions, selecting the best one, then reading
/root/package/bestversion).
I cannot see why this is unavailable in a straight directory
indexes. Correct me if I'm wrong, but I think you can have
/root/package/index.html
/root/package/version/index.html
and then still chose to make both index.html the same
(if there is only a single version), or list the individual
versions in the top-level index.html.
Or, you can just drop /root/package/index.html, trusting
that the Apache directory index will list the single
version subdirectory, anyway.
Regards,
Martin
From jim at zope.com Thu Jul 12 00:51:51 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 18:51:51 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <200707120809.48344.richardjones@optusnet.com.au>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
<200707120809.48344.richardjones@optusnet.com.au>
Message-ID: <297846B8-94DC-4770-9476-711796E82FEC@zope.com>
On Jul 11, 2007, at 6:09 PM, Richard Jones wrote:
> On Thu, 12 Jul 2007, you wrote:
>> Yup. Absolutely. That's why it we should change the index or
>> setuptools, or both. IMO, it makes the most sense to change the
>> index to have setuptools specific pages, in addition to the ones for
>> humans, that allow:
>
> ... you know about the XML-RPC interface, yes?
Yes.
>
> http://wiki.python.org/moin/CheeseShopXmlRpc
>
> I never fully understood why setuptools went with HTML scraping
> instead of
> XML-RPC.
The main reason, as Phillip has explained is that he wants to allow
static mirrors of the index. Another good reason is to allow static
implementation, which would be far more scalable in the long run.
Thanks for reminding me of this though as it will make my little
project to prototype an alternate index format for setuptools easier. :)
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Thu Jul 12 00:56:01 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 18:56:01 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <200707120823.12001.richardjones@optushome.com.au>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<469520B6.2030002@benjiyork.com>
<20070711190058.2322F3A404D@sparrow.telecommunity.com>
<200707120823.12001.richardjones@optushome.com.au>
Message-ID: <4198A946-0B11-4F19-9D99-CD7F7B4B9161@zope.com>
On Jul 11, 2007, at 6:23 PM, Richard Jones wrote:
...
> ps. why is it I keep on reading this undercurrent of "pypi doesn't
> do exactly
> what we need, so let's write a new one" and not "let's just add
> some more
> functionality to pypi so it does exactly what we need"... Is there
> something
> written somewhere, or even implied, that PyPI is somehow a closed
> development? If there is, I really need to strongly reiterate -
> PyPI will
> *always* be completely open for new developers. Please see the wiki
> page
> http://wiki.python.org/moin/CheeseShopDev for further information.
I don't think anyone wants to write an alternative. Well, maybe
there are people like that, but you aren't reading them here. Why
would people spend time arguing about requirements, performance, etc,
if they wanted to write their own.
Some people are being forced to implement their own indexes because
they've become dependent on PyPI and PyPI just hasn't been there for
them lately. I'm pretty sure they don't want to maintain alternate
indexes in the long term.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jeremy.kloth at 4suite.org Thu Jul 12 01:20:49 2007
From: jeremy.kloth at 4suite.org (Jeremy Kloth)
Date: Wed, 11 Jul 2007 17:20:49 -0600
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070711223812.D02D13A404D@sparrow.telecommunity.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<200707120809.48344.richardjones@optusnet.com.au>
<20070711223812.D02D13A404D@sparrow.telecommunity.com>
Message-ID: <200707111720.49299.jeremy.kloth@4suite.org>
On Wednesday 11 July 2007 4:40:26 pm Phillip J. Eby wrote:
> 1. Access to "hidden" packages' release info
This already exists. Simply call release_data() with the exact version you are
interested in. It returns the metadata regardless of the "hidden" flag.
> 2. Links in the long_description that are rendered by PyPI's web interface
The 'description' key in the dictionary returned by release_data() contains
the long_description as provided by the package's setup.py. I would think
that scanning just that should be simpler than relying on particular
formatting of the PyPI generated package page.
--
Jeremy Kloth
http://4suite.org/
From jim at zope.com Thu Jul 12 01:32:21 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 19:32:21 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070711223812.D02D13A404D@sparrow.telecommunity.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
<200707120809.48344.richardjones@optusnet.com.au>
<20070711223812.D02D13A404D@sparrow.telecommunity.com>
Message-ID: <484AE499-EB19-4831-9AFB-1BCC3FCE9249@zope.com>
On Jul 11, 2007, at 6:40 PM, Phillip J. Eby wrote:
...
> There are two pieces still missing:
>
> 1. Access to "hidden" packages' release info
I'm not sure what you are referring to here. Are you talking about
hidden releases? Or something else?
> 2. Links in the long_description that are rendered by PyPI's web
> interface
>
> Without #2, we can't pick up author-provided Subversion links; see:
>
> http://peak.telecommunity.com/DevCenter/setuptools#making-your-
> package-available-for-easyinstall
> for details.
AFAICT, the information is available in the output of the
release_data method.
...
> To be honest, the one thing I did *not* anticipate in this design
> was that Jim would be making 20 releases of the same package
> available in "unhidden" form. :)
I assume you understand why this is needed. (Or maybe it isn't needed
and I'm missing something.) We need to be able to depend on old
versions and AFAICT, setuptools can't see hidden releases.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Thu Jul 12 01:46:45 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 19:46:45 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <200707120811.49824.richardjones@optushome.com.au>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
<200707120811.49824.richardjones@optushome.com.au>
Message-ID:
On Jul 11, 2007, at 6:11 PM, Richard Jones wrote:
> On Thu, 12 Jul 2007, Martin v. L?wis wrote:
>>> The questions for us is, how much effort we are willing to make to
>>> prevent people from shooting themselves in the foot. I can
>>> understand
>>> why Phillip would like the package index to prevent people from
>>> choosing
>>> problematic package names.
>>
>> That's not my understanding - the issue isn't with "problematic
>> package
>> names", but with conflicting package names. IOW, any single name is
>> fine - it's a pair of names that would cause a problem (and only if
>> you wanted to install both packages on the same system).
>
> A big issue that's not been raised is that *distutils* have no
> package name
> rules, but it's being proposed that PyPI does - thus a package
> author will
> potentially get an error when uploading their package, and also the
> name that
> appears in the index may be quite different to the name of their
> package.
Maybe distutils should have more package name rules than it does
now. We (the Community) should be free to change things based on
experience. We now have a lot more experience with this stuff than
we had a few years ago. Maybe we should consider a reset.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Thu Jul 12 01:47:59 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 19:47:59 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <297846B8-94DC-4770-9476-711796E82FEC@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
<200707120809.48344.richardjones@optusnet.com.au>
<297846B8-94DC-4770-9476-711796E82FEC@zope.com>
Message-ID:
On Jul 11, 2007, at 6:51 PM, Jim Fulton wrote:
> Another good reason is to allow static
> implementation, which would be far more scalable in the long run.
ATM, from my machine, xml-rpc requests to PyPI are taking about .27
seconds. This is only a little less than regular page requests.
With the current API, It would require at best 3 requests to get all
of the distribution URLs. Presumably, with a change to the API, we
could get this down to one request, but that's still a long time
given the demand I expect on PyPI in the future.
It would be so much simpler to just publish a static page for each
package that setuptools could parse. I'll try to prototype this.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From waterbug at pangalactic.us Thu Jul 12 03:17:00 2007
From: waterbug at pangalactic.us (Stephen Waterbury)
Date: Wed, 11 Jul 2007 21:17:00 -0400
Subject: [Catalog-sig] No more cc's please (was Re: start on static
generation, and caching - apache config.)
In-Reply-To: <05004547-983F-4192-8FA6-7D0A05D6155C@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
<4695241D.3090203@v.loewis.de>
<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
<46953287.8020702@v.loewis.de>
<46953A70.6070600@v.loewis.de>
<469540FD.5060109@v.loewis.de>
<05004547-983F-4192-8FA6-7D0A05D6155C@zope.com>
Message-ID: <4695810C.7070606@pangalactic.us>
Everyone:
Please exclude me from the cc's of all messages you send to the list!
I'm a *member* of the catalog-sig list, so I'm getting 2 copies of every
message in this thread and it's getting annoying. I'm against all this
cc crap anyway -- that's why we have a *list*, dammit! (Geez, one
would think Python programmers would be more email literate! grumble.)
Thanks,
Steve
From martin at v.loewis.de Thu Jul 12 07:11:50 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 07:11:50 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
<200707120809.48344.richardjones@optusnet.com.au>
<297846B8-94DC-4770-9476-711796E82FEC@zope.com>
Message-ID: <4695B816.9020706@v.loewis.de>
> ATM, from my machine, xml-rpc requests to PyPI are taking about .27
> seconds. This is only a little less than regular page requests. With
> the current API, It would require at best 3 requests to get all of the
> distribution URLs. Presumably, with a change to the API, we could get
> this down to one request, but that's still a long time given the demand
> I expect on PyPI in the future.
You seem to assume that if you see a round trip time of .27 seconds,
that then PyPI could only do 3 requests per second. That is not so.
I just logged onto www.python.org (a machine that is close to
cheeseshop.python.org), and called this function:
>>> s=xmlrpclib.ServerProxy("http://cheeseshop.python.org/pypi")
>>> def f():
... start=time.time()
... for i in range(1000):s.package_releases('setuptools')
... return time.time()-start
...
>>> f()
7.6247878074645996
So it can currently do 130 XML-RPC requests per second, to
a single client. Inverting it, a request takes 0.0076s,
which is a lot less than 0.27s.
Regards,
Martin
From pje at telecommunity.com Thu Jul 12 07:48:38 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 01:48:38 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <200707111720.49299.jeremy.kloth@4suite.org>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<200707120809.48344.richardjones@optusnet.com.au>
<20070711223812.D02D13A404D@sparrow.telecommunity.com>
<200707111720.49299.jeremy.kloth@4suite.org>
Message-ID: <20070712054627.886D13A404D@sparrow.telecommunity.com>
At 05:20 PM 7/11/2007 -0600, Jeremy Kloth wrote:
>On Wednesday 11 July 2007 4:40:26 pm Phillip J. Eby wrote:
> > 1. Access to "hidden" packages' release info
>
>This already exists. Simply call release_data() with the exact
>version you are
>interested in. It returns the metadata regardless of the "hidden" flag.
There is no way to discover those versions, however, AFAICT
> > 2. Links in the long_description that are rendered by PyPI's web interface
>
>The 'description' key in the dictionary returned by release_data() contains
>the long_description as provided by the package's setup.py. I would think
>that scanning just that should be simpler than relying on particular
>formatting of the PyPI generated package page.
Alas, this entire subject area is one where lots of people "would
think" that such-and-such a thing would be simpler, but
isn't. :( In this case, long_description is allowed to be
reStructured Text, which nothing less than a full reST parser can
handle. It's much easier to scan for a simple regular expression
pattern to pull the links out of HTML, than to handle all the ways
URLs can be spelled in reST, AFAICT.
That having been said, I've never actually made the attempt, for
simple historical reasons. I'll happily review patches for the
functionality, as long as they can gracefully fall back to
non-XML-RPC use, or provide an option to disable it so people using
their own static indexes can still function.
From renesd at gmail.com Thu Jul 12 08:01:00 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Thu, 12 Jul 2007 16:01:00 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <200707120809.48344.richardjones@optusnet.com.au>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
<200707120809.48344.richardjones@optusnet.com.au>
Message-ID: <64ddb72c0707112301p51614078sa84a3135584b11e8@mail.gmail.com>
xmlrpc uses POST. So it's terrible for performance, and semantically
impossible to cache.
On 7/12/07, Richard Jones wrote:
> On Thu, 12 Jul 2007, you wrote:
> > Yup. Absolutely. That's why it we should change the index or
> > setuptools, or both. IMO, it makes the most sense to change the
> > index to have setuptools specific pages, in addition to the ones for
> > humans, that allow:
>
> ... you know about the XML-RPC interface, yes?
>
> http://wiki.python.org/moin/CheeseShopXmlRpc
>
> I never fully understood why setuptools went with HTML scraping instead of
> XML-RPC.
>
>
> Richard
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
From renesd at gmail.com Thu Jul 12 08:15:23 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Thu, 12 Jul 2007 16:15:23 +1000
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <64ddb72c0707112301p51614078sa84a3135584b11e8@mail.gmail.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
<200707120809.48344.richardjones@optusnet.com.au>
<64ddb72c0707112301p51614078sa84a3135584b11e8@mail.gmail.com>
Message-ID: <64ddb72c0707112315q6f34439en79b437ad1e9c4d6e@mail.gmail.com>
hellos,
ok, maybe I'm wrong about the performance of this interface!
I guess I meant in general - using POST for GET requests is not such a
nice thing.
cu.
On 7/12/07, Ren? Dudfield wrote:
> xmlrpc uses POST. So it's terrible for performance, and semantically
> impossible to cache.
>
>
> On 7/12/07, Richard Jones wrote:
> > On Thu, 12 Jul 2007, you wrote:
> > > Yup. Absolutely. That's why it we should change the index or
> > > setuptools, or both. IMO, it makes the most sense to change the
> > > index to have setuptools specific pages, in addition to the ones for
> > > humans, that allow:
> >
> > ... you know about the XML-RPC interface, yes?
> >
> > http://wiki.python.org/moin/CheeseShopXmlRpc
> >
> > I never fully understood why setuptools went with HTML scraping instead of
> > XML-RPC.
> >
> >
> > Richard
> > _______________________________________________
> > Catalog-SIG mailing list
> > Catalog-SIG at python.org
> > http://mail.python.org/mailman/listinfo/catalog-sig
> >
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
From jim at zope.com Thu Jul 12 12:34:19 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 12 Jul 2007 06:34:19 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4695B816.9020706@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
<200707120809.48344.richardjones@optusnet.com.au>
<297846B8-94DC-4770-9476-711796E82FEC@zope.com>
<4695B816.9020706@v.loewis.de>
Message-ID: <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
On Jul 12, 2007, at 1:11 AM, Martin v. L?wis wrote:
>> ATM, from my machine, xml-rpc requests to PyPI are taking about .27
>> seconds. This is only a little less than regular page requests.
>> With
>> the current API, It would require at best 3 requests to get all of
>> the
>> distribution URLs. Presumably, with a change to the API, we could
>> get
>> this down to one request, but that's still a long time given the
>> demand
>> I expect on PyPI in the future.
>
> You seem to assume that if you see a round trip time of .27 seconds,
> that then PyPI could only do 3 requests per second. That is not so.
Yeah, it occurred to me on my way home that a substantial part of the
time might be due to distance.
I wonder what times ab against http://www.python.org/pypi/ZODB3 from
inside the python.org network would give.
I wonder if it would help much to make multiple HTTP requests in the
same connection. This might be something to look at in setuptools
and/or xmlrpclib.
....
> So it can currently do 130 XML-RPC requests per second, to
> a single client. Inverting it, a request takes 0.0076s,
> which is a lot less than 0.27s.
Cool. That's much better. Thanks for trying this.
OTOH, this points up a couple things:
1. Since many people will be far away from PyPI, I think our long-
term plan should encompass geographic mirrors. It's good that the
server is spending a small amount of time, but it still takes *me* a
long time to get data.
2. It's important to reduce the number of round trips.
I'm still opposed to using XML-RPC because:
- It's harder to mirror, and
- It's still slower than static pages.
Note that after our discussion, I'm equally against the current
approach of parsing a human interface. I still think it makes a lot
more sense to have a tailored interface for setuptools.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From benji at benjiyork.com Thu Jul 12 13:26:44 2007
From: benji at benjiyork.com (Benji York)
Date: Thu, 12 Jul 2007 07:26:44 -0400
Subject: [Catalog-sig] No more cc's please (was Re: start on static
generation, and caching - apache config.)
In-Reply-To: <4695810C.7070606@pangalactic.us>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> <4695241D.3090203@v.loewis.de> <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> <46953287.8020702@v.loewis.de> <46953A70.6070600@v.loewis.de> <469540FD.5060109@v.loewis.de> <05004547-983F-4192-8FA6-7D0A05D6155C@zope.com>
<4695810C.7070606@pangalactic.us>
Message-ID: <46960FF4.3050609@benjiyork.com>
Stephen Waterbury wrote:
> Please exclude me from the cc's of all messages you send to the list!
> I'm a *member* of the catalog-sig list, so I'm getting 2 copies of every
> message in this thread and it's getting annoying. I'm against all this
> cc crap anyway -- that's why we have a *list*, dammit! (Geez, one
> would think Python programmers would be more email literate! grumble.)
Go to http://mail.python.org/mailman/options/catalog-sig and set the
"Avoid duplicate copies of messages?" option to "Yes". (One would think
a list member would be more mailman literate!)
--
Benji York
http://benjiyork.com
From amk at amk.ca Thu Jul 12 14:20:30 2007
From: amk at amk.ca (A.M. Kuchling)
Date: Thu, 12 Jul 2007 08:20:30 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46951B55.9050009@v.loewis.de>
References:
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
<46951612.9010009@v.loewis.de>
<46951B55.9050009@v.loewis.de>
Message-ID: <20070712122030.GA5853@amk-desktop.matrixgroup.net>
On Wed, Jul 11, 2007 at 08:03:01PM +0200, "Martin v. L?wis" wrote:
> > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
> > is passing requests through to a local process (mod_rewrite or
> > mod_proxy?), and that process wasn't responding.
>
> Neither is going on for PyPI, AFAIK - it's mod_fastcgi.
www.python.org/pypi does use mod_proxy to provide PyPI access from the
old URL; it's possible these users were going through www.python.org.
--amk
From gentoodev at gmail.com Thu Jul 12 16:39:14 2007
From: gentoodev at gmail.com (Rob Cakebread)
Date: Thu, 12 Jul 2007 07:39:14 -0700
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
<46946A69.4000702@v.loewis.de>
<9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
Message-ID: <9b06ffb10707120739s56ef8736mce1545071df3475b@mail.gmail.com>
On 7/11/07, Fred Drake wrote:
> On 7/11/07, Nathan R. Yergler wrote:
> > The speed has noticeably improved (thanks!) but as recently as Monday
> > PyPI was unresponsive and then returning proxy errors. It definitely
> > caused us (Creative Commons) to lose productivity Monday afternoon
> > (PDT).
>
> We're seeing this right now, too. I'm checking both www.python.org
> and cheeseshop.python.org.
>
>
As of 7:30am PST it's timing out on the website and via XML-RPC,
testing from L.A. or Germany.
From pje at telecommunity.com Thu Jul 12 20:07:52 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 14:07:52 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469522D6.1070706@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
Message-ID: <20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
At 08:35 PM 7/11/2007 +0200, Martin v. L?wis wrote:
> > The questions for us is, how much effort we are willing to make to
> > prevent people from shooting themselves in the foot. I can understand
> > why Phillip would like the package index to prevent people from choosing
> > problematic package names.
>
>That's not my understanding - the issue isn't with "problematic package
>names", but with conflicting package names. IOW, any single name is
>fine - it's a pair of names that would cause a problem (and only if
>you wanted to install both packages on the same system).
It's also a problem for locating the correct package in the first
place... which seems to fall under the jurisdiction of a "package index". :)
This is just as important for direct human users of the Cheeseshop,
as it is for the humans using software to access the Cheeseshop.
From jim at zope.com Thu Jul 12 20:15:03 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 12 Jul 2007 14:15:03 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
Message-ID: <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
On Jul 12, 2007, at 2:07 PM, Phillip J. Eby wrote:
> At 08:35 PM 7/11/2007 +0200, Martin v. L?wis wrote:
>> > The questions for us is, how much effort we are willing to make to
>> > prevent people from shooting themselves in the foot. I can
>> understand
>> > why Phillip would like the package index to prevent people from
>> choosing
>> > problematic package names.
>>
>> That's not my understanding - the issue isn't with "problematic
>> package
>> names", but with conflicting package names. IOW, any single name is
>> fine - it's a pair of names that would cause a problem (and only if
>> you wanted to install both packages on the same system).
>
> It's also a problem for locating the correct package in the first
> place... which seems to fall under the jurisdiction of a "package
> index". :)
>
> This is just as important for direct human users of the Cheeseshop,
> as it is for the humans using software to access the Cheeseshop.
I want to make sure I understand this. I would hope that searching
would be case insensitive and otherwise flexible wrt names. Is there
any reason we can't expect URLs and requirement specifications to be
precisely spelled? That is, if someone names their package "sPaM", I
see no reason why PyPI needs to support anything other than http://
www.python.org/pypi/sPaM as the one URL of the package. Someone
should be able to use the search UI to search for "spam" and see a
result that includes "sPaM". From then on, they should be able to
type the name "sPaM". Or am I missing something?
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Thu Jul 12 20:43:11 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 14:43:11 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
Message-ID: <20070712184056.F219A3A40B0@sparrow.telecommunity.com>
At 02:15 PM 7/12/2007 -0400, Jim Fulton wrote:
>I want to make sure I understand this. I would hope that searching
>would be case insensitive and otherwise flexible wrt names.
PyPI's searching is indeed case insensitive, and is a
substring/keyword search as well.
> Is there
>any reason we can't expect URLs and requirement specifications to be
>precisely spelled? That is, if someone names their package "sPaM", I
>see no reason why PyPI needs to support anything other than http://
>www.python.org/pypi/sPaM as the one URL of the package. Someone
>should be able to use the search UI to search for "spam" and see a
>result that includes "sPaM". From then on, they should be able to
>type the name "sPaM". Or am I missing something?
You're missing that the subject is about similarity of names. A typo
of say, 'SPam' shouldn't return me some package *other* than the one
I'm looking for. It'd be nice if the resulting page said something
besides "Not Found", too... like "there's no SPam, but here are a
bunch of packages whose name contains 'spam'".
If it did that, setuptools would be able to find the right page
without hitting the main index, too. But redirection, as proposed by
Martin, also accomplishes the same thing.
And again, all this helps human direct users of the index, too.
From jim at zope.com Thu Jul 12 21:02:10 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 12 Jul 2007 15:02:10 -0400
Subject: [Catalog-sig] Case sensitivity of package names
In-Reply-To: <20070712184056.F219A3A40B0@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
<20070712184056.F219A3A40B0@sparrow.telecommunity.com>
Message-ID: <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
On Jul 12, 2007, at 2:43 PM, Phillip J. Eby wrote:
> At 02:15 PM 7/12/2007 -0400, Jim Fulton wrote:
>> I want to make sure I understand this. I would hope that searching
>> would be case insensitive and otherwise flexible wrt names.
>
> PyPI's searching is indeed case insensitive, and is a substring/
> keyword search as well.
>
>
>> Is there
>> any reason we can't expect URLs and requirement specifications to be
>> precisely spelled? That is, if someone names their package "sPaM", I
>> see no reason why PyPI needs to support anything other than
>> http:// www.python.org/pypi/sPaM as the one URL of the package.
>> Someone
>> should be able to use the search UI to search for "spam" and see a
>> result that includes "sPaM". From then on, they should be able to
>> type the name "sPaM". Or am I missing something?
>
> You're missing that the subject is about similarity of names.
> A typo of say, 'SPam' shouldn't return me some package *other*
> than the one I'm looking for. I
No, I understand that part. I understand the desire to avoid
conflicts that cause problems down the road. I would prefer to
"disallow" this by rejecting new package names that are too similar
to already-registered packages.
> t'd be nice if the resulting page said something besides "Not
> Found", too... like "there's no SPam, but here are a bunch of
> packages whose name contains 'spam'".
I think this would be fine in a human interface.
> If it did that, setuptools would be able to find the right page
> without hitting the main index, too. But redirection, as proposed
> by Martin, also accomplishes the same thing.
I really don't like this for setuptools. My preference is that
setuptools should be required to ask for a package with precise
spelling.
> And again, all this helps human direct users of the index, too.
I think it encourages humans to do bad things. Is someone misspells
ZODB3 as zodb3 and is able to install it with easy_install, then
they'll be tempted to use the name "zodb3" in their requirements
specifications. That is a bad thing IMO. We're talking about
technical users and I think it is reasonable to expect them to be
precise in their specifications.
I could live with case-insensitive package names if we (for some
definition of we, possibly being Guido) decided we want them, but I'd
prefer they be case sensitive. I'd still be in favor of avoiding
confusing duplicates. If we stick with case-sentitive package names,
then I'd prefer that the interaction of setuptools with the index be
case sensitive.
I wouldn't object to setuptools giving people help. So, for example,
if I type "zodb3", I wouldn't object to setuptools letting the user
know that maybe they should use ZODB3.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Thu Jul 12 21:26:02 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 15:26:02 -0400
Subject: [Catalog-sig] Case sensitivity of package names
In-Reply-To: <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
<20070712184056.F219A3A40B0@sparrow.telecommunity.com>
<068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
Message-ID: <20070712192350.347B13A40B0@sparrow.telecommunity.com>
At 03:02 PM 7/12/2007 -0400, Jim Fulton wrote:
>We're talking about
>technical users and I think it is reasonable to expect them to be
>precise in their specifications.
IMO, "technical users" is a wider range of people than you seem to be
thinking of. In any case, this is a separate topic from disallowing
too-similar names -- which you agree we should do.
Whether to then also introduce case-sensitivity into various parts of
easy_install is another subject that doesn't really matter to the catalog-sig.
Please note, however, that it is not a minor change by any means --
case-insensitivity exists throughout pkg_resources and setuptools to
handle operating system filename case-insensitivity, not just for
index lookups. In fact, I believe the index lookups *are*
case-sensitive; IIRC it's only link parsing that is case-insensitive.
From jim at zope.com Thu Jul 12 21:31:45 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 12 Jul 2007 15:31:45 -0400
Subject: [Catalog-sig] Case sensitivity of package names
In-Reply-To: <20070712192350.347B13A40B0@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<468F3CD4.1070501@v.loewis.de>
<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
<20070712184056.F219A3A40B0@sparrow.telecommunity.com>
<068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
<20070712192350.347B13A40B0@sparrow.telecommunity.com>
Message-ID: <4CD1A7D8-1911-45C9-AB08-C4DC3E1CDFA9@zope.com>
On Jul 12, 2007, at 3:26 PM, Phillip J. Eby wrote:
...
> Whether to then also introduce case-sensitivity into various parts
> of easy_install is another subject that doesn't really matter to
> the catalog-sig.
I'm not sure we agree on what matters to the catalog sig. :)
(I still need to respond to your note on that topic.)
> Please note, however, that it is not a minor change by any means --
> case-insensitivity exists throughout pkg_resources and setuptools
> to handle operating system filename case-insensitivity, not just
> for index lookups. In fact, I believe the index lookups *are* case-
> sensitive; IIRC it's only link parsing that is case-insensitive.
I'm not suggesting that you shouldn't deal with file-system case
insensitivity. If I were to change setuptools to match my opinion, I
would probably just change the code that tries to get a package
listing to look for close matches to print a suggestion and stop
rather than guessing a package name and continuing.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Thu Jul 12 23:09:32 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 23:09:32 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
Message-ID: <4696988C.6050309@v.loewis.de>
> I wonder what times ab against http://www.python.org/pypi/ZODB3 from
> inside the python.org network would give.
I just measured it. 1000 requests take 17s using urllib, giving 60
request per second.
> I wonder if it would help much to make multiple HTTP requests in the
> same connection. This might be something to look at in setuptools
> and/or xmlrpclib.
Only for remote connections, due to the round-trips required for
TCP handshake. Locally, Apache opens a new connection to the FCGI
servers per requests (using the farmer-worker pattern).
> 1. Since many people will be far away from PyPI, I think our long-
> term plan should encompass geographic mirrors. It's good that the
> server is spending a small amount of time, but it still takes *me* a
> long time to get data.
Ok. I am, in general, skeptical about mirroring. However, if it
makes people happy, feel free to implement it.
A number of issues should be considered, of course:
- there should be a way to get authoritative answers somehow, preferably
from mirrors, but, if necessary, from the main site
- I really wish to collect download counters across mirrors. "Official"
mirrors should be obliged to report download statistics once a day
or so.
> 2. It's important to reduce the number of round trips.
A colleague today suggested that the best way to reduce round trips
is to give each machine a local copy of the index, the same way
Debian apt works: you do 'apt-get update', and then have a local
copy of the catalog that you can build against. No roundtrips
at all (except for the one to update the local catalog), for the
expense of being out of date if you don't manually update the
catalog.
Regards,
Martin
From martin at v.loewis.de Thu Jul 12 23:12:55 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 23:12:55 +0200
Subject: [Catalog-sig] www.python.org/pypi might redirect?
In-Reply-To: <20070712122030.GA5853@amk-desktop.matrixgroup.net>
References: <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de>
<469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> <46951B55.9050009@v.loewis.de>
<20070712122030.GA5853@amk-desktop.matrixgroup.net>
Message-ID: <46969957.1020404@v.loewis.de>
> www.python.org/pypi does use mod_proxy to provide PyPI access from the
> old URL; it's possible these users were going through www.python.org.
I wonder why that is. Would there be anything wrong with making that
a (permanent) redirect instead?
Users of the old URL should see a speedup if they do many requests;
all relative URLs would directly go to cheeseshop, rather than having
to pass through www.python.org again.
Regards,
Martin
From martin at v.loewis.de Thu Jul 12 23:25:58 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 23:25:58 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <9b06ffb10707120739s56ef8736mce1545071df3475b@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de>
<469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
<9b06ffb10707120739s56ef8736mce1545071df3475b@mail.gmail.com>
Message-ID: <46969C66.2020806@v.loewis.de>
> As of 7:30am PST it's timing out on the website and via XML-RPC,
> testing from L.A. or Germany.
It seems the same crash of all FCGI servers (with a failure of mod_fcgi
to restart them) has happened again. I still have no clue what's causing
it, but I added a watchdog that should restart it within a minute the
next time.
Regards,
Martin
From martin at v.loewis.de Thu Jul 12 23:38:50 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 23:38:50 +0200
Subject: [Catalog-sig] Case sensitivity of package names
In-Reply-To: <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <20070712180539.3BFB43A40D7@sparrow.telecommunity.com> <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com> <20070712184056.F219A3A40B0@sparrow.telecommunity.com>
<068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
Message-ID: <46969F6A.8030904@v.loewis.de>
> I really don't like this for setuptools. My preference is that
> setuptools should be required to ask for a package with precise
> spelling.
I think the way setuptools currently works is this:
Every name gets converted to its lower-case safe-name equivalent.
All dependencies, file names, resource identifications etc
are based on that version of the name, *not* the "true"
name of the package.
Then, when setuptools tries to find a package whose "true"
name is in mixed-case, it uses the lower-cased safe-named
version, and PyPI reports that the package does not exist.
Then, setuptools queries the entire package list, trying
to find out the original spelling of the package.
I'm sure Phillip will correct me if I'm wrong.
> I could live with case-insensitive package names if we (for some
> definition of we, possibly being Guido) decided we want them, but I'd
> prefer they be case sensitive. I'd still be in favor of avoiding
> confusing duplicates. If we stick with case-sentitive package names,
> then I'd prefer that the interaction of setuptools with the index be
> case sensitive.
See above - I believe setuptools package names are case insensitive
today.
Regards,
Martin
From jim at zope.com Fri Jul 13 01:14:33 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 12 Jul 2007 19:14:33 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4696988C.6050309@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
<4696988C.6050309@v.loewis.de>
Message-ID:
On Jul 12, 2007, at 5:09 PM, Martin v. L?wis wrote:
...
>> I wonder if it would help much to make multiple HTTP requests in the
>> same connection. This might be something to look at in setuptools
>> and/or xmlrpclib.
>
> Only for remote connections, due to the round-trips required for
> TCP handshake. Locally, Apache opens a new connection to the FCGI
> servers per requests (using the farmer-worker pattern).
Right, but most connections will be remote, so this is a potential win.
>
>> 1. Since many people will be far away from PyPI, I think our long-
>> term plan should encompass geographic mirrors. It's good that the
>> server is spending a small amount of time, but it still takes *me* a
>> long time to get data.
>
> Ok. I am, in general, skeptical about mirroring. However, if it
> makes people happy, feel free to implement it.
My goal is to have PyPI provide a simplified version of the data for
use by setuptools that is easily mirrored using standard mirroring
tools. (I may actually prototype this with a kind of mirror.)
> A number of issues should be considered, of course:
> - there should be a way to get authoritative answers somehow,
> preferably
> from mirrors, but, if necessary, from the main site
I don't know what you mean. I envision mirrors as being read-only
and only used by setuptools. The main site would certainly be
authoritative.
> - I really wish to collect download counters across mirrors.
> "Official"
> mirrors should be obliged to report download statistics once a day
> or so.
OK.
>
>> 2. It's important to reduce the number of round trips.
>
> A colleague today suggested that the best way to reduce round trips
> is to give each machine a local copy of the index, the same way
> Debian apt works: you do 'apt-get update', and then have a local
> copy of the catalog that you can build against. No roundtrips
> at all (except for the one to update the local catalog), for the
> expense of being out of date if you don't manually update the
> catalog.
Yup. This might be a really nice way to go. It would be especially
nice if a client could contact PyPI and ask for new data since a
given time. I imagine that this request could be as cheap as the
requests we have now, unless a client was very out of date.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Fri Jul 13 01:35:05 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 19:35:05 -0400
Subject: [Catalog-sig] Case sensitivity of package names
In-Reply-To: <46969F6A.8030904@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
<468FC2BB.7030607@v.loewis.de>
<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
<468FF69B.2090503@v.loewis.de>
<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
<46910BBF.3010308@v.loewis.de>
<4692B3A3.5030209@v.loewis.de>
<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
<46931A3A.5000703@v.loewis.de>
<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
<4693FA2A.3020107@v.loewis.de>
<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
<469467AA.7070409@v.loewis.de>
<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
<20070712184056.F219A3A40B0@sparrow.telecommunity.com>
<068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
<46969F6A.8030904@v.loewis.de>
Message-ID: <20070712233252.3C2913A40A9@sparrow.telecommunity.com>
At 11:38 PM 7/12/2007 +0200, Martin v. L?wis wrote:
> > I really don't like this for setuptools. My preference is that
> > setuptools should be required to ask for a package with precise
> > spelling.
>
>I think the way setuptools currently works is this:
>
>Every name gets converted to its lower-case safe-name equivalent.
>All dependencies, file names, resource identifications etc
>are based on that version of the name, *not* the "true"
>name of the package.
Object comparisons are done case-insensitively, but the objects
themselves keep the case-insensitive forms ('key' attributes)
separate from the originally-input names ('project_name' attributes).
>Then, when setuptools tries to find a package whose "true"
>name is in mixed-case, it uses the lower-cased safe-named
>version, and PyPI reports that the package does not exist.
>Then, setuptools queries the entire package list, trying
>to find out the original spelling of the package.
This is almost correct, except that it actually tries to lookup
whatever the user actually input, then the safe_name() form of
that. For index lookups, it does not actually change the case of
what was entered, so if the user enters something that exactly
matches what's on PyPI, they'll have a better chance of getting
everything in one request.... unless there are multiple versions
listed, of course.
From pje at telecommunity.com Fri Jul 13 01:43:04 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 19:43:04 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
<46953A70.6070600@v.loewis.de>
<200707120809.48344.richardjones@optusnet.com.au>
<297846B8-94DC-4770-9476-711796E82FEC@zope.com>
<4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
<4696988C.6050309@v.loewis.de>
Message-ID: <20070712234049.97ED63A40A9@sparrow.telecommunity.com>
At 07:14 PM 7/12/2007 -0400, Jim Fulton wrote:
>On Jul 12, 2007, at 5:09 PM, Martin v. L?wis wrote:
> >> 2. It's important to reduce the number of round trips.
> >
> > A colleague today suggested that the best way to reduce round trips
> > is to give each machine a local copy of the index, the same way
> > Debian apt works: you do 'apt-get update', and then have a local
> > copy of the catalog that you can build against. No roundtrips
> > at all (except for the one to update the local catalog), for the
> > expense of being out of date if you don't manually update the
> > catalog.
>
>Yup. This might be a really nice way to go. It would be especially
>nice if a client could contact PyPI and ask for new data since a
>given time. I imagine that this request could be as cheap as the
>requests we have now, unless a client was very out of date.
Such a query could simply consist of which packages had been updated,
and the data could then be cleared from the local cache.
The downside to this approach is that it's not any faster for
anything you've never downloaded before.
So, I'm not really sure how to create a quality user experience with
edge caching alone. It seems to me that geographically localized
mirrors are needed to provide infrequent users and new users with
good performance. And presumably, the commercial users who are
having issues now, want their users as well as their developers to
have good performance.
(Personally, I find it extremely irritating every time the "yum"
package manager makes me wait for it to download a bunch of
repository data that isn't necessarily even related to what I just
asked it to do.)
From doug at hellfly.net Fri Jul 13 03:26:12 2007
From: doug at hellfly.net (Doug Hellmann)
Date: Thu, 12 Jul 2007 21:26:12 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
<4696988C.6050309@v.loewis.de>
Message-ID: <786451BD-A013-48C1-87B9-884F46151B81@hellfly.net>
On Jul 12, 2007, at 7:14 PM, Jim Fulton wrote:
>>> 2. It's important to reduce the number of round trips.
>>
>> A colleague today suggested that the best way to reduce round trips
>> is to give each machine a local copy of the index, the same way
>> Debian apt works: you do 'apt-get update', and then have a local
>> copy of the catalog that you can build against. No roundtrips
>> at all (except for the one to update the local catalog), for the
>> expense of being out of date if you don't manually update the
>> catalog.
>
> Yup. This might be a really nice way to go. It would be especially
> nice if a client could contact PyPI and ask for new data since a
> given time. I imagine that this request could be as cheap as the
> requests we have now, unless a client was very out of date.
That sounds like RSS.
Doug
From martin at v.loewis.de Fri Jul 13 10:04:33 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 10:04:33 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
<4696988C.6050309@v.loewis.de>
Message-ID: <46973211.1060801@v.loewis.de>
>> A number of issues should be considered, of course:
>> - there should be a way to get authoritative answers somehow, preferably
>> from mirrors, but, if necessary, from the main site
>
> I don't know what you mean. I envision mirrors as being read-only and
> only used by setuptools. The main site would certainly be authoritative.
The problem is with outdated information. With a mirror, the question
is always "is my information current". Perhaps it's ok for users of
a mirror to use outdated information. However, when people register
a package, then use setuptools to install it, they might be puzzled
that it won't find the package just because it was using an outdated
mirror.
In many cases, it's fine to use outdated information, of course, e.g.
if you know that the package hasn't been released for many weeks now,
or in case you will update the next day again, and then fetch the
newer release.
> Yup. This might be a really nice way to go. It would be especially nice
> if a client could contact PyPI and ask for new data since a given time.
> I imagine that this request could be as cheap as the requests we have
> now, unless a client was very out of date.
PyPI already supports that: the updated_releases RPC call will return
all packages that have changed since a given date.
Regards,
Martin
From jim at zope.com Fri Jul 13 16:59:01 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 10:59:01 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46973211.1060801@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
<4696988C.6050309@v.loewis.de>
<46973211.1060801@v.loewis.de>
Message-ID:
On Jul 13, 2007, at 4:04 AM, Martin v. L?wis wrote:
>>> A number of issues should be considered, of course:
>>> - there should be a way to get authoritative answers somehow,
>>> preferably
>>> from mirrors, but, if necessary, from the main site
>>
>> I don't know what you mean. I envision mirrors as being read-only
>> and
>> only used by setuptools. The main site would certainly be
>> authoritative.
>
> The problem is with outdated information. With a mirror, the question
> is always "is my information current". Perhaps it's ok for users of
> a mirror to use outdated information. However, when people register
> a package, then use setuptools to install it, they might be puzzled
> that it won't find the package just because it was using an outdated
> mirror.
I agree 100% with this concern, which is why I was skeptical of
caching in the classical form.
Right. So the question is, how can we keep the mirror up to date? :)
>> Yup. This might be a really nice way to go. It would be especially
>> nice
>> if a client could contact PyPI and ask for new data since a given
>> time.
>> I imagine that this request could be as cheap as the requests we have
>> now, unless a client was very out of date.
>
> PyPI already supports that: the updated_releases RPC call will return
> all packages that have changed since a given date.
Awesome! Too bad it wasn't shown in:
http://wiki.python.org/moin/CheeseShopXmlRpc
I'll look at the source (location hints welcome) and update that page.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Fri Jul 13 17:14:38 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 17:14:38 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
<4696988C.6050309@v.loewis.de>
<46973211.1060801@v.loewis.de>
Message-ID: <469796DE.805@v.loewis.de>
> Right. So the question is, how can we keep the mirror up to date? :)
I think there is no efficient way to provide perfect synchronization
(not without putting too much load on the central server again).
If slight propagationdelays are acceptable, it would be possible that
the central server publishes sequence numbers of each update performed,
and mirrors could check with a single roundtrip what the most current
sequence number is.
Then it is the mirror's choice how much it can age; checking every
minute would be reasonable IMO for most purposes; users that want
to see their just-uploaded stuff then would either need to wait
that minute, or go to the master site, or fetch the sequence
number of the master site and compare it with the one of the mirror
they use.
> I'll look at the source (location hints welcome) and update that page.
See
http://svn.python.org/view/trunk/pypi/rpc.py?rev=433&root=packages&view=markup
Regards,
Martin
From jim at zope.com Fri Jul 13 18:01:18 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 12:01:18 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <46973211.1060801@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
<4696988C.6050309@v.loewis.de>
<46973211.1060801@v.loewis.de>
Message-ID: <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
On Jul 13, 2007, at 4:04 AM, Martin v. L?wis wrote:
> PyPI already supports that: the updated_releases RPC call will return
> all packages that have changed since a given date.
It appears that this only shows new releases. If I update a new
distribution to a release, it doesn't cause the release to appear as
updated. A common scenario for me is that I'll create a release,
update a source release, and then, some time later, when someone bugs
me, I'll upload a windows egg. The way things are now, the later
upload won't be noticed. Of course, the initial upload won't be
noticed if someone happens to poll between release creation and the
first upload.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Fri Jul 13 18:02:33 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 12:02:33 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469796DE.805@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
<4696988C.6050309@v.loewis.de>
<46973211.1060801@v.loewis.de>
<469796DE.805@v.loewis.de>
Message-ID:
On Jul 13, 2007, at 11:14 AM, Martin v. L?wis wrote:
>> Right. So the question is, how can we keep the mirror up to date? :)
>
> I think there is no efficient way to provide perfect synchronization
> (not without putting too much load on the central server again).
Well, if there mirrors were known, then the primary could notify
them. Of course, that would make them more complex. Of course
polling has its complexities too.
> If slight propagationdelays are acceptable, it would be possible that
> the central server publishes sequence numbers of each update
> performed,
> and mirrors could check with a single roundtrip what the most current
> sequence number is.
If the updated_releases actually reflected updates, then I think that
would be good enough. Then we could use the UTC second as the
sequence number. :)
>
> Then it is the mirror's choice how much it can age; checking every
> minute would be reasonable IMO for most purposes;
Yup
> users that want
> to see their just-uploaded stuff then would either need to wait
> that minute, or go to the master site, or fetch the sequence
> number of the master site and compare it with the one of the mirror
> they use.
Yup
>
>> I'll look at the source (location hints welcome) and update that
>> page.
>
> See
>
> http://svn.python.org/view/trunk/pypi/rpc.py?
> rev=433&root=packages&view=markup
Thanks.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Fri Jul 13 18:50:50 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 18:50:50 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
<4696988C.6050309@v.loewis.de>
<46973211.1060801@v.loewis.de>
<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
Message-ID: <4697AD6A.1030602@v.loewis.de>
> It appears that this only shows new releases.
That's true. I don't know why it does that; it may be that this
interface predates file uploading.
> If I update a new distribution to a release
With "distribution", you always mean "file", right?
Regards,
Martin
From martin at v.loewis.de Fri Jul 13 19:07:30 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 19:07:30 +0200
Subject: [Catalog-sig] Effect of HTTP 1.1
Message-ID: <4697B152.7030304@v.loewis.de>
I did some measurements, with the script below.
For 30 requests, a single HTTP 1.1 connection
needs 5.4s over my DSL connection; 30 individual
connections need 11.7s. So if setuptools expects
to request multiple pages from the index, it would
definitely be useful to keep the connection
(I don't know at all whether it currently does so
already).
Regards,
Martin
import httplib, time
t=time.time()
h = httplib.HTTPConnection("cheeseshop.python.org")
for i in range(30):
h.putrequest("GET", "/pypi/Lamina/")
h.endheaders()
r = h.getresponse()
r.begin()
r.read()
h.close()
print time.time()-t
t=time.time()
for i in range(30):
h = httplib.HTTPConnection("cheeseshop.python.org")
h.putrequest("GET", "/pypi/Lamina/")
h.endheaders()
r = h.getresponse()
r.begin()
r.read()
h.close()
print time.time()-t
From jim at zope.com Fri Jul 13 19:45:10 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 13:45:10 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4697AD6A.1030602@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de>
<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
<4696988C.6050309@v.loewis.de>
<46973211.1060801@v.loewis.de>
<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
<4697AD6A.1030602@v.loewis.de>
Message-ID:
On Jul 13, 2007, at 12:50 PM, Martin v. L?wis wrote:
>> It appears that this only shows new releases.
>
> That's true. I don't know why it does that; it may be that this
> interface predates file uploading.
>
>> If I update a new distribution to a release
>
> With "distribution", you always mean "file", right?
Yup.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Fri Jul 13 19:54:57 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 13:54:57 -0400
Subject: [Catalog-sig] Effect of HTTP 1.1
In-Reply-To: <4697B152.7030304@v.loewis.de>
References: <4697B152.7030304@v.loewis.de>
Message-ID: <657E3A38-2871-4B4F-9CBE-B5A777CFB9F5@zope.com>
On Jul 13, 2007, at 1:07 PM, Martin v. L?wis wrote:
> I did some measurements, with the script below.
> For 30 requests, a single HTTP 1.1 connection
> needs 5.4s over my DSL connection; 30 individual
> connections need 11.7s.
Interesting. Your DSL times for connection/request are actually
longer than what I'm seeing. Maybe geography isn't so important.
Measurements are good. It's going to be interesting to see how this
all pans out. It''s definitely interesting that you doubled the
throughput using a single connection.
> So if setuptools expects
> to request multiple pages from the index, it would
> definitely be useful to keep the connection
> (I don't know at all whether it currently does so
> already).
I don't think so.
This also looks like a good optimization for xmlrpclib.
Thanks for trying this.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Fri Jul 13 20:34:45 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 13 Jul 2007 14:34:45 -0400
Subject: [Catalog-sig] Effect of HTTP 1.1
In-Reply-To: <4697B152.7030304@v.loewis.de>
References: <4697B152.7030304@v.loewis.de>
Message-ID: <20070713183235.817C13A40A8@sparrow.telecommunity.com>
At 07:07 PM 7/13/2007 +0200, Martin v. L?wis wrote:
>I did some measurements, with the script below.
>For 30 requests, a single HTTP 1.1 connection
>needs 5.4s over my DSL connection; 30 individual
>connections need 11.7s. So if setuptools expects
>to request multiple pages from the index, it would
>definitely be useful to keep the connection
>(I don't know at all whether it currently does so
>already).
It doesn't. I looked just now and found this, that looks like it
might produce the desired effect for easy_install:
http://linux.duke.edu/projects/urlgrabber/contents/urlgrabber/keepalive.py
Perhaps someone (Jim?) would like to try activating it in a process
using easy_install (i.e. doing the urllib2.install_opener dance), and
see if it gives a performance boost. If it works well, then perhaps
a patch for setuptools.package_index to use a custom opener is in order.
From jim at zope.com Fri Jul 13 20:51:54 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 14:51:54 -0400
Subject: [Catalog-sig] Effect of HTTP 1.1
In-Reply-To: <20070713183235.817C13A40A8@sparrow.telecommunity.com>
References: <4697B152.7030304@v.loewis.de>
<20070713183235.817C13A40A8@sparrow.telecommunity.com>
Message-ID: <5F747173-F02A-42A5-8767-ACDA61CD0C5C@zope.com>
On Jul 13, 2007, at 2:34 PM, Phillip J. Eby wrote:
> At 07:07 PM 7/13/2007 +0200, Martin v. L?wis wrote:
>> I did some measurements, with the script below.
>> For 30 requests, a single HTTP 1.1 connection
>> needs 5.4s over my DSL connection; 30 individual
>> connections need 11.7s. So if setuptools expects
>> to request multiple pages from the index, it would
>> definitely be useful to keep the connection
>> (I don't know at all whether it currently does so
>> already).
>
> It doesn't. I looked just now and found this, that looks like it
> might produce the desired effect for easy_install:
>
> http://linux.duke.edu/projects/urlgrabber/contents/urlgrabber/
> keepalive.py
>
> Perhaps someone (Jim?) would like to try activating it in a process
> using easy_install (i.e. doing the urllib2.install_opener dance),
> and see if it gives a performance boost. If it works well, then
> perhaps a patch for setuptools.package_index to use a custom opener
> is in order.
I'd be happy to do this sometime in the next few weeks.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Fri Jul 13 22:17:20 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 16:17:20 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4697D796.5080803@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de>
<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
<4697D796.5080803@v.loewis.de>
Message-ID:
On Jul 13, 2007, at 3:50 PM, Martin v. L?wis wrote:
>> It appears that this only shows new releases. If I update a new
>> distribution to a release, it doesn't cause the release to appear as
>> updated. A common scenario for me is that I'll create a release,
>> update a source release, and then, some time later, when someone bugs
>> me, I'll upload a windows egg. The way things are now, the later
>> upload won't be noticed. Of course, the initial upload won't be
>> noticed if someone happens to poll between release creation and the
>> first upload.
>
> Ok, I added another operation "changelog", that gives you four-tuples
> name, version, timestamp, action. It's the complete journal, except
> that privacy fields (author and IP) are not returned, and except
> changes to the package (rather than a specific release) are not
> returned.
Very cool. Thanks! It doesn't seem to catch file-uploads, either
through distutils or through the web. I uploaded a windows release
for zope.proxy this morning and I just (withing the last half hour)
uploaded some eggs for http://cheeseshop.python.org/pypi/
zc.zodbrecipes/0.2.1 and am not seeing anything in the transcript.
> The possible values for "action" remain undocumented. If there is
> interested, people can propose a specification that PyPI should
> try to stick to; this specification should allow for
> still-undocumented action values (to allow addition of more actions).
I have no immediate use for action at this time other than as
documentation when interpreting the output.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Fri Jul 13 22:43:07 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 22:43:07 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To:
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de>
<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
<4697D796.5080803@v.loewis.de>
Message-ID: <4697E3DB.8070801@v.loewis.de>
> Very cool. Thanks! It doesn't seem to catch file-uploads, either
> through distutils or through the web. I uploaded a windows release for
> zope.proxy this morning and I just (withing the last half hour) uploaded
> some eggs for http://cheeseshop.python.org/pypi/zc.zodbrecipes/0.2.1 and
> am not seeing anything in the transcript.
It appears that file additions were logged without a package version
(just package name). I don't know why this is, but I changed changelog
to return all entries (so version may be None, using the XML-RPC nil
extension). I also started logging the version for the file.
So please try again.
Regards,
Martin
From jim at zope.com Fri Jul 13 23:15:26 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 17:15:26 -0400
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <4697E3DB.8070801@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de>
<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
<4697D796.5080803@v.loewis.de>
<4697E3DB.8070801@v.loewis.de>
Message-ID: <8175AD9F-7D42-4C8C-8F97-2CAAA876F7D9@zope.com>
On Jul 13, 2007, at 4:43 PM, Martin v. L?wis wrote:
>> Very cool. Thanks! It doesn't seem to catch file-uploads, either
>> through distutils or through the web. I uploaded a windows release
>> for
>> zope.proxy this morning and I just (withing the last half hour)
>> uploaded
>> some eggs for http://cheeseshop.python.org/pypi/zc.zodbrecipes/
>> 0.2.1 and
>> am not seeing anything in the transcript.
>
> It appears that file additions were logged without a package version
> (just package name). I don't know why this is, but I changed changelog
> to return all entries (so version may be None, using the XML-RPC nil
> extension). I also started logging the version for the file.
>
> So please try again.
Works great! Thanks!
(Now I just wish I wasn't going to be offline all weekend.)
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Fri Jul 13 21:50:46 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 21:50:46 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de>
<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
Message-ID: <4697D796.5080803@v.loewis.de>
> It appears that this only shows new releases. If I update a new
> distribution to a release, it doesn't cause the release to appear as
> updated. A common scenario for me is that I'll create a release,
> update a source release, and then, some time later, when someone bugs
> me, I'll upload a windows egg. The way things are now, the later
> upload won't be noticed. Of course, the initial upload won't be
> noticed if someone happens to poll between release creation and the
> first upload.
Ok, I added another operation "changelog", that gives you four-tuples
name, version, timestamp, action. It's the complete journal, except
that privacy fields (author and IP) are not returned, and except
changes to the package (rather than a specific release) are not
returned.
The possible values for "action" remain undocumented. If there is
interested, people can propose a specification that PyPI should
try to stick to; this specification should allow for
still-undocumented action values (to allow addition of more actions).
Regards,
Martin
From gentoodev at gmail.com Tue Jul 17 08:45:14 2007
From: gentoodev at gmail.com (Rob Cakebread)
Date: Mon, 16 Jul 2007 23:45:14 -0700
Subject: [Catalog-sig] PyPI command-line tool: yolk
Message-ID: <9b06ffb10707162345s7813d59dpc61b758b50d3df66@mail.gmail.com>
yolk 0.3.0 has been released and lets you use the new PyPI XML-RPC
methods 'changelog' and 'updated_releases'.
You can see the latest releases for the last :
yolk -L 24
You can see a detailed ChangeLog of The Cheese Shop by the last :
yolk -C 6
http://tools.assembla.com/yolk
From stuart at stuartbishop.net Wed Jul 18 11:58:11 2007
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Wed, 18 Jul 2007 16:58:11 +0700
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469522D6.1070706@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de>
Message-ID: <469DE433.4040405@stuartbishop.net>
Martin v. L?wis wrote:
>> The questions for us is, how much effort we are willing to make to
>> prevent people from shooting themselves in the foot. I can understand
>> why Phillip would like the package index to prevent people from choosing
>> problematic package names.
>
> That's not my understanding - the issue isn't with "problematic package
> names", but with conflicting package names. IOW, any single name is
> fine - it's a pair of names that would cause a problem (and only if
> you wanted to install both packages on the same system).
By not blocking registration of packages with similar names, we are creating
a security problem. If there is a popular package 'CoolStuff', I just have
to upload a trojan 'coolstuff' and suddenly people will end up using my
trojan which they thought was coming from a trusted source. I think this
attack vector is possible right now and only a BUGTRAQ post away from being
common knowledge.
I think blocking this is the responsibility of the package index, as it is
the first point that it is possible to do so.
I think a reasonable restriction would be printable ASCII only names and not
allowing registration of a package with a name differing only in case,
whitespace or punctuation.
There are additional side benefits that fall out of this (being able
optimize searches by doing exact matches rather than fuzzy, or avoiding
whole classes of case-sensitivity or Unicode bugs in other applications
integrating with the registry, or reducing confusion to end users, or
reducing the likely hood of less user-hostile systems being developed and
making the official registry irrelevant - heck, I work on a closed source
system that would happily take the business).
--
Stuart Bishop
http://www.stuartbishop.net/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070718/e971e102/attachment.pgp
From martin at v.loewis.de Thu Jul 19 00:07:30 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 19 Jul 2007 00:07:30 +0200
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469DE433.4040405@stuartbishop.net>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de> <469DE433.4040405@stuartbishop.net>
Message-ID: <469E8F22.7080204@v.loewis.de>
> I think blocking this is the responsibility of the package index, as it is
> the first point that it is possible to do so.
Would you like to contribute a patch?
Regards,
Martin
From stuart at stuartbishop.net Thu Jul 19 06:00:53 2007
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Thu, 19 Jul 2007 11:00:53 +0700
Subject: [Catalog-sig] start on static generation,
and caching - apache config.
In-Reply-To: <469E8F22.7080204@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de> <469DE433.4040405@stuartbishop.net>
<469E8F22.7080204@v.loewis.de>
Message-ID: <469EE1F5.7000802@stuartbishop.net>
Martin v. L?wis wrote:
>> I think blocking this is the responsibility of the package index, as it is
>> the first point that it is possible to do so.
>
> Would you like to contribute a patch?
Yes, but it would be rather pointless to make one if my analysis is
incorrect or it would be bounced for some non-technical reason so I emailed
it for discussion. I'm also unsure if switching to exact matching on a
normalized string instead of substring matching is good (well... it is good
for performance, but might not be good for UI).
I haven't looked at the source code to see how much work is involved yet -
if I find the Python code incomprehensible I should at least be able to do
the PostgreSQL side of things.
--
Stuart Bishop
http://www.stuartbishop.net/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070719/5e71e1d1/attachment.pgp
From martin at v.loewis.de Thu Jul 19 09:17:15 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 19 Jul 2007 09:17:15 +0200
Subject: [Catalog-sig] Package naming (Was: start on static generation,
and caching - apache config.)
In-Reply-To: <469EE1F5.7000802@stuartbishop.net>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
<469522D6.1070706@v.loewis.de> <469DE433.4040405@stuartbishop.net>
<469E8F22.7080204@v.loewis.de> <469EE1F5.7000802@stuartbishop.net>
Message-ID: <469F0FFB.6010904@v.loewis.de>
> Yes, but it would be rather pointless to make one if my analysis is
> incorrect or it would be bounced for some non-technical reason so I emailed
> it for discussion. I'm also unsure if switching to exact matching on a
> normalized string instead of substring matching is good (well... it is good
> for performance, but might not be good for UI).
That's something completely different. I thought you were saying that
the Cheeseshop should block conflicting registrations. To implement
that, you only have to perform any normalization when a new project
is registered. There are roughly three new registrations per day, so
performance is irrelevant here.
Matching on lookup is rather a convenience to users; they can put
in a misspelled string and still find the package. OTOH, the search
interface already does case-insensitive matching; I doubt that doing
it in the URL adds much convenience. OTOH, it does add performance
(not convenience) to setuptools users, as setuptools could stop
downloading the complete package list to find the match.
But these are unrelated; if you want to contribute, it might be
best to just focus on the part that really worries you (namely
the security risk of conflicting registrations).
Regards,
Martin
From jim at zope.com Thu Jul 19 13:06:34 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 19 Jul 2007 07:06:34 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
Message-ID: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Over the past few months, we've struggled quite a bit with Python
Package Index (PyPI) performance and stability. Thanks to the heroic
efforts of Martin v. L?wis and others, performance and especially
stability have improved quite a bit. Martin has demonstrated that, at
least when running well, PyPI seems to answer most requests on the
order of 7 miliseconds (around 150 requests per second) internally.
That's not bad. Unfortunately for users, actual times can be quite a
bit longer. For me at work, request take around 300 milliseconds.
For Martin, they seem to take somewhat longer. 300 milliseconds
isn't so bad for a request or two, however, easy install can easily
make 10s or even hundreds of requests to satisfy a user request for a
package. zc.buildout, when verifying that a large system with many
tens of packages has the most up to date versions of each package can
easily make thousands of requests.
Why do setuptools and buildout make so many requests? If a package
exposes more than one release, then setuptools checks the package's
main PyPI page and the pages for each release. We need to be able to
easily use older releases, so we can't hide old releases. Typical
projects of ours have many old releases exposed. If setuptools was
more clever in the way it searched PyPI, but it would still have to
make a minimum of 2 requests per package for packages with multiple
versions exposed.
Another potential issue is that PyPI pages can be large. I've found
it convenient to use PyPI package pages as the home page for many of
my projects. I like to include package documentation in my project
pages. Perhaps this is an abuse of PyPI, but it is very convenient
for me and no one has complained. :) The zc.buildout pages are
around 200K. That's a fair bit of data for setuptools to download
and scan for download URLs.
In the course of this discussion, I've realized that it doesn't make
sense for setuptools to use the same interface that humans use.
setuptools doesn't need to see all of the data that is useful to
humans. Similarly, humans generally don't need to see all of the
historical releases for a project. I suggested a simple page format
designed just for setuptools. An alternative would be an xmlrpc
API. I prefer pages because I think that, over time, the amount of
requests from automated tools like easy_install and zc.buildout will
increase substantially and ultimately, will overwhelm dynamic
servers, even ones like PyPI that are reasonably fast. I also think
that a simple static collection of pages will be easier to mirror and
I think some number of geographic mirrors is likely to help some
people. I promised to prototype the format I suggested.
I've created and experimental prototype setuptools-specific package
index at
http://download.zope.org/ppix
Going to that page gives brief instructions for using it with
easy_install and zc.buildout. To see an individual package page, add
the package name to the URL, as in:
http://download.zope.org/ppix/setuptools/
A few things to note about this:
- I don't expose a long package list at http://download.zope.org/
ppix/. The long package list would be expensive to download and
supports a use case that I consider to be of negative value, which is
installing packages with case-insensitive package names, I think it
is important for humans to be able to search for packages using case-
insensitive search terms, but I think that, after identifying a
package, precise package names should be used. I think it is
especially important that precise package names be used in package
requirements.
- There is a single page per package. This can greatly reduce the
number of requests. Packages that store all of their distributions
in PyPI and that don't have off-site home pages or download URLs can
be scanned with a single request. Note that I excluded home page and
download URLs that pointed back to the packages PyPI page, as that
wouldn't provide any new information to setuptools.
- Download URLs for *hidden* packages are included. Humans don't
need to see old revisions, but setuptools-based tools do. If we used
an index like this for setuptools, we could stop unhiding old
releases when we created new releases in PyPI. This would make PyPI
more useful to humans and less of a pain for developers.
- Download URLs are the same as they are in PyPI. Using this new
index, distributions are still downloaded from PyPI, so the index
doesn't affect PyPI download statistics.
To see the impact of this, it's interesting to look at installing
zc.buildout using easy_install from PyPI and from the experimental
index:
Installing using PyPI looks like this:
(env)jim at ds9:~/tmp$ time easy_install zc.buildout
Searching for zc.buildout
Reading http://cheeseshop.python.org/pypi/zc.buildout/
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19
Reading http://svn.zope.org/zc.buildout
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16
Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18
Best match: zc.buildout 1.0.0b28
Downloading http://cheeseshop.python.org/packages/2.5/z/
zc.buildout/zc.buildout-1.0.0b28-
py2.5.egg#md5=4e37e53f010ed7984555a029732f479d
Processing zc.buildout-1.0.0b28-py2.5.egg
creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
py2.5.egg
Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/
python2.5
Adding zc.buildout 1.0.0b28 to easy-install.pth file
Installing buildout script to /home/jim/tmp/env/bin/
Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
py2.5.egg
Processing dependencies for zc.buildout
Searching for setuptools==0.6c6
Best match: setuptools 0.6c6
Processing setuptools-0.6c6-py2.5.egg
Adding setuptools 0.6c6 to easy-install.pth file
Installing easy_install script to /home/jim/tmp/env/bin/
Installing easy_install-2.5 script to /home/jim/tmp/env/bin/
Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6-py2.5.egg
Processing dependencies for setuptools==0.6c6
Finished processing dependencies for setuptools==0.6c6
Finished installing setuptools==0.6c6
Finished processing dependencies for zc.buildout
Finished installing zc.buildout
real 0m31.360s
user 0m1.136s
sys 0m0.060s
Note the large number of pages read. Here I was installing a single
package with one dependency, setuptools, that was already installed.
Let's look at this again using the experimental index:
(env)jim at ds9:~/tmp$ time easy_install -i http://download.zope.org/
ppix zc.buildout
Searching for zc.buildout
Reading http://download.zope.org/ppix/zc.buildout/
Best match: zc.buildout 1.0.0b28
Downloading http://cheeseshop.python.org/packages/2.5/z/
zc.buildout/zc.buildout-1.0.0b28-
py2.5.egg#md5=4e37e53f010ed7984555a029732f479d
Processing zc.buildout-1.0.0b28-py2.5.egg
creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
py2.5.egg
Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/
python2.5
Adding zc.buildout 1.0.0b28 to easy-install.pth file
Installing buildout script to /home/jim/tmp/env/bin/
Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
py2.5.egg
Processing dependencies for zc.buildout
Searching for setuptools==0.6c6
Best match: setuptools 0.6c6
Processing setuptools-0.6c6-py2.5.egg
Adding setuptools 0.6c6 to easy-install.pth file
Installing easy_install script to /home/jim/tmp/env/bin/
Installing easy_install-2.5 script to /home/jim/tmp/env/bin/
Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6-py2.5.egg
Processing dependencies for setuptools==0.6c6
Finished processing dependencies for setuptools==0.6c6
Finished installing setuptools==0.6c6
Finished processing dependencies for zc.buildout
Finished installing zc.buildout
real 0m7.006s
user 0m0.244s
sys 0m0.040s
Note:
- We made far fewer requests with the new index
- Most of the time in the second example was spent actually
downloading the buildout distribution. Most of the time in the first
example was spent reading the index.
- I used workingenv to create clean environments for each of the
examples above.
WRT zc.buildout, refreshing a buildout with just ZODB installed in it
takes about 45 seconds for me using PyPI and about 5 seconds using
the experimental index.
Some of the speed improvements is due to the fact that the
experimental index is much closer to me (on the net) than PyPI. ATM,
requests to PyPI take *me* around 500 milliseconds, while requests to
the experimental index are taking between 100 and 300 milliseconds.
(I'm at home and this seems to be somewhat variable.) Most of the
speed improvements are from reducing the number of requests.
I'm polling PyPI once a minute to get and apply updates. Thanks to
the new XML-RPC method that Martin added, this is very efficient to do.
I encourage people to check this out and even try using it with
easy_install and especially buildout. AFAIK, aside from being much
faster and showing download files for hidden releases it is
completely equivalent to PyPI for setuptools use. My intension is to
keep this experimental index going and up to date for the foreseeable
future and plan to use it for all my work.
My primary goal is to prototype the new index format. If this seems
useful, then I think that www.python.org should expose an index in
this format to setuptools, either at a different URL or by satisfying
setuptools requests from the index based on client information. I'd
love to see this index populated via a baking mechanism that updates
package pages when they change, rather than through polling as I'm
doing.
There would be some benefit to having geographic mirrors. I suspect
that having such mirrors available would improve performance further,
at least for some folks. It might also be useful to have some
mirrors for redundancy purposes. Note though that what I'm doing is
mirroring the only index data. I'm not mirroring distributions. Of
course, I'd be happy to make my software available. (It already is
via our subversion repository.)
I hope this effort spurs useful discussion and progress.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Fri Jul 20 10:21:18 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 20 Jul 2007 10:21:18 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <46A0707E.6000103@v.loewis.de>
> I've created and experimental prototype setuptools-specific package
> index at
>
> http://download.zope.org/ppix
Cool! If this proves useful, people are encouraged to contribute the
proper patches to PyPI to regenerate the page directly on each log
change.
There is a slight transactional trickiness to doing so: If you
regenerate before the commit, it might be that the commit fails;
then you would have to rollback the page update, too. If you
regenerate after commit, it might be that you run into race
conditions if the same package sees two updates in two
transactions very quickly, and the second regeneration completes
before the first one.
If people would find it easier to make these pages dynamic,
such patches would also be kindly accepted. Generating the
pages on access should be fairly cheap; the SQL is
select filename,md5_digest from release_files where name='setuptools';
and putting the result of that into an ppix-like HTML page
should be much faster than invoking ZPT.
Regards,
Martin
From ct at gocept.com Fri Jul 20 12:02:45 2007
From: ct at gocept.com (Christian Theune)
Date: Fri, 20 Jul 2007 12:02:45 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <1184925765.6519.3.camel@mindy>
Am Donnerstag, den 19.07.2007, 07:06 -0400 schrieb Jim Fulton:
> I promised to prototype the format I suggested.
>
> I've created and experimental prototype setuptools-specific package
> index at
>
> http://download.zope.org/ppix
Yay! This works like a charme!
> There would be some benefit to having geographic mirrors. I suspect
> that having such mirrors available would improve performance further,
> at least for some folks. It might also be useful to have some
> mirrors for redundancy purposes. Note though that what I'm doing is
> mirroring the only index data. I'm not mirroring distributions. Of
> course, I'd be happy to make my software available. (It already is
> via our subversion repository.)
I'd be happy to support mirroring once all this is sorted out/ I can
offer a server in Germany/Europe.
Christian
From jim at zope.com Fri Jul 20 13:45:57 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 07:45:57 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <46A0707E.6000103@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A0707E.6000103@v.loewis.de>
Message-ID: <5105308E-F651-438B-8C3D-F5FCAF8A8351@zope.com>
On Jul 20, 2007, at 4:21 AM, Martin v. L?wis wrote:
>> I've created and experimental prototype setuptools-specific package
>> index at
>>
>> http://download.zope.org/ppix
>
> Cool! If this proves useful, people are encouraged to contribute the
> proper patches to PyPI to regenerate the page directly on each log
> change.
>
> There is a slight transactional trickiness to doing so: If you
> regenerate before the commit, it might be that the commit fails;
> then you would have to rollback the page update, too. If you
> regenerate after commit, it might be that you run into race
> conditions if the same package sees two updates in two
> transactions very quickly, and the second regeneration completes
> before the first one.
>
> If people would find it easier to make these pages dynamic,
> such patches would also be kindly accepted. Generating the
> pages on access should be fairly cheap; the SQL is
>
> select filename,md5_digest from release_files where name='setuptools';
>
> and putting the result of that into an ppix-like HTML page
> should be much faster than invoking ZPT.
A few notes.
It is important to show files from hidden releases as well as
unhidden releases. I suspect the select statement above does that.
I parse long descriptions to get #egg= links. I also give some
special care to urls that point back to PyPI to avoid having
setuptools go back to the human interface.
It might be easiest to just trigger the existing ppix sw to poll
after a change. Thanks to your xmlrpc addition, polling is quite
cheap. Alternatively, we could install the existing software in a
way that polls more or less continuously. This would be quite
trivial. What you suggest is probably cleaner but requires some
expertise with the current software. :) I'd much rather generate
static files (as I'm doing now) than serve these dynamically.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Fri Jul 20 13:48:39 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 07:48:39 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <1184925765.6519.3.camel@mindy>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<1184925765.6519.3.camel@mindy>
Message-ID: <465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com>
On Jul 20, 2007, at 6:02 AM, Christian Theune wrote:
...
> I'd be happy to support mirroring once all this is sorted out/ I can
> offer a server in Germany/Europe.
If we decide that mirrors would be a good idea, it will be important,
imo, to select mirror sites bases on their connectivity. The goal of
the mirrors should be to try to give people options with short
network distances.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From ct at gocept.com Fri Jul 20 13:52:12 2007
From: ct at gocept.com (Christian Theune)
Date: Fri, 20 Jul 2007 13:52:12 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<1184925765.6519.3.camel@mindy>
<465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com>
Message-ID: <1184932332.6519.11.camel@mindy>
Am Freitag, den 20.07.2007, 07:48 -0400 schrieb Jim Fulton:
> On Jul 20, 2007, at 6:02 AM, Christian Theune wrote:
> ...
> > I'd be happy to support mirroring once all this is sorted out/ I can
> > offer a server in Germany/Europe.
>
> If we decide that mirrors would be a good idea, it will be important,
> imo, to select mirror sites bases on their connectivity. The goal of
> the mirrors should be to try to give people options with short
> network distances.
Right, however, do you have any specific parameters that can be measured
in mind?
(Our server is reasonably well connected, reachable with about 5 hops
from within Germany with latency around 40ms on a DSL line. Multiple
GBit lines to the hosting center.)
Christian
From jodok at lovelysystems.com Fri Jul 20 10:50:40 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Fri, 20 Jul 2007 10:50:40 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <6000B516-6593-4A98-AA08-B6C7B329BC62@lovelysystems.com>
thanks jim.
you save our day. we'll send some austrian cheese over :)
jodok
On 19.07.2007, at 13:06, Jim Fulton wrote:
> Over the past few months, we've struggled quite a bit with Python
> Package Index (PyPI) performance and stability. Thanks to the heroic
> efforts of Martin v. L?wis and others, performance and especially
> stability have improved quite a bit. Martin has demonstrated that, at
> least when running well, PyPI seems to answer most requests on the
> order of 7 miliseconds (around 150 requests per second) internally.
> That's not bad. Unfortunately for users, actual times can be quite a
> bit longer. For me at work, request take around 300 milliseconds.
> For Martin, they seem to take somewhat longer. 300 milliseconds
> isn't so bad for a request or two, however, easy install can easily
> make 10s or even hundreds of requests to satisfy a user request for a
> package. zc.buildout, when verifying that a large system with many
> tens of packages has the most up to date versions of each package can
> easily make thousands of requests.
>
> Why do setuptools and buildout make so many requests? If a package
> exposes more than one release, then setuptools checks the package's
> main PyPI page and the pages for each release. We need to be able to
> easily use older releases, so we can't hide old releases. Typical
> projects of ours have many old releases exposed. If setuptools was
> more clever in the way it searched PyPI, but it would still have to
> make a minimum of 2 requests per package for packages with multiple
> versions exposed.
>
> Another potential issue is that PyPI pages can be large. I've found
> it convenient to use PyPI package pages as the home page for many of
> my projects. I like to include package documentation in my project
> pages. Perhaps this is an abuse of PyPI, but it is very convenient
> for me and no one has complained. :) The zc.buildout pages are
> around 200K. That's a fair bit of data for setuptools to download
> and scan for download URLs.
>
> In the course of this discussion, I've realized that it doesn't make
> sense for setuptools to use the same interface that humans use.
> setuptools doesn't need to see all of the data that is useful to
> humans. Similarly, humans generally don't need to see all of the
> historical releases for a project. I suggested a simple page format
> designed just for setuptools. An alternative would be an xmlrpc
> API. I prefer pages because I think that, over time, the amount of
> requests from automated tools like easy_install and zc.buildout will
> increase substantially and ultimately, will overwhelm dynamic
> servers, even ones like PyPI that are reasonably fast. I also think
> that a simple static collection of pages will be easier to mirror and
> I think some number of geographic mirrors is likely to help some
> people. I promised to prototype the format I suggested.
>
> I've created and experimental prototype setuptools-specific package
> index at
>
> http://download.zope.org/ppix
>
> Going to that page gives brief instructions for using it with
> easy_install and zc.buildout. To see an individual package page, add
> the package name to the URL, as in:
>
> http://download.zope.org/ppix/setuptools/
>
> A few things to note about this:
>
> - I don't expose a long package list at http://download.zope.org/
> ppix/. The long package list would be expensive to download and
> supports a use case that I consider to be of negative value, which is
> installing packages with case-insensitive package names, I think it
> is important for humans to be able to search for packages using case-
> insensitive search terms, but I think that, after identifying a
> package, precise package names should be used. I think it is
> especially important that precise package names be used in package
> requirements.
>
> - There is a single page per package. This can greatly reduce the
> number of requests. Packages that store all of their distributions
> in PyPI and that don't have off-site home pages or download URLs can
> be scanned with a single request. Note that I excluded home page and
> download URLs that pointed back to the packages PyPI page, as that
> wouldn't provide any new information to setuptools.
>
> - Download URLs for *hidden* packages are included. Humans don't
> need to see old revisions, but setuptools-based tools do. If we used
> an index like this for setuptools, we could stop unhiding old
> releases when we created new releases in PyPI. This would make PyPI
> more useful to humans and less of a pain for developers.
>
> - Download URLs are the same as they are in PyPI. Using this new
> index, distributions are still downloaded from PyPI, so the index
> doesn't affect PyPI download statistics.
>
> To see the impact of this, it's interesting to look at installing
> zc.buildout using easy_install from PyPI and from the experimental
> index:
> Installing using PyPI looks like this:
>
> (env)jim at ds9:~/tmp$ time easy_install zc.buildout
> Searching for zc.buildout
> Reading http://cheeseshop.python.org/pypi/zc.buildout/
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19
> Reading http://svn.zope.org/zc.buildout
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18
> Best match: zc.buildout 1.0.0b28
> Downloading http://cheeseshop.python.org/packages/2.5/z/
> zc.buildout/zc.buildout-1.0.0b28-
> py2.5.egg#md5=4e37e53f010ed7984555a029732f479d
> Processing zc.buildout-1.0.0b28-py2.5.egg
> creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
> py2.5.egg
> Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/
> python2.5
> Adding zc.buildout 1.0.0b28 to easy-install.pth file
> Installing buildout script to /home/jim/tmp/env/bin/
>
> Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
> py2.5.egg
> Processing dependencies for zc.buildout
> Searching for setuptools==0.6c6
> Best match: setuptools 0.6c6
> Processing setuptools-0.6c6-py2.5.egg
> Adding setuptools 0.6c6 to easy-install.pth file
> Installing easy_install script to /home/jim/tmp/env/bin/
> Installing easy_install-2.5 script to /home/jim/tmp/env/bin/
>
> Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6-
> py2.5.egg
> Processing dependencies for setuptools==0.6c6
> Finished processing dependencies for setuptools==0.6c6
> Finished installing setuptools==0.6c6
> Finished processing dependencies for zc.buildout
> Finished installing zc.buildout
>
> real 0m31.360s
> user 0m1.136s
> sys 0m0.060s
>
> Note the large number of pages read. Here I was installing a single
> package with one dependency, setuptools, that was already installed.
> Let's look at this again using the experimental index:
>
> (env)jim at ds9:~/tmp$ time easy_install -i http://download.zope.org/
> ppix zc.buildout
> Searching for zc.buildout
> Reading http://download.zope.org/ppix/zc.buildout/
> Best match: zc.buildout 1.0.0b28
> Downloading http://cheeseshop.python.org/packages/2.5/z/
> zc.buildout/zc.buildout-1.0.0b28-
> py2.5.egg#md5=4e37e53f010ed7984555a029732f479d
> Processing zc.buildout-1.0.0b28-py2.5.egg
> creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
> py2.5.egg
> Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/
> python2.5
> Adding zc.buildout 1.0.0b28 to easy-install.pth file
> Installing buildout script to /home/jim/tmp/env/bin/
>
> Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
> py2.5.egg
> Processing dependencies for zc.buildout
> Searching for setuptools==0.6c6
> Best match: setuptools 0.6c6
> Processing setuptools-0.6c6-py2.5.egg
> Adding setuptools 0.6c6 to easy-install.pth file
> Installing easy_install script to /home/jim/tmp/env/bin/
> Installing easy_install-2.5 script to /home/jim/tmp/env/bin/
>
> Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6-
> py2.5.egg
> Processing dependencies for setuptools==0.6c6
> Finished processing dependencies for setuptools==0.6c6
> Finished installing setuptools==0.6c6
> Finished processing dependencies for zc.buildout
> Finished installing zc.buildout
>
> real 0m7.006s
> user 0m0.244s
> sys 0m0.040s
>
> Note:
>
> - We made far fewer requests with the new index
>
> - Most of the time in the second example was spent actually
> downloading the buildout distribution. Most of the time in the first
> example was spent reading the index.
>
> - I used workingenv to create clean environments for each of the
> examples above.
>
> WRT zc.buildout, refreshing a buildout with just ZODB installed in it
> takes about 45 seconds for me using PyPI and about 5 seconds using
> the experimental index.
>
> Some of the speed improvements is due to the fact that the
> experimental index is much closer to me (on the net) than PyPI. ATM,
> requests to PyPI take *me* around 500 milliseconds, while requests to
> the experimental index are taking between 100 and 300 milliseconds.
> (I'm at home and this seems to be somewhat variable.) Most of the
> speed improvements are from reducing the number of requests.
>
> I'm polling PyPI once a minute to get and apply updates. Thanks to
> the new XML-RPC method that Martin added, this is very efficient to
> do.
>
> I encourage people to check this out and even try using it with
> easy_install and especially buildout. AFAIK, aside from being much
> faster and showing download files for hidden releases it is
> completely equivalent to PyPI for setuptools use. My intension is to
> keep this experimental index going and up to date for the foreseeable
> future and plan to use it for all my work.
>
> My primary goal is to prototype the new index format. If this seems
> useful, then I think that www.python.org should expose an index in
> this format to setuptools, either at a different URL or by satisfying
> setuptools requests from the index based on client information. I'd
> love to see this index populated via a baking mechanism that updates
> package pages when they change, rather than through polling as I'm
> doing.
>
> There would be some benefit to having geographic mirrors. I suspect
> that having such mirrors available would improve performance further,
> at least for some folks. It might also be useful to have some
> mirrors for redundancy purposes. Note though that what I'm doing is
> mirroring the only index data. I'm not mirroring distributions. Of
> course, I'd be happy to make my software available. (It already is
> via our subversion repository.)
>
> I hope this effort spurs useful discussion and progress.
>
> Jim
>
> --
> Jim Fulton mailto:jim at zope.com Python Powered!
> CTO (540) 361-1714 http://www.python.org
> Zope Corporation http://www.zope.com http://www.zope.org
>
>
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
--
"Although never is often better than *right* now."
-- The Zen of Python, by Tim Peters
Jodok Batlogg, Lovely Systems
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
phone: +43 5572 908060, fax: +43 5572 908060-77
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2454 bytes
Desc: not available
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070720/26d28f4b/attachment-0001.bin
From jim at zope.com Fri Jul 20 15:42:37 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 09:42:37 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <1184932332.6519.11.camel@mindy>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<1184925765.6519.3.camel@mindy>
<465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com>
<1184932332.6519.11.camel@mindy>
Message-ID: <5686B35D-34DD-49FE-A8E7-37397A4AE808@zope.com>
On Jul 20, 2007, at 7:52 AM, Christian Theune wrote:
> Am Freitag, den 20.07.2007, 07:48 -0400 schrieb Jim Fulton:
>> On Jul 20, 2007, at 6:02 AM, Christian Theune wrote:
>> ...
>>> I'd be happy to support mirroring once all this is sorted out/ I can
>>> offer a server in Germany/Europe.
>>
>> If we decide that mirrors would be a good idea, it will be important,
>> imo, to select mirror sites bases on their connectivity. The goal of
>> the mirrors should be to try to give people options with short
>> network distances.
>
> Right, however, do you have any specific parameters that can be
> measured
> in mind?
I'm not enough of a network expert. Hopefully, someone more
knowledgeable will make a suggestion. BTW, with the current PyPI
performance, I'm guessing we could have 10s of mirrors poll once a
minute without affecting other users.
> (Our server is reasonably well connected, reachable with about 5 hops
> from within Germany with latency around 40ms on a DSL line. Multiple
> GBit lines to the hosting center.)
I didn't mean to suggest that you weren't well connected.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Fri Jul 20 22:09:39 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 20 Jul 2007 16:09:39 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <20070720200721.88E1D3A403A@sparrow.telecommunity.com>
At 07:06 AM 7/19/2007 -0400, Jim Fulton wrote:
>I've created and experimental prototype setuptools-specific package
>index at
>
> http://download.zope.org/ppix
>
>Going to that page gives brief instructions for using it with
>easy_install and zc.buildout.
FYI, the handling of homepage and download links is broken. You have
e.g. 'meta="homepage"' instead of 'rel="homepage"', so easy_install
doesn't pick these up and look for links there, meaning that ppix
fails to find downloads for e.g. pywin32 which is hosted at Sourceforge.
(On a perhaps not entirely unrelated note, the Cheeseshop appears to
be down at the moment:
"""Error...
There's been a problem with your request
psycopg.OperationalError: no connection to the server""")
By the way, I'd suggest explaining (or linking to an explanation) on
the ppix main page describing how to configure easy_install such that
the '-i' option isn't necessary. Perhaps we could add an example to
the EasyInstall docs somewhere near:
http://peak.telecommunity.com/DevCenter/EasyInstall#creating-your-own-package-index
and then link to it from the ppix page.
From jim at zope.com Fri Jul 20 22:07:08 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 16:07:08 -0400
Subject: [Catalog-sig] PyPI is down with a psycopg error
Message-ID: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
Requests to http://www.python.org/pypi are giving:
Error...
There's been a problem with your request
psycopg.OperationalError: no connection to the server
This (or something like it) has been happening since 7:54 UTC. I know
because my once a minute cron job to update ppix has been failing
since then. :)
The good news is that folks who have switched to using http://
download.zope.org/ppix/ for setuptools (easy_install and buildout)
are unaffected.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Fri Jul 20 22:18:55 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 16:18:55 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <20070720200721.88E1D3A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<20070720200721.88E1D3A403A@sparrow.telecommunity.com>
Message-ID: <24B11DD1-DD79-4171-A38F-06B642EC354B@zope.com>
On Jul 20, 2007, at 4:09 PM, Phillip J. Eby wrote:
> At 07:06 AM 7/19/2007 -0400, Jim Fulton wrote:
>> I've created and experimental prototype setuptools-specific package
>> index at
>>
>> http://download.zope.org/ppix
>>
>> Going to that page gives brief instructions for using it with
>> easy_install and zc.buildout.
>
> FYI, the handling of homepage and download links is broken. You
> have e.g. 'meta="homepage"' instead of 'rel="homepage"', so
> easy_install doesn't pick these up and look for links there,
> meaning that ppix fails to find downloads for e.g. pywin32 which is
> hosted at Sourceforge.
Doh! Fixed.
> (On a perhaps not entirely unrelated note, the Cheeseshop appears
> to be down at the moment:
>
> """Error...
>
> There's been a problem with your request
>
> psycopg.OperationalError: no connection to the server""")
>
>
> By the way, I'd suggest explaining (or linking to an explanation)
> on the ppix main page describing how to configure easy_install such
> that the '-i' option isn't necessary.
If you send me some text, I'd be happy to add it to the ppix main page.
> Perhaps we could add an example to the EasyInstall docs somewhere
> near:
>
> http://peak.telecommunity.com/DevCenter/EasyInstall#creating-your-
> own-package-index
>
> and then link to it from the ppix page.
+1
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From benji at benjiyork.com Fri Jul 20 22:04:29 2007
From: benji at benjiyork.com (Benji York)
Date: Fri, 20 Jul 2007 16:04:29 -0400
Subject: [Catalog-sig] Cheeseshop down
Message-ID: <46A1154D.7000708@benjiyork.com>
Fulfilling my dutifully sworn obligation to report every instance of
PYPI being down:
"""
Error...
There's been a problem with your request
psycopg.OperationalError: no connection to the server
"""
--
Benji York
http://benjiyork.com
From bray at sent.com Fri Jul 20 23:01:16 2007
From: bray at sent.com (Brian Ray)
Date: Fri, 20 Jul 2007 16:01:16 -0500
Subject: [Catalog-sig] PyPI is down with a psycopg error
In-Reply-To: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
Message-ID:
On Jul 20, 2007, at 3:07 PM, Jim Fulton wrote:
>
> Error...
>
> There's been a problem with your request
>
> psycopg.OperationalError: no connection to the server
>
Come on!
Still down.
Not Good. Does anybody know a short term fix and a long term solution.
Brian Ray
bray at sent.com
From jim at zope.com Fri Jul 20 23:10:26 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 17:10:26 -0400
Subject: [Catalog-sig] PyPI is down with a psycopg error
In-Reply-To:
References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
Message-ID:
On Jul 20, 2007, at 5:01 PM, Brian Ray wrote:
>
> On Jul 20, 2007, at 3:07 PM, Jim Fulton wrote:
>
>>
>> Error...
>>
>> There's been a problem with your request
>>
>> psycopg.OperationalError: no connection to the server
>>
>
> Come on!
>
> Still down.
>
> Not Good. Does anybody know a short term fix and a long term
> solution.
If you're using it for easy_install or buildout, use http://
download.zope.org/ppix as your package index.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From richardjones at optushome.com.au Sat Jul 21 01:34:34 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Sat, 21 Jul 2007 09:34:34 +1000
Subject: [Catalog-sig] PyPI is down with a psycopg error
In-Reply-To:
References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
Message-ID: <200707210934.34159.richardjones@optushome.com.au>
On Sat, 21 Jul 2007, Brian Ray wrote:
> On Jul 20, 2007, at 3:07 PM, Jim Fulton wrote:
> > Error...
> >
> > There's been a problem with your request
> >
> > psycopg.OperationalError: no connection to the server
>
> Come on!
Yes, because complaining about it will fix it.
Postgres is up and running, but the web interface is reporting the above
errors as though it can't connect. I can only assume that the persistent
connection has run into trouble. I've disabled persistent connections in the
fcgi config, but now apache will need restarting. I'm trying to contact
someone who can do that.
> Not Good. Does anybody know a short term fix and a long term solution.
You can volunteer to also be a maintainer of the system.
Richard
From richardjones at optushome.com.au Sat Jul 21 02:08:35 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Sat, 21 Jul 2007 10:08:35 +1000
Subject: [Catalog-sig] PyPI is down with a psycopg error
In-Reply-To: <200707210934.34159.richardjones@optushome.com.au>
References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
<200707210934.34159.richardjones@optushome.com.au>
Message-ID: <200707211008.35954.richardjones@optushome.com.au>
On Sat, 21 Jul 2007, Richard Jones wrote:
> I'm trying to contact someone who can do that.
It looks like one of the volunteer sysadmins has now restarted apache and the
database connection issues are no more.
Richard
From martin at v.loewis.de Sat Jul 21 08:05:13 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Jul 2007 08:05:13 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <20070720200721.88E1D3A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<20070720200721.88E1D3A403A@sparrow.telecommunity.com>
Message-ID: <46A1A219.60906@v.loewis.de>
> (On a perhaps not entirely unrelated note, the Cheeseshop appears to
> be down at the moment:
>
> """Error...
>
> There's been a problem with your request
>
> psycopg.OperationalError: no connection to the server""")
Around that time, the Postgres log has these entries:
2007-07-20 21:53:24 [14636] LOG: received fast shutdown request
2007-07-20 21:53:24 [14636] LOG: aborting any active transactions
2007-07-20 21:53:24 [26166] FATAL: terminating connection due to
administrator command
2007-07-20 21:53:24 [15769] FATAL: terminating connection due to
administrator command
2007-07-20 21:53:24 [10390] FATAL: terminating connection due to
administrator command
2007-07-20 21:53:24 [31182] FATAL: terminating connection due to
administrator command
2007-07-20 21:53:24 [30066] FATAL: terminating connection due to
administrator command
2007-07-20 21:53:24 [10162] FATAL: terminating connection due to
administrator command
2007-07-20 21:53:24 [17452] FATAL: terminating connection due to
administrator command
2007-07-20 21:53:24 [17147] FATAL: terminating connection due to
administrator command
2007-07-20 21:53:24 [1159] LOG: shutting down
2007-07-20 21:53:26 [1159] LOG: database system is shut down
2007-07-20 21:53:33 [1469] LOG: database system was shut down at
2007-07-20 21:53:26 CEST
2007-07-20 21:53:33 [1469] LOG: checkpoint record is at A/FD833F0
2007-07-20 21:53:33 [1469] LOG: redo record is at A/FD833F0; undo
record is at 0/0; shutdown TRUE
2007-07-20 21:53:33 [1469] LOG: next transaction ID: 110977718; next
OID: 61913929
2007-07-20 21:53:33 [1469] LOG: database system is ready
and Sean Reifschneider was logged in, so I suspect he did some
maintenance work.
Sean?
Regards,
Martin
From jafo at tummy.com Sat Jul 21 08:17:20 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Sat, 21 Jul 2007 00:17:20 -0600
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific
PyPI index.
In-Reply-To: <46A1A219.60906@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<20070720200721.88E1D3A403A@sparrow.telecommunity.com>
<46A1A219.60906@v.loewis.de>
Message-ID: <20070721061720.GB4489@tummy.com>
On Sat, Jul 21, 2007 at 08:05:13AM +0200, "Martin v. L?wis" wrote:
>Around that time, the Postgres log has these entries:
There was an upgrade of Postgres done earlier, as far as I can see,
pypi is running. It must have been resolved earlier. AMK mentioned there
was a problem with the upgrade restart and Apache had to be restarted, that
was like 6 hours ago though.
Thanks,
Sean
--
"I not only use all the brains that I have, but all that I can borrow."
-- Woodrow Wilson
Sean Reifschneider, Member of Technical Staff
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
From martin at v.loewis.de Sat Jul 21 19:00:30 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Jul 2007 19:00:30 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <46A23BAE.5090907@v.loewis.de>
> I've created and experimental prototype setuptools-specific package
> index at
>
> http://download.zope.org/ppix
I've now added something similar as
http://cheeseshop.python.org/simple/
It differs from your site in a few ways:
- it does include a top-level index of all packages (but neither
releases nor descriptions)
- it's always current, due to being dynamically computed
- it may differ in the precise list of URLs displayed;
if there are important deviations, please let me know.
Regards,
Martin
From jim at zope.com Sat Jul 21 19:12:48 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 21 Jul 2007 13:12:48 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A23BAE.5090907@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
Message-ID: <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
On Jul 21, 2007, at 1:00 PM, Martin v. L?wis wrote:
>> I've created and experimental prototype setuptools-specific package
>> index at
>>
>> http://download.zope.org/ppix
>
> I've now added something similar as
>
> http://cheeseshop.python.org/simple/
Way cool!
>
> It differs from your site in a few ways:
>
> - it does include a top-level index of all packages (but neither
> releases nor descriptions)
Why? This is a relatively expensive page, due to it's size I assume,
that really provides no value. This will slow down setuptools.
> - it's always current, due to being dynamically computed
And also unreliable, for the same reason. For example, it would have
been inaccessible yesterday afternoon. And also puts more load on the
server. It would be much better imo if static pages could be written
on writes.
> - it may differ in the precise list of URLs displayed;
> if there are important deviations, please let me know.
The download and homepage URL anchors need rel="download" or
rel="homepage".
They lack the #egg= links.
Compare your page for setuptools to mine.
Also, some packages use their pypi pages as their home page links.
You want to exclude these, otherwise, setuptools will circle around
to the human interface, which defeats point of the simple interface.
Thanks for plugging away on this.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Sat Jul 21 19:48:16 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 13:48:16 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A23BAE.5090907@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
Message-ID: <20070721174558.DDF923A403A@sparrow.telecommunity.com>
At 07:00 PM 7/21/2007 +0200, Martin v. L?wis wrote:
> > I've created and experimental prototype setuptools-specific package
> > index at
> >
> > http://download.zope.org/ppix
>
>I've now added something similar as
>
>http://cheeseshop.python.org/simple/
It's very fast, thanks.
>It differs from your site in a few ways:
>
>- it does include a top-level index of all packages (but neither
> releases nor descriptions)
Unfortunately, that doesn't help current versions of setuptools. See
point #7 of:
http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api
Setuptools looks for release links, not package links on that page.
Compare:
$ easy_install -vvvi http://cheeseshop.python.org/simple Pywin32
Searching for Pywin32
Reading http://cheeseshop.python.org/simple/Pywin32/
Couldn't find index page for 'Pywin32' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading http://cheeseshop.python.org/simple/
No local packages or download links found for Pywin32
error: Could not find suitable distribution for Requirement.parse('Pywin32')
$ easy_install -vvvi http://cheeseshop.python.org/pypi Pywin32
Searching for Pywin32
Reading http://cheeseshop.python.org/pypi/Pywin32/
Couldn't find index page for 'Pywin32' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading http://cheeseshop.python.org/pypi/
Reading http://cheeseshop.python.org/pypi/pywin32/210
Reading http://sf.net/projects/pywin32
...
>- it's always current, due to being dynamically computed
>- it may differ in the precise list of URLs displayed;
> if there are important deviations, please let me know.
Jim's already mentioned these, but the rel="" info (per the index API
spec's point #6), and the links embedded in the long_description
field (per point #4) are missing. Without these, easy_install can't
find sourceforge links, subversion checkouts, or any other embedded
direct download links. For example:
$ easy_install -vvvi http://cheeseshop.python.org/simple pywin32
Searching for pywin32
Reading http://cheeseshop.python.org/simple/pywin32/
No local packages or download links found for pywin32
error: Could not find suitable distribution for Requirement.parse('pywin32')
$ easy_install -vvvi http://cheeseshop.python.org/pypi pywin32
Searching for pywin32
Reading http://cheeseshop.python.org/pypi/pywin32/
Reading http://sf.net/projects/pywin32
Reading http://sourceforge.net/project/showfiles.php?group_id=78018
Found link:
http://downloads.sourceforge.net/pywin32/pywin32-210.win32-py2.2.exe?modtime=1159009204&big_mirror=0
...[a dozen more links]
$ easy_install -i http://cheeseshop.python.org/simple setuptools==dev
Searching for setuptools==dev
Reading http://cheeseshop.python.org/simple/setuptools/
No local packages or download links found for setuptools==dev
error: Could not find suitable distribution for
Requirement.parse('setuptools==dev')
$ easy_install -i http://cheeseshop.python.org/pypi setuptools==dev
Searching for setuptools==dev
Reading http://cheeseshop.python.org/pypi/setuptools/
Reading http://cheeseshop.python.org/pypi/setuptools
Reading http://cheeseshop.python.org/pypi/setuptools/0.6c6
Best match: setuptools dev
Downloading
http://svn.python.org/projects/sandbox/trunk/setuptools/#egg=setuptools-dev
Doing subversion checkout from
http://svn.python.org/projects/sandbox/trunk/setuptools/ to ...
From martin at v.loewis.de Sat Jul 21 21:08:52 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Jul 2007 21:08:52 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
Message-ID: <46A259C4.6090605@v.loewis.de>
>> - it does include a top-level index of all packages (but neither
>> releases nor descriptions)
>
> Why? This is a relatively expensive page, due to it's size I assume,
> that really provides no value. This will slow down setuptools.
IIUC, it won't slow down setuptools, as setuptools looks at it only
if it cannot find the real package page due to a misspelling. So
as long as everything is spelled correctly, it should not provide
any slowdown.
If people do misspell a package name when invoking easy_install,
they get the feature that you consider of no value.
As for performance - 30 downloads take 3.9s currently from nearby.
>> - it's always current, due to being dynamically computed
>
> And also unreliable, for the same reason. For example, it would have
> been inaccessible yesterday afternoon.
The same could happen to Apache, too, of course. svn.python.org
sometimes fails to restart when a restart is request on log rotation.
Any software is unreliable; to reduce downtime, you need an operator
that is available when something breaks.
> And also puts more load on the server. It would be much better imo
> if static pages could be written on writes.
Contributions are welcome. In addition to me considering it futile,
I also don't know how to implement it correctly.
>> - it may differ in the precise list of URLs displayed;
>> if there are important deviations, please let me know.
>
> The download and homepage URL anchors need rel="download" or
> rel="homepage".
Done.
> They lack the #egg= links.
How are these computed?
> Also, some packages use their pypi pages as their home page links.
Ok, done.
Regards,
Martin
From martin at v.loewis.de Sat Jul 21 21:23:30 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Jul 2007 21:23:30 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <20070721174558.DDF923A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<20070721174558.DDF923A403A@sparrow.telecommunity.com>
Message-ID: <46A25D32.4080606@v.loewis.de>
> Unfortunately, that doesn't help current versions of setuptools. See
> point #7 of:
>
> http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api
>
> Setuptools looks for release links, not package links on that page.
I don't understand. What's a "release link"? The links on the index
page *do* go to the "project's active version pages", as specified
(there aren't any numbered version pages)
Jim left out that page entirely - are you saying it is impossible
to provide such an index page with the page structure that Jim
proposed?
> $ easy_install -vvvi http://cheeseshop.python.org/simple Pywin32
> Searching for Pywin32
> Reading http://cheeseshop.python.org/simple/Pywin32/
> Couldn't find index page for 'Pywin32' (maybe misspelled?)
> Scanning index of all packages (this may take a while)
> Reading http://cheeseshop.python.org/simple/
> No local packages or download links found for Pywin32
I see that it doesn't work, but I cannot understand why.
On
http://cheeseshop.python.org/simple/
"pywin32" is clearly linked, so it should be able to resolve
the misspelling.
> Jim's already mentioned these, but the rel="" info (per the index API
> spec's point #6),
This is fixed.
> and the links embedded in the long_description field
> (per point #4) are missing.
I have to think about this more. Is it correct that you want all href
attributes of all a elements in the long_description? And how do you
know what the long_description is from just looking at the rendered
page?
Regards,
Martin
From pje at telecommunity.com Sat Jul 21 21:51:26 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 15:51:26 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A25D32.4080606@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<20070721174558.DDF923A403A@sparrow.telecommunity.com>
<46A25D32.4080606@v.loewis.de>
Message-ID: <20070721194908.F16373A403A@sparrow.telecommunity.com>
At 09:23 PM 7/21/2007 +0200, Martin v. L?wis wrote:
> > Unfortunately, that doesn't help current versions of setuptools. See
> > point #7 of:
> >
> > http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api
> >
> > Setuptools looks for release links, not package links on that page.
>
>I don't understand. What's a "release link"? The links on the index
>page *do* go to the "project's active version pages", as specified
>(there aren't any numbered version pages)
See point #2:
"""2. Individual project version pages' URLs must be of the form
base/projectname/version, where base is the package index's base URL."""
That's what's meant by "version pages" in point #7 -- i.e., they
*must* be of that two-part form for setuptools to recognize them as such.
>I see that it doesn't work, but I cannot understand why.
>On
>
>http://cheeseshop.python.org/simple/
>
>"pywin32" is clearly linked, so it should be able to resolve
>the misspelling.
It could perhaps be *changed* to do so, but at present it follows the
spec's definition of "version page" URLs.
> > Jim's already mentioned these, but the rel="" info (per the index API
> > spec's point #6),
>
>This is fixed.
Great; Sourceforge and other offsite download pages work now.
> > and the links embedded in the long_description field
> > (per point #4) are missing.
>
>I have to think about this more. Is it correct that you want all href
>attributes of all a elements in the long_description?
Yes; of course, the usual rendering needs to be applied, since
long_description can contain reStructuredText.
> And how do you
>know what the long_description is from just looking at the rendered
>page?
You don't need to; easy_install discovers those links the same way it
does any other Cheeseshop-provided download links. From
easy_install's point of view, the entire page is just one big mass of
links that might point to downloads:
"""4. ...It is explicitly permitted for a project's
"long_description" to include URLs, and these should be formatted as
HTML links by the package index, as EasyInstall does *no special
processing* [emph. added] to identify what parts of a page are
index-specific and which are part of the project's supplied description."""
In other words, the *only* links that are specially handled are the
"rel" ones, which it follows unconditionally to look for additional
direct download links. All other links are merely *inspected* to see
if they obviously refer to a downloadable package (e.g. .tgz, .zip,
.egg, .exe etc., or explicitly-marked #egg). As a side-effect, this
means that links to perform Cheeseshop operations, links to other
parts of python.org, etc. are simply ignored, as they are not links
to downloadables nor marked as #egg.
If a URL can be determined by inspection to be a download link, then
easy_install extracts version and platform info from the URL and adds
it as a candidate for download selection. When both the home page
and download URL have been read, along with any detected "active
version pages" (as defined above), then easy_install chooses the
"best" download URL from all the candidates it has seen up to that point.
From pje at telecommunity.com Sat Jul 21 21:53:40 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 15:53:40 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To:
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
Message-ID: <20070721195122.CF2343A40D7@sparrow.telecommunity.com>
At 09:23 PM 7/21/2007 +0200, Georg Brandl wrote:
>What I, as an outsider, can see: for the Pygments package, Jim's page
>lists the development link from the package description
>(http://trac.pocoo.org/repos/pygments/trunk#egg=Pygments-dev), but
>it looks like it's badly extracted (it has a trailing ">`__"), yours
>doesn't list it at all.
Hm, perhaps Jim is extracting it by looking for #egg URLs, rather
than by actually processing the reST markup with docutils. That
should probably be fixed, since there are many ways to specify URLs
in reST and handling them all with regular expressions is unlikely to
work as well as applying regular expressions to the resulting HTML. :)
(Also, looking only for #egg links will miss non-#egg links embedded
in the long_description, in the event that someone places direct
download links there.)
From martin at v.loewis.de Sun Jul 22 00:53:03 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 00:53:03 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <20070721194908.F16373A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<20070721174558.DDF923A403A@sparrow.telecommunity.com>
<46A25D32.4080606@v.loewis.de>
<20070721194908.F16373A403A@sparrow.telecommunity.com>
Message-ID: <46A28E4F.5070905@v.loewis.de>
> See point #2:
>
> """2. Individual project version pages' URLs must be of the form
> base/projectname/version, where base is the package index's base URL."""
>
> That's what's meant by "version pages" in point #7 -- i.e., they *must*
> be of that two-part form for setuptools to recognize them as such.
Ok, but I still cannot see how to fix that: there simply *is* no
version part that I could point to.
Does that mean that Jim's approach does not work?
> Yes; of course, the usual rendering needs to be applied, since
> long_description can contain reStructuredText.
Ok, I now added these links as well.
Regards,
Martin
From pje at telecommunity.com Sun Jul 22 01:20:04 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 19:20:04 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A28E4F.5070905@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<20070721174558.DDF923A403A@sparrow.telecommunity.com>
<46A25D32.4080606@v.loewis.de>
<20070721194908.F16373A403A@sparrow.telecommunity.com>
<46A28E4F.5070905@v.loewis.de>
Message-ID: <20070721231808.2D5793A403A@sparrow.telecommunity.com>
At 12:53 AM 7/22/2007 +0200, Martin v. L?wis wrote:
> > See point #2:
> >
> > """2. Individual project version pages' URLs must be of the form
> > base/projectname/version, where base is the package index's base URL."""
> >
> > That's what's meant by "version pages" in point #7 -- i.e., they *must*
> > be of that two-part form for setuptools to recognize them as such.
>
>Ok, but I still cannot see how to fix that: there simply *is* no
>version part that I could point to.
Actually, 'version' is allowed to be an empty string, so simply
adding a trailing '/' to the links you're generating now should work.
The only thing the version part of a version page URL is used for, is
to handle links to .py files: setuptools uses the package version (if
available) to synthesize a setup.py for installing standalone .py files.
If the version is not available, it won't be able to do that, but
that's a relatively minor feature, all things considered. Few
packages are distributed via a single .py download URL, but the
package index could actually tack on an #egg designator to such links
in order to preserve 100% backward-compatibility.
>Does that mean that Jim's approach does not work?
Jim isn't providing the top-level index, and thus doesn't provide
punctuation or case corrections. The "version pages" convention is
only used by setuptools to discover additional index pages for
crawling, anyway, and his whole design is intended to prevent crawling.
> > Yes; of course, the usual rendering needs to be applied, since
> > long_description can contain reStructuredText.
>
>Ok, I now added these links as well.
Looks good, thanks!
From martin at v.loewis.de Sun Jul 22 09:42:19 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 09:42:19 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <20070721231808.2D5793A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<20070721174558.DDF923A403A@sparrow.telecommunity.com>
<46A25D32.4080606@v.loewis.de>
<20070721194908.F16373A403A@sparrow.telecommunity.com>
<46A28E4F.5070905@v.loewis.de>
<20070721231808.2D5793A403A@sparrow.telecommunity.com>
Message-ID: <46A30A5B.4020007@v.loewis.de>
> Actually, 'version' is allowed to be an empty string, so simply adding a
> trailing '/' to the links you're generating now should work.
It does indeed.
Regards,
Martin
From jim at zope.com Sun Jul 22 15:09:44 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jul 2007 09:09:44 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A259C4.6090605@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>
Message-ID:
On Jul 21, 2007, at 3:08 PM, Martin v. L?wis wrote:
>>> - it does include a top-level index of all packages (but neither
>>> releases nor descriptions)
>>
>> Why? This is a relatively expensive page, due to it's size I assume,
>> that really provides no value. This will slow down setuptools.
>
> IIUC, it won't slow down setuptools, as setuptools looks at it only
> if it cannot find the real package page due to a misspelling. So
> as long as everything is spelled correctly, it should not provide
> any slowdown.
>
> If people do misspell a package name when invoking easy_install,
> they get the feature that you consider of no value.
That is not correct. Not all packages are in PyPI. Using a package
that isn't in PyPI will trigger a fetch of that page. It isn't
misspelled, it's just not there. People should *not* misspell pages
when using setuptools. They should certainly not use misspelled
package names in requirements. In my strongly help opinion, allowing
imprecise names in requirements and setuptools command if of negative
value.
> As for performance - 30 downloads take 3.9s currently from nearby.
That's nice. For me, that page takes 3 or 4 times as long as other
pages.
>>> - it's always current, due to being dynamically computed
>>
>> And also unreliable, for the same reason. For example, it would have
>> been inaccessible yesterday afternoon.
>
> The same could happen to Apache, too, of course. svn.python.org
> sometimes fails to restart when a restart is request on log rotation.
>
> Any software is unreliable; to reduce downtime, you need an operator
> that is available when something breaks.
Apache has a far better record than the cheeseshop. I give up.
>> And also puts more load on the server. It would be much better imo
>> if static pages could be written on writes.
>
> Contributions are welcome. In addition to me considering it futile,
> I also don't know how to implement it correctly.
I'd be happy to contribute my polling version. That solves my
problems and I can't justify the additional effort to figure out the
cheeseshop softtware.
...
>> They lack the #egg= links.
>
> How are these computed?
By parsing the description.
Apparently, I'm going this incorrectly. I'll have to look into that.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Sun Jul 22 15:16:44 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jul 2007 09:16:44 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <20070721195122.CF2343A40D7@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<20070721195122.CF2343A40D7@sparrow.telecommunity.com>
Message-ID:
On Jul 21, 2007, at 3:53 PM, Phillip J. Eby wrote:
> At 09:23 PM 7/21/2007 +0200, Georg Brandl wrote:
>> What I, as an outsider, can see: for the Pygments package, Jim's page
>> lists the development link from the package description
>> (http://trac.pocoo.org/repos/pygments/trunk#egg=Pygments-dev), but
>> it looks like it's badly extracted (it has a trailing ">`__"), yours
>> doesn't list it at all.
>
> Hm, perhaps Jim is extracting it by looking for #egg URLs, rather
> than by actually processing the reST markup with docutils.
Yup.
> That should probably be fixed, since there are many ways to specify
> URLs in reST and handling them all with regular expressions is
> unlikely to work
Yeah, I was hoping to get off easy. :)
> as well as applying regular expressions to the resulting HTML. :)
:)
> (Also, looking only for #egg links will miss non-#egg links
> embedded in the long_description, in the event that someone places
> direct download links there.)
By this, I assume you mean direct links to distributions.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Sun Jul 22 15:19:05 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jul 2007 09:19:05 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <20070721231808.2D5793A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<20070721174558.DDF923A403A@sparrow.telecommunity.com>
<46A25D32.4080606@v.loewis.de>
<20070721194908.F16373A403A@sparrow.telecommunity.com>
<46A28E4F.5070905@v.loewis.de>
<20070721231808.2D5793A403A@sparrow.telecommunity.com>
Message-ID:
On Jul 21, 2007, at 7:20 PM, Phillip J. Eby wrote:
...
> Jim isn't providing the top-level index, and thus doesn't provide
> punctuation or case corrections.
Yup
> The "version pages" convention is only used by setuptools to
> discover additional index pages for crawling, anyway, and his whole
> design is intended to prevent crawling.
That's a secondary benefit. The main goal is to avoid the expense of
that page for packages that aren't in PyPI, as some packages I use
aren't.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From martin at v.loewis.de Sun Jul 22 18:24:41 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 18:24:41 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To:
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>
Message-ID: <46A384C9.8040404@v.loewis.de>
>> If people do misspell a package name when invoking easy_install,
>> they get the feature that you consider of no value.
>
> That is not correct. Not all packages are in PyPI. Using a package that
> isn't in PyPI will trigger a fetch of that page.
I don't understand. What page is fetched if the package is not in PyPI?
> It isn't misspelled,
> it's just not there. People should *not* misspell pages when using
> setuptools. They should certainly not use misspelled package names in
> requirements. In my strongly help opinion, allowing imprecise names in
> requirements and setuptools command if of negative value.
I cannot comment on. I don't use setuptools, and have no intuition what
is good or bad when using it (for example, I consider .egg files and
the notion of eggs inherently bad).
My main motivation to provide that page is that the setuptools
specification says it should be there. As this entire infrastructure
is for the sake of setuptools, I find it pointless to not support
setuptools fully.
> I'd be happy to contribute my polling version. That solves my problems
> and I can't justify the additional effort to figure out the cheeseshop
> softtware.
I'd like to hear other opinions here. Would people prefer if the index
was always correct (and perhaps somewhat slow), or would they prefer
instead that it is super-efficient (and somewhat out-of-date)?
Regards,
Martin
From martin at v.loewis.de Sun Jul 22 18:26:14 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 18:26:14 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To:
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<20070721174558.DDF923A403A@sparrow.telecommunity.com>
<46A25D32.4080606@v.loewis.de>
<20070721194908.F16373A403A@sparrow.telecommunity.com>
<46A28E4F.5070905@v.loewis.de>
<20070721231808.2D5793A403A@sparrow.telecommunity.com>
Message-ID: <46A38526.2010308@v.loewis.de>
> That's a secondary benefit. The main goal is to avoid the expense of
> that page for packages that aren't in PyPI, as some packages I use aren't.
I see. Shouldn't that be fixed by providing an option to setuptools
that avoids going to the index for missing packages?
Regards,
Martin
From tseaver at palladion.com Sun Jul 22 18:33:11 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Sun, 22 Jul 2007 12:33:11 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A384C9.8040404@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de>
<46A384C9.8040404@v.loewis.de>
Message-ID: <46A386C7.5080203@palladion.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Martin v. L?wis wrote:
>>> If people do misspell a package name when invoking easy_install,
>>> they get the feature that you consider of no value.
>> That is not correct. Not all packages are in PyPI. Using a package that
>> isn't in PyPI will trigger a fetch of that page.
>
> I don't understand. What page is fetched if the package is not in PyPI?
I think Jim was referring to a package which is *registered* in PyPI,
but whose download location was elsewhere.
>> I'd be happy to contribute my polling version. That solves my problems
>> and I can't justify the additional effort to figure out the cheeseshop
>> softtware.
>
> I'd like to hear other opinions here. Would people prefer if the index
> was always correct (and perhaps somewhat slow), or would they prefer
> instead that it is super-efficient (and somewhat out-of-date)?
I would prefer the second, particularly as I think the caching solution
lends itself to mirroring, which would also improve availability.
- From my complete ignorance of the underlying architecture: the polling
solution would stay pretty current if there were an extremely cheap way
to ask for the latest "transaction ID" on the cheeseshop, or if the
query could fetch only registrations newer than the last poll time. Are
such queries possible over the XML-RPC interface?
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGo4bH+gerLs4ltQ4RAjiWAJ9/5TeOWAHdwL7PS5QAUnpyZWJzMQCeN5hT
5rRjOHzAu4cf+TKktNntWV8=
=p59N
-----END PGP SIGNATURE-----
From tseaver at palladion.com Sun Jul 22 18:33:11 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Sun, 22 Jul 2007 12:33:11 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A384C9.8040404@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de>
<46A384C9.8040404@v.loewis.de>
Message-ID: <46A386C7.5080203@palladion.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Martin v. L?wis wrote:
>>> If people do misspell a package name when invoking easy_install,
>>> they get the feature that you consider of no value.
>> That is not correct. Not all packages are in PyPI. Using a package that
>> isn't in PyPI will trigger a fetch of that page.
>
> I don't understand. What page is fetched if the package is not in PyPI?
I think Jim was referring to a package which is *registered* in PyPI,
but whose download location was elsewhere.
>> I'd be happy to contribute my polling version. That solves my problems
>> and I can't justify the additional effort to figure out the cheeseshop
>> softtware.
>
> I'd like to hear other opinions here. Would people prefer if the index
> was always correct (and perhaps somewhat slow), or would they prefer
> instead that it is super-efficient (and somewhat out-of-date)?
I would prefer the second, particularly as I think the caching solution
lends itself to mirroring, which would also improve availability.
- From my complete ignorance of the underlying architecture: the polling
solution would stay pretty current if there were an extremely cheap way
to ask for the latest "transaction ID" on the cheeseshop, or if the
query could fetch only registrations newer than the last poll time. Are
such queries possible over the XML-RPC interface?
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGo4bH+gerLs4ltQ4RAjiWAJ9/5TeOWAHdwL7PS5QAUnpyZWJzMQCeN5hT
5rRjOHzAu4cf+TKktNntWV8=
=p59N
-----END PGP SIGNATURE-----
From pje at telecommunity.com Sun Jul 22 18:40:11 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 22 Jul 2007 12:40:11 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A38526.2010308@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<20070721174558.DDF923A403A@sparrow.telecommunity.com>
<46A25D32.4080606@v.loewis.de>
<20070721194908.F16373A403A@sparrow.telecommunity.com>
<46A28E4F.5070905@v.loewis.de>
<20070721231808.2D5793A403A@sparrow.telecommunity.com>
<46A38526.2010308@v.loewis.de>
Message-ID: <20070722163754.A78EF3A40A9@sparrow.telecommunity.com>
At 06:26 PM 7/22/2007 +0200, Martin v. L?wis wrote:
> > That's a secondary benefit. The main goal is to avoid the expense of
> > that page for packages that aren't in PyPI, as some packages I use aren't.
>
>I see. Shouldn't that be fixed by providing an option to setuptools
>that avoids going to the index for missing packages?
There's already such an option; --find-links or -f lets you specify
URLs that should be checked before *any* PyPI access occurs. If all
dependencies can be met using those URLs without going to PyPI, and
you haven't explicitly requested -U (--update), easy_install doesn't
go to PyPI.
You can also specify such links in a setup script using
setup(dependency_links=[...]), which bakes them into the .egg. When
searching for that egg's dependencies, easy_install will pick them up
and use them.
So, it's actually possible to install a package and all its
dependencies without using PyPI at all, if the package author(s) bake
the URLs in.
From jim at zope.com Sun Jul 22 18:38:09 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jul 2007 12:38:09 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A384C9.8040404@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>
<46A384C9.8040404@v.loewis.de>
Message-ID:
On Jul 22, 2007, at 12:24 PM, Martin v. L?wis wrote:
>>> If people do misspell a package name when invoking easy_install,
>>> they get the feature that you consider of no value.
>>
>> That is not correct. Not all packages are in PyPI. Using a
>> package that
>> isn't in PyPI will trigger a fetch of that page.
>
> I don't understand. What page is fetched if the package is not in
> PyPI?
We have lots of packages that aren't in PyPI. Some of them aren't
ready for PyPI or are not of general interest. Some are proprietary.
>> It isn't misspelled,
>> it's just not there. People should *not* misspell pages when using
>> setuptools. They should certainly not use misspelled package
>> names in
>> requirements. In my strongly help opinion, allowing imprecise
>> names in
>> requirements and setuptools command if of negative value.
>
> I cannot comment on. I don't use setuptools, and have no intuition
> what
> is good or bad when using it (for example, I consider .egg files and
> the notion of eggs inherently bad).
>
> My main motivation to provide that page is that the setuptools
> specification says it should be there. As this entire infrastructure
> is for the sake of setuptools, I find it pointless to not support
> setuptools fully.
Fair enough. Theory beats practicality every time. ;)
>> I'd be happy to contribute my polling version. That solves my
>> problems
>> and I can't justify the additional effort to figure out the
>> cheeseshop
>> softtware.
>
> I'd like to hear other opinions here.
Yes. This has been a fairly limited discussion. Sigh.
> Would people prefer if the index
> was always correct (and perhaps somewhat slow), or would they prefer
> instead that it is super-efficient (and somewhat out-of-date)?
Where somewhat out of date could be a matter of seconds. IMO, a
python.org index could poll every few seconds, given that local
polling only takes a few milliseconds. I have a feeling that this
discussion is going to annoy someone with PyPI software knowledge
enough to add baking on write. :) For example, I had the impression
that Rene' was planning to invoke scripts after updates. It would be
easy to invoke my polling script or a script based on your work,
BTW, I'm pretty sure that geographic mirrors are desirable, both for
performance and redundancy reasons. I think that, for these, polling
once a minute is plenty and puts negligible load on PyPI, assuming
that there aren't hundreds of them.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Sun Jul 22 18:41:55 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jul 2007 12:41:55 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A386C7.5080203@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de>
<46A384C9.8040404@v.loewis.de> <46A386C7.5080203@palladion.com>
Message-ID: <437D4304-ECF3-4240-8C33-F946128F8232@zope.com>
On Jul 22, 2007, at 12:33 PM, Tres Seaver wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Martin v. L?wis wrote:
>>>> If people do misspell a package name when invoking easy_install,
>>>> they get the feature that you consider of no value.
>>> That is not correct. Not all packages are in PyPI. Using a
>>> package that
>>> isn't in PyPI will trigger a fetch of that page.
>>
>> I don't understand. What page is fetched if the package is not in
>> PyPI?
>
> I think Jim was referring to a package which is *registered* in PyPI,
> but whose download location was elsewhere.
No, I was referring to packages that aren't ready for or of interest
to PyPI or to proprietary packages.
...
> - From my complete ignorance of the underlying architecture: the
> polling
> solution would stay pretty current if there were an extremely cheap
> way
> to ask for the latest "transaction ID" on the cheeseshop, or if the
> query could fetch only registrations newer than the last poll time.
There is such an API thanks to Martin.
> Are
> such queries possible over the XML-RPC interface?
Yup. I'm using them. Queries take only a few milliseconds per
request on the server.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Sun Jul 22 18:51:40 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 22 Jul 2007 12:51:40 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To:
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>
Message-ID: <20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote:
>People should *not* misspell pages
>when using setuptools. They should certainly not use misspelled
>package names in requirements.
People do all sorts of things they shouldn't. That doesn't stop them
blaming other people for their mistakes.
It's said that a 10% improvement in ease-of-use can double a
product's users. Case sensitivity is a barrier to entry for new
users, and setuptools can't afford any additional entry barriers.
A significant part of setuptools' audience includes people who are
new to Python, or at least new to installing or distributing Python
modules, and quite a lot of setuptools features are aimed squarely at
that audience. This happens to be one of them.
> In my strongly help opinion, allowing
>imprecise names in requirements and setuptools command if of negative
>value.
I understand that perspective. But practicality beats purity, and
this is absolutely a "worse is better" type of situation.
Setuptools has lots of features that are targeted at different
audiences. There are plenty of features targeted at the group you're
in, don't begrudge the other groups their features. :)
(This is probably one reason that setuptools is so controversial;
everybody can find *something* about it to hate, even if those very
same things are quite loved by a different group of users. E.g. you
and case-insensitivity, Martin and eggs, etc.)
From martin at v.loewis.de Sun Jul 22 18:54:36 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 18:54:36 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A386C7.5080203@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de>
<46A384C9.8040404@v.loewis.de> <46A386C7.5080203@palladion.com>
Message-ID: <46A38BCC.1000707@v.loewis.de>
> I would prefer the second, particularly as I think the caching solution
> lends itself to mirroring, which would also improve availability.
I think this conclusion is wrong: Jim already has a mirror
infrastructure that anybody can run, without the need of running that
on the central server.
> - From my complete ignorance of the underlying architecture: the polling
> solution would stay pretty current if there were an extremely cheap way
> to ask for the latest "transaction ID" on the cheeseshop, or if the
> query could fetch only registrations newer than the last poll time. Are
> such queries possible over the XML-RPC interface?
Yes; you can ask for all changes since a certain UTC time. People
shouldn't invoke that every UTC second, though - once a minute is
fine.
Regards,
Martin
From martin at v.loewis.de Sun Jul 22 19:03:49 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 19:03:49 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To:
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>
<46A384C9.8040404@v.loewis.de>
Message-ID: <46A38DF5.6010701@v.loewis.de>
Jim Fulton schrieb:
> On Jul 22, 2007, at 12:24 PM, Martin v. L?wis wrote:
>>>> If people do misspell a package name when invoking easy_install,
>>>> they get the feature that you consider of no value.
>>>
>>> That is not correct. Not all packages are in PyPI. Using a package that
>>> isn't in PyPI will trigger a fetch of that page.
>>
>> I don't understand. What page is fetched if the package is not in PyPI?
>
> We have lots of packages that aren't in PyPI. Some of them aren't ready
> for PyPI or are not of general interest. Some are proprietary.
Ah, ok. So I stand to my original statement (the one you classified
as incorrect): *If* I do misspell a package name, *then* setuptools
will correct the spelling if the index page is available.
>> Would people prefer if the index
>> was always correct (and perhaps somewhat slow), or would they prefer
>> instead that it is super-efficient (and somewhat out-of-date)?
>
> Where somewhat out of date could be a matter of seconds.
And where somewhat slower could be "practically not noticable".
> BTW, I'm pretty sure that geographic mirrors are desirable, both for
> performance and redundancy reasons. I think that, for these, polling
> once a minute is plenty and puts negligible load on PyPI, assuming that
> there aren't hundreds of them.
Sure: I don't mind at all if more people run your software on their
machines. If people want it more official, we can have
"cheeseshop0.python.org", "cheeseshop1.python.org", and so on,
or "de.cheeseshop.python.org", "jp.cheeseshop.python.org", and so
on.
As I said before: if people also want to mirror the files, I'd
ask them provide download statistics. Given the changelog, it
would be easy to keep a file mirror up-to-date (of course,
if a mirror downloads all files, these downloads also count
towards the download statistics - which might confuse people).
Regards,
Martin
From martin at v.loewis.de Sun Jul 22 20:40:05 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 20:40:05 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <46A3A485.7060602@v.loewis.de>
> WRT zc.buildout, refreshing a buildout with just ZODB installed in it
> takes about 45 seconds for me using PyPI and about 5 seconds using
> the experimental index.
Can you kindly provide a measurement for the index at
http://cheeseshop.python.org/simple/ as well?
Thanks,
Martin
From fdrake at gmail.com Mon Jul 23 06:56:48 2007
From: fdrake at gmail.com (Fred Drake)
Date: Mon, 23 Jul 2007 00:56:48 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>
<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
Message-ID: <9cee7ab80707222156o2bae8a32pdaf7767f8c167918@mail.gmail.com>
On 7/22/07, Phillip J. Eby wrote:
> Setuptools has lots of features that are targeted at different
> audiences. There are plenty of features targeted at the group you're
> in, don't begrudge the other groups their features. :)
Actually, I suspect this is a substantial contributor to setuptools
being considered controversial: it encompasses to many different
features. That certainly keeps me feeling unhappy about depending on
it.
-Fred
--
Fred L. Drake, Jr.
"Chaos is the score upon which reality is written." --Henry Miller
From jim at zope.com Mon Jul 23 12:59:44 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 06:59:44 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A38DF5.6010701@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>
<46A384C9.8040404@v.loewis.de>
<46A38DF5.6010701@v.loewis.de>
Message-ID:
On Jul 22, 2007, at 1:03 PM, Martin v. L?wis wrote:
> Jim Fulton schrieb:
>> On Jul 22, 2007, at 12:24 PM, Martin v. L?wis wrote:
>>>>> If people do misspell a package name when invoking easy_install,
>>>>> they get the feature that you consider of no value.
>>>>
>>>> That is not correct. Not all packages are in PyPI. Using a
>>>> package that
>>>> isn't in PyPI will trigger a fetch of that page.
>>>
>>> I don't understand. What page is fetched if the package is not in
>>> PyPI?
>>
>> We have lots of packages that aren't in PyPI. Some of them aren't
>> ready
>> for PyPI or are not of general interest. Some are proprietary.
>
> Ah, ok. So I stand to my original statement (the one you classified
> as incorrect): *If* I do misspell a package name, *then* setuptools
> will correct the spelling if the index page is available.
Your full original statement was:
On Jul 21, 2007, at 3:08 PM, Martin v. L?wis wrote:
> IIUC, it won't slow down setuptools, as setuptools looks at it only
> if it cannot find the real package page due to a misspelling. So
> as long as everything is spelled correctly, it should not provide
> any slowdown.
>
> If people do misspell a package name when invoking easy_install,
> they get the feature that you consider of no value.
I was referring to the part about not slowing things down when people
didn't misspell. But it looks like I was mistaken. It was my
understanding that setuptools always checked index/ when it couldn't
find index/package_name/, but as Phillip pointed out, if it finds a
package via find links, it won't look at index/. Basic tests seem to
confirm this.
>>> Would people prefer if the index
>>> was always correct (and perhaps somewhat slow), or would they prefer
>>> instead that it is super-efficient (and somewhat out-of-date)?
>>
>> Where somewhat out of date could be a matter of seconds.
>
> And where somewhat slower could be "practically not noticable".
I wasn't arguing about speed. I agree that when PyPI is working
well, the difference between the speed of the dynamic page and the
speed of a static page wouldn't be noticeable.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Mon Jul 23 13:08:50 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 07:08:50 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>
<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
Message-ID: <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote:
> At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote:
>> People should *not* misspell pages
>> when using setuptools. They should certainly not use misspelled
>> package names in requirements.
>
> People do all sorts of things they shouldn't. That doesn't stop
> them blaming other people for their mistakes.
>
> It's said that a 10% improvement in ease-of-use can double a
> product's users. Case sensitivity is a barrier to entry for new
> users, and setuptools can't afford any additional entry barriers.
I totally don't buy this in a case like this. People installing
packages with setuptools are technical users. We expect them to
write Python scripts.
> A significant part of setuptools' audience includes people who are
> new to Python, or at least new to installing or distributing Python
> modules, and quite a lot of setuptools features are aimed squarely
> at that audience. This happens to be one of them.
I don't think that encouraging use of case insensitive names by
people who are about start learning a language that uses case
sensitive names is doing them any favors.
>> In my strongly help opinion, allowing
>> imprecise names in requirements and setuptools command if of negative
>> value.
>
> I understand that perspective. But practicality beats purity, and
> this is absolutely a "worse is better" type of situation.
Obviously we disagree.
> Setuptools has lots of features that are targeted at different
> audiences. There are plenty of features targeted at the group
> you're in, don't begrudge the other groups their features. :)
I don't think you are helping them.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From jim at zope.com Mon Jul 23 13:36:45 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 07:36:45 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A3A485.7060602@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A3A485.7060602@v.loewis.de>
Message-ID: <617C738B-BDB4-4EDE-900E-64B50EFC2ED6@zope.com>
On Jul 22, 2007, at 2:40 PM, Martin v. L?wis wrote:
>> WRT zc.buildout, refreshing a buildout with just ZODB installed in it
>> takes about 45 seconds for me using PyPI and about 5 seconds using
>> the experimental index.
>
> Can you kindly provide a measurement for the index at
> http://cheeseshop.python.org/simple/ as well?
Yup. So, ATM:
Using old PyPI takes about 1m5s
Using simple takes about 25s
Using ppix takes about 8s
Some notes:
- ZODB isn't the best example as it has download links to
www.zope.org, making it take longer than packages without offsite
links (relative to PyPI).
- I expect that the difference between simple and ppix *for me* is a
matter of geography.
Refreshing an empty buildout checks the zc.buildout and setuptools
packages. For that:
Old PyPI takes 25s
Simple takes 8s
and ppix takes .5s
Again, I assume that the difference between simple and ppix has more
to do with geography than the difference between serving statically
and dynamically. The simple page has more links on it than the ppix
page, because I haven't gotten around to scarf all links off of a
restructured-text rendering of long description. I doubt that makes
any difference. It will be interesting to try again after I fix that.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From pje at telecommunity.com Mon Jul 23 17:22:30 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 11:22:30 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>
<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
Message-ID: <20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
At 07:08 AM 7/23/2007 -0400, Jim Fulton wrote:
>On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote:
>>At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote:
>>>People should *not* misspell pages
>>>when using setuptools. They should certainly not use misspelled
>>>package names in requirements.
>>
>>People do all sorts of things they shouldn't. That doesn't stop
>>them blaming other people for their mistakes.
>>
>>It's said that a 10% improvement in ease-of-use can double a
>>product's users. Case sensitivity is a barrier to entry for new
>>users, and setuptools can't afford any additional entry barriers.
>
>I totally don't buy this in a case like this. People installing
>packages with setuptools are technical users. We expect them to
>write Python scripts.
No, "we" don't. Eggs were created to support application-level
plugins, such as are used by Trac and Chandler. Trac and Chandler
users are not necessarily programmers, let alone Python programmers.
From tseaver at palladion.com Mon Jul 23 18:01:02 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 23 Jul 2007 12:01:02 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
Message-ID: <46A4D0BE.4030706@palladion.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Phillip J. Eby wrote:
> At 07:08 AM 7/23/2007 -0400, Jim Fulton wrote:
>> On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote:
>>> At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote:
>>>> People should *not* misspell pages
>>>> when using setuptools. They should certainly not use misspelled
>>>> package names in requirements.
>>> People do all sorts of things they shouldn't. That doesn't stop
>>> them blaming other people for their mistakes.
>>>
>>> It's said that a 10% improvement in ease-of-use can double a
>>> product's users. Case sensitivity is a barrier to entry for new
>>> users, and setuptools can't afford any additional entry barriers.
>> I totally don't buy this in a case like this. People installing
>> packages with setuptools are technical users. We expect them to
>> write Python scripts.
>
> No, "we" don't. Eggs were created to support application-level
> plugins, such as are used by Trac and Chandler. Trac and Chandler
> users are not necessarily programmers, let alone Python programmers.
But by definition, the people typing the names of the dependencies into
a 'setup.py' for such a plugin *are* Python programmers, and could be
expected to know about case sensitivity.
I don't think Jim was areguing that human-centric *search* should punish
misspellings, but rather that encouraging such sloppiness in other
packages is a misfeature, especially if supporting it induces a tax on
*all* users of automated dependency resolution.
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGpNC++gerLs4ltQ4RAr2HAJ9UdPIVdz36inTG7nkm8SnrWPpcOgCgjKPc
sOqbuwOhUvlsSYpgxFSz1mg=
=F1EY
-----END PGP SIGNATURE-----
From tseaver at palladion.com Mon Jul 23 18:01:02 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 23 Jul 2007 12:01:02 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
Message-ID: <46A4D0BE.4030706@palladion.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Phillip J. Eby wrote:
> At 07:08 AM 7/23/2007 -0400, Jim Fulton wrote:
>> On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote:
>>> At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote:
>>>> People should *not* misspell pages
>>>> when using setuptools. They should certainly not use misspelled
>>>> package names in requirements.
>>> People do all sorts of things they shouldn't. That doesn't stop
>>> them blaming other people for their mistakes.
>>>
>>> It's said that a 10% improvement in ease-of-use can double a
>>> product's users. Case sensitivity is a barrier to entry for new
>>> users, and setuptools can't afford any additional entry barriers.
>> I totally don't buy this in a case like this. People installing
>> packages with setuptools are technical users. We expect them to
>> write Python scripts.
>
> No, "we" don't. Eggs were created to support application-level
> plugins, such as are used by Trac and Chandler. Trac and Chandler
> users are not necessarily programmers, let alone Python programmers.
But by definition, the people typing the names of the dependencies into
a 'setup.py' for such a plugin *are* Python programmers, and could be
expected to know about case sensitivity.
I don't think Jim was areguing that human-centric *search* should punish
misspellings, but rather that encouraging such sloppiness in other
packages is a misfeature, especially if supporting it induces a tax on
*all* users of automated dependency resolution.
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGpNC++gerLs4ltQ4RAr2HAJ9UdPIVdz36inTG7nkm8SnrWPpcOgCgjKPc
sOqbuwOhUvlsSYpgxFSz1mg=
=F1EY
-----END PGP SIGNATURE-----
From noah.gift at gmail.com Mon Jul 23 18:37:47 2007
From: noah.gift at gmail.com (Noah Gift)
Date: Mon, 23 Jul 2007 12:37:47 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <46A4D0BE.4030706@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>
<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
<46A4D0BE.4030706@palladion.com>
Message-ID:
>
>
> But by definition, the people typing the names of the dependencies into
> a 'setup.py' for such a plugin *are* Python programmers, and could be
> expected to know about case sensitivity.
>
> I don't think Jim was areguing that human-centric *search* should punish
> misspellings, but rather that encouraging such sloppiness in other
> packages is a misfeature, especially if supporting it induces a tax on
> *all* users of automated dependency resolution.
>
>
In my humble opinion, I for one completely agree with Phillip. I have had
to sit down with quite a few new Python Programmers and show them how to use
easy_install and I "thank God" easy_install is smart enough to figure out
case sensitivity. This is a wonderful feature!!!! Please don't ever get
rid of it :)
Not being able to install a package as they couldn't figure out the exact
name of the package could be the final straw for some new programmer to
Python!
Noah Gift
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/catalog-sig/attachments/20070723/9d7ebe75/attachment.htm
From barry at python.org Mon Jul 23 18:46:24 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 23 Jul 2007 12:46:24 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <46A4D0BE.4030706@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
<46A4D0BE.4030706@palladion.com>
Message-ID:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Jul 23, 2007, at 12:01 PM, Tres Seaver wrote:
>>>> It's said that a 10% improvement in ease-of-use can double a
>>>> product's users.
Under that principle, can I renew my plea for a better name than
"easy_install"?
- -Barry
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
iQCVAwUBRqTbYHEjvBPtnXfVAQIHmgP+L5eDz3n4mrcPk5K6NEexQPLrOT9iSd+w
cFYhn+FL5QoK6snRfxFp25KFmdz/raKDeGpQ4ZIy3nhpZTqxeQpPCsAg84rrw0lQ
lflPXkMMmZJTi+3JmjXc2mhj2SlHZ+73XxRPcD2NKnqr14sxlunJMPe4/IX+y1Rf
9C5WVwoCiJ0=
=b+zs
-----END PGP SIGNATURE-----
From jodok at lovelysystems.com Mon Jul 23 19:56:45 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Mon, 23 Jul 2007 13:56:45 -0400
Subject: [Catalog-sig] setuptools upload to pypi
Message-ID:
hi,
i can't upload a new egg to cheeseshop...
running "python setup.py bdist_egg register upload" hangs for several
minutes at "Using PyPI login from /Users/jodok/.pypirc".
entering username and password interactively results in the same.
the webinterface seems to work fine (at least browsing)
any idea?
thanks
jodok
--
"Now is better than never."
-- The Zen of Python, by Tim Peters
Jodok Batlogg, Lovely Systems
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
phone: +43 5572 908060, fax: +43 5572 908060-77
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2454 bytes
Desc: not available
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070723/2e31b7b4/attachment.bin
From kantrn at rpi.edu Mon Jul 23 20:28:27 2007
From: kantrn at rpi.edu (Noah Kantrowitz)
Date: Mon, 23 Jul 2007 14:28:27 -0400
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To:
References:
Message-ID: <46A4F34B.4090004@rpi.edu>
I've been seeing that this morning too. Uploads work fine, its just the
register that seems to fail.
--Noah
Jodok Batlogg wrote:
> hi,
>
> i can't upload a new egg to cheeseshop...
>
> running "python setup.py bdist_egg register upload" hangs for several
> minutes at "Using PyPI login from /Users/jodok/.pypirc".
> entering username and password interactively results in the same.
> the webinterface seems to work fine (at least browsing)
>
> any idea?
>
> thanks
>
> jodok
>
> --
> "Now is better than never."
> -- The Zen of Python, by Tim Peters
>
> Jodok Batlogg, Lovely Systems
> Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
> phone: +43 5572 908060, fax: +43 5572 908060-77
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
From tseaver at palladion.com Mon Jul 23 20:48:40 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 23 Jul 2007 14:48:40 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To:
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com>
Message-ID: <46A4F808.4050406@palladion.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Noah Gift wrote:
>>
>> But by definition, the people typing the names of the dependencies into
>> a 'setup.py' for such a plugin *are* Python programmers, and could be
>> expected to know about case sensitivity.
>>
>> I don't think Jim was areguing that human-centric *search* should punish
>> misspellings, but rather that encouraging such sloppiness in other
>> packages is a misfeature, especially if supporting it induces a tax on
>> *all* users of automated dependency resolution.
>>
>>
> In my humble opinion, I for one completely agree with Phillip. I have had
> to sit down with quite a few new Python Programmers and show them how to use
> easy_install and I "thank God" easy_install is smart enough to figure out
> case sensitivity. This is a wonderful feature!!!! Please don't ever get
> rid of it :)
> Not being able to install a package as they couldn't figure out the exact
> name of the package could be the final straw for some new programmer to
> Python!
There are two different use cases here:
1. User mis-types the name of a package on the command line, e.g.:
$ easy_install Foo
when it should be spelled:
$ easy_install foo
Being forgiving of case-mangling here ia a concern of the
easy_install *application*, and is non-controversil.
2. Programmer mis-types the name of a package in the dependencies
for his own pacakge, e.g.:
setup(install_requires=['Foo']...)
In this case, coddling the error causes it to *propagate*, becuase
other programmers will copy it directly, or depend on the error-
filled package. Worse, the cost of error correction is transferred
to *all* users of the setuptools library, even if they never use
'easy_install' at all.
I'm fine with leaving the newbie-friendly behavior in 'easy_install'; I
just don't like the performance hit it induces on users of setuptools
who *can* spell.
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGpPgI+gerLs4ltQ4RApzMAJ0WP6gzaM8n99fxkyo0Se285Te3bQCg1vxF
6ihYIENH8GpsQ7/ZF062T4Q=
=OuxU
-----END PGP SIGNATURE-----
From tseaver at palladion.com Mon Jul 23 20:48:40 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 23 Jul 2007 14:48:40 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To:
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com>
Message-ID: <46A4F808.4050406@palladion.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Noah Gift wrote:
>>
>> But by definition, the people typing the names of the dependencies into
>> a 'setup.py' for such a plugin *are* Python programmers, and could be
>> expected to know about case sensitivity.
>>
>> I don't think Jim was areguing that human-centric *search* should punish
>> misspellings, but rather that encouraging such sloppiness in other
>> packages is a misfeature, especially if supporting it induces a tax on
>> *all* users of automated dependency resolution.
>>
>>
> In my humble opinion, I for one completely agree with Phillip. I have had
> to sit down with quite a few new Python Programmers and show them how to use
> easy_install and I "thank God" easy_install is smart enough to figure out
> case sensitivity. This is a wonderful feature!!!! Please don't ever get
> rid of it :)
> Not being able to install a package as they couldn't figure out the exact
> name of the package could be the final straw for some new programmer to
> Python!
There are two different use cases here:
1. User mis-types the name of a package on the command line, e.g.:
$ easy_install Foo
when it should be spelled:
$ easy_install foo
Being forgiving of case-mangling here ia a concern of the
easy_install *application*, and is non-controversil.
2. Programmer mis-types the name of a package in the dependencies
for his own pacakge, e.g.:
setup(install_requires=['Foo']...)
In this case, coddling the error causes it to *propagate*, becuase
other programmers will copy it directly, or depend on the error-
filled package. Worse, the cost of error correction is transferred
to *all* users of the setuptools library, even if they never use
'easy_install' at all.
I'm fine with leaving the newbie-friendly behavior in 'easy_install'; I
just don't like the performance hit it induces on users of setuptools
who *can* spell.
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGpPgI+gerLs4ltQ4RApzMAJ0WP6gzaM8n99fxkyo0Se285Te3bQCg1vxF
6ihYIENH8GpsQ7/ZF062T4Q=
=OuxU
-----END PGP SIGNATURE-----
From benji at benjiyork.com Mon Jul 23 20:54:27 2007
From: benji at benjiyork.com (Benji York)
Date: Mon, 23 Jul 2007 14:54:27 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A384C9.8040404@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de>
<46A384C9.8040404@v.loewis.de>
Message-ID: <46A4F963.3040609@benjiyork.com>
Martin v. L?wis wrote:
> would they prefer instead that it is super-efficient (and somewhat
> out-of-date)?
Yes. At most a few minutes out of date and faster/more reliable would
be my strong preference.
--
Benji York
http://benjiyork.com
From jim at zope.com Mon Jul 23 20:55:16 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 14:55:16 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <46A4F808.4050406@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com>
<46A4F808.4050406@palladion.com>
Message-ID: <9FFADEB3-0E83-417E-B6EE-AF9A172690D0@zope.com>
On Jul 23, 2007, at 2:48 PM, Tres Seaver wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Noah Gift wrote:
>>>
>>> But by definition, the people typing the names of the
>>> dependencies into
>>> a 'setup.py' for such a plugin *are* Python programmers, and
>>> could be
>>> expected to know about case sensitivity.
>>>
>>> I don't think Jim was areguing that human-centric *search* should
>>> punish
>>> misspellings, but rather that encouraging such sloppiness in other
>>> packages is a misfeature, especially if supporting it induces a
>>> tax on
>>> *all* users of automated dependency resolution.
>>>
>>>
>> In my humble opinion, I for one completely agree with Phillip. I
>> have had
>> to sit down with quite a few new Python Programmers and show them
>> how to use
>> easy_install and I "thank God" easy_install is smart enough to
>> figure out
>> case sensitivity. This is a wonderful feature!!!! Please don't
>> ever get
>> rid of it :)
>> Not being able to install a package as they couldn't figure out
>> the exact
>> name of the package could be the final straw for some new
>> programmer to
>> Python!
>
> There are two different use cases here:
>
> 1. User mis-types the name of a package on the command line, e.g.:
>
> $ easy_install Foo
>
> when it should be spelled:
>
> $ easy_install foo
>
> Being forgiving of case-mangling here ia a concern of the
> easy_install *application*, and is non-controversil.
For me this is potentially controversial because:
> 2. Programmer mis-types the name of a package in the dependencies
> for his own pacakge, e.g.:
>
> setup(install_requires=['Foo']...)
Note that this might be intentional, as opposed to a typo. The
programmer will think "Foo" is a valid name because it worked with
easy_install. It's true that easy_install prints a warning, but it
is buried in so much output that it is easily missed or ignored.
> In this case, coddling the error causes it to *propagate*, becuase
> other programmers will copy it directly, or depend on the error-
> filled package. Worse, the cost of error correction is
> transferred
> to *all* users of the setuptools library, even if they never use
> 'easy_install' at all.
Well said.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From benji at benjiyork.com Mon Jul 23 20:58:44 2007
From: benji at benjiyork.com (Benji York)
Date: Mon, 23 Jul 2007 14:58:44 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A38DF5.6010701@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de>
<46A38DF5.6010701@v.loewis.de>
Message-ID: <46A4FA64.5050404@benjiyork.com>
Martin v. L?wis wrote:
> And where somewhat slower could be "practically not noticable".
Perhaps it /could/ be, but isn't currently. For example, updating one
piece of software I have with almost 150 dependencies takes 45 seconds
with ppix, 4:45 without. I plan to do similar timings with the "simple"
PyPI interface when I get a chance and report the results here.
--
Benji York
http://benjiyork.com
From jim at zope.com Mon Jul 23 21:06:46 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 15:06:46 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A4FA64.5050404@benjiyork.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de>
<46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com>
Message-ID: <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com>
On Jul 23, 2007, at 2:58 PM, Benji York wrote:
> Martin v. L?wis wrote:
>> And where somewhat slower could be "practically not noticable".
>
> Perhaps it /could/ be, but isn't currently. For example, updating
> one piece of software I have with almost 150 dependencies takes 45
> seconds with ppix, 4:45 without. I plan to do similar timings with
> the "simple" PyPI interface when I get a chance and report the
> results here.
I suspect that this has more to do with network distance than with
server speed.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
From noah.gift at gmail.com Mon Jul 23 21:30:44 2007
From: noah.gift at gmail.com (Noah Gift)
Date: Mon, 23 Jul 2007 15:30:44 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
index.
In-Reply-To: <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
<46A23BAE.5090907@v.loewis.de>
<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
<46A259C4.6090605@v.loewis.de>