From jodok at lovelysystems.com Mon Jul 2 21:33:38 2007 From: jodok at lovelysystems.com (Jodok Batlogg) Date: Mon, 2 Jul 2007 21:33:38 +0200 Subject: [Catalog-sig] ip 194.183.146.189 blocked Message-ID: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> hi, is it possible that our outgoing proxy server is beeing blocked by cheeseshop? it's ip address is 194.183.146.189 no, it was no attack to cheeseshop :) we're simply running buildout over and over and probably generating some load. thanks jodok -- "Explicit is better than implicit." -- The Zen of Python, by Tim Peters Jodok Batlogg, Lovely Systems Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria phone: +43 5572 908060, fax: +43 5572 908060-77 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2454 bytes Desc: not available Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070702/69790938/attachment.bin From fdrake at gmail.com Mon Jul 2 23:21:06 2007 From: fdrake at gmail.com (Fred Drake) Date: Mon, 2 Jul 2007 17:21:06 -0400 Subject: [Catalog-sig] ip 194.183.146.189 blocked In-Reply-To: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> Message-ID: <9cee7ab80707021421v4c30a348g9bd62272d81b2413@mail.gmail.com> On 7/2/07, Jodok Batlogg wrote: > is it possible that our outgoing proxy server is beeing blocked by > cheeseshop? it's ip address is 194.183.146.189 > no, it was no attack to cheeseshop :) we're simply running buildout > over and over and probably generating some load. Hey Jodok, I've taken to only using an internal repository for project buildouts; if I need/want a new release from PyPI, I load that into the internal repository. That avoids depending on PyPI being accessible at all times, and I can always get what I've used again. No need to worry about someone hiding old releases, or whatever. It in incurs a little overhead on adding or updating a package used in my projects, but avoids depending on a highly-variable service. An internal repository can still have problems, but at least it's easier to make changes if needed. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From jodok at lovelysystems.com Mon Jul 2 23:25:33 2007 From: jodok at lovelysystems.com (Jodok Batlogg) Date: Mon, 2 Jul 2007 23:25:33 +0200 Subject: [Catalog-sig] ip 194.183.146.189 blocked In-Reply-To: <9cee7ab80707021421v4c30a348g9bd62272d81b2413@mail.gmail.com> References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> <9cee7ab80707021421v4c30a348g9bd62272d81b2413@mail.gmail.com> Message-ID: <003AB009-74C9-4F3A-8C78-F9CA96B31605@lovelysystems.com> On 02.07.2007, at 23:21, Fred Drake wrote: > On 7/2/07, Jodok Batlogg wrote: >> is it possible that our outgoing proxy server is beeing blocked by >> cheeseshop? it's ip address is 194.183.146.189 >> no, it was no attack to cheeseshop :) we're simply running buildout >> over and over and probably generating some load. > > Hey Jodok, > > I've taken to only using an internal repository for project buildouts; > if I need/want a new release from PyPI, I load that into the internal > repository. That avoids depending on PyPI being accessible at all > times, and I can always get what I've used again. No need to worry > about someone hiding old releases, or whatever. > > It in incurs a little overhead on adding or updating a package used in > my projects, but avoids depending on a highly-variable service. An > internal repository can still have problems, but at least it's easier > to make changes if needed. already done after pypi beeing flakey :) unfortunately now the outgoing ip of this repo is beeing blocked and it sucks to scp downloaded files :) thanks fred, jodok > > > -Fred > > -- > Fred L. Drake, Jr. > "Chaos is the score upon which reality is written." --Henry Miller -- "Errors should never pass silently." "Unless explicitly silenced." -- The Zen of Python, by Tim Peters Jodok Batlogg, Lovely Systems Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria phone: +43 5572 908060, fax: +43 5572 908060-77 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2454 bytes Desc: not available Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070702/9efc7cec/attachment.bin From jim at zope.com Tue Jul 3 00:04:36 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 2 Jul 2007 18:04:36 -0400 Subject: [Catalog-sig] ip 194.183.146.189 blocked In-Reply-To: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> Message-ID: <1BBE8714-5AC2-40E0-9182-F628B58F4911@zope.com> On Jul 2, 2007, at 3:33 PM, Jodok Batlogg wrote: > hi, > > is it possible that our outgoing proxy server is beeing blocked by > cheeseshop? it's ip address is 194.183.146.189 > no, it was no attack to cheeseshop :) we're simply running buildout > over and over and probably generating some load. It's hard to believe that buildout could be generating enough load to trigger being blocked. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From lac at openend.se Tue Jul 3 00:16:38 2007 From: lac at openend.se (Laura Creighton) Date: Tue, 03 Jul 2007 00:16:38 +0200 Subject: [Catalog-sig] ip 194.183.146.189 blocked In-Reply-To: Message from Jodok Batlogg of "Mon, 02 Jul 2007 21:33:38 +0200." <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> Message-ID: <200707022216.l62MGcg6009085@theraft.openend.se> Could it be that you are simply out of apache's? i recall that Sean set the number of simultaneous ones at some very tiny number. Laura From martin at v.loewis.de Tue Jul 3 00:29:21 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Jul 2007 00:29:21 +0200 Subject: [Catalog-sig] ip 194.183.146.189 blocked In-Reply-To: <200707022216.l62MGcg6009085@theraft.openend.se> References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> <200707022216.l62MGcg6009085@theraft.openend.se> Message-ID: <46897C41.7090609@v.loewis.de> Laura Creighton schrieb: > Could it be that you are simply out of apache's? i recall that > Sean set the number of simultaneous ones at some very tiny number. I think you misunderstood. He set MaxRequestsPerChild to 10, which means that each process will be replaced by a different one after 10 requests. MaxClients is 60, which should be more than enough. Regards, Martin From lac at openend.se Tue Jul 3 00:33:36 2007 From: lac at openend.se (Laura Creighton) Date: Tue, 03 Jul 2007 00:33:36 +0200 Subject: [Catalog-sig] ip 194.183.146.189 blocked In-Reply-To: Message from =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= of "Tue, 03 Jul 2007 00:29:21 +0200." <46897C41.7090609@v.loewis.de> References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> <200707022216.l62MGcg6009085@theraft.openend.se> <46897C41.7090609@v.loewis.de> Message-ID: <200707022233.l62MXavr011774@theraft.openend.se> In a message of Tue, 03 Jul 2007 00:29:21 +0200, "Martin v. L?wis" writes: >Laura Creighton schrieb: >> Could it be that you are simply out of apache's? i recall that >> Sean set the number of simultaneous ones at some very tiny number. > >I think you misunderstood. He set MaxRequestsPerChild to 10, which >means that each process will be replaced by a different one after >10 requests. MaxClients is 60, which should be more than enough. > >Regards, >Martin yes, I thought it was 10. Sorry about that, and thank you. Laura From martin at v.loewis.de Tue Jul 3 09:22:11 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Jul 2007 09:22:11 +0200 Subject: [Catalog-sig] ip 194.183.146.189 blocked In-Reply-To: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> Message-ID: <4689F923.8030304@v.loewis.de> > is it possible that our outgoing proxy server is beeing blocked by > cheeseshop? it's ip address is 194.183.146.189 I can't see anything like that in the configuration of ximinez. Furthermore, I cannot see that this IP addresses made any attempt to contact ximinez. I got several accesses from 194.183.146.178, for various versions of zc.buildout, through setuptools, and I got requests from 194.183.146.185 through Firefox, but none from the IP address that you mention. Going back until December 2006 (if I can trust the logs), that machine never made any access to the Cheeseshop. Regards, Martin From jodok at lovelysystems.com Tue Jul 3 11:02:19 2007 From: jodok at lovelysystems.com (Jodok Batlogg) Date: Tue, 3 Jul 2007 11:02:19 +0200 Subject: [Catalog-sig] ip 194.183.146.189 blocked In-Reply-To: <4689F923.8030304@v.loewis.de> References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> <4689F923.8030304@v.loewis.de> Message-ID: <5B0A8BC7-CC65-49E6-AA15-CCF591A0EA41@lovelysystems.com> On 03.07.2007, at 09:22, Martin v. L?wis wrote: >> is it possible that our outgoing proxy server is beeing blocked by >> cheeseshop? it's ip address is 194.183.146.189 > > I can't see anything like that in the configuration of ximinez. > > Furthermore, I cannot see that this IP addresses made any attempt > to contact ximinez. I got several accesses from 194.183.146.178, > for various versions of zc.buildout, through setuptools, and > I got requests from 194.183.146.185 through Firefox, but none > from the IP address that you mention. Going back until December > 2006 (if I can trust the logs), that machine never made any > access to the Cheeseshop. it seems to happen on the network level. i can't ping the machine from this ip address :) coming from 194.183.146.189: traceroute to ximinez.python.org (82.94.237.219), 64 hops max, 60 byte packets 1 lsfw01 (192.168.34.254) 0.727 ms 0.406 ms 0.345 ms 2 194-183-146-177.tele.net (194.183.146.177) 1.212 ms 1.061 ms 3.801 ms 3 cr4-swz1.net.tele.net (194.183.134.8) 6.733 ms 5.034 ms 4.472 ms 4 fas0-1-70-cr3-swz1.net.tele.net (194.183.133.188) 4.550 ms 4.581 ms 4.627 ms 5 atm0-0-r1-hoe1.net.tele.net (194.183.135.34) 5.743 ms 5.471 ms 5.362 ms 6 giga0-2.r2-buh1.net.tele.net (194.183.135.194) 7.449 ms 6.484 ms 5.843 ms 7 83.144.194.17 (83.144.194.17) 8.407 ms 8.736 ms 8.444 ms 8 g4-0-211.core01.zrh01.atlas.cogentco.com (149.6.83.129) 9.269 ms 8.669 ms 8.727 ms 9 p6-0.core01.str01.atlas.cogentco.com (130.117.0.53) 11.924 ms 11.825 ms 10.960 ms 10 p3-0.core01.fra03.atlas.cogentco.com (130.117.0.217) 13.820 ms 14.551 ms 13.941 ms 11 p3-0.core01.ams03.atlas.cogentco.com (130.117.0.145) 21.411 ms 21.266 ms 20.842 ms 12 t3-1.mpd01.ams03.atlas.cogentco.com (130.117.0.34) 20.100 ms 21.003 ms 20.880 ms 13 ams-ix.sara.xs4all.net (195.69.144.48) 20.878 ms 20.983 ms 28.193 ms 14 0.so-6-0-0.xr1.3d12.xs4all.net (194.109.5.1) 21.045 ms 21.486 ms 20.892 ms 15 0.so-3-0-0.cr1.3d12.xs4all.net (194.109.5.58) 49.436 ms 29.076 ms 103.199 ms 16 * * * 17 * * * 18 * * * coming from 194.183.146.179: traceroute to ximinez.python.org (82.94.237.219), 64 hops max, 60 byte packets 1 lsfw01 (192.168.34.254) 2.030 ms 1.495 ms 1.461 ms 2 * 194-183-146-177.tele.net (194.183.146.177) 1.834 ms 1.646 ms 3 cr4-swz1.net.tele.net (194.183.134.8) 4.873 ms 6.393 ms 5.318 ms 4 fas4-0-70-cr1-swz1.net.tele.net (194.183.133.190) 8.466 ms 196.174 ms 5.562 ms 5 194.183.142.2 (194.183.142.2) 6.540 ms 6.462 ms 21.969 ms 6 giga0-2.r2-buh1.net.tele.net (194.183.135.194) 6.642 ms 6.871 ms 7.797 ms 7 83.144.194.17 (83.144.194.17) 18.965 ms 9.923 ms 10.459 ms 8 g4-0-211.core01.zrh01.atlas.cogentco.com (149.6.83.129) 10.003 ms 9.462 ms 9.945 ms 9 p6-0.core01.str01.atlas.cogentco.com (130.117.0.53) 13.728 ms 11.831 ms 12.375 ms 10 p3-0.core01.fra03.atlas.cogentco.com (130.117.0.217) 14.568 ms 16.176 ms 15.069 ms 11 p3-0.core01.ams03.atlas.cogentco.com (130.117.0.145) 124.421 ms 134.435 ms 205.047 ms 12 t3-1.mpd01.ams03.atlas.cogentco.com (130.117.0.34) 21.689 ms 21.962 ms 22.313 ms 13 ams-ix.tc2.xs4all.net (195.69.144.166) 21.655 ms 21.213 ms 23.011 ms 14 0.so-7-0-0.xr2.3d12.xs4all.net (194.109.5.13) 21.531 ms 21.966 ms 0.so-7-0-0.xr1.3d12.xs4all.net (194.109.5.9) 21.673 ms 15 0.so-2-0-0.cr1.3d12.xs4all.net (194.109.5.74) 21.526 ms 0.so-3-0-0.cr1.3d12.xs4all.net (194.109.5.58) 24.606 ms 22.263 ms 16 ximinez.python.org (82.94.237.219) 23.363 ms 21.890 ms 25.506 ms thanks a lot for your help jodok > > Regards, > Martin -- "Simple is better than complex." -- The Zen of Python, by Tim Peters Jodok Batlogg, Lovely Systems Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria phone: +43 5572 908060, fax: +43 5572 908060-77 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2454 bytes Desc: not available Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070703/b6ff4b4e/attachment.bin From pje at telecommunity.com Thu Jul 5 02:56:25 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 04 Jul 2007 20:56:25 -0400 Subject: [Catalog-sig] Cheeseshop login problems? Message-ID: <20070705005415.3F4F03A4046@sparrow.telecommunity.com> I can't seem to log in to the Cheeseshop, from any platform or machine, whether via script or browser (Firefox or Lynx). I haven't changed my password, but just in case there was an issue with my password, I asked for a password reset. The passwords I received in email don't work either, however, which seems to suggest that there is a server problem involved. :( From richardjones at optusnet.com.au Thu Jul 5 05:43:44 2007 From: richardjones at optusnet.com.au (richardjones at optusnet.com.au) Date: Thu, 05 Jul 2007 13:43:44 +1000 Subject: [Catalog-sig] Cheeseshop login problems? Message-ID: <200707050343.l653hirE007904@mail06.syd.optusnet.com.au> An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://mail.python.org/pipermail/catalog-sig/attachments/20070705/a90ec30e/attachment.asc From martin at v.loewis.de Thu Jul 5 07:38:36 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 05 Jul 2007 07:38:36 +0200 Subject: [Catalog-sig] Cheeseshop login problems? In-Reply-To: <200707050343.l653hirE007904@mail06.syd.optusnet.com.au> References: <200707050343.l653hirE007904@mail06.syd.optusnet.com.au> Message-ID: <468C83DC.4030605@v.loewis.de> > No logins appear to work at the moment. > > Has anyone made changes to the apache config recently? I did - I'll look into it. Martin From martin at v.loewis.de Thu Jul 5 08:22:33 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 05 Jul 2007 08:22:33 +0200 Subject: [Catalog-sig] Cheeseshop login problems? In-Reply-To: <20070705005415.3F4F03A4046@sparrow.telecommunity.com> References: <20070705005415.3F4F03A4046@sparrow.telecommunity.com> Message-ID: <468C8E29.70808@v.loewis.de> Phillip J. Eby schrieb: > I can't seem to log in to the Cheeseshop, from any platform or > machine, whether via script or browser (Firefox or Lynx). I haven't > changed my password, but just in case there was an issue with my > password, I asked for a password reset. > > The passwords I received in email don't work either, however, which > seems to suggest that there is a server problem involved. :( Please try again; it should work now. I switched the Cheeseshop from using mod_python to using FastCGI, but forgot to do the RewriteCond dance. Sorry about that. Regards, Martin From martin at v.loewis.de Thu Jul 5 08:37:35 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 05 Jul 2007 08:37:35 +0200 Subject: [Catalog-sig] Cheeseshop performance problems solved Message-ID: <468C91AF.8000304@v.loewis.de> I think I solved the performance problems of the Cheeseshop, by switching both the wiki and the Cheeseshop it to FastCGI. I raised the MaxRequestsPerChild to 1000 again, and MaxClients back to its default (256). There are four processes running the PyPI, and four threads running MoinMoin. If you experience problems, please report exact data and time of the outage, as well as the nature of the outage (e.g. if it doesn't respond within a reasonable time, report what operation you did and after what time you gave up waiting for a response). Regards, Martin From jim at zope.com Thu Jul 5 15:32:44 2007 From: jim at zope.com (Jim Fulton) Date: Thu, 5 Jul 2007 09:32:44 -0400 Subject: [Catalog-sig] Cheeseshop login problems? In-Reply-To: <468C8E29.70808@v.loewis.de> References: <20070705005415.3F4F03A4046@sparrow.telecommunity.com> <468C8E29.70808@v.loewis.de> Message-ID: <24CECA6B-B9F3-420B-8016-C3C4FBB06548@zope.com> Hey Martin, I want to say thanks to you and the other folks who are working on trying to address the PyPI performance issues. Much much thanks! Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Fri Jul 6 01:29:57 2007 From: jim at zope.com (Jim Fulton) Date: Thu, 5 Jul 2007 19:29:57 -0400 Subject: [Catalog-sig] psycoph errors from pypi Message-ID: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> I imagine the people working on the cheeseshop are aware of this, but, in case you aren't, I'm getting intermittent errors from the cheeseshop. For example, requests for: http://www.python.org/pypi/ Often give: Error... There's been a problem with your request psycopg.ProgrammingError: ERROR: current transaction is aborted, commands ignored until end of transaction block select name, version, summary, _pypi_ordering from releases where (lower(name) LIKE '%%%%') and _pypi_hidden = FALSE order by lower(name), _pypi_ordering or http://www.python.org/pypi/setuptools sometimes gives: Error... There's been a problem with your request psycopg.ProgrammingError: ERROR: current transaction is aborted, commands ignored until end of transaction block select name, version, summary, _pypi_hidden from releases where name = 'setuptools' and _pypi_hidden = False order by _pypi_ordering desc http://www.python.org/pypi/setuptools/0.6c6 gives: Error... There's been a problem with your request psycopg.ProgrammingError: ERROR: current transaction is aborted, commands ignored until end of transaction block select packages.name as name, stable_version, version, author, author_email, maintainer, maintainer_email, home_page, license, summary, description, description_html, keywords, platform, download_url, _pypi_ordering, _pypi_hidden, cheesecake_installability_id, cheesecake_documentation_id, cheesecake_code_kwalitee_id from packages, releases where packages.name='setuptools' and version='0.6c6' and packages.name = releases.name And so on. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Fri Jul 6 04:10:19 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 06 Jul 2007 04:10:19 +0200 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> Message-ID: <468DA48B.2020008@v.loewis.de> Jim Fulton schrieb: > I imagine the people working on the cheeseshop are aware of this, > but, in case you aren't, I'm getting intermittent errors from the > cheeseshop. For example, requests for: http://www.python.org/pypi/ I wasn't aware of this until you reported it. I don't have a clue what's causing it. Regards, Martin From martin at v.loewis.de Fri Jul 6 04:33:54 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 06 Jul 2007 04:33:54 +0200 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: <468DA48B.2020008@v.loewis.de> References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de> Message-ID: <468DAA12.4000707@v.loewis.de> Martin v. L?wis schrieb: > Jim Fulton schrieb: >> I imagine the people working on the cheeseshop are aware of this, >> but, in case you aren't, I'm getting intermittent errors from the >> cheeseshop. For example, requests for: http://www.python.org/pypi/ > > I wasn't aware of this until you reported it. > > I don't have a clue what's causing it. I now do, somewhat. Apparently, when you discard a cursor object in psycopg, and create a new one, that doesn't necessarily start a new transaction. So if there was some SQL error in the connection, it stops accepting further SQL statements. I fixed that by rolling back the connection after each request, and before each new request. What I don't understand is why there was an error in the first place (or what that error was). Regards, Martin From jim at zope.com Fri Jul 6 14:04:06 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 6 Jul 2007 08:04:06 -0400 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: <468DAA12.4000707@v.loewis.de> References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de> Message-ID: On Jul 5, 2007, at 10:33 PM, Martin v. L?wis wrote: > Martin v. L?wis schrieb: >> Jim Fulton schrieb: >>> I imagine the people working on the cheeseshop are aware of this, >>> but, in case you aren't, I'm getting intermittent errors from the >>> cheeseshop. For example, requests for: http://www.python.org/pypi/ >> >> I wasn't aware of this until you reported it. >> >> I don't have a clue what's causing it. > > I now do, somewhat. Apparently, when you discard a cursor object > in psycopg, and create a new one, that doesn't necessarily start > a new transaction. So if there was some SQL error in the connection, > it stops accepting further SQL statements. > > I fixed that by rolling back the connection after each request, > and before each new request. > > What I don't understand is why there was an error in the first > place (or what that error was). OK, this probably isn't helpful, but I can't help asking an obvious question. Did something change in the software other than a switch from mod_python to FastCGI? Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Fri Jul 6 14:16:47 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 6 Jul 2007 08:16:47 -0400 Subject: [Catalog-sig] Cheeseshop performance improved In-Reply-To: <20070626105201.GA14025@tummy.com> References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> Message-ID: On Jun 26, 2007, at 6:52 AM, Sean Reifschneider wrote: ... > The quick fix would be to engage XS4ALL to upgrade the RAM in that > box, > leaving the box otherwise untouched. The system has only 1GB of > RAM in it. > It's got a 2.8GHz Xeon CPU in it, so I would expect it can take at > least > 4GB of RAM, if not 8 or 16GB. > > Thomas: If the PSF threw a grand or two at XS4ALL, could we get the > memory > in ximinez upgraded? Preferably to 4 or 8GB of RAM? What is the status if this? This seems like a promising early step and a pretty darn good use of PSF funds. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Fri Jul 6 19:21:00 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 06 Jul 2007 13:21:00 -0400 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de> Message-ID: <20070706171848.268C23A4046@sparrow.telecommunity.com> At 08:04 AM 7/6/2007 -0400, Jim Fulton wrote: >On Jul 5, 2007, at 10:33 PM, Martin v. L?wis wrote: > > I now do, somewhat. Apparently, when you discard a cursor object > > in psycopg, and create a new one, that doesn't necessarily start > > a new transaction. So if there was some SQL error in the connection, > > it stops accepting further SQL statements. > > > > I fixed that by rolling back the connection after each request, > > and before each new request. > > > > What I don't understand is why there was an error in the first > > place (or what that error was). > >OK, this probably isn't helpful, but I can't help asking an obvious >question. Did something change in the software other than a switch >from mod_python to FastCGI? That wouldn't be necessary for this to become a problem. If PyPI was CGI before, then any sort of transient SQL problem wouldn't have had this effect, because the DB connection would've been closed at the end of each request. So, it's probably an existing SQL error in PyPI. From jafo at tummy.com Fri Jul 6 23:45:27 2007 From: jafo at tummy.com (Sean Reifschneider) Date: Fri, 6 Jul 2007 15:45:27 -0600 Subject: [Catalog-sig] Cheeseshop performance improved In-Reply-To: References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> Message-ID: <20070706214527.GR28082@tummy.com> On Fri, Jul 06, 2007 at 08:16:47AM -0400, Jim Fulton wrote: >What is the status if this? This seems like a promising early step >and a pretty darn good use of PSF funds. I never heard anything from Thomas, which I would think would be the right person to run this through, as I really don't know anything about the arrangement we have with XS4ALL. I guess we'd also need to get the PSF to approve this, though I'd imagine that'd be little more than a formality. If we don't have any response from Thomas in a bit, I can try contacting XS4ALL directly and see if they can give us any ideas. However, I believe that Martin also thinks that with his FastCGi changes it should be happy now as is... Thanks, Sean -- I think you are blind to the fact that the hand you hold is the hand that holds you down. -- Everclear Sean Reifschneider, Member of Technical Staff tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability From martin at v.loewis.de Sat Jul 7 00:02:47 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 07 Jul 2007 00:02:47 +0200 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de> Message-ID: <468EBC07.6010607@v.loewis.de> >> I now do, somewhat. Apparently, when you discard a cursor object >> in psycopg, and create a new one, that doesn't necessarily start >> a new transaction. So if there was some SQL error in the connection, >> it stops accepting further SQL statements. >> >> I fixed that by rolling back the connection after each request, >> and before each new request. >> >> What I don't understand is why there was an error in the first >> place (or what that error was). > > OK, this probably isn't helpful, but I can't help asking an obvious > question. Did something change in the software other than a switch from > mod_python to FastCGI? Yes, I also made the connections to Postgres persistent, rather than opening a new connection on each request. Regards, Martin From jim at zope.com Sat Jul 7 00:06:29 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 6 Jul 2007 18:06:29 -0400 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: <468EBC07.6010607@v.loewis.de> References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de> <468EBC07.6010607@v.loewis.de> Message-ID: <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com> On Jul 6, 2007, at 6:02 PM, Martin v. L?wis wrote: >>> I now do, somewhat. Apparently, when you discard a cursor object >>> in psycopg, and create a new one, that doesn't necessarily start >>> a new transaction. So if there was some SQL error in the connection, >>> it stops accepting further SQL statements. >>> >>> I fixed that by rolling back the connection after each request, >>> and before each new request. >>> >>> What I don't understand is why there was an error in the first >>> place (or what that error was). >> >> OK, this probably isn't helpful, but I can't help asking an obvious >> question. Did something change in the software other than a >> switch from >> mod_python to FastCGI? > > Yes, I also made the connections to Postgres persistent, rather than > opening a new connection on each request. Ah, OK, that explains it. This is a reasonable thing to do from a performance point of view. Thanks for plugging away at this. :) (Of course it's too bad we don't have a better way of testing changes. Oh well.) Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Sat Jul 7 00:15:10 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 07 Jul 2007 00:15:10 +0200 Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved In-Reply-To: <20070706214527.GR28082@tummy.com> References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com> Message-ID: <468EBEEE.9010404@v.loewis.de> > I never heard anything from Thomas, which I would think would be the right > person to run this through, as I really don't know anything about the > arrangement we have with XS4ALL. I guess we'd also need to get the PSF to > approve this, though I'd imagine that'd be little more than a formality. > > If we don't have any response from Thomas in a bit, I can try contacting > XS4ALL directly and see if they can give us any ideas. I expect such a project to complete in a matter of months rather than a matter of days. It took a year or so before the current set of machines was actively being used (IIRC). > However, I believe that Martin also thinks that with his FastCGi changes it > should be happy now as is... Indeed. If there are further complaints on the performance, I'd like to hear them (preferably with a way for reproducing them). There is still stuff that can be done to improve PyPI further, such as better usage of SQL. Regards, Martin From martin at v.loewis.de Sat Jul 7 00:22:42 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 07 Jul 2007 00:22:42 +0200 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com> References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de> <468EBC07.6010607@v.loewis.de> <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com> Message-ID: <468EC0B2.9070903@v.loewis.de> > Ah, OK, that explains it. This is a reasonable thing to do from a > performance point of view. Thanks for plugging away at this. :) > > (Of course it's too bad we don't have a better way of testing changes. > Oh well.) If there were volunteer testers, it would be possible to test changes for some period of time. Such testers would have to build themselves a PyPI installation, and then checkout all changes that have been committed (or install them from a tracker where they float around). Alternatively, if somebody contributed a unit test suite, certain problems might get caught. In the specific case, I tested whether PyPI "works" on my local installation, and I apparently didn't not manage to trigger the problem. My guess is that it was originally triggered by some failing concurrent access, which is really hard to test for. Regards, Martin From martin at v.loewis.de Sat Jul 7 00:40:19 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 07 Jul 2007 00:40:19 +0200 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: <20070706171848.268C23A4046@sparrow.telecommunity.com> References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de> <20070706171848.268C23A4046@sparrow.telecommunity.com> Message-ID: <468EC4D3.5030108@v.loewis.de> > That wouldn't be necessary for this to become a problem. If PyPI was > CGI before, then any sort of transient SQL problem wouldn't have had > this effect, because the DB connection would've been closed at the end > of each request. So, it's probably an existing SQL error in PyPI. That would be my guess. Another possibility might have been that there was a Python exception, in which case PyPI would not have invoked .commit on the transaction (so apparently, the transaction would have been kept open). I'm unsure whether this might cause problems for subsequent actions. Still, no such exceptions were reported... In any case, I now do a .rollback in the case of an exception, and a .rollback before processing a new request. I'd like to get some confirmation that this is a sensible approach (or what else best practice is). Regards, Martin From ianb at colorstudy.com Sat Jul 7 00:44:51 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 06 Jul 2007 17:44:51 -0500 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: <468EC0B2.9070903@v.loewis.de> References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de> <468EBC07.6010607@v.loewis.de> <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com> <468EC0B2.9070903@v.loewis.de> Message-ID: <468EC5E3.2040903@colorstudy.com> Martin v. L?wis wrote: >> Ah, OK, that explains it. This is a reasonable thing to do from a >> performance point of view. Thanks for plugging away at this. :) >> >> (Of course it's too bad we don't have a better way of testing changes. >> Oh well.) > > If there were volunteer testers, it would be possible to test changes > for some period of time. Such testers would have to build themselves > a PyPI installation, and then checkout all changes that have been > committed (or install them from a tracker where they float around). > > Alternatively, if somebody contributed a unit test suite, certain > problems might get caught. > > In the specific case, I tested whether PyPI "works" on my local > installation, and I apparently didn't not manage to trigger the > problem. My guess is that it was originally triggered by some > failing concurrent access, which is really hard to test for. Are exceptions being logged, and actively sent to someone who can handle them? This particular problem sounds like it is fairly deployment- and load-specific, so testing probably wouldn't have found it anyway. -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org : Write code, do good : http://topp.openplans.org/careers From pje at telecommunity.com Sat Jul 7 01:20:37 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 06 Jul 2007 19:20:37 -0400 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: <468EC4D3.5030108@v.loewis.de> References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de> <20070706171848.268C23A4046@sparrow.telecommunity.com> <468EC4D3.5030108@v.loewis.de> Message-ID: <20070706231831.B83783A405F@sparrow.telecommunity.com> At 12:40 AM 7/7/2007 +0200, Martin v. L?wis wrote: > > That wouldn't be necessary for this to become a problem. If PyPI was > > CGI before, then any sort of transient SQL problem wouldn't have had > > this effect, because the DB connection would've been closed at the end > > of each request. So, it's probably an existing SQL error in PyPI. > >That would be my guess. Another possibility might have been that >there was a Python exception, in which case PyPI would not have invoked >.commit on the transaction (so apparently, the transaction would have >been kept open). I'm unsure whether this might cause problems for >subsequent actions. Still, no such exceptions were reported... > >In any case, I now do a .rollback in the case of an exception, and >a .rollback before processing a new request. I'd like to get some >confirmation that this is a sensible approach (or what else best >practice is). The best practice is ensuring that either a commit or rollback happens at the end of each web request that uses the connection. Then, there's no chance of a failed but not rolled-back transaction continuing to hold locks in the database. In PostgreSQL's case, the MVCC would prevent such a transaction from blocking any read-only transactions, of course. What you're doing is quite close to best practice; if I understand you correctly, it differs only in the case of what happens if there is a program error resulting in failure to commit or abort. From richardjones at optushome.com.au Sat Jul 7 01:30:46 2007 From: richardjones at optushome.com.au (Richard Jones) Date: Sat, 7 Jul 2007 09:30:46 +1000 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: <468EC5E3.2040903@colorstudy.com> References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468EC0B2.9070903@v.loewis.de> <468EC5E3.2040903@colorstudy.com> Message-ID: <200707070930.46318.richardjones@optushome.com.au> On Sat, 7 Jul 2007, Ian Bicking wrote: > Are exceptions being logged, and actively sent to someone who can handle > them? This particular problem sounds like it is fairly deployment- and > load-specific, so testing probably wouldn't have found it anyway. Errors are currently emailed to myself and AMK. This is controlled by the config on xminiez, so others may receive the error emails if they desire. Richard From renesd at gmail.com Sat Jul 7 03:22:27 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 7 Jul 2007 11:22:27 +1000 Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved In-Reply-To: <468EBEEE.9010404@v.loewis.de> References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de> Message-ID: <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com> Hi, yeah, the sql can be improved. A lot of the queries cause a sequential scan of all the rows in the journal and release tables. I think the cause of this is that one of the tables does not have a primary key, so postgresql can't optimize the query. Even if the table had an incrementing numeric id field, then I think the joins could be sped up. I haven't tested this yet, but maybe that'd help - or maybe there would need to be more changes needed. Postgresql definitely needs a PK on each table though. ps, I'm going to try and finish off that caching/static file work I've been working on(more on that later). I guess I'll need to test things a little differently with fastcgi. How did you set up a fastcgi pypi? Cheers, On 7/7/07, "Martin v. L?wis" wrote: > > I never heard anything from Thomas, which I would think would be the right > > person to run this through, as I really don't know anything about the > > arrangement we have with XS4ALL. I guess we'd also need to get the PSF to > > approve this, though I'd imagine that'd be little more than a formality. > > > > If we don't have any response from Thomas in a bit, I can try contacting > > XS4ALL directly and see if they can give us any ideas. > > I expect such a project to complete in a matter of months rather > than a matter of days. It took a year or so before the current set of > machines was actively being used (IIRC). > > > However, I believe that Martin also thinks that with his FastCGi changes it > > should be happy now as is... > > Indeed. If there are further complaints on the performance, I'd like to > hear them (preferably with a way for reproducing them). There is still > stuff that can be done to improve PyPI further, such as better usage of > SQL. > > Regards, > Martin > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From renesd at gmail.com Sat Jul 7 06:24:53 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 7 Jul 2007 14:24:53 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. Message-ID: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> Hello, here is the start of an apache config for using static files if they exist, and the person is not logged in. The idea will be to have a www/static/cheeseshop.python.org/pypi/ directory filled with the relevant cached files. Here's the apache config so far. It checks to see if the person is authorized, and if they are it does not use the static files. There are a couple of special cases... ie the /pypi and pypi urls. Now I just need to finish off the static file generation code. It needs a tool which can run every minute or so, which will look for any changes. If it finds changes it will update just those files. It will generate the files in a separate directory first, and then move them in - so people don't download half generated files. It will optionally be able to regenerate all the static files - incase there are database, or template changes. Of course the config will have to change a little bit for using fcgi instead of modpython... but there shouldn't be too much to change. I've also updated the http://wiki.python.org/moin/CheeseShopDev page with some things I noticed when installing the cheeseshop again on my laptop. Mainly dependencies, and missing config steps. NameVirtualHost 192.168.0.3 ServerAdmin webmaster at localhost ServerName gracerr.pretendpaper.com DocumentRoot /home/rene/dev/python/cheeseshop/packages/trunk/www/ # Redirect RSS to a static file Alias /pypi/?:action=rss /data/www/pypi/pypi_rss.xml Options Indexes FollowSymLinks MultiViews AllowOverride None Order allow,deny allow from all AddHandler cgi-script .cgi Options Indexes SetHandler mod_python #PythonPath "['/data/pypi/src/pypi']+sys.path" PythonPath "['/home/rene/dev/python/cheeseshop/packages/trunk/pypi']+sys.path" PythonHandler pypi::handle PythonDebug On # 2007-06-15 -- POSTs to /pypi every second deny from 69.55.232.188 # Rewrite rules RewriteEngine on # if the authorization header is empty, redirect. RewriteCond %{HTTP:authorization} ^$ RewriteRule ^(.*)pypi/$ /static/package_index.html [L] #RewriteRule ^(.*)pypi$ /static/front-page.html [L] # always make the /pypi empty one go straight through. RewriteRule ^(.*)pypi$ /pypi2 [PT] # a file, or a directory, and empty authorization header. RewriteCond %{HTTP:authorization} ^$ RewriteCond /home/rene/dev/python/cheeseshop/packages/trunk/www/static/gracerr.pretendpaper.com/%{REQUEST_FILENAME} -f RewriteRule ^(.*)pypi/(.*) /static/gracerr.pretendpaper.com/pypi/$2 [PT] RewriteCond %{HTTP:authorization} ^$ RewriteCond /home/rene/dev/python/cheeseshop/packages/trunk/www/static/gracerr.pretendpaper.com/%{REQUEST_FILENAME} -d RewriteRule ^(.*)pypi/(.*) /static/gracerr.pretendpaper.com/pypi/$2 [PT] # Look here instead... RewriteRule (.*) /pypi2/$1 [PT] # Point to package directory RewriteRule /packages(/.*)?$ /data/packages$1 [last] RewriteRule /icons/(.*$) /usr/share/apache2/icons/$1 [last] RedirectMatch permanent ^/$ "http://gracerr.pretendpaper.com/pypi" RewriteLog /var/log/apache2/rewrite.log RewriteLogLevel 9 ErrorLog /var/log/apache2/grace_error.log # Possible values include: debug, info, notice, warn, error, crit, # alert, emerg. #LogLevel warn LogLevel debug CustomLog /var/log/apache2/grace_access.log combined #ServerSignature On # mkdir /var/tmp/proxy2/cheeseshop # chown www-data: /var/tmp/proxy2/cheeseshop # CacheRoot "/var/tmp/proxy2/cheeseshop" # CacheEnable disk / # CacheSize 4000000 # # CacheMinFileSize setting this so that 403 forbidden pages are not cached. # CacheMinFileSize 400 # CacheDirLevels 5 # CacheDirLength 3 # #CacheGcInterval 4 # CacheMaxExpire 24 # CacheLastModifiedFactor 0.1 # CacheDefaultExpire 1 # #CacheForceCompletion 100 From martin at v.loewis.de Sat Jul 7 08:30:53 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 07 Jul 2007 08:30:53 +0200 Subject: [Catalog-sig] psycoph errors from pypi In-Reply-To: <468EC5E3.2040903@colorstudy.com> References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com> <468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de> <468EBC07.6010607@v.loewis.de> <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com> <468EC0B2.9070903@v.loewis.de> <468EC5E3.2040903@colorstudy.com> Message-ID: <468F331D.1080904@v.loewis.de> > Are exceptions being logged, and actively sent to someone who can handle > them? This particular problem sounds like it is fairly deployment- and > load-specific, so testing probably wouldn't have found it anyway. They are sent by email. AFAICT, they are not logged. Regards, Martin From martin at v.loewis.de Sat Jul 7 08:44:21 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 07 Jul 2007 08:44:21 +0200 Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved In-Reply-To: <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com> References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de> <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com> Message-ID: <468F3645.1030000@v.loewis.de> > A lot of the queries cause a sequential scan of all the rows in the > journal and release tables. > > I think the cause of this is that one of the tables does not have a > primary key, so postgresql can't optimize the query. Even if the > table had an incrementing numeric id field, then I think the joins > could be sped up. I haven't tested this yet, but maybe that'd help - > or maybe there would need to be more changes needed. Postgresql > definitely needs a PK on each table though. Not definitely - and index is enough. A PK only adds an additional constraint, and does not contribute in itself to performance. In any case, I plan to add a name-version index to release_classifiers, as the browsing often looks into release_classifiers by name and version. > ps, I'm going to try and finish off that caching/static file work I've > been working on(more on that later). I guess I'll need to test things > a little differently with fastcgi. How did you set up a fastcgi pypi? FastCgiServer /data/pypi/src/pypi/pypi.fcgi -idle-timeout 60 -processes 4 then # Trick Apache in providing Basic-Auth to pypi.fcgi RewriteCond %{HTTP:Authorization} ^(.+)$ RewriteRule ^/pypi(.*) /data/pypi/src/pypi/pypi.fcgi$1 [e=HTTP_CGI_AUTHORIZATION:%1,l] ScriptAlias /pypi /data/pypi/src/pypi/pypi.fcgi Regards, Martin From martin at v.loewis.de Sat Jul 7 09:12:20 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 07 Jul 2007 09:12:20 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> Message-ID: <468F3CD4.1070501@v.loewis.de> > Now I just need to finish off the static file generation code. It > needs a tool which can run every minute or so, which will look for any > changes. Would it be possible to trigger that explicitly by a write operation? I'm doubtful about cron jobs for that kind of stuff - they run both too often and too infrequent. It's too often because most of the time, nothing changes, and too infrequent, because the user making the change won't see it, and wonders where it got lost (they will see the change as long they are logged in, then they log out, and the release is not there). IIUC, every addition to the journals should trigger a change, and then the updating of the download counters. There are also changes to the templates, but it would be ok if one would have to trigger regeneration manually in this case. Regards, Martin From renesd at gmail.com Sat Jul 7 09:38:18 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 7 Jul 2007 17:38:18 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <468F3CD4.1070501@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> Message-ID: <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> Yeah, that could be triggered then. For the case of multiple changes at a similar time, we could add some checks to make sure the updater process is only running once. Otherwise for the case when there are a few changes happening at a time, the machine would get unnecessarily overloaded. On 7/7/07, "Martin v. L?wis" wrote: > > Now I just need to finish off the static file generation code. It > > needs a tool which can run every minute or so, which will look for any > > changes. > > Would it be possible to trigger that explicitly by a write operation? > I'm doubtful about cron jobs for that kind of stuff - they run both > too often and too infrequent. It's too often because most of the time, > nothing changes, and too infrequent, because the user making the change > won't see it, and wonders where it got lost (they will see the change > as long they are logged in, then they log out, and the release is not > there). > > IIUC, every addition to the journals should trigger a change, and then > the updating of the download counters. There are also changes to the > templates, but it would be ok if one would have to trigger regeneration > manually in this case. > > Regards, > Martin > From renesd at gmail.com Sat Jul 7 09:41:54 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 7 Jul 2007 17:41:54 +1000 Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved In-Reply-To: <468F3645.1030000@v.loewis.de> References: <467CC2E1.3010708@v.loewis.de> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de> <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com> <468F3645.1030000@v.loewis.de> Message-ID: <64ddb72c0707070041n5eb565c1jdaa25e4c9d583641@mail.gmail.com> Thanks. I thought because of the types of joins being done postgresql needed a primary key - but maybe you can get them working with just some more indices. On 7/7/07, "Martin v. L?wis" wrote: > > A lot of the queries cause a sequential scan of all the rows in the > > journal and release tables. > > > > I think the cause of this is that one of the tables does not have a > > primary key, so postgresql can't optimize the query. Even if the > > table had an incrementing numeric id field, then I think the joins > > could be sped up. I haven't tested this yet, but maybe that'd help - > > or maybe there would need to be more changes needed. Postgresql > > definitely needs a PK on each table though. > > Not definitely - and index is enough. A PK only adds an additional > constraint, and does not contribute in itself to performance. > In any case, I plan to add a name-version index to release_classifiers, > as the browsing often looks into release_classifiers by name and > version. > > > ps, I'm going to try and finish off that caching/static file work I've > > been working on(more on that later). I guess I'll need to test things > > a little differently with fastcgi. How did you set up a fastcgi pypi? > > FastCgiServer /data/pypi/src/pypi/pypi.fcgi -idle-timeout 60 -processes 4 > > then > > # Trick Apache in providing Basic-Auth to pypi.fcgi > RewriteCond %{HTTP:Authorization} ^(.+)$ > RewriteRule ^/pypi(.*) /data/pypi/src/pypi/pypi.fcgi$1 > [e=HTTP_CGI_AUTHORIZATION:%1,l] > ScriptAlias /pypi /data/pypi/src/pypi/pypi.fcgi > > Regards, > Martin > From jafo at tummy.com Sat Jul 7 10:18:13 2007 From: jafo at tummy.com (Sean Reifschneider) Date: Sat, 7 Jul 2007 02:18:13 -0600 Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved In-Reply-To: <468EBEEE.9010404@v.loewis.de> References: <46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de> Message-ID: <20070707081813.GS28082@tummy.com> On Sat, Jul 07, 2007 at 12:15:10AM +0200, "Martin v. L?wis" wrote: >I expect such a project to complete in a matter of months rather >than a matter of days. It took a year or so before the current set of I believe that Jim was referring to the memory upgrade of ximinez, not the getting creosote replaced with a new box. The memory upgrade should tale little if any of our time. Thanks, Sean -- moshez always wanted to invent a compression scheme called "feather", so he could tar and feather his files. Sean Reifschneider, Member of Technical Staff tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability From renesd at gmail.com Sat Jul 7 11:03:24 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 7 Jul 2007 19:03:24 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> Message-ID: <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> Hi, I tried using memcached for caching the database queries - for logged in users. It did speed it up a little, but not that much. It turns out that the templates take most of the time - at least on my machine. I guess pagetemplates are not that quick? Here's the modified files if you want to try it out yourself: http://rene.f0o.com/~rene/stuff/store.py http://rene.f0o.com/~rene/stuff/webui.py I just tried out on the queries that /pypi /pypi/ use. There's some timing in the webui that gets written to a file in /tmp/asdfsdaf For concurrent access then memcached will make more of a difference though. Memcache could help even for logged in people, but I think replacing the template language with something faster will have the most effect. Cheers, On 7/7/07, Ren? Dudfield wrote: > Yeah, that could be triggered then. > > For the case of multiple changes at a similar time, we could add some > checks to make sure the updater process is only running once. > Otherwise for the case when there are a few changes happening at a > time, the machine would get unnecessarily overloaded. > > > On 7/7/07, "Martin v. L?wis" wrote: > > > Now I just need to finish off the static file generation code. It > > > needs a tool which can run every minute or so, which will look for any > > > changes. > > > > Would it be possible to trigger that explicitly by a write operation? > > I'm doubtful about cron jobs for that kind of stuff - they run both > > too often and too infrequent. It's too often because most of the time, > > nothing changes, and too infrequent, because the user making the change > > won't see it, and wonders where it got lost (they will see the change > > as long they are logged in, then they log out, and the release is not > > there). > > > > IIUC, every addition to the journals should trigger a change, and then > > the updating of the download counters. There are also changes to the > > templates, but it would be ok if one would have to trigger regeneration > > manually in this case. > > > > Regards, > > Martin > > > From jim at zope.com Sat Jul 7 16:30:24 2007 From: jim at zope.com (Jim Fulton) Date: Sat, 7 Jul 2007 10:30:24 -0400 Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved In-Reply-To: <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com> References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de> <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com> Message-ID: <2F7122AD-4F6C-4714-9955-0E12AF8A6864@zope.com> On Jul 6, 2007, at 9:22 PM, Ren? Dudfield wrote: ... > ps, I'm going to try and finish off that caching/static file work I've > been working on(more on that later). Yay! > I guess I'll need to test things > a little differently with fastcgi. How did you set up a fastcgi pypi? Does it matter? Couldn't you just test with CGI? Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Sat Jul 7 17:19:19 2007 From: jim at zope.com (Jim Fulton) Date: Sat, 7 Jul 2007 11:19:19 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> Message-ID: On Jul 7, 2007, at 12:24 AM, Ren? Dudfield wrote: ... > Now I just need to finish off the static file generation code. It > needs a tool which can run every minute or so, which will look for any > changes. Why not write the files when the underlying packages change? I don't like polling for two reasons: - New pages are out of date for up to the polling interval. This is especially annoying for someone who uploads a package and wants to be able to access it immediately. - Polling all of the pages to see what's changed doesn't seem scalable to me. ... > I've also updated the http://wiki.python.org/moin/CheeseShopDev page > with some things I noticed when installing the cheeseshop again on my > laptop. Mainly dependencies, and missing config steps. Thanks! Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Sat Jul 7 18:39:42 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 07 Jul 2007 18:39:42 +0200 Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved In-Reply-To: <20070707081813.GS28082@tummy.com> References: <46801FDC.4060502@v.loewis.de> <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de> <20070707081813.GS28082@tummy.com> Message-ID: <468FC1CE.8080708@v.loewis.de> >> I expect such a project to complete in a matter of months rather >> than a matter of days. It took a year or so before the current set of > > I believe that Jim was referring to the memory upgrade of ximinez, not the > getting creosote replaced with a new box. The memory upgrade should tale > little if any of our time. Ah, ok. If you would like to find the right person at XS4ALL to talk to, please go ahead - else I could try myself. Regards, Martin From martin at v.loewis.de Sat Jul 7 18:43:39 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 07 Jul 2007 18:43:39 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> Message-ID: <468FC2BB.7030607@v.loewis.de> > I tried using memcached for caching the database queries - for logged > in users. It did speed it up a little, but not that much. It turns > out that the templates take most of the time - at least on my machine. For the majority of pages generated through page templates, I think the static generation would be fine. I'm looking primarily into the browse interface at the moment. > There's some timing in the webui that gets written to a file in /tmp/asdfsdaf > > For concurrent access then memcached will make more of a difference though. > > Memcache could help even for logged in people, but I think replacing > the template language with something faster will have the most effect. I'm quite skeptical on caching in general (even about the static page generation). It *should* be possible to make it fast enough so that it doesn't need caching. I consider caching a work-around, not a solution - and one with severe drawbacks. Regards, Martin From jim at zope.com Sat Jul 7 19:48:50 2007 From: jim at zope.com (Jim Fulton) Date: Sat, 7 Jul 2007 13:48:50 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <468FC2BB.7030607@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> Message-ID: <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote: ... > I'm quite skeptical on caching in general (even about the static page > generation). It *should* be possible to make it fast enough so that > it doesn't need caching. Sure, with more hardware than we want to afford. > I consider caching a work-around, not a > solution - and one with severe drawbacks. The pages we're talking about are static. They change at well-known times. IMO, It's crazy to serve static content dynamically when it's easy to serve it statically. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jafo at tummy.com Sat Jul 7 20:56:30 2007 From: jafo at tummy.com (Sean Reifschneider) Date: Sat, 7 Jul 2007 12:56:30 -0600 Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved In-Reply-To: <468FC1CE.8080708@v.loewis.de> References: <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <46802A10.8080205@v.loewis.de> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de> <20070707081813.GS28082@tummy.com> <468FC1CE.8080708@v.loewis.de> Message-ID: <20070707185630.GV28082@tummy.com> On Sat, Jul 07, 2007 at 06:39:42PM +0200, "Martin v. L?wis" wrote: >Ah, ok. If you would like to find the right person at XS4ALL to talk to, >please go ahead - else I could try myself. I've sent a request to the "sales" e-mail contact explaining what we're trying to do and asking for direction. Thanks, Sean -- You know you're in Canada when: You see a flyer advertising a polka-fest at the curling rink. Sean Reifschneider, Member of Technical Staff tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability From thomas at python.org Sat Jul 7 21:38:00 2007 From: thomas at python.org (Thomas Wouters) Date: Sat, 7 Jul 2007 12:38:00 -0700 Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved In-Reply-To: <20070707185630.GV28082@tummy.com> References: <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com> <200706252144.l5PLi7cs032424@theraft.openend.se> <20070626105201.GA14025@tummy.com> <20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de> <20070707081813.GS28082@tummy.com> <468FC1CE.8080708@v.loewis.de> <20070707185630.GV28082@tummy.com> Message-ID: <9e804ac0707071238v6664e9c1xb954fe805f5ebb15@mail.gmail.com> On 7/7/07, Sean Reifschneider wrote: > > On Sat, Jul 07, 2007 at 06:39:42PM +0200, "Martin v. L?wis" wrote: > >Ah, ok. If you would like to find the right person at XS4ALL to talk to, > >please go ahead - else I could try myself. > > I've sent a request to the "sales" e-mail contact explaining what we're > trying to do and asking for direction. I doubt they can figure out what to do, frankly, since we're not an official sales customer. But who knows, they might surprise me ;) I sent out an email asking for extra memory last week, but I've been busy with work and travelling (first Mountain View for Google, now Vilnius for EuroPython) and haven't had a chance to find out if the people I asked are even in the country right now. If you don't hear back from sales, let me know and I'll ask around more. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/catalog-sig/attachments/20070707/0500495c/attachment.html From martin at v.loewis.de Sat Jul 7 22:24:59 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 07 Jul 2007 22:24:59 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> Message-ID: <468FF69B.2090503@v.loewis.de> Jim Fulton schrieb: > ... >> I'm quite skeptical on caching in general (even about the static page >> generation). It *should* be possible to make it fast enough so that >> it doesn't need caching. > > Sure, with more hardware than we want to afford. So you are saying it's not fast enough already? Regards, Martin From renesd at gmail.com Sun Jul 8 05:14:56 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sun, 8 Jul 2007 13:14:56 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> Message-ID: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> Hello, Cool, ok. Let's start with event based updating of the static files. I need to make this tool in this way anyway though. But we can either set it up to work with polling, or event based. We can start with event based and switch to polling later if needed. Since none of the files exists at the moment, the tool will be needed to generate them initially. Also if templates change, or the database changes - then the static pages may need regenerating. Polling is just one sql statement to see if something has changed. You do this once, no matter how many things have changed. It's a really quick, operation if nothing has changed. Polling ends up being faster if you are constantly having to do things all the time anyway. It's what network drivers do these days because they realise that there are a constant stream of events(interupts) anyway - so might as well deal with them at a fixed interval. Logged in users will not see the static file anyway - since they are logged in, they get to see the dynamically generated stuff. Imagine this case: 2-3 users are updating their packages, at a similar time. The main index then gets regenerated 3 times, rather than once. The more people who are changing things the more this method works. If there are 20 people changing things at the same time, then there is still only one update of the main index page. However since the cheeseshop only gets updated about 6 times daily, event based is probably better for the moment. Anyway... I'm just making the tool which can be used on demand, or at regular timings. Cheers, On 7/8/07, Jim Fulton wrote: > > On Jul 7, 2007, at 12:24 AM, Ren? Dudfield wrote: > ... > > Now I just need to finish off the static file generation code. It > > needs a tool which can run every minute or so, which will look for any > > changes. > > Why not write the files when the underlying packages change? > > I don't like polling for two reasons: > > - New pages are out of date for up to the polling interval. This is > especially annoying for someone who uploads a package and wants to be > able to access it immediately. > > - Polling all of the pages to see what's changed doesn't seem > scalable to me. > > ... > > > I've also updated the http://wiki.python.org/moin/CheeseShopDev page > > with some things I noticed when installing the cheeseshop again on my > > laptop. Mainly dependencies, and missing config steps. > > Thanks! > > Jim > > -- > Jim Fulton mailto:jim at zope.com Python Powered! > CTO (540) 361-1714 http://www.python.org > Zope Corporation http://www.zope.com http://www.zope.org > > > > From renesd at gmail.com Sun Jul 8 05:27:53 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sun, 8 Jul 2007 13:27:53 +1000 Subject: [Catalog-sig] europython cheeseshop sprint? Rolling out changes. Message-ID: <64ddb72c0707072027p226f7125k3642ef00c5577675@mail.gmail.com> Hellos, I'll need to coordinate with someone at some point to implement my changes... since I don't have access. I'm at europython, so maybe that would be a good time to meet up for a little sprint? Is there anyone with access to the cheeseshop going to europython who wants to work on implementing these changes? I don't have subversion commit access, or access to the server, so I'll need someone else who does to help me. Here's the sprint wiki page for sprints: http://wiki.python.org/moin/EuroPython2007Sprints I also created a page here: http://wiki.python.org/moin/EuroPython2007/CheeseshopSprint We need to decide when to do the sprint too. Please let me know if you want to join the sprint, and on what day? What other things do people want to work on at the sprint? It would be good to set up a different virtual domain so we can test changes on there without mucking up the normal cheeseshop so much. It might be best if I set it up on a separate server for testing, since apache will have to be restarted a lot. Since there aren't really any tests for the cheeseshop, should I start adding some? If so with which tool? I'd like to make some tests to see if the dymanic, or static files are being served - depending if the user is authorized or not. I'd also like to These tests can also serve as monitoring tools - to answer this question - 'is the cheeseshop still working?' From renesd at gmail.com Sun Jul 8 06:48:45 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sun, 8 Jul 2007 14:48:45 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> Message-ID: <64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com> Hi, here's the start of the static file generator. It just works on one web path, and one fileout at a time so far. It doesn't figure out the correct path to put the file, or check to see if there are any changes. http://rene.f0o.com/~rene/stuff/pypi/pypi-static-generation.py # here is like looking at the http://cheeseshop.python.org/pypi/pygame url python pypi-static-generation.py -create_single /pypi/pygame /tmp/pygame.html It uses the webui.py code, so that there will not be any repeating of code. It does this in a similar manner to how the pypi.py pypi.cgi and pypi.fcgi codes works. That is by making its implementation of the RequestWrapper class. I thought I'd just keep posting my changes to the mailing list as I go... so there's some history of changes - and so people can have a look/review if they want. If that annoys people I'll stop sending to the list. Next up I'm going to put a few functions into store.py. Ones to check if a release has changed since a given date. Also one to see if any changes at all have happened since a given date. I'll also add some onChange type functions for releases. That will be where all of the code can go for stuff that happens on a change to releases etc. cheers, On 7/8/07, Ren? Dudfield wrote: > Hello, > > Cool, ok. Let's start with event based updating of the static files. > > > I need to make this tool in this way anyway though. But we can either > set it up to work with polling, or event based. We can start with > event based and switch to polling later if needed. > > Since none of the files exists at the moment, the tool will be > needed to generate them initially. Also if templates change, or the > database changes - then the static pages may need regenerating. > > > Polling is just one sql statement to see if something has changed. > You do this once, no matter how many things have changed. It's a > really quick, operation if nothing has changed. > > Polling ends up being faster if you are constantly having to do things > all the time anyway. It's what network drivers do these days because > they realise that there are a constant stream of events(interupts) > anyway - so might as well deal with them at a fixed interval. > > Logged in users will not see the static file anyway - since they are > logged in, they get to see the dynamically generated stuff. > > Imagine this case: > 2-3 users are updating their packages, at a similar time. The main > index then gets regenerated 3 times, rather than once. The more > people who are changing things the more this method works. If there > are 20 people changing things at the same time, then there is still > only one update of the main index page. However since the cheeseshop > only gets updated about 6 times daily, event based is probably better > for the moment. > > > Anyway... I'm just making the tool which can be used on demand, or at > regular timings. > > > Cheers, > > > > On 7/8/07, Jim Fulton wrote: > > > > On Jul 7, 2007, at 12:24 AM, Ren? Dudfield wrote: > > ... > > > Now I just need to finish off the static file generation code. It > > > needs a tool which can run every minute or so, which will look for any > > > changes. > > > > Why not write the files when the underlying packages change? > > > > I don't like polling for two reasons: > > > > - New pages are out of date for up to the polling interval. This is > > especially annoying for someone who uploads a package and wants to be > > able to access it immediately. > > > > - Polling all of the pages to see what's changed doesn't seem > > scalable to me. > > > > ... > > > > > I've also updated the http://wiki.python.org/moin/CheeseShopDev page > > > with some things I noticed when installing the cheeseshop again on my > > > laptop. Mainly dependencies, and missing config steps. > > > > Thanks! > > > > Jim > > > > -- > > Jim Fulton mailto:jim at zope.com Python Powered! > > CTO (540) 361-1714 http://www.python.org > > Zope Corporation http://www.zope.com http://www.zope.org > > > > > > > > > From martin at v.loewis.de Sun Jul 8 07:19:33 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 08 Jul 2007 07:19:33 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> Message-ID: <469073E5.6010201@v.loewis.de> > Polling is just one sql statement to see if something has changed. It's not good enough if something has changed - one would also need to know what precisely has changed, or else you would need to regenerate everything. > Polling ends up being faster if you are constantly having to do things > all the time anyway. Maybe (I don't fully understand what you try to say). However, the cheeseshop does not change very often, so you don't have to do things all the time anyway. If it was, caching would have no advantage. > 2-3 users are updating their packages, at a similar time. The main > index then gets regenerated 3 times, rather than once. [Not sure what page precisely you are referring to as "the main index". I'll assume you talk about the home page] On July 7 (yesterday), there were 54 changes; the day before, there were 37. Of these, it is typical that multiple changes to the same package happen within a few seconds, and then no changes happen for many minutes; often not a single change within an hour. It very rarely happens that there are 3 users simultaneously updating their packages. Regenerating the main index 3 times is very fast. Depending on how precisely you prevent concurrent updates, and depending on how similar the times are, the three users may not trigger three updates, but only two, if the first update is still running when the second and third one is attempted. Regards, Martin From martin at v.loewis.de Sun Jul 8 07:29:36 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 08 Jul 2007 07:29:36 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com> Message-ID: <46907640.3010408@v.loewis.de> > Next up I'm going to put a few functions into store.py. Ones to check > if a release has changed since a given date. Also one to see if any > changes at all have happened since a given date. Is this really necessary? I think it would be sufficient to have a table of name,version pairs that list the releases that have changed. This table is filled on modification, and cleared by the regeneration. Regards, Martin From renesd at gmail.com Sun Jul 8 07:36:21 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sun, 8 Jul 2007 15:36:21 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46907640.3010408@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com> <46907640.3010408@v.loewis.de> Message-ID: <64ddb72c0707072236x6c800515sc8869e31334bd359@mail.gmail.com> hello, It's less work to just look up to see when the last change was. Rather than make another table and store it - duplicating the data. Cheers, On 7/8/07, "Martin v. L?wis" wrote: > > Next up I'm going to put a few functions into store.py. Ones to check > > if a release has changed since a given date. Also one to see if any > > changes at all have happened since a given date. > > Is this really necessary? I think it would be sufficient to have a table > of name,version pairs that list the releases that have changed. This > table is filled on modification, and cleared by the regeneration. > > Regards, > Martin > From renesd at gmail.com Sun Jul 8 09:46:18 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sun, 8 Jul 2007 17:46:18 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707072236x6c800515sc8869e31334bd359@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com> <46907640.3010408@v.loewis.de> <64ddb72c0707072236x6c800515sc8869e31334bd359@mail.gmail.com> Message-ID: <64ddb72c0707080046j4c1a2566s7cf6ae5cba0ad9c6@mail.gmail.com> Hi, here's another update: http://rene.f0o.com/~rene/stuff/pypi/pypi-static-generation.py Now you can also create all of the releases listed on the "/pypi/" url. python pypi-static-generation.py -create_all It still doesn't do date checking yet. I'll probably get around to that tomorrow. so it creates these files and directories: /pypi/Pygame/index.html /pypi/Pygame/1.7.1/index.html So these urls can use the static files: /pypi/Pygame/ /pypi/Pygame /pypi/Pygame/1.7.1 /pypi/Pygame/1.7.1/ It took about 20 minutes to generate all of them on my Ye Olde p3 256MB ram, laptop HD computer. Cheers, On 7/8/07, Ren? Dudfield wrote: > hello, > > It's less work to just look up to see when the last change was. > Rather than make another table and store it - duplicating the data. > > Cheers, > > > On 7/8/07, "Martin v. L?wis" wrote: > > > Next up I'm going to put a few functions into store.py. Ones to check > > > if a release has changed since a given date. Also one to see if any > > > changes at all have happened since a given date. > > > > Is this really necessary? I think it would be sufficient to have a table > > of name,version pairs that list the releases that have changed. This > > table is filled on modification, and cleared by the regeneration. > > > > Regards, > > Martin > > > From jim at zope.com Sun Jul 8 14:14:27 2007 From: jim at zope.com (Jim Fulton) Date: Sun, 8 Jul 2007 08:14:27 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <468FF69B.2090503@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> Message-ID: <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> On Jul 7, 2007, at 4:24 PM, Martin v. L?wis wrote: > Jim Fulton schrieb: >> ... >>> I'm quite skeptical on caching in general (even about the static >>> page >>> generation). It *should* be possible to make it fast enough so that >>> it doesn't need caching. >> >> Sure, with more hardware than we want to afford. > > So you are saying it's not fast enough already? Uh, yeah. That's what this whole thread has been about. *Maybe* all your efforts will make it fast enough. I'm skeptical though. Also understand that now that we're using the cheeseshop to support automated builds, the load will increase a lot over time. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Sun Jul 8 18:07:27 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 08 Jul 2007 18:07:27 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> Message-ID: <46910BBF.3010308@v.loewis.de> >> So you are saying it's not fast enough already? > > Uh, yeah. Can you please be more precise, then? What kind of operation are you performing, how long does it take, and how long should it take so that you would consider it fast enough? It's difficult to implement a system if the requirements are unknown to those implementing it. Regards, Martin From pje at telecommunity.com Sun Jul 8 19:27:56 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 08 Jul 2007 13:27:56 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> Message-ID: <20070708172544.8D2763A404D@sparrow.telecommunity.com> At 01:48 PM 7/7/2007 -0400, Jim Fulton wrote: >On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote: >... > > I'm quite skeptical on caching in general (even about the static page > > generation). It *should* be possible to make it fast enough so that > > it doesn't need caching. > >Sure, with more hardware than we want to afford. > > > I consider caching a work-around, not a > > solution - and one with severe drawbacks. > >The pages we're talking about are static. They change at well-known >times. IMO, It's crazy to serve static content dynamically when it's >easy to serve it statically. If they're effectively static, why can't Apache cache them? Shouldn't we be able to simply add Last-Modified/If-Modified support to the PyPI output, and enable Apache's disk caching for non-logged-in users? That is, as long as there is a quick last-modified-time query for a package, we can use those to process the If-Modified header. The modification time could even be memcached, so as not to need a database hit 99% of the time. While that's not necessarily as fast as static page generation, it's a lot less complex to get right, and it saves the main piece of CPU load: i.e., doing SQL queries and actually generating the page. Pages that pertain to more than one package might be a bit more complex to do this on, but if I understand correctly it's mainly the package-specific pages we're concerned with here, correct? Even so, it's possible to have any updates also update a global "something's changed" time, and use that time as the Last-Modified of those pages. From martin at v.loewis.de Sun Jul 8 19:37:24 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 08 Jul 2007 19:37:24 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070708172544.8D2763A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <20070708172544.8D2763A404D@sparrow.telecommunity.com> Message-ID: <469120D4.60909@v.loewis.de> > If they're effectively static, why can't Apache cache them? That's easy to answer: nobody told Apache to do that (and I don't know how to tell it to). Ren?'s approach currently is to generate the files explicitly on disk, and then have Apache return them always from disk. > Shouldn't > we be able to simply add Last-Modified/If-Modified support to the PyPI > output, and enable Apache's disk caching for non-logged-in users? How precisely would that work? I.e. what software should put what header into what place, and how would the cache then find out that the real data have changed? > While that's not necessarily as fast as static page generation, it's a > lot less complex to get right, and it saves the main piece of CPU load: > i.e., doing SQL queries and actually generating the page. I'm not convinced yet that this is where the time is spent (seeing actual profiling data would convince me). I have learned to never ever guess what precisely is consuming cycles in a piece of software. > Pages that pertain to more than one package might be a bit more complex > to do this on, but if I understand correctly it's mainly the > package-specific pages we're concerned with here, correct? I'm not convinced of that, either. Regards, Martin From renesd at gmail.com Sun Jul 8 19:47:17 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Mon, 9 Jul 2007 03:47:17 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070708172544.8D2763A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <20070708172544.8D2763A404D@sparrow.telecommunity.com> Message-ID: <64ddb72c0707081047i1f4209e0j1584c1c2d6863bc5@mail.gmail.com> Hi, turning on caching is the plan as well, but after the static files. See my earlier emails on the subject. However static pages have their uses too, and are a bit faster than the cached ones. On 7/9/07, Phillip J. Eby wrote: > At 01:48 PM 7/7/2007 -0400, Jim Fulton wrote: > > >On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote: > >... > > > I'm quite skeptical on caching in general (even about the static page > > > generation). It *should* be possible to make it fast enough so that > > > it doesn't need caching. > > > >Sure, with more hardware than we want to afford. > > > > > I consider caching a work-around, not a > > > solution - and one with severe drawbacks. > > > >The pages we're talking about are static. They change at well-known > >times. IMO, It's crazy to serve static content dynamically when it's > >easy to serve it statically. > > If they're effectively static, why can't Apache cache > them? Shouldn't we be able to simply add Last-Modified/If-Modified > support to the PyPI output, and enable Apache's disk caching for > non-logged-in users? > > That is, as long as there is a quick last-modified-time query for a > package, we can use those to process the If-Modified header. The > modification time could even be memcached, so as not to need a > database hit 99% of the time. > > While that's not necessarily as fast as static page generation, it's > a lot less complex to get right, and it saves the main piece of CPU > load: i.e., doing SQL queries and actually generating the page. > > Pages that pertain to more than one package might be a bit more > complex to do this on, but if I understand correctly it's mainly the > package-specific pages we're concerned with here, correct? Even so, > it's possible to have any updates also update a global "something's > changed" time, and use that time as the Last-Modified of those pages. > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From renesd at gmail.com Sun Jul 8 19:50:07 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Mon, 9 Jul 2007 03:50:07 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469120D4.60909@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <20070708172544.8D2763A404D@sparrow.telecommunity.com> <469120D4.60909@v.loewis.de> Message-ID: <64ddb72c0707081050l55c8beakbc241e5ac94ed7d7@mail.gmail.com> On 7/9/07, "Martin v. L?wis" wrote: > > If they're effectively static, why can't Apache cache them? > > That's easy to answer: nobody told Apache to do that > (and I don't know how to tell it to). > > Ren?'s approach currently is to generate the files explicitly > on disk, and then have Apache return them always from disk. Yeah, have apache return from disk if not logged in. Also if the static file is not there, then it generates the page dynamically. From pje at telecommunity.com Sun Jul 8 21:33:36 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 08 Jul 2007 15:33:36 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469120D4.60909@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <20070708172544.8D2763A404D@sparrow.telecommunity.com> <469120D4.60909@v.loewis.de> Message-ID: <20070708193123.CCB803A404D@sparrow.telecommunity.com> At 07:37 PM 7/8/2007 +0200, Martin v. L?wis wrote: > > If they're effectively static, why can't Apache cache them? > >That's easy to answer: nobody told Apache to do that >(and I don't know how to tell it to). > >Ren?'s approach currently is to generate the files explicitly >on disk, and then have Apache return them always from disk. > > > Shouldn't > > we be able to simply add Last-Modified/If-Modified support to the PyPI > > output, and enable Apache's disk caching for non-logged-in users? > >How precisely would that work? I.e. what software should put what >header into what place, and how would the cache then find out that >the real data have changed? I was under the impression that when Apache caching is enabled, it can add an If-Modified-Since header to incoming requests, and in the event that the dynamic content hasn't changed, use its cached version of the response. I am not an expert on this, however. If it does do this, then PyPI would check for an If-Modified-Since header and compare it to the modified date for the page, and return a "not changed" response if appropriate. > > While that's not necessarily as fast as static page generation, it's a > > lot less complex to get right, and it saves the main piece of CPU load: > > i.e., doing SQL queries and actually generating the page. > >I'm not convinced yet that this is where the time is spent (seeing >actual profiling data would convince me). I thought Rene' had done such profiling, as he said it was the templates that were taking most of the CPU. > > Pages that pertain to more than one package might be a bit more complex > > to do this on, but if I understand correctly it's mainly the > > package-specific pages we're concerned with here, correct? > >I'm not convinced of that, either. Well, I thought those were the ones we were caching. It may be that I'm making too many assumptions, but if those assumptions are correct, then the whole thing gets a lot easier to prove correct, compared to a static cache, due to fewer moving parts. If most CPU time is spent rendering package-specific pages, then this approach would fix the problem using the fewest changed parts and extra code to maintain. From martin at v.loewis.de Sun Jul 8 21:34:00 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 08 Jul 2007 21:34:00 +0200 Subject: [Catalog-sig] ZPT template caching Message-ID: <46913C28.4060903@v.loewis.de> I just added template caching to PyPI: rather than parsing a page template on each request, it caches the templates, and later renders a pre-parsed one. According to my measurements, this should reduce the number of Python function calls needed to render a page noticably. As a side effect, Apache needs to be restarted when a template changes (this was already the case for code changes). Regards, Martin From martin at v.loewis.de Sun Jul 8 22:00:44 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 08 Jul 2007 22:00:44 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070708193123.CCB803A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <20070708172544.8D2763A404D@sparrow.telecommunity.com> <469120D4.60909@v.loewis.de> <20070708193123.CCB803A404D@sparrow.telecommunity.com> Message-ID: <4691426C.7030501@v.loewis.de> > I was under the impression that when Apache caching is enabled, it can > add an If-Modified-Since header to incoming requests, and in the event > that the dynamic content hasn't changed, use its cached version of the > response. I am not an expert on this, however. Where would it add that? The (F)CGI script doesn't see any headers, except for those communicated in environment variables. AFAICT, there is non for if-modified-since. If you were thinking of mod_cache: it will expire entries after CacheDefaultExpire (default 1h), unless an Expires or Last-Modified header is in the original response. In the latter case, CacheLastModifiedFactor is used to determine an expiry period (default 10% since last-modified). >> I'm not convinced yet that this is where the time is spent (seeing >> actual profiling data would convince me). > > I thought Rene' had done such profiling, as he said it was the templates > that were taking most of the CPU. I saw that he said that its taking most of the CPU, however, he didn't say he did profiling. I now did, and found that the parsing of the templates takes some time, so it now caches the parsed templates. >> > Pages that pertain to more than one package might be a bit more complex >> > to do this on, but if I understand correctly it's mainly the >> > package-specific pages we're concerned with here, correct? >> >> I'm not convinced of that, either. > > Well, I thought those were the ones we were caching. Not "were caching", but "going to cache". As I said before, I'm unconvinced that this is were the load goes; as a consequence, I'm unconvinced that generating static pages will improve things. Of course, if Rene completes this project, and the static pages don't actually break anything, it shouldn't hurt to use them; then we will see what the saving is (there surely will be *some* saving, and it might be that those who complain about the performance most will see a performance increase assuming that they are primarily interested in the static pages). > It may be that I'm making too many assumptions, but if those assumptions > are correct, then the whole thing gets a lot easier to prove correct, > compared to a static cache, due to fewer moving parts. If most CPU time > is spent rendering package-specific pages, then this approach would fix > the problem using the fewest changed parts and extra code to maintain. My biggest concern is whether there can be a reliable computation of "has this changed". If that predicate gives an incorrect response, it doesn't matter much whether Apache does its own caching, or whether the static page fail to be regenerated. Regards, Martin From jafo at tummy.com Mon Jul 9 06:50:38 2007 From: jafo at tummy.com (Sean Reifschneider) Date: Sun, 8 Jul 2007 22:50:38 -0600 Subject: [Catalog-sig] ZPT template caching In-Reply-To: <46913C28.4060903@v.loewis.de> References: <46913C28.4060903@v.loewis.de> Message-ID: <20070709045038.GA12464@tummy.com> On Sun, Jul 08, 2007 at 09:34:00PM +0200, "Martin v. L?wis" wrote: >As a side effect, Apache needs to be restarted when a template >changes (this was already the case for code changes). The way I cache our site, I put the cache into memcached, so that the cache is shared among all apaches, ages out old stuff, and when I update something I just tell memcached to invalidate everything in it's cache, no Apache restart necessary. I *DO* need to restart it if I make code changes, but not template changes. Thanks, Sean -- If not actually disgruntled, he was far from being gruntled. -- P. G. Wodehouse Sean Reifschneider, Member of Technical Staff tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability From martin at v.loewis.de Mon Jul 9 07:08:15 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 09 Jul 2007 07:08:15 +0200 Subject: [Catalog-sig] ZPT template caching In-Reply-To: <20070709045038.GA12464@tummy.com> References: <46913C28.4060903@v.loewis.de> <20070709045038.GA12464@tummy.com> Message-ID: <4691C2BF.1060901@v.loewis.de> > The way I cache our site, I put the cache into memcached, so that the cache > is shared among all apaches, ages out old stuff, and when I update > something I just tell memcached to invalidate everything in it's cache, no > Apache restart necessary. I *DO* need to restart it if I make code > changes, but not template changes. How can I put parsed zope templates into memcached? Regards, Martin From jafo at tummy.com Mon Jul 9 07:25:57 2007 From: jafo at tummy.com (Sean Reifschneider) Date: Sun, 8 Jul 2007 23:25:57 -0600 Subject: [Catalog-sig] ZPT template caching In-Reply-To: <4691C2BF.1060901@v.loewis.de> References: <46913C28.4060903@v.loewis.de> <20070709045038.GA12464@tummy.com> <4691C2BF.1060901@v.loewis.de> Message-ID: <20070709052557.GD5041@tummy.com> On Mon, Jul 09, 2007 at 07:08:15AM +0200, "Martin v. L?wis" wrote: >How can I put parsed zope templates into memcached? I have no idea. I do it by caching the results, which for my application is all I really care about and don't vary from request to request unless the data or template has changed, or it's a different day. Sean -- Examine what is said, not who speaks. (Arabian Proverb) Sean Reifschneider, Member of Technical Staff tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability Back off man. I'm a scientist. http://HackingSociety.org/ From jim at zope.com Mon Jul 9 15:49:32 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 9 Jul 2007 09:49:32 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> Message-ID: <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> What Martin said :), and: On Jul 7, 2007, at 11:14 PM, Ren? Dudfield wrote: ... > Logged in users will not see the static file anyway - since they are > logged in, they get to see the dynamically generated stuff. Here's a common use case: - A user uploads a new release - They then use setuptools to install the release from PyPI. setuptools will not present their credentials and will therefore behave like a logged in user. It will see and install an older version of the package. This will be very mysterious and annoying to the user that just uploaded the release. > Imagine this case: > 2-3 users are updating their packages, at a similar time. The main > index then gets regenerated 3 times, rather than once. Who cares. That's one page that we get dynamically now. > The more > people who are changing things the more this method works. If there > are 20 people changing things at the same time, then there is still > only one update of the main index page. However since the cheeseshop > only gets updated about 6 times daily, event based is probably better > for the moment. Yup. > Anyway... I'm just making the tool which can be used on demand, or at > regular timings. I wonder if we are talking about the same thing here. I fear not. With event based update, you should only update the pages that need to be updated, at worst, this should be the pages for the project being updated plus http://www.python.org/pypi/. The software needed for this would be very different than the software that would build the static pages initially or update all if a template has changed. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Mon Jul 9 16:09:37 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 9 Jul 2007 10:09:37 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46910BBF.3010308@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> Message-ID: On Jul 8, 2007, at 12:07 PM, Martin v. L?wis wrote: >>> So you are saying it's not fast enough already? >> >> Uh, yeah. > > Can you please be more precise, then? What kind of operation are > you performing, I'm using setuptools. Sertuptools looks at package pages (e.g. http://www.python.org/pypi/foobar), it looks at: http://www.python.org/pypi/ and it doenloads distributions. (AFAICT, the later is done dynamically too, which is especially insane.) > how long does it take, Lately, it's has often taken minutes. This has been the major problem. At the best of times. well, I don't know when those are. :) ATM, requests for http://www.python.org/pypi/zc.buildout takes about 1/3 second. Requests for http://cheeseshop.python.org/packages/2.5/z/ zc.buildout/zc.buildout-1.0.0b28-py2.5.egg take about 2.5 seconds. Requests for http://www.python.org/pypi/ take about 10 seconds. I would say that these times are too long. > and how long should it > take so that you would consider it fast enough? IMO, it needs to be much much faster. If we were serving pages staticially, we would be able to serve thousands of requests per second. There's nothing about this application that would make doing that hard. > It's difficult to implement a system if the requirements are > unknown to those implementing it. I'm sorry, I've been talking about setuptools all along. I thought the use case was understood. Also, I thought it was pretty obvious that the performance we've been seeing lately is totally unacceptable. It's hard to pinpoint exactly what the acceptable performance will be, in part because, we we do better, demand will increase. Note that, as it is now, demand is possibly decreasing because people are building their own indexes. If this was an application that had to be served dynamically (and of course, parts of it are), then it would be much more interesting to discuss targets for dynamic delivery. The performance-critical parts of this application -- the pages that setuptools uses, can readily be served statically, so it makes no sense not to do so. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Mon Jul 9 16:21:23 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 9 Jul 2007 10:21:23 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070708172544.8D2763A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <20070708172544.8D2763A404D@sparrow.telecommunity.com> Message-ID: <437CFE1D-125A-4856-936E-27FC688B57BA@zope.com> On Jul 8, 2007, at 1:27 PM, Phillip J. Eby wrote: > At 01:48 PM 7/7/2007 -0400, Jim Fulton wrote: > >> On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote: >> ... >> > I'm quite skeptical on caching in general (even about the static >> page >> > generation). It *should* be possible to make it fast enough so that >> > it doesn't need caching. >> >> Sure, with more hardware than we want to afford. >> >> > I consider caching a work-around, not a >> > solution - and one with severe drawbacks. >> >> The pages we're talking about are static. They change at well-known >> times. IMO, It's crazy to serve static content dynamically when it's >> easy to serve it statically. > > If they're effectively static, why can't Apache cache them? > Shouldn't we be able to simply add Last-Modified/If-Modified > support to the PyPI output, and enable Apache's disk caching for > non-logged-in users? When caching something, you typically specify a age before you start checking. That means that content would be stale for that period. Sometimes, that is both acceptable and necessary. In any case, dynamic servers typically take just as long to handle an If-Modified or Last-Modified request than they do to handle a regular request. It would be just as complicated, if not more so, to get the cheeseshop software to do this properly than it would to just bake. > That is, as long as there is a quick last-modified-time query for a > package, we can use those to process the If-Modified header. The > modification time could even be memcached, so as not to need a > database hit 99% of the time. No, it can't be cached. What would you do to make sure that cache wasn't stale. > While that's not necessarily as fast as static page generation, > it's a lot less complex to get right, and it saves the main piece > of CPU load: i.e., doing SQL queries and actually generating the page. It is really easy to get static page generation right for an application this simple. YOu know when pages are invalidated. The page relationships are not at all complicated here. > Pages that pertain to more than one package might be a bit more > complex to do this on, but if I understand correctly it's mainly > the package-specific pages we're concerned with here, correct? Yes, and http://www.python.org/pypi/ Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From renesd at gmail.com Mon Jul 9 17:13:48 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Mon, 9 Jul 2007 19:13:48 +0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> Message-ID: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> Hello Jim, I double+ agree we should update on change. On 7/9/07, Jim Fulton wrote: > Here's a common use case: > > - A user uploads a new release > > - They then use setuptools to install the release from PyPI. > setuptools will not present their credentials and will therefore > behave like a logged in user. It will see and install an older > version of the package. > You mean it will behave like someone *not* logged in right? Either way they should always get the latest change. The way to do this atomically, so not one can possibly get an old page, the static file will be removed as the change is committed. Then everyone gets the latest change right away - as soon as the change has been committed. > > Anyway... I'm just making the tool which can be used on demand, or at > > regular timings. > > I wonder if we are talking about the same thing here. I fear not. > With event based update, you should only update the pages that need > to be updated, at worst, this should be the pages for the project > being updated plus http://www.python.org/pypi/. The software needed > for this would be very different than the software that would build > the static pages initially or update all if a template has changed. > These are the commands so far: python pypi-static-generation.py -create_single /pypi/pygame /tmp/pygame.html python pypi-static-generation.py -create_all The generation of the main index page would be: python pypi-static-generation.py -create_single /pypi/ path_to_static_indexpage.html Then there would be a command to update the single page: python pypi-static-generation.py -create_single /pypi/Pygame path_to_static_pygame.html Ok, that's all for now. I'll be able to finish it off in a few days after europython. Cheers, From renesd at gmail.com Mon Jul 9 17:19:52 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Mon, 9 Jul 2007 19:19:52 +0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> Message-ID: <64ddb72c0707090819l2382c8cu619a85ec0d3464dc@mail.gmail.com> On 7/9/07, Jim Fulton wrote: > ATM, requests for http://www.python.org/pypi/zc.buildout takes about > 1/3 second. Requests for http://cheeseshop.python.org/packages/2.5/z/ > zc.buildout/zc.buildout-1.0.0b28-py2.5.egg take about 2.5 seconds. > Requests for http://www.python.org/pypi/ take about 10 seconds. > > I would say that these times are too long. > Hi again, Just a note, the static pages through the mod-rewrite logic goes pretty quickly. So both those pages can be served at 1000s of requests per second. Cheers, From jim at zope.com Mon Jul 9 18:27:08 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 9 Jul 2007 12:27:08 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> Message-ID: <34EBAA43-50DD-49F0-BAB0-B114DA870C37@zope.com> On Jul 9, 2007, at 11:13 AM, Ren? Dudfield wrote: > Hello Jim, > > I double+ agree we should update on change. Yay! :) > On 7/9/07, Jim Fulton wrote: >> Here's a common use case: >> >> - A user uploads a new release >> >> - They then use setuptools to install the release from PyPI. >> setuptools will not present their credentials and will therefore >> behave like a logged in user. It will see and install an older >> version of the package. >> > > You mean it will behave like someone *not* logged in right? Right. > Either > way they should always get the latest change. Yes, if we update the static on change. I though you were arguing that it didn't matter of cached pages were out of date because the person updating the pages would see the changes because they'd see uncached pages. > The way to do this atomically, so not one can possibly get an old > page, the static file will be removed as the change is committed. > Then everyone gets the latest change right away - as soon as the > change has been committed. Sure. > >> > Anyway... I'm just making the tool which can be used on demand, >> or at >> > regular timings. >> >> I wonder if we are talking about the same thing here. I fear not. >> With event based update, you should only update the pages that need >> to be updated, at worst, this should be the pages for the project >> being updated plus http://www.python.org/pypi/. The software needed >> for this would be very different than the software that would build >> the static pages initially or update all if a template has changed. >> > > > These are the commands so far: > python pypi-static-generation.py -create_single /pypi/pygame /tmp/ > pygame.html > python pypi-static-generation.py -create_all Ah, so one script, 2 behaviors. Fair enough. > The generation of the main index page would be: > python pypi-static-generation.py -create_single /pypi/ > path_to_static_indexpage.html > > Then there would be a command to update the single page: > python pypi-static-generation.py -create_single /pypi/Pygame > path_to_static_pygame.html Shouldn't that be implied by both of the commands above. I'm a little surprised that you are doing this as an external script, as opposed to adding the behavior to the cheeseshop code, but I guess it doesn't matter. > Ok, that's all for now. I'll be able to finish it off in a few days > after europython. Haven't you been able to get anyone to sprint with you on it there? Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Mon Jul 9 18:44:45 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 09 Jul 2007 12:44:45 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.co m> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> Message-ID: <20070709164232.95EED3A404D@sparrow.telecommunity.com> At 07:13 PM 7/9/2007 +0400, Ren? Dudfield wrote: >The way to do this atomically, so not one can possibly get an old >page, the static file will be removed as the change is committed. >Then everyone gets the latest change right away - as soon as the >change has been committed. This sounds pretty good... except that you may need better protection against a race condition. What happens if a page is removed *while* it is being regenerated? PostgreSQL has MVCC for read-only transactions, so the static page will be generated against old data, unless you have some other locking mechanism used to serialize access to the static file, that is shared by both the deletion and generating mechanisms. One possible approach: if the generator writes its files to foo/index.html.tmp (opened with exclusive access) and then renames them to 'foo/index.html', then the deletion mechanism can attempt to *first* remove the .tmp file, then the real file. Both processes must be robust against their renames or unlinks or exclusive open()'s failing, but there would then be no possibility of collision. The exclusive open would have to be done at the *start* of write processing, however, before any database queries have been attempted. (And their connection must be rolled back at that point.) This ensures that, if a writer succeeds in locking the .tmp file, then they are seeing data that is current. All that having been said, the idea in general sounds good. If PyPI itself simply checked whether the URL it's about to serve is cacheable (i.e., has a static location and no user logged in), and if so, opened the temp file for exclusive writing, it could just dump its generated page out, and rename it at the end if it had been successful in acquiring the temp file. And voila! No separate caching process, no scheduling, and an always perfectly-up-to-date cache. As soon as a page becomes out of date, it gets served dynamically... but only for as long as it takes to serve one copy of that page. :) In pseudocode: def process_request(): if no authentication header and URL path is cacheable: try: temp = exclusive open cache file with .tmp extension except os.error: pass else: with stdout redirected to temp: process_request_normally() try: rename(tempfilename, realfilename) except os.error: pass send_browser_contents_of(temp) return return process_request_normally() Here, 'process_request_normally()' should refer to everything that PyPI does now, *including database connection rollback or commit*. This will ensure that it's impossible to write stale data to the cache. The deletion process should just do this: for name in (cache_path+'.tmp', cache_path): try: os.unlink(name) except os.error: pass after committing the database transaction. Informal serialization proof: * Only one process may write to a page's .tmp file at a time * Either the writer has committed its page write (by renaming the .tmp file), or it has not (i.e., rename() is atomic) * If the writer has *not* committed its page, then the first unlink will prevent it from doing so. * If the writer *has* committed its page, then the second unlink will undo this. * If between the two unlinks operations, another writer appears, that writer will be reading current data from the database, because it has to acquire exclusive access to the .tmp file before doing a rollback and reading the data it will use for writing. QED, it will be impossible to have stale data in the cache, unless the invalidating request fails to attempt its two unlink operations during the brief window after its database commit. From renesd at gmail.com Mon Jul 9 19:15:19 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Tue, 10 Jul 2007 03:15:19 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070709164232.95EED3A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> <20070709164232.95EED3A404D@sparrow.telecommunity.com> Message-ID: <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com> On 7/10/07, Phillip J. Eby wrote: > At 07:13 PM 7/9/2007 +0400, Ren? Dudfield wrote: > >The way to do this atomically, so not one can possibly get an old > >page, the static file will be removed as the change is committed. > >Then everyone gets the latest change right away - as soon as the > >change has been committed. > > This sounds pretty good... except that you may need better > protection against a race condition. What happens if a page is > removed *while* it is being regenerated? PostgreSQL has MVCC for > read-only transactions, so the static page will be generated against > old data, unless you have some other locking mechanism used to > serialize access to the static file, that is shared by both the > deletion and generating mechanisms. > Hi, move in linux/unix is atomic. So the file is generated and then moved in. unlink is similar... once you remove it, any processes with that file open still references the old file. So no race condition. def the static generation: - generate file in temp file - move temp file to place where static file lives. def the update code: - do inserts/updates/deletes. - remove static files. - commit change. - the static generation() From renesd at gmail.com Mon Jul 9 19:31:10 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Tue, 10 Jul 2007 03:31:10 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <34EBAA43-50DD-49F0-BAB0-B114DA870C37@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> <34EBAA43-50DD-49F0-BAB0-B114DA870C37@zope.com> Message-ID: <64ddb72c0707091031m1fe5fccai12708e38fb547d79@mail.gmail.com> No, I haven't found anyone yet. I'll write it up on the board, and see if anyone wants to join in tomorrow - or maybe find someone at the bar tonight. Where do people report bugs for the cheeseshop/distutils? Someone was telling me today that he couldn't get the setup.py to do new releases anymore. cu. On 7/10/07, Jim Fulton wrote: > > On Jul 9, 2007, at 11:13 AM, Ren? Dudfield wrote: > > > Hello Jim, > > > > I double+ agree we should update on change. > > Yay! :) > > > On 7/9/07, Jim Fulton wrote: > >> Here's a common use case: > >> > >> - A user uploads a new release > >> > >> - They then use setuptools to install the release from PyPI. > >> setuptools will not present their credentials and will therefore > >> behave like a logged in user. It will see and install an older > >> version of the package. > >> > > > > You mean it will behave like someone *not* logged in right? > > Right. > > > Either > > way they should always get the latest change. > > Yes, if we update the static on change. > > I though you were arguing that it didn't matter of cached pages were > out of date because the person updating the pages would see the > changes because they'd see uncached pages. > > > The way to do this atomically, so not one can possibly get an old > > page, the static file will be removed as the change is committed. > > Then everyone gets the latest change right away - as soon as the > > change has been committed. > > Sure. > > > > > >> > Anyway... I'm just making the tool which can be used on demand, > >> or at > >> > regular timings. > >> > >> I wonder if we are talking about the same thing here. I fear not. > >> With event based update, you should only update the pages that need > >> to be updated, at worst, this should be the pages for the project > >> being updated plus http://www.python.org/pypi/. The software needed > >> for this would be very different than the software that would build > >> the static pages initially or update all if a template has changed. > >> > > > > > > These are the commands so far: > > python pypi-static-generation.py -create_single /pypi/pygame /tmp/ > > pygame.html > > python pypi-static-generation.py -create_all > > Ah, so one script, 2 behaviors. Fair enough. > > > > The generation of the main index page would be: > > python pypi-static-generation.py -create_single /pypi/ > > path_to_static_indexpage.html > > > > Then there would be a command to update the single page: > > python pypi-static-generation.py -create_single /pypi/Pygame > > path_to_static_pygame.html > > Shouldn't that be implied by both of the commands above. > > I'm a little surprised that you are doing this as an external script, > as opposed to adding the behavior to the cheeseshop code, but I guess > it doesn't matter. > > > Ok, that's all for now. I'll be able to finish it off in a few days > > after europython. > > Haven't you been able to get anyone to sprint with you on it there? > > Jim > > -- > Jim Fulton mailto:jim at zope.com Python Powered! > CTO (540) 361-1714 http://www.python.org > Zope Corporation http://www.zope.com http://www.zope.org > > > > From pje at telecommunity.com Mon Jul 9 19:37:56 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 09 Jul 2007 13:37:56 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.co m> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> <20070709164232.95EED3A404D@sparrow.telecommunity.com> <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com> Message-ID: <20070709173543.BC31B3A404D@sparrow.telecommunity.com> At 03:15 AM 7/10/2007 +1000, Ren? Dudfield wrote: >def the static generation: > - generate file in temp file > - move temp file to place where static file lives. > >def the update code: > - do inserts/updates/deletes. > - remove static files. > - commit change. > - the static generation() Ah - I was assuming static generation was going to be a separate process. However, there's still a race condition here, unless you open the temp file exclusively before the transaction commits. If you wait until after the transaction is finished, another change could occur to the same page after you, but finish its page write *before* you, causing you to overwrite it with your move! You then end up with an outdated page that will stick around indefinitely. (Yes, it's unlikely, but it *can* happen, and therefore eventually will.) So, as in my suggestion, you *still* need an exclusive open of a pre-determined tempfile name, prior to transaction commit. Then, such an occurrence is impossible. By the way, the generate-on-change approach also means you have to do a big batch run to pre-generate all the existing static pages; the approach I suggested will simply generate them in response to actual demand, with no batch processing necessary. A new PyPI installation would just build up its cache as it gets used, getting faster as it goes. From martin at v.loewis.de Tue Jul 10 00:16:03 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Jul 2007 00:16:03 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> Message-ID: <4692B3A3.5030209@v.loewis.de> > Lately, it's has often taken minutes. This has been the major problem. > At the best of times. well, I don't know when those are. :) > > ATM, requests for http://www.python.org/pypi/zc.buildout takes about 1/3 > second. Ok. By "ATM", you mean July 9, 14:09 GMT? Please take a look at http://ximinez.python.org/munin/localdomain/localhost.localdomain-load.html That was the most significant spike in the load today, and I surely would like to know what was causing it. > Requests for > http://cheeseshop.python.org/packages/2.5/z/zc.buildout/zc.buildout-1.0.0b28-py2.5.egg > take about 2.5 seconds. That is a static file, not going through PyPI. It's 168kiB, so that means you download with 67kB/s. > Requests for http://www.python.org/pypi/ take > about 10 seconds. Why does that matter for setuptools? Does setuptools ever look at this page? > I would say that these times are too long. Which of these precisely? Given that the actual file downloads in 2.5s, why is it important that the access to the page referring to it is 1/3s? >> and how long should it >> take so that you would consider it fast enough? > > IMO, it needs to be much much faster. If we were serving pages > staticially, we would be able to serve thousands of requests per > second. There's nothing about this application that would make doing > that hard. I looked at the load preceding your message. Counting 1000 requests backwards from 14:09, we are at 16:07. So this system receives roughly 1000 requests per minute in its peak load, and it seems to be able to handle them (although the performance degrades at that point). Of these requests, 853 came from a single machine (x.y.237.218), which appears to be an extraordinarily "big" client of PyPI. 45 requests came from msnbot, 13 from Google, 44 requests from setuptools (from different machines), and the rest from various web browsers and crawlers. Also, there is a significant difference between throughput and latency: 1000 requests per second is a throughput requirement, whereas "faster than 0.3s" is a latency requirement. They are somewhat unrelated, see below. >> It's difficult to implement a system if the requirements are >> unknown to those implementing it. > > I'm sorry, I've been talking about setuptools all along. I thought the > use case was understood. I understand the use case, I just don't understand the performance requirements resulting out of it. If it's an automated build, why do you care if the page download completes in 0.3s or in 0.01s (it won't be much faster because of network roundtrip times). > Also, I thought it was pretty obvious that the > performance we've been seeing lately is totally unacceptable. Define "lately". I never personally saw "totally unacceptable performance". Whenever I access the system, it behaves completely reasonable, much faster than any other web pages. There were only two instances of "totally unacceptable performance", which were when the system was overloaded, and thrashing. I have since fixed these cases; they cannot occur again. So I don't think it is possible that the current installation shows "totally unacceptable" performance. > If this was an application that had to be served dynamically (and of > course, parts of it are), then it would be much more interesting to > discuss targets for dynamic delivery. The performance-critical parts of > this application -- the pages that setuptools uses, can readily be > served statically, so it makes no sense not to do so. Except that somebody needs to implement that, of course. Regards, Martin From martin at v.loewis.de Tue Jul 10 00:19:58 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Jul 2007 00:19:58 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> Message-ID: <4692B48E.705@v.loewis.de> > These are the commands so far: > python pypi-static-generation.py -create_single /pypi/pygame /tmp/pygame.html > python pypi-static-generation.py -create_all That also needs -create-single /pypi/pywin32/210 Regards, Martin From martin at v.loewis.de Tue Jul 10 00:37:51 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Jul 2007 00:37:51 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com> <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com> <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com> <20070709164232.95EED3A404D@sparrow.telecommunity.com> <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com> Message-ID: <4692B8BF.60203@v.loewis.de> > So no race condition. What Phillip says: "the update code" has a race condition, if multiple simultaneous updates occur. My proposal is still to put a table into Postgres that lists the pages to regenerate. The (single) update process would lock this job table, clear it, release the lock, and start generating; alternatively, multiple update process would each lock the table, generate, then release the lock. Regards, Martin From pje at telecommunity.com Tue Jul 10 02:34:26 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 09 Jul 2007 20:34:26 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4692B3A3.5030209@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> Message-ID: <20070710003214.A2EA83A404D@sparrow.telecommunity.com> At 12:16 AM 7/10/2007 +0200, Martin v. L?wis wrote: > > Requests for http://www.python.org/pypi/ take > > about 10 seconds. > >Why does that matter for setuptools? Does setuptools ever look at this >page? Yes, in order to find the correct spelling for a package's name. If a user types, say "pylons" when the package is listed on PyPI as "Pylons", setuptools looks at the root after the lookup of /pypi/pylons fails. This need could be eliminated if PyPI would canonicalize package names case-insensitively, collapsing all non-alphanumeric characters (other than '.') to a single '-'. i.e.: def safe_name(name): """Convert an arbitrary string to a standard distribution name Any runs of non-alphanumeric/. characters are replaced with a single '-'. """ return re.sub('[^A-Za-z0-9.]+', '-', name) A case-insensitive match by safe_name would be ideal, and could also be used to prevent users from registering packages whose names differ only by case or punctuation. From martin at v.loewis.de Tue Jul 10 07:33:46 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Jul 2007 07:33:46 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070710003214.A2EA83A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> Message-ID: <46931A3A.5000703@v.loewis.de> > Yes, in order to find the correct spelling for a package's name. If a > user types, say "pylons" when the package is listed on PyPI as "Pylons", > setuptools looks at the root after the lookup of /pypi/pylons fails. I don't understand. How does it help to look at /pypi in this case? The right spelling of Pylons is not listed there, unless there was a release of Pylons recently. If you want to correct the spelling, you need to look at http://cheeseshop.python.org/pypi?%3Aaction=index > A case-insensitive match by safe_name would be ideal, and could also be > used to prevent users from registering packages whose names differ only > by case or punctuation. Would it be acceptable to do an HTTP redirect in that case, ie. redirect /pypi/pylons/0.9.5 to /pypi/Pylons/0.9.5? I would not want to have multiple URLs to render the same page, in general (I know it already does that in some cases). I can see how lower-casing helps; I'm doubtful about replacing spaces. I.e. why is it better to look for python-ftp-server-library--pyftpdlib- than Python FTP server library (pyftpdlib) IOW, if you have a mis-spelling of the latter, what are the chances that it is so misspelled that the safe_name is still the former? Shouldn't the package owner just correct the package name, to pyftpdlib, and put the other string into the summary? In any case, if it where postgres 8.1 or later, I could simply do select name from packages where regexp_replace(lower(name),'[^a-z0-9.]','-')='gnosis-utilities'; to do the lookup; with 7.4, I would have to download all names and do the safe matching myself. Regards, Martin From martin at v.loewis.de Tue Jul 10 08:07:15 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Jul 2007 08:07:15 +0200 Subject: [Catalog-sig] Speeding up /pypi Message-ID: <46932213.6050508@v.loewis.de> I created a partial index (didn't know such a thing existed until yesterday) to speed up the computation of the home page: CREATE INDEX journals_latest_releases ON journals(submitted_date, name, version) WHERE version IS NOT NULL AND action='new release'; and reworked the query to let postgres actually use that index; now I can get the Cheeseshop home page as fast as that of www.python.org (namely, in 0.1s), as measured by start=time.time();x=urllib.urlopen("http://cheeseshop.python.org/pypi").read();print time.time()-start Regards, Martin From renesd at gmail.com Tue Jul 10 10:49:10 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Tue, 10 Jul 2007 18:49:10 +1000 Subject: [Catalog-sig] Speeding up /pypi In-Reply-To: <46932213.6050508@v.loewis.de> References: <46932213.6050508@v.loewis.de> Message-ID: <64ddb72c0707100149k46782c66m214b184447ab667b@mail.gmail.com> nice one :) On 7/10/07, "Martin v. L?wis" wrote: > I created a partial index (didn't know such a thing existed until > yesterday) to speed up the computation of the home page: > > CREATE INDEX journals_latest_releases ON > journals(submitted_date, name, version) > WHERE version IS NOT NULL AND action='new release'; > > and reworked the query to let postgres actually use that index; > now I can get the Cheeseshop home page as fast as that of > www.python.org (namely, in 0.1s), as measured by > > start=time.time();x=urllib.urlopen("http://cheeseshop.python.org/pypi").read();print > time.time()-start > > Regards, > Martin > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From jim at zope.com Tue Jul 10 15:52:42 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 10 Jul 2007 09:52:42 -0400 Subject: [Catalog-sig] Merge catalog and distutils sigs Message-ID: <6C0A5EEC-7E01-4C25-BC09-E0B595C8109A@zope.com> Is there are good reason for the distutils and catalog sigs to be separate? Now, that PyPI is an integral part of the distribution system, I find most topics are really of of interested to both sigs, and I bet that the overlap between the sigs is significant. Would anyone object to combining them? Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Tue Jul 10 16:15:05 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 10 Jul 2007 10:15:05 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46931A3A.5000703@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> Message-ID: <20070710141304.BC6903A40A4@sparrow.telecommunity.com> At 07:33 AM 7/10/2007 +0200, Martin v. L?wis wrote: > > Yes, in order to find the correct spelling for a package's name. If a > > user types, say "pylons" when the package is listed on PyPI as "Pylons", > > setuptools looks at the root after the lookup of /pypi/pylons fails. > >I don't understand. How does it help to look at /pypi in this case? It doesn't. It looks at /pypi/ (note the trailing /) -- which lists all packages. >The right spelling of Pylons is not listed there, unless there was >a release of Pylons recently. > >If you want to correct the spelling, you need to look at > >http://cheeseshop.python.org/pypi?%3Aaction=index Which is also spelled /pypi/ - the advantage of this is that a purely static index consisting of Apache directory indexes produces an equally useful result for setuptools. > > A case-insensitive match by safe_name would be ideal, and could also be > > used to prevent users from registering packages whose names differ only > > by case or punctuation. > >Would it be acceptable to do an HTTP redirect in that case, ie. >redirect /pypi/pylons/0.9.5 to /pypi/Pylons/0.9.5? Yes, although setuptoools at the moment looks at /pypi/pylons/ (again, with a trailing /) and does not go to individual version pages unless the base page contains only links to individual version pages. It will handle a redirect correctly, as far as interpreting relative links on result pages. > I would not >want to have multiple URLs to render the same page, in general >(I know it already does that in some cases). > >I can see how lower-casing helps; I'm doubtful about replacing >spaces. I.e. why is it better to look for > >python-ftp-server-library--pyftpdlib- That '--' would actually just be one '-' >than > >Python FTP server library (pyftpdlib) It's not much better, however, there are a lot of packages with shorter names for which it does help. Mainly, though, setuptools just uses this for purposes of determining distribution filenames. >IOW, if you have a mis-spelling of the latter, what are the >chances that it is so misspelled that the safe_name is still >the former? Shouldn't the package owner just correct the >package name, to pyftpdlib, and put the other string into >the summary? > >In any case, if it where postgres 8.1 or later, I could simply do > >select name from packages where >regexp_replace(lower(name),'[^a-z0-9.]','-')='gnosis-utilities'; > >to do the lookup; with 7.4, I would have to download all names >and do the safe matching myself. I think this will work instead: select name from packages where name ~* 'gnosis[^a-z0-9.]+utilities' i.e., replace all '-' in the safe_name() with the appropriate regex. '~*' is the case-insensitive regular expression match operator, according to: http://www.postgresql.org/docs/7.4/interactive/functions-matching.html Of course, it may also suffice to do: select lower(name) from packages where name like 'gnosis_%utilities' i.e. replace all '-' in the safe_name with '_%', which is sort of like '.+' in a regex. You would still have to postprocess the result to catch the difference between say, "gnosis-utilities" and "gnosis3utilities" or some such, but there should be very few such matches. The "like" query may be easier for postgres to use an index on - an expression index on lower(name) would do the trick. Of course, I'm used to trying to optimize much larger databases than PyPI - with only a few thousand entries, a non-index query here may be just fine. In any case, this query should also be used to check for uniqueness when adding packages. From jim at zope.com Tue Jul 10 16:32:10 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 10 Jul 2007 10:32:10 -0400 Subject: [Catalog-sig] Why so many zc.buildout versions? In-Reply-To: <46937F10.3070201@weitershausen.de> References: <46937F10.3070201@weitershausen.de> Message-ID: <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> You raise a really good point, which is especially relevant in light of pypi performance issues and discussions. I'm copying the distutils and catalog sigs to get some wider discussion. I apologize for the cross posting. I'm beginning to wonder about the strategy that setuptools uses, or maybe about the way we are using the index. It's important to note that there is nothing specific about the buildout package here. It is very important to make multiple versions available to support requirements for specific package versions. It make builds/installs repeatable, whether talking about buildout or other systems built on setuptools. When someone has tested and wants to release an application built from a collection of distributions, they will want to specify those *specific* versions for future builds or installs. This means that we need to retain any versions published indefinitely in a way that can be found by setuptools. Currently, the only way to support multiple versions with the cheeseshop is to unhide past releases. This has a fairly severe effect on performance. As the example below shows, setuptools will fetch the package page and then fetch the pages for each release. That's a lot of requests. What makes it worse is that the individual package pages can be fairly long. I've gotten in the habit of including full documentation on every release page. For example, recent release pages for zc.buildout are around 200K. This is a fairly significant amount of data to transfer. This will certainly make the scanning process take a long time for clients. (Obviously, if we keep doing things the way we are, I'll need to stop doing that.) All of this aggravates any performance problems we might have. Up to now, setuptools has tried hard to use existing systems without change. This means that it reuses systems designed primarily for people, not software. I think that setuptools rightly took the approach it has up to now so that progress could be made without making people change other systems. This was appropriate when setuptools was evolving and people were figuring out ways to use it. I think it is time to take a step back and think a lot harder about how we'd want to structure an index to support setuptools. IMO, a setuptools-aware index would have a single page for each package: - The single page would be published in a case-insensitive way. It would be nice to find a way to avoid this, or maybe we should use a windows-based web server. :) It would also be served very cheaply, for example statically. - The single page would list links for all available distributions, which should include all distributions published. It would also list any other URLs that should be scanned for releases, when releases aren't all uploaded to PyPI. - The single page would contain very little additional information. It would be for use by software, not humans. In addition, the root page with a trailing / would be empty and very cheap. There are a lot of ways we could achieve this pretty cheaply while keeping the existing system pretty much as it is. For example, the current effort to bake static pages could bake these pages instead. We could make the new index available at a different URL for people to play with while we worked the kinks out of the process. Of course, those of us who use the cheesehop and setuptools extensively can also achieve much of this by changing the way we work. Thoughts? Jim On Jul 10, 2007, at 8:44 AM, Philipp von Weitershausen wrote: > When easy_installing zc.buildout I realized that the CheeseShop > still lists a gazillion old versions of zc.buildout. That makes it > take quite some time to install zc.buildout (see below), and I > reckon the same sort of check has to happen each time it looks for > a new version of that egg... > > Is there any reason for having so many old versions around? > > > $ easy_install zc.buildout > Searching for zc.buildout > Reading http://cheeseshop.python.org/pypi/zc.buildout/ > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19 > Reading http://svn.zope.org/zc.buildout > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18 > Best match: zc.buildout 1.0.0b28 > ... -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Tue Jul 10 16:40:48 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 10 Jul 2007 10:40:48 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4692B3A3.5030209@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> Message-ID: <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> On Jul 9, 2007, at 6:16 PM, Martin v. L?wis wrote: ... > Ok. By "ATM", you mean July 9, 14:09 GMT? Whenever I sent the note, > Please take a look at > > http://ximinez.python.org/munin/localdomain/localhost.localdomain- > load.html > > That was the most significant spike in the load today, and I surely > would like to know what was causing it. Maybe someone was trying to mirror pypi because it is too slow. :/ I suspect that there is a lot of this going on. > >> Requests for >> http://cheeseshop.python.org/packages/2.5/z/zc.buildout/ >> zc.buildout-1.0.0b28-py2.5.egg >> take about 2.5 seconds. > > That is a static file, not going through PyPI. It's 168kiB, so that > means you download with 67kB/s. OK. So I guess that is reasonable. I'll note that in the long term, we'll probably want to create mirrors to get better locality and this faster downloads and to prevent excessive bandwith consumption for python.org. > >> Requests for http://www.python.org/pypi/ take >> about 10 seconds. > > Why does that matter for setuptools? Does setuptools ever look at this > page? Phillip answered this. >> I would say that these times are too long. > > Which of these precisely? Given that the actual file downloads in > 2.5s, > why is it important that the access to the page referring to it is > 1/3s? I guess all of them except the download. Really, in the long run, I think the download time is too long too. But that isn't my immediate concern. BTW, the problem is exacerbased by packages like zc.buildout that include full documentation in their package pages. Although even packages that don't do that seem to take about a third of a second. >>> and how long should it >>> take so that you would consider it fast enough? >> >> IMO, it needs to be much much faster. If we were serving pages >> staticially, we would be able to serve thousands of requests per >> second. There's nothing about this application that would make doing >> that hard. > > I looked at the load preceding your message. Counting 1000 requests > backwards from 14:09, we are at 16:07. So this system receives roughly > 1000 requests per minute in its peak load, and it seems to be able to > handle them (although the performance degrades at that point). You can expect one of 2 things to happen: - We'll fix the PyPI performance problems and load will increase dramatically, or - We won't fix the problems and people will create alternate indexes. This is already happening. If that happens, the load will likely still increase, although not as rapidly. ... >>> It's difficult to implement a system if the requirements are >>> unknown to those implementing it. >> >> I'm sorry, I've been talking about setuptools all along. I >> thought the >> use case was understood. > > I understand the use case, I just don't understand the performance > requirements resulting out of it. If it's an automated build, why do > you care if the page download completes in 0.3s or in 0.01s (it won't > be much faster because of network roundtrip times). Two reasons: - People wait for these builds. A build will usually make *many* (tens or hundreds) of requests for pypi checking for new versions of software. If there are no new versions, which will be the common case, then nothing will be downloaded. I'm most interested in speeding up the checking. Of course, a requests for http:// www.python.org/pypi/ will usually be done once per build if any of the packages in in the build aren't in pypi (only once because setuptools caches pages internally). It would be nice to find a way to stop doing this. - If performance degrades, as it has often lately, then the times are much longer. In fact, requests over the last few weeks have often timed out, making work grind to a halt. It't imporant to realize that demand will increase substantially, so whatwver we do needs to be scalable. >> Also, I thought it was pretty obvious that the >> performance we've been seeing lately is totally unacceptable. > > Define "lately". I never personally saw "totally unacceptable > performance". Whenever I access the system, it behaves completely > reasonable, much faster than any other web pages. I've seen requests take minutes and time out with proxy errors many times over the last few weeks. We, ZC, and many people we work with are at the point of building private indexes to get around the horrible performance. > There were only two instances of "totally unacceptable performance", > which were when the system was overloaded, and thrashing. I have > since fixed these cases; they cannot occur again. So I don't think > it is possible that the current installation shows "totally > unacceptable" performance. Maybe others can chime in. >> If this was an application that had to be served dynamically (and of >> course, parts of it are), then it would be much more interesting to >> discuss targets for dynamic delivery. The performance-critical >> parts of >> this application -- the pages that setuptools uses, can readily be >> served statically, so it makes no sense not to do so. > > Except that somebody needs to implement that, of course. And happily, someone is. I've realized this morning, in responding to a note from Philipp von Weitershausen that we really should take a step back and think about an index to support setuptools, or, failing that, rethink the ways we're using PyPI in light of the way setuptools works. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Tue Jul 10 17:56:42 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 10 Jul 2007 11:56:42 -0400 Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions? In-Reply-To: <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> Message-ID: <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com> At 10:32 AM 7/10/2007 -0400, Jim Fulton wrote: >Currently, the only way to support multiple versions with the >cheeseshop is to unhide past releases. This has a fairly severe >effect on performance. As the example below shows, setuptools will >fetch the package page and then fetch the pages for each release. >That's a lot of requests. This could potentially be fixed in setuptools, so that it only looks at release pages that match its requirements, in highest-to-lowest version order, stopping as soon as a suitable match is found. That would eliminate the current issue -- but only for new versions of setuptools. So I do like your idea better, since it can be made to work for already-deployed clients as well. >I think it is time to take a step back and think a lot harder about >how we'd want to structure an index to support setuptools. +1, as long as somebody's willing to build and host the thing. Please see my earlier comments on the Catalog-Sig about this. >IMO, a setuptools-aware index would have a single page for each package: > >- The single page would be published in a case-insensitive way. It >would be nice to find a way to avoid this, or maybe we should use a >windows-based web server. :) It would also be served very cheaply, >for example statically. Apache's CheckSpelling directive does case-insensitivity and approximate matching. Combine that with making the directories be based on "safe_name" values to begin with, and you should be all set. >- The single page would list links for all available distributions, >which should include all distributions published. It would also list >any other URLs that should be scanned for releases, when releases >aren't all uploaded to PyPI. The piece you're missing here is direct links to other downloads, such as "#egg=project-dev" subversion links. However, if you extracted these from all of the relevant PyPI HTML pages, you could certainly do that. >In addition, the root page with a trailing / would be empty and very >cheap. As long as the individual package directories are safe_name based, this would work. >There are a lot of ways we could achieve this pretty cheaply while >keeping the existing system pretty much as it is. Of course, there are still other reasons to want to improve the Cheeseshop's performance, such as search engines and other bots. >For example, the current effort to bake static pages could bake these >pages instead. We could make the new index available at a different >URL for people to play with while we worked the kinks out of the >process. ...and then use a User-Agent rewrite rule to redirect setuptools clients to the static piece, as soon as we're satisfied that it works. From jim at zope.com Tue Jul 10 18:04:01 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 10 Jul 2007 12:04:01 -0400 Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions? In-Reply-To: <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com> References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com> Message-ID: On Jul 10, 2007, at 11:56 AM, Phillip J. Eby wrote: > At 10:32 AM 7/10/2007 -0400, Jim Fulton wrote: >> Currently, the only way to support multiple versions with the >> cheeseshop is to unhide past releases. This has a fairly severe >> effect on performance. As the example below shows, setuptools will >> fetch the package page and then fetch the pages for each release. >> That's a lot of requests. > > This could potentially be fixed in setuptools, so that it only > looks at release pages that match its requirements, in highest-to- > lowest version order, stopping as soon as a suitable match is > found. That would eliminate the current issue No, it will mitigate the current issue somewhat, but it will still involve multiple requests per package, while a simpler index structure would allow a single request per package. > -- but only for new versions of setuptools. So I do like your idea > better, since it can be made to work for already-deployed clients > as well. Yup. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Tue Jul 10 23:29:14 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Jul 2007 23:29:14 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070710141304.BC6903A40A4@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> Message-ID: <4693FA2A.3020107@v.loewis.de> > It doesn't. It looks at /pypi/ (note the trailing /) -- which lists all > packages. Ah, ok. I keep forgetting that feature. >> Would it be acceptable to do an HTTP redirect in that case, ie. >> redirect /pypi/pylons/0.9.5 to /pypi/Pylons/0.9.5? > > Yes, although setuptoools at the moment looks at /pypi/pylons/ (again, > with a trailing /) and does not go to individual version pages unless > the base page contains only links to individual version pages. Right - I meant that to mean that it would redirect /pypi/Pylons/ to /pypi/pylons/ > I think this will work instead: > > select name from packages where name ~* 'gnosis[^a-z0-9.]+utilities' Ok. I was hoping to be able to create an index of safe_names, which postgres would automatically maintain on updates; the above approach would always cause a sequential scan (in postgres, not in Python). Your second approach (using like) might solve that, but there I dislike having the logic both in Python and in SQL - ideally, only one of them should do "real" computation (and ideally, it would be SQL). On ximinez, your query gets analyzed as Seq Scan on packages (cost=0.00..46.65 rows=1 width=13) (actual time=0.461..9.367 rows=1 loops=1) Filter: (name ~* 'gnosis[^a-z0-9.]+utilities'::text) Total runtime: 9.413 ms Compared to some other queries it performs, that's a cheap one. > In any case, this query should also be used to check for uniqueness when > adding packages. Hmm. I'm somewhat skeptical about setuptools (or any other packaging infrastructure, say, Debian) establishing rules on what makes a difference in package names. Regards, Martin From martin at v.loewis.de Tue Jul 10 23:36:28 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Jul 2007 23:36:28 +0200 Subject: [Catalog-sig] Why so many zc.buildout versions? In-Reply-To: <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> Message-ID: <4693FBDC.2060201@v.loewis.de> > For example, the current effort to bake static pages could bake these > pages instead. Certainly not instead; in addition, if there are volunteers to implement that. > We could make the new index available at a different > URL for people to play with while we worked the kinks out of the > process. I have been thinking about the same thing. I think it would be good to have, however, it will surely take some time until all setuptools implementations learn to use it. > Of course, those of us who use the cheesehop and setuptools > extensively can also achieve much of this by changing the way we work. Hmm. How about those using them extensively start contributing to them also? Regards, Martin From martin at v.loewis.de Tue Jul 10 23:39:28 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Jul 2007 23:39:28 +0200 Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions? In-Reply-To: References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com> Message-ID: <4693FC90.9060001@v.loewis.de> > No, it will mitigate the current issue somewhat, but it will still > involve multiple requests per package, while a simpler index > structure would allow a single request per package. I don't understand. If setuptools would always look /pypi/package/version first, it would immediately find the right page if that version is indeed stored in the cheeseshop. Why would that require multiple requests per package? Regards, Martin From martin at v.loewis.de Tue Jul 10 23:48:04 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Jul 2007 23:48:04 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> Message-ID: <4693FE94.6090107@v.loewis.de> >> That was the most significant spike in the load today, and I surely >> would like to know what was causing it. > > Maybe someone was trying to mirror pypi because it is too slow. :/ I > suspect that there is a lot of this going on. In that case, I doubt it. The top client identified itself as setuptools. > I've seen requests take minutes and time out with proxy errors many > times over the last few weeks. We, ZC, and many people we work with are > at the point of building private indexes to get around the horrible > performance. I still don't understand why you consider this an easier option than contributing to the existing project. If you invest time to do an alternative, isn't this more expensive than starting where others have already contributed? But if you think that scratches your itches: good luck! > Maybe others can chime in. That's also my concern. Nobody else is complaining; AFAICT, there is just one unhappy user of PyPI. Regards, Martin From jim at zope.com Tue Jul 10 23:49:43 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 10 Jul 2007 17:49:43 -0400 Subject: [Catalog-sig] Why so many zc.buildout versions? In-Reply-To: <4693FBDC.2060201@v.loewis.de> References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> <4693FBDC.2060201@v.loewis.de> Message-ID: <4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com> On Jul 10, 2007, at 5:36 PM, Martin v. L?wis wrote: >> For example, the current effort to bake static pages could bake these >> pages instead. > > Certainly not instead; in addition, if there are volunteers to > implement > that. Sure, > >> We could make the new index available at a different >> URL for people to play with while we worked the kinks out of the >> process. > > I have been thinking about the same thing. I think it would be good > to have, however, it will surely take some time until all setuptools > implementations learn to use it. No, not at all. You can tell setuptools to use a different index than the current one. For example, this is a command-line option for easy_install and a configuration option for buildout. >> Of course, those of us who use the cheesehop and setuptools >> extensively can also achieve much of this by changing the way we >> work. > > Hmm. How about those using them extensively start contributing to > them also? I like to think that I am by participating in this discussion. Actually changing the cheeseshop software has a very high learning curve. I don't think that I can make that kind of time any time soon. I'm very grateful that you and Ren? are doing what you're doing. I also suspect that, given your and Ren?'s activity, it would be counter productive for someone else to get involved at that level, but maybe I'm wrong about that. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Tue Jul 10 23:55:28 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 10 Jul 2007 17:55:28 -0400 Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions? In-Reply-To: <4693FC90.9060001@v.loewis.de> References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com> <4693FC90.9060001@v.loewis.de> Message-ID: On Jul 10, 2007, at 5:39 PM, Martin v. L?wis wrote: >> No, it will mitigate the current issue somewhat, but it will still >> involve multiple requests per package, while a simpler index >> structure would allow a single request per package. > > I don't understand. If setuptools would always look > /pypi/package/version first, it would immediately find the right > page if that version is indeed stored in the cheeseshop. > > Why would that require multiple requests per package? It usually doesn't have a single required version. It usually has just a package name or a name and a range of versions. It has to scan the package page to find out what versions are available, and *then* it can load the release page for the highest version that satisfies the requirement. It can usually read that one page, however, there may be additional filtering needed that would cause it to search multiple releases. For example, it might be looking for a source distribution, or a platform-specific distribution that isn't available for the most recent release. In any case, the best case is that it has to scan the package page to find the most recent release, and then scan that release page. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Wed Jul 11 00:18:00 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 10 Jul 2007 18:18:00 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4693FA2A.3020107@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> Message-ID: <20070710221547.4A3043A40A4@sparrow.telecommunity.com> At 11:29 PM 7/10/2007 +0200, Martin v. L?wis wrote: >Hmm. I'm somewhat skeptical about setuptools (or any other packaging >infrastructure, say, Debian) establishing rules on what makes a >difference in package names. I can certainly understand that. However, *having* SOME definition that's more human-friendly (and cross-platform filename friendly!) than "the bytes match exactly", would be very useful to have. If PyPI had already had one (and I asked about this when I was first trying to establish one) I'd have used that, or negotiated a compromise if it didn't meet the filename-related requirements. However, none of the times that I asked about this issue on either the catalog-sig nor the distutils-sig did anyone propose any alternative canonicalization, nor bring up any objection besides the general sort of reservation that you're expressing here - i.e., not sure it's a good idea, but not expressing any particular reason it's a bad idea. Note that Windows (and Mac OS under certain circumstances) have filename case insensitivity, and have different restrictions about what can or can't be in a filename than Unix. Spaces and other punctuation characters can cause problems for shells, even if they're theoretically acceptable as filenames. If you'd like to propose a *different* canonicalization, however, I'm certainly willing to consider implementing it in setuptools, if it can be done. However, as I said, nobody has proposed anything else, but it would be nice to resolve the issue *before* name collisions happen. If anything, I think that PyPI canonicalization may wish to be *more* restrictive than setuptools' is. There isn't a whole lot of user benefit to having, say, "Mike's Nifty module" and "Mikes Nifty Module" being considered distinct packages, even though setuptools actually allows that distinction to be made. IOW, setuptools' focus is more on distribution filename safety, rather than on sensible naming distinctions for end users. The former is less restrictive than the latter, I believe. From jim at zope.com Wed Jul 11 00:54:10 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 10 Jul 2007 18:54:10 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4693FE94.6090107@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> Message-ID: On Jul 10, 2007, at 5:48 PM, Martin v. L?wis wrote: >> I've seen requests take minutes and time out with proxy errors many >> times over the last few weeks. We, ZC, and many people we work >> with are >> at the point of building private indexes to get around the horrible >> performance. > > I still don't understand why you consider this an easier option than > contributing to the existing project. I don't. I'm not advocating it. In fact, I've been trying to convince people not to. People are doing it, usually in limited ways, out of desperation. ... >> Maybe others can chime in. > > That's also my concern. Nobody else is complaining; AFAICT, there > is just one unhappy user of PyPI. Oh come on, I'm not the only one who has posted messages on this mailing list over the last few weeks reporting problems. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Jul 11 00:55:57 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 10 Jul 2007 18:55:57 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4693FA2A.3020107@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> Message-ID: <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> On Jul 10, 2007, at 5:29 PM, Martin v. L?wis wrote: ... > Hmm. I'm somewhat skeptical about setuptools (or any other packaging > infrastructure, say, Debian) establishing rules on what makes a > difference in package names. Why? It certainly seems reasonable to me for a packaging system to define rules for package names. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Jul 11 01:03:09 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 10 Jul 2007 19:03:09 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070710221547.4A3043A40A4@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> Message-ID: On Jul 10, 2007, at 6:18 PM, Phillip J. Eby wrote: > At 11:29 PM 7/10/2007 +0200, Martin v. L?wis wrote: >> Hmm. I'm somewhat skeptical about setuptools (or any other packaging >> infrastructure, say, Debian) establishing rules on what makes a >> difference in package names. > > I can certainly understand that. However, *having* SOME definition > that's more human-friendly (and cross-platform filename friendly!) > than "the bytes match exactly", would be very useful to have. > > If PyPI had already had one (and I asked about this when I was > first trying to establish one) I'd have used that, or negotiated a > compromise if it didn't meet the filename-related requirements. > > However, none of the times that I asked about this issue on either > the catalog-sig nor the distutils-sig did anyone propose any > alternative canonicalization, nor bring up any objection besides > the general sort of reservation that you're expressing here - i.e., > not sure it's a good idea, but not expressing any particular reason > it's a bad idea. I think it is time we (the Python community) nailed this down. Perhaps a distribution project-name naming PEP is in order. > > Note that Windows (and Mac OS under certain circumstances) have > filename case insensitivity, and have different restrictions about > what can or can't be in a filename than Unix. Spaces and other > punctuation characters can cause problems for shells, even if > they're theoretically acceptable as filenames. Why should this imply case insensitivity of distribution project names. Python has case sensitive module (including package) names that can lead to problems if two modules have names that differ only in case. (I assume that Python 3000 retains this although, sadly, I don't know.) We deal with this by telling people "don't do that." Two packages with the same name except for case are incompatible, but then, so are modules with incompatible dependencies. > If you'd like to propose a *different* canonicalization, however, > I'm certainly willing to consider implementing it in setuptools, if > it can be done. However, as I said, nobody has proposed anything > else, but it would be nice to resolve the issue *before* name > collisions happen. > > If anything, I think that PyPI canonicalization may wish to be > *more* restrictive than setuptools' is. There isn't a whole lot of > user benefit to having, say, "Mike's Nifty module" and "Mikes Nifty > Module" being considered distinct packages, even though setuptools > actually allows that distinction to be made. > > IOW, setuptools' focus is more on distribution filename safety, > rather than on sensible naming distinctions for end users. The > former is less restrictive than the latter, I believe. I don't care much what canonicalization we use, but I agree strongly that we should decide something. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Wed Jul 11 02:12:54 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 10 Jul 2007 20:12:54 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> Message-ID: <20070711001040.D6DBF3A404D@sparrow.telecommunity.com> At 07:03 PM 7/10/2007 -0400, Jim Fulton wrote: >Why should this imply case insensitivity of distribution project >names. Python has case sensitive module (including package) names >that can lead to problems if two modules have names that differ only >in case. Module names are identifiers, with an already-restricted character set. Package names are strings, and many people (especially those who enter their PyPI data through the web) assume they can put whatever the heck they want in there. > (I assume that Python 3000 retains this although, sadly, I >don't know.) We deal with this by telling people "don't do that." Right... and PyPI's input validation would be a good place to tell them. :) >Two packages with the same name except for case are incompatible, but >then, so are modules with incompatible dependencies. Compatibility isn't the only concern, it's also about confusion as to which package is which. While one can't legislate away confusion, fixing simple, obvious errors that can and *do* occur in practice (like one package name having one space in it, the other having two!) is a good idea. One of the things that prompted my search for a canonicalization strategy was my survey of existing CheeseShop packages, which actually included a certain amount of duplication due to changes in case or punctuation at one point. (I believe the specific instances were fixed a long time ago, although I wouldn't rule out the possibility that some still exist.) From srichter at cosmos.phy.tufts.edu Wed Jul 11 02:16:40 2007 From: srichter at cosmos.phy.tufts.edu (Stephan Richter) Date: Tue, 10 Jul 2007 20:16:40 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. Message-ID: <200707102016.40669.srichter@cosmos.phy.tufts.edu> Hi all, Jim Fulton forwarded this exchange to the Zope3-Dev mailing lsit asking for us to comment. > On Jul 10, 2007, at 5:48 PM, Martin v. L?wis wrote: >> That's also my concern. Nobody else is complaining; AFAICT, there >> is just one unhappy user of PyPI. > > Oh come on, I'm not the only one who has posted messages on this > mailing list over the last few weeks reporting problems. I can assure you that I have had several times troubles with performance. One Friday I could not even finish my release, because I could not upload to PyPI or test the release since the packages were not downloaded after 5 hours! Regards, Stephan -- Stephan Richter CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student) Web2k - Web Software Design, Development and Training From waterbug at pangalactic.us Wed Jul 11 04:55:30 2007 From: waterbug at pangalactic.us (Stephen Waterbury) Date: Tue, 10 Jul 2007 22:55:30 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4693FE94.6090107@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> Message-ID: <469446A2.9070500@pangalactic.us> Martin v. L?wis wrote: > [Jim Fulton wrote:] >> Maybe others can chime in. > > That's also my concern. Nobody else is complaining; AFAICT, there > is just one unhappy user of PyPI. I'm not happy with PyPI's performance either. Probably many users are like me: I thought it was common knowledge that the performance of PyPI was bad, but I didn't want to complain when it appeared that people were working on improvements. Steve From richardjones at optusnet.com.au Wed Jul 11 06:04:24 2007 From: richardjones at optusnet.com.au (richardjones at optusnet.com.au) Date: Wed, 11 Jul 2007 14:04:24 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. Message-ID: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://mail.python.org/pipermail/catalog-sig/attachments/20070711/c7f5e06a/attachment.pot From martin at v.loewis.de Wed Jul 11 07:16:26 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 07:16:26 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070710221547.4A3043A40A4@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> Message-ID: <469467AA.7070409@v.loewis.de> > Note that Windows (and Mac OS under certain circumstances) have filename > case insensitivity, and have different restrictions about what can or > can't be in a filename than Unix. Spaces and other punctuation > characters can cause problems for shells, even if they're theoretically > acceptable as filenames. I can see that collisions should be avoided in advance when it comes to file names. However, the name of a software package is not necessarily a file name, nor is it even related to the name of files inside the package. *Python* package names are the ones that must not conflict. For a packaging tool, the names of the package files must not conflict, either. For the package names in general, issues of file names are only remotely relevant, on a first glance. > IOW, setuptools' focus is more on distribution filename safety, rather > than on sensible naming distinctions for end users. The former is less > restrictive than the latter, I believe. Yes. However, it's not clear to me that the infrastructure needs to (or even is able to) enforce sensible naming. Instead, any policing that might be necessary should be done in the community. If two packages are named too similarly, users will get confused, and eventually one package may disappear, get renamed, get its naming challenged in court, and so on. It's not the job of the package *index* to do that sort of policing. Regards, Martin From martin at v.loewis.de Wed Jul 11 07:19:45 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 07:19:45 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> Message-ID: <46946871.3060100@v.loewis.de> > People are doing it, usually in limited ways, out of desperation. Same question to these people, then (whoever they are): why do you think it's easier to build your own index in desperation, rather than contributing to PyPI? >> That's also my concern. Nobody else is complaining; AFAICT, there >> is just one unhappy user of PyPI. > > Oh come on, I'm not the only one who has posted messages on this mailing > list over the last few weeks reporting problems. Can you kindly refer to four or five such messages in the archives? I must have missed them. Regards, Martin From waterbug at pangalactic.us Wed Jul 11 07:20:05 2007 From: waterbug at pangalactic.us (Stephen Waterbury) Date: Wed, 11 Jul 2007 01:20:05 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> Message-ID: <46946885.8080100@pangalactic.us> richardjones at optusnet.com.au wrote: > Stephen Waterbury wrote: >> Martin v. L??wis wrote: >>> [Jim Fulton wrote:] >>>> Maybe others can chime in. >>> That's also my concern. Nobody else is complaining; AFAICT, there >>> is just one unhappy user of PyPI. >> I'm not happy with PyPI's performance either. >> Probably many users are like me: I thought it was >> common knowledge that the performance of PyPI was bad, but >> I didn't want to complain when it appeared that people were >> working on improvements. > > It has been slow in the past, but Martin has done some great work > speeding it up in the last few days. If it's still slow, please > report when you noticed and what you were trying to do. I agree, Martin's improvements have made a huge difference, in my recent experience. Thanks, Martin! I inferred from the conversation that the performance is variable, and I think my tests of it have been in off-peak times, so my current impressions should be regarded as anecdotal ... another reason why I hadn't volunteered my opinion until this request for input. Steve From martin at v.loewis.de Wed Jul 11 07:21:09 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 07:21:09 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> Message-ID: <469468C5.8000906@v.loewis.de> >> Hmm. I'm somewhat skeptical about setuptools (or any other packaging >> infrastructure, say, Debian) establishing rules on what makes a >> difference in package names. > > Why? It certainly seems reasonable to me for a packaging system to > define rules for package names. Ah, sure. It's certainly fine and reasonable for a packaging system to do that for its own purposes. However, I'm skeptical about that packaging system then to enforce its rules on other systems (such as the cheeseshop, which is not packaging system). Regards, Martin From martin at v.loewis.de Wed Jul 11 07:28:09 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 07:28:09 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469446A2.9070500@pangalactic.us> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> Message-ID: <46946A69.4000702@v.loewis.de> > I'm not happy with PyPI's performance either. > Probably many users are like me: I thought it was > common knowledge that the performance of PyPI was bad Please trust me that it isn't. I know that PyPI could become unresponsive, and I FIXED that. AFAICT, it's solved, done, can't happen again. I do not know that performance IS bad; I know that it WAS bad (primarily not due to the way the software was written, but due to the way it was run). > but > I didn't want to complain when it appeared that people were > working on improvements. Sure: mere complaints would not be constructive. However, specific *reports* of problems are absolutely necessary. If you experience problems today, tomorrow, next week, by all means, report them. Different people apparently also have different perception what good performance is, so please always make a full bug report: - what precisely did you do (including "when" also in this case), - what happened, - what did you expect to happen instead Regards, Martin From martin at v.loewis.de Wed Jul 11 07:31:06 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 07:31:06 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46946885.8080100@pangalactic.us> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46946885.8080100@pangalactic.us> Message-ID: <46946B1A.9040004@v.loewis.de> > I agree, Martin's improvements have made a huge difference, in my > recent experience. Thanks, Martin! I inferred from the conversation > that the performance is variable, and I think my tests of it have been > in off-peak times, so my current impressions should be regarded as > anecdotal ... another reason why I hadn't volunteered my opinion until > this request for input. Ah, ok. Please take a look at http://ximinez.python.org/munin/localdomain/localhost.localdomain-load.html Times are in CEST (UTC+2), so the peak load occurred during the times I was asleep - I never personally see any significant load on the system anymore. If you also work in a similar time zone as I do, I would consider your problems solved. Regards, Martin From gentoodev at gmail.com Wed Jul 11 07:54:48 2007 From: gentoodev at gmail.com (Rob Cakebread) Date: Tue, 10 Jul 2007 22:54:48 -0700 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469467AA.7070409@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> Message-ID: <9b06ffb10707102254i57e5c0f8gf92836805f8a0626@mail.gmail.com> On 7/10/07, "Martin v. L?wis" wrote: > > Yes. However, it's not clear to me that the infrastructure needs to > (or even is able to) enforce sensible naming. Instead, any policing > that might be necessary should be done in the community. If two > packages are named too similarly, users will get confused, and > eventually one package may disappear, get renamed, get its naming > challenged in court, and so on. It's not the job of the package > *index* to do that sort of policing. > Every package index I can think of does enforce sensible naming, except PyPI. Nobody is going to change the name of their project if you enforce sensible naming for PyPI, they'll just have to map their project name to a way that is easily mapped to PyPI's system, just like on Freshmeat, RubyForge etc. From martin at v.loewis.de Wed Jul 11 07:06:59 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 07:06:59 +0200 Subject: [Catalog-sig] Why so many zc.buildout versions? In-Reply-To: <4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com> References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> <4693FBDC.2060201@v.loewis.de> <4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com> Message-ID: <46946573.2070400@v.loewis.de> >> I have been thinking about the same thing. I think it would be good >> to have, however, it will surely take some time until all setuptools >> implementations learn to use it. > > No, not at all. You can tell setuptools to use a different index than > the current one. For example, this is a command-line option for > easy_install and a configuration option for buildout. Yes. However, that will make the feature only available to those who know about it. I have very shallow knowledge of setuptools and easy_install only (I nearly never use them at all), and I surely would miss such an option, and miss why it's relevant. It's true that the Apache installation could also redirect existing installations to the new pages, but I doubt that they would be otherwise widely used until setuptools changes its hard-coded default. >> Hmm. How about those using them extensively start contributing to >> them also? > > I like to think that I am by participating in this discussion. Actually > changing the cheeseshop software has a very high learning curve. I don't > think that I can make that kind of time any time soon. I'm very > grateful that you and Ren? are doing what you're doing. I also suspect > that, given your and Ren?'s activity, it would be counter productive for > someone else to get involved at that level, but maybe I'm wrong about that. I strongly think you are. There are many things that could be improved, and I would not mind leaving the cheeseshop alone if some other maintainer came along - I also have other things to do. Regards, Martin From martin at v.loewis.de Wed Jul 11 07:44:53 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 11 Jul 2007 07:44:53 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <200707102016.40669.srichter@cosmos.phy.tufts.edu> References: <200707102016.40669.srichter@cosmos.phy.tufts.edu> Message-ID: <46946E55.30308@v.loewis.de> >>> That's also my concern. Nobody else is complaining; AFAICT, there >>> is just one unhappy user of PyPI. >> Oh come on, I'm not the only one who has posted messages on this >> mailing list over the last few weeks reporting problems. > > I can assure you that I have had several times troubles with performance. One > Friday I could not even finish my release, because I could not upload to PyPI > or test the release since the packages were not downloaded after 5 hours! I assume you are talking about past here - I can readily believe that has happened. I think it's fixed now, and it should not happen again that you have to wait 5 hours to download a file (unless there is some hardware failure, network outage or the like beyond the control of the local software). So yes, I trust that there have been complaints in the past - I wonder whether there are *still* complaints (beyond the ones of Jim Fulton). Regards, Martin From benji at benjiyork.com Wed Jul 11 14:10:30 2007 From: benji at benjiyork.com (Benji York) Date: Wed, 11 Jul 2007 08:10:30 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46946871.3060100@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <46946871.3060100@v.loewis.de> Message-ID: <4694C8B6.1030804@benjiyork.com> Martin v. L?wis wrote: >> People are doing it, usually in limited ways, out of desperation. > > Same question to these people, then (whoever they are): why > do you think it's easier to build your own index in desperation, > rather than contributing to PyPI? Because they aren't aware of the progress being made or the intent to make more? >>> That's also my concern. Nobody else is complaining; AFAICT, there >>> is just one unhappy user of PyPI. >> Oh come on, I'm not the only one who has posted messages on this mailing >> list over the last few weeks reporting problems. > > Can you kindly refer to four or five such messages in the archives? > I must have missed them. Here's one (you didn't say they had to be past messages ). Is your position that PyPI isn't down/very slow on occasion or that when it is no one complains? My team has lost many man hours to PyPI begin down/glacially slow. This isn't meant to disparage PyPI though, if it weren't such a great thing it wouldn't be important to us. -- Benji York http://benjiyork.com From renesd at gmail.com Wed Jul 11 14:20:22 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Wed, 11 Jul 2007 22:20:22 +1000 Subject: [Catalog-sig] Why so many zc.buildout versions? In-Reply-To: <46946573.2070400@v.loewis.de> References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> <4693FBDC.2060201@v.loewis.de> <4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com> <46946573.2070400@v.loewis.de> Message-ID: <64ddb72c0707110520j42bb8f27nb676bcf4de39d14c@mail.gmail.com> I have to say the cheeseshop code was pretty easy to get into. I think I was able to make most of my changes within the first reading of it. It quite clearly separates things like the templates, the database functionality and the 'webui'. There definitely are a huge amount of things that I would love to change with it over time, and I hope other people begin to develop it more - it can only help the python community as a whole. The amount of people doing releases has increased quite a lot even in the last two months, so I think the releases will get more frequent. As it grows it will continue to need different changes - optimizations to the database/webserver, and also optimizations to the user interface. On 7/11/07, "Martin v. L?wis" wrote: > >> I have been thinking about the same thing. I think it would be good > >> to have, however, it will surely take some time until all setuptools > >> implementations learn to use it. > > > > No, not at all. You can tell setuptools to use a different index than > > the current one. For example, this is a command-line option for > > easy_install and a configuration option for buildout. > > Yes. However, that will make the feature only available to those who > know about it. I have very shallow knowledge of setuptools and > easy_install only (I nearly never use them at all), and I surely would > miss such an option, and miss why it's relevant. > > It's true that the Apache installation could also redirect existing > installations to the new pages, but I doubt that they would be otherwise > widely used until setuptools changes its hard-coded default. > > >> Hmm. How about those using them extensively start contributing to > >> them also? > > > > I like to think that I am by participating in this discussion. Actually > > changing the cheeseshop software has a very high learning curve. I don't > > think that I can make that kind of time any time soon. I'm very > > grateful that you and Ren? are doing what you're doing. I also suspect > > that, given your and Ren?'s activity, it would be counter productive for > > someone else to get involved at that level, but maybe I'm wrong about that. > > I strongly think you are. There are many things that could be improved, > and I would not mind leaving the cheeseshop alone if some other > maintainer came along - I also have other things to do. > > Regards, > Martin > > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From benji at benjiyork.com Wed Jul 11 14:42:16 2007 From: benji at benjiyork.com (Benji York) Date: Wed, 11 Jul 2007 08:42:16 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469468C5.8000906@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> Message-ID: <4694D028.6050203@benjiyork.com> Martin v. L?wis wrote: >>> Hmm. I'm somewhat skeptical about setuptools (or any other packaging >>> infrastructure, say, Debian) establishing rules on what makes a >>> difference in package names. >> Why? It certainly seems reasonable to me for a packaging system to >> define rules for package names. > > Ah, sure. It's certainly fine and reasonable for a packaging system > to do that for its own purposes. However, I'm skeptical about that > packaging system then to enforce its rules on other systems (such > as the cheeseshop, which is not packaging system). Although it wasn't part of the cheeseshop's original mission, it has become an integral part of distributing Python packages. If it doesn't want to participate in its new-found utility, other options need to be explored. -- Benji York http://benjiyork.com From jim at zope.com Wed Jul 11 14:52:20 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 08:52:20 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469467AA.7070409@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> Message-ID: <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> On Jul 11, 2007, at 1:16 AM, Martin v. L?wis wrote: ... >> IOW, setuptools' focus is more on distribution filename safety, >> rather >> than on sensible naming distinctions for end users. The former is >> less >> restrictive than the latter, I believe. > > Yes. However, it's not clear to me that the infrastructure needs to > (or even is able to) enforce sensible naming. Instead, any policing > that might be necessary should be done in the community. If two > packages are named too similarly, users will get confused, and > eventually one package may disappear, get renamed, get its naming > challenged in court, and so on. It's not the job of the package > *index* to do that sort of policing. When Phillip designed setuptools, he tried to have a very low impact on lots of systems. He did that very well and that has allowed setuptools to be adopted gradually with very little up front buy in. One of the decisions Phillip made was to not use an installed-package database other than sys.path. When a distribution is installed, the installed file name reflects the package name. If you want to know whether a package is installed, you can scan sys.path looking for files or directories that contain/reflect the package name. IMO, this was a very good decision, however, it does have the disadvantage that it may run afoul of system file-naming limitations. Again, I think this was a fair trade off. The questions for us is, how much effort we are willing to make to prevent people from shooting themselves in the foot. I can understand why Phillip would like the package index to prevent people from choosing problematic package names. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Jul 11 14:56:22 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 08:56:22 -0400 Subject: [Catalog-sig] Why so many zc.buildout versions? In-Reply-To: <46946573.2070400@v.loewis.de> References: <46937F10.3070201@weitershausen.de> <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com> <4693FBDC.2060201@v.loewis.de> <4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com> <46946573.2070400@v.loewis.de> Message-ID: <7E6E8D05-9669-4765-B61D-254835DDA553@zope.com> On Jul 11, 2007, at 1:06 AM, Martin v. L?wis wrote: >>> I have been thinking about the same thing. I think it would be good >>> to have, however, it will surely take some time until all setuptools >>> implementations learn to use it. >> >> No, not at all. You can tell setuptools to use a different index >> than >> the current one. For example, this is a command-line option for >> easy_install and a configuration option for buildout. > > Yes. However, that will make the feature only available to those who > know about it. I have very shallow knowledge of setuptools and > easy_install only (I nearly never use them at all), and I surely would > miss such an option, and miss why it's relevant. That's fine. I don't care if most people can find it. While it is an *experimental* index, it is fine if only a few people play with it. If it is proven to work properly, then we could arrange that other people get it by default. > It's true that the Apache installation could also redirect existing > installations to the new pages, but I doubt that they would be > otherwise > widely used until setuptools changes its hard-coded default. Right, that's why, if the experiment works, we should then change the Apache config to rediect setuptools to it. Changing the apache config is much easier than updating the setuptools installed base. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Jul 11 15:32:31 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 09:32:31 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46946871.3060100@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <46946871.3060100@v.loewis.de> Message-ID: On Jul 11, 2007, at 1:19 AM, Martin v. L?wis wrote: ... >>> That's also my concern. Nobody else is complaining; AFAICT, there >>> is just one unhappy user of PyPI. >> >> Oh come on, I'm not the only one who has posted messages on this >> mailing >> list over the last few weeks reporting problems. > > Can you kindly refer to four or five such messages in the archives? > I must have missed them. http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html http://mail.python.org/pipermail/catalog-sig/2007-June/001101.html http://mail.python.org/pipermail/catalog-sig/2007-April/001049.html http://mail.python.org/pipermail/catalog-sig/2006-November/000997.html There haven't been a large number of messages. There must not be a problem. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html From jim at zope.com Wed Jul 11 15:34:41 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 09:34:41 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469468C5.8000906@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> Message-ID: On Jul 11, 2007, at 1:21 AM, Martin v. L?wis wrote: >>> Hmm. I'm somewhat skeptical about setuptools (or any other packaging >>> infrastructure, say, Debian) establishing rules on what makes a >>> difference in package names. >> >> Why? It certainly seems reasonable to me for a packaging system to >> define rules for package names. > > Ah, sure. It's certainly fine and reasonable for a packaging system > to do that for its own purposes. However, I'm skeptical about that > packaging system then to enforce its rules on other systems (such > as the cheeseshop, which is not packaging system). OK, let's take a step back. IMO, PyPI is a *part* is the packaging system. If we can't agree that that is true, then we need to find a package index that *is* part of the package system. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Jul 11 16:03:33 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 10:03:33 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> Message-ID: <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> On Jul 11, 2007, at 12:04 AM, richardjones at optusnet.com.au wrote: > Stephen Waterbury wrote: >> Martin v. L?wis wrote: >>> [Jim Fulton wrote:] >>>> Maybe others can chime in. >>> >>> That's also my concern. Nobody else is complaining; AFAICT, there >>> is just one unhappy user of PyPI. >> >> I'm not happy with PyPI's performance either. >> Probably many users are like me: I thought it was >> common knowledge that the performance of PyPI was bad, but >> I didn't want to complain when it appeared that people were >> working on improvements. > > It has been slow in the past, but Martin has done some great work > speeding it up in the last few days. Yup. Much thanks Martin! > If it's still slow, please report when you noticed and what you > were trying to do. Let's look at the new-improved times. Right now ~14:00UTC July 11: http://www.python.org/ZODB3 takes about .3 seconds (median)(means is higher) http://www.python.org/ZODB3/3.8.0b2 also takes about .3 seconds http://www.python.org/pypi/ takes aabout 6 seconds (median) For the sake of argument, let's ignore http://www.python.org/pypi/. The .3-second times per request is *much* better than we had before (I assume), but it's *not fast enough*. The demand on the package index used by setuptools is going to increase substantially. Even if setuptools only made a single request per package, .3 seconds per request is too slow. Given the current structure of the index, setuptools has to make a request for the package and a request per release. For ZODB, this means about 12 requests, or more than 3 seconds. Of course, this will increase over time, as more releases are made. The progress Martin has made has (I assume and hope) greatly increased the reliability and performance of PYPI. This is very important and much appreciated. It is not enough in the long (or, I suspect medium) term. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From nathan at creativecommons.org Wed Jul 11 16:28:33 2007 From: nathan at creativecommons.org (Nathan R. Yergler) Date: Wed, 11 Jul 2007 07:28:33 -0700 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46946A69.4000702@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> Message-ID: On 7/10/07, "Martin v. L?wis" wrote: > > I'm not happy with PyPI's performance either. > > Probably many users are like me: I thought it was > > common knowledge that the performance of PyPI was bad > > Please trust me that it isn't. I know that PyPI could > become unresponsive, and I FIXED that. AFAICT, it's > solved, done, can't happen again. I do not know that > performance IS bad; I know that it WAS bad (primarily > not due to the way the software was written, but > due to the way it was run). The speed has noticeably improved (thanks!) but as recently as Monday PyPI was unresponsive and then returning proxy errors. It definitely caused us (Creative Commons) to lose productivity Monday afternoon (PDT). Nathan > > > but > > I didn't want to complain when it appeared that people were > > working on improvements. > > Sure: mere complaints would not be constructive. However, > specific *reports* of problems are absolutely necessary. > If you experience problems today, tomorrow, next week, > by all means, report them. Different people apparently > also have different perception what good performance is, > so please always make a full bug report: > > - what precisely did you do (including "when" also > in this case), > - what happened, > - what did you expect to happen instead > > Regards, > Martin > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From jodok at lovelysystems.com Wed Jul 11 17:57:14 2007 From: jodok at lovelysystems.com (Jodok Batlogg) Date: Wed, 11 Jul 2007 17:57:14 +0200 (CEST) Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: Message-ID: <21138246.8381184169434589.JavaMail.root@post.webmeisterei.com> sorry for incorrect quoting - i'm at europython and the webmailer behaves badly... i've been complaining loudly! :) in fact cheeseshop was unusably slow. in meanwhile we built our own index and are not depending on cheeseshop anymore. i think at least me (lovely systems) and jim (zope corporation) offered help and volunteered to pay someone to fix it. for me, the current solution is just "tuning", but not addressing the general problem behind the current software design (that is the pypi software and parts of setuptools in general). i've been following the thread actively and like to thank especially martin for his work to get a short-term solution. nevertheless we need to solve these issues. as a lot of other projects are moving to egg-based distributions pypi is a integral part. baking static pages would be my first choice. jodok ----- Original Message ----- From: "Jim Fulton" To: "=?ISO-8859-1?Q? \"Martin_v._L=F6wis\" ?=" Cc: catalog-sig at python.org Sent: Wednesday, July 11, 2007 4:32:31 PM (GMT+0200) Europe/Athens Subject: Re: [Catalog-sig] start on static generation, and caching - apache config. On Jul 11, 2007, at 1:19 AM, Martin v. L?wis wrote: ... >>> That's also my concern. Nobody else is complaining; AFAICT, there >>> is just one unhappy user of PyPI. >> >> Oh come on, I'm not the only one who has posted messages on this >> mailing >> list over the last few weeks reporting problems. > > Can you kindly refer to four or five such messages in the archives? > I must have missed them. http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html http://mail.python.org/pipermail/catalog-sig/2007-June/001101.html http://mail.python.org/pipermail/catalog-sig/2007-April/001049.html http://mail.python.org/pipermail/catalog-sig/2006-November/000997.html There haven't been a large number of messages. There must not be a problem. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html _______________________________________________ Catalog-SIG mailing list Catalog-SIG at python.org http://mail.python.org/mailman/listinfo/catalog-sig -- Lovely Systems, Partner phone: +43 5572 908060, fax: +43 5572 908060-77 Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria From martin at v.loewis.de Wed Jul 11 19:40:34 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 19:40:34 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> Message-ID: <46951612.9010009@v.loewis.de> > The speed has noticeably improved (thanks!) but as recently as Monday > PyPI was unresponsive and then returning proxy errors. It definitely > caused us (Creative Commons) to lose productivity Monday afternoon > (PDT). Ok. What precisely was that proxy error? (I'm puzzled, because I'm not aware of a proxy somewhere) Regards, Martin From fdrake at gmail.com Wed Jul 11 19:42:04 2007 From: fdrake at gmail.com (Fred Drake) Date: Wed, 11 Jul 2007 13:42:04 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> Message-ID: <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com> On 7/11/07, Nathan R. Yergler wrote: > The speed has noticeably improved (thanks!) but as recently as Monday > PyPI was unresponsive and then returning proxy errors. It definitely > caused us (Creative Commons) to lose productivity Monday afternoon > (PDT). We're seeing this right now, too. I'm checking both www.python.org and cheeseshop.python.org. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From nathan at creativecommons.org Wed Jul 11 19:47:33 2007 From: nathan at creativecommons.org (Nathan R. Yergler) Date: Wed, 11 Jul 2007 10:47:33 -0700 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46951612.9010009@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> Message-ID: On 7/11/07, "Martin v. L?wis" wrote: > > The speed has noticeably improved (thanks!) but as recently as Monday > > PyPI was unresponsive and then returning proxy errors. It definitely > > caused us (Creative Commons) to lose productivity Monday afternoon > > (PDT). > > Ok. What precisely was that proxy error? (I'm puzzled, because I'm > not aware of a proxy somewhere) IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache is passing requests through to a local process (mod_rewrite or mod_proxy?), and that process wasn't responding. > > Regards, > Martin > From jim at zope.com Wed Jul 11 19:50:01 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 13:50:01 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46951612.9010009@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> Message-ID: On Jul 11, 2007, at 1:40 PM, Martin v. L?wis wrote: >> The speed has noticeably improved (thanks!) but as recently as Monday >> PyPI was unresponsive and then returning proxy errors. It definitely >> caused us (Creative Commons) to lose productivity Monday afternoon >> (PDT). > > Ok. What precisely was that proxy error? (I'm puzzled, because I'm > not aware of a proxy somewhere) Here's the error I just got after several minutes of spinning trying to get: http://www.python.org/pypi/ZODB3 503 Service Temporarily Unavailable

Service Temporarily Unavailable

The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From benji at benjiyork.com Wed Jul 11 19:50:39 2007 From: benji at benjiyork.com (Benji York) Date: Wed, 11 Jul 2007 13:50:39 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46946E55.30308@v.loewis.de> References: <200707102016.40669.srichter@cosmos.phy.tufts.edu> <46946E55.30308@v.loewis.de> Message-ID: <4695186F.3030207@benjiyork.com> Martin v. L?wis wrote: > So yes, I trust that there have been complaints in the past - I > wonder whether there are *still* complaints (beyond the ones > of Jim Fulton). Here's a complaint: the cheeseshop is down. -- Benji York http://benjiyork.com From fdrake at gmail.com Wed Jul 11 19:50:56 2007 From: fdrake at gmail.com (Fred Drake) Date: Wed, 11 Jul 2007 13:50:56 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> Message-ID: <9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com> On 7/11/07, Nathan R. Yergler wrote: > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache > is passing requests through to a local process (mod_rewrite or > mod_proxy?), and that process wasn't responding. Firefox's "Page Info" says 503. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From nathan at creativecommons.org Wed Jul 11 19:55:46 2007 From: nathan at creativecommons.org (Nathan R. Yergler) Date: Wed, 11 Jul 2007 10:55:46 -0700 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> <9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com> Message-ID: I'm getting the following right now: 502 Proxy Error

Proxy Error

The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /pypi.

Reason: Error reading from remote server

On 7/11/07, Fred Drake wrote: > On 7/11/07, Nathan R. Yergler wrote: > > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache > > is passing requests through to a local process (mod_rewrite or > > mod_proxy?), and that process wasn't responding. > > Firefox's "Page Info" says 503. > > > -Fred > > -- > Fred L. Drake, Jr. > "Chaos is the score upon which reality is written." --Henry Miller > From martin at v.loewis.de Wed Jul 11 20:01:59 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 20:01:59 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4694C8B6.1030804@benjiyork.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com> Message-ID: <46951B17.4000104@v.loewis.de> Benji York schrieb: > Martin v. L?wis wrote: >>> People are doing it, usually in limited ways, out of desperation. >> Same question to these people, then (whoever they are): why >> do you think it's easier to build your own index in desperation, >> rather than contributing to PyPI? > > Because they aren't aware of the progress being made or the intent to > make more? And then, why didn't they ask how they could help? People can start all the projects they want, of course. It just seems like a waste of volunteer time to work on competing projects. > Here's one (you didn't say they had to be past messages ). And indeed, I'm more interested in new reports than in old ones (since the system changed since the old ones). > Is your position that PyPI isn't down/very slow on occasion or that when > it is no one complains? Both. I believe it shouldn't be down, and I have no precise reports of it being "very slow". Jim Fulton complained that it took 0.3s to get a single package's page, which I cannot classify as "very slow". > My team has lost many man hours to PyPI begin down/glacially slow. This > isn't meant to disparage PyPI though, if it weren't such a great thing > it wouldn't be important to us. But when did that happen precisely? Regards, Martin From martin at v.loewis.de Wed Jul 11 20:03:01 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 20:03:01 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> Message-ID: <46951B55.9050009@v.loewis.de> >> Ok. What precisely was that proxy error? (I'm puzzled, because I'm >> not aware of a proxy somewhere) > > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache > is passing requests through to a local process (mod_rewrite or > mod_proxy?), and that process wasn't responding. Neither is going on for PyPI, AFAIK - it's mod_fastcgi. Regards, Martin From nathan at creativecommons.org Wed Jul 11 20:06:20 2007 From: nathan at creativecommons.org (Nathan R. Yergler) Date: Wed, 11 Jul 2007 11:06:20 -0700 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46951B55.9050009@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> <46951B55.9050009@v.loewis.de> Message-ID: On 7/11/07, "Martin v. L?wis" wrote: > >> Ok. What precisely was that proxy error? (I'm puzzled, because I'm > >> not aware of a proxy somewhere) > > > > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache > > is passing requests through to a local process (mod_rewrite or > > mod_proxy?), and that process wasn't responding. > > Neither is going on for PyPI, AFAIK - it's mod_fastcgi. > So perhaps the external fastcgi server has barfed? Like I said, I was just guessing based on past experience. I don't know enough about the internals of PyPI to actually comment on how applicable that experience is. NRY From benji at benjiyork.com Wed Jul 11 20:25:58 2007 From: benji at benjiyork.com (Benji York) Date: Wed, 11 Jul 2007 14:25:58 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46951B17.4000104@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com> <46951B17.4000104@v.loewis.de> Message-ID: <469520B6.2030002@benjiyork.com> Martin v. L?wis wrote: > Benji York schrieb: >> Is your position that PyPI isn't down/very slow on occasion or that when >> it is no one complains? > > Both. I believe it shouldn't be down The cheeseshop has provided its own proof that that believe is mistaken by being down as I began composing this message. > Jim Fulton complained that it took 0.3s to > get a single package's page, which I cannot classify as "very slow". During a single run setuptools or zc.buildout may make hundreds of requests to the cheeseshop taking a total time in the minutes. That's not fast enough. I can't see a technical reason why these requests couldn't be handled much faster than 3 a second. >> My team has lost many man hours to PyPI begin down/glacially slow. This >> isn't meant to disparage PyPI though, if it weren't such a great thing >> it wouldn't be important to us. > > But when did that happen precisely? I don't recall precisely. I'll be sure to report outages religiously from now on. -- Benji York http://benjiyork.com From martin at v.loewis.de Wed Jul 11 20:27:00 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 20:27:00 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> <9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com> Message-ID: <469520F4.2050708@v.loewis.de> Nathan R. Yergler schrieb: > I'm getting the following right now: > > > > 502 Proxy Error > >

Proxy Error

>

The proxy server received an invalid > response from an upstream server.
> The proxy server could not handle the request href="/pypi">GET /pypi.

> Reason: Error reading from remote server

> > Thanks for all the reports. I'm really puzzled what precisely happened. Apache has logged tons of the error messages [Wed Jul 11 20:11:01 2007] [warn] FastCGI: server "/data/pypi/src/pypi/pypi.fcgi" has failed to remain running for 30 seconds given 3 attempts, its restart interval has been backed off to 600 seconds That caused the outage: the PyPI FCGI servers stopped, and failed to restart, so FCGI backed off starting new ones. However, I don't understand why PyPI crashed - it did not leave a log message, and did not send an error email. After restarting it, it seems to run just fine. The first crashed server was started 7:56 (UTC+2), and, at 11:04, the line [warn] FastCGI: server "/data/pypi/src/pypi/pypi.fcgi" (pid 3770) terminated by calling exit with status '0' was logged, i.e. PyPI voluntarily decided to exit. The same happened later again and again, but I can't figure out why it would do such a thing. Regards, Martin From martin at v.loewis.de Wed Jul 11 20:27:57 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 20:27:57 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> <46951B55.9050009@v.loewis.de> Message-ID: <4695212D.6010406@v.loewis.de> > So perhaps the external fastcgi server has barfed? Like I said, I was > just guessing based on past experience. I don't know enough about the > internals of PyPI to actually comment on how applicable that > experience is. I just looked into it a little - that happened, but I don't know why. Regards, Martin From martin at v.loewis.de Wed Jul 11 20:32:22 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 20:32:22 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4694D028.6050203@benjiyork.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com> Message-ID: <46952236.30704@v.loewis.de> > Although it wasn't part of the cheeseshop's original mission, it has > become an integral part of distributing Python packages. If it doesn't > want to participate in its new-found utility, other options need to be > explored. It's a software system; it doesn't have a mission. I just dislike making unilateral decisions. Regards, Martin From martin at v.loewis.de Wed Jul 11 20:35:02 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 20:35:02 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> Message-ID: <469522D6.1070706@v.loewis.de> > The questions for us is, how much effort we are willing to make to > prevent people from shooting themselves in the foot. I can understand > why Phillip would like the package index to prevent people from choosing > problematic package names. That's not my understanding - the issue isn't with "problematic package names", but with conflicting package names. IOW, any single name is fine - it's a pair of names that would cause a problem (and only if you wanted to install both packages on the same system). Regards, Martin From martin at v.loewis.de Wed Jul 11 20:36:57 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 20:36:57 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> Message-ID: <46952349.5050606@v.loewis.de> > OK, let's take a step back. IMO, PyPI is a *part* is the packaging > system. If we can't agree that that is true, then we need to find a > package index that *is* part of the package system. It might be hairsplitting to discuss this specific question, but I think the purpose of PyPI is to allow people to find Python packages, i.e. it is a package index. Regards, Martin From martin at v.loewis.de Wed Jul 11 20:40:29 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 20:40:29 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> Message-ID: <4695241D.3090203@v.loewis.de> > The .3-second times per request is *much* better than we had before > (I assume), but it's *not fast enough*. The demand on the package > index used by setuptools is going to increase substantially. Even if > setuptools only made a single request per package, .3 seconds per > request is too slow. Given the current structure of the index, > setuptools has to make a request for the package and a request per > release. For ZODB, this means about 12 requests, or more than 3 > seconds. Of course, this will increase over time, as more releases > are made. This I still don't understand. Why does it need to query all available releases? Regards, Martin From benji at benjiyork.com Wed Jul 11 20:41:32 2007 From: benji at benjiyork.com (Benji York) Date: Wed, 11 Jul 2007 14:41:32 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46952236.30704@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com> <46952236.30704@v.loewis.de> Message-ID: <4695245C.3020703@benjiyork.com> Martin v. L?wis wrote: >> Although it wasn't part of the cheeseshop's original mission, it has >> become an integral part of distributing Python packages. If it doesn't >> want to participate in its new-found utility, other options need to be >> explored. > > It's a software system; it doesn't have a mission. This SIG has a mission, I was under the impression that the cheeseshop was developed to forward that mission. If not, we need to start work on something that will provide a usable server-side compliment to setuptools. > I just dislike making unilateral decisions. Fortunately you don't have to. We have several people here with varied experience that have the facilities to communicate their desires and expertise. -- Benji York http://benjiyork.com From jim at zope.com Wed Jul 11 20:45:09 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 14:45:09 -0400 Subject: [Catalog-sig] The purpose(s) of PYPI In-Reply-To: <46952349.5050606@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <46952349.5050606@v.loewis.de> Message-ID: On Jul 11, 2007, at 2:36 PM, Martin v. L?wis wrote: >> OK, let's take a step back. IMO, PyPI is a *part* is the packaging >> system. If we can't agree that that is true, then we need to find a >> package index that *is* part of the package system. > > It might be hairsplitting to discuss this specific question, but > I think the purpose of PyPI is to allow people to find Python > packages, i.e. it is a package index. Let me try to put this another way. Can we agree that it is part of the purpose of PyPI to serve as a repository for setuptools? I'd like to resolve this issue. If it isn't part of PyPI's purpose to serve as a repository for setuptools, then we'll build another system that *does* have that purpose. If it is part of the purpose to serve as a repository for setuptools, then we'll need to take various needs of setuptools into account. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Wed Jul 11 20:37:21 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 11 Jul 2007 14:37:21 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469467AA.7070409@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> Message-ID: <20070711184549.733CE3A404D@sparrow.telecommunity.com> At 07:16 AM 7/11/2007 +0200, Martin v. L?wis wrote: > > Note that Windows (and Mac OS under certain circumstances) have filename > > case insensitivity, and have different restrictions about what can or > > can't be in a filename than Unix. Spaces and other punctuation > > characters can cause problems for shells, even if they're theoretically > > acceptable as filenames. > >I can see that collisions should be avoided in advance when it comes to >file names. However, the name of a software package is not necessarily a >file name, Actually, it is. The distutils generate distribution filenames based on this. > > IOW, setuptools' focus is more on distribution filename safety, rather > > than on sensible naming distinctions for end users. The former is less > > restrictive than the latter, I believe. > >Yes. However, it's not clear to me that the infrastructure needs to >(or even is able to) enforce sensible naming. I said sensible *distinctions* - not sensible naming. Clearly, we can't advise people not to publish packages named "Joe's Miscellaneous Functions", at least not in an automated way. :) > Instead, any policing >that might be necessary should be done in the community. If two >packages are named too similarly, users will get confused, and >eventually one package may disappear, get renamed, get its naming >challenged in court, and so on. It's not the job of the package >*index* to do that sort of policing. Within its own scope, that's a valid and sensible argument. Within the larger scope of "what is good for users", I would say it does no *good* to allow people to register such similar package names, and in many cases will do *harm* to do so. Contrariwise, it will not do *harm* to anyone to reject their too-similar name, and will in fact do them good. Today, I almost created a package called "Aspects". Had I done so, and uploaded it to the Cheeseshop, I wouldn't have been warned that there is already a package named "aspects". I would have been well on my way to creating confusion that would be entirely avoidable, were the Cheeseshop to stop me at the point of registration or uploading. Since the restriction can cause no real harm, and produces a net good, but the lack of restriction can cause real harm (e.g., I had to later change a package name, thereby breaking dependencies in other packages), there is no reason *not* to provide that benefit to the users, and protect them from that harm. Perhaps, as Jim says, it is time to start treating PyPI as part of the packaging system. It is so in fact, anyway. Meanwhile, the separation between cataloging and packaging means other issues, such as the complete disconnect between the cataloging of metadata and the automated production and use of such metadata. The PKG-INFO format has been degrading with each new version, in terms of defining more metadata for which over-restrictive *syntax* is defined, while being almost completely lacking in any *semantics*. This schism between the idea of neatly cataloging things, versus being able to actually *use* that cataloging for practical purposes by automated tools (as opposed to being usable only to human readers), seems to be at the heart of some of the current discussion. From martin at v.loewis.de Wed Jul 11 20:46:02 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 20:46:02 +0200 Subject: [Catalog-sig] cheeseshop outage In-Reply-To: <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com> Message-ID: <4695256A.5020208@v.loewis.de> Fred Drake schrieb: > On 7/11/07, Nathan R. Yergler wrote: >> The speed has noticeably improved (thanks!) but as recently as Monday >> PyPI was unresponsive and then returning proxy errors. It definitely >> caused us (Creative Commons) to lose productivity Monday afternoon >> (PDT). > > We're seeing this right now, too. I'm checking both www.python.org > and cheeseshop.python.org. If www.python.org is up, should be safe to ignore. If you can find any post-mortem evidence on ximinez, that would be much appreciated. Regards, Martin P.S. Why is www.python.org proxying for ximinez? Shouldn't it perform redirects instead? From benji at benjiyork.com Wed Jul 11 20:52:50 2007 From: benji at benjiyork.com (Benji York) Date: Wed, 11 Jul 2007 14:52:50 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070711184549.733CE3A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <20070711184549.733CE3A404D@sparrow.telecommunity.com> Message-ID: <46952702.8060606@benjiyork.com> Phillip J. Eby wrote: > This schism between the idea of neatly cataloging things, versus > being able to actually *use* that cataloging for practical purposes > by automated tools (as opposed to being usable only to human > readers), seems to be at the heart of some of the current discussion. Wasn't there a proposal to merge the catalog-sig and distutils-sig? -- Benji York http://benjiyork.com From jim at zope.com Wed Jul 11 20:57:43 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 14:57:43 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4695241D.3090203@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> <4695241D.3090203@v.loewis.de> Message-ID: <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> On Jul 11, 2007, at 2:40 PM, Martin v. L?wis wrote: >> The .3-second times per request is *much* better than we had before >> (I assume), but it's *not fast enough*. The demand on the package >> index used by setuptools is going to increase substantially. Even if >> setuptools only made a single request per package, .3 seconds per >> request is too slow. Given the current structure of the index, >> setuptools has to make a request for the package and a request per >> release. For ZODB, this means about 12 requests, or more than 3 >> seconds. Of course, this will increase over time, as more releases >> are made. > > This I still don't understand. Why does it need to query all available > releases? The way that setuptools currently works, it scans each of the release pages looking for distributions. In theory, it could take the names of these pages into account and scan fewer. It will still have to scan at least 2. I have a feeling that I'll never convince you that a third of a second is too slow. I think I'll stop trying. Hopefully, Ren?, will be able to get baking working, at which point the pages will be a lot faster. At that point, I think it would be good to pursue alternate pages more optimized for setuptools to reduce the number and size of setuptools requests. I'll help any way I can with that. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Wed Jul 11 21:03:12 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 11 Jul 2007 15:03:12 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469520B6.2030002@benjiyork.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com> <46951B17.4000104@v.loewis.de> <469520B6.2030002@benjiyork.com> Message-ID: <20070711190058.2322F3A404D@sparrow.telecommunity.com> At 02:25 PM 7/11/2007 -0400, Benji York wrote: >Martin v. L?wis wrote: > > Benji York schrieb: > > >> Is your position that PyPI isn't down/very slow on occasion or that when > >> it is no one complains? > > > > Both. I believe it shouldn't be down > >The cheeseshop has provided its own proof that that believe is mistaken >by being down as I began composing this message. > > > Jim Fulton complained that it took 0.3s to > > get a single package's page, which I cannot classify as "very slow". > >During a single run setuptools or zc.buildout may make hundreds of >requests to the cheeseshop taking a total time in the minutes. That's >not fast enough. I can't see a technical reason why these requests >couldn't be handled much faster than 3 a second. An interesting thought for future optimization... an XML-RPC catalog server designed for this use case could in fact do all the computation server-side, resolving dependencies and evaluating version constraints. Heck, in theory, it could cache packages' external links, and simply hand back to the caller a complete list of candidate URLs to choose for downloading. That way, most activities would take only one server round-trip to complete, if the client sent a list of everything it expects to need, and the server includes everything that the server expects the client to want due to those things' dependencies. The main obstacle to implementing such a service today, is that it would have no way of knowing what dependencies to look for, without sniffing the contents of .egg files. But, as long as a superset of possible dependencies was listed in PKG-INFO, the server could make intelligent guesses about what other packages are likely to be needed, and return their version/download info as well. Returning information for packages that turn out not to be needed is likely to be far less expensive than having to make round-trip requests. An alternative to providing that information from metadata, of course, would be for the client to include a "referrer" header of sorts, saying why it is asking for a package. The server could then simply "learn" the relevant associations. From jim at zope.com Wed Jul 11 21:08:03 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 15:08:03 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070711190058.2322F3A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com> <46951B17.4000104@v.loewis.de> <469520B6.2030002@benjiyork.com> <20070711190058.2322F3A404D@sparrow.telecommunity.com> Message-ID: <9EE8B28D-5B16-4AE8-8001-E3ECCC34A199@zope.com> On Jul 11, 2007, at 3:03 PM, Phillip J. Eby wrote: > At 02:25 PM 7/11/2007 -0400, Benji York wrote: >> Martin v. L?wis wrote: >>> Benji York schrieb: >> >>>> Is your position that PyPI isn't down/very slow on occasion or >>>> that when >>>> it is no one complains? >>> >>> Both. I believe it shouldn't be down >> >> The cheeseshop has provided its own proof that that believe is >> mistaken >> by being down as I began composing this message. >> >>> Jim Fulton complained that it took 0.3s to >>> get a single package's page, which I cannot classify as "very slow". >> >> During a single run setuptools or zc.buildout may make hundreds of >> requests to the cheeseshop taking a total time in the minutes. >> That's >> not fast enough. I can't see a technical reason why these requests >> couldn't be handled much faster than 3 a second. > > An interesting thought for future optimization... an XML-RPC catalog > server designed for this use case could in fact do all the > computation server-side, resolving dependencies and evaluating > version constraints. Heck, in theory, it could cache packages' > external links, and simply hand back to the caller a complete list of > candidate URLs to choose for downloading. That way, most activities > would take only one server round-trip to complete, if the client sent > a list of everything it expects to need, and the server includes > everything that the server expects the client to want due to those > things' dependencies. That wouldn't help when local (e.g. development) or private distributions need to be included in the mix. I think collecting all of the links for a package that PYPI knows about on individual package pages would go a very long way to reducing the number of requests. If these pages were served statically (or in similar times), then I think we'd be in very good shape. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Wed Jul 11 21:13:42 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 21:13:42 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4695245C.3020703@benjiyork.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com> <46952236.30704@v.loewis.de> <4695245C.3020703@benjiyork.com> Message-ID: <46952BE6.1070604@v.loewis.de> Benji York schrieb: > Martin v. L?wis wrote: >>> Although it wasn't part of the cheeseshop's original mission, it has >>> become an integral part of distributing Python packages. If it doesn't >>> want to participate in its new-found utility, other options need to be >>> explored. >> >> It's a software system; it doesn't have a mission. > > This SIG has a mission, I was under the impression that the cheeseshop > was developed to forward that mission. That's true. That mission is "The Python Catalog SIG aims at producing a master index of Python software and other resources." I think this still is the mission - be *the* central site for indexing Python software. The part "other resources" apparently never was considered; it only indexes software now. >> I just dislike making unilateral decisions. > > Fortunately you don't have to. We have several people here with varied > experience that have the facilities to communicate their desires and > expertise. Ok. Of course, here the usual software engineer's reaction comes into play: if you don't think something is that important, you try to come up with reasons not doing it. I should have been more open: I don't see that I have time to implement the clashing check that Phillip proposed, although I'll see what I can do about the redirect on lookup. Regards, Martin From martin at v.loewis.de Wed Jul 11 21:23:51 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 21:23:51 +0200 Subject: [Catalog-sig] The purpose(s) of PYPI In-Reply-To: References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <46952349.5050606@v.loewis.de> Message-ID: <46952E47.8020700@v.loewis.de> > Can we agree that it is part of the purpose of PyPI to serve as a > repository for setuptools? I'd like to resolve this issue. If it isn't > part of PyPI's purpose to serve as a repository for setuptools, then > we'll build another system that *does* have that purpose. If it is part > of the purpose to serve as a repository for setuptools, then we'll need > to take various needs of setuptools into account. I can't answer that question. I know PyPI is a master index of Python software and other resources, because (as Benji York kindly reminded me) that's the mission under which it was created. Beyond that, it is what the community makes it to be. I personally know it is not a "repository for setuptools" for *me*, as I don't use setuptools. I also know it is a "repository for setuptools" for you, as you have reported using it for that purpose. For many of the package authors, I think it is a platform to advertise their software; for some, it is also a web hosting service to place their released files onto. As for taking needs into account: First of all, it's a volunteer project. Open source contributors are known to primarily scratch their own itches. So if you want to see needs be taken into account, you may have to write the code yourself, pay somebody to write it for your, or talk somebody into writing it for you. In particular, I personally won't write any line of code just because of a threat to go away and write a competing index. Instead, my reaction to such a threat remains the same: good luck! Regards, Martin From benji at benjiyork.com Wed Jul 11 21:27:45 2007 From: benji at benjiyork.com (Benji York) Date: Wed, 11 Jul 2007 15:27:45 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46952BE6.1070604@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com> <46952236.30704@v.loewis.de> <4695245C.3020703@benjiyork.com> <46952BE6.1070604@v.loewis.de> Message-ID: <46952F31.5020806@benjiyork.com> Martin v. L?wis wrote: > That's true. That mission is "The Python Catalog SIG aims at producing a > master index of Python software and other resources." > > I think this still is the mission - be *the* central site for indexing > Python software. The part "other resources" apparently never was > considered; it only indexes software now. There exists ambiguity as to the audience for the index. Humans are assumed; I propose that packaging systems need to be on the list as well. > I should have been more open: I don't > see that I have time to implement the clashing check that Phillip > proposed, although I'll see what I can do about the redirect > on lookup. Knowing your motivation helps. I don't think anyone expected you to jump on the implementation. It's OK to say that you don't have time to implement something. There are other people that can help, and if not it'll just have to wait. We have to make sure we distinguish between desirability and feasibility. -- Benji York http://benjiyork.com From pje at telecommunity.com Wed Jul 11 21:25:44 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 11 Jul 2007 15:25:44 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46952702.8060606@benjiyork.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <20070711184549.733CE3A404D@sparrow.telecommunity.com> <46952702.8060606@benjiyork.com> Message-ID: <20070711192751.A9FF33A404D@sparrow.telecommunity.com> At 02:52 PM 7/11/2007 -0400, Benji York wrote: >Phillip J. Eby wrote: >>This schism between the idea of neatly cataloging things, versus >>being able to actually *use* that cataloging for practical purposes >>by automated tools (as opposed to being usable only to human >>readers), seems to be at the heart of some of the current discussion. > >Wasn't there a proposal to merge the catalog-sig and distutils-sig? Merging the lists isn't going to merge the people or change anybody's point of view. The difference in SIGs reflects, for the most part, a difference in Special Interest -- the "I" in SIG. Or another way of looking at the "I" is "Itch". The people who have been working on cataloging already have their itch basically scratched; PyPI has been sufficient for their needs for some time now. The packaging people, OTOH, have an ever-increasing itch, as setuptools hits its "hockey stick" growth phase both in user volume and package volume. This is understandably, of little interest to people who don't do lots of packaging, deployment, and distribution. I absolutely don't want to disparage the good folks who have made PyPI what it is today, and I totally understand their not wanting to take on the burden of supporting a tool they don't use or care about themselves, just because it happens to use PyPI. But it seems to me that for folks whose Interest/Itch is not merely finding packages, but *using* them, a different infrastructure is needed, treating PyPI as the ultimate *source* of the information, without being also its sole *distribution* point, or query interface. There are plenty of folks who have offered to spend funds, provide hosting, etc. for PyPI mirrors or alternatives -- perhaps we should create a SIG to start figuring out *how* to provide that, ideally while creating the least amount of additional service burden on the Cheeseshop. Ideally, we could then support having the Cheeseshop redirect existing clients to a nearby distribution index, while newer clients could use a distribution index to start with. Such a discussion would need to resolve certain design tradeoffs such as speed and availability vs. freshness of the index vs. load on the primary Cheeseshop vs. ability to have lots of mirrors/distribution indexes vs. ease of selecting one, etc. But I believe the main reason why such discussion hasn't gone very far at this point is because the packaging-interest folks have been looking to the cataloging-interest folks to provide direction and focus to the discussion of the tradeoffs, even though these things lie mostly outside their itch/interest. I think it is more likely to be productive for the packaging-interest folks to get clear about what they want first, and then the cataloging-interest folks can chime in if they see something being proposed that might be especially harmful to the Cheeseshop's availability or performance. From jim at zope.com Wed Jul 11 21:29:23 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 15:29:23 -0400 Subject: [Catalog-sig] The purpose(s) of PYPI In-Reply-To: <46952E47.8020700@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <46952349.5050606@v.loewis.de> <46952E47.8020700@v.loewis.de> Message-ID: <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com> On Jul 11, 2007, at 3:23 PM, Martin v. L?wis wrote: ... > As for taking needs into account: First of all, it's a volunteer > project. Open source contributors are known to primarily scratch > their own itches. Thank you for explaining open source to me. > So if you want to see needs be taken into account, > you may have to write the code yourself, pay somebody to write > it for your, or talk somebody into writing it for you. Yup. I'm aware of that. > In particular, > I personally won't write any line of code just because of a threat to > go away and write a competing index. First, I'm not aware that anyone has asked you do do anything. Second, I certainly meant no threat. We need a working index to use with setuptools. I would hope, in the spirit of open source to collaborate on that. A basic questions that needs to be answered is whether to use PyPI or to build something else. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Wed Jul 11 21:41:59 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 21:41:59 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> <4695241D.3090203@v.loewis.de> <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> Message-ID: <46953287.8020702@v.loewis.de> >> This I still don't understand. Why does it need to query all available >> releases? > > The way that setuptools currently works, it scans each of the release > pages looking for distributions. In theory, it could take the names of > these pages into account and scan fewer. It will still have to scan at > least 2. Can you elaborate please? Why does it need to find distributions for versions that it will eventually not download? > I have a feeling that I'll never convince you that a third of a second > is too slow. That's likely, yes. > to get baking working, at which point the pages will be a lot faster. > At that point, I think it would be good to pursue alternate pages more > optimized for setuptools to reduce the number and size of setuptools > requests. I'll help any way I can with that. Deal: please provide sample pages for some of the packages (starting with some zc packages perhaps), plus a directory structure in which they should live. I'll put them up on ximinez, at (say) /raw (or /simple, or whatever URL people propose), so that one can experiment with whether they look right. Then somebody else can write a generator to populate that; I will at the earliest point when I have time (which won't be before August), unless somebody does it earlier. Regards, Martin From martin at v.loewis.de Wed Jul 11 21:53:04 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 21:53:04 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070711190058.2322F3A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com> <46951B17.4000104@v.loewis.de> <469520B6.2030002@benjiyork.com> <20070711190058.2322F3A404D@sparrow.telecommunity.com> Message-ID: <46953520.4080106@v.loewis.de> > An interesting thought for future optimization... an XML-RPC catalog > server designed for this use case could in fact do all the computation > server-side, resolving dependencies and evaluating version constraints. > Heck, in theory, it could cache packages' external links, and simply > hand back to the caller a complete list of candidate URLs to choose for > downloading. You mean something like select f.filename from release_files f,releases r where f.name='setuptools' and f.name=r.name and f.version=r.version and not r._pypi_hidden; This gives filename ---------------------------------- setuptools-0.6c5.win32-py2.3.exe setuptools-0.6c5-py2.3.egg setuptools-0.6c5.win32-py2.4.exe setuptools-0.6c5-1.src.rpm setuptools-0.6c5.win32-py2.5.exe setuptools-0.6c5.tar.gz setuptools-0.6c5-py2.5.egg setuptools-0.6c5-py2.4.egg That would be very easy to add to the RPC server, and would be quite efficient also. > That way, most activities would take only one server > round-trip to complete, if the client sent a list of everything it > expects to need, and the server includes everything that the server > expects the client to want due to those things' dependencies. > > The main obstacle to implementing such a service today, is that it would > have no way of knowing what dependencies to look for, without sniffing > the contents of .egg files. For that, I would definitely need code contributions. Regards, Martin From jim at zope.com Wed Jul 11 21:57:47 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 15:57:47 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46953287.8020702@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> <4695241D.3090203@v.loewis.de> <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> <46953287.8020702@v.loewis.de> Message-ID: On Jul 11, 2007, at 3:41 PM, Martin v. L?wis wrote: >>> This I still don't understand. Why does it need to query all >>> available >>> releases? >> >> The way that setuptools currently works, it scans each of the release >> pages looking for distributions. In theory, it could take the >> names of >> these pages into account and scan fewer. It will still have to >> scan at >> least 2. > > Can you elaborate please? Why does it need to find distributions for > versions that it will eventually not download? It just scans the package page for URLs. It doesn't really know that the release pages correspond to a particular version. Let's suppose that setuptools was changed to be aware that PyPI release pages correspond to a particular version. In that case, it would have to read the package page to discover the release pages and then it would have to read at least one release page. If it had requirements other than the version (e.g. Python version or platform), it might have to scan several releases to find an acceptable distribution. But, in the best case, it would have to scan at least two pages. ... >> to get baking working, at which point the pages will be a lot faster. >> At that point, I think it would be good to pursue alternate pages >> more >> optimized for setuptools to reduce the number and size of setuptools >> requests. I'll help any way I can with that. > > Deal: please provide sample pages for some of the packages (starting > with some zc packages perhaps), plus a directory structure in which > they should live. Fair enough. I'll do that. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Wed Jul 11 22:08:33 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 22:08:33 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070711192751.A9FF33A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <20070711184549.733CE3A404D@sparrow.telecommunity.com> <46952702.8060606@benjiyork.com> <20070711192751.A9FF33A404D@sparrow.telecommunity.com> Message-ID: <469538C1.4050404@v.loewis.de> > There are plenty of folks who have offered to spend funds, provide > hosting, etc. for PyPI mirrors or alternatives -- perhaps we should > create a SIG to start figuring out *how* to provide that, ideally while > creating the least amount of additional service burden on the Cheeseshop. This makes me suspicious. I can certainly believe that you may need more sheer processing power, or more bandwidth, for such a system than the current PyPI installation has to offer. What I don't see why you need to implement something *different*. If you need better queries - fine, add them to PyPI. If you need replication, load balancing, etc, please add it to PyPI. If you have a way faster machine, migrate PyPI to that machine. That is all possible, but assumes availability of volunteers. However, the approach "let's create a different system" *also* needs volunteers. So I'd rather have these volunteers contribute to a single system, instead of each of them building their own one. With the particular offer of a faster machine, *all* it needs is a volunteer who first migrates and then maintains the installation. Of course, that would involve responsibility for all of PyPI (i.e. also dealing with abandoned packages that somebody else takes over, adding new classifiers, etc) (I say that because that aspect also lacks volunteers in the current installation). Regards, Martin From martin at v.loewis.de Wed Jul 11 22:11:09 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 22:11:09 +0200 Subject: [Catalog-sig] The purpose(s) of PYPI In-Reply-To: <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <46952349.5050606@v.loewis.de> <46952E47.8020700@v.loewis.de> <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com> Message-ID: <4695395D.5030602@v.loewis.de> > Second, I certainly meant no threat. We need a working index to use > with setuptools. I would hope, in the spirit of open source to > collaborate on that. A basic questions that needs to be answered is > whether to use PyPI or to build something else. Ok. For this question, there is a seemingly-obvious answer: use PyPI. Why on earth would somebody want to build something else? Regards, Martin From martin at v.loewis.de Wed Jul 11 22:15:44 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 22:15:44 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> <4695241D.3090203@v.loewis.de> <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> <46953287.8020702@v.loewis.de> Message-ID: <46953A70.6070600@v.loewis.de> > Let's suppose that setuptools was changed to be aware that PyPI release > pages correspond to a particular version. In that case, it would have > to read the package page to discover the release pages and then it would > have to read at least one release page. If it had requirements other > than the version (e.g. Python version or platform), it might have to > scan several releases to find an acceptable distribution. But, in the > best case, it would have to scan at least two pages. Sure. However, that makes the difference between O(1) and O(N), where N is the number of releases recorded. Going back to your original concern: you would not have to change the policy of keeping many different releases if the number of releases does not impact performance. When it looks for individual release pages, does it know that these are release pages, or does it follow all links on the package page? If the latter, what links does it follow (there are plenty more on the package page)? Regards, Martin From jim at zope.com Wed Jul 11 22:18:08 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 16:18:08 -0400 Subject: [Catalog-sig] The purpose(s) of PYPI In-Reply-To: <4695395D.5030602@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <46952349.5050606@v.loewis.de> <46952E47.8020700@v.loewis.de> <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com> <4695395D.5030602@v.loewis.de> Message-ID: <86EDAB6A-62C4-437C-82CD-34242258472C@zope.com> On Jul 11, 2007, at 4:11 PM, Martin v. L?wis wrote: >> Second, I certainly meant no threat. We need a working index to use >> with setuptools. I would hope, in the spirit of open source to >> collaborate on that. A basic questions that needs to be answered is >> whether to use PyPI or to build something else. > > Ok. For this question, there is a seemingly-obvious answer: use PyPI. > Why on earth would somebody want to build something else? If we can make PyPI do what we (where "we" doesn't have to include "you") need, then there is no reason. I don't want to shove a bunch of requirements down someone's throat. I understand that you don't object to new requirements if you don't have to be responsible for them. That's perfectly fair. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From benji at benjiyork.com Wed Jul 11 22:22:09 2007 From: benji at benjiyork.com (Benji York) Date: Wed, 11 Jul 2007 16:22:09 -0400 Subject: [Catalog-sig] The purpose(s) of PYPI In-Reply-To: <4695395D.5030602@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com> <469468C5.8000906@v.loewis.de> <46952349.5050606@v.loewis.de> <46952E47.8020700@v.loewis.de> <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com> <4695395D.5030602@v.loewis.de> Message-ID: <46953BF1.2020905@benjiyork.com> Martin v. L?wis wrote: >> Second, I certainly meant no threat. We need a working index to use >> with setuptools. I would hope, in the spirit of open source to >> collaborate on that. A basic questions that needs to be answered is >> whether to use PyPI or to build something else. > > Ok. For this question, there is a seemingly-obvious answer: use PyPI. > Why on earth would somebody want to build something else? Great; now that we've established that PyPI's audience will include setuptools, the people who know what it wants can make (or reiterate) proposals. -- Benji York http://benjiyork.com From jodok at lovelysystems.com Wed Jul 11 23:15:43 2007 From: jodok at lovelysystems.com (Jodok Batlogg) Date: Wed, 11 Jul 2007 21:15:43 +0000 GMT Subject: [Catalog-sig] The purpose(s) of PYPI In-Reply-To: <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de><069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com><469468C5.8000906@v.loewis.de><46952349.5050606@v.loewis.de><46952E47.8020700@v.loewis.de> <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com> Message-ID: <1827602359-1184184972-cardhu_blackberry.rim.net-22952-@engine37-cell01.bwc.produk.on.blackberry> +1 on all you said jim -- Lovely Systems, Partner phone: +43 5572 908060, fax: +43 5572 908060-77 Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria -----Original Message----- From: Jim Fulton Date: Wed, 11 Jul 2007 15:29:23 To: "Martin v. L?wis" Cc:catalog-sig at python.org Subject: Re: [Catalog-sig] The purpose(s) of PYPI On Jul 11, 2007, at 3:23 PM, Martin v. L?wis wrote: ... > As for taking needs into account: First of all, it's a volunteer > project. Open source contributors are known to primarily scratch > their own itches. Thank you for explaining open source to me. > So if you want to see needs be taken into account, > you may have to write the code yourself, pay somebody to write > it for your, or talk somebody into writing it for you. Yup. I'm aware of that. > In particular, > I personally won't write any line of code just because of a threat to > go away and write a competing index. First, I'm not aware that anyone has asked you do do anything. Second, I certainly meant no threat. We need a working index to use with setuptools. I would hope, in the spirit of open source to collaborate on that. A basic questions that needs to be answered is whether to use PyPI or to build something else. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org _______________________________________________ Catalog-SIG mailing list Catalog-SIG at python.org http://mail.python.org/mailman/listinfo/catalog-sig From jim at zope.com Wed Jul 11 22:29:56 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 16:29:56 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46953A70.6070600@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> <4695241D.3090203@v.loewis.de> <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> <46953287.8020702@v.loewis.de> <46953A70.6070600@v.loewis.de> Message-ID: On Jul 11, 2007, at 4:15 PM, Martin v. L?wis wrote: >> Let's suppose that setuptools was changed to be aware that PyPI >> release >> pages correspond to a particular version. In that case, it would >> have >> to read the package page to discover the release pages and then it >> would >> have to read at least one release page. If it had requirements other >> than the version (e.g. Python version or platform), it might have to >> scan several releases to find an acceptable distribution. But, in >> the >> best case, it would have to scan at least two pages. > > Sure. However, that makes the difference between O(1) and O(N), > where N is the number of releases recorded. Going back to your > original concern: you would not have to change the policy of > keeping many different releases if the number of releases > does not impact performance. Yup. Absolutely. That's why it we should change the index or setuptools, or both. IMO, it makes the most sense to change the index to have setuptools specific pages, in addition to the ones for humans, that allow: - One page per package and - a minimal amount of data to be downloaded and scanned per page. (As I noted before, release pages are meant for humans. They sometimes contain *lots* of data that setuptools doesn't need.) > When it looks for individual release pages, does it know that these > are release pages, or does it follow all links on the package > page? I'll have to dig to answer that question precisely. I'll do that after pausing to see if Phillip explains it first. > If the latter, what links does it follow (there are plenty > more on the package page)? See: http://mail.python.org/pipermail/catalog-sig/2007-July/001217.html It seems to only scan the release pages. So it has some heuristic to know which links to follow. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Wed Jul 11 22:43:41 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Jul 2007 22:43:41 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> <4695241D.3090203@v.loewis.de> <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> <46953287.8020702@v.loewis.de> <46953A70.6070600@v.loewis.de> Message-ID: <469540FD.5060109@v.loewis.de> >> If the latter, what links does it follow (there are plenty >> more on the package page)? > > See: http://mail.python.org/pipermail/catalog-sig/2007-July/001217.html > > It seems to only scan the release pages. So it has some heuristic to > know which links to follow. Looking at http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api tells me that it always expects that release pages have the form base/projectname/version. This looks like a formal specification of PyPI, so I wonder why it then would not trust this specification more actively. Regards, Martin From jim at zope.com Wed Jul 11 22:55:55 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 16:55:55 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469540FD.5060109@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> <4695241D.3090203@v.loewis.de> <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> <46953287.8020702@v.loewis.de> <46953A70.6070600@v.loewis.de> <469540FD.5060109@v.loewis.de> Message-ID: <05004547-983F-4192-8FA6-7D0A05D6155C@zope.com> On Jul 11, 2007, at 4:43 PM, Martin v. L?wis wrote: >>> If the latter, what links does it follow (there are plenty >>> more on the package page)? >> >> See: http://mail.python.org/pipermail/catalog-sig/2007-July/ >> 001217.html >> >> It seems to only scan the release pages. So it has some heuristic to >> know which links to follow. > > Looking at > > http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api > > tells me that it always expects that release pages have the form > base/projectname/version. > > This looks like a formal specification of PyPI, so I wonder why it > then would not trust this specification more actively. Phillip has certainly said it could. IMO, it wouldn't really matter if the pages used by setuptools were specialized for it. Compared with changing setuptools to be more clever in its handling of release pages, providing custom pages for setuptools will reduce the number of requests by at least 50% and sometimes much more and will greatly reduce the amount of data that needs to be downloaded and scanned. Someone will need to modify some software in either case, so the custom index pages look like a big win to me. I'll take a stab at writing a module, probably using setuptools itself, to scan the existing package and release pages to generate the sort of pages I'm talking about. This can be used to generate sample pages and might be useful for implementing the pages. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From richardjones at optusnet.com.au Thu Jul 12 00:09:48 2007 From: richardjones at optusnet.com.au (Richard Jones) Date: Thu, 12 Jul 2007 08:09:48 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> Message-ID: <200707120809.48344.richardjones@optusnet.com.au> On Thu, 12 Jul 2007, you wrote: > Yup. Absolutely. That's why it we should change the index or > setuptools, or both. IMO, it makes the most sense to change the > index to have setuptools specific pages, in addition to the ones for > humans, that allow: ... you know about the XML-RPC interface, yes? http://wiki.python.org/moin/CheeseShopXmlRpc I never fully understood why setuptools went with HTML scraping instead of XML-RPC. Richard From richardjones at optushome.com.au Thu Jul 12 00:11:49 2007 From: richardjones at optushome.com.au (Richard Jones) Date: Thu, 12 Jul 2007 08:11:49 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469522D6.1070706@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> Message-ID: <200707120811.49824.richardjones@optushome.com.au> On Thu, 12 Jul 2007, Martin v. L?wis wrote: > > The questions for us is, how much effort we are willing to make to > > prevent people from shooting themselves in the foot. I can understand > > why Phillip would like the package index to prevent people from choosing > > problematic package names. > > That's not my understanding - the issue isn't with "problematic package > names", but with conflicting package names. IOW, any single name is > fine - it's a pair of names that would cause a problem (and only if > you wanted to install both packages on the same system). A big issue that's not been raised is that *distutils* have no package name rules, but it's being proposed that PyPI does - thus a package author will potentially get an error when uploading their package, and also the name that appears in the index may be quite different to the name of their package. Richard From richardjones at optushome.com.au Thu Jul 12 00:23:11 2007 From: richardjones at optushome.com.au (Richard Jones) Date: Thu, 12 Jul 2007 08:23:11 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070711190058.2322F3A404D@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <469520B6.2030002@benjiyork.com> <20070711190058.2322F3A404D@sparrow.telecommunity.com> Message-ID: <200707120823.12001.richardjones@optushome.com.au> On Thu, 12 Jul 2007, Phillip J. Eby wrote: > An interesting thought for future optimization... an XML-RPC catalog > server designed for this use case could in fact do all the > computation server-side, resolving dependencies and evaluating > version constraints. Just to remind again: PyPI has an XML-RPC interface, and has had for a long time. It has a history of accepting any and all additional functions for that interface. Richard ps. why is it I keep on reading this undercurrent of "pypi doesn't do exactly what we need, so let's write a new one" and not "let's just add some more functionality to pypi so it does exactly what we need"... Is there something written somewhere, or even implied, that PyPI is somehow a closed development? If there is, I really need to strongly reiterate - PyPI will *always* be completely open for new developers. Please see the wiki page http://wiki.python.org/moin/CheeseShopDev for further information. From pje at telecommunity.com Thu Jul 12 00:40:26 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 11 Jul 2007 18:40:26 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <200707120809.48344.richardjones@optusnet.com.au> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> Message-ID: <20070711223812.D02D13A404D@sparrow.telecommunity.com> At 08:09 AM 7/12/2007 +1000, Richard Jones wrote: >On Thu, 12 Jul 2007, you wrote: > > Yup. Absolutely. That's why it we should change the index or > > setuptools, or both. IMO, it makes the most sense to change the > > index to have setuptools specific pages, in addition to the ones for > > humans, that allow: > >... you know about the XML-RPC interface, yes? > >http://wiki.python.org/moin/CheeseShopXmlRpc > >I never fully understood why setuptools went with HTML scraping instead of >XML-RPC. Fundamentally, it was because the XML-RPC API did not then (and does not now) provide everything that's needed. (As I mentioned a few of the other times you asked this.) The API has improved and added some of the missing bits, but not all of them. There are two pieces still missing: 1. Access to "hidden" packages' release info 2. Links in the long_description that are rendered by PyPI's web interface Without #2, we can't pick up author-provided Subversion links; see: http://peak.telecommunity.com/DevCenter/setuptools#making-your-package-available-for-easyinstall for details. With this information, easy_install could be changed to use the XML-RPC API.... *but* it would make even *more* round-trips to PyPI than it does now, unless those APIs were also designed differently than the ones that exist now, because you would need at least one search to find the correct package and its PKG-INFO, and another search to get the download files. Currently, it can at least get both of these in one trip, if the package name is an exact match. To answer Martin's question of why setuptools doesn't "trust" the PyPI specification even more, it's because having chosen to use the web interface to get the information, I thought it prudent to use only that subset of the web interface that could be easily duplicated using simple Apache directory indexes, since that meant someone could create their own index or mirror a portion of PyPI without having to implement its entire feature set. This later proved prudent when Jim wanted to have tests of his buildout framework that did not rely on PyPI being up, as it made it easier to create a mock PyPI for unit testing purposes. To be honest, the one thing I did *not* anticipate in this design was that Jim would be making 20 releases of the same package available in "unhidden" form. :) From pje at telecommunity.com Thu Jul 12 00:44:51 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 11 Jul 2007 18:44:51 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <200707120811.49824.richardjones@optushome.com.au> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <200707120811.49824.richardjones@optushome.com.au> Message-ID: <20070711224237.511A73A404D@sparrow.telecommunity.com> At 08:11 AM 7/12/2007 +1000, Richard Jones wrote: >On Thu, 12 Jul 2007, Martin v. L?wis wrote: > > > The questions for us is, how much effort we are willing to make to > > > prevent people from shooting themselves in the foot. I can understand > > > why Phillip would like the package index to prevent people from choosing > > > problematic package names. > > > > That's not my understanding - the issue isn't with "problematic package > > names", but with conflicting package names. IOW, any single name is > > fine - it's a pair of names that would cause a problem (and only if > > you wanted to install both packages on the same system). > >A big issue that's not been raised is that *distutils* have no package name >rules, but it's being proposed that PyPI does - thus a package author will >potentially get an error when uploading their package, That would happen now, if they spell their package exactly the same as somebody else's package. >and also the name that >appears in the index may be quite different to the name of their package. No-one has proposed that PyPI *change* a package's name, only that one not be allowed to *add* a package whose name does not sufficiently differ from an existing package that it would have a different filename. In other words, since someone has uploaded a package to the CheeseShop called "aspects", I should not be able to register a package called "Aspects" or "asPecTS". If on the other hand I had registered a package named "Aspects" first, then the other person should not be able to create one called "aspects" or "ASPects". So there is neither any changing of names, nor rejection of names on their own, but only a restriction as to how *similar* two names may be. From martin at v.loewis.de Thu Jul 12 00:51:55 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Jul 2007 00:51:55 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070711223812.D02D13A404D@sparrow.telecommunity.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <20070711223812.D02D13A404D@sparrow.telecommunity.com> Message-ID: <46955F0B.2060006@v.loewis.de> > 1. Access to "hidden" packages' release info Can you explain what you need them for, and when? I don't fully understand _pypi_hidden, however, I thought that a "hidden" release is really one that the author doesn't want to be ever found, and that is maintained just because of old clients know exactly where it is, and access it directly. > 2. Links in the long_description that are rendered by PyPI's web interface Just specify precisely what operation you want, and what precisely the result should be, and it will appear (also for _pypi_hidden). > With this information, easy_install could be changed to use the XML-RPC > API.... *but* it would make even *more* round-trips to PyPI than it > does now, unless those APIs were also designed differently than the ones > that exist now, because you would need at least one search to find the > correct package and its PKG-INFO, and another search to get the download > files. Currently, it can at least get both of these in one trip, if the > package name is an exact match. Ok, so can you design different APIs, reducing the number of roundtrips to one in the common case, while simultaneously not requiring the server to compute information that is not needed in the common case? If you can, it will appear. > To answer Martin's question of why setuptools doesn't "trust" the PyPI > specification even more, it's because having chosen to use the web > interface to get the information, I thought it prudent to use only that > subset of the web interface that could be easily duplicated using simple > Apache directory indexes, since that meant someone could create their > own index or mirror a portion of PyPI without having to implement its > entire feature set. This later proved prudent when Jim wanted to have > tests of his buildout framework that did not rely on PyPI being up, as > it made it easier to create a mock PyPI for unit testing purposes. I still don't understand. I'm talking about not accessing all versions in /root/package/version, trusting that the last part really is a version (i.e. reading only /root/package, finding out all possible versions, selecting the best one, then reading /root/package/bestversion). I cannot see why this is unavailable in a straight directory indexes. Correct me if I'm wrong, but I think you can have /root/package/index.html /root/package/version/index.html and then still chose to make both index.html the same (if there is only a single version), or list the individual versions in the top-level index.html. Or, you can just drop /root/package/index.html, trusting that the Apache directory index will list the single version subdirectory, anyway. Regards, Martin From jim at zope.com Thu Jul 12 00:51:51 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 18:51:51 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <200707120809.48344.richardjones@optusnet.com.au> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> Message-ID: <297846B8-94DC-4770-9476-711796E82FEC@zope.com> On Jul 11, 2007, at 6:09 PM, Richard Jones wrote: > On Thu, 12 Jul 2007, you wrote: >> Yup. Absolutely. That's why it we should change the index or >> setuptools, or both. IMO, it makes the most sense to change the >> index to have setuptools specific pages, in addition to the ones for >> humans, that allow: > > ... you know about the XML-RPC interface, yes? Yes. > > http://wiki.python.org/moin/CheeseShopXmlRpc > > I never fully understood why setuptools went with HTML scraping > instead of > XML-RPC. The main reason, as Phillip has explained is that he wants to allow static mirrors of the index. Another good reason is to allow static implementation, which would be far more scalable in the long run. Thanks for reminding me of this though as it will make my little project to prototype an alternate index format for setuptools easier. :) Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Thu Jul 12 00:56:01 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 18:56:01 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <200707120823.12001.richardjones@optushome.com.au> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <469520B6.2030002@benjiyork.com> <20070711190058.2322F3A404D@sparrow.telecommunity.com> <200707120823.12001.richardjones@optushome.com.au> Message-ID: <4198A946-0B11-4F19-9D99-CD7F7B4B9161@zope.com> On Jul 11, 2007, at 6:23 PM, Richard Jones wrote: ... > ps. why is it I keep on reading this undercurrent of "pypi doesn't > do exactly > what we need, so let's write a new one" and not "let's just add > some more > functionality to pypi so it does exactly what we need"... Is there > something > written somewhere, or even implied, that PyPI is somehow a closed > development? If there is, I really need to strongly reiterate - > PyPI will > *always* be completely open for new developers. Please see the wiki > page > http://wiki.python.org/moin/CheeseShopDev for further information. I don't think anyone wants to write an alternative. Well, maybe there are people like that, but you aren't reading them here. Why would people spend time arguing about requirements, performance, etc, if they wanted to write their own. Some people are being forced to implement their own indexes because they've become dependent on PyPI and PyPI just hasn't been there for them lately. I'm pretty sure they don't want to maintain alternate indexes in the long term. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jeremy.kloth at 4suite.org Thu Jul 12 01:20:49 2007 From: jeremy.kloth at 4suite.org (Jeremy Kloth) Date: Wed, 11 Jul 2007 17:20:49 -0600 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070711223812.D02D13A404D@sparrow.telecommunity.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <200707120809.48344.richardjones@optusnet.com.au> <20070711223812.D02D13A404D@sparrow.telecommunity.com> Message-ID: <200707111720.49299.jeremy.kloth@4suite.org> On Wednesday 11 July 2007 4:40:26 pm Phillip J. Eby wrote: > 1. Access to "hidden" packages' release info This already exists. Simply call release_data() with the exact version you are interested in. It returns the metadata regardless of the "hidden" flag. > 2. Links in the long_description that are rendered by PyPI's web interface The 'description' key in the dictionary returned by release_data() contains the long_description as provided by the package's setup.py. I would think that scanning just that should be simpler than relying on particular formatting of the PyPI generated package page. -- Jeremy Kloth http://4suite.org/ From jim at zope.com Thu Jul 12 01:32:21 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 19:32:21 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070711223812.D02D13A404D@sparrow.telecommunity.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <20070711223812.D02D13A404D@sparrow.telecommunity.com> Message-ID: <484AE499-EB19-4831-9AFB-1BCC3FCE9249@zope.com> On Jul 11, 2007, at 6:40 PM, Phillip J. Eby wrote: ... > There are two pieces still missing: > > 1. Access to "hidden" packages' release info I'm not sure what you are referring to here. Are you talking about hidden releases? Or something else? > 2. Links in the long_description that are rendered by PyPI's web > interface > > Without #2, we can't pick up author-provided Subversion links; see: > > http://peak.telecommunity.com/DevCenter/setuptools#making-your- > package-available-for-easyinstall > for details. AFAICT, the information is available in the output of the release_data method. ... > To be honest, the one thing I did *not* anticipate in this design > was that Jim would be making 20 releases of the same package > available in "unhidden" form. :) I assume you understand why this is needed. (Or maybe it isn't needed and I'm missing something.) We need to be able to depend on old versions and AFAICT, setuptools can't see hidden releases. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Thu Jul 12 01:46:45 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 19:46:45 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <200707120811.49824.richardjones@optushome.com.au> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <200707120811.49824.richardjones@optushome.com.au> Message-ID: On Jul 11, 2007, at 6:11 PM, Richard Jones wrote: > On Thu, 12 Jul 2007, Martin v. L?wis wrote: >>> The questions for us is, how much effort we are willing to make to >>> prevent people from shooting themselves in the foot. I can >>> understand >>> why Phillip would like the package index to prevent people from >>> choosing >>> problematic package names. >> >> That's not my understanding - the issue isn't with "problematic >> package >> names", but with conflicting package names. IOW, any single name is >> fine - it's a pair of names that would cause a problem (and only if >> you wanted to install both packages on the same system). > > A big issue that's not been raised is that *distutils* have no > package name > rules, but it's being proposed that PyPI does - thus a package > author will > potentially get an error when uploading their package, and also the > name that > appears in the index may be quite different to the name of their > package. Maybe distutils should have more package name rules than it does now. We (the Community) should be free to change things based on experience. We now have a lot more experience with this stuff than we had a few years ago. Maybe we should consider a reset. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Thu Jul 12 01:47:59 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 11 Jul 2007 19:47:59 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <297846B8-94DC-4770-9476-711796E82FEC@zope.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> Message-ID: On Jul 11, 2007, at 6:51 PM, Jim Fulton wrote: > Another good reason is to allow static > implementation, which would be far more scalable in the long run. ATM, from my machine, xml-rpc requests to PyPI are taking about .27 seconds. This is only a little less than regular page requests. With the current API, It would require at best 3 requests to get all of the distribution URLs. Presumably, with a change to the API, we could get this down to one request, but that's still a long time given the demand I expect on PyPI in the future. It would be so much simpler to just publish a static page for each package that setuptools could parse. I'll try to prototype this. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From waterbug at pangalactic.us Thu Jul 12 03:17:00 2007 From: waterbug at pangalactic.us (Stephen Waterbury) Date: Wed, 11 Jul 2007 21:17:00 -0400 Subject: [Catalog-sig] No more cc's please (was Re: start on static generation, and caching - apache config.) In-Reply-To: <05004547-983F-4192-8FA6-7D0A05D6155C@zope.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> <4695241D.3090203@v.loewis.de> <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> <46953287.8020702@v.loewis.de> <46953A70.6070600@v.loewis.de> <469540FD.5060109@v.loewis.de> <05004547-983F-4192-8FA6-7D0A05D6155C@zope.com> Message-ID: <4695810C.7070606@pangalactic.us> Everyone: Please exclude me from the cc's of all messages you send to the list! I'm a *member* of the catalog-sig list, so I'm getting 2 copies of every message in this thread and it's getting annoying. I'm against all this cc crap anyway -- that's why we have a *list*, dammit! (Geez, one would think Python programmers would be more email literate! grumble.) Thanks, Steve From martin at v.loewis.de Thu Jul 12 07:11:50 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Jul 2007 07:11:50 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> Message-ID: <4695B816.9020706@v.loewis.de> > ATM, from my machine, xml-rpc requests to PyPI are taking about .27 > seconds. This is only a little less than regular page requests. With > the current API, It would require at best 3 requests to get all of the > distribution URLs. Presumably, with a change to the API, we could get > this down to one request, but that's still a long time given the demand > I expect on PyPI in the future. You seem to assume that if you see a round trip time of .27 seconds, that then PyPI could only do 3 requests per second. That is not so. I just logged onto www.python.org (a machine that is close to cheeseshop.python.org), and called this function: >>> s=xmlrpclib.ServerProxy("http://cheeseshop.python.org/pypi") >>> def f(): ... start=time.time() ... for i in range(1000):s.package_releases('setuptools') ... return time.time()-start ... >>> f() 7.6247878074645996 So it can currently do 130 XML-RPC requests per second, to a single client. Inverting it, a request takes 0.0076s, which is a lot less than 0.27s. Regards, Martin From pje at telecommunity.com Thu Jul 12 07:48:38 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 12 Jul 2007 01:48:38 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <200707111720.49299.jeremy.kloth@4suite.org> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <200707120809.48344.richardjones@optusnet.com.au> <20070711223812.D02D13A404D@sparrow.telecommunity.com> <200707111720.49299.jeremy.kloth@4suite.org> Message-ID: <20070712054627.886D13A404D@sparrow.telecommunity.com> At 05:20 PM 7/11/2007 -0600, Jeremy Kloth wrote: >On Wednesday 11 July 2007 4:40:26 pm Phillip J. Eby wrote: > > 1. Access to "hidden" packages' release info > >This already exists. Simply call release_data() with the exact >version you are >interested in. It returns the metadata regardless of the "hidden" flag. There is no way to discover those versions, however, AFAICT > > 2. Links in the long_description that are rendered by PyPI's web interface > >The 'description' key in the dictionary returned by release_data() contains >the long_description as provided by the package's setup.py. I would think >that scanning just that should be simpler than relying on particular >formatting of the PyPI generated package page. Alas, this entire subject area is one where lots of people "would think" that such-and-such a thing would be simpler, but isn't. :( In this case, long_description is allowed to be reStructured Text, which nothing less than a full reST parser can handle. It's much easier to scan for a simple regular expression pattern to pull the links out of HTML, than to handle all the ways URLs can be spelled in reST, AFAICT. That having been said, I've never actually made the attempt, for simple historical reasons. I'll happily review patches for the functionality, as long as they can gracefully fall back to non-XML-RPC use, or provide an option to disable it so people using their own static indexes can still function. From renesd at gmail.com Thu Jul 12 08:01:00 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Thu, 12 Jul 2007 16:01:00 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <200707120809.48344.richardjones@optusnet.com.au> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> Message-ID: <64ddb72c0707112301p51614078sa84a3135584b11e8@mail.gmail.com> xmlrpc uses POST. So it's terrible for performance, and semantically impossible to cache. On 7/12/07, Richard Jones wrote: > On Thu, 12 Jul 2007, you wrote: > > Yup. Absolutely. That's why it we should change the index or > > setuptools, or both. IMO, it makes the most sense to change the > > index to have setuptools specific pages, in addition to the ones for > > humans, that allow: > > ... you know about the XML-RPC interface, yes? > > http://wiki.python.org/moin/CheeseShopXmlRpc > > I never fully understood why setuptools went with HTML scraping instead of > XML-RPC. > > > Richard > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From renesd at gmail.com Thu Jul 12 08:15:23 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Thu, 12 Jul 2007 16:15:23 +1000 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <64ddb72c0707112301p51614078sa84a3135584b11e8@mail.gmail.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <64ddb72c0707112301p51614078sa84a3135584b11e8@mail.gmail.com> Message-ID: <64ddb72c0707112315q6f34439en79b437ad1e9c4d6e@mail.gmail.com> hellos, ok, maybe I'm wrong about the performance of this interface! I guess I meant in general - using POST for GET requests is not such a nice thing. cu. On 7/12/07, Ren? Dudfield wrote: > xmlrpc uses POST. So it's terrible for performance, and semantically > impossible to cache. > > > On 7/12/07, Richard Jones wrote: > > On Thu, 12 Jul 2007, you wrote: > > > Yup. Absolutely. That's why it we should change the index or > > > setuptools, or both. IMO, it makes the most sense to change the > > > index to have setuptools specific pages, in addition to the ones for > > > humans, that allow: > > > > ... you know about the XML-RPC interface, yes? > > > > http://wiki.python.org/moin/CheeseShopXmlRpc > > > > I never fully understood why setuptools went with HTML scraping instead of > > XML-RPC. > > > > > > Richard > > _______________________________________________ > > Catalog-SIG mailing list > > Catalog-SIG at python.org > > http://mail.python.org/mailman/listinfo/catalog-sig > > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From jim at zope.com Thu Jul 12 12:34:19 2007 From: jim at zope.com (Jim Fulton) Date: Thu, 12 Jul 2007 06:34:19 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4695B816.9020706@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> Message-ID: <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> On Jul 12, 2007, at 1:11 AM, Martin v. L?wis wrote: >> ATM, from my machine, xml-rpc requests to PyPI are taking about .27 >> seconds. This is only a little less than regular page requests. >> With >> the current API, It would require at best 3 requests to get all of >> the >> distribution URLs. Presumably, with a change to the API, we could >> get >> this down to one request, but that's still a long time given the >> demand >> I expect on PyPI in the future. > > You seem to assume that if you see a round trip time of .27 seconds, > that then PyPI could only do 3 requests per second. That is not so. Yeah, it occurred to me on my way home that a substantial part of the time might be due to distance. I wonder what times ab against http://www.python.org/pypi/ZODB3 from inside the python.org network would give. I wonder if it would help much to make multiple HTTP requests in the same connection. This might be something to look at in setuptools and/or xmlrpclib. .... > So it can currently do 130 XML-RPC requests per second, to > a single client. Inverting it, a request takes 0.0076s, > which is a lot less than 0.27s. Cool. That's much better. Thanks for trying this. OTOH, this points up a couple things: 1. Since many people will be far away from PyPI, I think our long- term plan should encompass geographic mirrors. It's good that the server is spending a small amount of time, but it still takes *me* a long time to get data. 2. It's important to reduce the number of round trips. I'm still opposed to using XML-RPC because: - It's harder to mirror, and - It's still slower than static pages. Note that after our discussion, I'm equally against the current approach of parsing a human interface. I still think it makes a lot more sense to have a tailored interface for setuptools. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From benji at benjiyork.com Thu Jul 12 13:26:44 2007 From: benji at benjiyork.com (Benji York) Date: Thu, 12 Jul 2007 07:26:44 -0400 Subject: [Catalog-sig] No more cc's please (was Re: start on static generation, and caching - apache config.) In-Reply-To: <4695810C.7070606@pangalactic.us> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <721297D4-85EA-4397-84C9-D90E5598477A@zope.com> <4695241D.3090203@v.loewis.de> <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com> <46953287.8020702@v.loewis.de> <46953A70.6070600@v.loewis.de> <469540FD.5060109@v.loewis.de> <05004547-983F-4192-8FA6-7D0A05D6155C@zope.com> <4695810C.7070606@pangalactic.us> Message-ID: <46960FF4.3050609@benjiyork.com> Stephen Waterbury wrote: > Please exclude me from the cc's of all messages you send to the list! > I'm a *member* of the catalog-sig list, so I'm getting 2 copies of every > message in this thread and it's getting annoying. I'm against all this > cc crap anyway -- that's why we have a *list*, dammit! (Geez, one > would think Python programmers would be more email literate! grumble.) Go to http://mail.python.org/mailman/options/catalog-sig and set the "Avoid duplicate copies of messages?" option to "Yes". (One would think a list member would be more mailman literate!) -- Benji York http://benjiyork.com From amk at amk.ca Thu Jul 12 14:20:30 2007 From: amk at amk.ca (A.M. Kuchling) Date: Thu, 12 Jul 2007 08:20:30 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46951B55.9050009@v.loewis.de> References: <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> <46951B55.9050009@v.loewis.de> Message-ID: <20070712122030.GA5853@amk-desktop.matrixgroup.net> On Wed, Jul 11, 2007 at 08:03:01PM +0200, "Martin v. L?wis" wrote: > > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache > > is passing requests through to a local process (mod_rewrite or > > mod_proxy?), and that process wasn't responding. > > Neither is going on for PyPI, AFAIK - it's mod_fastcgi. www.python.org/pypi does use mod_proxy to provide PyPI access from the old URL; it's possible these users were going through www.python.org. --amk From gentoodev at gmail.com Thu Jul 12 16:39:14 2007 From: gentoodev at gmail.com (Rob Cakebread) Date: Thu, 12 Jul 2007 07:39:14 -0700 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com> Message-ID: <9b06ffb10707120739s56ef8736mce1545071df3475b@mail.gmail.com> On 7/11/07, Fred Drake wrote: > On 7/11/07, Nathan R. Yergler wrote: > > The speed has noticeably improved (thanks!) but as recently as Monday > > PyPI was unresponsive and then returning proxy errors. It definitely > > caused us (Creative Commons) to lose productivity Monday afternoon > > (PDT). > > We're seeing this right now, too. I'm checking both www.python.org > and cheeseshop.python.org. > > As of 7:30am PST it's timing out on the website and via XML-RPC, testing from L.A. or Germany. From pje at telecommunity.com Thu Jul 12 20:07:52 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 12 Jul 2007 14:07:52 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469522D6.1070706@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> Message-ID: <20070712180539.3BFB43A40D7@sparrow.telecommunity.com> At 08:35 PM 7/11/2007 +0200, Martin v. L?wis wrote: > > The questions for us is, how much effort we are willing to make to > > prevent people from shooting themselves in the foot. I can understand > > why Phillip would like the package index to prevent people from choosing > > problematic package names. > >That's not my understanding - the issue isn't with "problematic package >names", but with conflicting package names. IOW, any single name is >fine - it's a pair of names that would cause a problem (and only if >you wanted to install both packages on the same system). It's also a problem for locating the correct package in the first place... which seems to fall under the jurisdiction of a "package index". :) This is just as important for direct human users of the Cheeseshop, as it is for the humans using software to access the Cheeseshop. From jim at zope.com Thu Jul 12 20:15:03 2007 From: jim at zope.com (Jim Fulton) Date: Thu, 12 Jul 2007 14:15:03 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <20070712180539.3BFB43A40D7@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <20070712180539.3BFB43A40D7@sparrow.telecommunity.com> Message-ID: <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com> On Jul 12, 2007, at 2:07 PM, Phillip J. Eby wrote: > At 08:35 PM 7/11/2007 +0200, Martin v. L?wis wrote: >> > The questions for us is, how much effort we are willing to make to >> > prevent people from shooting themselves in the foot. I can >> understand >> > why Phillip would like the package index to prevent people from >> choosing >> > problematic package names. >> >> That's not my understanding - the issue isn't with "problematic >> package >> names", but with conflicting package names. IOW, any single name is >> fine - it's a pair of names that would cause a problem (and only if >> you wanted to install both packages on the same system). > > It's also a problem for locating the correct package in the first > place... which seems to fall under the jurisdiction of a "package > index". :) > > This is just as important for direct human users of the Cheeseshop, > as it is for the humans using software to access the Cheeseshop. I want to make sure I understand this. I would hope that searching would be case insensitive and otherwise flexible wrt names. Is there any reason we can't expect URLs and requirement specifications to be precisely spelled? That is, if someone names their package "sPaM", I see no reason why PyPI needs to support anything other than http:// www.python.org/pypi/sPaM as the one URL of the package. Someone should be able to use the search UI to search for "spam" and see a result that includes "sPaM". From then on, they should be able to type the name "sPaM". Or am I missing something? Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Thu Jul 12 20:43:11 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 12 Jul 2007 14:43:11 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <20070712180539.3BFB43A40D7@sparrow.telecommunity.com> <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com> Message-ID: <20070712184056.F219A3A40B0@sparrow.telecommunity.com> At 02:15 PM 7/12/2007 -0400, Jim Fulton wrote: >I want to make sure I understand this. I would hope that searching >would be case insensitive and otherwise flexible wrt names. PyPI's searching is indeed case insensitive, and is a substring/keyword search as well. > Is there >any reason we can't expect URLs and requirement specifications to be >precisely spelled? That is, if someone names their package "sPaM", I >see no reason why PyPI needs to support anything other than http:// >www.python.org/pypi/sPaM as the one URL of the package. Someone >should be able to use the search UI to search for "spam" and see a >result that includes "sPaM". From then on, they should be able to >type the name "sPaM". Or am I missing something? You're missing that the subject is about similarity of names. A typo of say, 'SPam' shouldn't return me some package *other* than the one I'm looking for. It'd be nice if the resulting page said something besides "Not Found", too... like "there's no SPam, but here are a bunch of packages whose name contains 'spam'". If it did that, setuptools would be able to find the right page without hitting the main index, too. But redirection, as proposed by Martin, also accomplishes the same thing. And again, all this helps human direct users of the index, too. From jim at zope.com Thu Jul 12 21:02:10 2007 From: jim at zope.com (Jim Fulton) Date: Thu, 12 Jul 2007 15:02:10 -0400 Subject: [Catalog-sig] Case sensitivity of package names In-Reply-To: <20070712184056.F219A3A40B0@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <20070712180539.3BFB43A40D7@sparrow.telecommunity.com> <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com> <20070712184056.F219A3A40B0@sparrow.telecommunity.com> Message-ID: <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com> On Jul 12, 2007, at 2:43 PM, Phillip J. Eby wrote: > At 02:15 PM 7/12/2007 -0400, Jim Fulton wrote: >> I want to make sure I understand this. I would hope that searching >> would be case insensitive and otherwise flexible wrt names. > > PyPI's searching is indeed case insensitive, and is a substring/ > keyword search as well. > > >> Is there >> any reason we can't expect URLs and requirement specifications to be >> precisely spelled? That is, if someone names their package "sPaM", I >> see no reason why PyPI needs to support anything other than >> http:// www.python.org/pypi/sPaM as the one URL of the package. >> Someone >> should be able to use the search UI to search for "spam" and see a >> result that includes "sPaM". From then on, they should be able to >> type the name "sPaM". Or am I missing something? > > You're missing that the subject is about similarity of names. > A typo of say, 'SPam' shouldn't return me some package *other* > than the one I'm looking for. I No, I understand that part. I understand the desire to avoid conflicts that cause problems down the road. I would prefer to "disallow" this by rejecting new package names that are too similar to already-registered packages. > t'd be nice if the resulting page said something besides "Not > Found", too... like "there's no SPam, but here are a bunch of > packages whose name contains 'spam'". I think this would be fine in a human interface. > If it did that, setuptools would be able to find the right page > without hitting the main index, too. But redirection, as proposed > by Martin, also accomplishes the same thing. I really don't like this for setuptools. My preference is that setuptools should be required to ask for a package with precise spelling. > And again, all this helps human direct users of the index, too. I think it encourages humans to do bad things. Is someone misspells ZODB3 as zodb3 and is able to install it with easy_install, then they'll be tempted to use the name "zodb3" in their requirements specifications. That is a bad thing IMO. We're talking about technical users and I think it is reasonable to expect them to be precise in their specifications. I could live with case-insensitive package names if we (for some definition of we, possibly being Guido) decided we want them, but I'd prefer they be case sensitive. I'd still be in favor of avoiding confusing duplicates. If we stick with case-sentitive package names, then I'd prefer that the interaction of setuptools with the index be case sensitive. I wouldn't object to setuptools giving people help. So, for example, if I type "zodb3", I wouldn't object to setuptools letting the user know that maybe they should use ZODB3. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Thu Jul 12 21:26:02 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 12 Jul 2007 15:26:02 -0400 Subject: [Catalog-sig] Case sensitivity of package names In-Reply-To: <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <20070712180539.3BFB43A40D7@sparrow.telecommunity.com> <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com> <20070712184056.F219A3A40B0@sparrow.telecommunity.com> <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com> Message-ID: <20070712192350.347B13A40B0@sparrow.telecommunity.com> At 03:02 PM 7/12/2007 -0400, Jim Fulton wrote: >We're talking about >technical users and I think it is reasonable to expect them to be >precise in their specifications. IMO, "technical users" is a wider range of people than you seem to be thinking of. In any case, this is a separate topic from disallowing too-similar names -- which you agree we should do. Whether to then also introduce case-sensitivity into various parts of easy_install is another subject that doesn't really matter to the catalog-sig. Please note, however, that it is not a minor change by any means -- case-insensitivity exists throughout pkg_resources and setuptools to handle operating system filename case-insensitivity, not just for index lookups. In fact, I believe the index lookups *are* case-sensitive; IIRC it's only link parsing that is case-insensitive. From jim at zope.com Thu Jul 12 21:31:45 2007 From: jim at zope.com (Jim Fulton) Date: Thu, 12 Jul 2007 15:31:45 -0400 Subject: [Catalog-sig] Case sensitivity of package names In-Reply-To: <20070712192350.347B13A40B0@sparrow.telecommunity.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <20070712180539.3BFB43A40D7@sparrow.telecommunity.com> <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com> <20070712184056.F219A3A40B0@sparrow.telecommunity.com> <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com> <20070712192350.347B13A40B0@sparrow.telecommunity.com> Message-ID: <4CD1A7D8-1911-45C9-AB08-C4DC3E1CDFA9@zope.com> On Jul 12, 2007, at 3:26 PM, Phillip J. Eby wrote: ... > Whether to then also introduce case-sensitivity into various parts > of easy_install is another subject that doesn't really matter to > the catalog-sig. I'm not sure we agree on what matters to the catalog sig. :) (I still need to respond to your note on that topic.) > Please note, however, that it is not a minor change by any means -- > case-insensitivity exists throughout pkg_resources and setuptools > to handle operating system filename case-insensitivity, not just > for index lookups. In fact, I believe the index lookups *are* case- > sensitive; IIRC it's only link parsing that is case-insensitive. I'm not suggesting that you shouldn't deal with file-system case insensitivity. If I were to change setuptools to match my opinion, I would probably just change the code that tries to get a package listing to look for close matches to print a suggestion and stop rather than guessing a package name and continuing. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Thu Jul 12 23:09:32 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Jul 2007 23:09:32 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> Message-ID: <4696988C.6050309@v.loewis.de> > I wonder what times ab against http://www.python.org/pypi/ZODB3 from > inside the python.org network would give. I just measured it. 1000 requests take 17s using urllib, giving 60 request per second. > I wonder if it would help much to make multiple HTTP requests in the > same connection. This might be something to look at in setuptools > and/or xmlrpclib. Only for remote connections, due to the round-trips required for TCP handshake. Locally, Apache opens a new connection to the FCGI servers per requests (using the farmer-worker pattern). > 1. Since many people will be far away from PyPI, I think our long- > term plan should encompass geographic mirrors. It's good that the > server is spending a small amount of time, but it still takes *me* a > long time to get data. Ok. I am, in general, skeptical about mirroring. However, if it makes people happy, feel free to implement it. A number of issues should be considered, of course: - there should be a way to get authoritative answers somehow, preferably from mirrors, but, if necessary, from the main site - I really wish to collect download counters across mirrors. "Official" mirrors should be obliged to report download statistics once a day or so. > 2. It's important to reduce the number of round trips. A colleague today suggested that the best way to reduce round trips is to give each machine a local copy of the index, the same way Debian apt works: you do 'apt-get update', and then have a local copy of the catalog that you can build against. No roundtrips at all (except for the one to update the local catalog), for the expense of being out of date if you don't manually update the catalog. Regards, Martin From martin at v.loewis.de Thu Jul 12 23:12:55 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Jul 2007 23:12:55 +0200 Subject: [Catalog-sig] www.python.org/pypi might redirect? In-Reply-To: <20070712122030.GA5853@amk-desktop.matrixgroup.net> References: <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <46951612.9010009@v.loewis.de> <46951B55.9050009@v.loewis.de> <20070712122030.GA5853@amk-desktop.matrixgroup.net> Message-ID: <46969957.1020404@v.loewis.de> > www.python.org/pypi does use mod_proxy to provide PyPI access from the > old URL; it's possible these users were going through www.python.org. I wonder why that is. Would there be anything wrong with making that a (permanent) redirect instead? Users of the old URL should see a speedup if they do many requests; all relative URLs would directly go to cheeseshop, rather than having to pass through www.python.org again. Regards, Martin From martin at v.loewis.de Thu Jul 12 23:25:58 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Jul 2007 23:25:58 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <9b06ffb10707120739s56ef8736mce1545071df3475b@mail.gmail.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com> <4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us> <46946A69.4000702@v.loewis.de> <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com> <9b06ffb10707120739s56ef8736mce1545071df3475b@mail.gmail.com> Message-ID: <46969C66.2020806@v.loewis.de> > As of 7:30am PST it's timing out on the website and via XML-RPC, > testing from L.A. or Germany. It seems the same crash of all FCGI servers (with a failure of mod_fcgi to restart them) has happened again. I still have no clue what's causing it, but I added a watchdog that should restart it within a minute the next time. Regards, Martin From martin at v.loewis.de Thu Jul 12 23:38:50 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Jul 2007 23:38:50 +0200 Subject: [Catalog-sig] Case sensitivity of package names In-Reply-To: <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <20070712180539.3BFB43A40D7@sparrow.telecommunity.com> <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com> <20070712184056.F219A3A40B0@sparrow.telecommunity.com> <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com> Message-ID: <46969F6A.8030904@v.loewis.de> > I really don't like this for setuptools. My preference is that > setuptools should be required to ask for a package with precise > spelling. I think the way setuptools currently works is this: Every name gets converted to its lower-case safe-name equivalent. All dependencies, file names, resource identifications etc are based on that version of the name, *not* the "true" name of the package. Then, when setuptools tries to find a package whose "true" name is in mixed-case, it uses the lower-cased safe-named version, and PyPI reports that the package does not exist. Then, setuptools queries the entire package list, trying to find out the original spelling of the package. I'm sure Phillip will correct me if I'm wrong. > I could live with case-insensitive package names if we (for some > definition of we, possibly being Guido) decided we want them, but I'd > prefer they be case sensitive. I'd still be in favor of avoiding > confusing duplicates. If we stick with case-sentitive package names, > then I'd prefer that the interaction of setuptools with the index be > case sensitive. See above - I believe setuptools package names are case insensitive today. Regards, Martin From jim at zope.com Fri Jul 13 01:14:33 2007 From: jim at zope.com (Jim Fulton) Date: Thu, 12 Jul 2007 19:14:33 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4696988C.6050309@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> Message-ID: On Jul 12, 2007, at 5:09 PM, Martin v. L?wis wrote: ... >> I wonder if it would help much to make multiple HTTP requests in the >> same connection. This might be something to look at in setuptools >> and/or xmlrpclib. > > Only for remote connections, due to the round-trips required for > TCP handshake. Locally, Apache opens a new connection to the FCGI > servers per requests (using the farmer-worker pattern). Right, but most connections will be remote, so this is a potential win. > >> 1. Since many people will be far away from PyPI, I think our long- >> term plan should encompass geographic mirrors. It's good that the >> server is spending a small amount of time, but it still takes *me* a >> long time to get data. > > Ok. I am, in general, skeptical about mirroring. However, if it > makes people happy, feel free to implement it. My goal is to have PyPI provide a simplified version of the data for use by setuptools that is easily mirrored using standard mirroring tools. (I may actually prototype this with a kind of mirror.) > A number of issues should be considered, of course: > - there should be a way to get authoritative answers somehow, > preferably > from mirrors, but, if necessary, from the main site I don't know what you mean. I envision mirrors as being read-only and only used by setuptools. The main site would certainly be authoritative. > - I really wish to collect download counters across mirrors. > "Official" > mirrors should be obliged to report download statistics once a day > or so. OK. > >> 2. It's important to reduce the number of round trips. > > A colleague today suggested that the best way to reduce round trips > is to give each machine a local copy of the index, the same way > Debian apt works: you do 'apt-get update', and then have a local > copy of the catalog that you can build against. No roundtrips > at all (except for the one to update the local catalog), for the > expense of being out of date if you don't manually update the > catalog. Yup. This might be a really nice way to go. It would be especially nice if a client could contact PyPI and ask for new data since a given time. I imagine that this request could be as cheap as the requests we have now, unless a client was very out of date. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Fri Jul 13 01:35:05 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 12 Jul 2007 19:35:05 -0400 Subject: [Catalog-sig] Case sensitivity of package names In-Reply-To: <46969F6A.8030904@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <20070712180539.3BFB43A40D7@sparrow.telecommunity.com> <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com> <20070712184056.F219A3A40B0@sparrow.telecommunity.com> <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com> <46969F6A.8030904@v.loewis.de> Message-ID: <20070712233252.3C2913A40A9@sparrow.telecommunity.com> At 11:38 PM 7/12/2007 +0200, Martin v. L?wis wrote: > > I really don't like this for setuptools. My preference is that > > setuptools should be required to ask for a package with precise > > spelling. > >I think the way setuptools currently works is this: > >Every name gets converted to its lower-case safe-name equivalent. >All dependencies, file names, resource identifications etc >are based on that version of the name, *not* the "true" >name of the package. Object comparisons are done case-insensitively, but the objects themselves keep the case-insensitive forms ('key' attributes) separate from the originally-input names ('project_name' attributes). >Then, when setuptools tries to find a package whose "true" >name is in mixed-case, it uses the lower-cased safe-named >version, and PyPI reports that the package does not exist. >Then, setuptools queries the entire package list, trying >to find out the original spelling of the package. This is almost correct, except that it actually tries to lookup whatever the user actually input, then the safe_name() form of that. For index lookups, it does not actually change the case of what was entered, so if the user enters something that exactly matches what's on PyPI, they'll have a better chance of getting everything in one request.... unless there are multiple versions listed, of course. From pje at telecommunity.com Fri Jul 13 01:43:04 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 12 Jul 2007 19:43:04 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> Message-ID: <20070712234049.97ED63A40A9@sparrow.telecommunity.com> At 07:14 PM 7/12/2007 -0400, Jim Fulton wrote: >On Jul 12, 2007, at 5:09 PM, Martin v. L?wis wrote: > >> 2. It's important to reduce the number of round trips. > > > > A colleague today suggested that the best way to reduce round trips > > is to give each machine a local copy of the index, the same way > > Debian apt works: you do 'apt-get update', and then have a local > > copy of the catalog that you can build against. No roundtrips > > at all (except for the one to update the local catalog), for the > > expense of being out of date if you don't manually update the > > catalog. > >Yup. This might be a really nice way to go. It would be especially >nice if a client could contact PyPI and ask for new data since a >given time. I imagine that this request could be as cheap as the >requests we have now, unless a client was very out of date. Such a query could simply consist of which packages had been updated, and the data could then be cleared from the local cache. The downside to this approach is that it's not any faster for anything you've never downloaded before. So, I'm not really sure how to create a quality user experience with edge caching alone. It seems to me that geographically localized mirrors are needed to provide infrequent users and new users with good performance. And presumably, the commercial users who are having issues now, want their users as well as their developers to have good performance. (Personally, I find it extremely irritating every time the "yum" package manager makes me wait for it to download a bunch of repository data that isn't necessarily even related to what I just asked it to do.) From doug at hellfly.net Fri Jul 13 03:26:12 2007 From: doug at hellfly.net (Doug Hellmann) Date: Thu, 12 Jul 2007 21:26:12 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> Message-ID: <786451BD-A013-48C1-87B9-884F46151B81@hellfly.net> On Jul 12, 2007, at 7:14 PM, Jim Fulton wrote: >>> 2. It's important to reduce the number of round trips. >> >> A colleague today suggested that the best way to reduce round trips >> is to give each machine a local copy of the index, the same way >> Debian apt works: you do 'apt-get update', and then have a local >> copy of the catalog that you can build against. No roundtrips >> at all (except for the one to update the local catalog), for the >> expense of being out of date if you don't manually update the >> catalog. > > Yup. This might be a really nice way to go. It would be especially > nice if a client could contact PyPI and ask for new data since a > given time. I imagine that this request could be as cheap as the > requests we have now, unless a client was very out of date. That sounds like RSS. Doug From martin at v.loewis.de Fri Jul 13 10:04:33 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 13 Jul 2007 10:04:33 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> Message-ID: <46973211.1060801@v.loewis.de> >> A number of issues should be considered, of course: >> - there should be a way to get authoritative answers somehow, preferably >> from mirrors, but, if necessary, from the main site > > I don't know what you mean. I envision mirrors as being read-only and > only used by setuptools. The main site would certainly be authoritative. The problem is with outdated information. With a mirror, the question is always "is my information current". Perhaps it's ok for users of a mirror to use outdated information. However, when people register a package, then use setuptools to install it, they might be puzzled that it won't find the package just because it was using an outdated mirror. In many cases, it's fine to use outdated information, of course, e.g. if you know that the package hasn't been released for many weeks now, or in case you will update the next day again, and then fetch the newer release. > Yup. This might be a really nice way to go. It would be especially nice > if a client could contact PyPI and ask for new data since a given time. > I imagine that this request could be as cheap as the requests we have > now, unless a client was very out of date. PyPI already supports that: the updated_releases RPC call will return all packages that have changed since a given date. Regards, Martin From jim at zope.com Fri Jul 13 16:59:01 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 13 Jul 2007 10:59:01 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46973211.1060801@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de> Message-ID: On Jul 13, 2007, at 4:04 AM, Martin v. L?wis wrote: >>> A number of issues should be considered, of course: >>> - there should be a way to get authoritative answers somehow, >>> preferably >>> from mirrors, but, if necessary, from the main site >> >> I don't know what you mean. I envision mirrors as being read-only >> and >> only used by setuptools. The main site would certainly be >> authoritative. > > The problem is with outdated information. With a mirror, the question > is always "is my information current". Perhaps it's ok for users of > a mirror to use outdated information. However, when people register > a package, then use setuptools to install it, they might be puzzled > that it won't find the package just because it was using an outdated > mirror. I agree 100% with this concern, which is why I was skeptical of caching in the classical form. Right. So the question is, how can we keep the mirror up to date? :) >> Yup. This might be a really nice way to go. It would be especially >> nice >> if a client could contact PyPI and ask for new data since a given >> time. >> I imagine that this request could be as cheap as the requests we have >> now, unless a client was very out of date. > > PyPI already supports that: the updated_releases RPC call will return > all packages that have changed since a given date. Awesome! Too bad it wasn't shown in: http://wiki.python.org/moin/CheeseShopXmlRpc I'll look at the source (location hints welcome) and update that page. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Fri Jul 13 17:14:38 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 13 Jul 2007 17:14:38 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de> Message-ID: <469796DE.805@v.loewis.de> > Right. So the question is, how can we keep the mirror up to date? :) I think there is no efficient way to provide perfect synchronization (not without putting too much load on the central server again). If slight propagationdelays are acceptable, it would be possible that the central server publishes sequence numbers of each update performed, and mirrors could check with a single roundtrip what the most current sequence number is. Then it is the mirror's choice how much it can age; checking every minute would be reasonable IMO for most purposes; users that want to see their just-uploaded stuff then would either need to wait that minute, or go to the master site, or fetch the sequence number of the master site and compare it with the one of the mirror they use. > I'll look at the source (location hints welcome) and update that page. See http://svn.python.org/view/trunk/pypi/rpc.py?rev=433&root=packages&view=markup Regards, Martin From jim at zope.com Fri Jul 13 18:01:18 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 13 Jul 2007 12:01:18 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <46973211.1060801@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de> Message-ID: <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com> On Jul 13, 2007, at 4:04 AM, Martin v. L?wis wrote: > PyPI already supports that: the updated_releases RPC call will return > all packages that have changed since a given date. It appears that this only shows new releases. If I update a new distribution to a release, it doesn't cause the release to appear as updated. A common scenario for me is that I'll create a release, update a source release, and then, some time later, when someone bugs me, I'll upload a windows egg. The way things are now, the later upload won't be noticed. Of course, the initial upload won't be noticed if someone happens to poll between release creation and the first upload. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Fri Jul 13 18:02:33 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 13 Jul 2007 12:02:33 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469796DE.805@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de> <469796DE.805@v.loewis.de> Message-ID: On Jul 13, 2007, at 11:14 AM, Martin v. L?wis wrote: >> Right. So the question is, how can we keep the mirror up to date? :) > > I think there is no efficient way to provide perfect synchronization > (not without putting too much load on the central server again). Well, if there mirrors were known, then the primary could notify them. Of course, that would make them more complex. Of course polling has its complexities too. > If slight propagationdelays are acceptable, it would be possible that > the central server publishes sequence numbers of each update > performed, > and mirrors could check with a single roundtrip what the most current > sequence number is. If the updated_releases actually reflected updates, then I think that would be good enough. Then we could use the UTC second as the sequence number. :) > > Then it is the mirror's choice how much it can age; checking every > minute would be reasonable IMO for most purposes; Yup > users that want > to see their just-uploaded stuff then would either need to wait > that minute, or go to the master site, or fetch the sequence > number of the master site and compare it with the one of the mirror > they use. Yup > >> I'll look at the source (location hints welcome) and update that >> page. > > See > > http://svn.python.org/view/trunk/pypi/rpc.py? > rev=433&root=packages&view=markup Thanks. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Fri Jul 13 18:50:50 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 13 Jul 2007 18:50:50 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de> <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com> Message-ID: <4697AD6A.1030602@v.loewis.de> > It appears that this only shows new releases. That's true. I don't know why it does that; it may be that this interface predates file uploading. > If I update a new distribution to a release With "distribution", you always mean "file", right? Regards, Martin From martin at v.loewis.de Fri Jul 13 19:07:30 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 13 Jul 2007 19:07:30 +0200 Subject: [Catalog-sig] Effect of HTTP 1.1 Message-ID: <4697B152.7030304@v.loewis.de> I did some measurements, with the script below. For 30 requests, a single HTTP 1.1 connection needs 5.4s over my DSL connection; 30 individual connections need 11.7s. So if setuptools expects to request multiple pages from the index, it would definitely be useful to keep the connection (I don't know at all whether it currently does so already). Regards, Martin import httplib, time t=time.time() h = httplib.HTTPConnection("cheeseshop.python.org") for i in range(30): h.putrequest("GET", "/pypi/Lamina/") h.endheaders() r = h.getresponse() r.begin() r.read() h.close() print time.time()-t t=time.time() for i in range(30): h = httplib.HTTPConnection("cheeseshop.python.org") h.putrequest("GET", "/pypi/Lamina/") h.endheaders() r = h.getresponse() r.begin() r.read() h.close() print time.time()-t From jim at zope.com Fri Jul 13 19:45:10 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 13 Jul 2007 13:45:10 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4697AD6A.1030602@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de> <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com> <4697AD6A.1030602@v.loewis.de> Message-ID: On Jul 13, 2007, at 12:50 PM, Martin v. L?wis wrote: >> It appears that this only shows new releases. > > That's true. I don't know why it does that; it may be that this > interface predates file uploading. > >> If I update a new distribution to a release > > With "distribution", you always mean "file", right? Yup. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Fri Jul 13 19:54:57 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 13 Jul 2007 13:54:57 -0400 Subject: [Catalog-sig] Effect of HTTP 1.1 In-Reply-To: <4697B152.7030304@v.loewis.de> References: <4697B152.7030304@v.loewis.de> Message-ID: <657E3A38-2871-4B4F-9CBE-B5A777CFB9F5@zope.com> On Jul 13, 2007, at 1:07 PM, Martin v. L?wis wrote: > I did some measurements, with the script below. > For 30 requests, a single HTTP 1.1 connection > needs 5.4s over my DSL connection; 30 individual > connections need 11.7s. Interesting. Your DSL times for connection/request are actually longer than what I'm seeing. Maybe geography isn't so important. Measurements are good. It's going to be interesting to see how this all pans out. It''s definitely interesting that you doubled the throughput using a single connection. > So if setuptools expects > to request multiple pages from the index, it would > definitely be useful to keep the connection > (I don't know at all whether it currently does so > already). I don't think so. This also looks like a good optimization for xmlrpclib. Thanks for trying this. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Fri Jul 13 20:34:45 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 13 Jul 2007 14:34:45 -0400 Subject: [Catalog-sig] Effect of HTTP 1.1 In-Reply-To: <4697B152.7030304@v.loewis.de> References: <4697B152.7030304@v.loewis.de> Message-ID: <20070713183235.817C13A40A8@sparrow.telecommunity.com> At 07:07 PM 7/13/2007 +0200, Martin v. L?wis wrote: >I did some measurements, with the script below. >For 30 requests, a single HTTP 1.1 connection >needs 5.4s over my DSL connection; 30 individual >connections need 11.7s. So if setuptools expects >to request multiple pages from the index, it would >definitely be useful to keep the connection >(I don't know at all whether it currently does so >already). It doesn't. I looked just now and found this, that looks like it might produce the desired effect for easy_install: http://linux.duke.edu/projects/urlgrabber/contents/urlgrabber/keepalive.py Perhaps someone (Jim?) would like to try activating it in a process using easy_install (i.e. doing the urllib2.install_opener dance), and see if it gives a performance boost. If it works well, then perhaps a patch for setuptools.package_index to use a custom opener is in order. From jim at zope.com Fri Jul 13 20:51:54 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 13 Jul 2007 14:51:54 -0400 Subject: [Catalog-sig] Effect of HTTP 1.1 In-Reply-To: <20070713183235.817C13A40A8@sparrow.telecommunity.com> References: <4697B152.7030304@v.loewis.de> <20070713183235.817C13A40A8@sparrow.telecommunity.com> Message-ID: <5F747173-F02A-42A5-8767-ACDA61CD0C5C@zope.com> On Jul 13, 2007, at 2:34 PM, Phillip J. Eby wrote: > At 07:07 PM 7/13/2007 +0200, Martin v. L?wis wrote: >> I did some measurements, with the script below. >> For 30 requests, a single HTTP 1.1 connection >> needs 5.4s over my DSL connection; 30 individual >> connections need 11.7s. So if setuptools expects >> to request multiple pages from the index, it would >> definitely be useful to keep the connection >> (I don't know at all whether it currently does so >> already). > > It doesn't. I looked just now and found this, that looks like it > might produce the desired effect for easy_install: > > http://linux.duke.edu/projects/urlgrabber/contents/urlgrabber/ > keepalive.py > > Perhaps someone (Jim?) would like to try activating it in a process > using easy_install (i.e. doing the urllib2.install_opener dance), > and see if it gives a performance boost. If it works well, then > perhaps a patch for setuptools.package_index to use a custom opener > is in order. I'd be happy to do this sometime in the next few weeks. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Fri Jul 13 22:17:20 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 13 Jul 2007 16:17:20 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4697D796.5080803@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de> <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com> <4697D796.5080803@v.loewis.de> Message-ID: On Jul 13, 2007, at 3:50 PM, Martin v. L?wis wrote: >> It appears that this only shows new releases. If I update a new >> distribution to a release, it doesn't cause the release to appear as >> updated. A common scenario for me is that I'll create a release, >> update a source release, and then, some time later, when someone bugs >> me, I'll upload a windows egg. The way things are now, the later >> upload won't be noticed. Of course, the initial upload won't be >> noticed if someone happens to poll between release creation and the >> first upload. > > Ok, I added another operation "changelog", that gives you four-tuples > name, version, timestamp, action. It's the complete journal, except > that privacy fields (author and IP) are not returned, and except > changes to the package (rather than a specific release) are not > returned. Very cool. Thanks! It doesn't seem to catch file-uploads, either through distutils or through the web. I uploaded a windows release for zope.proxy this morning and I just (withing the last half hour) uploaded some eggs for http://cheeseshop.python.org/pypi/ zc.zodbrecipes/0.2.1 and am not seeing anything in the transcript. > The possible values for "action" remain undocumented. If there is > interested, people can propose a specification that PyPI should > try to stick to; this specification should allow for > still-undocumented action values (to allow addition of more actions). I have no immediate use for action at this time other than as documentation when interpreting the output. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Fri Jul 13 22:43:07 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 13 Jul 2007 22:43:07 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de> <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com> <4697D796.5080803@v.loewis.de> Message-ID: <4697E3DB.8070801@v.loewis.de> > Very cool. Thanks! It doesn't seem to catch file-uploads, either > through distutils or through the web. I uploaded a windows release for > zope.proxy this morning and I just (withing the last half hour) uploaded > some eggs for http://cheeseshop.python.org/pypi/zc.zodbrecipes/0.2.1 and > am not seeing anything in the transcript. It appears that file additions were logged without a package version (just package name). I don't know why this is, but I changed changelog to return all entries (so version may be None, using the XML-RPC nil extension). I also started logging the version for the file. So please try again. Regards, Martin From jim at zope.com Fri Jul 13 23:15:26 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 13 Jul 2007 17:15:26 -0400 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <4697E3DB.8070801@v.loewis.de> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de> <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com> <4697D796.5080803@v.loewis.de> <4697E3DB.8070801@v.loewis.de> Message-ID: <8175AD9F-7D42-4C8C-8F97-2CAAA876F7D9@zope.com> On Jul 13, 2007, at 4:43 PM, Martin v. L?wis wrote: >> Very cool. Thanks! It doesn't seem to catch file-uploads, either >> through distutils or through the web. I uploaded a windows release >> for >> zope.proxy this morning and I just (withing the last half hour) >> uploaded >> some eggs for http://cheeseshop.python.org/pypi/zc.zodbrecipes/ >> 0.2.1 and >> am not seeing anything in the transcript. > > It appears that file additions were logged without a package version > (just package name). I don't know why this is, but I changed changelog > to return all entries (so version may be None, using the XML-RPC nil > extension). I also started logging the version for the file. > > So please try again. Works great! Thanks! (Now I just wish I wasn't going to be offline all weekend.) Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Fri Jul 13 21:50:46 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 13 Jul 2007 21:50:46 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com> References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au> <46953A70.6070600@v.loewis.de> <200707120809.48344.richardjones@optusnet.com.au> <297846B8-94DC-4770-9476-711796E82FEC@zope.com> <4695B816.9020706@v.loewis.de> <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com> <4696988C.6050309@v.loewis.de> <46973211.1060801@v.loewis.de> <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com> Message-ID: <4697D796.5080803@v.loewis.de> > It appears that this only shows new releases. If I update a new > distribution to a release, it doesn't cause the release to appear as > updated. A common scenario for me is that I'll create a release, > update a source release, and then, some time later, when someone bugs > me, I'll upload a windows egg. The way things are now, the later > upload won't be noticed. Of course, the initial upload won't be > noticed if someone happens to poll between release creation and the > first upload. Ok, I added another operation "changelog", that gives you four-tuples name, version, timestamp, action. It's the complete journal, except that privacy fields (author and IP) are not returned, and except changes to the package (rather than a specific release) are not returned. The possible values for "action" remain undocumented. If there is interested, people can propose a specification that PyPI should try to stick to; this specification should allow for still-undocumented action values (to allow addition of more actions). Regards, Martin From gentoodev at gmail.com Tue Jul 17 08:45:14 2007 From: gentoodev at gmail.com (Rob Cakebread) Date: Mon, 16 Jul 2007 23:45:14 -0700 Subject: [Catalog-sig] PyPI command-line tool: yolk Message-ID: <9b06ffb10707162345s7813d59dpc61b758b50d3df66@mail.gmail.com> yolk 0.3.0 has been released and lets you use the new PyPI XML-RPC methods 'changelog' and 'updated_releases'. You can see the latest releases for the last : yolk -L 24 You can see a detailed ChangeLog of The Cheese Shop by the last : yolk -C 6 http://tools.assembla.com/yolk From stuart at stuartbishop.net Wed Jul 18 11:58:11 2007 From: stuart at stuartbishop.net (Stuart Bishop) Date: Wed, 18 Jul 2007 16:58:11 +0700 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469522D6.1070706@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> Message-ID: <469DE433.4040405@stuartbishop.net> Martin v. L?wis wrote: >> The questions for us is, how much effort we are willing to make to >> prevent people from shooting themselves in the foot. I can understand >> why Phillip would like the package index to prevent people from choosing >> problematic package names. > > That's not my understanding - the issue isn't with "problematic package > names", but with conflicting package names. IOW, any single name is > fine - it's a pair of names that would cause a problem (and only if > you wanted to install both packages on the same system). By not blocking registration of packages with similar names, we are creating a security problem. If there is a popular package 'CoolStuff', I just have to upload a trojan 'coolstuff' and suddenly people will end up using my trojan which they thought was coming from a trusted source. I think this attack vector is possible right now and only a BUGTRAQ post away from being common knowledge. I think blocking this is the responsibility of the package index, as it is the first point that it is possible to do so. I think a reasonable restriction would be printable ASCII only names and not allowing registration of a package with a name differing only in case, whitespace or punctuation. There are additional side benefits that fall out of this (being able optimize searches by doing exact matches rather than fuzzy, or avoiding whole classes of case-sensitivity or Unicode bugs in other applications integrating with the registry, or reducing confusion to end users, or reducing the likely hood of less user-hostile systems being developed and making the official registry irrelevant - heck, I work on a closed source system that would happily take the business). -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070718/e971e102/attachment.pgp From martin at v.loewis.de Thu Jul 19 00:07:30 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 19 Jul 2007 00:07:30 +0200 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469DE433.4040405@stuartbishop.net> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <469DE433.4040405@stuartbishop.net> Message-ID: <469E8F22.7080204@v.loewis.de> > I think blocking this is the responsibility of the package index, as it is > the first point that it is possible to do so. Would you like to contribute a patch? Regards, Martin From stuart at stuartbishop.net Thu Jul 19 06:00:53 2007 From: stuart at stuartbishop.net (Stuart Bishop) Date: Thu, 19 Jul 2007 11:00:53 +0700 Subject: [Catalog-sig] start on static generation, and caching - apache config. In-Reply-To: <469E8F22.7080204@v.loewis.de> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <469DE433.4040405@stuartbishop.net> <469E8F22.7080204@v.loewis.de> Message-ID: <469EE1F5.7000802@stuartbishop.net> Martin v. L?wis wrote: >> I think blocking this is the responsibility of the package index, as it is >> the first point that it is possible to do so. > > Would you like to contribute a patch? Yes, but it would be rather pointless to make one if my analysis is incorrect or it would be bounced for some non-technical reason so I emailed it for discussion. I'm also unsure if switching to exact matching on a normalized string instead of substring matching is good (well... it is good for performance, but might not be good for UI). I haven't looked at the source code to see how much work is involved yet - if I find the Python code incomprehensible I should at least be able to do the PostgreSQL side of things. -- Stuart Bishop http://www.stuartbishop.net/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070719/5e71e1d1/attachment.pgp From martin at v.loewis.de Thu Jul 19 09:17:15 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 19 Jul 2007 09:17:15 +0200 Subject: [Catalog-sig] Package naming (Was: start on static generation, and caching - apache config.) In-Reply-To: <469EE1F5.7000802@stuartbishop.net> References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com> <468F3CD4.1070501@v.loewis.de> <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com> <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com> <468FC2BB.7030607@v.loewis.de> <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com> <468FF69B.2090503@v.loewis.de> <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com> <46910BBF.3010308@v.loewis.de> <4692B3A3.5030209@v.loewis.de> <20070710003214.A2EA83A404D@sparrow.telecommunity.com> <46931A3A.5000703@v.loewis.de> <20070710141304.BC6903A40A4@sparrow.telecommunity.com> <4693FA2A.3020107@v.loewis.de> <20070710221547.4A3043A40A4@sparrow.telecommunity.com> <469467AA.7070409@v.loewis.de> <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com> <469522D6.1070706@v.loewis.de> <469DE433.4040405@stuartbishop.net> <469E8F22.7080204@v.loewis.de> <469EE1F5.7000802@stuartbishop.net> Message-ID: <469F0FFB.6010904@v.loewis.de> > Yes, but it would be rather pointless to make one if my analysis is > incorrect or it would be bounced for some non-technical reason so I emailed > it for discussion. I'm also unsure if switching to exact matching on a > normalized string instead of substring matching is good (well... it is good > for performance, but might not be good for UI). That's something completely different. I thought you were saying that the Cheeseshop should block conflicting registrations. To implement that, you only have to perform any normalization when a new project is registered. There are roughly three new registrations per day, so performance is irrelevant here. Matching on lookup is rather a convenience to users; they can put in a misspelled string and still find the package. OTOH, the search interface already does case-insensitive matching; I doubt that doing it in the URL adds much convenience. OTOH, it does add performance (not convenience) to setuptools users, as setuptools could stop downloading the complete package list to find the match. But these are unrelated; if you want to contribute, it might be best to just focus on the part that really worries you (namely the security risk of conflicting registrations). Regards, Martin From jim at zope.com Thu Jul 19 13:06:34 2007 From: jim at zope.com (Jim Fulton) Date: Thu, 19 Jul 2007 07:06:34 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. Message-ID: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> Over the past few months, we've struggled quite a bit with Python Package Index (PyPI) performance and stability. Thanks to the heroic efforts of Martin v. L?wis and others, performance and especially stability have improved quite a bit. Martin has demonstrated that, at least when running well, PyPI seems to answer most requests on the order of 7 miliseconds (around 150 requests per second) internally. That's not bad. Unfortunately for users, actual times can be quite a bit longer. For me at work, request take around 300 milliseconds. For Martin, they seem to take somewhat longer. 300 milliseconds isn't so bad for a request or two, however, easy install can easily make 10s or even hundreds of requests to satisfy a user request for a package. zc.buildout, when verifying that a large system with many tens of packages has the most up to date versions of each package can easily make thousands of requests. Why do setuptools and buildout make so many requests? If a package exposes more than one release, then setuptools checks the package's main PyPI page and the pages for each release. We need to be able to easily use older releases, so we can't hide old releases. Typical projects of ours have many old releases exposed. If setuptools was more clever in the way it searched PyPI, but it would still have to make a minimum of 2 requests per package for packages with multiple versions exposed. Another potential issue is that PyPI pages can be large. I've found it convenient to use PyPI package pages as the home page for many of my projects. I like to include package documentation in my project pages. Perhaps this is an abuse of PyPI, but it is very convenient for me and no one has complained. :) The zc.buildout pages are around 200K. That's a fair bit of data for setuptools to download and scan for download URLs. In the course of this discussion, I've realized that it doesn't make sense for setuptools to use the same interface that humans use. setuptools doesn't need to see all of the data that is useful to humans. Similarly, humans generally don't need to see all of the historical releases for a project. I suggested a simple page format designed just for setuptools. An alternative would be an xmlrpc API. I prefer pages because I think that, over time, the amount of requests from automated tools like easy_install and zc.buildout will increase substantially and ultimately, will overwhelm dynamic servers, even ones like PyPI that are reasonably fast. I also think that a simple static collection of pages will be easier to mirror and I think some number of geographic mirrors is likely to help some people. I promised to prototype the format I suggested. I've created and experimental prototype setuptools-specific package index at http://download.zope.org/ppix Going to that page gives brief instructions for using it with easy_install and zc.buildout. To see an individual package page, add the package name to the URL, as in: http://download.zope.org/ppix/setuptools/ A few things to note about this: - I don't expose a long package list at http://download.zope.org/ ppix/. The long package list would be expensive to download and supports a use case that I consider to be of negative value, which is installing packages with case-insensitive package names, I think it is important for humans to be able to search for packages using case- insensitive search terms, but I think that, after identifying a package, precise package names should be used. I think it is especially important that precise package names be used in package requirements. - There is a single page per package. This can greatly reduce the number of requests. Packages that store all of their distributions in PyPI and that don't have off-site home pages or download URLs can be scanned with a single request. Note that I excluded home page and download URLs that pointed back to the packages PyPI page, as that wouldn't provide any new information to setuptools. - Download URLs for *hidden* packages are included. Humans don't need to see old revisions, but setuptools-based tools do. If we used an index like this for setuptools, we could stop unhiding old releases when we created new releases in PyPI. This would make PyPI more useful to humans and less of a pain for developers. - Download URLs are the same as they are in PyPI. Using this new index, distributions are still downloaded from PyPI, so the index doesn't affect PyPI download statistics. To see the impact of this, it's interesting to look at installing zc.buildout using easy_install from PyPI and from the experimental index: Installing using PyPI looks like this: (env)jim at ds9:~/tmp$ time easy_install zc.buildout Searching for zc.buildout Reading http://cheeseshop.python.org/pypi/zc.buildout/ Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19 Reading http://svn.zope.org/zc.buildout Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16 Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18 Best match: zc.buildout 1.0.0b28 Downloading http://cheeseshop.python.org/packages/2.5/z/ zc.buildout/zc.buildout-1.0.0b28- py2.5.egg#md5=4e37e53f010ed7984555a029732f479d Processing zc.buildout-1.0.0b28-py2.5.egg creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- py2.5.egg Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/ python2.5 Adding zc.buildout 1.0.0b28 to easy-install.pth file Installing buildout script to /home/jim/tmp/env/bin/ Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- py2.5.egg Processing dependencies for zc.buildout Searching for setuptools==0.6c6 Best match: setuptools 0.6c6 Processing setuptools-0.6c6-py2.5.egg Adding setuptools 0.6c6 to easy-install.pth file Installing easy_install script to /home/jim/tmp/env/bin/ Installing easy_install-2.5 script to /home/jim/tmp/env/bin/ Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6-py2.5.egg Processing dependencies for setuptools==0.6c6 Finished processing dependencies for setuptools==0.6c6 Finished installing setuptools==0.6c6 Finished processing dependencies for zc.buildout Finished installing zc.buildout real 0m31.360s user 0m1.136s sys 0m0.060s Note the large number of pages read. Here I was installing a single package with one dependency, setuptools, that was already installed. Let's look at this again using the experimental index: (env)jim at ds9:~/tmp$ time easy_install -i http://download.zope.org/ ppix zc.buildout Searching for zc.buildout Reading http://download.zope.org/ppix/zc.buildout/ Best match: zc.buildout 1.0.0b28 Downloading http://cheeseshop.python.org/packages/2.5/z/ zc.buildout/zc.buildout-1.0.0b28- py2.5.egg#md5=4e37e53f010ed7984555a029732f479d Processing zc.buildout-1.0.0b28-py2.5.egg creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- py2.5.egg Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/ python2.5 Adding zc.buildout 1.0.0b28 to easy-install.pth file Installing buildout script to /home/jim/tmp/env/bin/ Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- py2.5.egg Processing dependencies for zc.buildout Searching for setuptools==0.6c6 Best match: setuptools 0.6c6 Processing setuptools-0.6c6-py2.5.egg Adding setuptools 0.6c6 to easy-install.pth file Installing easy_install script to /home/jim/tmp/env/bin/ Installing easy_install-2.5 script to /home/jim/tmp/env/bin/ Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6-py2.5.egg Processing dependencies for setuptools==0.6c6 Finished processing dependencies for setuptools==0.6c6 Finished installing setuptools==0.6c6 Finished processing dependencies for zc.buildout Finished installing zc.buildout real 0m7.006s user 0m0.244s sys 0m0.040s Note: - We made far fewer requests with the new index - Most of the time in the second example was spent actually downloading the buildout distribution. Most of the time in the first example was spent reading the index. - I used workingenv to create clean environments for each of the examples above. WRT zc.buildout, refreshing a buildout with just ZODB installed in it takes about 45 seconds for me using PyPI and about 5 seconds using the experimental index. Some of the speed improvements is due to the fact that the experimental index is much closer to me (on the net) than PyPI. ATM, requests to PyPI take *me* around 500 milliseconds, while requests to the experimental index are taking between 100 and 300 milliseconds. (I'm at home and this seems to be somewhat variable.) Most of the speed improvements are from reducing the number of requests. I'm polling PyPI once a minute to get and apply updates. Thanks to the new XML-RPC method that Martin added, this is very efficient to do. I encourage people to check this out and even try using it with easy_install and especially buildout. AFAIK, aside from being much faster and showing download files for hidden releases it is completely equivalent to PyPI for setuptools use. My intension is to keep this experimental index going and up to date for the foreseeable future and plan to use it for all my work. My primary goal is to prototype the new index format. If this seems useful, then I think that www.python.org should expose an index in this format to setuptools, either at a different URL or by satisfying setuptools requests from the index based on client information. I'd love to see this index populated via a baking mechanism that updates package pages when they change, rather than through polling as I'm doing. There would be some benefit to having geographic mirrors. I suspect that having such mirrors available would improve performance further, at least for some folks. It might also be useful to have some mirrors for redundancy purposes. Note though that what I'm doing is mirroring the only index data. I'm not mirroring distributions. Of course, I'd be happy to make my software available. (It already is via our subversion repository.) I hope this effort spurs useful discussion and progress. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Fri Jul 20 10:21:18 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 20 Jul 2007 10:21:18 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> Message-ID: <46A0707E.6000103@v.loewis.de> > I've created and experimental prototype setuptools-specific package > index at > > http://download.zope.org/ppix Cool! If this proves useful, people are encouraged to contribute the proper patches to PyPI to regenerate the page directly on each log change. There is a slight transactional trickiness to doing so: If you regenerate before the commit, it might be that the commit fails; then you would have to rollback the page update, too. If you regenerate after commit, it might be that you run into race conditions if the same package sees two updates in two transactions very quickly, and the second regeneration completes before the first one. If people would find it easier to make these pages dynamic, such patches would also be kindly accepted. Generating the pages on access should be fairly cheap; the SQL is select filename,md5_digest from release_files where name='setuptools'; and putting the result of that into an ppix-like HTML page should be much faster than invoking ZPT. Regards, Martin From ct at gocept.com Fri Jul 20 12:02:45 2007 From: ct at gocept.com (Christian Theune) Date: Fri, 20 Jul 2007 12:02:45 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> Message-ID: <1184925765.6519.3.camel@mindy> Am Donnerstag, den 19.07.2007, 07:06 -0400 schrieb Jim Fulton: > I promised to prototype the format I suggested. > > I've created and experimental prototype setuptools-specific package > index at > > http://download.zope.org/ppix Yay! This works like a charme! > There would be some benefit to having geographic mirrors. I suspect > that having such mirrors available would improve performance further, > at least for some folks. It might also be useful to have some > mirrors for redundancy purposes. Note though that what I'm doing is > mirroring the only index data. I'm not mirroring distributions. Of > course, I'd be happy to make my software available. (It already is > via our subversion repository.) I'd be happy to support mirroring once all this is sorted out/ I can offer a server in Germany/Europe. Christian From jim at zope.com Fri Jul 20 13:45:57 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 20 Jul 2007 07:45:57 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A0707E.6000103@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A0707E.6000103@v.loewis.de> Message-ID: <5105308E-F651-438B-8C3D-F5FCAF8A8351@zope.com> On Jul 20, 2007, at 4:21 AM, Martin v. L?wis wrote: >> I've created and experimental prototype setuptools-specific package >> index at >> >> http://download.zope.org/ppix > > Cool! If this proves useful, people are encouraged to contribute the > proper patches to PyPI to regenerate the page directly on each log > change. > > There is a slight transactional trickiness to doing so: If you > regenerate before the commit, it might be that the commit fails; > then you would have to rollback the page update, too. If you > regenerate after commit, it might be that you run into race > conditions if the same package sees two updates in two > transactions very quickly, and the second regeneration completes > before the first one. > > If people would find it easier to make these pages dynamic, > such patches would also be kindly accepted. Generating the > pages on access should be fairly cheap; the SQL is > > select filename,md5_digest from release_files where name='setuptools'; > > and putting the result of that into an ppix-like HTML page > should be much faster than invoking ZPT. A few notes. It is important to show files from hidden releases as well as unhidden releases. I suspect the select statement above does that. I parse long descriptions to get #egg= links. I also give some special care to urls that point back to PyPI to avoid having setuptools go back to the human interface. It might be easiest to just trigger the existing ppix sw to poll after a change. Thanks to your xmlrpc addition, polling is quite cheap. Alternatively, we could install the existing software in a way that polls more or less continuously. This would be quite trivial. What you suggest is probably cleaner but requires some expertise with the current software. :) I'd much rather generate static files (as I'm doing now) than serve these dynamically. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Fri Jul 20 13:48:39 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 20 Jul 2007 07:48:39 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <1184925765.6519.3.camel@mindy> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <1184925765.6519.3.camel@mindy> Message-ID: <465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com> On Jul 20, 2007, at 6:02 AM, Christian Theune wrote: ... > I'd be happy to support mirroring once all this is sorted out/ I can > offer a server in Germany/Europe. If we decide that mirrors would be a good idea, it will be important, imo, to select mirror sites bases on their connectivity. The goal of the mirrors should be to try to give people options with short network distances. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From ct at gocept.com Fri Jul 20 13:52:12 2007 From: ct at gocept.com (Christian Theune) Date: Fri, 20 Jul 2007 13:52:12 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <1184925765.6519.3.camel@mindy> <465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com> Message-ID: <1184932332.6519.11.camel@mindy> Am Freitag, den 20.07.2007, 07:48 -0400 schrieb Jim Fulton: > On Jul 20, 2007, at 6:02 AM, Christian Theune wrote: > ... > > I'd be happy to support mirroring once all this is sorted out/ I can > > offer a server in Germany/Europe. > > If we decide that mirrors would be a good idea, it will be important, > imo, to select mirror sites bases on their connectivity. The goal of > the mirrors should be to try to give people options with short > network distances. Right, however, do you have any specific parameters that can be measured in mind? (Our server is reasonably well connected, reachable with about 5 hops from within Germany with latency around 40ms on a DSL line. Multiple GBit lines to the hosting center.) Christian From jodok at lovelysystems.com Fri Jul 20 10:50:40 2007 From: jodok at lovelysystems.com (Jodok Batlogg) Date: Fri, 20 Jul 2007 10:50:40 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> Message-ID: <6000B516-6593-4A98-AA08-B6C7B329BC62@lovelysystems.com> thanks jim. you save our day. we'll send some austrian cheese over :) jodok On 19.07.2007, at 13:06, Jim Fulton wrote: > Over the past few months, we've struggled quite a bit with Python > Package Index (PyPI) performance and stability. Thanks to the heroic > efforts of Martin v. L?wis and others, performance and especially > stability have improved quite a bit. Martin has demonstrated that, at > least when running well, PyPI seems to answer most requests on the > order of 7 miliseconds (around 150 requests per second) internally. > That's not bad. Unfortunately for users, actual times can be quite a > bit longer. For me at work, request take around 300 milliseconds. > For Martin, they seem to take somewhat longer. 300 milliseconds > isn't so bad for a request or two, however, easy install can easily > make 10s or even hundreds of requests to satisfy a user request for a > package. zc.buildout, when verifying that a large system with many > tens of packages has the most up to date versions of each package can > easily make thousands of requests. > > Why do setuptools and buildout make so many requests? If a package > exposes more than one release, then setuptools checks the package's > main PyPI page and the pages for each release. We need to be able to > easily use older releases, so we can't hide old releases. Typical > projects of ours have many old releases exposed. If setuptools was > more clever in the way it searched PyPI, but it would still have to > make a minimum of 2 requests per package for packages with multiple > versions exposed. > > Another potential issue is that PyPI pages can be large. I've found > it convenient to use PyPI package pages as the home page for many of > my projects. I like to include package documentation in my project > pages. Perhaps this is an abuse of PyPI, but it is very convenient > for me and no one has complained. :) The zc.buildout pages are > around 200K. That's a fair bit of data for setuptools to download > and scan for download URLs. > > In the course of this discussion, I've realized that it doesn't make > sense for setuptools to use the same interface that humans use. > setuptools doesn't need to see all of the data that is useful to > humans. Similarly, humans generally don't need to see all of the > historical releases for a project. I suggested a simple page format > designed just for setuptools. An alternative would be an xmlrpc > API. I prefer pages because I think that, over time, the amount of > requests from automated tools like easy_install and zc.buildout will > increase substantially and ultimately, will overwhelm dynamic > servers, even ones like PyPI that are reasonably fast. I also think > that a simple static collection of pages will be easier to mirror and > I think some number of geographic mirrors is likely to help some > people. I promised to prototype the format I suggested. > > I've created and experimental prototype setuptools-specific package > index at > > http://download.zope.org/ppix > > Going to that page gives brief instructions for using it with > easy_install and zc.buildout. To see an individual package page, add > the package name to the URL, as in: > > http://download.zope.org/ppix/setuptools/ > > A few things to note about this: > > - I don't expose a long package list at http://download.zope.org/ > ppix/. The long package list would be expensive to download and > supports a use case that I consider to be of negative value, which is > installing packages with case-insensitive package names, I think it > is important for humans to be able to search for packages using case- > insensitive search terms, but I think that, after identifying a > package, precise package names should be used. I think it is > especially important that precise package names be used in package > requirements. > > - There is a single page per package. This can greatly reduce the > number of requests. Packages that store all of their distributions > in PyPI and that don't have off-site home pages or download URLs can > be scanned with a single request. Note that I excluded home page and > download URLs that pointed back to the packages PyPI page, as that > wouldn't provide any new information to setuptools. > > - Download URLs for *hidden* packages are included. Humans don't > need to see old revisions, but setuptools-based tools do. If we used > an index like this for setuptools, we could stop unhiding old > releases when we created new releases in PyPI. This would make PyPI > more useful to humans and less of a pain for developers. > > - Download URLs are the same as they are in PyPI. Using this new > index, distributions are still downloaded from PyPI, so the index > doesn't affect PyPI download statistics. > > To see the impact of this, it's interesting to look at installing > zc.buildout using easy_install from PyPI and from the experimental > index: > Installing using PyPI looks like this: > > (env)jim at ds9:~/tmp$ time easy_install zc.buildout > Searching for zc.buildout > Reading http://cheeseshop.python.org/pypi/zc.buildout/ > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19 > Reading http://svn.zope.org/zc.buildout > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18 > Best match: zc.buildout 1.0.0b28 > Downloading http://cheeseshop.python.org/packages/2.5/z/ > zc.buildout/zc.buildout-1.0.0b28- > py2.5.egg#md5=4e37e53f010ed7984555a029732f479d > Processing zc.buildout-1.0.0b28-py2.5.egg > creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- > py2.5.egg > Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/ > python2.5 > Adding zc.buildout 1.0.0b28 to easy-install.pth file > Installing buildout script to /home/jim/tmp/env/bin/ > > Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- > py2.5.egg > Processing dependencies for zc.buildout > Searching for setuptools==0.6c6 > Best match: setuptools 0.6c6 > Processing setuptools-0.6c6-py2.5.egg > Adding setuptools 0.6c6 to easy-install.pth file > Installing easy_install script to /home/jim/tmp/env/bin/ > Installing easy_install-2.5 script to /home/jim/tmp/env/bin/ > > Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6- > py2.5.egg > Processing dependencies for setuptools==0.6c6 > Finished processing dependencies for setuptools==0.6c6 > Finished installing setuptools==0.6c6 > Finished processing dependencies for zc.buildout > Finished installing zc.buildout > > real 0m31.360s > user 0m1.136s > sys 0m0.060s > > Note the large number of pages read. Here I was installing a single > package with one dependency, setuptools, that was already installed. > Let's look at this again using the experimental index: > > (env)jim at ds9:~/tmp$ time easy_install -i http://download.zope.org/ > ppix zc.buildout > Searching for zc.buildout > Reading http://download.zope.org/ppix/zc.buildout/ > Best match: zc.buildout 1.0.0b28 > Downloading http://cheeseshop.python.org/packages/2.5/z/ > zc.buildout/zc.buildout-1.0.0b28- > py2.5.egg#md5=4e37e53f010ed7984555a029732f479d > Processing zc.buildout-1.0.0b28-py2.5.egg > creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- > py2.5.egg > Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/ > python2.5 > Adding zc.buildout 1.0.0b28 to easy-install.pth file > Installing buildout script to /home/jim/tmp/env/bin/ > > Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- > py2.5.egg > Processing dependencies for zc.buildout > Searching for setuptools==0.6c6 > Best match: setuptools 0.6c6 > Processing setuptools-0.6c6-py2.5.egg > Adding setuptools 0.6c6 to easy-install.pth file > Installing easy_install script to /home/jim/tmp/env/bin/ > Installing easy_install-2.5 script to /home/jim/tmp/env/bin/ > > Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6- > py2.5.egg > Processing dependencies for setuptools==0.6c6 > Finished processing dependencies for setuptools==0.6c6 > Finished installing setuptools==0.6c6 > Finished processing dependencies for zc.buildout > Finished installing zc.buildout > > real 0m7.006s > user 0m0.244s > sys 0m0.040s > > Note: > > - We made far fewer requests with the new index > > - Most of the time in the second example was spent actually > downloading the buildout distribution. Most of the time in the first > example was spent reading the index. > > - I used workingenv to create clean environments for each of the > examples above. > > WRT zc.buildout, refreshing a buildout with just ZODB installed in it > takes about 45 seconds for me using PyPI and about 5 seconds using > the experimental index. > > Some of the speed improvements is due to the fact that the > experimental index is much closer to me (on the net) than PyPI. ATM, > requests to PyPI take *me* around 500 milliseconds, while requests to > the experimental index are taking between 100 and 300 milliseconds. > (I'm at home and this seems to be somewhat variable.) Most of the > speed improvements are from reducing the number of requests. > > I'm polling PyPI once a minute to get and apply updates. Thanks to > the new XML-RPC method that Martin added, this is very efficient to > do. > > I encourage people to check this out and even try using it with > easy_install and especially buildout. AFAIK, aside from being much > faster and showing download files for hidden releases it is > completely equivalent to PyPI for setuptools use. My intension is to > keep this experimental index going and up to date for the foreseeable > future and plan to use it for all my work. > > My primary goal is to prototype the new index format. If this seems > useful, then I think that www.python.org should expose an index in > this format to setuptools, either at a different URL or by satisfying > setuptools requests from the index based on client information. I'd > love to see this index populated via a baking mechanism that updates > package pages when they change, rather than through polling as I'm > doing. > > There would be some benefit to having geographic mirrors. I suspect > that having such mirrors available would improve performance further, > at least for some folks. It might also be useful to have some > mirrors for redundancy purposes. Note though that what I'm doing is > mirroring the only index data. I'm not mirroring distributions. Of > course, I'd be happy to make my software available. (It already is > via our subversion repository.) > > I hope this effort spurs useful discussion and progress. > > Jim > > -- > Jim Fulton mailto:jim at zope.com Python Powered! > CTO (540) 361-1714 http://www.python.org > Zope Corporation http://www.zope.com http://www.zope.org > > > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig -- "Although never is often better than *right* now." -- The Zen of Python, by Tim Peters Jodok Batlogg, Lovely Systems Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria phone: +43 5572 908060, fax: +43 5572 908060-77 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2454 bytes Desc: not available Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070720/26d28f4b/attachment-0001.bin From jim at zope.com Fri Jul 20 15:42:37 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 20 Jul 2007 09:42:37 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <1184932332.6519.11.camel@mindy> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <1184925765.6519.3.camel@mindy> <465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com> <1184932332.6519.11.camel@mindy> Message-ID: <5686B35D-34DD-49FE-A8E7-37397A4AE808@zope.com> On Jul 20, 2007, at 7:52 AM, Christian Theune wrote: > Am Freitag, den 20.07.2007, 07:48 -0400 schrieb Jim Fulton: >> On Jul 20, 2007, at 6:02 AM, Christian Theune wrote: >> ... >>> I'd be happy to support mirroring once all this is sorted out/ I can >>> offer a server in Germany/Europe. >> >> If we decide that mirrors would be a good idea, it will be important, >> imo, to select mirror sites bases on their connectivity. The goal of >> the mirrors should be to try to give people options with short >> network distances. > > Right, however, do you have any specific parameters that can be > measured > in mind? I'm not enough of a network expert. Hopefully, someone more knowledgeable will make a suggestion. BTW, with the current PyPI performance, I'm guessing we could have 10s of mirrors poll once a minute without affecting other users. > (Our server is reasonably well connected, reachable with about 5 hops > from within Germany with latency around 40ms on a DSL line. Multiple > GBit lines to the hosting center.) I didn't mean to suggest that you weren't well connected. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Fri Jul 20 22:09:39 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 20 Jul 2007 16:09:39 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> Message-ID: <20070720200721.88E1D3A403A@sparrow.telecommunity.com> At 07:06 AM 7/19/2007 -0400, Jim Fulton wrote: >I've created and experimental prototype setuptools-specific package >index at > > http://download.zope.org/ppix > >Going to that page gives brief instructions for using it with >easy_install and zc.buildout. FYI, the handling of homepage and download links is broken. You have e.g. 'meta="homepage"' instead of 'rel="homepage"', so easy_install doesn't pick these up and look for links there, meaning that ppix fails to find downloads for e.g. pywin32 which is hosted at Sourceforge. (On a perhaps not entirely unrelated note, the Cheeseshop appears to be down at the moment: """Error... There's been a problem with your request psycopg.OperationalError: no connection to the server""") By the way, I'd suggest explaining (or linking to an explanation) on the ppix main page describing how to configure easy_install such that the '-i' option isn't necessary. Perhaps we could add an example to the EasyInstall docs somewhere near: http://peak.telecommunity.com/DevCenter/EasyInstall#creating-your-own-package-index and then link to it from the ppix page. From jim at zope.com Fri Jul 20 22:07:08 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 20 Jul 2007 16:07:08 -0400 Subject: [Catalog-sig] PyPI is down with a psycopg error Message-ID: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com> Requests to http://www.python.org/pypi are giving: Error... There's been a problem with your request psycopg.OperationalError: no connection to the server This (or something like it) has been happening since 7:54 UTC. I know because my once a minute cron job to update ppix has been failing since then. :) The good news is that folks who have switched to using http:// download.zope.org/ppix/ for setuptools (easy_install and buildout) are unaffected. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Fri Jul 20 22:18:55 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 20 Jul 2007 16:18:55 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070720200721.88E1D3A403A@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <20070720200721.88E1D3A403A@sparrow.telecommunity.com> Message-ID: <24B11DD1-DD79-4171-A38F-06B642EC354B@zope.com> On Jul 20, 2007, at 4:09 PM, Phillip J. Eby wrote: > At 07:06 AM 7/19/2007 -0400, Jim Fulton wrote: >> I've created and experimental prototype setuptools-specific package >> index at >> >> http://download.zope.org/ppix >> >> Going to that page gives brief instructions for using it with >> easy_install and zc.buildout. > > FYI, the handling of homepage and download links is broken. You > have e.g. 'meta="homepage"' instead of 'rel="homepage"', so > easy_install doesn't pick these up and look for links there, > meaning that ppix fails to find downloads for e.g. pywin32 which is > hosted at Sourceforge. Doh! Fixed. > (On a perhaps not entirely unrelated note, the Cheeseshop appears > to be down at the moment: > > """Error... > > There's been a problem with your request > > psycopg.OperationalError: no connection to the server""") > > > By the way, I'd suggest explaining (or linking to an explanation) > on the ppix main page describing how to configure easy_install such > that the '-i' option isn't necessary. If you send me some text, I'd be happy to add it to the ppix main page. > Perhaps we could add an example to the EasyInstall docs somewhere > near: > > http://peak.telecommunity.com/DevCenter/EasyInstall#creating-your- > own-package-index > > and then link to it from the ppix page. +1 Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From benji at benjiyork.com Fri Jul 20 22:04:29 2007 From: benji at benjiyork.com (Benji York) Date: Fri, 20 Jul 2007 16:04:29 -0400 Subject: [Catalog-sig] Cheeseshop down Message-ID: <46A1154D.7000708@benjiyork.com> Fulfilling my dutifully sworn obligation to report every instance of PYPI being down: """ Error... There's been a problem with your request psycopg.OperationalError: no connection to the server """ -- Benji York http://benjiyork.com From bray at sent.com Fri Jul 20 23:01:16 2007 From: bray at sent.com (Brian Ray) Date: Fri, 20 Jul 2007 16:01:16 -0500 Subject: [Catalog-sig] PyPI is down with a psycopg error In-Reply-To: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com> References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com> Message-ID: On Jul 20, 2007, at 3:07 PM, Jim Fulton wrote: > > Error... > > There's been a problem with your request > > psycopg.OperationalError: no connection to the server > Come on! Still down. Not Good. Does anybody know a short term fix and a long term solution. Brian Ray bray at sent.com From jim at zope.com Fri Jul 20 23:10:26 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 20 Jul 2007 17:10:26 -0400 Subject: [Catalog-sig] PyPI is down with a psycopg error In-Reply-To: References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com> Message-ID: On Jul 20, 2007, at 5:01 PM, Brian Ray wrote: > > On Jul 20, 2007, at 3:07 PM, Jim Fulton wrote: > >> >> Error... >> >> There's been a problem with your request >> >> psycopg.OperationalError: no connection to the server >> > > Come on! > > Still down. > > Not Good. Does anybody know a short term fix and a long term > solution. If you're using it for easy_install or buildout, use http:// download.zope.org/ppix as your package index. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From richardjones at optushome.com.au Sat Jul 21 01:34:34 2007 From: richardjones at optushome.com.au (Richard Jones) Date: Sat, 21 Jul 2007 09:34:34 +1000 Subject: [Catalog-sig] PyPI is down with a psycopg error In-Reply-To: References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com> Message-ID: <200707210934.34159.richardjones@optushome.com.au> On Sat, 21 Jul 2007, Brian Ray wrote: > On Jul 20, 2007, at 3:07 PM, Jim Fulton wrote: > > Error... > > > > There's been a problem with your request > > > > psycopg.OperationalError: no connection to the server > > Come on! Yes, because complaining about it will fix it. Postgres is up and running, but the web interface is reporting the above errors as though it can't connect. I can only assume that the persistent connection has run into trouble. I've disabled persistent connections in the fcgi config, but now apache will need restarting. I'm trying to contact someone who can do that. > Not Good. Does anybody know a short term fix and a long term solution. You can volunteer to also be a maintainer of the system. Richard From richardjones at optushome.com.au Sat Jul 21 02:08:35 2007 From: richardjones at optushome.com.au (Richard Jones) Date: Sat, 21 Jul 2007 10:08:35 +1000 Subject: [Catalog-sig] PyPI is down with a psycopg error In-Reply-To: <200707210934.34159.richardjones@optushome.com.au> References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com> <200707210934.34159.richardjones@optushome.com.au> Message-ID: <200707211008.35954.richardjones@optushome.com.au> On Sat, 21 Jul 2007, Richard Jones wrote: > I'm trying to contact someone who can do that. It looks like one of the volunteer sysadmins has now restarted apache and the database connection issues are no more. Richard From martin at v.loewis.de Sat Jul 21 08:05:13 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 21 Jul 2007 08:05:13 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070720200721.88E1D3A403A@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <20070720200721.88E1D3A403A@sparrow.telecommunity.com> Message-ID: <46A1A219.60906@v.loewis.de> > (On a perhaps not entirely unrelated note, the Cheeseshop appears to > be down at the moment: > > """Error... > > There's been a problem with your request > > psycopg.OperationalError: no connection to the server""") Around that time, the Postgres log has these entries: 2007-07-20 21:53:24 [14636] LOG: received fast shutdown request 2007-07-20 21:53:24 [14636] LOG: aborting any active transactions 2007-07-20 21:53:24 [26166] FATAL: terminating connection due to administrator command 2007-07-20 21:53:24 [15769] FATAL: terminating connection due to administrator command 2007-07-20 21:53:24 [10390] FATAL: terminating connection due to administrator command 2007-07-20 21:53:24 [31182] FATAL: terminating connection due to administrator command 2007-07-20 21:53:24 [30066] FATAL: terminating connection due to administrator command 2007-07-20 21:53:24 [10162] FATAL: terminating connection due to administrator command 2007-07-20 21:53:24 [17452] FATAL: terminating connection due to administrator command 2007-07-20 21:53:24 [17147] FATAL: terminating connection due to administrator command 2007-07-20 21:53:24 [1159] LOG: shutting down 2007-07-20 21:53:26 [1159] LOG: database system is shut down 2007-07-20 21:53:33 [1469] LOG: database system was shut down at 2007-07-20 21:53:26 CEST 2007-07-20 21:53:33 [1469] LOG: checkpoint record is at A/FD833F0 2007-07-20 21:53:33 [1469] LOG: redo record is at A/FD833F0; undo record is at 0/0; shutdown TRUE 2007-07-20 21:53:33 [1469] LOG: next transaction ID: 110977718; next OID: 61913929 2007-07-20 21:53:33 [1469] LOG: database system is ready and Sean Reifschneider was logged in, so I suspect he did some maintenance work. Sean? Regards, Martin From jafo at tummy.com Sat Jul 21 08:17:20 2007 From: jafo at tummy.com (Sean Reifschneider) Date: Sat, 21 Jul 2007 00:17:20 -0600 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A1A219.60906@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <20070720200721.88E1D3A403A@sparrow.telecommunity.com> <46A1A219.60906@v.loewis.de> Message-ID: <20070721061720.GB4489@tummy.com> On Sat, Jul 21, 2007 at 08:05:13AM +0200, "Martin v. L?wis" wrote: >Around that time, the Postgres log has these entries: There was an upgrade of Postgres done earlier, as far as I can see, pypi is running. It must have been resolved earlier. AMK mentioned there was a problem with the upgrade restart and Apache had to be restarted, that was like 6 hours ago though. Thanks, Sean -- "I not only use all the brains that I have, but all that I can borrow." -- Woodrow Wilson Sean Reifschneider, Member of Technical Staff tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability From martin at v.loewis.de Sat Jul 21 19:00:30 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 21 Jul 2007 19:00:30 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> Message-ID: <46A23BAE.5090907@v.loewis.de> > I've created and experimental prototype setuptools-specific package > index at > > http://download.zope.org/ppix I've now added something similar as http://cheeseshop.python.org/simple/ It differs from your site in a few ways: - it does include a top-level index of all packages (but neither releases nor descriptions) - it's always current, due to being dynamically computed - it may differ in the precise list of URLs displayed; if there are important deviations, please let me know. Regards, Martin From jim at zope.com Sat Jul 21 19:12:48 2007 From: jim at zope.com (Jim Fulton) Date: Sat, 21 Jul 2007 13:12:48 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A23BAE.5090907@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> Message-ID: <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> On Jul 21, 2007, at 1:00 PM, Martin v. L?wis wrote: >> I've created and experimental prototype setuptools-specific package >> index at >> >> http://download.zope.org/ppix > > I've now added something similar as > > http://cheeseshop.python.org/simple/ Way cool! > > It differs from your site in a few ways: > > - it does include a top-level index of all packages (but neither > releases nor descriptions) Why? This is a relatively expensive page, due to it's size I assume, that really provides no value. This will slow down setuptools. > - it's always current, due to being dynamically computed And also unreliable, for the same reason. For example, it would have been inaccessible yesterday afternoon. And also puts more load on the server. It would be much better imo if static pages could be written on writes. > - it may differ in the precise list of URLs displayed; > if there are important deviations, please let me know. The download and homepage URL anchors need rel="download" or rel="homepage". They lack the #egg= links. Compare your page for setuptools to mine. Also, some packages use their pypi pages as their home page links. You want to exclude these, otherwise, setuptools will circle around to the human interface, which defeats point of the simple interface. Thanks for plugging away on this. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Sat Jul 21 19:48:16 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 21 Jul 2007 13:48:16 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A23BAE.5090907@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> Message-ID: <20070721174558.DDF923A403A@sparrow.telecommunity.com> At 07:00 PM 7/21/2007 +0200, Martin v. L?wis wrote: > > I've created and experimental prototype setuptools-specific package > > index at > > > > http://download.zope.org/ppix > >I've now added something similar as > >http://cheeseshop.python.org/simple/ It's very fast, thanks. >It differs from your site in a few ways: > >- it does include a top-level index of all packages (but neither > releases nor descriptions) Unfortunately, that doesn't help current versions of setuptools. See point #7 of: http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api Setuptools looks for release links, not package links on that page. Compare: $ easy_install -vvvi http://cheeseshop.python.org/simple Pywin32 Searching for Pywin32 Reading http://cheeseshop.python.org/simple/Pywin32/ Couldn't find index page for 'Pywin32' (maybe misspelled?) Scanning index of all packages (this may take a while) Reading http://cheeseshop.python.org/simple/ No local packages or download links found for Pywin32 error: Could not find suitable distribution for Requirement.parse('Pywin32') $ easy_install -vvvi http://cheeseshop.python.org/pypi Pywin32 Searching for Pywin32 Reading http://cheeseshop.python.org/pypi/Pywin32/ Couldn't find index page for 'Pywin32' (maybe misspelled?) Scanning index of all packages (this may take a while) Reading http://cheeseshop.python.org/pypi/ Reading http://cheeseshop.python.org/pypi/pywin32/210 Reading http://sf.net/projects/pywin32 ... >- it's always current, due to being dynamically computed >- it may differ in the precise list of URLs displayed; > if there are important deviations, please let me know. Jim's already mentioned these, but the rel="" info (per the index API spec's point #6), and the links embedded in the long_description field (per point #4) are missing. Without these, easy_install can't find sourceforge links, subversion checkouts, or any other embedded direct download links. For example: $ easy_install -vvvi http://cheeseshop.python.org/simple pywin32 Searching for pywin32 Reading http://cheeseshop.python.org/simple/pywin32/ No local packages or download links found for pywin32 error: Could not find suitable distribution for Requirement.parse('pywin32') $ easy_install -vvvi http://cheeseshop.python.org/pypi pywin32 Searching for pywin32 Reading http://cheeseshop.python.org/pypi/pywin32/ Reading http://sf.net/projects/pywin32 Reading http://sourceforge.net/project/showfiles.php?group_id=78018 Found link: http://downloads.sourceforge.net/pywin32/pywin32-210.win32-py2.2.exe?modtime=1159009204&big_mirror=0 ...[a dozen more links] $ easy_install -i http://cheeseshop.python.org/simple setuptools==dev Searching for setuptools==dev Reading http://cheeseshop.python.org/simple/setuptools/ No local packages or download links found for setuptools==dev error: Could not find suitable distribution for Requirement.parse('setuptools==dev') $ easy_install -i http://cheeseshop.python.org/pypi setuptools==dev Searching for setuptools==dev Reading http://cheeseshop.python.org/pypi/setuptools/ Reading http://cheeseshop.python.org/pypi/setuptools Reading http://cheeseshop.python.org/pypi/setuptools/0.6c6 Best match: setuptools dev Downloading http://svn.python.org/projects/sandbox/trunk/setuptools/#egg=setuptools-dev Doing subversion checkout from http://svn.python.org/projects/sandbox/trunk/setuptools/ to ... From martin at v.loewis.de Sat Jul 21 21:08:52 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 21 Jul 2007 21:08:52 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> Message-ID: <46A259C4.6090605@v.loewis.de> >> - it does include a top-level index of all packages (but neither >> releases nor descriptions) > > Why? This is a relatively expensive page, due to it's size I assume, > that really provides no value. This will slow down setuptools. IIUC, it won't slow down setuptools, as setuptools looks at it only if it cannot find the real package page due to a misspelling. So as long as everything is spelled correctly, it should not provide any slowdown. If people do misspell a package name when invoking easy_install, they get the feature that you consider of no value. As for performance - 30 downloads take 3.9s currently from nearby. >> - it's always current, due to being dynamically computed > > And also unreliable, for the same reason. For example, it would have > been inaccessible yesterday afternoon. The same could happen to Apache, too, of course. svn.python.org sometimes fails to restart when a restart is request on log rotation. Any software is unreliable; to reduce downtime, you need an operator that is available when something breaks. > And also puts more load on the server. It would be much better imo > if static pages could be written on writes. Contributions are welcome. In addition to me considering it futile, I also don't know how to implement it correctly. >> - it may differ in the precise list of URLs displayed; >> if there are important deviations, please let me know. > > The download and homepage URL anchors need rel="download" or > rel="homepage". Done. > They lack the #egg= links. How are these computed? > Also, some packages use their pypi pages as their home page links. Ok, done. Regards, Martin From martin at v.loewis.de Sat Jul 21 21:23:30 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 21 Jul 2007 21:23:30 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <20070721174558.DDF923A403A@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <20070721174558.DDF923A403A@sparrow.telecommunity.com> Message-ID: <46A25D32.4080606@v.loewis.de> > Unfortunately, that doesn't help current versions of setuptools. See > point #7 of: > > http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api > > Setuptools looks for release links, not package links on that page. I don't understand. What's a "release link"? The links on the index page *do* go to the "project's active version pages", as specified (there aren't any numbered version pages) Jim left out that page entirely - are you saying it is impossible to provide such an index page with the page structure that Jim proposed? > $ easy_install -vvvi http://cheeseshop.python.org/simple Pywin32 > Searching for Pywin32 > Reading http://cheeseshop.python.org/simple/Pywin32/ > Couldn't find index page for 'Pywin32' (maybe misspelled?) > Scanning index of all packages (this may take a while) > Reading http://cheeseshop.python.org/simple/ > No local packages or download links found for Pywin32 I see that it doesn't work, but I cannot understand why. On http://cheeseshop.python.org/simple/ "pywin32" is clearly linked, so it should be able to resolve the misspelling. > Jim's already mentioned these, but the rel="" info (per the index API > spec's point #6), This is fixed. > and the links embedded in the long_description field > (per point #4) are missing. I have to think about this more. Is it correct that you want all href attributes of all a elements in the long_description? And how do you know what the long_description is from just looking at the rendered page? Regards, Martin From pje at telecommunity.com Sat Jul 21 21:51:26 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 21 Jul 2007 15:51:26 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A25D32.4080606@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <20070721174558.DDF923A403A@sparrow.telecommunity.com> <46A25D32.4080606@v.loewis.de> Message-ID: <20070721194908.F16373A403A@sparrow.telecommunity.com> At 09:23 PM 7/21/2007 +0200, Martin v. L?wis wrote: > > Unfortunately, that doesn't help current versions of setuptools. See > > point #7 of: > > > > http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api > > > > Setuptools looks for release links, not package links on that page. > >I don't understand. What's a "release link"? The links on the index >page *do* go to the "project's active version pages", as specified >(there aren't any numbered version pages) See point #2: """2. Individual project version pages' URLs must be of the form base/projectname/version, where base is the package index's base URL.""" That's what's meant by "version pages" in point #7 -- i.e., they *must* be of that two-part form for setuptools to recognize them as such. >I see that it doesn't work, but I cannot understand why. >On > >http://cheeseshop.python.org/simple/ > >"pywin32" is clearly linked, so it should be able to resolve >the misspelling. It could perhaps be *changed* to do so, but at present it follows the spec's definition of "version page" URLs. > > Jim's already mentioned these, but the rel="" info (per the index API > > spec's point #6), > >This is fixed. Great; Sourceforge and other offsite download pages work now. > > and the links embedded in the long_description field > > (per point #4) are missing. > >I have to think about this more. Is it correct that you want all href >attributes of all a elements in the long_description? Yes; of course, the usual rendering needs to be applied, since long_description can contain reStructuredText. > And how do you >know what the long_description is from just looking at the rendered >page? You don't need to; easy_install discovers those links the same way it does any other Cheeseshop-provided download links. From easy_install's point of view, the entire page is just one big mass of links that might point to downloads: """4. ...It is explicitly permitted for a project's "long_description" to include URLs, and these should be formatted as HTML links by the package index, as EasyInstall does *no special processing* [emph. added] to identify what parts of a page are index-specific and which are part of the project's supplied description.""" In other words, the *only* links that are specially handled are the "rel" ones, which it follows unconditionally to look for additional direct download links. All other links are merely *inspected* to see if they obviously refer to a downloadable package (e.g. .tgz, .zip, .egg, .exe etc., or explicitly-marked #egg). As a side-effect, this means that links to perform Cheeseshop operations, links to other parts of python.org, etc. are simply ignored, as they are not links to downloadables nor marked as #egg. If a URL can be determined by inspection to be a download link, then easy_install extracts version and platform info from the URL and adds it as a candidate for download selection. When both the home page and download URL have been read, along with any detected "active version pages" (as defined above), then easy_install chooses the "best" download URL from all the candidates it has seen up to that point. From pje at telecommunity.com Sat Jul 21 21:53:40 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 21 Jul 2007 15:53:40 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> Message-ID: <20070721195122.CF2343A40D7@sparrow.telecommunity.com> At 09:23 PM 7/21/2007 +0200, Georg Brandl wrote: >What I, as an outsider, can see: for the Pygments package, Jim's page >lists the development link from the package description >(http://trac.pocoo.org/repos/pygments/trunk#egg=Pygments-dev), but >it looks like it's badly extracted (it has a trailing ">`__"), yours >doesn't list it at all. Hm, perhaps Jim is extracting it by looking for #egg URLs, rather than by actually processing the reST markup with docutils. That should probably be fixed, since there are many ways to specify URLs in reST and handling them all with regular expressions is unlikely to work as well as applying regular expressions to the resulting HTML. :) (Also, looking only for #egg links will miss non-#egg links embedded in the long_description, in the event that someone places direct download links there.) From martin at v.loewis.de Sun Jul 22 00:53:03 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 22 Jul 2007 00:53:03 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <20070721194908.F16373A403A@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <20070721174558.DDF923A403A@sparrow.telecommunity.com> <46A25D32.4080606@v.loewis.de> <20070721194908.F16373A403A@sparrow.telecommunity.com> Message-ID: <46A28E4F.5070905@v.loewis.de> > See point #2: > > """2. Individual project version pages' URLs must be of the form > base/projectname/version, where base is the package index's base URL.""" > > That's what's meant by "version pages" in point #7 -- i.e., they *must* > be of that two-part form for setuptools to recognize them as such. Ok, but I still cannot see how to fix that: there simply *is* no version part that I could point to. Does that mean that Jim's approach does not work? > Yes; of course, the usual rendering needs to be applied, since > long_description can contain reStructuredText. Ok, I now added these links as well. Regards, Martin From pje at telecommunity.com Sun Jul 22 01:20:04 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 21 Jul 2007 19:20:04 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A28E4F.5070905@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <20070721174558.DDF923A403A@sparrow.telecommunity.com> <46A25D32.4080606@v.loewis.de> <20070721194908.F16373A403A@sparrow.telecommunity.com> <46A28E4F.5070905@v.loewis.de> Message-ID: <20070721231808.2D5793A403A@sparrow.telecommunity.com> At 12:53 AM 7/22/2007 +0200, Martin v. L?wis wrote: > > See point #2: > > > > """2. Individual project version pages' URLs must be of the form > > base/projectname/version, where base is the package index's base URL.""" > > > > That's what's meant by "version pages" in point #7 -- i.e., they *must* > > be of that two-part form for setuptools to recognize them as such. > >Ok, but I still cannot see how to fix that: there simply *is* no >version part that I could point to. Actually, 'version' is allowed to be an empty string, so simply adding a trailing '/' to the links you're generating now should work. The only thing the version part of a version page URL is used for, is to handle links to .py files: setuptools uses the package version (if available) to synthesize a setup.py for installing standalone .py files. If the version is not available, it won't be able to do that, but that's a relatively minor feature, all things considered. Few packages are distributed via a single .py download URL, but the package index could actually tack on an #egg designator to such links in order to preserve 100% backward-compatibility. >Does that mean that Jim's approach does not work? Jim isn't providing the top-level index, and thus doesn't provide punctuation or case corrections. The "version pages" convention is only used by setuptools to discover additional index pages for crawling, anyway, and his whole design is intended to prevent crawling. > > Yes; of course, the usual rendering needs to be applied, since > > long_description can contain reStructuredText. > >Ok, I now added these links as well. Looks good, thanks! From martin at v.loewis.de Sun Jul 22 09:42:19 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 22 Jul 2007 09:42:19 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <20070721231808.2D5793A403A@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <20070721174558.DDF923A403A@sparrow.telecommunity.com> <46A25D32.4080606@v.loewis.de> <20070721194908.F16373A403A@sparrow.telecommunity.com> <46A28E4F.5070905@v.loewis.de> <20070721231808.2D5793A403A@sparrow.telecommunity.com> Message-ID: <46A30A5B.4020007@v.loewis.de> > Actually, 'version' is allowed to be an empty string, so simply adding a > trailing '/' to the links you're generating now should work. It does indeed. Regards, Martin From jim at zope.com Sun Jul 22 15:09:44 2007 From: jim at zope.com (Jim Fulton) Date: Sun, 22 Jul 2007 09:09:44 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A259C4.6090605@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> Message-ID: On Jul 21, 2007, at 3:08 PM, Martin v. L?wis wrote: >>> - it does include a top-level index of all packages (but neither >>> releases nor descriptions) >> >> Why? This is a relatively expensive page, due to it's size I assume, >> that really provides no value. This will slow down setuptools. > > IIUC, it won't slow down setuptools, as setuptools looks at it only > if it cannot find the real package page due to a misspelling. So > as long as everything is spelled correctly, it should not provide > any slowdown. > > If people do misspell a package name when invoking easy_install, > they get the feature that you consider of no value. That is not correct. Not all packages are in PyPI. Using a package that isn't in PyPI will trigger a fetch of that page. It isn't misspelled, it's just not there. People should *not* misspell pages when using setuptools. They should certainly not use misspelled package names in requirements. In my strongly help opinion, allowing imprecise names in requirements and setuptools command if of negative value. > As for performance - 30 downloads take 3.9s currently from nearby. That's nice. For me, that page takes 3 or 4 times as long as other pages. >>> - it's always current, due to being dynamically computed >> >> And also unreliable, for the same reason. For example, it would have >> been inaccessible yesterday afternoon. > > The same could happen to Apache, too, of course. svn.python.org > sometimes fails to restart when a restart is request on log rotation. > > Any software is unreliable; to reduce downtime, you need an operator > that is available when something breaks. Apache has a far better record than the cheeseshop. I give up. >> And also puts more load on the server. It would be much better imo >> if static pages could be written on writes. > > Contributions are welcome. In addition to me considering it futile, > I also don't know how to implement it correctly. I'd be happy to contribute my polling version. That solves my problems and I can't justify the additional effort to figure out the cheeseshop softtware. ... >> They lack the #egg= links. > > How are these computed? By parsing the description. Apparently, I'm going this incorrectly. I'll have to look into that. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Sun Jul 22 15:16:44 2007 From: jim at zope.com (Jim Fulton) Date: Sun, 22 Jul 2007 09:16:44 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070721195122.CF2343A40D7@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <20070721195122.CF2343A40D7@sparrow.telecommunity.com> Message-ID: On Jul 21, 2007, at 3:53 PM, Phillip J. Eby wrote: > At 09:23 PM 7/21/2007 +0200, Georg Brandl wrote: >> What I, as an outsider, can see: for the Pygments package, Jim's page >> lists the development link from the package description >> (http://trac.pocoo.org/repos/pygments/trunk#egg=Pygments-dev), but >> it looks like it's badly extracted (it has a trailing ">`__"), yours >> doesn't list it at all. > > Hm, perhaps Jim is extracting it by looking for #egg URLs, rather > than by actually processing the reST markup with docutils. Yup. > That should probably be fixed, since there are many ways to specify > URLs in reST and handling them all with regular expressions is > unlikely to work Yeah, I was hoping to get off easy. :) > as well as applying regular expressions to the resulting HTML. :) :) > (Also, looking only for #egg links will miss non-#egg links > embedded in the long_description, in the event that someone places > direct download links there.) By this, I assume you mean direct links to distributions. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Sun Jul 22 15:19:05 2007 From: jim at zope.com (Jim Fulton) Date: Sun, 22 Jul 2007 09:19:05 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <20070721231808.2D5793A403A@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <20070721174558.DDF923A403A@sparrow.telecommunity.com> <46A25D32.4080606@v.loewis.de> <20070721194908.F16373A403A@sparrow.telecommunity.com> <46A28E4F.5070905@v.loewis.de> <20070721231808.2D5793A403A@sparrow.telecommunity.com> Message-ID: On Jul 21, 2007, at 7:20 PM, Phillip J. Eby wrote: ... > Jim isn't providing the top-level index, and thus doesn't provide > punctuation or case corrections. Yup > The "version pages" convention is only used by setuptools to > discover additional index pages for crawling, anyway, and his whole > design is intended to prevent crawling. That's a secondary benefit. The main goal is to avoid the expense of that page for packages that aren't in PyPI, as some packages I use aren't. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Sun Jul 22 18:24:41 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 22 Jul 2007 18:24:41 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> Message-ID: <46A384C9.8040404@v.loewis.de> >> If people do misspell a package name when invoking easy_install, >> they get the feature that you consider of no value. > > That is not correct. Not all packages are in PyPI. Using a package that > isn't in PyPI will trigger a fetch of that page. I don't understand. What page is fetched if the package is not in PyPI? > It isn't misspelled, > it's just not there. People should *not* misspell pages when using > setuptools. They should certainly not use misspelled package names in > requirements. In my strongly help opinion, allowing imprecise names in > requirements and setuptools command if of negative value. I cannot comment on. I don't use setuptools, and have no intuition what is good or bad when using it (for example, I consider .egg files and the notion of eggs inherently bad). My main motivation to provide that page is that the setuptools specification says it should be there. As this entire infrastructure is for the sake of setuptools, I find it pointless to not support setuptools fully. > I'd be happy to contribute my polling version. That solves my problems > and I can't justify the additional effort to figure out the cheeseshop > softtware. I'd like to hear other opinions here. Would people prefer if the index was always correct (and perhaps somewhat slow), or would they prefer instead that it is super-efficient (and somewhat out-of-date)? Regards, Martin From martin at v.loewis.de Sun Jul 22 18:26:14 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 22 Jul 2007 18:26:14 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <20070721174558.DDF923A403A@sparrow.telecommunity.com> <46A25D32.4080606@v.loewis.de> <20070721194908.F16373A403A@sparrow.telecommunity.com> <46A28E4F.5070905@v.loewis.de> <20070721231808.2D5793A403A@sparrow.telecommunity.com> Message-ID: <46A38526.2010308@v.loewis.de> > That's a secondary benefit. The main goal is to avoid the expense of > that page for packages that aren't in PyPI, as some packages I use aren't. I see. Shouldn't that be fixed by providing an option to setuptools that avoids going to the index for missing packages? Regards, Martin From tseaver at palladion.com Sun Jul 22 18:33:11 2007 From: tseaver at palladion.com (Tres Seaver) Date: Sun, 22 Jul 2007 12:33:11 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A384C9.8040404@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> Message-ID: <46A386C7.5080203@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin v. L?wis wrote: >>> If people do misspell a package name when invoking easy_install, >>> they get the feature that you consider of no value. >> That is not correct. Not all packages are in PyPI. Using a package that >> isn't in PyPI will trigger a fetch of that page. > > I don't understand. What page is fetched if the package is not in PyPI? I think Jim was referring to a package which is *registered* in PyPI, but whose download location was elsewhere. >> I'd be happy to contribute my polling version. That solves my problems >> and I can't justify the additional effort to figure out the cheeseshop >> softtware. > > I'd like to hear other opinions here. Would people prefer if the index > was always correct (and perhaps somewhat slow), or would they prefer > instead that it is super-efficient (and somewhat out-of-date)? I would prefer the second, particularly as I think the caching solution lends itself to mirroring, which would also improve availability. - From my complete ignorance of the underlying architecture: the polling solution would stay pretty current if there were an extremely cheap way to ask for the latest "transaction ID" on the cheeseshop, or if the query could fetch only registrations newer than the last poll time. Are such queries possible over the XML-RPC interface? Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGo4bH+gerLs4ltQ4RAjiWAJ9/5TeOWAHdwL7PS5QAUnpyZWJzMQCeN5hT 5rRjOHzAu4cf+TKktNntWV8= =p59N -----END PGP SIGNATURE----- From tseaver at palladion.com Sun Jul 22 18:33:11 2007 From: tseaver at palladion.com (Tres Seaver) Date: Sun, 22 Jul 2007 12:33:11 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A384C9.8040404@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> Message-ID: <46A386C7.5080203@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin v. L?wis wrote: >>> If people do misspell a package name when invoking easy_install, >>> they get the feature that you consider of no value. >> That is not correct. Not all packages are in PyPI. Using a package that >> isn't in PyPI will trigger a fetch of that page. > > I don't understand. What page is fetched if the package is not in PyPI? I think Jim was referring to a package which is *registered* in PyPI, but whose download location was elsewhere. >> I'd be happy to contribute my polling version. That solves my problems >> and I can't justify the additional effort to figure out the cheeseshop >> softtware. > > I'd like to hear other opinions here. Would people prefer if the index > was always correct (and perhaps somewhat slow), or would they prefer > instead that it is super-efficient (and somewhat out-of-date)? I would prefer the second, particularly as I think the caching solution lends itself to mirroring, which would also improve availability. - From my complete ignorance of the underlying architecture: the polling solution would stay pretty current if there were an extremely cheap way to ask for the latest "transaction ID" on the cheeseshop, or if the query could fetch only registrations newer than the last poll time. Are such queries possible over the XML-RPC interface? Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGo4bH+gerLs4ltQ4RAjiWAJ9/5TeOWAHdwL7PS5QAUnpyZWJzMQCeN5hT 5rRjOHzAu4cf+TKktNntWV8= =p59N -----END PGP SIGNATURE----- From pje at telecommunity.com Sun Jul 22 18:40:11 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 22 Jul 2007 12:40:11 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A38526.2010308@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <20070721174558.DDF923A403A@sparrow.telecommunity.com> <46A25D32.4080606@v.loewis.de> <20070721194908.F16373A403A@sparrow.telecommunity.com> <46A28E4F.5070905@v.loewis.de> <20070721231808.2D5793A403A@sparrow.telecommunity.com> <46A38526.2010308@v.loewis.de> Message-ID: <20070722163754.A78EF3A40A9@sparrow.telecommunity.com> At 06:26 PM 7/22/2007 +0200, Martin v. L?wis wrote: > > That's a secondary benefit. The main goal is to avoid the expense of > > that page for packages that aren't in PyPI, as some packages I use aren't. > >I see. Shouldn't that be fixed by providing an option to setuptools >that avoids going to the index for missing packages? There's already such an option; --find-links or -f lets you specify URLs that should be checked before *any* PyPI access occurs. If all dependencies can be met using those URLs without going to PyPI, and you haven't explicitly requested -U (--update), easy_install doesn't go to PyPI. You can also specify such links in a setup script using setup(dependency_links=[...]), which bakes them into the .egg. When searching for that egg's dependencies, easy_install will pick them up and use them. So, it's actually possible to install a package and all its dependencies without using PyPI at all, if the package author(s) bake the URLs in. From jim at zope.com Sun Jul 22 18:38:09 2007 From: jim at zope.com (Jim Fulton) Date: Sun, 22 Jul 2007 12:38:09 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A384C9.8040404@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> Message-ID: On Jul 22, 2007, at 12:24 PM, Martin v. L?wis wrote: >>> If people do misspell a package name when invoking easy_install, >>> they get the feature that you consider of no value. >> >> That is not correct. Not all packages are in PyPI. Using a >> package that >> isn't in PyPI will trigger a fetch of that page. > > I don't understand. What page is fetched if the package is not in > PyPI? We have lots of packages that aren't in PyPI. Some of them aren't ready for PyPI or are not of general interest. Some are proprietary. >> It isn't misspelled, >> it's just not there. People should *not* misspell pages when using >> setuptools. They should certainly not use misspelled package >> names in >> requirements. In my strongly help opinion, allowing imprecise >> names in >> requirements and setuptools command if of negative value. > > I cannot comment on. I don't use setuptools, and have no intuition > what > is good or bad when using it (for example, I consider .egg files and > the notion of eggs inherently bad). > > My main motivation to provide that page is that the setuptools > specification says it should be there. As this entire infrastructure > is for the sake of setuptools, I find it pointless to not support > setuptools fully. Fair enough. Theory beats practicality every time. ;) >> I'd be happy to contribute my polling version. That solves my >> problems >> and I can't justify the additional effort to figure out the >> cheeseshop >> softtware. > > I'd like to hear other opinions here. Yes. This has been a fairly limited discussion. Sigh. > Would people prefer if the index > was always correct (and perhaps somewhat slow), or would they prefer > instead that it is super-efficient (and somewhat out-of-date)? Where somewhat out of date could be a matter of seconds. IMO, a python.org index could poll every few seconds, given that local polling only takes a few milliseconds. I have a feeling that this discussion is going to annoy someone with PyPI software knowledge enough to add baking on write. :) For example, I had the impression that Rene' was planning to invoke scripts after updates. It would be easy to invoke my polling script or a script based on your work, BTW, I'm pretty sure that geographic mirrors are desirable, both for performance and redundancy reasons. I think that, for these, polling once a minute is plenty and puts negligible load on PyPI, assuming that there aren't hundreds of them. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Sun Jul 22 18:41:55 2007 From: jim at zope.com (Jim Fulton) Date: Sun, 22 Jul 2007 12:41:55 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A386C7.5080203@palladion.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A386C7.5080203@palladion.com> Message-ID: <437D4304-ECF3-4240-8C33-F946128F8232@zope.com> On Jul 22, 2007, at 12:33 PM, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Martin v. L?wis wrote: >>>> If people do misspell a package name when invoking easy_install, >>>> they get the feature that you consider of no value. >>> That is not correct. Not all packages are in PyPI. Using a >>> package that >>> isn't in PyPI will trigger a fetch of that page. >> >> I don't understand. What page is fetched if the package is not in >> PyPI? > > I think Jim was referring to a package which is *registered* in PyPI, > but whose download location was elsewhere. No, I was referring to packages that aren't ready for or of interest to PyPI or to proprietary packages. ... > - From my complete ignorance of the underlying architecture: the > polling > solution would stay pretty current if there were an extremely cheap > way > to ask for the latest "transaction ID" on the cheeseshop, or if the > query could fetch only registrations newer than the last poll time. There is such an API thanks to Martin. > Are > such queries possible over the XML-RPC interface? Yup. I'm using them. Queries take only a few milliseconds per request on the server. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Sun Jul 22 18:51:40 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 22 Jul 2007 12:51:40 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> Message-ID: <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote: >People should *not* misspell pages >when using setuptools. They should certainly not use misspelled >package names in requirements. People do all sorts of things they shouldn't. That doesn't stop them blaming other people for their mistakes. It's said that a 10% improvement in ease-of-use can double a product's users. Case sensitivity is a barrier to entry for new users, and setuptools can't afford any additional entry barriers. A significant part of setuptools' audience includes people who are new to Python, or at least new to installing or distributing Python modules, and quite a lot of setuptools features are aimed squarely at that audience. This happens to be one of them. > In my strongly help opinion, allowing >imprecise names in requirements and setuptools command if of negative >value. I understand that perspective. But practicality beats purity, and this is absolutely a "worse is better" type of situation. Setuptools has lots of features that are targeted at different audiences. There are plenty of features targeted at the group you're in, don't begrudge the other groups their features. :) (This is probably one reason that setuptools is so controversial; everybody can find *something* about it to hate, even if those very same things are quite loved by a different group of users. E.g. you and case-insensitivity, Martin and eggs, etc.) From martin at v.loewis.de Sun Jul 22 18:54:36 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 22 Jul 2007 18:54:36 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A386C7.5080203@palladion.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A386C7.5080203@palladion.com> Message-ID: <46A38BCC.1000707@v.loewis.de> > I would prefer the second, particularly as I think the caching solution > lends itself to mirroring, which would also improve availability. I think this conclusion is wrong: Jim already has a mirror infrastructure that anybody can run, without the need of running that on the central server. > - From my complete ignorance of the underlying architecture: the polling > solution would stay pretty current if there were an extremely cheap way > to ask for the latest "transaction ID" on the cheeseshop, or if the > query could fetch only registrations newer than the last poll time. Are > such queries possible over the XML-RPC interface? Yes; you can ask for all changes since a certain UTC time. People shouldn't invoke that every UTC second, though - once a minute is fine. Regards, Martin From martin at v.loewis.de Sun Jul 22 19:03:49 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 22 Jul 2007 19:03:49 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> Message-ID: <46A38DF5.6010701@v.loewis.de> Jim Fulton schrieb: > On Jul 22, 2007, at 12:24 PM, Martin v. L?wis wrote: >>>> If people do misspell a package name when invoking easy_install, >>>> they get the feature that you consider of no value. >>> >>> That is not correct. Not all packages are in PyPI. Using a package that >>> isn't in PyPI will trigger a fetch of that page. >> >> I don't understand. What page is fetched if the package is not in PyPI? > > We have lots of packages that aren't in PyPI. Some of them aren't ready > for PyPI or are not of general interest. Some are proprietary. Ah, ok. So I stand to my original statement (the one you classified as incorrect): *If* I do misspell a package name, *then* setuptools will correct the spelling if the index page is available. >> Would people prefer if the index >> was always correct (and perhaps somewhat slow), or would they prefer >> instead that it is super-efficient (and somewhat out-of-date)? > > Where somewhat out of date could be a matter of seconds. And where somewhat slower could be "practically not noticable". > BTW, I'm pretty sure that geographic mirrors are desirable, both for > performance and redundancy reasons. I think that, for these, polling > once a minute is plenty and puts negligible load on PyPI, assuming that > there aren't hundreds of them. Sure: I don't mind at all if more people run your software on their machines. If people want it more official, we can have "cheeseshop0.python.org", "cheeseshop1.python.org", and so on, or "de.cheeseshop.python.org", "jp.cheeseshop.python.org", and so on. As I said before: if people also want to mirror the files, I'd ask them provide download statistics. Given the changelog, it would be easy to keep a file mirror up-to-date (of course, if a mirror downloads all files, these downloads also count towards the download statistics - which might confuse people). Regards, Martin From martin at v.loewis.de Sun Jul 22 20:40:05 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 22 Jul 2007 20:40:05 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> Message-ID: <46A3A485.7060602@v.loewis.de> > WRT zc.buildout, refreshing a buildout with just ZODB installed in it > takes about 45 seconds for me using PyPI and about 5 seconds using > the experimental index. Can you kindly provide a measurement for the index at http://cheeseshop.python.org/simple/ as well? Thanks, Martin From fdrake at gmail.com Mon Jul 23 06:56:48 2007 From: fdrake at gmail.com (Fred Drake) Date: Mon, 23 Jul 2007 00:56:48 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> Message-ID: <9cee7ab80707222156o2bae8a32pdaf7767f8c167918@mail.gmail.com> On 7/22/07, Phillip J. Eby wrote: > Setuptools has lots of features that are targeted at different > audiences. There are plenty of features targeted at the group you're > in, don't begrudge the other groups their features. :) Actually, I suspect this is a substantial contributor to setuptools being considered controversial: it encompasses to many different features. That certainly keeps me feeling unhappy about depending on it. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From jim at zope.com Mon Jul 23 12:59:44 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 23 Jul 2007 06:59:44 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A38DF5.6010701@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A38DF5.6010701@v.loewis.de> Message-ID: On Jul 22, 2007, at 1:03 PM, Martin v. L?wis wrote: > Jim Fulton schrieb: >> On Jul 22, 2007, at 12:24 PM, Martin v. L?wis wrote: >>>>> If people do misspell a package name when invoking easy_install, >>>>> they get the feature that you consider of no value. >>>> >>>> That is not correct. Not all packages are in PyPI. Using a >>>> package that >>>> isn't in PyPI will trigger a fetch of that page. >>> >>> I don't understand. What page is fetched if the package is not in >>> PyPI? >> >> We have lots of packages that aren't in PyPI. Some of them aren't >> ready >> for PyPI or are not of general interest. Some are proprietary. > > Ah, ok. So I stand to my original statement (the one you classified > as incorrect): *If* I do misspell a package name, *then* setuptools > will correct the spelling if the index page is available. Your full original statement was: On Jul 21, 2007, at 3:08 PM, Martin v. L?wis wrote: > IIUC, it won't slow down setuptools, as setuptools looks at it only > if it cannot find the real package page due to a misspelling. So > as long as everything is spelled correctly, it should not provide > any slowdown. > > If people do misspell a package name when invoking easy_install, > they get the feature that you consider of no value. I was referring to the part about not slowing things down when people didn't misspell. But it looks like I was mistaken. It was my understanding that setuptools always checked index/ when it couldn't find index/package_name/, but as Phillip pointed out, if it finds a package via find links, it won't look at index/. Basic tests seem to confirm this. >>> Would people prefer if the index >>> was always correct (and perhaps somewhat slow), or would they prefer >>> instead that it is super-efficient (and somewhat out-of-date)? >> >> Where somewhat out of date could be a matter of seconds. > > And where somewhat slower could be "practically not noticable". I wasn't arguing about speed. I agree that when PyPI is working well, the difference between the speed of the dynamic page and the speed of a static page wouldn't be noticeable. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Mon Jul 23 13:08:50 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 23 Jul 2007 07:08:50 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> Message-ID: <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote: > At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote: >> People should *not* misspell pages >> when using setuptools. They should certainly not use misspelled >> package names in requirements. > > People do all sorts of things they shouldn't. That doesn't stop > them blaming other people for their mistakes. > > It's said that a 10% improvement in ease-of-use can double a > product's users. Case sensitivity is a barrier to entry for new > users, and setuptools can't afford any additional entry barriers. I totally don't buy this in a case like this. People installing packages with setuptools are technical users. We expect them to write Python scripts. > A significant part of setuptools' audience includes people who are > new to Python, or at least new to installing or distributing Python > modules, and quite a lot of setuptools features are aimed squarely > at that audience. This happens to be one of them. I don't think that encouraging use of case insensitive names by people who are about start learning a language that uses case sensitive names is doing them any favors. >> In my strongly help opinion, allowing >> imprecise names in requirements and setuptools command if of negative >> value. > > I understand that perspective. But practicality beats purity, and > this is absolutely a "worse is better" type of situation. Obviously we disagree. > Setuptools has lots of features that are targeted at different > audiences. There are plenty of features targeted at the group > you're in, don't begrudge the other groups their features. :) I don't think you are helping them. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Mon Jul 23 13:36:45 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 23 Jul 2007 07:36:45 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A3A485.7060602@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A3A485.7060602@v.loewis.de> Message-ID: <617C738B-BDB4-4EDE-900E-64B50EFC2ED6@zope.com> On Jul 22, 2007, at 2:40 PM, Martin v. L?wis wrote: >> WRT zc.buildout, refreshing a buildout with just ZODB installed in it >> takes about 45 seconds for me using PyPI and about 5 seconds using >> the experimental index. > > Can you kindly provide a measurement for the index at > http://cheeseshop.python.org/simple/ as well? Yup. So, ATM: Using old PyPI takes about 1m5s Using simple takes about 25s Using ppix takes about 8s Some notes: - ZODB isn't the best example as it has download links to www.zope.org, making it take longer than packages without offsite links (relative to PyPI). - I expect that the difference between simple and ppix *for me* is a matter of geography. Refreshing an empty buildout checks the zc.buildout and setuptools packages. For that: Old PyPI takes 25s Simple takes 8s and ppix takes .5s Again, I assume that the difference between simple and ppix has more to do with geography than the difference between serving statically and dynamically. The simple page has more links on it than the ppix page, because I haven't gotten around to scarf all links off of a restructured-text rendering of long description. I doubt that makes any difference. It will be interesting to try again after I fix that. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Mon Jul 23 17:22:30 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 23 Jul 2007 11:22:30 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> Message-ID: <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> At 07:08 AM 7/23/2007 -0400, Jim Fulton wrote: >On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote: >>At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote: >>>People should *not* misspell pages >>>when using setuptools. They should certainly not use misspelled >>>package names in requirements. >> >>People do all sorts of things they shouldn't. That doesn't stop >>them blaming other people for their mistakes. >> >>It's said that a 10% improvement in ease-of-use can double a >>product's users. Case sensitivity is a barrier to entry for new >>users, and setuptools can't afford any additional entry barriers. > >I totally don't buy this in a case like this. People installing >packages with setuptools are technical users. We expect them to >write Python scripts. No, "we" don't. Eggs were created to support application-level plugins, such as are used by Trac and Chandler. Trac and Chandler users are not necessarily programmers, let alone Python programmers. From tseaver at palladion.com Mon Jul 23 18:01:02 2007 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 23 Jul 2007 12:01:02 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> Message-ID: <46A4D0BE.4030706@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Phillip J. Eby wrote: > At 07:08 AM 7/23/2007 -0400, Jim Fulton wrote: >> On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote: >>> At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote: >>>> People should *not* misspell pages >>>> when using setuptools. They should certainly not use misspelled >>>> package names in requirements. >>> People do all sorts of things they shouldn't. That doesn't stop >>> them blaming other people for their mistakes. >>> >>> It's said that a 10% improvement in ease-of-use can double a >>> product's users. Case sensitivity is a barrier to entry for new >>> users, and setuptools can't afford any additional entry barriers. >> I totally don't buy this in a case like this. People installing >> packages with setuptools are technical users. We expect them to >> write Python scripts. > > No, "we" don't. Eggs were created to support application-level > plugins, such as are used by Trac and Chandler. Trac and Chandler > users are not necessarily programmers, let alone Python programmers. But by definition, the people typing the names of the dependencies into a 'setup.py' for such a plugin *are* Python programmers, and could be expected to know about case sensitivity. I don't think Jim was areguing that human-centric *search* should punish misspellings, but rather that encouraging such sloppiness in other packages is a misfeature, especially if supporting it induces a tax on *all* users of automated dependency resolution. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGpNC++gerLs4ltQ4RAr2HAJ9UdPIVdz36inTG7nkm8SnrWPpcOgCgjKPc sOqbuwOhUvlsSYpgxFSz1mg= =F1EY -----END PGP SIGNATURE----- From tseaver at palladion.com Mon Jul 23 18:01:02 2007 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 23 Jul 2007 12:01:02 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> Message-ID: <46A4D0BE.4030706@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Phillip J. Eby wrote: > At 07:08 AM 7/23/2007 -0400, Jim Fulton wrote: >> On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote: >>> At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote: >>>> People should *not* misspell pages >>>> when using setuptools. They should certainly not use misspelled >>>> package names in requirements. >>> People do all sorts of things they shouldn't. That doesn't stop >>> them blaming other people for their mistakes. >>> >>> It's said that a 10% improvement in ease-of-use can double a >>> product's users. Case sensitivity is a barrier to entry for new >>> users, and setuptools can't afford any additional entry barriers. >> I totally don't buy this in a case like this. People installing >> packages with setuptools are technical users. We expect them to >> write Python scripts. > > No, "we" don't. Eggs were created to support application-level > plugins, such as are used by Trac and Chandler. Trac and Chandler > users are not necessarily programmers, let alone Python programmers. But by definition, the people typing the names of the dependencies into a 'setup.py' for such a plugin *are* Python programmers, and could be expected to know about case sensitivity. I don't think Jim was areguing that human-centric *search* should punish misspellings, but rather that encouraging such sloppiness in other packages is a misfeature, especially if supporting it induces a tax on *all* users of automated dependency resolution. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGpNC++gerLs4ltQ4RAr2HAJ9UdPIVdz36inTG7nkm8SnrWPpcOgCgjKPc sOqbuwOhUvlsSYpgxFSz1mg= =F1EY -----END PGP SIGNATURE----- From noah.gift at gmail.com Mon Jul 23 18:37:47 2007 From: noah.gift at gmail.com (Noah Gift) Date: Mon, 23 Jul 2007 12:37:47 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A4D0BE.4030706@palladion.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> Message-ID: > > > But by definition, the people typing the names of the dependencies into > a 'setup.py' for such a plugin *are* Python programmers, and could be > expected to know about case sensitivity. > > I don't think Jim was areguing that human-centric *search* should punish > misspellings, but rather that encouraging such sloppiness in other > packages is a misfeature, especially if supporting it induces a tax on > *all* users of automated dependency resolution. > > In my humble opinion, I for one completely agree with Phillip. I have had to sit down with quite a few new Python Programmers and show them how to use easy_install and I "thank God" easy_install is smart enough to figure out case sensitivity. This is a wonderful feature!!!! Please don't ever get rid of it :) Not being able to install a package as they couldn't figure out the exact name of the package could be the final straw for some new programmer to Python! Noah Gift -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/catalog-sig/attachments/20070723/9d7ebe75/attachment.htm From barry at python.org Mon Jul 23 18:46:24 2007 From: barry at python.org (Barry Warsaw) Date: Mon, 23 Jul 2007 12:46:24 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A4D0BE.4030706@palladion.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jul 23, 2007, at 12:01 PM, Tres Seaver wrote: >>>> It's said that a 10% improvement in ease-of-use can double a >>>> product's users. Under that principle, can I renew my plea for a better name than "easy_install"? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRqTbYHEjvBPtnXfVAQIHmgP+L5eDz3n4mrcPk5K6NEexQPLrOT9iSd+w cFYhn+FL5QoK6snRfxFp25KFmdz/raKDeGpQ4ZIy3nhpZTqxeQpPCsAg84rrw0lQ lflPXkMMmZJTi+3JmjXc2mhj2SlHZ+73XxRPcD2NKnqr14sxlunJMPe4/IX+y1Rf 9C5WVwoCiJ0= =b+zs -----END PGP SIGNATURE----- From jodok at lovelysystems.com Mon Jul 23 19:56:45 2007 From: jodok at lovelysystems.com (Jodok Batlogg) Date: Mon, 23 Jul 2007 13:56:45 -0400 Subject: [Catalog-sig] setuptools upload to pypi Message-ID: hi, i can't upload a new egg to cheeseshop... running "python setup.py bdist_egg register upload" hangs for several minutes at "Using PyPI login from /Users/jodok/.pypirc". entering username and password interactively results in the same. the webinterface seems to work fine (at least browsing) any idea? thanks jodok -- "Now is better than never." -- The Zen of Python, by Tim Peters Jodok Batlogg, Lovely Systems Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria phone: +43 5572 908060, fax: +43 5572 908060-77 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2454 bytes Desc: not available Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070723/2e31b7b4/attachment.bin From kantrn at rpi.edu Mon Jul 23 20:28:27 2007 From: kantrn at rpi.edu (Noah Kantrowitz) Date: Mon, 23 Jul 2007 14:28:27 -0400 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: References: Message-ID: <46A4F34B.4090004@rpi.edu> I've been seeing that this morning too. Uploads work fine, its just the register that seems to fail. --Noah Jodok Batlogg wrote: > hi, > > i can't upload a new egg to cheeseshop... > > running "python setup.py bdist_egg register upload" hangs for several > minutes at "Using PyPI login from /Users/jodok/.pypirc". > entering username and password interactively results in the same. > the webinterface seems to work fine (at least browsing) > > any idea? > > thanks > > jodok > > -- > "Now is better than never." > -- The Zen of Python, by Tim Peters > > Jodok Batlogg, Lovely Systems > Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria > phone: +43 5572 908060, fax: +43 5572 908060-77 > > > ------------------------------------------------------------------------ > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From tseaver at palladion.com Mon Jul 23 20:48:40 2007 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 23 Jul 2007 14:48:40 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> Message-ID: <46A4F808.4050406@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Noah Gift wrote: >> >> But by definition, the people typing the names of the dependencies into >> a 'setup.py' for such a plugin *are* Python programmers, and could be >> expected to know about case sensitivity. >> >> I don't think Jim was areguing that human-centric *search* should punish >> misspellings, but rather that encouraging such sloppiness in other >> packages is a misfeature, especially if supporting it induces a tax on >> *all* users of automated dependency resolution. >> >> > In my humble opinion, I for one completely agree with Phillip. I have had > to sit down with quite a few new Python Programmers and show them how to use > easy_install and I "thank God" easy_install is smart enough to figure out > case sensitivity. This is a wonderful feature!!!! Please don't ever get > rid of it :) > Not being able to install a package as they couldn't figure out the exact > name of the package could be the final straw for some new programmer to > Python! There are two different use cases here: 1. User mis-types the name of a package on the command line, e.g.: $ easy_install Foo when it should be spelled: $ easy_install foo Being forgiving of case-mangling here ia a concern of the easy_install *application*, and is non-controversil. 2. Programmer mis-types the name of a package in the dependencies for his own pacakge, e.g.: setup(install_requires=['Foo']...) In this case, coddling the error causes it to *propagate*, becuase other programmers will copy it directly, or depend on the error- filled package. Worse, the cost of error correction is transferred to *all* users of the setuptools library, even if they never use 'easy_install' at all. I'm fine with leaving the newbie-friendly behavior in 'easy_install'; I just don't like the performance hit it induces on users of setuptools who *can* spell. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGpPgI+gerLs4ltQ4RApzMAJ0WP6gzaM8n99fxkyo0Se285Te3bQCg1vxF 6ihYIENH8GpsQ7/ZF062T4Q= =OuxU -----END PGP SIGNATURE----- From tseaver at palladion.com Mon Jul 23 20:48:40 2007 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 23 Jul 2007 14:48:40 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> Message-ID: <46A4F808.4050406@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Noah Gift wrote: >> >> But by definition, the people typing the names of the dependencies into >> a 'setup.py' for such a plugin *are* Python programmers, and could be >> expected to know about case sensitivity. >> >> I don't think Jim was areguing that human-centric *search* should punish >> misspellings, but rather that encouraging such sloppiness in other >> packages is a misfeature, especially if supporting it induces a tax on >> *all* users of automated dependency resolution. >> >> > In my humble opinion, I for one completely agree with Phillip. I have had > to sit down with quite a few new Python Programmers and show them how to use > easy_install and I "thank God" easy_install is smart enough to figure out > case sensitivity. This is a wonderful feature!!!! Please don't ever get > rid of it :) > Not being able to install a package as they couldn't figure out the exact > name of the package could be the final straw for some new programmer to > Python! There are two different use cases here: 1. User mis-types the name of a package on the command line, e.g.: $ easy_install Foo when it should be spelled: $ easy_install foo Being forgiving of case-mangling here ia a concern of the easy_install *application*, and is non-controversil. 2. Programmer mis-types the name of a package in the dependencies for his own pacakge, e.g.: setup(install_requires=['Foo']...) In this case, coddling the error causes it to *propagate*, becuase other programmers will copy it directly, or depend on the error- filled package. Worse, the cost of error correction is transferred to *all* users of the setuptools library, even if they never use 'easy_install' at all. I'm fine with leaving the newbie-friendly behavior in 'easy_install'; I just don't like the performance hit it induces on users of setuptools who *can* spell. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGpPgI+gerLs4ltQ4RApzMAJ0WP6gzaM8n99fxkyo0Se285Te3bQCg1vxF 6ihYIENH8GpsQ7/ZF062T4Q= =OuxU -----END PGP SIGNATURE----- From benji at benjiyork.com Mon Jul 23 20:54:27 2007 From: benji at benjiyork.com (Benji York) Date: Mon, 23 Jul 2007 14:54:27 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A384C9.8040404@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> Message-ID: <46A4F963.3040609@benjiyork.com> Martin v. L?wis wrote: > would they prefer instead that it is super-efficient (and somewhat > out-of-date)? Yes. At most a few minutes out of date and faster/more reliable would be my strong preference. -- Benji York http://benjiyork.com From jim at zope.com Mon Jul 23 20:55:16 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 23 Jul 2007 14:55:16 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A4F808.4050406@palladion.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A4F808.4050406@palladion.com> Message-ID: <9FFADEB3-0E83-417E-B6EE-AF9A172690D0@zope.com> On Jul 23, 2007, at 2:48 PM, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Noah Gift wrote: >>> >>> But by definition, the people typing the names of the >>> dependencies into >>> a 'setup.py' for such a plugin *are* Python programmers, and >>> could be >>> expected to know about case sensitivity. >>> >>> I don't think Jim was areguing that human-centric *search* should >>> punish >>> misspellings, but rather that encouraging such sloppiness in other >>> packages is a misfeature, especially if supporting it induces a >>> tax on >>> *all* users of automated dependency resolution. >>> >>> >> In my humble opinion, I for one completely agree with Phillip. I >> have had >> to sit down with quite a few new Python Programmers and show them >> how to use >> easy_install and I "thank God" easy_install is smart enough to >> figure out >> case sensitivity. This is a wonderful feature!!!! Please don't >> ever get >> rid of it :) >> Not being able to install a package as they couldn't figure out >> the exact >> name of the package could be the final straw for some new >> programmer to >> Python! > > There are two different use cases here: > > 1. User mis-types the name of a package on the command line, e.g.: > > $ easy_install Foo > > when it should be spelled: > > $ easy_install foo > > Being forgiving of case-mangling here ia a concern of the > easy_install *application*, and is non-controversil. For me this is potentially controversial because: > 2. Programmer mis-types the name of a package in the dependencies > for his own pacakge, e.g.: > > setup(install_requires=['Foo']...) Note that this might be intentional, as opposed to a typo. The programmer will think "Foo" is a valid name because it worked with easy_install. It's true that easy_install prints a warning, but it is buried in so much output that it is easily missed or ignored. > In this case, coddling the error causes it to *propagate*, becuase > other programmers will copy it directly, or depend on the error- > filled package. Worse, the cost of error correction is > transferred > to *all* users of the setuptools library, even if they never use > 'easy_install' at all. Well said. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From benji at benjiyork.com Mon Jul 23 20:58:44 2007 From: benji at benjiyork.com (Benji York) Date: Mon, 23 Jul 2007 14:58:44 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A38DF5.6010701@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A38DF5.6010701@v.loewis.de> Message-ID: <46A4FA64.5050404@benjiyork.com> Martin v. L?wis wrote: > And where somewhat slower could be "practically not noticable". Perhaps it /could/ be, but isn't currently. For example, updating one piece of software I have with almost 150 dependencies takes 45 seconds with ppix, 4:45 without. I plan to do similar timings with the "simple" PyPI interface when I get a chance and report the results here. -- Benji York http://benjiyork.com From jim at zope.com Mon Jul 23 21:06:46 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 23 Jul 2007 15:06:46 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A4FA64.5050404@benjiyork.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com> Message-ID: <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com> On Jul 23, 2007, at 2:58 PM, Benji York wrote: > Martin v. L?wis wrote: >> And where somewhat slower could be "practically not noticable". > > Perhaps it /could/ be, but isn't currently. For example, updating > one piece of software I have with almost 150 dependencies takes 45 > seconds with ppix, 4:45 without. I plan to do similar timings with > the "simple" PyPI interface when I get a chance and report the > results here. I suspect that this has more to do with network distance than with server speed. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From noah.gift at gmail.com Mon Jul 23 21:30:44 2007 From: noah.gift at gmail.com (Noah Gift) Date: Mon, 23 Jul 2007 15:30:44 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com> <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com> Message-ID: On 7/23/07, Jim Fulton wrote: > > > On Jul 23, 2007, at 2:58 PM, Benji York wrote: > > > Martin v. L?wis wrote: > >> And where somewhat slower could be "practically not noticable". > > > > Perhaps it /could/ be, but isn't currently. For example, updating > > one piece of software I have with almost 150 dependencies takes 45 > > seconds with ppix, 4:45 without. I plan to do similar timings with > > the "simple" PyPI interface when I get a chance and report the > > results here. > > I suspect that this has more to do with network distance than with > server speed. That is an interesting point. It is amazing how many directory type things get slammed, but the problem is really latency...such as a slow DNS lookup. I wonder how much quicker an easy_install would be will local DNS lookups,package names, etc. I had a problem with a LDAP server I setup that was really tricky to figure out until I wrote some scripts that ran continuously getting stats, and I realized that a DNS server would hang occasionally and it would grind everything to a halt. People kept telling me they would have an occasional 'ls -l' that would hang for 20 seconds. Caching DNS servers fixed it. Jim > > -- > Jim Fulton mailto:jim at zope.com Python > Powered! > CTO (540) 361-1714 > http://www.python.org > Zope Corporation http://www.zope.com > http://www.zope.org > > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > http://mail.python.org/mailman/listinfo/distutils-sig > -- http://www.blog.noahgift.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/catalog-sig/attachments/20070723/af718f06/attachment.html From benji at benjiyork.com Mon Jul 23 21:41:05 2007 From: benji at benjiyork.com (Benji York) Date: Mon, 23 Jul 2007 15:41:05 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com> <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com> Message-ID: <46A50451.5050908@benjiyork.com> Jim Fulton wrote: > On Jul 23, 2007, at 2:58 PM, Benji York wrote: > >> Martin v. L?wis wrote: >>> And where somewhat slower could be "practically not noticable". >> Perhaps it /could/ be, but isn't currently. For example, updating >> one piece of software I have with almost 150 dependencies takes 45 >> seconds with ppix, 4:45 without. I plan to do similar timings with >> the "simple" PyPI interface when I get a chance and report the >> results here. > > I suspect that this has more to do with network distance than with > server speed. That's actually my point. Geographically distributed mirrors that are a little out of sync are much more valuable (IMO) than a centralized service that is absolutely up to date, but "far" away. For me the static/dynamic argument is more about stability, and central/distributed is more about (network) speed. -- Benji York http://benjiyork.com From benji at benjiyork.com Mon Jul 23 21:02:08 2007 From: benji at benjiyork.com (Benji York) Date: Mon, 23 Jul 2007 15:02:08 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> Message-ID: <46A4FB30.2000304@benjiyork.com> Jim Fulton wrote: > On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote: >> A significant part of setuptools' audience includes people who are >> new to Python, or at least new to installing or distributing Python >> modules, and quite a lot of setuptools features are aimed squarely >> at that audience. This happens to be one of them. > > I don't think that encouraging use of case insensitive names by > people who are about start learning a language that uses case > sensitive names is doing them any favors. Agreed. -- Benji York http://benjiyork.com From benji at benjiyork.com Mon Jul 23 21:05:42 2007 From: benji at benjiyork.com (Benji York) Date: Mon, 23 Jul 2007 15:05:42 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> Message-ID: <46A4FC06.3010109@benjiyork.com> Noah Gift wrote: > In my humble opinion, I for one completely agree with Phillip. I have had to > sit down with quite a few new Python Programmers and show them how to use > easy_install and I "thank God" easy_install is smart enough to figure out case > sensitivity. This is a wonderful feature!!!! Please don't ever get rid of it :) If easy_install had instead said "sorry, I can't find 'foo', perhaps you meant 'Foo'", then the user would be both spared frustration and enlightened. -- Benji York http://benjiyork.com From martin at v.loewis.de Mon Jul 23 22:00:12 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 23 Jul 2007 22:00:12 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <617C738B-BDB4-4EDE-900E-64B50EFC2ED6@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A3A485.7060602@v.loewis.de> <617C738B-BDB4-4EDE-900E-64B50EFC2ED6@zope.com> Message-ID: <46A508CC.5010706@v.loewis.de> > Yup. So, ATM: > > Using old PyPI takes about 1m5s > Using simple takes about 25s > Using ppix takes about 8s Thanks! > Again, I assume that the difference between simple and ppix has more to > do with geography than the difference between serving statically and > dynamically. The simple page has more links on it than the ppix page, > because I haven't gotten around to scarf all links off of a > restructured-text rendering of long description. I doubt that makes any > difference. It will be interesting to try again after I fix that. If you think that the /simple pages are correct, it might be easier to just mirror them instead of doing all the work yourself. I don't plan to take that service offline, unless experimentation shows it has serious flaws. Regards, Martin From martin at v.loewis.de Mon Jul 23 22:04:36 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 23 Jul 2007 22:04:36 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A4D0BE.4030706@palladion.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> Message-ID: <46A509D4.3070108@v.loewis.de> > But by definition, the people typing the names of the dependencies into > a 'setup.py' for such a plugin *are* Python programmers, and could be > expected to know about case sensitivity. > > I don't think Jim was areguing that human-centric *search* should punish > misspellings, but rather that encouraging such sloppiness in other > packages is a misfeature, especially if supporting it induces a tax on > *all* users of automated dependency resolution. Right. I think Phillip is primarily talking about package names as specified on the command line of easy_install. So if your concern is about package names specified in dependencies, one solution could be that setuptools distinguishes whether to apply case corrections and normalization, depending on whether it was an end-user-typed name or a programmer-specified one. What I don't know is how difficult that would be to implement, and what volunteer is supposed to implement it if it were easy/possible, so I by no means propose that such a solution should be implemented, even if it would solve the problem. Regards, Martin From rlratzel at enthought.com Mon Jul 23 22:04:18 2007 From: rlratzel at enthought.com (Rick Ratzel) Date: Mon, 23 Jul 2007 15:04:18 -0500 (CDT) Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A4FC06.3010109@benjiyork.com> (message from Benji York on Mon, 23 Jul 2007 15:05:42 -0400) References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A4FC06.3010109@benjiyork.com> Message-ID: <20070723200419.058451DF4F6@mail.enthought.com> > Date: Mon, 23 Jul 2007 15:05:42 -0400 > From: Benji York > > Noah Gift wrote: > > In my humble opinion, I for one completely agree with Phillip. I have had to > > sit down with quite a few new Python Programmers and show them how to use > > easy_install and I "thank God" easy_install is smart enough to figure out case > > sensitivity. This is a wonderful feature!!!! Please don't ever get rid of it :) > > If easy_install had instead said "sorry, I can't find 'foo', perhaps you > meant 'Foo'", then the user would be both spared frustration and > enlightened. +1 -- Rick Ratzel - Enthought, Inc. 515 Congress Avenue, Suite 2100 - Austin, Texas 78701 512-536-1057 x229 - Fax: 512-536-1059 http://www.enthought.com From jim at zope.com Mon Jul 23 22:05:53 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 23 Jul 2007 16:05:53 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A508CC.5010706@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A3A485.7060602@v.loewis.de> <617C738B-BDB4-4EDE-900E-64B50EFC2ED6@zope.com> <46A508CC.5010706@v.loewis.de> Message-ID: On Jul 23, 2007, at 4:00 PM, Martin v. L?wis wrote: ... >> Again, I assume that the difference between simple and ppix has >> more to >> do with geography than the difference between serving statically and >> dynamically. The simple page has more links on it than the ppix page, >> because I haven't gotten around to scarf all links off of a >> restructured-text rendering of long description. I doubt that >> makes any >> difference. It will be interesting to try again after I fix that. > > If you think that the /simple pages are correct, it might be easier to > just mirror them instead of doing all the work yourself. Good point. I might just do that. > I don't plan to take that service offline, unless experimentation > shows it has serious flaws. Cool. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Mon Jul 23 22:13:37 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 23 Jul 2007 22:13:37 +0200 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: References: Message-ID: <46A50BF1.9020303@v.loewis.de> > i can't upload a new egg to cheeseshop... > > running "python setup.py bdist_egg register upload" hangs for several > minutes at "Using PyPI login from /Users/jodok/.pypirc". > entering username and password interactively results in the same. > the webinterface seems to work fine (at least browsing) > > any idea? I think that's because I turned of proxying from www.python.org/pypi to cheeseshop.python.org/pypi, and replaced it with redirection (302, temporary redirect) instead (temporary just in case people find problems with that). (I asked a few days ago whether that would be a problem, and nobody said it would). I'd appreciate if somebody could investigate what precisely is causing the problem (I thought urllib[2] would be able to handle redirects), how to fix it, and propose a fix to the code base. I have now reverted the change (which, of course, gives a performance problem, as all accesses to www.python.org/pypi now go through two web servers). Regards, Martin From martin at v.loewis.de Mon Jul 23 22:16:27 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 23 Jul 2007 22:16:27 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A4FA64.5050404@benjiyork.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com> Message-ID: <46A50C9B.7060501@v.loewis.de> > Perhaps it /could/ be, but isn't currently. For example, updating one > piece of software I have with almost 150 dependencies takes 45 seconds > with ppix, 4:45 without. I plan to do similar timings with the "simple" > PyPI interface when I get a chance and report the results here. I was, of course, talking about the simple interface. The full index will certainly take much more time because setuptools has to request more pages, and each page contains a lot of unnecessary data. Regards, Martin From martin at v.loewis.de Mon Jul 23 22:21:05 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 23 Jul 2007 22:21:05 +0200 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A50451.5050908@benjiyork.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com> <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com> <46A50451.5050908@benjiyork.com> Message-ID: <46A50DB1.3080207@v.loewis.de> >>>> And where somewhat slower could be "practically not noticable". >>> Perhaps it /could/ be, but isn't currently. For example, updating >>> one piece of software I have with almost 150 dependencies takes 45 >>> seconds with ppix, 4:45 without. I plan to do similar timings with >>> the "simple" PyPI interface when I get a chance and report the >>> results here. >> >> I suspect that this has more to do with network distance than with >> server speed. > > That's actually my point. Geographically distributed mirrors that are a > little out of sync are much more valuable (IMO) than a centralized > service that is absolutely up to date, but "far" away. Ok, but then your response didn't really answer my question. If people want to run distributed mirrors that are somewhat behind, by all means: start today (just remember to talk to me if you also want to mirror files - if not, just run Jim's software as-is). My question was about the "simple" interface on the central server, to which you seem to say "I don't need it at all - whether it's current and slow or behind and fast" (which, in a sense, is also a response to the question, namely "I don't care"). Regards, Martin From pje at telecommunity.com Mon Jul 23 22:43:40 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 23 Jul 2007 16:43:40 -0400 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: <46A50BF1.9020303@v.loewis.de> References: <46A50BF1.9020303@v.loewis.de> Message-ID: <20070723204445.65ABC3A40AA@sparrow.telecommunity.com> At 10:13 PM 7/23/2007 +0200, Martin v. L?wis wrote: > > i can't upload a new egg to cheeseshop... > > > > running "python setup.py bdist_egg register upload" hangs for several > > minutes at "Using PyPI login from /Users/jodok/.pypirc". > > entering username and password interactively results in the same. > > the webinterface seems to work fine (at least browsing) > > > > any idea? > >I think that's because I turned of proxying from www.python.org/pypi >to cheeseshop.python.org/pypi, and replaced it with redirection >(302, temporary redirect) instead (temporary just in case people >find problems with that). If you were doing that for POST requests, that is probably the source of the problem. You could always restrict the proxying to occur only for non-GET requests, since IIRC distutils.command.register and distutils.command.upload use POSTs. GET requests generally have a much wider leeway for safe redirection than POST requests do. Of course, one must also preserve the query string in a redirected GET, and I don't think Apache's Redirect directive does that either. You can certainly do it with mod_rewrite, however. I expect that the combination of preserving query strings on redirection, and only redirecting GETs should make the transition safe. From pje at telecommunity.com Mon Jul 23 22:47:04 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 23 Jul 2007 16:47:04 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A509D4.3070108@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> Message-ID: <20070723204446.294223A40B2@sparrow.telecommunity.com> At 10:04 PM 7/23/2007 +0200, Martin v. L?wis wrote: > > But by definition, the people typing the names of the dependencies into > > a 'setup.py' for such a plugin *are* Python programmers, and could be > > expected to know about case sensitivity. > > > > I don't think Jim was areguing that human-centric *search* should punish > > misspellings, but rather that encouraging such sloppiness in other > > packages is a misfeature, especially if supporting it induces a tax on > > *all* users of automated dependency resolution. > >Right. I think Phillip is primarily talking about package names as >specified on the command line of easy_install. > >So if your concern is about package names specified in dependencies, >one solution could be that setuptools distinguishes whether to apply >case corrections and normalization, depending on whether it was an >end-user-typed name or a programmer-specified one. > >What I don't know is how difficult that would be to implement, and >what volunteer is supposed to implement it if it were easy/possible, >so I by no means propose that such a solution should be implemented, >even if it would solve the problem. Yes, especially since compatibility with the existing installation base requires case insensitivity, because on case-insensitive platforms easy_install already normalizes the case of filenames it creates. So, the question of what the "right thing" to do is in the abstract has already been moot for a year or two. From martin at v.loewis.de Mon Jul 23 23:03:20 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 23 Jul 2007 23:03:20 +0200 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: <20070723204445.65ABC3A40AA@sparrow.telecommunity.com> References: <46A50BF1.9020303@v.loewis.de> <20070723204445.65ABC3A40AA@sparrow.telecommunity.com> Message-ID: <46A51798.8000907@v.loewis.de> > because I turned of proxying from www.python.org/pypi >> to cheeseshop.python.org/pypi, and replaced it with redirection >> (302, temporary redirect) instead (temporary just in case people >> find problems with that). > > If you were doing that for POST requests, that is probably the source of > the problem. You could always restrict the proxying to occur only for > non-GET requests, since IIRC distutils.command.register and > distutils.command.upload use POSTs. GET requests generally have a much > wider leeway for safe redirection than POST requests do. What is the problem with redirects for POST? In particular, why doesn't urllib2 support it? > Of course, one must also preserve the query string in a redirected GET, > and I don't think Apache's Redirect directive does that either. You can > certainly do it with mod_rewrite, however. I see - I was using a plain Redirect. > I expect that the combination of preserving query strings on > redirection, and only redirecting GETs should make the transition safe. Can you share the magic to do that? I'd really like to start phasing out www.python.org/pypi, although I now see that it will take a few Python releases to get the cheeseshop home page replaced in distutils. In particular, if I also keep the mod_proxy setup for the reverse proxy, how will it interact with the redirect for the GET only? Regards, Martin From fdrake at gmail.com Mon Jul 23 23:13:18 2007 From: fdrake at gmail.com (Fred Drake) Date: Mon, 23 Jul 2007 17:13:18 -0400 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: <46A50BF1.9020303@v.loewis.de> References: <46A50BF1.9020303@v.loewis.de> Message-ID: <9cee7ab80707231413q573c62bas8a03163e03ba9fc1@mail.gmail.com> On 7/23/07, "Martin v. L?wis" wrote: > I think that's because I turned of proxying from www.python.org/pypi > to cheeseshop.python.org/pypi, and replaced it with redirection > (302, temporary redirect) instead (temporary just in case people > find problems with that). > > (I asked a few days ago whether that would be a problem, and nobody > said it would). I guess I just didn't find the time, but my objections are non-technical and have apparently been of no interest when voiced in the past. Basically, I think exposing human beings to the name "cheeseshop" in bad. Specifically, it's confusing to anyone not familiar with a particular Monty Python skit. A nice skit (IMO), but not a good public-facing name for PyPI. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From martin at v.loewis.de Mon Jul 23 23:13:48 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 23 Jul 2007 23:13:48 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070723204446.294223A40B2@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> Message-ID: <46A51A0C.2090800@v.loewis.de> > Yes, especially since compatibility with the existing installation > base requires case insensitivity, because on case-insensitive > platforms easy_install already normalizes the case of filenames it > creates. So, the question of what the "right thing" to do is in the > abstract has already been moot for a year or two. Can you elaborate a bit, please? Why does the case of filenames matter for the queries it makes? AFAIU, it gets package names either from the user or from setup.py, perhaps also from packages dependency inside .egg files (assuming those support dependencies); these should all be case-sensitive. Regards, Martin From pje at telecommunity.com Mon Jul 23 23:21:16 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 23 Jul 2007 17:21:16 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A51A0C.2090800@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> Message-ID: <20070723211919.361E23A403D@sparrow.telecommunity.com> At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote: > > Yes, especially since compatibility with the existing installation > > base requires case insensitivity, because on case-insensitive > > platforms easy_install already normalizes the case of filenames it > > creates. So, the question of what the "right thing" to do is in the > > abstract has already been moot for a year or two. > >Can you elaborate a bit, please? Why does the case of filenames >matter for the queries it makes? > >AFAIU, it gets package names either from the user or from setup.py, >perhaps also from packages dependency inside .egg files (assuming >those support dependencies); these should all be case-sensitive. In order to resolve dependencies, the system looks at installed .egg files and directories (and .egg-info direcories), and extracts package name and version info from the filenames. From benji at benjiyork.com Mon Jul 23 23:26:39 2007 From: benji at benjiyork.com (Benji York) Date: Mon, 23 Jul 2007 17:26:39 -0400 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: <9cee7ab80707231413q573c62bas8a03163e03ba9fc1@mail.gmail.com> References: <46A50BF1.9020303@v.loewis.de> <9cee7ab80707231413q573c62bas8a03163e03ba9fc1@mail.gmail.com> Message-ID: <46A51D0F.30406@benjiyork.com> Fred Drake wrote: > Basically, I think exposing human beings to the name "cheeseshop" [is] > bad. [...] A nice skit (IMO), but not a good public-facing name for > PyPI. I have to agree on both counts. -- Benji York http://benjiyork.com From pje at telecommunity.com Mon Jul 23 23:31:39 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 23 Jul 2007 17:31:39 -0400 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: <46A51798.8000907@v.loewis.de> References: <46A50BF1.9020303@v.loewis.de> <20070723204445.65ABC3A40AA@sparrow.telecommunity.com> <46A51798.8000907@v.loewis.de> Message-ID: <20070723212920.8BFE63A40AA@sparrow.telecommunity.com> At 11:03 PM 7/23/2007 +0200, Martin v. L?wis wrote: > > because I turned of proxying from www.python.org/pypi > >> to cheeseshop.python.org/pypi, and replaced it with redirection > >> (302, temporary redirect) instead (temporary just in case people > >> find problems with that). > > > > If you were doing that for POST requests, that is probably the source of > > the problem. You could always restrict the proxying to occur only for > > non-GET requests, since IIRC distutils.command.register and > > distutils.command.upload use POSTs. GET requests generally have a much > > wider leeway for safe redirection than POST requests do. > >What is the problem with redirects for POST? In particular, why doesn't >urllib2 support it? It's my understanding that a redirection response to a POST means "GET the location I'm giving you", not "sorry, you should POST to this other place instead." At least, that's how I understand web browsers to interpret it, and I believe urllib2 does as well. So, the issue is not one of "not supporting" POSTs, it's a question of what the semantics of a redirected POST should be. As far as I'm aware, it doesn't cause the POST to repeat, although that *might* depend on the specific status code and HTTP version. > > Of course, one must also preserve the query string in a redirected GET, > > and I don't think Apache's Redirect directive does that either. You can > > certainly do it with mod_rewrite, however. > >I see - I was using a plain Redirect. > > > I expect that the combination of preserving query strings on > > redirection, and only redirecting GETs should make the transition safe. > >Can you share the magic to do that? I'd really like to start phasing >out www.python.org/pypi, although I now see that it will take a few >Python releases to get the cheeseshop home page replaced in distutils. > >In particular, if I also keep the mod_proxy setup for the reverse >proxy, how will it interact with the redirect for the GET only? Well, if you are using mod_rewrite to do both the redirection and the proxying, then it should suffice to have the GET rewrite with [R] and the remainder use [P]. Something like: RewriteEngine On RewriteBase / RewriteCond %{REQUEST_METHOD} ^GET$ RewriteRule ^pypi(.*)$ http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [R,L] RewriteRule ^pypi(.*)$ http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [P,L] But I'd test that with some dummy URLs instead of 'pypi', first. Notice that this is not using any mod_proxy directives, just using mod_rewrite proxy support. I've never used the mod_proxy directives, actually, but I have used mod_rewrite proxying. From benji at benjiyork.com Mon Jul 23 23:34:16 2007 From: benji at benjiyork.com (Benji York) Date: Mon, 23 Jul 2007 17:34:16 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A50DB1.3080207@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com> <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com> <46A50451.5050908@benjiyork.com> <46A50DB1.3080207@v.loewis.de> Message-ID: <46A51ED8.4020502@benjiyork.com> Martin v. L?wis wrote: > My question was about the "simple" interface on the central > server Ah, I didn't realize. > to which you seem to say "I don't need it at all - whether > it's current and slow or behind and fast" (which, in a sense, > is also a response to the question, namely "I don't care"). I think it's a great idea to have both human- and machine-targeted versions available. It looks like setuptools is about twice as fast (in at least one instance) with the simple version. That seems like a pretty big win to me. -- Benji York http://benjiyork.com From richardjones at optushome.com.au Tue Jul 24 00:02:26 2007 From: richardjones at optushome.com.au (Richard Jones) Date: Tue, 24 Jul 2007 08:02:26 +1000 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: <46A50BF1.9020303@v.loewis.de> References: <46A50BF1.9020303@v.loewis.de> Message-ID: <200707240802.26694.richardjones@optushome.com.au> On Tue, 24 Jul 2007, Martin v. L?wis wrote: > I think that's because I turned of proxying from www.python.org/pypi > to cheeseshop.python.org/pypi, and replaced it with redirection > (302, temporary redirect) instead (temporary just in case people > find problems with that). > > (I asked a few days ago whether that would be a problem, and nobody > said it would). Sorry, I somehow totally missed your message on this. Richard From pje at telecommunity.com Tue Jul 24 00:56:51 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 23 Jul 2007 18:56:51 -0400 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: <20070723212920.8BFE63A40AA@sparrow.telecommunity.com> References: <46A50BF1.9020303@v.loewis.de> <20070723204445.65ABC3A40AA@sparrow.telecommunity.com> <46A51798.8000907@v.loewis.de> <20070723212920.8BFE63A40AA@sparrow.telecommunity.com> Message-ID: <20070723225431.AB1063A40B2@sparrow.telecommunity.com> At 05:31 PM 7/23/2007 -0400, Phillip J. Eby wrote: >Something like: > >RewriteEngine On >RewriteBase / >RewriteCond %{REQUEST_METHOD} ^GET$ >RewriteRule ^pypi(.*)$ >http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [R,L] >RewriteRule ^pypi(.*)$ >http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [P,L] Ugh. Looks like those lines wrapped in transit. The two RewriteRule lines should be one line each, with the 'http:' appearing after the "^pypi(.*)$" and a space. From martin at v.loewis.de Tue Jul 24 06:33:22 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 06:33:22 +0200 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: <9cee7ab80707231413q573c62bas8a03163e03ba9fc1@mail.gmail.com> References: <46A50BF1.9020303@v.loewis.de> <9cee7ab80707231413q573c62bas8a03163e03ba9fc1@mail.gmail.com> Message-ID: <46A58112.204@v.loewis.de> > Basically, I think exposing human beings to the name "cheeseshop" in > bad. Specifically, it's confusing to anyone not familiar with a > particular Monty Python skit. A nice skit (IMO), but not a good > public-facing name for PyPI. Ok. However, I think this is a matter of taste, and he who designs the system gets to name it. Barring a BDFL pronouncement or PSF board decision, Cheeseshop is the name of that system, whether people like that name or not. So I have heard that, but this cannot stop me from fixing what I consider a technical performance problem (namely, that all data go through two machines). Regards, Martin From martin at v.loewis.de Tue Jul 24 06:40:18 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 06:40:18 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070723211919.361E23A403D@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> Message-ID: <46A582B2.3060105@v.loewis.de> Phillip J. Eby schrieb: > At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote: >> > Yes, especially since compatibility with the existing installation >> > base requires case insensitivity, because on case-insensitive >> > platforms easy_install already normalizes the case of filenames it >> > creates. So, the question of what the "right thing" to do is in the >> > abstract has already been moot for a year or two. >> >> Can you elaborate a bit, please? Why does the case of filenames >> matter for the queries it makes? > > In order to resolve dependencies, the system looks at installed .egg > files and directories (and .egg-info direcories), and extracts package > name and version info from the filenames. Still - why does that require case-insensitive lookups to the index? Suppose a package specifies a dependency Foo. IIUC, you look on disk whether foo is already present, finding the version(s) of foo installed in that process. Then, this either is satisfying or not. If it is, you don't need the index at all. If it is not, you need to go to the index - but you still know that it is Foo that you were looking for, no? So lookups for dependencies in the index could always be case-sensitive; please correct me if I'm wrong. Regards, Martin From martin at v.loewis.de Tue Jul 24 07:52:46 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 07:52:46 +0200 Subject: [Catalog-sig] Cheeseshop webstats Message-ID: <46A593AE.9030609@v.loewis.de> For those who are curious, I started collecting webstats, at http://cheeseshop.python.org/webstats/ Regards, Martin From jim at zope.com Tue Jul 24 12:11:15 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 06:11:15 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070723211919.361E23A403D@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> Message-ID: <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> On Jul 23, 2007, at 5:21 PM, Phillip J. Eby wrote: > At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote: >>> Yes, especially since compatibility with the existing installation >>> base requires case insensitivity, because on case-insensitive >>> platforms easy_install already normalizes the case of filenames it >>> creates. So, the question of what the "right thing" to do is in the >>> abstract has already been moot for a year or two. >> >> Can you elaborate a bit, please? Why does the case of filenames >> matter for the queries it makes? >> >> AFAIU, it gets package names either from the user or from setup.py, >> perhaps also from packages dependency inside .egg files (assuming >> those support dependencies); these should all be case-sensitive. > > In order to resolve dependencies, the system looks at installed .egg > files and directories (and .egg-info direcories), and extracts > package name and version info from the filenames. But the package name and version are in the PKG-INFO files, so it certainly has access to non-normalized names. Why can't it double check a possible match against that file? Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From benji at benjiyork.com Tue Jul 24 16:42:37 2007 From: benji at benjiyork.com (Benji York) Date: Tue, 24 Jul 2007 10:42:37 -0400 Subject: [Catalog-sig] Prototype setuptools-specific PyPI index. In-Reply-To: <46A4FA64.5050404@benjiyork.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <46A384C9.8040404@v.loewis.de> <46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com> Message-ID: <46A60FDD.2030207@benjiyork.com> Benji York wrote: > I plan to do similar timings with the "simple" PyPI interface when I > get a chance and report the results here. Here are my non-scientific results: buildout times: regular: 4:52.86 simple: 3:15.57 ppix: 2:03.58 As everyone is aware, network latency has a large impact on this so here are the shortest round-trip packet times I got (with a small sample). cheeseshop.python.org: 93ms download.zope.org: 8ms I suspect the majority/entirety of the difference between ppix and simple is network related. -- Benji York http://benjiyork.com From pje at telecommunity.com Tue Jul 24 17:31:19 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 24 Jul 2007 11:31:19 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> Message-ID: <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> At 06:11 AM 7/24/2007 -0400, Jim Fulton wrote: >On Jul 23, 2007, at 5:21 PM, Phillip J. Eby wrote: > >>At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote: >>>>Yes, especially since compatibility with the existing installation >>>>base requires case insensitivity, because on case-insensitive >>>>platforms easy_install already normalizes the case of filenames it >>>>creates. So, the question of what the "right thing" to do is in the >>>>abstract has already been moot for a year or two. >>> >>>Can you elaborate a bit, please? Why does the case of filenames >>>matter for the queries it makes? >>> >>>AFAIU, it gets package names either from the user or from setup.py, >>>perhaps also from packages dependency inside .egg files (assuming >>>those support dependencies); these should all be case-sensitive. >> >>In order to resolve dependencies, the system looks at installed .egg >>files and directories (and .egg-info direcories), and extracts >>package name and version info from the filenames. > >But the package name and version are in the PKG-INFO files, so it >certainly has access to non-normalized names. Why can't it double >check a possible match against that file? Because if case actually made a difference, we couldn't have both packages installed in the same directory, could we? And why add an extra file open (which currently is only needed for "develop" eggs) to the process of building a working set or environment, in order to confirm something whose only purpose is to make requirements more difficult to specify? :) Note that if what's bothering you is the package index access time, use Apache's mod_speling to enable case-insensitive URLs for the static page tree. From jim at zope.com Tue Jul 24 17:39:38 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 11:39:38 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> Message-ID: <37BB633B-B2E6-43BB-AB16-CFE807CF8625@zope.com> On Jul 24, 2007, at 11:31 AM, Phillip J. Eby wrote: > At 06:11 AM 7/24/2007 -0400, Jim Fulton wrote: > >> On Jul 23, 2007, at 5:21 PM, Phillip J. Eby wrote: >> >>> At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote: >>>>> Yes, especially since compatibility with the existing installation >>>>> base requires case insensitivity, because on case-insensitive >>>>> platforms easy_install already normalizes the case of filenames it >>>>> creates. So, the question of what the "right thing" to do is >>>>> in the >>>>> abstract has already been moot for a year or two. >>>> >>>> Can you elaborate a bit, please? Why does the case of filenames >>>> matter for the queries it makes? >>>> >>>> AFAIU, it gets package names either from the user or from setup.py, >>>> perhaps also from packages dependency inside .egg files (assuming >>>> those support dependencies); these should all be case-sensitive. >>> >>> In order to resolve dependencies, the system looks at installed .egg >>> files and directories (and .egg-info direcories), and extracts >>> package name and version info from the filenames. >> >> But the package name and version are in the PKG-INFO files, so it >> certainly has access to non-normalized names. Why can't it double >> check a possible match against that file? > > Because if case actually made a difference, we couldn't have both > packages installed in the same directory, could we? And why add an > extra file open (which currently is only needed for "develop" eggs) > to the process of building a working set or environment, in order > to confirm something whose only purpose is to make requirements > more difficult to specify? :) Currently, we allow packages to differ only in case. The fact that setuptools pretends we don't doesn't change the fact that we do. You said that "compatibility with the existing installation base requires case insensitivity, because on case-insensitive platforms easy_install already normalizes the case of filenames it creates". I'm merely pointing out that we don't have to rely soley on the file name. > Note that if what's bothering you is the package index access time, > use Apache's mod_speling to enable case-insensitive URLs for the > static page tree. *If* we decide that package names are case insensitive, then we should do this. We haven't decided this. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Tue Jul 24 17:54:32 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 24 Jul 2007 11:54:32 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <37BB633B-B2E6-43BB-AB16-CFE807CF8625@zope.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> <37BB633B-B2E6-43BB-AB16-CFE807CF8625@zope.com> Message-ID: <20070724155212.E16A93A40A7@sparrow.telecommunity.com> At 11:39 AM 7/24/2007 -0400, Jim Fulton wrote: >On Jul 24, 2007, at 11:31 AM, Phillip J. Eby wrote: > >>At 06:11 AM 7/24/2007 -0400, Jim Fulton wrote: >> >>>On Jul 23, 2007, at 5:21 PM, Phillip J. Eby wrote: >>> >>>>At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote: >>>>>>Yes, especially since compatibility with the existing installation >>>>>>base requires case insensitivity, because on case-insensitive >>>>>>platforms easy_install already normalizes the case of filenames it >>>>>>creates. So, the question of what the "right thing" to do is >>>>>>in the >>>>>>abstract has already been moot for a year or two. >>>>> >>>>>Can you elaborate a bit, please? Why does the case of filenames >>>>>matter for the queries it makes? >>>>> >>>>>AFAIU, it gets package names either from the user or from setup.py, >>>>>perhaps also from packages dependency inside .egg files (assuming >>>>>those support dependencies); these should all be case-sensitive. >>>> >>>>In order to resolve dependencies, the system looks at installed .egg >>>>files and directories (and .egg-info direcories), and extracts >>>>package name and version info from the filenames. >>> >>>But the package name and version are in the PKG-INFO files, so it >>>certainly has access to non-normalized names. Why can't it double >>>check a possible match against that file? >> >>Because if case actually made a difference, we couldn't have both >>packages installed in the same directory, could we? And why add an >>extra file open (which currently is only needed for "develop" eggs) >>to the process of building a working set or environment, in order >>to confirm something whose only purpose is to make requirements >>more difficult to specify? :) > >Currently, we allow packages to differ only in case. The fact that >setuptools pretends we don't doesn't change the fact that we do. I wasn't under the impression that we were discussing whether allowing project names to differ only in case was a good idea, since I haven't heard anybody give an argument that it's a *good* idea. In fact, it seems like an obviously bad idea on its face, whether setuptools is in the picture or not. >>Note that if what's bothering you is the package index access time, >>use Apache's mod_speling to enable case-insensitive URLs for the >>static page tree. > >*If* we decide that package names are case insensitive, then we >should do this. We haven't decided this. Well, so far the only argument *against* it that I recall seeing, is your argument that sloppy requirement specs slow everybody down by making them do the extra package index hit. So, if that's fixable, what other argument is there for treating the names case-sensitively? From jim at zope.com Tue Jul 24 18:36:29 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 12:36:29 -0400 Subject: [Catalog-sig] We need to make a decision wrt distribution names Message-ID: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> Obviously, we are having a debate about what forms distribution names can take. I think we need a decision. Does anyone know if there are existing rules for package names? I can't find them if there are. Up until now, I think we've been in somewhat of a prototyping mode, but I think it's time to move beyond that. I strongly suggest that we need an official specification that says: - what's a legal package name and - what the equivalence rules for package names are. Whatever we decide needs to be well supported by setuptools and PyPI. I can live with whatever we decide as long as we decide something and make sure it is well communicated and implemented. In particular, I could live with the equivalence rules that setuptools uses if they are documented and if they are supported correctly and efficiently by the index (including mirrors). IMO, a decision is extremely important. If we can't reach consensus, then we need to call in the BDFL. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From waterbug at pangalactic.us Tue Jul 24 18:49:43 2007 From: waterbug at pangalactic.us (Stephen Waterbury) Date: Tue, 24 Jul 2007 12:49:43 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names In-Reply-To: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> Message-ID: <46A62DA7.9000304@pangalactic.us> Jim Fulton wrote: > Does anyone know if there are existing rules for package names? I > can't find them if there are. ... Well, there is PEP 8, which has this to say on the subject: "Package and Module Names "Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged." Steve From jim at zope.com Tue Jul 24 18:54:36 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 12:54:36 -0400 Subject: [Catalog-sig] We need to make a decision wrt distribution names In-Reply-To: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> Message-ID: <41DE9981-BDAE-4A35-B989-DAD4749CA6BD@zope.com> On Jul 24, 2007, at 12:36 PM, Jim Fulton wrote: > > Obviously, we are having a debate about what forms distribution > names can take. I think we need a decision. > > Does anyone know if there are existing rules for package names? Doh. I meant "distribution names". Sorry. > I can't find them if there are. Up until now, I think we've been > in somewhat of a prototyping mode, but I think it's time to move > beyond that. > > I strongly suggest that we need an official specification that says: > > - what's a legal package name and Ditto > > - what the equivalence rules for package names are. Ditto. > > Whatever we decide needs to be well supported by setuptools and > PyPI. I can live with whatever we decide as long as we decide > something and make sure it is well communicated and implemented. In > particular, I could live with the equivalence rules that setuptools > uses if they are documented and if they are supported correctly and > efficiently by the index (including mirrors). > > IMO, a decision is extremely important. If we can't reach > consensus, then we need to call in the BDFL. -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Tue Jul 24 18:55:39 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 12:55:39 -0400 Subject: [Catalog-sig] We need to make a decision wrt distribution names (second try) Message-ID: Obviously, we are having a debate about what forms distribution names can take. I think we need a decision. Does anyone know if there are existing rules for distribution names? I can't find them if there are. Up until now, I think we've been in somewhat of a prototyping mode, but I think it's time to move beyond that. I strongly suggest that we need an official specification that says: - what's a legal distribution name and - what the equivalence rules for distribution names are. Whatever we decide needs to be well supported by setuptools and PyPI. I can live with whatever we decide as long as we decide something and make sure it is well communicated and implemented. In particular, I could live with the equivalence rules that setuptools uses if they are documented and if they are supported correctly and efficiently by the index (including mirrors). IMO, a decision is extremely important. If we can't reach consensus, then we need to call in the BDFL. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Tue Jul 24 18:57:01 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 12:57:01 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names In-Reply-To: <46A62DA7.9000304@pangalactic.us> References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> <46A62DA7.9000304@pangalactic.us> Message-ID: On Jul 24, 2007, at 12:49 PM, Stephen Waterbury wrote: > Jim Fulton wrote: >> Does anyone know if there are existing rules for package names? I >> can't find them if there are. ... > > Well, there is PEP 8, which has this to say on the subject: > > "Package and Module Names > > "Modules should have short, all-lowercase names. Underscores > can be used in the module name if it improves readability. > Python packages should also have short, all-lowercase names, > although the use of underscores is discouraged." Doh, I was sloppy in my terminology. I should have said "distribution name". We're talking about the names used in PyPI, the Python Distribution index. ;) Also the value passed to the "name" argument of setup. Sorry for the confusion. :) Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From waterbug at pangalactic.us Tue Jul 24 19:09:20 2007 From: waterbug at pangalactic.us (Stephen Waterbury) Date: Tue, 24 Jul 2007 13:09:20 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names In-Reply-To: References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> <46A62DA7.9000304@pangalactic.us> Message-ID: <46A63240.7070003@pangalactic.us> Jim Fulton wrote: > > On Jul 24, 2007, at 12:49 PM, Stephen Waterbury wrote: > >> Jim Fulton wrote: >>> Does anyone know if there are existing rules for package names? I >>> can't find them if there are. ... >> >> Well, there is PEP 8, which has this to say on the subject: >> >> "Package and Module Names >> >> "Modules should have short, all-lowercase names. Underscores >> can be used in the module name if it improves readability. >> Python packages should also have short, all-lowercase names, >> although the use of underscores is discouraged." > > Doh, I was sloppy in my terminology. I should have said "distribution > name". We're talking about the names used in PyPI, the Python > Distribution index. ;) Also the value passed to the "name" argument of > setup. > > Sorry for the confusion. :) Actually, I wasn't confused. :) I'd suggest a convention that allows a distribution "title" (e.g., "Zope", "Twisted", etc.) and a distribution "name" that would simply be the name of the distribution's top-level package (e.g., "zope", "twisted", etc.), which should follow the PEP 8 suggestion for package names and should be what setuptools uses together with a version reference to uniquely identify a specific distribution/version (egg). Steve From martin at v.loewis.de Tue Jul 24 19:29:06 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 19:29:06 +0200 Subject: [Catalog-sig] We need to make a decision wrt distribution names In-Reply-To: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> Message-ID: <46A636E2.2090408@v.loewis.de> > I strongly suggest that we need an official specification that says: The process would then be to write a PEP. It will end with a BDFL pronouncement either way, but that might be easy to obtain if there is consensus up-front. Regards, Martin From jim at zope.com Tue Jul 24 19:33:43 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 13:33:43 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names (second try) In-Reply-To: <46A636DB.50105@ibp.de> References: <46A636DB.50105@ibp.de> Message-ID: On Jul 24, 2007, at 1:28 PM, Lars Immisch wrote: > Hi, > >> Obviously, we are having a debate about what forms distribution >> names can take. I think we need a decision. > > Thanks for bringing this up. > >> Does anyone know if there are existing rules for distribution >> names? I can't find them if there are. Up until now, I think >> we've been in somewhat of a prototyping mode, but I think it's >> time to move beyond that. >> I strongly suggest that we need an official specification that says: >> - what's a legal distribution name and >> - what the equivalence rules for distribution names are. > > Comparison rules are also important: > > Is artin-1.2-rc2 < artin-1.2? Note that these are not distribution names. Well, that depends on how you define "distribution names". Sigh. The dsitribution names I'm trying to talk about don't have version numbers. I don't see a particular reason why these distribution names have to be ordered, Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From martin at v.loewis.de Tue Jul 24 19:40:55 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 19:40:55 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> Message-ID: <46A639A7.7090305@v.loewis.de> >> But the package name and version are in the PKG-INFO files, so it >> certainly has access to non-normalized names. Why can't it double >> check a possible match against that file? > > Because if case actually made a difference, we couldn't have both > packages installed in the same directory, could we? Right. However, there is a difference between case-insensitive, and case-preserving. > Note that if what's bothering you is the package index access time, use > Apache's mod_speling to enable case-insensitive URLs for the static page > tree. That won't help. If you look for a name of a non-registered package, setuptools will go to the index even if mod_speling corrects spelling errors. Such an approach is only possible if setuptools would stop using the entire index if the server has case-insensitive lookup (which it cannot determine). Regards, Martin From pje at telecommunity.com Tue Jul 24 19:45:08 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 24 Jul 2007 13:45:08 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names In-Reply-To: <46A63240.7070003@pangalactic.us> References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> <46A62DA7.9000304@pangalactic.us> <46A63240.7070003@pangalactic.us> Message-ID: <20070724174248.F40AA3A40A7@sparrow.telecommunity.com> At 01:09 PM 7/24/2007 -0400, Stephen Waterbury wrote: >Actually, I wasn't confused. :) I'd suggest a convention that allows >a distribution "title" (e.g., "Zope", "Twisted", etc.) and a >distribution "name" that would simply be the name of the >distribution's top-level package (e.g., "zope", "twisted", etc.), This proposal would rule out namespace packages, in addition to being incompatible with existing distribution names. Note that package != distribution -- a distribution may contain zero or more packages (even top-level), *and* a single package (top-level or otherwise) may be spread over more than one distribution. Also note that this was true even with the distutils, long before setuptools existed. From pje at telecommunity.com Tue Jul 24 19:48:54 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 24 Jul 2007 13:48:54 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names (second try) In-Reply-To: References: <46A636DB.50105@ibp.de> Message-ID: <20070724174634.2283E3A40A7@sparrow.telecommunity.com> At 01:33 PM 7/24/2007 -0400, Jim Fulton wrote: >Note that these are not distribution names. Well, that depends on >how you define "distribution names". Sigh. The dsitribution names >I'm trying to talk about don't have version numbers. Setuptools uses the term "project name" for what you're calling a distribution name, if that helps any. :) From pje at telecommunity.com Tue Jul 24 19:52:32 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 24 Jul 2007 13:52:32 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A639A7.7090305@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> <46A639A7.7090305@v.loewis.de> Message-ID: <20070724175013.238323A40A7@sparrow.telecommunity.com> At 07:40 PM 7/24/2007 +0200, Martin v. L?wis wrote: > >> But the package name and version are in the PKG-INFO files, so it > >> certainly has access to non-normalized names. Why can't it double > >> check a possible match against that file? > > > > Because if case actually made a difference, we couldn't have both > > packages installed in the same directory, could we? > >Right. However, there is a difference between case-insensitive, >and case-preserving. I don't understand your statement here, nor what is supposed to follow from it. > > Note that if what's bothering you is the package index access time, use > > Apache's mod_speling to enable case-insensitive URLs for the static page > > tree. > >That won't help. If you look for a name of a non-registered package, >setuptools will go to the index even if mod_speling corrects spelling >errors. Jim's objection was that if it's possible to get case-correction from the index, people will declare setup.py dependencies with incorrect case, leading to other packages having indirect dependencies with incorrect case, leading to lots of package index lookups. This objection is relevant only to requirements which differ from the actual project name only by their case. A non-registered package lookup is going to fail no matter what, and thus isn't going to wind up in a setup.py without a dependency_links specifier that will prevent it being looked up in the package index to begin with. From jim at zope.com Tue Jul 24 19:51:31 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 13:51:31 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names (second try) In-Reply-To: <20070724174634.2283E3A40A7@sparrow.telecommunity.com> References: <46A636DB.50105@ibp.de> <20070724174634.2283E3A40A7@sparrow.telecommunity.com> Message-ID: <0A0EDDEC-7BCC-41C1-822A-5D93AF20E1F7@zope.com> On Jul 24, 2007, at 1:48 PM, Phillip J. Eby wrote: > At 01:33 PM 7/24/2007 -0400, Jim Fulton wrote: >> Note that these are not distribution names. Well, that depends on >> how you define "distribution names". Sigh. The dsitribution names >> I'm trying to talk about don't have version numbers. > > Setuptools uses the term "project name" for what you're calling a > distribution name, if that helps any. :) Right. I'm happy to use that. Does anyone want to disagree? BTW, to up the ante, I volunteer to try to update the distutils document to reflect what we decide. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From lars at ibp.de Tue Jul 24 19:28:59 2007 From: lars at ibp.de (Lars Immisch) Date: Tue, 24 Jul 2007 19:28:59 +0200 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names (second try) In-Reply-To: References: Message-ID: <46A636DB.50105@ibp.de> Hi, > Obviously, we are having a debate about what forms distribution names > can take. I think we need a decision. Thanks for bringing this up. > Does anyone know if there are existing rules for distribution names? > I can't find them if there are. Up until now, I think we've been in > somewhat of a prototyping mode, but I think it's time to move beyond > that. > > I strongly suggest that we need an official specification that says: > > - what's a legal distribution name and > > - what the equivalence rules for distribution names are. Comparison rules are also important: Is artin-1.2-rc2 < artin-1.2? IMO, it's perfectly fine to just state: comparisons are lexicographical (ASCII only). But I'd like to see this mentioned somewhere. - Lars From lars at ibp.de Tue Jul 24 20:11:57 2007 From: lars at ibp.de (Lars Immisch) Date: Tue, 24 Jul 2007 20:11:57 +0200 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names (second try) In-Reply-To: References: <46A636DB.50105@ibp.de> Message-ID: <46A640ED.8070406@ibp.de> Hi, >>> Obviously, we are having a debate about what forms distribution >>> names can take. I think we need a decision. >> >> Thanks for bringing this up. >> >>> Does anyone know if there are existing rules for distribution >>> names? I can't find them if there are. Up until now, I think we've >>> been in somewhat of a prototyping mode, but I think it's time to >>> move beyond that. >>> I strongly suggest that we need an official specification that says: >>> - what's a legal distribution name and >>> - what the equivalence rules for distribution names are. >> >> Comparison rules are also important: >> >> Is artin-1.2-rc2 < artin-1.2? > > Note that these are not distribution names. Well, that depends on how > you define "distribution names". Sigh. The dsitribution names I'm > trying to talk about don't have version numbers. I don't see a > particular reason why these distribution names have to be ordered, I see. Sorry for the drive-by-shooting. Still, I'd like a stated convention how version numbers are compared. I believe this would be good for setuptools also. But the issue is separable from project naming conventions. - Lars From martin at v.loewis.de Tue Jul 24 20:21:11 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 20:21:11 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070724175013.238323A40A7@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> <46A639A7.7090305@v.loewis.de> <20070724175013.238323A40A7@sparrow.telecommunity.com> Message-ID: <46A64317.6090907@v.loewis.de> >> > Because if case actually made a difference, we couldn't have both >> > packages installed in the same directory, could we? >> >> Right. However, there is a difference between case-insensitive, >> and case-preserving. > > I don't understand your statement here, nor what is supposed to follow > from it. Clearly, on a case-insensitive file system, project names differing only in case cannot coexist. That doesn't mean that all references to the project should be case-normalized (e.g. lower-cased). So even if project names compare case-insensitive, there still should (could) be a "right" spelling, the one that the package author wants to see. This is the spelling that others then should use. So I still don't see why the file names on disk have any effect on the lookup setuptools do to the index. > Jim's objection was that if it's possible to get case-correction from > the index, people will declare setup.py dependencies with incorrect > case, leading to other packages having indirect dependencies with > incorrect case, leading to lots of package index lookups. I don't think that was his objection. IIUC, he complains about incorrect spellings as bad, period - regardless of whether they also have a performance effect. It's like spelling your name "Philipp" - that's a bad thing to do, independent of whether it also makes you harder to find (which it actually doesn't, thanks to Google). > This objection is relevant only to requirements which differ from the > actual project name only by their case. A non-registered package lookup > is going to fail no matter what, and thus isn't going to wind up in a > setup.py without a dependency_links specifier that will prevent it being > looked up in the package index to begin with. Right. However, if setuptools would stop making case insensitive lookups to the index, lookups to unregistered packages would become more efficient. Regards, Martin From pje at telecommunity.com Tue Jul 24 20:44:08 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 24 Jul 2007 14:44:08 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A64317.6090907@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> <46A639A7.7090305@v.loewis.de> <20070724175013.238323A40A7@sparrow.telecommunity.com> <46A64317.6090907@v.loewis.de> Message-ID: <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> At 08:21 PM 7/24/2007 +0200, Martin v. L?wis wrote: > >> > Because if case actually made a difference, we couldn't have both > >> > packages installed in the same directory, could we? > >> > >> Right. However, there is a difference between case-insensitive, > >> and case-preserving. > > > > I don't understand your statement here, nor what is supposed to follow > > from it. > >Clearly, on a case-insensitive file system, project names differing >only in case cannot coexist. That doesn't mean that all references >to the project should be case-normalized (e.g. lower-cased). > >So even if project names compare case-insensitive, there still >should (could) be a "right" spelling, the one that the package >author wants to see. This is the spelling that others then should >use. Well, that spelling will certainly show up everywhere. Setuptools is case-preserving, *except* with regard to installing egg files on case-insensitive filesystems (as defined by what os.path.normcase does on a given platform). When it installs an egg, it normalizes the case of the target path. In all other matters it is case-insensitive for comparison, but case-preserving of the inputs it receives. > > Jim's objection was that if it's possible to get case-correction from > > the index, people will declare setup.py dependencies with incorrect > > case, leading to other packages having indirect dependencies with > > incorrect case, leading to lots of package index lookups. > >I don't think that was his objection. IIUC, he complains about >incorrect spellings as bad, period - regardless of whether they also >have a performance effect. It's like spelling your name "Philipp" - >that's a bad thing to do, independent of whether it also makes you >harder to find (which it actually doesn't, thanks to Google). It's actually more like spelling my name "phillip", which is arguably still spelled correctly, if punctuated poorly. :) And it's also an answer to the wrong question: the *first* question is whether we should allow "phillip" and "Phillip" to co-exist in the package index. If not, then there is the question of whether there is any reason to be case-sensitive with respect to searching. If we are agreed that having projects whose names differ only by case is a bad idea, then the latter question is considerably less controversial. > > This objection is relevant only to requirements which differ from the > > actual project name only by their case. A non-registered package lookup > > is going to fail no matter what, and thus isn't going to wind up in a > > setup.py without a dependency_links specifier that will prevent it being > > looked up in the package index to begin with. > >Right. However, if setuptools would stop making case insensitive >lookups to the index, lookups to unregistered packages would become >more efficient. I'm not sure I follow you. If a non-registered package is used as a dependency, the setup() will need to specify dependency_links, in which case PyPI will not be consulted. From martin at v.loewis.de Tue Jul 24 20:54:24 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 20:54:24 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> <46A639A7.7090305@v.loewis.de> <20070724175013.238323A40A7@sparrow.telecommunity.com> <46A64317.6090907@v.loewis.de> <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> Message-ID: <46A64AE0.7000307@v.loewis.de> >> Right. However, if setuptools would stop making case insensitive >> lookups to the index, lookups to unregistered packages would become >> more efficient. > > I'm not sure I follow you. If a non-registered package is used as a > dependency, the setup() will need to specify dependency_links, in which > case PyPI will not be consulted. Ah, ok. So is it then correct that setuptools never looks at pypi/, unless the user misspelled a package name on the command line? Regards, Martin From jim at zope.com Tue Jul 24 21:20:40 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 15:20:40 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names (second try) In-Reply-To: <46A640ED.8070406@ibp.de> References: <46A636DB.50105@ibp.de> <46A640ED.8070406@ibp.de> Message-ID: <1FE480D2-27FD-4F42-82FD-06C387805EE2@zope.com> On Jul 24, 2007, at 2:11 PM, Lars Immisch wrote: > Hi, > >>>> Obviously, we are having a debate about what forms distribution >>>> names can take. I think we need a decision. >>> >>> Thanks for bringing this up. >>> >>>> Does anyone know if there are existing rules for distribution >>>> names? I can't find them if there are. Up until now, I think >>>> we've been in somewhat of a prototyping mode, but I think it's >>>> time to move beyond that. >>>> I strongly suggest that we need an official specification that >>>> says: >>>> - what's a legal distribution name and >>>> - what the equivalence rules for distribution names are. >>> >>> Comparison rules are also important: >>> >>> Is artin-1.2-rc2 < artin-1.2? >> Note that these are not distribution names. Well, that depends on >> how you define "distribution names". Sigh. The dsitribution names >> I'm trying to talk about don't have version numbers. I don't see >> a particular reason why these distribution names have to be ordered, > > I see. Sorry for the drive-by-shooting. np > Still, I'd like a stated convention how version numbers are > compared. I believe this would be good for setuptools also. setuptools has this. It would be nice bless it in a PEP. > But the issue is separable from project naming conventions. Yup. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Tue Jul 24 21:46:42 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 24 Jul 2007 15:46:42 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A64AE0.7000307@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <46A23BAE.5090907@v.loewis.de> <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com> <46A259C4.6090605@v.loewis.de> <20070722164922.AE50D3A40A9@sparrow.telecommunity.com> <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com> <20070723152015.E7AFA3A403D@sparrow.telecommunity.com> <46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de> <20070723204446.294223A40B2@sparrow.telecommunity.com> <46A51A0C.2090800@v.loewis.de> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> <46A639A7.7090305@v.loewis.de> <20070724175013.238323A40A7@sparrow.telecommunity.com> <46A64317.6090907@v.loewis.de> <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> <46A64AE0.7000307@v.loewis.de> Message-ID: <20070724194426.828EE3A40A7@sparrow.telecommunity.com> At 08:54 PM 7/24/2007 +0200, Martin v. L?wis wrote: > >> Right. However, if setuptools would stop making case insensitive > >> lookups to the index, lookups to unregistered packages would become > >> more efficient. > > > > I'm not sure I follow you. If a non-registered package is used as a > > dependency, the setup() will need to specify dependency_links, in which > > case PyPI will not be consulted. > >Ah, ok. So is it then correct that setuptools never looks at pypi/, >unless the user misspelled a package name on the command line? Pretty much, yes. From martin at v.loewis.de Tue Jul 24 21:55:47 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 21:55:47 +0200 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org Message-ID: <46A65943.7000302@v.loewis.de> After some discussion, it seems that nobody really likes the name "cheeseshop" for the Python Package Index, and some people seem to actively hate it. So I'm going to change the name (again/back): the software will call itself "Python Package Index", abbreviated as pypi (PyPI where case matters). The machine address cheeseshop.python.org will continue to work for a foreseeable future, but will not be actively advertised. Regards, Martin From noah.gift at gmail.com Tue Jul 24 21:57:30 2007 From: noah.gift at gmail.com (Noah Gift) Date: Tue, 24 Jul 2007 15:57:30 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070724194426.828EE3A40A7@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> <46A639A7.7090305@v.loewis.de> <20070724175013.238323A40A7@sparrow.telecommunity.com> <46A64317.6090907@v.loewis.de> <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> <46A64AE0.7000307@v.loewis.de> <20070724194426.828EE3A40A7@sparrow.telecommunity.com> Message-ID: On 7/24/07, Phillip J. Eby wrote: > At 08:54 PM 7/24/2007 +0200, Martin v. L?wis wrote: > > >> Right. However, if setuptools would stop making case insensitive > > >> lookups to the index, lookups to unregistered packages would become > > >> more efficient. > > > > > > I'm not sure I follow you. If a non-registered package is used as a > > > dependency, the setup() will need to specify dependency_links, in which > > > case PyPI will not be consulted. > > > >Ah, ok. So is it then correct that setuptools never looks at pypi/, > >unless the user misspelled a package name on the command line? > > Pretty much, yes. Would it be a bad idea to suggest the case insensitive lookup happen against a local flat file that gets diff'd from PyPI? Then only the culprit gets punished using their own CPU :) From martin at v.loewis.de Tue Jul 24 22:02:27 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 22:02:27 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <20070723211919.361E23A403D@sparrow.telecommunity.com> <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> <46A639A7.7090305@v.loewis.de> <20070724175013.238323A40A7@sparrow.telecommunity.com> <46A64317.6090907@v.loewis.de> <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> <46A64AE0.7000307@v.loewis.de> <20070724194426.828EE3A40A7@sparrow.telecommunity.com> Message-ID: <46A65AD3.3060607@v.loewis.de> > Would it be a bad idea to suggest the case insensitive lookup happen > against a local flat file that gets diff'd from PyPI? Then only the > culprit gets punished using their own CPU :) What does it mean to "diff a flat file from PyPI"? Regards, Martin From noah.gift at gmail.com Tue Jul 24 22:07:15 2007 From: noah.gift at gmail.com (Noah Gift) Date: Tue, 24 Jul 2007 16:07:15 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A65AD3.3060607@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> <46A639A7.7090305@v.loewis.de> <20070724175013.238323A40A7@sparrow.telecommunity.com> <46A64317.6090907@v.loewis.de> <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> <46A64AE0.7000307@v.loewis.de> <20070724194426.828EE3A40A7@sparrow.telecommunity.com> <46A65AD3.3060607@v.loewis.de> Message-ID: On 7/24/07, "Martin v. L?wis" wrote: > > Would it be a bad idea to suggest the case insensitive lookup happen > > against a local flat file that gets diff'd from PyPI? Then only the > > culprit gets punished using their own CPU :) > > What does it mean to "diff a flat file from PyPI"? I am familiar with an open source project called Radmind. It maintains machines be keeping a local transcript with all of the files and "overloads" on it. When you modify the file system you diff the changes into an overload and put them on the server. When the client asks for an update, the client checks to see if its transcript files are the same. If they are then it does nothing. If it different the file(s) get updated. Then the magic is that the search and replace for which files it needs to grab are done locally using ton's of local CPU. When the client resolves all of the files it needs, it then grabs them from the server. It is a nifty design: http://rsug.itd.umich.edu/software/radmind/ So, if someone does an "incorrect" search, easy_install checks to see first if it has the latest "file". If not, it then replaces its local index. Then the search happens locally, not being going back and forth to the server. > > Regards, > Martin > -- http://www.blog.noahgift.com From pje at telecommunity.com Tue Jul 24 22:21:29 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 24 Jul 2007 16:21:29 -0400 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <46A65943.7000302@v.loewis.de> References: <46A65943.7000302@v.loewis.de> Message-ID: <20070724201911.1210E3A40A7@sparrow.telecommunity.com> At 09:55 PM 7/24/2007 +0200, Martin v. L?wis wrote: >After some discussion, it seems that nobody really likes >the name "cheeseshop" for the Python Package Index, >and some people seem to actively hate it. I was under the impression that that's also the case for the name "PyPI", which was changed because of difficulty of disambiguating from "PyPy" in conversation. Cheeseshop is at least a word that is obviously a noun, and it is in somewhat more common use, with 224000 google hits for "cheeseshop python -monty", versus 199,000 for "pypi python -monty". From martin at v.loewis.de Tue Jul 24 22:23:58 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 22:23:58 +0200 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com> <46A639A7.7090305@v.loewis.de> <20070724175013.238323A40A7@sparrow.telecommunity.com> <46A64317.6090907@v.loewis.de> <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> <46A64AE0.7000307@v.loewis.de> <20070724194426.828EE3A40A7@sparrow.telecommunity.com> <46A65AD3.3060607@v.loewis.de> Message-ID: <46A65FDE.7080806@v.loewis.de> > I am familiar with an open source project called Radmind. It > maintains machines be keeping a local transcript with all of the files > and "overloads" on it. When you modify the file system you diff the > changes into an overload and put them on the server. That's still a lot of terminology which I don't understand, and have no intuition for, perhaps because English is not my native language. I give up trying to understand - just to give you an idea: What's a "transcript of files"? How do you "overload" on it (why is "to overload" used with the preposition "on")? How do I "diff" a change "into" "an overload" (which now is a noun, it seems)? > So, if someone does an "incorrect" search, easy_install checks to see > first if it has the latest "file". If not, it then replaces its local > index. Then the search happens locally, not being going back and > forth to the server. I think this brings us to the real issue: you asked whether this would be a bad idea to suggest that? I now think "perhaps not bad, but unhelpful, unless you also contribute an implementation of it". It's a change to setuptools, which is still mostly a one-man-show, (IIUC), so proposing ideas in general is futile (as for most software with a single author - including PyPI); the single author cannot possibly implement all the ideas people have. Regards, Martin From jim at zope.com Tue Jul 24 22:36:31 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 16:36:31 -0400 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <46A65943.7000302@v.loewis.de> References: <46A65943.7000302@v.loewis.de> Message-ID: <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> On Jul 24, 2007, at 3:55 PM, Martin v. L?wis wrote: > After some discussion, it seems that nobody really likes > the name "cheeseshop" for the Python Package Index, > and some people seem to actively hate it. > > So I'm going to change the name (again/back): the software > will call itself "Python Package Index", abbreviated as > pypi (PyPI where case matters). The machine address > cheeseshop.python.org will continue to work for a > foreseeable future, but will not be actively advertised. I think this is progress. I'll note that "Package Index" is somewhat misleading, because it actually indexes distributions, not packages. A more precise name would be pydi and wouldn't be so easily confused with pypy. (Jim ducks. Jim looks forward to pydi tee shirts.) Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From noah.gift at gmail.com Tue Jul 24 22:37:49 2007 From: noah.gift at gmail.com (Noah Gift) Date: Tue, 24 Jul 2007 16:37:49 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <46A65FDE.7080806@v.loewis.de> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <20070724175013.238323A40A7@sparrow.telecommunity.com> <46A64317.6090907@v.loewis.de> <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> <46A64AE0.7000307@v.loewis.de> <20070724194426.828EE3A40A7@sparrow.telecommunity.com> <46A65AD3.3060607@v.loewis.de> <46A65FDE.7080806@v.loewis.de> Message-ID: My real motive is selfishness. I like that easy_install in not case sensitive, as I and other people I am helping to learn Python. I just hope that doesn't go away. My suggestion is mored geared toward, how do I "keep" that feature :) > no intuition for, perhaps because English is not my native language. > I give up trying to understand - just to give you an idea: I apologize, I can be very lazy when I type. > I now think "perhaps not bad, but > unhelpful, unless you also contribute an implementation of it". > It's a change to setuptools, which is still mostly a one-man-show, > (IIUC), so proposing ideas in general is futile (as for most software > with a single author - including PyPI); the single author cannot > possibly implement all the ideas people have. The basic algorithm is that a local index of PyPi could be kept in one file. If an incorrect search was made, the first action to occur would be to check if the local file was the same as the file on the server. If not, it would sync the changes with svn. Then easy_install would try to do lookups against the local file to find a match. I am happy to help if you need help. I am particular interest in easy_install as I am writing a chapter on it for an O'Reilly book as well, again a partially selfish motive :) Noah > > Regards, > Martin > -- http://www.blog.noahgift.com From martin at v.loewis.de Tue Jul 24 23:35:24 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 23:35:24 +0200 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <20070724201911.1210E3A40A7@sparrow.telecommunity.com> References: <46A65943.7000302@v.loewis.de> <20070724201911.1210E3A40A7@sparrow.telecommunity.com> Message-ID: <46A6709C.1040004@v.loewis.de> >> After some discussion, it seems that nobody really likes >> the name "cheeseshop" for the Python Package Index, >> and some people seem to actively hate it. > > I was under the impression that that's also the case for the name > "PyPI", which was changed because of difficulty of disambiguating from > "PyPy" in conversation. That may be the case - however, Guido van Rossum said he would like to see PyPI promoted, and thought that this already had been decided. Richard Jones doesn't object; so PyPI it is. > Cheeseshop is at least a word that is obviously a noun, and it is in > somewhat more common use, with 224000 google hits for "cheeseshop python > -monty", versus 199,000 for "pypi python -monty". Sure. I can see all the reasons why one would like to have something like that. However, it's an authority decision, and I firmly believe in authority when it comes to naming things - somebody has to pick a name, and PyPI is the name that got picked (along with its full spelling of "Python Package Index" - google for that also) But then, I can't even see why the number of hits is important - what matters is what comes out at place 1 in Google. Regards, Martin From philipp at weitershausen.de Tue Jul 24 23:33:40 2007 From: philipp at weitershausen.de (Philipp von Weitershausen) Date: Tue, 24 Jul 2007 23:33:40 +0200 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <46A65943.7000302@v.loewis.de> References: <46A65943.7000302@v.loewis.de> Message-ID: Martin v. L?wis wrote: > After some discussion, it seems that nobody really likes > the name "cheeseshop" for the Python Package Index, > and some people seem to actively hate it. Not sure if this makes a difference, but I'm one of the people who actively love it (whatever that means :)). > So I'm going to change the name (again/back): the software > will call itself "Python Package Index", abbreviated as > pypi (PyPI where case matters). The machine address > cheeseshop.python.org will continue to work for a > foreseeable future, but will not be actively advertised. To avoid confusion with PyPy, we can perhaps encourage a different pronounciation (or a totally different name as Jim has suggested). I've been pronouncing it "pippi" (as in Longstocking) myself. -- http://worldcookery.com -- Professional Zope documentation and training From martin at v.loewis.de Tue Jul 24 23:43:20 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Jul 2007 23:43:20 +0200 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> References: <46A65943.7000302@v.loewis.de> <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> Message-ID: <46A67278.6000709@v.loewis.de> > I'll note that "Package Index" is somewhat misleading, because it > actually indexes distributions, not packages. A more precise name would > be pydi and wouldn't be so easily confused with pypy. (Jim ducks. Jim > looks forward to pydi tee shirts.) That's actually irrelevant. There is no option to name it something *other* than either PyPI or Cheeseshop. Naming it something else has been tried and failed tremendously, so I won't try it again (and being the main PyPI maintainer at the moment, nobody else has a chance to try something else - if you want to give it a name, contribute to it for a few years, then have your own try). FWIW, "distribution" is quite misleading. SuSE is a Distribution (of Linux), and so is Debian; ActivePython is a distribution of Python (I think - before coming to that area, I would have though that "the distribution of Python varies across continents"). Regards, Martin From jim at zope.com Tue Jul 24 23:49:15 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 17:49:15 -0400 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <46A67278.6000709@v.loewis.de> References: <46A65943.7000302@v.loewis.de> <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> <46A67278.6000709@v.loewis.de> Message-ID: <304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com> On Jul 24, 2007, at 5:43 PM, Martin v. L?wis wrote: ... > FWIW, "distribution" is quite misleading. SuSE is a Distribution > (of Linux), and so is Debian; ActivePython is a distribution of > Python (I think - before coming to that area, I would have though > that "the distribution of Python varies across continents"). I didn't come up with the name "distribution". Distutils did that. Whether we like it or not, the Python Library Reference defines this term. http://docs.python.org/dist/distutils-term.html We have a real problem with terminology. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From kantrn at rpi.edu Tue Jul 24 23:47:43 2007 From: kantrn at rpi.edu (Noah Kantrowitz) Date: Tue, 24 Jul 2007 17:47:43 -0400 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <46A6709C.1040004@v.loewis.de> References: <46A65943.7000302@v.loewis.de> <20070724201911.1210E3A40A7@sparrow.telecommunity.com> <46A6709C.1040004@v.loewis.de> Message-ID: On Jul 24, 2007, at 5:35 PM, Martin v. L?wis wrote: >>> After some discussion, it seems that nobody really likes >>> the name "cheeseshop" for the Python Package Index, >>> and some people seem to actively hate it. >> >> I was under the impression that that's also the case for the name >> "PyPI", which was changed because of difficulty of disambiguating >> from >> "PyPy" in conversation. > > That may be the case - however, Guido van Rossum said he would like > to see PyPI promoted, and thought that this already had been decided. > Richard Jones doesn't object; so PyPI it is. > >> Cheeseshop is at least a word that is obviously a noun, and it is in >> somewhat more common use, with 224000 google hits for "cheeseshop >> python >> -monty", versus 199,000 for "pypi python -monty". > > Sure. I can see all the reasons why one would like to have something > like that. However, it's an authority decision, and I firmly believe > in authority when it comes to naming things - somebody has to pick > a name, and PyPI is the name that got picked (along with its full > spelling of "Python Package Index" - google for that also) > > But then, I can't even see why the number of hits is important - what > matters is what comes out at place 1 in Google. > Personally I don't have much of a problem with PyPI (pie pee eye) vs. PyPy (pie pie). --Noah From martin at v.loewis.de Wed Jul 25 00:07:17 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 25 Jul 2007 00:07:17 +0200 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com> References: <46A65943.7000302@v.loewis.de> <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> <46A67278.6000709@v.loewis.de> <304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com> Message-ID: <46A67815.7020807@v.loewis.de> > I didn't come up with the name "distribution". Distutils did that. > Whether we like it or not, the Python Library Reference defines this term. > > http://docs.python.org/dist/distutils-term.html > > We have a real problem with terminology. Perhaps. I notice that the page you refer to does *not* define the term "distribution", but "module distribution". I also notice that PyPI is not an index for these (i.e. .tar.gz or whatever files containing Python modules). Instead, in *indexes* Python projects (as Richard calls them, and I think quite correctly so). Each project then may have multiple _releases_, and each of them may refer to distributions (but not only so, it also refers to a home page, an author, a description, Trove classifiers, etc). Regards, Martin FWIW, the distutils terminology would make it "PyMDI" :-) From pje at telecommunity.com Wed Jul 25 00:13:48 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 24 Jul 2007 18:13:48 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <20070724175013.238323A40A7@sparrow.telecommunity.com> <46A64317.6090907@v.loewis.de> <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> <46A64AE0.7000307@v.loewis.de> <20070724194426.828EE3A40A7@sparrow.telecommunity.com> <46A65AD3.3060607@v.loewis.de> <46A65FDE.7080806@v.loewis.de> Message-ID: <20070724221128.D9B193A40A7@sparrow.telecommunity.com> At 04:37 PM 7/24/2007 -0400, Noah Gift wrote: >The basic algorithm is that a local index of PyPi could be kept in one >file. If an incorrect search was made, the first action to occur >would be to check if the local file was the same as the file on the >server. If not, it would sync the changes with svn. Then >easy_install would try to do lookups against the local file to find a >match. Note that there are a lot of ways you can implement something like this without even involving me on the client or Martin on the server. For example, setuptools.package_index uses urllib2 for all its URL access, so installing an "opener" that does caching before invoking easy_install is possible. You can also subclass the easy_install command class and the PackageIndex class, or tell the easy_install command class to use a different PackageIndex implementation. In the long run, I'd like to add some entry points to allow people to extend the search mechanism in such ways, but for now you can certainly hack subclasses easily enough and make your own alternative commands, as Jim has done for integrating zc.buildout with setuptools. From richardjones at optushome.com.au Wed Jul 25 00:14:22 2007 From: richardjones at optushome.com.au (Richard Jones) Date: Wed, 25 Jul 2007 08:14:22 +1000 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names (second try) In-Reply-To: <46A640ED.8070406@ibp.de> References: <46A640ED.8070406@ibp.de> Message-ID: <200707250814.22196.richardjones@optushome.com.au> On Wed, 25 Jul 2007, Lars Immisch wrote: > Still, I'd like a stated convention how version numbers are compared. I > believe this would be good for setuptools also. Currently PyPI sorts releases using distutils.version.LooseVersion It uses distutils.version.StrictVersion when parsing "provides, "requires" and "obsoletes" setup.py package meta-data. Richard From g.brandl at gmx.net Wed Jul 25 00:15:22 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 25 Jul 2007 00:15:22 +0200 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <46A67815.7020807@v.loewis.de> References: <46A65943.7000302@v.loewis.de> <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> <46A67278.6000709@v.loewis.de> <304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com> <46A67815.7020807@v.loewis.de> Message-ID: Martin v. L?wis schrieb: >> I didn't come up with the name "distribution". Distutils did that. >> Whether we like it or not, the Python Library Reference defines this term. >> >> http://docs.python.org/dist/distutils-term.html >> >> We have a real problem with terminology. > > Perhaps. I notice that the page you refer to does *not* define > the term "distribution", but "module distribution". I also notice > that PyPI is not an index for these (i.e. .tar.gz or whatever > files containing Python modules). Instead, in *indexes* Python > projects (as Richard calls them, and I think quite correctly > so). Each project then may have multiple _releases_, and each > of them may refer to distributions (but not only so, it > also refers to a home page, an author, a description, Trove > classifiers, etc). So why not change the name to "Python Project Index"? The abbreviation stays the same... Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From benji at benjiyork.com Tue Jul 24 22:47:05 2007 From: benji at benjiyork.com (Benji York) Date: Tue, 24 Jul 2007 16:47:05 -0400 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> References: <46A65943.7000302@v.loewis.de> <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> Message-ID: <46A66549.4010406@benjiyork.com> Jim Fulton wrote: > A more precise name would be pydi /me looks forward to the Mr. T jokes. -- Benji York http://benjiyork.com From noah.gift at gmail.com Wed Jul 25 00:30:31 2007 From: noah.gift at gmail.com (Noah Gift) Date: Tue, 24 Jul 2007 18:30:31 -0400 Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI index. In-Reply-To: <20070724221128.D9B193A40A7@sparrow.telecommunity.com> References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com> <20070724184151.1EAE53A40A7@sparrow.telecommunity.com> <46A64AE0.7000307@v.loewis.de> <20070724194426.828EE3A40A7@sparrow.telecommunity.com> <46A65AD3.3060607@v.loewis.de> <46A65FDE.7080806@v.loewis.de> <20070724221128.D9B193A40A7@sparrow.telecommunity.com> Message-ID: On 7/24/07, Phillip J. Eby wrote: > At 04:37 PM 7/24/2007 -0400, Noah Gift wrote: > >The basic algorithm is that a local index of PyPi could be kept in one > >file. If an incorrect search was made, the first action to occur > >would be to check if the local file was the same as the file on the > >server. If not, it would sync the changes with svn. Then > >easy_install would try to do lookups against the local file to find a > >match. > > Note that there are a lot of ways you can implement something like > this without even involving me on the client or Martin on the > server. For example, setuptools.package_index uses urllib2 for all > its URL access, so installing an "opener" that does caching before > invoking easy_install is possible. You can also subclass the > easy_install command class and the PackageIndex class, or tell the > easy_install command class to use a different PackageIndex implementation. > > In the long run, I'd like to add some entry points to allow people to > extend the search mechanism in such ways, but for now you can > certainly hack subclasses easily enough and make your own alternative > commands, as Jim has done for integrating zc.buildout with setuptools. > Great suggestion! I really like that idea. Does this mean it is also easy to point to another local repository that is available via NFS? I guess a local http mirror would work just as well, if you told the opener about it. This seems like a good way to instruct a sysadmin on how to setup a local customized infrastructure! > -- http://www.blog.noahgift.com From jim at zope.com Wed Jul 25 00:31:29 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 24 Jul 2007 18:31:29 -0400 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <46A67815.7020807@v.loewis.de> References: <46A65943.7000302@v.loewis.de> <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> <46A67278.6000709@v.loewis.de> <304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com> <46A67815.7020807@v.loewis.de> Message-ID: <2F7239A7-4098-4E0F-BA06-93042CD479C4@zope.com> On Jul 24, 2007, at 6:07 PM, Martin v. L?wis wrote: >> I didn't come up with the name "distribution". Distutils did that. >> Whether we like it or not, the Python Library Reference defines >> this term. >> >> http://docs.python.org/dist/distutils-term.html >> >> We have a real problem with terminology. > > Perhaps. I notice that the page you refer to does *not* define > the term "distribution", but "module distribution". Obviously, "module" is modifying the term "distribution". No matter, I like your point below. > I also notice > that PyPI is not an index for these (i.e. .tar.gz or whatever > files containing Python modules). Instead, in *indexes* Python > projects (as Richard calls them, and I think quite correctly > so). Each project then may have multiple _releases_, and each > of them may refer to distributions (but not only so, it > also refers to a home page, an author, a description, Trove > classifiers, etc). I think that using the term "project" here addresses the terminology issue nicely. As Phillip pointed out, this is the terminology that setuptools uses. So maybe PyPI should expand to "Python Project Index". Aside from PyPI, I'd really like to "bless" this terminology. If we all seem to like this term, I'd be happy to try to update the distutils documentation to reflect this terminology. (I hope we don't need a PEP to adopt this terminology.) Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From renesd at gmail.com Wed Jul 25 01:34:27 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Wed, 25 Jul 2007 09:34:27 +1000 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <46A65943.7000302@v.loewis.de> References: <46A65943.7000302@v.loewis.de> Message-ID: <64ddb72c0707241634m7ff701f4y1f6e97db1b0c49be@mail.gmail.com> Cheeseshop is better I reckon. pypi sounds like nothing. Cheeseshop is at least fun... for those with a sense of humour. On 7/25/07, "Martin v. L?wis" wrote: > After some discussion, it seems that nobody really likes > the name "cheeseshop" for the Python Package Index, > and some people seem to actively hate it. > > So I'm going to change the name (again/back): the software > will call itself "Python Package Index", abbreviated as > pypi (PyPI where case matters). The machine address > cheeseshop.python.org will continue to work for a > foreseeable future, but will not be actively advertised. > > Regards, > Martin > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From waterbug at pangalactic.us Wed Jul 25 03:56:17 2007 From: waterbug at pangalactic.us (Stephen Waterbury) Date: Tue, 24 Jul 2007 21:56:17 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names In-Reply-To: <20070724174248.F40AA3A40A7@sparrow.telecommunity.com> References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> <46A62DA7.9000304@pangalactic.us> <46A63240.7070003@pangalactic.us> <20070724174248.F40AA3A40A7@sparrow.telecommunity.com> Message-ID: <46A6ADC1.4000008@pangalactic.us> Phillip J. Eby wrote: > At 01:09 PM 7/24/2007 -0400, Stephen Waterbury wrote: >> Actually, I wasn't confused. :) I'd suggest a convention that allows >> a distribution "title" (e.g., "Zope", "Twisted", etc.) and a >> distribution "name" that would simply be the name of the >> distribution's top-level package (e.g., "zope", "twisted", etc.), > > This proposal would rule out namespace packages ... I thought about that. The rule for namespace distributions would be to allow dotted names, e.g. "zope.interface", "zope.schema", etc., as are often currently used. In fact, in a real sense, those *are* the top-level packages of namespace packages. > in addition to being > incompatible with existing distribution names. I thought the point was to come up with a new distribution naming convention, because there currently isn't one -- but the naming convention has to be consistent with all existing distribution names? Seems a tough constraint. > Note that package != distribution ... Yes, I knew that. Of course, now the discussion seems to suggest "project" or "project release" might be a better name than "distribution", and I agree with that. > -- a distribution may contain zero or > more packages (even top-level) ... Indeed, and I've always disliked multiple top-level packages in an [installable unit]. I never liked ZODB strewing top-level packages all over site-packages. (But I do like ZODB -- thanks Jim et al.! I'd just much prefer that it have a top-level "zodb" package.) Of course, eggs make site-packages dirs look much tidier, but I'd still prefer that each [installable unit] have a top-level package, because then it's obvious where imported modules come from just by looking at their top-level namespace. > *and* a single package (top-level or > otherwise) may be spread over more than one distribution. IMO, a package that's spread over more than one distribution should probably not be top-level in both distributions. :) BTW, I am not emotionally attached to this proposal (good thing, eh? ;), but there are a couple of principles in it that I thought deserved a little bit of logical advocacy, e.g.: * if a package deserves a "top-level" namespace, it probably also deserves have its own [installable unit]. * although package != [installable unit], I still think it's not illogical to use the top-level package of an [installable unit] as part of its canonical unique identifier. But admittedly one would have to agree with some of my other points above to agree with that. Steve From pje at telecommunity.com Wed Jul 25 04:19:03 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 24 Jul 2007 22:19:03 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names In-Reply-To: <46A6ADC1.4000008@pangalactic.us> References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> <46A62DA7.9000304@pangalactic.us> <46A63240.7070003@pangalactic.us> <20070724174248.F40AA3A40A7@sparrow.telecommunity.com> <46A6ADC1.4000008@pangalactic.us> Message-ID: <20070725021710.B45F43A40A7@sparrow.telecommunity.com> At 09:56 PM 7/24/2007 -0400, Stephen Waterbury wrote: >I thought the point was to come up with a new distribution naming >convention, Nope, just clarify the rules for *distinguishing* projects by name -- a much less ambitious goal, since it's pretty easy to do with little or no impact on existing projects. A new naming convention isn't in scope, since it would require a "boil the ocean" renaming effort to implement, assuming you could get everyone to agree in the first place. From waterbug at pangalactic.us Wed Jul 25 04:56:03 2007 From: waterbug at pangalactic.us (Stephen Waterbury) Date: Tue, 24 Jul 2007 22:56:03 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names In-Reply-To: <20070725021710.B45F43A40A7@sparrow.telecommunity.com> References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> <46A62DA7.9000304@pangalactic.us> <46A63240.7070003@pangalactic.us> <20070724174248.F40AA3A40A7@sparrow.telecommunity.com> <46A6ADC1.4000008@pangalactic.us> <20070725021710.B45F43A40A7@sparrow.telecommunity.com> Message-ID: <46A6BBC3.7000505@pangalactic.us> Phillip J. Eby wrote: > At 09:56 PM 7/24/2007 -0400, Stephen Waterbury wrote: >> I thought the point was to come up with a new distribution naming >> convention, > > Nope, just clarify the rules for *distinguishing* projects by name -- > a much less ambitious goal, since it's pretty easy to do with little > or no impact on existing projects. > > A new naming convention isn't in scope, since it would require a > "boil the ocean" renaming effort to implement, assuming you could get > everyone to agree in the first place. Indeed. Boiling the ocean will have to wait. I still think putting multiple top-level packages in a single installable is a mistake. ;) Peace. Steve From martin at v.loewis.de Wed Jul 25 07:34:01 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 25 Jul 2007 07:34:01 +0200 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: References: <46A65943.7000302@v.loewis.de> <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> <46A67278.6000709@v.loewis.de> <304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com> <46A67815.7020807@v.loewis.de> Message-ID: <46A6E0C9.1050802@v.loewis.de> > So why not change the name to "Python Project Index"? The abbreviation > stays the same... Because I'm uncertain how people will react to it. The unabbreviated name really doesn't matter that much, except for Google searches perhaps. If "Project Index" gets a clear preference over "Package Index", we can try that; if people are likely to object to it also, nothing is gained. Regards, Martin From benji at benjiyork.com Wed Jul 25 14:40:28 2007 From: benji at benjiyork.com (Benji York) Date: Wed, 25 Jul 2007 08:40:28 -0400 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <46A6E0C9.1050802@v.loewis.de> References: <46A65943.7000302@v.loewis.de> <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com> <46A67278.6000709@v.loewis.de> <304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com> <46A67815.7020807@v.loewis.de> <46A6E0C9.1050802@v.loewis.de> Message-ID: <46A744BC.5010702@benjiyork.com> Martin v. L?wis wrote: >> So why not change the name to "Python Project Index"? The >> abbreviation stays the same... > > Because I'm uncertain how people will react to it. [...] If "Project > Index" gets a clear preference over "Package Index", we can try that; > if people are likely to object to it also, nothing is gained. This looks like a bike shed from here. I suggest you and whomever you want to consult pick a name and go with it. And I repent for my earlier paint color suggestion. -- Benji York http://benjiyork.com From jim at zope.com Wed Jul 25 15:25:38 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 25 Jul 2007 09:25:38 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names In-Reply-To: <20070725021710.B45F43A40A7@sparrow.telecommunity.com> References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> <46A62DA7.9000304@pangalactic.us> <46A63240.7070003@pangalactic.us> <20070724174248.F40AA3A40A7@sparrow.telecommunity.com> <46A6ADC1.4000008@pangalactic.us> <20070725021710.B45F43A40A7@sparrow.telecommunity.com> Message-ID: On Jul 24, 2007, at 10:19 PM, Phillip J. Eby wrote: > At 09:56 PM 7/24/2007 -0400, Stephen Waterbury wrote: >> I thought the point was to come up with a new distribution naming >> convention, > > Nope, just clarify the rules for *distinguishing* projects by name -- > a much less ambitious goal, since it's pretty easy to do with little > or no impact on existing projects. I mostly agree, except that I think we also need to define what is legal in a project name. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Jul 25 15:30:45 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 25 Jul 2007 09:30:45 -0400 Subject: [Catalog-sig] [Distutils] We need to make a decision wrt distribution names In-Reply-To: <46A6ADC1.4000008@pangalactic.us> References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com> <46A62DA7.9000304@pangalactic.us> <46A63240.7070003@pangalactic.us> <20070724174248.F40AA3A40A7@sparrow.telecommunity.com> <46A6ADC1.4000008@pangalactic.us> Message-ID: <1882F28C-AC83-4EEE-9F86-979CF5DEB88E@zope.com> On Jul 24, 2007, at 9:56 PM, Stephen Waterbury wrote: > Phillip J. Eby wrote: >> At 01:09 PM 7/24/2007 -0400, Stephen Waterbury wrote: >>> Actually, I wasn't confused. :) I'd suggest a convention that >>> allows >>> a distribution "title" (e.g., "Zope", "Twisted", etc.) and a >>> distribution "name" that would simply be the name of the >>> distribution's top-level package (e.g., "zope", "twisted", etc.), >> >> This proposal would rule out namespace packages ... > > I thought about that. The rule for namespace distributions would > be to > allow dotted names, e.g. "zope.interface", "zope.schema", etc., as are > often currently used. In fact, in a real sense, those *are* the > top-level packages of namespace packages. Those are the top-level packages of those distributions. >> in addition to being >> incompatible with existing distribution names. > > I thought the point was to come up with a new distribution naming > convention, because there currently isn't one -- but the naming > convention has to be consistent with all existing distribution > names? Seems a tough constraint. No, my proposal was to define: - Rules for constructing *legal* (as opposed to "good") project names - Rules for variations on project names. ... >> -- a distribution may contain zero or >> more packages (even top-level) ... > > Indeed, and I've always disliked multiple top-level packages in an > [installable unit]. No offense intended, but this seems arbitrary to me. Note that not only can a distribution contain more than one package, it can contain no packages. >> *and* a single package (top-level or >> otherwise) may be spread over more than one distribution. > > IMO, a package that's spread over more than one distribution should > probably not be top-level in both distributions. :) Phillip was (I think) referring to namespace packages. Namespace packages are a very important tool for maintaining some sanity in package naming. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Jul 25 15:59:31 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 25 Jul 2007 09:59:31 -0400 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: <64ddb72c0707241634m7ff701f4y1f6e97db1b0c49be@mail.gmail.com> References: <46A65943.7000302@v.loewis.de> <64ddb72c0707241634m7ff701f4y1f6e97db1b0c49be@mail.gmail.com> Message-ID: On Jul 24, 2007, at 7:34 PM, Ren? Dudfield wrote: > Cheeseshop is better I reckon. > > pypi sounds like nothing. Cheeseshop is at least fun... for those > with a sense of humour. Minor hysterical note: The original joke, as I understand it, was based on the fact that PyPI originally didn't contain any packages (or distributions). Like the cheeseshop having no cheese, the package index had no packages (or even distributions). The original joke doesn't really apply any more as many or most of us are actually uploading our distributions, so the project index really does indeed have cheese, I mean distributions. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From renesd at gmail.com Thu Jul 26 00:50:56 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Thu, 26 Jul 2007 08:50:56 +1000 Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org In-Reply-To: References: <46A65943.7000302@v.loewis.de> <64ddb72c0707241634m7ff701f4y1f6e97db1b0c49be@mail.gmail.com> Message-ID: <64ddb72c0707251550y9a5603ay38067556fbcb1584@mail.gmail.com> ok, that is even funnier - or are all jokes ruined by being explained? btw, I don't think most people are uploading their packages yet. You just have to compare the number of python projects on A) sourceforge/googlecode, B) python cookbook C) pygame projects There's still massive amounts of python code not indexed by pypi. oh, was projects one of the names being considered? projects.python.org ? How does the cookbook idea fit in with pypi? I guess I've seen pypi as about packages and modules rather than projects and cookbook recipes. Maybe those things were left out on purpose? On 7/25/07, Jim Fulton wrote: > > > On Jul 24, 2007, at 7:34 PM, Ren? Dudfield wrote: > > > Cheeseshop is better I reckon. > > > > pypi sounds like nothing. Cheeseshop is at least fun... for those > > with a sense of humour. > > Minor hysterical note: The original joke, as I understand it, was > based on the fact that PyPI originally didn't contain any packages > (or distributions). Like the cheeseshop having no cheese, the package > index had no packages (or even distributions). The original joke > doesn't really apply any more as many or most of us are actually > uploading our distributions, so the project index really does indeed > have cheese, I mean distributions. > > Jim > > -- > Jim Fulton mailto:jim at zope.com Python > Powered! > CTO (540) 361-1714 > http://www.python.org > Zope Corporation http://www.zope.com > http://www.zope.org > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/catalog-sig/attachments/20070726/fd729fa5/attachment.html From michael at d2m.at Fri Jul 27 07:24:15 2007 From: michael at d2m.at (Michael Haubenwallner) Date: Fri, 27 Jul 2007 07:24:15 +0200 Subject: [Catalog-sig] rss feed: broken links Message-ID: FYI: the RSS feeds links to packages (like http://python.python.org/pypi/...) are broken. Btw, changing the URLs will likely double the last 30 feeditems in aggregators. Michael -- http://www.zope.org/Members/d2m http:/planetzope.org From martin at v.loewis.de Fri Jul 27 08:03:53 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 27 Jul 2007 08:03:53 +0200 Subject: [Catalog-sig] rss feed: broken links In-Reply-To: References: Message-ID: <46A98AC9.3050009@v.loewis.de> Michael Haubenwallner schrieb: > FYI: the RSS feeds links to packages > (like http://python.python.org/pypi/...) > are broken. Oops, fixed. > Btw, changing the URLs will likely double the last 30 feeditems in > aggregators. Sure. I don't think anything can be done about that; after a few days, this is old news, anyway. Regards, Martin From renesd at gmail.com Sat Jul 28 02:22:16 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 28 Jul 2007 10:22:16 +1000 Subject: [Catalog-sig] static files, and testing pypi Message-ID: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com> Hello, I've got a bit of spare time again after catching up on work after attending europython - so was wondering if I should still finish the static file stuff? Should I still finish the static file generation, or is this not wanted? I think it would still be useful. So if it is still wanted, can I please get a new version of the database? Since I think there has been significant changes since my copy of it. As part of it I want to write some unittests and regression tests/monitoring scripts. I think this is sorely needed for pypi, so we don't see the same kind of breakage when we refactor - and to make sure the service is running ok. I guess a tool for this stuff might be the webunit that Richard Jones wrote? Or some other tool? http://mechanicalcat.net/tech/webunit/ Should unittests just be written with unittest? Or some other framework? If the maintainers want to stick with no tests I'll just write my tests separately. Or I can just set up a basic framework with unittest, and webunit. From martin at v.loewis.de Sat Jul 28 09:34:26 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 28 Jul 2007 09:34:26 +0200 Subject: [Catalog-sig] static files, and testing pypi In-Reply-To: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com> References: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com> Message-ID: <46AAF182.9070400@v.loewis.de> > I think it would still be useful. So if it is still wanted, can I > please get a new version of the database? Since I think there has > been significant changes since my copy of it. Please take a look at the tools/sql-migrate* files. They should bring your database up to the current schema. > I guess a tool for this stuff might be the webunit that Richard Jones > wrote? Or some other tool? http://mechanicalcat.net/tech/webunit/ > > Should unittests just be written with unittest? Or some other framework? I don't care what framework is chosen - pick any that allows for completely automated test runs. Regards, Martin From richardjones at optushome.com.au Sat Jul 28 10:03:04 2007 From: richardjones at optushome.com.au (Richard Jones) Date: Sat, 28 Jul 2007 18:03:04 +1000 Subject: [Catalog-sig] static files, and testing pypi In-Reply-To: <46AAF182.9070400@v.loewis.de> References: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com> <46AAF182.9070400@v.loewis.de> Message-ID: <200707281803.04466.richardjones@optushome.com.au> On Sat, 28 Jul 2007, Martin v. L?wis wrote: > > I guess a tool for this stuff might be the webunit that Richard Jones > > wrote? Or some other tool? http://mechanicalcat.net/tech/webunit/ > > > > Should unittests just be written with unittest? Or some other framework? > > I don't care what framework is chosen - pick any that allows for > completely automated test runs. FWIW that's pretty much what webunit was designed for. Richard From benji at benjiyork.com Sat Jul 28 15:02:44 2007 From: benji at benjiyork.com (Benji York) Date: Sat, 28 Jul 2007 09:02:44 -0400 Subject: [Catalog-sig] static files, and testing pypi In-Reply-To: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com> References: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com> Message-ID: <46AB3E74.6090303@benjiyork.com> Ren? Dudfield wrote: > Should I still finish the static file generation, or is this not wanted? I like the idea, if only from a stability standpoint. (Granted, stability has been improved greatly of late, but static files will always trump dynamic page generation). > As part of it I want to write some unittests and regression > tests/monitoring scripts. +1 > I guess a tool for this stuff might be the webunit that Richard Jones > wrote? Or some other tool? http://mechanicalcat.net/tech/webunit/ There's a good list of web testing tools at http://pycheesecake.org/wiki/PythonTestingToolsTaxonomy#WebTestingTools. I have a fondness for doctest, so would recommend that as well. -- Benji York http://benjiyork.com From martin at v.loewis.de Sat Jul 28 15:37:57 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 28 Jul 2007 15:37:57 +0200 Subject: [Catalog-sig] static files, and testing pypi In-Reply-To: <46AB3E74.6090303@benjiyork.com> References: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com> <46AB3E74.6090303@benjiyork.com> Message-ID: <46AB46B5.6020806@v.loewis.de> > I like the idea, if only from a stability standpoint. (Granted, > stability has been improved greatly of late, but static files will > always trump dynamic page generation). Depends on how you define stability, perhaps. If the dynamic generation on update stops working at some point, such an error may remain unnoticed for some period of time. The pages remain available, but are incorrect. Regards, Martin From martin at v.loewis.de Sat Jul 28 16:22:29 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 28 Jul 2007 16:22:29 +0200 Subject: [Catalog-sig] setuptools upload to pypi In-Reply-To: <20070723212920.8BFE63A40AA@sparrow.telecommunity.com> References: <46A50BF1.9020303@v.loewis.de> <20070723204445.65ABC3A40AA@sparrow.telecommunity.com> <46A51798.8000907@v.loewis.de> <20070723212920.8BFE63A40AA@sparrow.telecommunity.com> Message-ID: <46AB5125.2000806@v.loewis.de> > RewriteEngine On > RewriteBase / > RewriteCond %{REQUEST_METHOD} ^GET$ > RewriteRule ^pypi(.*)$ > http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [R,L] > RewriteRule ^pypi(.*)$ > http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [P,L] Thanks! I have now activated something like this, namely RewriteCond %{REQUEST_METHOD} ^GET$ RewriteRule ^/pypi(.*)$ http://pypi.python.org/pypi$1?%{QUERY_STRING} [R,L] RewriteRule ^/pypi(.*)$ http://pypi.python.org/pypi$1?%{QUERY_STRING} [P,L] I haven't set RewriteBase, as this is in the central server config and would affect other rules as well. Regards, Martin