From greg.ewing at canterbury.ac.nz Wed Oct 1 02:40:35 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 01 Oct 2008 12:40:35 +1200 Subject: [Catalog-sig] [Distutils] PEP for distutils In-Reply-To: <48E24897.8010800@colorstudy.com> References: <94bdd2610809280555p12c0e326r4a867bd3b67efbd9@mail.gmail.com> <48E23F2D.3000802@simplistix.co.uk> <20080930153822.GJ26878@phare.normalesup.org> <48E24897.8010800@colorstudy.com> Message-ID: <48E2C703.2030801@canterbury.ac.nz> Ian Bicking wrote: > FWIW, pyinstall can collect all the packages before installing any of > them. You do have to download all packages, though, as that's the only > way to get the metadata. This may be something to make sure is on the requirements list for a metatdata standard: Make sure there is a defined way of getting just the metadata from a repository without having to download the whole package. -- Greg From ziade.tarek at gmail.com Wed Oct 1 14:10:48 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 1 Oct 2008 14:10:48 +0200 Subject: [Catalog-sig] pre-PEP : Synthesis of previous threads, and irc talks + proposals Message-ID: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> Hello I have followed most of the threads from the past days, and we talked a lot on IRC with people from Fedora, Debian, Enthought, TG2 on possible enhancements While the other threads are continuing in deeper details, I would like to start a fresh thread were people don't have to re-read everything to be able to give their opinions on very precise points, This thread is focusing on shouting out the current problems and the solutions that can be adopted. I'd like to have "+1" and "-1" on each proposal, with at most one sentence. or fix a mistake if there is. That could help us speed up the work. let's try to keep this thread concise, if you want to discuss deeply on a problem, start another thread, and i'll follow it to fix my summary. The problems =========== 1/ the dependencies of a package are not expressed in the Require metadata of the package most of the time. adding a dependency to a module is not really done, developer add dependencies to packages. Furthermore, developer tend to use setuptools "install_requires" and "tests_require" arguments to express dependencies. So basically, you have to run an egg_info to get them, because the info files are generated by commands. 2/ the existence of PyPI had a side-effect: people tend to push the entire doc of the package in one single field (long_description) to display them at PyPI. The documentation of the package is not cleary pointed to others. 3/ the metadata infos cannot be expressed in a static file by the developer, because sometimes they are calculated by code. while this very permissive, that is how it works but they are tighted to argument passing to setup(). 4/ PyPI lacks of direct information about dependencies. In the meantime, the DOAP project is working on a way to express dependencies, but it is a work in progress. 5/ ideally, they should be one and only one version of a given package in an OS-based installation 6/ packagers would like to have more compatibility information to work out on security upgrades or version conflicts 7/ developers should be able to have more options when they define version dependencies in their packages, things like: A depends on B>=1.2 and B<=2.0 but with a preference to B 1.4 or "avoid B 1.7" they give tips to packagers ! 8/ the requires-python field is rarely used by people, so unless you try the package, you don't know when it is a source distribution, if it is going to run on various python versions. 9/ unless the developer has a strong comitment to an OS, he will never create and use a file that is located in /etc 10/ you can't possibily have a complete knowledge of the dependency graph and possible conflicts when you introduce a versioned dependency in your package. packages at given versions are known by some people to work well together or not in a set of versioned packages, Let's call this a "known good set" (KGS) - OS packager know and maintain the KGS for their distribution. - Web framework packagers does it for their application you don't. unless you work in a "KGS" environment. But if you want your package to be a regular python package at PyPI, packagers should be able to change its dependencies to make it fit their own KGS, and to build their knowledge on it. The developer dependencies infos is a tip and a help for a packager, not an enforcement. see [7] 11/ people should always upload the sdist version at PyPI, they don't do it always. otherwise it is a pain for packagers. Proposals ======== this is also a synthezis of what I hurd, and some elements I have added to respect the needs that were expressed. 0/ a lot of work can be done to clean distutils, no matter what is decided (another PEP is built for that) cleanning, removing old-style code, testing 1/ let's change the Python Metadata , in order to introduce a better dependency system, by - officialy introduce "install requires" and "test requires" metadata in there - mark "requires" as deprecated 2/ Let's move part of setuptools code in distutils, to respect those changes. 3/ let's create a simple convention : the metadata should be expressed in a python module called 'pkginfo.py' where each metadata is a variable. that can be used by setup.py and therefore by any tool that work with it, even if it does not run a setup.py command. This is simpler, this is cleaner. you don't have to run some setup magic to read them. at least some magic introduces by commands 4/ let's change PyPI to make it work with the new metadata and to enforce a few things Enforcements: - a binary distribution cannot be uploaded if a source distrbution has not been previously provided for the version - the requires-python need to be present. : come on, you know what python versions your package work with ! New features: - we should be able to download the metadata of a package without downloading the package - PyPI should display the install and test dependencies in the UI - The XML-RPC should provide this new metadata as well. - a commenting system should allow developers and packagers to give more infos on a package at PyPI to make the work easier Open question ============ (please if you want to react on those, open another thread, with a clean cut, otherwise it is hard to follow directly) - what about the documentation ? can't we express it better in the Metadata ? I think we can structurize it a bit - what about the configuration ? can't we find a way to interact with a config ini-like file for instance and don't care if it is located at /etc/package.cfg or at /Volumes/..etc ? Regards Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From gherman at darwin.in-berlin.de Wed Oct 1 14:35:05 2008 From: gherman at darwin.in-berlin.de (Dinu Gherman) Date: Wed, 1 Oct 2008 14:35:05 +0200 Subject: [Catalog-sig] Does package_releases() always return all version numbers? Message-ID: <5A441D61-EB18-4242-85E4-0324280969AD@darwin.in-berlin.de> Hi, I'm new on this list after I've discovered how to access the PyPI repository via XML-RPC. I'm planning to look deeper into the possibilities of PyPI, e.g. by writing little tools for reporting statistics or verifying my own project registrations and uploads. Right now I'm trying to understand why in the example below the "package_releases" method returns only the latest version string for one of my packages, named "pdfnup", although it should be two, like one can see online: http://pypi.python.org/pypi/pdfnup/0.3.0 http://pypi.python.org/pypi/pdfnup/0.3.1 I don't know if this is an unexpected behaviour for my "pdfnup" only or if other packages show the same behaviour. Also, I don't know if the package owner must do something to prevent this from happening, other than what I did: setup.py register setup.py sdist/bdist_egg upload Regards, Dinu PS: import xmlrpclib serverUrl = "http://pypi.python.org/pypi" server = xmlrpclib.Server(serverUrl) name = "pdfnup" versions = server.package_releases(name) print versions # gives ["0.3.1"], expected: ["0.3.0", "0.3.1"] for d in server.release_urls("pdfnup", "0.3.1"): print d["filename"] # pdfnup-0.3.1-py2.5.egg # pdfnup-0.3.1.tar.gz for d in server.release_urls("pdfnup", "0.3.0"): print d["filename"] # pdfnup-0.3.0.tar.gz # pdfnup-0.3.0-py2.5.egg -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdrake at gmail.com Wed Oct 1 15:08:06 2008 From: fdrake at gmail.com (Fred Drake) Date: Wed, 1 Oct 2008 09:08:06 -0400 Subject: [Catalog-sig] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> Message-ID: <9cee7ab80810010608uc761173t2937493d88a79ea5@mail.gmail.com> On Wed, Oct 1, 2008 at 8:10 AM, Tarek Ziad? wrote: > 8/ the requires-python field is rarely used by people, so unless you > try the package, you don't know when it is a source > distribution, if it is going to run on various python versions. What requires-python field? I don't see this documented for distutils or setuptools. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From asmodai at in-nomine.org Wed Oct 1 15:16:09 2008 From: asmodai at in-nomine.org (Jeroen Ruigrok van der Werven) Date: Wed, 1 Oct 2008 15:16:09 +0200 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> Message-ID: <20081001131609.GW30869@nexus.in-nomine.org> -On [20081001 14:10], Tarek Ziad? (ziade.tarek at gmail.com) wrote: >I have followed most of the threads from the past days, and we talked >a lot on IRC with people from Fedora, Debian, Enthought, TG2 on >possible enhancements Just a note: do not forget the BSD Unix systems when it comes to packaging and whatnot, it's quite a bit different from the Linux systems. Also, there's pkgsrc which spans multiple different platforms. -- Jeroen Ruigrok van der Werven / asmodai ????? ?????? ??? ?? ?????? http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Vae victis! From ziade.tarek at gmail.com Wed Oct 1 15:21:35 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 1 Oct 2008 15:21:35 +0200 Subject: [Catalog-sig] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <9cee7ab80810010608uc761173t2937493d88a79ea5@mail.gmail.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <9cee7ab80810010608uc761173t2937493d88a79ea5@mail.gmail.com> Message-ID: <94bdd2610810010621h6033e2d0l791e623e86ac8513@mail.gmail.com> On Wed, Oct 1, 2008 at 3:08 PM, Fred Drake wrote: > On Wed, Oct 1, 2008 at 8:10 AM, Tarek Ziad? wrote: >> 8/ the requires-python field is rarely used by people, so unless you >> try the package, you don't know when it is a source >> distribution, if it is going to run on various python versions. > > What requires-python field? > > I don't see this documented for distutils or setuptools. The one described here http://www.python.org/dev/peps/pep-0345/ in Metadata 1.2 'Requires-Python' So it can't be used by people at this time, that was a mistake. the new version of [8] could be: 8/ you don't know when it is a source distribution, if it is going to run on various python versions. PEP 345 defines it, but it is not yet implemented in distutils, either in setuptools. and in the proposal parts: 0/ ... + implement the metadata 1.2 from PEP 345, (besides the "requires" metadata) > > > -Fred > > -- > Fred L. Drake, Jr. > "Chaos is the score upon which reality is written." --Henry Miller > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From amk at amk.ca Wed Oct 1 15:21:37 2008 From: amk at amk.ca (A.M. Kuchling) Date: Wed, 1 Oct 2008 09:21:37 -0400 Subject: [Catalog-sig] [Distutils] PEP for distutils In-Reply-To: <94bdd2610809301341j60830399s37cf310939783b86@mail.gmail.com> References: <94bdd2610809280555p12c0e326r4a867bd3b67efbd9@mail.gmail.com> <48E23F2D.3000802@simplistix.co.uk> <20080930153822.GJ26878@phare.normalesup.org> <48E24897.8010800@colorstudy.com> <20080930162559.GA11804@amk-desktop.matrixgroup.net> <20080930175201.106863A409C@sparrow.telecommunity.com> <9b06ffb10809301121r589e1784je1300fb2632ab46@mail.gmail.com> <94bdd2610809301341j60830399s37cf310939783b86@mail.gmail.com> Message-ID: <20081001132137.GA7882@amk-desktop.matrixgroup.net> On Tue, Sep 30, 2008 at 10:41:12PM +0200, Tarek Ziad? wrote: > - can a RDF-based database can possibly handle such a graph ? > - would it make sense for PyPI to query the doap server to get those > dependency infos ? Good RDF tools can handle really big databases: DBpedia (http://dbpedia.org/) is 116 million triples. On the other hand, I don't think the running PyPI installation should query some other server because it'll add another point of failure. --amk From ziade.tarek at gmail.com Wed Oct 1 15:24:38 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 1 Oct 2008 15:24:38 +0200 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <20081001131609.GW30869@nexus.in-nomine.org> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <20081001131609.GW30869@nexus.in-nomine.org> Message-ID: <94bdd2610810010624t33fef42cs1268c05d63adf4d1@mail.gmail.com> On Wed, Oct 1, 2008 at 3:16 PM, Jeroen Ruigrok van der Werven wrote: > -On [20081001 14:10], Tarek Ziad? (ziade.tarek at gmail.com) wrote: >>I have followed most of the threads from the past days, and we talked >>a lot on IRC with people from Fedora, Debian, Enthought, TG2 on >>possible enhancements > > Just a note: do not forget the BSD Unix systems when it comes to packaging > and whatnot, it's quite a bit different from the Linux systems. Also, > there's pkgsrc which spans multiple different platforms. Yes indeed, a BSD packager could be helpfull in the discussion. Are you able to help us in this ? Tarek From asmodai at in-nomine.org Wed Oct 1 15:29:02 2008 From: asmodai at in-nomine.org (Jeroen Ruigrok van der Werven) Date: Wed, 1 Oct 2008 15:29:02 +0200 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810010624t33fef42cs1268c05d63adf4d1@mail.gmail.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <20081001131609.GW30869@nexus.in-nomine.org> <94bdd2610810010624t33fef42cs1268c05d63adf4d1@mail.gmail.com> Message-ID: <20081001132902.GX30869@nexus.in-nomine.org> -On [20081001 15:24], Tarek Ziad? (ziade.tarek at gmail.com) wrote: >Yes indeed, a BSD packager could be helpfull in the discussion. Are >you able to help us in this ? I used to be a FreeBSD and DragonFly BSD committer and I still use Python mostly on BSD systems. I am sure Trent Nelson could also help. I know Joerg Sonnenberger from pkgsrc/NetBSD quite well too. So all in all I should be able to. My time seems a bit more limited than yours though. :) -- Jeroen Ruigrok van der Werven / asmodai ????? ?????? ??? ?? ?????? http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Dreams are like Angels, they keep bad at bay, bad at bay, Love is the Light, scaring Darkness away... From pje at telecommunity.com Wed Oct 1 19:29:32 2008 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 01 Oct 2008 13:29:32 -0400 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.co m> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> Message-ID: <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> At 02:10 PM 10/1/2008 +0200, Tarek Ziad? wrote: >Proposals >======== > >this is also a synthezis of what I hurd, and some elements I have >added to respect the needs that were expressed. > >0/ a lot of work can be done to clean distutils, no matter what is >decided (another PEP is built for that) cleanning, removing old-style >code, testing > >1/ let's change the Python Metadata , in order to introduce a better >dependency system, by > > - officialy introduce "install requires" and "test requires" > metadata in there > - mark "requires" as deprecated > >2/ Let's move part of setuptools code in distutils, to respect those changes. > >3/ let's create a simple convention : the metadata should be expressed >in a python module called 'pkginfo.py' > where each metadata is a variable. > > that can be used by setup.py and therefore by any tool that work >with it, even if it does not run > a setup.py command. > > This is simpler, this is cleaner. you don't have to run some setup >magic to read them. > at least some magic introduces by commands I'm -1 on all of the above. I think we need a standard for tools interop (ala WSGI), not implementation tweaks for the existing tools. I also think that a concrete metadata format proposal is premature at this time; we've barely begun to gather -- let alone specify -- our requirements for that metadata. (Essentially, only version dependencies have been discussed, AFAICT.) There have been many people agreeing that the distutils are thoroughly broken and a new approach is needed; these proposals sound like minor tweaks to the existing infrastructure, rather than a way to get rid of it. So to me, the above doesn't seem like a synthesis of the threads that I've been reading. >4/ let's change PyPI to make it work with the new metadata and to >enforce a few things > >Enforcements: > - a binary distribution cannot be uploaded if a source distrbution >has not been previously provided for the version Note that this doesn't allow closed-source packages to be uploaded; thus it would need to be a warning, rather than a requirement. >New features: > - we should be able to download the metadata of a package without >downloading the package > - PyPI should display the install and test dependencies in the UI It could only do this for specific binaries, since dependencies can be dynamic. From ziade.tarek at gmail.com Wed Oct 1 19:55:59 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 1 Oct 2008 19:55:59 +0200 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> Message-ID: <94bdd2610810011055j69411c6fu6f417cf3cf930460@mail.gmail.com> On Wed, Oct 1, 2008 at 7:29 PM, Phillip J. Eby wrote: > I'm -1 on all of the above. I think we need a standard for tools interop > (ala WSGI), not implementation tweaks for the existing tools. I also think > that a concrete metadata format proposal is premature at this time; we've > barely begun to gather -- let alone specify -- our requirements for that > metadata. (Essentially, only version dependencies have been discussed, > AFAICT.) What are the other important points we need to discuss at this point in your opinion ? > > There have been many people agreeing that the distutils are thoroughly > broken and a new approach is needed; these proposals sound like minor > tweaks to the existing infrastructure, rather than a way to get rid of it. > So to me, the above doesn't seem like a synthesis of the threads that I've > been reading. > > >> 4/ let's change PyPI to make it work with the new metadata and to >> enforce a few things >> >> Enforcements: >> - a binary distribution cannot be uploaded if a source distrbution >> has not been previously provided for the version > > Note that this doesn't allow closed-source packages to be uploaded; thus it > would need to be a warning, rather than a requirement. > Right. do you agree it is something useful to do ? > >> New features: >> - we should be able to download the metadata of a package without >> downloading the package >> - PyPI should display the install and test dependencies in the UI > > It could only do this for specific binaries, since dependencies can be > dynamic. > > What dynamic means here ? the python module to static file process or more ? can you provide an example ? Regards Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From pje at telecommunity.com Wed Oct 1 20:28:07 2008 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 01 Oct 2008 14:28:07 -0400 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810011055j69411c6fu6f417cf3cf930460@mail.gmail.co m> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> <94bdd2610810011055j69411c6fu6f417cf3cf930460@mail.gmail.com> Message-ID: <20081001182658.E96173A4072@sparrow.telecommunity.com> At 07:55 PM 10/1/2008 +0200, Tarek Ziad? wrote: >On Wed, Oct 1, 2008 at 7:29 PM, Phillip J. Eby wrote: > > I'm -1 on all of the above. I think we need a standard for tools interop > > (ala WSGI), not implementation tweaks for the existing tools. I also think > > that a concrete metadata format proposal is premature at this time; we've > > barely begun to gather -- let alone specify -- our requirements for that > > metadata. (Essentially, only version dependencies have been discussed, > > AFAICT.) > >What are the other important points we need to discuss at this point >in your opinion ? What information needs to be conveyed by a "build" tool to an "install" tool, and vice versa. For example, an install tool needs to know what files are documentation, which are sample data, and what is part of the library proper, as well as what things are scripts (and in what language those scripts are written, e.g. Python, shell, batch, etc.). Some install tools need to know about icons, menus, registry info, cron jobs, etc. (These are perhaps more properly the domain of applications than libraries, but I'm going to assume that these things are in scope.) The way that this information is communicated needs to be extensible, so that optional metadata for debs and msi's and rpm's and whathaveyou can be incorporated, without needing to modify the standard -- especially if the APIs for reading and writing this data are in the stdlib. There needs to be a way for install tools to ask a source package to "configure" itself, possibly specifying options (and a way for it to find out what those options are, to be able to present them to the user). Then there needs to be a way for the install tool to ask the package to build itself with the configured options, and a standard for how the build tool(s) communicate errors or other issues back to the install tools. There needs to be a way, of course, for the package to specify what build tools it needs in order to be built, and for those to plug in to the (again stdlib-contained) build API. There needs to be a better API for querying C configuration and compiler details, that's separate from the distutils "ccompiler" stuff. Last, but not least, there needs to be a definition of core build and install tools to be both an example/reference implementation of the standard, and to provide the basics needed by the core. I think I've mentioned all of these previously in the thread. I also think that as a matter of technicalities, these things are not difficult to achieve... but if it they are only achieved *technically*, then the result is a failure, not a success. In order for the *real* goal to be achieved (i.e., a flourishing build/install system for Python), widespread participation and buy-in is required. If the OS people or the big package people (e.g. Zope Corp., Enthought) or the distutils aficionados are left out, then the result won't get used. I think the best way to ensure that nobody is left out, is to get them to participate in the design of a standard that ensures that *they* will be able to control their destiny, by creating their own build plugins and/or install tools... or at least having a robust choice of alternatives. We need a consensus "de jure" standard (ala WSGI), rather than just an ongoing "de facto" standard (ala distutils/setuptools), or we're not making any substantial progress, just handing the reins over to somebody else. > >> Enforcements: > >> - a binary distribution cannot be uploaded if a source distrbution > >> has not been previously provided for the version > > > > Note that this doesn't allow closed-source packages to be uploaded; thus it > > would need to be a warning, rather than a requirement. > > > >Right. do you agree it is something useful to do ? Sure. > >> New features: > >> - we should be able to download the metadata of a package without > >> downloading the package > >> - PyPI should display the install and test dependencies in the UI > > > > It could only do this for specific binaries, since dependencies can be > > dynamic. > > > > > >What dynamic means here ? the python module to static file process or more ? >can you provide an example ? Dependencies can be platform-specific as well as python-version specific. If you want ctypes, you would depend on it in python 2.4 but not in 2.5. Similarly, if on some platform a certain library is required to implement a feature that is natively accessible on other platforms. In these cases, you would have logic in setup.py to detect this and choose the appropriate dependencies. From martin at v.loewis.de Wed Oct 1 20:38:07 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 01 Oct 2008 20:38:07 +0200 Subject: [Catalog-sig] Does package_releases() always return all version numbers? In-Reply-To: <5A441D61-EB18-4242-85E4-0324280969AD@darwin.in-berlin.de> References: <5A441D61-EB18-4242-85E4-0324280969AD@darwin.in-berlin.de> Message-ID: <48E3C38F.60409@v.loewis.de> > versions = server.package_releases(name) Try versions = server.package_releases(name,True) This will also report hidden releases. Regards, Martin From gael.varoquaux at normalesup.org Wed Oct 1 21:05:59 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 1 Oct 2008 21:05:59 +0200 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <20081001182658.E96173A4072@sparrow.telecommunity.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> <94bdd2610810011055j69411c6fu6f417cf3cf930460@mail.gmail.com> <20081001182658.E96173A4072@sparrow.telecommunity.com> Message-ID: <20081001190559.GB585@phare.normalesup.org> On Wed, Oct 01, 2008 at 02:28:07PM -0400, Phillip J. Eby wrote: > In order for the *real* goal to be achieved (i.e., a flourishing > build/install system for Python), widespread participation and buy-in is > required. If the OS people or the big package people (e.g. Zope Corp., > Enthought) or the distutils aficionados are left out, then the result won't > get used. > > [...] > > We need a consensus "de jure" standard (ala WSGI), rather than just an > ongoing "de facto" standard (ala distutils/setuptools), or we're not making > any substantial progress, just handing the reins over to somebody else. Nice words. I like very much the sound of this discussion. Ga?l From ziade.tarek at gmail.com Thu Oct 2 03:10:59 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 2 Oct 2008 03:10:59 +0200 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <20081001182658.E96173A4072@sparrow.telecommunity.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> <94bdd2610810011055j69411c6fu6f417cf3cf930460@mail.gmail.com> <20081001182658.E96173A4072@sparrow.telecommunity.com> Message-ID: <94bdd2610810011810h3ea48e6cmb8d223cc24e1cf85@mail.gmail.com> On Wed, Oct 1, 2008 at 8:28 PM, Phillip J. Eby wrote: > At 07:55 PM 10/1/2008 +0200, Tarek Ziad? wrote: >> >> On Wed, Oct 1, 2008 at 7:29 PM, Phillip J. Eby >> wrote: >> > I'm -1 on all of the above. I think we need a standard for tools >> > interop >> > (ala WSGI), not implementation tweaks for the existing tools. I also >> > think >> > that a concrete metadata format proposal is premature at this time; >> > we've >> > barely begun to gather -- let alone specify -- our requirements for that >> > metadata. (Essentially, only version dependencies have been discussed, >> > AFAICT.) >> >> What are the other important points we need to discuss at this point >> in your opinion ? > > What information needs to be conveyed by a "build" tool to an "install" > tool, and vice versa. > > For example, an install tool needs to know what files are documentation, > which are sample data, and what is part of the library proper, as well as > what things are scripts (and in what language those scripts are written, > e.g. Python, shell, batch, etc.). Some install tools need to know about > icons, menus, registry info, cron jobs, etc. (These are perhaps more > properly the domain of applications than libraries, but I'm going to assume > that these things are in scope.) > > The way that this information is communicated needs to be extensible, so > that optional metadata for debs and msi's and rpm's and whathaveyou can be > incorporated, without needing to modify the standard -- especially if the > APIs for reading and writing this data are in the stdlib. So basically with this system, this would mean that an ini-file in my package would be marked as a configuration file for the installer API, and that third party tools would decide what to do with this file. This would mean that my program would have to access to that file through an API as well to get back to the ini-file in the system. how would it work ? > I think I've mentioned all of these previously in the thread. I also think > that as a matter of technicalities, these things are not difficult to > achieve... but if it they are only achieved *technically*, then the result > is a failure, not a success. > > In order for the *real* goal to be achieved (i.e., a flourishing > build/install system for Python), widespread participation and buy-in is > required. If the OS people or the big package people (e.g. Zope Corp., > Enthought) or the distutils aficionados are left out, then the result won't > get used. Well yes, that is basically what we are trying to build since a few days, but threads are not linear so people cant' keep up and jump in like that. So maybe you mentioned that idea before in the thread, but if we want people to buy the idea, it should be put in the wiki imho, even prematurely, and built it there, starting from a small set of points. I mean, you said earlier that it is premature to write a concrete document but it's hard to follow in threads the ideas proposed. That was the idea of my early proposal : start a synthetic list of problems and for each problem possible solutions. Then discuss them in the ML and make the right one raise. That is just my 2 cents on how the discussions go > > I think the best way to ensure that nobody is left out, is to get them to > participate in the design of a standard that ensures that *they* will be > able to control their destiny, by creating their own build plugins and/or > install tools... or at least having a robust choice of alternatives. > > We need a consensus "de jure" standard (ala WSGI), rather than just an > ongoing "de facto" standard (ala distutils/setuptools), or we're not making > any substantial progress, just handing the reins over to somebody else. +10 > > >> >> Enforcements: >> >> - a binary distribution cannot be uploaded if a source distrbution >> >> has not been previously provided for the version >> > >> > Note that this doesn't allow closed-source packages to be uploaded; thus >> > it >> > would need to be a warning, rather than a requirement. >> > >> >> Right. do you agree it is something useful to do ? > > Sure. Ok so maybe this is *one* problem we can already solve at PyPI with a patch. >> >> - PyPI should display the install and test dependencies in the UI >> > >> > It could only do this for specific binaries, since dependencies can be >> > dynamic. >> > >> > >> >> What dynamic means here ? the python module to static file process or more >> ? >> can you provide an example ? > > Dependencies can be platform-specific as well as python-version specific. > If you want ctypes, you would depend on it in python 2.4 but not in 2.5. > Similarly, if on some platform a certain library is required to implement a > feature that is natively accessible on other platforms. In these cases, you > would have logic in setup.py to detect this and choose the appropriate > dependencies. ok, right it is not possible with what we have now. If each dependency was marked as platform-specific or python-version specific in the metadata when it is necessary, we would know them all without calling extra detection code to build them. I hate the idea of dynamic metadata in fact. I can't express precisely why at that point. > > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From gherman at darwin.in-berlin.de Thu Oct 2 10:12:19 2008 From: gherman at darwin.in-berlin.de (Dinu Gherman) Date: Thu, 2 Oct 2008 10:12:19 +0200 Subject: [Catalog-sig] Does package_releases() always return all version numbers? In-Reply-To: <48E3C38F.60409@v.loewis.de> References: <5A441D61-EB18-4242-85E4-0324280969AD@darwin.in-berlin.de> <48E3C38F.60409@v.loewis.de> Message-ID: Martin v. L?wis: > Try > > versions = server.package_releases(name,True) > > This will also report hidden releases. Great news, thanks! Is there a way to find this out by inspecting web pages, APIs, or anything else? The only documentation I found is listed here (and this does not mention the second parameter): http://wiki.python.org/moin/PyPiXmlRpc And the method does not contain a docstring, unfortunately: >>> import xmlrpclib >>> serverUrl = "http://pypi.python.org/pypi" >>> server = xmlrpclib.Server(serverUrl) >>> print server.package_releases.__doc__ None More generally, how can I find out which code is actually running the PyPI server? Regards, Dinu From martin at v.loewis.de Thu Oct 2 21:15:56 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Oct 2008 21:15:56 +0200 Subject: [Catalog-sig] Does package_releases() always return all version numbers? In-Reply-To: References: <5A441D61-EB18-4242-85E4-0324280969AD@darwin.in-berlin.de> <48E3C38F.60409@v.loewis.de> Message-ID: <48E51DEC.3040000@v.loewis.de> > Great news, thanks! Is there a way to find this out by inspecting > web pages, APIs, or anything else? It's possible by inspecting the code; in the specific case, in https://svn.python.org/packages/trunk/pypi/rpc.py > The only documentation I found > is listed here (and this does not mention the second parameter): > > http://wiki.python.org/moin/PyPiXmlRpc If you compare the code and the wiki page, please feel free to make any necessary corrections to the wiki. > And the method does not contain a docstring, unfortunately: > > >>> import xmlrpclib > >>> serverUrl = "http://pypi.python.org/pypi" > >>> server = xmlrpclib.Server(serverUrl) > >>> print server.package_releases.__doc__ > None Hmm. I don't think XML-RPC supports fetching doc strings from the remote implementation (although that might be a cool idea). > More generally, how can I find out which code is actually running > the PyPI server? See above. Regards, Martin From martin at v.loewis.de Thu Oct 2 22:28:14 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Oct 2008 22:28:14 +0200 Subject: [Catalog-sig] Does package_releases() always return all version numbers? In-Reply-To: <078701c924cb$c6ea1380$54be3a80$@net> References: <5A441D61-EB18-4242-85E4-0324280969AD@darwin.in-berlin.de> <48E3C38F.60409@v.loewis.de> <48E51DEC.3040000@v.loewis.de> <078701c924cb$c6ea1380$54be3a80$@net> Message-ID: <48E52EDE.3060901@v.loewis.de> > I sent in a patch to allow the normal system.* functions (including getting > signatures and documentation) a while ago, but never got a response. > http://mail.python.org/pipermail/catalog-sig/2008-May/001679.html Can you please the PyPI tracker for such contributions? It's at http://sourceforge.net/projects/pypi resp. http://sourceforge.net/tracker/?atid=513503&group_id=66150&func=browse I was either unaware of your message, or have forgotten about it. Regards, Martin From noah at coderanger.net Thu Oct 2 22:16:37 2008 From: noah at coderanger.net (Noah Kantrowitz) Date: Thu, 2 Oct 2008 13:16:37 -0700 Subject: [Catalog-sig] Does package_releases() always return all version numbers? In-Reply-To: <48E51DEC.3040000@v.loewis.de> References: <5A441D61-EB18-4242-85E4-0324280969AD@darwin.in-berlin.de> <48E3C38F.60409@v.loewis.de> <48E51DEC.3040000@v.loewis.de> Message-ID: <078701c924cb$c6ea1380$54be3a80$@net> > -----Original Message----- > From: catalog-sig-bounces+noah=coderanger.net at python.org > [mailto:catalog-sig-bounces+noah=coderanger.net at python.org] On Behalf > Of "Martin v. L?wis" > Sent: Thursday, October 02, 2008 12:16 PM > Cc: catalog-sig at python.org > Subject: Re: [Catalog-sig] Does package_releases() always return all > version numbers? > > > Great news, thanks! Is there a way to find this out by inspecting > > web pages, APIs, or anything else? I sent in a patch to allow the normal system.* functions (including getting signatures and documentation) a while ago, but never got a response. http://mail.python.org/pipermail/catalog-sig/2008-May/001679.html --Noah From amk at amk.ca Thu Oct 2 23:39:26 2008 From: amk at amk.ca (A.M. Kuchling) Date: Thu, 2 Oct 2008 17:39:26 -0400 Subject: [Catalog-sig] Does package_releases() always return all version numbers? In-Reply-To: <078701c924cb$c6ea1380$54be3a80$@net> References: <5A441D61-EB18-4242-85E4-0324280969AD@darwin.in-berlin.de> <48E3C38F.60409@v.loewis.de> <48E51DEC.3040000@v.loewis.de> <078701c924cb$c6ea1380$54be3a80$@net> Message-ID: <20081002213926.GA20615@amk-desktop.matrixgroup.net> On Thu, Oct 02, 2008 at 01:16:37PM -0700, Noah Kantrowitz wrote: > I sent in a patch to allow the normal system.* functions (including getting > signatures and documentation) a while ago, but never got a response. > http://mail.python.org/pipermail/catalog-sig/2008-May/001679.html A stray thought: people wanting to experiment with their own PyPI installations could probably use Bazaar and its bzr-svn plugin to maintain private branches of PyPI that could then be published on their own servers or Launchpad. --amk From chris at simplistix.co.uk Fri Oct 3 17:01:44 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 03 Oct 2008 16:01:44 +0100 Subject: [Catalog-sig] [Distutils] PEP for distutils In-Reply-To: <20080930175201.106863A409C@sparrow.telecommunity.com> References: <94bdd2610809280555p12c0e326r4a867bd3b67efbd9@mail.gmail.com> <48E23F2D.3000802@simplistix.co.uk> <20080930153822.GJ26878@phare.normalesup.org> <48E24897.8010800@colorstudy.com> <20080930162559.GA11804@amk-desktop.matrixgroup.net> <20080930175201.106863A409C@sparrow.telecommunity.com> Message-ID: <48E633D8.1090808@simplistix.co.uk> Phillip J. Eby wrote: > Nope. And it can't possibly do so, unless it contains dependency data > for every possible variation of the package. For example, a package > might dynamically declare dependency on ctypes, depending on whether > you're installing it for Python 2.4 or Python 2.5. (Dependencies can > also be platform-specific and build-option-specific, as well as > Python-version-specific.) Ug. No-one actually does this do they? man, setup.py both sucks and blows at the same time :-( Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Fri Oct 3 17:18:48 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 03 Oct 2008 16:18:48 +0100 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> Message-ID: <48E637D8.9070004@simplistix.co.uk> Phillip J. Eby wrote: > I think we need a standard for tools > interop (ala WSGI), not implementation tweaks for the existing tools. Agreed. >> 4/ let's change PyPI to make it work with the new metadata and to >> enforce a few things >> >> Enforcements: >> - a binary distribution cannot be uploaded if a source distrbution >> has not been previously provided for the version > > Note that this doesn't allow closed-source packages to be uploaded; thus > it would need to be a warning, rather than a requirement. This is an important point. We can't assume any one repository will have all needed packages. > It could only do this for specific binaries, since dependencies can be > dynamic. They should not be dynamic :-( Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From pje at telecommunity.com Fri Oct 3 17:59:14 2008 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 03 Oct 2008 11:59:14 -0400 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <48E637D8.9070004@simplistix.co.uk> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> <48E637D8.9070004@simplistix.co.uk> Message-ID: <20081003155805.15F023A40D9@sparrow.telecommunity.com> At 04:18 PM 10/3/2008 +0100, Chris Withers wrote: >Phillip J. Eby wrote: >>It could only do this for specific binaries, since dependencies can >>be dynamic. > >They should not be dynamic :-( Too bad. They are, because they have to be in order to support more than one platform and/or Python version from the same source base. From gherman at darwin.in-berlin.de Sat Oct 4 18:46:22 2008 From: gherman at darwin.in-berlin.de (Dinu Gherman) Date: Sat, 4 Oct 2008 18:46:22 +0200 Subject: [Catalog-sig] Does package_releases() always return all version numbers? In-Reply-To: <48E51DEC.3040000@v.loewis.de> References: <5A441D61-EB18-4242-85E4-0324280969AD@darwin.in-berlin.de> <48E3C38F.60409@v.loewis.de> <48E51DEC.3040000@v.loewis.de> Message-ID: Martin v. L?wis: >> The only documentation I found >> is listed here (and this does not mention the second parameter): >> >> http://wiki.python.org/moin/PyPiXmlRpc > > If you compare the code and the wiki page, please feel free to make > any necessary corrections to the wiki. Done. I'm still left with the feeling, though, that this function is a good example for a mismatch in its intent as expressed in the function name, "get package releases" (note the plural form), and the default value for its "show_hidden" parameter (False). > It's possible by inspecting the code; in the specific case, in > > https://svn.python.org/packages/trunk/pypi/rpc.py In fact, this, too, would benefit from a couple of docstrings. Regards, Dinu -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Sat Oct 4 19:02:12 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 04 Oct 2008 19:02:12 +0200 Subject: [Catalog-sig] Does package_releases() always return all version numbers? In-Reply-To: References: <5A441D61-EB18-4242-85E4-0324280969AD@darwin.in-berlin.de> <48E3C38F.60409@v.loewis.de> <48E51DEC.3040000@v.loewis.de> Message-ID: <48E7A194.4010700@v.loewis.de> >> If you compare the code and the wiki page, please feel free to make >> any necessary corrections to the wiki. > Done. Thanks! > I'm still left with the feeling, though, that this function > is a good example for a mismatch in its intent as expressed > in the function name, "get package releases" (note the plural > form), and the default value for its "show_hidden" parameter > (False). There is a lot discussion on the "hidden" feature of PyPI. Even if you set show_hidden to False, you may still get multiple packages, if there are multiple non-hidden ones - it's not required that all releases are hidden but one. As a package owner, you have a Hide? drop-down for each release, which you can set as you wish. I've rephrased the wiki, but it's probably possible to use better words to describe this flag. >> https://svn.python.org/packages/trunk/pypi/rpc.py > > In fact, this, too, would benefit from a couple of docstrings. Certainly. Much of it is a shallow wrapper around a store.py routine; those do have docstrings. Regards, Martin From zooko at zooko.com Sun Oct 5 18:04:20 2008 From: zooko at zooko.com (zooko) Date: Sun, 5 Oct 2008 10:04:20 -0600 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> Message-ID: <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> Thanks for the synthesis, Tarek. I have some experience using current Python packaging in the field -- the Tahoe project [1] -- and so I would like to throw in what I know of what is currently working and what is currently needed and what isn't a big deal to me. This doesn't, of course, mean that other people might value things that I don't, but at least the following opinions of mine are won from hard experience. On Oct 1, 2008, at 6:10 AM, Tarek Ziad? wrote: > 1/ the dependencies of a package are not expressed in the Require > metadata of the package most of the time. +2 -- This is the biggest problem. The dependencies are not expressed *anywhere* in the metadata of the package most of the time. We need a de jure and de facto way to express dependencies so that developers will actually write them down. > Furthermore, developer tend to use setuptools "install_requires" > and "tests_require" arguments to express dependencies. > > So basically, you have to run an egg_info to get them, because the > info files are generated by commands. +0 -- I can see how this could be done better, but it isn't a pressing problem us. The current mechanism to get that dependency information at build/develop/install time works okay. > 2/ the existence of PyPI had a side-effect: people tend to push the > entire doc of the package in one single field (long_description) > to display them at PyPI. The documentation of the package is not > cleary pointed to others. +0 -- I would like more structured docs because then I could patch stdeb [2] to put docs into /usr/share/docs/$PACKAGE on Debian. But it isn't a pressing problem for us (we currently kludge around that issue). > 3/ the metadata infos cannot be expressed in a static file by the > developer, because sometimes they are calculated by code. > while this very permissive, that is how it works but they are > tighted to argument passing to setup(). +0 -- I totally agree that a static, separate, declarative file containing just data and no code would be a nicer way to do this. But the current way is working for us. > 4/ PyPI lacks of direct information about dependencies. +? -- I don't know. It sounds like it would be a big improvement, but the current mechanism of discovering dependencies by downloading distributions and executing their setup.py's seems to be working. > 5/ ideally, they should be one and only one version of a given package > in an OS-based installation -1 -- This is the strong preference of the folks who package software for OSes -- Debian, Fedora, etc. -- but it is not necessarily the choice of the users who use their OSes. It is best for the Python packaging standards to be agnostic towards this, or at least to support both this desideratum and its opposite. > 6/ packagers would like to have more compatibility information to work > out on security upgrades or version conflicts > 7/ developers should be able to have more options when they define > version dependencies in their packages, things like: > A depends on B>=1.2 and B<=2.0 but with a preference to B 1.4 > or "avoid B 1.7" > > they give tips to packagers ! +0 -- If we try to do better than Debian and Fedora already do then this risks being a science project -- i.e. something that will take a few years and might or might not pan out. If we try to just ape them and learn from their decade's worth of mistakes then this is probably doable. > The developer dependencies infos is a tip and a help for a packager, > not an enforcement. see [7] +1 -- In around 95% of the cases that I've seen, the developer's dependencies info was good enough. But, people have to be able to do something about the other 5%, so they have to be able to override developer-provided dependency information with their own. Obviously they can do this by patching or runtime-patching or maintaining their own branch, but we should specify a standard, principled way to do it instead. > 11/ people should always upload the sdist version at PyPI, they > don't do it always. otherwise it is a pain for packagers. +1 -- sdist format should be encouraged. > 1/ let's change the Python Metadata , in order to introduce a better > dependency system, by > > - officialy introduce "install requires" and "test requires" > metadata in there > - mark "requires" as deprecated +1 > 2/ Let's move part of setuptools code in distutils, to respect > those changes. +1 > 3/ let's create a simple convention : the metadata should be expressed > in a python module called 'pkginfo.py' > where each metadata is a variable. > > that can be used by setup.py and therefore by any tool that work > with it, even if it does not run > a setup.py command. > > This is simpler, this is cleaner. you don't have to run some setup > magic to read them. > at least some magic introduces by commands Uh... I thought the idea was to *not* have arbitrary Python code executed in this part? How about a flat file that people can reliably parse with, say, "grep", to learn about metadata. > - a binary distribution cannot be uploaded if a source distrbution > has not been previously provided for the version > - the requires-python need to be present. : come on, you know what > python versions your package work with ! +1 > - we should be able to download the metadata of a package without > downloading the package > - PyPI should display the install and test dependencies in the UI > - The XML-RPC should provide this new metadata as well. > - a commenting system should allow developers and packagers to > give more infos on a package at PyPI > to make the work easier +1 Regards, Zooko [1] http://allmydata.org/trac/tahoe [2] http://stdeb.python-hosting.com/ --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month From zooko at zooko.com Sun Oct 5 18:13:37 2008 From: zooko at zooko.com (zooko) Date: Sun, 5 Oct 2008 10:13:37 -0600 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810011810h3ea48e6cmb8d223cc24e1cf85@mail.gmail.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> <94bdd2610810011055j69411c6fu6f417cf3cf930460@mail.gmail.com> <20081001182658.E96173A4072@sparrow.telecommunity.com> <94bdd2610810011810h3ea48e6cmb8d223cc24e1cf85@mail.gmail.com> Message-ID: On Oct 1, 2008, at 19:10 PM, Tarek Ziad? wrote: > I hate the idea of dynamic metadata in fact. I can't express precisely > why at that point. Me too and me too. Perhaps it would help to distinguish between requiring a certain functionality and requiring a specific codebase which implements that functionality. For example: distribution A requires the functionality of ctypes. That part is statically, declaratively always true. However, distribution A doesn't necessarily require a *distribution* named "ctypes". If you are running on Python 2.6, then that functionality is already present. If there is a new distribution out there named "new_ctypes" which provides the same functionality and the same interface but is a completely different code base, then the presence of "new_ctypes" satisfies distribution A's requirements. The former question is simple, static, and declarative. The latter question isn't. In most cases there is only one implementation of a given interface, so we make do by equating the interface with the implementation. I wonder how Debian and Fedora handle this sort of issue? Regards, Zooko --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month From a.badger at gmail.com Mon Oct 6 07:27:18 2008 From: a.badger at gmail.com (Toshio Kuratomi) Date: Sun, 05 Oct 2008 22:27:18 -0700 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <20081001172824.B1E0B3A4072@sparrow.telecommunity.com> <94bdd2610810011055j69411c6fu6f417cf3cf930460@mail.gmail.com> <20081001182658.E96173A4072@sparrow.telecommunity.com> <94bdd2610810011810h3ea48e6cmb8d223cc24e1cf85@mail.gmail.com> Message-ID: <48E9A1B6.2050607@gmail.com> zooko wrote: > On Oct 1, 2008, at 19:10 PM, Tarek Ziad? wrote: > >> I hate the idea of dynamic metadata in fact. I can't express precisely >> why at that point. > > Me too and me too. > > Perhaps it would help to distinguish between requiring a certain > functionality and requiring a specific codebase which implements that > functionality. > > For example: distribution A requires the functionality of ctypes. That > part is statically, declaratively always true. > > However, distribution A doesn't necessarily require a *distribution* > named "ctypes". If you are running on Python 2.6, then that > functionality is already present. If there is a new distribution out > there named "new_ctypes" which provides the same functionality and the > same interface but is a completely different code base, then the > presence of "new_ctypes" satisfies distribution A's requirements. > > The former question is simple, static, and declarative. The latter > question isn't. > > In most cases there is only one implementation of a given interface, so > we make do by equating the interface with the implementation. > > I wonder how Debian and Fedora handle this sort of issue? > With python modules we just require one thing providing the interface. Let's say that elementtree was merged into python-2.5. And let's say that we got python-2.5 as the default python in Fedora 7. Since we only have one version of python in any release of Fedora we do something like this: Require: python %if 0%{?fedora} < 7 Require: python-elementtree %endif We are thinking of enhancing what dependency information we Require and Provide (the problem being... we want to do this automatically.) If we get that working, we could do things like: Require: python(elementtree) and in Fedora 6, python-elementtree would have: Provide: python(elementtree) whereas Fedora 7+, the python package would have: Provide: python(elementtree) Note that this information is not as easy to get to as the metadata provided by eggs so we're still trying to come up with a script that will generate this data automatically. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: OpenPGP digital signature URL: From charlesw123456 at gmail.com Mon Oct 6 08:09:13 2008 From: charlesw123456 at gmail.com (li wang) Date: Mon, 6 Oct 2008 14:09:13 +0800 Subject: [Catalog-sig] If PyPI is more strict with its packages, may be we can build binary packages from them directly. Message-ID: hi~ I'm writing pypi2pkgsys: http://code.google.com/p/pypi2pkgsys/ . I noticed that the name, license of python modules registered in PyPI is really a miss. Such as 'Are You Human?', even easy-install can not install them with these strange name. Most of the linux user use a linux distribution, such as fedora, ubuntu or gentoo, I'm not willing to install python module by easy-install because the local package management system will not know the python module is installed already. They will install them with some old version again. If PyPI is more strict in name, license and its format, automatically package install within the distribution package management system should be possible. By the way, PyPI is really slow here, mirror should be welcomed. And I have add the local mirror features into pypi2pkgsys. Thanks. Any comment is welcomed. Charles Oct 6th, 2008. From martin at v.loewis.de Mon Oct 6 08:25:05 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 06 Oct 2008 08:25:05 +0200 Subject: [Catalog-sig] If PyPI is more strict with its packages, may be we can build binary packages from them directly. In-Reply-To: References: Message-ID: <48E9AF41.5080804@v.loewis.de> > I'm writing pypi2pkgsys: http://code.google.com/p/pypi2pkgsys/ . > I noticed that the name, license of python modules registered in PyPI > is really a miss. Such as 'Are You Human?', even easy-install can not > install them with these strange name. I don't really see the problem. Sure, it is very difficult to fetch this record from PyPI. But then, it's the package author's fault if his package is inaccessible. If you have an automated tool to access packages, just skip over the packages that you cannot access. This wouldn't be very different from the case where PyPI would have been more strict: just presume that the package is not there if you don't like its name. > If PyPI is more strict in name, license and its format, automatically > package install within the distribution package management system should > be possible. But it is possible already! See above. While I can sympathize with a desire to enforce a certain package name syntax, I am unsure what licenses have to do with it. Why should PyPI enforce a policy on the license field, and what should that policy be? Regards, Martin From charlesw123456 at gmail.com Mon Oct 6 13:59:59 2008 From: charlesw123456 at gmail.com (li wang) Date: Mon, 6 Oct 2008 19:59:59 +0800 Subject: [Catalog-sig] If PyPI is more strict with its packages, may be we can build binary packages from them directly. In-Reply-To: <48E9AF41.5080804@v.loewis.de> References: <48E9AF41.5080804@v.loewis.de> Message-ID: hi~ 2008/10/6 "Martin v. L?wis" : >> I'm writing pypi2pkgsys: http://code.google.com/p/pypi2pkgsys/ . >> I noticed that the name, license of python modules registered in PyPI >> is really a miss. Such as 'Are You Human?', even easy-install can not >> install them with these strange name. > > I don't really see the problem. Sure, it is very difficult to fetch this > record from PyPI. But then, it's the package author's fault if his > package is inaccessible. Sure, of course it is a problem of the author. And this policy may help PyPI to collect more packages for users. But this fault will defeat the user but not the author, why user have to bear the the fault of the author? Now there are many packages in PyPI already, may be it is a time to let the author care about this problem to make the user more comfortable? :) > > If you have an automated tool to access packages, just skip over the > packages that you cannot access. This wouldn't be very different from > the case where PyPI would have been more strict: just presume that the > package is not there if you don't like its name. > >> If PyPI is more strict in name, license and its format, automatically >> package install within the distribution package management system should >> be possible. > > But it is possible already! See above. > > While I can sympathize with a desire to enforce a certain package name > syntax, I am unsure what licenses have to do with it. Why should PyPI > enforce a policy on the license field, and what should that policy be? In fact, pypi2pkgsys can scan PyPI catalog automatically and log all broken packages automatically. There is the log statistics: $ sudo pypi-logstats.py /var/tmp/pypi/pypi2pkgsys.log /var/tmp/pypi/pypi2pkgsys.log: 2902(59.95%) ok, 0( 0.00%) manual, 1939(40.05%) bad. The reason of the damage is diversity, may be broken by bad name, may be broken by unrecognized license (Somebody use GPL, somebody use http://www.gnu.org/licenses/licenses/gpl.html, somebody use http://www.opensource.org/licenses/gpl-license.php). Somebody embedded all of the text into license argument of setup...... And the site of many packages are not accessable, and I can not get any code from them. As I known, gentoo ebuild require a standardizied format on license. I'm not want to apply the rule of ebuild to PyPI, but just hope to refine it. As you see, for GPL, there are many varieties in PyPI: GPL, general public licence, http://www.gnu.org/licenses/gpl.txt, http://www.gnu.org/licenses/gpl.html, http://www.gnu.org/copyleft/gpl.html,http://www.opensource.org/licenses/gpl-license.php .... Regards, Charles Oct 6th, 2008 From martin at v.loewis.de Mon Oct 6 21:33:17 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 06 Oct 2008 21:33:17 +0200 Subject: [Catalog-sig] If PyPI is more strict with its packages, may be we can build binary packages from them directly. In-Reply-To: References: <48E9AF41.5080804@v.loewis.de> Message-ID: <48EA67FD.80609@v.loewis.de> > Sure, of course it is a problem of the author. And this policy may help PyPI > to collect more packages for users. But this fault will defeat the user but not > the author, why user have to bear the the fault of the author? Now there are > many packages in PyPI already, may be it is a time to let the author care about > this problem to make the user more comfortable? :) I fail to see why this creates a problem for the users. > In fact, pypi2pkgsys can scan PyPI catalog automatically and log all broken > packages automatically. There is the log statistics: > > $ sudo pypi-logstats.py /var/tmp/pypi/pypi2pkgsys.log > /var/tmp/pypi/pypi2pkgsys.log: 2902(59.95%) ok, 0( 0.00%) manual, > 1939(40.05%) bad. > > The reason of the damage is diversity, may be broken by bad name, may > be broken by > unrecognized license (Somebody use GPL, somebody use > http://www.gnu.org/licenses/licenses/gpl.html, > somebody use http://www.opensource.org/licenses/gpl-license.php). Hmm. Maybe if you also look at the Trove classifiers, your recognition for licenses is better. I don't want to restrict package authors in the licenses that they chose for their software. If they chose a license that is not yet recognized, your tool certainly won't be able to map it to some well-known list of licenses (which you apparently need for some reason I don't understand). However, why should PyPI restrict the licenses for Python packages to the list of licenses that pyp2pkgsys supports? > Somebody embedded all of > the text into license argument of setup...... And the site of many > packages are not accessable, and > I can not get any code from them. And that's intentional. This is the Python Package *Index*, not a Python package repository. Some people chose to provide source code, others don't. Perhaps the package isn't even free software. > As I known, gentoo ebuild require a standardizied format on license. > I'm not want to apply the rule of > ebuild to PyPI, but just hope to refine it. As you see, for GPL, there > are many varieties in PyPI: > GPL, general public licence, http://www.gnu.org/licenses/gpl.txt, > http://www.gnu.org/licenses/gpl.html, > http://www.gnu.org/copyleft/gpl.html,http://www.opensource.org/licenses/gpl-license.php > .... Hmm. I personally don't think anything should change about that. Regards, Martin From ziade.tarek at gmail.com Tue Oct 7 16:07:26 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 7 Oct 2008 10:07:26 -0400 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> Message-ID: <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.com> On Sun, Oct 5, 2008 at 12:04 PM, zooko wrote: >> 5/ ideally, they should be one and only one version of a given package >> in an OS-based installation > > -1 -- This is the strong preference of the folks who package software for > OSes -- Debian, Fedora, etc. -- but it is not necessarily the choice of the > users who use their OSes. It is best for the Python packaging standards to > be agnostic towards this, or at least to support both this desideratum and > its opposite. I can see this as an exponentional problem for packagers, but let's think about it: - How you would handle several version of the same package in Python then ? - How each application would pick the right version ? - How would you decide which version is the one by default ? That is the core of the problem. The -m feature of setuptools is nice, but it activates one version at a time, and this is globlal to Python unless each application is handling the version switch, wich is pretty heavy. A programmable sys.path ? Tarek. From ziade.tarek at gmail.com Tue Oct 7 16:35:49 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 7 Oct 2008 10:35:49 -0400 Subject: [Catalog-sig] Distutils and PyPI : P4-Sprint in D.C. Message-ID: <94bdd2610810070735g1e4a93d6n45beda83dd8c158c@mail.gmail.com> Hey all, We are going to have a P4-sprint (pre-pre-pre-PEP sprint) in D.C. during the Plone Conference. The idea is to try to bring the discussions that have been going on in the mailing lists into the next stage. Please join us if you are interested, even if you are not in D.C. http://www.openplans.org/projects/plone-conference-2008-dc/distribute Regards Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From fdrake at gmail.com Tue Oct 7 17:37:52 2008 From: fdrake at gmail.com (Fred Drake) Date: Tue, 7 Oct 2008 11:37:52 -0400 Subject: [Catalog-sig] Distutils and PyPI : P4-Sprint in D.C. In-Reply-To: <94bdd2610810070735g1e4a93d6n45beda83dd8c158c@mail.gmail.com> References: <94bdd2610810070735g1e4a93d6n45beda83dd8c158c@mail.gmail.com> Message-ID: <9cee7ab80810070837q486414a3v8dadd697e7892d09@mail.gmail.com> On Tue, Oct 7, 2008 at 10:35 AM, Tarek Ziad? wrote: > We are going to have a P4-sprint (pre-pre-pre-PEP sprint) in D.C. > during the Plone Conference. Very cool. Given the venue, should people expect that they're welcome in person even if not associated with the Plone conference? I don't know if I'll be able to steal some time from my family, but there's a possibility. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From ziade.tarek at gmail.com Tue Oct 7 17:44:29 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 7 Oct 2008 11:44:29 -0400 Subject: [Catalog-sig] Distutils and PyPI : P4-Sprint in D.C. In-Reply-To: <9cee7ab80810070837q486414a3v8dadd697e7892d09@mail.gmail.com> References: <94bdd2610810070735g1e4a93d6n45beda83dd8c158c@mail.gmail.com> <9cee7ab80810070837q486414a3v8dadd697e7892d09@mail.gmail.com> Message-ID: <94bdd2610810070844t7622c326ib9aba78a18e1bc5c@mail.gmail.com> On Tue, Oct 7, 2008 at 11:37 AM, Fred Drake wrote: > On Tue, Oct 7, 2008 at 10:35 AM, Tarek Ziad? wrote: >> We are going to have a P4-sprint (pre-pre-pre-PEP sprint) in D.C. >> during the Plone Conference. > > Very cool. > > Given the venue, should people expect that they're welcome in person > even if not associated with the Plone conference? Yes there's a huge space, everyone is welcome ! > > I don't know if I'll be able to steal some time from my family, but > there's a possibility. > That would be awesome ! Tarek > > -Fred > > -- > Fred L. Drake, Jr. > "Chaos is the score upon which reality is written." --Henry Miller > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From pje at telecommunity.com Tue Oct 7 20:42:24 2008 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 07 Oct 2008 14:42:24 -0400 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.co m> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.com> Message-ID: <20081007184113.54CF03A4045@sparrow.telecommunity.com> At 10:07 AM 10/7/2008 -0400, Tarek Ziad? wrote: >The -m feature of setuptools is nice, but it activates one version at >a time, and >this is globlal to Python unless each application is handling the >version switch, >wich is pretty heavy. With or without the -m switch, scripts installed by setuptools will find the version they are specified to use, without the user needing to do anything. So, you can have a default version of an egg (used by the interpreter and non-setuptools scripts), and then some non-default versions that are used by scripts. zc.buildout and virtualenv also have their own ways of accomplishing the same thing, e.g., by hardcoding paths in an installed script. From ziade.tarek at gmail.com Tue Oct 7 20:58:00 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 7 Oct 2008 14:58:00 -0400 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <20081007184113.54CF03A4045@sparrow.telecommunity.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.com> <20081007184113.54CF03A4045@sparrow.telecommunity.com> Message-ID: <94bdd2610810071158u624007eft4593b253f1887be0@mail.gmail.com> On Tue, Oct 7, 2008 at 2:42 PM, Phillip J. Eby wrote: > At 10:07 AM 10/7/2008 -0400, Tarek Ziad? wrote: >> >> The -m feature of setuptools is nice, but it activates one version at >> a time, and >> this is globlal to Python unless each application is handling the >> version switch, >> wich is pretty heavy. > > With or without the -m switch, scripts installed by setuptools will find the > version they are specified to use, without the user needing to do anything. > So, you can have a default version of an egg (used by the interpreter and > non-setuptools scripts), and then some non-default versions that are used by > scripts. > > zc.buildout and virtualenv also have their own ways of accomplishing the > same thing, e.g., by hardcoding paths in an installed script. in a plain python setup, If foo 1.2 is the default, and a package wants use foo 1.4, it needs to specifically call pkg_resources.require() in the code, to activate it in sys.path before importing "foo" in the code. Since each package can list with setuptools its dependencies with versions in install_requires, how hard would it be to automatically call the right "require()" calls when the package is used ? Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From ianb at colorstudy.com Tue Oct 7 21:07:35 2008 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 07 Oct 2008 14:07:35 -0500 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810071158u624007eft4593b253f1887be0@mail.gmail.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.com> <20081007184113.54CF03A4045@sparrow.telecommunity.com> <94bdd2610810071158u624007eft4593b253f1887be0@mail.gmail.com> Message-ID: <48EBB377.7030800@colorstudy.com> Tarek Ziad? wrote: > On Tue, Oct 7, 2008 at 2:42 PM, Phillip J. Eby wrote: >> At 10:07 AM 10/7/2008 -0400, Tarek Ziad? wrote: >>> The -m feature of setuptools is nice, but it activates one version at >>> a time, and >>> this is globlal to Python unless each application is handling the >>> version switch, >>> wich is pretty heavy. >> With or without the -m switch, scripts installed by setuptools will find the >> version they are specified to use, without the user needing to do anything. >> So, you can have a default version of an egg (used by the interpreter and >> non-setuptools scripts), and then some non-default versions that are used by >> scripts. >> >> zc.buildout and virtualenv also have their own ways of accomplishing the >> same thing, e.g., by hardcoding paths in an installed script. > > in a plain python setup, > > If foo 1.2 is the default, and a package wants use foo 1.4, > it needs to specifically call pkg_resources.require() in the code, to > activate it in sys.path > before importing "foo" in the code. > > Since each package can list with setuptools its dependencies with > versions in install_requires, > how hard would it be to automatically call the right "require()" > calls when the package is used ? require() is recursive, so as long as the original script is explicitly loaded (e.g., from a binary script, or something that loads eggs/entry points) then the proper versions will be loaded. Though as far as I know, pkg_resources won't remove other versions of the egg from the path, so it only works if there are no active versions of the eggs. Which isn't how many people install packages, so this feature of require() doesn't get used for much of anything (at least that I've seen). I'll also note that the require in setuptools-generated scripts causes pretty frequent problems for people, all to support this multi-version feature that no one really uses. An example of an easy way to cause the problem, if you do: "python setup.py develop; svn up; python setup.py egg_info" it'll break any scripts, or if you install a script in an unusual location, or use $PYTHONPATH but don't set $PATH so that you get an unexpected script that doesn't match your libraries -- since pyinstall is using --single-version-externally-managed, I kind of wish I could easily turn off the require() as well (I could monkeypatch setuptools to remove it, but I've been burned by going down that path before). -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org From pje at telecommunity.com Tue Oct 7 22:27:40 2008 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 07 Oct 2008 16:27:40 -0400 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810071158u624007eft4593b253f1887be0@mail.gmail.co m> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.com> <20081007184113.54CF03A4045@sparrow.telecommunity.com> <94bdd2610810071158u624007eft4593b253f1887be0@mail.gmail.com> Message-ID: <20081007202630.266D23A4045@sparrow.telecommunity.com> At 02:58 PM 10/7/2008 -0400, Tarek Ziad? wrote: >On Tue, Oct 7, 2008 at 2:42 PM, Phillip J. Eby wrote: > > At 10:07 AM 10/7/2008 -0400, Tarek Ziad? wrote: > >> > >> The -m feature of setuptools is nice, but it activates one version at > >> a time, and > >> this is globlal to Python unless each application is handling the > >> version switch, > >> wich is pretty heavy. > > > > With or without the -m switch, scripts installed by setuptools > will find the > > version they are specified to use, without the user needing to do anything. > > So, you can have a default version of an egg (used by the interpreter and > > non-setuptools scripts), and then some non-default versions that > are used by > > scripts. > > > > zc.buildout and virtualenv also have their own ways of accomplishing the > > same thing, e.g., by hardcoding paths in an installed script. > >in a plain python setup, > >If foo 1.2 is the default, and a package wants use foo 1.4, >it needs to specifically call pkg_resources.require() in the code, to >activate it in sys.path >before importing "foo" in the code. You can't un-default the default, actually. If there's a default, it can't be replaced once pkg_resources has been imported. >Since each package can list with setuptools its dependencies with >versions in install_requires, >how hard would it be to automatically call the right "require()" >calls when the package is used ? This is already done by setuptools-generated scripts. Same for zc.buildout and virtualenv, they just do it differently. From robert.kern at gmail.com Tue Oct 7 23:58:42 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 07 Oct 2008 16:58:42 -0500 Subject: [Catalog-sig] Simultaneous multi-version support In-Reply-To: <87abdgjcw3.fsf_-_@benfinney.id.au> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.com> <20081007184113.54CF03A4045@sparrow.telecommunity.com> <94bdd2610810071158u624007eft4593b253f1887be0@mail.gmail.com> <48EBB377.7030800@colorstudy.com> <87abdgjcw3.fsf_-_@benfinney.id.au> Message-ID: Ben Finney wrote: > Ian Bicking writes: > >> I'll also note that the require in setuptools-generated scripts causes >> pretty frequent problems for people, all to support this multi-version >> feature that no one really uses. > > I agree heartily that it seems to cause more trouble than it's worth ? > for my assessment of its worth, anyway. Is it true that ?no-one? (to > some epsilon value) actually uses this feature? There is one person on enthought-dev who does this for everything. He says it keeps him honest about his dependencies. And consequently keeps us at Enthought honest about ours. I typically have multiple versions of things installed, and switch between them, but I never use pkg_resources.require() to do so. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ben+python at benfinney.id.au Tue Oct 7 23:47:40 2008 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 08 Oct 2008 08:47:40 +1100 Subject: [Catalog-sig] Simultaneous multi-version support (was: pre-PEP : Synthesis of previous threads, and irc talks + proposals) References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.com> <20081007184113.54CF03A4045@sparrow.telecommunity.com> <94bdd2610810071158u624007eft4593b253f1887be0@mail.gmail.com> <48EBB377.7030800@colorstudy.com> Message-ID: <87abdgjcw3.fsf_-_@benfinney.id.au> Ian Bicking writes: > I'll also note that the require in setuptools-generated scripts causes > pretty frequent problems for people, all to support this multi-version > feature that no one really uses. I agree heartily that it seems to cause more trouble than it's worth ? for my assessment of its worth, anyway. Is it true that ?no-one? (to some epsilon value) actually uses this feature? -- \ ?The best ad-libs are rehearsed.? ?Graham Kennedy | `\ | _o__) | Ben Finney From david at ar.media.kyoto-u.ac.jp Wed Oct 8 04:58:20 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 08 Oct 2008 11:58:20 +0900 Subject: [Catalog-sig] [Distutils] pre-PEP : Synthesis of previous threads, and irc talks + proposals In-Reply-To: <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.com> References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.com> Message-ID: <48EC21CC.3080709@ar.media.kyoto-u.ac.jp> Tarek Ziad? wrote: > On Sun, Oct 5, 2008 at 12:04 PM, zooko wrote: >>> 5/ ideally, they should be one and only one version of a given package >>> in an OS-based installation >> -1 -- This is the strong preference of the folks who package software for >> OSes -- Debian, Fedora, etc. -- but it is not necessarily the choice of the >> users who use their OSes. It is best for the Python packaging standards to >> be agnostic towards this, or at least to support both this desideratum and >> its opposite. It is not just a strong preference of the Linux guys, it is proper software engineering. It is fine for developers to have several versions installed at the same time, but we should really discourage developers to depend on the possibility to deploy several versions of the same package, at least in the site-packages owned by the system. The problem is not just for linux guys: by enabling everyone to deploy several versions, you encourage people not to care about API compatibility, and then quikcly, in the dependencies, you will have depends on foo >= 1.2 (foo < 1.3), which quickly means it will fail because a package A depends on B and C, and B depends on foo 1.2 and C 1.3. By encouraging multiple versions, you are encouraging this kind of failures all the time. If people want to try several versions, there is virtual env and co (or just installing several python interpreters). It should be a *developer only* convenience as much as possible. We can have a system to control imported versions, but without support from python interpreter, it will be unreliable, and keeps breaking as it does now. cheers, David From zooko at zooko.com Wed Oct 8 15:15:08 2008 From: zooko at zooko.com (zooko) Date: Wed, 8 Oct 2008 07:15:08 -0600 Subject: [Catalog-sig] [Distutils] Simultaneous multi-version support In-Reply-To: References: <94bdd2610810010510w43abf97bkf19f0324520e3dfe@mail.gmail.com> <8E57AB2B-4C1C-49F4-9FF8-78F57509395E@zooko.com> <94bdd2610810070707k5fee06eayd2d1086da9459213@mail.gmail.com> <20081007184113.54CF03A4045@sparrow.telecommunity.com> <94bdd2610810071158u624007eft4593b253f1887be0@mail.gmail.com> <48EBB377.7030800@colorstudy.com> <87abdgjcw3.fsf_-_@benfinney.id.au> Message-ID: <4CA944A8-FD07-4157-BB2E-4F0D5D674066@zooko.com> We use pkg_resources.require() in Tahoe solely in order to get better and earlier error messages in the case of missing or wrong-version dependencies: http://allmydata.org/trac/tahoe/browser/_auto_deps.py?rev=2968#L22 """ The purpose of this function is to raise a pkg_resources exception if any of the requirements can't be imported. This is just to give earlier and more explicit error messages, as opposed to waiting until the source code tries to import some module from one of these packages and gets an ImportError. This function gets called from src/allmydata/__init__.py . """ We are considering experimenting with the multi-version feature of eggs, but haven't tried it yet. pkg_resources.require() is not particularly problematic for us as far as I recall. Regards, Zooko --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month From lists at zopyx.com Thu Oct 9 12:40:14 2008 From: lists at zopyx.com (Andreas Jung) Date: Thu, 09 Oct 2008 06:40:14 -0400 Subject: [Catalog-sig] PyPI replication project Message-ID: <48EDDF8E.8030204@zopyx.com> Hi there, I would like to inform you that we created the "PyPI replication project" hosted on Launchpad. Driven by the needs of the Zope world using zc.buildout intensively, we created a package z3c.pypimirror for mirroring the packages directly hosted on PyPI on a local server. This removed the dependency from the PyPI server(s) which are a single-point-of-failure and often had issues in the past with respect to availability and reliability. In phase 1 of the project (upcoming soon) we will provide a number of independent machines (up to five servers) with a full copy of all packages hosted directly on PyPI. For phase 2 (next year) we will rework the codebase of z3c.pypimirror and try to deal in some way with packages hosted externally. In addition we think about providing some kind of automatic mirror-selection within setuptools/zc.buildout based on DNS alias entries (subject to be planned). Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: lists.vcf Type: text/x-vcard Size: 316 bytes Desc: not available URL: From martin at v.loewis.de Thu Oct 9 20:32:59 2008 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Oct 2008 20:32:59 +0200 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EDDF8E.8030204@zopyx.com> References: <48EDDF8E.8030204@zopyx.com> Message-ID: <48EE4E5B.9070200@v.loewis.de> > In phase 1 of the project (upcoming soon) we will provide a number of > independent machines (up to five servers) with a full copy of all > packages hosted directly on PyPI. Did you consider offering to host and manage PyPI *instead* of creating this mirror? Regards, Martin From lists at zopyx.com Thu Oct 9 20:53:30 2008 From: lists at zopyx.com (Andreas Jung) Date: Thu, 09 Oct 2008 14:53:30 -0400 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EE4E5B.9070200@v.loewis.de> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> Message-ID: <48EE532A.3030907@zopyx.com> Am 09.10.2008 14:32 Uhr, Martin v. L?wis schrieb: >> In phase 1 of the project (upcoming soon) we will provide a number of >> independent machines (up to five servers) with a full copy of all >> packages hosted directly on PyPI. > > Did you consider offering to host and manage PyPI *instead* of creating > this mirror? What is the background for this question? Right now we have the problem that we must have the eggs and source code archive available for doing buildout - we don't depend on the metadata and the PyPI software itself. Are you asking because your server setup is too limited or needs further resource concerning reliability and availability? Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: lists.vcf Type: text/x-vcard Size: 316 bytes Desc: not available URL: From martin at v.loewis.de Thu Oct 9 23:18:10 2008 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Oct 2008 23:18:10 +0200 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EE532A.3030907@zopyx.com> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> Message-ID: <48EE7512.7040607@v.loewis.de> >> Did you consider offering to host and manage PyPI *instead* of creating >> this mirror? > > What is the background for this question? Right now we have the problem > that we must have the eggs and source code archive available for doing > buildout - we don't depend on the metadata and the PyPI software itself. > Are you asking because your server setup is too limited or needs further > resource concerning reliability and availability? I'm concerned that PyPI will fork, and that users have to chose between your installation, and PyPI, in particular for publishing packages (I assume that initially, uploading to your infrastructure might not be possible, but what you describe sounds like a good starting point for a package repository). I don't think the machine running pypi.python.org is too limited, but in order to provide the availability you complained is missing, having full-time personnel managing it would be useful (not because it needs constant maintenance, but so that somebody is there who can respond quickly). So if you would host PyPI on these machines, it might be that the mirroring software becomes unnecessary, and that the added redundancy and maintenance staff can provide what you need without creating a separate repository. Regards, Martin From lists at zopyx.com Fri Oct 10 00:04:37 2008 From: lists at zopyx.com (Andreas Jung) Date: Thu, 09 Oct 2008 18:04:37 -0400 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EE7512.7040607@v.loewis.de> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> Message-ID: <48EE7FF5.7090405@zopyx.com> Am 09.10.2008 17:18 Uhr, Martin v. L?wis schrieb: >>> Did you consider offering to host and manage PyPI *instead* of creating >>> this mirror? >> What is the background for this question? Right now we have the problem >> that we must have the eggs and source code archive available for doing >> buildout - we don't depend on the metadata and the PyPI software itself. >> Are you asking because your server setup is too limited or needs further >> resource concerning reliability and availability? > > I'm concerned that PyPI will fork, and that users have to chose between > your installation, and PyPI, in particular for publishing packages > (I assume that initially, uploading to your infrastructure might not > be possible, but what you describe sounds like a good starting point for > a package repository). Nah...no fear...there is no intention to fork PyPI - especially PyPI will remain the master and the replication provides _only_ the download support. There is absolutely no intention for providing _any_ kind of upload possibility - never ever. Consider being it like the CPAN mirrors. > > I don't think the machine running pypi.python.org is too limited, but > in order to provide the availability you complained is missing, having > full-time personnel managing it would be useful (not because it needs > constant maintenance, but so that somebody is there who can respond > quickly). If there would be an easy way to replicate the PyPI backend (no idea about the implementation - any RDBMS involved) then there are likely volunteers taking over the resources for a mirror of the current backend. > > So if you would host PyPI on these machines, it might be that the > mirroring software becomes unnecessary, and that the added redundancy > and maintenance staff can provide what you need without creating > a separate repository. Please tell us how a distributed PyPI backend would look like..one requirement would be that the software and database part (if any) have to be in a good shape - the last time I tried the PyPI software I have not had the best impression. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: lists.vcf Type: text/x-vcard Size: 316 bytes Desc: not available URL: From martin at v.loewis.de Fri Oct 10 00:36:33 2008 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Oct 2008 00:36:33 +0200 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EE7FF5.7090405@zopyx.com> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> Message-ID: <48EE8771.8040605@v.loewis.de> > If there would be an easy way to replicate the PyPI backend (no idea > about the implementation - any RDBMS involved) then there are likely > volunteers taking over the resources for a mirror of the current backend. I don't think a mirror would help. If the original system is down, who would tell users to go to alternative locations? > Please tell us how a distributed PyPI backend would look like..one > requirement would be that the software and database part (if any) have > to be in a good shape - the last time I tried the PyPI software I have > not had the best impression. Again, I don't think that distribution improves anything - I rather would expect that it introduces new problems. What would help is more people who know how to operate the software, and restart it when it breaks (or can contribute automatic error recovery). Regards, Martin From renesd at gmail.com Fri Oct 10 00:42:53 2008 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 10 Oct 2008 09:42:53 +1100 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EE8771.8040605@v.loewis.de> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> Message-ID: <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> hellos, Mirrors help every other packaging system. So it stands to reason that it would help pypi too. I think since many zope people have been using mirrors instead of using pypi directly... pypi has been more available. It's running lots better for other reasons too... but less load is probably also nice for pypi :) cheers, On Fri, Oct 10, 2008 at 9:36 AM, "Martin v. L?wis" wrote: >> If there would be an easy way to replicate the PyPI backend (no idea >> about the implementation - any RDBMS involved) then there are likely >> volunteers taking over the resources for a mirror of the current backend. > > I don't think a mirror would help. If the original system is down, > who would tell users to go to alternative locations? > >> Please tell us how a distributed PyPI backend would look like..one >> requirement would be that the software and database part (if any) have >> to be in a good shape - the last time I tried the PyPI software I have >> not had the best impression. > > Again, I don't think that distribution improves anything - I rather > would expect that it introduces new problems. > > What would help is more people who know how to operate the software, > and restart it when it breaks (or can contribute automatic error > recovery). > > Regards, > Martin > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From lists at zopyx.com Fri Oct 10 00:45:41 2008 From: lists at zopyx.com (Andreas Jung) Date: Thu, 09 Oct 2008 18:45:41 -0400 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EE8771.8040605@v.loewis.de> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> Message-ID: <48EE8995.9080206@zopyx.com> Am 09.10.2008 18:36 Uhr, Martin v. L?wis schrieb: >> If there would be an easy way to replicate the PyPI backend (no idea >> about the implementation - any RDBMS involved) then there are likely >> volunteers taking over the resources for a mirror of the current backend. > > I don't think a mirror would help. If the original system is down, > who would tell users to go to alternative locations? zc.buildout supports this using the 'find-links' option. As said, the primary focus is for zc.buildout users (means the complete Zope & Plone world). > >> Please tell us how a distributed PyPI backend would look like..one >> requirement would be that the software and database part (if any) have >> to be in a good shape - the last time I tried the PyPI software I have >> not had the best impression. > > Again, I don't think that distribution improves anything - I rather > would expect that it introduces new problems. We already maintain a internal mirror of PyPI for a while (just because we have the need for having PyPI available _all_ the time for production reasons). And it works out pretty well. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: lists.vcf Type: text/x-vcard Size: 316 bytes Desc: not available URL: From martin at v.loewis.de Fri Oct 10 00:59:56 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Oct 2008 00:59:56 +0200 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> Message-ID: <48EE8CEC.1000706@v.loewis.de> > Mirrors help every other packaging system. So it stands to reason > that it would help pypi too. I think since many zope people have been > using mirrors instead of using pypi directly... pypi has been more > available. It's running lots better for other reasons too... but less > load is probably also nice for pypi :) I'm fine with people operating their own mirrors. I just don't think it can be made *invisible* to users that they use a mirror. In the mirroring systems for Linux distributions, for example, people have to explicitly select which mirror they want to use (and accept that the mirror may lag behind by a day or so). It's also clear that it is a "mere" mirror. What Andreas was asking how a distributed PyPI installation could work, by which I assume he was asking for one that a) is invisible (of called misleadingly "transparent") to users, and b) allows updates to replica. I'm skeptical that such an system would work all that well, and can be created in a reasonable amount of time. Regards, Martin From lists at zopyx.com Fri Oct 10 12:40:17 2008 From: lists at zopyx.com (Andreas Jung) Date: Fri, 10 Oct 2008 06:40:17 -0400 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EE8CEC.1000706@v.loewis.de> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> <48EE8CEC.1000706@v.loewis.de> Message-ID: <48EF3111.3010706@zopyx.com> Am 09.10.2008 18:59 Uhr, Martin v. L?wis schrieb: >> Mirrors help every other packaging system. So it stands to reason >> that it would help pypi too. I think since many zope people have been >> using mirrors instead of using pypi directly... pypi has been more >> available. It's running lots better for other reasons too... but less >> load is probably also nice for pypi :) > > I'm fine with people operating their own mirrors. I just don't think > it can be made *invisible* to users that they use a mirror. In the > mirroring systems for Linux distributions, for example, people have > to explicitly select which mirror they want to use (and accept that > the mirror may lag behind by a day or so). It's also clear that it is > a "mere" mirror. Implict or explict mirror selection is not primary point in phase 1 of the project. The point is that we must have access to the distribution packages and eggs at any time - independent of the available of PyPI (either related to issues with the PyPI server or caused by internet outages or routing problems). An implicit selection of a mirror in case of an detected outtage would be nice but this is possibly not the most important issue right now. We can always reconfigure out buildout configurations easily to a new server or define a series of mirroring servers. > > What Andreas was asking how a distributed PyPI installation could work, > by which I assume he was asking for one that a) is invisible (of called > misleadingly "transparent") to users, and b) allows updates to replica. > > I'm skeptical that such an system would work all that well, and can > be created in a reasonable amount of time As stated earlier: we can already define multiple servers as part of a buildout configuration. A better mirror selection algorithm would be nice to have for the future but right now we don't actually need it and can live with the current state. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: lists.vcf Type: text/x-vcard Size: 316 bytes Desc: not available URL: From tarek.ziade at ingeniweb.com Fri Oct 10 14:38:13 2008 From: tarek.ziade at ingeniweb.com (Tarek Ziade) Date: Fri, 10 Oct 2008 08:38:13 -0400 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EF3111.3010706@zopyx.com> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> <48EE8CEC.1000706@v.loewis.de> <48EF3111.3010706@zopyx.com> Message-ID: 2008/10/10 Andreas Jung > Am 09.10.2008 18:59 Uhr, Martin v. L?wis schrieb: > >> Mirrors help every other packaging system. So it stands to reason >>> that it would help pypi too. I think since many zope people have been >>> using mirrors instead of using pypi directly... pypi has been more >>> available. It's running lots better for other reasons too... but less >>> load is probably also nice for pypi :) >>> >> >> I'm fine with people operating their own mirrors. I just don't think >> it can be made *invisible* to users that they use a mirror. In the >> mirroring systems for Linux distributions, for example, people have >> to explicitly select which mirror they want to use (and accept that >> the mirror may lag behind by a day or so). It's also clear that it is >> a "mere" mirror. >> > > Implict or explict mirror selection is not primary point in phase 1 > of the project. The point is that we must have access to the distribution > packages and eggs at any time - independent of the available of PyPI (either > related to issues with the PyPI server or caused by internet outages or > routing problems). > > An implicit selection of a mirror in case of an detected outtage would be > nice but this is possibly not the most important issue right now. > We can always reconfigure out buildout configurations easily to a new > server or define a series of mirroring servers. > > >> What Andreas was asking how a distributed PyPI installation could work, >> by which I assume he was asking for one that a) is invisible (of called >> misleadingly "transparent") to users, and b) allows updates to replica. >> >> I'm skeptical that such an system would work all that well, and can >> be created in a reasonable amount of time >> > > As stated earlier: we can already define multiple servers as part > of a buildout configuration. A better mirror selection algorithm would be > nice to have for the future but right now we don't actually need it and can > live with the current state. I think the key is to choose upon several indexes, not to use extra find-links. (I have submitted a patch for setuptools to handle several indexes for that: I think the key is to choose upon several indexes, not to used) Last, I think we should create a mirror registration system on PyPI, to ping mirrors when a new package is uploaded, so the sync is simpler. > > Andreas > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > > -- Tarek Ziad? - Directeur Technique INGENIWEB (TM) - SAS 50000 Euros - RC B 438 725 632 Bureaux de la Colline - 1 rue Royale - B?timent D - 9?me ?tage 92210 Saint Cloud - France Phone : 01.78.15.24.00 / Fax : 01 46 02 44 04 http://www.ingeniweb.com - une soci?t? du groupe Alter Way -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at simplistix.co.uk Fri Oct 10 17:20:40 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 10 Oct 2008 16:20:40 +0100 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EDDF8E.8030204@zopyx.com> References: <48EDDF8E.8030204@zopyx.com> Message-ID: <48EF72C8.1040705@simplistix.co.uk> Andreas Jung wrote: > In phase 1 of the project (upcoming soon) we will provide a number of > independent machines (up to five servers) with a full copy of all > packages hosted directly on PyPI. > > For phase 2 (next year) we will rework the codebase of z3c.pypimirror > and try to deal in some way with packages hosted externally. In addition > we think about providing some kind of automatic mirror-selection within > setuptools/zc.buildout based on DNS alias entries (subject to be planned). I don't see why someone doesn't just develop PyPI as a Google App Engine app and use Amazon S3 for the storage of the packages. Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From lists at zopyx.com Fri Oct 10 17:23:45 2008 From: lists at zopyx.com (Andreas Jung) Date: Fri, 10 Oct 2008 11:23:45 -0400 Subject: [Catalog-sig] PyPI replication project In-Reply-To: References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> <48EE8CEC.1000706@v.loewis.de> <48EF3111.3010706@zopyx.com> Message-ID: <48EF7381.4030902@zopyx.com> On 10.10.2008 8:38 Uhr, Tarek Ziade wrote: > > > 2008/10/10 Andreas Jung > > > Am 09.10.2008 18:59 Uhr, Martin v. L?wis schrieb: > > Mirrors help every other packaging system. So it stands to > reason > that it would help pypi too. I think since many zope people > have been > using mirrors instead of using pypi directly... pypi has > been more > available. It's running lots better for other reasons too... > but less > load is probably also nice for pypi :) > > > I'm fine with people operating their own mirrors. I just don't think > it can be made *invisible* to users that they use a mirror. In the > mirroring systems for Linux distributions, for example, people have > to explicitly select which mirror they want to use (and accept that > the mirror may lag behind by a day or so). It's also clear that > it is > a "mere" mirror. > > > Implict or explict mirror selection is not primary point in phase 1 > of the project. The point is that we must have access to the > distribution packages and eggs at any time - independent of the > available of PyPI (either related to issues with the PyPI server or > caused by internet outages or routing problems). > > An implicit selection of a mirror in case of an detected outtage > would be nice but this is possibly not the most important issue > right now. > We can always reconfigure out buildout configurations easily to a > new server or define a series of mirroring servers. > > > > What Andreas was asking how a distributed PyPI installation > could work, > by which I assume he was asking for one that a) is invisible (of > called > misleadingly "transparent") to users, and b) allows updates to > replica. > > I'm skeptical that such an system would work all that well, and can > be created in a reasonable amount of time > > > As stated earlier: we can already define multiple servers as part > of a buildout configuration. A better mirror selection algorithm > would be nice to have for the future but right now we don't actually > need it and can live with the current state. > > > I think the key is to choose upon several indexes, not to use extra > find-links. > > (I have submitted a patch for setuptools to handle several indexes for > that: I think the key > is to choose upon several indexes, not to used) > > Last, I think we should create a mirror registration system on PyPI, to > ping mirrors when > a new package is uploaded, so the sync is simpler. I really want to defer this discussion for now since it is not relevant for phase 1 of the project. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: lists.vcf Type: text/x-vcard Size: 330 bytes Desc: not available URL: From ianb at colorstudy.com Fri Oct 10 20:34:33 2008 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 10 Oct 2008 14:34:33 -0400 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EE8CEC.1000706@v.loewis.de> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> <48EE8CEC.1000706@v.loewis.de> Message-ID: <48EFA039.30002@colorstudy.com> Martin v. L?wis wrote: >> Mirrors help every other packaging system. So it stands to reason >> that it would help pypi too. I think since many zope people have been >> using mirrors instead of using pypi directly... pypi has been more >> available. It's running lots better for other reasons too... but less >> load is probably also nice for pypi :) > > I'm fine with people operating their own mirrors. I just don't think > it can be made *invisible* to users that they use a mirror. In the > mirroring systems for Linux distributions, for example, people have > to explicitly select which mirror they want to use (and accept that > the mirror may lag behind by a day or so). I vaguely remember CPAN doing something like having machine-readable lists of mirrors, and those lists are available at a couple reliable locations, and those locations are hardcoded into the tool. That doesn't speak to how well updated the mirror is, but I think some Linux distributions have clever solutions to that aspect too. If some component of the system was built in a push manner (i.e., a static file), and that file was kept synced between a couple reliable servers (I don't think it's really important if one of these servers is a couple seconds out of date), then we'd have something fairly reliable. So... the static file(s) could be a list of mirrors, and maybe a last-modified time for the entire system, then you could get a mirror and ask check against the last-modified of the mirror list to see if the mirror was fully up-to-date. The problem there is that mirrors might be out of date, but not in a way you care about (i.e., some package is uploaded that you don't care about). And there I vaguely remember someone talking about a more clever algorithm where you could tell if the mirror was up to date for the packages you care about. But, if mirrors are pinged about updates, they should really be able to keep up to date quickly, as most packages are small and new releases happen at a rate more like every couple hours. Sorry... this is more speculation than based on actual knowledge, but I think there are feasible ways to do these things. -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org From renesd at gmail.com Fri Oct 10 22:04:44 2008 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 11 Oct 2008 07:04:44 +1100 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EFA039.30002@colorstudy.com> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> <48EE8CEC.1000706@v.loewis.de> <48EFA039.30002@colorstudy.com> Message-ID: <64ddb72c0810101304l2243409dw4fca2dbc85fca39a@mail.gmail.com> On Sat, Oct 11, 2008 at 5:34 AM, Ian Bicking wrote: ... > I vaguely remember CPAN doing something like having machine-readable lists > of mirrors, and those lists are available at a couple reliable locations, > and those locations are hardcoded into the tool. > > That doesn't speak to how well updated the mirror is, but I think some Linux > distributions have clever solutions to that aspect too. Debian stores a diff of the package index for each update. So you can quickly download what has changed, and also see where a mirror is up to mirroring with a few http requests. I think it also combines multiple updates together... so that after 20 updates you don't need to get 20, just 1. From amk at amk.ca Sat Oct 11 16:46:52 2008 From: amk at amk.ca (A.M. Kuchling) Date: Sat, 11 Oct 2008 10:46:52 -0400 Subject: [Catalog-sig] Distutils and PyPI : P4-Sprint in D.C. In-Reply-To: <94bdd2610810070735g1e4a93d6n45beda83dd8c158c@mail.gmail.com> References: <94bdd2610810070735g1e4a93d6n45beda83dd8c158c@mail.gmail.com> Message-ID: <20081011144652.GA28602@amk.local> On Tue, Oct 07, 2008 at 10:35:49AM -0400, Tarek Ziad? wrote: > We are going to have a P4-sprint (pre-pre-pre-PEP sprint) in D.C. > during the Plone Conference. Sadly I'm not going to be able to make it after all; various things have come up and I'm going to spend Saturday running errands. Sorry. --amk From ziade.tarek at gmail.com Sat Oct 11 19:56:27 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sat, 11 Oct 2008 13:56:27 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks Message-ID: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> Here's the lists of tasks we are going to work on. They are simple. - PyPI : write a patch to enforce (or display a warning) the source distribution to be uploaded. so if a binary distribution or a zipped egg is uploaded we are sure we provide the source as well. - Documentation: write a glossary for the distutils/setuptools/Pypi terminology on python.org wiki - PyPI mirroring: write a PEP to implement a mirroring protocol, where mirrors can register at PyPI. Then when a package is uploaded, mirrors will be ping through RPC so they know they can eventually get synced. - setuptools: finish the patch for the multiple index support, with a CPAN-like mechanism on the client side, with a socket timeout managment - - distutils: code cleaning: better test coverage, remove logging, etc. Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From ziade.tarek at gmail.com Sun Oct 12 00:24:04 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sat, 11 Oct 2008 18:24:04 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> Message-ID: <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> Day 1 is almost over, We worked on two elements so far: the mirroring thing, and the terminology one these are early drafts, http://wiki.python.org/moin/PythonPackagingTerminology http://wiki.python.org/moin/PEP_374 please comment Tarek On Sat, Oct 11, 2008 at 1:56 PM, Tarek Ziad? wrote: > Here's the lists of tasks we are going to work on. They are simple. > > - PyPI : write a patch to enforce (or display a warning) the source > distribution to be uploaded. so if a binary distribution or a zipped > egg is uploaded > we are sure we provide the source as well. > > - Documentation: write a glossary for the distutils/setuptools/Pypi > terminology on python.org wiki > > - PyPI mirroring: write a PEP to implement a mirroring protocol, where > mirrors can register at PyPI. Then when a package is uploaded, mirrors > will be ping through RPC > so they know they can eventually get synced. > > - setuptools: finish the patch for the multiple index support, with a > CPAN-like mechanism on the client side, with a socket timeout > managment - > > - distutils: code cleaning: better test coverage, remove logging, etc. > > > Tarek > > -- > Tarek Ziad? | Association AfPy | www.afpy.org > Blog FR | http://programmation-python.org > Blog EN | http://tarekziade.wordpress.com/ > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From lists at zopyx.com Sun Oct 12 01:40:28 2008 From: lists at zopyx.com (Andreas Jung) Date: Sat, 11 Oct 2008 19:40:28 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> Message-ID: <48F1396C.8030907@zopyx.com> On 11.10.2008 18:24 Uhr, Tarek Ziad? wrote: > Day 1 is almost over, > > We worked on two elements so far: the mirroring thing, and the terminology one > > these are early drafts, > > http://wiki.python.org/moin/PythonPackagingTerminology > http://wiki.python.org/moin/PEP_374 > > I think we should also investigate how other repositories like CPAN or the Ruby world deals with mirroring. A notification mechanism appears fragile to me. I believe that the mirrors should remain "dumb" in order to keep the complete system simple and solid - less moving parts are better than more. I really have a bad feeling about the approach as described in your PEP. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: lists.vcf Type: text/x-vcard Size: 316 bytes Desc: not available URL: From ziade.tarek at gmail.com Sun Oct 12 03:56:51 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sat, 11 Oct 2008 21:56:51 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F1396C.8030907@zopyx.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> Message-ID: <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> On Sat, Oct 11, 2008 at 7:40 PM, Andreas Jung wrote: > On 11.10.2008 18:24 Uhr, Tarek Ziad? wrote: >> >> Day 1 is almost over, >> >> We worked on two elements so far: the mirroring thing, and the terminology >> one >> >> these are early drafts, >> >> http://wiki.python.org/moin/PythonPackagingTerminology >> http://wiki.python.org/moin/PEP_374 >> >> > > I think we should also investigate how other repositories like CPAN > or the Ruby world deals with mirroring. or rather linux http://www.mail-archive.com/distutils-sig at python.org/msg05791.html > A notification mechanism appears > fragile to me. I believe that the mirrors should remain "dumb" > in order to keep the complete system simple and solid -` Well at some point you need a protocol, otherwise your mirror is this "dumb" thing we cannot garantee to be a reliable thing. > less moving parts > are better than more. I really have a bad feeling about the approach as > described in your PEP. well, I think we need more than a rsync script at some point. > > Andreas > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From lists at zopyx.com Sun Oct 12 09:57:00 2008 From: lists at zopyx.com (Andreas Jung) Date: Sun, 12 Oct 2008 03:57:00 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> Message-ID: <48F1ADCC.5050602@zopyx.com> On 11.10.2008 21:56 Uhr, Tarek Ziad? wrote: > On Sat, Oct 11, 2008 at 7:40 PM, Andreas Jung wrote: >> On 11.10.2008 18:24 Uhr, Tarek Ziad? wrote: >>> Day 1 is almost over, >>> >>> We worked on two elements so far: the mirroring thing, and the terminology >>> one >>> >>> these are early drafts, >>> >>> http://wiki.python.org/moin/PythonPackagingTerminology >>> http://wiki.python.org/moin/PEP_374 >>> >>> >> I think we should also investigate how other repositories like CPAN >> or the Ruby world deals with mirroring. > > or rather linux > > http://www.mail-archive.com/distutils-sig at python.org/msg05791.html > > >> A notification mechanism appears >> fragile to me. I believe that the mirrors should remain "dumb" >> in order to keep the complete system simple and solid -` > > Well at some point you need a protocol, otherwise your mirror > is this "dumb" thing we cannot garantee to be a reliable thing. > >> less moving parts >> are better than more. I really have a bad feeling about the approach as >> described in your PEP. > > well, I think we need more than a rsync script at some point. The question is if you want a push or pull mechanism. z3c.pypimirror implements the pull implementation and performs an incremental update in the sense of rsync in a reliable way. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: lists.vcf Type: text/x-vcard Size: 316 bytes Desc: not available URL: From sdouche at gmail.com Sun Oct 12 14:40:05 2008 From: sdouche at gmail.com (Sebastien Douche) Date: Sun, 12 Oct 2008 14:40:05 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F1ADCC.5050602@zopyx.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> Message-ID: <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> On Sun, Oct 12, 2008 at 09:57, Andreas Jung wrote: > The question is if you want a push or pull mechanism. z3c.pypimirror > implements the pull implementation and performs an incremental update > in the sense of rsync in a reliable way. Can you speak more on incremental update ? I use z3c;pypimirror and it needs 2 hours to make a complete update. 2008-10-12 12:01:43,671 DEBUG Statistics 2008-10-12 12:01:43,672 DEBUG ---------- 2008-10-12 12:01:43,672 DEBUG Found (cached): 17626 2008-10-12 12:01:43,672 DEBUG Stored (downloaded): 1306 2008-10-12 12:01:43,673 DEBUG Not found (404): 35 2008-10-12 12:01:43,673 DEBUG Invalid packages: 1 2008-10-12 12:01:43,673 DEBUG Invalid URLs: 265 2008-10-12 12:01:43,673 DEBUG Runtime: 120m21s -- Seb From ziade.tarek at gmail.com Sun Oct 12 14:55:44 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 12 Oct 2008 08:55:44 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F1ADCC.5050602@zopyx.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> Message-ID: <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> On Sun, Oct 12, 2008 at 3:57 AM, Andreas Jung wrote: >> well, I think we need more than a rsync script at some point. > > The question is if you want a push or pull mechanism. z3c.pypimirror > implements the pull implementation and performs an incremental update > in the sense of rsync in a reliable way. You don't want a push mechanism, but you want a way to list the mirrors at PyPI, their states, and know if they are good mirrors or not. That is what a ping mechanism provides. So the point is not about being able to provide a reliable "copy of files program", I can use for that a "wget --mirror" or "rsync -r" and I don't need to write a program for this. On the other hand maybe the XML-RPC thing is a little heavy but we sure need to know how "fresh" a mirror is. Maybe if the last-modified header is maintained on the mirror this could be enough for PyPI and other third-party application to know about it. -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From ziade.tarek at gmail.com Sun Oct 12 14:58:02 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 12 Oct 2008 08:58:02 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> Message-ID: <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> On Sun, Oct 12, 2008 at 8:40 AM, Sebastien Douche wrote: > On Sun, Oct 12, 2008 at 09:57, Andreas Jung wrote: >> The question is if you want a push or pull mechanism. z3c.pypimirror >> implements the pull implementation and performs an incremental update >> in the sense of rsync in a reliable way. > > Can you speak more on incremental update ? I use z3c;pypimirror and it > needs 2 hours to make a complete update. we use wget --mirror, and it is working quite good, an upgrade is around 10 minutes IIRC but I think rsync is the best way to go for a mirror. From ziade.tarek at gmail.com Sun Oct 12 18:05:20 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 12 Oct 2008 12:05:20 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F1396C.8030907@zopyx.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> Message-ID: <94bdd2610810120905i2af1c221gc6075d7fe4b573ce@mail.gmail.com> We removed the RPC thing, and added a freshness date principle, to make thing simpler. see http://wiki.python.org/moin/PEP_374 On Sat, Oct 11, 2008 at 7:40 PM, Andreas Jung wrote: > On 11.10.2008 18:24 Uhr, Tarek Ziad? wrote: >> >> Day 1 is almost over, >> >> We worked on two elements so far: the mirroring thing, and the terminology >> one >> >> these are early drafts, >> >> http://wiki.python.org/moin/PythonPackagingTerminology >> http://wiki.python.org/moin/PEP_374 >> >> > > I think we should also investigate how other repositories like CPAN > or the Ruby world deals with mirroring. A notification mechanism appears > fragile to me. I believe that the mirrors should remain "dumb" > in order to keep the complete system simple and solid - less moving parts > are better than more. I really have a bad feeling about the approach as > described in your PEP. > > Andreas > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From martin at v.loewis.de Sun Oct 12 19:14:44 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 19:14:44 +0200 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EF3111.3010706@zopyx.com> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> <48EE8CEC.1000706@v.loewis.de> <48EF3111.3010706@zopyx.com> Message-ID: <48F23084.2000701@v.loewis.de> > As stated earlier: we can already define multiple servers as part > of a buildout configuration. A better mirror selection algorithm would > be nice to have for the future but right now we don't actually need it > and can live with the current state. Ok. In that case, distributing PyPI would be easy: make one system the master, and maintain a single-master postgres replication to all slaves; on each file upload, also trigger an rsync replication. Disable all writes on the slaves. Regards, Martin From martin at v.loewis.de Sun Oct 12 19:15:42 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 12 Oct 2008 19:15:42 +0200 Subject: [Catalog-sig] PyPI replication project In-Reply-To: References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> <48EE8CEC.1000706@v.loewis.de> <48EF3111.3010706@zopyx.com> Message-ID: <48F230BE.5030101@v.loewis.de> > Last, I think we should create a mirror registration system on PyPI, to > ping mirrors when > a new package is uploaded, so the sync is simpler. Why do you think the sync is difficult right now? Regards, Martin From martin at v.loewis.de Sun Oct 12 19:47:39 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 19:47:39 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F1396C.8030907@zopyx.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> Message-ID: <48F2383B.7020103@v.loewis.de> > I think we should also investigate how other repositories like CPAN > or the Ruby world deals with mirroring. A notification mechanism appears > fragile to me. I believe that the mirrors should remain "dumb" > in order to keep the complete system simple and solid - less moving > parts are better than more. I really have a bad feeling about the > approach as described in your PEP. I'm also skeptical about that. I don't think this callback solves any specific problem. Regards, Martin From martin at v.loewis.de Sun Oct 12 19:48:43 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 19:48:43 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> Message-ID: <48F2387B.8060709@v.loewis.de> > Can you speak more on incremental update ? What would you like to know? incremental update should be very easy to implement for a mirror tool, with no additional changes to PyPI. Regards, Martin From martin at v.loewis.de Sun Oct 12 19:50:15 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 19:50:15 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> Message-ID: <48F238D7.7050600@v.loewis.de> > we use wget --mirror, and it is working quite good, an upgrade is > around 10 minutes IIRC > but I think rsync is the best way to go for a mirror. I disagree. The best way to do the mirroring is with the specific protocol explicitly designed to do mirroring, namely to look at the changelog. PLEASE DONT USE RSYNC OR WGET TO MIRROR PYPI. PLEASE! Regards, Martin From ziade.tarek at gmail.com Sun Oct 12 19:51:42 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 12 Oct 2008 13:51:42 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F238D7.7050600@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> Message-ID: <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> On Sun, Oct 12, 2008 at 1:50 PM, "Martin v. L?wis" wrote: >> we use wget --mirror, and it is working quite good, an upgrade is >> around 10 minutes IIRC >> but I think rsync is the best way to go for a mirror. > > I disagree. The best way to do the mirroring is with the specific > protocol explicitly designed to do mirroring, namely to look at > the changelog. > > PLEASE DONT USE RSYNC OR WGET TO MIRROR PYPI. PLEASE! could you explain why is that a problem ? > > Regards, > Martin > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From martin at v.loewis.de Sun Oct 12 19:53:45 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 19:53:45 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> Message-ID: <48F239A9.2040609@v.loewis.de> > That is what a ping mechanism provides. Hmm. If the mirror provided a file "last-changed", it would be very easy to find out whether the mirror is still running. Mirrors not providing that file could be ignored. > I can use for that a "wget --mirror" or "rsync -r" and I don't need to > write a program for this. If more people start mirroring PyPI through wget or rsync, I need to ban specific IP addresses. For the moment, please consider my request to stop mirroring PyPI with wget, and to write (or use) a real mirroring tool. Regards, Martin From martin at v.loewis.de Sun Oct 12 19:58:52 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 19:58:52 +0200 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48EFA039.30002@colorstudy.com> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> <48EE8CEC.1000706@v.loewis.de> <48EFA039.30002@colorstudy.com> Message-ID: <48F23ADC.4080108@v.loewis.de> > Sorry... this is more speculation than based on actual knowledge, but I > think there are feasible ways to do these things. PyPI provides mirrors with a changelog where they can efficiently ask for a list of packages that have changed since they last synchronized. This is the recommended way for mirrors to operate; polling the changelog once every minute is acceptable load for PyPI. Regards, Martin From martin at v.loewis.de Sun Oct 12 20:12:33 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 20:12:33 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> Message-ID: <48F23E11.30801@v.loewis.de> > could you explain why is that a problem ? It produces significant load on the master. If you look at the web stats, e.g for September: http://pypi.python.org/webstats/usage_200809.html you see that there had been 5671455 hits, or 41%, of accesses through wget. The problem with wget mirroring is that it needs to read *many* pages, to find out the *few* changes. FWIW, it's also the case that 4940769 hits originate from France. Could it be that you are alone responsible for 40% of the traffic on PyPI? Regards, Martin From ziade.tarek at gmail.com Sun Oct 12 20:16:17 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 12 Oct 2008 14:16:17 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F239A9.2040609@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> <48F239A9.2040609@v.loewis.de> Message-ID: <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> On Sun, Oct 12, 2008 at 1:53 PM, "Martin v. L?wis" wrote: >> That is what a ping mechanism provides. > > Hmm. If the mirror provided a file "last-changed", it would be very > easy to find out whether the mirror is still running. Mirrors not > providing that file could be ignored. Right, please take a look at my last version http://wiki.python.org/moin/PEP_374 it tries to go in that direction From ziade.tarek at gmail.com Sun Oct 12 20:32:25 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 12 Oct 2008 14:32:25 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F23E11.30801@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> <48F23E11.30801@v.loewis.de> Message-ID: <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> On Sun, Oct 12, 2008 at 2:12 PM, "Martin v. L?wis" wrote: >> could you explain why is that a problem ? > > It produces significant load on the master. If you look at the web > stats, e.g for September: > > http://pypi.python.org/webstats/usage_200809.html > > you see that there had been 5671455 hits, or 41%, of accesses through > wget. > > The problem with wget mirroring is that it needs to read *many* > pages, to find out the *few* changes. Sure, > FWIW, it's also the case that 4940769 hits originate from > France. Could it be that you are alone responsible for 40% of > the traffic on PyPI? > Yes, I am the only Python developer in France. That's me. Just kidding :) France has a lot of python/plone developers that triggers buildouts every day, so I am pretty sure the mirrors don't make the whole traffic in PyPI. we could probably do things better though. Here's my proposal: + see if we can locate the mirrors, so for instance, if i register a "Paris mirror" people will eventually go there because it is the nearest location for them. (? la CPAN) + create a new user agent for mirroring tools Regards Tarek > Regards, > Martin > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From martin at v.loewis.de Sun Oct 12 20:34:27 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 20:34:27 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> Message-ID: <48F24333.5070008@v.loewis.de> > Right, please take a look at my last version http://wiki.python.org/moin/PEP_374 > it tries to go in that direction For such an infrastructure (which apparently intends to mirror the files as well), I insist that a propagation of download counters is made mandatory. The only mirrors that can be excused from that are private ones. Regards, Martin From martin at v.loewis.de Sun Oct 12 20:49:26 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 20:49:26 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> <48F23E11.30801@v.loewis.de> <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> Message-ID: <48F246B6.8000707@v.loewis.de> >> FWIW, it's also the case that 4940769 hits originate from >> France. Could it be that you are alone responsible for 40% of >> the traffic on PyPI? >> > > Yes, I am the only Python developer in France. That's me. > > Just kidding :) > > France has a lot of python/plone developers that triggers buildouts every day, > so I am pretty sure the mirrors don't make the whole traffic in PyPI. Hmm. Yesterday, there were 199250 accesses to PyPI through wget. Of those, 169971 requests came from a single address (from Dedibox in France), 28966 requests from a second one (from Sakura in Japan). So it *is* wget mirrors that make the whole traffic in PyPI. Regards, Martin From ziade.tarek at gmail.com Sun Oct 12 21:12:51 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 12 Oct 2008 15:12:51 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F24333.5070008@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> <48F24333.5070008@v.loewis.de> Message-ID: <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> On Sun, Oct 12, 2008 at 2:34 PM, "Martin v. L?wis" wrote: >> Right, please take a look at my last version http://wiki.python.org/moin/PEP_374 >> it tries to go in that direction > > For such an infrastructure (which apparently intends to mirror the files > as well), I insist that a propagation of download counters is made > mandatory. But how do you want to display them ? Do you want to display the grand total on PyPI ? In that cas each mirror should provide a counter page, > The only mirrors that can be excused from that are private > ones. > > Regards, > Martin > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From ziade.tarek at gmail.com Sun Oct 12 21:21:02 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 12 Oct 2008 15:21:02 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F246B6.8000707@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> <48F23E11.30801@v.loewis.de> <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> <48F246B6.8000707@v.loewis.de> Message-ID: <94bdd2610810121221se32d293yd51e51b516fd3811@mail.gmail.com> On Sun, Oct 12, 2008 at 2:49 PM, "Martin v. L?wis" wrote: >>> FWIW, it's also the case that 4940769 hits originate from >>> France. Could it be that you are alone responsible for 40% of >>> the traffic on PyPI? >>> >> >> Yes, I am the only Python developer in France. That's me. >> >> Just kidding :) >> >> France has a lot of python/plone developers that triggers buildouts every day, >> so I am pretty sure the mirrors don't make the whole traffic in PyPI. > > Hmm. Yesterday, there were 199250 accesses to PyPI through wget. > Of those, 169971 requests came from a single address (from Dedibox in > France), 28966 requests from a second one (from Sakura in Japan). yes that is us, we have two mirror here, one smart one (proxy with the minimum amount on call over PyPI) and the wget one, the second one is the wget. but the mirror option use a file listing and a timestamping mecanism, so the file or pages are downloaded only if they have changed. Basically only headers are read. I'll shut it off anyway, we have the smart collective.proxy at work now, and will eventually switch to the wget one to the pypimirror one if we provide an "official" full mirror > > So it *is* wget mirrors that make the whole traffic in PyPI. > > Regards, > Martin > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From martin at v.loewis.de Sun Oct 12 21:23:44 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 21:23:44 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> Message-ID: <48F24EC0.601@v.loewis.de> > But how do you want to display them ? Do you want to display the grand total > on PyPI ? Yes, exactly so. > In that cas each mirror should provide a counter page, Why that? Shouldn't the mirrors then also display the grand total? Regards, Martin From ziade.tarek at gmail.com Sun Oct 12 21:34:20 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 12 Oct 2008 15:34:20 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F24EC0.601@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> Message-ID: <94bdd2610810121234l133cac7bsb6377a87f44ae8f9@mail.gmail.com> On Sun, Oct 12, 2008 at 3:23 PM, "Martin v. L?wis" wrote: >> But how do you want to display them ? Do you want to display the grand total >> on PyPI ? > > Yes, exactly so. > >> In that cas each mirror should provide a counter page, > > Why that? Shouldn't the mirrors then also display the grand total? ok then we'll try to think about some solutions for that, > > Regards, > Martin > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From ziade.tarek at gmail.com Sun Oct 12 22:08:27 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 12 Oct 2008 16:08:27 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F24EC0.601@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> Message-ID: <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> On Sun, Oct 12, 2008 at 3:23 PM, "Martin v. L?wis" wrote: >> But how do you want to display them ? Do you want to display the grand total >> on PyPI ? > > Yes, exactly so. > >> In that cas each mirror should provide a counter page, > > Why that? Shouldn't the mirrors then also display the grand total? how do you collect them in PyPI ? via Apache logs ? > > Regards, > Martin > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From martin at v.loewis.de Sun Oct 12 22:32:51 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 22:32:51 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> Message-ID: <48F25EF3.60509@v.loewis.de> > how do you collect them in PyPI ? via Apache logs ? Exactly. It's in tools/apache_count.py Regards, Martin From zopyxfilter at googlemail.com Sun Oct 12 23:33:17 2008 From: zopyxfilter at googlemail.com (zopyxfilter at googlemail.com) Date: Sun, 12 Oct 2008 17:33:17 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F2387B.8060709@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <48F2387B.8060709@v.loewis.de> Message-ID: <48F26D1D.4010901@gmail.com> On 12.10.2008 13:48 Uhr, Martin v. L?wis wrote: >> Can you speak more on incremental update ? >> > > What would you like to know? incremental update should be very > easy to implement for a mirror tool, with no additional changes > to PyPI. Our z3c.pypimirror already performs an incremental update based on the information available from the index.html page of the simple index and the available md5 hashes. Works like a charm... Andreas From martin at v.loewis.de Sun Oct 12 23:47:00 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Oct 2008 23:47:00 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F26D1D.4010901@gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <48F2387B.8060709@v.loewis.de> <48F26D1D.4010901@gmail.com> Message-ID: <48F27054.1060307@v.loewis.de> > Our z3c.pypimirror already performs an incremental update based on > the information available from the index.html page of the simple > index and the available md5 hashes. Works like a charm... So how does it find out when a release gets made? Regards, Martin From zopyxfilter at googlemail.com Sun Oct 12 23:52:29 2008 From: zopyxfilter at googlemail.com (zopyxfilter at googlemail.com) Date: Sun, 12 Oct 2008 17:52:29 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F27054.1060307@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <48F2387B.8060709@v.loewis.de> <48F26D1D.4010901@gmail.com> <48F27054.1060307@v.loewis.de> Message-ID: <48F2719D.9050409@gmail.com> On 12.10.2008 17:47 Uhr, Martin v. L?wis wrote: >> Our z3c.pypimirror already performs an incremental update based on >> the information available from the index.html page of the simple >> index and the available md5 hashes. Works like a charm... >> > > So how does it find out when a release gets made? > What do you mean by that? Andreas From martin at v.loewis.de Mon Oct 13 00:18:35 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 13 Oct 2008 00:18:35 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F2719D.9050409@gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <48F2387B.8060709@v.loewis.de> <48F26D1D.4010901@gmail.com> <48F27054.1060307@v.loewis.de> <48F2719D.9050409@gmail.com> Message-ID: <48F277BB.4060105@v.loewis.de> >>> Our z3c.pypimirror already performs an incremental update based on >>> the information available from the index.html page of the simple >>> index and the available md5 hashes. Works like a charm... >>> >> >> So how does it find out when a release gets made? >> > > What do you mean by that? If you only look at http://pypi.python.org/simple/ then you have no way of find out out what changed. So "the information available from the index.html page of the simple index" is not actually suitable for building incremental mirroring. What you describe is not possible. I just looked at the z3c.pypimirror source, and found that it isn't really incremental: Whenever it mirrors, it looks at *all* index.html pages, of each an every package (all 4900 of them, except when you restrict the mirror). It then only downloads any new files that may have been added/deleted, and it *is* incremental wrt. files. IIUC, it is *not* incremental wrt. the package index itself. Please correct me if I'm wrong (and please correct z3c.pypimirror if I'm not :-) Can you please set a specific useragent header, to find out what amount of traffic pypimirror produces? Currently, urllib accounts for 17% of the requests, excluding requests made through urllib by setuptools (which is a separate 18%). It's probably not all of them through pypimirror, but of the 64626 requests made through urllib yesterday, 41671 originated from zopyx.com. For real incremental mirroring, you should retrieve the changelog, and access only those package pages that have actually changed since the last time you ran the mirror (successfully). Regards, Martin From zopyxfilter at googlemail.com Mon Oct 13 14:10:24 2008 From: zopyxfilter at googlemail.com (zopyxfilter at googlemail.com) Date: Mon, 13 Oct 2008 08:10:24 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F277BB.4060105@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <48F2387B.8060709@v.loewis.de> <48F26D1D.4010901@gmail.com> <48F27054.1060307@v.loewis.de> <48F2719D.9050409@gmail.com> <48F277BB.4060105@v.loewis.de> Message-ID: <48F33AB0.9010405@gmail.com> On 12.10.2008 18:18 Uhr, Martin v. L?wis wrote: >>>> Our z3c.pypimirror already performs an incremental update based on >>>> the information available from the index.html page of the simple >>>> index and the available md5 hashes. Works like a charm... >>>> >>>> >>> So how does it find out when a release gets made? >>> >>> >> What do you mean by that? >> > > If you only look at > > http://pypi.python.org/simple/ > > then you have no way of find out out what changed. So "the information > available from the index.html page of the simple index" is not actually > suitable for building incremental mirroring. What you describe is not > possible. > > I just looked at the z3c.pypimirror source, and found that it isn't > really incremental: Whenever it mirrors, it looks at *all* index.html > pages, of each an every package (all 4900 of them, except when you > restrict the mirror). It then only downloads any new files that may > have been added/deleted, and it *is* incremental wrt. files. IIUC, > it is *not* incremental wrt. the package index itself. > > Please correct me if I'm wrong (and please correct z3c.pypimirror > if I'm not :-) > Good suggestion. I think we can take the changelog into account easily. Having to check this with Daniel Kraft, the original author of the package. > Can you please set a specific useragent header, to find out what > amount of traffic pypimirror produces? Currently, urllib accounts for > 17% of the requests, excluding requests made through urllib by > setuptools (which is a separate 18%). It's probably not all of them > through pypimirror, but of the 64626 requests made through urllib > yesterday, 41671 originated from zopyx.com. > > Should not be a problem. > For real incremental mirroring, you should retrieve the changelog, > and access only those package pages that have actually changed since > the last time you ran the mirror (successfully). See above. Andreas From info at pinpointresources.com Mon Oct 13 14:42:27 2008 From: info at pinpointresources.com (Mark Hall) Date: Mon, 13 Oct 2008 09:42:27 -0300 Subject: [Catalog-sig] Top 10 Test Automation Tool Factors Message-ID: Having trouble viewing this e-mail? please use this link ( http://app.mailworkz.com/email_view.asp?group_idno=5088734&outgoing_idno=5222660&email_idno=3001151 ) . ( http://www.pinpointresources.com/ ) ( http://www.pinpointresources.com/ ) TEST AUTOMATION TOOL FACTORS 1-5 Scope Preparation Timeframe Return On Investment When Is the Benefit To Be Gained? The Degree Of Change SCOPE It is not practical to try to automate everything, nor is there the time available generally. Pick very carefully the functions/areas... PREPARATION TIMEFRAME The preparation time for automated test scripts has to be taken into account. In general, the preparation time for automated scripts can be up to 2/3 times longer than for manual testing... ( http://www.pinpointresources.com/automation.asp#point2 ) RETURN ON INVESTMENT ( http://www.pinpointresources.com/automation.asp#point3 ) Because the preparation time for test automation is so long, the benefit of the test automation only begins... WHEN IS THE BENEFIT TO BE GAINED?Choose your objectives wisely, and seriously think about when and where ( http://www.pinpointresources.com/automation.asp#point4 ) the benefit is to be gained. If your application is significantly changing regularly, forget about test automation - you will spend so much time updating your scripts that you will not reap many benefits... THE DEGREE OF CHANGEThe best of test automation is for regression testing, whereby you use automated tests to ensure that pre-existing functions (e.g. functions from version 1.0 - i.e. not new functions in this release) are unaffected by... Often when a test automation tool is introduced to a project, the expectations for the return on investment are very high. Project members anticipate that the tool will immediately narrow down the testing scope, meaning reducing cost and schedule. However, we have seen several test automation projects fail - miserably. The following very simple factors largely influence the effectiveness of automated testing, and if not taken into account, the result is usually a lot of lost effort, and very expensive 'shelfware'. Free Consultation from PinPoint Resources If you would like to hear more about our Automated Testing Methodology please contact us for a free consultation from one of our consultants... Steve PalamaraNational Director of Business Developmentsteve.palamara at PinPointResources.com317-726-5600 ext 209 ( http://www.pinpointresources.com/ ) Visit Us Onlinewww.pinpointresources.com ( http://app.mailworkz.com/unsubscribe.asp?outgoing_idno=5222660&e=3001151&gId=5088734 ) This email was sent to catalog-sig at python.org, by info at pinpointresources.com. You may unsubscribe from this list. If this message was received in error, please report it. PinPoint Resources | 6308 Rucker Rd. Suite A | Indianapolis | IN | 46220 | US -------------- next part -------------- An HTML attachment was scrubbed... URL: From ziade.tarek at gmail.com Mon Oct 13 16:35:11 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 13 Oct 2008 10:35:11 -0400 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F25EF3.60509@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> Message-ID: <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> On Sun, Oct 12, 2008 at 4:32 PM, "Martin v. L?wis" wrote: >> how do you collect them in PyPI ? via Apache logs ? > > Exactly. It's in tools/apache_count.py How often do you run it ? I guess a daily update is enough for the grand total ? Anyway, so the mirrors should be able to reuse this script for their own internal count as well, and PyPI would need to provide a way for the mirror to report them, and to get back the grand count. but i wouldn't want to make apache mandatory for the mirrors. What about this: 1/ each mirror maintain simple text-based stats pages, with the local count, reachable from an url (/local_stats) 2/ PyPI modifies its script so it injects its apache count + the registered mirrors local counts 3/ PyPI maintains a simple text stats page, with the grand count (/stats) one stat page represents one day, and the stats are presented in folders that represents the year and the month So the stats from october the 11th will be reachable at: .../local_stats/2008/10/11 The stat page can referer to the packages using a PACKAGE_NAME/FILE = HITS syntax: iw.recipe.fss/iw.recipe.fss-0.2.1.tar.gz = 123 foo.bar/foo.bar-0.3.tar.gz = 12 ... This is a fairly simple structure any mirroring tool can create, and we could provide a simple python script that generates it from the Apache logs, Regards Tarek > > Regards, > Martin > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From martin at v.loewis.de Mon Oct 13 23:35:34 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 13 Oct 2008 23:35:34 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> Message-ID: <48F3BF26.4050905@v.loewis.de> > How often do you run it ? I guess a daily update is enough for the grand total ? I think it still runs daily. There was one complaint about that, but the user could accept that as a policy after understanding what happened (he thought the feature was broken as there was no immediate update). > 1/ each mirror maintain simple text-based stats pages, with the local > count, reachable from an url (/local_stats) > 2/ PyPI modifies its script so it injects its apache count + the > registered mirrors local counts > 3/ PyPI maintains a simple text stats page, with the grand count (/stats) Sounds fine to me. Expect that to become a long file, though, with one line per file (roughly 20000 files with at least one download). > one stat page represents one day, and the stats are presented in > folders that represents the year and the month I wonder whether it might be easier to have a single file, with the totals for that server. > iw.recipe.fss/iw.recipe.fss-0.2.1.tar.gz = 123 > foo.bar/foo.bar-0.3.tar.gz = 12 I would drop the "=" in that syntax. Regards, Martin From tarek.ziade at ingeniweb.com Mon Oct 13 23:58:57 2008 From: tarek.ziade at ingeniweb.com (Tarek Ziade) Date: Mon, 13 Oct 2008 17:58:57 -0400 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <48F3BF26.4050905@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> Message-ID: 2008/10/13 "Martin v. L?wis" > > How often do you run it ? I guess a daily update is enough for the grand > total ? > > I think it still runs daily. There was one complaint about that, but the > user could accept that as a policy after understanding what happened (he > thought the feature was broken as there was no immediate update). > > > 1/ each mirror maintain simple text-based stats pages, with the local > > count, reachable from an url (/local_stats) > > 2/ PyPI modifies its script so it injects its apache count + the > > registered mirrors local counts > > 3/ PyPI maintains a simple text stats page, with the grand count > (/stats) > > Sounds fine to me. Expect that to become a long file, though, with one > line per file (roughly 20000 files with at least one download). Maybe we could use one subfolder per alphabet letter, like what is done in packages/ at PyPI that would lower it down to roughly 1000 items per pages, > > > one stat page represents one day, and the stats are presented in > > folders that represents the year and the month > > I wonder whether it might be easier to have a single file, with the > totals for that server. You would need to specify a timestamp for each single download though, to make sure PyPI knows which hits to count, depending on the last date it checked the mirror. if we have 1000 downloads per day, that's a huge file after a while > > > > iw.recipe.fss/iw.recipe.fss-0.2.1.tar.gz = 123 > > foo.bar/foo.bar-0.3.tar.gz = 12 > > I would drop the "=" in that syntax. > Ok I'll upgrade the proposal, reflecting these infos > Regards, > Martin > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > http://mail.python.org/mailman/listinfo/distutils-sig > -- Tarek Ziad? - Directeur Technique INGENIWEB (TM) - SAS 50000 Euros - RC B 438 725 632 Bureaux de la Colline - 1 rue Royale - B?timent D - 9?me ?tage 92210 Saint Cloud - France Phone : 01.78.15.24.00 / Fax : 01 46 02 44 04 http://www.ingeniweb.com - une soci?t? du groupe Alter Way -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Tue Oct 14 00:16:56 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 14 Oct 2008 00:16:56 +0200 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> Message-ID: <48F3C8D8.4020804@v.loewis.de> > Maybe we could use one subfolder per alphabet letter, Would that simplify anything? PyPI uses one directory per letter to reduce the number of files in a single directory, in case ext3 doesn't deal with large directories well. For the stats, the "large directories" argument wouldn't count. OTOH, if you do have separate pages per letter, the master server would still need to download all individual files. Having them split into chunks just increases the load, rather than reducing it. > You would need to specify a timestamp for each single download though, > to make sure PyPI > knows which hits to count, depending on the last date it checked the > mirror. No. It would just compute the grand total from scratch each time. Regards, Martin From ziade.tarek at gmail.com Tue Oct 14 02:34:00 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 13 Oct 2008 20:34:00 -0400 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <48F3C8D8.4020804@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> Message-ID: <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> On Mon, Oct 13, 2008 at 6:16 PM, "Martin v. L?wis" wrote: >> Maybe we could use one subfolder per alphabet letter, > > Would that simplify anything? > > PyPI uses one directory per letter to reduce the number of files in a > single directory, in case ext3 doesn't deal with large directories well. > For the stats, the "large directories" argument wouldn't count. > > OTOH, if you do have separate pages per letter, the master server would > still need to download all individual files. Having them split into > chunks just increases the load, rather than reducing it. Yes I thaught you were concerned by the size of that file, rather by the number of calls PyPI would need to perform. > > >> You would need to specify a timestamp for each single download though, >> to make sure PyPI >> knows which hits to count, depending on the last date it checked the >> mirror. > > No. It would just compute the grand total from scratch each time. > ok OTHO you would lose an interesting info: how downloads evolve in time. As a packager, I can see some interesting use cases. For example when foo 2.0 gets out, I can watch foo 1.0 downloads decrease and foo 2.0 raise. (if not make sure i have promoted 2.0 correctly) People would be able to generate interesting statistics tools from there. This would be possible of course only if PyPI provides the same timestamped pages for the grand total. This leads to another point we did not discuss yet: it would be interesting to keep the user-agent info in the mirrors, and make sure all automatic-package-grabbing softwares out there have there own user agent id For instance, knowing that 90% of the downloads of a given package where done by zc.buildout is interesting. IIRC, we cannot know it right now, and I could work on zc.buildout side for that, because it uses the setuptools user agent id Regards Tarek > > Regards, > Martin > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From martin at v.loewis.de Tue Oct 14 06:59:54 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Oct 2008 06:59:54 +0200 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> Message-ID: <48F4274A.9060601@v.loewis.de> > OTHO you would lose an interesting info: how downloads evolve in > time. Users interested in that could produce that information themselves, though: they query the download stats once a day, and compute the first derivative. > For instance, knowing that 90% of the downloads of a given package > where done by zc.buildout is interesting. IIRC, we cannot know it > right now, and I could work on zc.buildout side for that, because it > uses the setuptools user agent id In principle, it would be fine with me to track that, as long as I don't need to preserve the the complete log files. So would the stats file then be of the form package,filename,useragent,count ? (commas in the useragent replaced with semicolons) Regards, Martin From ziade.tarek at gmail.com Tue Oct 14 14:02:05 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 14 Oct 2008 14:02:05 +0200 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <48F4274A.9060601@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> <48F4274A.9060601@v.loewis.de> Message-ID: <94bdd2610810140502k6632e640o8e1466c872c72223@mail.gmail.com> On Tue, Oct 14, 2008 at 12:59 AM, "Martin v. L?wis" wrote: >> OTHO you would lose an interesting info: how downloads evolve in >> time. > > Users interested in that could produce that information themselves, > though: they query the download stats once a day, and compute the > first derivative. ok > >> For instance, knowing that 90% of the downloads of a given package >> where done by zc.buildout is interesting. IIRC, we cannot know it >> right now, and I could work on zc.buildout side for that, because it >> uses the setuptools user agent id > > In principle, it would be fine with me to track that, as long as I > don't need to preserve the the complete log files. > > So would the stats file then be of the form > > package,filename,useragent,count > > ? (commas in the useragent replaced with semicolons) > sounds good, i'll write that down in the proposal as well last points I can think of, that we have discussed at the sprint: - are non open source licensed packages alllowed at PyPI ? - wouldn't it make sense for open source package to force a sdist upload before any other kind of distribution (this is a feature claimed by many people in fact, as binary distribution obsfuscate things and make it hard to install if it's not the same version, and if it was not intended by the packager) Regards > Regards, > Martin > > > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From tseaver at palladion.com Tue Oct 14 14:03:30 2008 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 14 Oct 2008 08:03:30 -0400 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <48F4274A.9060601@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> <48F4274A.9060601@v.loewis.de> Message-ID: <48F48A92.6010106@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin v. L?wis wrote: >> OTHO you would lose an interesting info: how downloads evolve in >> time. > > Users interested in that could produce that information themselves, > though: they query the download stats once a day, and compute the > first derivative. > >> For instance, knowing that 90% of the downloads of a given package >> where done by zc.buildout is interesting. IIRC, we cannot know it >> right now, and I could work on zc.buildout side for that, because it >> uses the setuptools user agent id > > In principle, it would be fine with me to track that, as long as I > don't need to preserve the the complete log files. > > So would the stats file then be of the form > > package,filename,useragent,count > > ? (commas in the useragent replaced with semicolons) Why not just use the csv module for that, and let it handle the escaping? Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFI9IqS+gerLs4ltQ4RAnNPAKDUZ+TuDjTt8yL4ncf78DCeSYiXsACfYjB2 p/l1QhaPSovWJHMdL+JcqrU= =4ClS -----END PGP SIGNATURE----- From tseaver at palladion.com Tue Oct 14 16:15:49 2008 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 14 Oct 2008 10:15:49 -0400 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810140502k6632e640o8e1466c872c72223@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> <48F4274A.9060601@v.loewis.de> <94bdd2610810140502k6632e640o8e1466c872c72223@mail.gmail.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 (trimming distutils SIG, as this is about PyPI policy): Tarek Ziad? wrote: > On Tue, Oct 14, 2008 at 12:59 AM, "Martin v. L?wis" wrote: > last points I can think of, that we have discussed at the sprint: > > - are non open source licensed packages alllowed at PyPI ? > > - wouldn't it make sense for open source package to force a sdist > upload before any other kind of distribution > > (this is a feature claimed by many people in fact, as binary > distribution obsfuscate things and make it hard to install if it's > not the same version, and if it was not intended by the packager) I think a reasonable policy would be to allow 'register' but not 'upload' of non-FOSS pacakges: those who want binary-only distribution typically have a profit-motive in mind, and can therefore pay for their own bandwith for downloads. Likewise, I think a requirement that packagers upload an 'sdist' before uploading any 'bdist' files (eggs, windows installers, etc.) is reasonable, as it reinforces the first policy, as well as removes a certain class of "you can't get there from here" bugs (i.e., the package is available on PyPI, but not in any form which a particular user can use, even if that user could install the package from source). I don't think a subversion URL is an adequate replacement for an 'sdist': if the package is not mature enough to run 'setup.py sdist', then it shouldn't be used for 'bdist' either. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFI9KmV+gerLs4ltQ4RAvviAJ9KSEh0Gt+j55CnFhxUNXIGgQenugCfTulw oX5eRUfIJQ+e4Q09b0e6MK4= =gj+p -----END PGP SIGNATURE----- From martin at v.loewis.de Tue Oct 14 19:52:52 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Oct 2008 19:52:52 +0200 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810140502k6632e640o8e1466c872c72223@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> <48F4274A.9060601@v.loewis.de> <94bdd2610810140502k6632e640o8e1466c872c72223@mail.gmail.com> Message-ID: <48F4DC74.4010308@v.loewis.de> > - are non open source licensed packages alllowed at PyPI ? Sure! There is no censorship applied in PyPI, except for content completely unrelated to Python. The only exception is when the package owner is unresponsive, and somebody else wants to take over the package. We need some procedure to formalize this case. > - wouldn't it make sense for open source package to force a sdist > upload before any other kind of distribution (this is a feature > claimed by many people in fact, as binary distribution obsfuscate > things and make it hard to install if it's not the same version, and > if it was not intended by the packager) I don't want to assert quality control to the packages. If they don't upload anything, fine. If they upload broken packages, fine. If they supply invalid URLs, fine. If they mistype their email addresses, names, or licensing terms, fine. If they fail to provide source code, fine. Users should contact the authors and report problems with the registration if they find any. They sometimes mistake the PyPI tracker as tracking problems with the packages, but that happens rarely. Perhaps we can provide a form to submit a message to the package owner, to be used when everything else fails (such form would require a PyPI account for the sender, too). Regards, Martin From martin at v.loewis.de Tue Oct 14 19:55:38 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Oct 2008 19:55:38 +0200 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <48F48A92.6010106@palladion.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> <48F4274A.9060601@v.loewis.de> <48F48A92.6010106@palladion.com> Message-ID: <48F4DD1A.3050808@v.loewis.de> >> So would the stats file then be of the form > >> package,filename,useragent,count > >> ? (commas in the useragent replaced with semicolons) > > Why not just use the csv module for that, and let it handle the escaping? I don't want to specify "the format is the one that the csv module produces". Instead, I could accept an explicit specification of what the CSV module produces, indicating that mirrors are encouraged to use the CSV module to actually produce the data. However, if somebody prefers to implement the mirror in Perl, then this should still be possible. So what would the specification then read like? Regards, Martin From tseaver at palladion.com Tue Oct 14 20:07:42 2008 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 14 Oct 2008 14:07:42 -0400 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <48F4DD1A.3050808@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> <48F4274A.9060601@v.loewis.de> <48F48A92.6010106@palladion.com> <48F4DD1A.3050808@v.loewis.de> Message-ID: <48F4DFEE.1050702@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin v. L?wis wrote: >>> So would the stats file then be of the form >>> package,filename,useragent,count >>> ? (commas in the useragent replaced with semicolons) >> Why not just use the csv module for that, and let it handle the escaping? > > I don't want to specify "the format is the one that the csv module > produces". Instead, I could accept an explicit specification of what > the CSV module produces, indicating that mirrors are encouraged to use > the CSV module to actually produce the data. However, if somebody > prefers to implement the mirror in Perl, then this should still be > possible. > > So what would the specification then read like? The 'excel' dialect specified in PEP 305: http://www.python.org/dev/peps/pep-0305/#id19 Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFI9N/u+gerLs4ltQ4RAjbMAKCNdXlbkjlcfnhZBo/yoq36JY1RtgCgvAEf FVZyUDGHHiUqduzrJq711W0= =tBHR -----END PGP SIGNATURE----- From martin at v.loewis.de Tue Oct 14 20:44:45 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Oct 2008 20:44:45 +0200 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <48F4DFEE.1050702@palladion.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> <48F4274A.9060601@v.loewis.de> <48F48A92.6010106@palladion.com> <48F4DD1A.3050808@v.loewis.de> <48F4DFEE.1050702@palladion.com> Message-ID: <48F4E89D.50600@v.loewis.de> >> So what would the specification then read like? > > The 'excel' dialect specified in PEP 305: > > http://www.python.org/dev/peps/pep-0305/#id19 That works fine for me. I notice that the PEP doesn't actually match the API, in particular wrt. to passing a fieldnames keyword argument. Do we need to specify whether there will be a heading line, and if so, should we require it? Thanks, Martin (who is not a CSV expert at all) From robert.kern at gmail.com Tue Oct 14 20:51:32 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 14 Oct 2008 13:51:32 -0500 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <48F4DC74.4010308@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> <48F4274A.9060601@v.loewis.de> <94bdd2610810140502k6632e640o8e1466c872c72223@mail.gmail.com> <48F4DC74.4010308@v.loewis.de> Message-ID: Martin v. L?wis wrote: >> - are non open source licensed packages alllowed at PyPI ? > > Sure! There is no censorship applied in PyPI, except for content > completely unrelated to Python. > > The only exception is when the package owner is unresponsive, > and somebody else wants to take over the package. We need some > procedure to formalize this case. If you are going to formalize such procedures, it may also be worth thinking about the case of someone uploading code that they do not have rights to. It's not very likely, but the response often must be swift, at least in some jurisdictions. The potential for mirrors, and thus the need for coordinated action, makes it even more important to formalize a procedure, IMO. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tseaver at palladion.com Tue Oct 14 20:52:19 2008 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 14 Oct 2008 14:52:19 -0400 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <48F4E89D.50600@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24333.5070008@v.loewis.de> <94bdd2610810121212xb132054y1e43f87e034fd1c2@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> <48F4274A.9060601@v.loewis.de> <48F48A92.6010106@palladion.com> <48F4DD1A.3050808@v.loewis.de> <48F4DFEE.1050702@palladion.com> <48F4E89D.50600@v.loewis.de> Message-ID: <48F4EA63.9060908@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin v. L?wis wrote: >>> So what would the specification then read like? >> The 'excel' dialect specified in PEP 305: >> >> http://www.python.org/dev/peps/pep-0305/#id19 > > That works fine for me. > > I notice that the PEP doesn't actually match the API, in particular > wrt. to passing a fieldnames keyword argument. > > Do we need to specify whether there will be a heading line, and if > so, should we require it? A heading line makes the file self-documenting, and even a bit future-proof (assuming resonable defaults exist for any columns added later). Given that we don't have any legacy code which is generating one without the headings, I would require it. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFI9Opj+gerLs4ltQ4RAnibAKC/anBNwT8zId3OSD42uGp+Qgc5qQCgq4MM KKe9oKPqo7W+LfXjyhJdPY4= =9EML -----END PGP SIGNATURE----- From martin at v.loewis.de Tue Oct 14 21:21:58 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 14 Oct 2008 21:21:58 +0200 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F24EC0.601@v.loewis.de> <94bdd2610810121308oe29debdy98a917c3a72fc211@mail.gmail.com> <48F25EF3.60509@v.loewis.de> <94bdd2610810130735r779745cdr70fd5e0afcfcdbc0@mail.gmail.com> <48F3BF26.4050905@v.loewis.de> <48F3C8D8.4020804@v.loewis.de> <94bdd2610810131734k59be8e2al71db3e142e9244c9@mail.gmail.com> <48F4274A.9060601@v.loewis.de> <94bdd2610810140502k6632e640o8e1466c872c72223@mail.gmail.com> <48F4DC74.4010308@v.loewis.de> Message-ID: <48F4F156.1050909@v.loewis.de> > If you are going to formalize such procedures, it may also be worth > thinking about the case of someone uploading code that they do not have > rights to. It's not very likely, but the response often must be swift, > at least in some jurisdictions. The potential for mirrors, and thus the > need for coordinated action, makes it even more important to formalize a > procedure, IMO. The procedure for such a case is fairly obvious: when the true copyright holder requests removal (e.g. through the bug tracker, or by email to the PSF, which appears as the responsible entity of the website), the files would be deleted immediately, and the uploader would be requested to clarify. Regards, Martin From dhess at bothan.net Thu Oct 16 06:46:58 2008 From: dhess at bothan.net (Drew Hess) Date: Wed, 15 Oct 2008 21:46:58 -0700 Subject: [Catalog-sig] Request for AGPLv3 classifier Message-ID: Hi, The OSI has approved the Affero GPL version 3: http://www.opensource.org/licenses/agpl-v3.html Can we get a classifier for it? thanks! d -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 193 bytes Desc: not available URL: From martin at v.loewis.de Thu Oct 16 10:17:24 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 Oct 2008 10:17:24 +0200 Subject: [Catalog-sig] Request for AGPLv3 classifier In-Reply-To: References: Message-ID: <48F6F894.5000509@v.loewis.de> > The OSI has approved the Affero GPL version 3: > > http://www.opensource.org/licenses/agpl-v3.html > > Can we get a classifier for it? Would that be License :: OSI Approved :: GNU AFFERO GENERAL PUBLIC LICENSE v3 ? Regards, Martin From dhess at bothan.net Thu Oct 16 10:43:39 2008 From: dhess at bothan.net (Drew Hess) Date: Thu, 16 Oct 2008 01:43:39 -0700 Subject: [Catalog-sig] Request for AGPLv3 classifier In-Reply-To: <48F6F894.5000509@v.loewis.de> ("Martin v. =?iso-8859-1?Q?L?= =?iso-8859-1?Q?=F6wis=22's?= message of "Thu\, 16 Oct 2008 10\:17\:24 +0200") References: <48F6F894.5000509@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > Would that be > > License :: OSI Approved :: GNU AFFERO GENERAL PUBLIC LICENSE v3 More or less... Maybe without the all-caps ;) thanks d -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 193 bytes Desc: not available URL: From martin at v.loewis.de Fri Oct 17 06:04:29 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Oct 2008 06:04:29 +0200 Subject: [Catalog-sig] Request for AGPLv3 classifier In-Reply-To: References: <48F6F894.5000509@v.loewis.de> Message-ID: <48F80ECD.90607@v.loewis.de> Drew Hess wrote: > "Martin v. L?wis" writes: > >> Would that be >> >> License :: OSI Approved :: GNU AFFERO GENERAL PUBLIC LICENSE v3 > > > More or less... Maybe without the all-caps ;) Ok, it's done. Regards, Martin From dhess at bothan.net Fri Oct 17 06:22:11 2008 From: dhess at bothan.net (Drew Hess) Date: Thu, 16 Oct 2008 21:22:11 -0700 Subject: [Catalog-sig] Request for AGPLv3 classifier In-Reply-To: <48F80ECD.90607@v.loewis.de> ("Martin v. =?iso-8859-1?Q?L=F6w?= =?iso-8859-1?Q?is=22's?= message of "Fri\, 17 Oct 2008 06\:04\:29 +0200") References: <48F6F894.5000509@v.loewis.de> <48F80ECD.90607@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > Ok, it's done. Great, thanks! d -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 193 bytes Desc: not available URL: From a.badger at gmail.com Fri Oct 17 06:47:34 2008 From: a.badger at gmail.com (Toshio Kuratomi) Date: Thu, 16 Oct 2008 21:47:34 -0700 Subject: [Catalog-sig] [Distutils] distribute D.C. sprint tasks In-Reply-To: <48F24333.5070008@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <94bdd2610810120555q5ee21b59q595efffed1aff2ed@mail.gmail.com> <48F239A9.2040609@v.loewis.de> <94bdd2610810121116x3b7f7a6bw7cf0298600d33cba@mail.gmail.com> <48F24333.5070008@v.loewis.de> Message-ID: <48F818E6.7020101@gmail.com> Martin v. L?wis wrote: >> Right, please take a look at my last version http://wiki.python.org/moin/PEP_374 >> it tries to go in that direction > > For such an infrastructure (which apparently intends to mirror the files > as well), I insist that a propagation of download counters is made > mandatory. The only mirrors that can be excused from that are private > ones. This may not apply to pypi as the sites you get to mirror you may be different but Linux distributions have found that it is easier to get mirrors if the mirror admins can run as little custom stuff as possible. ie: If they can retrieve content from the master mirror via a simple rsync cron job that they write they are happiest. We have found other ways to generate statistics regarding download in these cases (for instance, based upon how many calls to retrieve the mirrorlist or how many calls for specific packages via the mirror redirector). As I say, whether this is a problem for you will depend on the willingness of the sites that are mirroring you to run your scripts with code that you've written rather than themselves. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: OpenPGP digital signature URL: From martin at v.loewis.de Sat Oct 18 15:00:48 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Oct 2008 15:00:48 +0200 Subject: [Catalog-sig] Classifiers for Python versions Message-ID: <48F9DE00.3000404@v.loewis.de> As suggested in issue 2169549, I have now added a number of trove classifiers to PyPI to denote support for certain Python versions, namely Programming Language :: Python :: 2 Programming Language :: Python :: 2.3 Programming Language :: Python :: 2.4 Programming Language :: Python :: 2.5 Programming Language :: Python :: 2.6 Programming Language :: Python :: 2.7 Programming Language :: Python :: 3 Programming Language :: Python :: 3.0 Programming Language :: Python :: 3.1 With these, packages can indicate that they support 2.x (but not 3.x), or that they have been tested/written for specific Python releases. As a side effect, packages can now expressly state that they support Python 3000. Regards, Martin From fdrake at gmail.com Sat Oct 18 17:17:40 2008 From: fdrake at gmail.com (Fred Drake) Date: Sat, 18 Oct 2008 11:17:40 -0400 Subject: [Catalog-sig] Classifiers for Python versions In-Reply-To: <48F9DE00.3000404@v.loewis.de> References: <48F9DE00.3000404@v.loewis.de> Message-ID: <9cee7ab80810180817v78c99910g4b0c0f7663782406@mail.gmail.com> On Sat, Oct 18, 2008 at 9:00 AM, "Martin v. L?wis" wrote: > With these, packages can indicate that they support 2.x > (but not 3.x), or that they have been tested/written for > specific Python releases. > > As a side effect, packages can now expressly state that they > support Python 3000. Thanks, Martin! This is good news. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From administrator at azdot.gov Sun Oct 19 13:57:10 2008 From: administrator at azdot.gov (administrator at azdot.gov) Date: Sun, 19 Oct 2008 04:57:10 -0700 Subject: [Catalog-sig] AutoNotify: RETURNED MAIL: DATA FORMAT ERROR Message-ID: <20081019121223.6A2AB1E4002@bag.python.org> *** Your message was blocked due to the attachment type. *** Message [oc4cbf517104849c0bfa2c8ca329464e9.pro] triggered rule [File Attachments] at 4:57:10 AM 10/19/2008 Sender: catalog-sig at python.org Recipient(s): asamuels at dot.state.az.us Subject: RETURNED MAIL: DATA FORMAT ERROR From chris at simplistix.co.uk Tue Oct 21 14:59:53 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 21 Oct 2008 13:59:53 +0100 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48F246B6.8000707@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810111524w4cba4f70n4766fdb0b95fe73e@mail.gmail.com> <48F1396C.8030907@zopyx.com> <94bdd2610810111856j4dbefcabra43dc590cb556e0d@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> <48F23E11.30801@v.loewis.de> <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> <48F246B6.8000707@v.loewis.de> Message-ID: <48FDD249.8070309@simplistix.co.uk> Martin v. L?wis wrote: > Hmm. Yesterday, there were 199250 accesses to PyPI through wget. > Of those, 169971 requests came from a single address (from Dedibox in > France), 28966 requests from a second one (from Sakura in Japan). > > So it *is* wget mirrors that make the whole traffic in PyPI. If it were me, I'd just IP firewall the offendors. There's not need for this kind of behaviour if there's an acceptable mirror protocol available... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From ziade.tarek at gmail.com Tue Oct 21 15:06:18 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 21 Oct 2008 15:06:18 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48FDD249.8070309@simplistix.co.uk> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> <48F23E11.30801@v.loewis.de> <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> <48F246B6.8000707@v.loewis.de> <48FDD249.8070309@simplistix.co.uk> Message-ID: <94bdd2610810210606p5533a1f3mbb6b37f02da85d8d@mail.gmail.com> On Tue, Oct 21, 2008 at 2:59 PM, Chris Withers wrote: > Martin v. L?wis wrote: >> >> Hmm. Yesterday, there were 199250 accesses to PyPI through wget. >> Of those, 169971 requests came from a single address (from Dedibox in >> France), 28966 requests from a second one (from Sakura in Japan). >> >> So it *is* wget mirrors that make the whole traffic in PyPI. > > If it were me, I'd just IP firewall the offendors. There's not need for this > kind of behaviour if there's an acceptable mirror protocol available... Well not yet... but the PEP should be finished sometimes this week, > > cheers, > > Chris > > -- > Simplistix - Content Management, Zope & Python Consulting > - http://www.simplistix.co.uk > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From chris at simplistix.co.uk Tue Oct 21 15:07:09 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 21 Oct 2008 14:07:09 +0100 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810210606p5533a1f3mbb6b37f02da85d8d@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> <48F23E11.30801@v.loewis.de> <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> <48F246B6.8000707@v.loewis.de> <48FDD249.8070309@simplistix.co.uk> <94bdd2610810210606p5533a1f3mbb6b37f02da85d8d@mail.gmail.com> Message-ID: <48FDD3FD.4030309@simplistix.co.uk> Tarek Ziad? wrote: > On Tue, Oct 21, 2008 at 2:59 PM, Chris Withers wrote: >> Martin v. L?wis wrote: >>> Hmm. Yesterday, there were 199250 accesses to PyPI through wget. >>> Of those, 169971 requests came from a single address (from Dedibox in >>> France), 28966 requests from a second one (from Sakura in Japan). >>> >>> So it *is* wget mirrors that make the whole traffic in PyPI. >> If it were me, I'd just IP firewall the offendors. There's not need for this >> kind of behaviour if there's an acceptable mirror protocol available... > > Well not yet... but the PEP should be finished sometimes this week, I'm pretty sure that Martin said something was already available... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From ziade.tarek at gmail.com Tue Oct 21 15:11:23 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 21 Oct 2008 15:11:23 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48FDD3FD.4030309@simplistix.co.uk> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> <48F23E11.30801@v.loewis.de> <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> <48F246B6.8000707@v.loewis.de> <48FDD249.8070309@simplistix.co.uk> <94bdd2610810210606p5533a1f3mbb6b37f02da85d8d@mail.gmail.com> <48FDD3FD.4030309@simplistix.co.uk> Message-ID: <94bdd2610810210611s65a47ae2h2963c502be6777ff@mail.gmail.com> On Tue, Oct 21, 2008 at 3:07 PM, Chris Withers wrote: > Tarek Ziad? wrote: >> >> On Tue, Oct 21, 2008 at 2:59 PM, Chris Withers >> wrote: >>> >>> Martin v. L?wis wrote: >>>> >>>> Hmm. Yesterday, there were 199250 accesses to PyPI through wget. >>>> Of those, 169971 requests came from a single address (from Dedibox in >>>> France), 28966 requests from a second one (from Sakura in Japan). >>>> >>>> So it *is* wget mirrors that make the whole traffic in PyPI. >>> >>> If it were me, I'd just IP firewall the offendors. There's not need for >>> this >>> kind of behaviour if there's an acceptable mirror protocol available... >> >> Well not yet... but the PEP should be finished sometimes this week, > > I'm pretty sure that Martin said something was already available... I am not sure what you are talking about, the only protocol published is pje's documentation on Peak, that explains how a package index should be layered + some insights from Martin in this thread afaik. now, both Andreas and I have worked on the topic, and even if our mirrors have created too much hits on PyPI lately, the "clean" protocol, and the right client behavior that has been described by Martin will be described in the PEP and applied in all clients at some point. And a User-agent request header will be added to identify clients. Cheers Tarek > > Chris > > -- > Simplistix - Content Management, Zope & Python Consulting > - http://www.simplistix.co.uk > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From martin at v.loewis.de Tue Oct 21 21:42:17 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Oct 2008 21:42:17 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810210606p5533a1f3mbb6b37f02da85d8d@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <48F1ADCC.5050602@zopyx.com> <5e1183fa0810120540g6e902699hfe7b177b57532f9c@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> <48F23E11.30801@v.loewis.de> <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> <48F246B6.8000707@v.loewis.de> <48FDD249.8070309@simplistix.co.uk> <94bdd2610810210606p5533a1f3mbb6b37f02da85d8d@mail.gmail.com> Message-ID: <48FE3099.8020501@v.loewis.de> >> If it were me, I'd just IP firewall the offendors. There's not need for this >> kind of behaviour if there's an acceptable mirror protocol available... > > Well not yet... but the PEP should be finished sometimes this week, There was an acceptable mirror protocol available for more than a year now, irrespective of any PEP you are working on. Claiming that only with the PEP, true mirroring becomes possible, disregards the work that Jim Fulton and I put into coming up with a workable solution. Regards, Martin From martin at v.loewis.de Tue Oct 21 21:47:06 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Oct 2008 21:47:06 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <94bdd2610810210611s65a47ae2h2963c502be6777ff@mail.gmail.com> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> <48F23E11.30801@v.loewis.de> <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> <48F246B6.8000707@v.loewis.de> <48FDD249.8070309@simplistix.co.uk> <94bdd2610810210606p5533a1f3mbb6b37f02da85d8d@mail.gmail.com> <48FDD3FD.4030309@simplistix.co.uk> <94bdd2610810210611s65a47ae2h2963c502be6777ff@mail.gmail.com> Message-ID: <48FE31BA.5030708@v.loewis.de> > I am not sure what you are talking about, the only protocol published > is pje's documentation on Peak, that explains how a package index > should be layered + some insights from Martin in this thread afaik. See the discussion from July 2007 on distutils-sig. Regards, Martin From ziade.tarek at gmail.com Wed Oct 22 00:10:43 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 22 Oct 2008 00:10:43 +0200 Subject: [Catalog-sig] distribute D.C. sprint tasks In-Reply-To: <48FE3099.8020501@v.loewis.de> References: <94bdd2610810111056k2e277cc0t4a1e1cbe88fcf841@mail.gmail.com> <94bdd2610810120558v31a75fc2ga9823e92f906bede@mail.gmail.com> <48F238D7.7050600@v.loewis.de> <94bdd2610810121051m40f4114ai8a9730d958bf7d50@mail.gmail.com> <48F23E11.30801@v.loewis.de> <94bdd2610810121132o34c51df8gb3acac84fa283502@mail.gmail.com> <48F246B6.8000707@v.loewis.de> <48FDD249.8070309@simplistix.co.uk> <94bdd2610810210606p5533a1f3mbb6b37f02da85d8d@mail.gmail.com> <48FE3099.8020501@v.loewis.de> Message-ID: <94bdd2610810211510i166fe697x1b368ebda4bad74e@mail.gmail.com> On Tue, Oct 21, 2008 at 9:42 PM, "Martin v. L?wis" wrote: >>> If it were me, I'd just IP firewall the offendors. There's not need for this >>> kind of behaviour if there's an acceptable mirror protocol available... >> >> Well not yet... but the PEP should be finished sometimes this week, > > There was an acceptable mirror protocol available for more than a year > now, irrespective of any PEP you are working on. Claiming that only with > the PEP, true mirroring becomes possible, disregards the work that Jim > Fulton and I put into coming up with a workable solution. Right, sorry about that. The PEP works on complementary needs (like client-side failover on mirrors) and will also try to summarize in a section how mirrors and client apps should behave for every aspect, reffering to all previous works on that, since they have already provided solutions for an optimal access to pypi. Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From lists at zopyx.com Fri Oct 24 14:17:25 2008 From: lists at zopyx.com (Andreas Jung) Date: Fri, 24 Oct 2008 14:17:25 +0200 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <48F23ADC.4080108@v.loewis.de> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> <48EE8CEC.1000706@v.loewis.de> <48EFA039.30002@colorstudy.com> <48F23ADC.4080108@v.loewis.de> Message-ID: <4901BCD5.4070404@zopyx.com> On 12.10.2008 19:58 Uhr, Martin v. L?wis wrote: >> Sorry... this is more speculation than based on actual knowledge, but I >> think there are feasible ways to do these things. > > PyPI provides mirrors with a changelog where they can efficiently ask > for a list of packages that have changed since they last synchronized. > This is the recommended way for mirrors to operate; polling the > changelog once every minute is acceptable load for PyPI. > > We've added support for the changelog() API in z3c.sqlalchemy 1.0.1 and will add support for a custom user agent string asap. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: lists.vcf Type: text/x-vcard Size: 316 bytes Desc: not available URL: From martin at v.loewis.de Fri Oct 24 23:53:08 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 24 Oct 2008 23:53:08 +0200 Subject: [Catalog-sig] PyPI replication project In-Reply-To: <4901BCD5.4070404@zopyx.com> References: <48EDDF8E.8030204@zopyx.com> <48EE4E5B.9070200@v.loewis.de> <48EE532A.3030907@zopyx.com> <48EE7512.7040607@v.loewis.de> <48EE7FF5.7090405@zopyx.com> <48EE8771.8040605@v.loewis.de> <64ddb72c0810091542o69d74b82p30a14dcecc8f0e1e@mail.gmail.com> <48EE8CEC.1000706@v.loewis.de> <48EFA039.30002@colorstudy.com> <48F23ADC.4080108@v.loewis.de> <4901BCD5.4070404@zopyx.com> Message-ID: <490243C4.1040308@v.loewis.de> > We've added support for the changelog() API in z3c.sqlalchemy 1.0.1 and > will add support for a custom user agent string asap. Great! If you find any problems, please let me know. Regards, Martin From zooko at zooko.com Sat Oct 25 03:39:35 2008 From: zooko at zooko.com (zooko) Date: Fri, 24 Oct 2008 19:39:35 -0600 Subject: [Catalog-sig] I wish for timestamps on pypi-hosted files. Message-ID: The subject line pretty much says it all. I'd like to know when those files were uploaded. Regards, Zooko --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $10/month From martin at v.loewis.de Sat Oct 25 04:02:23 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 25 Oct 2008 04:02:23 +0200 Subject: [Catalog-sig] I wish for timestamps on pypi-hosted files. In-Reply-To: References: Message-ID: <49027E2F.4060202@v.loewis.de> > The subject line pretty much says it all. I'd like to know when those > files were uploaded. They are available already, please take a look at (say) http://pypi.python.org/packages/2.6/m/mock/ To retrieve the file name programmatically, you can use the HEAD verb, and look at the Last-Modified header. $ telnet pypi.python.org http Trying 82.94.164.163... Connected to ximinez.python.org. Escape character is '^]'. HEAD /packages/2.6/m/mock/mock-0.4.0-py2.6.egg HTTP/1.0 Host: pypi.python.org HTTP/1.1 200 OK Date: Sat, 25 Oct 2008 01:57:18 GMT Server: Apache/2.2.3 (Debian) mod_fastcgi/2.4.2 Last-Modified: Sun, 12 Oct 2008 16:23:10 GMT ETag: "51409c-16ac-cc5f7780" Accept-Ranges: bytes Content-Length: 5804 Connection: close Content-Type: application/octet-stream Regards, Martin From zooko at zooko.com Sat Oct 25 04:12:11 2008 From: zooko at zooko.com (zooko) Date: Fri, 24 Oct 2008 20:12:11 -0600 Subject: [Catalog-sig] I wish for timestamps on pypi-hosted files. In-Reply-To: <49027E2F.4060202@v.loewis.de> References: <49027E2F.4060202@v.loewis.de> Message-ID: <938C00D6-5DF6-494A-A8C5-5CB682E8C2BE@zooko.com> On Oct 24, 2008, at 20:02 PM, Martin v. L?wis wrote: > They are available already, please take a look at (say) I see -- thanks! To clarify, I had meant that I wanted to see those timestamps when browsing packages like this: http://pypi.python.org/pypi/mock Regards, Zooko From martin at v.loewis.de Sat Oct 25 08:43:49 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 25 Oct 2008 08:43:49 +0200 Subject: [Catalog-sig] I wish for timestamps on pypi-hosted files. In-Reply-To: <938C00D6-5DF6-494A-A8C5-5CB682E8C2BE@zooko.com> References: <49027E2F.4060202@v.loewis.de> <938C00D6-5DF6-494A-A8C5-5CB682E8C2BE@zooko.com> Message-ID: <4902C025.6010308@v.loewis.de> > To clarify, I had meant that I wanted to see those timestamps when > browsing packages like this: > > http://pypi.python.org/pypi/mock Please submit a bug report at http://sourceforge.net/projects/pypi It would be best if you included a patch. Regards, Martin From ziade.tarek at gmail.com Sun Oct 26 23:01:06 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 26 Oct 2008 23:01:06 +0100 Subject: [Catalog-sig] fail-over and index merging for tools like setuptools and pyinstall Message-ID: <94bdd2610810261501p51e7dfa2ya6de9d8e56a2a824@mail.gmail.com> Hi ! Sorry for the crosspost, but this mail concerns both lists, There are two points that need to be discussed to finish the work on the PyPI proposal started in D.C. (http://wiki.python.org/PEP_374): - the fail-over mechanism - merging several indexes *Fail over* PyPI will provide a static page that lists all its mirrors. Each line of this file describes a mirror. It provides the root url, followed by the relative url of: - the index : the root of the package index - the last-modified page : a static text file that gives the date of the last sync. - the local stats page: a static text file that gives the number of downloads of a file, per package, per user-agent - the global stats page, calculated by pypi that gives the grand total of all downloads (the sum of PyPI local stats + mirrors local stats) - the mirrors page, that lists all mirrors For example: http://example.com/pypi,index,last-modified,local-stats,stats,mirrors http://example2.com/pypi,index,last-modified,local-stats,stats,mirrors (see the proposal doc for more info) This mirror list says for example that a mirror is available at http://example.com/pypi/index, and that its last modified date is available at http://example.com/pypi/last-modified. On client side it means that it is possible to list mirrors of a given package index to implement a fail-over mechanism. Moreover, it makes it possible to select the nearest mirror. *Merging several indexes* Besides fail over, another thing needs to be implemented on client side: being able to use different indexes. This is an obvious missing feature: we don't want to push in PyPI all our customers package. In the meantime we do want to use tools like distutils, setuptools etc., the same way with any kind of package. So using private package indexes easily besides PyPI is needed. It is now possible in Python 2.6 with the new .pypirc file to define several indexes. >From there softwares like PloneSoftwareCenter allows developers to work with other indexes than PyPI. But tools like setuptools need to evolve the same way. Each one of this index can have its own mirrors, as defined previously, but the client needs to combine all the different index, into a "super" index. This can be implemented by working with a sorted list of index. When a client is looking for a package, it can look in each index and pick the first package that fits. Any comments ? Cheers Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From martin at v.loewis.de Sun Oct 26 23:10:03 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 26 Oct 2008 23:10:03 +0100 Subject: [Catalog-sig] fail-over and index merging for tools like setuptools and pyinstall In-Reply-To: <94bdd2610810261501p51e7dfa2ya6de9d8e56a2a824@mail.gmail.com> References: <94bdd2610810261501p51e7dfa2ya6de9d8e56a2a824@mail.gmail.com> Message-ID: <4904EABB.7070709@v.loewis.de> > Any comments ? Who will implement all that? Regards, Martin From ziade.tarek at gmail.com Sun Oct 26 23:14:59 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 26 Oct 2008 23:14:59 +0100 Subject: [Catalog-sig] fail-over and index merging for tools like setuptools and pyinstall In-Reply-To: <4904EABB.7070709@v.loewis.de> References: <94bdd2610810261501p51e7dfa2ya6de9d8e56a2a824@mail.gmail.com> <4904EABB.7070709@v.loewis.de> Message-ID: <94bdd2610810261514j6da146d2uc3be1cc95c1a212f@mail.gmail.com> On Sun, Oct 26, 2008 at 11:10 PM, "Martin v. L?wis" wrote: >> Any comments ? > > Who will implement all that? > I am willing to do it. I started already to write some patch for setuptools, submited for review in the tracker, I can also write the patches for PyPI, since I have the code and a database dump, I can also organize sprints if some other people want to help implementing that. > Regards, > Martin > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/