From kantrn at rpi.edu Mon May 5 10:59:41 2008 From: kantrn at rpi.edu (Noah Kantrowitz) Date: Mon, 05 May 2008 04:59:41 -0400 Subject: [Catalog-sig] Hidden packages in XML-RPC Message-ID: <481ECC7D.5070605@rpi.edu> I remember a while back there was a request to add a way to see hidden releases, and I see this got added in trunk. I can't get it to work against the live server though, can someone update it? --Noah -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 249 bytes Desc: OpenPGP digital signature URL: From kantrn at rpi.edu Mon May 5 11:19:08 2008 From: kantrn at rpi.edu (Noah Kantrowitz) Date: Mon, 05 May 2008 05:19:08 -0400 Subject: [Catalog-sig] Hidden packages in XML-RPC In-Reply-To: <481ECC7D.5070605@rpi.edu> References: <481ECC7D.5070605@rpi.edu> Message-ID: <481ED10C.7030509@rpi.edu> Noah Kantrowitz wrote: > I remember a while back there was a request to add a way to see hidden > releases, and I see this got added in trunk. I can't get it to work > against the live server though, can someone update it? A related request, could you add a similar argument to the search call? With Trac plugins it is common to have multiple active branches for different Trac major versions, so the idea of only one unhidden package is generally not what I want. --Noah -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 249 bytes Desc: OpenPGP digital signature URL: From kantrn at rpi.edu Mon May 5 15:46:12 2008 From: kantrn at rpi.edu (Noah Kantrowitz) Date: Mon, 05 May 2008 09:46:12 -0400 Subject: [Catalog-sig] New XML-RPC implementation Message-ID: <481F0FA4.6020801@rpi.edu> Continuing my flood of XML-RPC related fun, attached is a mostly untested patch to make the RPC system use SimpleXMLRPCDispatcher. This means adding the introspection and multicall APIs too, which may help alleviate concerns about round-trips. --Noah PS: Is there any simple way to get a devlopment environment going? I think I have finally got the schema cleaned up enough to import, but its hard to tell if this is the right one to use (pkgbase_schema.sql). -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: new_rpc.patch URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 249 bytes Desc: OpenPGP digital signature URL: From yanghatespam at gmail.com Thu May 8 21:01:17 2008 From: yanghatespam at gmail.com (Yang Zhang) Date: Thu, 08 May 2008 15:01:17 -0400 Subject: [Catalog-sig] Feature request: activity analytics/statistics/traffic/downloads Message-ID: <48234DFD.8050209@gmail.com> Hi, is it possible to collect and view statistics about the activity of our packages? For me, I don't mind having this information be either public or private, and I'm particularly interested in # visitors and # downloads (over time). Even a quick and dirty, very-coarse-grained first pass at introducing these numbers (ASAP) would be tremendously appreciated! Thanks for hearing me in. -- Yang Zhang http://www.mit.edu/~y_z/ From amk at amk.ca Thu May 8 21:59:40 2008 From: amk at amk.ca (A.M. Kuchling) Date: Thu, 8 May 2008 15:59:40 -0400 Subject: [Catalog-sig] Feature request: activity analytics/statistics/traffic/downloads In-Reply-To: <48234DFD.8050209@gmail.com> References: <48234DFD.8050209@gmail.com> Message-ID: <20080508195940.GA14053@amk-desktop.matrixgroup.net> On Thu, May 08, 2008 at 03:01:17PM -0400, Yang Zhang wrote: > Hi, is it possible to collect and view statistics about the activity of > our packages? For me, I don't mind having this information be either > public or private, and I'm particularly interested in # visitors and # > downloads (over time). Even a quick and dirty, very-coarse-grained > first pass at introducing these numbers (ASAP) would be tremendously > appreciated! Thanks for hearing me in. Rough download figures for individual files are updated once per day and shown on the release page. For example, http://pypi.python.org/pypi/Rat/0.1 shows 319, 101, and 185 downloads for the three variants of the package. --amk From martin at v.loewis.de Thu May 8 22:32:54 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 08 May 2008 22:32:54 +0200 Subject: [Catalog-sig] Feature request: activity analytics/statistics/traffic/downloads In-Reply-To: <20080508195940.GA14053@amk-desktop.matrixgroup.net> References: <48234DFD.8050209@gmail.com> <20080508195940.GA14053@amk-desktop.matrixgroup.net> Message-ID: <48236376.3000906@v.loewis.de> > On Thu, May 08, 2008 at 03:01:17PM -0400, Yang Zhang wrote: >> Hi, is it possible to collect and view statistics about the activity of >> our packages? For me, I don't mind having this information be either >> public or private, and I'm particularly interested in # visitors and # >> downloads (over time). Even a quick and dirty, very-coarse-grained >> first pass at introducing these numbers (ASAP) would be tremendously >> appreciated! Thanks for hearing me in. > > Rough download figures for individual files are updated once per day > and shown on the release page. For example, > http://pypi.python.org/pypi/Rat/0.1 shows 319, 101, and 185 downloads > for the three variants of the package. In addition, webstats are available at http://pypi.python.org/webstats/ Regards, Martin From martin at v.loewis.de Mon May 12 12:35:24 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 12 May 2008 12:35:24 +0200 Subject: [Catalog-sig] PyPI naming policy changes Message-ID: <48281D6C.5020200@v.loewis.de> I just implemented a number of policy changes in PyPI: - per email address, there can only be one registered user. If you have a need to have multiple accounts for the same email address, please provide a patch to correctly implement the password reset procedure. - package names must differ in their pkg_resource.safe_name(p).lower() values (e.g. you can't have two packages that only differ in case, or in the amount of white space between words). Existing registrations are not affected, although package owners have been asked to clean this up appropriately. - files must start with to_filename(safe_name(pkg_name)), ignoring case. E.g. for the "BerkeleyDB Backend Storage Engine for DURUS" package, valid file names would be BerkeleyDB_Backend_Storage_Engine_for_DURUS-1.0.tar.gz or berkeleydb_backend_storage_engine_for_durus-1.0.tar.gz, but not durus-berkeleydbstorage-20061121.tar.gz. Existing registrations are not affected; no attempt to clean up the data is made. Please let me know if you see any problems with that policy. Regards, Martin From renesd at gmail.com Mon May 12 14:50:38 2008 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Mon, 12 May 2008 22:50:38 +1000 Subject: [Catalog-sig] PyPI naming policy changes In-Reply-To: <48281D6C.5020200@v.loewis.de> References: <48281D6C.5020200@v.loewis.de> Message-ID: <64ddb72c0805120550m335037b5l8b7f64a84658090d@mail.gmail.com> hi, what is wrong with the file name: durus-berkeleydbstorage-20061121.tar.gz ? cheers, On Mon, May 12, 2008 at 8:35 PM, "Martin v. L?wis" wrote: > I just implemented a number of policy changes in PyPI: > > - per email address, there can only be one registered user. > If you have a need to have multiple accounts for the same > email address, please provide a patch to correctly implement > the password reset procedure. > > - package names must differ in their pkg_resource.safe_name(p).lower() > values (e.g. you can't have two packages that only differ in case, > or in the amount of white space between words). > > Existing registrations are not affected, although package owners have > been asked to clean this up appropriately. > > - files must start with to_filename(safe_name(pkg_name)), ignoring case. > E.g. for the "BerkeleyDB Backend Storage Engine for DURUS" > package, valid file names would be > BerkeleyDB_Backend_Storage_Engine_for_DURUS-1.0.tar.gz or > berkeleydb_backend_storage_engine_for_durus-1.0.tar.gz, but not > durus-berkeleydbstorage-20061121.tar.gz. > > Existing registrations are not affected; no attempt to clean > up the data is made. > > Please let me know if you see any problems with that policy. > > Regards, > Martin > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > From martin at v.loewis.de Mon May 12 15:04:32 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 12 May 2008 15:04:32 +0200 Subject: [Catalog-sig] PyPI naming policy changes In-Reply-To: <64ddb72c0805120550m335037b5l8b7f64a84658090d@mail.gmail.com> References: <48281D6C.5020200@v.loewis.de> <64ddb72c0805120550m335037b5l8b7f64a84658090d@mail.gmail.com> Message-ID: <48284060.9030300@v.loewis.de> > what is wrong with the file name: durus-berkeleydbstorage-20061121.tar.gz ? It doesn't match the package name. As a consequence, setuptools would not be able to find it. See also http://sourceforge.net/tracker/index.php?func=detail&aid=1901694&group_id=66150&atid=513503 Regards, Martin From pje at telecommunity.com Mon May 12 17:12:50 2008 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 12 May 2008 11:12:50 -0400 Subject: [Catalog-sig] PyPI naming policy changes In-Reply-To: <48281D6C.5020200@v.loewis.de> References: <48281D6C.5020200@v.loewis.de> Message-ID: <20080512151230.B48283A40C2@sparrow.telecommunity.com> At 12:35 PM 5/12/2008 +0200, Martin v. L?wis wrote: >- package names must differ in their pkg_resource.safe_name(p).lower() > values (e.g. you can't have two packages that only differ in case, > or in the amount of white space between words). Hurray! Thanks for taking care of this. >- files must start with to_filename(safe_name(pkg_name)), ignoring case. > E.g. for the "BerkeleyDB Backend Storage Engine for DURUS" > package, valid file names would be > BerkeleyDB_Backend_Storage_Engine_for_DURUS-1.0.tar.gz or > berkeleydb_backend_storage_engine_for_durus-1.0.tar.gz, but not > durus-berkeleydbstorage-20061121.tar.gz. > > Existing registrations are not affected; no attempt to clean > up the data is made. > >Please let me know if you see any problems with that policy. One possible problem -- the distutils do not generate "safe" filenames. That is, a project named "foo-bar" or "py-baz" will have '-' in the source distribution filename generated by the distutils. So, if you are testing filename.startswith(to_filename(...)), then distutils-generated packages for such projects won't work. Probably the test you want here is: safe_name(filename).startswith(safe_name(project_name)) As this will handle both distutils-generated and setuptools-generated distribution files. (Setuptools already "knows" that distutils-generated files have this naming ambiguity, btw, and works around it.) From martin at v.loewis.de Mon May 12 17:54:19 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 12 May 2008 17:54:19 +0200 Subject: [Catalog-sig] PyPI naming policy changes In-Reply-To: <20080512151230.B48283A40C2@sparrow.telecommunity.com> References: <48281D6C.5020200@v.loewis.de> <20080512151230.B48283A40C2@sparrow.telecommunity.com> Message-ID: <4828682B.4050902@v.loewis.de> > Probably the test you want here is: > > safe_name(filename).startswith(safe_name(project_name)) So what's to_filename then for? Also, is it allowed for the distribution's name to be all-lowercase, when the PyPI package name is mixed-case? In any case, I updated the formula in r543. Regards, Martin From pje at telecommunity.com Mon May 12 18:25:33 2008 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 12 May 2008 12:25:33 -0400 Subject: [Catalog-sig] PyPI naming policy changes In-Reply-To: <4828682B.4050902@v.loewis.de> References: <48281D6C.5020200@v.loewis.de> <20080512151230.B48283A40C2@sparrow.telecommunity.com> <4828682B.4050902@v.loewis.de> Message-ID: <20080512162511.088323A40AE@sparrow.telecommunity.com> At 05:54 PM 5/12/2008 +0200, Martin v. L?wis wrote: > > Probably the test you want here is: > > > > safe_name(filename).startswith(safe_name(project_name)) > >So what's to_filename then for? It's used by setuptools to name files in a way that's unambiguous as to which parts mean what. > Also, is it allowed for the >distribution's name to be all-lowercase, when the PyPI package name >is mixed-case? Yes... sorry, I left that out of the above... it should be filename.lower() and project_name.lower(). Setuptools processes project names and version numbers case-insensitively. From martin at v.loewis.de Mon May 12 19:27:20 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 12 May 2008 19:27:20 +0200 Subject: [Catalog-sig] PyPI naming policy changes In-Reply-To: <20080512162511.088323A40AE@sparrow.telecommunity.com> References: <48281D6C.5020200@v.loewis.de> <20080512151230.B48283A40C2@sparrow.telecommunity.com> <4828682B.4050902@v.loewis.de> <20080512162511.088323A40AE@sparrow.telecommunity.com> Message-ID: <48287DF8.2010202@v.loewis.de> > Yes... sorry, I left that out of the above... it should be > filename.lower() and project_name.lower(). Setuptools processes project > names and version numbers case-insensitively. Ok, fixed in r544. Regards, Martin From tarek.ziade at ingeniweb.com Wed May 14 13:27:28 2008 From: tarek.ziade at ingeniweb.com (Tarek Ziade) Date: Wed, 14 May 2008 13:27:28 +0200 Subject: [Catalog-sig] search queries in PyPI Message-ID: Hi, I was wondering how the search works in PyPI (didn't have time to digg the code) I was unable to do specific queries. For instance, how do I get the packages that have the word "nose" and the word "plugin" in their short descriptions ? I tried 'nose AND plugin', 'nose+plugin', etc.. without success. I tried '"nose plugin"' and I got back a package that had this sequence of words, but also had a package that has nothing to do with it (z3c.sampledata 0.1.0 ) Any tips ? Tarek -- Tarek Ziad? - Directeur Technique INGENIWEB (TM) - SAS 50000 Euros - RC B 438 725 632 Bureaux de la Colline - 1 rue Royale - B?timent D - 9?me ?tage 92210 Saint Cloud - France Phone : 01.78.15.24.00 / Fax : 01 46 02 44 04 http://www.ingeniweb.com - une soci?t? du groupe Alter Way -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Wed May 14 19:18:56 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 14 May 2008 19:18:56 +0200 Subject: [Catalog-sig] search queries in PyPI In-Reply-To: References: Message-ID: <482B1F00.5090907@v.loewis.de> > Any tips ? Not from me; I would have to look it up in the code as well. Regards, Martin From kantrn at rpi.edu Wed May 14 19:25:01 2008 From: kantrn at rpi.edu (Noah Kantrowitz) Date: Wed, 14 May 2008 13:25:01 -0400 Subject: [Catalog-sig] search queries in PyPI In-Reply-To: References: Message-ID: <482B206D.9080604@rpi.edu> Tarek Ziade wrote: > Hi, > > I was wondering how the search works in PyPI (didn't have time to digg the > code) > > I was unable to do specific queries. For instance, how do I get the packages > that have > the word "nose" and the word "plugin" in their short descriptions ? > > I tried 'nose AND plugin', 'nose+plugin', etc.. without success. > > I tried '"nose plugin"' and I got back a package that had this sequence of > words, but also had a package that > has nothing to do with it (z3c.sampledata > 0.1.0 > ) > Try "nose%plugin". Thats the syntax used in the XML-RPC API at least. --Noah -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 249 bytes Desc: OpenPGP digital signature URL: From tarek.ziade at ingeniweb.com Wed May 14 21:06:09 2008 From: tarek.ziade at ingeniweb.com (Tarek Ziade) Date: Wed, 14 May 2008 21:06:09 +0200 Subject: [Catalog-sig] search queries in PyPI In-Reply-To: <482B206D.9080604@rpi.edu> References: <482B206D.9080604@rpi.edu> Message-ID: 2008/5/14 Noah Kantrowitz : > Tarek Ziade wrote: > >> Hi, >> >> I was wondering how the search works in PyPI (didn't have time to digg the >> code) >> >> I was unable to do specific queries. For instance, how do I get the >> packages >> that have >> the word "nose" and the word "plugin" in their short descriptions ? >> >> I tried 'nose AND plugin', 'nose+plugin', etc.. without success. >> >> I tried '"nose plugin"' and I got back a package that had this sequence of >> words, but also had a package that >> has nothing to do with it (z3c.sampledata >> 0.1.0 >> ) >> >> > Try "nose%plugin". Thats the syntax used in the XML-RPC API at least. ah ! interesting, that worked, thanks ! I have also digged the code to get how it is done. here's the pseudo code: def search(query): results = {} terms = query.split('') for term in terms: for field in ('name', 'description', 'summary'): for result in store.query_packages(term): # ... some score calculation if result.name == field results[result.name] = result return results Basically, there is one request over the storage (database) for each word entered in the query, 'AND' is not used, it is event removed because it is listed as a stop word. So, Noah's query, using %, doesn't split the words and sends them directly to the DB using the LIKE sql statement in one string. In the meantime, store.query_package. *has* a feature to do AND and OR searches: def query_packages(query, operator='and'): ... I think it wouldn't cost too much here to change the webui interface, to use store.py features. It woud also make it faster since only one database query could be done per search. I still need to install a PyPI instance for a patch I wanted to propose for making pypi permissive on unexisting classifiers, so maybe I can try a patch for this in the meantime ? the change could take into account AND and OR words, to do the proper query, Tarek > > > --Noah > > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > > -- Tarek Ziad? - Directeur Technique INGENIWEB (TM) - SAS 50000 Euros - RC B 438 725 632 Bureaux de la Colline - 1 rue Royale - B?timent D - 9?me ?tage 92210 Saint Cloud - France Phone : 01.78.15.24.00 / Fax : 01 46 02 44 04 http://www.ingeniweb.com - une soci?t? du groupe Alter Way -------------- next part -------------- An HTML attachment was scrubbed... URL: From richardjones at optushome.com.au Thu May 15 00:16:46 2008 From: richardjones at optushome.com.au (Richard Jones) Date: Thu, 15 May 2008 08:16:46 +1000 Subject: [Catalog-sig] search queries in PyPI In-Reply-To: References: <482B206D.9080604@rpi.edu> Message-ID: <200805150816.46654.richardjones@optushome.com.au> On Thu, 15 May 2008, Tarek Ziade wrote: > I think it wouldn't cost too much here to change the webui interface, to > use store.py features. The current search implementation is a trade-off between the complexity you would like to have and useful results for arbitrary, inelegant term lists. Your pseudo-code omits an important feature - the scoring of the result based on the type of match. Perhaps the scoring could be altered such that hits on more than one term in the same field result in a higher score? Richard From tarek.ziade at ingeniweb.com Thu May 15 09:48:47 2008 From: tarek.ziade at ingeniweb.com (Tarek Ziade) Date: Thu, 15 May 2008 09:48:47 +0200 Subject: [Catalog-sig] search queries in PyPI In-Reply-To: <200805150816.46654.richardjones@optushome.com.au> References: <482B206D.9080604@rpi.edu> <200805150816.46654.richardjones@optushome.com.au> Message-ID: 2008/5/15 Richard Jones : > On Thu, 15 May 2008, Tarek Ziade wrote: > > I think it wouldn't cost too much here to change the webui interface, to > > use store.py features. > > The current search implementation is a trade-off between the complexity you > would like to have and useful results for arbitrary, inelegant term lists. ok, yes my use case is being able to look for several words with AND and/or OR, if searchable_text is a field with all text the package has: search('nose AND plugin') --- which in my mind would result in something like: select distinct package_name from package where searchable_text like '%nose%' AND name like '%plugin%' But i guess it can be rather complex, and Noah's trick fit my needs :D > > > Your pseudo-code omits an important feature - the scoring of the result > based > on the type of match. > Ok. semi-related: looking at the code, I think query_packages can be called just once per term, then the score calculated for each column on the results. That would lower the number of queries (1/3). > Perhaps the scoring could be altered such that hits on more than one term > in > the same field result in a higher score? if it is doable, that would be nice, Maybe we could have an advanced search screen that would take care of asking the user which field the term refers too (like roundup) ? Tarek > > > > Richard > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG at python.org > http://mail.python.org/mailman/listinfo/catalog-sig > -- Tarek Ziad? - Directeur Technique INGENIWEB (TM) - SAS 50000 Euros - RC B 438 725 632 Bureaux de la Colline - 1 rue Royale - B?timent D - 9?me ?tage 92210 Saint Cloud - France Phone : 01.78.15.24.00 / Fax : 01 46 02 44 04 http://www.ingeniweb.com - une soci?t? du groupe Alter Way -------------- next part -------------- An HTML attachment was scrubbed... URL: From zooko at zooko.com Thu May 15 14:47:37 2008 From: zooko at zooko.com (zooko) Date: Thu, 15 May 2008 06:47:37 -0600 Subject: [Catalog-sig] I wish pypi displayed the date of upload. Message-ID: <45879E3D-2686-49FC-89E0-BD14CD251991@zooko.com> I often find myself wondering when a certain package was uploaded. Regards, Zooko From python at jwp.name Sat May 17 18:30:23 2008 From: python at jwp.name (James William Pye) Date: Sat, 17 May 2008 09:30:23 -0700 Subject: [Catalog-sig] Adding package pydoc to PyPI In-Reply-To: <47A2AD1E.4020700@v.loewis.de> References: <20080131162523.GA50793@lit.jwp.name> <47A209DC.5060200@v.loewis.de> <20080131194445.GA51222@lit.jwp.name> <47A22D68.3040707@v.loewis.de> <20080131231849.GA52682@lit.jwp.name> <47A2AD1E.4020700@v.loewis.de> Message-ID: <20080517163023.GA5306@lit.jwp.name> For those that are curious, I have been writing some code in this direction. The "jwp_pkg_documentation" package provides a setuptools command ``dist_doc`` that extracts a package's documentation into the ``dist/doc.xml`` file. RNG for the produced XML: https://svn.jwp.name/lib/python/pkg_documentation/fork/v0.9.1/validation/pypkg.rng Example output(buggy and far from where I want it to be): http://python.projects.postgresql.org/doc/pg_proboscis-1.0.html On Fri, Feb 01, 2008 at 06:24:46AM +0100, "Martin v. L?wis" wrote: > See above. If you implement the service, I would consider it feasible > to provide a link to a package's pydoc documentation in PyPI > (similar to homepage and download_url; say documentation_url), although > support for that in distutils probably requires a PEP. I really like this idea. Even in the absence of the proposed service, I think this would be very useful.