From info at infomki.com Tue Jun 1 18:40:49 2004 From: info at infomki.com (Medical Knowledge Institute) Date: Tue Jun 1 18:58:13 2004 Subject: [Catalog-sig] MKI Health Reporter: Take Action Now on Childhood Obesity, Expert Says Message-ID: <7199d22ccaf54d3d1917401d475bea02@localhost.localdomain> ========================================================= M.K.I N.E.W.S.L.E.T.T.E.R. ========================================================= {sectie::Introductie} --------------------------------------------------------- {sectie::Linksboven} --------------------------------------------------------- {sectie::Rechtsboven} --------------------------------------------------------- {sectie::Linksonder} --------------------------------------------------------- {sectie::Rechtsonder} ========================================================= Copyright 2004 Medical Knowledge Institute Website: http://mki.massmailer.nl/klik.php?m=13&id=8770&url=http%3A%2F%2Fwww.infomki.com Unsubscribe: http://mki.massmailer.nl/afmelden.php?a=aa&e=catalog-sig@python.org&id=8770&m=13 ========================================================= From ianb at colorstudy.com Tue Jun 15 21:53:05 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Tue Jun 15 21:53:13 2004 Subject: [Catalog-sig] PyPI improvements Message-ID: Howdy. I just recently posted some ideas for PyPI (http://blog.colorstudy.com/ianb/weblog/2004/06/15.html#P123), but some of the basic features I was thinking about should be generally useful: 1. Express relationships between packages. These are relationships like alternative-implementation, fork, part-of, recommends, requires, etc. At the moment I'm thinking purely about displaying this information, not any fancy distutils magic installation of dependencies. 2. Cache packages. I.e., download a copy of the package, and if the package disappears then we have a backup. These are both more important when dealing with smaller pieces of code -- code that is part of a more interdependent ecosystem (1), and code that is more transient (2) -- but these apply just as well to full-blown packages. The other thing that might be useful is some improved categorization of code. The Trove categories are... well, they are stupid. No fault of anyone here. CPAN's much more coarsely-grained categories are much better, in my opinion (Acme, AI, Algorithm, Apache, AppConfig, Archive, Array, and so on: http://www.cpan.org/modules/by-module But even more coarsely-grained than that, there are classes of package. Right now we have libraries and applications. I'd like to add modules -- though the name is vague, I'm thinking of code on the sophisticated end of the Python Cookbook entries. Small, reusable, and not worth distutilifying (I just can't imagine making a whole package for one 100-line module, nor can I imagine using such a package). When you're looking for code, each of these is quite different from the others -- for any search, you will probably be interested in any of these (a library to use, or a module or application to borrow from). Right now we're neither here nor there, as people don't think to add applications to PyPI, and the trove categories are inappropriate for libraries. On top of this is the infrastructure issue, which probably also has to be dealt with before moving forward much (i.e., SQLite and CGI). Concurrent updates to a SQLite database from multiple processes scares the crap out of me. But it doesn't look like that should be too hard to fix. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From richardjones at optushome.com.au Tue Jun 15 23:30:40 2004 From: richardjones at optushome.com.au (Richard Jones) Date: Tue Jun 15 23:30:50 2004 Subject: [Catalog-sig] PyPI improvements In-Reply-To: References: Message-ID: <200406161330.40399.richardjones@optushome.com.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday 16 Jun 2004 11:53, Ian Bicking wrote: > Howdy. I just recently posted some ideas for PyPI > (http://blog.colorstudy.com/ianb/weblog/2004/06/15.html#P123) I commented there, but I might repeat some of my comments here too where appropriate. > 1. Express relationships between packages. These are relationships > like alternative-implementation, fork, part-of, recommends, requires, > etc. At the moment I'm thinking purely about displaying this > information, not any fancy distutils magic installation of > dependencies. There's been a number of proposals and I believe some code towards implementing this kind of meta-data capture. The two extensions to distutils dealing with this issue that I know of are PIMP (/PackMan) and the ZPKG tools: http://undefined.org/python/pimp/ http://www.python.org/packman/ (couldn't find a page giving the technical details of PIMP) http://zope.org/Members/fdrake/zpkgtools/ (this page has a good list of links to prior discussions / proposals) Various proposals have also been made on this list. I have no idea how related those projects are. It would be a shame to develop *another* system. > 2. Cache packages. I.e., download a copy of the package, and if the > package disappears then we have a backup. The disappearance of packages is a concern. An archive network would solve this issue, but it requires both organisation and support from hosts. I'm pretty sure the current python.org machine is not suitable for storing packages. > The other thing that might be useful is some improved categorization of > code. The Trove categories are... well, they are stupid. No fault of > anyone here. CPAN's much more coarsely-grained categories are much > better, in my opinion (Acme, AI, Algorithm, Apache, AppConfig, Archive, > Array, and so on: http://www.cpan.org/modules/by-module The current Trove list may be extended - I simply drew on the two best-known lists: sourceforge and freshmeat. What's the "Acme" category hold? :) > But even more coarsely-grained than that, there are classes of package. > Right now we have libraries and applications. PyPI doesn't make this distinction - though I believe it is a useful one. > I'd like to add modules -- though the name is vague, I'm thinking of > code on the sophisticated end of the Python Cookbook entries. Small, > reusable, and not worth distutilifying This sounds like a good idea, but raises a couple of issues: 1. Distutils isn't involed, but that's OK since PyPI allows TTW entry of package meta-data. 2. PyPI currently makes no assumptions about what the download_url points to. Would you advocate using the download_url for locating the module source? As I said in response to your weblog entry: "PyPI is intended to be an index of metadata that is generated by distutils. I'm not sure I'm comfortable extending that scope to include actual code fragments. It would confuse the meta-data schema and user interfaces considerably." > When you're looking for code, each of these is quite different from the > others -- for any search, you will probably be interested in any of > these (a library to use, or a module or application to borrow from). Yep. And note that some entries will span two (or all?) categories - Roundup, for example, is both a library and an application. > Right now we're neither here nor there, as people don't think to add > applications to PyPI, and the trove categories are inappropriate for > libraries. I don't believe the categories as they stand are *that* useless! > On top of this is the infrastructure issue, which probably also has to > be dealt with before moving forward much (i.e., SQLite and CGI). > Concurrent updates to a SQLite database from multiple processes scares > the crap out of me. But it doesn't look like that should be too hard > to fix. As I said in response to your weblog entry: "Finally, PyPI is bordering on being too large for the technologies it's built on; sqlite will need to be replaced by postgresql some time soon and the cgi.py-based web ui scales very poorly. Development such as you're proposing would push those technologies over the edge :)" On a separate topic, I believe it's pretty important that a document be written that captures your intentions. A lot of ideas have floated around on this list over the years - only to be subsequently forgotten because they're lost in the list archive. Yes, I'm suggesting writing a PEP about it. That way there's a single place someone can go to see the content and status of the proposal. Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFAz77grGisBEHG6TARAvUKAJ9Oh4oNtRzSLYmchYWwBdG2uYW2UQCdGHTU ZIFY1pyM9iM+PM5iLTFOa3w= =8/Tl -----END PGP SIGNATURE----- From ianb at colorstudy.com Wed Jun 16 00:32:13 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Wed Jun 16 00:32:22 2004 Subject: [Catalog-sig] PyPI improvements In-Reply-To: <200406161330.40399.richardjones@optushome.com.au> References: <200406161330.40399.richardjones@optushome.com.au> Message-ID: <23D3400E-BF4E-11D8-B1BB-000393C2D67E@colorstudy.com> On Jun 15, 2004, at 10:30 PM, Richard Jones wrote: >> 1. Express relationships between packages. These are relationships >> like alternative-implementation, fork, part-of, recommends, requires, >> etc. At the moment I'm thinking purely about displaying this >> information, not any fancy distutils magic installation of >> dependencies. > > There's been a number of proposals and I believe some code towards > implementing this kind of meta-data capture. > > The two extensions to distutils dealing with this issue that I know of > are > PIMP (/PackMan) and the ZPKG tools: > > http://undefined.org/python/pimp/ > http://www.python.org/packman/ > (couldn't find a page giving the technical details of PIMP) > http://zope.org/Members/fdrake/zpkgtools/ > (this page has a good list of links to prior discussions / > proposals) > > Various proposals have also been made on this list. I have no idea how > related > those projects are. It would be a shame to develop *another* system. I'm not entirely clear on all of these, but I think they all are looking for dependencies. Along with that they need canonical identifiers, which PyPI already has well enough (package names). For modules this wouldn't work, as the naming would be less unique. Module identifiers would be an issue, but I don't think they'd participate in automated dependencies quite so much. >> 2. Cache packages. I.e., download a copy of the package, and if the >> package disappears then we have a backup. > > The disappearance of packages is a concern. An archive network would > solve > this issue, but it requires both organisation and support from hosts. > I'm > pretty sure the current python.org machine is not suitable for storing > packages. It should mostly take disk space, at least how I'm envisioning it. If each package has a download URL (that's a real download URL, not just a web page with other references) then we cache the archives and provide a link to that archive if we detect that the source archive is gone. Packages without download locations won't be very popular (though they can still be interesting -- I've certainly found links to missing code that would interest me). >> The other thing that might be useful is some improved categorization >> of >> code. The Trove categories are... well, they are stupid. No fault of >> anyone here. CPAN's much more coarsely-grained categories are much >> better, in my opinion (Acme, AI, Algorithm, Apache, AppConfig, >> Archive, >> Array, and so on: http://www.cpan.org/modules/by-module > > The current Trove list may be extended - I simply drew on the two > best-known > lists: sourceforge and freshmeat. > > What's the "Acme" category hold? :) Joke modules, I believe. Pythonistas apparently aren't as prone to humor. So it goes. I've found the trove categories to be overwhelming to use when creating packages, and I've never paid attention to them when looking for packages. In part because I can't expect authors to have defined categories for their package. In Perl the categories are also caught up in naming, which I don't think we'd use. And you can't belong to multiple categories, for the same reason. But I think they present a simpler set of categories that would be more useful. The Vaults has a reasonable set of categories as well. We just need less categories. >> But even more coarsely-grained than that, there are classes of >> package. >> Right now we have libraries and applications. > > PyPI doesn't make this distinction - though I believe it is a useful > one. > > >> I'd like to add modules -- though the name is vague, I'm thinking of >> code on the sophisticated end of the Python Cookbook entries. Small, >> reusable, and not worth distutilifying > > This sounds like a good idea, but raises a couple of issues: > > 1. Distutils isn't involed, but that's OK since PyPI allows TTW entry > of package meta-data. I'd probably want to set up a automatic submission client that uses docstrings, but that's a separate issue. > 2. PyPI currently makes no assumptions about what the download_url > points to. Would you advocate using the download_url for locating > the module source? Yes, or another field. Freshmeat allows for a set of download URLs, which would potentially help this -- i.e., Windows installer, tarball, rpm or deb, etc. > As I said in response to your weblog entry: > > "PyPI is intended to be an index of metadata that is generated by > distutils. > I'm not sure I'm comfortable extending that scope to include actual > code > fragments. It would confuse the meta-data schema and user interfaces > considerably." The idea of broad categories (application, library, module) may alleviate the UI issues. We already have enough fragmentation -- even the Vaults get new submissions that don't go to PyPI -- so I'd hate to set up an entirely separate system. It could be parallel, but that doesn't seem necessary. Anyway, the prerequisite features are generally useful, so it's not a decision that has to happen yet. >> When you're looking for code, each of these is quite different from >> the >> others -- for any search, you will probably be interested in any of >> these (a library to use, or a module or application to borrow from). > > Yep. And note that some entries will span two (or all?) categories - > Roundup, > for example, is both a library and an application. Maybe that's what would be called a "framework". But yes, it's a little vague. >> Right now we're neither here nor there, as people don't think to add >> applications to PyPI, and the trove categories are inappropriate for >> libraries. > > I don't believe the categories as they stand are *that* useless! Perhaps. On one hand they are a set of properties (e.g., development status or natural language), which you probably wouldn't search on, but which are useful fields. Or a broad filter that would be appropriate for separate interfaces (intended audience). Or largely meaningless (at least for libraries, particularly OS and programming language). Which leaves the topics, which aren't the best set of categories. And I don't think I'd rely on them. So far I just have searched on the description. A full-text of description, keywords, title, and classifiers would probably be my favorite search if available. Unless I'm searching for a specific package, I would find searching on any single field (including category) to be too restrictive and too likely to cause me to miss something interesting. >> On top of this is the infrastructure issue, which probably also has to >> be dealt with before moving forward much (i.e., SQLite and CGI). >> Concurrent updates to a SQLite database from multiple processes scares >> the crap out of me. But it doesn't look like that should be too hard >> to fix. > > As I said in response to your weblog entry: > > "Finally, PyPI is bordering on being too large for the technologies > it's built > on; sqlite will need to be replaced by postgresql some time soon and > the > cgi.py-based web ui scales very poorly. Development such as you're > proposing > would push those technologies over the edge :)" > > > On a separate topic, I believe it's pretty important that a document be > written that captures your intentions. A lot of ideas have floated > around on > this list over the years - only to be subsequently forgotten because > they're > lost in the list archive. Yes, I'm suggesting writing a PEP about it. > That > way there's a single place someone can go to see the content and > status of > the proposal. Sure, after a bit of back-and-forth here. Maybe it would be easier to just write something up to be put in docs/ in CVS. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From richardjones at optushome.com.au Wed Jun 16 05:59:45 2004 From: richardjones at optushome.com.au (Richard Jones) Date: Wed Jun 16 05:59:57 2004 Subject: [Catalog-sig] PyPI improvements In-Reply-To: <23D3400E-BF4E-11D8-B1BB-000393C2D67E@colorstudy.com> References: <200406161330.40399.richardjones@optushome.com.au> <23D3400E-BF4E-11D8-B1BB-000393C2D67E@colorstudy.com> Message-ID: <200406161959.45396.richardjones@optushome.com.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday 16 Jun 2004 14:32, Ian Bicking wrote: > For modules this wouldn't work, as the naming would be less unique. > Module identifiers would be an issue, but I don't think they'd > participate in automated dependencies quite so much. If you're going to have some meta-data embedded in the module, then one of those fields can be a name in the PyPI namespace. I think that if the modules are going to be in PyPI, then they've got to have a unique name. Names are keys in PyPI (just as they are in CPAN / PAUSE). > It should mostly take disk space, at least how I'm envisioning it. Then current python.org (creosote) is definitely not up to the task. > If > each package has a download URL (that's a real download URL, not just a > web page with other references) then we cache the archives and provide > a link to that archive if we detect that the source archive is gone. I guess the issue is how we know what the download_url points to. I think we agree that the distutils meta-data is going to have to grow some additional fields (or single a complex field) that point specifically to source, win32 binary, redhat RPM, etc. download files. Of course, for projects hosted on sourceforge, all this is moot since there is no such thing as a URL pointing to a file (ok, there is, but I suspect your project would be booted if you used URLs pointing directly at mirrors). > > What's the "Acme" category hold? :) > > Joke modules, I believe. Pythonistas apparently aren't as prone to > humor. So it goes. That's what I figured. I'll take the rest of your statement in the sarcastic light that it was obviously intended ;) > I've found the trove categories to be overwhelming to use when creating > packages, and I've never paid attention to them when looking for > packages. In part because I can't expect authors to have defined > categories for their package. But they do. I've personally found the category searching to be quite productive a couple of times now. Perhaps I should generate some statistics? I'd have separate counts for users using any categories and those using topics... > In Perl the categories are also caught up in naming, which I don't > think we'd use. And you can't belong to multiple categories, for the > same reason. But I think they present a simpler set of categories that > would be more useful. The Vaults has a reasonable set of categories as > well. We just need less categories. Again, the categories we have at the moment are just the combination of the sourceforge and freshmeat listings. I'm well aware it's not the best list that it could be and I'd be more than happy to work on the list. > I'd probably want to set up a automatic submission client that uses > docstrings, but that's a separate issue. I think this is a great idea (I'm a fan of lowest-possible-burden for contributors ;) > The idea of broad categories (application, library, module) may > alleviate the UI issues. Agreed. > We already have enough fragmentation -- even > the Vaults get new submissions that don't go to PyPI -- so I'd hate to > set up an entirely separate system. Aside: It really is a shame I got zero response from repeated enquiries about collaboration with the Vaults people. I honestly didn't want to have to develop a new system :( > Sure, after a bit of back-and-forth here. Maybe it would be easier to > just write something up to be put in docs/ in CVS. Which CVS? Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFA0BoRrGisBEHG6TARAlL+AJ0Ql/XkS7I2AyQ5GZVfBK1p7NyqegCeL5Ue sb/rHbE3eN1uH00jd1ThGPc= =4g4l -----END PGP SIGNATURE----- From ianb at colorstudy.com Wed Jun 16 12:51:49 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Thu Jun 17 08:12:36 2004 Subject: [Catalog-sig] Upgrading infrastructure Message-ID: <40D07AA5.2090201@colorstudy.com> I also noticed in those comments a note on Postgres, and that it still needed to be installed, and who knows when that would happen... Would it be suffient just to move to a threaded, long-running model instead of CGI? For all its flaws (in a highly concurrent environment), I suspect SQLite probably wouldn't be so bad once the CGI overhead is gone, and the CGI overhead will be significant even if Postgres is in place. Thoughts on what to use? Static generation of a few key files would also be helpful (like the RSS feed). Ian From catalog-sig at claytonbrown.net Wed Jun 16 10:59:15 2004 From: catalog-sig at claytonbrown.net (catalog-sig@claytonbrown.net) Date: Thu Jun 17 09:43:19 2004 Subject: [Catalog-sig] Package/Module/Recipe Versioning, Aggregation and Distrobution Message-ID: I have been developing a bootstrap loader to enable module/package & python interpretor versioning/specification at time of import within a script. This is primarily to encourage code re-use/portablility (satisfying dependancies on multiple machines & platforms), and allow revision control and coexistence of packages, a particular weak point of python in my experience (I could be just doing things wrong). I am interested now in how such versioned packages could be agregated and made avaliable through a centralised service such as PyPi/CPAN, and as to wether such functionality described below will be a spanner in the works for distrobution techniques. ---------my actual question ------------------ To people familiar with distrobuting python projects, and associated tools, eg (py2exe, disutils, PyPi, freeze, etc) Which is not my string point in python, is this going to cause headaches with the way the afore mentioned tools work. Can this below mentioned versioning/package retrieval/recipe retrieval be easily integrated with a CPAN like service such as PyPi & http://python.org/peps/pep-0273.html in a logical manner, allowing platform specific, versioned packages to be agregated and made available automagically at time of import, or through some form of dependancy checker. ------------------------------------ -----functionality------------ This is similar to a technique seen in PythonMegaWidgets and a discussion I found of David Aschers on versioning. I have made my intial version of this avaliable on aspn. This is comprised of two parts, versioner.py & version_loader.py (http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/285215), versioner walks a site-packages tree identifying folder structures where versioned packages occur, and distrobutes version_loader.py as __init__.py in the parent folder of this versioned packages, it also creates empty placeholder __init__.py's & autoloaders (from foldername import * - where script name = folder name and no __init__.py exists ) so that you can automagically have "from package.folder.folder.script import method, etc ,etc" Admittedly Mark Hammonds win32, win32com, etc, didn't work at first inspection but I am happy for such packages to remain locally with site-packages as these are not cross platform packages anyhow, however all other packages I have migrated to my versioned site packages directory have played nicely in tests made. (for example: bsddb3, ClientCookie, DCOracle2, Ft, id3, log4py, logging, mx, OpenSSL, PIL, psyco, psyco2, PythonMagick, soaplib, wxPython, xmlplus ) With the manual hack of dragging any built binaries (.so, .dll, .pyd, etc) for said package from DLLS,etc into versioned package folder (OK when this is dpendant on system services eg MySqld, BerkleyDB, etc this could get difficult). The impact on python syntax for this is minimal with local variables such as: _foo_version_ = '1.2.3.4' #where point depth is level of compatibility required import foo Optionally. _python_version_ = '2.2.2' #again point depth is variable I further intend to remove all of the pollution this make of local namespace once the import has been performed. And have not gone through extensive testing nor bug fixes as yet, but all seems to work quite nicely. Ultimately I would love an extension to the python core syntax along the lines of: import foo version 2.3.4.5 (with inate platform inspection suffixing on package name)#to give the behaviour I have implements I thought I had seen a PEP on this though cant seem to find this. ------------------------------------ ------------extension to this------------ Ultimately I would like to achieve a CPAN like web retrieval of versioned packages/scripts which are referenced yet not available, though doing this at time of import perhaps if local variable is declared: e.g. _PyPi_download_ = 1, where if a version is specified it will retrieve that version else defaulting to latest. I would like to extend this with recipes as well, Eg. import recipe.aspn_python.stringutil.md5hash as md5hash import recipe.parnassus.stringutil.utf8escape as ut8esc import recipe.claysstuff.stringutil.utf8unescape as ut8unesc Where one could register a base foldername, and their lookup service with a central weblookup services which fires a search for a packge/module/scriptname which returns standardised xml results, to allow retrieval of the said package, storing it in the path you have nominated for versioned-packages/recipes (perhaps in site.py), perhaps even allowing syntax to import all packages from source/category/subcategory/etc eg aspn_python.category.* Having PyPi agregate packages (perhaps with unit tests also) and mirror seems logical, bundling all immediate depandancies within a single archive also seems logical. Having packages also nominate dependancies could be a huge benifit: Eg. requires['package'] = '1.2.3' Or standard dependancies.xml in base folder of package, ok this may be simplistic, modelling dependancies is a whole separate issue but for example libpthread.so.0 #using ldd #using locate(if present) | binaryName -v [perhaps this is a little flimsy] #for private packages #uses Pypi to get search service to locate/download package ------------------------------------ I thought I may as well put out some of the concerns I came across with versioned package managment, and some of the enhancements I could see being of benifit for discussion. My apologies if I have missed such discussions in these groups I have only recently joined these groups. Apologies if any of these seems unclear From amk at amk.ca Wed Jun 16 12:14:56 2004 From: amk at amk.ca (A.M. Kuchling) Date: Thu Jun 17 12:35:52 2004 Subject: [Catalog-sig] PyPI improvements In-Reply-To: <200406161330.40399.richardjones@optushome.com.au> References: <200406161330.40399.richardjones@optushome.com.au> Message-ID: <20040616161455.GA1972@rogue.amk.ca> On Wed, Jun 16, 2004 at 01:30:40PM +1000, Richard Jones wrote: > "Finally, PyPI is bordering on being too large for the technologies it's built > on; sqlite will need to be replaced by postgresql some time soon and the > cgi.py-based web ui scales very poorly. Development such as you're proposing I can pursue installing PostgreSQL on the Python server. It would be useful for PyPI, could be used for Roundup, and might be handy for content management on python.org. --amk From ianb at colorstudy.com Thu Jun 17 12:48:45 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Thu Jun 17 12:49:03 2004 Subject: [Catalog-sig] Category suggestions Message-ID: <40D1CB6D.4070207@colorstudy.com> Here's a list of categories that I think are unneeded, with a few additions as well (marked with +). Generally I think a category should only exist if ... (a) Someone would say "I want something like X", where X is a category, or... (b) Having found a package, I want to know if it has property X (e.g., licensing, maturity) (c) It can't be replace with a unamgiuous keyword, or an element of the description (e.g., Z39.50) (d) If a subcategory, a user would be genuinely interested in the specific subcategory, where there would be an *excess* of uninteresting packages in the parent category. (e) If not a property-based category (e.g., maturity level), it shouldn't apply to a significant number of the packages. "Utilities" is silly. "Python" is obvious. With a bit more thought, it would probably be possible to trim the remaining categories considerably, and add in some more useful categories. E.g., "metaclasses". Generally there should be more Python-specific categories (e.g., Zope, etc). Any Python framework that has a significant number of packages that depend on that framework should be a category. (Unless a relationship system makes that redundant, which might be an interesting way to factor it.) Maybe the properties should also be removed and turned into normal fields. E.g., we already have a license field. It's nice to sort on free/proprietary, and maybe permissive/GPL (for free) and free-but-proprietary/not-free... but maybe those categories can be filled in automatically instead of having the redundancy. The categories: Environment :: Console :: Curses Environment :: Console :: Framebuffer Environment :: Console :: Newt Environment :: Console :: svgalib Environment :: MacOS X :: Aqua Environment :: MacOS X :: Carbon Environment :: MacOS X :: Cocoa Environment :: Other Environment Environment :: Plugins Environment :: Web Environment :: Mozilla (?) Natural Language :: English (?) Operating System :: OS Independent (generally, the OS categories seem excessive for Python) Programming Language :: Python (well duh it uses Python) Programming Language :: Zope Topic :: Communications :: Email :: Address Book Topic :: Communications :: Email :: Email Clients (MUA) Topic :: Communications :: Email :: Filters Topic :: Communications :: Email :: Mail Transport Agents Topic :: Communications :: Email :: Mailing List Servers Topic :: Communications :: Email :: Post-Office Topic :: Communications :: Email :: Post-Office :: IMAP Topic :: Communications :: Email :: Post-Office :: POP3 + Topic :: Communications :: Email :: Client + Topic :: Communications :: Email :: Server Topic :: Communications :: FIDO Topic :: Communications :: File Sharing :: Gnutella Topic :: Communications :: File Sharing :: Napster (too timely) Topic :: Communications :: Ham Radio Topic :: Communications :: Internet Phone Topic :: Communications :: Telephony (both?) Topic :: Database (too *few* topics...) + Topic :: Database :: MySQL + Topic :: Database :: PostgreSQL + Topic :: Database :: SQLite + Topic :: Database :: Other RDBMS + Topic :: Database :: RDBMS wrappers + Topic :: Database :: Persistence (for ZODB, Kirbybase, etc) Topic :: Desktop Environment :: K Desktop Environment (KDE) :: Themes Topic :: Desktop Environment :: PicoGUI :: Applications Topic :: Desktop Environment :: PicoGUI :: Themes Topic :: Desktop Environment :: Window Managers :: Afterstep Topic :: Desktop Environment :: Window Managers :: Afterstep :: Themes Topic :: Desktop Environment :: Window Managers :: Applets Topic :: Desktop Environment :: Window Managers :: Blackbox Topic :: Desktop Environment :: Window Managers :: Blackbox :: Themes Topic :: Desktop Environment :: Window Managers :: CTWM Topic :: Desktop Environment :: Window Managers :: CTWM :: Themes Topic :: Desktop Environment :: Window Managers :: Enlightenment Topic :: Desktop Environment :: Window Managers :: Enlightenment :: Epplets Topic :: Desktop Environment :: Window Managers :: Enlightenment :: Themes DR15 Topic :: Desktop Environment :: Window Managers :: Enlightenment :: Themes DR16 Topic :: Desktop Environment :: Window Managers :: Enlightenment :: Themes DR17 Topic :: Desktop Environment :: Window Managers :: FVWM Topic :: Desktop Environment :: Window Managers :: FVWM :: Themes Topic :: Desktop Environment :: Window Managers :: Fluxbox Topic :: Desktop Environment :: Window Managers :: Fluxbox :: Themes Topic :: Desktop Environment :: Window Managers :: IceWM Topic :: Desktop Environment :: Window Managers :: IceWM :: Themes Topic :: Desktop Environment :: Window Managers :: MetaCity Topic :: Desktop Environment :: Window Managers :: MetaCity :: Themes Topic :: Desktop Environment :: Window Managers :: Oroborus Topic :: Desktop Environment :: Window Managers :: Oroborus :: Themes Topic :: Desktop Environment :: Window Managers :: Sawfish Topic :: Desktop Environment :: Window Managers :: Sawfish :: Themes 0.30 Topic :: Desktop Environment :: Window Managers :: Sawfish :: Themes pre-0.30 Topic :: Desktop Environment :: Window Managers :: Waimea Topic :: Desktop Environment :: Window Managers :: Waimea :: Themes Topic :: Desktop Environment :: Window Managers :: Window Maker Topic :: Desktop Environment :: Window Managers :: Window Maker :: Applets Topic :: Desktop Environment :: Window Managers :: Window Maker :: Themes Topic :: Desktop Environment :: Window Managers :: XFCE Topic :: Desktop Environment :: Window Managers :: XFCE :: Themes Topic :: Internet :: WAP Topic :: Internet :: WWW/HTTP :: Dynamic Content Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries Topic :: Internet :: WWW/HTTP :: Dynamic Content :: Message Boards Topic :: Internet :: WWW/HTTP :: Dynamic Content :: News/Diary Topic :: Internet :: WWW/HTTP :: Dynamic Content :: Page Counters + Topic :: Internet :: WWW/HTTP :: Frameworks + Topic :: Internet :: WWW/HTTP :: Frameworks :: CGI + Topic :: Internet :: WWW/HTTP :: Frameworks :: mod_python + Topic :: Internet :: WWW/HTTP :: Twisted + Topic :: Internet :: WWW/HTTP :: Zope 2 + Topic :: Internet :: WWW/HTTP :: Zope 2 :: Products + Topic :: Internet :: WWW/HTTP :: Zope 3 + Topic :: Internet :: WWW/HTTP :: Zope 3 :: Products + Topic :: Internet :: WWW/HTTP :: Content Management (Actually, I'd rather rethink all of Internet) Topic :: Internet :: Z39.50 Topic :: Multimedia :: Graphics :: Capture :: Digital Camera Topic :: Multimedia :: Graphics :: Capture :: Scanners Topic :: Multimedia :: Graphics :: Capture :: Screen Capture Topic :: Multimedia :: Sound/Audio :: CD Audio :: CD Playing Topic :: Multimedia :: Sound/Audio :: CD Audio :: CD Ripping Topic :: Multimedia :: Sound/Audio :: CD Audio :: CD Writing Topic :: Multimedia :: Sound/Audio :: Players :: MP3 Topic :: Office/Business :: News/Diary (It's not clearn why this is Office/Business) Topic :: Other/Nonlisted Topic Re: Topic :: Scientific/Engineering (It might be good to get input from someone who cares about this area) Topic :: Sociology Topic :: Sociology :: Genealogy Topic :: Sociology :: History (Sociology just seems strange -- genealogy belongs somewhere, though) Topic :: Software Development :: Assemblers Topic :: Software Development :: Disassemblers Topic :: Software Development :: Documentation Topic :: Software Development :: Libraries Topic :: Software Development :: Libraries :: Application Frameworks Topic :: Software Development :: Libraries :: Java Libraries Topic :: Software Development :: Libraries :: PHP Classes Topic :: Software Development :: Libraries :: Perl Modules Topic :: Software Development :: Libraries :: Pike Modules Topic :: Software Development :: Libraries :: Python Modules Topic :: Software Development :: Libraries :: Ruby Modules Topic :: Software Development :: Libraries :: Tcl Extensions Topic :: Software Development :: Localization (Redunant with Internalization) Topic :: Software Development :: Object Brokering Topic :: Software Development :: Object Brokering :: CORBA ("Object Brokering" a loaded term) Topic :: Software Development :: Quality Assurance (Redundant with Testing / Bug Tracking) Topic :: Software Development :: Testing :: Traffic Generation Topic :: Software Development :: Version Control :: RCS Topic :: Software Development :: Version Control :: SCCS + Topic :: Software Development :: Version Control :: Subversion Topic :: Software Development :: Widget Sets Topic :: System :: Archiving :: Backup Topic :: System :: Archiving :: Compression Topic :: System :: Archiving :: Mirroring Topic :: System :: Archiving :: Packaging Topic :: System :: Boot Topic :: System :: Boot :: Init Topic :: System :: Clustering (Redundant with Distributed Computing) Topic :: System :: Console Fonts Topic :: System :: Emulators Topic :: System :: Filesystems Topic :: System :: Hardware Topic :: System :: Hardware :: Hardware Drivers Topic :: System :: Hardware :: Mainframes Topic :: System :: Hardware :: Symmetric Multi-processing Topic :: System :: Networking :: Monitoring Topic :: System :: Networking :: Monitoring :: Hardware Watchdog Topic :: System :: Networking :: Time Synchronization Topic :: System :: Operating System Topic :: System :: Operating System Kernels Topic :: System :: Operating System Kernels :: BSD Topic :: System :: Operating System Kernels :: GNU Hurd Topic :: System :: Operating System Kernels :: Linux Topic :: System :: Power (UPS) Topic :: System :: Recovery Tools Topic :: System :: Software Distribution ("Software" loaded, redundant with Installation/Setup -- File Distribution maybe more appropriate) Topic :: System :: Systems Administration Topic :: System :: Systems Administration :: Authentication/Directory Topic :: System :: Systems Administration :: Authentication/Directory :: LDAP Topic :: System :: Systems Administration :: Authentication/Directory :: NIS (Audience, not Topic) Topic :: Terminals Topic :: Terminals :: Serial Topic :: Terminals :: Telnet Topic :: Terminals :: Terminal Emulators/X Terminals Topic :: Text Editors :: Text Processing Topic :: Text Processing :: Fonts Topic :: Text Processing :: General Topic :: Text Processing :: Markup :: LaTeX Topic :: Text Processing :: Markup :: SGML Topic :: Text Processing :: Markup :: VRML Topic :: Utilities (Too vague) From amk at amk.ca Thu Jun 17 14:01:18 2004 From: amk at amk.ca (A.M. Kuchling) Date: Thu Jun 17 14:02:02 2004 Subject: [Catalog-sig] Which PostgreSQL interface? Message-ID: <20040617180118.GB9325@rogue.amk.ca> I've embarked on compiling PostgreSQL on www.python.org, and am wondering which Python interface to use. I've used Pygresql and have been happy with it, but Richard and Ian, I'd like to know if you have opinions about this. --amk From ianb at colorstudy.com Thu Jun 17 14:36:59 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Thu Jun 17 14:37:13 2004 Subject: [Catalog-sig] Which PostgreSQL interface? In-Reply-To: <20040617180118.GB9325@rogue.amk.ca> References: <20040617180118.GB9325@rogue.amk.ca> Message-ID: <40D1E4CB.5020100@colorstudy.com> A.M. Kuchling wrote: > I've embarked on compiling PostgreSQL on www.python.org, and am > wondering which Python interface to use. I've used Pygresql and have > been happy with it, but Richard and Ian, I'd like to know if you have > opinions about this. PyGreSQL is kind of wonky sometimes. Psycopg seems to be the better interface, in my experience. Ian From slash at dotnetslash.net Thu Jun 17 15:03:56 2004 From: slash at dotnetslash.net (Mark W. Alexander) Date: Thu Jun 17 15:04:07 2004 Subject: [Catalog-sig] Which PostgreSQL interface? In-Reply-To: <40D1E4CB.5020100@colorstudy.com> References: <20040617180118.GB9325@rogue.amk.ca> <40D1E4CB.5020100@colorstudy.com> Message-ID: <20040617190356.GA22597@dotnetslash.net> On Thu, Jun 17, 2004 at 01:36:59PM -0500, Ian Bicking wrote: > A.M. Kuchling wrote: > >I've embarked on compiling PostgreSQL on www.python.org, and am > >wondering which Python interface to use. I've used Pygresql and have > >been happy with it, but Richard and Ian, I'd like to know if you have > >opinions about this. > > PyGreSQL is kind of wonky sometimes. Psycopg seems to be the better > interface, in my experience. This was some time ago, but the author of PyGreSQL stated that he did not use the DB-ABI compatible mode so his support for it was hit & miss based on contributions. psycopg, otoh, was written to the DB-API-2.0 spec, with thread-safety in mind, and transparently supports connection pooling. I switched to psycopg and haven't looked back (though I'd be interested in more current experiences with PyGreSQL.) mwa -- Mark W. Alexander slash@dotnetslash.net The contents of this message authored by Mark W. Alexander are released under the Creative Commons Attribution-NonCommercial license. Copyright of quoted materials are retained by the original author(s). http://creativecommons.org/licenses/by-nc/2.0/ From richardjones at optushome.com.au Thu Jun 17 18:14:09 2004 From: richardjones at optushome.com.au (Richard Jones) Date: Thu Jun 17 18:14:43 2004 Subject: [Catalog-sig] Which PostgreSQL interface? In-Reply-To: <20040617180118.GB9325@rogue.amk.ca> References: <20040617180118.GB9325@rogue.amk.ca> Message-ID: <200406180814.13103.richardjones@optushome.com.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday 18 Jun 2004 04:01, A.M. Kuchling wrote: > I've embarked on compiling PostgreSQL on www.python.org, and am > wondering which Python interface to use. I've used Pygresql and have > been happy with it, but Richard and Ian, I'd like to know if you have > opinions about this. I have had good experiences with psycopg. It requires the mxDateTime module though, which makes building a little more painful. I have had no experiences with any other psql python modules though. Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFA0he1rGisBEHG6TARAmpfAJ41OGpNhB98pvopltdCB+aZKPGMSwCeM1jp G6NSe26znq6Edt3TYKFShhY= =fqDw -----END PGP SIGNATURE----- From golux at comcast.net Thu Jun 17 15:36:28 2004 From: golux at comcast.net (Stephen Waterbury) Date: Sun Jun 20 23:35:05 2004 Subject: [Catalog-sig] Which PostgreSQL interface? In-Reply-To: <20040617180118.GB9325@rogue.amk.ca> References: <20040617180118.GB9325@rogue.amk.ca> Message-ID: <40D1F2BC.3000907@comcast.net> A.M. Kuchling wrote: > I've embarked on compiling PostgreSQL on www.python.org, and am > wondering which Python interface to use. I've used Pygresql and have > been happy with it, but Richard and Ian, I'd like to know if you have > opinions about this. I've used pyPgSQL and psycopg with good results, both by themselves and as backends with Twisted's adbapi. Both pyPgSQL and psycopg are very well supported, and questions to their maintainers are answered very promptly. Steve From amk at amk.ca Fri Jun 18 09:47:35 2004 From: amk at amk.ca (A.M. Kuchling) Date: Sun Jun 20 23:43:05 2004 Subject: [Catalog-sig] PostgreSQL installed Message-ID: <20040618134735.GA16283@rogue.amk.ca> I've finished installing PostgreSQL 7.4.3 and the Psycopg DB adapter on python.org. It's configured to only support Unix domain sockets, not TCP/IP, so databases can only be accessed from creosote. Original source code is in ~amk/source/ext ; if we come up with a standard place to keep source code, I'll move it. The binaries, libraries, and data are in /usr/local/pgsql743/ ; you may want to add /usr/local/pgsql743/bin to your path. A cron job will make a nightly backup to /usr/local/pgsql743/dump/backup . Currently I'm the only user with the ability to create new users and databases. Ask me to create a PostgreSQL user for you; let me know if you want the right to create new databases and/or new DB users. Since everyone with login access to creosote is trusted, I'll give these rights to anyone who asks. Richard, I've already created a user for you and a database named "pypi"; you should be able to run "psql pypi" and then type SQL queries. Let me know if there's a problem. --amk From richardjones at optushome.com.au Fri Jun 18 20:51:39 2004 From: richardjones at optushome.com.au (Richard Jones) Date: Sun Jun 20 23:46:11 2004 Subject: [Catalog-sig] PyPI improvements In-Reply-To: <20040616161455.GA1972@rogue.amk.ca> References: <200406161330.40399.richardjones@optushome.com.au> <20040616161455.GA1972@rogue.amk.ca> Message-ID: <200406191051.43755.richardjones@optushome.com.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday 17 Jun 2004 02:14, A.M. Kuchling wrote: > On Wed, Jun 16, 2004 at 01:30:40PM +1000, Richard Jones wrote: > > "Finally, PyPI is bordering on being too large for the technologies it's > > built on; sqlite will need to be replaced by postgresql some time soon > > and the cgi.py-based web ui scales very poorly. Development such as > > you're proposing > > I can pursue installing PostgreSQL on the Python server. It would be > useful for PyPI, could be used for Roundup, and might be handy for > content management on python.org. That would be brilliant, thanks! Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFA044frGisBEHG6TARAlXSAJ40E6p3+dHQbLD1CtTv03AaK9AikwCcCdAi 5lAxfDS6ngMI9RvOhJA93Ug= =zZNu -----END PGP SIGNATURE----- From ianb at colorstudy.com Wed Jun 16 12:31:05 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Sun Jun 20 23:47:18 2004 Subject: [Catalog-sig] PyPI improvements In-Reply-To: <200406161959.45396.richardjones@optushome.com.au> References: <200406161330.40399.richardjones@optushome.com.au> <23D3400E-BF4E-11D8-B1BB-000393C2D67E@colorstudy.com> <200406161959.45396.richardjones@optushome.com.au> Message-ID: <40D075C9.9020108@colorstudy.com> Richard Jones wrote: > On Wednesday 16 Jun 2004 14:32, Ian Bicking wrote: >>For modules this wouldn't work, as the naming would be less unique. >>Module identifiers would be an issue, but I don't think they'd >>participate in automated dependencies quite so much. > > > If you're going to have some meta-data embedded in the module, then one of > those fields can be a name in the PyPI namespace. > > I think that if the modules are going to be in PyPI, then they've got to have > a unique name. Names are keys in PyPI (just as they are in CPAN / PAUSE). I think they'd have to be parameterized in some way then -- stand-alone modules just aren't likely to be uniquely named. Or, to make them uniquely named would lead to funky names (e.g., joe_screenscraper.py). Maybe it could be a name of author_username:module_name, or something like that. Or maybe the names simply don't have to match the Python module name. I really *don't* want to encourage a lot of distutiled modules with conflicting names. In the comments on my post (http://blog.colorstudy.com/system/comments.py?u=0000001&p=P123) someone suggested automatically creating a zip/tarball with the proper setup.py, and I think that would be a bad idea and would lead to a polluted site-packages. >>It should mostly take disk space, at least how I'm envisioning it. > > Then current python.org (creosote) is definitely not up to the task. Okay... then maybe it should be a RPC setup. Like a client can query for a list of URLs that have not been archived (sufficiently), and can register the fact it has archived a URL. Then there'd be the infrastructure so that archiving can be offloaded to another machine, though no archiving or mirroring would be built into the system. There shouldn't be any lack of machines for the use -- disk and bandwidth is so cheap these days that a complex system like CPAN's seems like overkill. I'm guessing creosote is just kind of old, and (understandably) no one wants to deal with the sysadmin issues of upgrading. >>If >>each package has a download URL (that's a real download URL, not just a >>web page with other references) then we cache the archives and provide >>a link to that archive if we detect that the source archive is gone. > > > I guess the issue is how we know what the download_url points to. > > I think we agree that the distutils meta-data is going to have to grow some > additional fields (or single a complex field) that point specifically to > source, win32 binary, redhat RPM, etc. download files. Of course, for > projects hosted on sourceforge, all this is moot since there is no such thing > as a URL pointing to a file (ok, there is, but I suspect your project would > be booted if you used URLs pointing directly at mirrors). I think the SF downloading should be built into the downloading client, as a kind of screen scraper to get to the real file. There's so much stuff on SF that it can be special cased. Alternately, we could ask submitters to give us a direct URL, and we would only distribute that URL to mirrors, never to users. In case of URLs, perhaps we need a (url, url_type, url_description) relation, where url_type is restricted, and maybe (or maybe not) (url, url_type) is unique. url_type would be like homepage, documentation, changelog, tar.gz download, Windows installer, Mac disk image, etc. Kind of like SourceForge does for downloads, but both a bit larger, and a bit less explicit. (E.g., I think SF has two different types for .gz and .bz2 files, which seems unnecessary.) >>>What's the "Acme" category hold? :) >> >>Joke modules, I believe. Pythonistas apparently aren't as prone to >>humor. So it goes. > > > That's what I figured. I'll take the rest of your statement in the sarcastic > light that it was obviously intended ;) There really aren't that many joke Python modules, at least that I've seen. Maybe because we lack the namespace for them. Or because we are less prone to puns. >>I've found the trove categories to be overwhelming to use when creating >>packages, and I've never paid attention to them when looking for >>packages. In part because I can't expect authors to have defined >>categories for their package. > > > But they do. I've personally found the category searching to be quite > productive a couple of times now. > > Perhaps I should generate some statistics? I'd have separate counts for users > using any categories and those using topics... Statistics might be useful -- we already have the number of packages that are using categories (in the browse screen), but a statistic of the number of packages that aren't using the categories would be helpful. >>In Perl the categories are also caught up in naming, which I don't >>think we'd use. And you can't belong to multiple categories, for the >>same reason. But I think they present a simpler set of categories that >>would be more useful. The Vaults has a reasonable set of categories as >>well. We just need less categories. > > > Again, the categories we have at the moment are just the combination of the > sourceforge and freshmeat listings. I'm well aware it's not the best list > that it could be and I'd be more than happy to work on the list. Maybe it would even be sufficient to fill it out just a little more (to make it more appropriate for Python and libraries), and then offer to display a trimmed-down list, since everyone fills out their categories by skimming through the list of available categories. Keywords are also another model, and are somewhat redundant now. Myself, I never know what to put in for keywords. An interactive setup.py-builder could be nice too -- it could help people get over the distutils hump, as well as promoting the necessary parts for PyPI submission. >>I'd probably want to set up a automatic submission client that uses >>docstrings, but that's a separate issue. > > > I think this is a great idea (I'm a fan of lowest-possible-burden for > contributors ;) > > > >>The idea of broad categories (application, library, module) may >>alleviate the UI issues. > > > Agreed. > > > >>We already have enough fragmentation -- even >>the Vaults get new submissions that don't go to PyPI -- so I'd hate to >>set up an entirely separate system. > > > Aside: It really is a shame I got zero response from repeated enquiries about > collaboration with the Vaults people. I honestly didn't want to have to > develop a new system :( Yeah, I was wondering about that. It would still be nice to import the Vaults data, as there's a lot of older but useful stuff there. Even if they didn't collaborate, at some point it would be nice to make PyPI more canonical, and import the Vaults data and (hopefully) shut that service down once it's entirely redundant. I should really be careful, as I am prone to distraction (this itself is a distraction) and I probably can't participate in this stuff in a really consistent way. But something like setting up an import might be a good level of commitment. >>Sure, after a bit of back-and-forth here. Maybe it would be easier to >>just write something up to be put in docs/ in CVS. > > > Which CVS? In the pypi SF project. Is that the canonical repository for the code? Ian From richardjones at optushome.com.au Fri Jun 18 21:26:13 2004 From: richardjones at optushome.com.au (Richard Jones) Date: Sun Jun 20 23:53:08 2004 Subject: [Catalog-sig] Category suggestions In-Reply-To: <40D1CB6D.4070207@colorstudy.com> References: <40D1CB6D.4070207@colorstudy.com> Message-ID: <200406191126.13603.richardjones@optushome.com.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday 18 Jun 2004 02:48, Ian Bicking wrote: > Here's a list of categories that I think are unneeded, with a few > additions as well (marked with +). I agree with most of your changes, and I have some comments where we disagree. I'd want to analyse whether or not anyone is using any of the topics to be removed. > Generally I think a category should only exist if ... > > (a) Someone would say "I want something like X", where X is a category, > or... > (b) Having found a package, I want to know if it has property X (e.g., > licensing, maturity) > (c) It can't be replace with a unamgiuous keyword, or an element of the > description (e.g., Z39.50) > (d) If a subcategory, a user would be genuinely interested in the > specific subcategory, where there would be an *excess* of uninteresting > packages in the parent category. > (e) If not a property-based category (e.g., maturity level), it > shouldn't apply to a significant number of the packages. "Utilities" is > silly. "Python" is obvious. I tend to agree with this. Note that the browsing functionality matches subcategories, allowing the packager to specify more finely what their package does, eg. "Topic :: Communications :: Email :: Address Book" while still matching "Topic :: Communications :: Email". > With a bit more thought, it would probably be possible to trim the > remaining categories considerably, and add in some more useful > categories. Agreed. > Maybe the properties should also be removed and turned into normal > fields. Yes, this has come up before. I think it's a good idea, but I'm unsure about how to go about it. Someone has also pointed out that the license field could be used to include the entire license text. I'm not sure whether that's useful though. For your list, I've implicitly agree to changes you proposed unless I note below: + Environment :: Embedded > Natural Language :: English (?) For non i18n, it's good to be explicit. > Operating System :: OS Independent > (generally, the OS categories seem excessive for Python) They can be important though. > Programming Language :: Python > (well duh it uses Python) Yeah, this section does seem silly. I guess the only important selections here are Python, C, C++ and Java. Perhaps C# too? > Topic :: Communications :: Email :: Address Book > Topic :: Communications :: Email :: Email Clients (MUA) > Topic :: Communications :: Email :: Filters > Topic :: Communications :: Email :: Mail Transport Agents > Topic :: Communications :: Email :: Mailing List Servers > Topic :: Communications :: Email :: Post-Office > Topic :: Communications :: Email :: Post-Office :: IMAP > Topic :: Communications :: Email :: Post-Office :: POP3 > + Topic :: Communications :: Email :: Client > + Topic :: Communications :: Email :: Server We'd want to heelp Filters - for things like spambayes, etc. > Topic :: Communications :: Internet Phone > Topic :: Communications :: Telephony > (both?) Technically they are separate things, but I'm not sure there's going to be enough packages to warrant two categories. Let's just go with Telephony. > + Topic :: Database :: RDBMS wrappers IMO this could be confused with the DB-API wrapper. Perhaps Object-Relational wrappers? > Topic :: Internet :: WWW/HTTP :: Dynamic Content Hrm - I'm not sure why you prefer Frameworks over > Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries > Topic :: Internet :: WWW/HTTP :: Dynamic Content :: Message Boards > Topic :: Internet :: WWW/HTTP :: Dynamic Content :: News/Diary > Topic :: Internet :: WWW/HTTP :: Dynamic Content :: Page Counters > + Topic :: Internet :: WWW/HTTP :: Frameworks > + Topic :: Internet :: WWW/HTTP :: Frameworks :: CGI > + Topic :: Internet :: WWW/HTTP :: Frameworks :: mod_python > + Topic :: Internet :: WWW/HTTP :: Twisted > + Topic :: Internet :: WWW/HTTP :: Zope 2 > + Topic :: Internet :: WWW/HTTP :: Zope 2 :: Products > + Topic :: Internet :: WWW/HTTP :: Zope 3 > + Topic :: Internet :: WWW/HTTP :: Zope 3 :: Products > + Topic :: Internet :: WWW/HTTP :: Content Management > (Actually, I'd rather rethink all of Internet) Yes, given that Twisted and the Zopes are both more than just WWW/HTTP. I'd be happy to knock them back a notch: + Topic :: Internet :: Twisted + Topic :: Internet :: Zope 2 + Topic :: Internet :: Zope 2 :: Products + Topic :: Internet :: Zope 3 + Topic :: Internet :: Zope 3 :: Products > Topic :: Office/Business :: News/Diary > (It's not clearn why this is Office/Business) Yeah. I think most of the Office/Business topics could be relabelled "Organisational" or some similar Adjective. > Re: Topic :: Scientific/Engineering > (It might be good to get input from someone who cares about this > area) I can actually have a good crack at this - I'm doing a review of Field Of Knowledge classification systems for work. Off the top of my head, the set here is pretty good. > Topic :: Software Development :: Disassemblers decompyle? > Topic :: Software Development :: Documentation docutils, et al? > Topic :: Software Development :: Object Brokering > Topic :: Software Development :: Object Brokering :: CORBA > ("Object Brokering" a loaded term) Yes, but there's a number of schemes in Python that do it. > Topic :: Software Development :: Version Control :: RCS > Topic :: Software Development :: Version Control :: SCCS > + Topic :: Software Development :: Version Control :: Subversion Do we need sub-topics? > Topic :: Software Development :: Widget Sets There's a number of Python GUI implementations. I'm not sure where else they'd go. Perhaps we need + Topic :: User Interface > Topic :: System :: Systems Administration > (Audience, not Topic) I agree the subcategories could go away, but the category is useful. Audience and Topic are separate concepts (a package may be for system administrators, or it might implement system admin tools for use by other people) > Topic :: Text Editors :: Text Processing docutils et al? Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFA05Y1rGisBEHG6TARAqIMAJ9F1IVhLTCDyJh/IAv/i0cmOjdRsQCcC6vR yAS/zjnIewm+vd4tdmJLzU4= =FPAb -----END PGP SIGNATURE----- From news at allnet.es Fri Jun 18 04:32:31 2004 From: news at allnet.es (ALLNET-News) Date: Mon Jun 21 00:02:08 2004 Subject: [Catalog-sig] =?iso-8859-1?q?C=E1maras_IP_LAN_y_54Mbit_-_en_stoc?= =?iso-8859-1?q?k!?= Message-ID: <20040618083231.CD25783319D@webbox243.server-home.net> An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/catalog-sig/attachments/20040618/68e01719/attachment-0001.html From richardjones at optushome.com.au Thu Jun 17 18:15:54 2004 From: richardjones at optushome.com.au (Richard Jones) Date: Mon Jun 21 00:16:36 2004 Subject: [Catalog-sig] Upgrading infrastructure In-Reply-To: <40D07AA5.2090201@colorstudy.com> References: <40D07AA5.2090201@colorstudy.com> Message-ID: <200406180815.54445.richardjones@optushome.com.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday 17 Jun 2004 02:51, Ian Bicking wrote: > I also noticed in those comments a note on Postgres, and that it still > needed to be installed, and who knows when that would happen... It needs someone with time enough to install it. > Would it be suffient just to move to a threaded, long-running model > instead of CGI? For all its flaws (in a highly concurrent environment), > I suspect SQLite probably wouldn't be so bad once the CGI overhead is > gone, and the CGI overhead will be significant even if Postgres is in > place. > > Thoughts on what to use? I was thinking about investigating Quixote. I've just not had the time to look into it. > Static generation of a few key files would also be helpful (like the RSS > feed). This was done as soon as I realised that some idiots out there have their RSS feed readers set to poll every minute. It still goes through the cgi interface, but doesn't require opening the database. Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFA0hgarGisBEHG6TARAnm+AJ0cXCeRat4Mzdq1tygrphil5DZxfgCfXWPE uswqySjz5OM4IDTl/lZbgSc= =UHxl -----END PGP SIGNATURE----- From Chris.Barker at noaa.gov Mon Jun 21 13:57:39 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Mon Jun 21 15:09:28 2004 Subject: [Catalog-sig] Package/Module/Recipe Versioning, Aggregation and Distrobution In-Reply-To: References: Message-ID: <40D72193.4040507@noaa.gov> catalog-sig@claytonbrown.net wrote: > I have been developing a bootstrap loader to enable module/package & python interpretor versioning/specification at time of import within a script. > This is primarily to encourage code re-use/portablility (satisfying dependancies on multiple machines & platforms), and allow revision control and coexistence of packages, a particular weak point of python in my experience (I could be just doing things wrong). I am interested now in how such versioned packages could be agregated and made avaliable through a centralised service such as PyPi/CPAN, and as to wether such functionality described below will be a spanner in the works for distrobution techniques. Who sent this note? There has been discussion recently on the wxPython-users list about how to support versioning of wxPython. I'd love to hear what you've worked out. And yes, I agree this is a real week point in Python that would be nice to address Python-wide, rather than each package developer coming up with their own scheme. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov From golux at comcast.net Tue Jun 22 20:09:26 2004 From: golux at comcast.net (Stephen Waterbury) Date: Tue Jun 22 20:14:55 2004 Subject: [Catalog-sig] Category suggestions In-Reply-To: <200406191126.13603.richardjones@optushome.com.au> References: <40D1CB6D.4070207@colorstudy.com> <200406191126.13603.richardjones@optushome.com.au> Message-ID: <40D8CA36.2090400@comcast.net> Richard Jones wrote: > On Friday 18 Jun 2004 02:48, Ian Bicking wrote: >>Maybe the properties should also be removed and turned into normal >>fields. > > Yes, this has come up before. I think it's a good idea, but I'm unsure about > how to go about it. Someone has also pointed out that the license field could > be used to include the entire license text. I'm not sure whether that's > useful though. Especially since it might be repeated in several places. Perhaps the license field could contain the URL to a single authoritative copy of the license text. - Steve From ianb at colorstudy.com Tue Jun 22 22:30:26 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Tue Jun 22 22:37:16 2004 Subject: [Catalog-sig] Category suggestions In-Reply-To: <200406191126.13603.richardjones@optushome.com.au> References: <40D1CB6D.4070207@colorstudy.com> <200406191126.13603.richardjones@optushome.com.au> Message-ID: <49338A96-C4BD-11D8-89B0-000393C2D67E@colorstudy.com> On Jun 18, 2004, at 8:26 PM, Richard Jones wrote: >> Maybe the properties should also be removed and turned into normal >> fields. > > Yes, this has come up before. I think it's a good idea, but I'm unsure > about > how to go about it. Someone has also pointed out that the license > field could > be used to include the entire license text. I'm not sure whether that's > useful though. I think the hierarchy that the trove categories use are useful, specifically the OSI hierarchy, and classifying some of the different proprietary licenses. At the same time, belonging to multiple license categories doesn't make sense, and at least one should be required (even if they just choose "other"). > For your list, I've implicitly agree to changes you proposed unless I > note > below: > > + Environment :: Embedded > > >> Natural Language :: English (?) > > For non i18n, it's good to be explicit. The difficulty to me is that there's a lot of libraries, and this shouldn't apply to libraries (unless they are not English). For applications this is more interesting, but this is the obvious default for almost all libraries. >> Operating System :: OS Independent >> (generally, the OS categories seem excessive for Python) > > They can be important though. Again, they shouldn't apply to most libraries. But I suppose OS Independent is good enough for most. A "pure python" category might be nice. >> Programming Language :: Python >> (well duh it uses Python) > > Yeah, this section does seem silly. I guess the only important > selections here > are Python, C, C++ and Java. Perhaps C# too? Kind of... I can imagine other things meant to integrate between languages -- e.g., Fortran and Python. Or source code generation, or build environments, etc. But that doesn't mean that each language needs its own category, maybe Programming Language :: Other is enough. >> Topic :: Communications :: Email :: Address Book >> Topic :: Communications :: Email :: Email Clients (MUA) >> Topic :: Communications :: Email :: Filters >> Topic :: Communications :: Email :: Mail Transport Agents >> Topic :: Communications :: Email :: Mailing List Servers >> Topic :: Communications :: Email :: Post-Office >> Topic :: Communications :: Email :: Post-Office :: IMAP >> Topic :: Communications :: Email :: Post-Office :: POP3 >> + Topic :: Communications :: Email :: Client >> + Topic :: Communications :: Email :: Server > > We'd want to heelp Filters - for things like spambayes, etc. Yes, there are a bunch of those. >> Topic :: Communications :: Internet Phone >> Topic :: Communications :: Telephony >> (both?) > > Technically they are separate things, but I'm not sure there's going > to be > enough packages to warrant two categories. Let's just go with > Telephony. > > >> + Topic :: Database :: RDBMS wrappers > > IMO this could be confused with the DB-API wrapper. Perhaps > Object-Relational > wrappers? There's a lot that aren't ORMs, like SQLDict or things like that. >> Topic :: Internet :: WWW/HTTP :: Dynamic Content > > Hrm - I'm not sure why you prefer Frameworks over "Dynamic Content" seems very vague to me. Any web application could qualify. Hmm... there should be some category to distinguish actual web applications from libraries. >> Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI >> Tools/Libraries >> Topic :: Internet :: WWW/HTTP :: Dynamic Content :: Message Boards >> Topic :: Internet :: WWW/HTTP :: Dynamic Content :: News/Diary >> Topic :: Internet :: WWW/HTTP :: Dynamic Content :: Page Counters >> + Topic :: Internet :: WWW/HTTP :: Frameworks >> + Topic :: Internet :: WWW/HTTP :: Frameworks :: CGI >> + Topic :: Internet :: WWW/HTTP :: Frameworks :: mod_python >> + Topic :: Internet :: WWW/HTTP :: Twisted >> + Topic :: Internet :: WWW/HTTP :: Zope 2 >> + Topic :: Internet :: WWW/HTTP :: Zope 2 :: Products >> + Topic :: Internet :: WWW/HTTP :: Zope 3 >> + Topic :: Internet :: WWW/HTTP :: Zope 3 :: Products >> + Topic :: Internet :: WWW/HTTP :: Content Management >> (Actually, I'd rather rethink all of Internet) > > Yes, given that Twisted and the Zopes are both more than just > WWW/HTTP. I'd be > happy to knock them back a notch: > > + Topic :: Internet :: Twisted > + Topic :: Internet :: Zope 2 > + Topic :: Internet :: Zope 2 :: Products > + Topic :: Internet :: Zope 3 > + Topic :: Internet :: Zope 3 :: Products Sure. >> Topic :: Office/Business :: News/Diary >> (It's not clearn why this is Office/Business) > > Yeah. I think most of the Office/Business topics could be relabelled > "Organisational" or some similar Adjective. > > >> Re: Topic :: Scientific/Engineering >> (It might be good to get input from someone who cares about this >> area) > > I can actually have a good crack at this - I'm doing a review of Field > Of > Knowledge classification systems for work. Off the top of my head, the > set > here is pretty good. > > >> Topic :: Software Development :: Disassemblers > > decompyle? > > >> Topic :: Software Development :: Documentation > > docutils, et al? That's true, I was thinking of actual documentation, but documentation tools makes sense. > >> Topic :: Software Development :: Object Brokering >> Topic :: Software Development :: Object Brokering :: CORBA >> ("Object Brokering" a loaded term) > > Yes, but there's a number of schemes in Python that do it. I feel like there should be some better term, though -- one which doesn't feel so specific to CORBA and that era. >> Topic :: Software Development :: Version Control :: RCS >> Topic :: Software Development :: Version Control :: SCCS >> + Topic :: Software Development :: Version Control :: Subversion > > Do we need sub-topics? I think it would make sense -- there's a number of tools and applications built on Subversion and CVS, and they are pretty tied to that backend. >> Topic :: Software Development :: Widget Sets > > There's a number of Python GUI implementations. I'm not sure where > else they'd > go. Perhaps we need > > + Topic :: User Interface Hmm... "User Interface" maybe sounds too vague, but you're right there should be something. Maybe GUI would be clearer. > >> Topic :: System :: Systems Administration >> (Audience, not Topic) > > I agree the subcategories could go away, but the category is useful. > Audience > and Topic are separate concepts (a package may be for system > administrators, > or it might implement system admin tools for use by other people) > > >> Topic :: Text Editors :: Text Processing > > docutils et al? But in Text Editors? Maybe moved somewhere else. Also I can't remember if there's good categories for HTML and XML (where HTML is separate). There's a lot of tools specific to that. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From richardjones at optushome.com.au Wed Jun 23 02:11:22 2004 From: richardjones at optushome.com.au (Richard Jones) Date: Wed Jun 23 02:13:22 2004 Subject: [Catalog-sig] PyPI improvements In-Reply-To: <40D075C9.9020108@colorstudy.com> References: <200406161959.45396.richardjones@optushome.com.au> <40D075C9.9020108@colorstudy.com> Message-ID: <200406231611.22656.richardjones@optushome.com.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday 17 Jun 2004 02:31, Ian Bicking wrote: > Richard Jones wrote: > > On Wednesday 16 Jun 2004 14:32, Ian Bicking wrote: > >>For modules this wouldn't work, as the naming would be less unique. > >>Module identifiers would be an issue, but I don't think they'd > >>participate in automated dependencies quite so much. > > > > If you're going to have some meta-data embedded in the module, then one > > of those fields can be a name in the PyPI namespace. > > > > I think that if the modules are going to be in PyPI, then they've got to > > have a unique name. Names are keys in PyPI (just as they are in CPAN / > > PAUSE). > > I think they'd have to be parameterized in some way then -- stand-alone > modules just aren't likely to be uniquely named. Or, to make them > uniquely named would lead to funky names (e.g., joe_screenscraper.py). > Maybe it could be a name of author_username:module_name, or something > like that. Or maybe the names simply don't have to match the Python > module name. I guess the problem with modules having the same name is that people can then get confused about eg. *which* "csv" module you might be talking about. There's at least four of them out there, though one of those is in the stdlib. > I really *don't* want to encourage a lot of distutiled modules with > conflicting names. In the comments on my post > (http://blog.colorstudy.com/system/comments.py?u=0000001&p=P123) someone > suggested automatically creating a zip/tarball with the proper setup.py, > and I think that would be a bad idea and would lead to a polluted > site-packages. Where else would these modules be installed? Or are you advocating that they not be installed? > Okay... then maybe it should be a RPC setup. Like a client can query > for a list of URLs that have not been archived (sufficiently), and can > register the fact it has archived a URL. Then there'd be the > infrastructure so that archiving can be offloaded to another machine, > though no archiving or mirroring would be built into the system. There > shouldn't be any lack of machines for the use -- disk and bandwidth is > so cheap these days that a complex system like CPAN's seems like > overkill. Seems like a reasonable idea. We'd want PyPI to be able to query the archives it knows about on a regular (weekly?) basis just to make sure everything's in sync. > I'm guessing creosote is just kind of old, and > (understandably) no one wants to deal with the sysadmin issues of > upgrading. Upgrading creosote is in the works, but it requires resources. > I think the SF downloading should be built into the downloading client, > as a kind of screen scraper to get to the real file. There's so much > stuff on SF that it can be special cased. That's easy enough to do. > In case of URLs, perhaps we need a (url, url_type, url_description) > relation, where url_type is restricted, and maybe (or maybe not) (url, > url_type) is unique. url_type would be like homepage, documentation, > changelog, tar.gz download, Windows installer, Mac disk image, etc. I like this idea a lot. How it's expressed in PKG_INFO terms is another thing :) > Kind of like SourceForge does for downloads, but both a bit larger, and > a bit less explicit. (E.g., I think SF has two different types for .gz > and .bz2 files, which seems unnecessary.) Yes, the sf.net types are a strange bunch. > Statistics might be useful -- we already have the number of packages > that are using categories (in the browse screen), but a statistic of the > number of packages that aren't using the categories would be helpful. I did some quickie analysis: http://www.mechanicalcat.net/richard/log/Python/PyPI_Categories > An interactive setup.py-builder could be nice too -- it could help > people get over the distutils hump, as well as promoting the necessary > parts for PyPI submission. Yep, this would be a great idea. My tkinter skillz are very low though. I've done most of my GUI programming in PyQt, and tkinter seems quite cumbersome in comparison :( If I can find the time, I'd still like to have another crack at tkinter, and perhaps this is the kind of project I need. Unless someone else beats me to it :) > >>Sure, after a bit of back-and-forth here. Maybe it would be easier to > >>just write something up to be put in docs/ in CVS. > > > > Which CVS? > > In the pypi SF project. Is that the canonical repository for the code? Yep. I don't have a CVS change mailer set up for that cvsroot yet, as I'm the only committer. Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFA2R8KrGisBEHG6TARAhYEAJ9AWk8XpRHKInSJvUj2la0p0iSV1wCaAg6b /TYUhYvdIztkYRt6ACnMwIc= =10x5 -----END PGP SIGNATURE----- From richardjones at optushome.com.au Thu Jun 24 18:50:30 2004 From: richardjones at optushome.com.au (Richard Jones) Date: Thu Jun 24 18:50:38 2004 Subject: [Catalog-sig] Category suggestions In-Reply-To: <200406191126.13603.richardjones@optushome.com.au> References: <40D1CB6D.4070207@colorstudy.com> <200406191126.13603.richardjones@optushome.com.au> Message-ID: <200406250850.30418.richardjones@optushome.com.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Saturday 19 Jun 2004 11:26, Richard Jones wrote: > > Re: Topic :: Scientific/Engineering > > (It might be good to get input from someone who cares about this > > area) > > I can actually have a good crack at this - I'm doing a review of Field Of > Knowledge classification systems for work. There's two reasonable Field of Knowledge standards in the world. The Australian Standard Research Classification and the OECD's Frascati Manual. Together, they specify around 1100 actual fields. The following are the logical groupings of those fields (with each group holding between 5 and 20 fields). Humanities Philosophy Cultural Studies Archaeology Anthropology History Linguistics Languages Literature Media Curatorial Sciences The Arts Religion Social Sciences Sociology Politics Human Geography Demography Psychology Cognitive Science Education Social Work Economics Business Law Media Sciences Mathematics Statistics Astronomy Physics Chemistry Earth Sciences Biology Medicine Alternative Medicine Public Health Nursing Human Movement and Sport Environmental Sciences Built Environment Engineering Information and Communication Sciences -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFA21q2rGisBEHG6TARAlEJAJ94McPB8tVCb5D85MhaZoCjnmZKpQCeOk81 S9VYemi6URzs1t/XcD+ZNqQ= =8NuB -----END PGP SIGNATURE----- From richardjones at optushome.com.au Sun Jun 27 01:17:47 2004 From: richardjones at optushome.com.au (Richard Jones) Date: Sun Jun 27 01:18:00 2004 Subject: [Catalog-sig] Revised categories Message-ID: <200406271517.51765.richardjones@optushome.com.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've put up a Wiki page holding the revised categories listing: http://www.python.org/cgi-bin/moinmoin/RevisedPyPiCategories I made most changes we've discussed, though I didn't change "Dynamic Content" to "Frameworks" as I know that the former is being used quite a bit, so I assume that people are happy with it. I also didn't touch the programming languages section. I'm not sure we need to. Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFA3lh/rGisBEHG6TARAjUDAJ4zx1aYF/RImssrSZgYIzT3fsukpQCdFCTo yGvn+x8MrEyq26MmaPjY2G0= =K6bz -----END PGP SIGNATURE-----