From greg@cosc.canterbury.ac.nz Tue Apr 1 00:43:05 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 01 Apr 2003 12:43:05 +1200 (NZST) Subject: [Python-Dev] Distutils documentation amputated in 2.2 docs? Message-ID: <200304010043.h310h5M17556@oma.cosc.canterbury.ac.nz> I was looking at the Distributing Python Modules section of the distutils docs for 2.2 the other day, and it mentioned a section about extending the distutils, but there did not appear to be any such section. Further investigation revealed that the 1.6 version of the docs *does* have this section, as section 8, but somewhere between the 1.6 and 2.2 docs, this section has disappeared, along with almost all of section 9, "Reference", which now appears as section 7, but with only a small part of what it should contain. What's the proper way of submitting a bug report about this? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From paul@prescod.net Tue Apr 1 00:52:06 2003 From: paul@prescod.net (Paul Prescod) Date: Mon, 31 Mar 2003 16:52:06 -0800 Subject: [Python-Dev] Capabilities In-Reply-To: References: Message-ID: <3E88E2B6.1080409@prescod.net> Ka-Ping Yee wrote: > Hmm, i'm not sure you understood what i meant. The code example i posted > is a solution to the design challenge: "provide read-only access to a > directory and its subdirectories, but no access to the rest of the filesystem". > I'm looking for other security design challenges to tackle in Python. > Once enough of them have been tried, we'll have a better understanding of > what Python would need to do to make secure programming easier. Okay, how about allowing a piece of untrusted code to import modules from a selected subset of all modules. For instance you probably want to allow untrusted code to get access to regular expressions and codecs (after taming!) but not os or socket. Speaking of sockets, web browsers often allow connections to sockets only at a particular domain. In a capabilities world, I guess the domain would be an object that you could request sockets from. Are DOS issues in scope? How do we prevent untrusted code from just bringing the interpreter to a halt? A smart enough attacker could even block all threads in the current process by finding a task that is usually not time-sliced and making it go on for a very long time. without looking at the Python implementation, I can't remember an example off of the top of my head, but perhaps a large multiplication or search-and-replace in a string. Paul Prescod From paul@prescod.net Tue Apr 1 01:08:40 2003 From: paul@prescod.net (Paul Prescod) Date: Mon, 31 Mar 2003 17:08:40 -0800 Subject: [Python-Dev] Capabilities In-Reply-To: <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net> References: <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E88E698.7000503@prescod.net> Guido van Rossum wrote: >... > >>In many classes, __init__ exercises authority. An obvious C type with >>the same problem is the "file" type (being able to ask a file object >>for its type gets you the ability to open any file on the filesystem). >>But many Python classes are in the same position -- they acquire >>authority upon initialization. > > > What do you mean exactly by "exercise authority"? Again, I understand > this for C code, but it would seem that all authority ultimately comes > from C code, so I don't understand what authority __init__() can > exercise. Given that Zipfile("/tmp/foo.zip") can read a zipfile, the zipfile class clearly has the ability to open files. It derives this ability from the fact that it can get at open(), os.open etc. In a capabilities world, it should not have access to that stuff unless the caller specifically gave it access. And the logical way for the caller to give it that access is like this: ZipFile(already_opened_file) But in restricted code > ... > But is it really ZipFile.__init__ that exercises the authority? Isn't > its authority derived from that of the open() function that it calls? I think that's the problem. the ZipFile module has a back-door "capability" that is incredibly powerful. In a library designed for capabilities, its only access to the outside world would be via data passed to it explicitly. > In what sense is the ZipFile class an entity by itself, rather than > just a pile of Python statements that derive any and all authority > from its caller? In the sense that it can import "open" or "os.open" rather than being forced to only communicate with the world through objects provided by the caller. If we imagine a world where it has no access to those back-doors then I can't see why Ping's complaint about access to classes would be a problem. Paul Prescod From jriehl@spaceship.com Tue Apr 1 01:50:39 2003 From: jriehl@spaceship.com (Jonathan Riehl) Date: Mon, 31 Mar 2003 19:50:39 -0600 (CST) Subject: [Python-Dev] PEP 269 once more. Message-ID: Hey all, FYI, Guido closed the patch I had on SourceForge (599331), but I have just put an updated patch there. I have added some documentation on how my pgen module may be used, and the interface is much more consistent and useful than the prior upload. If anyone is interested in playing with pgen from Python, check it out and let me know what you think. Thanks! -Jon From martin@v.loewis.de Tue Apr 1 06:12:17 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 01 Apr 2003 08:12:17 +0200 Subject: [Python-Dev] Distutils documentation amputated in 2.2 docs? In-Reply-To: <200304010043.h310h5M17556@oma.cosc.canterbury.ac.nz> References: <200304010043.h310h5M17556@oma.cosc.canterbury.ac.nz> Message-ID: Greg Ewing writes: > What's the proper way of submitting a bug report about this? It would be best if you would provide a patch. Try to locate the primary source of the missing documentation (i.e. a TeX snippet), and integrate this into the current CVS, then do a cvs diff. If you find that the text is still there in the primary source, and just not rendered in the HTML version, submit a bug report pointing to the precise file that does not get rendered. Regards, Martin From joel@boost-consulting.com Tue Apr 1 08:56:34 2003 From: joel@boost-consulting.com (Joel de Guzman) Date: Tue, 1 Apr 2003 16:56:34 +0800 Subject: [Python-Dev] How to suppress instance __dict__? References: <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net> <200303231546.h2NFkex04473@pcp02138704pcs.reston01.va.comcast.net> <200303232104.h2NL4GQ04819@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <021d01c2f82c$9b6d3470$4ee1afca@kim> Dave Abrahams wrote: >> I am generating extension types derived from a type which is derived >> from int 'int' by calling the metaclass; in order to prevent instances >> of the most-derived type from getting an instance _dict_ I am >> putting an empty tuple in the class _dict_ as '_slots_'. The >> problem with this hack is that it disables pickling of these babies: >> >> "a class that defines _slots_ without defining _getstate_ >> cannot be pickled" >> Guido van Rossum wrote: > Yes. I was assuming you'd do this at the C level. To do what I > suggested in Python, I think you'd have to write this: > > class M(type): > def __new__(cls, name, bases, dict): > C = type.__new__(cls, name, bases, dict) > del C.__getstate__ > return C Hi, Ok, I'm lost. Please be easy with me, I'm still learning the C API interfacing with Python :) Here's what I have so far. Emulating the desired behavior in Python, I can do: class EnumMeta(type): def __new__(cls, name, bases, dict): C = type.__new__(cls, name, bases, dict) del C.__getstate__ return C class Enum(int): __metaclass__ = EnumMeta __slots__ = () x = Enum(1964) print x import pickle print "SAVING" out_x = pickle.dumps(x) print "LOADING" xl = pickle.loads(out_x) print xl I'm trying to rewrite this in C/C++ with the intent to patch Boost.Python to allow pickling on enums. I took on this task to learn more about the low level details of Python C interfacing. So far, I have implemented EnumMeta in C that does not override anything yet and installed that as the metaclass of Enum. I was wondering... Is there some C code somewhere that I can see that implements some sort of meta-stuff? I read PEP253 and 253 and "Unifying Types and Classes in Python 2.2". The examples there (specifically the class autoprop) is written in Python. I tried searching for examples in C from the current CVS snapsot of 2.3 but I failed in doing so. I'm sure it's there, but I don't know where to find. To be specific, I'm lost in trying to implement tp_new of PyTypeObject. How do I call the default tp_new for metaclasses? TIA, -- Joel de Guzman joel at boost-consulting.com http://www.boost-consulting.com http://spirit.sf.net From zooko@zooko.com Tue Apr 1 16:47:56 2003 From: zooko@zooko.com (Zooko) Date: Tue, 01 Apr 2003 11:47:56 -0500 Subject: [Python-Dev] Capabilities (we already got one) In-Reply-To: Message from Guido van Rossum of "Mon, 31 Mar 2003 17:43:09 EST." <200303312243.h2VMhCC24639@odiug.zope.com> References: <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net> <200303311944.h2VJhsA16638@odiug.zope.com> <200303312243.h2VMhCC24639@odiug.zope.com> Message-ID: (I, Zooko, wrote the lines prepended with "> > ".) Guido wrote: > > Yes. That may be why the demand for capabilities has been met with > resistance: to quote the French in "Monty Python and the Holy Grail", > "we already got one!" :-) ;-) Such skepticism is of course perfectly appropriate for proposed changes to your beautiful language. More on the one you already got below. (I agree: you already got one.) > > Here's a two sentence definition of capabilities: > > I've heard too many of these. They are all too abstract. There may have been a terminological problem. The word "capabilities" has been used for three different systems -- "capabilities-as-rows-of-the-Lampson-access- control-matrix", "capabilities-as-keys", and "capabilities-as-references". Unfortunately, the distinction is rarely made explicit, so people often assert things about "capabilities" which are untrue of capabilities-as-references. (Ping has just written a paper about this.) The former two kinds of capabilities have major problems and are disliked by almost everybody. The last one is the one that Ping, Ben Laurie and I are advocating, and the one that you already got. Anyway, if someone gave a definition of capabilities-as-references and it didn't match with the two-sentence definition I gave (and with the diagram), then it was wrong. Here's the two-sentence definition again: > > Authority originates in C code (in the interpreter or C extension > > modules), and is passed from thing to thing. > > This part I like. > > > A given thing "X" -- an instance of ZipFile, for example -- has the > > authority to use a given authority -- to invoke the real open(), for > > example -- if and only if some thing "Y" previously held both the > > "open()" authority and the "authority to extend authorities to X" > > authority, and chose to extend the "open()" authority to X. > > But the instance of ZipFile is not really a protection domain. > Methods on the instance may have different authority. Okay, ZipFile was the wrong example. Here it is without examples: Abstract version: A given thing "X" can use a given authority "S" if and only if some thing "Y" has previously held both the authority and the "authority to extend authorities to X" and chose to extend "S" to X. To make it concrete, I will use the word "object" to mean "anything referenced by a Python reference". This includes class instances, closures, bound methods, stack frames, etc. When I mean Python's instance-of-a-class "object", I'll say "instance" instead of "object". So the concrete version is: Concrete version: An object "X" can use an object "S" if and only if some object "Y" has previously held references to both S and X, and chose to give a reference to S to X. (Quoting out of order:) > > Hm. Reviewing the rexec docs, I being to suspect that the "access > > control system with unified designation and authority" *is* how > > Python does access control in restricted mode, and that rexec itself > > is just to manage module import and certain dangerous builtins. > > Yes. [...] > Sure. The question is, what exactly are Alice, Bob and Carol? I > claim that they are not specific class instances but they are each a > "workspace" as I tried to explain before. A workspace is more or less > the contents of a particular "sys.modules" dictionary. I believe I understand the motivation for rexec now. I think that in restricted-execution-mode (hereafter: "REM", as per Greg Ewing's suggestion [1]), Python objects have encapsulation -- one can't access their private data without their permission. Once this is done, Python references are capabilities. So if you have a Python object such as a wxWindow instance, and you want to control access to it, the natural way to do that is to control how references to it are passed around. This is why you've already got one. The natural and Pythonic way to control access to Python objects is with capabilities, and that's what you've been doing all along. However, you don't use the same technique to control access to Python *modules* such as the zipfile module, because the "import zipfile" statement will give the current scope access to the zipfile module even if nobody has granted such access to the current scope. This is a violation of the two-sentence definition and of the graph: the current scope just gained authority ex nihilo. So your solution to this, to prevent code from grabbing privileges willy nilly via "import" and builtins, is rexec, which creates a scope in which code executes (now called a "workspace"), and allows you to control which builtins and modules are available for code executing in that "workspace". Now access to modules conforms to the definition of capabilities: an object X can access a module S if and only if some object Y previously had access to X's workspace and to S, and Y chose to give X access to S. So unless I've missed something, rexec conforms to the definition of capabilities as well. (Of course, one can always build other access-control mechanisms on top of capabilities. In particular, the rexec "hooks" mechanism seems intended for that.) Regards, Zooko http://zooko.com/ ^-- under re-construction: some new stuff, some broken links [1] http://mail.python.org/pipermail/python-dev/2003-March/034311.html From jeremy@zope.com Tue Apr 1 17:10:16 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 01 Apr 2003 12:10:16 -0500 Subject: [Python-Dev] Capabilities (we already got one) In-Reply-To: References: <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net> <200303311944.h2VJhsA16638@odiug.zope.com> <200303312243.h2VMhCC24639@odiug.zope.com> Message-ID: <1049217016.14149.12.camel@slothrop.zope.com> On Tue, 2003-04-01 at 11:47, Zooko wrote: > I think that in restricted-execution-mode (hereafter: "REM", as per Greg Ewing's > suggestion [1]), Python objects have encapsulation -- one can't access their > private data without their permission. > > Once this is done, Python references are capabilities. REM does not provide object encapsulation, but it disables enough introspection that it is possible to provide encapsulation. The REM implementation provides a Bastion function that creates private state by storing the state in func_defaults, which is inaccessible in REM. Jeremy From paul@prescod.net Tue Apr 1 18:29:37 2003 From: paul@prescod.net (Paul Prescod) Date: Tue, 01 Apr 2003 10:29:37 -0800 Subject: [Python-Dev] Capabilities In-Reply-To: <200303312243.h2VMhCC24639@odiug.zope.com> References: <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net> <200303311944.h2VJhsA16638@odiug.zope.com> <200303312243.h2VMhCC24639@odiug.zope.com> Message-ID: <3E89DA91.9040001@prescod.net> Guido van Rossum wrote: >>How is the implementation of "open" provided by the trusted code to >>the untrusted code? Is it possible to provide a different "open" >>implementation to different "instances" of the zipfile module? (I >>think not, as there is no such thing as "a different instance of a >>module", but perhaps you could have two rexec "workspaces" each of >>which has a zipfile module with a different "open"?) > > > To the contrary, it is very easy to provide code with a different > version of open(). E.g.: > > # this executes as trusted code > def my_open(...): > "open() variant that only allows reading" > my_builtins = {"len": len, "open": my_open, "range": range, ...} > namespace = {"__builtins__": my_builtins} > exec "..." in namespace That's fair enough, but why is it better for the "protection domain" to be an invoked "workspace" instead of an object? Think of it from a software engineering point of view: you're proposing that the right way to manage security is to override more-or-less global variables. Zooko is proposing that you pass the capabilities each method needs to that method. i.e. standard structured programming. Let's say that untrusted code wants access to the socket module. The surrounding code wants to tame it to prevent socket connections to certain IP addresses. I think that in the rexec model, the surrounding application would have to go in and poke "safe" versions of the constructor into the module. Or they would have to disallow access to the module altogether and provide an object that tamed module appropriately. The first approach is kind of error prone. The second approach requires the untrusted code to use a model of programming that is very different than "standard Python." If we imagined a Python with capabilities were built in deeply, the socket module would be designed to be tamed. By default it would have no authority at all except that which is passed in. The authority to contact the outside world would be separate from all of the other useful stuff in the socket module and socket class. I'm not necessarily advocating this kind of a change to the Python library... Paul Prescod From pje@telecommunity.com Tue Apr 1 18:01:54 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Tue, 01 Apr 2003 13:01:54 -0500 Subject: [Python-Dev] Capabilities (we already got one) Message-ID: <5.1.1.6.0.20030401124212.01e03670@mail.rapidsite.net> >However, you don't use the same technique to control access to Python *modules* >such as the zipfile module, because the "import zipfile" statement will give the >current scope access to the zipfile module even if nobody has granted such >access to the current scope. >... >So your solution to this, to prevent code from grabbing privileges willy nilly >via "import" and builtins, is rexec, which creates a scope in which code >executes (now called a "workspace"), and allows you to control which builtins >and modules are available for code executing in that "workspace". Almost. I think you may be confusing module *code* and module *objects*. Guido pointed this out earlier. A Python module object is populated by executing a body of *code* against the module *object* dictionary. The module object dictionary contains a '__builtins__' entry that gives it its "base" capabilities. Module *objects* possess capabilities, which are in their dictionary or reachable from it. *Code* doesn't possess capabilities except to constants used in the code. So access to *code* only grants you capabilities to the code and its constants. So, in order to provide a capability-safe environment, you need only provide a custom __import__ which uses a different 'sys.modules' that is specific to that environment. At that point, a "workspace" consists of an object graph rooted in the supplied '__builtins__', locals(), globals(), and initially executing code. We can then see that the standard Python environment is in fact a capability system, wherein everything is reachable from everything else. The "holes" in this capability system, then, are: 1. introspective abilities that allow "breaking out" of the workspace (such as the ability to 'sys._getframe()' or examine tracebacks to "reach up" to higher-level stack frames) 2. the structuring of the library in ways that equate creating an instance of a class with an "unsafe" capability. (E.g., creating instances of 'file()') coupled with instance->class introspection 3. Lack of true "privacy" for objects. (Proxies are a useful way to address this issue, because they allow more than one "capability" to exist for the same object.) From ping@zesty.ca Tue Apr 1 20:12:49 2003 From: ping@zesty.ca (Ka-Ping Yee) Date: Tue, 1 Apr 2003 14:12:49 -0600 (CST) Subject: [Python-Dev] Capabilities (we already got one) In-Reply-To: Message-ID: On Tue, 1 Apr 2003, Zooko wrote: > I think that in restricted-execution-mode (hereafter: "REM", as per Greg Ewing's > suggestion [1]), Python objects have encapsulation -- one can't access their > private data without their permission. > > Once this is done, Python references are capabilities. Aaack! I wish you would *stop* saying that! There is no criterion by which a reference is or is not a capability. To talk in such terms only confuses the issue. It is possible to program in a capability style in any Turing-complete programming language, just as it is possible to program in an object style or a functional style or a procedural style. The question is: what does programming in a capability style look like, and how might Python facilitate (or even encourage) that style? To say that activating restricted execution mode causes things to "become" capabilities is as meaningless as saying that adding a feature to the C language would suddenly turn an arbitrary C program into an object-oriented program. -- ?!ng From ehuss@netmeridian.com Tue Apr 1 21:41:54 2003 From: ehuss@netmeridian.com (Eric Huss) Date: Tue, 1 Apr 2003 13:41:54 -0800 (PST) Subject: [Python-Dev] Minor issue with PyErr_NormalizeException Message-ID: We had a bug in one of our extension modules that caused a core dump in PyErr_NormalizeException(). At the very top of the function (line 133) it checks for a NULL type. I think it should have a "return" here so that the code does not continue and thus dump core on line 153 when it calls PyClass_Check(type). This should also make the comment not lie about dumping core. ;) Just thought I'd pass it on.. -Eric From klm@zope.com Tue Apr 1 22:35:10 2003 From: klm@zope.com (Ken Manheimer) Date: Tue, 1 Apr 2003 17:35:10 -0500 (EST) Subject: [Python-Dev] Capabilities (we already got one) In-Reply-To: Message-ID: On Tue, 1 Apr 2003, Ka-Ping Yee wrote: > On Tue, 1 Apr 2003, Zooko wrote: > > I think that in restricted-execution-mode (hereafter: "REM", as > > per Greg Ewing's suggestion [1]), Python objects have > > encapsulation -- one can't access their private data without their > > permission. > > > > Once this is done, Python references are capabilities. > > Aaack! I wish you would *stop* saying that! > > There is no criterion by which a reference is or is not a capability. > To talk in such terms only confuses the issue. I take the above, with a bit of license, to mean that REM enables encapsulation for python objects, so they are closer to being safe to use as capabilities. Subsequent posts suggest that encapsulation isn't actually achieved, but that's not the issue here - the issue, as i understand it, is how to talk about enabling capability-based safety in python code. > It is possible to program in a capability style in any Turing-complete > programming language, just as it is possible to program in an object > style or a functional style or a procedural style. The question is: > what does programming in a capability style look like, and how might > Python facilitate (or even encourage) that style? I think the last part is, more specifically, "what measures need to be taken to enable safe use of python objects for capability style programming?" > To say that activating restricted execution mode causes things to > "become" capabilities is as meaningless as saying that adding a feature > to the C language would suddenly turn an arbitrary C program into an > object-oriented program. I'm not near as clear about all this as you seem to be, but i have the feeling the statements are not as meaningless as you're suggesting. I *do* think that getting more clear about what the questions are that we're trying to answer would be helpful, here. One big one seems to be: "What needs to be done to enable effective ("safe"?) use of python object (references) as capabilities?" I've seen answers to this roll by several times - i think we need to settle them, and collect the conclusions in a PEP. And we need to identify what other questions there are. One more probably is, "how do we use python objects as capabilities, once we can ensure their safety?" And maybe it'd be helpful to elaborate what "safety" means. -- Ken klm@zope.com Alan Turing thought about criteria to settle the question of whether machines can think, a question of which we now know that it is about as relevant as the question of whether submarines can swim. -- Edgser Dijkstra From beau@nyc-search.com Tue Apr 1 23:15:44 2003 From: beau@nyc-search.com (beau@nyc-search.com) Date: Tue, 01 Apr 2003 18:15:44 -0500 Subject: [Python-Dev] Python Programmers, NYC Message-ID: <3E8A1DA0.5E202C45@nyc-search.com> This is a multi-part message in MIME format. --------------C831F35444BF6E2B414EE13A Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Python Programmers, NYC http://www.nyc-search.com/jobs/python.html --------------C831F35444BF6E2B414EE13A Content-Type: text/html; charset=us-ascii; name="python.html" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="python.html" Content-Base: "http://www.nyc-search.com/jobs/python. html" Content-Location: "http://www.nyc-search.com/jobs/python. html" Python Programmers, NYC Python Programmers, NYC

We are seeking an experienced and highly-talented programmer/scripter/analysts to fill the position of Technical Lead for our quality control group. The successful candidate will collaborate with engineering, QC, and clients, and shall be responsible for developing and executing testing scripts to ensure all aspects of client data, as transformed to reports, meet stringent quality standards.

Job Requirements:

  • Solid experience programming with Python and Java, preferably in a UNIX environment.
  • Strong knowledge of databases (Oracle) and SQL - knowledge of PL/SQL preferred.
  • Strong analytical skills (mathematics or statistics background preferred).
  • Demonstrated business knowledge of public education systems in the United States helpful.
  • We are using Python to: Prototype and simulate key product functionality, as well as test the client data for consistency and test product subsystems for correctness.
  • Candidates who elaborate on their knowledge of the above *key* requirements will get the best response.
My client hires on a contract basis first and then it becomes full time if both parties are happy.

Candidates MUST be permanent and local tri-state (NY, NJ, CT) residents.

Please submit Word resume and hourly/salary requirements to python@nyc-search.com
  --------------C831F35444BF6E2B414EE13A-- From greg@cosc.canterbury.ac.nz Wed Apr 2 01:58:31 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 02 Apr 2003 13:58:31 +1200 (NZST) Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: Message-ID: <200304020158.h321wVY02357@oma.cosc.canterbury.ac.nz> > It would be best if you would provide a patch. Try to locate the > primary source of the missing documentation (i.e. a TeX snippet), > and integrate this into the current CVS, then do a cvs diff. I'd rather not get involved in all that right now. I just want to draw this to the attention of whoever is maintaining the documentation. > submit a bug report That's what I *want* to do, but I can't figure out how. Following the obvious links leads me to the SourceForge Bug Tracker page, but I can't find anything there for submitting a new bug report, only browsing existing ones. Can someone please tell me how to submit a bug report? Thanks, Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From lalo@laranja.org Wed Apr 2 02:40:11 2003 From: lalo@laranja.org (Lalo Martins) Date: Tue, 1 Apr 2003 23:40:11 -0300 Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: <200304020158.h321wVY02357@oma.cosc.canterbury.ac.nz> References: <200304020158.h321wVY02357@oma.cosc.canterbury.ac.nz> Message-ID: <20030402024010.GG6887@laranja.org> On Wed, Apr 02, 2003 at 01:58:31PM +1200, Greg Ewing wrote: > > That's what I *want* to do, but I can't figure out how. > Following the obvious links leads me to the SourceForge > Bug Tracker page, but I can't find anything there for > submitting a new bug report, only browsing existing ones. > > Can someone please tell me how to submit a bug report? You need to login to sourceforge. Once you do that you should see a bar that looks like Submit New | Browse | Reporting | Admin the link you want is "Submit New". []s, |alo +---- -- Those who trade freedom for security lose both and deserve neither. -- http://www.laranja.org/ mailto:lalo@laranja.org pgp key: http://www.laranja.org/pessoal/pgp Eu jogo RPG! (I play RPG) http://www.eujogorpg.com.br/ GNU: never give up freedom http://www.gnu.org/ From tim.one@comcast.net Wed Apr 2 03:03:45 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 01 Apr 2003 22:03:45 -0500 Subject: [Python-Dev] Minor issue with PyErr_NormalizeException In-Reply-To: Message-ID: [Eric Huss] > We had a bug in one of our extension modules that caused a core dump in > PyErr_NormalizeException(). At the very top of the function (line 133) it > checks for a NULL type. I think it should have a "return" here so that > the code does not continue and thus dump core on line 153 when it calls > PyClass_Check(type). This should also make the comment not lie about > dumping core. ;) > > Just thought I'd pass it on.. I agree the code doesn't make sense, but the comment doesn't either. I'm in favor of replacing the guts of the if (type == NULL) { block with a call to Py_FatalError(). From barry@python.org Wed Apr 2 04:06:32 2003 From: barry@python.org (Barry Warsaw) Date: 01 Apr 2003 23:06:32 -0500 Subject: [Python-Dev] Minor issue with PyErr_NormalizeException In-Reply-To: References: Message-ID: <1049256392.3057.3.camel@geddy> On Tue, 2003-04-01 at 22:03, Tim Peters wrote: > [Eric Huss] > > I agree the code doesn't make sense, but the comment doesn't either. I'm in > favor of replacing the guts of the > > if (type == NULL) { > > block with a call to Py_FatalError(). +1 -Barry From drifty@alum.berkeley.edu Wed Apr 2 04:52:22 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Tue, 1 Apr 2003 20:52:22 -0800 (PST) Subject: [Python-Dev] python-dev Summary for 2003-03-16 through 2003-03-31 Message-ID: You guys have 24 hours to correct my usual bunch of mistakes. Also give me feedback on the new format for the Quickies section. ----------- +++++++++++++++++++++++++++++++++++++++++++++++++++++ python-dev Summary for 2003-03-16 through 2003-03-31 +++++++++++++++++++++++++++++++++++++++++++++++++++++ .. _last summary: http://www.python.org/dev/summary/2003-03-01_2003-03-15.html ====================== Summary Announcements ====================== PyCon is now over! It was a wonderful experience. Getting to meet people from python-dev in person was great. The sprint was fun and productive (work on the AST branch, caching where something is found in an inheritence tree, and a new CALL_ATTR opcode were all worked on). Definitely was worth it. I am trying a new way of formatting the Quickies_ section. I am trying non-inline implicit links instead of inlined ones. I am hoping this will read better in the text version of the summary. If you have an opinion on whether the new or old version is better let me know. And remember, the last time I asked for an opinion Michael Chermside was the only person to respond and thus ended up making an executive decision. .. _PyCon: http://www.python.org/pycon/ ======================== `Re: lists v. tuples`__ ======================== __ http://mail.python.org/pipermail/python-dev/2003-March/034029.html Splinter threads: - `Re: Re: lists v. tuples `__ This developed from a thread from covered in the `last summary`_ that discussed the different uses of lists and tuples. By the start date for this summary, though, it had turned into a discussion on comparisons. This occured when sorting heterogeneous objects came up. Guido commented that having anything beyond equality and non-equality tests for non-related objects does not make sense. This also led Guido to comment that "TOOWTDI makes me want to get rid of __cmp__" (TOOWTDI is "There is Only One Way to Do It"). Now before people start screaming bloody murder over the possible future loss of __cmp__() (which probably won't happen until Python 3), realize that all comparisons can be done using the six other rich comparisons (__lt__(), __eq__(), etc.). There is some possible code elegance lost if you have to use two rich comparisons instead a single __cmp__() comparison, but it is nothing that will prevent you from doing something that you couldn't do before. This all led Guido to suggest introducing the function before(). This would be used for arbitrary ordering of objects. Alex Martelli said it would "be very nice if before(x,y) were the same as x`__ The thread that will not die (nor does it look like it will in the near future; Guido asked to postpone discussing it until he gets back from `Python UK`_ which will continue the discussion into the next summary. I am ending up an expert at capabilities against my will. =) In case you have not been following all of this, capabilities as being discussed here is the idea that security is based on passing around references to objects. If you have a reference you can use it with no restrictions. Security comes in by controlling who you give references to. So I might ask for a reference to file(), but I won't necessarily get it. I could, instead, be handed a reference to a restrictive version of file() that only opens files in an OSs temporary file directory. If that is not clear, read the `last summary`_ on this thread. And now, on to the new stuff... One point made about capabilities is that they partially go against the Pythonic grain. Since you have to pass capabilities specifically and shouldn't allow them to be inherited, it does not go with the way you tend to write Python code. There were also suggestions to add arguments to import statements to give a more fine-grained control over them. But it was pointed out that classes fit this bill. The idea of limiting what modules are accessible by some code by not using a universally global scope (i.e., not using sys.modules) but by having a specific scope for each function was suggested. As Greg Ewing put it, "it would be dynamic scoping of the import namespace". While trying to clarify things (which were at PyCon thanks to the Open Space discussion held there on this subject), a good distinction between a rexec_ world (as in the module) and a capabilities was made by Guido. In capabilities, security is based on passing around references that have the amount of power you are willing for it to have. In a rexec world, it is based on what powers the built-ins give you; there is no worry about passing around code. Also, in the rexec world, you can have the idea of a "workspace" where __builtin__ has very specific definitions of built-ins that are used when executing untrusted code. Ka-Ping Yee wrote up an example of some code of what it would be like to code with capabilities (can be found at XXX ). .. _Python UK: http://www.python-uk.org/ .. _rexec: http://www.python.org/dev/doc/devel/lib/module-rexec.html ========= Quickies ========= `tzset`__ time.tzset() is going to be kept in Python, but only on UNIX. The testing suite was also loosened so as to not throw as many false-negatives. __ http://mail.python.org/pipermail/python-dev/2003-March/034062.html `Windows IO`__ stdin and stdout on Windows are TTYs. You can get 3rd-party modules to get more control over the TTY. __ http://mail.python.org/pipermail/python-dev/2003-March/034102.html `Who approved PyObject_GenericGetIter()???`__ Splinter threads: `Re: [Python-checkins] python/dist/src/Modules _hotshot.c,...`__; `PyObject_GenericGetIter()`__ Raymond Hettinger wrote a function called PyObject_GenericGetIter() that returned self for objects that were an iterator themselves. Thomas Wouters didn't like the name and neither did Guido since it was generic at all; it worked specifically with objects that were iterators themselves. Thus the function was renamed to PyObject_SelfIter(). __ http://mail.python.org/pipermail/python-dev/2003-March/034107.html __ http://mail.python.org/pipermail/python-dev/2003-March/034103.html __ http://mail.python.org/pipermail/python-dev/2003-March/034110.html `test_posix failures?`__ A test for posix.getlogin() was failing for Barry Warsaw under XEmacs (that is what he gets for not using Vim_ =). Thomas Wouters pointed out it only works when there is a utmp file somewhere. Basically it was agreed the test that was failing should be removed. __ http://mail.python.org/pipermail/python-dev/2003-March/034120.html .. _Vim: http://www.vim.org/ `Shortcut bugfix`__ Raymond Hettinger reported that a change in `_tkinter.c`_ for a function led to it returning strings or ints which broke PMW_ (although having a function return two different things was disputed in the thread; I think it used to return a string and now returns an int). The suggestion of making string.atoi() more lenient on its accepted arguments was made but shot down since it changes semantics. If you want to keep old way of having everything in Tkinter return strings instead of more proper object types (such as ints where appropriate), you can put teh line ``Tkinter.wantobjects = 0`` before the first creation of a tkapp object. __ http://mail.python.org/pipermail/python-dev/2003-March/034138.html .. __tkinter.c: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Modules/_tkinter.c .. _PMW: http://pmw.sourceforge.net/ `csv package ready for prime-time?`__ Related: `csv package stitched into CVS hierarchy`__ Skip Montanaro: Okay to move csv_ package from the sandbox into the stdlib? Guido van Rossum: Yes. __ http://mail.python.org/pipermail/python-dev/2003-March/034162.html __ http://mail.python.org/pipermail/python-dev/2003-March/034179.html .. _csv: http://www.python.org/dev/doc/devel/lib/module-csv.html `string.strip doc vs code mismatch`__ Neal Norwitz asked for someone to look at http://python.org/sf/697220 which updates string.strip() from the string_ module to take an optional second argument. The patch is still open. __ http://mail.python.org/pipermail/python-dev/2003-March/034167.html .. _string: http://www.python.org/dev/doc/devel/lib/module-string.html `Re: More int/long integration issues`__ The point was made that it would be nice if the statement ``if num in range(...): ...`` could be optimized by the compiler if range() was only the built-in by substituting it with something like xrange() and thus skip creating a huge list. This would allow the removal of xrange() without issue. Guido suggested a restartable iterator (generator would work wonderfully if you could just get everything else to make what range() returns look like the list it should be). __ http://mail.python.org/pipermail/python-dev/2003-March/034019.html `socket timeouts fail w/ makefile()`__ Skip Montanaro discovered that using the makefile() method on a socket cause the file-like object to not observe the new timeout facility introduced in Python 2.3. He has since patched it so that it works properly and that sockets always have a makefile() (wasn't always the case before). __ http://mail.python.org/pipermail/python-dev/2003-March/034177.html `New Module? Tiger Hashsum`__ Tino Lange implemented a wrapper for the `Tiger hash sum`_ for Python and asked how he could get it added to the stdlib. He was told that he would need community backing before his module could be added in order to make sure that there is enough demand to warrant the edition. __ http://mail.python.org/pipermail/python-dev/2003-March/034191.html .. _Tiger hash sum: http://www.cs.technion.ac.il/~biham/Reports/Tiger/ `Icon for Python RSS Feed?`__ Tino Lange asked if an XML RSS feed icon could be added at http://www.python.org/ for http://www.python.org/channews.rdf . It has been added. __ http://mail.python.org/pipermail/python-dev/2003-March/034196.html `How to suppress instance __dict__?`__ David Abrahams asked if there was an easy way to suppress an instance __dict__'s creation from a metaclass. The answer turned out to be no. __ http://mail.python.org/pipermail/python-dev/2003-March/034197.html `Weekly Python Bug/Patch Summary`__ Another summary can be found at http://mail.python.org/pipermail/python-dev/2003-March/034286.html Skip Montanaro's weekly reminder how Python ain't perfect. __ http://mail.python.org/pipermail/python-dev/2003-March/034200.html `[ot] offline`__ Samuele Pedroni is off relaxing is is going to be offline for two weeks starting March 23. __ http://mail.python.org/pipermail/python-dev/2003-March/034204.html `funny leak`__ Christian Tismer discovered a memory leak in a funky def statement he came up with. The leak has since been squashed (done at PyCon_ during the sprint, actually). __ http://mail.python.org/pipermail/python-dev/2003-March/034212.html `Checkins to Attic?`__ CVS_ uses something called the Attic to put files that are only in a branch but not the HEAD of a tree. __ http://mail.python.org/pipermail/python-dev/2003-March/034230.html .. _CVS: http://www.cvshome.org/ `ossaudiodev tweak needs testing`__ Greg Ward asked people who are running Linux or FreeBSD to execute ``Lib/test/regrtest.py -uaudio test_ossaudiodev`` so as to test his latest change to ossaudiodev_. __ http://mail.python.org/pipermail/python-dev/2003-March/034233.html .. _ossaudiodev: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Modules/ossaudiodev.c `cvs.python.sourceforge.net fouled up`__ Apparently when you get that nice message from SourceForge_ telling you that recv() has aborted because of server overloading you can rest assured that people with checkin rights get to continue to connect since they get priority. __ http://mail.python.org/pipermail/python-dev/2003-March/034234.html .. _SF: .. _SourceForge: http://www.sf.net/ `Doc strings for typeslots?`__ You can't add custom docstrings to things stored in typeobject slots at the C level. __ http://mail.python.org/pipermail/python-dev/2003-March/034239.html `Compiler treats None both as a constant and variable`__ As of now the compiler outputs opcode that treats None as both a global and a constant. That will change as some point when assigning to None becomes an error instead of a warning as it is in Python 2.3; possibly 2.4 the change will be made. __ http://mail.python.org/pipermail/python-dev/2003-March/034281.html `iconv codec`__ M.A. Lemburg stated that he questioned whether the iconv codec was ready for prime-time. There have been multiple issues with it and most seem to stem from a platform's codec and not ones that come with Python. This affects all u"".encode() calls when the codec does not come with Python. Hye-Shik Chang said he would get his iconv codec NG patch up on SF in the next few days and that would be applied. __ http://mail.python.org/pipermail/python-dev/2003-March/034300.html From beau@nyc-search.com Wed Apr 2 04:52:26 2003 From: beau@nyc-search.com (beau@nyc-search.com) Date: Tue, 01 Apr 2003 23:52:26 -0500 Subject: [Python-Dev] Python Technical Lead, New York, NY Message-ID: <3E8A6C8A.223A19FC@nyc-search.com> This is a multi-part message in MIME format. --------------FCDF22A5C479E2E8508D5BD8 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit http://www.nyc-search.com/jobs/python.html --------------FCDF22A5C479E2E8508D5BD8 Content-Type: text/html; charset=us-ascii; name="python.html" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="python.html" Content-Base: "http://www.nyc-search.com/jobs/python. html" Content-Location: "http://www.nyc-search.com/jobs/python. html" Python Technical Lead, New York, NY Python Technical Lead, New York, NY

We are seeking an experienced and highly-talented programmer/scripter/analysts to fill the position of Technical Lead for our quality control group. The successful candidate will collaborate with engineering, QC, and clients, and shall be responsible for developing and executing testing scripts to ensure all aspects of client data, as transformed to reports, meet stringent quality standards.

Job Requirements:

  • Solid experience programming with Python and Java, preferably in a UNIX environment.
  • Strong knowledge of databases (Oracle) and SQL - knowledge of PL/SQL preferred.
  • Strong analytical skills (mathematics or statistics background preferred).
  • Demonstrated business knowledge of public education systems in the United States helpful.
  • We are using Python to: Prototype and simulate key product functionality, as well as test the client data for consistency and test product subsystems for correctness.
  • Candidates who elaborate on their knowledge of the above *key* requirements will get the best response.
My client hires on a contract basis first and then it becomes full time if both parties are happy.

Candidates MUST be permanent and local tri-state (NY, NJ, CT) residents.

Please submit Word resume and hourly/salary requirements to python@nyc-search.com
  --------------FCDF22A5C479E2E8508D5BD8-- From Jack.Jansen@cwi.nl Wed Apr 2 09:21:17 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Wed, 2 Apr 2003 11:21:17 +0200 Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: <20030402024010.GG6887@laranja.org> Message-ID: <7574B507-64EC-11D7-80C3-0030655234CE@cwi.nl> On Wednesday, Apr 2, 2003, at 04:40 Europe/Amsterdam, Lalo Martins wrote: >> Can someone please tell me how to submit a bug report? > > You need to login to sourceforge. > > Once you do that you should see a bar that looks like > Submit New | Browse | Reporting | Admin > the link you want is "Submit New". Aargh, this is very bad! I'm always logged in when I visit sourceforge (and I assume that most of us are), I wasn't aware of the fact that if you are not logged in you get no indication whatsoever that it is possible to submit bugs. Do we have control over what is on that page, i.e. could we add a note to the top saying "If you want to submit a new bug please log in first"? Otherwise I think the "bugs" link on www.python.org should go to a local page which explains this before sending people off to the sourceforge tracker. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From dave@boost-consulting.com Wed Apr 2 12:57:34 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 02 Apr 2003 07:57:34 -0500 Subject: [Python-Dev] How to suppress instance __dict__? In-Reply-To: <021d01c2f82c$9b6d3470$4ee1afca@kim> ("Joel de Guzman"'s message of "Tue, 1 Apr 2003 16:56:34 +0800") References: <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net> <200303231546.h2NFkex04473@pcp02138704pcs.reston01.va.comcast.net> <200303232104.h2NL4GQ04819@pcp02138704pcs.reston01.va.comcast.net> <021d01c2f82c$9b6d3470$4ee1afca@kim> Message-ID: Hi, Joel -- I don't think this is more than marginally appropriate for python-dev, and probably we shouldn't bother Guido about it until I've failed to help you first. Everybody else can ignore the rest of this message unless they have a sick fascination with Boost.Python... "Joel de Guzman" writes: > Ok, I'm lost. Please be easy with me, I'm still learning the C API > interfacing with Python :) Here's what I have so far. Emulating the > desired behavior in Python, I can do: > > class EnumMeta(type): > def __new__(cls, name, bases, dict): > C = type.__new__(cls, name, bases, dict) > del C.__getstate__ > return C > > class Enum(int): > __metaclass__ = EnumMeta > __slots__ = () > > > x = Enum(1964) > print x > > import pickle > print "SAVING" > out_x = pickle.dumps(x) > > print "LOADING" > xl = pickle.loads(out_x) > print xl > > I'm trying to rewrite this in C/C++ with the intent to patch > Boost.Python to allow pickling on enums. I took on this task to > learn more about the low level details of Python C interfacing. > So far, I have implemented EnumMeta in C that does not override > anything yet and installed that as the metaclass of Enum. > > I was wondering... Is there some C code somewhere that I can see > that implements some sort of meta-stuff? We have some in Boost.Python already, and I'm about to check in some more to implement static data members. > I read PEP253 and 253 and "Unifying Types and Classes in Python > 2.2". The examples there (specifically the class autoprop) is > written in Python. I tried searching for examples in C from the > current CVS snapsot of 2.3 but I failed in doing so. I'm sure it's > there, but I don't know where to find. Actually there are very few metaclasses in Python proper. AFAIK, PyType_Type is the only metaclass in the core. > To be specific, I'm lost in trying to implement tp_new of > PyTypeObject. How do I call the default tp_new for metaclasses? PyTypeObject.tp_new( /*args here*/ ) should work. HTH, -- Dave Abrahams Boost Consulting www.boost-consulting.com From zooko@zooko.com Wed Apr 2 13:39:33 2003 From: zooko@zooko.com (Zooko) Date: Wed, 02 Apr 2003 08:39:33 -0500 Subject: [Python-Dev] python-dev Summary for 2003-03-16 through 2003-03-31 In-Reply-To: Message from Brett Cannon of "Tue, 01 Apr 2003 20:52:22 PST." References: Message-ID: Brett Cannon wrote: > > One point made about capabilities is that they partially go against the > Pythonic grain. Since you have to pass capabilities specifically and > shouldn't allow them to be inherited, it does not go with the way you tend > to write Python code. This doesn't make sense to me, and I don't recall a message which asserted it. If capabilities were implemented as Python references, you could inherit capabilities (== references) from superclasses, just as you can currently do. The rest looks like a good summary! Regards, Zooko http://zooko.com/ ^-- under re-construction: some new stuff, some broken links From nas@python.ca Wed Apr 2 14:35:53 2003 From: nas@python.ca (Neil Schemenauer) Date: Wed, 2 Apr 2003 06:35:53 -0800 Subject: [Python-Dev] python-dev Summary for 2003-03-16 through 2003-03-31 In-Reply-To: References: Message-ID: <20030402143553.GA6801@glacier.arctrix.com> Brett Cannon wrote: > Neil Schemanauer suggested adding a warning for when this kind of > shadowing is done. There is a patch on SF (http://www.python.org/sf/711448) that adds a warning. It probably needs a bit of polish but I think it could go into 2.3. Neil From op73418@mail.telepac.pt Wed Apr 2 14:42:41 2003 From: op73418@mail.telepac.pt (=?iso-8859-1?Q?Gon=E7alo_Rodrigues?=) Date: Wed, 2 Apr 2003 15:42:41 +0100 Subject: [Python-Dev] Super and properties Message-ID: <001401c2f926$1d32d7e0$a8130dd5@violante> Hi all, Since this is my first post here, let me first introduce myself. I'm Gonçalo Rodrigues. I work in mathematics, mathematical physics to be more precise. I am a self-taught hobbyist programmer and fell in love with Python a year and half ago. And of interesting personal details this is about all so let me get down to business. My problem has to do with super that does not seem to work well with properties. I posted to comp.lang.python a while ago and there I was advised to post here. So, suppose I override a property in a subclass, e.g. >>> class test(object): ... def __init__(self, n): ... self.__n = n ... def __get_n(self): ... return self.__n ... def __set_n(self, n): ... self.__n = n ... n = property(__get_n, __set_n) ... >>> a = test(8) >>> a.n 8 >>> class test2(test): ... def __init__(self, n): ... super(test2, self).__init__(n) ... def __get_n(self): ... return "Got ya!" ... n = property(__get_n) ... >>> b = test2(8) >>> b.n 'Got ya!' Now, since I'm overriding a property, it is only normal that I may want to call the property implementation in the super class. But the obvious way (to me at least) does not work: >>> print super(test2, b).n Traceback (most recent call last): File "", line 1, in ? AttributeError: 'super' object has no attribute 'n' I know I can get at the property via the class, e.g. do >>> test.n.__get__(b) 8 >>> Or, not hardcoding the test class, >>> b.__class__.__mro__[1].n.__get__(b) 8 But this is ugly at best. To add to the puzzle, the following works, albeit not in the way I expected >>> super(test2, b).__getattribute__('n') 'Got ya!' Since I do not know if this is a bug in super or a feature request for it, I thought I'd better post here and leave it to your consideration. With my best regards, G. Rodrigues From lkcl@samba-tng.org Wed Apr 2 09:07:26 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Wed, 2 Apr 2003 09:07:26 +0000 Subject: [Python-Dev] [PEP] += on return of function call result Message-ID: <20030402090726.GN1048@localhost> example code: log = {} for t in range(5): for r in range(10): log.setdefault(r, '') += "test %d\n" % t pprint(log) instead, as the above is not possible, the following must be used: from operator import add ... ... ... add(log.setdefault(r, ''), "test %d\n" % t) ... ARGH! just checked - NOPE! add doesn't work. and there's no function "radd" or "__radd__" in the operator module. unless there are really good reasons, can i recommend allowing += on return result of function calls. i cannot honestly think of or believe that there is a reasonable justification for restricting the += operator. append() on the return result of setdefault works absolutely fine, which is GREAT because you have no idea how long i have been fed up of not being able to do this in one line: log = {} log.setdefault(99, []).append("test %d\n" % t) l. From ark@research.att.com Wed Apr 2 14:54:35 2003 From: ark@research.att.com (Andrew Koenig) Date: 02 Apr 2003 09:54:35 -0500 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <20030402090726.GN1048@localhost> References: <20030402090726.GN1048@localhost> Message-ID: Luke> example code: Luke> log = {} Luke> for t in range(5): Luke> for r in range(10): Luke> log.setdefault(r, '') += "test %d\n" % t Luke> pprint(log) Luke> instead, as the above is not possible, the following must be used: Luke> from operator import add Luke> ... Luke> ... Luke> ... Luke> add(log.setdefault(r, ''), "test %d\n" % t) Luke> ... ARGH! just checked - NOPE! add doesn't work. Luke> and there's no function "radd" or "__radd__" in the Luke> operator module. Why can't you do this? for t in range(5): for r in range(10): foo = log.setdefault(r,'') foo += "test %d\n" % t -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark From lkcl@samba-tng.org Wed Apr 2 15:12:33 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Wed, 2 Apr 2003 15:12:33 +0000 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: References: <20030402090726.GN1048@localhost> Message-ID: <20030402151232.GX1048@localhost> On Wed, Apr 02, 2003 at 09:54:35AM -0500, Andrew Koenig wrote: > Why can't you do this? > > for t in range(5): > for r in range(10): > foo = log.setdefault(r,'') > foo += "test %d\n" % t because i am thick? ... now why didn't that occur to me :) thanks andrew, l. p.s. so it's on the "would be nice to have" From ben@algroup.co.uk Wed Apr 2 16:22:09 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Wed, 02 Apr 2003 17:22:09 +0100 Subject: [Python-Dev] Capabilities (we already got one) In-Reply-To: <5.1.1.6.0.20030401124212.01e03670@mail.rapidsite.net> References: <5.1.1.6.0.20030401124212.01e03670@mail.rapidsite.net> Message-ID: <3E8B0E31.5060001@algroup.co.uk> This message came unglued from the rest of the thread, so I'm going to unglue my response from my catching up with the rest of the thread (which I am partway through at the moment) ;-) Phillip J. Eby wrote: > >However, you don't use the same technique to control access to Python > *modules* > >such as the zipfile module, because the "import zipfile" statement > will give the > >current scope access to the zipfile module even if nobody has granted > such > >access to the current scope. > >... > >So your solution to this, to prevent code from grabbing privileges > willy nilly > >via "import" and builtins, is rexec, which creates a scope in which code > >executes (now called a "workspace"), and allows you to control which > builtins > >and modules are available for code executing in that "workspace". > > Almost. I think you may be confusing module *code* and module > *objects*. Guido pointed this out earlier. > > A Python module object is populated by executing a body of *code* > against the module *object* dictionary. The module object dictionary > contains a '__builtins__' entry that gives it its "base" capabilities. > > Module *objects* possess capabilities, which are in their dictionary or > reachable from it. *Code* doesn't possess capabilities except to > constants used in the code. So access to *code* only grants you > capabilities to the code and its constants. > > So, in order to provide a capability-safe environment, you need only > provide a custom __import__ which uses a different 'sys.modules' that is > specific to that environment. At that point, a "workspace" consists of > an object graph rooted in the supplied '__builtins__', locals(), > globals(), and initially executing code. > > We can then see that the standard Python environment is in fact a > capability system, wherein everything is reachable from everything else. I'm not quite sure what you mean by this. Of course, the fact that Python doesn't seem to be all that far from a capability system is one of the attractions, but until the holes you mention (and perhaps others) are plugged, it isn't a capability system. > > The "holes" in this capability system, then, are: > > 1. introspective abilities that allow "breaking out" of the workspace > (such as the ability to 'sys._getframe()' or examine tracebacks to > "reach up" to higher-level stack frames) > > 2. the structuring of the library in ways that equate creating an > instance of a class with an "unsafe" capability. (E.g., creating > instances of 'file()') coupled with instance->class introspection > > 3. Lack of true "privacy" for objects. (Proxies are a useful way to > address this issue, because they allow more than one "capability" to > exist for the same object.) Of course, once you have a capability system, you get the effect of more than one capability for the same object for free, as it were, simply by, err, proxying with other objects. The objection to doing it the other way round is that for capability languages to be truly usable the capability functionality needs to be automatic, not something that is painfully added to each class or object (at least, that is the claim we capability mavens are making). Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From aahz@pythoncraft.com Wed Apr 2 17:55:48 2003 From: aahz@pythoncraft.com (Aahz) Date: Wed, 2 Apr 2003 12:55:48 -0500 Subject: [Python-Dev] Security challenge (was Re: Capabilities) In-Reply-To: References: <3E8768BE.8010603@prescod.net> Message-ID: <20030402175548.GA25135@panix.com> On Mon, Mar 31, 2003, Ka-Ping Yee wrote: > > I'm looking for other security design challenges to tackle in Python. > Once enough of them have been tried, we'll have a better understanding > of what Python would need to do to make secure programming easier. Okay, how about using LDAP to secure access to a database and give each user appropriate privileges? I'm just throwing this in as an example of mediated access that's required to be effective in the Real World [tm]; I'm sure you can think of simpler examples if you want. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ This is Python. We don't care much about theory, except where it intersects with useful practice. --Aahz, c.l.py, 2/4/2002 From drifty@alum.berkeley.edu Wed Apr 2 20:36:38 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Wed, 2 Apr 2003 12:36:38 -0800 (PST) Subject: [Python-Dev] python-dev Summary for 2003-03-16 through 2003-03-31 In-Reply-To: References: Message-ID: [Zooko] > > Brett Cannon wrote: > > > > One point made about capabilities is that they partially go against the > > Pythonic grain. Since you have to pass capabilities specifically and > > shouldn't allow them to be inherited, it does not go with the way you tend > > to write Python code. > > This doesn't make sense to me, and I don't recall a message which asserted it. > It was said in an email. I don't remember who off the top of my head, but someone stated something along these lines. > If capabilities were implemented as Python references, you could inherit > capabilities (== references) from superclasses, just as you can currently do. > That's why it says "shouldn't" instead of "couldn't". I could re-word this to go more along the way Ping phrased it in how the class statement does not make perfect sense for capabilities but it can be used. > The rest looks like a good summary! > Thanks. -Brett From martin@v.loewis.de Wed Apr 2 21:24:32 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 02 Apr 2003 23:24:32 +0200 Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: <7574B507-64EC-11D7-80C3-0030655234CE@cwi.nl> References: <7574B507-64EC-11D7-80C3-0030655234CE@cwi.nl> Message-ID: Jack Jansen writes: > Do we have control over what is on that page, i.e. could we add a > note to the top saying "If you want to submit a new bug please log > in first"? Please have a look at the page now. Look ok? Is that needed for patches as well? Regards, Martin From fdrake@acm.org Wed Apr 2 21:34:24 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 2 Apr 2003 16:34:24 -0500 Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: References: <7574B507-64EC-11D7-80C3-0030655234CE@cwi.nl> Message-ID: <16011.22368.351593.284577@grendel.zope.com> Martin v. L=F6wis writes: > Please have a look at the page now. Look ok? Is that needed for > patches as well? Yes; that tracker has the same requirement for submission. -Fred --=20 Fred L. Drake, Jr. PythonLabs at Zope Corporation From zooko@zooko.com Wed Apr 2 22:53:31 2003 From: zooko@zooko.com (Zooko) Date: Wed, 02 Apr 2003 17:53:31 -0500 Subject: [Python-Dev] python-dev Summary for 2003-03-16 through 2003-03-31 In-Reply-To: Message from Brett Cannon of "Wed, 02 Apr 2003 12:36:38 PST." References: Message-ID: > > > One point made about capabilities is that they partially go against the > > > Pythonic grain. ... > > If capabilities were implemented as Python references, you could inherit > > capabilities (== references) from superclasses, just as you can currently do. > > That's why it says "shouldn't" instead of "couldn't". I could re-word > this to go more along the way Ping phrased it in how the class statement > does not make perfect sense for capabilities but it can be used. I can't speak for Ping, but I would be quite surprised if he thought that capabilities were un-Pythonic. (I wouldn't be surprised if he disapproved of the notion of classes in a programming language, regardless of security considerations...) Speaking for myself, capabilities have two main advantages: they fit with the Zen of Python, they enable higher-order least-privilege, and they fit with the principle of unifying designation and authority. But seriously, I feel that capabilities fit with normal Python programming as it is currently practiced. Regards, Zooko http://zooko.com/ ^-- under re-construction: some new stuff, some broken links From zooko@zooko.com Wed Apr 2 23:08:12 2003 From: zooko@zooko.com (Zooko) Date: Wed, 02 Apr 2003 18:08:12 -0500 Subject: [Python-Dev] Capabilities (we already got one) In-Reply-To: Message from Ka-Ping Yee of "Tue, 01 Apr 2003 14:12:49 CST." References: Message-ID: (I, Zooko, wrote the lines prepended with "> > ".) Ping wrote: > > > I think that in restricted-execution-mode (hereafter: "REM", as per Greg Ewing's > > suggestion [1]), Python objects have encapsulation -- one can't access their > > private data without their permission. > > > > Once this is done, Python references are capabilities. > > Aaack! I wish you would *stop* saying that! > > There is no criterion by which a reference is or is not a capability. > To talk in such terms only confuses the issue. Let me be a little more precise. Once Python objects are encapsulated, then possession of a reference is constrained in the following way: you can have a reference only if another object that had it chose to give it to you (or if you create something yourself, in which case you get the first-ever reference to it). This constraint happens to be the same constraint that the rule of capabilities imposes on the transmission of capabilities: you can have a capability only if someone else who had it chose to give it to you (or if you create something yourself, in which case you get the first-ever capability to it). Therefore, if you wish to use capability access control to manage access to resources in Python you can use the following technique: 1. Encapsulate the resource that you wish to control in a Python object. 2. Say to yourself "References are capabilities!". 3. Control the way references to that object are shared. Doing it this way will yield the advantages that capability access control enjoys over alternative access control models. It also has the advantage that your skills at Python programming can be applied directly to the problem of managing access control, without requiring you to learn any new policy language or new concepts. You are quite right, Ping, that capability access control could be enforced in other ways in Python. I didn't mean to say "capabilities are Python references", which would imply that capability access control could not be implemented in any other way. I'm deliberately refraining from posting about the issue of controlling import of modules and builtins in an attempt to "slow down" the discussion until Guido returns from Python UK. Regards, Zooko http://zooko.com/ ^-- under re-construction: some new stuff, some broken links From greg@cosc.canterbury.ac.nz Thu Apr 3 01:07:52 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 03 Apr 2003 13:07:52 +1200 (NZST) Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <20030402151232.GX1048@localhost> Message-ID: <200304030107.h3317qq20982@oma.cosc.canterbury.ac.nz> Andrew Koenig wrote: > Why can't you do this? > foo = log.setdefault(r,'') > foo += "test %d\n" % t You can do it, but it's useless! >>> d = {} >>> foo = d.setdefault(42, "buckle") >>> foo += " my shoe" >>> d {42: 'buckle'} What Mr. Leighton wanted is *impossible* when the value concerned is immutable, because by the time you get to the += operator, there's no information left about where the value came from, and thus no way to update the dict with the new value. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Apr 3 02:19:51 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 03 Apr 2003 14:19:51 +1200 (NZST) Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: Message-ID: <200304030219.h332Jp223291@oma.cosc.canterbury.ac.nz> Martin: > Please have a look at the page now. Look ok? What page are you talking about, exactly? I just tried the "Bug Tracker" link in the sidebar of www.python.org, and it still goes straight to a sourceforge page, which looks just the same as before as far as I can tell. What am I supposed to be seeing? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@comcast.net Thu Apr 3 02:31:43 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 02 Apr 2003 21:31:43 -0500 Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: <200304030219.h332Jp223291@oma.cosc.canterbury.ac.nz> Message-ID: [Greg Ewing] > What page are you talking about, exactly? I just tried > the "Bug Tracker" link in the sidebar of www.python.org, > and it still goes straight to a sourceforge page, which > looks just the same as before as far as I can tell. > > What am I supposed to be seeing? I expect he wants you to see the line that says Please log into SourceForge to submit a new report. below the filter boxes and above the 1-line bug summaries. From ark@research.att.com Thu Apr 3 02:38:48 2003 From: ark@research.att.com (Andrew Koenig) Date: 02 Apr 2003 21:38:48 -0500 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <200304030107.h3317qq20982@oma.cosc.canterbury.ac.nz> References: <200304030107.h3317qq20982@oma.cosc.canterbury.ac.nz> Message-ID: Greg> Andrew Koenig wrote: >> Why can't you do this? >> foo = log.setdefault(r,'') >> foo += "test %d\n" % t Greg> You can do it, but it's useless! >>>> d = {} >>>> foo = d.setdefault(42, "buckle") >>>> foo += " my shoe" >>>> d Greg> {42: 'buckle'} Greg> What Mr. Leighton wanted is *impossible* when the value Greg> concerned is immutable, because by the time you get to Greg> the += operator, there's no information left about where Greg> the value came from, and thus no way to update the Greg> dict with the new value. Of course it's impossible when the value is immutable, because += cam't mutate it :-) However, consider this: foo = [] foo += ["my shoe"] No problem, right? So the behavior of foo = d.setdefault(r,'') foo += "test %d\n" % t depends on what type foo has, and the OP didn't say. But whatever type foo might have, the behavior of the two statements above ought logically to be the same as the theoretical behavior of d.setdefault(r,'') += "test %d\n" % t which is what the OP was trying to achieve. -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark From greg@cosc.canterbury.ac.nz Thu Apr 3 02:56:43 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 03 Apr 2003 14:56:43 +1200 (NZST) Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: Message-ID: <200304030256.h332uha23381@oma.cosc.canterbury.ac.nz> Tim Peters : > I expect he wants you to see the line that says > > Please log into SourceForge to submit a new report. > > below the filter boxes and above the 1-line bug summaries. Hmmm, okay, I can see it now, but it would be easy to miss if I weren't looking for it. Perhaps it could be made a little larger and set off from the items above and below it? Ideally, of course, the Submit New button should always be there, and lead to a page telling you to log in if you're not already. But presumably you don't have that much control over the page? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Apr 3 03:04:27 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 03 Apr 2003 15:04:27 +1200 (NZST) Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: Message-ID: <200304030304.h3334Rc23393@oma.cosc.canterbury.ac.nz> Andrew Koenig : > So the behavior of > > foo = d.setdefault(r,'') > foo += "test %d\n" % t > > depends on what type foo has, and the OP didn't say. I assumed that the code snippet was from his actual application, in which case he *did* want it to work on strings, in which case, even if he had the feature he wanted, it wouldn't have helped him. I think the fact that this would only work when the value was mutable is a good reason to disallow it. Too big a source of surprises, otherwise. Being forced to find another way to update the value in this case is a feature, because the absence of such a way when the value is immutable makes it clear that there's no way to do what you're trying to do! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@comcast.net Thu Apr 3 03:09:25 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 02 Apr 2003 22:09:25 -0500 Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: <200304030256.h332uha23381@oma.cosc.canterbury.ac.nz> Message-ID: [Greg Ewing] > Hmmm, okay, I can see it now, but it would be easy to > miss if I weren't looking for it. > > Perhaps it could be made a little larger and set off from > the items above and below it? We have no control over either -- SF lets us put words there, but that's all. I added another paragraph: Please log into SourceForge to submit a new report. SourceForge will not allow you to submit a new bug report unless you're logged in. It's not as invisible now. > Ideally, of course, the Submit New button should always > be there, and lead to a page telling you to log in > if you're not already. But presumably you don't have > that much control over the page? That's right. From fdrake@acm.org Thu Apr 3 03:57:38 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 2 Apr 2003 22:57:38 -0500 Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: References: <200304030256.h332uha23381@oma.cosc.canterbury.ac.nz> Message-ID: <16011.45362.723995.488848@grendel.zope.com> Tim Peters writes: > We have no control over either -- SF lets us put words there, but that's > all. I added another paragraph: We can do a little more; see the Expat tracker's "Submit New" page for an example that enhances the presentation a bit: http://sourceforge.net/tracker/?func=add&group_id=10127&atid=110127 One catch, of course, is that the extra blurb is always shown, even for people that are already logged in (I suspect the majority of use is by the development team); the farther down the page we push the actual bug information, the harder it is for developers to use. We need to think about the tradeoff; it is important to encourage good reports from people interested in providing them and willing to do so. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From martin@v.loewis.de Thu Apr 3 04:33:31 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 03 Apr 2003 06:33:31 +0200 Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: <16011.45362.723995.488848@grendel.zope.com> References: <200304030256.h332uha23381@oma.cosc.canterbury.ac.nz> <16011.45362.723995.488848@grendel.zope.com> Message-ID: "Fred L. Drake, Jr." writes: > One catch, of course, is that the extra blurb is always shown, even > for people that are already logged in (I suspect the majority of use > is by the development team); the farther down the page we push the > actual bug information, the harder it is for developers to use. I have now boldified parts of it; this doesn't take make space, but should increase visibility. I hope it's not considered annoying - feel free to undo that. If they would allow us to put PHP into that box, we could even suppress the text if the user was logged in. Regards, Martin From boris.boutillier@arteris.net Thu Apr 3 06:09:11 2003 From: boris.boutillier@arteris.net (Boris Boutillier) Date: 03 Apr 2003 08:09:11 +0200 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <200304030304.h3334Rc23393@oma.cosc.canterbury.ac.nz> References: <200304030304.h3334Rc23393@oma.cosc.canterbury.ac.nz> Message-ID: <1049350152.23533.20.camel@elevedelix> Thre is a way to do it, even with immutable objects, it is a little bit heavier : >>> x = {} >>> x.setdefault(42,'buckle') 'buckle' >>> x[42] += '3' >>> x {42: 'buckle3'} Boris Boutillier, - ARTERIS - Artwork Interconnecting System 6, Parc Ariane 78284 Guyancourt (FRANCE) On Thu, 2003-04-03 at 05:04, Greg Ewing wrote: > Andrew Koenig : > > > So the behavior of > > > > foo = d.setdefault(r,'') > > foo += "test %d\n" % t > > > > depends on what type foo has, and the OP didn't say. > > I assumed that the code snippet was from his actual application, in > which case he *did* want it to work on strings, in which case, even if > he had the feature he wanted, it wouldn't have helped him. > > I think the fact that this would only work when the value was mutable > is a good reason to disallow it. Too big a source of surprises, > otherwise. > > Being forced to find another way to update the value in this case is a > feature, because the absence of such a way when the value is immutable > makes it clear that there's no way to do what you're trying to do! > > Greg Ewing, Computer Science Dept, +--------------------------------------+ > University of Canterbury, | A citizen of NewZealandCorp, a | > Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | > greg@cosc.canterbury.ac.nz +--------------------------------------+ > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From walter@livinglogic.de Thu Apr 3 08:53:17 2003 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Thu, 03 Apr 2003 10:53:17 +0200 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <200304030304.h3334Rc23393@oma.cosc.canterbury.ac.nz> References: <200304030304.h3334Rc23393@oma.cosc.canterbury.ac.nz> Message-ID: <3E8BF67D.4060807@livinglogic.de> Greg Ewing wrote: > Andrew Koenig : > > >>So the behavior of >> >> foo = d.setdefault(r,'') >> foo += "test %d\n" % t >> >>depends on what type foo has, and the OP didn't say. > > I assumed that the code snippet was from his actual application, in > which case he *did* want it to work on strings, in which case, even if > he had the feature he wanted, it wouldn't have helped him. > [...] > Being forced to find another way to update the value in this case is a > feature, because the absence of such a way when the value is immutable > makes it clear that there's no way to do what you're trying to do! Mutable (or at least appendable) strings should probably be done with StringIO/cStringIO. How about adding support for __iadd__ and __str__ (and __unicode__) to both? Bye, Walter Dörwald From ben@algroup.co.uk Thu Apr 3 10:43:10 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Thu, 03 Apr 2003 11:43:10 +0100 Subject: [Python-Dev] Capabilities In-Reply-To: References: <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E8C103E.90201@algroup.co.uk> Zooko wrote: > In the capability way of life, it is still the case that access to the ZipFile > class gives you the ability to open files anywhere in the system! (That is: I'm > assuming for now that we implement capabilities without re-writing every > dangerous class in the Library.) In this scheme, there are no flags, and when > you run code that you think might misuse this feature, you simply don't give > that code a reference to the ZipFile class. (Also, we have to arrange that it > can't acquire a reference by "import zipfile".) It would probably be helpful to explain what you (or, at least, I) would do if you (I) were writing from scratch, rather then "taming" the existing libraries. In this case, Zipfile would require a file capability to be passed to it at construction time, and so would become non-dangerous, which is, I think, where Guido is coming from. The risk only occurs because we want to not rewrite the whole library, just to wrap it, and its important to understand that this isn't really the "proper" way to do it (though, of course, the ZipFile class is not unlike any of the other non-capability things we'd have to wrap anyway, given a non-capability OS underneath, it just happens to be one that _can_ be rewritten if we want to rewrite it). Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From ben@algroup.co.uk Thu Apr 3 10:52:08 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Thu, 03 Apr 2003 11:52:08 +0100 Subject: [Python-Dev] Capabilities (we already got one) In-Reply-To: References: Message-ID: <3E8C1258.3070906@algroup.co.uk> Ken Manheimer wrote: > On Tue, 1 Apr 2003, Ka-Ping Yee wrote: > One big one seems to be: "What needs to be done to enable effective > ("safe"?) use of python object (references) as capabilities?" I've > seen answers to this roll by several times - i think we need to settle > them, and collect the conclusions in a PEP. And we need to identify > what other questions there are. I am in the process of writing a PEP, and it is being informed by this discussion. Unfortunately, I have several day jobs and its going somewhat slowly. I've also been bogged down somewhat in a theoretical discussion with a bunch of capability experts over globals and how they should work. However, we do appear to have reached closure on that issue: globals have to be at least transitively immutable - unfortunately, I have demonstrated that this requirement is not sufficient to make them safe, but it is (we believe) necessary. So, now I've sorted that one out I can complete my first pass on the PEP, which I expect to do in the next few days. At that point, I'm slightly unsure how best to proceed. The most obvious way is, of course, to follow the standard PEP procedure, but are there people who would like to comment before I submit the first draft? It is still going to be full of unanswered questions, but I do think we are near to the stage where we can start nailing down the answers. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From mcherm@mcherm.com Thu Apr 3 13:09:31 2003 From: mcherm@mcherm.com (Michael Chermside) Date: Thu, 3 Apr 2003 05:09:31 -0800 Subject: [Python-Dev] Re: Capabilities (we already got one) Message-ID: <1049375371.3e8c328be581d@mcherm.com> > The objection to doing it the other way round is that for capability > languages to be truly usable the capability functionality needs to be > automatic, not something that is painfully added to each class or object > (at least, that is the claim we capability mavens are making). Just how strong a claim are you making here? It seems to me that the need for security (via capabilities or any other mechanism) is an UNUSUAL need. Most programs don't need it at all, others need it in only a few places. Now don't get me wrong... when you DO need it, you really need it, and just throwing something together without explicit language support is somewhere between impossible and terrifically-difficult-and-error-prone. So supporting secure execution (via capabilities or whatever) in the language is a great idea. And I like the capabilities-as-references approach... it's simple, elegant, and not error prone. But if you're going so far as to imply that capability functionality needs to be present ALWAYS, and supported (and considered) in every class or object, then that's going too far. A random module should, for instance, be able to open arbitrary files in the file system without being passed any special objects, UNLESS we do something special when we load it to indicate that we want it to run in a restricted mode. I think that zipfile is a good example here. As a library developer, I should be able to write and distribute a zipfile module without thinking about capabilities or security at all. Of course, when others go to use it in a secure or restricted mode, they may find that it isn't as useful as they'd like, but (I believe) we shouldn't say NO ONE can have a zipfile module unless the module author is willing to address security issues. Someone can write securezipfile when they get the itch. Now, if we really built security (via capabilities) into the language from the ground up, then ALL modules would work by being passed appropriate capability objects, and only the starting script would possess all capabilities. There would be no "file" builtin, just file objects (and ReadOnlyFile objects, and DirectorySubTree objects, and so forth) which got passed around. So OF COURSE the original author of zipfile would write it to accept a file at construction rather than allowing it to open files... that would be the natural way to do things. But that language isn't python... and I don't think it's worth changing Python enough to get there. So if you're proposing this drastic a change (which I doubt), then I think it's too drastic. But if you're NOT, then you have to realize that there will be lots of library modules like zipfile, which were written by people who didn't give any thought to security (since it's a rarely-used feature of the language). So we need workarounds (like wrappers or proxies) that can be applied after-the-fact to modules and classes that weren't written with security in mind. If that's "painfully adding something to each class or object", then I don't see how it's to be avoided. -- Michael Chermside From zooko@zooko.com Thu Apr 3 13:29:57 2003 From: zooko@zooko.com (Zooko) Date: Thu, 03 Apr 2003 08:29:57 -0500 Subject: [Python-Dev] Capabilities In-Reply-To: Message from Ben Laurie of "Thu, 03 Apr 2003 11:43:10 +0100." <3E8C103E.90201@algroup.co.uk> References: <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net> <3E8C103E.90201@algroup.co.uk> Message-ID: (I, Zooko, wrote the lines prepended with "> > ".) Ben Laurie wrote: > > > In the capability way of life, it is still the case that access to the ZipFile > > class gives you the ability to open files anywhere in the system! (That is: I'm > > assuming for now that we implement capabilities without re-writing every > > dangerous class in the Library.) ... > It would probably be helpful to explain what you (or, at least, I) would > do if you (I) were writing from scratch, rather then "taming" the > existing libraries. In this case, Zipfile would require a file > capability to be passed to it at construction time, and so would become > non-dangerous, which is, I think, where Guido is coming from. Thank you. You are right about how I would do it, and I think you are right that this fits with Guido's approach, too. I would make the constructor of the ZipFile class take a file object, and hide (at least from unprivileged code) the option of passing a filename to the constructor. This would make it so that no authority is gained by importing the zipfile module. Regards, Zooko http://zooko.com/ ^-- under re-construction: some new stuff, some broken links From ben@algroup.co.uk Thu Apr 3 14:04:27 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Thu, 03 Apr 2003 15:04:27 +0100 Subject: [Python-Dev] Capabilities In-Reply-To: <3E88E2B6.1080409@prescod.net> References: <3E88E2B6.1080409@prescod.net> Message-ID: <3E8C3F6B.8000000@algroup.co.uk> Paul Prescod wrote: > Are DOS issues in scope? How do we prevent untrusted code from just > bringing the interpreter to a halt? A smart enough attacker could even > block all threads in the current process by finding a task that is > usually not time-sliced and making it go on for a very long time. > without looking at the Python implementation, I can't remember an > example off of the top of my head, but perhaps a large multiplication or > search-and-replace in a string. It seems to me that this is an issue orthogonal to capabilities (though access to mechanisms that regulate it might well be capability-based). Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From ben@algroup.co.uk Thu Apr 3 14:05:45 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Thu, 03 Apr 2003 15:05:45 +0100 Subject: [Python-Dev] Capabilities In-Reply-To: References: Message-ID: <3E8C3FB9.50101@algroup.co.uk> Ka-Ping Yee wrote: > Hmm, i'm not sure you understood what i meant. The code example i posted > is a solution to the design challenge: "provide read-only access to a > directory and its subdirectories, but no access to the rest of the filesystem". > I'm looking for other security design challenges to tackle in Python. > Once enough of them have been tried, we'll have a better understanding of > what Python would need to do to make secure programming easier. Well, one of the favourites is to create a file selection dialog that will only give access (optionally readonly) to the file designated by the user. This may be rather more than you want to bite off as a working system at this stage, though! It might be a useful thought experiment, though. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From fdrake@acm.org Thu Apr 3 14:40:21 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 3 Apr 2003 09:40:21 -0500 Subject: How do I report a bug? (Re: [Python-Dev] Distutils documentation amputated in 2.2 docs?) In-Reply-To: References: <200304030256.h332uha23381@oma.cosc.canterbury.ac.nz> <16011.45362.723995.488848@grendel.zope.com> Message-ID: <16012.18389.659720.951267@grendel.zope.com> Martin v. L=F6wis writes: > I have now boldified parts of it; this doesn't take make space, but > should increase visibility. I hope it's not considered annoying - fe= el > free to undo that. Nice! I've made the boldified text a hyperlink to the login page, and copied the text to the patch tracker as well. > If they would allow us to put PHP into that box, we could even > suppress the text if the user was logged in. Hmm. I don't know that they won't, I just don't know the incantation to determine if a user is logged on. -Fred --=20 Fred L. Drake, Jr. PythonLabs at Zope Corporation From drifty@alum.berkeley.edu Thu Apr 3 19:05:56 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Thu, 3 Apr 2003 11:05:56 -0800 (PST) Subject: [Python-Dev] python-dev Summary for 2003-03-16 through 2003-03-31 In-Reply-To: References: Message-ID: [Zooko] > But seriously, I feel that capabilities fit with normal Python programming as it > is currently practiced. > The paragraph is gone, so no need to worry about this anymore. -Brett From altis@semi-retired.com Thu Apr 3 19:42:09 2003 From: altis@semi-retired.com (Kevin Altis) Date: Thu, 3 Apr 2003 11:42:09 -0800 Subject: [Python-Dev] fwd: Dan Sugalski on continuations and closures Message-ID: via Simon Willison's blog: http://simon.incutio.com/archive/2003/04/03/#closuresAndContinuations " Thanks to Dan Sugalski (designer of Parrot, the next generation Perl VM) I finally understand what continuations and closures actually are. He explains them as part of a comparison between the forthcoming Parrot and two popular virtual machines already in existence: * (Perl|python|Ruby) on (.NET|JVM) leads in to the explanation. http://www.sidhe.org/~dan/blog/archives/000151.html * The reason for Parrot, part 2 explains closures. http://www.sidhe.org/~dan/blog/archives/000152.html * Continuations and VMs explains continuations. http://www.sidhe.org/~dan/blog/archives/000156.html * Continuations and VMs, part 2 rounds things off by explaining why the JVM and the CLR are unsuitable environments for supporting these language features. http://www.sidhe.org/~dan/blog/archives/000157.html " ka ps. In order to focus on Python promotion and site-redesign efforts I've suspended delivery of python-dev email in the short-term and will only be scanning the archives as time permits. If you need to flame me, please address your emails to me directly or /dev/null, your choice ;-) From martin@v.loewis.de Thu Apr 3 22:36:49 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 04 Apr 2003 00:36:49 +0200 Subject: [Python-Dev] Re: How do I report a bug? In-Reply-To: <16012.18389.659720.951267@grendel.zope.com> References: <200304030256.h332uha23381@oma.cosc.canterbury.ac.nz> <16011.45362.723995.488848@grendel.zope.com> <16012.18389.659720.951267@grendel.zope.com> Message-ID: "Fred L. Drake, Jr." writes: > > If they would allow us to put PHP into that box, we could even > > suppress the text if the user was logged in. > > Hmm. I don't know that they won't, I just don't know the incantation > to determine if a user is logged on. If it's still the same code as in SF 2.5, it is "user_isloggedin()": http://phpxref.sourceforge.net/sourceforge/include/User.class.source.html#l555 As an example usage, see http://phpxref.sourceforge.net/sourceforge/patch/add_patch.php.source.html#l49 Regards, Martin From tim.one@comcast.net Fri Apr 4 04:08:54 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 03 Apr 2003 23:08:54 -0500 Subject: [Python-Dev] Boom Message-ID: While enduring dental implant surgery earlier today, I thought to myself "oops -- I bet this program will crash Python". Turns out it does, in current CVS, and almost certainly in every version of Python since cyclic gc was added: """ import gc class C: def __getattr__(self, attr): del self.attr raise AttributeError a = C() b = C() a.attr = b b.attr = a del a, b gc.collect() """ Short course: a and b are in a trash cycle. gcmodule's move_finalizers() finds one of them and calls has_finalizer() to see whether it's collectible. Say it's b. has_finalizer() calls (in effect) hasattr(b, "__del__"), and b.__getattr__() deletes b.attr as a side effect before saying b.__del__ doesn't exist. That drops the refcount on a to 0, which in turn drops the refcount on a.__dict__ to 0. Those two are the killers: a and a.__dict__ become untracked (by gc) as part of cleaning them up, but the move_finalizers() "next" local still points to one of them (to the __dict__, in the run I happened to step thru). As a result, the next trip around the move_finalizer() loop calls has_finalizer() on memory that's already been free()ed. Hilarity ensues. The anesthesia is wearing off and I won't speculate about solutions now. I suspect it's easy, or close to intractable. PLabs folks, I'm unsure whether this relates to the ZODB test failure we've been bashing away at. All, ZODB is a persistent database, and at one point in this test gc determines that "a ghost" is unreachable. When gc's has_finalizer() asks whether the ghost has a __del__ method, the persistence machinery kicks in, sucking the ghost's state off of disk, and executing a lot of Python code as a result. Part of the Python code executed does appear (if hazy memory serves) to delete some previously unreachable objects that were also in (or hanging off of) the ghost's cycle, and so in the unreachable list gc's move_finalizers() is crawling over. The kind of blowup above could be one bad effect, and Jeremy was seeing blowups with move_finalizers() in the traceback. Unfortunately, the test doesn't blow up under CVS Python, and 2.2.2 doesn't have the telltale 0xdbdbdbdb filler 2.3's debug PyMalloc sprays into free()ed memory. From tim.one@comcast.net Fri Apr 4 04:37:47 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 03 Apr 2003 23:37:47 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: Message-ID: [jhylton@users.sourceforge.net] > Modified Files: > Tag: release22-maint > gcmodule.c > Log Message: > Fix memory corruption in garbage collection. > ... > The problem with the previous revision is that it followed > gc->gc.gc_next before calling has_finalizer(). If has_finalizer() > gc->happened to deallocate the object FROM_GC(gc->gc.gc_next), then > the next time through the loop gc would point to freed memory. The > fix is to always follow the next pointer after calling > has_finalizer(). Oops! I didn't see this before posting my "Boom" msg. > Note that Python 2.3 does not have this problem, because > has_finalizer() checks the tp_del slot and never runs Python code. That part isn't so, alas: the program I posted in the "Boom" msg crashes 2.3, via the same mechanism: return PyInstance_Check(op) ? PyObject_HasAttr(op, delstr) : PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE) ? op->ob_type->tp_del != NULL : 0; It's the PyInstance_Check(op) path there that's still vulnerable. I'll poke at that. > Tim, Barry, and I peed away the better part of two days tracking this > down. > ! next = gc->gc.gc_next; > if (has_finalizer(op)) { > gc_list_remove(gc); > gc_list_append(gc, finalizers); > gc->gc.gc_refs = GC_MOVED; > } > } > } > --- 277,290 ---- > for (; gc != unreachable; gc=next) { > PyObject *op = FROM_GC(gc); > ! /* has_finalizer() may result in arbitrary Python > ! code being run. */ > if (has_finalizer(op)) { > + next = gc->gc.gc_next; > gc_list_remove(gc); > gc_list_append(gc, finalizers); > gc->gc.gc_refs = GC_MOVED; > } > + else > + next = gc->gc.gc_next; > } > } Are we certain that has_finalizer() can't unlink gc itself from the unreachable list? If it can, then > + else > + next = gc->gc.gc_next; will set next to the content of free()ed memory. In fact, I believe the Boom program will suffer this fate ... yup, it does. "The problem" isn't yet really fixed in any version of Python, although I agree it's a lot better with the change above. From ben@algroup.co.uk Fri Apr 4 10:41:43 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Fri, 04 Apr 2003 11:41:43 +0100 Subject: [Python-Dev] Re: Capabilities (we already got one) In-Reply-To: <1049375371.3e8c328be581d@mcherm.com> References: <1049375371.3e8c328be581d@mcherm.com> Message-ID: <3E8D6167.4020804@algroup.co.uk> Michael Chermside wrote: >>The objection to doing it the other way round is that for capability >>languages to be truly usable the capability functionality needs to be >>automatic, not something that is painfully added to each class or object >>(at least, that is the claim we capability mavens are making). > > > Just how strong a claim are you making here? > > It seems to me that the need for security (via capabilities or any other > mechanism) is an UNUSUAL need. Most programs don't need it at all, > others need it in only a few places. Now don't get me wrong... when you > DO need it, you really need it, and just throwing something together > without explicit language support is somewhere between impossible and > terrifically-difficult-and-error-prone. So supporting secure execution > (via capabilities or whatever) in the language is a great idea. And I > like the capabilities-as-references approach... it's simple, elegant, > and not error prone. > > But if you're going so far as to imply that capability functionality > needs to be present ALWAYS, and supported (and considered) in every class > or object, then that's going too far. A random module should, for > instance, be able to open arbitrary files in the file system without > being passed any special objects, UNLESS we do something special when we > load it to indicate that we want it to run in a restricted mode. > > I think that zipfile is a good example here. As a library developer, I > should be able to write and distribute a zipfile module without thinking > about capabilities or security at all. Of course, when others go to use > it in a secure or restricted mode, they may find that it isn't as useful > as they'd like, but (I believe) we shouldn't say NO ONE can have a > zipfile module unless the module author is willing to address security > issues. Someone can write securezipfile when they get the itch. > > Now, if we really built security (via capabilities) into the language > from the ground up, then ALL modules would work by being passed > appropriate capability objects, and only the starting script would > possess all capabilities. There would be no "file" builtin, just file > objects (and ReadOnlyFile objects, and DirectorySubTree objects, and > so forth) which got passed around. So OF COURSE the original author > of zipfile would write it to accept a file at construction rather than > allowing it to open files... that would be the natural way to do things. > But that language isn't python... and I don't think it's worth changing > Python enough to get there. > > So if you're proposing this drastic a change (which I doubt), then I > think it's too drastic. But if you're NOT, then you have to realize > that there will be lots of library modules like zipfile, which were > written by people who didn't give any thought to security (since it's > a rarely-used feature of the language). So we need workarounds (like > wrappers or proxies) that can be applied after-the-fact to modules and > classes that weren't written with security in mind. If that's > "painfully adding something to each class or object", then I don't see > how it's to be avoided. I am completely in agreement. Taming of existing modules is inevitably going to be somewhat painful - and, in some cases, it may be less painful to simply rewrite them. As you suspect, what I am proposing is that _when_ a programmer wishes to use capabilities as a security mechanism, it is desirable to make that as easy to use as possible. I'm not sure I agree that the need for security is particularly unusual but I don't think its worth having a big argument about. I certainly do agree that crippling Python in order to get capabilities is not a desirable outcome. Not that I have that option anyway :-) Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From ping@zesty.ca Fri Apr 4 12:28:18 2003 From: ping@zesty.ca (Ka-Ping Yee) Date: Fri, 4 Apr 2003 06:28:18 -0600 (CST) Subject: [Python-Dev] Re: Capabilities (we already got one) In-Reply-To: <3E8D6167.4020804@algroup.co.uk> Message-ID: Michael Chermside wrote: > It seems to me that the need for security (via capabilities or any other > mechanism) is an UNUSUAL need. Most programs don't need it at all, > others need it in only a few places. I think you are missing the point somewhat. Security is about making sure your program will do what you expect. So it is just as much about avoiding bugs as about thwarting malicious agents. Programming in a capability style makes programs more reliable and bugs less damaging. Colleagues of mine have established the habit of programming in a capability style in Java -- not because Java supports capabilities, and not because they need security at all, but just because programming *as if* the language had capabilities leads to a better modular design. On Fri, 4 Apr 2003, Ben Laurie wrote: > I'm not sure I agree that the need for security is particularly unusual > but I don't think its worth having a big argument about. I certainly do > agree that crippling Python in order to get capabilities is not a > desirable outcome. Not that I have that option anyway :-) I also prefer to avoid loaded language. No one is talking about "crippling" anything. The essence of a capability model is simply to be explicit when authority is transferred. Explicit is better than implicit. -- ?!ng From jeremy@zope.com Fri Apr 4 16:46:32 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 04 Apr 2003 11:46:32 -0500 Subject: [Python-Dev] Re: [PythonLabs] Re: [Python-checkins] python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: References: Message-ID: <1049474792.14151.85.camel@slothrop.zope.com> On Thu, 2003-04-03 at 23:37, Tim Peters wrote: > > ! next = gc->gc.gc_next; > > if (has_finalizer(op)) { > > gc_list_remove(gc); > > gc_list_append(gc, finalizers); > > gc->gc.gc_refs = GC_MOVED; > > } > > } > > } > > --- 277,290 ---- > > for (; gc != unreachable; gc=next) { > > PyObject *op = FROM_GC(gc); > > ! /* has_finalizer() may result in arbitrary Python > > ! code being run. */ > > if (has_finalizer(op)) { > > + next = gc->gc.gc_next; > > gc_list_remove(gc); > > gc_list_append(gc, finalizers); > > gc->gc.gc_refs = GC_MOVED; > > } > > + else > > + next = gc->gc.gc_next; > > } > > } > > Are we certain that has_finalizer() can't unlink gc itself from the > unreachable list? If it can, then > > > + else > > + next = gc->gc.gc_next; > > will set next to the content of free()ed memory. In fact, I believe the > Boom program will suffer this fate ... yup, it does. "The problem" isn't > yet really fixed in any version of Python, although I agree it's a lot > better with the change above. It looks like it's hard to find a place to stand. Since arbitrary Python code can run, then an arbitrary set of objects in the unreachable list can suddenly become unlinked. The previous, current, and next objects are all suspect. I think a safe approach would be to move everything out of unreachable and into either "collectable" or "finalizers". That way, we can do a while (!gc_list_is_empty(unreachable)) loop and always deal with the head of the unreachable list. Each time through the loop, the head of the list can be moved to collectable or finalizers or become unlinked, so we always make progress. Sound plausible? Jeremy From jeremy@zope.com Fri Apr 4 17:39:16 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 04 Apr 2003 12:39:16 -0500 Subject: [Python-Dev] Re: [PythonLabs] Re: [Python-checkins] python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <1049474792.14151.85.camel@slothrop.zope.com> References: <1049474792.14151.85.camel@slothrop.zope.com> Message-ID: <1049477956.14152.93.camel@slothrop.zope.com> On Fri, 2003-04-04 at 11:46, Jeremy Hylton wrote: > I think a safe approach would be to move everything out of unreachable > and into either "collectable" or "finalizers". That way, we can do a > while (!gc_list_is_empty(unreachable)) loop and always deal with the > head of the unreachable list. Each time through the loop, the head of > the list can be moved to collectable or finalizers or become unlinked, > so we always make progress. > > Sound plausible? Yes. I've got a patch that fixes the boom case, but I'm not sure I've handled the case where the object becomes reachable as a result of running PyObject_HasAttr(). I'll post after testing that. Jeremy From jeremy@zope.com Fri Apr 4 18:26:11 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 04 Apr 2003 13:26:11 -0500 Subject: [Python-Dev] Re: [PythonLabs] Re: [Python-checkins] python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <1049477956.14152.93.camel@slothrop.zope.com> References: <1049474792.14151.85.camel@slothrop.zope.com> <1049477956.14152.93.camel@slothrop.zope.com> Message-ID: <1049480770.14146.95.camel@slothrop.zope.com> On Fri, 2003-04-04 at 12:39, Jeremy Hylton wrote: > On Fri, 2003-04-04 at 11:46, Jeremy Hylton wrote: > > I think a safe approach would be to move everything out of unreachable > > and into either "collectable" or "finalizers". That way, we can do a > > while (!gc_list_is_empty(unreachable)) loop and always deal with the > > head of the unreachable list. Each time through the loop, the head of > > the list can be moved to collectable or finalizers or become unlinked, > > so we always make progress. > > > > Sound plausible? > > Yes. I've got a patch that fixes the boom case, but I'm not sure I've > handled the case where the object becomes reachable as a result of > running PyObject_HasAttr(). I'll post after testing that. It's SF patch 715446. There's a lingering problem with test_gc, but I hope it's tractable. Jeremy From jeremy@zope.com Fri Apr 4 20:15:51 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 04 Apr 2003 15:15:51 -0500 Subject: [Python-Dev] Re: [PythonLabs] Re: [Python-checkins] python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <1049480770.14146.95.camel@slothrop.zope.com> References: <1049474792.14151.85.camel@slothrop.zope.com> <1049477956.14152.93.camel@slothrop.zope.com> <1049480770.14146.95.camel@slothrop.zope.com> Message-ID: <1049487350.14146.101.camel@slothrop.zope.com> We've got the first version of boom nailed, but we've got the same problem in handle_finalizers(). The version of boom below doesn't blow up until the second time the has_finalizer() is called. I don't understand the logic in handle_finalizers(), though. If the objects are all in the finalizers list, why do we call has_finalizer() a second time? Shouldn't everything has a finalizer at that point? Jeremy import gc class C: def __init__(self): self.x = 0 def delete(self): print "never called" def __getattr__(self, attr): self.x += 1 print self.x if self.x > 1: del self.attr else: return self.delete raise AttributeError a = C() b = C() a.attr = b b.attr = a del a, b print gc.collect() From tim_one@email.msn.com Sat Apr 5 08:15:40 2003 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 5 Apr 2003 03:15:40 -0500 Subject: [Python-Dev] Re: [PythonLabs] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <1049487350.14146.101.camel@slothrop.zope.com> Message-ID: [Jeremy Hylton] > We've got the first version of boom nailed, but we've got the same > problem in handle_finalizers(). The version of boom below doesn't blow > up until the second time the has_finalizer() is called. > > I don't understand the logic in handle_finalizers(), though. If the > objects are all in the finalizers list, why do we call has_finalizer() a > second time? Shouldn't everything has a finalizer at that point? Nope -- the parenthetical /* Handle uncollectable garbage (cycles with finalizers). */ comment is incomplete. The earlier call to move_finalizer_reachable() also put everything reachable only *from* trash cycles with finalizers into the list. So, e.g., if the trash graph is like A<->B->C and A has a finalizer but B and C don't, they're all in the finalizers list (at this point) regardless. But B and C aren't stopping the blob from getting collected, and we're trying to do the user a favor by putting only A (the troublemaker) into gc.garbage. It's an approximation, though. For example, if A and C both had finalizers, A and C would both be put into gc.garbage, despite that C's finalizer isn't stopping anything from getting collected. The comments are apparently a bit out of synch with the code, because 17 months ago all instance objects in the finalizers list were put into gc.garbage (regardless of whether they had __del__). The checkin comment for rev 2.28 sez the __del__ change was needed to fix a bug; but I'm too groggy to dig more now. From tim_one@email.msn.com Sat Apr 5 19:34:36 2003 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 5 Apr 2003 14:34:36 -0500 Subject: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <1049487350.14146.101.camel@slothrop.zope.com> Message-ID: I checked in some more changes (2.3 head only). This kind of program may be intractable: """ class C: def __getattr__(self, attribute): global alist if 'attr' in self.__dict__: alist.append(self.attr) del self.attr raise AttributeError import gc gc.collect() a = C() b = C() alist = [] a.attr = b b.attr = a a.x = 1 b.x = 2 del a, b # Oops. This prints 4: it's collecting # a, b, and their dicts. print gc.collect() # Despite that __getattr__ resurrected them. print alist # But gc cleared their dicts. print alist[0].__dict__ print alist[1].__dict__ # So a.x and b.x fail. print alist[0].x, alist[1].x """ While a __getattr__ side effect may resurrect an object in gc's unreachable list, gc has no way to know that an object has been resurrected short of starting over again. In the absence of that, the object remains in gc's unreachable list, and its tp_clear slot eventually gets called. The internal C stuff remains self-consistent, so this won't cause a segfault (etc), but it may (as above) be surprising. I don't see a sane way to fix this so long as asking whether __del__ exists can execute arbitrary mounds of Python code. From exarkun@intarweb.us Sat Apr 5 19:35:31 2003 From: exarkun@intarweb.us (Jp Calderone) Date: Sat, 5 Apr 2003 14:35:31 -0500 Subject: [Python-Dev] Placement of os.fdopen functionality Message-ID: <20030405193531.GA23455@meson.dyndns.org> --2fHTh5uZTiUOsy+g Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable It occurred to me this afternoon (after answering aquestion about creating file objects from file descriptors) that perhaps os.fdopen would be more logically placed someplace else - of course it could also remain as os.fdopen() for whatever deprecation period is warrented. Perhaps as a class method of the file type, file.fromfd()? Should I file a feature request for this on sf, or would it be considered too much of a mindless twiddle to bother with? Jp --=20 http://catandgirl.com/view.cgi?44 --=20 up 16 days, 16:00, 5 users, load average: 1.13, 0.93, 0.85 --2fHTh5uZTiUOsy+g Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (GNU/Linux) iD8DBQE+jzADedcO2BJA+4YRApeUAJ98bFbiUoBXXdzYm025xmV8LamPbwCcDs/J C1oeDLOPgcWgAWwEDQGCGOg= =qSMA -----END PGP SIGNATURE----- --2fHTh5uZTiUOsy+g-- From martin@v.loewis.de Sat Apr 5 20:34:13 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 05 Apr 2003 22:34:13 +0200 Subject: [Python-Dev] Placement of os.fdopen functionality In-Reply-To: <20030405193531.GA23455@meson.dyndns.org> References: <20030405193531.GA23455@meson.dyndns.org> Message-ID: Jp Calderone writes: > Perhaps as a class method of the file type, file.fromfd()? > > Should I file a feature request for this on sf, or would it be considered > too much of a mindless twiddle to bother with? Feel free to file a feature request, but I'd predict that it might sit there for some years until it is closed because of no action. OTOH, if you would produce a patch implementing the feature, it might get attention. Regards, Martin From tim.one@comcast.net Sun Apr 6 00:05:21 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 05 Apr 2003 19:05:21 -0500 Subject: [Python-Dev] Re: [PythonLabs] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <1049487350.14146.101.camel@slothrop.zope.com> Message-ID: [Jeremy Hylton] > We've got the first version of boom nailed, but we've got the same > problem in handle_finalizers(). The version of boom below doesn't blow > up until the second time the has_finalizer() is called. It isn't really necessary to call has_finalizer() a second time, and I'll check in changes so that it doesn't anymore (assuming the test suite passes -- it's running as I type this). > I don't understand the logic in handle_finalizers(), though. If the > objects are all in the finalizers list, why do we call has_finalizer() a > second time? Shouldn't everything has a finalizer at that point? I tried to explain that last night. The essence of the changes I have pending is to make move_finalizer_reachable() move the tentatively unreachable objects reachable only from finalizers into a new & distinct list, reachable_from_finalizers. After that, everything in finalizers has a finalizer and nothing in reachable_from_finalizers does, so we don't have to call has_finalizer() again. Before, finalizers contained everything in both (finalizers and reachable_from_finalizers) lists, so another has_finalizer() call on each object was needed to distinguish the two kinds (has a finalizer, doesn't have a finalizer) of objects again. > import gc > > class C: > > def __init__(self): > self.x = 0 > > def delete(self): > print "never called" > > def __getattr__(self, attr): > self.x += 1 > print self.x > if self.x > 1: > del self.attr > else: > return self.delete > raise AttributeError > > a = C() > b = C() > a.attr = b > b.attr = a > > del a, b > print gc.collect() I also added a non-printing variant of this to test_gc. In the new world, the "del self.attr" bits never get called, so this is just a vanilla trash cycle now. From jeremy@alum.mit.edu Sun Apr 6 03:02:04 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: 05 Apr 2003 21:02:04 -0500 Subject: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: References: Message-ID: <1049594522.24643.57.camel@localhost.localdomain> On Sat, 2003-04-05 at 14:34, Tim Peters wrote: > While a __getattr__ side effect may resurrect an object in gc's unreachable > list, gc has no way to know that an object has been resurrected short of > starting over again. In the absence of that, the object remains in gc's > unreachable list, and its tp_clear slot eventually gets called. The > internal C stuff remains self-consistent, so this won't cause a segfault > (etc), but it may (as above) be surprising. I don't see a sane way to fix > this so long as asking whether __del__ exists can execute arbitrary mounds > of Python code. I think I'll second the thought that there are no satisfactory answers here. We've made a big step forward by fixing the core dumps. If we want to document the current behavior, we would say that garbage collection may leave reachable objects in an "invalid state" in the presence of "problematic objects." A "problematic object" is an instance of a classic class that defines a getattr hook (__getattr__) but not a finalizer (__del__). An object an in "invalid state" has had its tp_clear slot executed; in the case of instances, this means the __dict__ will be empty. Specifically, if a problematic object is part of unreachable cycle, the garbage collector will execute the code in its getattr hook; if executing that code makes any object in the cycle reachable again, it will be left in an invalid state. If we document this for 2.2, it's more complicated because instances of new-style classes are also affected. What's worse, a new-style class with a __getattribute__ hook is affected regardless of whether it has a finalizer. Here are a couple of thoughts about how to avoid leaving objects in an invalid state. It's pretty unlikely for it to happen, but speaking from experience it's baffling when it does. #1. (I think this was Fred's suggestion on Friday.) Don't do a hasattr() check on the object, do it on the class. This is what happens with new-style classes in Python 2.3: If a new-style class doesn't define an __del__ method, then its instances don't have finalizer. It doesn't matter whether the specific instance has an __del__ attribute. Limitations: This is a change in semantics, although it only covers a nearly insane corner case. The other limitation is that things could still go wrong, although only in the presence of a classic metaclass! #2. If an object has a getattr hook and it's involved in a cycle, just put it in gc.garbage. Forget about checking for a finalizer. That seems fine for 2.3, since we're only talking about classic classes with getattr hooks. But it doesn't sound very pleasant for 2.2, since it covers an class instance with a getattr hook. I think #1 is pretty reasonable. I'd like to see something fixed for 2.2.3, but I worry that the semantic change may be unacceptable for a bug fix release. (But maybe not, the semantics are pretty insane right now :-). Jeremy From jim@zope.com Sun Apr 6 12:07:44 2003 From: jim@zope.com (Jim Fulton) Date: Sun, 06 Apr 2003 07:07:44 -0400 Subject: [PythonLabs] Re: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: References: Message-ID: <3E900A80.3010802@zope.com> Tim Peters wrote: ... > While a __getattr__ side effect may resurrect an object in gc's unreachable > list, gc has no way to know that an object has been resurrected short of > starting over again. In the absence of that, the object remains in gc's > unreachable list, and its tp_clear slot eventually gets called. The > internal C stuff remains self-consistent, so this won't cause a segfault > (etc), but it may (as above) be surprising. I don't see a sane way to fix > this so long as asking whether __del__ exists can execute arbitrary mounds > of Python code. If I understand the problem, it can be avoided by avoiding old-style classes. Maybe it's time to, at least optionally, cause a warning when old-style classes are used. :) I'm not kidding for Zope. I think it might be worth-while to be issue such a warning in Zope. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (703) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From skip@mojam.com Sun Apr 6 13:00:22 2003 From: skip@mojam.com (Skip Montanaro) Date: Sun, 6 Apr 2003 07:00:22 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200304061200.h36C0MU07870@manatee.mojam.com> Bug/Patch Summary ----------------- 384 open / 3510 total bugs (+7) 136 open / 2062 total patches (no change) New Bugs -------- test_zipimport failing on ia64 (at least) (2003-03-30) http://python.org/sf/712322 Cannot change the class of a list (2003-03-31) http://python.org/sf/712975 test_pty fails on HP-UX and AIX when run after test_openpty (2003-03-31) http://python.org/sf/713169 site.py breaks if prefix is empty (2003-04-01) http://python.org/sf/713601 Distutils documentation amputated (2003-04-01) http://python.org/sf/713722 cPickle fails to pickle inf (2003-04-03) http://python.org/sf/714733 bsddb.first()/next() raise undocumented exception (2003-04-03) http://python.org/sf/715063 pydoc support for keywords (2003-04-05) http://python.org/sf/715782 Minor nested scopes doc issues (2003-04-06) http://python.org/sf/716168 New Patches ----------- Bug fix 548176: urlparse('http://foo?blah') errs (2003-03-30) http://python.org/sf/712317 sre fixes for lastindex and minimizing repeats+assertions (2003-03-31) http://python.org/sf/712900 Fixes for 'commands' module on win32 (2003-04-01) http://python.org/sf/713428 rfc822.parsedate returns a tuple (2003-04-01) http://python.org/sf/713599 freeze fails when extensions_win32.ini is missing (2003-04-01) http://python.org/sf/713645 iconv_codec NG (2003-04-02) http://python.org/sf/713820 Unicode Codecs for CJK Encodings (2003-04-02) http://python.org/sf/713824 Guard against segfaults in debug code (2003-04-02) http://python.org/sf/714348 timeouts for FTP connect (and other supported ops) (2003-04-03) http://python.org/sf/714592 Document freeze process in PC/config.c (2003-04-03) http://python.org/sf/714957 Closed Bugs ----------- locale.getpreferredencoding fails on AIX (2003-01-31) http://python.org/sf/678259 configure option --enable-shared make problems (2003-03-11) http://python.org/sf/701823 -i -u options give SyntaxError on Windows (2003-03-21) http://python.org/sf/707576 Closed Patches -------------- sgmllib support for additional tag forms (2002-04-17) http://python.org/sf/545300 posixfy some things (2002-12-08) http://python.org/sf/650412 Add missing constants for IRIX al module (2003-01-13) http://python.org/sf/667548 Py_Main() removal of exit() calls. Return value instead (2003-01-21) http://python.org/sf/672053 fix for bug 672614 :) (2003-02-28) http://python.org/sf/695250 Wrong prototype for PyUnicode_Splitlines on documentation (2003-03-11) http://python.org/sf/701395 more apply removals (2003-03-11) http://python.org/sf/701494 Fix a few broken links in pydoc (2003-03-19) http://python.org/sf/706338 Adds Mock Object support to unittest.TestCase (2003-03-19) http://python.org/sf/706590 Make "%c" % u"a" work (2003-03-26) http://python.org/sf/710127 Backport to 2.2.2 of codec registry fix (2003-03-27) http://python.org/sf/710576 Obsolete comment in urlparse.py (2003-03-30) http://python.org/sf/712124 From nas@python.ca Sun Apr 6 19:43:21 2003 From: nas@python.ca (Neil Schemenauer) Date: Sun, 6 Apr 2003 11:43:21 -0700 Subject: [PythonLabs] Re: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <3E900A80.3010802@zope.com> References: <3E900A80.3010802@zope.com> Message-ID: <20030406184320.GA14894@glacier.arctrix.com> Jim Fulton wrote: > Maybe it's time to, at least optionally, cause a warning when > old-style classes are used. :) I'm not kidding for Zope. I think it > might be worth-while to be issue such a warning in Zope. A command line option that enabled new-style classes by default may be a good idea (suggested to me by AMK at PyCon). Neil From barry@python.org Sun Apr 6 23:03:32 2003 From: barry@python.org (Barry Warsaw) Date: 06 Apr 2003 18:03:32 -0400 Subject: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <1049594522.24643.57.camel@localhost.localdomain> References: <1049594522.24643.57.camel@localhost.localdomain> Message-ID: <1049666611.9026.3.camel@geddy> On Sat, 2003-04-05 at 21:02, Jeremy Hylton wrote: > #1. (I think this was Fred's suggestion on Friday.) Don't do a > hasattr() check on the object, do it on the class. This is what happens > with new-style classes in Python 2.3: If a new-style class doesn't > define an __del__ method, then its instances don't have finalizer. It > doesn't matter whether the specific instance has an __del__ attribute. FWIW, IIRC Jython does something vaguely like this. Actually the existance of __del__ is check at class creation time because it's expensive to call __del__ when the object is Java gc'd, and we use two different Java classes for classic class instances depending on whether it had a __del__ or not. This means you can't add __del__ to the class or the instance after the class is defined. Personally I think this is reasonable and I don't recall this biting anyone when I was working on Jython. -Barry From tim_one@email.msn.com Mon Apr 7 01:47:53 2003 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 6 Apr 2003 20:47:53 -0400 Subject: [Python-Dev] Re: [Python-checkins]python/dist/src/Modulesgcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <1049594522.24643.57.camel@localhost.localdomain> Message-ID: [Jeremy Hylton] > I think I'll second the thought that there are no satisfactory answers > here. We've made a big step forward by fixing the core dumps. > > If we want to document the current behavior, we would say that garbage > collection may leave reachable objects in an "invalid state" in the > presence of "problematic objects." A "problematic object" is an > instance of a classic class that defines a getattr hook (__getattr__) > but not a finalizer (__del__). An object an in "invalid state" has had > its tp_clear slot executed; in the case of instances, this means the > __dict__ will be empty. Specifically, if a problematic object is part > of unreachable cycle, the garbage collector will execute the code in its > getattr hook; if executing that code makes any object in the cycle > reachable again, it will be left in an invalid state. I expect that documenting it comprehensbly is impossible. For example, the referrent of "it" in your last sentence is unclear, and hard to flesh out. A problematic object doesn't need to be part of a cycle to cause problems, and when it does cause problems the things that end up in an unexpected state needn't be part of cycles either. It's more that the problematic object needs to be reachable only from an unreachable cycle (the unreachable cycle needn't contain problematic objects), and then it's all the objects reachable only from the unreachable cycle and from the problematic object that may be in trouble (and regardless of whether they're in cycles). Here's a concrete example, where the instance of the problematic D isn't in a cycle, and neither are the list or the dict that get magically cleared (.mylist and .mydict) despite being resurrected: """ class C: pass class D: def __init__(self): self.mydict = {'a': 1, 'b': 2} self.mylist = range(100) def __getattr__(self, attribute): global alist if attribute == "__del__": alist.append(self.mydict) alist.append(self.mylist) raise AttributeError import gc gc.collect() a = C() a.loop = a # make a cycle a.d_instance = D() # an instance of D hangs *off* the cycle alist = [] del a print gc.collect() # 6: a, a.d_instance, their __dicts__, and D()'s # mydict and mylist print alist # [(), []] """ If we had enough words to explain that, it still wouldn't be enough, because the effect of calling tp_clear isn't defined by the language for any type. If, for example, D also defined a .mytuple attr and resurrected it in __getattr__, the user would see that *that* one survived OK (tuples happen to have a NULL tp_clear slot). > If we document this for 2.2, it's more complicated because instances of > new-style classes are also affected. What's worse, a new-style class > with a __getattribute__ hook is affected regardless of whether it has a > finalizer. In 2.2 but not 2.3, right? I haven't tried anything with __getattribute__. For that matter, in my own Python programming, I've never even defined a __getattr__ method -- I spend most of my life tracking down bugs in things I don't use . > Here are a couple of thoughts about how to avoid leaving objects in an > invalid state. I'd much rather pursue that than write docs nobody will understand. > It's pretty unlikely for it to happen, but speaking from > experience it's baffling when it does. > > #1. (I think this was Fred's suggestion on Friday.) Don't do a > hasattr() check on the object, do it on the class. This is what happens > with new-style classes in Python 2.3: If a new-style class doesn't > define an __del__ method, then its instances don't have finalizer. It > doesn't matter whether the specific instance has an __del__ attribute. > > Limitations: This is a change in semantics, although it only covers a > nearly insane corner case. The other limitation is that things could > still go wrong, although only in the presence of a classic metaclass! I'm not sure I followed the last sentence. If I did, screw calling hasattr() -- do a string lookup for "__del__" in the classic class's __dict__, and that's it. Anything that ends up executing arbitrary Python code is going to leave holes. > #2. If an object has a getattr hook and it's involved in a cycle, just > put it in gc.garbage. Forget about checking for a finalizer. That > seems fine for 2.3, since we're only talking about classic classes with > getattr hooks. But it doesn't sound very pleasant for 2.2, since it > covers an class instance with a getattr hook. I'd like to avoid expanding the definition of what ends up in gc.garbage. The relationship to __del__ and unreachable cycles is explainable now, modulo the __getattr__ insanity. Getting rid of the latter is a lot more attractive than folding it into the former. > I think #1 is pretty reasonable. I'd like to see something fixed for > 2.2.3, but I worry that the semantic change may be unacceptable for a > bug fix release. (But maybe not, the semantics are pretty insane right > now :-). I have no problem with changing this for 2.2.3. I doubt any Python app will be affected, except possibly to rid 1 in 10,000 of a subtle bug. There's certainly no defensible app that relied on Python segfaulting here, and I can't imagine any relying on containers getting magically cleared at unpredictable times. BTW, I'm still wondering why the ZODB thread test failed the way it did for Tres and Barry and me: you saw corrupt gc lists, but the rest of us never did. We saw a Connection instance with a mysteriously cleared __dict__. That's consistent with the __getattr__-hook-resurrects-an- object-reachable-only-from-an-unreachable-cycle examples I posted, but did you guys figure out on Friday whether that's what was actually happening? The corrupt-gc-lists symptom was explained by the __getattr__ hook deleting unreachable objects while gc was still crawling over them, and that's a different (albeit related) problem than __dicts__ getting cleared by magic. From greg@cosc.canterbury.ac.nz Mon Apr 7 01:54:20 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 07 Apr 2003 12:54:20 +1200 (NZST) Subject: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: Message-ID: <200304070054.h370sK814932@oma.cosc.canterbury.ac.nz> > I don't see a sane way to fix this so long as asking whether __del__ >exists can execute arbitrary mounds of Python code. This further confirms my opinion that __del__ methods are evil, and the language would be the better for their complete removal. Failing that, perhaps they should be made a bit less dynamic, so that the GC can make reasonable assumptions about their existence without having to execute Python code. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Apr 7 01:56:35 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 07 Apr 2003 12:56:35 +1200 (NZST) Subject: [Python-Dev] Placement of os.fdopen functionality In-Reply-To: <20030405193531.GA23455@meson.dyndns.org> Message-ID: <200304070056.h370uZc14935@oma.cosc.canterbury.ac.nz> Jp Calderone : > perhaps os.fdopen would be more logically placed someplace else - > Perhaps as a class method of the file type, file.fromfd()? Not all OSes have the notion of a file descriptor, which is probably why it's in the os module. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Apr 7 02:04:39 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 07 Apr 2003 13:04:39 +1200 (NZST) Subject: [PythonLabs] Re: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <3E900A80.3010802@zope.com> Message-ID: <200304070104.h3714df15005@oma.cosc.canterbury.ac.nz> > Maybe it's time to, at least optionally, cause a warning when > old-style classes are used. :) You might want to, er, make an exception for subclasses of Exception (you still don't get any choice there, right?) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim_one@email.msn.com Mon Apr 7 02:11:10 2003 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 6 Apr 2003 21:11:10 -0400 Subject: [PythonLabs] Re: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: Message-ID: [Jim Fulton] > If I understand the problem, it can be avoided by avoiding > old-style classes. In Python 2.3, that appears to be true. In Python 2.2.2, not true. The problems are caused by __getattr__ hooks that resurrect unreachable objects, and/or remove the last reference to an unreachable object, when such a hook is on an instance reachable only from an unreachable cycle, and the class doesn't explicitly define a __del__ method, and the class has a getattr hook, and the getattr hook does extreme things instead of just saying "no, there's no __del__ here". Python 2.3 introduced new machinery for new-style classes specifically aimed at answering the "does it support __del__?" question without invoking getattr hooks, and that's why it's not a problem for new-style classes in 2.3. New-style classes still go thru getattr hooks to answer this question in 2.2.2. There were problem in Python and problems in Zope here. Jeremy fixed the Zope problems under 2.2 by breaking the and the getattr hook does extreme things instead of just saying "no, there's no __del__ here" link of the chain for persistent objects. > Maybe it's time to, at least optionally, cause a warning when > old-style classes are used. :) I'm not kidding for Zope. I think it > might be worth-while to be issue such a warning in Zope. There may be good reasons for wanting that, but none raised in this thread so far are relevant (unless 2.3 is mandated for Zope, which I'm sure we don't want to do). From tim_one@email.msn.com Mon Apr 7 02:30:56 2003 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 6 Apr 2003 21:30:56 -0400 Subject: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: <200304070054.h370sK814932@oma.cosc.canterbury.ac.nz> Message-ID: [Greg Ewing] > This further confirms my opinion that __del__ methods are evil, and > the language would be the better for their complete removal. They sure create more than their share of implementation headaches, so don't fare well on the "if the implementation is hard to explain, it's a bad idea" scale. > Failing that, perhaps they should be made a bit less dynamic, so that > the GC can make reasonable assumptions about their existence without > having to execute Python code. Guido already did so for new-style classes in Python 2.3. That machinery doesn't exist in 2.2.2, and old-style classes remain a problem under 2.3 too. Backward compatibility constrains how much we can get away with, of course. From jeremy@alum.mit.edu Mon Apr 7 04:45:05 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: 06 Apr 2003 23:45:05 -0400 Subject: [Python-Dev] Re: [Python-checkins]python/dist/src/Modulesgcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: References: Message-ID: <1049687104.1383.27.camel@localhost.localdomain> On Sun, 2003-04-06 at 20:47, Tim Peters wrote: > BTW, I'm still wondering why the ZODB thread test failed the way it did for > Tres and Barry and me: you saw corrupt gc lists, but the rest of us never > did. We saw a Connection instance with a mysteriously cleared __dict__. > That's consistent with the __getattr__-hook-resurrects-an- > object-reachable-only-from-an-unreachable-cycle examples I posted, but did > you guys figure out on Friday whether that's what was actually happening? > The corrupt-gc-lists symptom was explained by the __getattr__ hook deleting > unreachable objects while gc was still crawling over them, and that's a > different (albeit related) problem than __dicts__ getting cleared by magic. [Note to everyone else, there's a lot of ZODB-specific detail in the answer. It might not be that interesting beyond ZODB developers.] The __getattr__ code in ZODB made a large cycle of objects reachable again. The __getattr__ hook called a method on a ZODB Connection and the Connection registered itself with the current transaction (basically, a global resource). Then the Connection got tp_cleared by the garbage collector. Now the Connection is a zombie but it's also registered with a transaction. When the transaction commits or aborts, the code failed because the Connection didn't have any attributes. I got particularly lucky with my compiler/platform/Python version/whatever. Part of the code in __getattr__ deleted a key-value pair from a dictionary. I think that was partly chance; there was nothing about the code that guaranteed the key was in the dict, but it deleted it if it was. The value in the dict was a weakref. The weakref decrefed and deallocated its callback function. Just by luck, the callback function was the next thing in the unreachable gc list. So I got a segfault when I dereferenced the now-freed GC header of the callback object. Jeremy From oren-py-d@hishome.net Mon Apr 7 07:16:30 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 7 Apr 2003 02:16:30 -0400 Subject: [Python-Dev] Placement of os.fdopen functionality In-Reply-To: <20030405193531.GA23455@meson.dyndns.org> References: <20030405193531.GA23455@meson.dyndns.org> Message-ID: <20030407061630.GA12658@hishome.net> On Sat, Apr 05, 2003 at 02:35:31PM -0500, Jp Calderone wrote: > It occurred to me this afternoon (after answering aquestion about creating > file objects from file descriptors) that perhaps os.fdopen would be more > logically placed someplace else - of course it could also remain as > os.fdopen() for whatever deprecation period is warrented. > > Perhaps as a class method of the file type, file.fromfd()? I don't see much point in moving it around just because the place doesn't seem right but the fact that it's a function rather than a method means that some things cannot be done in pure Python. I can create an uninitialized instance of a subclass of 'file' using file.__new__(filesubclass) but the only way to open it is by name using file.__init__(filesubclassinstance, 'filename'). A file subclass cannot be opened from a file descriptor because fdopen always returns a new instance of 'file'. If there was some way to open an uninitialized file object from a file descriptor it would be possible, for example, to write a version of popen that returns a subclass of file. It could add a method for retrieving the exit code of the process, do something interesting on __del__, etc. Here are some alternatives of where this could be implemented, followed by what a Python implementation of os.fdopen would look like: 1. New form of file.__new__ with more arguments: def fdopen(fd, mode='r', buffering=-1): return file.__new__('(fdopen)', mode, buffering, fd) 2. Optional argument to file.__init__: def fdopen(fd, mode='r', buffering=-1): return file('(fdopen)', mode, buffering, fd) 3. Instance method (NOT a class method): def fdopen(fd, mode='r', buffering=-1): f = file.__new__() f.fdopen(fd, mode, buffering, '(fdopen)') return f Oren From theller@python.net Mon Apr 7 07:56:38 2003 From: theller@python.net (Thomas Heller) Date: 07 Apr 2003 08:56:38 +0200 Subject: [Python-Dev] LONG_LONG (Was: [Python-checkins] python/dist/src/Misc NEWS,1.703,1.704) In-Reply-To: References: Message-ID: loewis@users.sourceforge.net writes: > Update of /cvsroot/python/python/dist/src/Misc > In directory sc8-pr-cvs1:/tmp/cvs-serv28757/Misc > > Modified Files: > NEWS > Log Message: > Rename LONG_LONG to PY_LONG_LONG. Fixes #710285. > What is the recommended way to port code like this to Python 2.3, and still remain compatible with 2.2? Thanks, Thomas typedef struct { PyObject_HEAD char tag; union { char c; char b; short h; int i; long l; #ifdef HAVE_LONG_LONG LONG_LONG q; #endif double d; float f; void *p; } value; PyObject *obj; } PyCArgObject; From mhammond@skippinet.com.au Mon Apr 7 12:23:02 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Mon, 07 Apr 2003 21:23:02 +1000 Subject: [Python-Dev] LONG_LONG (Was: [Python-checkins] python/dist/src/Misc NEWS,1.703,1.704) In-Reply-To: Message-ID: > > Rename LONG_LONG to PY_LONG_LONG. Fixes #710285. > > > > What is the recommended way to port code like this to Python 2.3, > and still remain compatible with 2.2? #if defined(PY_LONG_LONG) && !defined(LONG_LONG) #define LONG_LONG PY_LONG_LONG /* grrr :( */ #endif ? This change does break things. Mark. From skip@pobox.com Mon Apr 7 15:56:41 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 7 Apr 2003 09:56:41 -0500 Subject: [Python-Dev] LONG_LONG (Was: [Python-checkins] python/dist/src/Misc NEWS, 1.703, 1.704) In-Reply-To: References: Message-ID: <16017.37289.216513.120081@montanaro.dyndns.org> Thomas> What is the recommended way to port code like this to Python Thomas> 2.3, and still remain compatible with 2.2? Thomas> #ifdef HAVE_LONG_LONG Thomas> LONG_LONG q; Thomas> #endif Wouldn't this work? #ifdef HAVE_LONG_LONG # ifdef Py_LONG_LONG Py_LONG_LONG q; # else LONG_LONG q; # endif #endif As MarkH pointed out, this change is going to break some code, but there's probably no way around it. Obviously, some other package defines a LONG_LONG macro or there wouldn't have been a bug report. Better to bite the bullet sooner than later. Skip From msg_2222@yahoo.com Mon Apr 7 18:16:53 2003 From: msg_2222@yahoo.com (Rick Y) Date: Mon, 7 Apr 2003 10:16:53 -0700 (PDT) Subject: [Python-Dev] socket question Message-ID: <20030407171653.41362.qmail@web20711.mail.yahoo.com> how can i enable _sockt module in my solaris python?. i did not build it. Downloaded it from sunfreeware. ./viewcvs-install Traceback (most recent call last): File "./viewcvs-install", line 35, in ? import compat File "./lib/compat.py", line 20, in ? import urllib File "/usr/local/lib/python2.1/urllib.py", line 26, in ? import socket File "/usr/local/lib/python2.1/socket.py", line 41, in ? from _socket import * ImportError: ld.so.1: python: fatal: libssl.so.0.9.6: open failed: No such file or directory __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com From aahz@pythoncraft.com Mon Apr 7 18:28:37 2003 From: aahz@pythoncraft.com (Aahz) Date: Mon, 7 Apr 2003 13:28:37 -0400 Subject: [Python-Dev] socket question In-Reply-To: <20030407171653.41362.qmail@web20711.mail.yahoo.com> References: <20030407171653.41362.qmail@web20711.mail.yahoo.com> Message-ID: <20030407172837.GA18682@panix.com> On Mon, Apr 07, 2003, Rick Y wrote: > > how can i enable _sockt module in my solaris python?. python-dev is for discussions about developing the language, not for questions about using Python. You'll probably get better advice by subscribing to the newsgroup comp.lang.python (or python-list). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ This is Python. We don't care much about theory, except where it intersects with useful practice. --Aahz, c.l.py, 2/4/2002 From jeremy@zope.com Mon Apr 7 18:43:28 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 07 Apr 2003 13:43:28 -0400 Subject: [Python-Dev] socket question In-Reply-To: <20030407171653.41362.qmail@web20711.mail.yahoo.com> References: <20030407171653.41362.qmail@web20711.mail.yahoo.com> Message-ID: <1049737408.23331.19.camel@slothrop.zope.com> Rick, This question would be more appropriate on python-list. The python-dev list is for discussion among people who work on the Python implementation, rather than for end-user questions. But don't sweat it; you probably didn't know that. On Mon, 2003-04-07 at 13:16, Rick Y wrote: > how can i enable _sockt module in my solaris python?. > i did not build it. Downloaded it from sunfreeware. > > ./viewcvs-install > Traceback (most recent call last): > File "./viewcvs-install", line 35, in ? > import compat > File "./lib/compat.py", line 20, in ? > import urllib > File "/usr/local/lib/python2.1/urllib.py", line 26, > in ? > import socket > File "/usr/local/lib/python2.1/socket.py", line 41, > in ? > from _socket import * > ImportError: ld.so.1: python: fatal: libssl.so.0.9.6: > open failed: No such file or directory The version of Python you are using has been linked against OpenSSL. The import of _socket is failing because the libssl.so can't be found at run-time. You either need to tell your linker where to find the file or install OpenSSL. I'm sure you can find more help on the details on the other list. Jeremy From martin@v.loewis.de Mon Apr 7 22:29:14 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 07 Apr 2003 23:29:14 +0200 Subject: [Python-Dev] LONG_LONG (Was: [Python-checkins] python/dist/src/Misc NEWS,1.703,1.704) In-Reply-To: References: Message-ID: Mark Hammond writes: > #if defined(PY_LONG_LONG) && !defined(LONG_LONG) > #define LONG_LONG PY_LONG_LONG /* grrr :( */ > #endif That works; perhaps one would remove the comment... > This change does break things. Most certainly. However, it was broken before, as it failed to be renamed in the grand renaming. Regards, Martin From marcus.h.mendenhall@vanderbilt.edu Tue Apr 8 15:38:57 2003 From: marcus.h.mendenhall@vanderbilt.edu (Marcus Mendenhall) Date: Tue, 8 Apr 2003 09:38:57 -0500 Subject: [Python-Dev] _socket efficiencies ideas Message-ID: I have been in discussion recently with Martin v. Loewis about an idea I have been thinking about for a while to improve the efficiency of the connect method in the _socket module. I posted the original suggestion to the python suggestions tracker on sourceforge as item 706392. A bit of history and justification: I am doing a lot of work using python to develop almost-real-time distributed data acquisition and control systems from running laboratory apparatus. In this environment, I do a lot of sun-rpc calls as part of the vxi-11 protocol to allow TCP/IP access to gpib-like devices. As a part of this, I do a lot sock socket.connect() calls, often with the connections being quite transient. The problem is that the current python _socket module makes a DNS call to try to resolve each address before connect is called, which if I am connecting/disconnecting many times a second results in pathological and gratuitous network activity. Incidentally, I am in the process of creating a sourceforge project, pythonlabtools (just approved this morning), in which I will start maintaining a repository of the tools I have been working on. My first solution to this, for which I submitted a patch to the tracker system (with guidance from Martin), was to create a wrapper for the sockaddr object, which one can create in advance, and when _socket.connect() is called (actually when getsockaddrarg() is called by connect), results in an immediate connection without any DNS activity. This solution solves part of the problem, but may not be the right final one. After writing this patch and verifying its functionality, I tried it in the real world. Then, I realized that for sun-rpc work, it wasn't quite what I needed, since the socket number may be changing each time the rpc request is made, resulting in a new address wrapper being needed, and thus DNS activity again. After thinking about what I have done with this patch, I would also like to suggest another change (for which I am also willing to submit the patch, which is quite simple): Consistent with some of the already extant glue in _socket to handle addresses like , would there be any reason no to modify setipaddr() and getaddrinfo() so that if an address is prefixed with (e.g. 127.0.0.1) that the PASSIVE and NUMERIC flags are always set so these routines reject any non-numeric address, but handle numeric ones very efficiently? I have already implemented a predecessor to this which I am experimentally running at home in python 2.2.2, in which I made it so that prefixing the address with an exclamation point provided this functionality. Given the somewhat more legible approach the team has already chosen for special addresses, I see no reason why using a (or some such) prefix isn't reasonable. Do any members of the development team have commentary on this? Would such a change be likely to be accepted into the system? Any reasons which it might break something? The actual patch would be only about 10 lines of code, (plus some documentation), a few in each of the routines mentioned above. Thanks for any suggestions. Marcus Mendenhall From guido@python.org Tue Apr 8 15:50:50 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 08 Apr 2003 10:50:50 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: Your message of "Tue, 08 Apr 2003 09:38:57 CDT." References: Message-ID: <200304081450.h38EoqE20178@odiug.zope.com> > I have been in discussion recently with Martin v. Loewis about an idea > I have been thinking about for a while to improve the efficiency of the > connect method in the _socket module. I posted the original suggestion > to the python suggestions tracker on sourceforge as item 706392. > > A bit of history and justification: > I am doing a lot of work using python to develop almost-real-time > distributed data acquisition and control systems from running > laboratory apparatus. In this environment, I do a lot of sun-rpc calls > as part of the vxi-11 protocol to allow TCP/IP access to gpib-like > devices. As a part of this, I do a lot sock socket.connect() calls, > often with the connections being quite transient. The problem is that > the current python _socket module makes a DNS call to try to resolve > each address before connect is called, which if I am > connecting/disconnecting many times a second results in pathological > and gratuitous network activity. Incidentally, I am in the process of > creating a sourceforge project, pythonlabtools (just approved this > morning), in which I will start maintaining a repository of the tools I > have been working on. Are you sure that it tries make a DNS call even when the address is pure numeric? That seems a mistake, and if that's really happening, I think that is the part that should be fixed. Maybe in the _socket module, maybe in getaddrinfo(). > My first solution to this, for which I submitted a patch to the tracker > system (with guidance from Martin), was to create a wrapper for the > sockaddr object, which one can create in advance, and when > _socket.connect() is called (actually when getsockaddrarg() is called > by connect), results in an immediate connection without any DNS > activity. > > This solution solves part of the problem, but may not be the right > final one. After writing this patch and verifying its functionality, I > tried it in the real world. Then, I realized that for sun-rpc work, it > wasn't quite what I needed, since the socket number may be changing > each time the rpc request is made, resulting in a new address wrapper > being needed, and thus DNS activity again. > > After thinking about what I have done with this patch, I would also > like to suggest another change (for which I am also willing to submit > the patch, which is quite simple): Consistent with some of the already > extant glue in _socket to handle addresses like , would > there be any reason no to modify > setipaddr() and getaddrinfo() so that if an address is prefixed with > (e.g. 127.0.0.1) that the PASSIVE and NUMERIC flags > are always set so these routines reject any non-numeric address, but > handle numeric ones very efficiently? > > I have already implemented a predecessor to this which I am > experimentally running at home in python 2.2.2, in which I made it so > that prefixing the address with an exclamation point provided this > functionality. Given the somewhat more legible approach the team has > already chosen for special addresses, I see no reason why using a > (or some such) prefix isn't reasonable. > > Do any members of the development team have commentary on this? Would > such a change be likely to be accepted into the system? Any reasons > which it might break something? The actual patch would be only about > 10 lines of code, (plus some documentation), a few in each of the > routines mentioned above. I don't see why we would have to add the flag to the address when the form of the address itself is already a perfect clue that the address is purely numeric. I'd be happy to see a patch that intercepts addresses of the form \d+\.\d+\.\d+\.\d+ and parses those without calling getaddrinfo(). --Guido van Rossum (home page: http://www.python.org/~guido/) From marcus.h.mendenhall@vanderbilt.edu Tue Apr 8 16:59:27 2003 From: marcus.h.mendenhall@vanderbilt.edu (Marcus Mendenhall) Date: Tue, 8 Apr 2003 10:59:27 -0500 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304081450.h38EoqE20178@odiug.zope.com> Message-ID: <138CDF38-69DB-11D7-A8D4-003065A81A70@vanderbilt.edu> Thanks for your prompt reply! On Tuesday, April 8, 2003, at 09:50 AM, Guido van Rossum wrote: >> I have been in discussion recently with Martin v. Loewis about an idea >> I have been thinking about for a while to improve the efficiency of >> the >> connect method in the _socket module. I posted the original >> suggestion >> to the python suggestions tracker on sourceforge as item 706392. >> >> A bit of history and justification: >> I am doing a lot of work using python to develop almost-real-time >> distributed data acquisition and control systems from running >> laboratory apparatus. In this environment, I do a lot of sun-rpc >> calls >> as part of the vxi-11 protocol to allow TCP/IP access to gpib-like >> devices. As a part of this, I do a lot sock socket.connect() calls, >> often with the connections being quite transient. The problem is that >> the current python _socket module makes a DNS call to try to resolve >> each address before connect is called, which if I am >> connecting/disconnecting many times a second results in pathological >> and gratuitous network activity. Incidentally, I am in the process of >> creating a sourceforge project, pythonlabtools (just approved this >> morning), in which I will start maintaining a repository of the tools >> I >> have been working on. > > Are you sure that it tries make a DNS call even when the address is > pure numeric? That seems a mistake, and if that's really happening, I > think that is the part that should be fixed. Maybe in the _socket > module, maybe in getaddrinfo(). > Yes, it seems to do this. It sets the PASSIVE flags, but that doesn't seem to be quite enough to prevent DNS activity, although the NUMERIC flag does the job. This is true, at least, in 2.3.x on MacOSX, and since the socket stuff is all the same, I suspect it is true on many Unixes. Note that this doesn't happen on the MacOS9 version, which provides its own socket interface through GUSI, which apparently is smart enough to handle it. >> My first solution to this, for which I submitted a patch to the >> tracker >> system (with guidance from Martin), was to create a wrapper for the >> sockaddr object, which one can create in advance, and when >> _socket.connect() is called (actually when getsockaddrarg() is called >> by connect), results in an immediate connection without any DNS >> activity. >> >> This solution solves part of the problem, but may not be the right >> final one. After writing this patch and verifying its functionality, >> I >> tried it in the real world. Then, I realized that for sun-rpc work, >> it >> wasn't quite what I needed, since the socket number may be changing >> each time the rpc request is made, resulting in a new address wrapper >> being needed, and thus DNS activity again. >> >> After thinking about what I have done with this patch, I would also >> like to suggest another change (for which I am also willing to submit >> the patch, which is quite simple): Consistent with some of the >> already >> extant glue in _socket to handle addresses like , would >> there be any reason no to modify >> setipaddr() and getaddrinfo() so that if an address is prefixed with >> (e.g. 127.0.0.1) that the PASSIVE and NUMERIC flags >> are always set so these routines reject any non-numeric address, but >> handle numeric ones very efficiently? >> >> I have already implemented a predecessor to this which I am >> experimentally running at home in python 2.2.2, in which I made it so >> that prefixing the address with an exclamation point provided this >> functionality. Given the somewhat more legible approach the team has >> already chosen for special addresses, I see no reason why using a >> (or some such) prefix isn't reasonable. >> >> Do any members of the development team have commentary on this? Would >> such a change be likely to be accepted into the system? Any reasons >> which it might break something? The actual patch would be only about >> 10 lines of code, (plus some documentation), a few in each of the >> routines mentioned above. > > I don't see why we would have to add the flag to the address > when the form of the address itself is already a perfect clue that the > address is purely numeric. I'd be happy to see a patch that > intercepts addresses of the form \d+\.\d+\.\d+\.\d+ and parses those > without calling getaddrinfo(). > Do we want this? The parser also then have to be modified when to handle numeric INET6 addresses, when they become popular. I actually did implement one of my trial versions this way, and it worked fine. There is one minor issue, too. In urllib, there are some calls to getaddrinfo to get (for maybe no good reason), CNAMEs of addresses. I would like some way to tag an address with a very strong comment that it is what it is, and I would like all further processing disabled. Also, a 'trial' parsing of an address for matching a a.b.c.d pattern each time is a lot more processor inensive than checking for at the beginning. I am perfectly happy to implement it either way. > --Guido van Rossum (home page: http://www.python.org/~guido/) > From guido@python.org Tue Apr 8 19:01:24 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 08 Apr 2003 14:01:24 -0400 Subject: [PythonLabs] Re: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: Your message of "Sun, 06 Apr 2003 11:43:21 PDT." <20030406184320.GA14894@glacier.arctrix.com> References: <3E900A80.3010802@zope.com> <20030406184320.GA14894@glacier.arctrix.com> Message-ID: <200304081801.h38I1QL22691@odiug.zope.com> > A command line option that enabled new-style classes by default may be a > good idea (suggested to me by AMK at PyCon). I expect lots of things to break; such an option would have to be at least as well-hidden as -U. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Apr 8 19:06:47 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 08 Apr 2003 14:06:47 -0400 Subject: [Python-Dev] Re: [Python-checkins]python/dist/src/Modules gcmodule.c,2.33.6.5,2.33.6.6 In-Reply-To: Your message of "Mon, 07 Apr 2003 12:54:20 +1200." <200304070054.h370sK814932@oma.cosc.canterbury.ac.nz> References: <200304070054.h370sK814932@oma.cosc.canterbury.ac.nz> Message-ID: <200304081806.h38I6v822730@odiug.zope.com> > This further confirms my opinion that __del__ methods are evil, and > the language would be the better for their complete removal. No can do. There must be a way to force e.g. calling os.close() for an integer file descriptor returned by os.open() without writing C code. But this should be exceedingly rare. A quick inspection of the standard library found one other case: flushing buffered data out. I think that's also a valid use of __del__. > Failing that, perhaps they should be made a bit less dynamic, so > that the GC can make reasonable assumptions about their existence > without having to execute Python code. +1 --Guido van Rossum (home page: http://www.python.org/~guido/) From jafo@tummy.com Wed Apr 9 13:48:48 2003 From: jafo@tummy.com (Sean Reifschneider) Date: Wed, 9 Apr 2003 06:48:48 -0600 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304081450.h38EoqE20178@odiug.zope.com> References: <200304081450.h38EoqE20178@odiug.zope.com> Message-ID: <20030409124848.GB15649@tummy.com> On Tue, Apr 08, 2003 at 10:50:50AM -0400, Guido van Rossum wrote: >Are you sure that it tries make a DNS call even when the address is >pure numeric? That seems a mistake, and if that's really happening, I My first thought is that there should be a local DNS cache on the machine that is running these apps. My second thought is that Python could benefit from caching some lookup information... >address is purely numeric. I'd be happy to see a patch that >intercepts addresses of the form \d+\.\d+\.\d+\.\d+ and parses those >without calling getaddrinfo(). It's not quite that easy. Beyond the IPV6 issues mentioned elsewhere, you'd also want to check "\d+.\d+" and "\d+\.\d+\.\d+". IP addresses will fill in missing ".0"s, which is particularly handy for accessing "127.1", which gets expanded to "127.0.0.1". Sean -- Rocky: "Do you know what an A-Bomb is?" Bullwinkle: "Of course. ``A Bomb'' is what some people call our show." Sean Reifschneider, Inimitably Superfluous tummy.com, ltd. - Linux Consulting since 1995. Qmail, Python, SysAdmin From hbl@st-andrews.ac.uk Wed Apr 9 14:35:46 2003 From: hbl@st-andrews.ac.uk (Hamish Lawson) Date: Wed, 09 Apr 2003 14:35:46 +0100 Subject: [Python-Dev] PEP305 csv package: from csv import csv? Message-ID: <5.2.0.9.0.20030409143148.01d0d620@spey.st-andrews.ac.uk> [Please excuse my posting this message here after initially posting it to python-list, but I realised afterwards that this might be the more appropriate forum (it hasn't so far had any responses on python-list anyway).] According to the documentation in progress at http://www.python.org/dev/doc/devel/whatsnew/node14.html use of the forthcoming csv module (as described in PEP305) requires it to be imported from the csv package: from csv import csv input = open('datafile', 'rb') reader = csv.reader(input) for line in reader: print line Is there some reason why the cvs package's __init__.py doesn't import the required names from cvs.py, so allowing the shorter form below? import csv input = open('datafile', 'rb') reader = csv.reader(input) for line in reader: print line Hamish Lawson From skip@pobox.com Wed Apr 9 14:43:11 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 9 Apr 2003 08:43:11 -0500 Subject: [Python-Dev] PEP305 csv package: from csv import csv? In-Reply-To: <5.2.0.9.0.20030409143148.01d0d620@spey.st-andrews.ac.uk> References: <5.2.0.9.0.20030409143148.01d0d620@spey.st-andrews.ac.uk> Message-ID: <16020.9071.801846.936864@montanaro.dyndns.org> >>>>> "Hamish" == Hamish Lawson writes: Hamish> [Please excuse my posting this message here after initially Hamish> posting it to python-list, but I realised afterwards that this Hamish> might be the more appropriate forum (it hasn't so far had any Hamish> responses on python-list anyway).] ... Actually, I forwarded your note to the csv mailing list: csv@mail.mojam.com. That'd be the best place to discuss the topic. ;-) I'll probably get around to changing things in the next day or two, but please feel free to submit a patch so I don't forget. Skip From guido@python.org Wed Apr 9 14:51:26 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 09:51:26 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: Your message of "Wed, 09 Apr 2003 06:48:48 MDT." <20030409124848.GB15649@tummy.com> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> Message-ID: <200304091351.h39DpSq24961@odiug.zope.com> > On Tue, Apr 08, 2003 at 10:50:50AM -0400, Guido van Rossum wrote: > >Are you sure that it tries make a DNS call even when the address is > >pure numeric? That seems a mistake, and if that's really happening, I > > My first thought is that there should be a local DNS cache on the > machine that is running these apps. My second thought is that Python > could benefit from caching some lookup information... I don't want to build a cache into Python, it should already be part of libresolv. > >address is purely numeric. I'd be happy to see a patch that > >intercepts addresses of the form \d+\.\d+\.\d+\.\d+ and parses those > >without calling getaddrinfo(). > > It's not quite that easy. Beyond the IPV6 issues mentioned elsewhere, The IPv6 folks can add their own cache. > you'd also want to check "\d+.\d+" and "\d+\.\d+\.\d+". IP addresses > will fill in missing ".0"s, which is particularly handy for accessing > "127.1", which gets expanded to "127.0.0.1". I didn't even know this, and I think it's bad style to use something that obscure (most people would probably guess that 127.1 means 0.0.127.1 or 127.1.0.0). But since you seem to know about this stuff, perhaps you can submit a patch? --Guido van Rossum (home page: http://www.python.org/~guido/) From marcus.h.mendenhall@vanderbilt.edu Wed Apr 9 15:20:50 2003 From: marcus.h.mendenhall@vanderbilt.edu (Marcus Mendenhall) Date: Wed, 9 Apr 2003 09:20:50 -0500 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304091351.h39DpSq24961@odiug.zope.com> Message-ID: <77018B84-6A96-11D7-87F7-003065A81A70@vanderbilt.edu> OK, I'll chime back in on the thread I started... I mostly have a question for Sean, since he seems to know the networking stuff well. Do you know of any reason why my original proposal (which is to allows IP addresses prefixed with e.g. 127.0.0.1 to cause both the AI_PASSIVE _and_ AI_NUMERIC flags to get set when resolution is attempted, which basically causes parsing with not real resolution at all) would break any known or plausible networking standards? The current Python socket module basically hides this part of the BSD socket API, and I find it quite useful to be able to suppress DNS activity absolutely for some addresses. And for Guido: since this type of tag has already been used in Python (as ), is there any reason why this solution is inelegant? Thanks. Marcus On Wednesday, April 9, 2003, at 08:51 AM, Guido van Rossum wrote: >> On Tue, Apr 08, 2003 at 10:50:50AM -0400, Guido van Rossum wrote: >>> Are you sure that it tries make a DNS call even when the address is >>> pure numeric? That seems a mistake, and if that's really happening, >>> I >> >> My first thought is that there should be a local DNS cache on the >> machine that is running these apps. My second thought is that Python >> could benefit from caching some lookup information... > > I don't want to build a cache into Python, it should already be part > of libresolv. > >>> address is purely numeric. I'd be happy to see a patch that >>> intercepts addresses of the form \d+\.\d+\.\d+\.\d+ and parses those >>> without calling getaddrinfo(). >> >> It's not quite that easy. Beyond the IPV6 issues mentioned elsewhere, > > The IPv6 folks can add their own cache. > >> you'd also want to check "\d+.\d+" and "\d+\.\d+\.\d+". IP addresses >> will fill in missing ".0"s, which is particularly handy for accessing >> "127.1", which gets expanded to "127.0.0.1". > > I didn't even know this, and I think it's bad style to use something > that obscure (most people would probably guess that 127.1 means > 0.0.127.1 or 127.1.0.0). > > But since you seem to know about this stuff, perhaps you can submit a > patch? > > --Guido van Rossum (home page: http://www.python.org/~guido/) > From Anthony Baxter Wed Apr 9 15:24:45 2003 From: Anthony Baxter (Anthony Baxter) Date: Thu, 10 Apr 2003 00:24:45 +1000 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <20030409124848.GB15649@tummy.com> Message-ID: <200304091424.h39EOje08304@localhost.localdomain> >>> Sean Reifschneider wrote > My first thought is that there should be a local DNS cache on the > machine that is running these apps. My second thought is that Python > could benefit from caching some lookup information... Ick ick. This is putting a bunch of code for a stub resolver into python. This stuff is hard to get right - I implemented this on top of pydns, and it was a lot of work to get (what I think is) correct, for not very much gain. The idea of either suppressing DNS lookups for all-numeric addresses, or some sort of extended API for suppressing DNS lookups might be better, but really, isn't this the job of the stub resolver? Anthony -- Anthony Baxter It's never too late to have a happy childhood. From marcus.h.mendenhall@vanderbilt.edu Wed Apr 9 15:32:00 2003 From: marcus.h.mendenhall@vanderbilt.edu (Marcus Mendenhall) Date: Wed, 9 Apr 2003 09:32:00 -0500 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304091424.h39EOje08304@localhost.localdomain> Message-ID: <069761E4-6A98-11D7-87F7-003065A81A70@vanderbilt.edu> On Wednesday, April 9, 2003, at 09:24 AM, Anthony Baxter wrote: > >>>> Sean Reifschneider wrote >> My first thought is that there should be a local DNS cache on the >> machine that is running these apps. My second thought is that Python >> could benefit from caching some lookup information... > > Ick ick. This is putting a bunch of code for a stub resolver into > python. > This stuff is hard to get right - I implemented this on top of pydns, > and > it was a lot of work to get (what I think is) correct, for not very > much > gain. > > The idea of either suppressing DNS lookups for all-numeric addresses, > or > some sort of extended API for suppressing DNS lookups might be better, > but really, isn't this the job of the stub resolver? > This is part of the resolver API, via the AI_NUMERIC flags. I am just trying to expose that API to the top level of python. Marcus > Anthony > > -- > Anthony Baxter > It's never too late to have a happy childhood. > From guido@python.org Wed Apr 9 15:37:35 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 10:37:35 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: Your message of "Wed, 09 Apr 2003 09:20:50 CDT." <77018B84-6A96-11D7-87F7-003065A81A70@vanderbilt.edu> References: <77018B84-6A96-11D7-87F7-003065A81A70@vanderbilt.edu> Message-ID: <200304091437.h39Ebc125316@odiug.zope.com> > OK, I'll chime back in on the thread I started... I mostly have a > question for Sean, since he seems to know the networking stuff well. I'll chime in nevertheless. > Do you know of any reason why my original proposal (which is to allows > IP addresses prefixed with e.g. 127.0.0.1 to cause > both the AI_PASSIVE _and_ AI_NUMERIC flags to get set when resolution > is attempted, which basically causes parsing with not real resolution > at all) would break any known or plausible networking standards? What are those flags? Which API uses them? I still don't understand why intercepting the all-numeric syntax isn't good enough, and why you want a prefix. > The current Python socket module basically hides this part of the > BSD socket API, and I find it quite useful to be able to suppress > DNS activity absolutely for some addresses. And for Guido: since > this type of tag has already been used in Python (as ), > is there any reason why this solution is inelegant? The reason I'm reluctant to add a new notation is that AFAIK it would be unique to Python. It's better to stick to standard notations IMO. was probably a mistake, since it seems to mean the same as 0.0.0.0 (for IPv4). --Guido van Rossum (home page: http://www.python.org/~guido/) From neal@metaslash.com Wed Apr 9 15:38:03 2003 From: neal@metaslash.com (Neal Norwitz) Date: Wed, 09 Apr 2003 10:38:03 -0400 Subject: [Python-Dev] SF file uploads work now Message-ID: <20030409143803.GE17847@epoch.metaslash.com> SF has fixed the problem which prevented a file from being uploaded when submitting a new patch. I just tested this and it worked. Neal From jafo@tummy.com Wed Apr 9 15:40:37 2003 From: jafo@tummy.com (Sean Reifschneider) Date: Wed, 9 Apr 2003 08:40:37 -0600 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304091424.h39EOje08304@localhost.localdomain> References: <20030409124848.GB15649@tummy.com> <200304091424.h39EOje08304@localhost.localdomain> Message-ID: <20030409144037.GL1756@tummy.com> On Thu, Apr 10, 2003 at 12:24:45AM +1000, Anthony Baxter wrote: >Ick ick. This is putting a bunch of code for a stub resolver into python. >This stuff is hard to get right - I implemented this on top of pydns, and >it was a lot of work to get (what I think is) correct, for not very much >gain. Well, ideally you'd cache the data for as long as the SOA says to cache it. However, it sounds like in the situation that started this thread, even caching that data for some small but configurable number of seconds might help out. >The idea of either suppressing DNS lookups for all-numeric addresses, or >some sort of extended API for suppressing DNS lookups might be better, >but really, isn't this the job of the stub resolver? Definitely, on both counts... I like the idea of the "127.0.0.1" or otherwise somehow specifying that the address shouldn't be resolved. I wouldn't think that it'd be good to do lookups of purely IP addresses, but there is probably some obscure part of some spec that says it should happen. Contrary to popular belief, just because I know that IP addresses get padded with 0s, I'm not a networking lawyer. ;-) I learned that trick because it can help make dealing with IPV6 addresses much easier, but I've found it most useful with 127.1. Sean -- This message is REALLY offensive, so I ROT-13d it TWICE. -- Sean Reifschneider being silly on #python, 2000 Sean Reifschneider, Inimitably Superfluous tummy.com, ltd. - Linux Consulting since 1995. Qmail, Python, SysAdmin From guido@python.org Wed Apr 9 15:41:37 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 10:41:37 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: Your message of "Thu, 10 Apr 2003 00:24:45 +1000." <200304091424.h39EOje08304@localhost.localdomain> References: <200304091424.h39EOje08304@localhost.localdomain> Message-ID: <200304091441.h39EfnU25347@odiug.zope.com> > Ick ick. This is putting a bunch of code for a stub resolver into python. > This stuff is hard to get right - I implemented this on top of pydns, and > it was a lot of work to get (what I think is) correct, for not very much > gain. What I said. > The idea of either suppressing DNS lookups for all-numeric addresses, or > some sort of extended API for suppressing DNS lookups might be better, > but really, isn't this the job of the stub resolver? Hey, I just figured it out. The old socket module (Python 2.1 and before) *did* special-case \d+\.\d+\.\d+\.\d+! This code was somehow lost when the IPv6 support was added. I propose to put it back in, at least for IPv4 (AF_INET). Patch anyone? --Guido van Rossum (home page: http://www.python.org/~guido/) From jafo@tummy.com Wed Apr 9 15:48:04 2003 From: jafo@tummy.com (Sean Reifschneider) Date: Wed, 9 Apr 2003 08:48:04 -0600 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304091351.h39DpSq24961@odiug.zope.com> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> <200304091351.h39DpSq24961@odiug.zope.com> Message-ID: <20030409144803.GM1756@tummy.com> On Wed, Apr 09, 2003 at 09:51:26AM -0400, Guido van Rossum wrote: >I didn't even know this, and I think it's bad style to use something >that obscure Perhaps... It's also bad style to break the obscure cases that are defined by the specifications... ;-) >(most people would probably guess that 127.1 means >0.0.127.1 or 127.1.0.0). Yeah, unfortunately it's one of those cases that it doesn't really make sense until you actually know the padding happens, and then think about it... It really only makes sense to pad within the address because you are rarely going to have leading or trailing 0s in a network address. So, it pads before the trailing specified octet: 10.1 => 10.0.0.1 10.9.1 => 10.9.0.1 >But since you seem to know about this stuff, perhaps you can submit a >patch? I've updated my local CVS repository, I'll see if I can get a change done on the airplane today. Sean -- The structure of a system reflects the structure of the organization that built it. -- Richard E. Fairley Sean Reifschneider, Inimitably Superfluous tummy.com, ltd. - Linux Consulting since 1995. Qmail, Python, SysAdmin From guido@python.org Wed Apr 9 15:50:11 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 10:50:11 -0400 Subject: [Python-Dev] SF file uploads work now In-Reply-To: Your message of "Wed, 09 Apr 2003 10:38:03 EDT." <20030409143803.GE17847@epoch.metaslash.com> References: <20030409143803.GE17847@epoch.metaslash.com> Message-ID: <200304091450.h39EoDP25441@odiug.zope.com> > SF has fixed the problem which prevented a file from being uploaded > when submitting a new patch. I just tested this and it worked. Thanks! I've removed the big red warning about this from the "submit new" page. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Apr 9 15:54:18 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 10:54:18 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: Your message of "Wed, 09 Apr 2003 08:48:04 MDT." <20030409144803.GM1756@tummy.com> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> <200304091351.h39DpSq24961@odiug.zope.com> <20030409144803.GM1756@tummy.com> Message-ID: <200304091454.h39EsPr25477@odiug.zope.com> > On Wed, Apr 09, 2003 at 09:51:26AM -0400, Guido van Rossum wrote: > >I didn't even know this, and I think it's bad style to use something > >that obscure > > Perhaps... It's also bad style to break the obscure cases that are > defined by the specifications... ;-) Sure. I propose to special-case only what we *absolutely* *know* we can handle, and if on closer inspection we can't (e.g. someone writes 999.999.999.999) we pass it on to the official code. Here's the 2.1 code, which takes that approach: if (sscanf(name, "%d.%d.%d.%d%c", &d1, &d2, &d3, &d4, &ch) == 4 && 0 <= d1 && d1 <= 255 && 0 <= d2 && d2 <= 255 && 0 <= d3 && d3 <= 255 && 0 <= d4 && d4 <= 255) { addr_ret->sin_addr.s_addr = htonl( ((long) d1 << 24) | ((long) d2 << 16) | ((long) d3 << 8) | ((long) d4 << 0)); return 4; } > >But since you seem to know about this stuff, perhaps you can submit a > >patch? > > I've updated my local CVS repository, I'll see if I can get a change > done on the airplane today. Great! --Guido van Rossum (home page: http://www.python.org/~guido/) From marcus.h.mendenhall@vanderbilt.edu Wed Apr 9 16:07:51 2003 From: marcus.h.mendenhall@vanderbilt.edu (Marcus Mendenhall) Date: Wed, 9 Apr 2003 10:07:51 -0500 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304091437.h39Ebc125316@odiug.zope.com> Message-ID: <0836E287-6A9D-11D7-87F7-003065A81A70@vanderbilt.edu> On Wednesday, April 9, 2003, at 09:37 AM, Guido van Rossum wrote: >> OK, I'll chime back in on the thread I started... I mostly have a >> question for Sean, since he seems to know the networking stuff well. > > I'll chime in nevertheless. > >> Do you know of any reason why my original proposal (which is to allows >> IP addresses prefixed with e.g. 127.0.0.1 to cause >> both the AI_PASSIVE _and_ AI_NUMERIC flags to get set when resolution >> is attempted, which basically causes parsing with not real resolution >> at all) would break any known or plausible networking standards? > > What are those flags? Which API uses them? > The getsockaddr call uses them (actually the correct name for one of the flags is AI_NUMERICHOST, not AI_NUMERIC as I originally stated), and its part of the BSD sockets library, which is basically what the python socketmodule wraps. > I still don't understand why intercepting the all-numeric syntax isn't > good enough, and why you want a prefix. > I guess intercepting all numeric is OK, it is just less efficient (since it requires a trial parsing of an address, which is wasted if it is not all numeric), and because it is so easy to implement . However, all my operational goals are achieved if the old check for pure numeric is reinstated at the lowest level (probably in getsockaddrarg in socketmodule.c), so it is used everywhere. >> The current Python socket module basically hides this part of the >> BSD socket API, and I find it quite useful to be able to suppress >> DNS activity absolutely for some addresses. And for Guido: since >> this type of tag has already been used in Python (as ), >> is there any reason why this solution is inelegant? > > The reason I'm reluctant to add a new notation is that AFAIK it would > be unique to Python. It's better to stick to standard notations IMO. > was probably a mistake, since it seems to mean the same as > 0.0.0.0 (for IPv4). I accept this logic. However, python is hiding a very useful (for efficiency) piece of the API, or depending on guessing whether you want it or not by looking at the format of an address. There are times in higher-level (python) code where getaddrinfo is called to get a CNAME, where I would also like to cause the raw IP to be returned by force, instead of attempting to get a CNAME, since I already know, by the IP I chose, that one doesn't exists. If we make the same check for numeric IPs in getaddrinfo, then it becomes impossible to resolve numeric names back to real ones. There is not way for getaddrinfo to know which way we want it, since in this case both ways might be needed. From guido@python.org Wed Apr 9 16:20:39 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 11:20:39 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: Your message of "Wed, 09 Apr 2003 10:07:51 CDT." <0836E287-6A9D-11D7-87F7-003065A81A70@vanderbilt.edu> References: <0836E287-6A9D-11D7-87F7-003065A81A70@vanderbilt.edu> Message-ID: <200304091521.h39FL5425595@odiug.zope.com> > > I still don't understand why intercepting the all-numeric syntax > > isn't good enough, and why you want a prefix. > > > I guess intercepting all numeric is OK, it is just less efficient > (since it requires a trial parsing of an address, which is wasted if > it is not all numeric), and because it is so easy to implement > . The performance loss will be unmeasurable (parsing a string of at most 11 bytes against a very simple pattern). Compare that to the true cost of adding : documentation has to be added (and dozens of books updated), and code that wants to use numeric addresses has to be changed. > However, all my operational goals are achieved if the > old check for pure numeric is reinstated at the lowest level > (probably in getsockaddrarg in socketmodule.c), so it is used > everywhere. Right. > > The reason I'm reluctant to add a new notation is that AFAIK it would > > be unique to Python. It's better to stick to standard notations IMO. > > was probably a mistake, since it seems to mean the same as > > 0.0.0.0 (for IPv4). > I accept this logic. However, python is hiding a very useful (for > efficiency) piece of the API, or depending on guessing whether you want > it or not by looking at the format of an address. There are times in > higher-level (python) code where getaddrinfo is called to get a CNAME, > where I would also like to cause the raw IP to be returned by force, > instead of attempting to get a CNAME, since I already know, by the IP I > chose, that one doesn't exists. If we make the same check for numeric > IPs in getaddrinfo, then it becomes impossible to resolve numeric names > back to real ones. There is not way for getaddrinfo to know which way > we want it, since in this case both ways might be needed. You're right, this functionality should be made available. IMO the right solution is to make it a separate API in the socket module, not to add more syntax to the existing address parsing code. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Wed Apr 9 19:36:17 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 09 Apr 2003 20:36:17 +0200 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <20030409124848.GB15649@tummy.com> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> Message-ID: <3E946821.6010208@v.loewis.de> Sean Reifschneider wrote: > My first thought is that there should be a local DNS cache on the > machine that is running these apps. My second thought is that Python > could benefit from caching some lookup information... I disagree. Python should expose the resolver library, and leave caching to it; many such libraries do caching already, in some form. The issue is different: In some cases the application just *knows* that an address is numeric, and that DNS lookup will fail. In these cases, lookup should be avoided - whether by explicit request from the application or by Python implicitly just knowing is a different issue. It turns out that Python doesn't need to 100% detect numeric addresses, as long as it would not classify addresses as numeric which aren't. Perhaps it is even possible to leave the "is numeric" test to the implementation of getaddrinfo, i.e. calling it twice (try numeric first, then try resolving the name)? Regards, Martin From martin@v.loewis.de Wed Apr 9 19:38:32 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 09 Apr 2003 20:38:32 +0200 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304091351.h39DpSq24961@odiug.zope.com> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> <200304091351.h39DpSq24961@odiug.zope.com> Message-ID: <3E9468A8.8050407@v.loewis.de> Guido van Rossum wrote: > I didn't even know this, and I think it's bad style to use something > that obscure (most people would probably guess that 127.1 means > 0.0.127.1 or 127.1.0.0). > > But since you seem to know about this stuff, perhaps you can submit a > patch? I think the OP is willing to create a patch if guided into a direction. The basic question is: should Python automatically recognize numeric addresses, or should the application have a way to indicate a numeric address? Regards, Martin From skip@pobox.com Wed Apr 9 19:44:51 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 9 Apr 2003 13:44:51 -0500 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <3E946821.6010208@v.loewis.de> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> <3E946821.6010208@v.loewis.de> Message-ID: <16020.27171.834878.631470@montanaro.dyndns.org> Martin> It turns out that Python doesn't need to 100% detect numeric Martin> addresses, as long as it would not classify addresses as numeric Martin> which aren't. Perhaps it is even possible to leave the "is Martin> numeric" test to the implementation of getaddrinfo, i.e. calling Martin> it twice (try numeric first, then try resolving the name)? Can a top-level domain be all digits? If not, why not assume numeric if re.search(r"\.\d+$", addr) is not None? Skip From guido@python.org Wed Apr 9 19:45:49 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 14:45:49 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: Your message of "Wed, 09 Apr 2003 20:36:17 +0200." <3E946821.6010208@v.loewis.de> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> <3E946821.6010208@v.loewis.de> Message-ID: <200304091845.h39Ijor31915@odiug.zope.com> > Sean Reifschneider wrote: > > > My first thought is that there should be a local DNS cache on the > > machine that is running these apps. My second thought is that Python > > could benefit from caching some lookup information... [MvL] > I disagree. Python should expose the resolver library, and leave > caching to it; many such libraries do caching already, in some form. Right. > The issue is different: In some cases the application just *knows* > that an address is numeric, and that DNS lookup will fail. In fact, I've often written code that passes a numeric address, and I've always assumed that in that case the code would take a shortcut because there's nothing to look up (only to parse). > In these cases, lookup should be avoided - whether by explicit > request from the application or by Python implicitly just knowing is > a different issue. > > It turns out that Python doesn't need to 100% detect numeric > addresses, as long as it would not classify addresses as numeric > which aren't. Perhaps it is even possible to leave the "is numeric" > test to the implementation of getaddrinfo, i.e. calling it twice > (try numeric first, then try resolving the name)? Perhaps, as long as we can safely ignore the first error. This would probably be a little slower, but probably not slow enoug to matter, and it sounds like a very general solution. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Wed Apr 9 19:49:54 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 09 Apr 2003 20:49:54 +0200 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <0836E287-6A9D-11D7-87F7-003065A81A70@vanderbilt.edu> References: <0836E287-6A9D-11D7-87F7-003065A81A70@vanderbilt.edu> Message-ID: <3E946B52.7090708@v.loewis.de> Marcus Mendenhall wrote: > The getsockaddr call uses them (actually the correct name for one of the > flags is AI_NUMERICHOST, not AI_NUMERIC as I originally stated), and its > part of the BSD sockets library, which is basically what the python > socketmodule wraps. More importantly, it is part of RFC 2553, which Python uses; it is also part of Winsock2. > I guess intercepting all numeric is OK, it is just less efficient (since > it requires a trial parsing of an address, which is wasted if it is not > all numeric), and because it is so easy to implement . But isn't the same trial parsing needed to determine presence of the "" flag? The trial parsing Guido proposes usually stops with the first letter in a non-numeric address, and accesses up to 16 letters for a numeric address. Regards, Martin From guido@python.org Wed Apr 9 19:47:41 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 14:47:41 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: Your message of "Wed, 09 Apr 2003 20:38:32 +0200." <3E9468A8.8050407@v.loewis.de> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> <200304091351.h39DpSq24961@odiug.zope.com> <3E9468A8.8050407@v.loewis.de> Message-ID: <200304091848.h39IlpW31935@odiug.zope.com> > The basic question is: should Python automatically recognize numeric > addresses, or should the application have a way to indicate a numeric > address? It should be automatically recognized. Python has always done this (until 2.1 at least). I don't think there is any ambiguity; AFAIK it's not possible to put something in the DNS so that an all-numeric address gets remapped (that would be a nasty security problem waiting to happen). --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Wed Apr 9 19:59:56 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 09 Apr 2003 20:59:56 +0200 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <16020.27171.834878.631470@montanaro.dyndns.org> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> <3E946821.6010208@v.loewis.de> <16020.27171.834878.631470@montanaro.dyndns.org> Message-ID: <3E946DAC.8010909@v.loewis.de> Skip Montanaro wrote: > Can a top-level domain be all digits? It appears nobody here can answer this question with certainty. If the answer is "no", it is surprising that getaddrinfo implementations still make resolver calls in this case even if they are sure that those resolver calls fail. One would hope that people writing socket libraries should no the answer. Regards, Martin From marcus.h.mendenhall@vanderbilt.edu Wed Apr 9 20:14:16 2003 From: marcus.h.mendenhall@vanderbilt.edu (Marcus Mendenhall) Date: Wed, 9 Apr 2003 14:14:16 -0500 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <3E946B52.7090708@v.loewis.de> Message-ID: <750D46CE-6ABF-11D7-87F7-003065A81A70@vanderbilt.edu> On Wednesday, April 9, 2003, at 01:49 PM, Martin v. L=F6wis wrote: > Marcus Mendenhall wrote: > >> The getsockaddr call uses them (actually the correct name for one of=20= >> the flags is AI_NUMERICHOST, not AI_NUMERIC as I originally stated),=20= >> and its part of the BSD sockets library, which is basically what the=20= >> python socketmodule wraps. > > More importantly, it is part of RFC 2553, which Python uses; it is = also > part of Winsock2. > >> I guess intercepting all numeric is OK, it is just less efficient=20 >> (since it requires a trial parsing of an address, which is wasted if=20= >> it is not all numeric), and because it is so easy to implement=20 >> . > > But isn't the same trial parsing needed to determine presence of the=20= > "" flag? The trial parsing Guido proposes usually stops with > the first letter in a non-numeric address, and accesses up to 16=20 > letters > for a numeric address. Yes, but a compare of the head of a string to a constant is probably=20 something which requires 1% of the cpu time of a sscanf. Just: if (string[0]=3D=3D'<' && not strncmp(string,"",9)) {whatever} the first compare avoids even a subroutine call in the most likely case=20= (string does not begin with ) but then checks extremely=20 quickly if it is right after that. Even though cpu time is cheap, we should save it for useful work. Marcus From nas@python.ca Wed Apr 9 20:31:22 2003 From: nas@python.ca (Neil Schemenauer) Date: Wed, 9 Apr 2003 12:31:22 -0700 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <750D46CE-6ABF-11D7-87F7-003065A81A70@vanderbilt.edu> References: <3E946B52.7090708@v.loewis.de> <750D46CE-6ABF-11D7-87F7-003065A81A70@vanderbilt.edu> Message-ID: <20030409193122.GA20230@glacier.arctrix.com> Marcus Mendenhall wrote: > Even though cpu time is cheap, we should save it for useful work. Saving a few cycles while having the complicate the interface is not the Python way. +1 on restoring the old sscanf code (or something similar to it). ObTrivia: IP addresses can be written as a single number (at least for many IP implementations). Try "ping 2130706433". Neil From jeremy@zope.com Wed Apr 9 20:33:47 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 09 Apr 2003 15:33:47 -0400 Subject: [Python-Dev] tp_clear return value Message-ID: <1049916827.4961.64.camel@slothrop.zope.com> Why does tp_clear have a return value? All the code I've seen returns 0, but the only place that clear is called doesn't inspect its return value. Jeremy From guido@python.org Wed Apr 9 20:40:56 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 15:40:56 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: Your message of "Wed, 09 Apr 2003 14:14:16 CDT." <750D46CE-6ABF-11D7-87F7-003065A81A70@vanderbilt.edu> References: <750D46CE-6ABF-11D7-87F7-003065A81A70@vanderbilt.edu> Message-ID: <200304091941.h39Jf7A00697@odiug.zope.com> > Even though cpu time is cheap, we should save it for useful work. With that attitude, I'm surprised you're using Python at all. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Wed Apr 9 20:48:10 2003 From: nas@python.ca (Neil Schemenauer) Date: Wed, 9 Apr 2003 15:48:10 -0400 Subject: [Python-Dev] Re: tp_clear return value In-Reply-To: <1049916827.4961.64.camel@slothrop.zope.com> References: <1049916827.4961.64.camel@slothrop.zope.com> Message-ID: <20030409194810.GA27070@mems-exchange.org> On Wed, Apr 09, 2003 at 03:33:47PM -0400, Jeremy Hylton wrote: > Why does tp_clear have a return value? All the code I've seen returns > 0, but the only place that clear is called doesn't inspect its return > value. I guess I would have to say overdesign. I was thinking that tp_clear and tp_traverse could somehow be used by things other than the GC. In retrospect that doesn't seem likely or even possible. The GC has pretty specific requirements. In retrospect, I think both tp_traverse and tp_clear should have returned "void". That would have made implementing those methods easier. Testing for errors in tp_traverse methods is silly since nothing returns an error, and, even if it did, the GC couldn't handle it. :-( How do we sort this out? I suppose one step would be to document that the return values of tp_traverse and tp_clear are ignored. If we agree on that, I volunteer to go through the code and remove the useless tests for errors in the tp_traverse methods. Neil From guido@python.org Wed Apr 9 20:52:03 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 15:52:03 -0400 Subject: [Python-Dev] Re: tp_clear return value In-Reply-To: Your message of "Wed, 09 Apr 2003 15:48:10 EDT." <20030409194810.GA27070@mems-exchange.org> References: <1049916827.4961.64.camel@slothrop.zope.com> <20030409194810.GA27070@mems-exchange.org> Message-ID: <200304091952.h39Jq6Y01468@odiug.zope.com> > On Wed, Apr 09, 2003 at 03:33:47PM -0400, Jeremy Hylton wrote: > > Why does tp_clear have a return value? All the code I've seen returns > > 0, but the only place that clear is called doesn't inspect its return > > value. [In response, Neil admitted] > I guess I would have to say overdesign. I was thinking that tp_clear > and tp_traverse could somehow be used by things other than the GC. In > retrospect that doesn't seem likely or even possible. The GC has pretty > specific requirements. > > In retrospect, I think both tp_traverse and tp_clear should have > returned "void". That would have made implementing those methods > easier. Testing for errors in tp_traverse methods is silly since > nothing returns an error, and, even if it did, the GC couldn't handle > it. > > :-( > > How do we sort this out? I suppose one step would be to document that > the return values of tp_traverse and tp_clear are ignored. If we agree > on that, I volunteer to go through the code and remove the useless tests > for errors in the tp_traverse methods. That's a good first step. Unfortunately changing the declaration to void will break 3rd party extensions so that will be too painful. --Guido van Rossum (home page: http://www.python.org/~guido/) From jafo@tummy.com Wed Apr 9 20:22:48 2003 From: jafo@tummy.com (Sean Reifschneider) Date: Wed, 9 Apr 2003 13:22:48 -0600 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <3E946821.6010208@v.loewis.de> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> <3E946821.6010208@v.loewis.de> Message-ID: <20030409192248.GQ1756@tummy.com> On Wed, Apr 09, 2003 at 08:36:17PM +0200, "Martin v. L?wis" wrote: >I disagree. Python should expose the resolver library, and leave caching >to it; many such libraries do caching already, in some form. Why don't we carry it to the logical conclusion and say that the resolver should also avoid doing a forward lookup on an already numeric IP? I've noticed that before the Red Hat 8.0 release, doing a "telnet " would usually be very fast on the initial connection, and since 8.0 it's been slow as if doing a lookup... To me that indicates that the resolver used to do this and has been changed to not, which makes me wonder why that was... Perhaps we're being too clever and it's going to come back to bite us? The "" syntax would allow us to leave resolution as it is and let the user override it when they deem necessary. If we try to auto-detect (which I'm usually all for), we should probably implement a "" or similar? Sean -- Geek English Rule #7: To reduce redundancy, the word "scary" can be left out of any statement containing the phrase "scary java applet". Sean Reifschneider, Inimitably Superfluous tummy.com, ltd. - Linux Consulting since 1995. Qmail, Python, SysAdmin From guido@python.org Wed Apr 9 21:05:50 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 16:05:50 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: Your message of "Wed, 09 Apr 2003 13:22:48 MDT." <20030409192248.GQ1756@tummy.com> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> <3E946821.6010208@v.loewis.de> <20030409192248.GQ1756@tummy.com> Message-ID: <200304092005.h39K5pd01600@odiug.zope.com> > Why don't we carry it to the logical conclusion and say that the > resolver should also avoid doing a forward lookup on an already numeric > IP? > > I've noticed that before the Red Hat 8.0 release, doing a "telnet " > would usually be very fast on the initial connection, and since 8.0 it's > been slow as if doing a lookup... To me that indicates that the > resolver used to do this and has been changed to not, which makes me > wonder why that was... > > Perhaps we're being too clever and it's going to come back to bite us? I think it's the other way around. The resolver lost some perfectly good caching in the upgrade to support IPv6. The designers probably didn't notice the difference because in their own setup, DNS is fast. I expect the caching will come back eventually. > The "" syntax would allow us to leave > resolution as it is and let the user override it when they deem > necessary. If we try to auto-detect (which I'm usually all for), we > should probably implement a "" or similar? YAGNI. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Wed Apr 9 21:27:01 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 09 Apr 2003 22:27:01 +0200 Subject: [Python-Dev] Re: tp_clear return value In-Reply-To: <20030409194810.GA27070@mems-exchange.org> References: <1049916827.4961.64.camel@slothrop.zope.com> <20030409194810.GA27070@mems-exchange.org> Message-ID: <3E948215.8050504@v.loewis.de> Neil Schemenauer wrote: > I guess I would have to say overdesign. I was thinking that tp_clear > and tp_traverse could somehow be used by things other than the GC. In > retrospect that doesn't seem likely or even possible. The GC has pretty > specific requirements. > > In retrospect, I think both tp_traverse and tp_clear should have > returned "void". While this is true for tp_clear, tp_traverse is actually more general. gc.get_referrers uses tp_traverse, for something other than collection. > That would have made implementing those methods > easier. Testing for errors in tp_traverse methods is silly since > nothing returns an error, and, even if it did, the GC couldn't handle > it. Again, gc.get_referrers "uses" this feature. If extending the list fails, traversal is aborted. Whether this is useful is questionable, as the entire notion of "out of memory exception handling" is questionable. Regards, Martin From jafo@tummy.com Wed Apr 9 21:33:19 2003 From: jafo@tummy.com (Sean Reifschneider) Date: Wed, 9 Apr 2003 14:33:19 -0600 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <16020.27171.834878.631470@montanaro.dyndns.org> References: <200304081450.h38EoqE20178@odiug.zope.com> <20030409124848.GB15649@tummy.com> <3E946821.6010208@v.loewis.de> <16020.27171.834878.631470@montanaro.dyndns.org> Message-ID: <20030409203319.GS1756@tummy.com> On Wed, Apr 09, 2003 at 01:44:51PM -0500, Skip Montanaro wrote: >Can a top-level domain be all digits? If not, why not assume numeric if >re.search(r"\.\d+$", addr) is not None? I don't think anyone sane would create a top-level that's digits, particularly in the range of 0 to 255. That probably means that somebody is going to do it... ;-/ I think checking for 2 to 4 dotted octets in the range of 0 to 255 would be safest... Yes, you can probably get away with using the regex above, but I wouldn't want to. Sean -- Sucking all the marrow out of life doesn't mean choking on the bone. -- _Dead_Poet's_Society_ Sean Reifschneider, Inimitably Superfluous tummy.com, ltd. - Linux Consulting since 1995. Qmail, Python, SysAdmin From tim.one@comcast.net Wed Apr 9 22:33:07 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 09 Apr 2003 17:33:07 -0400 Subject: [Python-Dev] Re: tp_clear return value In-Reply-To: <3E948215.8050504@v.loewis.de> Message-ID: [Neil Schemenauer] >> I was thinking that tp_clear and tp_traverse could somehow be used by >> things other than the GC. In retrospect that doesn't seem likely or even >> possible. The GC has pretty specific requirements. >> In retrospect, I think both tp_traverse and tp_clear should have >> returned "void". [Martin v. Lowis] > While this is true for tp_clear, tp_traverse is actually more general. > gc.get_referrers uses tp_traverse, for something other than collection. >> That would have made implementing those methods >> easier. Testing for errors in tp_traverse methods is silly since >> nothing returns an error, and, even if it did, the GC couldn't handle >> it. > Again, gc.get_referrers "uses" this feature. If extending the list > fails, traversal is aborted. Whether this is useful is questionable, > as the entire notion of "out of memory exception handling" is > questionable. The brand new gc.get_referents uses the return value of tp_traverse to abort on out-of-memory, but gc.get_referrers uses it for a different purpose (its traversal function returns true if the visited object is in the tuple of objects passed in, else returns false). The internal gc.get_referrers_for is what aborts on out-of-memory in the get_referrers subsystem. tp_traverse is fine as-is. The return value of tp_clear does indeed appear without plausible use. >> If we agree that, I volunteer to go through the code and remove the >> useless tests for errors in the tp_traverse methods. That would make get_referents press on after memory is exhausted. It would also change the semantics of get_referrers, in a subtle way (if object A has 25 references to object B, gc.get_referrers(B) contains only 1 instance of A today, but would contain 25 instances of A if tp_traverse methods ignored visit() return values). truth-isn't-necessarily-an-error-ly y'rs - tim From Jack.Jansen@oratrix.com Wed Apr 9 22:33:14 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Wed, 9 Apr 2003 23:33:14 +0200 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <20030409144037.GL1756@tummy.com> Message-ID: On woensdag, apr 9, 2003, at 16:40 Europe/Amsterdam, Sean Reifschneider wrote: > On Thu, Apr 10, 2003 at 12:24:45AM +1000, Anthony Baxter wrote: >> Ick ick. This is putting a bunch of code for a stub resolver into >> python. >> This stuff is hard to get right - I implemented this on top of pydns, >> and >> it was a lot of work to get (what I think is) correct, for not very >> much >> gain. > > Well, ideally you'd cache the data for as long as the SOA says to cache > it. However, it sounds like in the situation that started this thread, > even caching that data for some small but configurable number of > seconds > might help out. I wouldn't touch caching with a ten foot pole here: Python cannot know what happens under the hood of the network. For example, if I move my WiFi-equipped laptop from one location to another I don't want to be forced to restart my Python applications just to clear some silly cache, knowing that the OS and libc layers have handled the switch fine. (And, yes, Windoze-users are probably required to reboot anyway, but my Mac handles changing IP addresses just nicely:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From nas@python.ca Wed Apr 9 22:41:04 2003 From: nas@python.ca (Neil Schemenauer) Date: Wed, 9 Apr 2003 14:41:04 -0700 Subject: [Python-Dev] Re: tp_clear return value In-Reply-To: <3E948215.8050504@v.loewis.de> References: <1049916827.4961.64.camel@slothrop.zope.com> <20030409194810.GA27070@mems-exchange.org> <3E948215.8050504@v.loewis.de> Message-ID: <20030409214104.GA20544@glacier.arctrix.com> "Martin v. L?wis" wrote: > Neil Schemenauer wrote: > >In retrospect, I think both tp_traverse and tp_clear should have > >returned "void". > > While this is true for tp_clear, tp_traverse is actually more general. > gc.get_referrers uses tp_traverse, for something other than collection. Could the visit procedure keep track of errors? Something like: struct result { int error; /* true if an error occured while traversing */ /* other results */ } static void myvisit(PyObject* obj, struct result *r) { if (!r->error) { error of error occurs> } } From martin@v.loewis.de Wed Apr 9 22:47:52 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 09 Apr 2003 23:47:52 +0200 Subject: [Python-Dev] Re: tp_clear return value In-Reply-To: <20030409214104.GA20544@glacier.arctrix.com> References: <1049916827.4961.64.camel@slothrop.zope.com> <20030409194810.GA27070@mems-exchange.org> <3E948215.8050504@v.loewis.de> <20030409214104.GA20544@glacier.arctrix.com> Message-ID: <3E949508.1030902@v.loewis.de> Neil Schemenauer wrote: > Could the visit procedure keep track of errors? No. For get_referrers (as Tim explains), it might be acceptable but less efficient (since traversal should stop when a the object is found to be a referrer). For get_referents, an error in the callback should really abort traversal as the system just went out of memory. Regards, Martin From db3l@fitlinxx.com Wed Apr 9 23:11:10 2003 From: db3l@fitlinxx.com (David Bolen) Date: 09 Apr 2003 18:11:10 -0400 Subject: [Python-Dev] Re: _socket efficiencies ideas References: <3E946B52.7090708@v.loewis.de> <750D46CE-6ABF-11D7-87F7-003065A81A70@vanderbilt.edu> <20030409193122.GA20230@glacier.arctrix.com> Message-ID: Neil Schemenauer writes: > Marcus Mendenhall wrote: > > Even though cpu time is cheap, we should save it for useful work. > > Saving a few cycles while having the complicate the interface is not the > Python way. +1 on restoring the old sscanf code (or something similar > to it). For what it's worth, whenever I had network code that I wanted to accept names or addresses, I always distinguished them through an attempt using the platform inet_addr() system call. If that returns an error (-1), then I go ahead and process it as a name, otherwise I use the address it returns. inet_addr() will itself take care of validating that the address is legal (e.g., no octet over 255 and only up to 4 octets), padding values as necessary (e.g., x.y.z is processed as if z was a 16-bit value, x.z as if z was a 24-bit value, x as a 32-bit value), and permits decimal, octal or hexadecimal forms of the individual octets. I believe this behavior is portable and well defined. If you wanted the same code to work for IPv4 and IPv6, you'd probably want to use inet_pton() instead since inet_addr() only does IPv4, although that would lose the hex/octal options. You'd probably have to conditionalize that anyway since it might not be available on IPv4 only configurations, so I could see using inet_addr() for IPv4 and inet_pton() for IPv6. > ObTrivia: IP addresses can be written as a single number (at least for > many IP implementations). Try "ping 2130706433". That's part of the inet_addr() definition. When a single value is given as the string, it is assumed to be the complete 32-bit address value, and is stored directly without any byte rearrangement. So, 2130706433 is (127*2^24) + 1, or "127.0.0.1" - but then obviously you knew that :-) -- David From greg@cosc.canterbury.ac.nz Thu Apr 10 01:31:34 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 10 Apr 2003 12:31:34 +1200 (NZST) Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304091424.h39EOje08304@localhost.localdomain> Message-ID: <200304100031.h3A0VYV24951@oma.cosc.canterbury.ac.nz> Anthony Baxter : > The idea of either suppressing DNS lookups for all-numeric addresses, or > some sort of extended API for suppressing DNS lookups might be better, > but really, isn't this the job of the stub resolver? Seems to me the basic problem is that we're representing to completely different things -- a DNS name and a raw IP address -- the same way, i.e. as a string. A raw IP address should (at least optionally) be represented by something different, such as a tuple of ints. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido@python.org Thu Apr 10 01:37:58 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Apr 2003 20:37:58 -0400 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: "Your message of Thu, 10 Apr 2003 12:31:34 +1200." <200304100031.h3A0VYV24951@oma.cosc.canterbury.ac.nz> References: <200304100031.h3A0VYV24951@oma.cosc.canterbury.ac.nz> Message-ID: <200304100037.h3A0bwt01972@pcp02138704pcs.reston01.va.comcast.net> > Seems to me the basic problem is that we're representing > to completely different things -- a DNS name and a raw > IP address -- the same way, i.e. as a string. > > A raw IP address should (at least optionally) be represented > by something different, such as a tuple of ints. Why? There's never any ambiguity about which kind is intended. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg@cosc.canterbury.ac.nz Thu Apr 10 02:10:44 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 10 Apr 2003 13:10:44 +1200 (NZST) Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304091848.h39IlpW31935@odiug.zope.com> Message-ID: <200304100110.h3A1Aij25025@oma.cosc.canterbury.ac.nz> Guido van Rossum : > AFAIK it's not possible to put something in the DNS so that an > all-numeric address gets remapped In that case, there's no problem at all, and I withdraw my suggestion about using tuples for numeric addresses. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Apr 10 02:15:05 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 10 Apr 2003 13:15:05 +1200 (NZST) Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <750D46CE-6ABF-11D7-87F7-003065A81A70@vanderbilt.edu> Message-ID: <200304100115.h3A1F5425035@oma.cosc.canterbury.ac.nz> Marcus Mendenhall : > Just: if (string[0]=='<' && not strncmp(string,"",9)) > {whatever} By the same token, checking whether the first char is a digit ought to weed out about 99.999% of all non-numeric domain name addresses. If this is even a problem, which I doubt. We're talking about something called from Python, for goodness sake... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From andrew@acooke.org Thu Apr 10 02:27:35 2003 From: andrew@acooke.org (andrew cooke) Date: Wed, 9 Apr 2003 21:27:35 -0400 (CLT) Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304100115.h3A1F5425035@oma.cosc.canterbury.ac.nz> References: <750D46CE-6ABF-11D7-87F7-003065A81A70@vanderbilt.edu> <200304100115.h3A1F5425035@oma.cosc.canterbury.ac.nz> Message-ID: <40894.127.0.0.1.1049938055.squirrel@127.0.0.1> this is a fragment from RFC 1034 (DOMAIN NAMES - CONCEPTS AND FACILITIES) http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc1034.html i'm not 100% sure that this is the "normative" definition, but if it is then it clearly requires a non-numeric initial character for each label. (sorry if someone has already mentioned this!) andrew 3.5 Preferred name syntax The DNS specifications attempt to be as general as possible in the rules for constructing domain names. The idea is that the name of any existing object can be expressed as a domain name with minimal changes. However, when assigning a domain name for an object, the prudent user will select a name which satisfies both the rules of the domain system and any existing rules for the object, whether these rules are published or implied by existing programs. For example, when naming a mail domain, the user should satisfy both the rules of this memo and those in RFC-822. When creating a new host name, the old rules for HOSTS.TXT should be followed. This avoids problems when old software is converted to use domain names. The following syntax will result in fewer problems with many applications that use domain names (e.g., mail, TELNET). ::= | " " ::=