From guido@python.org Sat Jun 1 02:21:59 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 31 May 2002 21:21:59 -0400 Subject: [Python-Dev] Customization docs In-Reply-To: Your message of "Fri, 31 May 2002 18:45:13 EDT." <06da01c208f4$d69003c0$6601a8c0@boostconsulting.com> References: <06da01c208f4$d69003c0$6601a8c0@boostconsulting.com> Message-ID: <200206010121.g511LxX19223@pcp742651pcs.reston01.va.comcast.net> I'll leave the doc questions for Fred (maybe better open a SF bug for them though). Then: > From what I could find in the docs, it's completely non-obvious how the > following works for immutable objects in containers: > > >>> x = [ 1, 2, 3] > >>> x[1] += 3 > >>> x > [1, 5, 3] > > Is the sequence of operations described someplace? Um, in the code. :-( Using dis(), you'll find that x[1]+=3 executes the following: 6 LOAD_FAST 0 (x) 9 LOAD_CONST 1 (1) 12 DUP_TOPX 2 15 BINARY_SUBSCR 16 LOAD_CONST 2 (3) 19 INPLACE_ADD 20 ROT_THREE 21 STORE_SUBSCR > How does Python decide that sequence elements are immutable? Huh? It doesn't. If they were mutable, had you expected something else? >>> x = [[1], [3], [5]] >>> x[1] += [6] >>> x [[1], [3, 6], [5]] >>> Basically, += on an attribute or subscripted container does the following: (1) get the thing out (2) apply the inplace operation to the thing (3) put the thing back in The inplace operation, of course, is a binary operator that *may* modify its first operand in place, but *must* return the resulting value; if it modified the first operand in place, it *should* return that operand. If a type doesn't support an inplace operation, the regular binary operator is invoked instead. Does this help? (The whole thing is designed to be intuitive, but that probably doesn't work in your case. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From s_lott@yahoo.com Sat Jun 1 03:11:06 2002 From: s_lott@yahoo.com (Steven Lott) Date: Fri, 31 May 2002 19:11:06 -0700 (PDT) Subject: [Python-Dev] deprecating string module? In-Reply-To: Message-ID: <20020601021106.29157.qmail@web9606.mail.yahoo.com> In python, you don't need overloading, you have a variety of optional parameter mechanisms. I think the "member functions" issues from C++ don't apply to Python becuase C++ is strongly typed, meaning that many similar functions have to be written with slightly different type signatures. The lack of strong typing makes it practical to write generic operations. I find that use of free functions defeats good object-oriented design and leads to functionality being informally bound by a cluster of free functions that have similar names. I'm suspicious of this, finding it tiresome to maintain and debug. --- "Martin v. Loewis" wrote: > Guido van Rossum writes: > > > Is this still relevant to Python? Why are C++ member > functions > > difficult to generic programs? Does the same apply to > Python methods? > > In a generic algorithm foo, you can write > > def foo1(x): > bar(x) > > if you have global functions. With methods, you need to write > > def foo1(x): > x.bar() > > which means that bar must be a method of x. This might be > difficult to > achieve if x is of a class that you cannot control. In C++, it > is then > still possible to define a function > > def bar(x of-type-X): > pass > > which, by means of overload resolution, will be found from > foo1 > automatically. > > In Python, this is not so easy since you don't have > overloading. > > Regards, > Martin > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev ===== -- S. Lott, CCP :-{) S_LOTT@YAHOO.COM http://www.mindspring.com/~slott1 Buccaneer #468: KaDiMa Macintosh user: drinking upstream from the herd. __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From s_lott@yahoo.com Sat Jun 1 03:45:53 2002 From: s_lott@yahoo.com (Steven Lott) Date: Fri, 31 May 2002 19:45:53 -0700 (PDT) Subject: [Python-Dev] Re: Adding Optik to the standard library In-Reply-To: Message-ID: <20020601024553.59600.qmail@web9607.mail.yahoo.com> The class isn't really the unit of reuse. The old one-class-per-file rules from C++ aren't helpful for good reusable design. They are for optimizing compiling and making. This is great book on large-scale design considerations. Much of it is C++ specific, but parts apply to Python. Large-Scale C++ Software Design, John Lakos Addison-Wesley, Paperback, Published July 1996, 845 pages, ISBN 0201633620. The module of related classes is the unit of reuse. A cluster of related modules can make sense for a large, complex reusable component, like an application program. As a user, anything in a module file that is not class definition (or the odd occaisional convenience function) is a show-stopper. If there is some funny business to implement submodules, that ends my interest. Part of the open source social contract is that if I'm going to use it, I'd better be able to support it. Even if you win the lottery and retire to a fishing boat in the Caribbean. The question of Optikoptions having several reusable elements pushes my envelope. If it's job is to parse command line arguments, how many different reusable elements can their really be? Perhaps there are several candidate modules here. It seems difficult to justify putting them all into a library. The problem doesn't seem complex enough to justify a complex solution. --- "Patrick K. O'Brien" wrote: > [Barry A. Warsaw] > > If that's so, then I'd prefer to see each class in its own > module > > inside a parent package. > > Without trying to open a can of worms here, is there any sort > of consensus > on the use of packages with multiple smaller modules vs. one > module > containing everything? I'm asking about the Python standard > library, > specifically. According to the one-class-per-module rule of > thumb, there are > some Python modules that could be refactored into packages. > Weighing against > that is the convenience of importing a single module. > > I'm just wondering if there are any guidelines that should > frame one's > thinking beyond the fairly obvious ones? For example, is the > standard > library an exceptional case because it must appeal to new > users as well as > experts? Does a good part of this issue come down to personal > preference? Or > are there advantages and disadvantages that should be > documented? (Maybe > they already have.) > > Is the current library configuration considered healthy? There > are a mix of > packages and single modules. Are these implementations pretty > optimal, or > would they be organized differently if one had the chance to > do it all over > again? > > Just curious. > > --- > Patrick K. O'Brien > Orbtech > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev ===== -- S. Lott, CCP :-{) S_LOTT@YAHOO.COM http://www.mindspring.com/~slott1 Buccaneer #468: KaDiMa Macintosh user: drinking upstream from the herd. __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From gward@python.net Sat Jun 1 03:57:39 2002 From: gward@python.net (Greg Ward) Date: Fri, 31 May 2002 22:57:39 -0400 Subject: [Python-Dev] Re: Adding Optik to the standard library In-Reply-To: <20020601024553.59600.qmail@web9607.mail.yahoo.com> References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> Message-ID: <20020601025739.GA17229@gerg.ca> On 31 May 2002, Steven Lott said: > The question of Optikoptions having several > reusable elements pushes my envelope. If it's job is to parse > command line arguments, how many different reusable elements can > their really be? Perhaps there are several candidate modules > here. It seems difficult to justify putting them all into a > library. The problem doesn't seem complex enough to justify a > complex solution. I think I agree with everything you said. There are only two important classes in Optik: OptionParser and Option. Together with one trivial support class (OptionValue) and some exception classes, that is the module -- the unit of reusability, in your terms. For convenience while developing, I split Optik into three source files -- optik/option_parser.py, optik/option.py, and optik/errors.py. There's not that much code; about 1100 lines. And it's all pretty tightly related -- the OptionParser class is useless without Option, and vice-versa. If you just want to use the code, it doesn't much matter if optik (or OptionParser) is a package with three sub-modules or a single file. If you just want to read the code, it's probably easier to have a single file. If you're hacking on it, it's probably easier to split the code up. I think Optik is now moving into that long, happy phase where it is mostly read and rarely hacked on, so I think it's time to merge the three separate source files into one. I very much doubt that it's too complex for this -- I have worked hard to keep it tightly focussed on doing one thing well. Greg -- Greg Ward - nerd gward@python.net http://starship.python.net/~gward/ I appoint you ambassador to Fantasy Island!!! From barry@zope.com Sat Jun 1 04:16:48 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 31 May 2002 23:16:48 -0400 Subject: [Python-Dev] subclass a module? Message-ID: <15608.15520.96707.809995@anthem.wooz.org> Am I freaking out, did I missing something, or was the `root' in my root beer float tonight something other than sarsaparilla? -------------------- snip snip -------------------- Python 2.2.1 (#1, May 31 2002, 18:34:35) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import string >>> class allmodcons(string): pass ... >>> string >>> allmodcons -------------------- snip snip -------------------- Can I now subclass from modules? And if so, what good does that do me? -------------------- snip snip -------------------- >>> dir(allmodcons) [] >>> allmodcons.whitespace Traceback (most recent call last): File "", line 1, in ? AttributeError: 'module' object has no attribute 'whitespace' >>> string.whitespace '\t\n\x0b\x0c\r \xa0' >>> -------------------- snip snip -------------------- stickin'-to-herbal-tea-and-dr.-pepper-ly y'rs, -Barry From goodger@users.sourceforge.net Sat Jun 1 04:30:43 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 31 May 2002 23:30:43 -0400 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: Message-ID: I ran across this wrinkle and hope that someone can shed some light. First posted to comp.lang.python, but no help there. Can anyone here enlighten me? I have a package on sys.path containing pairs of modules, each importing the other:: package/ __init__.py: # empty module1.py: import module2 # relative import module2.py: import module1 Executing "from package import module1" works fine. Changing the import statements to absolute dotted forms also works for "from package import module3":: module3.py: import package.module4 # absolute import module4.py: import package.module3 However, if I change both imports to be absolute using the "from/import" form, it doesn't work:: module5.py: from package import module6 # absolute import module6.py: from package import module5 Now I get an exception:: >>> from package import module5 Traceback (most recent call last): File "", line 1, in ? File "package/module5.py", line 1, in ? from package import module6 File "package/module6.py", line 1, in ? from package import module5 ImportError: cannot import name module5 Is this behavior expected? Or is it a bug? I note that FAQ entry 4.37 [*]_ says we shouldn't do "from import *"; I'm not. Are all "from import" statements forbidden in this context? Why? (It seems to me that "import package.module" and "from package import module" are equivalent imports, except for their effect on the local namespace.) Is there an authoritative reference (docs, past c.l.p post, bug report, etc.)? .. [*] http://www.python.org/cgi-bin/faqw.py?req=show&file=faq04.037.htp -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From guido@python.org Sat Jun 1 05:27:44 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 00:27:44 -0400 Subject: [Python-Dev] Re: Adding Optik to the standard library In-Reply-To: Your message of "Fri, 31 May 2002 22:57:39 EDT." <20020601025739.GA17229@gerg.ca> References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> Message-ID: <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> > If you're hacking on it, it's probably easier to split the code up. Hm, that's not how I tend to hack on things (except when working with others who like that style). Why do you find hacking on several (many?) small files easier for you than on a single large file? Surely not because loading a large file (in the editor, or in Python) takes too long? That was in the 80s. :-) Is it because multiple Emacs buffers allow you to maintain multiple current positions, with all the context that that entails? Or is it something else? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jun 1 05:29:51 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 00:29:51 -0400 Subject: [Python-Dev] subclass a module? In-Reply-To: Your message of "Fri, 31 May 2002 23:16:48 EDT." <15608.15520.96707.809995@anthem.wooz.org> References: <15608.15520.96707.809995@anthem.wooz.org> Message-ID: <200206010429.g514Tpi19397@pcp742651pcs.reston01.va.comcast.net> > Can I now subclass from modules? It's a bug IMO. > And if so, what good does that do me? None whatsoever. The resulting class cannot be instantiated. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jun 1 05:42:42 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 00:42:42 -0400 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: Your message of "Fri, 31 May 2002 23:30:43 EDT." References: Message-ID: <200206010442.g514gg419477@pcp742651pcs.reston01.va.comcast.net> > However, if I change both imports to be absolute using the > "from/import" form, it doesn't work:: > > module5.py: > from package import module6 # absolute import > > module6.py: > from package import module5 > > Now I get an exception:: > > >>> from package import module5 > Traceback (most recent call last): > File "", line 1, in ? > File "package/module5.py", line 1, in ? > from package import module6 > File "package/module6.py", line 1, in ? > from package import module5 > ImportError: cannot import name module5 > > Is this behavior expected? Or is it a bug? It's probably due to the extremely subtle (lame?) way that "from package import module" is (has to be?) implemented. It's too late at night for me to dig further to come up with an explanation, but maybe reading the file knee.py is helpful -- it gives the *algorithm* used for package and module import. In 2.2 and before, it's Lib/knee.py; in 2.3, it's been moved to Demo/imputils/knee.py. I think that you'll have to live with it. --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" <200206010121.g511LxX19223@pcp742651pcs.reston01.va.comcast.net> Message-ID: <073501c20960$23f367e0$6601a8c0@boostconsulting.com> From: "Guido van Rossum" > Um, in the code. :-( Using dis(), you'll find that x[1]+=3 executes > the following: > > 6 LOAD_FAST 0 (x) > 9 LOAD_CONST 1 (1) > 12 DUP_TOPX 2 > 15 BINARY_SUBSCR > 16 LOAD_CONST 2 (3) > 19 INPLACE_ADD > 20 ROT_THREE > 21 STORE_SUBSCR > > > How does Python decide that sequence elements are immutable? > > Huh? It doesn't. If they were mutable, had you expected something > else? Actually, yes. I had expcected that Python would know it didn't need to "put the thing back in", since the thing gets modified in place. Knowing that it doesn't work that way clears up a lot. > >>> x = [[1], [3], [5]] > >>> x[1] += [6] > >>> x > [[1], [3, 6], [5]] > >>> Well of /course/ I know that's the result. The question was, how is the result achieved? > Basically, += on an attribute or subscripted container does the > following: > > (1) get the thing out > (2) apply the inplace operation to the thing > (3) put the thing back in > > The inplace operation, of course, is a binary operator that *may* > modify its first operand in place, but *must* return the resulting > value; if it modified the first operand in place, it *should* return > that operand. If a type doesn't support an inplace operation, the > regular binary operator is invoked instead. That's the easy part. > Does this help? (The whole thing is designed to be intuitive, but > that probably doesn't work in your case. :-) I use this stuff from Python without thinking about it, but when it comes to building new types, I sometimes need to have a better sense of the underlying mechanism. Thanks, Dave P.S. Say, you could optimize away putting the thing back at runtime if the inplace operation returns its first argument... but you probably already thought of that one. From David Abrahams" Message-ID: <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com> > In python, you don't need overloading, you have a variety of > optional parameter mechanisms ...which forces users to write centralized dispatching mechanism that could be much more elegantly-handled by the language. The language already does something just for operators, but the rules are complicated and don't scale well. > I think the "member functions" issues from C++ don't apply to > Python becuase C++ is strongly typed, meaning that many similar > functions have to be written with slightly different type > signatures. That's very seldom the case in my C++ code. Why would you do that in lieu of writing function templates? I think Martin hit the nail on the head: you can achieve some decoupling of algorithms from data structures using free functions, but you need some way to look up the appropriate free function for a given data structure. FOr that, you need some kind of overload resolution. > The lack of strong typing makes it practical to > write generic operations. Templates and overloading in C++ make it practical to write statically-type-checked generic operations. From David Abrahams" <200206010429.g514Tpi19397@pcp742651pcs.reston01.va.comcast.net> Message-ID: <076b01c20964$573b9f60$6601a8c0@boostconsulting.com> From: "Guido van Rossum" > > Can I now subclass from modules? > > It's a bug IMO. > > > And if so, what good does that do me? > > None whatsoever. The resulting class cannot be instantiated. Really? >>> import re >>> class X(type(re)): ... def hello(): print 'hi' ... >>> newmod = X() >>> newmod.hello > From neal@metaslash.com Sat Jun 1 14:20:04 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sat, 01 Jun 2002 09:20:04 -0400 Subject: [Python-Dev] PYC Magic Message-ID: <3CF8CA04.C9070F3A@metaslash.com> I recently posted a patch to fix a bug: http://python.org/sf/561858. The patch requires changing .pyc magic. Since this bug goes back to 2.1, what is the process for changing .pyc magic in bugfix releases? ie, is it allowed? In this case the co_stacksize > 32767 and only a short is written to disk. This could be doubled to 65536 (probably should be) without changing the magic. But even that isn't sufficient to solve this problem. It also brings up a related problem. If the PyCodeObject can't be written to disk, should a .pyc be created at all? The code will run fine the first time, but when imported the second time it will fail. The other 16 bit values stored are: co_argcount, co_nlocals, co_flags. At least argcount & nlocals aren't too likely to exceed 32k, but co_flags could, which would be silently ignored now. Neal From guido@python.org Sat Jun 1 14:34:01 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 09:34:01 -0400 Subject: [Python-Dev] Customization docs In-Reply-To: Your message of "Sat, 01 Jun 2002 07:33:20 EDT." <073501c20960$23f367e0$6601a8c0@boostconsulting.com> References: <06da01c208f4$d69003c0$6601a8c0@boostconsulting.com> <200206010121.g511LxX19223@pcp742651pcs.reston01.va.comcast.net> <073501c20960$23f367e0$6601a8c0@boostconsulting.com> Message-ID: <200206011334.g51DY1D21669@pcp742651pcs.reston01.va.comcast.net> > > > How does Python decide that sequence elements are immutable? > > > > Huh? It doesn't. If they were mutable, had you expected something > > else? > > Actually, yes. I had expcected that Python would know it didn't need > to "put the thing back in", since the thing gets modified in > place. Knowing that it doesn't work that way clears up a lot. Still, I don't understand which other outcome than [1, 6, 5] you had expected. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jun 1 14:35:09 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 09:35:09 -0400 Subject: [Python-Dev] deprecating string module? In-Reply-To: Your message of "Sat, 01 Jun 2002 07:42:09 EDT." <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com> References: <20020601021106.29157.qmail@web9606.mail.yahoo.com> <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com> Message-ID: <200206011335.g51DZ9a21685@pcp742651pcs.reston01.va.comcast.net> > > In python, you don't need overloading, you have a variety of > > optional parameter mechanisms > > ...which forces users to write centralized dispatching mechanism > that could be much more elegantly-handled by the language. The > language already does something just for operators, but the rules > are complicated and don't scale well. I don't think the situation can be improved without adding type declarations. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jun 1 14:36:51 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 09:36:51 -0400 Subject: [Python-Dev] subclass a module? In-Reply-To: Your message of "Sat, 01 Jun 2002 08:00:53 EDT." <076b01c20964$573b9f60$6601a8c0@boostconsulting.com> References: <15608.15520.96707.809995@anthem.wooz.org> <200206010429.g514Tpi19397@pcp742651pcs.reston01.va.comcast.net> <076b01c20964$573b9f60$6601a8c0@boostconsulting.com> Message-ID: <200206011336.g51Daqe21701@pcp742651pcs.reston01.va.comcast.net> > > > Can I now subclass from modules? > > > > It's a bug IMO. > > > > > And if so, what good does that do me? > > > > None whatsoever. The resulting class cannot be instantiated. > > Really? > > >>> import re > >>> class X(type(re)): > ... def hello(): print 'hi' > ... > >>> newmod = X() > >>> newmod.hello > > You subclass the module metaclass. The example we were discussing was different: it subclassed the module itself, like this: >>> import re >>> class X(re): pass ... >>> X() Traceback (most recent call last): File "", line 1, in ? TypeError: 'module' object is not callable >>> --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" <200206010121.g511LxX19223@pcp742651pcs.reston01.va.comcast.net> <073501c20960$23f367e0$6601a8c0@boostconsulting.com> <200206011334.g51DY1D21669@pcp742651pcs.reston01.va.comcast.net> Message-ID: <07d801c20970$cdebe050$6601a8c0@boostconsulting.com> From: "Guido van Rossum" > > > > How does Python decide that sequence elements are immutable? > > > > > > Huh? It doesn't. If they were mutable, had you expected something > > > else? > > > > Actually, yes. I had expcected that Python would know it didn't need > > to "put the thing back in", since the thing gets modified in > > place. Knowing that it doesn't work that way clears up a lot. > > Still, I don't understand which other outcome than [1, 6, 5] you had > expected. As I indicated in my previous mail, I didn't expect any other result. My question was about what a new type needs to do in order for things to work properly in Python. If, as I had incorrectly assumed, Python were checking a type's mutability before deciding whether it would be putting the result back into the sequence, I would need to know what criteria Python uses to decide mutability. -Dave From guido@python.org Sat Jun 1 14:45:09 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 09:45:09 -0400 Subject: [Python-Dev] PYC Magic In-Reply-To: Your message of "Sat, 01 Jun 2002 09:20:04 EDT." <3CF8CA04.C9070F3A@metaslash.com> References: <3CF8CA04.C9070F3A@metaslash.com> Message-ID: <200206011345.g51Dj9h21769@pcp742651pcs.reston01.va.comcast.net> > I recently posted a patch to fix a bug: http://python.org/sf/561858. > The patch requires changing .pyc magic. Since this bug goes back > to 2.1, what is the process for changing .pyc magic in bugfix releases? > ie, is it allowed? Absolutely not!!!!! .pyc files must remain 100% compatible!!! (Imagine someone doing a .pyc-only distribution for 2.1.3 and finding that it doesn't work for 2.1.4!) > In this case the co_stacksize > 32767 and only a short is written > to disk. This could be doubled to 65536 (probably should be) > without changing the magic. But even that isn't sufficient > to solve this problem. I guess the only way to fix this in 2.1.x is to raise an error -- that's better than the crash that will follow if you try to execute that code. > It also brings up a related problem. If the PyCodeObject > can't be written to disk, should a .pyc be created at all? > The code will run fine the first time, but when imported > the second time it will fail. What do you mean by "can't be written to disk"? Is the disk full? Is there another kind of write error? The magic number is written last, only when the write is successful. > The other 16 bit values stored are: co_argcount, co_nlocals, co_flags. > At least argcount & nlocals aren't too likely to exceed 32k, but > co_flags could, which would be silently ignored now. If you're going to change the marshal format anyway, I'd increase all of them to 32 bit ints. After all, I thought the stacksize would never exceed 32K either... --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@python.net Sat Jun 1 14:42:36 2002 From: gward@python.net (Greg Ward) Date: Sat, 1 Jun 2002 09:42:36 -0400 Subject: [Python-Dev] Where to put wrap_text()? Message-ID: <20020601134236.GA17691@gerg.ca> Hidden away in distutils.fancy_getopt is an exceedingly handy function called wrap_text(). It does just what you might expect from the name: def wrap_text (text, width): """wrap_text(text : string, width : int) -> [string] Split 'text' into multiple lines of no more than 'width' characters each, and return the list of strings that results. """ Surprise surprise, Optik uses this. I've never been terribly happy about importing it from distutils.fancy_getopt, and putting Optik into the standard library as OptionParser is a great opportunity for putting wrap_text somewhere more sensible. I happen to think that wrap_text() is useful for more than just auto-formatting --help messages, so hiding it away in OptionParser.py doesn't seem right. Also, Perl has a Text::Wrap module that's been part of the standard library for not-quite-forever -- so shouln't Python have one too? Proposal: a new standard library module, wrap_text, which combines the best of distutils.fancy_getopt.wrap_text() and Text::Wrap. Right now, I'm thinking of an interface something like this: wrap(text : string, width : int) -> [string] Split 'text' into multiple lines of no more than 'width' characters each, and return the list of strings that results. Tabs in 'text' are expanded with string.expandtabs(), and all other whitespace characters (including newline) are converted to space. [This is identical to distutils.fancy_getopt.wrap_text(), but the docstring is more complete.] wrap_nomunge(text : string, width : int) -> [string] Same as wrap(), without munging whitespace. [Not sure if this is really useful to expose publicly. Opinions?] fill(text : string, width : int, initial_tab : string = "", subsequent_tab : string = "") -> string Reformat the paragraph in 'text' to fit in lines of no more than 'width' columns. The first line is prefixed with 'initial_tab', and subsequent lines are prefixed with 'subsequent_tab'; the lengths of the tab strings are accounted for when wrapping lines to fit in 'width' columns. [This is just a glorified "\n".join(wrap(...)); the idea to add initial_tab and subsequent_tab was stolen from Perl's Text::Wrap.] I'll go whip up some code and submit a patch to SF. If people like it, I'll even write some tests and documentation too. Greg -- Greg Ward - Unix nerd gward@python.net http://starship.python.net/~gward/ Support bacteria -- it's the only culture some people have! From guido@python.org Sat Jun 1 14:54:20 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 09:54:20 -0400 Subject: [Python-Dev] Where to put wrap_text()? In-Reply-To: Your message of "Sat, 01 Jun 2002 09:42:36 EDT." <20020601134236.GA17691@gerg.ca> References: <20020601134236.GA17691@gerg.ca> Message-ID: <200206011354.g51DsKK21861@pcp742651pcs.reston01.va.comcast.net> > Proposal: a new standard library module, wrap_text, which combines the > best of distutils.fancy_getopt.wrap_text() and Text::Wrap. I think this is a fine idea. But *please* don't put an underscore in the name. I'd say "wrap" or "wraptext" are better than "wrap_text". --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Sat Jun 1 14:49:46 2002 From: aahz@pythoncraft.com (Aahz) Date: Sat, 1 Jun 2002 09:49:46 -0400 Subject: [Python-Dev] Where to put wrap_text()? In-Reply-To: <20020601134236.GA17691@gerg.ca> References: <20020601134236.GA17691@gerg.ca> Message-ID: <20020601134946.GA608@panix.com> On Sat, Jun 01, 2002, Greg Ward wrote: > > Proposal: a new standard library module, wrap_text, which combines the > best of distutils.fancy_getopt.wrap_text() and Text::Wrap. Personally, I'd like to at least get the functionality of some versions of 'fmt', which have both goal and maxlength parameters. If you feel like getting ambitious, there's the 'par' program that can wrap quoted text, but that can always be added to a later version of the library. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In the end, outside of spy agencies, people are far too trusting and willing to help." --Ira Winkler From pinard@iro.umontreal.ca Sat Jun 1 15:00:04 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 01 Jun 2002 10:00:04 -0400 Subject: [Python-Dev] Those (punctuations and skul heads) bug tracking systems! :-) Message-ID: Hi, gang. Anecdotical rants from a technical Luddite, yours truly! :-) A few days ago, I wrote Fred about a tiny problem in Python documentation. Fred replied (very nicely, don't doubt it) something like "Yes, this should absolutely be corrected, but being busy now, I might forget about this -- so please submit a bug report using the SF tracker". I learnt to shudder with horror when people tell me such things. Email is so simple, clean, expeditive and human! Each time I have to use a BTS, this is the same story, I spend a lot of hours studying around, then later experimenting with the system. And finally, I invariably fall in dead ends, after having met a few blatant bugs in the BTS itself. Don't tell me it's my browser. The browser is an integral part of the BTS. Think "user" here! So, trying once more to be a good citizen, I spent many hours yesterday at sorting and reading the email I saved over time, about various comments or references from Python developers about the BTS in use. If I filed these, this is foreseeing I could not escape the Python BTS forever, especially if I want to involve myself a bit more. Reading all this more attentively, I noticed a flurry of alternate, confusing, and sometimes heavy notations to access already submitted reports and documentation, changes in numbering and methods over time that were did not always seem to be fully gracious, I admired the relative nicety of the Python SF redirector, and its minor short-comings. Notable to me were many developer comments about reports being mis-attributed, re-filed, unduly aging or nearly lost in practice. This morning, I decided to do the great try, knowing that there is a facility to prepare the message offline using a reasonable editor (Netscape is very far from my concept of a usable editor) and submit it afterwards. I prepared the message yesterday, saved it into a `temp0' file, and moved it over to the machine here, coming back from travel. Netscape first refused to see that `temp0' file in its directory in the file browsing window, it apparently only saw `*.html' files. I was surely not to turn my little communication into HTML first for Fred to see, so I merely typed the file name in the upload box. "Category", "Group", "Summary", "Check to Upload and Attach File" all had a little '?' besides them, from which I expected some documentation, but clicking on them yielded "File loaded" in the bottom echo area, and _nothing_ more. For "File Description" in particular, I would have needed more information, but there was no `?' next to it, so I merely guessed it wanted a MIME type and wrote "text/plain" within it. Clicking "SUBMIT" gave something like "ERROR Invalid file name", and no kind of feedback about the bug having been submitted. So I guessed the file needed an extension, and renamed `temp0' into `temp0.txt', then modified the file name accordingly in the upload box. Re-attempting "SUBMIT" a second time yielded: "ERROR You Attempted To Double-submit this item. Please avoid double-clicking." Sigh! The usual misery! OK. Instead of uploading a prepared file, I will now proceed to try cut and pasting into Netscape from a real editor, hoping that the mangling will be limited. I do know I have more comments and nuances for Fred, I should find the courage to share them: I fear one does not have much of a choice for contributing. Such rotten reporting systems are merely discouraging. Hmph! I'll continue trying to tame myself to these user interface failures. Sometimes, I ponder that if all maintainers were using the same BTS, the effort of learning to cope with _that one_ would probably have some more worth. I do imagine that a BTS could be useful. There are many BTS around, random projects using random BTS -- so the effort of fighting with BTS often has to be restarted when you play in many fields. Moreover, the truth is, at least for Python, that using a BTS does not solve the main problem, which is the insufficient number of contributors and developers. Risk for risk, I still think I have a much better chance being listened to and understood when I write to Fred directly! With some luck, Fred is an ordered and careful man who, just like me, is able to handle folders. I read with pleasure all the thread saying that `roundup' has an email interface, is actively being improved, and could replace the SF tracker. Let us hope it will be more usable than its precessors! You know, the real goal of all this is allowing for simple and humble communication between humans, about the knowledge of a problem. I surely used to be a very active reporter for all problems I saw everywhere, at the time maintainers were still reachable. When the effort gets too frustrating, sadly, one might feel less inclined to offer contributions, and rather choose to enjoy more of the sun, music, and life! :-) It may look like a useless moan, but it might be worth saying after all. -- François Pinard http://www.iro.umontreal.ca/~pinard From neal@metaslash.com Sat Jun 1 15:15:22 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sat, 01 Jun 2002 10:15:22 -0400 Subject: [Python-Dev] PYC Magic References: <3CF8CA04.C9070F3A@metaslash.com> <200206011345.g51Dj9h21769@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3CF8D6FA.D9D7B8CE@metaslash.com> Guido van Rossum wrote: > > > I recently posted a patch to fix a bug: http://python.org/sf/561858. > > The patch requires changing .pyc magic. Since this bug goes back > > to 2.1, what is the process for changing .pyc magic in bugfix releases? > > ie, is it allowed? > > Absolutely not!!!!! .pyc files must remain 100% compatible!!! > (Imagine someone doing a .pyc-only distribution for 2.1.3 and finding > that it doesn't work for 2.1.4!) Ok, I'll work on a patch for 2.1/2.2. In looking through other magic code, I found that when -U was removed. It was only removed from the usage msg. Should the option and code be removed altogether? ie, -U is still used by getopt and still changes the magic, so -U is just a hidden option now. > > It also brings up a related problem. If the PyCodeObject > > can't be written to disk, should a .pyc be created at all? > > The code will run fine the first time, but when imported > > the second time it will fail. > > What do you mean by "can't be written to disk"? Is the disk full? Is > there another kind of write error? The magic number is written last, > only when the write is successful. Disk full was one condition. The other condition was the if a value is 32 bits in memory, but only 16 bits are written to disk. Based on your comment to increase all of the 16 bit values for PyCode, that will no longer be the case. Although, there could be transient write errors and the file could be corrupted. Since only part of the data would be written. One case where this could happen is an interupted system call. There is one other possible problem. [wr]_short() is now only used in one place: for long.digits which are unsigned ints. But r_short() does sign extension. Is this a problem? Neal From paul-python@svensson.org Sat Jun 1 15:17:50 2002 From: paul-python@svensson.org (Paul Svensson) Date: Sat, 1 Jun 2002 10:17:50 -0400 (EDT) Subject: [Python-Dev] Customization docs In-Reply-To: <200206011334.g51DY1D21669@pcp742651pcs.reston01.va.comcast.net> Message-ID: On Sat, 1 Jun 2002, Guido van Rossum wrote: >> > > How does Python decide that sequence elements are immutable? >> > >> > Huh? It doesn't. If they were mutable, had you expected something >> > else? >> >> Actually, yes. I had expcected that Python would know it didn't need >> to "put the thing back in", since the thing gets modified in >> place. Knowing that it doesn't work that way clears up a lot. > >Still, I don't understand which other outcome than [1, 6, 5] you had >expected. Well, _I_ would have expected this to work: Python 2.1 (#4, Jun 6 2001, 08:54:49) [GCC 2.95.2 19991024 (release)] on linux2 Type "copyright", "credits" or "license" for more information. >>> x = ([],[],[]) >>> x[1] += [1] Traceback (most recent call last): File "", line 1, in ? TypeError: object doesn't support item assignment Given that the object x[1] can be (and is) modified in place, I find this behaviour quite counter-intuitive, specially considering: >>> z = x[1] >>> z += [2] >>> x ([], [1, 2], []) /Paul From neal@metaslash.com Sat Jun 1 15:18:31 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sat, 01 Jun 2002 10:18:31 -0400 Subject: [Python-Dev] Where to put wrap_text()? References: <20020601134236.GA17691@gerg.ca> <200206011354.g51DsKK21861@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3CF8D7B7.BFA0AD14@metaslash.com> Guido van Rossum wrote: > > > Proposal: a new standard library module, wrap_text, which combines the > > best of distutils.fancy_getopt.wrap_text() and Text::Wrap. > > I think this is a fine idea. But *please* don't put an underscore in > the name. I'd say "wrap" or "wraptext" are better than "wrap_text". Some possibilities are: * a string method * a UserString method * a new module text, with a function wrap() * add function wrap() to UserString Should it work on unicode strings too? Neal From walter@livinglogic.de Sat Jun 1 15:23:39 2002 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Sat, 01 Jun 2002 16:23:39 +0200 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> Message-ID: <3CF8D8EB.60604@livinglogic.de> Raymond Hettinger wrote: > While we're eliminating uses of the string and types modules, how about > other code clean-ups and modernization: > > [...] dont' forget: import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime But to be able to remove "import stat" everywhere the remaining functions in stat.py would have to be implemented as methods. And what about the remaining constants defined in stat.py? Bye, Walter Dörwald From walter@livinglogic.de Sat Jun 1 15:25:36 2002 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Sat, 01 Jun 2002 16:25:36 +0200 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> Message-ID: <3CF8D960.8040402@livinglogic.de> Raymond Hettinger wrote: > While we're eliminating uses of the string and types modules, how about > other code clean-ups and modernization: > > [...] dont' forget: import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime But to be able to remove "import stat" everywhere the remaining functions in stat.py would have to be implemented as methods. And what about the remaining constants defined in stat.py? Bye, Walter Dörwald From gward@python.net Sat Jun 1 15:38:55 2002 From: gward@python.net (Greg Ward) Date: Sat, 1 Jun 2002 10:38:55 -0400 Subject: [Python-Dev] Re: Adding Optik to the standard library In-Reply-To: <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020601143855.GA18632@gerg.ca> On 01 June 2002, Guido van Rossum said: > Hm, that's not how I tend to hack on things (except when working with > others who like that style). Why do you find hacking on several > (many?) small files easier for you than on a single large file? Actually, Optik started out in one file; I split it up somewhere around 600 or 700 lines of code expecting it to grow more. It only grew to around 1100 lines, which I suppose is a good thing. I think having small modules makes me more comfortable about adding code -- I don't feel at all hemmed-in adding 50 lines to a 300-line module, but adding 50 lines to an 800-line module makes me nervous. I think it all boils down to having things in easily-digested chunks, rather than concerns about stressing Emacs out. (OTOH and wildly OT: since I gave in a couple years ago and started using Emacs syntax-colouring, it *does* take a lot longer to load modules up -- eg. ~2 sec for the 1000-line rfc822.py. But that's probably just because Emacs is a great shaggy beast of an editor ("Eight(y) Megs and Constantly Swapping", "Eventually Mallocs All Core Storage", you know...). I'm sure if I got a brain transplant so that I could use vim, it would be different.) Greg -- Greg Ward - programmer-at-big gward@python.net http://starship.python.net/~gward/ Gee, I feel kind of LIGHT in the head now, knowing I can't make my satellite dish PAYMENTS! From aahz@pythoncraft.com Sat Jun 1 15:44:05 2002 From: aahz@pythoncraft.com (Aahz) Date: Sat, 1 Jun 2002 10:44:05 -0400 Subject: [Python-Dev] Re: Adding Optik to the standard library In-Reply-To: <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020601144405.GA6490@panix.com> On Sat, Jun 01, 2002, Guido van Rossum wrote: > > > If you're hacking on it, it's probably easier to split the code up. > > Hm, that's not how I tend to hack on things (except when working with > others who like that style). Why do you find hacking on several > (many?) small files easier for you than on a single large file? > Surely not because loading a large file (in the editor, or in Python) > takes too long? That was in the 80s. :-) Is it because multiple > Emacs buffers allow you to maintain multiple current positions, with > all the context that that entails? Or is it something else? s/Emacs/vi sessions/ Yes. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In the end, outside of spy agencies, people are far too trusting and willing to help." --Ira Winkler From David Abrahams" <200206010121.g511LxX19223@pcp742651pcs.reston01.va.comcast.net> <073501c20960$23f367e0$6601a8c0@boostconsulting.com> <200206011334.g51DY1D21669@pcp742651pcs.reston01.va.comcast.net> <07d801c20970$cdebe050$6601a8c0@boostconsulting.com> <200206011348.g51Dm0i21793@pcp742651pcs.reston01.va.comcast.net> Message-ID: <081001c2097d$4fce5740$6601a8c0@boostconsulting.com> From: "Guido van Rossum" > > As I indicated in my previous mail, I didn't expect any other result. > > Then your question was formulated strangely. You showed the result > and said "how does it know that list items are immutable"; the context > suggested strongly to me that you had expected something else. > > > My question was about what a new type needs to do in order for things to > > work properly in Python. > > You could have asked that directly. :-) Incorrect background assumptions have a way of fouling communication. I hope it's obvious that I'm making an effort to be clear. -Dave From guido@python.org Sat Jun 1 16:20:01 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 11:20:01 -0400 Subject: [Python-Dev] Where to put wrap_text()? In-Reply-To: Your message of "Sat, 01 Jun 2002 10:18:31 EDT." <3CF8D7B7.BFA0AD14@metaslash.com> References: <20020601134236.GA17691@gerg.ca> <200206011354.g51DsKK21861@pcp742651pcs.reston01.va.comcast.net> <3CF8D7B7.BFA0AD14@metaslash.com> Message-ID: <200206011520.g51FK1P22041@pcp742651pcs.reston01.va.comcast.net> > Some possibilities are: > > * a string method > * a UserString method This should *definitely* not be a method. Too specialized, too many possibilities for tweaking the algorithm. > * a new module text, with a function wrap() > * add function wrap() to UserString > > Should it work on unicode strings too? Yes. --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com> <200206011335.g51DZ9a21685@pcp742651pcs.reston01.va.comcast.net> Message-ID: <082601c2097f$f7b75900$6601a8c0@boostconsulting.com> From: "Guido van Rossum" > > > In python, you don't need overloading, you have a variety of > > > optional parameter mechanisms > > > > ...which forces users to write centralized dispatching mechanism > > that could be much more elegantly-handled by the language. The > > language already does something just for operators, but the rules > > are complicated and don't scale well. > > I don't think the situation can be improved without adding type > declarations. You could do a little without type declarations, to handle functions with diffrent numbers of arguments or keywords, though I don't think that would be very satisfying. A good solution would not neccessarily need full type declaration capability; just some way to annotate function signatures with types. What I mean is that, for example, the ability to declare the type of a local variable would not be of any use in overload resolution. In fact, pure (sub)type comparison is probably not the best mechanism for Python's overload resolution. For example, it should be possible to write a function which will match any sequence object. I'd like to see something like the following sketch: 1. Each function can have an optional associated rating function which, given (args,kw) returns a float describing the quality of the match to its arguments 2. When calling an overloaded function, the best match from the pool of overloads is taken 3. The default rating function works as follows: a. Each formal argument has an optional associated match object b. The match object contributes a rating to the overall rating of the function c. If the match object is a type T, the rating system favors arguments x where T appears earlier in the mro of x.__class__. The availability of explicit conversions such as int() and float() is considered, but always produces a worse match than a subtype match. d. If the match object is a callable non-type, it's expected to produce an argument match rating Obviously, there are some details missing. I have to think about this stuff anyway for Boost.Python (since its current trivial overload resolution is not really adequate); if there's any interest here it would be nice for me, since I'd stand a chance of doing something which is likely to be consistent with whatever happens in Python. -Dave From guido@python.org Sat Jun 1 16:26:37 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 11:26:37 -0400 Subject: [Python-Dev] PYC Magic In-Reply-To: Your message of "Sat, 01 Jun 2002 10:15:22 EDT." <3CF8D6FA.D9D7B8CE@metaslash.com> References: <3CF8CA04.C9070F3A@metaslash.com> <200206011345.g51Dj9h21769@pcp742651pcs.reston01.va.comcast.net> <3CF8D6FA.D9D7B8CE@metaslash.com> Message-ID: <200206011526.g51FQbO22066@pcp742651pcs.reston01.va.comcast.net> > In looking through other magic code, I found that when -U > was removed. It was only removed from the usage msg. > Should the option and code be removed altogether? > ie, -U is still used by getopt and still changes the magic, > so -U is just a hidden option now. -U is a handy option for developers wanting to test Unicode conformance of their code, but the help message promised more than it could deliver. Please leave this alone. > There is one other possible problem. [wr]_short() is now only > used in one place: for long.digits which are unsigned ints. > But r_short() does sign extension. Is this a problem? Long digits are only 15 bits, so if you change it to return an unsigned short that shouldn't matter. Dunno if there's magic for negative numbers though (in memory, the length is negative, but the digits are not). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jun 1 16:28:45 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 11:28:45 -0400 Subject: [Python-Dev] Other library code transformations In-Reply-To: Your message of "Sat, 01 Jun 2002 16:23:39 +0200." <3CF8D8EB.60604@livinglogic.de> References: <001501c208bd$46133420$d061accf@othello> <3CF8D8EB.60604@livinglogic.de> Message-ID: <200206011528.g51FSjb22095@pcp742651pcs.reston01.va.comcast.net> > But to be able to remove "import stat" everywhere the remaining > functions in stat.py would have to be implemented as methods. > And what about the remaining constants defined in stat.py? I see no reason to want to deprecate the stat module, only the indexing constants in the stat tuple. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jun 1 16:27:48 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 11:27:48 -0400 Subject: [Python-Dev] Customization docs In-Reply-To: Your message of "Sat, 01 Jun 2002 10:17:50 EDT." References: Message-ID: <200206011527.g51FRmt22081@pcp742651pcs.reston01.va.comcast.net> > Well, _I_ would have expected this to work: > > Python 2.1 (#4, Jun 6 2001, 08:54:49) > [GCC 2.95.2 19991024 (release)] on linux2 > Type "copyright", "credits" or "license" for more information. > >>> x = ([],[],[]) > >>> x[1] += [1] > Traceback (most recent call last): > File "", line 1, in ? > TypeError: object doesn't support item assignment Yes, but that can't be fixed without breaking other things. Too bad. It's not like this is an important use case in real life. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Sat Jun 1 16:39:17 2002 From: aahz@pythoncraft.com (Aahz) Date: Sat, 1 Jun 2002 11:39:17 -0400 Subject: [Python-Dev] deprecating string module? In-Reply-To: <200206011335.g51DZ9a21685@pcp742651pcs.reston01.va.comcast.net> References: <20020601021106.29157.qmail@web9606.mail.yahoo.com> <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com> <200206011335.g51DZ9a21685@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020601153917.GA14320@panix.com> >>> In python, you don't need overloading, you have a variety of >>> optional parameter mechanisms >> >> ...which forces users to write centralized dispatching mechanism >> that could be much more elegantly-handled by the language. The >> language already does something just for operators, but the rules >> are complicated and don't scale well. > > I don't think the situation can be improved without adding type > declarations. I thought this was the issue interfaces were supposed to handle? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In the end, outside of spy agencies, people are far too trusting and willing to help." --Ira Winkler From barry@zope.com Sat Jun 1 16:55:23 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 1 Jun 2002 11:55:23 -0400 Subject: [Python-Dev] Where to put wrap_text()? References: <20020601134236.GA17691@gerg.ca> Message-ID: <15608.61035.133229.77125@anthem.wooz.org> >>>>> "GW" == Greg Ward writes: GW> Proposal: a new standard library module, wrap_text, which GW> combines the best of distutils.fancy_getopt.wrap_text() and GW> Text::Wrap. Right now, I'm thinking of an interface something GW> like this: You might consider a text package with submodules for various wrapping algorithms. The text package might even grow other functionality later too. I say this because in Mailman I also have a wrap() function (big surprise, eh?) that implements the Python FAQ wizard rules for wrapping: def wrap(text, column=70, honor_leading_ws=1): """Wrap and fill the text to the specified column. Wrapping is always in effect, although if it is not possible to wrap a line (because some word is longer than `column' characters) the line is broken at the next available whitespace boundary. Paragraphs are also always filled, unless honor_leading_ws is true and the line begins with whitespace. This is the algorithm that the Python FAQ wizard uses, and seems like a good compromise. """ There's nothing at all Mailman specific about it, so I wouldn't mind donating it to the standard library. -Barry From aahz@pythoncraft.com Sat Jun 1 17:07:00 2002 From: aahz@pythoncraft.com (Aahz) Date: Sat, 1 Jun 2002 12:07:00 -0400 Subject: Documenting practice (was Re: [Python-Dev] Python 2.3 release schedule) In-Reply-To: <2mr8jwv84e.fsf@starship.python.net> References: <2mr8jwv84e.fsf@starship.python.net> Message-ID: <20020601160659.GA16298@panix.com> On Tue, May 28, 2002, Michael Hudson wrote: > > Thanks; I think it is a good idea to describe intended usage in no > uncertain terms *somewhere* at least. Probably lots of places. Any > book authors reading python-dev? Yes. However, I'm no C programmer, and feedback on my book proposal makes it likely that API stuff will be dumped -- which is fine with me. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In the end, outside of spy agencies, people are far too trusting and willing to help." --Ira Winkler From guido@python.org Sat Jun 1 17:17:50 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 12:17:50 -0400 Subject: [Python-Dev] deprecating string module? In-Reply-To: Your message of "Sat, 01 Jun 2002 11:39:17 EDT." <20020601153917.GA14320@panix.com> References: <20020601021106.29157.qmail@web9606.mail.yahoo.com> <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com> <200206011335.g51DZ9a21685@pcp742651pcs.reston01.va.comcast.net> <20020601153917.GA14320@panix.com> Message-ID: <200206011617.g51GHor22381@pcp742651pcs.reston01.va.comcast.net> > I thought this was the issue interfaces were supposed to handle? You'd still need a way to attach an interface declaration to a function argument. Smell likes type declarations to me. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Sat Jun 1 17:24:25 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 1 Jun 2002 12:24:25 -0400 Subject: [Python-Dev] PYC Magic In-Reply-To: <3CF8D6FA.D9D7B8CE@metaslash.com> Message-ID: <3CF8BCF9.5557.4C49011F@localhost> On 1 Jun 2002 at 10:15, Neal Norwitz wrote: > Guido van Rossum wrote: > > What do you mean by "can't be written to disk"? > Disk full was one condition. I can't be 100% sure of the cause, but I *have* seen this (a bad .pyc file that had to be deleted before the module would import). The .pyc was woefully short but passed the magic test. I think this was 2.1, maybe 2.0. This was during a firestorm at a client site, so I didn't get around to a bug report. -- Gordon http://www.mcmillan-inc.com/ From pinard@iro.umontreal.ca Sat Jun 1 17:51:03 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 01 Jun 2002 12:51:03 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: <20020601134236.GA17691@gerg.ca> References: <20020601134236.GA17691@gerg.ca> Message-ID: [Greg Ward] > Proposal: a new standard library module, wrap_text, which combines the > best of distutils.fancy_getopt.wrap_text() and Text::Wrap. [Aahz] > Personally, I'd like to at least get the functionality of some versions > of 'fmt' [Guido van Rossum] > I think this is a fine idea. But *please* don't put an underscore in > the name. I'd say "wrap" or "wraptext" are better than "wrap_text". One thing that I would love to have available in Python is a function able to wrap text using Knuth's filling algorithm. GNU `fmt' does it, and it is _so_ better than dumb refilling, in my eyes at least, that I managed so Emacs own filling algorithm is short-circuited with an external call (I do not mind the small fraction of a second it takes). Also, is there some existing module in which `wraptext' would fit nicely? That might be better than creating a new module for not many functions. -- François Pinard http://www.iro.umontreal.ca/~pinard From tim.one@comcast.net Sat Jun 1 17:50:29 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 01 Jun 2002 12:50:29 -0400 Subject: [Python-Dev] PYC Magic In-Reply-To: <200206011526.g51FQbO22066@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido] > Long digits are only 15 bits, so if you change it to return an > unsigned short that shouldn't matter. Dunno if there's magic for > negative numbers though (in memory, the length is negative, but the > digits are not). The marshal format is the same: signed length and unsigned digits. The signed length goes thru [rw]_long. From aahz@pythoncraft.com Sat Jun 1 17:59:06 2002 From: aahz@pythoncraft.com (Aahz) Date: Sat, 1 Jun 2002 12:59:06 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: References: <20020601134236.GA17691@gerg.ca> Message-ID: <20020601165906.GA23320@panix.com> On Sat, Jun 01, 2002, François Pinard wrote: > > Also, is there some existing module in which `wraptext' would fit nicely? > That might be better than creating a new module for not many functions. I'd prefer to create a package called 'text', with wrap being a module inside it. That way, as we add parsing (e.g. mxTextTools) and other features to the standard library, they can be stuck in the package. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In the end, outside of spy agencies, people are far too trusting and willing to help." --Ira Winkler From tim.one@comcast.net Sat Jun 1 17:58:48 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 01 Jun 2002 12:58:48 -0400 Subject: [Python-Dev] Where to put wrap_text()? In-Reply-To: <20020601134236.GA17691@gerg.ca> Message-ID: [Greg Ward, on wrapping text] > ... Note that regrtest.py also has a wrapper: def printlist(x, width=70, indent=4): """Print the elements of a sequence to stdout. Optional arg width (default 70) is the maximum line length. Optional arg indent (default 4) is the number of blanks with which to begin each line. """ This kind of thing gets reinvented too often, so +1 on a module from me. Just make sure it handle the union of all possible desires, but has a simple and intuitive interface . From mal@lemburg.com Sat Jun 1 18:36:34 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 01 Jun 2002 19:36:34 +0200 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> Message-ID: <3CF90622.6000106@lemburg.com> Walter D=F6rwald wrote: > Raymond Hettinger wrote: >=20 > > While we're eliminating uses of the string and types modules, how ab= out > > other code clean-ups and modernization: > > > > [...] >=20 > dont' forget: > import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime >=20 > But to be able to remove "import stat" everywhere the remaining=20 > functions in stat.py would have to be implemented as methods. > And what about the remaining constants defined in stat.py? While you're at it: could you also write up all these little "code cleanups" in some file so that Andrew can integrate them in the migration guide ?! Thanks, --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From barry@zope.com Sat Jun 1 17:02:38 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 1 Jun 2002 12:02:38 -0400 Subject: [Python-Dev] Re: Adding Optik to the standard library References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca> Message-ID: <15608.61470.981820.695481@anthem.wooz.org> >>>>> "GW" == Greg Ward writes: GW> (OTOH and wildly OT: since I gave in a couple years ago and GW> started using Emacs syntax-colouring, it *does* take a lot GW> longer to load modules up -- eg. ~2 sec for the 1000-line GW> rfc822.py. But that's probably just because Emacs is a great GW> shaggy beast of an editor ("Eight(y) Megs and Constantly GW> Swapping", "Eventually Mallocs All Core Storage", you GW> know...). I'm sure if I got a brain transplant so that I GW> could use vim, it would be different.) Actually, I've found jed to be a very nice quick-in-quick-out alternative to XEmacs (the one true Emacs :). Its default bindings and operation is close enough that I never notice the difference, for simple quick editing jobs. -Barry From python@rcn.com Sat Jun 1 19:34:46 2002 From: python@rcn.com (Raymond Hettinger) Date: Sat, 1 Jun 2002 14:34:46 -0400 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> Message-ID: <000b01c2099b$023984a0$1bea7ad1@othello> From: "M.-A. Lemburg" > While you're at it: could you also write up all these little > "code cleanups" in some file so that Andrew can integrate them > in the migration guide ?! Will do! Raymond Hettinger From mal@lemburg.com Sat Jun 1 20:48:30 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 01 Jun 2002 21:48:30 +0200 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> Message-ID: <3CF9250E.30002@lemburg.com> Raymond Hettinger wrote: > From: "M.-A. Lemburg" > >>While you're at it: could you also write up all these little >>"code cleanups" in some file so that Andrew can integrate them >>in the migration guide ?! > > > Will do! Great. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From python@rcn.com Sat Jun 1 20:50:50 2002 From: python@rcn.com (Raymond Hettinger) Date: Sat, 1 Jun 2002 15:50:50 -0400 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D8EB.60604@livinglogic.de> Message-ID: <001f01c209a5$a27007a0$1bea7ad1@othello> From: "Walter Dörwald" > dont' forget: > import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime Done! BTW, it was surprising how many times the above has been coded as: os.stat("foo")[8] Raymond Hettinger From gward@python.net Sat Jun 1 23:05:29 2002 From: gward@python.net (Greg Ward) Date: Sat, 1 Jun 2002 18:05:29 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: References: <20020601134236.GA17691@gerg.ca> Message-ID: <20020601220529.GA20025@gerg.ca> On 01 June 2002, Fran?ois Pinard said: > One thing that I would love to have available in Python is a function able > to wrap text using Knuth's filling algorithm. GNU `fmt' does it, and it > is _so_ better than dumb refilling, in my eyes at least, that I managed > so Emacs own filling algorithm is short-circuited with an external call > (I do not mind the small fraction of a second it takes). Damn, I had no idea there was a body of computer science (however small) devoted to the art of filling text. Trust Knuth to be there first. Do you have a reference for this algorithm apart from GNU fmt's source code? Google'ing for "knuth text fill algorithm" was unhelpful, ditto with s/fill/wrap/. Anyways, despite being warned just today on the conceptual/philosophical danger of classes whose names end in "-er" [1], I'm leaning towards a TextWrapper class, so that everyone may impose their desires through subclassing. I'll start with my simple naive text-wrapping algorithm, and then we can see who wants to contribute fancy/clever algorithms to the pot. > Also, is there some existing module in which `wraptext' would fit nicely? > That might be better than creating a new module for not many functions. Not if it grows to accomodate Optik/OptionParser, Mailman, regrtest, etc. Greg [1] objects should *be*, not *do*, and class names like HelpFormatter and TextWrapper are impositions of procedural abstraction onto OOP. It's something to be aware of, but still a useful idiom (IMHO). -- Greg Ward - Unix geek gward@python.net http://starship.python.net/~gward/ No problem is so formidable that you can't just walk away from it. From tim.one@comcast.net Sun Jun 2 00:19:08 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 01 Jun 2002 19:19:08 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: <20020601220529.GA20025@gerg.ca> Message-ID: [Greg Ward] > Damn, I had no idea there was a body of computer science (however small) > devoted to the art of filling text. I take it you don't spend much time surveying the range of computer science literature . > Trust Knuth to be there first. Do you have a reference for this > algorithm apart from GNU fmt's source code? Google'ing for "knuth text > fill algorithm" was unhelpful, ditto with s/fill/wrap/. Search for Knuth hyphenation instead. Three months later, the best advice you'll have read is to avoid hyphenation entirely. But then you're stuck fighting snaky little rivers of vertical whitespace without the biggest gun in the arsenal. Avoid right justification entirely too, and let the whitespace fall where it may. Doing justification with fixed-width fonts is like juggling dirt anyway . > Anyways, despite being warned just today on the conceptual/philosophical > danger of classes whose names end in "-er" [1], I'm leaning towards a > TextWrapper class, so that everyone may impose their desires through > subclassing. LOL! Resolved, that the world would be a better place if all classes ended with "-ist". From goodger@users.sourceforge.net Sun Jun 2 01:30:26 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Sat, 01 Jun 2002 20:30:26 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? Message-ID: Greg Ward wrote: > [1] objects should *be*, not *do*, and class names like > HelpFormatter and TextWrapper are impositions of procedural > abstraction onto OOP. I don't see anything dangerous about -er objects. There are plenty of objects in the real world that end in -er, all nouns: Programmer, Bookkeeper, Publisher, Reader, Writer, Trucker, ad infinitum. Plenty of precedent in the OOP world too: Debugger, Profiler, Parser, TestLoader, SequenceMatcher, Visitor. Objects combine state (data) with behavior (processing); sometimes the state is most important, sometimes the behavior. Following that kind of over-simplified "rule" may do more harm than good. I'm glad you didn't fall for it. ;-) -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From guido@python.org Sun Jun 2 01:42:12 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Jun 2002 20:42:12 -0400 Subject: [Python-Dev] PYC Magic In-Reply-To: Your message of "Sat, 01 Jun 2002 12:24:25 EDT." <3CF8BCF9.5557.4C49011F@localhost> References: <3CF8BCF9.5557.4C49011F@localhost> Message-ID: <200206020042.g520gCL22848@pcp742651pcs.reston01.va.comcast.net> [GMcM] > I can't be 100% sure of the cause, but I *have* > seen this (a bad .pyc file that had to be > deleted before the module would import). The .pyc > was woefully short but passed the magic > test. I think this was 2.1, maybe 2.0. Hm... Here's the code responsible for writing .pyc files: static void write_compiled_module(PyCodeObject *co, char *cpathname, long mtime) { [...] PyMarshal_WriteLongToFile(pyc_magic, fp); /* First write a 0 for mtime */ PyMarshal_WriteLongToFile(0L, fp); PyMarshal_WriteObjectToFile((PyObject *)co, fp); if (ferror(fp)) { /* Don't keep partial file */ fclose(fp); (void) unlink(cpathname); return; } /* Now write the true mtime */ fseek(fp, 4L, 0); PyMarshal_WriteLongToFile(mtime, fp); fflush(fp); fclose(fp); [...] } It's been like this for a very long time. It always writes the magic number, but withholds the mtime until it's done writing without errors. And if the mtime doesn't match, the .pyc is ignored (unless there's no .py file...). The only way this could write the correct mtime but not all the marshalled data would be if ferror(fp) doesn't actually indicate an error after a write failure due to a disk full condition. And that's a stdio quality of implementation issue. I'm not sure if there's anything I could do differently to make this more robust. (I guess I could write the correct magic number at the end too.) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Sun Jun 2 04:43:14 2002 From: skip@pobox.com (Skip Montanaro) Date: Sat, 1 Jun 2002 22:43:14 -0500 Subject: [Python-Dev] Where to put wrap_text()? In-Reply-To: References: <20020601134236.GA17691@gerg.ca> Message-ID: <15609.37970.691340.638319@12-248-41-177.client.attbi.com> Tim> Note that regrtest.py also has a wrapper: Me too... def wrap(s, col=74, startcol=0, hangindent=0): """Insert newlines into 's' so it doesn't extend past 'col'. All lines are indented to 'startcol'. The indentation of the first line is adjusted further by hangindent. """ I guess everybody has one of these laying about... I'll be happy to dump mine once something mostly equivalent is available. I love to throw out code. Skip From smurf@noris.de Sun Jun 2 05:04:59 2002 From: smurf@noris.de (Matthias Urlichs) Date: Sun, 2 Jun 2002 06:04:59 +0200 Subject: [Python-Dev] intra-package mutual imports fail: "from import " Message-ID: > module5.py: > from package import module6 # absolute import > > module6.py: > from package import module5 > [...] > ImportError: cannot import name module5 > > Is this behavior expected? Or is it a bug? The problem is that importing with from consists of two steps: - load the module - add the imported names to the local namespace Since this addition is by reference to the actual object and not to the symbol's name in the other module, a concept which Python doesn't have (use Perl if you want this...), your recursive import doesn't work. The solution would be: import package.module6 as module6 which should have the same effect. -- Matthias Urlichs From walter@livinglogic.de Sun Jun 2 10:27:28 2002 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Sun, 02 Jun 2002 11:27:28 +0200 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> Message-ID: <3CF9E500.9030103@livinglogic.de> Raymond Hettinger wrote: > From: "M.-A. Lemburg" > >>While you're at it: could you also write up all these little >>"code cleanups" in some file so that Andrew can integrate them >>in the migration guide ?! > > > Will do! There's another one: "foobar"[:3]=="foo" --> "foobar".startswith("foo") Bye, Walter Dörwald From skip@mojam.com Sun Jun 2 13:00:18 2002 From: skip@mojam.com (Skip Montanaro) Date: Sun, 2 Jun 2002 07:00:18 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200206021200.g52C0I721199@12-248-41-177.client.attbi.com> Bug/Patch Summary ----------------- 263 open / 2539 total bugs (+8) 136 open / 1532 total patches (+8) New Bugs -------- crash in shelve module (2001-03-13) http://python.org/sf/408271 UTF-16 BOM handling counterintuitive (2002-05-13) http://python.org/sf/555360 import user doesn't work with CGIs (2002-05-14) http://python.org/sf/555779 removing extensions without admin rights (2002-05-14) http://python.org/sf/555810 installing extension w/o admin rights (2002-05-14) http://python.org/sf/555812 Flawed fcntl.ioctl implementation. (2002-05-14) http://python.org/sf/555817 Expat improperly described in setup.py (2002-05-15) http://python.org/sf/556370 illegal use of malloc/free (2002-05-16) http://python.org/sf/557028 TclError is a str should be an Exception (2002-05-17) http://python.org/sf/557436 netrc module can't handle all passwords (2002-05-18) http://python.org/sf/557704 faqwiz.py could do email obfuscation (2002-05-19) http://python.org/sf/558072 Compile error _sre.c on Cray T3E (2002-05-19) http://python.org/sf/558153 Shutdown of IDLE blows up (2002-05-19) http://python.org/sf/558166 rfc822.Message.get() incompatibility (2002-05-20) http://python.org/sf/558179 unittest.TestResult documentation (2002-05-20) http://python.org/sf/558278 \verbatiminput and name duplication (2002-05-20) http://python.org/sf/558279 DL_EXPORT on VC7 broken (2002-05-20) http://python.org/sf/558488 HTTPSConnection memory leakage (2002-05-22) http://python.org/sf/559117 imaplib.IMAP4.open() typo (2002-05-23) http://python.org/sf/559884 inconsistent behavior of __getslice__ (2002-05-24) http://python.org/sf/560064 PyType_IsSubtype can segfault (2002-05-24) http://python.org/sf/560215 Add docs for 'string' (2002-05-24) http://python.org/sf/560286 foo() doesn't use __getattribute__ (2002-05-25) http://python.org/sf/560438 deepcopy can't handle custom metaclasses (2002-05-26) http://python.org/sf/560794 Maximum recursion limit exceeded (2002-05-27) http://python.org/sf/561047 ConfigParser has_option case sensitive (2002-05-29) http://python.org/sf/561822 Assertion with very long lists (2002-05-29) http://python.org/sf/561858 test_signal.py fails on FreeBSD-4-stable (2002-05-29) http://python.org/sf/562188 build problems on DEC Unix 4.0f (2002-05-30) http://python.org/sf/562585 xmlrpclib.Binary.data undocumented (2002-05-31) http://python.org/sf/562878 Module can be used as a base class (2002-05-31) http://python.org/sf/563060 Clarify documentation for inspect (2002-06-01) http://python.org/sf/563273 Fuzziness in inspect module documentatio (2002-06-01) http://python.org/sf/563298 Heap corruption in debug (2002-06-01) http://python.org/sf/563303 Getting traceback in embedded python. (2002-06-01) http://python.org/sf/563338 Add separator argument to readline() (2002-06-02) http://python.org/sf/563491 New Patches ----------- timeout socket implementation (2002-05-12) http://python.org/sf/555085 Mutable object change flag (2002-05-12) http://python.org/sf/555251 Cygwin AH_BOTTOM cleanup patch (2002-05-14) http://python.org/sf/555929 OSX build -- make python.app (2002-05-18) http://python.org/sf/557719 Ebcdic compliancy in stringobject source (2002-05-19) http://python.org/sf/557946 cmd.py: add instance-specific stdin/out (2002-05-20) http://python.org/sf/558544 SocketServer: don't flush closed wfile (2002-05-20) http://python.org/sf/558547 GC: untrack simple objects (2002-05-21) http://python.org/sf/558745 Use builtin boolean if present (2002-05-22) http://python.org/sf/559288 Expose xrange type in builtins (2002-05-23) http://python.org/sf/559833 isinstance error message (2002-05-24) http://python.org/sf/560250 os.uname() on Darwin space in machine (2002-05-24) http://python.org/sf/560311 Karatsuba multiplication (2002-05-24) http://python.org/sf/560379 Micro optimizations (2002-05-27) http://python.org/sf/561244 webchecker chokes at charsets. (2002-05-28) http://python.org/sf/561478 README additions for Cray T3E (2002-05-28) http://python.org/sf/561724 Installation database patch (2002-05-29) http://python.org/sf/562100 Getting rid of string, types and stat (2002-05-30) http://python.org/sf/562373 Prevent duplicates in readline history (2002-05-30) http://python.org/sf/562492 Add isxxx() methods to string objects (2002-05-30) http://python.org/sf/562501 First patch: start describing types... (2002-05-30) http://python.org/sf/562529 Remove UserDict from cookie.py (2002-05-31) http://python.org/sf/562987 Closed Bugs ----------- pdb can only step when at botframe (PR#4) (2000-07-31) http://python.org/sf/210682 Copy from stdout after crash (2001-11-29) http://python.org/sf/487297 detail: tp_basicsize and tp_itemsize (2001-12-12) http://python.org/sf/492349 Finder Tool Move not working on MOSX (2001-12-15) http://python.org/sf/493826 plugin project generation has problems (2001-12-18) http://python.org/sf/494572 Inaccuracy(?) in tutorial section 9.2 (2002-01-07) http://python.org/sf/500539 random.cunifvariate() incorrect? (2002-01-21) http://python.org/sf/506647 random.gammavariate hosed (2002-03-07) http://python.org/sf/527139 __reduce__ does not work as documented (2002-03-21) http://python.org/sf/533291 rexec: potential security hole (2002-03-22) http://python.org/sf/533625 Running MacPython as non-priv user may fail (2002-03-23) http://python.org/sf/534158 Compile fails on posixmodule.c (2002-04-10) http://python.org/sf/542003 Distutils readme outdated (2002-04-12) http://python.org/sf/542912 bug? floor divison on complex (2002-04-13) http://python.org/sf/543387 base64 newlines - documentation (again) (2002-04-22) http://python.org/sf/547037 cStringIO mangles Unicode (2002-04-23) http://python.org/sf/547537 Missing or wrong index entries (2002-04-25) http://python.org/sf/548693 Poor error message for float() (2002-05-02) http://python.org/sf/551673 PDF won't print (2002-05-03) http://python.org/sf/551828 "./configure" crashes (2002-05-06) http://python.org/sf/553000 cPickle dies on short reads (2002-05-07) http://python.org/sf/553512 bug in telnetlib- 'opt' instead of 'c' (2002-05-09) http://python.org/sf/554073 test_fcntl fails on OpenBSD 3.0 (2002-05-10) http://python.org/sf/554663 --disable-unicode builds horked (2002-05-11) http://python.org/sf/554912 rfc822.Message.getaddrlist broken (2002-05-11) http://python.org/sf/555035 Closed Patches -------------- Reminder: 2.3 should check tp_compare (2001-10-18) http://python.org/sf/472523 foreign-platform newline support (2001-10-31) http://python.org/sf/476814 Cygwin setup.py import workaround patch (2001-12-10) http://python.org/sf/491107 Fix webbrowser running on MachoPython (2002-01-10) http://python.org/sf/502205 fix random.gammavariate bug #527139 (2002-03-13) http://python.org/sf/529408 force gzip to open files with 'b' (2002-03-28) http://python.org/sf/536278 context sensitive help/keyword search (2002-04-08) http://python.org/sf/541031 error about string formatting rewording? (2002-04-26) http://python.org/sf/549187 Unittest for base64 (2002-04-28) http://python.org/sf/550002 __doc__ strings of builtin types (2002-04-29) http://python.org/sf/550290 GC collection frequency bug (2002-05-03) http://python.org/sf/551915 Add degrees() & radians() to math module (2002-05-05) http://python.org/sf/552452 Cygwin Makefile.pre.in vestige patch (2002-05-08) http://python.org/sf/553678 OpenBSD fixes for Python 2.2 (2002-05-10) http://python.org/sf/554719 From akuchlin@mems-exchange.org Sun Jun 2 13:45:17 2002 From: akuchlin@mems-exchange.org (akuchlin@mems-exchange.org) Date: Sun, 2 Jun 2002 08:45:17 -0400 Subject: [Python-Dev] Where to put wrap_text()? In-Reply-To: <15608.61035.133229.77125@anthem.wooz.org> References: <20020601134236.GA17691@gerg.ca> <15608.61035.133229.77125@anthem.wooz.org> Message-ID: <20020602124517.GA31389@mems-exchange.org> On Sat, Jun 01, 2002 at 11:55:23AM -0400, Barry A. Warsaw wrote: >You might consider a text package with submodules for various wrapping >algorithms. The text package might even grow other functionality >later too. +1. --amk From pinard@iro.umontreal.ca Sun Jun 2 14:09:57 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 02 Jun 2002 09:09:57 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: <20020601220529.GA20025@gerg.ca> References: <20020601134236.GA17691@gerg.ca> <20020601220529.GA20025@gerg.ca> Message-ID: [Greg Ward] > Do you have a reference for this algorithm apart from GNU fmt's source > code? Surely not handy. I heard about it, and others even more capable, many years ago. If I remember well, Knuth's algorithm plays by moving line cuts and optimising a global function through dynamic programming, giving more points, say, when punctuation coincides with end of lines, removing points when a single letter words appear at end of lines, and such thing. So lines are not guaranteed to be as filled as possible, but the overall appearance of the paragraph gets better, sometimes much better. I'm Cc:ing Ross Paterson, who wrote GNU `fmt', in hope he could shed some light about references, or otherwise. Some filling algorithms used by typographers (or so I heard) are even careful about dismantling vertical or diagonal (aliased) white lines which sometimes build up across paragraphs by the effect of dumbier filling. > I'm leaning towards a TextWrapper class, so that everyone may impose > their desires through subclassing. Distutils experience speaking here? :-) By the way, I would like if the module was not named `text'. I use `text' all over in my programs already as a common variable name, as a way to not use `string' for a common variable name, for obvious reasons. Granted that `string' is progressively becoming available again :-). Maybe Python should try to not name modules with likely to be use-everywhere local variables. > Not if it grows to accomodate Optik/OptionParser, Mailman, regrtest, > etc. At some places in my things, I have unusual wrapping/filling needs, and wonder if they could all fit in a generic scheme. An interesting question and exercise, surely. -- François Pinard http://www.iro.umontreal.ca/~pinard From skip@pobox.com Sun Jun 2 14:10:15 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 2 Jun 2002 08:10:15 -0500 Subject: [Python-Dev] "max recursion limit exceeded" canned response? Message-ID: <15610.6455.96035.742110@12-248-41-177.client.attbi.com> How would we go about adding a canned response to the commonly submitted "max recursion limit exceeded" bug report? I think Tim's discussion of re design patterns to use in http://python.org/sf/493252 (or something like it) probably belongs in the re module docs since this is such a common stumbling block for people used to using ".*?". I'll work something up for the Examples section and Jake's hockey game this morning. Skip From guido@python.org Sun Jun 2 14:22:53 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 02 Jun 2002 09:22:53 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: Your message of "02 Jun 2002 09:09:57 EDT." References: <20020601134236.GA17691@gerg.ca> <20020601220529.GA20025@gerg.ca> Message-ID: <200206021322.g52DMr531014@pcp742651pcs.reston01.va.comcast.net> > > Do you have a reference for this algorithm apart from GNU fmt's > > source code? Can we focus on getting the module/package structure and a basic algorithm first? It's fine to design the structure for easy extensibility with other algorithms, but implementing Knuth's algorithm seems hopelessly out of scope. Even Emacs' fill-paragraph is too fancy-schmancy for my taste (for inclusion as a Python standard library). Simply breaking lines at a certain limit is all that's needed. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jun 2 14:25:08 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 02 Jun 2002 09:25:08 -0400 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: Your message of "Sun, 02 Jun 2002 06:04:59 +0200." References: Message-ID: <200206021325.g52DP8031046@pcp742651pcs.reston01.va.comcast.net> > > module5.py: > > from package import module6 # absolute import > > > > module6.py: > > from package import module5 > > [...] > > ImportError: cannot import name module5 > > > > Is this behavior expected? Or is it a bug? > > The problem is that importing with from consists of two steps: > - load the module > - add the imported names to the local namespace Good explanation! This means it's an unavoidable problem. Maybe you can fix the FAQ entry? --Guido van Rossum (home page: http://www.python.org/~guido/) From goodger@users.sourceforge.net Sun Jun 2 15:11:31 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Sun, 02 Jun 2002 10:11:31 -0400 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: Message-ID: I wrote: >> module5.py: >> from package import module6 # absolute import >> >> module6.py: >> from package import module5 >> [...] >> ImportError: cannot import name module5 >> >> Is this behavior expected? Or is it a bug? Matthias Urlichs replied: > The problem is that importing with from consists of two steps: > - load the module > - add the imported names to the local namespace > > Since this addition is by reference to the actual object and not to > the symbol's name in the other module, a concept which Python doesn't > have (use Perl if you want this...), your recursive import doesn't > work. > > The solution would be: > import package.module6 as module6 > > which should have the same effect. Perhaps I'm just dense, or perhaps it's because of my choice of names in my example, but I don't understand the explanation. Could you be more specific, perhaps with a concrete example? Despite Guido's "Good explanation!", the above text in the FAQ entry wouldn't eliminate my confusion. I suspect it's a good explanation for those that already understand what's going on behind the scenes. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From David Abrahams" The following small program is giving me some unexpected results with Python 2.2.1: class Int(object): def __int__(self): return 10 class Float(object): def __float__(self): return 10.0 class Long(object): def __long__(self): return 10L class Complex(object): def __complex__(self): return (10+0j) def attempt(f,arg): try: return f(arg) except Exception,e: return str(e.__class__.__name__)+': '+str(e) for f in int,float,long,complex: for t in Int,Float,Long,Complex: print f.__name__ + '(' + t.__name__ + ')\t\t', print attempt(f,t()) ----- results ------ int(Int) 10 int(Float) TypeError: object can't be converted to int int(Long) TypeError: object can't be converted to int int(Complex) TypeError: object can't be converted to int *** OK, int() seems to work as expected float(Int) TypeError: float() needs a string argument float(Float) 10.0 float(Long) TypeError: float() needs a string argument float(Complex) TypeError: float() needs a string argument *** float() seems to work, but what's with the error message about strings? long(Int) TypeError: object can't be converted to long long(Float) TypeError: object can't be converted to long long(Long) 10 long(Complex) TypeError: object can't be converted to long **** OK, long seems to work as expected complex(Int) TypeError: complex() arg can't be converted to complex complex(Float) (10+0j) complex(Long) TypeError: complex() arg can't be converted to complex complex(Complex) TypeError: complex() arg can't be converted to complex **** I can understand complex() handling Float implicitly, but only if it also handles Complex! And if it does handle Float implicitly, shouldn't all of these handle everything? Comments? -Dave +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+ From David Abrahams" import " References: Message-ID: <0b3701c20a43$39a61ef0$6601a8c0@boostconsulting.com> From: "David Goodger" > Perhaps I'm just dense, or perhaps it's because of my choice of names > in my example, but I don't understand the explanation. Could you be > more specific, perhaps with a concrete example? Despite Guido's > "Good explanation!", the above text in the FAQ entry wouldn't > eliminate my confusion. Nor mine, FWIW. -D From smurf@noris.de Sun Jun 2 15:44:07 2002 From: smurf@noris.de (Matthias Urlichs) Date: Sun, 2 Jun 2002 16:44:07 +0200 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: ; from goodger@users.sourceforge.net on Sun, Jun 02, 2002 at 10:11:31AM -0400 References: Message-ID: <20020602164407.E17316@noris.de> Hi, David Goodger: > Perhaps I'm just dense, or perhaps it's because of my choice of names > in my example, but I don't understand the explanation. Could you be > more specific, perhaps with a concrete example? foo.py: from bar import one bar.py: from foo import two main.py: import foo So what happens is, more or less: main imports foo Empty globals for foo are created foo is compiled foo loads bar Empty globals for bar are created bar is compiled bar loads foo (which is a no-op since there already is a module named foo) bar.two = foo.two ... which fails, because the compiler isn't done with foo yet and the global symbol dict for foo is still empty. > eliminate my confusion. I suspect it's a good explanation for those > that already understand what's going on behind the scenes. > _If_ you can change foo.py so that it reads: two = 2 from bar import one i.e., initialize the exports first and load afterwards, the test would work. However, the following will NOT work: two = None from bar import one two = do_something(with(bar.one)) for (hopefully) obvious reasons. -- Matthias Urlichs | noris network AG | http://smurf.noris.de/ From mwh@python.net Sun Jun 2 15:45:13 2002 From: mwh@python.net (Michael Hudson) Date: 02 Jun 2002 15:45:13 +0100 Subject: [Python-Dev] PYC Magic In-Reply-To: "Gordon McMillan"'s message of "Sat, 1 Jun 2002 12:24:25 -0400" References: <3CF8BCF9.5557.4C49011F@localhost> Message-ID: <2m1ybpu4cm.fsf@starship.python.net> "Gordon McMillan" writes: > On 1 Jun 2002 at 10:15, Neal Norwitz wrote: > > > Guido van Rossum wrote: > > > > What do you mean by "can't be written to disk"? > > > Disk full was one condition. > > I can't be 100% sure of the cause, but I *have* > seen this (a bad .pyc file that had to be > deleted before the module would import). The .pyc > was woefully short but passed the magic > test. I think this was 2.1, maybe 2.0. Someone on comp.lang.python reported getting corrupt .pycs by having modules in a user-writeable directory being accessed more-or-less simultaneously by different Python versions. I'm not sure what could be done aobut that. Cheers, M. -- First of all, email me your AOL password as a security measure. You may find that won't be able to connect to the 'net for a while. This is normal. The next thing to do is turn your computer upside down and shake it to reboot it. -- Darren Tucker, asr From mwh@python.net Sun Jun 2 16:13:24 2002 From: mwh@python.net (Michael Hudson) Date: 02 Jun 2002 16:13:24 +0100 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/test test_signal.py,1.9,1.10 In-Reply-To: Michael Hudson's message of "30 May 2002 17:05:04 +0100" References: <2mit56ros4.fsf@starship.python.net> <2msn49hp3i.fsf@starship.python.net> <200205301228.g4UCS9S07601@pcp742651pcs.reston01.va.comcast.net> <2mhekphjdi.fsf@starship.python.net> <200205301540.g4UFefk24115@odiug.zope.com> <2melfthb9r.fsf@starship.python.net> Message-ID: <2m3cw5zpbf.fsf@starship.python.net> Michael Hudson writes: > Now what do I do? Back my patch out? Not expose the functions on > BSD? It works on Linux... But nowhere else, it would seem, at least not when python is built threaded. Darwin fails in a similar manner to FreeBSD, test_signal hangs on Solaris, but only if it's runs as part of the test suite -- it runs fine if you just run it alone. Argh! I really am starting to think that I should back out my patch and distribute my code as an extension module. Opinions? Cheers, M. -- But since your post didn't lay out your assumptions, your goals, or how you view language characteristics as fitting in with either, you're not a *natural* candidate for embracing Design by Contract <0.6 wink>. -- Tim Peters, giving Eiffel adoption advice From gisle@ActiveState.com Sun Jun 2 16:40:49 2002 From: gisle@ActiveState.com (Gisle Aas) Date: 02 Jun 2002 08:40:49 -0700 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: References: Message-ID: Matthias Urlichs writes: > Since this addition is by reference to the actual object and not to > the symbol's name in the other module, a concept which Python doesn't > have (use Perl if you want this...) Perl doesn't add references to names. It imports direct reference as well. The difference is that perl will create the named object in the exporting package when it is imported, if the exporting package's init code has not executed yet. In Perl this works because we at import time know if we are importing a variable (and what kind) or a function, and later assignments to variables or redefinitions of functions mutate the object in-place. Regards, Gisle Aas From barry@zope.com Sun Jun 2 16:54:55 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 2 Jun 2002 11:54:55 -0400 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de> Message-ID: <15610.16335.191744.30203@anthem.wooz.org> What about "foo = foo + 1" => "foo += 1"? -Barry From aahz@pythoncraft.com Sun Jun 2 16:56:33 2002 From: aahz@pythoncraft.com (Aahz) Date: Sun, 2 Jun 2002 11:56:33 -0400 Subject: [Python-Dev] Numeric conversions In-Reply-To: <0b3501c20a43$394d77a0$6601a8c0@boostconsulting.com> References: <0b3501c20a43$394d77a0$6601a8c0@boostconsulting.com> Message-ID: <20020602155633.GA3139@panix.com> On Sun, Jun 02, 2002, David Abrahams wrote: > > The following small program is giving me some unexpected results with > Python 2.2.1: > > class Int(object): > def __int__(self): return 10 > > class Float(object): > def __float__(self): return 10.0 > > ----- results ------ > > int(Int) 10 > int(Float) TypeError: object can't be converted to int Um. I'm confuzzled. Float doesn't have an __int__ method; why do you expect it to work? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In the end, outside of spy agencies, people are far too trusting and willing to help." --Ira Winkler From David Abrahams" <20020602155633.GA3139@panix.com> Message-ID: <0b5701c20a4e$6648f760$6601a8c0@boostconsulting.com> From: "Aahz" > On Sun, Jun 02, 2002, David Abrahams wrote: > > > > The following small program is giving me some unexpected results with > > Python 2.2.1: > > > > class Int(object): > > def __int__(self): return 10 > > > > class Float(object): > > def __float__(self): return 10.0 > > > > ----- results ------ > > > > int(Int) 10 > > int(Float) TypeError: object can't be converted to int > > Um. I'm confuzzled. Float doesn't have an __int__ method; why do you > expect it to work? I don't. as I wrote below that: *** OK, int() seems to work as expected Each item beginning with '***' is meant to refer to the group of 4 lines above. From barry@zope.com Sun Jun 2 17:00:27 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 2 Jun 2002 12:00:27 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? References: <20020601134236.GA17691@gerg.ca> <20020601220529.GA20025@gerg.ca> Message-ID: <15610.16667.463562.261068@anthem.wooz.org> >>>>> "FP" =3D=3D Fran=E7ois Pinard writes: FP> By the way, I would like if the module was not named `text'. Well, since Greg's writing it , I think "textutils" is a natural package name. :) -Barry From smurf@noris.de Sun Jun 2 17:05:37 2002 From: smurf@noris.de (Matthias Urlichs) Date: Sun, 2 Jun 2002 18:05:37 +0200 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: ; from gisle@ActiveState.com on Sun, Jun 02, 2002 at 08:40:49AM -0700 References: Message-ID: <20020602180537.G17316@noris.de> Hi, Gisle Aas: > Perl doesn't add references to names. It imports direct reference as > well. What I meant to say was: Perl shares the actual symbol table slot when you import something; so a later reassignment to the variable in question will affect every module. Python doesn't have that additional indirection. > In Perl this works because we at import time know if we are importing > a variable (and what kind) or a function, and later assignments to > variables or redefinitions of functions mutate the object in-place. > ... which is essentially a different way to state the same thing. ;-) -- Matthias Urlichs | noris network AG | http://smurf.noris.de/ From goodger@users.sourceforge.net Sun Jun 2 17:19:39 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Sun, 02 Jun 2002 12:19:39 -0400 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: <20020602164407.E17316@noris.de> Message-ID: Matthias, thank you for your explanation. I was operating under the assumption that the mechanism behind "from package import module" was somehow different from that behind "from module import name", because there is no name "module" inside package/__init__.py. A little experimentation confirmed that it was a mistaken assumption. Now all is clear. The note in section 6.12 ("The import statement") of the Language Reference, "XXX Can't be bothered to spell this out right now", has always bothered me. I will endeavour to flesh it out for release 2.3. Perhaps Guido's package support essay (http://www.python.org/doc/essays/packages.html), edited, should form a new section or appendix. Ideas anyone? -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From pinard@iro.umontreal.ca Sun Jun 2 17:35:42 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 02 Jun 2002 12:35:42 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: <200206021322.g52DMr531014@pcp742651pcs.reston01.va.comcast.net> References: <20020601134236.GA17691@gerg.ca> <20020601220529.GA20025@gerg.ca> <200206021322.g52DMr531014@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > > Do you have a reference for this algorithm apart from GNU fmt's > > > source code? > Can we focus on getting the module/package structure and a basic > algorithm first? It's fine to design the structure for easy > extensibility with other algorithms, but implementing Knuth's > algorithm seems hopelessly out of scope. There is no emergency for Knuth's algorithm of, course. However, if I mentioned it, this was as an invitation for the package to be designed with an opened mind about extensibility. And it usually helps opening the mind, pondering various avenues. What should we read in your "hopelessly out of scope" comment, above? Do you mean you would object beforehand that Python offers it? -- François Pinard http://www.iro.umontreal.ca/~pinard From gisle@ActiveState.com Sun Jun 2 17:44:25 2002 From: gisle@ActiveState.com (Gisle Aas) Date: 02 Jun 2002 09:44:25 -0700 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: <20020602180537.G17316@noris.de> References: <20020602180537.G17316@noris.de> Message-ID: "Matthias Urlichs" writes: > Gisle Aas: > > Perl doesn't add references to names. It imports direct reference as > > well. > > What I meant to say was: Perl shares the actual symbol table slot when you > import something; Wrong. If you do: package Bar; use Foo qw(foo); Then you end up with \&Bar::foo and \&Foo:foo pointing to the same function object, but the symbol table slots are independent. Is is exactly the same situation you would have in Python with two namespace dicts pointing to the same function object. But you can achieve sharing by explicit import of the symbol (aka glob) using something like: use Foo qw(*foo); Regards, Gisle Aas From mal@lemburg.com Sun Jun 2 18:22:18 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 02 Jun 2002 19:22:18 +0200 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de> <15610.16335.191744.30203@anthem.wooz.org> Message-ID: <3CFA544A.6050701@lemburg.com> Raymond, while you documenting the various changes, please include a Python version number with all of them, so that the migration guide can use this information as well. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From smurf@noris.de Sun Jun 2 18:45:06 2002 From: smurf@noris.de (Matthias Urlichs) Date: Sun, 2 Jun 2002 19:45:06 +0200 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: ; from gisle@ActiveState.com on Sun, Jun 02, 2002 at 09:44:25AM -0700 References: <20020602180537.G17316@noris.de> Message-ID: <20020602194506.H17316@noris.de> Hi, Gisle Aas: > > What I meant to say was: Perl shares the actual symbol table slot when you > > import something; > > Wrong. If you do: > > package Bar; > use Foo qw(foo); > > Then you end up with \&Bar::foo and \&Foo:foo pointing to the same > function object, but the symbol table slots are independent. Right, actually; we're just miscommunicating. Let's state it differently: the situation is more like "use Foo qw($foo)" (you can't assign to &Foo::foo). After the "use", \$Bar::foo and \$Foo:foo point to the same scalar variable, thus $Foo::foo and $Bar::foo are the same variable and no longer independent (unless a different scalar or glob reference is stored in to *Foo::foo, but that's too much magic for a FAQ entry). In Python, Bar.foo gets set to the contents of Foo.foo when the import statement is processed, but the two variables are otherwise independent. -- Matthias Urlichs | noris network AG | http://smurf.noris.de/ From guido@python.org Sun Jun 2 21:56:39 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 02 Jun 2002 16:56:39 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib formatter.py,1.20,1.21 ftplib.py,1.69,1.70 gettext.py,1.13,1.14 hmac.py,1.5,1.6 In-Reply-To: Your message of "Sun, 02 Jun 2002 16:18:25 +1100." References: Message-ID: <200206022056.g52KudD31601@pcp742651pcs.reston01.va.comcast.net> [Andrew MacIntyre commenting on Raymond H's changes from "if x" to "if x is not None"] > You have in fact changed the semantics of this test with your change. > > In the case where file = '', the original would fall through to the > elif, whereas with your change it won't. The question is whether that's an important change or not. Raymond has changed the semantics in every case where he made this particular change. Usually that's fine. Occasionally it's not. > It concerns me that your extensive changes have introduced some > unexpected traps which won't be sprung until 2.3 is released. Me too. I think 99% of the changes were "right", but without looking at actual use cases much more we won't know where the 1% mistakes are. I think it's okay to do this (because of the 99%) but we should be aware that we may be breaking some code and willing to revert the decision in some cases. I'm not sure how we can test this enough before 2.3 is released -- surely the alphas won't shake out enough. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Sun Jun 2 22:53:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 02 Jun 2002 23:53:20 +0200 Subject: [Python-Dev] PYC Magic In-Reply-To: <2m1ybpu4cm.fsf@starship.python.net> References: <3CF8BCF9.5557.4C49011F@localhost> <2m1ybpu4cm.fsf@starship.python.net> Message-ID: Michael Hudson writes: > Someone on comp.lang.python reported getting corrupt .pycs by having > modules in a user-writeable directory being accessed more-or-less > simultaneously by different Python versions. I'm not sure what could > be done aobut that. The user could remove write permission on that directory. I think Python should provide an option to never write .pyc files, controllable through sys.something. Regards, Martin From martin@v.loewis.de Sun Jun 2 22:51:30 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 02 Jun 2002 23:51:30 +0200 Subject: [Python-Dev] "max recursion limit exceeded" canned response? In-Reply-To: <15610.6455.96035.742110@12-248-41-177.client.attbi.com> References: <15610.6455.96035.742110@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > How would we go about adding a canned response to the commonly submitted > "max recursion limit exceeded" bug report? Post the precise text that you want to see as the canned response, and somebody can install it. Regards, Martin From tim.one@comcast.net Sun Jun 2 23:03:55 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 02 Jun 2002 18:03:55 -0400 Subject: [Python-Dev] "max recursion limit exceeded" canned response? In-Reply-To: Message-ID: [Skip Montanaro] > How would we go about adding a canned response to the commonly submitted > "max recursion limit exceeded" bug report? [Martin v. Loewis] > Post the precise text that you want to see as the canned response, and > somebody can install it. I don't think any canned answer will suffice -- every context is different enough that it needs custom help. I vote instead that we stop answering these reports at all: let /F do it. That will eventually provoke him into either writing the canned response he wants to see, or to complete the long-delayed task of removing this ceiling from sre. From akuchlin@mems-exchange.org Mon Jun 3 01:37:09 2002 From: akuchlin@mems-exchange.org (akuchlin@mems-exchange.org) Date: Sun, 2 Jun 2002 20:37:09 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: References: <20020601134236.GA17691@gerg.ca> <20020601220529.GA20025@gerg.ca> Message-ID: <20020603003709.GA1214@mems-exchange.org> On Sun, Jun 02, 2002 at 09:09:57AM -0400, Fran?ois Pinard wrote: >years ago. If I remember well, Knuth's algorithm plays by moving line >cuts and optimising a global function through dynamic programming, giving >more points, say, when punctuation coincides with end of lines, ... If that's the same algorithm that's used by TeX, see http://www.amk.ca/python/code/tex_wrap.html . --amk From guido@python.org Mon Jun 3 06:01:36 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 01:01:36 -0400 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: Your message of "Sun, 02 Jun 2002 12:19:39 EDT." References: Message-ID: <200206030501.g5351aG31933@pcp742651pcs.reston01.va.comcast.net> > The note in section 6.12 ("The import statement") of the Language > Reference, "XXX Can't be bothered to spell this out right now", has > always bothered me. I will endeavour to flesh it out for release > 2.3. Perhaps Guido's package support essay > (http://www.python.org/doc/essays/packages.html), edited, should > form a new section or appendix. Ideas anyone? The information should be worked into the language reference IMO. I'm not sure if an appendix is appropriate or if this could be part of the documentation for the import statement. Thanks for helping out!!! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jun 3 06:09:09 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 01:09:09 -0400 Subject: [Python-Dev] Other library code transformations In-Reply-To: Your message of "Sun, 02 Jun 2002 11:54:55 EDT." <15610.16335.191744.30203@anthem.wooz.org> References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de> <15610.16335.191744.30203@anthem.wooz.org> Message-ID: <200206030509.g53599u31983@pcp742651pcs.reston01.va.comcast.net> > What about "foo = foo + 1" => "foo += 1"? I'm not for making peephole changes like this. It's easy to make mistakes (even if you run the test suite) if you don't guess the type of a variable right. I think it's better to bring code up to date in style only as part of a serious rewrite of the module containing it -- so you can fix up all different aspects. It's often kind of strange to see a modernization like this in code that otherwise shows it hasn't been modified in 5 years... (Exceptions are to get rid of deprecation warnings, or outright failures, of course.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jun 3 06:10:08 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 01:10:08 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/test test_signal.py,1.9,1.10 In-Reply-To: Your message of "02 Jun 2002 16:13:24 BST." <2m3cw5zpbf.fsf@starship.python.net> References: <2mit56ros4.fsf@starship.python.net> <2msn49hp3i.fsf@starship.python.net> <200205301228.g4UCS9S07601@pcp742651pcs.reston01.va.comcast.net> <2mhekphjdi.fsf@starship.python.net> <200205301540.g4UFefk24115@odiug.zope.com> <2melfthb9r.fsf@starship.python.net> <2m3cw5zpbf.fsf@starship.python.net> Message-ID: <200206030510.g535A8g31997@pcp742651pcs.reston01.va.comcast.net> > But nowhere else, it would seem, at least not when python is built > threaded. Darwin fails in a similar manner to FreeBSD, test_signal > hangs on Solaris, but only if it's runs as part of the test suite -- > it runs fine if you just run it alone. Argh! I really am starting to > think that I should back out my patch and distribute my code as an > extension module. Opinions? Back it out, and think of a way to only enable it on Linux. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jun 3 06:12:02 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 01:12:02 -0400 Subject: [Python-Dev] PYC Magic In-Reply-To: Your message of "02 Jun 2002 15:45:13 BST." <2m1ybpu4cm.fsf@starship.python.net> References: <3CF8BCF9.5557.4C49011F@localhost> <2m1ybpu4cm.fsf@starship.python.net> Message-ID: <200206030512.g535C2Q32028@pcp742651pcs.reston01.va.comcast.net> > Someone on comp.lang.python reported getting corrupt .pycs by having > modules in a user-writeable directory being accessed more-or-less > simultaneously by different Python versions. I'm not sure what > could be done about that. Yes, that doesn't work... I suggest to create copies of the code per Python version. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jun 3 06:23:03 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 01:23:03 -0400 Subject: [Python-Dev] Numeric conversions In-Reply-To: Your message of "Sun, 02 Jun 2002 10:36:12 EDT." <0b3501c20a43$394d77a0$6601a8c0@boostconsulting.com> References: <0b3501c20a43$394d77a0$6601a8c0@boostconsulting.com> Message-ID: <200206030523.g535N3t32090@pcp742651pcs.reston01.va.comcast.net> > The following small program is giving me some unexpected results with > Python 2.2.1: > > class Int(object): > def __int__(self): return 10 > > class Float(object): > def __float__(self): return 10.0 > > class Long(object): > def __long__(self): return 10L > > class Complex(object): > def __complex__(self): return (10+0j) > > def attempt(f,arg): > try: > return f(arg) > except Exception,e: > return str(e.__class__.__name__)+': '+str(e) > > for f in int,float,long,complex: > for t in Int,Float,Long,Complex: > print f.__name__ + '(' + t.__name__ + ')\t\t', > print attempt(f,t()) > > ----- results ------ > > int(Int) 10 > int(Float) TypeError: object can't be converted to int > int(Long) TypeError: object can't be converted to int > int(Complex) TypeError: object can't be converted to int > > *** OK, int() seems to work as expected > > float(Int) TypeError: float() needs a string argument > float(Float) 10.0 > float(Long) TypeError: float() needs a string argument > float(Complex) TypeError: float() needs a string argument > > *** float() seems to work, but what's with the error message about strings? Sloppy coding. float(), like int() and long(), takes either a number or a string. Raymond Hettinger fixed this in CVS in response to SF bug 551673, about two weeks ago. :-) > long(Int) TypeError: object can't be converted to long > long(Float) TypeError: object can't be converted to long > long(Long) 10 > long(Complex) TypeError: object can't be converted to long > > **** OK, long seems to work as expected > > complex(Int) TypeError: complex() arg can't be converted to > complex > complex(Float) (10+0j) > complex(Long) TypeError: complex() arg can't be converted to > complex > complex(Complex) TypeError: complex() arg can't be converted to > complex > > **** I can understand complex() handling Float implicitly, but only if it > also handles Complex! And if it does handle Float implicitly, shouldn't all > of these handle everything? The signature of complex() is different -- it takes two float arguments, the real and imaginary part. It doesn't take Complex() because of a bug: it only looks for __complex__ if the argument is a classic instance. Maybe Raymond can fix this? I've added a bug report: python.org/sf/563740 --Guido van Rossum (home page: http://www.python.org/~guido/) From smurf@noris.de Mon Jun 3 07:01:21 2002 From: smurf@noris.de (Matthias Urlichs) Date: Mon, 3 Jun 2002 08:01:21 +0200 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: <200206021325.g52DP8031046@pcp742651pcs.reston01.va.comcast.net>; from guido@python.org on Sun, Jun 02, 2002 at 09:25:08AM -0400 References: <200206021325.g52DP8031046@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020603080121.I17316@noris.de> Hi, Guido van Rossum: > Good explanation! This means it's an unavoidable problem. Maybe you > can fix the FAQ entry? > I've rewritten FAQ 4.37, though I just noticed that I mis-pasted the log entry (it's incomplete). I'll be more careful in the future. :-/ -- Matthias Urlichs | noris network AG | http://smurf.noris.de/ From fredrik@pythonware.com Mon Jun 3 10:07:22 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 3 Jun 2002 11:07:22 +0200 Subject: [Python-Dev] Re: Adding Optik to the standard library References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca> Message-ID: <01b401c20ade$13c7bdb0$0900a8c0@spiff> greg wrote: > (OTOH and wildly OT: since I gave in a couple years ago and started > using Emacs syntax-colouring, it *does* take a lot longer to load > modules up -- eg. ~2 sec for the 1000-line rfc822.py. (setq font-lock-support-mode 'lazy-lock-mode) From fredrik@pythonware.com Mon Jun 3 10:08:52 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 3 Jun 2002 11:08:52 +0200 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> Message-ID: <01d401c20ade$d0dc9650$0900a8c0@spiff> walter wrote: > import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime or, nicer: os.path.getmtime("foo") From fredrik@pythonware.com Mon Jun 3 10:12:34 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 3 Jun 2002 11:12:34 +0200 Subject: [Python-Dev] Re: Where to put wrap_text()? References: <20020601134236.GA17691@gerg.ca> <20020601220529.GA20025@gerg.ca> Message-ID: <01d501c20ade$d0e76bc0$0900a8c0@spiff> greg wrote: > Damn, I had no idea there was a body of computer science (however = small) > devoted to the art of filling text. Trust Knuth to be there first. = Do > you have a reference for this algorithm apart from GNU fmt's source > code? Google'ing for "knuth text fill algorithm" was unhelpful, ditto > with s/fill/wrap/. http://www.amk.ca/python/code/tex_wrap.html > > Also, is there some existing module in which `wraptext' would fit = nicely? > > That might be better than creating a new module for not many = functions. "string" (yes, I'm serious). From barry@zope.com Mon Jun 3 10:48:26 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 3 Jun 2002 05:48:26 -0400 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de> <15610.16335.191744.30203@anthem.wooz.org> <200206030509.g53599u31983@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15611.15210.602452.975631@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: >> What about "foo = foo + 1" => "foo += 1"? GvR> I'm not for making peephole changes like this. It's easy to GvR> make mistakes (even if you run the test suite) if you don't GvR> guess the type of a variable right. I think it's better to GvR> bring code up to date in style only as part of a serious GvR> rewrite of the module containing it -- so you can fix up all GvR> different aspects. It's often kind of strange to see a GvR> modernization like this in code that otherwise shows it GvR> hasn't been modified in 5 years... I agree! If it works, don't fix it. I was responding to this > While you're at it: could you also write up all these little > "code cleanups" in some file so that Andrew can integrate them > in the migration guide ?! I think it's a nice "code cleanup" that is worth noting in a migration guide. -Barry From barry@zope.com Mon Jun 3 11:06:32 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 3 Jun 2002 06:06:32 -0400 Subject: [Python-Dev] Re: Adding Optik to the standard library References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca> <01b401c20ade$13c7bdb0$0900a8c0@spiff> Message-ID: <15611.16296.202088.831238@anthem.wooz.org> >>>>> "FL" == Fredrik Lundh writes: FL> (setq font-lock-support-mode 'lazy-lock-mode) (add-hook 'font-lock-mode-hook 'turn-on-fast-lock) dueling-hooks-ly y'rs, -Barry From gmcm@hypernet.com Mon Jun 3 13:37:10 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 3 Jun 2002 08:37:10 -0400 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: <200206021325.g52DP8031046@pcp742651pcs.reston01.va.comcast.net> References: Your message of "Sun, 02 Jun 2002 06:04:59 +0200." Message-ID: <3CFB2AB6.22345.55C5AC0E@localhost> On 2 Jun 2002 at 9:25, Guido van Rossum wrote: [Matthias Urlichs] > > The problem is that importing with from consists of > > two steps: - load the module - add the imported names > > to the local namespace > > Good explanation! This means it's an unavoidable > problem. Um, different problem. What Matthias explains is unavoidable. But in David's case, the containing namespace (the package) is not empty when module2 wants module1. In fact, I believe that sys.modules['package.module1'] is there (though *it* is empty). My guess is that import is looking for module1 as an attribute of package, and that that binding hasn't taken place yet. If I use iu instead of the builtin import, it works. -- Gordon http://www.mcmillan-inc.com/ From skip@pobox.com Mon Jun 3 14:34:13 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 3 Jun 2002 08:34:13 -0500 Subject: [Python-Dev] Re: Adding Optik to the standard library In-Reply-To: <15611.16296.202088.831238@anthem.wooz.org> References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca> <01b401c20ade$13c7bdb0$0900a8c0@spiff> <15611.16296.202088.831238@anthem.wooz.org> Message-ID: <15611.28757.838957.151483@12-248-41-177.client.attbi.com> BAW> (add-hook 'font-lock-mode-hook 'turn-on-fast-lock) Whoa! What a difference! After seeing your response to /F, I assume his solution was for GNU Emacs and yours if for XEmacs, right? Skip From barry@zope.com Mon Jun 3 14:37:48 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 3 Jun 2002 09:37:48 -0400 Subject: [Python-Dev] Re: Adding Optik to the standard library References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca> <01b401c20ade$13c7bdb0$0900a8c0@spiff> <15611.16296.202088.831238@anthem.wooz.org> <15611.28757.838957.151483@12-248-41-177.client.attbi.com> Message-ID: <15611.28972.172040.401411@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: BAW> (add-hook 'font-lock-mode-hook 'turn-on-fast-lock) SM> Whoa! What a difference! SM> After seeing your response to /F, I assume his solution was SM> for GNU Emacs and yours if for XEmacs, right? What's "GNU Emacs"? -Barry From guido@python.org Mon Jun 3 15:04:58 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 10:04:58 -0400 Subject: [Python-Dev] Other library code transformations In-Reply-To: Your message of "Mon, 03 Jun 2002 05:48:26 EDT." <15611.15210.602452.975631@anthem.wooz.org> References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de> <15610.16335.191744.30203@anthem.wooz.org> <200206030509.g53599u31983@pcp742651pcs.reston01.va.comcast.net> <15611.15210.602452.975631@anthem.wooz.org> Message-ID: <200206031404.g53E4w400629@pcp742651pcs.reston01.va.comcast.net> > I was responding to this > > > While you're at it: could you also write up all these little > > "code cleanups" in some file so that Andrew can integrate them > > in the migration guide ?! > > I think it's a nice "code cleanup" that is worth noting in a migration > guide. I think MAL wrote that. I interpreted it as "Raymond should document which modules he changed, and how, so that people can be aware of the subtle semantic changes." I now realize that he probably meant "can you write up a list of things you can do to modernize your code." But IMO the latter doesn't belong in a migration guide; the migration guide should focus on what you *have* to change in order to avoid disappointments later. Most of the things Raymond does aren't about new features in 2.3 either. And I *do* think that a migration guide should at least contain a general warning about the kind of changes that Raymond did that might affect 3rd party code. --Guido van Rossum (home page: http://www.python.org/~guido/) From s_lott@yahoo.com Mon Jun 3 14:58:24 2002 From: s_lott@yahoo.com (Steven Lott) Date: Mon, 3 Jun 2002 06:58:24 -0700 (PDT) Subject: [Python-Dev] Where to put wrap_text()? In-Reply-To: Message-ID: <20020603135824.43784.qmail@web9605.mail.yahoo.com> Another place for a text wrap function would be part of pprint. I agree with M. Pinard that it doesn't deserve an entire module. RE seems a little off-task for formatting text. PPRINT seems more closely related to the core problem. And it leaves room for adding additional formatting and pretty-printing features. Perhaps a small class hierarchy with different wrapping algorithms (filling and justifying, no filling, etc.) --- Tim Peters wrote: > [Greg Ward, on wrapping text] > > ... > > Note that regrtest.py also has a wrapper: > > def printlist(x, width=70, indent=4): > """Print the elements of a sequence to stdout. > > Optional arg width (default 70) is the maximum line > length. > Optional arg indent (default 4) is the number of blanks > with which to > begin each line. > """ > > This kind of thing gets reinvented too often, so +1 on a > module from me. > Just make sure it handle the union of all possible desires, > but has a simple > and intuitive interface . > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev ===== -- S. Lott, CCP :-{) S_LOTT@YAHOO.COM http://www.mindspring.com/~slott1 Buccaneer #468: KaDiMa Macintosh user: drinking upstream from the herd. __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From mal@lemburg.com Mon Jun 3 15:15:06 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 03 Jun 2002 16:15:06 +0200 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de> <15610.16335.191744.30203@anthem.wooz.org> <200206030509.g53599u31983@pcp742651pcs.reston01.va.comcast.net> <15611.15210.602452.975631@anthem.wooz.org> <200206031404.g53E4w400629@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3CFB79EA.3070607@lemburg.com> Guido van Rossum wrote: >>I was responding to this >> >> > While you're at it: could you also write up all these little >> > "code cleanups" in some file so that Andrew can integrate them >> > in the migration guide ?! >> >>I think it's a nice "code cleanup" that is worth noting in a migration >>guide. > > > I think MAL wrote that. I interpreted it as "Raymond should document > which modules he changed, and how, so that people can be aware of the > subtle semantic changes." I now realize that he probably meant "can > you write up a list of things you can do to modernize your code." I meant the latter. This information is needed in order to moderinise Python code and also to have a reference for checking existing code against a specific Python version. > But IMO the latter doesn't belong in a migration guide; the migration > guide should focus on what you *have* to change in order to avoid > disappointments later. Most of the things Raymond does aren't about > new features in 2.3 either. I know. That's why I asked Raymond to add a Python version to each of the modifications (basically pointing out in which Python version this coding style became available). Note that migration does not only include correcting code which might break; it also covers code cleanups like what Raymond is currently doing. I somehow have a feeling that you are afraid of such a guide, Guido. Is that so ? and if yes, why ? I think all this is valuable information and worth publishing. > And I *do* think that a migration guide should at least contain a > general warning about the kind of changes that Raymond did that might > affect 3rd party code. Certainly. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From guido@python.org Mon Jun 3 15:21:04 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 10:21:04 -0400 Subject: [Python-Dev] intra-package mutual imports fail: "from import " In-Reply-To: Your message of "Mon, 03 Jun 2002 08:37:10 EDT." <3CFB2AB6.22345.55C5AC0E@localhost> References: Your message of "Sun, 02 Jun 2002 06:04:59 +0200." <3CFB2AB6.22345.55C5AC0E@localhost> Message-ID: <200206031421.g53EL4W00847@pcp742651pcs.reston01.va.comcast.net> > Um, different problem. What Matthias explains is > unavoidable. But in David's case, the containing > namespace (the package) is not empty when > module2 wants module1. In fact, I believe that > sys.modules['package.module1'] is there (though > *it* is empty). > > My guess is that import is looking for module1 > as an attribute of package, and that that binding > hasn't taken place yet. Yes, that's what "from package import module" does -- it wants "module" to be an attribute of "package". This is because it doesn't really distinguish between "from package import module" and "from module import attribute". --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jun 3 15:50:39 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 10:50:39 -0400 Subject: [Python-Dev] Other library code transformations In-Reply-To: Your message of "Mon, 03 Jun 2002 16:15:06 +0200." <3CFB79EA.3070607@lemburg.com> References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de> <15610.16335.191744.30203@anthem.wooz.org> <200206030509.g53599u31983@pcp742651pcs.reston01.va.comcast.net> <15611.15210.602452.975631@anthem.wooz.org> <200206031404.g53E4w400629@pcp742651pcs.reston01.va.comcast.net> <3CFB79EA.3070607@lemburg.com> Message-ID: <200206031450.g53Eodx01137@pcp742651pcs.reston01.va.comcast.net> > I somehow have a feeling that you are afraid of such a guide, > Guido. Is that so ? and if yes, why ? I think all this is valuable > information and worth publishing. No, not at all! I just misunderstood what your purpose of a migration guide was. Sorry for the confusion. --Guido van Rossum (home page: http://www.python.org/~guido/) From pobrien@orbtech.com Mon Jun 3 16:03:42 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Mon, 3 Jun 2002 10:03:42 -0500 Subject: [Python-Dev] Other library code transformations In-Reply-To: <200206031450.g53Eodx01137@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > No, not at all! I just misunderstood what your purpose of a migration > guide was. Sorry for the confusion. Certainly distinguishing between Required changes and Recommended changes would be a good thing. Required changes are what needs to be done to keep old code working. Recommended changes could include stylistic changes, new features, new idioms, etc. I think it would even make sense to talk about changes that should be made in anticipation of future feature deprecation. --- Patrick K. O'Brien Orbtech From mgilfix@eecs.tufts.edu Mon Jun 3 16:22:45 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Mon, 3 Jun 2002 11:22:45 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200205232013.g4NKD6X07596@odiug.zope.com>; from guido@python.org on Thu, May 23, 2002 at 04:13:06PM -0400 References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> Message-ID: <20020603112245.E19838@eecs.tufts.edu> Alrighty. Here's the monster reply. I'll be much faster with the replies this week. Had a hectic week last week. On Thu, May 23 @ 16:13, Guido van Rossum wrote: > General style nits: > > - You submitted a reverse diff! Customary is diff(old, new). Oops. Will fix that this time round. > - Please don't put a space between the function name and the open > parenthesis. (You do this both in C and in Python code.) Fixed. Some personal preference bled it there. All removed in my copy. > - Also please don't put a space between the open parenthesis and the > first argument (you do this almost consistently in test_timeout.py). Couldn't really figure out what you were seeing here. I read that you saw something like func( a, b), which I don't see in my local copy. I do have something like this for list comprehension: [ x.find('\n') for x in self._rbuf ] Er, but I though there were supposed to be surrounding spaces at the edges... > - Please don't introduce lines longer than 78 columns. Fixed my offending line. I've also corrected some other lines in the socket module that went over 78 columns (there were a few). > > Feedback on the patch to socket.py: > > - I think you're changing the semantics of unbuffered and > line-buffered reads/writes on Windows. For one thing, you no longer > implement line-buffered writes correctly. The idea is that if the > buffer size is set to 1, data is flushed at \n only, so that if > the code builds up the line using many small writes, this doesn't > result in many small sends. There was code for this in write() -- > why did you delete it? I screwed up the write. New the write is: def write(self, data): self._wbuf = self._wbuf + data if self._wbufsize == 1: if '\n' in data: self.flush () elif len(self._wbuf) >= self._wbufsize: self.flush() which is pretty much the same as the old. The read should be ok though. I could really use someone with a win compiler to test this for me. > - It also looks like you've broken the semantics of size<0 in read(). Maybe I'm misunderstanding the code, but I thought that a size < 0 meant to read until there are no more? The statement: while size < 0 or buf_len < size: accomplishes the same thing as what's in the current socket.py implementation. If you look closely, the 'if >= 0' branch *always* returns, meaning that the < 0 is equiv to while 1. Due to shortcutting, the same thing happens in the above statement. Maybe a comment would make it clearer? > - Maybe changing the write buffer to a list makes sense too? I could do this. Then just do a join before the flush. Is the append /that/ much faster? > - Since this code appears independent from the timeoutsocket code, > maybe we can discuss this separately? The point of this code was to keep from losing data when an exception occurs (as timothy, if I remember correctly, pointed out). Hence the reason for keeping a lot more data around in instance variables instead of local variables. So the windows version might (in obscure cases) be affected by the timeout changes. That's what this patch was addressing. > > Feedback on the documentation: > > - I would document that the argument to settimeout() should be int or > float (hm, can it be a long? that should be acceptable even if it's > strange), and that the return value of gettimeout() is None or a > float. It can be a long in my local copy. The argument can be any numeric value and the special None. I've updated my documentation to be more explicit. > - You may want to document the interaction with blocking mode. I've put notes in the tex documentation. Here's how the interaction works: if the socket is in non-blocking mode: All operations are non-blocking and setting timeouts doesn't mean anything (they are not enforced). A timeout can still be changed and gettimeout will reflect the value but the exception will never be raised. else if the socket is in blocking mode: enabling timeouts does the usual thing you would expect from timeouts. > Feedback on the C socket module changes: > > - Why import the select module? It's much more efficient to use the > select system call directly. You don't need all the overhead and > generality that the select module adds on top of it, and it costs a > lot (select allocates lots of objects and lots of memory and hence > is very expensive). Well, the thinking was that if there were any portability issues with select, they could be taken care of in one place. At the time, I hadn't really looked closely at the select module. Now that I glance at it, pretty much all the code in the select module just extracts the necessary information from the objects for polling. I suppose I could just use select directly... There's also the advantage of all the error handling in select. I could do a stripped down version of the code, I suppose, for speed. Seemed like a good idea for code re-use. > - is already included by Python.h. I didn't do this but it's been removed. > - Please don't introduce more static functions with a 'Py' name > prefix. Only did this in one place, with PyTimeout_Err. The reason was that the other Error functions used the Py prefix, so it was done for consistency. I can change that.. or change the naming scheme with the others if you like. > - You should raise TypeError if the type of the argument is wrong, and > ValueError if the value is wrong (out of range). Not SocketError. Oops. Fixed. > - I believe that you can't reliably maintain a "sock_blocking" flag; > there are setsockopt() or ioctl() calls that can make a socket > blocking or non-blocking. Also, there's no way to know whether a > socket passed to fromfd() is in blocking mode or not. Well, upon socket creation (in init_sockobject), the socket is set to blocking mode. I think that handles fromfd, right? Doesn't every initialization means have to call that function? The real problem would be someone using an ioctl or setsockopt (Can you even do blocking stuff through setsockopt?). Ugh. The original timeoutsocket didn't really deal with anything like that. Erm, seems like an interface problem here - using ioctl kinda breaks the socket object interface. Perhaps we should be doing some sort of getsockopt to figure out the blocking mode and update our state accordingly? That would be an extra call for each thing to the interface though. One solution is to set/unset blocking mode right before doing each call to be sure of the state and based on the internally stored value of the blocking attribute... but... then that kind of renders ioctl useless. Another solution might be to set the blocking mode to on everytime someone sets a timeout. That would change the blocking/socket interaction already described a bit but not drastically. Also easy to implement. That sends the message: Don't use ioctls when using timeouts. Hmm.. Will need to think about this more. Any insight would be helpful or some wisdom about how you usually handle this sort of thing. > - There are refcount bugs. I didn't do a detailed review of these, > but I note that the return value from PyFloat_FromDouble() in > PySocketSock_settimeout() is leaked. (There's an INCREF that's only > needed for the None case.) This has been fixed. I was one ref count too high in my scheme. > - The function internal_select() is *always* used like this: > > count = internal_select (s, 1); > if (count < 0) > return NULL; > else if (count == 0) /* Timeout elapsed */ > return PyTimeout_Err (); > > If internal_select() called PyTimeout_Err() itself, all call sites > could be simplified to this: > > count = internal_select (s, 1); > if (count < 0) > return NULL; > > or even (now that the count variable is no longer needed) to this: > > if(internal_select (s, 1) < 0) > return NULL; Good point. Except the return value needs to be checked for <= 0 in this case. Changes were made. > - The accept() wrapper contains this bit of code (only showing the > Unix variant): > > if (newfd < 0) > if (!s->sock_blocking || (errno != EAGAIN && errno != EWOULDBLOCK)) > return s->errorhandler (); > > Isn't the sense of testing s->sock_blocking wrong? I would think > that if we're in blocking mode we'd want to return immediately > without even checking the errno. I recommend writing this out more > clearly, e.g. like this: > > if (s->sock_blocking) > return s->errorhandler(); > /* A non-blocking accept() failed */ > if (errno != EAGAIN && errno != EWOULDBLOCK) > return s->errorhandler(); I've written this out more explicitly as you suggest. It is supposed to be !s->sock_blocking though. If we're in non-blocking mode at that point with an error, then it's definitely an error. If we're in blocking mode, then we have to check the type of error. The reason being that the underlying socket is always in non-blocking mode (remember select) so we need to check that we don't have a weird error. I've written it out like this: if (newfd < 0) { ········if (!s->sock_blocking) ············return s->errorhandler(); ········/* Check if we have a true failure for a blocking socket */ ········if (errno != EAGAIN && errno != EWOULDBLOCK) ············return s->errorhandler(); ····} I've also fixed a similar thing for connect. > - What is s->errorhandler() for? AFAICT, this is always equal to > PySocket_Err! This was always in the module. Not sure why it was put there intially. I used it to be consistent. > - The whole interaction between non-blocking mode and timeout mode is > confusing to me. Are you sure that this always does the right > thing? Have you even thought about what "the right thing" is in all > 4 combinations? I think I've explained this earlier in the thread. Lemme know if I need any more clarifications. If you made it this far, it's time for coffee. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From guido@python.org Mon Jun 3 16:34:34 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 11:34:34 -0400 Subject: [Python-Dev] Other library code transformations In-Reply-To: Your message of "Mon, 03 Jun 2002 10:03:42 CDT." References: Message-ID: <200206031534.g53FYYR01331@pcp742651pcs.reston01.va.comcast.net> > Certainly distinguishing between Required changes and Recommended > changes would be a good thing. Required changes are what needs to be > done to keep old code working. Recommended changes could include > stylistic changes, new features, new idioms, etc. I think it would > even make sense to talk about changes that should be made in > anticipation of future feature deprecation. I'm not sure that using x+=1 instead of x=x+1 should be even a recommended change. This is a personal choice, just like using True/False to indicate truth values. The "is None" vs. "== None" issue is a general style recommendation, not a migration tip. This is a "should do" issue. The "if not x" vs. "if x is None" issue is also a general style recommendation. This is a "could do" issue, because the semantics are different. --Guido van Rossum (home page: http://www.python.org/~guido/) From walter@livinglogic.de Mon Jun 3 16:54:11 2002 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon, 03 Jun 2002 17:54:11 +0200 Subject: [Python-Dev] Other library code transformations References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <01d401c20ade$d0dc9650$0900a8c0@spiff> Message-ID: <3CFB9123.7050203@livinglogic.de> Fredrik Lundh wrote: > walter wrote: > > >>import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime > > > or, nicer: > > os.path.getmtime("foo") Is there an os.path function available for all the os.stat entries? Which version should we use? Should we change this at all? (stat.py won't go away, only string.py and types.py will.) Bye, Walter Dörwald From pobrien@orbtech.com Mon Jun 3 16:58:41 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Mon, 3 Jun 2002 10:58:41 -0500 Subject: [Python-Dev] Other library code transformations In-Reply-To: <200206031534.g53FYYR01331@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > > Certainly distinguishing between Required changes and Recommended > > changes would be a good thing. Required changes are what needs to be > > done to keep old code working. Recommended changes could include > > stylistic changes, new features, new idioms, etc. I think it would > > even make sense to talk about changes that should be made in > > anticipation of future feature deprecation. > > I'm not sure that using x+=1 instead of x=x+1 should be even a > recommended change. This is a personal choice, just like using > True/False to indicate truth values. But the choice only becomes available with a certain version of Python. > The "is None" vs. "== None" issue is a general style recommendation, > not a migration tip. This is a "should do" issue. Right. But it wouldn't hurt to remind people in a migration guide, would it? > The "if not x" vs. "if x is None" issue is also a general style > recommendation. This is a "could do" issue, because the semantics are > different. Perhaps three sections then: Required, Recommended and Optional (TMTOWTDI)? I just think it would be good to know what coding changes one could/should start making when a new version is released. And which practices one should stop. And if there are lots of examples in real code of certain poor practices, it wouldn't hurt to point them out as well, even if they aren't necessarily tied to a particular release of Python. (But I'm willing to concede that I might be stretching the scope of this migration guide beyond its limits with that last item.) --- Patrick K. O'Brien Orbtech From guido@python.org Mon Jun 3 17:02:59 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 12:02:59 -0400 Subject: [Python-Dev] Other library code transformations In-Reply-To: Your message of "Mon, 03 Jun 2002 10:58:41 CDT." References: Message-ID: <200206031602.g53G2xm02048@pcp742651pcs.reston01.va.comcast.net> > > I'm not sure that using x+=1 instead of x=x+1 should be even a > > recommended change. This is a personal choice, just like using > > True/False to indicate truth values. > > But the choice only becomes available with a certain version of Python. 2.0. > > The "is None" vs. "== None" issue is a general style recommendation, > > not a migration tip. This is a "should do" issue. > > Right. But it wouldn't hurt to remind people in a migration guide, > would it? Yes it would. A migration guide should focus on migration and leave general style tips to other documents. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jun 3 18:22:16 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 13:22:16 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: Your message of "Mon, 03 Jun 2002 11:22:45 EDT." <20020603112245.E19838@eecs.tufts.edu> References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> Message-ID: <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net> [Addressing only points that need attention] > > - Also please don't put a space between the open parenthesis and the > > first argument (you do this almost consistently in test_timeout.py). > > Couldn't really figure out what you were seeing here. I read that > you saw something like func( a, b), which I don't see in my local > copy. test_timeout.py from the SF page has this. I'm glad you fixed this already in your own copy. > I do have something like this for list comprehension: > > [ x.find('\n') for x in self._rbuf ] > > Er, but I though there were supposed to be surrounding spaces at the > edges... I prefer to see that as [x.find('\n') for x in self._rbuf] > I screwed up the write. New the write is: > [...] > > which is pretty much the same as the old. The read should be ok > though. I could really use someone with a win compiler to test this > for me. I'll review it more when you next upload it. > > - It also looks like you've broken the semantics of size<0 in read(). > > Maybe I'm misunderstanding the code, but I thought that a size < 0 > meant to read until there are no more? The statement: > > while size < 0 or buf_len < size: > > accomplishes the same thing as what's in the current socket.py > implementation. If you look closely, the 'if >= 0' branch *always* returns, > meaning that the < 0 is equiv to while 1. Due to shortcutting, the same > thing happens in the above statement. Maybe a comment would make it clearer? I was referring to this piece of code: ! if buf_len > size: ! self._rbuf.append (data[size:]) ! data = data[:size] Here data[size:] gives you the last byte of the data and data[:size] chops off the last byte. > > - Maybe changing the write buffer to a list makes sense too? > > I could do this. Then just do a join before the flush. Is the append > /that/ much faster? Depends on how small the chunks are you write. Roughly, repeated list append is O(N log N), while repeated string append is O(N**2). > > - Since this code appears independent from the timeoutsocket code, > > maybe we can discuss this separately? > > The point of this code was to keep from losing data when an exception > occurs (as timothy, if I remember correctly, pointed out). Hence the reason > for keeping a lot more data around in instance variables instead of local > variables. So the windows version might (in obscure cases) be affected > by the timeout changes. That's what this patch was addressing. OK, but given the issues the first version had, I recommand that the code gets more review and that you write unit tests for all cases. > > - Please don't introduce more static functions with a 'Py' name > > prefix. > > Only did this in one place, with PyTimeout_Err. The reason was that the > other Error functions used the Py prefix, so it was done for consistency. I > can change that.. or change the naming scheme with the others if you like. I like to do code cleanup that doesn't change semantics (like renamings) as a separate patch and checkin. You can do this before or after the timeout changes, but don't merge it into the timeout changes. I still like the static names that you introduce not to start with Py. > > - I believe that you can't reliably maintain a "sock_blocking" flag; > > there are setsockopt() or ioctl() calls that can make a socket > > blocking or non-blocking. Also, there's no way to know whether a > > socket passed to fromfd() is in blocking mode or not. > > Well, upon socket creation (in init_sockobject), the socket is > set to blocking mode. I think that handles fromfd, right? Doesn't > every initialization means have to call that function? OK, it looks like you call internal_setblocking(s, 0) to set the socket in nonblocking mode. (Hm, I don't see any calls to set the socket in blocking mode!) So do I understand that you are now always setting the socket in non-blocking mode, even when there is no timeout specified, and that you look at the sock_blocking flag to decide whether to do timeouts or just pass the nonblocking behavior to the user? This is a change in semantics, and could interfere with existing applications that pass the socket's file descriptor off to other code. I think I'd be happier if the behavior wasn't changed at all until a timeout is set for a socket -- then existing code won't break. > The real problem would be someone using an ioctl or setsockopt (Can > you even do blocking stuff through setsockopt?). Yes, setblocking() makes a call to setsockopt(). :-) > Ugh. The original timeoutsocket didn't really deal with anything > like that. Erm, seems like an interface problem here - using ioctl > kinda breaks the socket object interface. Perhaps we should be doing > some sort of getsockopt to figure out the blocking mode and update > our state accordingly? That would be an extra call for each thing to > the interface though. I only really care for sockets passed in to fromfd(). E.g. someone can currently do: s1 = socket(AF_INET, SOCK_STREAM) s1.setblocking(0) s2 = fromfd(s1.fileno()) # Now s2 is non-blocking too I'd like this to continue to work as long as s1 doesn't set a timeout. > One solution is to set/unset blocking mode right before doing each > call to be sure of the state and based on the internally stored value > of the blocking attribute... but... then that kind of renders ioctl > useless. Don't worry so much about ioctl, but do worry about fromfd. > Another solution might be to set the blocking mode to on everytime > someone sets a timeout. That would change the blocking/socket > interaction already described a bit but not drastically. Also easy > to implement. That sends the message: Don't use ioctls when using > timeouts. I like this. > Hmm.. Will need to think about this more. Any insight would be > helpful or some wisdom about how you usually handle this sort of > thing. See above. Since we don't know what people out there are doing, I don't want to break existing code. We do know that existing code doesn't use (this form of) timeout, so we can exploit that knowledge. > > - What is s->errorhandler() for? AFAICT, this is always equal to > > PySocket_Err! > > This was always in the module. Not sure why it was put there intially. > I used it to be consistent. Argh, you're right. MAL added this; I'll ask him why. > If you made it this far, it's time for coffee. When can I expect a new version? --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Mon Jun 3 18:16:02 2002 From: aahz@pythoncraft.com (Aahz) Date: Mon, 3 Jun 2002 13:16:02 -0400 Subject: [Python-Dev] Re: Adding Optik to the standard library In-Reply-To: <15611.16296.202088.831238@anthem.wooz.org> References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca> <01b401c20ade$13c7bdb0$0900a8c0@spiff> <15611.16296.202088.831238@anthem.wooz.org> Message-ID: <20020603171602.GB20395@panix.com> On Mon, Jun 03, 2002, Barry A. Warsaw wrote: > > >>>>> "FL" == Fredrik Lundh writes: > > FL> (setq font-lock-support-mode 'lazy-lock-mode) > > (add-hook 'font-lock-mode-hook 'turn-on-fast-lock) Which proves that vi[m] is the true Pythonic editor -- there's only one way. baiting-ly y'rs -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In the end, outside of spy agencies, people are far too trusting and willing to help." --Ira Winkler From phoengeist38259@arcor.de" Entschuldigen Sie bitte die Störung! Mir ist etwas zu Ohren gekommen. Eine relativ aussergewöhnliche Gerüchteküche, aus der man mir ein schwerverdauliches Süppchen vorgesetzt hat, ist der Grund meiner Mail. Unappetitlich ist gar kein Ausdruck! Ist es möglich auf funktechnischem Wege(in welchen Frequenzbereichen?) jemanden zu beeinflussen oder zu manipulieren? Oder sogar zu schikanieren und terrorisieren? Unter dem Motto:"Einen am Sender?Nich ganz alleine? Kleine Mannim Ohr?Falsche Wellenlänge?Bohnen in den Ohren? Auf den Zahn gefühlt(Amalgam)?Mal unverbindlich reinhören? Der Pullacher Wanzentanz? Ist das Spinnerei?Das geht doch gar nicht,oder? Und wenn wie sieht das ethisch moralisch aus? Zur technischen Seite der Sache gibt es zwar Berichte und Webseiten: Totalitaer,de - Die Waffe gegen die Kritik http://www.raum-und-zeit.com/Aktuell/Brummton.htm http://www.fosar-bludorf.com/Tempelhof/ http://jya.com/haarp.htm http://www.zeitenschrift.at/magazin/zs_24_15/1_mikrowaffen.htm http://www.bse-plus.de/d/doc/lbrief/lbmincontr.htm http://home.nexgo.de/kraven/bigb/big3.html http://w3.nrl.navy.mil/projects/haarp/index.html http://cryptome.org/ http://www.raven1.net/ravindex.htm http://www.calweb.com/~welsh/ http://www.cahe.org/ http://www.parascope.com/ds/mkultra0.htm http://www.trufax.org/menu/mind.html http://www.trufax.org/menu/elect.html http://mindcontrolforum.com/ http://www.trufax.org/menu/elect.html usw. usw. usw. ,aber,das kann doch nicht sein,das soetwas gemacht wird,oder? Eine Menschenrechtsverletzung sonder gleichen!?! Ist es möglich,durch Präparation,der Ohren und im Zusammenspiel mit eventuell vorhandenem Zahnersatz? Mit relativ einfacher Funktechnik?? In diesem Land?Hier und heute??? Unter welchen Motiven? Wo ist eigentlich die Abteilung 5 des BND und des Verfassungsschutzes? Kann es sein,daß es Leute gibt,die dem BND/Verfassungsschutz,auf funktechnischem Wege permanent einen Situationsbericht abliefern,ohne es selbst zu merken,im Kindesalter machbar gemacht?? Werden durch solche inoffiziellen Mitarbeiter,beim BND und Verfassungsschutz,nach Stasimanier, Informationen von und über,rein theoretisch, jeden Bundesbürger,gesammelt? Gibt es dann noch ein Recht auf Privatsphere? Wer kontrolliert eigentlich den BND,MAD und Verfassungsschutz auf Unterwanderung??? In der Mail geht es mir eigentlich um die Frage,ob es kriminellen Elementen, aus dem Motiv der Bereicherung,oder Gruppierungen aus ideologischen Motiven, möglich ist ,sich Wissen und Technik anzueignen,die zu anderen Zeiten, aus anderen Motiven(Westfernsehen?),entwickelt wurde. Und stellt der technische Wissensstand, der der Allgemeinheit bekannt ist wirklich das Ende der Fahnenstange dar? Ist es denn nicht kriminellen Elementen genauso möglich, ich sage das jetzt mal verharmlost und verniedlichend, einzelne Personen oder Gruppen mit relativ einfachen Mitteln, aus welchen Motiven auch immer, auszuspionieren? Und stellt diese "Ausspioniererei" nicht einen erheblichen Eingriff in die Privatsphäre dar? Ist es möglich einzelne Personen oder Gruppen, eine Akzeptans einer gewissen Öffentlichkeit(suggeriert?), die z.B. mit Hilfe von Internetseiten,wie zum Beispiel dem "Pranger"geschaffen werden könnte, mal vorausgestzt,zu terroriesieren und oder zu schikanieren, und das in aller (suggerierten)Öffentlichkeit?Haben die Leute die da am Pranger, oder auf irgendeiner anderen Seite verunglimpft,oder gar Verleumdet werden, eigentlich eine Chance zur Gegenöffentlichkeit?Ist das nicht Rufmord? Vor einigen Jahren bin ich per Zufall auf die Seite "Der Pranger" gestoßen, damals lief das noch nicht unter dem Deckmantel der Partnervermittlung. Können sich einzelne Personen,oder Interessengemeinschaften, aus reinem Selbstzweck,solcher Seiten bedienen, um unter dem Deckmantel einer fragwürdigen Zivilkourage, durch anzetteln irgendwelcher Hetzkampagnen,eigene, ganz persöhnliche Interessen durchsetzen? Können solche Seiten zur Koordination von kriminellen machenschaften dienen? Die Frage,ist es Möglichkeit oder Unmöglichkeit,technisch und gesellschaftlich, einzelne Personen,oder auch Gruppierungen,aus einer kriminellen/ideologischen Energei heraus,zu manipulieren oder zu beeinflussen,terrorisieren oder zu schickanieren,und zwar gezielt. Zielgruppenmanipulation durch Massenmedien sind alltägliche Manipulation, der mansich,mehr oder weniger,entziehen kann. Wird das Recht auf Privatsphäre,schleichend,tiefenpsychologisch, durch Sendungen,wie,zum Beispiel "Big brother",untergraben? Sollte bei einem der Angemailten ein gewisser Wissensstand zum Thema vorhanden sein, wäre ich über Hinweise zum Thema froh. Auf der Suche nach Antworten auf meine Fragen maile ich verschiedene Adressen aus dem Internet an, und hoffe aufkonstruktive Antworten und Kritiken. Über einen Besuch auf der Seite würde ich mich freuen. Sollten Sie von mir mehrfach angeschrieben worden sein,so bitte ich Sie,mir dies zu entschuldigen, das war nicht beabsichtigt. Der Grund für meine Anonymität ist die Tatsache, daß bei derlei Fragenstellerei, verständlicherweise,schnell der Ruf nach der Psychatrie laut wird. Was auch Methode hat(ist). Sollten Sie die Mail als Belästigung empfinden, möchte ich mich hiermit dafür entschuldigen! Big brother is watching you? Excuse please the disturbance! Me something came to ears. A relatively unusual rumor kitchen, from which one put forward to me a heavydigestible soup, is the reason of my Mail. Unappetizing is no printout! Is it possible on radio Wege(in for which frequency ranges?) to influence or manipulate someone? Terrorize or to even chicane and? Under the Motto:"Einen at the Sender?Nich quite alone? Small Mannim Ohr?Fal Wellenlaenge?Bohnen in the ears? On the tooth clean-hear gefuehlt(Amalgam)?Mal witthout obligation? The Pullacher bug wanzentanz? Isn't the Spinnerei?Das goes nevertheless at all, or? And if as looks ethicalally morally? For the technical page of the thing there is to report and web page: Totalitaer,de - Die Waffe gegen die Kritik http://www.raum-und-zeit.com/Aktuell/Brummton.htm http://www.fosar-bludorf.com/Tempelhof/ http://jya.com/haarp.htm http://www.zeitenschrift.at/magazin/zs_24_15/1_mikrowaffen.htm http://www.bse-plus.de/d/doc/lbrief/lbmincontr.htm http://home.nexgo.de/kraven/bigb/big3.html http://w3.nrl.navy.mil/projects/haarp/index.html http://cryptome.org/ http://www.raven1.net/ravindex.htm http://www.calweb.com/~welsh/ http://www.cahe.org/ http://www.parascope.com/ds/mkultra0.htm http://www.trufax.org/menu/mind.html http://www.trufax.org/menu/elect.html http://mindcontrolforum.com/ http://www.trufax.org/menu/elect.html usw. usw. usw. but, that cannot be nevertheless, which is made soetwas, or? A violation of human rights resemble special!?! Is it possible, by preparation, the ears and in interaction with possibly available artificial dentures? With relatively simple radio engineering?? In this Land?Hier and today??? Under which motives? Where is the department actually 5 of the BND and the protection of the constitution? Can it be that there are people, which deliver the Federal Intelligence Service/protection of the constitution, on radio way permanently a situation report, without noticing it, in the infancy feasiblly made? By such unofficial coworkers, with the BND and protection of the constitution, after Stasimanier, is information collected of and over,purely theoretically, each Federal citizen? Is there then still another right to Privatsphere? Who actually checks the BND, WAD and protection of the constitution for infiltration??? Into the Mail actually concerns it to me the question whether it criminal items, from which motive of enriching, or groupings from ideological motives is possible, to acquire itself knowledge and technique which were developed at other times, from other Motiven(Westfernsehen?).And does the technical knowledge status place, to that the public admits is really the end of the flag bar? Is it not to criminal items just as possible, I legend that now times played down and does nice-end, individual persons or groups with relatively simple means, to spy from whatever motives always? And doesn't this " Ausspioniererei " represent a substantial intervention into the privatsphaere? It is possible individual persons or groups, one acceptance to of a certain Oeffentlichkeit(suggeriert?), e.g. by Internet pages, how for example the " Pranger"geschaffen could become, times vorausgestzt, to terroriesieren and or chicane, and in everything (the people suggerierten)Oeffentlichkeit?Haben there at the Pranger, or on any other page to be reviled, or slandered, actually a chance to the Gegenoeffentlichkeit?Ist that not character assassination? Some years ago I am by coincidence the page " the Pranger " encountered, at that time ran not yet under the cover of the partner switching.Itself can individual persons, or communities of interests, from pure self purpose, such pages to serve, over under the cover of a doubtful Zivilkourage, through plot any rushing campaigns, own, quite persoehnliche interests to intersperse? Can such pages serve for the co-ordination of criminal machinations? The question, is it possibility or impossibility, technically and socially, individual persons, or also groupings of manipulating or of influencing from an criminal/ideological Energei, terrorizes or to schickanieren, directed.Target group manipulation by mass media are everyday manipulation, from which, more or less, can extract itself. Does the right to privatsphaere, creeping, by transmissions become deep psychological, how, for example " Big undermine brother"? If the Angemailten should be available a certain knowledge status to the topic with one, I would be glad over notes to the topic On the search for responses to my questions maile I different addresses from the Internet on, and hope up-constructional responses and criticisms.Over an attendance on the page wuerde I are pleased.If you should have been written down by me several times, then please I you to excuse me this that was not intended. The reason for my anonymity is the fact that with such Fragenstellerei, understandably, fast after the call the Psychatrie loud becomes. Which also method hat(ist). If you should feel the Mail as annoyance, I would like to apologize hereby for it! Big is watching you? Veuillez excuser le dérangement! Moi quelque chose concernant des oreilles est venu. Une cuisine de bruit relativement inhabituelle, dont on m'a placé un Sueppchen schwerverdauliches devant, est la raison de mes Mail.Aucune expression n'est peu appétissante! Il est possible sur un Wege(in funktechnischem pour quelles réponses fréquentielles?) quelqu'un influencer ou manipuler? Ou même schikanieren et terroriser? Sous le Motto:"Einen au Sender?Nich tout à fait seulement? Petits Mannim Ohr?Falsche Wellenlaenge?Bohnen dans les oreilles? Sur la dent gefuehlt(Amalgam)?Mal non contraignant reinhoeren? Le Pullacher Wanzentanz? Le Spinnerei?Das n'est-il quand même pas du tout va, ou? Et si comme cela paraît éthiquement moralement? Au côté technique de la chose, il y a certes des rapports et des Webseiten: Totalitaer,de - Die Waffe gegen die Kritik http://www.raum-und-zeit.com/Aktuell/Brummton.htm http://www.fosar-bludorf.com/Tempelhof/ http://jya.com/haarp.htm http://www.zeitenschrift.at/magazin/zs_24_15/1_mikrowaffen.htm http://www.bse-plus.de/d/doc/lbrief/lbmincontr.htm http://home.nexgo.de/kraven/bigb/big3.html http://w3.nrl.navy.mil/projects/haarp/index.html http://cryptome.org/ http://www.raven1.net/ravindex.htm http://www.calweb.com/~welsh/ http://www.cahe.org/ http://www.parascope.com/ds/mkultra0.htm http://www.trufax.org/menu/mind.html http://www.trufax.org/menu/elect.html http://mindcontrolforum.com/ http://www.trufax.org/menu/elect.html usw. usw. usw. toutefois qui ne peut quand même pas être qui on fait soetwas, ou? Une violation des droits de l'homme séparer ressembler!?! Il est possible, par la préparation, des oreilles et dans l'effet avec la prothèse dentaire éventuellement existante? Avec la technique de radio relativement simple?? Dans ce Land?Hier et aujourd'hui Sous quels motifs? Où le département est-il en réalité 5 du BND et de la protection d'constitution? peut il être qu'il y a les personnes qui livrent en permanence le BND/Verfassungsschutz, de manière funktechnischem un rapport de situation, sans le remarquer le -même , dans l'enfance rendu possible?? Par de tels collaborateurs officieux, avec le BND et la protection d'constitution, après manière, des informations sont-elles rassemblées et plus de, purement théoriquement, chaque citoyen allemand? Il y a alors encore un droit à des Privatsphere? Qui contrôle en réalité le BND, mad et protection d'constitution sur une infiltration??? Il s'agit en réalité dans le Mail me la question de savoir si lui éléments criminels, dont le motif de l'enrichissement, ou de groupements des motifs idéologiques, possible de s'acquérir le savoir et la technique qui à d'autres temps, est autre MotivenEt place-t-il le savoir technique dont le public vraiment la fin la barre de drapeau a connaissance ? Il n'est pas donc exactement la même chose possible pour des éléments criminels, moi cela maintenant fois verharmlost et minimisant une légende, personnes ou groupes particuliers avec des moyens relativement simples, de quels motifs aussi toujours, auszuspionieren?(Westfernsehen?), a été développé. Et ce "Ausspioniererei" ne représente-t-il pas une intervention considérable dans la vie privée? Il est possible personnes ou groupes particuliers, pour certain Oeffentlichkeit(suggeriert?), celui p. ex. à l'aide des côtés Internet, comme par exemple "le Pranger"geschaffen pourrait, fois vorausgestzt schikanieren terroriesieren et ou , et qui toute (suggerierten)Oeffentlichkeit?Haben les personnes ceux là, ou d'un autre côté verunglimpft, ou on ne pas calomnie, en réalité une chance au Gegenoeffentlichkeit?Ist qui meurtre d'appel? Il y a quelques années, je ne suis pas encore par hasard sur le côté "celui" poussé, fonctionnais alors cela sous la couche de pont de l'entremise partenaire. Des personnes particulières, ou des communautés d'intérêts le peuventelles, d'un autobut pur, de tels côtés servent, sous la couche de pont d'un Zivilkourage douteux, tracent plus de des campagnes de précipitation, propres intérêts tout à fait persoehnliche entremêlent? De tels côtés peuvent-ils servir à la coordination des manoeuvres criminelles? Question, est lui possibilité ou impossibilité de manipuler ou d'influencer techniquement et socialement, particulière personnes, ou aussi groupements, criminelle/ponctuel idéologique Energei dehors, , terroriser ou schickanieren, et ce.Une manipulation de groupe cible par des masse-médias être la manipulation quotidienne qui peut extraire mansich, plus ou moins. Le droit à la vie privée est-il miné, ramment, tiefenpsychologisch, par des envois, comme, par exemple "des Big brother"? Avec un les Angemailten si un certain savoir devait exister sur le thème, je serais heureux sur des indications sur le thème.Sur la recherche des réponses à mes questions je différentes adresses maile d'Internet dessus, et espère réponses et critiques aufkonstruktive. Sur une visite du côté http://hometown.aol.de/reinerhohn38259/homepage/index.html> je me réjouirais. Si vous deviez avoir été écrit à différentes reprises par moi, je vous demande de m'excuser cela qui n'était pas envisagé. La raison de mon anonymat est le fait qu'avec telle des Fragenstellerei, l'appel devient ce qui est bien compréhensible, rapidement bruyant après le Psychatrie. Ce que la méthode a également (ist). Si vous deviez ressentir les Mail comme un ennui, je voudrais m'excuser par ceci pour cela! Big brother is watching you? From tim.one@comcast.net Mon Jun 3 18:32:27 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 03 Jun 2002 13:32:27 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido] >>> - Maybe changing the write buffer to a list makes sense too? [mgilfix@eecs.tufts.edu] >> I could do this. Then just do a join before the flush. Is the append >> /that/ much faster? [Guido] > Depends on how small the chunks are you write. Roughly, repeated list > append is O(N log N), while repeated string append is O(N**2). Repeated list append is O(N) amortized (a single append may take O(N) time all by itself, but if you do N of them in a row the time is still no worse than O(N) overall; a possible conceptual difficulty may arise here because the value of "N" changes over time, and while growing to a total of size N may require O(log N) whole-list copies, each of the copies involves far fewer elements than the final value of N -- if you add up all these smaller values of N in the worst case, the sum is O(N) wrt the final value of N, and so it's worst-case O(N) overall wrt the final value of N). From trentm@ActiveState.com Mon Jun 3 19:04:16 2002 From: trentm@ActiveState.com (Trent Mick) Date: Mon, 3 Jun 2002 11:04:16 -0700 Subject: [Python-Dev] PYC Magic In-Reply-To: ; from martin@v.loewis.de on Sun, Jun 02, 2002 at 11:53:20PM +0200 References: <3CF8BCF9.5557.4C49011F@localhost> <2m1ybpu4cm.fsf@starship.python.net> Message-ID: <20020603110416.A25092@ActiveState.com> [Martin v. Loewis wrote] > I think Python should provide an option to never write .pyc files, > controllable through sys.something. +1 ActivePython's uninstallation process has a custom action (which uses the installed Python) to remove .pyc files before the MSI process removes the other files. That process *creates* new .pyc files which makes uninstallation a little bit of a pain. Trent -- Trent Mick TrentM@ActiveState.com From mgilfix@eecs.tufts.edu Mon Jun 3 19:39:06 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Mon, 3 Jun 2002 14:39:06 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jun 03, 2002 at 01:22:16PM -0400 References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020603143906.J19838@eecs.tufts.edu> On Mon, Jun 03 @ 13:22, Guido van Rossum wrote: > > If you made it this far, it's time for coffee. > > When can I expect a new version? Give me a day or two to address these points and produce the new version. -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From python@rcn.com Mon Jun 3 19:51:20 2002 From: python@rcn.com (Raymond Hettinger) Date: Mon, 3 Jun 2002 14:51:20 -0400 Subject: [Python-Dev] Draft Guide for code migration and modernation Message-ID: <001a01c20b2f$a7518060$7eec7ad1@othello> Here is my cut at the migration and modernization guide. Comments are welcome. Walter and Neal, would you like to add the somewhat more involved steps for eliminating the types and strings modules. Raymond Hettinger ----------------------------------------------- Code Modernization and Migration Guide Pattern: if d.has_key(k): --> if k in d: Idea: For testing dictionary membership, use the 'in' keyword instead of the 'has_key()' method. Version: 2.2 or greater Benefits: The result is shorter and more readable. The style becomes consistent with tests for membership in lists. The result is slightly faster because has_key requires an attribute search. Locating: grep has_key Contra-indications: 1. if dictlike.has_key(k) ## objects like shelve do not define __contains__() Pattern: for k in d.keys() --> for k in d for k in d.items() --> for k in d.iteritems() for k in d.values() --> for k in d.itervalues() Idea: Use the new iter methods for looping over dictionaries Version: 2.2 or greater Benefits: The iter methods are faster because the do not have to create a new list object with a complete copy of all of the keys, values, or items. Selecting only keys, items, or values as needed saves the time for creating unused object references and, in the case of items, saves a second hash look-up of the key. Contra-indications: 1. def getids(): return d.keys() ## do not change the return type 2. for k in dictlike.keys() ## objects like shelve do not define itermethods 3. k = d.keys(); j = k[:] ## iterators do not support slicing, sorting or other operations 4. for k in d.keys(): del[k] ## dict iterators prohibit modifying the dictionary Pattern: if v == None --> if v is None: Idea: Since there is only one None object, it can be tested with identity. Version: Any Benefits: Identity tests are slightly faster than equality tests. Also, some object types may overload comparison to be much slower (or even break). Locating: grep '== None' or grep '!= None' Pattern: os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime os.stat("foo")[stat.ST_MTIME] --> os.path.getmtime("foo") Idea: Replace stat contants or indices with new stat methods Version: 2.2 or greater Benefits: The methods are not order dependent and do not require an import of the stat module Locating: grep os.stat Pattern: import whrandom --> import random Idea: Replace deprecated module Version: 2.1 or greater Benefits: All random methods collected in one place Locating: grep whrandom From skip@pobox.com Mon Jun 3 20:00:22 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 3 Jun 2002 14:00:22 -0500 Subject: [Python-Dev] "max recursion limit exceeded" canned response? In-Reply-To: References: <15610.6455.96035.742110@12-248-41-177.client.attbi.com> Message-ID: <15611.48326.400187.797188@beluga.mojam.com> >> How would we go about adding a canned response to the commonly >> submitted "max recursion limit exceeded" bug report? Martin> Post the precise text that you want to see as the canned Martin> response, and somebody can install it. How about: The max recursion limit problem in the re module is well-known. Until this limitation in the implementation is removed, to work around it check http://www.python.org/dev/doc/devel/lib/module-re.html http://python/org/sf/493252 Note that the examples in the CVS version of the re module do contain some tips for working around the problem, however they haven't yet percolated to the main doc set. Skip From mwh@python.net Mon Jun 3 20:04:03 2002 From: mwh@python.net (Michael Hudson) Date: 03 Jun 2002 20:04:03 +0100 Subject: [Python-Dev] Socket timeout patch In-Reply-To: Guido van Rossum's message of "Mon, 03 Jun 2002 13:22:16 -0400" References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net> Message-ID: <2mvg90mbfg.fsf@starship.python.net> Guido van Rossum writes: > OK, but given the issues the first version had, I recommand that the ^^^^^^^^^ I *like* this typo :) > code gets more review and that you write unit tests for all cases. Cheers, M. -- I've reinvented the idea of variables and types as in a programming language, something I do on every project. -- Greg Ward, September 1998 From skip@pobox.com Mon Jun 3 20:05:33 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 3 Jun 2002 14:05:33 -0500 Subject: [Python-Dev] Draft Guide for code migration and modernation In-Reply-To: <001a01c20b2f$a7518060$7eec7ad1@othello> References: <001a01c20b2f$a7518060$7eec7ad1@othello> Message-ID: <15611.48637.35273.683341@beluga.mojam.com> Raymond> Pattern: if d.has_key(k): --> if k in d: Raymond> Idea: For testing dictionary membership, use the 'in' keyword Raymond> instead of the 'has_key()' method. Raymond> Version: 2.2 or greater Raymond> Benefits: The result is shorter and more readable. The style Raymond> becomes consistent with tests for membership in lists. The Raymond> result is slightly faster because has_key requires an attribute Raymond> search. Also faster (I think) because it avoids executing the expensive CALL_FUNCTION opcode. (Probably applies to the d.keys() part of second pattern as well.) Raymond> Pattern: if v == None --> if v is None: Raymond> Idea: Since there is only one None object, it can be tested Raymond> with identity. Raymond> Version: Any Raymond> Benefits: Identity tests are slightly faster than equality Raymond> tests. Also, some object types may overload comparison to be Raymond> much slower (or even break). Raymond> Locating: grep '== None' or grep '!= None' Also: if v: --> if v is None where appropriate (often when testing function arguments that default to None). This may change semantics though and has to be undertaken with some care. Skip From tim.one@comcast.net Mon Jun 3 20:04:50 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 03 Jun 2002 15:04:50 -0400 Subject: [Python-Dev] "max recursion limit exceeded" canned response? In-Reply-To: <15611.48326.400187.797188@beluga.mojam.com> Message-ID: [Skip Montanaro] > How about: > > The max recursion limit problem in the re module is well-known. Until > this limitation in the implementation is removed, to work around it > check > > http://www.python.org/dev/doc/devel/lib/module-re.html > http://python/org/sf/493252 I've added this as a canned response, with name "SRE max recursion limit". Thanks! From skip@pobox.com Mon Jun 3 20:18:37 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 3 Jun 2002 14:18:37 -0500 Subject: [Python-Dev] Draft Guide for code migration and modernation In-Reply-To: <15611.48637.35273.683341@beluga.mojam.com> References: <001a01c20b2f$a7518060$7eec7ad1@othello> <15611.48637.35273.683341@beluga.mojam.com> Message-ID: <15611.49421.389497.166235@beluga.mojam.com> Skip> Also: Skip> if v: --> if v is None Ack!!! Obviously I got the sense of the test backwards: if v: --> if v is not None: Skip From neal@metaslash.com Mon Jun 3 20:32:53 2002 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 03 Jun 2002 15:32:53 -0400 Subject: [Python-Dev] Draft Guide for code migration and modernation References: <001a01c20b2f$a7518060$7eec7ad1@othello> Message-ID: <3CFBC465.9CE292D7@metaslash.com> Raymond Hettinger wrote: > > Here is my cut at the migration and modernization guide. > > Comments are welcome. > > Walter and Neal, would you like to add the somewhat more involved steps for > eliminating the types and strings modules. Here's some more. Note the last one. Martin wanted to make sure this made it into whatsnew. I have already changed a few, one in Bdb I think. I will be changing TclError also. This could be a problem if anyone assumed these exceptions would be string. Neal -- Pattern: import types ; type(v, types.IntType) --> isinstance(v, int) type(s, types.StringTypes --> isinstance(s, basestring) Idea: The types module will likely to be deprecated in the future. Version: 2.2 or greater Benefits: May be slightly faster, avoid a deprecated feature. Locating: grep types *.py | grep import Pattern: import string ; string.method(s, ...) --> s.method(...) c in string.whitespace --> c.isspace() Idea: The string module will likely to be deprecated in the future. Version: 2.0 or greater Benefits: Slightly faster, avoid a deprecated feature. Locating: grep string *.py | grep import Pattern: NewError = 'NewError' --> class NewError(Exception): pass Idea: String exceptions are deprecated, derive from Exception base class. Version: Any Benefits: String exceptions will not work in future versions. Allows except Exception: clause to work. Locating: Use PyChecker From guido@python.org Mon Jun 3 20:42:49 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 15:42:49 -0400 Subject: [Python-Dev] Draft Guide for code migration and modernation In-Reply-To: Your message of "Mon, 03 Jun 2002 15:32:53 EDT." <3CFBC465.9CE292D7@metaslash.com> References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com> Message-ID: <200206031942.g53Jgnq17951@pcp742651pcs.reston01.va.comcast.net> > Pattern: NewError = 'NewError' --> class NewError(Exception): pass > Idea: String exceptions are deprecated, derive from Exception base class. > Version: Any > Benefits: String exceptions will not work in future versions. Allows except Exception: clause to work. > Locating: Use PyChecker Should also warn against class exceptions not deriving from Exception. Be careful about generic phrases like "String exceptions will not work in future versions." Some people (especially those who tend to fear change ;-) start to panic when they read this, thinking it might be in 2.4. I don't think we'll be able to delete string exceptions before Python 3.0, so you can be explicit in this case. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Mon Jun 3 19:41:22 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 3 Jun 2002 13:41:22 -0500 Subject: [Python-Dev] zap _Py prefix? Message-ID: <15611.47186.828411.542771@beluga.mojam.com> The issue of Michael's static PyTimeout_Err symbol reminded me about a question I had about _Py-prefixed symbols. I realize they are all "internal", but I also recall Tim saying a couple of times that the ANSI C standard reserves all symbols which begin with underscores for use by compiler writers. Should the _Py-prefixed symbols be renamed, for example, from _PyUnicode_IsDecimalDigit to Py__Unicode_IsDecimalDigit ? If so, we would then declare that all external symbols which begin with "Py__" were part of the private API. We would of course add macro definitions during the deprecation period: #define _PyUnicode_IsDecimalDigit Py__Unicode_IsDecimalDigit (It would also be nice to #warn when the macros are used. Is that possible with the C preprocessor?) Skip From neal@metaslash.com Mon Jun 3 20:54:20 2002 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 03 Jun 2002 15:54:20 -0400 Subject: [Python-Dev] Draft Guide for code migration and modernation References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com> <200206031942.g53Jgnq17951@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3CFBC96C.C29CBD98@metaslash.com> Guido van Rossum wrote: > > > Pattern: NewError = 'NewError' --> class NewError(Exception): pass > > Idea: String exceptions are deprecated, derive from Exception base class. > > Version: Any > > Benefits: String exceptions will not work in future versions. Allows except Exception: clause to work. > > Locating: Use PyChecker > > Should also warn against class exceptions not deriving from Exception. I was going to add this check, but I then I noticed I already had. :-) > Be careful about generic phrases like "String exceptions will not work > in future versions." Some people (especially those who tend to fear > change ;-) start to panic when they read this, thinking it might be in > 2.4. Maybe people won't bitch at you so much for things like bool and will start bitching at me. :-) On a somewhat related note, I was perusing the Perl 5.8 RC1 notes. While there was probably not any incompatible change as "major" as bool, there were many, many "minor" incompatible changes. Most seemed pretty small and had been warned about in the past. 1. I think Python is doing a good job wrt change. 2. Perhaps, there needs to be stronger warnings about deprecated or questionable features. (We seem to be working towards this.) Neal PS I don't view the bool change as major and I think it was a good change. From guido@python.org Mon Jun 3 21:32:37 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 16:32:37 -0400 Subject: [Python-Dev] zap _Py prefix? In-Reply-To: Your message of "Mon, 03 Jun 2002 13:41:22 CDT." <15611.47186.828411.542771@beluga.mojam.com> References: <15611.47186.828411.542771@beluga.mojam.com> Message-ID: <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net> > Should the _Py-prefixed symbols be renamed, for example, from > > _PyUnicode_IsDecimalDigit > > to > > Py__Unicode_IsDecimalDigit I've replied to this and I'll reply again. When a C compiler is spotted that defines a conflicting symbol or that refuses to compile our code because of this, it's early enough to change this. > ? If so, we would then declare that all external symbols which begin with > "Py__" were part of the private API. The problem is that Py__ doesn't scream "internal" like "_Py" does. If we had to, I'd propose "_py". --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Mon Jun 3 21:29:12 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 3 Jun 2002 16:29:12 -0400 Subject: [Python-Dev] zap _Py prefix? References: <15611.47186.828411.542771@beluga.mojam.com> <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15611.53656.944393.842462@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> The problem is that Py__ doesn't scream "internal" like "_Py" GvR> does. If we had to, I'd propose "_py". Or how 'bout "mypy_". :) keep-your-hands-off-of-my-pie-ly y'rs, -Barry From python@rcn.com Mon Jun 3 21:42:11 2002 From: python@rcn.com (Raymond Hettinger) Date: Mon, 3 Jun 2002 16:42:11 -0400 Subject: [Python-Dev] Silent Deprecation Message-ID: <005a01c20b3f$239c82a0$a9e77ad1@othello> Did we ever decide how to implement silent deprecation in the docs? Some of the choices were: 1. Delete it from the current docs (making Fred cringe) 2. Move it to a separate section of the docs 3. Add a note to the docs. Perhaps \dissuade{apply()}{Use func(*args,**kwds) instead} Raymond Hettinger From tim.one@comcast.net Mon Jun 3 21:49:17 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 03 Jun 2002 16:49:17 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects object.c,2.175,2.176 In-Reply-To: Message-ID: [Guido] > Modified Files: > object.c > Log Message: > Implement the intention of SF patch 472523 (but coded differently). > > In the past, an object's tp_compare could return any value. In 2.2 > the docs were tightened to require it to return -1, 0 or 1; and -1 for > an error. > > We now issue a warning if the value is not in this range. > ... > > I haven't decided yet whether to backport this to 2.2.x. The patch > applies fine. But is it fair to start warning in 2.2.2 about code > that worked flawlessly in 2.2.1? If 2.2.x is the Python-in-a-Tie line, I say "no way". People wearing ties don't care whether a thing is right or wrong, so long as "it works" they simply don't want to hear about it at all. Converting old extensions to avoid new warnings is a 2.3 task for them. From guido@python.org Mon Jun 3 21:55:35 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Jun 2002 16:55:35 -0400 Subject: [Python-Dev] Silent Deprecation In-Reply-To: Your message of "Mon, 03 Jun 2002 16:42:11 EDT." <005a01c20b3f$239c82a0$a9e77ad1@othello> References: <005a01c20b3f$239c82a0$a9e77ad1@othello> Message-ID: <200206032055.g53KtZD21491@pcp742651pcs.reston01.va.comcast.net> > Did we ever decide how to implement silent deprecation in the docs? > > Some of the choices were: > 1. Delete it from the current docs (making Fred cringe) > 2. Move it to a separate section of the docs > 3. Add a note to the docs. Perhaps \dissuade{apply()}{Use > func(*args,**kwds) instead} I'll leave this for Fred to pronounce on. --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net> Message-ID: <100201c20b40$5f177320$6601a8c0@boostconsulting.com> From: "Guido van Rossum" > The problem is that Py__ doesn't scream "internal" like "_Py" does. > If we had to, I'd propose "_py". ISO/IEC 9899:1999: 7.1.3 Reserved identifiers 1 Each header declares or defines all identifiers listed in its associated subclause, and optionally declares or defines identifiers listed in its associated future library directions subclause and identifiers which are always reserved either for any use or for use as file scope identifiers. — All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use. — All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces. i-liked-mypy_-ly y'rs, Dave From mal@lemburg.com Mon Jun 3 21:57:28 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 03 Jun 2002 22:57:28 +0200 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects object.c,2.175,2.176 References: Message-ID: <3CFBD838.5090205@lemburg.com> Tim Peters wrote: > [Guido] > >>Modified Files: >> object.c >>Log Message: >>Implement the intention of SF patch 472523 (but coded differently). >> >>In the past, an object's tp_compare could return any value. In 2.2 >>the docs were tightened to require it to return -1, 0 or 1; and -1 for >>an error. >> >>We now issue a warning if the value is not in this range. >>... Another one of these little changes that slipped my radar... migration guide candidate. >>I haven't decided yet whether to backport this to 2.2.x. The patch >>applies fine. But is it fair to start warning in 2.2.2 about code >>that worked flawlessly in 2.2.1? > > > If 2.2.x is the Python-in-a-Tie line, I say "no way". People wearing ties > don't care whether a thing is right or wrong, so long as "it works" they > simply don't want to hear about it at all. Converting old extensions to > avoid new warnings is a 2.3 task for them. Since when do you wear a tie, Tim ? ;-) (or have you found a new employer requiring this... comcast.net ?) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From mgilfix@eecs.tufts.edu Mon Jun 3 22:00:53 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Mon, 3 Jun 2002 17:00:53 -0400 Subject: [Python-Dev] zap _Py prefix? In-Reply-To: <100201c20b40$5f177320$6601a8c0@boostconsulting.com>; from david.abrahams@rcn.com on Mon, Jun 03, 2002 at 04:51:00PM -0400 References: <15611.47186.828411.542771@beluga.mojam.com> <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net> <100201c20b40$5f177320$6601a8c0@boostconsulting.com> Message-ID: <20020603170053.C2362@eecs.tufts.edu> There's also py backwards: yp_func or even: yP_func I think I like the last one better. On Mon, Jun 03 @ 16:51, David Abrahams wrote: > > From: "Guido van Rossum" > > > The problem is that Py__ doesn't scream "internal" like "_Py" does. > > If we had to, I'd propose "_py". > > ISO/IEC 9899:1999: > 7.1.3 Reserved identifiers > 1 Each header declares or defines all identifiers listed in its associated > subclause, and > optionally declares or defines identifiers listed in its associated future > library directions > subclause and identifiers which are always reserved either for any use or > for use as file > scope identifiers. > — All identifiers that begin with an underscore and either an uppercase > letter or another > underscore are always reserved for any use. > — All identifiers that begin with an underscore are always reserved for use > as identifiers with file scope in both the ordinary and tag name spaces. > > > > i-liked-mypy_-ly y'rs, > > Dave > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev `-> (david.abrahams) -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From David Abrahams" <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net> <100201c20b40$5f177320$6601a8c0@boostconsulting.com> <20020603170053.C2362@eecs.tufts.edu> Message-ID: <106001c20b43$48870dc0$6601a8c0@boostconsulting.com> From: "Michael Gilfix" > There's also py backwards: > > yp_func > > or even: > > yP_func > > I think I like the last one better. Yeep! That's unpronounceably delicious! frosted-lucky-charms-ly y'rs, dave From tim.one@comcast.net Mon Jun 3 22:23:26 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 03 Jun 2002 17:23:26 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects object.c,2.175,2.176 In-Reply-To: <3CFBD838.5090205@lemburg.com> Message-ID: [M.-A. Lemburg] > ... > Since when do you wear a tie, Tim ? ;-) (or have you found a > new employer requiring this... comcast.net ?) Ya, I'm now a sales rep for Comcast, selling cable TV door to door in rural Virginia. It's pretty much a nightmare, as they haven't yet laid any cable in rural Virginia, and almost 10% of my customers ask for their money back when they discover I've just sold them the right to get cable if it ever comes to their area. Still, it beats working for Guido. so-glad-you-asked-and-btw-do-you-have-a-second-tv?-ly y'rs - tim From barry@zope.com Mon Jun 3 22:28:42 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 3 Jun 2002 17:28:42 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects object.c,2.175,2.176 References: <3CFBD838.5090205@lemburg.com> Message-ID: <15611.57226.657099.191575@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> [M.-A. Lemburg] >> ... Since when do you wear a tie, Tim ? ;-) (or have you found >> a new employer requiring this... comcast.net ?) TP> Ya, I'm now a sales rep for Comcast, selling cable TV door to TP> door in rural Virginia. It's pretty much a nightmare, as they TP> haven't yet laid any cable in rural Virginia, and almost 10% TP> of my customers ask for their money back when they discover TP> I've just sold them the right to get cable if it ever comes to TP> their area. Still, it beats working for Guido. And if you slip him $50 he'll hook you up to the illicit Uncle Timmy's Farm Report channel. Moo. it's-even-entertaining-in-maryland-ly y'rs, -Barry From tim.one@comcast.net Mon Jun 3 22:45:21 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 03 Jun 2002 17:45:21 -0400 Subject: [Python-Dev] Lazily GC tracking tuples In-Reply-To: Message-ID: [Kevin Jacobs, on Neil's tuple-untracking patch] > Sorry, I wasn't very clear here. The patch _does_ fix the performance > problem by untracking cycle-less tuples when we use the naive version of > our code (i.e., the one that does not play with the garbage collector). > However, the performance of the patched GC when compared to our GC-tuned > code is very similar. Then Neil's patch is doing all that we could wish of it in this case (you seem to have counted it as a strike against the patch that it didn't do better than you can by turning off gc by hand, but that's unrealistic if so), and then some: >>> The good news is that another (unrelated) part of our code just became >>> about 20-40% faster with this patch, though I need to do some fairly >>> major surgery to isolate why this is so. That makes it a winner if it doesn't slow non-pathological cases "too much" (counting your cases as pathological, just because they are ). From jacobs@penguin.theopalgroup.com Mon Jun 3 23:04:00 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 3 Jun 2002 18:04:00 -0400 (EDT) Subject: [Python-Dev] Lazily GC tracking tuples In-Reply-To: Message-ID: On Mon, 3 Jun 2002, Tim Peters wrote: > [Kevin Jacobs, on Neil's tuple-untracking patch] > > Sorry, I wasn't very clear here. The patch _does_ fix the performance > > problem by untracking cycle-less tuples when we use the naive version of > > our code (i.e., the one that does not play with the garbage collector). > > However, the performance of the patched GC when compared to our GC-tuned > > code is very similar. > > Then Neil's patch is doing all that we could wish of it in this case (you > seem to have counted it as a strike against the patch that it didn't do > better than you can by turning off gc by hand, but that's unrealistic if > so), and then some: I didn't count it as a strike against the patch -- I had just hoped that untracking tuples would result in faster execution than turning GC off and letting my heap swell obscenely. One extreme case could happend if I turn off GC, run my code, and let it fill all of my real memory with tuples, and start swapping to disk. Clearly, keeping GC enabled with the tuple untracking patch would result in huge performance gains. This is not the situation I was dealing with, though I was hoping for a relatively smaller improvement from having a more compact and (hopefully) less fragmented heap. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From mwh@python.net Mon Jun 3 23:23:44 2002 From: mwh@python.net (Michael Hudson) Date: 03 Jun 2002 23:23:44 +0100 Subject: [Python-Dev] zap _Py prefix? In-Reply-To: Michael Gilfix's message of "Mon, 3 Jun 2002 17:00:53 -0400" References: <15611.47186.828411.542771@beluga.mojam.com> <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net> <100201c20b40$5f177320$6601a8c0@boostconsulting.com> <20020603170053.C2362@eecs.tufts.edu> Message-ID: <2m1ybogfwv.fsf@starship.python.net> Michael Gilfix writes: > There's also py backwards: > > yp_func > > or even: > > yP_func > > I think I like the last one better. Unfortunately, that screams "Yellow Pages", even to me. ...-let's-call-the-whole-thing-off-ly y'rs m. -- Well, yes. I don't think I'd put something like "penchant for anal play" and "able to wield a buttplug" in a CV unless it was relevant to the gig being applied for... -- Matt McLeod, alt.sysadmin.recovery From gward@python.net Mon Jun 3 23:50:39 2002 From: gward@python.net (Greg Ward) Date: Mon, 3 Jun 2002 18:50:39 -0400 Subject: [Python-Dev] Re: Adding Optik to the standard library In-Reply-To: <15611.16296.202088.831238@anthem.wooz.org> References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca> <01b401c20ade$13c7bdb0$0900a8c0@spiff> <15611.16296.202088.831238@anthem.wooz.org> Message-ID: <20020603225039.GA6787@gerg.ca> On 03 June 2002, Barry A. Warsaw said: > > >>>>> "FL" == Fredrik Lundh writes: > > FL> (setq font-lock-support-mode 'lazy-lock-mode) > > (add-hook 'font-lock-mode-hook 'turn-on-fast-lock) Neither one worked for me (XEmacs 21.4.6). You're *never* going to believe what did work: load a font-locked file go to Options menu go to "Syntax Highlighting" select "Lazy lock" back to Options menu select "Save Options ..." restart XEmacs XEmacs seems to have added this bit of gibberish, err sorry line of Lisp code, to my ~/.xemacs/custom.el: '(lazy-lock-mode t nil (lazy-lock)) The wonderful thing about (X)Emacs is that there are so very many ways for it not to do what you want it to do, and every one of those ways just might work in some version of (X)Emacs somewhere... Greg -- Greg Ward - Python bigot gward@python.net http://starship.python.net/~gward/ Vote anarchist. From tim.one@comcast.net Mon Jun 3 23:51:33 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 03 Jun 2002 18:51:33 -0400 Subject: [Python-Dev] Lazily GC tracking tuples In-Reply-To: Message-ID: [Kevin Jacobs] > I didn't count it as a strike against the patch -- I had just hoped that > untracking tuples would result in faster execution than turning GC off and > letting my heap swell obscenely. Kevin, keep in mind that we haven't run your app: the only things we know about it are what you've told us. That your heap swells obscenely when gc is off is news to me. *Does* your heap swell obscenely when turning gc off, but does not swell obscenely if you keep gc on? If so, you should keep in mind too that the best way to speed gc is not to create cyclic trash to begin with. > One extreme case could happend if I turn off GC, run my code, and let it > fill all of my real memory with tuples, and start swapping to disk. > Clearly, keeping GC enabled with the tuple untracking patch would result > in huge performance gains. Sorry, this isn't clear at all. If your tuples aren't actually in cycles, then whether GC is on or off is irrelevant to how long they live, and to how much memory they consume. It doesn't cost any extra memory (not even one byte) for a tuple to live in a gc list; on the flip side, no memory is saved by Neil's patch. > This is not the situation I was dealing with, though I was hoping for a > relatively smaller improvement from having a more compact and (hopefully) > less fragmented heap. Neil's patch should have no effect on memory use or fragmentation. It's only aiming at reducing the *time* spent uselessly scanning and rescanning and rescanning and ... tuples in bad cases. So long as your tuples stay alive, they're going to consume the same amount of memory with or without the patch, and whether or not you disable gc. From gward@python.net Tue Jun 4 00:46:12 2002 From: gward@python.net (Greg Ward) Date: Mon, 3 Jun 2002 19:46:12 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: References: <20020601220529.GA20025@gerg.ca> Message-ID: <20020603234612.GA6891@gerg.ca> On 01 June 2002, Tim Peters said: > I take it you don't spend much time surveying the range of computer science > literature . *snort* I went to grad school in CS. Wasn't that enough? ;-> > Search for > > Knuth hyphenation > > instead. Three months later, the best advice you'll have read is to avoid > hyphenation entirely. I have no desire to put auto-hyphenation into the Python standard library -- isn't the whole world trying to get *away* from (natural) language-specific code? It's wonderful that Knuth came up with the algorithm, and even more wonderful that Andrew implemented it for us in Python. My wrapping algorithm respects hyphens according to the English-language conventions I learned in school, augmented by my peculiar need to handle strings like "-b" and "--file". But that's all I need. > Doing > justification with fixed-width fonts is like juggling dirt anyway . Don't worry, I have even less intention of going there. Greg -- Greg Ward - Linux nerd gward@python.net http://starship.python.net/~gward/ Never put off till tomorrow what you can put off till the day after tomorrow. From greg@cosc.canterbury.ac.nz Tue Jun 4 01:19:37 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Jun 2002 12:19:37 +1200 (NZST) Subject: [Python-Dev] Re: Adding Optik to the standard library In-Reply-To: <15607.2869.648570.950260@anthem.wooz.org> Message-ID: <200206040019.MAA06851@s454.cosc.canterbury.ac.nz> barry@zope.com (Barry A. Warsaw): > I'm probably somewhat influenced too by > my early C++ days when we adopted a one class per .h file (and one > class implementation per .cc file). IIRC, Objective-C also encouraged > this granularity of organization. Deciding how to split things up into files is not such a big issue in C-related languages, because file organisation is not tied to naming. You can change your mind about it without having to change any of the code which refers to the affected items. In Python, one is encouraged to put more thought into the matter, because it affects how things are named. One-class-per-module is convenient for editing, but it introduces an extra unneeded level into the naming hierarchy. It's unfortunate that editing convenience and naming convenience seem to be in conflict in Python. Maybe a folding editor is the answer... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry@zope.com Tue Jun 4 02:35:32 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 3 Jun 2002 21:35:32 -0400 Subject: [Python-Dev] Re: Adding Optik to the standard library References: <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca> <01b401c20ade$13c7bdb0$0900a8c0@spiff> <15611.16296.202088.831238@anthem.wooz.org> <20020603225039.GA6787@gerg.ca> Message-ID: <15612.6500.399598.338375@anthem.wooz.org> >>>>> "GW" == Greg Ward writes: GW> Neither one worked for me (XEmacs 21.4.6). Well of course (wink) you also have to (require 'fast-lock). | You're *never* | going to believe what did work: | load a font-locked file | go to Options menu | go to "Syntax Highlighting" | select "Lazy lock" | back to Options menu | select "Save Options ..." | restart XEmacs Oh, but I do believe it. GW> XEmacs seems to have added this bit of gibberish, err sorry GW> line of Lisp code, to my ~/.xemacs/custom.el: GW> '(lazy-lock-mode t nil (lazy-lock)) Pshhh. D'oh. Obvious. GW> The wonderful thing about (X)Emacs is that there are so very GW> many ways for it not to do what you want it to do, and every GW> one of those ways just might work in some version of (X)Emacs GW> somewhere... how-many-more-would-you-like?-ly y'rs, -Barry From jacobs@penguin.theopalgroup.com Tue Jun 4 02:52:31 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 3 Jun 2002 21:52:31 -0400 (EDT) Subject: [Python-Dev] Lazily GC tracking tuples In-Reply-To: Message-ID: On Mon, 3 Jun 2002, Tim Peters wrote: > [Kevin Jacobs] > > I didn't count it as a strike against the patch -- I had just hoped that > > untracking tuples would result in faster execution than turning GC off and > > letting my heap swell obscenely. > > Kevin, keep in mind that we haven't run your app: the only things we know > about it are what you've told us. That your heap swells obscenely when gc > is off is news to me. *Does* your heap swell obscenely when turning gc off, > but does not swell obscenely if you keep gc on? If so, you should keep in > mind too that the best way to speed gc is not to create cyclic trash to > begin with. This part of my app allocates a mix of several kinds of objects. Some are cyclic tuple trees, many are acyclic tuples, and others are complex class instances. To make things worse, the object lifetimes vary from ephemeral to very long-lived. Due to this mix, turning GC off *does* cause the heap to swell, and we have to very carefully monitor the algorithm to relieve the swelling frequently enough to prevent eating too much system memory, but infrequently enough that we don't spend all of our time in the GC tracking through the many acyclic tuples. Also, due to the dynamic nature of the algorithm, it is not trivial to avoid creating cyclic trash. > > One extreme case could happend if I turn off GC, run my code, and let it > > fill all of my real memory with tuples, and start swapping to disk. > > Clearly, keeping GC enabled with the tuple untracking patch would result > > in huge performance gains. > > Sorry, this isn't clear at all. If your tuples aren't actually in cycles, > then whether GC is on or off is irrelevant to how long they live, and to how > much memory they consume. It doesn't cost any extra memory (not even one > byte) for a tuple to live in a gc list; on the flip side, no memory is saved > by Neil's patch. But I do generate cyclic trash, and it does build up when I turn off GC. So, if it runs long enough with GC turned off, the machine will run out of real memory. This is all that I am saying. > > This is not the situation I was dealing with, though I was hoping for a > > relatively smaller improvement from having a more compact and (hopefully) > > less fragmented heap. > > Neil's patch should have no effect on memory use or fragmentation. It's > only aiming at reducing the *time* spent uselessly scanning and rescanning > and rescanning and ... tuples in bad cases. So long as your tuples stay > alive, they're going to consume the same amount of memory with or without > the patch, and whether or not you disable gc. With Neil's patch and GC turned ON: Python untracks many, many tuples that give for several generations that would otherwise have to scanned to dispose of the accumulating cyclic garbage. Some GC overhead is observed, due to normal periodic scanning and the extra work needed to untrack acyclic tuples. without Neil's patch and automatic GC turned OFF: Python does not spend any time scanning for garbage, and things are very fast until we run out of core memory, or the patterns of allocations result in an extremely fragmented heap due to interleaved allocation of objects with very heterogeneous lifetimes. At certain intervals, GC would be triggered manually to release the cyclic trash. My hope was that the more compact and hopefully less fragmented heap would more than offset the overhead of automatic GC scans and the tuple untracking sweep. Hopefully my comments are now a little clearer... -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From greg@cosc.canterbury.ac.nz Tue Jun 4 03:00:06 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Jun 2002 14:00:06 +1200 (NZST) Subject: [Python-Dev] subclass a module? In-Reply-To: <15608.15520.96707.809995@anthem.wooz.org> Message-ID: <200206040200.OAA06890@s454.cosc.canterbury.ac.nz> barry@zope.com (Barry A. Warsaw): > Can I now subclass from modules? And if so, what good does that do > me? This seems to be a side effect of two things: (1) Python 2.2 will accept anything as a base class whose type is callable with the appropriate arguments, and (2) types.ModuleType doesn't seem to care what arguments you give it: Python 2.2 (#14, May 28 2002, 14:11:27) [GCC 2.95.2 19991024 (release)] on sunos5 Type "help", "copyright", "credits" or "license" for more information. >>> from types import ModuleType >>> ModuleType() >>> ModuleType(42) >>> ModuleType("dead", "parrot") >>> ModuleType("nobody", "expects", "the", "spanish", "base", "class") >>> So your class statement is simply creating an empty module. An interesting feature of the new scheme is that the "class" statement can be used to create things which don't even remotely resemble classes. Brain-explosion for the masses! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Tue Jun 4 02:17:38 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Jun 2002 13:17:38 +1200 (NZST) Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: <20020601220529.GA20025@gerg.ca> Message-ID: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz> Greg Ward : > despite being warned just today on the conceptual/philosophical > danger of classes whose names end in "-er" [1] > > [1] objects should *be*, not *do*, and class names like HelpFormatter > and TextWrapper are impositions of procedural abstraction onto > OOP. I disagree with this statement completely. Surely the concept of objects *doing* things is central to the whole idea of OO! Why do you think objects have things called "methods"?-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin@v.loewis.de Tue Jun 4 06:32:52 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 04 Jun 2002 07:32:52 +0200 Subject: [Python-Dev] zap _Py prefix? In-Reply-To: <100201c20b40$5f177320$6601a8c0@boostconsulting.com> References: <15611.47186.828411.542771@beluga.mojam.com> <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net> <100201c20b40$5f177320$6601a8c0@boostconsulting.com> Message-ID: "David Abrahams" writes: > All identifiers that begin with an underscore and either an > uppercase letter or another underscore are always reserved for any > use. I agree with Guido that this is not enough incentive to change the names of these function. Even though compiler may use then, no compiler of interest will. Regards, Martin From guido@python.org Tue Jun 4 07:08:53 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 02:08:53 -0400 Subject: [Python-Dev] subclass a module? In-Reply-To: Your message of "Tue, 04 Jun 2002 14:00:06 +1200." <200206040200.OAA06890@s454.cosc.canterbury.ac.nz> References: <200206040200.OAA06890@s454.cosc.canterbury.ac.nz> Message-ID: <200206040608.g5468rb31173@pcp742651pcs.reston01.va.comcast.net> > > Can I now subclass from modules? And if so, what good does that do > > me? > > This seems to be a side effect of two things: > (1) Python 2.2 will accept anything as a base class > whose type is callable with the appropriate arguments, > and (2) types.ModuleType doesn't seem to care what > arguments you give it: Thanks for explaining this! (I've been away from this so long that it baffled me a bit. :-) I've fixed this by making the module constructor sane: it now requires a name and takes an optional docstring. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Jun 4 08:48:36 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 04 Jun 2002 09:48:36 +0200 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects object.c,2.175,2.176 References: Message-ID: <3CFC70D4.6050303@lemburg.com> Tim Peters wrote: > [M.-A. Lemburg] > >>... >>Since when do you wear a tie, Tim ? ;-) (or have you found a >>new employer requiring this... comcast.net ?) > > > Ya, I'm now a sales rep for Comcast, selling cable TV door to door in rural > Virginia. It's pretty much a nightmare, as they haven't yet laid any cable > in rural Virginia, and almost 10% of my customers ask for their money back > when they discover I've just sold them the right to get cable if it ever > comes to their area. Still, it beats working for Guido. Yeah, probably better than squashing dirty bugs living near wild pythons on a daily basis. > so-glad-you-asked-and-btw-do-you-have-a-second-tv?-ly y'rs - tim Not yet, but I promise to get one as soon as I move to Virginia. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From walter@livinglogic.de Tue Jun 4 12:58:53 2002 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue, 04 Jun 2002 13:58:53 +0200 Subject: [Python-Dev] Draft Guide for code migration and modernation References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com> Message-ID: <3CFCAB7D.4090507@livinglogic.de> Neal Norwitz wrote: > [...] > Pattern: import string ; string.method(s, ...) --> s.method(...) join and zfill should probably be mentioned separately: """ Be careful with string.join(): The order of the arguments is reversed here. string.zfill has a "decadent feature": It also works for non-string objects by calling repr before formatting. """ string.atoi(s, ...) --> int(s, ...) string.atol(s, ...) --> long(s, ...) string.atof(s, ...) --> float(s, ...) > c in string.whitespace --> c.isspace() This changes the meaning slightly for unicode characters, because chr(i).isspace() != unichr(i).isspace() for i in { 0x1c, 0x1d, 0x1e, 0x1f, 0x85, 0xa0 } New ones: Pattern: "foobar"[:3] == "foo" -> "foobar".startswith("foo") "foobar"[-3:] == "bar" -> "foobar".endswith("bar") Version: ??? (It was added on the string_methods branch) Benefits: Faster because no slice has to be created. No danger of miscounting. Locating: grep "\[\w*-[0-9]*\w*:\w*\]" | grep "==" grep "\[\w*:\w*[0-9]*\w*\]" | grep "==" Pattern: import types; if hasattr(types, "UnicodeType"): foo else: bar --> try: unicode except NameError: bar else: foo Idea: The types module will likely to be deprecated in the future. Version: 2.2 Benefits: Avoid a deprecated feature. Locating: grep "hasattr.*UnicodeType" Bye, Walter Dörwald From s_lott@yahoo.com Tue Jun 4 13:03:04 2002 From: s_lott@yahoo.com (Steven Lott) Date: Tue, 4 Jun 2002 05:03:04 -0700 (PDT) Subject: [Python-Dev] Re: Adding Optik to the standard library In-Reply-To: <200206040019.MAA06851@s454.cosc.canterbury.ac.nz> Message-ID: <20020604120304.17738.qmail@web9605.mail.yahoo.com> Or a literate programming tool that separates these concerns nicely. --- Greg Ewing wrote: > barry@zope.com (Barry A. Warsaw): > > > I'm probably somewhat influenced too by > > my early C++ days when we adopted a one class per .h file > (and one > > class implementation per .cc file). IIRC, Objective-C also > encouraged > > this granularity of organization. > > Deciding how to split things up into files is not such > a big issue in C-related languages, because file > organisation is not tied to naming. You can change > your mind about it without having to change any of > the code which refers to the affected items. > > In Python, one is encouraged to put more thought into > the matter, because it affects how things are named. > One-class-per-module is convenient for editing, but > it introduces an extra unneeded level into the > naming hierarchy. > > It's unfortunate that editing convenience and naming > convenience seem to be in conflict in Python. Maybe > a folding editor is the answer... > > Greg Ewing, Computer Science Dept, > +--------------------------------------+ > University of Canterbury, | A citizen of NewZealandCorp, a > | > Christchurch, New Zealand | wholly-owned subsidiary of USA > Inc. | > greg@cosc.canterbury.ac.nz > +--------------------------------------+ > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev ===== -- S. Lott, CCP :-{) S_LOTT@YAHOO.COM http://www.mindspring.com/~slott1 Buccaneer #468: KaDiMa Macintosh user: drinking upstream from the herd. __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From guido@python.org Tue Jun 4 14:16:11 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 09:16:11 -0400 Subject: [Python-Dev] Draft Guide for code migration and modernation In-Reply-To: Your message of "Tue, 04 Jun 2002 13:58:53 +0200." <3CFCAB7D.4090507@livinglogic.de> References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com> <3CFCAB7D.4090507@livinglogic.de> Message-ID: <200206041316.g54DGBA00973@pcp742651pcs.reston01.va.comcast.net> > string.zfill has a "decadent feature": It also works for > non-string objects by calling repr before formatting. Hm, but repr() was the wrong thing to call here anyway. :-( > > c in string.whitespace --> c.isspace() > > This changes the meaning slightly for unicode characters, because > chr(i).isspace() != unichr(i).isspace() > for i in { 0x1c, 0x1d, 0x1e, 0x1f, 0x85, 0xa0 } That's unfortunate, because I'd like unicode to be an extension of ASCII also in this kind of functionality. What are these and why are they considered spaces? Would it hurt to make them spaces in ASCII too? > New ones: > > Pattern: "foobar"[:3] == "foo" -> "foobar".startswith("foo") > "foobar"[-3:] == "bar" -> "foobar".endswith("bar") > Version: ??? (It was added on the string_methods branch) 2.0. > Benefits: Faster because no slice has to be created. > No danger of miscounting. > Locating: grep "\[\w*-[0-9]*\w*:\w*\]" | grep "==" > grep "\[\w*:\w*[0-9]*\w*\]" | grep "==" Are these regexes really worth making part of the migration guide? \w* isn't a good pattern to catch an arbitrary expression, it only catches simple identifiers! --Guido van Rossum (home page: http://www.python.org/~guido/) From pobrien@orbtech.com Tue Jun 4 14:23:34 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Tue, 4 Jun 2002 08:23:34 -0500 Subject: [Python-Dev] Draft Guide for code migration and modernation In-Reply-To: <200206041316.g54DGBA00973@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > Locating: grep "\[\w*-[0-9]*\w*:\w*\]" | grep "==" > > grep "\[\w*:\w*[0-9]*\w*\]" | grep "==" > > Are these regexes really worth making part of the migration guide? > \w* isn't a good pattern to catch an arbitrary expression, it only > catches simple identifiers! Doesn't that make a pretty good case for including (properly formed) regexes? --- Patrick K. O'Brien Orbtech From pinard@iro.umontreal.ca Tue Jun 4 14:29:51 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 04 Jun 2002 09:29:51 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz> Message-ID: Hi, people. For this incoming text wrapper facility, there is a feature that appears really essential to me, and many others: the protection of full stops[1]. In a previous message, I spoke of Knuth's algorithm as a nice possibility, but this is merely whipped cream and cherry over the ice cream. Protection of full stops does not fall in that decoration category, it is essential. I mean, for those who care, a wrapper without full stop protection would be rather unusable when there is more than one sentence to refill. ---------- [1] Full stops are punctuation ending sentences with two spaces guaranteed. Full stops are defined that way for typography based on fixed width fonts, like when we say "this many characters to a line". -- François Pinard http://www.iro.umontreal.ca/~pinard From walter@livinglogic.de Tue Jun 4 15:06:29 2002 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Tue, 04 Jun 2002 16:06:29 +0200 Subject: [Python-Dev] Draft Guide for code migration and modernation References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com> <3CFCAB7D.4090507@livinglogic.de> <200206041316.g54DGBA00973@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3CFCC965.5060200@livinglogic.de> Guido van Rossum wrote: >>string.zfill has a "decadent feature": It also works for >>non-string objects by calling repr before formatting. > > > Hm, but repr() was the wrong thing to call here anyway. :-( The old code used `x`. Should we change it to use str()? >>> c in string.whitespace --> c.isspace() >> >>This changes the meaning slightly for unicode characters, because >>chr(i).isspace() != unichr(i).isspace() >>for i in { 0x1c, 0x1d, 0x1e, 0x1f, 0x85, 0xa0 } > > > That's unfortunate, because I'd like unicode to be an extension of > ASCII also in this kind of functionality. What are these and why are > they considered spaces? http://www.unicode.org/Public/UNIDATA/NamesList.txt says: 001C = INFORMATION SEPARATOR FOUR = file separator (FS) 001D = INFORMATION SEPARATOR THREE = group separator (GS) 001E = INFORMATION SEPARATOR TWO = record separator (RS) 001F = INFORMATION SEPARATOR ONE = unit separator (US) 0085 = NEXT LINE (NEL) 00A0 NO-BREAK SPACE x (space - 0020) x (figure space - 2007) x (narrow no-break space - 202F) x (word joiner - 2060) x (zero width no-break space - FEFF) # 0020 > Would it hurt to make them spaces in ASCII > too? stringobject.c::string_isspace() currently uses the isspace() function from . >>New ones: >> >>Pattern: "foobar"[:3] == "foo" -> "foobar".startswith("foo") >> "foobar"[-3:] == "bar" -> "foobar".endswith("bar") >>Version: ??? (It was added on the string_methods branch) > > > 2.0. > > >>Benefits: Faster because no slice has to be created. >> No danger of miscounting. >>Locating: grep "\[\w*-[0-9]*\w*:\w*\]" | grep "==" >> grep "\[\w*:\w*[0-9]*\w*\]" | grep "==" > > > Are these regexes really worth making part of the migration guide? > \w* isn't a good pattern to catch an arbitrary expression, it only > catches simple identifiers! Ouch, that was meant to be grep "\[[[:space:]]*-[[:digit:]]*[[:space:]]*:[[:space:]]*\]" | grep "==" grep "\[[[:space:]]*:[[:space:]]*[[:digit:]]*[[:space:]]*\]" | grep "==" This doesn't find "foobar"[-len("bar"):]=="bar", only constants. But at least it's a little better than vgrep. ;) Bye, Walter Dörwald From akuchlin@mems-exchange.org Tue Jun 4 15:07:34 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Tue, 4 Jun 2002 10:07:34 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: References: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz> Message-ID: <20020604140734.GB1039@ute.mems-exchange.org> On Tue, Jun 04, 2002 at 09:29:51AM -0400, Fran?ois Pinard wrote: >[1] Full stops are punctuation ending sentences with two spaces guaranteed. >Full stops are defined that way for typography based on fixed width fonts, >like when we say "this many characters to a line". I don't think this really matters, because I doubt anyone will be implementing full justification. Left justification is just a matter of inserting newlines at particular points, so if the input data has two spaces after punctuation, line-breaking won't introduce any errors. --amk From guido@python.org Tue Jun 4 15:11:10 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 10:11:10 -0400 Subject: [Python-Dev] Draft Guide for code migration and modernation In-Reply-To: Your message of "Tue, 04 Jun 2002 08:23:34 CDT." References: Message-ID: <200206041411.g54EBAI10211@odiug.zope.com> > Doesn't that make a pretty good case for including (properly formed) > regexes? You can't match an expression with a regex. --Guido van Rossum (home page: http://www.python.org/~guido/) From pobrien@orbtech.com Tue Jun 4 15:22:51 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Tue, 4 Jun 2002 09:22:51 -0500 Subject: [Python-Dev] Draft Guide for code migration and modernation In-Reply-To: <200206041411.g54EBAI10211@odiug.zope.com> Message-ID: [Guido van Rossum] > > Doesn't that make a pretty good case for including (properly formed) > > regexes? > > You can't match an expression with a regex. Doh! Sorry. ;-) --- Patrick K. O'Brien Orbtech From mal@lemburg.com Tue Jun 4 15:17:26 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 04 Jun 2002 16:17:26 +0200 Subject: [Python-Dev] Draft Guide for code migration and modernation References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com> <3CFCAB7D.4090507@livinglogic.de> <200206041316.g54DGBA00973@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3CFCCBF6.8010902@lemburg.com> Guido van Rossum wrote: >>> c in string.whitespace --> c.isspace() >> >>This changes the meaning slightly for unicode characters, because >>chr(i).isspace() != unichr(i).isspace() >>for i in { 0x1c, 0x1d, 0x1e, 0x1f, 0x85, 0xa0 } > > > That's unfortunate, because I'd like unicode to be an extension of > ASCII also in this kind of functionality. What are these and why are > they considered spaces? Would it hurt to make them spaces in ASCII > too? From the Unicode database: 001C;;Cc;0;B;;;;;N;FILE SEPARATOR;;;; 001D;;Cc;0;B;;;;;N;GROUP SEPARATOR;;;; 001E;;Cc;0;B;;;;;N;RECORD SEPARATOR;;;; 001F;;Cc;0;S;;;;;N;UNIT SEPARATOR;;;; 0085;;Cc;0;B;;;;;N;NEXT LINE;;;; 00A0;NO-BREAK SPACE;Zs;0;CS; 0020;;;;N;NON-BREAKING SPACE;;;; -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From guido@python.org Tue Jun 4 15:32:01 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 10:32:01 -0400 Subject: [Python-Dev] Draft Guide for code migration and modernation In-Reply-To: Your message of "Tue, 04 Jun 2002 16:06:29 +0200." <3CFCC965.5060200@livinglogic.de> References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com> <3CFCAB7D.4090507@livinglogic.de> <200206041316.g54DGBA00973@pcp742651pcs.reston01.va.comcast.net> <3CFCC965.5060200@livinglogic.de> Message-ID: <200206041432.g54EW1Q10622@odiug.zope.com> > > Hm, but repr() was the wrong thing to call here anyway. :-( > > The old code used `x`. Should we change it to use str()? Can't do that, it's an incompatibility. In a module of mostly historic importance, it doesn't make sense to change it incompatibly. > > Would it hurt to make them spaces in ASCII > > too? > > stringobject.c::string_isspace() currently uses the isspace() > function from . I guess we'll have to live with this difference. There's not much harm, since nobody uses these anyway. > grep "\[[[:space:]]*-[[:digit:]]*[[:space:]]*:[[:space:]]*\]" | grep "==" > grep "\[[[:space:]]*:[[:space:]]*[[:digit:]]*[[:space:]]*\]" | grep "==" > > This doesn't find "foobar"[-len("bar"):]=="bar", only constants. > > But at least it's a little better than vgrep. ;) Doesn't answer my question. I'm doubting the wisdom of including these grep instructions (correct or otherwise :-) for several reasons: (1) It doesn't catch all cases (regexes aren't powerful enough to match arbitrary expressions) (2) This recipe is Unix specific (3) (Most important) it encourages "peephole changes" By "peephole changes" I mean a very focused search-and-destroy looking for a pattern and changing it, without looking at anything else. This can cause anachronistic code, where one line is modern style, and the rest of a function uses outdated idioms. IMO that looks worse than all old style. It can also cause bugs to slip in. In my recollection, every time someone went in and did a sweep over the standard library looking for a particular pattern to fix, they introduced at least one bug. I much prefer such modernizations to be done only when you have a reason to look at a particular module anyway, so you really understand the code before you go in. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@strakt.com Tue Jun 4 16:08:37 2002 From: martin@strakt.com (Martin =?iso-8859-1?Q?Sj=F6gren?=) Date: Tue, 4 Jun 2002 17:08:37 +0200 Subject: [Python-Dev] Python 2.3 release schedule In-Reply-To: <001c01c20474$e1a50140$f9d8accf@othello> References: <200205242217.g4OMHbu25323@pcp742651pcs.reston01.va.comcast.net> <001c01c20474$e1a50140$f9d8accf@othello> Message-ID: <20020604150837.GA30078@strakt.com> On Sun, May 26, 2002 at 01:19:15AM -0400, Raymond Hettinger wrote: > ia.filter(pred) # takewhile > ia.invfilter(pred) # dropwhile Err. I don't know what you mean with "filter", but in Haskell, there is a big difference between filter and takeWhile. Prelude> filter (>3) [1..5] [4,5] Prelude> takeWhile (>3) [1..5] [] Prelude> dropWhile (>3) [1..5] [1,2,3,4,5] /Martin --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 7710870 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html From neal@metaslash.com Tue Jun 4 16:17:50 2002 From: neal@metaslash.com (Neal Norwitz) Date: Tue, 04 Jun 2002 11:17:50 -0400 Subject: [Python-Dev] Changes to PEP 8 & 42 Message-ID: <3CFCDA1E.B0577968@metaslash.com> I've also got a change to PEP 8 (for basestring). Is the wording ok? Should I check it in or should Barry/Guido? Also, it seems that PEP 42 is a bit out of date. I think the stat/statvfs changes may be done (at least started), the std library uses 4 space indents, math.radians/degrees were added. Probably others too. Neal -- Index: pep-0008.txt =================================================================== RCS file: /cvsroot/python/python/nondist/peps/pep-0008.txt,v retrieving revision 1.13 diff -C1 -r1.13 pep-0008.txt *** pep-0008.txt 29 May 2002 16:07:27 -0000 1.13 --- pep-0008.txt 3 Jun 2002 20:25:54 -0000 *************** *** 542,548 **** When checking if an object is a string, keep in mind that it ! might be a unicode string too! In Python 2.2, the types module ! has the StringTypes type defined for that purpose, e.g.: ! from types import StringTypes: ! if isinstance(strorunicodeobj, StringTypes): --- 542,547 ---- When checking if an object is a string, keep in mind that it ! might be a unicode string too! In Python 2.3, str and unicode ! have a common base class, basestring, so you can do: ! if isinstance(strorunicodeobj, basestring): From guido@python.org Tue Jun 4 16:39:32 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 11:39:32 -0400 Subject: [Python-Dev] Changes to PEP 8 & 42 In-Reply-To: Your message of "Tue, 04 Jun 2002 11:17:50 EDT." <3CFCDA1E.B0577968@metaslash.com> References: <3CFCDA1E.B0577968@metaslash.com> Message-ID: <200206041539.g54FdWF01219@odiug.zope.com> > I've also got a change to PEP 8 (for basestring). Is the wording ok? > Should I check it in or should Barry/Guido? You can check it in, but I suggest providing ways of doing this for 2.0/2.1, 2.2, and 2.3 (since they are all different). > Also, it seems that PEP 42 is a bit out of date. I think the > stat/statvfs changes may be done (at least started), > the std library uses 4 space indents, math.radians/degrees were added. > Probably others too. Whoever fulfilled those wishes should ideally edit the PEP. You can do it too. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Tue Jun 4 16:43:29 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 4 Jun 2002 11:43:29 -0400 Subject: [Python-Dev] Changes to PEP 8 & 42 References: <3CFCDA1E.B0577968@metaslash.com> Message-ID: <15612.57377.547271.755162@anthem.wooz.org> >>>>> "NN" == Neal Norwitz writes: NN> I've also got a change to PEP 8 (for basestring). Is the NN> wording ok? Should I check it in or should Barry/Guido? I would add the Python 2.3 recommendation to the PEP instead of replacing the Python 2.2 recommendation. If you do that, feel free to commit the change. -Barry From martin@v.loewis.de Tue Jun 4 18:40:27 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 4 Jun 2002 19:40:27 +0200 Subject: [Python-Dev] Patch #473512 Message-ID: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> I'm ready to apply patch 473512 : getopt with GNU style scanning, which adds getopt.gnu_getopt. Any objections? Regards, Martin From guido@python.org Tue Jun 4 19:04:28 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 14:04:28 -0400 Subject: [Python-Dev] Patch #473512 In-Reply-To: Your message of "Tue, 04 Jun 2002 19:40:27 +0200." <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> Message-ID: <200206041804.g54I4Sd16333@odiug.zope.com> > I'm ready to apply patch 473512 : getopt with GNU style scanning, > which adds getopt.gnu_getopt. > > Any objections? Is there a point to adding more cruft to getopt.py now that we're getting Greg Ward's Optik? Also, I happen to hate GNU style getopt. You may call me an old fogey, but I think options should precede other arguments. But other that, no objections. --Guido van Rossum (home page: http://www.python.org/~guido/) From pinard@iro.umontreal.ca Tue Jun 4 19:09:26 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 04 Jun 2002 14:09:26 -0400 Subject: [Python-Dev] Re: Patch #473512 In-Reply-To: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> Message-ID: [Martin v. Loewis] > I'm ready to apply patch 473512 : getopt with GNU style scanning, which > adds getopt.gnu_getopt. Any objections? GNU getopt changes once in a while. Will `getopt.gnu_getopt' track and reflect these changes as they occur? I mean, is it the intent? If yes, the name might be fine. Otherwise, it might be better to name this other `getopt' after some of its properties, instead of using `gnu' as a prefix. If GNU it has to be, maybe it should be capitalised? Some existing modules suggest capitals where underlines would probably have been sufficient, maybe we should use capitals where they are more naturally mandated. Not a big matter for me, but probably worth a thought nevertheless? Have a good day, everybody! -- François Pinard http://www.iro.umontreal.ca/~pinard From pinard@iro.umontreal.ca Tue Jun 4 19:34:12 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 04 Jun 2002 14:34:12 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: <20020604140734.GB1039@ute.mems-exchange.org> References: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz> <20020604140734.GB1039@ute.mems-exchange.org> Message-ID: [Andrew Kuchling] > On Tue, Jun 04, 2002 at 09:29:51AM -0400, Fran?ois Pinard wrote: > >[1] Full stops are punctuation ending sentences with two spaces guaranteed. > >Full stops are defined that way for typography based on fixed width fonts, > >like when we say "this many characters to a line". > I don't think this really matters, because I doubt anyone will be > implementing full justification. This is an orthogonal matter, unrelated to full stops. Simultaneous left and right justification for fixed fonts texts is _not_ to be praised[1]. The real goal of any typographical device, like wrapping, is improving the legibility of text. Maybe simultaneous left and right justification is more "good looking", some would even say "beautiful", but I think it is considered well known that such simultaneous justification signficiatnly decreases legibility for fixed width fonts. If a typographical device aims beauty instead of legibility, it misses the real goal. > Left justification is just a matter of inserting newlines at particular > points, so if the input data has two spaces after punctuation, > line-breaking won't introduce any errors. Excellent if it could be done exactly this way. However, things are not always that simple. If a newline is inserted at some point for wrapping purposes, it is desirable and usual to remove what was whitespace around that point, so we do not have unwelcome spaces at start of the beginning line, or spurious trailing whitespace at end of the previous line. If the wrapping device otherwise replaces sequences of many spaces by one, it should be careful at replacing many space by two, in context of full stops. ---------- [1] I think, shudder and horror, that `man' does simultaneous left and right justification when producing ASCII pages, this is especially bad since `man' is about documentation to start with. Of course, when generating pages for laser printers, with proportional fonts and micro-spacing, things are pretty different, and _then_ simultaneous left and right justification makes sense for legibility, if kept within reasonable bounds of course. I'm almost sure that all of us have seen dubious and unreasonable usages. -- François Pinard http://www.iro.umontreal.ca/~pinard From martin@v.loewis.de Tue Jun 4 19:34:07 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 04 Jun 2002 20:34:07 +0200 Subject: [Python-Dev] Patch #473512 In-Reply-To: <200206041804.g54I4Sd16333@odiug.zope.com> References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> <200206041804.g54I4Sd16333@odiug.zope.com> Message-ID: Guido van Rossum writes: > Is there a point to adding more cruft to getopt.py now that we're > getting Greg Ward's Optik? Perhaps ease-of-use - people wanting to use GNU getopt style only need to change the function name in their existing application. > Also, I happen to hate GNU style getopt. You may call me an old > fogey, but I think options should precede other arguments. That certainly is debatable. However, since the patch is a pure addition, every application author will have to make the choice herself, and no existing application will break. Regards, Martin From guido@python.org Tue Jun 4 19:53:54 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 14:53:54 -0400 Subject: [Python-Dev] Patch #473512 In-Reply-To: Your message of "04 Jun 2002 20:34:07 +0200." References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> <200206041804.g54I4Sd16333@odiug.zope.com> Message-ID: <200206041853.g54Irsb17084@odiug.zope.com> > > Is there a point to adding more cruft to getopt.py now that we're > > getting Greg Ward's Optik? > > Perhaps ease-of-use - people wanting to use GNU getopt style only need > to change the function name in their existing application. > > > Also, I happen to hate GNU style getopt. You may call me an old > > fogey, but I think options should precede other arguments. > > That certainly is debatable. However, since the patch is a pure > addition, every application author will have to make the choice > herself, and no existing application will break. So I'm a neutral 0 on the patch. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 4 19:55:11 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 14:55:11 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: Your message of "04 Jun 2002 14:34:12 EDT." References: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz> <20020604140734.GB1039@ute.mems-exchange.org> Message-ID: <200206041855.g54ItBW17096@odiug.zope.com> > Excellent if it could be done exactly this way. However, things are > not always that simple. If a newline is inserted at some point for > wrapping purposes, it is desirable and usual to remove what was > whitespace around that point, so we do not have unwelcome spaces at > start of the beginning line, or spurious trailing whitespace at end > of the previous line. If the wrapping device otherwise replaces > sequences of many spaces by one, it should be careful at replacing > many space by two, in context of full stops. Emacs does it this way because you reformat the same paragraph over and over. The downside is that sometimes a line is shorter than it could be because it would end in a period. For what we're doing here (producing tidy output) I prefer not to do the Emacs fiddling. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue Jun 4 20:48:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 04 Jun 2002 21:48:20 +0200 Subject: [Python-Dev] Re: Patch #473512 In-Reply-To: References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> Message-ID: pinard@iro.umontreal.ca (Fran=E7ois Pinard) writes: > GNU getopt changes once in a while.=20=20 In what way? When has it changed last, and what was that change? > Will `getopt.gnu_getopt' track and reflect these changes as they > occur? I mean, is it the intent? If yes, the name might be fine. > Otherwise, it might be better to name this other `getopt' after some > of its properties, instead of using `gnu' as a prefix. Assuming that bug fixes are made to GNU getopt, it would certainly be reasonable to reflect them in getopt.gnu_getopt. > If GNU it has to be, maybe it should be capitalised? Some existing modul= es > suggest capitals where underlines would probably have been sufficient, > maybe we should use capitals where they are more naturally mandated. >=20 > Not a big matter for me, but probably worth a thought nevertheless? I've no opinion on that except that function names are traditionally all lower-case. I'll ask the author of the patch. Regards, Martin From guido@python.org Tue Jun 4 20:58:31 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 15:58:31 -0400 Subject: [Python-Dev] Re: Patch #473512 In-Reply-To: Your message of "04 Jun 2002 21:48:20 +0200." References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> Message-ID: <200206041958.g54JwVS18336@odiug.zope.com> > > If GNU it has to be, maybe it should be capitalised? Some existing modules > > suggest capitals where underlines would probably have been sufficient, > > maybe we should use capitals where they are more naturally mandated. > > > > Not a big matter for me, but probably worth a thought nevertheless? > > I've no opinion on that except that function names are traditionally > all lower-case. I'll ask the author of the patch. I'm -1 on capitalizing GNU here -- the module name and function name are all lowercase. --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Tue Jun 4 21:08:08 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 4 Jun 2002 16:08:08 -0400 Subject: [Python-Dev] xrange identity crisis Message-ID: <20020604200808.GA43351@hishome.net> It seems that the xrange object in the current CVS can't make up its mind whether it's an iterator or an iterable: >>> iterables = ["", (), [], {}, file('/dev/null'), xrange(10)] >>> iterators = [iter(x) for x in iterables] >>> for x in iterables + iterators: ... print hasattr(x, 'next'), x is iter(x), type(x) ... False False False False False False False False False False True False True True True True True True True True True True True False Generally, iterables don't have a next() method and return a new object each time they are iter()ed. Iterators do have a next() method and return themselves on iter(). xrange is a strange hybrid. In Python 2.2.0/1 xrange behaved just like the other iterables: >>> iterables = ["", (), [], {}, file('/dev/null'), xrange(10)] >>> iterators = [iter(x) for x in iterables] >>> for x in iterables + iterators: ... print hasattr(x, 'next'), x is iter(x), type(x) ... 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 What's the rationale behind this change? Oren From jepler@unpythonic.net Tue Jun 4 22:01:24 2002 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 4 Jun 2002 16:01:24 -0500 Subject: [Python-Dev] xrange identity crisis In-Reply-To: <20020604200808.GA43351@hishome.net> References: <20020604200808.GA43351@hishome.net> Message-ID: <20020604160123.D24361@unpythonic.net> On Tue, Jun 04, 2002 at 04:08:08PM -0400, Oren Tirosh wrote: > It seems that the xrange object in the current CVS can't make up its mind > whether it's an iterator or an iterable: In 2.2, xrange had no "next" method, so it got wrapped by a generic iterator object. It was desirable for performance to have xrange also act as an iterator. See http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/rangeobject.c.diff?r1=2.35&r2=2.36 for the change. See http://www.python.org/sf/551410 for the sf patch this comes from. However, the following code would give different results if 'iter(x) is x' for xrange objects: x = xrange(5) for a in x: for b in x: print a,b (it'd print "0 1" "0 2" "0 3" "0 4" if they were the same iterator, just as for 'x = iter(range(5))') so, it's necessary to return a *different* xrange object from iter(x) so it can start iterating from the beginning again. I think there's an optimization that *the first time*, iter(x) is x for an xrange object. Hm, the python cvs I have here is too old to have this optimization ... so I can't really tell you how it works now for sure. Jeff From python@rcn.com Tue Jun 4 21:57:58 2002 From: python@rcn.com (Raymond Hettinger) Date: Tue, 4 Jun 2002 16:57:58 -0400 Subject: [Python-Dev] xrange identity crisis References: <20020604200808.GA43351@hishome.net> Message-ID: <001901c20c0a$8254d880$f061accf@othello> Xrange was given its own tp_iter slot and now runs as fast a range. In single pass timings, it runs faster. In multiple passes, range is still quicker because it only has to create the PyNumbers once. Being immutable, xrange had the advantage that it could serve as its own iterator and did not require the extra code needed for list iterators and dict iterators. Raymond Hettinger ----- Original Message ----- From: "Oren Tirosh" To: Sent: Tuesday, June 04, 2002 4:08 PM Subject: [Python-Dev] xrange identity crisis > It seems that the xrange object in the current CVS can't make up its mind > whether it's an iterator or an iterable: > > >>> iterables = ["", (), [], {}, file('/dev/null'), xrange(10)] > >>> iterators = [iter(x) for x in iterables] > >>> for x in iterables + iterators: > ... print hasattr(x, 'next'), x is iter(x), type(x) > ... > False False > False False > False False > False False > False False > True False > True True > True True > True True > True True > True True > True False > > Generally, iterables don't have a next() method and return a new object > each time they are iter()ed. Iterators do have a next() method and return > themselves on iter(). xrange is a strange hybrid. > > In Python 2.2.0/1 xrange behaved just like the other iterables: > >>> iterables = ["", (), [], {}, file('/dev/null'), xrange(10)] > >>> iterators = [iter(x) for x in iterables] > >>> for x in iterables + iterators: > ... print hasattr(x, 'next'), x is iter(x), type(x) > ... > 0 0 > 0 0 > 0 0 > 0 0 > 0 0 > 0 0 > 1 1 > 1 1 > 1 1 > 1 1 > 1 1 > 1 1 > > What's the rationale behind this change? > > Oren > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > From guido@python.org Tue Jun 4 22:12:45 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 17:12:45 -0400 Subject: [Python-Dev] xrange identity crisis In-Reply-To: Your message of "Tue, 04 Jun 2002 16:01:24 CDT." <20020604160123.D24361@unpythonic.net> References: <20020604200808.GA43351@hishome.net> <20020604160123.D24361@unpythonic.net> Message-ID: <200206042112.g54LCjl25275@odiug.zope.com> > On Tue, Jun 04, 2002 at 04:08:08PM -0400, Oren Tirosh wrote: > > It seems that the xrange object in the current CVS can't make up its mind > > whether it's an iterator or an iterable: > > In 2.2, xrange had no "next" method, so it got wrapped by a generic > iterator object. It was desirable for performance to have xrange also > act as an iterator. This seems to propagate the confusion. To avoid being wrapped by a generic iterator object, you need to define an __iter__ method, not a next method. The current xrange code (from SF patch #551410) uses the xrange object as both an iterator and iterable, and has an extra flag to make things work right when the same object is iterated over more than once. Without doing more of a review, I can only say that I'm a but uncomfortable with that approach. Something like the more recent code that Raymond H added to listobject.c to add a custom iterator makes more sense. But perhaps it is defensible. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 4 22:18:39 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 17:18:39 -0400 Subject: [Python-Dev] xrange identity crisis In-Reply-To: Your message of "Tue, 04 Jun 2002 16:57:58 EDT." <001901c20c0a$8254d880$f061accf@othello> References: <20020604200808.GA43351@hishome.net> <001901c20c0a$8254d880$f061accf@othello> Message-ID: <200206042118.g54LIdr25307@odiug.zope.com> [Raymond Hettinger] > Xrange was given its own tp_iter slot and now runs as fast a range. > In single pass timings, it runs faster. In multiple passes, range > is still quicker because it only has to create the PyNumbers once. > > Being immutable, xrange had the advantage that it could serve as its > own iterator and did not require the extra code needed for list > iterators and dict iterators. Did you write the pach that Martin checked in? It's broken. >>> a = iter(xrange(10)) >>> for i in a: print i if i == 4: print '*', a.next() 0 1 2 3 4 * 0 5 6 7 8 9 >>> Compare to: >>> a = iter(range(10)) >>> for i in a: print i if i == 4: print '*', a.next() 0 1 2 3 4 * 5 6 7 8 9 >>> --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Tue Jun 4 22:42:52 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 5 Jun 2002 00:42:52 +0300 Subject: [Python-Dev] xrange identity crisis In-Reply-To: <20020604160123.D24361@unpythonic.net>; from jepler@unpythonic.net on Tue, Jun 04, 2002 at 04:01:24PM -0500 References: <20020604200808.GA43351@hishome.net> <20020604160123.D24361@unpythonic.net> Message-ID: <20020605004252.A27339@hishome.net> On Tue, Jun 04, 2002 at 04:01:24PM -0500, Jeff Epler wrote: > On Tue, Jun 04, 2002 at 04:08:08PM -0400, Oren Tirosh wrote: > > It seems that the xrange object in the current CVS can't make up its mind > > whether it's an iterator or an iterable: > > In 2.2, xrange had no "next" method, so it got wrapped by a generic > iterator object. It was desirable for performance to have xrange also > act as an iterator. I understand the performance issue. But it is possible to improve the performance of iterating over xranges without creating this unholy chimera. >>> type([]), type(iter([])) (, ) ... lists have a listiterator >>> type({}), type(iter({})) (, ) ... dictionaries have a dictionary-iterator >>> type(xrange(10)), type(iter(xrange(10))) (, ) ... why shouldn't an xrange have an xrangeiterator? It's the only way to make xrange behave consistently with other iterables. > However, the following code would give different results if 'iter(x) > is x' for xrange objects: > x = xrange(5) > for a in x: > for b in x: > print a,b xrange currently is currently stuck halfway between an iterable and an iterator. If it was made 100% iterator you would be right, it would break this code. What I'm saying is that it should be 100% iterable. I know it works just fine the way it is. But I see a lot of confusion on the python list around the semantics of iterators and this behavior might make it just a little bit worse. Oren From martin@v.loewis.de Tue Jun 4 22:47:00 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 04 Jun 2002 23:47:00 +0200 Subject: [Python-Dev] xrange identity crisis In-Reply-To: <200206042112.g54LCjl25275@odiug.zope.com> References: <20020604200808.GA43351@hishome.net> <20020604160123.D24361@unpythonic.net> <200206042112.g54LCjl25275@odiug.zope.com> Message-ID: Guido van Rossum writes: > The current xrange code (from SF patch #551410) uses the xrange object > as both an iterator and iterable, and has an extra flag to make things > work right when the same object is iterated over more than once. > Without doing more of a review, I can only say that I'm a but > uncomfortable with that approach. Something like the more recent code > that Raymond H added to listobject.c to add a custom iterator makes > more sense. But perhaps it is defensible. The main defense is that the typical use case is for i in xrange(len(some_list)) In that case, it is desirable not to create an additional object, and nobody will notice the difference. Regards, Martin From martin@v.loewis.de Tue Jun 4 22:52:30 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 04 Jun 2002 23:52:30 +0200 Subject: [Python-Dev] xrange identity crisis In-Reply-To: <20020604200808.GA43351@hishome.net> References: <20020604200808.GA43351@hishome.net> Message-ID: Oren Tirosh writes: > What's the rationale behind this change? The rationale is that it is more efficient. You seem to think it is a problem. Can you explain why you think so? Regards, Martin From martin@v.loewis.de Tue Jun 4 23:07:36 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 05 Jun 2002 00:07:36 +0200 Subject: [Python-Dev] xrange identity crisis In-Reply-To: <20020605004252.A27339@hishome.net> References: <20020604200808.GA43351@hishome.net> <20020604160123.D24361@unpythonic.net> <20020605004252.A27339@hishome.net> Message-ID: Oren Tirosh writes: > ... why shouldn't an xrange have an xrangeiterator? Because that would create an additional object. > It's the only way to make xrange behave consistently with other iterables. Why does it have to be consistent? > I know it works just fine the way it is. But I see a lot of confusion on > the python list around the semantics of iterators and this behavior might > make it just a little bit worse. Why do you think people will get confused? Most people will use it in the canoncical form for i in range(maxvalue) in which case they cannot experience any difference (except for the performance boost)? Regards, Martin From greg@cosc.canterbury.ac.nz Wed Jun 5 01:24:58 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Jun 2002 12:24:58 +1200 (NZST) Subject: [Python-Dev] xrange identity crisis In-Reply-To: Message-ID: <200206050024.MAA06957@s454.cosc.canterbury.ac.nz> martin@v.loewis.de (Martin v. Loewis): > The main defense is that the typical use case is > > for i in xrange(len(some_list)) How about deprecating xrange, and introducing a new function such as indexes(sequence) that returns a proper iterator. That would clear up all the xrange confusion and make for nicer looking code as well. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From gward@python.net Wed Jun 5 02:14:02 2002 From: gward@python.net (Greg Ward) Date: Tue, 4 Jun 2002 21:14:02 -0400 Subject: [Python-Dev] Re: Where to put wrap_text()? In-Reply-To: References: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz> Message-ID: <20020605011402.GA13638@gerg.ca> On 04 June 2002, Fran?ois Pinard said: > Hi, people. > > For this incoming text wrapper facility, there is a feature that appears > really essential to me, and many others: the protection of full stops[1]. If you mean reformatting this: """ This is a sentence ending. If we convert each newline to a single space, there won't be enough space after that period. """ to this: """ This is a sentence ending. If we convert each newline to a single space, there won't be enough space after that period. """ then my wrapping algorithm handles it. However, it's currently limited to English, because it relies on string.lowercase to detect sentence ending periods -- this needs to be fixed, but I was going to post the code and let someone who understands locales tell me what to do. ;-) Greg -- Greg Ward - programmer-at-big gward@python.net http://starship.python.net/~gward/ "... but in the town it was well known that when they got home their fat and psychopathic wives would thrash them to within inches of their lives ..." From guido@python.org Wed Jun 5 02:22:30 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 21:22:30 -0400 Subject: [Python-Dev] xrange identity crisis In-Reply-To: Your message of "04 Jun 2002 23:47:00 +0200." References: <20020604200808.GA43351@hishome.net> <20020604160123.D24361@unpythonic.net> <200206042112.g54LCjl25275@odiug.zope.com> Message-ID: <200206050122.g551MUA01710@pcp02138704pcs.reston01.va.comcast.net> > The main defense is that the typical use case is > > for i in xrange(len(some_list)) > > In that case, it is desirable not to create an additional object, and > nobody will notice the difference. Is it really so bad if this allocates *two* objects instead of one? I think that's the only to get my example to work correctly. And it *has* to work correctly. If two objects are created anyway, I agree with Oren that it's better to have a separate range-iterator object type. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jun 5 02:42:05 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Jun 2002 21:42:05 -0400 Subject: [Python-Dev] xrange identity crisis In-Reply-To: Your message of "Wed, 05 Jun 2002 12:24:58 +1200." <200206050024.MAA06957@s454.cosc.canterbury.ac.nz> References: <200206050024.MAA06957@s454.cosc.canterbury.ac.nz> Message-ID: <200206050142.g551g5j01889@pcp02138704pcs.reston01.va.comcast.net> > How about deprecating xrange, Deprecating xrange has about as much chance as deprecating the string module. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg@cosc.canterbury.ac.nz Wed Jun 5 03:43:28 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Jun 2002 14:43:28 +1200 (NZST) Subject: [Python-Dev] xrange identity crisis In-Reply-To: <200206050142.g551g5j01889@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206050243.OAA07021@s454.cosc.canterbury.ac.nz> > Deprecating xrange has about as much chance as deprecating the string > module. Well, discouraging it then, or whatever *is* being done with the string module. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From python@rcn.com Wed Jun 5 05:52:16 2002 From: python@rcn.com (Raymond Hettinger) Date: Wed, 5 Jun 2002 00:52:16 -0400 Subject: [Python-Dev] xrange identity crisis References: <20020604200808.GA43351@hishome.net> <001901c20c0a$8254d880$f061accf@othello> <200206042118.g54LIdr25307@odiug.zope.com> Message-ID: <008101c20c4c$c4fa9700$a666accf@othello> RDH> Xrange was given its own tp_iter slot and now runs as fast a range. RDH> > In single pass timings, it runs faster. In multiple passes, range RDH> > is still quicker because it only has to create the PyNumbers once. RDH> > RDH> > Being immutable, xrange had the advantage that it could serve as its RDH> > own iterator and did not require the extra code needed for list RDH> > iterators and dict iterators. > GvR> Did you write the pach that Martin checked in? GvR> GvR> It's broken. GvR> GvR> >>> a = iter(xrange(10)) GvR> >>> for i in a: GvR> print i GvR> if i == 4: print '*', a.next() Okay, here's the distilled analysis: Given x=xrange(10), 1. Oren notes that id(iter(x)) == id(x) which is atypical of objects that have special iterator types or get wrapped by the generic iterobject. 2. GvR notes that id(iter(x)) != id(iter(iter(x))) which is inconsistent with range(). #1 should nor be a requirement. A call to iter should simply return something that has an iterable interface whether it be a new object or the current object. In examples of user defined classes with their own __iter__() method, we show the object returning itself. At the same time, we allow the __iter__ method to possibly be defined with a generator which returns a new object. In short, the object identity of iter(x) has not been promised to be either equal or not equal to x. If we decide that #1 is required (for consistency with the way other iterables are currently implemented), the most straightforward solution is to add an xrange iteratorobject to rangeobject.c just like we did for listobject.c. I'll be happy to do this if it is what everyone wants. For #2, the most compelling argument is that xrange should be a drop-in replacement for range in *every* circumstance including the weird use case of iter(iter(xrange(10))). This is easily accomplished and I've uploaded attached a simple patch to the bug report that restores this behavior. However, before accepting the patch, I think we ought to consider whether the current xrange() behavior is more rational than the range() behavior. PEP 234 says: """Some folks have requested the ability to restart an iterator. This should be dealt with by calling iter() on a sequence repeatedly, not by the iterator protocol itself. """ Maybe, the right way to go is to assure that iter(x) returns a freshly loaded iterator instead of the same iterator in the same state. Right now (with xrange different from range), we get what I think is weirder behavior from range(): >>> a = iter(range(3)) >>> for i in a: for j in a: print i,j 0 1 0 2 >>> a = iter(xrange(3)) >>> for i in a: for j in a: print i,j 0 0 0 1 0 2 1 0 1 1 1 2 2 0 2 1 2 2 BTW, I'm happy to do whatever you guys think best: (a) Adding an xrangeiteratorobject fixes #1 and #2 resulting in an xrange() identical to range() with no cost to performance during the loop (creation performance suffers just a bit). (b) Adding my other patch (attached to the bug report www.python.org/sf/564601), fixes #2 only (again with no cost to loop performance). (c) Leaving it the way it is gives xrange a behavior that is identical to range for the common use cases, and arguably superior abilities for the weird cases. Raymond Hettinger From greg@cosc.canterbury.ac.nz Wed Jun 5 06:08:32 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Jun 2002 17:08:32 +1200 (NZST) Subject: [Python-Dev] xrange identity crisis In-Reply-To: <008101c20c4c$c4fa9700$a666accf@othello> Message-ID: <200206050508.RAA07040@s454.cosc.canterbury.ac.nz> > Maybe, the right way to go is to assure that iter(x) returns a freshly > loaded iterator instead of the same iterator in the same state. That would be a change to the semantics of all iterators, not worth it just to fix a small oddity with xrange. I think it's fairly clear that xrange is to be thought of as a lazy list, *not* an iterator. The best way to fix it (if it needs fixing) is to have iter(xrange(...)) always return a new object, I think. It wouldn't be possible for all iterators to behave the way you suggest, anyway, because some kinds of iterator don't have an underlying sequence that can be restarted (e.g. file iterators). Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin@v.loewis.de Wed Jun 5 08:30:05 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 05 Jun 2002 09:30:05 +0200 Subject: [Python-Dev] xrange identity crisis In-Reply-To: <200206050122.g551MUA01710@pcp02138704pcs.reston01.va.comcast.net> References: <20020604200808.GA43351@hishome.net> <20020604160123.D24361@unpythonic.net> <200206042112.g54LCjl25275@odiug.zope.com> <200206050122.g551MUA01710@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Is it really so bad if this allocates *two* objects instead of one? When accepting the patch, I assumed that the observed speed difference between xrange and range originated from the fact that xrange iteration allocates iterator objects. I'm not so sure anymore that this is the real cause, more likely, it is again the exception handling when exhausting the range. > I think that's the only to get my example to work correctly. And it > *has* to work correctly. > > If two objects are created anyway, I agree with Oren that it's better > to have a separate range-iterator object type. I agree. I wouldn't mind if somebody would review Raymond's to introduce such a thing. Regards, Martin From oren-py-d@hishome.net Wed Jun 5 10:21:23 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 5 Jun 2002 12:21:23 +0300 Subject: [Python-Dev] xrange identity crisis In-Reply-To: <001901c20c0a$8254d880$f061accf@othello>; from python@rcn.com on Tue, Jun 04, 2002 at 04:57:58PM -0400 References: <20020604200808.GA43351@hishome.net> <001901c20c0a$8254d880$f061accf@othello> Message-ID: <20020605122123.A1420@hishome.net> On Tue, Jun 04, 2002 at 04:57:58PM -0400, Raymond Hettinger wrote: > Being immutable, xrange had the advantage that it could serve as its own > iterator and did not require the extra code needed for list iterators and > dict iterators. In its current form, xrange is no longer immutable. It has state information and calling the next() method of an xrange object modifies it. I guess the difference between us is that you are concerned with what works while I am irrationally obsessed with semantics :-) Oren From thomas.heller@ion-tof.com Wed Jun 5 15:25:35 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 5 Jun 2002 16:25:35 +0200 Subject: [Python-Dev] 'compile' error message Message-ID: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook> Consider: Python 2.3a0 (#29, Jun 5 2002, 13:09:10) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> compile("1+*3", "myfile.py", "exec") Traceback (most recent call last): File "", line 1, in ? File "", line 1 1+*3 ^ SyntaxError: invalid syntax >>> Shouldn't it print "myfile.py" instead of ""? Thomas From Oleg Broytmann Wed Jun 5 15:48:14 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Wed, 5 Jun 2002 18:48:14 +0400 Subject: [Python-Dev] 'compile' error message In-Reply-To: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Wed, Jun 05, 2002 at 04:25:35PM +0200 References: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook> Message-ID: <20020605184814.D26674@phd.pp.ru> On Wed, Jun 05, 2002 at 04:25:35PM +0200, Thomas Heller wrote: > Python 2.3a0 (#29, Jun 5 2002, 13:09:10) [MSC 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> compile("1+*3", "myfile.py", "exec") > Traceback (most recent call last): > File "", line 1, in ? > File "", line 1 > 1+*3 > ^ > SyntaxError: invalid syntax > >>> > > Shouldn't it print "myfile.py" instead of ""? I think it shoud. Just yesterday I stuck on this bug. Pleas file a bug report. PS. Tomorrow I'll publish the code that uses "compile", whatch c.l.py :) Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From David Abrahams" <20020605184814.D26674@phd.pp.ru> Message-ID: <177701c20ca3$16e69d10$6601a8c0@boostconsulting.com> Didn't I report this problem for 2.2? I was getting this "" thing in my doctest outputs. Could've sworn I phoned it in, and was told it was already fixed. -Dave ----- Original Message ----- From: "Oleg Broytmann" Shouldn't it print "myfile.py" instead of ""? > > I think it shoud. Just yesterday I stuck on this bug. Pleas file a bug > report. > > PS. Tomorrow I'll publish the code that uses "compile", whatch c.l.py :) > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru > Programmers don't die, they just GOSUB without RETURN. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From guido@python.org Wed Jun 5 17:14:45 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Jun 2002 12:14:45 -0400 Subject: [Python-Dev] 'compile' error message In-Reply-To: Your message of "Wed, 05 Jun 2002 16:25:35 +0200." <0c6401c20c9c$db600610$e000a8c0@thomasnotebook> References: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook> Message-ID: <200206051614.g55GEjr01951@pcp02138704pcs.reston01.va.comcast.net> > >>> compile("1+*3", "myfile.py", "exec") > Traceback (most recent call last): > File "", line 1, in ? > File "", line 1 > 1+*3 > ^ > SyntaxError: invalid syntax > >>> > > Shouldn't it print "myfile.py" instead of ""? Yes. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Wed Jun 5 18:04:42 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 5 Jun 2002 19:04:42 +0200 Subject: [Python-Dev] 'compile' error message References: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook> <200206051614.g55GEjr01951@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <0d3001c20cb3$164b7b40$e000a8c0@thomasnotebook> > > >>> compile("1+*3", "myfile.py", "exec") > > Traceback (most recent call last): > > File "", line 1, in ? > > File "", line 1 > > 1+*3 > > ^ > > SyntaxError: invalid syntax > > >>> > > > > Shouldn't it print "myfile.py" instead of ""? > > Yes. > > --Guido van Rossum (home page: http://www.python.org/~guido/) Submitted as SF bug #564931. Thomas From eikeon@eikeon.com Wed Jun 5 20:43:27 2002 From: eikeon@eikeon.com (Daniel 'eikeon' Krech) Date: 05 Jun 2002 15:43:27 -0400 Subject: [Python-Dev] d.get_key(key) -> key? Message-ID: While attempting to "intern" the nodes in our rdflib's triple store I have come across the following question. Is there or could there be an efficient way to get an existing key from a dictionary given a key that is == but whose id is not. For example: given a==b and id(a)!=id(b) and d[a] = 1 what is the best way to: d.get_key(b) -> a --eikeon, http://eikeon.com/ PS: Here is the code where I am trying to get rid of multiple instances of equivalent nodes: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/redfoot/rdflib-1.0/rdflib/store/memory.py?rev=1.1.1.1&content-type=text/vnd.viewcvs-markup and a not so efficient first attempt: # (could use a s.get(e2)->e1 as well given e1==e2 and id(e1)!=id(e2)) class Set(object): def __init__(self): self.__set = [] def add(self, obj): e = self.get(obj) if e: # already have equivalent element, so return the one we have return e else: self.__set.append(obj) return obj def get(self, obj): if obj in self.__set: for e in self.__set: if e==obj: return e return None class Intern(object): def __init__(self): super(Intern, self).__init__() self.__nodes = Set() def add(self, subject, predicate, object): subject = self.__nodes.add(subject) predicate = self.__nodes.add(predicate) object = self.__nodes.add(object) super(Intern, self).add(subject, predicate, object) From guido@python.org Wed Jun 5 20:52:37 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Jun 2002 15:52:37 -0400 Subject: [Python-Dev] SF sending content-free emails Message-ID: <200206051952.g55Jqcl04146@pcp02138704pcs.reston01.va.comcast.net> You may have noticed that the SF tracker is sending email that doesn't contain any content, when an item is updated. I've filed a bug report with SF. http://sourceforge.net/tracker/index.php?func=detail&aid=565001&group_id=1&atid=200001 Gordon, how's the roundup project coming along? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From mgilfix@eecs.tufts.edu Wed Jun 5 20:52:35 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Wed, 5 Jun 2002 15:52:35 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jun 03, 2002 at 01:22:16PM -0400 References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020605155235.B5911@eecs.tufts.edu> On Mon, Jun 03 @ 13:22, Guido van Rossum wrote: > > Couldn't really figure out what you were seeing here. I read that > > you saw something like func( a, b), which I don't see in my local > > copy. > > test_timeout.py from the SF page has this. I'm glad you fixed this > already in your own copy. Weird. I didn't change anything. Oh well. We'll see if it shows up in the new patch this time round. > > I do have something like this for list comprehension: > > > > [ x.find('\n') for x in self._rbuf ] > > > > Er, but I though there were supposed to be surrounding spaces at the > > edges... > > I prefer to see that as > > [x.find('\n') for x in self._rbuf] Ok. Done. One day, you can explain to me why you despise whitespace so. Perhaps she was mean to you or something. She's always hanging around with that tab guy at any rate and they make a bad mix. > > > - It also looks like you've broken the semantics of size<0 in read(). > > I was referring to this piece of code: > > ! if buf_len > size: > ! self._rbuf.append (data[size:]) > ! data = data[:size] > > Here data[size:] gives you the last byte of the data and data[:size] > chops off the last byte. Ok. This has been fixed. All read sizes now work and have been tested by me. > > > - Maybe changing the write buffer to a list makes sense too? > > > > I could do this. Then just do a join before the flush. Is the append > > /that/ much faster? > > Depends on how small the chunks are you write. Roughly, repeated list > append is O(N log N), while repeated string append is O(N**2). Done. The write buffer now uses a list, so it should be faster than the initial version and the one currently in use. > OK, but given the issues the first version had, I recommand that the > code gets more review and that you write unit tests for all cases. I agree. I wasn't through enough in my checking. I'm going to see if I can include a test-case specifically to test the windows file class directly. > > > - Please don't introduce more static functions with a 'Py' name > > > prefix. > > > > Only did this in one place, with PyTimeout_Err. The reason was that the > > other Error functions used the Py prefix, so it was done for consistency. I > > can change that.. or change the naming scheme with the others if you like. > > I like to do code cleanup that doesn't change semantics (like > renamings) as a separate patch and checkin. You can do this before or > after the timeout changes, but don't merge it into the timeout > changes. I still like the static names that you introduce not to > start with Py. Ok. I'll change the PyTimeout_Err to just timeout_err. We can do some other cleanup after the patch has been accepted. It's big enough as is and no need to add more complication. > OK, it looks like you call internal_setblocking(s, 0) to set the > socket in nonblocking mode. (Hm, I don't see any calls to set the > socket in blocking mode!) > > So do I understand that you are now always setting the socket in > non-blocking mode, even when there is no timeout specified, and that > you look at the sock_blocking flag to decide whether to do timeouts or > just pass the nonblocking behavior to the user? > > This is a change in semantics, and could interfere with existing > applications that pass the socket's file descriptor off to other > code. I think I'd be happier if the behavior wasn't changed at all > until a timeout is set for a socket -- then existing code won't > break. So, the best way to proceed seems to be: if (s->sock_timeout == Py_None) /* Perhaps do nothing, or just do original behavior */ else /* Get funky. Do one of the solutions discussed below */ > I only really care for sockets passed in to fromfd(). E.g. someone > can currently do: > > s1 = socket(AF_INET, SOCK_STREAM) > s1.setblocking(0) > > s2 = fromfd(s1.fileno()) > # Now s2 is non-blocking too > > I'd like this to continue to work as long as s1 doesn't set a timeout. I see the issue. We'll worry about this and not ioctl. So let's look at solutions: > > One solution is to set/unset blocking mode right before doing each > > call to be sure of the state and based on the internally stored value > > of the blocking attribute... but... then that kind of renders ioctl > > useless. > > Don't worry so much about ioctl, but do worry about fromfd. Not so popular. > > Another solution might be to set the blocking mode to on everytime > > someone sets a timeout. That would change the blocking/socket > > interaction already described a bit but not drastically. Also easy > > to implement. That sends the message: Don't use ioctls when using > > timeouts. > > I like this. Alright. Well, using the above pseudo-code scheme, we should be alright. So here are the new semantics: If you set_timeout(int/float/long != None): The actual socket gets put in non-blocking mode and the usual select stuff is done. If you set_timeout(None): The old behavior is used AND automatically, the socket is set to blocking mode. That means that someone who was doing non-blocking stuff before, sets a timeout, and then unsets one, will have to do a set_blocking call again if he wants non-blocking stuff. This makes sense 'cause timeout stuff is blocking by nature. That seems fairest and we always have an idea of what state we're in. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From martin@v.loewis.de Wed Jun 5 21:30:47 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 05 Jun 2002 22:30:47 +0200 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: References: Message-ID: "Daniel 'eikeon' Krech" writes: > While attempting to "intern" the nodes in our rdflib's triple store > I have come across the following question. Why is that a python-dev question? Please use python-list to discuss applications of Python. Regards, Martin From guido@python.org Wed Jun 5 22:33:55 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Jun 2002 17:33:55 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: Your message of "Wed, 05 Jun 2002 15:52:35 EDT." <20020605155235.B5911@eecs.tufts.edu> References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net> <20020605155235.B5911@eecs.tufts.edu> Message-ID: <200206052133.g55LXtv04611@pcp02138704pcs.reston01.va.comcast.net> > Ok. Done. One day, you can explain to me why you despise whitespace > so. Perhaps she was mean to you or something. She's always hanging > around with that tab guy at any rate and they make a bad mix. I like the whitespace use in the English language (like so) best. > Ok. This has been fixed. All read sizes now work and have been tested > by me. Have you written unit tests? That would be really great. Ideally, the tests should pass both before and after your patches. > So, the best way to proceed seems to be: > > if (s->sock_timeout == Py_None) > /* Perhaps do nothing, or just do original behavior */ > else > /* Get funky. Do one of the solutions discussed below */ Yes. > So here are the new semantics: > > If you set_timeout(int/float/long != None): > The actual socket gets put in non-blocking mode and the usual select > stuff is done. > If you set_timeout(None): > The old behavior is used AND automatically, the socket is set > to blocking mode. That means that someone who was doing non-blocking > stuff before, sets a timeout, and then unsets one, will have to do > a set_blocking call again if he wants non-blocking stuff. This makes > sense 'cause timeout stuff is blocking by nature. Sounds good! --Guido van Rossum (home page: http://www.python.org/~guido/) From eikeon@eikeon.com Wed Jun 5 22:33:36 2002 From: eikeon@eikeon.com (Daniel 'eikeon' Krech) Date: 05 Jun 2002 17:33:36 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: References: Message-ID: martin@v.loewis.de (Martin v. Loewis) writes: > "Daniel 'eikeon' Krech" writes: > > > While attempting to "intern" the nodes in our rdflib's triple store > > I have come across the following question. > > Why is that a python-dev question? Please use python-list to discuss > applications of Python. Sorry, seems to me like it was on topic for python-dev seeing as python dictionaries do not currently have the functionality I desire. And it would make a great addition to an already great language, IMO. Did not mean for my message to come across as a question about applying python... it is not. --eikeon From ping@zesty.ca Wed Jun 5 22:40:16 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Wed, 5 Jun 2002 16:40:16 -0500 (CDT) Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Message-ID: On 5 Jun 2002, Daniel 'eikeon' Krech wrote: > > Is there or could there be an efficient way to get an existing key from > a dictionary given a key that is == but whose id is not. For example: If i understand you correctly, a good way to solve this problem is to provide a __hash__ method on the objects that you are using as keys to your dictionary. Dictionaries look up keys by hash equality. Note that you will have to ensure that the keys are immutable (i.e. once they are put in the dictionary, they should never change). -- ?!ng From barry@barrys-emacs.org Wed Jun 5 23:07:10 2002 From: barry@barrys-emacs.org (Barry Scott) Date: Wed, 5 Jun 2002 23:07:10 +0100 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Message-ID: <000001c20cdd$56a14330$070210ac@LAPDANCE> Why not store the key as part of the value. d[a] = a d[b] => a If you need more info in the value put a class instance or a tuple with the key as part of the value. d[a] = (a,1) BArry -----Original Message----- From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On Behalf Of Daniel 'eikeon' Krech Sent: 05 June 2002 20:43 To: python-dev@python.org Subject: [Python-Dev] d.get_key(key) -> key? While attempting to "intern" the nodes in our rdflib's triple store I have come across the following question. Is there or could there be an efficient way to get an existing key from a dictionary given a key that is == but whose id is not. For example: given a==b and id(a)!=id(b) and d[a] = 1 what is the best way to: d.get_key(b) -> a --eikeon, http://eikeon.com/ PS: Here is the code where I am trying to get rid of multiple instances of equivalent nodes: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/redfoot/rdflib-1.0/rdflib/sto re/memory.py?rev=1.1.1.1&content-type=text/vnd.viewcvs-markup and a not so efficient first attempt: # (could use a s.get(e2)->e1 as well given e1==e2 and id(e1)!=id(e2)) class Set(object): def __init__(self): self.__set = [] def add(self, obj): e = self.get(obj) if e: # already have equivalent element, so return the one we have return e else: self.__set.append(obj) return obj def get(self, obj): if obj in self.__set: for e in self.__set: if e==obj: return e return None class Intern(object): def __init__(self): super(Intern, self).__init__() self.__nodes = Set() def add(self, subject, predicate, object): subject = self.__nodes.add(subject) predicate = self.__nodes.add(predicate) object = self.__nodes.add(object) super(Intern, self).add(subject, predicate, object) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev From mgilfix@eecs.tufts.edu Wed Jun 5 23:07:29 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Wed, 5 Jun 2002 18:07:29 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206052133.g55LXtv04611@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Wed, Jun 05, 2002 at 05:33:55PM -0400 References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net> <20020605155235.B5911@eecs.tufts.edu> <200206052133.g55LXtv04611@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020605180728.C5911@eecs.tufts.edu> On Wed, Jun 05 @ 17:33, Guido van Rossum wrote: > > Ok. This has been fixed. All read sizes now work and have been tested > > by me. > > Have you written unit tests? That would be really great. Ideally, > the tests should pass both before and after your patches. Done. I've added them into the test_socket.py test as I didn't feel like starting a new test that does roughly the same thing. Works on both the old (2.1.3 source I had lying around my system) and the new. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From niemeyer@conectiva.com Wed Jun 5 23:09:01 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Wed, 5 Jun 2002 19:09:01 -0300 Subject: [Python-Dev] Patch #473512 In-Reply-To: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> Message-ID: <20020605190901.A7546@ibook.distro.conectiva> > I'm ready to apply patch 473512 : getopt with GNU style scanning, > which adds getopt.gnu_getopt. I'm +1 on that. I've written a wrapper by myself a few times. Having it in the library will help. Even with Optik, this should be a small patch, and I don't think getopt will be deprecated any time soon. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From mgilfix@eecs.tufts.edu Wed Jun 5 23:25:15 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Wed, 5 Jun 2002 18:25:15 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206052133.g55LXtv04611@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Wed, Jun 05, 2002 at 05:33:55PM -0400 References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net> <20020605155235.B5911@eecs.tufts.edu> <200206052133.g55LXtv04611@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020605182515.D5911@eecs.tufts.edu> Ok. The new version of the patch is in the sourceforge tracker. Hopefully I haven't forgotten anything. Enjoy all. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From eikeon@eikeon.com Wed Jun 5 23:41:12 2002 From: eikeon@eikeon.com (Daniel 'eikeon' Krech) Date: 05 Jun 2002 18:41:12 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: <000001c20cdd$56a14330$070210ac@LAPDANCE> References: <000001c20cdd$56a14330$070210ac@LAPDANCE> Message-ID: "Barry Scott" writes: > Why not store the key as part of the value. > > d[a] = a > > d[b] => a > > If you need more info in the value put a class instance or a > tuple with the key as part of the value. > > d[a] = (a,1) Ideally it would be nice not to have to store it as part of the value. But that should work. Thank you. I should have split my question clearly into two questions. Sorry for dragging the off [python-dev] topic aspect of my question into this list. The question I tried (poorly) to raise to this list is if a get_key(key) -> key as I described could be added to dictionary in future versions of python. I know at least one user that would use it :) Thank you from a happy python user, --eikeon From barry@barrys-emacs.org Thu Jun 6 00:14:07 2002 From: barry@barrys-emacs.org (Barry Scott) Date: Thu, 6 Jun 2002 00:14:07 +0100 Subject: [Python-Dev] "max recursion limit exceeded" canned response? In-Reply-To: Message-ID: <000301c20ce6$b113c730$070210ac@LAPDANCE> I take it the bug is that .*? is implemented recursively rather then iteratively? I wondered if .*? was broken, but it yields the right answer for short input strings. The case of * applied to a fixed width term could be implemented interatively, ".*", "[axz]*" etc. But variable sized terms would need a record of what they matched for back tracking. For example "(\w+\s+)*". The compiler can figure these differences out. Using a back tracking stack allocated from the heap would reduce the memory used to run the search at the cost of code complexity. Once the bug is fixed the canned message will only need to cover the case of greed repeats * and {n,} encountering an input string line that is too long? I'm working on a regex parser/engine for Barry's Emacs and these design problems are fresh in my thoughts. Barry -----Original Message----- From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On Behalf Of Tim Peters Sent: 02 June 2002 23:04 To: python-dev@python.org Subject: RE: [Python-Dev] "max recursion limit exceeded" canned response? [Skip Montanaro] > How would we go about adding a canned response to the commonly submitted > "max recursion limit exceeded" bug report? [Martin v. Loewis] > Post the precise text that you want to see as the canned response, and > somebody can install it. I don't think any canned answer will suffice -- every context is different enough that it needs custom help. I vote instead that we stop answering these reports at all: let /F do it. That will eventually provoke him into either writing the canned response he wants to see, or to complete the long-delayed task of removing this ceiling from sre. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev From guido@python.org Thu Jun 6 00:25:27 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Jun 2002 19:25:27 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Your message of "05 Jun 2002 17:33:36 EDT." References: Message-ID: <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net> > Sorry, seems to me like it was on topic for python-dev seeing as > python dictionaries do not currently have the functionality I > desire. And it would make a great addition to an already great > language, IMO. Did not mean for my message to come across as a > question about applying python... it is not. The functionality you propose seems to esoteric to add. It's probably simpler to make sure you call intern() before storing the key in the dict anyway. --Guido van Rossum (home page: http://www.python.org/~guido/) From eikeon@eikeon.com Thu Jun 6 01:02:34 2002 From: eikeon@eikeon.com (Daniel 'eikeon' Krech) Date: 05 Jun 2002 20:02:34 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net> References: <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net> Message-ID: > The functionality you propose seems to esoteric to add. Fair enough. That fact that it is a bit esoteric is the reason why I raised it... to me it seems like a nice boundary case to support, but an unlikely one unless someone raised it. > It's probably simpler to make sure you call intern() before storing > the key in the dict anyway. It is my understanding that calling intern creates a string that is forever immortal? [Perhaps this is a question for python-list though?] For our application, we can not afford to "leak" the memory we would lose to the immortal strings. [I will likely proceed by storing the key as part of the value as Barry suggested.] Thank you for your time to hear me out. Sorry it was too esoteric... felt it at least desirved raising... now it is time to forget that I did :) Thank you all for your time, --eikeon From pobrien@orbtech.com Thu Jun 6 01:24:25 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Wed, 5 Jun 2002 19:24:25 -0500 Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention Message-ID: Forgive me if this is slightly off-topic for this list, but since we've been talking about migration guides and coding idioms and tweaking performance and such, I've got a few questions I'd like to ask. I'll start with an actual code sample. This is a very simple class that's part of an xhtml toolkit I'm writing. class Comment: def __init__(self, content=''): self.content = content def __call__(self, content=''): o = self.__class__(content) return str(o) def __str__(self): return '' % self.content def __repr__(self): return repr(self.__str__()) When I look at this, I see certain decisions I've made and I'm wondering if I've made the best decisions. I'm wondering how to balance performance against clarity and proper coding conventions. 1. In the __call__ I save a reference to the object. Instead, I could simply: return str(self.__class__(content)) Is there much of a performance impact by explicitly naming intermediate references? (I need some of Tim Peter's performance testing scripts.) 2. I chose the slightly indirect str(o) instead of o.__str__(). Is this slower? Is one style preferred over the other and why? 3. I used a format string, '' % self.content, where I could just as easily have concatenated '' instead. Is one faster than the other? 4. Is there any documentation that covers these kinds of issues where there is more than one way to do something? I'd like to have some foundation for making these decisions. As you can probably guess, I usually hate having more than one way to do anything. ;-) --- Patrick K. O'Brien Orbtech From greg@cosc.canterbury.ac.nz Thu Jun 6 01:26:40 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 06 Jun 2002 12:26:40 +1200 (NZST) Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Message-ID: <200206060026.MAA07140@s454.cosc.canterbury.ac.nz> "Daniel 'eikeon' Krech" : > Ideally it would be nice not to have to store it as part of the > value. You could keep a separate dictionary mapping each "canonical" value to itself, and use that for normalising things before looking up the main dictionary. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido@python.org Thu Jun 6 01:53:02 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Jun 2002 20:53:02 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Your message of "Thu, 06 Jun 2002 12:26:40 +1200." <200206060026.MAA07140@s454.cosc.canterbury.ac.nz> References: <200206060026.MAA07140@s454.cosc.canterbury.ac.nz> Message-ID: <200206060053.g560r2h05238@pcp02138704pcs.reston01.va.comcast.net> > You could keep a separate dictionary mapping each > "canonical" value to itself, and use that for > normalising things before looking up the main > dictionary. That's what intern() does. Can't he just call intern()? Or does he want the *uninterned* version of the key back? Why on earth? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 6 02:07:11 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Jun 2002 21:07:11 -0400 Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention In-Reply-To: Your message of "Wed, 05 Jun 2002 19:24:25 CDT." References: Message-ID: <200206060107.g5617Cv05298@pcp02138704pcs.reston01.va.comcast.net> > class Comment: > > def __init__(self, content=''): > self.content = content > > def __call__(self, content=''): > o = self.__class__(content) > return str(o) > > def __str__(self): > return '' % self.content > > def __repr__(self): > return repr(self.__str__()) > > When I look at this, I see certain decisions I've made and I'm wondering if > I've made the best decisions. I'm wondering how to balance performance > against clarity and proper coding conventions. > > 1. In the __call__ I save a reference to the object. Instead, I could > simply: > > return str(self.__class__(content)) > > Is there much of a performance impact by explicitly naming intermediate > references? (I need some of Tim Peter's performance testing scripts.) Since o is a "fast local" (all locals are fast locals except when a function uses exec or import *), it is very fast. The load and store of fast locals are about the fastest opcodes around. I am more worried about the inefficiency of instantiating self.__class__ and then throwing it away after calling str() on it. You could factor out the body of __str__ into a separate method so that you can invoke it from __call__ without creating an instance. > 2. I chose the slightly indirect str(o) instead of o.__str__(). Is this > slower? Is one style preferred over the other and why? str(o) is preferred. I would say that you should never call __foo__ methods directly except when you're overriding a base class's __foo__ method. > 3. I used a format string, '' % self.content, where I could just > as easily have concatenated '' > instead. Is one faster than the other? You could time it. My personal belief is that for more than one + operator, %s is faster. > 4. Is there any documentation that covers these kinds of issues > where there is more than one way to do something? I'd like to have > some foundation for making these decisions. As you can probably > guess, I usually hate having more than one way to do anything. ;-) I'm not aware of documentation, and I think you should give yourself some credit for having a personal opinion. Study the standard library and you'll get an idea of what's "done" and what's "not done". BTW I have another gripe about your example. > def __str__(self): > return '' % self.content > > def __repr__(self): > return repr(self.__str__()) This definition of __repr__ makes no sense to me -- all it does is add string quotes around the contents of the string (and escape non-printing characters and quotes if there are any). That is confusing, because it will appear to the reader as if the object is a string. You probably should write __repr__ = __str__ instead. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg@cosc.canterbury.ac.nz Thu Jun 6 02:06:48 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 06 Jun 2002 13:06:48 +1200 (NZST) Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: <200206060053.g560r2h05238@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206060106.NAA07146@s454.cosc.canterbury.ac.nz> Guido: > That's what intern() does. Can't he just call intern()? Because his keys aren't strings, I think. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido@python.org Thu Jun 6 02:11:47 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Jun 2002 21:11:47 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Your message of "05 Jun 2002 20:02:34 EDT." References: <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206060111.g561Bl205351@pcp02138704pcs.reston01.va.comcast.net> > It is my understanding that calling intern creates a string that is > forever immortal? [Perhaps this is a question for python-list > though?] Yes. > For our application, we can not afford to "leak" the memory we would > lose to the immortal strings. Thanks for explaining that. The use case still seems to esoteric to me to warrant adding a feature to Python. You could always write an extension that does it though. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From greg@cosc.canterbury.ac.nz Thu Jun 6 02:12:56 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 06 Jun 2002 13:12:56 +1200 (NZST) Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: <200206060111.g561Bl205351@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> > For our application, we can not afford to "leak" the memory we would > lose to the immortal strings. Seems to me that if the implementation of interning were smart enough, it would be able to drop strings that were not referenced from anywhere else. Maybe *that* would be a useful feature to add? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From eikeon@eikeon.com Thu Jun 6 02:29:45 2002 From: eikeon@eikeon.com (Daniel 'eikeon' Krech) Date: 05 Jun 2002 21:29:45 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: <200206060111.g561Bl205351@pcp02138704pcs.reston01.va.comcast.net> References: <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net> <200206060111.g561Bl205351@pcp02138704pcs.reston01.va.comcast.net> Message-ID: >Because his keys aren't strings, I think. Our objects are subclasses of string. >Seems to me that if the implementation of interning were >smart enough, it would be able to drop strings that were >not referenced from anywhere else. >Maybe *that* would be a useful feature to add? Yes. Especially if it could be made to work with subclasses of string ;) --eikeon From guido@python.org Thu Jun 6 02:36:26 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Jun 2002 21:36:26 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Your message of "Thu, 06 Jun 2002 13:12:56 +1200." <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> Message-ID: <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> > Seems to me that if the implementation of interning were > smart enough, it would be able to drop strings that were > not referenced from anywhere else. > > Maybe *that* would be a useful feature to add? An occasional run through the 'interned' dict (in stringobject.c) looking for strings with refcount 2 would do this. Maybe something for the gc module do handle as a service whenever it runs its last-generation collection? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 6 02:44:26 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Jun 2002 21:44:26 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Your message of "05 Jun 2002 21:29:45 EDT." References: <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net> <200206060111.g561Bl205351@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206060144.g561iQc05572@pcp02138704pcs.reston01.va.comcast.net> > Our objects are subclasses of string. Ah, those can't be interned. > >Seems to me that if the implementation of interning were > >smart enough, it would be able to drop strings that were > >not referenced from anywhere else. > > >Maybe *that* would be a useful feature to add? > > Yes. Especially if it could be made to work with subclasses of string ;) Alas, subclasses of str can't be interned. Consider the following scenario. You intern a str-subclass-instance with value "foo" that implements a funky __repr__. Some other unrelated piece of code interns the string "foo". When they apply repr() to it, they'll be very unhappy that their string has been turned into something else. (In fact, the interning code, when it sees a str-subclass-instance, makes a copy as a true str instance.) --Guido van Rossum (home page: http://www.python.org/~guido/) From pobrien@orbtech.com Thu Jun 6 03:15:51 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Wed, 5 Jun 2002 21:15:51 -0500 Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention In-Reply-To: <200206060107.g5617Cv05298@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > I am more worried about the inefficiency of instantiating > self.__class__ and then throwing it away after calling str() on it. > You could factor out the body of __str__ into a separate method so > that you can invoke it from __call__ without creating an instance. Some more code from the module might help explain this design decision. I'm still sort of toying with this to see if I like it. The basic idea here is that I'm trying to support both DOM-like xhtml structures as well as simple function-like callables that return strings. When the instance is called it needs a fresh state in order to better mimic a true function. It isn't immediately obvious to me how I might refactor this to avoid instantiating a throwaway. class Element: def __init__(self, klass, id, style, title): self.name = self.__class__.__name__.lower() self.attrs = { 'class': klass, # Space-separated list of classes. 'id': id, # Document-wide unique id. 'style': style, # Associated style info. 'title': title, # Advisory title/amplification. } def attrstring(self): attrs = self.attrs.keys() attrs.sort() # Sorting is only cosmetic, not required. l = [] # List of formatted attribute/value pairs. for attr in attrs: value = self.attrs[attr] if value is not None and value != '': l += ['%s="%s"' % (attr, convert(value))] s = ' ' + ' '.join(l) # Prepend a single space. return s.rstrip() # Reduce to an empty string if no attrs. def __str__(self): pass def __repr__(self): return repr(self.__str__()) class EmptyElement(Element): def __init__(self, klass=None, id=None, style=None, title=None): Element.__init__(self, klass, id, style, title) def __call__(self, klass=None, id=None, style=None, title=None): o = self.__class__(klass, id, style, title) return str(o) def __str__(self): attrstring = self.attrstring() return '<%s%s />\n' % (self.name, attrstring) class SimpleElement(Element): def __init__(self, content='', klass=None, id=None, style=None, title=None): self.content = content Element.__init__(self, klass, id, style, title) def __call__(self, content='', klass=None, id=None, style=None, title=None): o = self.__class__(content, klass, id, style, title) return str(o) def __str__(self): attrstring = self.attrstring() return '<%s%s>\n%s\n\n' % \ (self.name, attrstring, convert(self.content), self.name) class Br(EmptyElement): pass class Hr(EmptyElement): pass class P(SimpleElement): pass # The following singleton instances are callable, returning strings. # They can be used like simple functions to return properly tagged contents. br = Br() comment = Comment() hr = Hr() p = P() > BTW I have another gripe about your example. > > > def __str__(self): > > return '' % self.content > > > > def __repr__(self): > > return repr(self.__str__()) > > This definition of __repr__ makes no sense to me -- all it does is add > string quotes around the contents of the string (and escape > non-printing characters and quotes if there are any). That is > confusing, because it will appear to the reader as if the object is a > string. Yes. This was a conscious design choice for this particular application. Maybe there is a better way, and maybe I'm not being too Pythonic, but I'm not particularly troubled by this even though I know I'm "breaking the rules". I guess I don't mind if there is more than one way to do something, as long as one way is the Python way and the other way is my way. ;-) --- Patrick K. O'Brien Orbtech From barry@zope.com Thu Jun 6 04:35:55 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 5 Jun 2002 23:35:55 -0400 Subject: [Python-Dev] d.get_key(key) -> key? References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15614.55451.1832.105520@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: >> Seems to me that if the implementation of interning were smart >> enough, it would be able to drop strings that were not >> referenced from anywhere else. Maybe *that* would be a useful >> feature to add? GvR> An occasional run through the 'interned' dict (in GvR> stringobject.c) looking for strings with refcount 2 would do GvR> this. Maybe something for the gc module do handle as a GvR> service whenever it runs its last-generation collection? What about exposing _Py_ReleaseInternedStrings() to Python, say, from the gc module? -Barry From aahz@pythoncraft.com Thu Jun 6 05:03:33 2002 From: aahz@pythoncraft.com (Aahz) Date: Thu, 6 Jun 2002 00:03:33 -0400 Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention In-Reply-To: References: <200206060107.g5617Cv05298@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020606040333.GA11085@panix.com> On Wed, Jun 05, 2002, Patrick K. O'Brien wrote: > > class Element: > > def __str__(self): > pass Dunno about other people's opinions, but I have a strong distaste for creating methods whose body contains pass. I always use "raise NotImplementedError". -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In the end, outside of spy agencies, people are far too trusting and willing to help." --Ira Winkler From pobrien@orbtech.com Thu Jun 6 05:38:51 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Wed, 5 Jun 2002 23:38:51 -0500 Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention In-Reply-To: <20020606040333.GA11085@panix.com> Message-ID: [Aahz] > > On Wed, Jun 05, 2002, Patrick K. O'Brien wrote: > > > > class Element: > > > > def __str__(self): > > pass > > Dunno about other people's opinions, but I have a strong distaste for > creating methods whose body contains pass. I always use "raise > NotImplementedError". I agree. That's a bad habit of mine that I need to change. Thanks for the reminder. --- Patrick K. O'Brien Orbtech From martin@v.loewis.de Thu Jun 6 07:36:41 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 06 Jun 2002 08:36:41 +0200 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > An occasional run through the 'interned' dict (in stringobject.c) > looking for strings with refcount 2 would do this. Maybe something > for the gc module do handle as a service whenever it runs its > last-generation collection? This has the potential of breaking applications that remember the id() of an interned string, instead of its value. Regards, Martin From martin@v.loewis.de Thu Jun 6 07:44:30 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 06 Jun 2002 08:44:30 +0200 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: References: Message-ID: "Daniel 'eikeon' Krech" writes: > Sorry, seems to me like it was on topic for python-dev seeing as > python dictionaries do not currently have the functionality I > desire. It sure is possible. You have been essentially asking the question "How do I get the canonical member of an equivalence class in Python", for which the canonical answer is "you intern the one member of each equivalence class". As Barry Scott explains, this is best done with an interning dictionary. You are also asking "how do I efficiently implement sets in Python". This is almost a FAQ, the answer is "if the elements are hashable, use a dictionary". Regards, Martin From loewis@informatik.hu-berlin.de Thu Jun 6 10:41:15 2002 From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=) Date: 06 Jun 2002 11:41:15 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t Message-ID: What terrible things would happen if ob_size would be changed from int to size_t? The question recently came up on comp.lang.python, where the poster noticed that you cannot mmap large files on a 64-bit system where int is 32 bits; there is a 2Gib limit on the length of objects on his specific system. About the only problem I can see is that you could not store negative numbers anymore. Is ssize_t universally available, or could be used on systems where it is available? Regards, Martin From mal@lemburg.com Thu Jun 6 11:19:17 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 06 Jun 2002 12:19:17 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t References: Message-ID: <3CFF3725.303@lemburg.com> Martin v. L=F6wis wrote: > What terrible things would happen if ob_size would be changed from int > to size_t? This would cause binary incompatibility for all extension types on 64-bit systems since the object struct layout would change (probably not much of an issue since binary compatiblity is not guaranteed between releases anyway). > The question recently came up on comp.lang.python, where the poster > noticed that you cannot mmap large files on a 64-bit system where int > is 32 bits; there is a 2Gib limit on the length of objects on his > specific system. Wouldn't it be easier to solve this particular problem in the type used for mmapping files ? > About the only problem I can see is that you could not store negative > numbers anymore. Is ssize_t universally available, or could be used on > systems where it is available? --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From guido@python.org Thu Jun 6 13:50:38 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 08:50:38 -0400 Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention In-Reply-To: Your message of "Wed, 05 Jun 2002 21:15:51 CDT." References: Message-ID: <200206061250.g56Cocp06447@pcp02138704pcs.reston01.va.comcast.net> > Yes. This was a conscious design choice for this particular > application. Maybe there is a better way, and maybe I'm not being > too Pythonic, but I'm not particularly troubled by this even though > I know I'm "breaking the rules". Maybe you shouldn't ask for advice if you have it all worked out already? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 6 13:57:41 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 08:57:41 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Your message of "Wed, 05 Jun 2002 23:35:55 EDT." <15614.55451.1832.105520@anthem.wooz.org> References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> <15614.55451.1832.105520@anthem.wooz.org> Message-ID: <200206061257.g56Cvfg06543@pcp02138704pcs.reston01.va.comcast.net> > GvR> An occasional run through the 'interned' dict (in > GvR> stringobject.c) looking for strings with refcount 2 would do > GvR> this. Maybe something for the gc module do handle as a > GvR> service whenever it runs its last-generation collection? > > What about exposing _Py_ReleaseInternedStrings() to Python, say, from > the gc module? If it's going to be an exposed API, it will have to live in stringobject.c, since the 'interned' dict is a static global there. Wanna give it a crack? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 6 13:58:21 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 08:58:21 -0400 Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention In-Reply-To: Your message of "Thu, 06 Jun 2002 00:03:33 EDT." <20020606040333.GA11085@panix.com> References: <200206060107.g5617Cv05298@pcp02138704pcs.reston01.va.comcast.net> <20020606040333.GA11085@panix.com> Message-ID: <200206061258.g56CwLp06558@pcp02138704pcs.reston01.va.comcast.net> > > class Element: > > > > def __str__(self): > > pass > > Dunno about other people's opinions, but I have a strong distaste for > creating methods whose body contains pass. I always use "raise > NotImplementedError". But that has different semantics! --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Thu Jun 6 14:13:25 2002 From: aahz@pythoncraft.com (Aahz) Date: Thu, 6 Jun 2002 09:13:25 -0400 Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention In-Reply-To: <200206061258.g56CwLp06558@pcp02138704pcs.reston01.va.comcast.net> References: <200206060107.g5617Cv05298@pcp02138704pcs.reston01.va.comcast.net> <20020606040333.GA11085@panix.com> <200206061258.g56CwLp06558@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020606131325.GA21511@panix.com> On Thu, Jun 06, 2002, Guido van Rossum wrote: > >>> class Element: >>> >>> def __str__(self): >>> pass >> >> Dunno about other people's opinions, but I have a strong distaste for >> creating methods whose body contains pass. I always use "raise >> NotImplementedError". > > But that has different semantics! Yes, exactly. My point was that one rarely wants the semantics of "pass" for method definitions, and that goes double or triple for the special methods such as __str__. Consider what happens to an application that calls str() on this object and gets back a None instead of a string. Blech -- errors should never pass silently. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I had lots of reasonable theories about children myself, until I had some." --Michael Rios From guido@python.org Thu Jun 6 14:19:46 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 09:19:46 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: Your message of "06 Jun 2002 11:41:15 +0200." References: Message-ID: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> > What terrible things would happen if ob_size would be changed from int > to size_t? Binary incompatibility on 64-bit platforms, for one. Also, many other APIs would have to be changed: everything that takes or returns an int (e.g. PyObject_Size, PySequence_GetItem) would have to be changed, and again would be a binary incompatibility. Also could cause lots of compilation warnings when user code stores the result into an int. > The question recently came up on comp.lang.python, where the poster > noticed that you cannot mmap large files on a 64-bit system where int > is 32 bits; there is a 2Gib limit on the length of objects on his > specific system. That is indeed painful. > About the only problem I can see is that you could not store negative > numbers anymore. Is ssize_t universally available, or could be used on > systems where it is available? I've never heard of if, so it must be a relatively newfangled thing. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 6 14:07:29 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 09:07:29 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Your message of "06 Jun 2002 08:36:41 +0200." References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net> > > An occasional run through the 'interned' dict (in stringobject.c) > > looking for strings with refcount 2 would do this. Maybe something > > for the gc module do handle as a service whenever it runs its > > last-generation collection? > > This has the potential of breaking applications that remember the id() > of an interned string, instead of its value. Ow, good point! It's also quite possible that there are no outside references to an interned string, but another string with the same value still references the interned string from its ob_sinterned field. E.g. s = "frobnicate"*3 t = intern(s) del t To solve this, we would have to make the ob_sinterned slot count as a reference to the interned string. But then string_dealloc would be complicated (it would have to call Py_XDECREF(op->ob_sinterned)), possibly slowing things down. Is this worth it? The fear for unbounded growth of the interned strings table is pretty common amongst authors of serious long-running programs. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Thu Jun 6 16:47:19 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 6 Jun 2002 11:47:19 -0400 Subject: Releasing the intern dictionary (was Re: [Python-Dev] d.get_key(key) -> key?) References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> <15614.55451.1832.105520@anthem.wooz.org> <200206061257.g56Cvfg06543@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15615.33799.915681.15874@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: >> What about exposing _Py_ReleaseInternedStrings() to Python, >> say, from the gc module? GvR> If it's going to be an exposed API, it will have to live in GvR> stringobject.c, since the 'interned' dict is a static global GvR> there. Actually, I don't think so, since _Py_ReleaseInternedStrings() is already an extern function. GvR> Wanna give it a crack? http://sourceforge.net/tracker/index.php?func=detail&aid=565378&group_id=5470&atid=305470 Doc changes and test case included. -Barry From gward@python.net Thu Jun 6 16:20:11 2002 From: gward@python.net (Greg Ward) Date: Thu, 6 Jun 2002 11:20:11 -0400 Subject: [Python-Dev] Where to put wrap_text()? In-Reply-To: References: <20020601134236.GA17691@gerg.ca> Message-ID: <20020606152011.GA16829@gerg.ca> On 01 June 2002, Tim Peters said: > [Greg Ward, on wrapping text] > > ... > > Note that regrtest.py also has a wrapper: > > def printlist(x, width=70, indent=4): > """Print the elements of a sequence to stdout. > > Optional arg width (default 70) is the maximum line length. > Optional arg indent (default 4) is the number of blanks with which to > begin each line. > """ I think this one will probably stand; I've gotten to the point with my text-wrapping code where I'm reimplementing the various other text-wrappers people have mentioned on top of it, and regrtest.printlist() is just not a good fit. It's for printing lists compactly, not for filling text. Whatever. > Just make sure it handle the union of all possible desires, but has a simple > and intuitive interface . Right. Gotcha. Code coming up soon. Greg -- Greg Ward - Unix weenie gward@python.net http://starship.python.net/~gward/ Quick!! Act as if nothing has happened! From gward@python.net Thu Jun 6 16:46:01 2002 From: gward@python.net (Greg Ward) Date: Thu, 6 Jun 2002 11:46:01 -0400 Subject: [Python-Dev] textwrap.py Message-ID: <20020606154601.GA16897@gerg.ca> --jI8keyz6grp/JLjh Content-Type: text/plain; charset=unknown-8bit Content-Disposition: inline Content-Transfer-Encoding: 8bit Hi all -- since my ISP seems to be taking a holiday today, I was able to polish off my proposed text-wrapping module. Of course, it'll sit in my mail queue until my link to the outside world is back, but never mind. Anyways, the code is attached. I don't care if this becomes textwrap.py, wraptext.py, text/wrap.py, or whatever -- let's concentrate on the code for now. I'll also attach my test script, so you can see what TextWrapper can and cannot do. Things to note: * The code is not locale-aware; it should be to detect sentence endings, which it needs to do to ensure that there are two spaces after each sentence ending. Eg. it fixes "I have eaten. And you?" but not "Moi, j'ai mangé. Et toi?" * The code is not Unicode-aware. I have no idea what will happen if you pass Unicode strings to it. * However, it is hyphen-aware. Please spend a few minutes gawping at the enormity of wordsep_re -- that took a while to get right. ;-) * Despite occasional complaints (hello, Jeremy and Neil), I still write "def foo (a, b)" rather than "def foo(a, b)". I'll (begrudgingly) fix this before checking anything in. * I'm not sure if exposing flags to make whitespace-munging optional is a good idea. Opinions? * need to convert the test suite to unittest (I guess) BTW, as an exercise I implemented Mailman's wrap() function on top of TextWrapper, and it seems to have worked fine. I tried to implement regrtest's printlist(), but it's not a good fit (as I mentioned in my last post). Greg -- Greg Ward - nerd gward@python.net http://starship.python.net/~gward/ Support bacteria -- it's the only culture some people have! --jI8keyz6grp/JLjh Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="textwrap.py" """ Utilities for wrapping text strings and filling text paragraphs. """ __revision__ = "$Id$" import string, re # XXX is this going to be implemented properly somewhere in 2.3? def islower (c): return c in string.lowercase class TextWrapper: """ Object for wrapping/filling text. The public interface consists of the wrap() and fill() methods; the other methods are just there for subclasses to override in order to tweak the default behaviour. If you want to completely replace the main wrapping algorithm, you'll probably have to override _wrap_chunks(). Several instance attributes control various aspects of wrapping: expand_tabs if true (default), tabs in input text will be expanded to spaces before further processing. Each tab will become 1 .. 8 spaces, depending on its position in its line. If false, each tab is treated as a single character. replace_whitespace if true (default), all whitespace characters in the input text are replaced by spaces after tab expansion. Note that expand_tabs is false and replace_whitespace is true, every tab will be converted to a single space! break_long_words if true (default), words longer than the line width constraint will be broken. If false, those words will not be broken, and some lines might be longer than the width constraint. """ whitespace_trans = string.maketrans(string.whitespace, ' ' * len(string.whitespace)) # This funky little regex is just the trick for splitting # text up into word-wrappable chunks. E.g. # "Hello there -- you goof-ball, use the -b option!" # splits into # Hello/ /there/ /--/ /you/ /goof-/ball,/ /use/ /the/ /-b/ /option! # (after stripping out empty strings). wordsep_re = re.compile(r'(\s+|' # any whitespace r'\w{2,}-(?=\w{2,})|' # hyphenated words r'(?<=\w)-{2,}(?=\w))') # em-dash def __init__ (self): self.expand_tabs = 1 self.replace_whitespace = 1 self.break_long_words = 1 # -- Private methods ----------------------------------------------- # (possibly useful for subclasses to override) def _munge_whitespace (self, text): """_munge_whitespace(text : string) -> string Munge whitespace in text: expand tabs and convert all other whitespace characters to spaces. Eg. " foo\tbar\n\nbaz" becomes " foo bar baz". """ if self.expand_tabs: text = text.expandtabs() if self.replace_whitespace: text = text.translate(self.whitespace_trans) return text def _split (self, text): """_split(text : string) -> [string] Split the text to wrap into indivisible chunks. Chunks are not quite the same as words; see wrap_chunks() for full details. As an example, the text Look, goof-ball -- use the -b option! breaks into the following chunks: 'Look,', ' ', 'goof-', 'ball', ' ', '--', ' ', 'use', ' ', 'the', ' ', '-b', ' ', 'option!' """ chunks = self.wordsep_re.split(text) chunks = filter(None, chunks) return chunks def _fix_sentence_endings (self, chunks): """_fix_sentence_endings(chunks : [string]) Correct for sentence endings buried in 'chunks'. Eg. when the original text contains "... foo.\nBar ...", munge_whitespace() and split() will convert that to [..., "foo.", " ", "Bar", ...] which has one too few spaces; this method simply changes the one space to two. """ i = 0 while i < len(chunks)-1: # chunks[i] looks like the last word of a sentence, # and it's followed by a single space. if (chunks[i][-1] == "." and chunks[i+1] == " " and islower(chunks[i][-2])): chunks[i+1] = " " i += 2 else: i += 1 def _handle_long_word (self, chunks, cur_line, cur_len, width): """_handle_long_word(chunks : [string], cur_line : [string], cur_len : int, width : int) Handle a chunk of text (most likely a word, not whitespace) that is too long to fit in any line. """ space_left = width - cur_len # If we're allowed to break long words, then do so: put as much # of the next chunk onto the current line as will fit. if self.break_long_words: cur_line.append(chunks[0][0:space_left]) chunks[0] = chunks[0][space_left:] # Otherwise, we have to preserve the long word intact. Only add # it to the current line if there's nothing already there -- # that minimizes how much we violate the width constraint. elif not cur_line: cur_line.append(chunks.pop(0)) # If we're not allowed to break long words, and there's already # text on the current line, do nothing. Next time through the # main loop of _wrap_chunks(), we'll wind up here again, but # cur_len will be zero, so the next line will be entirely # devoted to the long word that we can't handle right now. def _wrap_chunks (self, chunks, width): """_wrap_chunks(chunks : [string], width : int) -> [string] Wrap a sequence of text chunks and return a list of lines of length 'width' or less. (If 'break_long_words' is false, some lines may be longer than 'width'.) Chunks correspond roughly to words and the whitespace between them: each chunk is indivisible (modulo 'break_long_words'), but a line break can come between any two chunks. Chunks should not have internal whitespace; ie. a chunk is either all whitespace or a "word". Whitespace chunks will be removed from the beginning and end of lines, but apart from that whitespace is preserved. """ lines = [] while chunks: cur_line = [] # list of chunks (to-be-joined) cur_len = 0 # length of current line # First chunk on line is whitespace -- drop it. if chunks[0].strip() == '': del chunks[0] while chunks: l = len(chunks[0]) # Can at least squeeze this chunk onto the current line. if cur_len + l <= width: cur_line.append(chunks.pop(0)) cur_len += l # Nope, this line is full. else: break # The current line is full, and the next chunk is too big to # fit on *any* line (not just this one). if chunks and len(chunks[0]) > width: self._handle_long_word(chunks, cur_line, cur_len, width) # If the last chunk on this line is all whitespace, drop it. if cur_line and cur_line[-1].strip() == '': del cur_line[-1] # Convert current line back to a string and store it in list # of all lines (return value). if cur_line: lines.append(''.join(cur_line)) return lines # -- Public interface ---------------------------------------------- def wrap (self, text, width): """wrap(text : string, width : int) -> [string] Split 'text' into multiple lines of no more than 'width' characters each, and return the list of strings that results. Tabs in 'text' are expanded with string.expandtabs(), and all other whitespace characters (including newline) are converted to space. """ text = self._munge_whitespace(text) if len(text) <= width: return [text] chunks = self._split(text) self._fix_sentence_endings(chunks) return self._wrap_chunks(chunks, width) def fill (self, text, width, initial_tab="", subsequent_tab=""): """fill(text : string, width : int, initial_tab : string = "", subsequent_tab : string = "") -> string Reformat the paragraph in 'text' to fit in lines of no more than 'width' columns. The first line is prefixed with 'initial_tab', and subsequent lines are prefixed with 'subsequent_tab'; the lengths of the tab strings are accounted for when wrapping lines to fit in 'width' columns. """ lines = self.wrap(text, width) sep = "\n" + subsequent_tab return initial_tab + sep.join(lines) # Convenience interface _wrapper = TextWrapper() def wrap (text, width): return _wrapper.wrap(text, width) def fill (text, width, initial_tab="", subsequent_tab=""): return _wrapper.fill(text, width, initial_tab, subsequent_tab) --jI8keyz6grp/JLjh Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="test_textwrap.py" #!/usr/bin/env python from textwrap import TextWrapper num = 0 def test (result, expect): global num num += 1 if result == expect: print "%d: ok" % num else: print "%d: not ok, expected:" % num for i in range(len(expect)): print " %d: %r" % (i, expect[i]) print "but got:" for i in range(len(result)): print " %d: %r" % (i, result[i]) wrapper = TextWrapper() wrap = wrapper.wrap # Simple case: just words, spaces, and a bit of punctuation. t = "Hello there, how are you this fine day? I'm glad to hear it!" test(wrap(t, 12), ["Hello there,", "how are you", "this fine", "day? I'm", "glad to hear", "it!"]) test(wrap(t, 42), ["Hello there, how are you this fine day?", "I'm glad to hear it!"]) test(wrap(t, 80), [t]) # Whitespace munging and end-of-sentence detection. t = """\ This is a paragraph that already has line breaks. But some of its lines are much longer than the others, so it needs to be wrapped. Some lines are \ttabbed too. What a mess! """ test(wrap(t, 45), ["This is a paragraph that already has line", "breaks. But some of its lines are much", "longer than the others, so it needs to be", "wrapped. Some lines are tabbed too. What a", "mess!"]) # Wrapping to make short lines longer. t = "This is a\nshort paragraph." test(wrap(t, 20), ["This is a short", "paragraph."]) test(wrap(t, 40), ["This is a short paragraph."]) # Test breaking hyphenated words. t = "this-is-a-useful-feature-for-reformatting-posts-from-tim-peters'ly" test(wrap(t, 40), ["this-is-a-useful-feature-for-", "reformatting-posts-from-tim-peters'ly"]) test(wrap(t, 41), ["this-is-a-useful-feature-for-", "reformatting-posts-from-tim-peters'ly"]) test(wrap(t, 42), ["this-is-a-useful-feature-for-reformatting-", "posts-from-tim-peters'ly"]) # Ensure that the standard _split() method works as advertised in # the comments (don't you hate it when code and comments diverge?). t = "Hello there -- you goof-ball, use the -b option!" test(wrapper._split(t), ["Hello", " ", "there", " ", "--", " ", "you", " ", "goof-", "ball,", " ", "use", " ", "the", " ", "-b", " ", "option!"]) text = ''' Did you say "supercalifragilisticexpialidocious?" How *do* you spell that odd word, anyways? ''' # XXX sentence ending not detected because of quotes test(wrap(text, 30), ['Did you say "supercalifragilis', 'ticexpialidocious?" How *do*', 'you spell that odd word,', 'anyways?']) test(wrap(text, 50), ['Did you say "supercalifragilisticexpialidocious?"', 'How *do* you spell that odd word, anyways?']) wrapper.break_long_words = 0 test(wrap(text, 30), ['Did you say', '"supercalifragilisticexpialidocious?"', 'How *do* you spell that odd', 'word, anyways?']) --jI8keyz6grp/JLjh-- From aahz@pythoncraft.com Thu Jun 6 17:15:42 2002 From: aahz@pythoncraft.com (Aahz) Date: Thu, 6 Jun 2002 12:15:42 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <20020606154601.GA16897@gerg.ca> References: <20020606154601.GA16897@gerg.ca> Message-ID: <20020606161541.GA26647@panix.com> On Thu, Jun 06, 2002, Greg Ward wrote: > > * The code is not locale-aware; it should be to detect sentence > endings, which it needs to do to ensure that there are two spaces > after each sentence ending. Eg. it fixes > "I have eaten. And you?" > but not > "Moi, j'ai mangé. Et toi?" It should fix neither. However, it should preserve sentence endings: "I have eaten. And you?" becomes "I have eaten. And you?" Writing the algorithm this way should require no locale-dependent code. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I had lots of reasonable theories about children myself, until I had some." --Michael Rios From loewis@informatik.hu-berlin.de Thu Jun 6 18:44:41 2002 From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=) Date: 06 Jun 2002 19:44:41 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Binary incompatibility on 64-bit platforms, for one. Isn't Python 2.3 breaking binary compatibility, anyway, so that the PYTHON_API_VERSION must be bumped? Regards, Martin From guido@python.org Thu Jun 6 18:48:00 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 13:48:00 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: Your message of "06 Jun 2002 19:44:41 +0200." References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206061748.g56Hm0k15221@odiug.zope.com> > > Binary incompatibility on 64-bit platforms, for one. > > Isn't Python 2.3 breaking binary compatibility, anyway, so that the > PYTHON_API_VERSION must be bumped? Maybe, but we're still trying to be as compatible as possible -- sometimes it helps. What about my other objections? --Guido van Rossum (home page: http://www.python.org/~guido/) From neal@metaslash.com Thu Jun 6 19:06:26 2002 From: neal@metaslash.com (Neal Norwitz) Date: Thu, 06 Jun 2002 14:06:26 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility Message-ID: <3CFFA4A2.9C2D9313@metaslash.com> This is a multi-part message in MIME format. --------------93293161D9DE985912DCB2D9 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Since the subject has come up several times recently, and some one (Walter?) suggested a PEP be written....here goes. Attached is a draft PEP. Comments? Neal --------------93293161D9DE985912DCB2D9 Content-Type: text/plain; charset=us-ascii; name="pep-nn.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="pep-nn.txt" PEP: XXX Title: Backward Compatibility for Standard Library Version: $Revision:$ Last-Modified: $Date:$ Author: neal@metaslash.com (Neal Norwitz) Status: Draft Type: Informational Created: 06-Jun-2002 Post-History: Python-Version: 2.3 Abstract This PEP describes the packages and modules in the standard library which should remain backward compatible with previous versions of Python. Rationale Authors have various reasons why packages and modules should continue to work with previous versions of Python. In order to maintain backward compatibility for these modules while moving the rest of the standard library forward, it is necessary to know which modules can be modified and which should use old and possibly deprecated features. Generally, authors should attempt to keep changes backward compatible with the previous released version of Python in order to make bug fixes easier to backport. Backward Compatible Packages & Modules Package/Module Maintainer(s) Python Version -------------- ------------- -------------- distutils Andrew Kuchling 1.5.2 email Barry Warsaw 2.1 sre Fredrik Lundh 1.5.2 xml (PyXML) Martin v. Loewis 2.0 Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: --------------93293161D9DE985912DCB2D9-- From loewis@informatik.hu-berlin.de Thu Jun 6 19:16:10 2002 From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=) Date: 06 Jun 2002 20:16:10 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib getopt.py,1.17,1.18 In-Reply-To: <3CFF60F9.48B334FF@metaslash.com> References: <3CFF60F9.48B334FF@metaslash.com> Message-ID: Neal Norwitz writes: > For additions to the stdlib, should we try to make sure new features > are used? In the above code, type(longopts) ... -> > isinstance(longopts, str) (or basestring?) and all_options_first > could be a bool. Done. It really should be ,str), since Unicode in command line options is not yet support (although they should be, since, on Windows, command line options are "natively" Unicode). Regards, Martin From guido@python.org Thu Jun 6 19:22:32 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 14:22:32 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility In-Reply-To: Your message of "Thu, 06 Jun 2002 14:06:26 EDT." <3CFFA4A2.9C2D9313@metaslash.com> References: <3CFFA4A2.9C2D9313@metaslash.com> Message-ID: <200206061822.g56IMWT23115@odiug.zope.com> > Since the subject has come up several times recently, > and some one (Walter?) suggested a PEP be written....here goes. > > Attached is a draft PEP. Comments? Good idea. Maybe you should mention some of the most common things you need to avoid to preserve backwards compatibility with 1.5.2, 2.0, 2.1? Without trying for completeness: For 1.5.2 (these were introduced in 2.0): string methods, unicode, augmented assignment, list comprehensions, zip(), dict.setdefault(), print >>f, calling f(*args), plus all of the following. For 2.0 (introduced in 2.1): nested scopes with future statement, rich comparisons, function attributes, plus all of the following. For 2.1 (introduced in 2.2): new-style classes, iterators, generators with future statement, nested scopes without future statement, plus all of the following. For 2.2 (introduced in 2.3): generators without future statement, bool (what else?). --Guido van Rossum (home page: http://www.python.org/~guido/) From walter@livinglogic.de Thu Jun 6 19:26:30 2002 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 06 Jun 2002 20:26:30 +0200 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> Message-ID: <3CFFA956.4000004@livinglogic.de> Neal Norwitz wrote: > Since the subject has come up several times recently, > and some one (Walter?) suggested a PEP be written....here goes. It was Thomas Heller on http://www.python.org/sf/561478 > [...] > Package/Module Maintainer(s) Python Version > -------------- ------------- -------------- Tools/freeze/modulefinder.py Thomas Heller 1.5.2 Bye, Walter Dörwald From fredrik@pythonware.com Thu Jun 6 19:25:46 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 6 Jun 2002 20:25:46 +0200 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> Message-ID: <019a01c20d87$95ac5380$ced241d5@hagrid> neal wrote: > Backward Compatible Packages & Modules > > Package/Module Maintainer(s) Python Version > -------------- ------------- -------------- > distutils Andrew Kuchling 1.5.2 > email Barry Warsaw 2.1 > sre Fredrik Lundh 1.5.2 + xmlrpclib Fredrik Lundh 1.5.2 (the code says 1.5.1, but I don't think I've tested that in quite a while...) > xml (PyXML) Martin v. Loewis 2.0 From thomas.heller@ion-tof.com Thu Jun 6 19:24:19 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 6 Jun 2002 20:24:19 +0200 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> Message-ID: <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> > Since the subject has come up several times recently, > and some one (Walter?) suggested a PEP be written....here goes. It was me (if you mean the comment on bug 561478), but who cares... > > Attached is a draft PEP. Comments? Since it may become impossible in the future to remain backward compatibility, should there be a (planned) Python version which no longer maintains backwards compatibility? > Package/Module Maintainer(s) Python Version > -------------- ------------- -------------- tools/scripts/freeze/modulefinder ??? 1.5.2 Thomas From loewis@informatik.hu-berlin.de Thu Jun 6 19:31:10 2002 From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=) Date: 06 Jun 2002 20:31:10 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: <200206061748.g56Hm0k15221@odiug.zope.com> References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> Message-ID: Guido van Rossum writes: > What about my other objections? Besides "breaks binary compatibility", the only other objection was: > Also could cause lots of compilation warnings when user code stores > the result into an int. True; this would be a migration issue. To be safe, we probably would define Py_size_t (or Py_ssize_t). People on 32-bit platforms would not notice the problems; people on 64-bit platforms would soon provide patches to use Py_ssize_t in the core. That is a lot of work, so it requires careful planning, but I believe this needs to be done sooner or later. Given MAL's and your response, I already accepted that it would likely be done rather later than sooner. I don't agree with MAL's objection > Wouldn't it be easier to solve this particular problem in > the type used for mmapping files ? Sure, it would be faster and easier, but that is the dark side of the force. People will find that they cannot have string objects with more than 2Gib one day, too, and, perhaps somewhat later, that they cannot have more than 2 milliard objects in a list. It is unlikely that the problem will go away, so at some point, all the problems will become pressing. It is perfectly reasonable to defer the binary breakage to that later point, except that probably more users will be affected in the future than would be affected now (because of the current rareness of 64-bit Python installations). Regards, Martin From guido@python.org Thu Jun 6 19:32:30 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 14:32:30 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility In-Reply-To: Your message of "Thu, 06 Jun 2002 20:24:19 +0200." <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> Message-ID: <200206061832.g56IWU523219@odiug.zope.com> > Since it may become impossible in the future to remain backward > compatibility, should there be a (planned) Python version > which no longer maintains backwards compatibility? That would be 3.0. Of course minor incompatibilities creep in at each new release. > > Package/Module Maintainer(s) Python Version > > -------------- ------------- -------------- > tools/scripts/freeze/modulefinder ??? 1.5.2 Can I ask once more why? --Guido van Rossum (home page: http://www.python.org/~guido/) From walter@livinglogic.de Thu Jun 6 19:34:57 2002 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 06 Jun 2002 20:34:57 +0200 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> Message-ID: <3CFFAB51.4050508@livinglogic.de> Thomas Heller wrote: >>Since the subject has come up several times recently, >>and some one (Walter?) suggested a PEP be written....here goes. > > It was me (if you mean the comment on bug 561478), but who cares... > >>Attached is a draft PEP. Comments? > > > Since it may become impossible in the future to remain backward > compatibility, should there be a (planned) Python version > which no longer maintains backwards compatibility? > > >> Package/Module Maintainer(s) Python Version >> -------------- ------------- -------------- > > tools/scripts/freeze/modulefinder ??? 1.5.2 Ouch, I misinterpreted your comment on bug #561478. So who is ???. Bye, Walter Dörwald From thomas.heller@ion-tof.com Thu Jun 6 19:37:33 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 6 Jun 2002 20:37:33 +0200 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> Message-ID: <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> > > Since it may become impossible in the future to remain backward > > compatibility, should there be a (planned) Python version > > which no longer maintains backwards compatibility? > > That would be 3.0. > > Of course minor incompatibilities creep in at each new release. > > > > Package/Module Maintainer(s) Python Version > > > -------------- ------------- -------------- > > tools/scripts/freeze/modulefinder ??? 1.5.2 > > Can I ask once more why? > I use it in py2exe, and this still supports 1.5.2. Thomas From thomas.heller@ion-tof.com Thu Jun 6 19:38:05 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 6 Jun 2002 20:38:05 +0200 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <3CFFAB51.4050508@livinglogic.de> Message-ID: <133001c20d89$4bc7ef70$e000a8c0@thomasnotebook> > >> Package/Module Maintainer(s) Python Version > >> -------------- ------------- -------------- > > > > tools/scripts/freeze/modulefinder ??? 1.5.2 > > Ouch, I misinterpreted your comment on bug #561478. > > So who is ???. > Maybe 'Thomas Heller et al.' Thomas From guido@python.org Thu Jun 6 19:44:21 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 14:44:21 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: Your message of "06 Jun 2002 20:31:10 +0200." References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> Message-ID: <200206061844.g56IiMY23310@odiug.zope.com> > Besides "breaks binary compatibility", the only other objection was: > > > Also could cause lots of compilation warnings when user code stores > > the result into an int. > > True; this would be a migration issue. To be safe, we probably would > define Py_size_t (or Py_ssize_t). People on 32-bit platforms would not > notice the problems; people on 64-bit platforms would soon provide > patches to use Py_ssize_t in the core. > > That is a lot of work, so it requires careful planning, but I believe > this needs to be done sooner or later. Given MAL's and your response, > I already accepted that it would likely be done rather later than > sooner. Perhaps we could introduce a new signed type in 2.3 that's implemented as an int, and switch it to something of the same size as size_t in a later revision. > I don't agree with MAL's objection > > > Wouldn't it be easier to solve this particular problem in > > the type used for mmapping files ? > > Sure, it would be faster and easier, but that is the dark side of the > force. People will find that they cannot have string objects with more > than 2Gib one day, too, and, perhaps somewhat later, that they cannot > have more than 2 milliard objects in a list. What's a milliard? Seriously, I think the problem for this "solution" would be that you can't use index notation on an mmap object, because PySequence_GetSlice takes two int args. I'm not very concerned about strings or lists with more than 2GB items, but I am concerned about other memory buffers. > It is unlikely that the problem will go away, so at some point, all > the problems will become pressing. It is perfectly reasonable to defer > the binary breakage to that later point, except that probably more > users will be affected in the future than would be affected now > (because of the current rareness of 64-bit Python installations). So we should be planning now. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 6 19:51:30 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 14:51:30 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility In-Reply-To: Your message of "Thu, 06 Jun 2002 20:37:33 +0200." <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> Message-ID: <200206061851.g56IpUC23357@odiug.zope.com> > > > tools/scripts/freeze/modulefinder ??? 1.5.2 I think the maintainer is Mark Hammond. I doubt he cares about 1.5.2 compatibility though. > > Can I ask once more why? > > > I use it in py2exe, and this still supports 1.5.2. Can you elaborate? Can't you include the last version of modulefinder.py that supports 1.5.2 in your py2exe distro? Or run py2exe with a 1.5.2 python? It seems to me that modulefinder.py depends on the dis.py module of the current Python -- how can you use a modulefinder.py from Python 2.x for a Python 1.5.2 program? --Guido van Rossum (home page: http://www.python.org/~guido/) From neal@metaslash.com Thu Jun 6 20:03:04 2002 From: neal@metaslash.com (Neal Norwitz) Date: Thu, 06 Jun 2002 15:03:04 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> <200206061822.g56IMWT23115@odiug.zope.com> Message-ID: <3CFFB1E8.F60E3F4@metaslash.com> This is a multi-part message in MIME format. --------------6729E0A908F3290E4EFD95F5 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Guido van Rossum wrote: > Maybe you should mention some of the most common things you need to avoid > to preserve backwards compatibility with 1.5.2, 2.0, 2.1? Updated version attached. Not sure if the Tools should remain in there. Neal --------------6729E0A908F3290E4EFD95F5 Content-Type: text/plain; charset=us-ascii; name="pep-nn.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="pep-nn.txt" PEP: XXX Title: Backward Compatibility for Standard Library Version: $Revision:$ Last-Modified: $Date:$ Author: neal@metaslash.com (Neal Norwitz) Status: Draft Type: Informational Created: 06-Jun-2002 Post-History: Python-Version: 2.3 Abstract This PEP describes the packages and modules in the standard library which should remain backward compatible with previous versions of Python. Rationale Authors have various reasons why packages and modules should continue to work with previous versions of Python. In order to maintain backward compatibility for these modules while moving the rest of the standard library forward, it is necessary to know which modules can be modified and which should use old and possibly deprecated features. Generally, authors should attempt to keep changes backward compatible with the previous released version of Python in order to make bug fixes easier to backport. Features to Avoid The following list contains common features to avoid in order to maintain backward compatibility with each version of Python. This list is not complete! It is only meant as a general guide. Note the features to avoid were implemented in the following version. For example, features listed next to 1.5.2 were implemented in 2.0. Version Features ------- -------- 1.5.2 string methods, Unicode, list comprehensions, augmented assignment (eg, +=), zip(), import x as y, dict.setdefault(), print >> f, calling f(*args, **kw), plus 2.0 features 2.0 nested scopes, rich comparisons, function attributes, plus 2.1 features 2.1 use of object or new-style classes, iterators, using generators, nested scopes, or // without from __future__ import ... statement, plus 2.2 features 2.2 bool, True, False, basestring, enumerate(), {}.pop(), PendingDeprecationWarning, Universal Newlines, plus 2.3 features Backward Compatible Packages, Modules, and Tools Package/Module Maintainer(s) Python Version -------------- ------------- -------------- compiler Jeremy Hylton 2.1 distutils Andrew Kuchling 1.5.2 email Barry Warsaw 2.1 sre Fredrik Lundh 1.5.2 xml (PyXML) Martin v. Loewis 2.0 xmlrpclib Fredrik Lundh 1.5.2 Tool Maintainer(s) Python Version ---- ------------- -------------- scripts/freeze/modulefinder Thomas Heller 1.5.2 Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: --------------6729E0A908F3290E4EFD95F5-- From thomas.heller@ion-tof.com Thu Jun 6 20:06:53 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 6 Jun 2002 21:06:53 +0200 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> Message-ID: <137201c20d8d$5226a790$e000a8c0@thomasnotebook> > Can you elaborate? Can't you include the last version of > modulefinder.py that supports 1.5.2 in your py2exe distro? Or run > py2exe with a 1.5.2 python? It seems to me that modulefinder.py > depends on the dis.py module of the current Python -- how can you use > a modulefinder.py from Python 2.x for a Python 1.5.2 program? > First, I want to use a version-independend modulefinder in py2exe, if possible. Second, it seems to work from 1.5.2 up to 2.2, currently. Except for a single use of a string method someone overlooked probably, see http://www.python.org/sf/564840. modulefinder simply uses some opnames from dis, and all seem to be present already in 1.5.2. Thomas From guido@python.org Thu Jun 6 20:08:34 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 15:08:34 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility In-Reply-To: Your message of "Thu, 06 Jun 2002 15:03:04 EDT." <3CFFB1E8.F60E3F4@metaslash.com> References: <3CFFA4A2.9C2D9313@metaslash.com> <200206061822.g56IMWT23115@odiug.zope.com> <3CFFB1E8.F60E3F4@metaslash.com> Message-ID: <200206061908.g56J8YZ23563@odiug.zope.com> > Updated version attached. Not sure if the Tools should remain in there. Go ahead and check it in as PEP 291. (PEP 290 is reserved for RaymondH's Migration Guide.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 6 20:10:52 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 15:10:52 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility In-Reply-To: Your message of "Thu, 06 Jun 2002 21:06:53 +0200." <137201c20d8d$5226a790$e000a8c0@thomasnotebook> References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> Message-ID: <200206061910.g56JAq023595@odiug.zope.com> > > Can you elaborate? Can't you include the last version of > > modulefinder.py that supports 1.5.2 in your py2exe distro? Or run > > py2exe with a 1.5.2 python? It seems to me that modulefinder.py > > depends on the dis.py module of the current Python -- how can you use > > a modulefinder.py from Python 2.x for a Python 1.5.2 program? > > > First, I want to use a version-independend modulefinder in py2exe, > if possible. > Second, it seems to work from 1.5.2 up to 2.2, currently. Except > for a single use of a string method someone overlooked probably, > see http://www.python.org/sf/564840. > > modulefinder simply uses some opnames from dis, and all seem to be > present already in 1.5.2. So why can't you include a copy of modulefinder.py in your distro? You seem to be using it beyond its intended use. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Thu Jun 6 20:15:26 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 6 Jun 2002 21:15:26 +0200 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com> Message-ID: <13b101c20d8e$83858d00$e000a8c0@thomasnotebook> > > > Can you elaborate? Can't you include the last version of > > > modulefinder.py that supports 1.5.2 in your py2exe distro? Or run > > > py2exe with a 1.5.2 python? It seems to me that modulefinder.py > > > depends on the dis.py module of the current Python -- how can you use > > > a modulefinder.py from Python 2.x for a Python 1.5.2 program? > > > > > First, I want to use a version-independend modulefinder in py2exe, > > if possible. > > Second, it seems to work from 1.5.2 up to 2.2, currently. Except > > for a single use of a string method someone overlooked probably, > > see http://www.python.org/sf/564840. > > > > modulefinder simply uses some opnames from dis, and all seem to be > > present already in 1.5.2. > > So why can't you include a copy of modulefinder.py in your distro? That's what I'm doing. I just want to keep it up-to-date with the latest and greatest version from the Python distro with the minimum effort. > You seem to be using it beyond its intended use. > You mean the cross-version use? As I said, it works nice. Anyway, if there is a strong reason to do so, it can be removed from PEP 291 - but string methods and booleans aren't such a reason (IMO). Thomas From guido@python.org Thu Jun 6 20:25:23 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 15:25:23 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility In-Reply-To: Your message of "Thu, 06 Jun 2002 21:15:26 +0200." <13b101c20d8e$83858d00$e000a8c0@thomasnotebook> References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com> <13b101c20d8e$83858d00$e000a8c0@thomasnotebook> Message-ID: <200206061925.g56JPN623651@odiug.zope.com> > > So why can't you include a copy of modulefinder.py in your distro? > That's what I'm doing. I just want to keep it up-to-date with > the latest and greatest version from the Python distro with the minimum > effort. The solution is simple. Just don't pull the new copy from the next Python release. > > You seem to be using it beyond its intended use. > You mean the cross-version use? As I said, it works nice. No, I meant that this module is part of the freeze tool. That has no requirement to be backwards compatible, since each Python version comes with its own version of freeze. Suppose that the .pyc file format changes in a backwards incompatible way (we're considering this too) and suppose modulefinder has to be changed. I think it should be possible to do that without consideration for older Python versions. > Anyway, if there is a strong reason to do so, it can be > removed from PEP 291 - but string methods and booleans > aren't such a reason (IMO). I think it should be removed. I want to avoid having random claims for backwards compatibility of arbitrary parts of the Python distribution, because the more of these we have, the more constrained we are as maintainers. The other cases are all packages that are being distributed separately by their maintainers for use with older Python versions. I think your use case is considerably different -- you are simply borrowing a module. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Thu Jun 6 20:38:09 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 6 Jun 2002 21:38:09 +0200 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com> <13b101c20d8e$83858d00$e000a8c0@thomasnotebook> <200206061925.g56JPN623651@odiug.zope.com> Message-ID: <143201c20d91$b02be180$e000a8c0@thomasnotebook> > > > You seem to be using it beyond its intended use. > > > You mean the cross-version use? As I said, it works nice. > > No, I meant that this module is part of the freeze tool. That has no > requirement to be backwards compatible, since each Python version > comes with its own version of freeze. Suppose that the .pyc file > format changes in a backwards incompatible way (we're considering this > too) and suppose modulefinder has to be changed. I think it should be > possible to do that without consideration for older Python versions. In this case I propose to add it to the standard library (or maybe Gordon's mf replacement, together with iu, his imputil replacement ?). > > > Anyway, if there is a strong reason to do so, it can be > > removed from PEP 291 - but string methods and booleans > > aren't such a reason (IMO). > > I think it should be removed. I want to avoid having random claims > for backwards compatibility of arbitrary parts of the Python > distribution, because the more of these we have, the more constrained > we are as maintainers. > Ok. > The other cases are all packages that are being distributed separately > by their maintainers for use with older Python versions. I think your > use case is considerably different -- you are simply borrowing a > module. > Thomas From guido@python.org Thu Jun 6 20:49:51 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 15:49:51 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility In-Reply-To: Your message of "Thu, 06 Jun 2002 21:38:09 +0200." <143201c20d91$b02be180$e000a8c0@thomasnotebook> References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com> <13b101c20d8e$83858d00$e000a8c0@thomasnotebook> <200206061925.g56JPN623651@odiug.zope.com> <143201c20d91$b02be180$e000a8c0@thomasnotebook> Message-ID: <200206061949.g56JnpZ30419@odiug.zope.com> > > Suppose that the .pyc file > > format changes in a backwards incompatible way (we're considering this > > too) and suppose modulefinder has to be changed. I think it should be > > possible to do that without consideration for older Python versions. > In this case I propose to add it to the standard library That's a non-sequitur. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Thu Jun 6 20:53:17 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 6 Jun 2002 21:53:17 +0200 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com> <13b101c20d8e$83858d00$e000a8c0@thomasnotebook> <200206061925.g56JPN623651@odiug.zope.com> <143201c20d91$b02be180$e000a8c0@thomasnotebook> <200206061949.g56JnpZ30419@odiug.zope.com> Message-ID: <144e01c20d93$cd504a60$e000a8c0@thomasnotebook> > > > Suppose that the .pyc file > > > format changes in a backwards incompatible way (we're considering this > > > too) and suppose modulefinder has to be changed. I think it should be > > > possible to do that without consideration for older Python versions. > > > In this case I propose to add it to the standard library > > That's a non-sequitur. I don't understand what you mean. What I mean is: If modulefinder provides functionality outside the freeze tool, and if it is maintained anyway, why not add it to the library? Thomas From guido@python.org Thu Jun 6 21:01:16 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 16:01:16 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility In-Reply-To: Your message of "Thu, 06 Jun 2002 21:53:17 +0200." <144e01c20d93$cd504a60$e000a8c0@thomasnotebook> References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com> <13b101c20d8e$83858d00$e000a8c0@thomasnotebook> <200206061925.g56JPN623651@odiug.zope.com> <143201c20d91$b02be180$e000a8c0@thomasnotebook> <200206061949.g56JnpZ30419@odiug.zope.com> <144e01c20d93$cd504a60$e000a8c0@thomasnotebook> Message-ID: <200206062001.g56K1G730686@odiug.zope.com> > > > > Suppose that the .pyc file > > > > format changes in a backwards incompatible way (we're considering this > > > > too) and suppose modulefinder has to be changed. I think it should be > > > > possible to do that without consideration for older Python versions. > > > > > In this case I propose to add it to the standard library > > > > That's a non-sequitur. > > I don't understand what you mean. > What I mean is: > If modulefinder provides functionality outside the freeze tool, > and if it is maintained anyway, why not add it to the library? My "Suppose that..." was in the context of the freeze tool. Since modulefinder.py looks in .pyc files, it has to track the .pyc file format, hence it cannot be required to be 1.5.2 compatible. Only *you* are claiming that it is useful outside the freeze tool. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Jun 6 21:02:25 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 06 Jun 2002 22:02:25 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> Message-ID: <3CFFBFD1.1050305@lemburg.com> Martin v. L=F6wis wrote: > Guido van Rossum writes: >=20 >=20 >>What about my other objections? >=20 >=20 > Besides "breaks binary compatibility", the only other objection was: >=20 >=20 >>Also could cause lots of compilation warnings when user code stores >>the result into an int. >=20 >=20 > True; this would be a migration issue. To be safe, we probably would > define Py_size_t (or Py_ssize_t). People on 32-bit platforms would not > notice the problems; people on 64-bit platforms would soon provide > patches to use Py_ssize_t in the core. >=20 > That is a lot of work, so it requires careful planning, but I believe > this needs to be done sooner or later. Given MAL's and your response, > I already accepted that it would likely be done rather later than > sooner. >=20 > I don't agree with MAL's objection Not that I would be surprised ;-)... but which one ? >>Wouldn't it be easier to solve this particular problem in >>the type used for mmapping files ? >=20 >=20 > Sure, it would be faster and easier, but that is the dark side of the > force. People will find that they cannot have string objects with more > than 2Gib one day, too, and, perhaps somewhat later, that they cannot > have more than 2 milliard objects in a list. >=20 > It is unlikely that the problem will go away, so at some point, all > the problems will become pressing. It is perfectly reasonable to defer > the binary breakage to that later point, except that probably more > users will be affected in the future than would be affected now > (because of the current rareness of 64-bit Python installations). Why not leave this for Py3K when 64-bit platforms will have become common enough to make this a real need (I doubt that anyone is using 1GB Python strings nowadays without getting MemoryErrors :-). Until then, I'd rather like to see the file IO APIs and related types fixed so that they can handle 2GB files all the way through. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From guido@python.org Thu Jun 6 21:04:47 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 16:04:47 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: Your message of "Thu, 06 Jun 2002 22:02:25 +0200." <3CFFBFD1.1050305@lemburg.com> References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> <3CFFBFD1.1050305@lemburg.com> Message-ID: <200206062004.g56K4lS30831@odiug.zope.com> > Until then, I'd rather like to see the file IO APIs and related > types fixed so that they can handle 2GB files all the way > through. Which file IO APIs need to be fixed? I thought we supported large files already (when the OS supports them)? --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Jun 6 20:12:36 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 6 Jun 2002 15:12:36 -0400 Subject: [Python-Dev] trace.py and the obscurity of Tools/scripts/ In-Reply-To: <15543.28437.909937.768107@slothrop.zope.com> References: <15541.51980.403233.710018@anthem.wooz.org> <15541.53473.786848.71301@anthem.wooz.org> <15543.27943.6250.555793@12-248-41-177.client.attbi.com> <15543.28437.909937.768107@slothrop.zope.com> Message-ID: <15615.46116.832422.534079@grendel.zope.com> [cleaning out some old mail...] Zooko wrote: > So in terms of `trace.py', it is a widely useful tool and > already has a programmatic interface. Being added to the > hallowed Python Standard Library would be a major step up in > publicity and hence usage. It would require better docs > regarding the programmatic usage. Skip wrote: > It's speed cries out for a rewrite of some sort. I haven't > thought about it, but I wonder if it could be layered on top of > hotshot. Jeremy Hylton writes: > Can any of the handler methods be re-coded in C? The hotshot changes > allow you to install a C function as a trace hook, but doesn't the > trace function in trace.py do a fair amount of work? Another possibility is to use _hotshot.coverage() instead of the "normal" profiler in HotShot. This records the enter/exit events and SET_LINENO instructions, but doesn't take any timing measurements, so operates much quicker than profiling. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From martin@v.loewis.de Thu Jun 6 21:16:04 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 06 Jun 2002 22:16:04 +0200 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net> References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > To solve this, we would have to make the ob_sinterned slot count as a > reference to the interned string. But then string_dealloc would be > complicated (it would have to call Py_XDECREF(op->ob_sinterned)), > possibly slowing things down. > > Is this worth it? That (latter) change seem "right" regardless of whether interned strings are ever released. > The fear for unbounded growth of the interned strings table is > pretty common amongst authors of serious long-running programs. I think it is. Unbound growth of interned strings is precisely the reason why the XML libraries repeatedly came up with their own interning dictionaries, which only persist for the lifetime of parsing the document, since the next document may want to intern entirely different things. This is the reason that the intern() function is bad to use for most applications. Regards, Martin From mal@lemburg.com Thu Jun 6 21:25:11 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 06 Jun 2002 22:25:11 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com> Message-ID: <3CFFC527.1090907@lemburg.com> Guido van Rossum wrote: >>Until then, I'd rather like to see the file IO APIs and related >>types fixed so that they can handle 2GB files all the way >>through. > > > Which file IO APIs need to be fixed? I thought we supported large > files already (when the OS supports them)? The file object does, but what the mmap module doesn't and it is not clear to me whether all code in the standard lib can actually deal with file positions outside the int range (most code probably doesn't care, since it uses .read() and .write() exclusively), e.g. can SRE scan mmapped files of such size ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From guido@python.org Thu Jun 6 21:33:57 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 16:33:57 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: Your message of "Thu, 06 Jun 2002 22:25:11 +0200." <3CFFC527.1090907@lemburg.com> References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com> <3CFFC527.1090907@lemburg.com> Message-ID: <200206062033.g56KXvJ05145@odiug.zope.com> > >>Until then, I'd rather like to see the file IO APIs and related > >>types fixed so that they can handle 2GB files all the way > >>through. (I suppose you meant >2GB files.) > > Which file IO APIs need to be fixed? I thought we supported large > > files already (when the OS supports them)? > > The file object does, but what the mmap module doesn't and > it is not clear to me whether all code in the standard lib > can actually deal with file positions outside the int range > (most code probably doesn't care, since it uses .read() > and .write() exclusively), e.g. can SRE scan mmapped > files of such size ? On a 32-bit machine you can mmap at most 2 GB anyway I expect, due to the VM architecture (and otherwise the limit would obviously be 4 GB). In which architecture are you interested? The only place where this might be a problem is when a pointer is 64 bits but an int is 32 bits. What other modules are you worried about? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Thu Jun 6 21:53:44 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 6 Jun 2002 22:53:44 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com> <3CFFC527.1090907@lemburg.com> <200206062033.g56KXvJ05145@odiug.zope.com> Message-ID: <03e201c20d9c$4213d350$ced241d5@hagrid> Guido van Rossum wrote: > In which architecture are you interested? The only place where this > might be a problem is when a pointer is 64 bits but an int is 32 bits. which means all 64-bit Unix machines... From gward@python.net Thu Jun 6 21:49:01 2002 From: gward@python.net (Greg Ward) Date: Thu, 6 Jun 2002 16:49:01 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <20020606161541.GA26647@panix.com> References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com> Message-ID: <20020606204901.GA18310@gerg.ca> On 06 June 2002, Aahz said: > It should fix neither. However, it should preserve sentence endings: Actually, what it does fix is this: blah blah blah here's the end of sentence at the end of a line. And here's the next sentence. Which, after whitespace-mangling, becomes ... end of a line. And here's ... which is incorrect (sentences should be separated by two spaces in fixed-width fonts). The catch is that single-space-separated sentences elsewhere in the text are also fixed, which *I* think is a good thing, but should be optional. Whatever -- minor detail. I'll check it in first and then worry about making more features optional. Greg -- Greg Ward - Unix geek gward@python.net http://starship.python.net/~gward/ Budget's in the red? Let's tax religion! -- Dead Kennedys From mal@lemburg.com Thu Jun 6 22:39:58 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 06 Jun 2002 23:39:58 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com> <3CFFC527.1090907@lemburg.com> <200206062033.g56KXvJ05145@odiug.zope.com> Message-ID: <3CFFD6AE.9020602@lemburg.com> Guido van Rossum wrote: >>>>Until then, I'd rather like to see the file IO APIs and related >>>>types fixed so that they can handle 2GB files all the way >>>>through. >>> > > (I suppose you meant >2GB files.) Yes. >>>Which file IO APIs need to be fixed? I thought we supported large >>>files already (when the OS supports them)? >> >>The file object does, but what the mmap module doesn't and >>it is not clear to me whether all code in the standard lib >>can actually deal with file positions outside the int range >>(most code probably doesn't care, since it uses .read() >>and .write() exclusively), e.g. can SRE scan mmapped >>files of such size ? > > > On a 32-bit machine you can mmap at most 2 GB anyway I expect, due to > the VM architecture (and otherwise the limit would obviously be 4 GB). > > In which architecture are you interested? The only place where this > might be a problem is when a pointer is 64 bits but an int is 32 bits. 64-bit Unix systems such as AIX 5L. > What other modules are you worried about? I'm not worried about any modules... take this as PEP-42 wish: someone would need to check all the code using e.g. file.seek() and file.tell() to make sure that it works correctly with long values. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From fredrik@pythonware.com Thu Jun 6 22:44:02 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 6 Jun 2002 23:44:02 +0200 Subject: [Python-Dev] textwrap.py References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com> <20020606204901.GA18310@gerg.ca> Message-ID: <047001c20da3$49c4f4b0$ced241d5@hagrid> Greg Ward wrote: > Which, after whitespace-mangling, becomes > > ... end of a line. And here's ... > > which is incorrect (sentences should be separated by two spaces in > fixed-width fonts). that depends on the locale. the two space rule does not apply to swedish, for example. and googling for "two space rule" and "one space after" + period makes me think it doesn't really apply to english either... see eg http://www.press.uchicago.edu/Misc/Chicago/cmosfaq/cmosfaq.OneSpaceorTwo.html "There is a traditional American practice, favored by some, of leaving two spaces after colons and periods. This practice is discouraged /.../" (and thousands of similar entries. from what I can tell, the more *real* research done by an author, the more likely he is to come down on the one space side...) From martin@v.loewis.de Thu Jun 6 22:55:03 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 06 Jun 2002 23:55:03 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: <3CFFD6AE.9020602@lemburg.com> References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com> <3CFFC527.1090907@lemburg.com> <200206062033.g56KXvJ05145@odiug.zope.com> <3CFFD6AE.9020602@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > 64-bit Unix systems such as AIX 5L. > > > What other modules are you worried about? > > I'm not worried about any modules... take this as PEP-42 wish: > someone would need to check all the code using e.g. file.seek() > and file.tell() to make sure that it works correctly with > long values. That is supposed to work today. If it doesn't, make a detailed bug report. Regards, Martin From DavidA@ActiveState.com Fri Jun 7 00:31:17 2002 From: DavidA@ActiveState.com (David Ascher) Date: Thu, 06 Jun 2002 16:31:17 -0700 Subject: [Python-Dev] textwrap.py References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com> <20020606204901.GA18310@gerg.ca> <047001c20da3$49c4f4b0$ced241d5@hagrid> Message-ID: <3CFFF0C5.7010902@ActiveState.com> Fredrik Lundh wrote: >Greg Ward wrote: > > > >>Which, after whitespace-mangling, becomes >> >> ... end of a line. And here's ... >> >>which is incorrect (sentences should be separated by two spaces in >>fixed-width fonts). >> >> > >that depends on the locale. > >the two space rule does not apply to swedish, for example. > >and googling for "two space rule" and "one space after" + period >makes me think it doesn't really apply to english either... see eg > >http://www.press.uchicago.edu/Misc/Chicago/cmosfaq/cmosfaq.OneSpaceorTwo.html > > "There is a traditional American practice, favored by some, > of leaving two spaces after colons and periods. This practice > is discouraged /.../" > >(and thousands of similar entries. from what I can tell, the more >*real* research done by an author, the more likely he is to come >down on the one space side...) > > I did some research on this in a previous life, and my memory is that the two-space rule was designed, much like using underscores, as a guide to the typesetter, since periods are easily missed. Underscores were an indication that the text should be emphasized, and no well-typeset document will include real underscores (except for "effect"). --da From tim.one@comcast.net Fri Jun 7 01:12:58 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 06 Jun 2002 20:12:58 -0400 Subject: [Python-Dev] Where to put wrap_text()? In-Reply-To: <20020606152011.GA16829@gerg.ca> Message-ID: [Tim] > Note that regrtest.py also has a wrapper: > > def printlist(x, width=70, indent=4): > """Print the elements of a sequence to stdout. > > Optional arg width (default 70) is the maximum line length. > Optional arg indent (default 4) is the number of blanks > with which to begin each line. > """ [Greg Ward] > I think this one will probably stand; I've gotten to the point with my > text-wrapping code where I'm reimplementing the various other > text-wrappers people have mentioned on top of it, and > regrtest.printlist() is just not a good fit. It's for printing > lists compactly, not for filling text. Whatever. regrtest's printlist is trivial to implement on top of the code you posted: def printlist(x, width=70, indent=4): guts = map(str, x) blanks = ' ' * indent w = textwrap.TextWrapper() print w.fill(' '.join(guts), width, blanks, blanks) TextWrapper certainly doesn't have to worry about changing the list into a string, all I want it is that it wrap a string, and it does. >> Just make sure it handle the union of all possible desires, but >> has a simple and intuitive interface . > Right. Gotcha. Code coming up soon. It's no more than 10x more elaborate than necessary, so ship it . From guido@python.org Fri Jun 7 01:19:54 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 20:19:54 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: Your message of "Thu, 06 Jun 2002 23:44:02 +0200." <047001c20da3$49c4f4b0$ced241d5@hagrid> References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com> <20020606204901.GA18310@gerg.ca> <047001c20da3$49c4f4b0$ced241d5@hagrid> Message-ID: <200206070019.g570Jsx07402@pcp02138704pcs.reston01.va.comcast.net> > (and thousands of similar entries. from what I can tell, the more > *real* research done by an author, the more likely he is to come > down on the one space side...) Knuth, when he invented TeX, heavily promoted a typesetting rule (for variable-width fonts) that allowed the whitespace after a full stop to stretch more than regular word space. The Emacs folks, who love Knuth, translated this idea for fixed-width text into two spaces. Note that HTML also doesn't do this -- it always single-spaces text. Looks fine to me. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 7 01:23:46 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 20:23:46 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: Your message of "06 Jun 2002 23:55:03 +0200." References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com> <3CFFC527.1090907@lemburg.com> <200206062033.g56KXvJ05145@odiug.zope.com> <3CFFD6AE.9020602@lemburg.com> Message-ID: <200206070023.g570NkZ07427@pcp02138704pcs.reston01.va.comcast.net> > > I'm not worried about any modules... take this as PEP-42 wish: > > someone would need to check all the code using e.g. file.seek() > > and file.tell() to make sure that it works correctly with > > long values. > > That is supposed to work today. If it doesn't, make a detailed bug > report. While file.seek() and file.tell() are indeed fixed, I think MAL has a fear that some modules don't like getting a long from tell(). I fixed a bug of this kind in dumbdbm more than three years ago, when a long wasn't acceptable as a multiplier in string repetition. Since then, longs aren't quite so poisonous as they once were, and I don't think this fear is rational any more. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jun 7 01:48:53 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 06 Jun 2002 20:48:53 -0400 Subject: [Python-Dev] Bizarre new test failure Message-ID: Guido noticed this on Linux late this afternoon. I've seen it on Win2K and Win98SE since. test_gc fails if you run the whole test suite from the start: test test_gc failed -- test_list: actual 10, expected 1 This seems impossible (look at the test). It doesn't fail in isolation. It fails in both debug and release builds. Not *all* tests before test_gc need to be run in order to provoke a failure, but I can detect no sense in which do need to be run (not just one or two, but lots of them). Here are the files that changed between a Python that does work (yesterday) and now: P python/configure P python/configure.in P python/pyconfig.h.in P python/Doc/lib/libgetopt.tex P python/Doc/lib/libsocket.tex P python/Lib/copy.py P python/Lib/fileinput.py P python/Lib/getopt.py P python/Lib/posixpath.py P python/Lib/shutil.py P python/Lib/socket.py P python/Lib/compiler/pyassem.py P python/Lib/compiler/pycodegen.py P python/Lib/compiler/transformer.py P python/Lib/distutils/command/clean.py P python/Lib/test/test_b1.py P python/Lib/test/test_commands.py P python/Lib/test/test_descr.py P python/Lib/test/test_getopt.py P python/Lib/test/test_socket.py U python/Lib/test/test_timeout.py P python/Misc/ACKS P python/Misc/NEWS P python/Modules/gcmodule.c P python/Modules/socketmodule.c P python/Modules/socketmodule.h P python/Objects/abstract.c P python/Objects/complexobject.c P python/Objects/rangeobject.c P python/Tools/webchecker/webchecker.py While Jeremy did fiddle gcmodule.c, that isn't the cause. I changed the test like so: def test_list(): import sys l = [] l.append(l) gc.collect() del l gc.set_debug(gc.DEBUG_SAVEALL) n = gc.collect() print >> sys.stderr, '*' * 30, n, gc.garbage expect(n, 1, "list") Here's the list of garbage objects it found: [, {'__dict__': , '__module__': 'test_descr', '__weakref__': , '__doc__': None, '__init__': }, (, , ), (,), [[...]], , , , (,), ] The recursive list: [[...]] is the only one expected here. Your turn . From greg@cosc.canterbury.ac.nz Fri Jun 7 01:47:06 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Jun 2002 12:47:06 +1200 (NZST) Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Message-ID: <200206070047.MAA07206@s454.cosc.canterbury.ac.nz> > This has the potential of breaking applications that remember the id() > of an interned string, instead of its value. Unless the manual promises that interned strings will live forever, I'd say such an application is broken already. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Fri Jun 7 01:53:52 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Jun 2002 12:53:52 +1200 (NZST) Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention In-Reply-To: <200206061258.g56CwLp06558@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206070053.MAA07210@s454.cosc.canterbury.ac.nz> Guido van Rossum : > > > def __str__(self): > > > pass > > > > Dunno about other people's opinions, but I have a strong distaste for > > creating methods whose body contains pass. I always use "raise > > NotImplementedError". > > But that has different semantics! In this particular case, the program blows up anyway if this method is ever called, so you might as well return a meaningful exception! Python 2.2 (#14, May 28 2002, 14:11:27) [GCC 2.95.2 19991024 (release)] on sunos5 Type "help", "copyright", "credits" or "license" for more information. >>> class C: ... def __str__(self): ... pass ... >>> c = C() >>> str(c) Traceback (most recent call last): File "", line 1, in ? TypeError: __str__ returned non-string (type NoneType) >>> Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From neal@metaslash.com Fri Jun 7 02:03:25 2002 From: neal@metaslash.com (Neal Norwitz) Date: Thu, 06 Jun 2002 21:03:25 -0400 Subject: [Python-Dev] Bizarre new test failure References: Message-ID: <3D00065D.5AEAECF4@metaslash.com> Tim Peters wrote: > > Guido noticed this on Linux late this afternoon. I've seen it on Win2K and > Win98SE since. test_gc fails if you run the whole test suite from the > start: > > test test_gc failed -- test_list: actual 10, expected 1 > > This seems impossible (look at the test). It doesn't fail in isolation. It > fails in both debug and release builds. Not *all* tests before test_gc need > to be run in order to provoke a failure, but I can detect no sense in which > do need to be run (not just one or two, but lots of them). > > Here are the files that changed between a Python that does work (yesterday) > and now: I've gotten this intermittently. Although the first time I got it was sometime yesterday, so I think you may have to go back a bit farther. Neal From greg@cosc.canterbury.ac.nz Fri Jun 7 02:02:56 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Jun 2002 13:02:56 +1200 (NZST) Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206070102.NAA07214@s454.cosc.canterbury.ac.nz> Guido: > It's also quite possible that there are no outside > references to an interned string, but another string with the same > value still references the interned string from its ob_sinterned > field. > > To solve this, we would have to make the ob_sinterned slot count as a > reference to the interned string. But then string_dealloc would be > complicated (it would have to call Py_XDECREF(op->ob_sinterned)), > possibly slowing things down. If the intern table cleanup is being done by a GC pass, you don't need a full Py_XDECREF -- you only need to decrement op->ob_sinterned->ob_refcnt. Doesn't sound excessively expensive to me, but I suppose it would have to be timed to make sure. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Fri Jun 7 02:10:17 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Jun 2002 13:10:17 +1200 (NZST) Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Message-ID: <200206070110.NAA07217@s454.cosc.canterbury.ac.nz> > > Is this worth it? > > I think it is. I think so, too. Currently, interning can *almost* be regarded as no more than an optimisation that speeds up comparing strings -- almost, because it has this side effect of making the strings immortal. Removing that side effect would be good. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido@python.org Fri Jun 7 02:20:26 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 21:20:26 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: Your message of "Thu, 06 Jun 2002 20:48:53 EDT." References: Message-ID: <200206070120.g571KQC14344@pcp02138704pcs.reston01.va.comcast.net> > Here are the files that changed between a Python that does work (yesterday) > and now: > Here's the list of garbage objects it found: > > [, > {'__dict__': , > '__module__': 'test_descr', > '__weakref__': , > '__doc__': None, > '__init__': }, > (, , ), > (,), > [[...]], > , > , > , > (,), > ] Most of these are leftovers from the test supers() in test_descr.py. If Neal is right and this could be two days old, I'm curious if my last change to typeobject.c (2.148) might not be the culprit, since it messes with the garbage collector. I'm trying to fix the non-blocking code in the socket module first, so I doubt I'll get to this tonight. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Fri Jun 7 03:21:29 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 6 Jun 2002 22:21:29 -0400 Subject: [Python-Dev] textwrap.py References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com> <20020606204901.GA18310@gerg.ca> <047001c20da3$49c4f4b0$ced241d5@hagrid> <200206070019.g570Jsx07402@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15616.6313.71537.816137@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> The Emacs folks, who love Knuth, translated this idea for GvR> fixed-width text into two spaces. I always thought the Emacs folks adopted it because it made the filling algorithms easier to deal with the difference between: ...on a stick with no mustard. At least, that's how I prefer my... and ...in love with Dr. Frankenstein, and who once noticed a watermelon... Two spaces after the sentence end and one after the abbreviation. Here's some interesting information from XEmacs: C-h f fill-paragraph RET If `sentence-end-double-space' is non-nil, then period followed by one space does not end a sentence, so don't break a line there. C-h v sentence-end-double-space RET *Non-nil means a single space does not end a sentence. This variable applies only to filling, not motion commands. To change the behavior of motion commands, see `sentence-end'. It's clear to me that any standard wrapping code in Python needs to handle either style. -Barry From guido@python.org Fri Jun 7 03:33:18 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Jun 2002 22:33:18 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: Your message of "Thu, 06 Jun 2002 22:21:29 EDT." <15616.6313.71537.816137@anthem.wooz.org> References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com> <20020606204901.GA18310@gerg.ca> <047001c20da3$49c4f4b0$ced241d5@hagrid> <200206070019.g570Jsx07402@pcp02138704pcs.reston01.va.comcast.net> <15616.6313.71537.816137@anthem.wooz.org> Message-ID: <200206070233.g572XIh14811@pcp02138704pcs.reston01.va.comcast.net> > I always thought the Emacs folks adopted it because it made the > filling algorithms easier to deal with the difference between: > > ...on a stick with no mustard. At least, that's how I prefer my... > > and > > ...in love with Dr. Frankenstein, and who once noticed a watermelon... You've got that backwards of course. If we didn't want two spaces after a full stop, we wouldn't need any of this nonsense. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Fri Jun 7 03:54:43 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 6 Jun 2002 22:54:43 -0400 Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility References: <3CFFA4A2.9C2D9313@metaslash.com> <200206061822.g56IMWT23115@odiug.zope.com> <3CFFB1E8.F60E3F4@metaslash.com> <200206061908.g56J8YZ23563@odiug.zope.com> Message-ID: <15616.8307.731761.157525@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> Go ahead and check it in as PEP 291. (PEP 290 is reserved GvR> for RaymondH's Migration Guide.) Today's thunderstorms knocked out my home network so I'm just now catching up on email. Great PEP Neal, thanks for doing it! One nit: PEP file numbers must have 4 digits, so I cvs rm'd pep-291.txt, copied it to pep-0291.txt and cvs added the latter. I also sync'd it to www.python.org. Please make any future updates to the pep-0291.txt file. Thanks, -Barry From guido@python.org Fri Jun 7 05:01:02 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 00:01:02 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: Your message of "Wed, 05 Jun 2002 17:33:55 EDT." Message-ID: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> I've more or less completed the introduction of timeout sockets. Executive summary: after sock.settimeout(T), all methods of sock will block for at most T floating seconds and fail if they can't complete within that time. sock.settimeout(None) restores full blocking mode. I've also done some long-needed rigorous cleanup of the socket module source code, e.g. I got rid of the PySock* static names. Remaining issues: - A test suite. There's no decent test suite for the timeout code. The file test_timeout.py doesn't test the functionality (as I discovered when the test succeeded while I had several blunders in the select code that made everything always time out). - Cross-platform testing. It's possible that the cleanup broke things on some platforms, or that select() doesn't work the same way. I can only test on Windows and Linux; there is code specific to OS/2 and RISCOS in the module too. - I'm not sure that the handling of timeout errors in accept(), connect() and connect_ex() is 100% correct (this code sniffs the error code and decides whether to retry or not). - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)? Currently it sets a timeout of zero seconds, and that behaves pretty much the same as setting the socket in nonblocking mode -- but not exactly. Maybe these should be made the same? - A socket filedescriptor passed to fromfd() is now assumed to be in blocking, non-timeout mode. - The socket.py module has been changed too, changing the way buffering is done on Windows. I haven't reviewed or tested this code thoroughly. I hope some of the developers on this list will help me out with all this! In the mean time, many thanks to Michael Gilfix who did most of the thinking and coding. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 7 05:02:44 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 00:02:44 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Your message of "06 Jun 2002 22:16:04 +0200." References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206070402.g5742il15834@pcp02138704pcs.reston01.va.comcast.net> > > To solve this, we would have to make the ob_sinterned slot count as a > > reference to the interned string. But then string_dealloc would be > > complicated (it would have to call Py_XDECREF(op->ob_sinterned)), > > possibly slowing things down. > > > > Is this worth it? > > That (latter) change seem "right" regardless of whether interned > strings are ever released. OK, let's do this. > > The fear for unbounded growth of the interned strings table is > > pretty common amongst authors of serious long-running programs. > > I think it is. Unbound growth of interned strings is precisely the > reason why the XML libraries repeatedly came up with their own > interning dictionaries, which only persist for the lifetime of parsing > the document, since the next document may want to intern entirely > different things. This is the reason that the intern() function is bad > to use for most applications. So let's expose a function that cleans out unused strings from the interned dict. Long-running apps can decide when to call this. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg@cosc.canterbury.ac.nz Fri Jun 7 05:19:57 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Jun 2002 16:19:57 +1200 (NZST) Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206070419.QAA07241@s454.cosc.canterbury.ac.nz> Guido: > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)? > Currently it sets a timeout of zero seconds, and that behaves pretty > much the same as setting the socket in nonblocking mode -- but not > exactly. Maybe these should be made the same? I'd say no. Someone might want the current behaviour, whatever it is -- and if they don't, they can always make it properly non-blocking. Don't make a special case unless it's absolutely necessary. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Fri Jun 7 05:22:25 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Jun 2002 16:22:25 +1200 (NZST) Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: <200206070402.g5742il15834@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206070422.QAA07244@s454.cosc.canterbury.ac.nz> Guido: > So let's expose a function that cleans out unused strings from the > interned dict. Long-running apps can decide when to call this. Would it do any harm to call this automatically from the garbage collector? I suppose it should be exposed as well, in case you want it but have GC turned off -- but in the normal case you shouldn't have to do anything special to get it. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From mgilfix@eecs.tufts.edu Fri Jun 7 05:26:23 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Fri, 7 Jun 2002 00:26:23 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jun 07, 2002 at 12:01:02AM -0400 References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020607002623.A20029@eecs.tufts.edu> On Fri, Jun 07 @ 00:01, Guido van Rossum wrote: > I've more or less completed the introduction of timeout sockets. > > Executive summary: after sock.settimeout(T), all methods of sock will > block for at most T floating seconds and fail if they can't complete > within that time. sock.settimeout(None) restores full blocking mode. > > I've also done some long-needed rigorous cleanup of the socket module > source code, e.g. I got rid of the PySock* static names. Good stuff. The module needed a little work as I discovered as well :) > Remaining issues: > > - A test suite. There's no decent test suite for the timeout code. > The file test_timeout.py doesn't test the functionality (as I > discovered when the test succeeded while I had several blunders in > the select code that made everything always time out). Er, hopefully Bernard is still watching this thread as he wrote the test_timeout.py. He's been pretty quiet though as of late... I'm willing to rewrite the tests if he doesn't have the time. I think the tests should follow the same pattern as the test_socket.py. While adding my regression tests, I noted that the general socket test suite could use some re-writing but I didn't feel it appropriate to tackle it at that point. Perhaps a next patch? > - Cross-platform testing. It's possible that the cleanup broke things > on some platforms, or that select() doesn't work the same way. I > can only test on Windows and Linux; there is code specific to OS/2 > and RISCOS in the module too. This was a concern from the beginning but we had some chat on the dev list and concluded that any system supporting sockets has to support select or some equivalent (hence the initial reason for using the select module, although I agree it was expensive). > - I'm not sure that the handling of timeout errors in accept(), > connect() and connect_ex() is 100% correct (this code sniffs the > error code and decides whether to retry or not). I've tested these on linux (manually) and they seem to work just fine. I didn't do as much testing with connect_ex but the code is very similar to connect, so confidence is high-er. The reason for the two-pass is because the initial connect needs to be made to start the process and then try again, based on the error codes, for non-blocking connects. It's weird like that. > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)? > Currently it sets a timeout of zero seconds, and that behaves pretty > much the same as setting the socket in nonblocking mode -- but not > exactly. Maybe these should be made the same? I thought about this and whether or not I wanted to address this. I kinda decided to leave them separate though. I don't think setting a timeout means anything equivalent to setblocking(0). In fact, I can't see why anyone should ever set a timeout of zero and the immediate throwing of the exception is a good alert as to what's going on. I vote, leave them separate and as they are now... > - A socket filedescriptor passed to fromfd() is now assumed to be in > blocking, non-timeout mode. > > - The socket.py module has been changed too, changing the way > buffering is done on Windows. I haven't reviewed or tested this > code thoroughly. I added a regression test to test_socket.py to test this, that works on both the old code (I used 2.1.3) and the new code. Hopefully, this will be instrumental for those testing it and it reflects my manual tests. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From tim.one@comcast.net Fri Jun 7 05:32:07 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 00:32:07 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <200206070019.g570Jsx07402@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > Knuth, when he invented TeX, heavily promoted a typesetting rule (for > variable-width fonts) that allowed the whitespace after a full stop to > stretch more than regular word space. The Emacs folks, who love > Knuth, translated this idea for fixed-width text into two spaces. Two spaces between sentences was the rule for monospaced fonts before Knuth was born. It got beaten into me by my mother when I learned to type, and is still the rule for monospaced fonts according to several style guides. Are there TWO spaces after every sentence? Manuscripts without two spaces after each sentence will be rejected. > Note that HTML also doesn't do this -- it always single-spaces text. Are there TWO spaces after every sentence? Manuscripts without two spaces after each sentence will be rejected. The web page from which that quote was taken forces an extra space after the question mark (I didn't insert it after pasting the quote) in the obvious way: Are there TWO spaces after every sentence?  Manuscripts without two spaces after each will be rejected. > Looks fine to me. It wouldn't if you viewed it in Courier; for fixed-width fonts it very arguably helps people parse. The two-space gimmick is out of favor for published works because proportional fonts and kerning are adequate to distinguish sentences. It still Rulz the DOS box, though . From tim.one@comcast.net Fri Jun 7 05:40:54 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 00:40:54 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: <200206070023.g570NkZ07427@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > While file.seek() and file.tell() are indeed fixed, I think MAL has a > fear that some modules don't like getting a long from tell(). I fixed > a bug of this kind in dumbdbm more than three years ago, when a long > wasn't acceptable as a multiplier in string repetition. Since then, > longs aren't quite so poisonous as they once were, and I don't think > this fear is rational any more. Andrew K fixed a lot of these too, for some definition of "fixed" . From xscottg@yahoo.com Fri Jun 7 05:56:09 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Thu, 6 Jun 2002 21:56:09 -0700 (PDT) Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020607045609.8677.qmail@web12908.mail.yahoo.com> --- Guido van Rossum wrote: > > Also > could cause lots of compilation warnings when user code stores the > result into an int. > The compiler won't complain a wink for int pointers passed to varargs functions. PyArg_ParseTuple and any format specifiers that have # after the typecode could be quiet bugs in any extension modules. This could be handled in a backwards compatible fashion by adding a new code indicating ssize_t while leaving '#' as indicating int. __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From nas@python.ca Fri Jun 7 06:31:04 2002 From: nas@python.ca (Neil Schemenauer) Date: Thu, 6 Jun 2002 22:31:04 -0700 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: ; from tim.one@comcast.net on Thu, Jun 06, 2002 at 08:48:53PM -0400 References: Message-ID: <20020606223104.B1389@glacier.arctrix.com> Tim Peters wrote: > test test_gc failed -- test_list: actual 10, expected 1 Hmm. > Here's the list of garbage objects it found: > > [, > {'__dict__': , > '__module__': 'test_descr', > '__weakref__': , > '__doc__': None, > '__init__': }, > (, , ), > (,), > [[...]], > , > , > , > (,), > ] I wonder if some new cyclic garbage structure needs two gc.collect() passes in order to break it up. Neil From nas@python.ca Fri Jun 7 06:38:37 2002 From: nas@python.ca (Neil Schemenauer) Date: Thu, 6 Jun 2002 22:38:37 -0700 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: ; from tim.one@comcast.net on Thu, Jun 06, 2002 at 08:48:53PM -0400 References: Message-ID: <20020606223837.C1389@glacier.arctrix.com> Tim Peters wrote: > I can detect no sense in which do need to be run (not just one or two, > but lots of them). It's easy to reproduce. First, disable the GC. Next, run: regrtest.py test_descr test_gc My wild guess is that some tp_clear method is not doing it's job. I'll take a closer look tomorrow if someone hasn't figured it out by then. Must sleep. Too much CS. Neil From loewis@informatik.hu-berlin.de Fri Jun 7 08:05:31 2002 From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=) Date: 07 Jun 2002 09:05:31 +0200 Subject: [Python-Dev] Quota on sf.net Message-ID: It appears SF is rearranging servers, and asks projects to honor their disk quota, see https://sourceforge.net/forum/forum.php?forum_id=183601 There is a per-project disk quota of 100MB; /home/groups/p/py/python currently consumes 880MB. Most of this (830MB) is in htdocs/snapshots. Should we move those onto python.org? Regards, Martin From loewis@informatik.hu-berlin.de Fri Jun 7 08:09:35 2002 From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=) Date: 07 Jun 2002 09:09:35 +0200 Subject: [Python-Dev] Making doc strings optional Message-ID: I'm ready to apply patch #505375. Any objections? Martin From mal@lemburg.com Fri Jun 7 08:26:59 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 07 Jun 2002 09:26:59 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t References: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <200206061748.g56Hm0k15221@odiug.zope.com> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com> <3CFFC527.1090907@lemburg.com> <200206062033.g56KXvJ05145@odiug.zope.com> <3CFFD6AE.9020602@lemburg.com> <200206070023.g570NkZ07427@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D006043.7010802@lemburg.com> Guido van Rossum wrote: >>>I'm not worried about any modules... take this as PEP-42 wish: >>>someone would need to check all the code using e.g. file.seek() >>>and file.tell() to make sure that it works correctly with >>>long values. >> >>That is supposed to work today. If it doesn't, make a detailed bug >>report. > > > While file.seek() and file.tell() are indeed fixed, I think MAL has a > fear that some modules don't like getting a long from tell(). I fixed > a bug of this kind in dumbdbm more than three years ago, when a long > wasn't acceptable as a multiplier in string repetition. Since then, > longs aren't quite so poisonous as they once were, and I don't think > this fear is rational any more. If Martin has checked the code for this already, I'm fine. I stumbled across problems in this area with mxBeeBase which did not support using longs as addresses and since the problems are rather subtle, I assumed that other code not specifically built for handling longs in file positions could have similiar problems. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From eikeon@eikeon.com Fri Jun 7 08:26:58 2002 From: eikeon@eikeon.com (Daniel 'eikeon' Krech) Date: 07 Jun 2002 03:26:58 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: <200206070402.g5742il15834@pcp02138704pcs.reston01.va.comcast.net> References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net> <200206070402.g5742il15834@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > > > To solve this, we would have to make the ob_sinterned slot count as a > > > reference to the interned string. But then string_dealloc would be > > > complicated (it would have to call Py_XDECREF(op->ob_sinterned)), > > > possibly slowing things down. > > > > > > Is this worth it? > > > > That (latter) change seem "right" regardless of whether interned > > strings are ever released. > > OK, let's do this. Cool! From tim_one@email.msn.com Fri Jun 7 08:49:27 2002 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 7 Jun 2002 03:49:27 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <20020606223837.C1389@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > It's easy to reproduce. First, disable the GC. Next, run: > > regrtest.py test_descr test_gc Sorry, thinking is cheating. > My wild guess is that some tp_clear method is not doing it's job. I'll > take a closer look tomorrow if someone hasn't figured it out by then. > Must sleep. Too much CS. Ya, Canadian sausage always does me in too. I'll attach a self-contained (in the sense that you can run it directly by itself, without regrtest.py) test program. Guido might have some idea what does . For me, it prints: collected 3 collected 51 collected 9 collected 0 and, at the end, collected 1 and it's not a coincidence that 9+1 == 10 (the failing value seen when running the test suite). It suggests one easy way to fix test_gc . > I wonder if some new cyclic garbage structure needs two gc.collect() > passes in order to break it up. If there isn't a bug, this case takes 3(!) passes. from test_support import vereq def supers(): class A(object): def meth(self, a): return "A(%r)" % a class B(A): def __init__(self): self.__super = super(B, self) def meth(self, a): return "B(%r)" % a + self.__super.meth(a) class C(A): def meth(self, a): return "C(%r)" % a + self.__super.meth(a) C._C__super = super(C) class D(C, B): def meth(self, a): return "D(%r)" % a + super(D, self).meth(a) class mysuper(super): def __init__(self, *args): return super(mysuper, self).__init__(*args) class E(D): def meth(self, a): return "E(%r)" % a + mysuper(E, self).meth(a) class F(E): def meth(self, a): s = self.__super return "F(%r)[%s]" % (a, s.__class__.__name__) + s.meth(a) F._F__super = mysuper(F) vereq(F().meth(6), "F(6)[mysuper]E(6)D(6)C(6)B(6)A(6)") import gc gc.disable() L = [] L.append(L) supers() while 1: n = gc.collect() print "collected", n if n == 0: break del L n = gc.collect() print print "and, at the end, collected", n From just@letterror.com Fri Jun 7 09:05:23 2002 From: just@letterror.com (Just van Rossum) Date: Fri, 7 Jun 2002 10:05:23 +0200 Subject: [Python-Dev] textwrap.py In-Reply-To: Message-ID: Tim Peters wrote: > It wouldn't if you viewed it in Courier; for fixed-width fonts it very > arguably helps people parse. The two-space gimmick is out of favor for > published works because proportional fonts and kerning are adequate to > distinguish sentences. It still Rulz the DOS box, though . Huh? In fixed-width fonts the period is a small dot on a huge area of white space. It contains much more white than it would in a proportional font and/or the dot is much bigger to compenstate for that. Either way, in most fixed-width fonts the period sticks out pretty well. I don't see how that can be harder to parse than when set in a proportional font, let alone why an extra space would help. can't-afford-to-stay-out-of-a-typographical-flame-fest-on-python-dev-ly y'rs - Just From fredrik@pythonware.com Fri Jun 7 12:20:58 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 7 Jun 2002 13:20:58 +0200 Subject: [Python-Dev] textwrap.py References: Message-ID: <01f501c20e15$bb84cd60$0900a8c0@spiff> tim wrote: > Two spaces between sentences was the rule for monospaced fonts before > Knuth was born. in America, perhaps. people from other parts of the world may also wish to use the textwrap modules (or better, string.wrap). so let's add an option (e.g. ms_davis_told_me_so=3D1 ;-) > It got beaten into me by my mother when I learned to type, and > is still the rule for monospaced fonts according to several style > guides. if you do the google searches I mention, you'll find that the word "some" is more correct than "several". =20 > Are there TWO spaces after every sentence? Manuscripts without > two spaces after each sentence will be rejected. to quote another random web page: "Even I was told by my typing teacher to put two spaces after a period. It's just that I trusted the advice I got from graphic designers more than I trusted my typing teacher. My typing teacher also carried a lunch box and wore short-sleeved white dress shirts with really bad ties to school every day. It's up to you...." and =20 "... I've found tenacity and authority the overriding "arguments" for maintaining the two-space rule. Empirically and financially, the one-space rule makes sense." > It wouldn't if you viewed it in Courier; for fixed-width fonts it very > arguably helps people parse. according to vision researchers, humans using their eyes to read text don't care much about sentence breaks inside blocks of text -- for some reason, they're probably more interested in the con- tent. and humans don't appear to use regular expressions at all. how weird. From neal@metaslash.com Fri Jun 7 13:39:19 2002 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 07 Jun 2002 08:39:19 -0400 Subject: [Python-Dev] Socket timeout patch References: <200206070419.QAA07241@s454.cosc.canterbury.ac.nz> Message-ID: <3D00A977.7FBDFC50@metaslash.com> Greg Ewing wrote: > > Guido: > > > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)? > > Currently it sets a timeout of zero seconds, and that behaves pretty > > much the same as setting the socket in nonblocking mode -- but not > > exactly. Maybe these should be made the same? > > I'd say no. Someone might want the current behaviour, > whatever it is -- and if they don't, they can always > make it properly non-blocking. Don't make a special > case unless it's absolutely necessary. Another possibility would be to make settimeout(0.0) equivalent to settimeout(None), ie disable timeouts. Neal From guido@python.org Fri Jun 7 13:54:05 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 08:54:05 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: Your message of "Fri, 07 Jun 2002 00:32:07 EDT." References: Message-ID: <200206071254.g57Cs5V16781@pcp02138704pcs.reston01.va.comcast.net> > The web page from which that quote was taken forces an extra space > after the question mark (I didn't insert it after pasting the quote) > in the obvious way: > > Are there TWO spaces after every sentence?  Manuscripts > without two spaces after each will be rejected. How pedantic. HTML wasn't intended to be written this way. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 7 14:02:44 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 09:02:44 -0400 Subject: [Python-Dev] Quota on sf.net In-Reply-To: Your message of "07 Jun 2002 09:05:31 +0200." References: Message-ID: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> > It appears SF is rearranging servers, and asks projects to honor their > disk quota, see > > https://sourceforge.net/forum/forum.php?forum_id=183601 > > There is a per-project disk quota of 100MB; /home/groups/p/py/python > currently consumes 880MB. Most of this (830MB) is in > htdocs/snapshots. Should we move those onto python.org? What is htdocs/snapshots? There's plenty of space on creosote, but maybe the snapshots should be reduced in volume first? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 7 14:19:42 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 09:19:42 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: Your message of "Fri, 07 Jun 2002 16:19:57 +1200." <200206070419.QAA07241@s454.cosc.canterbury.ac.nz> References: <200206070419.QAA07241@s454.cosc.canterbury.ac.nz> Message-ID: <200206071319.g57DJgC17102@pcp02138704pcs.reston01.va.comcast.net> [Guido] > > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)? > > Currently it sets a timeout of zero seconds, and that behaves pretty > > much the same as setting the socket in nonblocking mode -- but not > > exactly. Maybe these should be made the same? [GregE] > I'd say no. Someone might want the current behaviour, > whatever it is -- and if they don't, they can always > make it properly non-blocking. Don't make a special > case unless it's absolutely necessary. Why would someone want the current (as of last night) behavior? IMO it's useless. The distinction with non-blocking mode is very minimal. [Neal] > Another possibility would be to make settimeout(0.0) equivalent to > settimeout(None), ie disable timeouts. Hm, but a zero really does smell more of non-blocking than of blocking. It would also be inconsistent with the timeout argument to select(), which currently uses None for blocking, 0 for non-blocking, and other positive numbers for a timeout in seconds -- just like settimeout(). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 7 14:19:51 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 09:19:51 -0400 Subject: [Python-Dev] d.get_key(key) -> key? In-Reply-To: Your message of "Fri, 07 Jun 2002 16:22:25 +1200." <200206070422.QAA07244@s454.cosc.canterbury.ac.nz> References: <200206070422.QAA07244@s454.cosc.canterbury.ac.nz> Message-ID: <200206071319.g57DJpH17110@pcp02138704pcs.reston01.va.comcast.net> > > So let's expose a function that cleans out unused strings from the > > interned dict. Long-running apps can decide when to call this. > > Would it do any harm to call this automatically from > the garbage collector? That's what I initially proposed -- do it in the last-generation GC pass, which runs every million object allocations or so. But since this is potentially expensive (running through a large dict), long-running processes might want to control when it runs. > I suppose it should be exposed as well, in case you > want it but have GC turned off -- but in the normal > case you shouldn't have to do anything special to > get it. It's a pure slowdown for more programs, even long-running programs (one could say *especially* for long-running programs, since short-running programs won't get to the last generation GC pass). Only long-running (24x7) servers that execute some kind of (pseudo-)code submitted by clients are vulnerable to the interned-dict-bloat problem. Such programs are full of hacks to limit memory bloat already, so this would be yet another trick for them to deploy. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Fri Jun 7 14:49:57 2002 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 7 Jun 2002 09:49:57 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <01f501c20e15$bb84cd60$0900a8c0@spiff> Message-ID: [Tim] >> Two spaces between sentences was the rule for monospaced fonts before >> Knuth was born. [/F] > in America, perhaps. people from other parts of the world may > also wish to use the textwrap modules (or better, string.wrap). Despite that you never bought a shift key, you use two spaces between sentences. Are you American? François Pinard's name can't even be spelled in American , and said Protection of full stops does not fall in that decoration category, it is essential. Just complained about it, and I invite you to set your browser to a fixed-width font and judge the readability of his msg compared to the pieces of mine he quoted (Just, the point isn't to make the period stand out, it's to make the start of the next sentence stand out): http://mail.python.org/pipermail/python-dev/2002-June/025141.html to my eyes single-space sucks with a monospaced font and I agree with François on this it makes monospaced text look like a giant run-on sentence. > so let's add an option (e.g. ms_davis_told_me_so=1 ;-) >> It got beaten into me by my mother when I learned to type, and >> is still the rule for monospaced fonts according to several style >> guides. > if you do the google searches I mention, I did, but I looked at a lot more than discussion boards. > you'll find that the word "some" is more correct than "several". I'm not sure that distinction means something; if it does, I don't buy it. >> Are there TWO spaces after every sentence? Manuscripts without >> two spaces after each sentence will be rejected. > to quote another random web page: > > "Even I was told by my typing teacher to put two spaces after > a period. It's just that I trusted the advice I got from graphic > designers more than I trusted my typing teacher. My typing > teacher also carried a lunch box and wore short-sleeved white > dress shirts with really bad ties to school every day. It's up > to you...." A difference is that my quote came from a publisher spelling out requirements for submission, while yours is pulled from a casual msg in a discussion board. This is the difference between quoting a journal and an Archimedes Plutonium post from sci.physics . > and > > "... I've found tenacity and authority the overriding "arguments" > for maintaining the two-space rule. Empirically and financially, the > one-space rule makes sense." And in *that* discussion board, the preceding msg in the thread says At my last Technical Writing job my manager was adamant about using two spaces after a period, and I have become accustomed to using two spaces. and Two spaces after a period is still the rule ... Selective quoting of random people blathering at each other doesn't count as "research" to me. If it does to anyone else, you can find hundreds of quotes supporting any view you like. > ... > according to vision researchers, humans using their eyes to read > text don't care much about sentence breaks inside blocks of text This reads like a garbled paraphrase; I assume that if you had a real reference, you would have given it <0.9 wink>. > -- for some reason, they're probably more interested in the con- > tent. and humans don't appear to use regular expressions at all. > how weird. [from Patricia Godfrey's review of "The Mac Is Not a Typewriter] ... The author details all the typewriter makeshifts, such as two hyphens for a dash, that no longer have to be—and should not be— employed when you’re working on a PC. But in one case she reveals her youth. Typing two spaces after an end-of-sentence period, she thinks, was only done on typewriters because typewriters have monospaced type, and you shouldn’t do it on a PC. Like many theories, it sounds logical, but those of us who read old books or are old enough to remember when typesetting was an art practiced by people, rather than the result of an algorithm, know better. Typists were taught to hit two spaces after a period when typing because typeset material once upon a time used extra space there. ... This is an interesting instance of a phenomenon that we should all be aware of: in times of much change, collective cultural amnesia can occur, and a whole society can forget something that "everyone knew." The regexp attempts to preserve what everyone used to know, against computer-inspired reduction to the simplest thing that can possibly be implemented. in-my-oourier-new-world-i-know-what-works-ly y'rs - tim From tim_one@email.msn.com Fri Jun 7 14:54:56 2002 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 7 Jun 2002 09:54:56 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <200206071254.g57Cs5V16781@pcp02138704pcs.reston01.va.comcast.net> Message-ID: >> Are there TWO spaces after every sentence?  Manuscripts >> without two spaces after each will be rejected. [Guido] > How pedantic. HTML wasn't intended to be written this way. The quote had nothing to do with HTML. Or are you focusing exclusively on the instance of  ? That's helpful . From nas@python.ca Fri Jun 7 15:10:54 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 7 Jun 2002 07:10:54 -0700 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: ; from tim_one@email.msn.com on Fri, Jun 07, 2002 at 03:49:27AM -0400 References: <20020606223837.C1389@glacier.arctrix.com> Message-ID: <20020607071054.A2511@glacier.arctrix.com> Tim Peters wrote: > [Neil Schemenauer] > > Must sleep. Too much CS. > > Ya, Canadian sausage always does me in too. But it's so good. > If there isn't a bug, this case takes 3(!) passes. Perhaps it is a reference counting bug. If a reference count is too high then tp_clear will keep decref'ing it until it gets to zero. Neil From just@letterror.com Fri Jun 7 15:15:11 2002 From: just@letterror.com (Just van Rossum) Date: Fri, 7 Jun 2002 16:15:11 +0200 Subject: [Python-Dev] textwrap.py In-Reply-To: Message-ID: Tim Peters wrote: > (Just, the point isn't to make the period stand out, it's > to make the start of the next sentence stand out): Sure, but there are already *two* things to make that clear: end the prev= ious sentence with a period, start the next with a capital letter. An extra sp= ace is overkill. But I guess your point may be that caps usually stand out less = in fixed-width fonts, which may be true. > http://mail.python.org/pipermail/python-dev/2002-June/025141.html That sucks only because the empty line between quote and followup was del= eted from the original... > to my eyes single-space sucks with a monospaced font and I agree with > Fran=E7ois on this it makes monospaced text look like a giant run-on se= ntence. Don't know about canadians, but I wouldn't listen to the french : they wr= ite spaces *before* punctuation ! Just From tim.one@comcast.net Fri Jun 7 15:33:51 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 10:33:51 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <20020606223104.B1389@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > I wonder if some new cyclic garbage structure needs two gc.collect() > passes in order to break it up. Can you dream up a way that can happen legitimately? I haven't been able to, short of assuming the existence of a container object that isn't tracked by gc. Else it seems that all the unreachable cycles that exist at any given time will be found by a single all-generations gc pass (either that, or gc is busted ). From perry@stsci.edu Fri Jun 7 15:39:53 2002 From: perry@stsci.edu (Perry Greenfield) Date: Fri, 07 Jun 2002 10:39:53 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t Message-ID: Guido writes: > I'm not very concerned about strings or lists with more than 2GB > items, but I am concerned about other memory buffers. Those in the Numeric/numarray community, for one, would also be concerned. Although there aren't many data arrays these days that are larger than 2GB there are some beginning to appear. I have no doubt that within a few years there will be many more. I'm not sure I understand all the implications of the discussion here, but it sounds like an important issue. Currently strings are frequently used as a common "medium" to pass binary data from one module to another (e.g., from Numeric to PIL); limiting strings to 2GB may prove a problem in this area (though frankly, I suspect few will want to use them as temporary buffers for objects that size until memories have grown a bit more :-). Perry Greenfield From gmcm@hypernet.com Fri Jun 7 15:40:32 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 7 Jun 2002 10:40:32 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: References: Message-ID: <3D008DA0.20248.6AD00DBA@localhost> Aargh! It doesn't matter if it "makes sense"[1]! It's a widely known rule that some people still insist upon. I don't see anyone arguing you should adopt the convention, just that people who follow the convention should see it respected. -- Gordon http://www.mcmillan-inc.com/ [1] Style guidelines frequently appeal to a notion of "sense" that only makes sense if the guideline appeals to you. I would cite the GNU C code style guide as an example, but that would only get me flamed :-). From tim.one@comcast.net Fri Jun 7 15:44:46 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 10:44:46 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: Message-ID: >> (Just, the point isn't to make the period stand out, it's >> to make the start of the next sentence stand out): [Just] > Sure, but there are already *two* things to make that clear: end > the previous sentence with a period, start the next with a capital > letter. An extra space is overkill. But I guess your point may be > that caps usually stand out less in fixed-width fonts, which may be > true. They do seem to stand out less, but that isn't really my point. My point is that I've been living mostly with fixed-width fonts for more than 30 years, and even in 1970 I noticed it was easier to read prose in such fonts when sentences were separated by two spaces. And that's before my eyes started growing old -- it's gotten more noticeable over the years. I don't know exactly why that is, but I can't notice a thing hundreds of times and then be convinced by abstract arguments that I've been hallucinating for decades. > ... > Don't know about canadians, but I wouldn't listen to the french : > they write spaces *before* punctuation ! God knows I'd rather align myself with the Dutch, but in the grand tradition of European wars, punctuation makes strange bedfellows . From s_lott@yahoo.com Fri Jun 7 15:48:51 2002 From: s_lott@yahoo.com (Steven Lott) Date: Fri, 7 Jun 2002 07:48:51 -0700 (PDT) Subject: [Python-Dev] textwrap.py In-Reply-To: Message-ID: <20020607144851.21255.qmail@web9605.mail.yahoo.com> > "old enough to remember when typesetting was > an art > practiced by people" Hey wait a minute, I resemble that remark! Briefly, I spent some time setting cold lead type with my stubbly little fingers. You used an "em" space after full stops, an "en" space otherwise. You padded after the "em" first, then spread "thins" around the line to get it to long enough that you could clamp it firmly in the frame. However, this doesn't resolve the monofont issue. An "em" is (usually) not twice as wide as an "en". An "en" is the width of the letter "n"; about in the middle of all of the widths. An "em", is the width of the letter "m", the widest of all letters. Anyway. I'm not a big fan of flags and options and settings. I think the text wrapper should have the "fix sentence ending" method renamed to "find sentence ending" and the wrap() and fill() functions could have a hook where a Strategy class can be applied. - One strategy class puts a single space after full stops. - Another puts double spaces after full stops. - A subclass of either of these could spread space around to justify the line. - I think that Unicode offers "em" and "en"-sized spaces; what this does with good old fashioned Courrier 12 I have no idea; but someone could add this strategy if it made them happy. ===== -- S. Lott, CCP :-{) S_LOTT@YAHOO.COM http://www.mindspring.com/~slott1 Buccaneer #468: KaDiMa Macintosh user: drinking upstream from the herd. __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From guido@python.org Fri Jun 7 15:58:02 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 10:58:02 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: Your message of "Fri, 07 Jun 2002 10:39:53 EDT." References: Message-ID: <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> > > I'm not very concerned about strings or lists with more than 2GB > > items, but I am concerned about other memory buffers. > > Those in the Numeric/numarray community, for one, would also be > concerned. Although there aren't many data arrays these days that are > larger than 2GB there are some beginning to appear. I have no doubt > that within a few years there will be many more. I'm not sure I > understand all the implications of the discussion here, but it sounds > like an important issue. Currently strings are frequently used as > a common "medium" to pass binary data from one module to another > (e.g., from Numeric to PIL); limiting strings to 2GB may prove > a problem in this area (though frankly, I suspect few will want > to use them as temporary buffers for objects that size until memories > have grown a bit more :-). Sorry, I should have been more exact. I meant 2 billion items, not 2 gigabytes. That should give you an extra factor 4-8 to play with. :-) We'll fix this in Python 3.0 for sure -- the question is, should we start fixing it now and binary compatibility be damned, or should we honor binary compatiblity more? Maybe someone in the Python-with-a-tie camp can comment? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jun 7 15:51:46 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 10:51:46 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <3D008DA0.20248.6AD00DBA@localhost> Message-ID: [Gordon McMillan] > Aargh! > > It doesn't matter if it "makes sense"[1]! Indeed, it doesn't even matter if one side is dead wrong. Hell, it doesn't even matter if all sides are dead wrong. > It's a widely known rule that some people still insist upon. The ones with a normal sense of aesthetics, yes . > ... > [1] Style guidelines frequently appeal to a > notion of "sense" that only makes sense > if the guideline appeals to you. I would cite > the GNU C code style guide as an example, > but that would only get me flamed :-). """ Aside from this, I prefer code formatted like this: if (x < foo (y, z)) haha = bar[4] + 5; else { while (z) { haha += foo (z, z); z--; } return ++x + bar (); } I find it easier to read a program when it has spaces before the open-parentheses and after the commas. """ I've recently had the opportunity to work with reams of code done this way. The best that can be said of it is that it's dead wrong. Out of consideration for our youth, I'll refrain from revealing the worst that can be said of it. At least it has two spaces after right curly braces . From guido@python.org Fri Jun 7 16:00:46 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 11:00:46 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: Your message of "Fri, 07 Jun 2002 10:40:32 EDT." <3D008DA0.20248.6AD00DBA@localhost> References: <3D008DA0.20248.6AD00DBA@localhost> Message-ID: <200206071500.g57F0lh18144@pcp02138704pcs.reston01.va.comcast.net> > It doesn't matter if it "makes sense"[1]! It's a widely known rule > that some people still insist upon. I don't see anyone arguing you > should adopt the convention, just that people who follow the > convention should see it respected. True, but then there needs to be a way to enable/disable it, since even if you never use two spaces after a period, the rule can still generate them for you in the output: when an input sentence ends at the end of a line but the output sentence doesn't, the rule will translate the newline into two spaces instead of one. I vote to have it off by default. --Guido van Rossum (home page: http://www.python.org/~guido/) From loewis@informatik.hu-berlin.de Fri Jun 7 16:00:05 2002 From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=) Date: 07 Jun 2002 17:00:05 +0200 Subject: [Python-Dev] Quota on sf.net In-Reply-To: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> References: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > What is htdocs/snapshots? There's plenty of space on creosote, but > maybe the snapshots should be reduced in volume first? I'm not sure. Jeremy owns it, but I don't know what process creates it. Regards, Martin From sholden@holdenweb.com Fri Jun 7 15:59:48 2002 From: sholden@holdenweb.com (Steve Holden) Date: Fri, 7 Jun 2002 10:59:48 -0400 Subject: [Python-Dev] textwrap.py References: Message-ID: <012c01c20e33$f9462540$7201a8c0@holdenweb.com> ----- Original Message ----- From: "Just van Rossum" To: Sent: Friday, June 07, 2002 10:15 AM Subject: RE: [Python-Dev] textwrap.py Tim Peters wrote: > (Just, the point isn't to make the period stand out, it's > to make the start of the next sentence stand out): Sure, but there are already *two* things to make that clear: end the previous sentence with a period, start the next with a capital letter. An extra space is overkill. But I guess your point may be that caps usually stand out less in fixed-width fonts, which may be true. > http://mail.python.org/pipermail/python-dev/2002-June/025141.html That sucks only because the empty line between quote and followup was deleted from the original... > to my eyes single-space sucks with a monospaced font and I agree with > François on this it makes monospaced text look like a giant run-on sentence. Don't know about canadians, but I wouldn't listen to the french : they write spaces *before* punctuation ! --------End Original Message-------- If the energy that has gone into this debate had gone into modifying existing code, by now each different version of the text formatting function being discussed could have an "twospace" keyword argument which could be set to achieve the required behavior and defaulted to the author's preference. I smell the bicycle shed here. regards ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From mal@lemburg.com Fri Jun 7 16:12:57 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 07 Jun 2002 17:12:57 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t References: <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D00CD79.6050409@lemburg.com> Guido van Rossum wrote: >>>I'm not very concerned about strings or lists with more than 2GB >>>items, but I am concerned about other memory buffers. >> >>Those in the Numeric/numarray community, for one, would also be >>concerned. Although there aren't many data arrays these days that are >>larger than 2GB there are some beginning to appear. I have no doubt >>that within a few years there will be many more. I'm not sure I >>understand all the implications of the discussion here, but it sounds >>like an important issue. Currently strings are frequently used as >>a common "medium" to pass binary data from one module to another >>(e.g., from Numeric to PIL); limiting strings to 2GB may prove >>a problem in this area (though frankly, I suspect few will want >>to use them as temporary buffers for objects that size until memories >>have grown a bit more :-). > > > Sorry, I should have been more exact. I meant 2 billion items, not 2 > gigabytes. That should give you an extra factor 4-8 to play with. :-) > > We'll fix this in Python 3.0 for sure -- the question is, should we > start fixing it now and binary compatibility be damned, or should we > honor binary compatiblity more? What binary compatibility ? I thought we had given that idea up after 1.5.2 was out the door (which is also why the Windows distutils installers are very picky about the Python version to install an extension for). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From guido@python.org Fri Jun 7 16:19:05 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 11:19:05 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: Your message of "Fri, 07 Jun 2002 10:33:51 EDT." References: Message-ID: <200206071519.g57FJ5k18395@pcp02138704pcs.reston01.va.comcast.net> > > I wonder if some new cyclic garbage structure needs two gc.collect() > > passes in order to break it up. > > Can you dream up a way that can happen legitimately? I haven't been able > to, short of assuming the existence of a container object that isn't tracked > by gc. Else it seems that all the unreachable cycles that exist at any > given time will be found by a single all-generations gc pass (either that, > or gc is busted ). Any idea why this would only happen on Windows? I tried it on Linux and couldn't get it to fail. Not even with gc.set_threshold(1). I'll go review my tp_clear code next. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Fri Jun 7 16:15:05 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 7 Jun 2002 11:15:05 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <200206071500.g57F0lh18144@pcp02138704pcs.reston01.va.comcast.net> References: <3D008DA0.20248.6AD00DBA@localhost> <200206071500.g57F0lh18144@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020607151505.GA22182@panix.com> On Fri, Jun 07, 2002, Guido van Rossum wrote: > > > It doesn't matter if it "makes sense"[1]! It's a widely known rule > > that some people still insist upon. I don't see anyone arguing you > > should adopt the convention, just that people who follow the > > convention should see it respected. > > True, but then there needs to be a way to enable/disable it, since > even if you never use two spaces after a period, the rule can still > generate them for you in the output: when an input sentence ends at > the end of a line but the output sentence doesn't, the rule will > translate the newline into two spaces instead of one. > > I vote to have it off by default. How about a compromise? If the algorithm discovers a sentence with two or more spaces ending it, it goes into "two-space" mode; otherwise, it defaults to one-space mode. (I think fmt does this, but I'm not sure; it's certainly the case that sometimes it preserves my spaces and sometimes it doesn't, and I've never been able to figure it out precisely.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I had lots of reasonable theories about children myself, until I had some." --Michael Rios From guido@python.org Fri Jun 7 16:23:23 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 11:23:23 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: Your message of "Fri, 07 Jun 2002 17:12:57 +0200." <3D00CD79.6050409@lemburg.com> References: <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> <3D00CD79.6050409@lemburg.com> Message-ID: <200206071523.g57FNNe18428@pcp02138704pcs.reston01.va.comcast.net> > What binary compatibility ? I thought we had given that idea > up after 1.5.2 was out the door (which is also why the Windows > distutils installers are very picky about the Python version > to install an extension for). You keep saying this, and I keep denying it. In everything I do I try to remain binary compatible in struct layout and function signatures. Can you point to a document that records a decision to the contrary? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jun 7 16:23:04 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 11:23:04 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: <3D00CD79.6050409@lemburg.com> Message-ID: [M.-A. Lemburg] > What binary compatibility ? I thought we had given that idea > up after 1.5.2 was out the door (which is also why the Windows > distutils installers are very picky about the Python version > to install an extension for). It's strange. The binary API has changed with every non-bugfix release since then, but if you stare at the details, old binaries will almost certainly work correctly despite the API warning messages. Guido explained to me that this is why API mismatch is just a warning instead of an error. This is also why it took weeks to enable pymalloc by default, instead of days (i.e., to make sure that old binaries don't have to be recompiled). I don't speak for disutils, of course. From tim.one@comcast.net Fri Jun 7 16:26:47 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 11:26:47 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <200206071519.g57FJ5k18395@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > Any idea why this would only happen on Windows? I tried it on Linux > and couldn't get it to fail. Not even with gc.set_threshold(1). What exactly is "it"? The failure when running regrtest.py in whole; the failure Neil reported (and I assume on Linux) by running just test_descr and test_gc after *disabling* gc in regrtest.py ("disable" == gc.disable() or gc.set_threshold(0), not gc.set_treshold(1)); or the 3 gc.collect()s it takes to clear out the cycles in the self-contained test program I posted? > I'll go review my tp_clear code next. Probably a good idea regardless . From tim.one@comcast.net Fri Jun 7 16:27:38 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 11:27:38 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <20020607151505.GA22182@panix.com> Message-ID: [Aahz] > How about a compromise? If the algorithm discovers a sentence with two > or more spaces ending it, it goes into "two-space" mode; otherwise, it > defaults to one-space mode. (I think fmt does this, but I'm not sure; > it's certainly the case that sometimes it preserves my spaces and > sometimes it doesn't, and I've never been able to figure it out > precisely.) That's certainly worthy of emulation . From mgilfix@eecs.tufts.edu Fri Jun 7 16:45:26 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Fri, 7 Jun 2002 11:45:26 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: ; from tim_one@email.msn.com on Fri, Jun 07, 2002 at 03:49:27AM -0400 References: <20020606223837.C1389@glacier.arctrix.com> Message-ID: <20020607114526.B24428@eecs.tufts.edu> On Fri, Jun 07 @ 03:49, Tim Peters wrote: > Ya, Canadian sausage always does me in too. I'll attach a self-contained > (in the sense that you can run it directly by itself, without regrtest.py) > test program. Guido might have some idea what does . For me, it > prints: Every Canadian knows that you should always opt for the ham. I think we feed our turkeys molson... -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From guido@python.org Fri Jun 7 16:50:13 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 11:50:13 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: Your message of "Fri, 07 Jun 2002 03:49:27 EDT." References: Message-ID: <200206071550.g57FoDk25701@pcp02138704pcs.reston01.va.comcast.net> > > I wonder if some new cyclic garbage structure needs two gc.collect() > > passes in order to break it up. > > If there isn't a bug, this case takes 3(!) passes. That same testcase prints the same output for me on Linux, with Python 2.2, with a 2.3 from June 4th, and with 2.3 from current CVS: collected 3 collected 51 collected 9 collected 0 and, at the end, collected 1 So there really are test cases that require more than one collection to clean them up. Next: > [Guido] > > Any idea why this would only happen on Windows? I tried it on Linux > > and couldn't get it to fail. Not even with gc.set_threshold(1). [Tim] > What exactly is "it"? The failure when running regrtest.py in > whole; the failure Neil reported (and I assume on Linux) by running > just test_descr and test_gc after *disabling* gc in regrtest.py > ("disable" == gc.disable() or gc.set_threshold(0), not > gc.set_treshold(1)); or the 3 gc.collect()s it takes to clear out > the cycles in the self-contained test program I posted? I meant the failure on Windows. But I can now reproduce on Linux what Neil did using the new -t option that I just added to regrtest.py: ./python ../Lib/test/regrtest.py -t0 test_descr test_gc which tells me test test_gc failed -- test_list: actual 10, expected 1 When I put an extra gc.collect() call in test_gc.test_list(), the test succeeds. Is this the right fix? I can't see anything particilarly wrong with subtype_clear() or the slot-traversing subtype_traverse() in typeobject.c. --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Fri Jun 7 16:54:22 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 7 Jun 2002 08:54:22 -0700 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <200206071519.g57FJ5k18395@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jun 07, 2002 at 11:19:05AM -0400 References: <200206071519.g57FJ5k18395@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020607085422.A3051@glacier.arctrix.com> Guido van Rossum wrote: > Any idea why this would only happen on Windows? What only happens on Windows? If can reliably reproduce the problem on Linux. > I tried it on Linux and couldn't get it to fail. Not even with > gc.set_threshold(1). I think you want gc.disable(). > I'll go review my tp_clear code next. I'm narrowing in on the change the broke things. It happened between Dec 1 and Dec 15. Neil From mal@lemburg.com Fri Jun 7 17:10:13 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 07 Jun 2002 18:10:13 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t References: <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> <3D00CD79.6050409@lemburg.com> <200206071523.g57FNNe18428@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D00DAE5.10203@lemburg.com> Guido van Rossum wrote: >>What binary compatibility ? I thought we had given that idea >>up after 1.5.2 was out the door (which is also why the Windows >>distutils installers are very picky about the Python version >>to install an extension for). > > > You keep saying this, and I keep denying it. :-) > In everything I do I try > to remain binary compatible in struct layout and function signatures. > Can you point to a document that records a decision to the contrary? Garbage collection, weak references, changes in the memory allocation, etc. etc. All these change the binary layout of structs or the semantics of memory allocation -- mostly only in slight ways, but to a point where 100% binary compatibility is not given anymore. Other changes (which I know of) are e.g. the Unicode APIs which have changed (they now include UCS2 or UCS4 depending on whether you use 16-bit or 32-bit Unicode internally). I don't think that binary compatibility is all that important; it just requires a recompile (and hey, that way you even get sub-type support for free ;-). Far more difficult to handle are all those minute little changes which easily slip the developer's radar. Luckily this will get approached now by Andrew and Raymond, so things are getting much better for us poor souls having to live on supporting 3-4 different Python versions :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From tim.one@comcast.net Fri Jun 7 17:03:49 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 12:03:49 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <200206071550.g57FoDk25701@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > That same testcase prints the same output for me on Linux, with Python > 2.2, with a 2.3 from June 4th, and with 2.3 from current CVS: > > collected 3 > collected 51 > collected 9 > collected 0 > > and, at the end, collected 1 > > So there really are test cases that require more than one collection > to clean them up. Same here. I wish we understood why. Or that at least one of Neil and I understood why. > ... > But I can now reproduce on Linux what Neil did using the new -t option > that I just added to regrtest.py: > > ./python ../Lib/test/regrtest.py -t0 test_descr test_gc > > which tells me > > test test_gc failed -- test_list: actual 10, expected 1 > > When I put an extra gc.collect() call in test_gc.test_list(), the test > succeeds. > > Is this the right fix? No, but assuming there isn't a real bug here, repeating gc.collect() until it returns 0 would be -- as the self-contained program showed, we *may* need to call gc.collect() as many as 4 times before that happens. And if it's legit that it may need 4, I see no reason for believing there's any a priori upper bound on how many may be needed. And the test could have failed all along, even in 2.2; it apparently depends on how many times gc just happens to run before we get to test_gc. I'll check in a "drain it" fix to test_gc, but I'm still squirming. > I can't see anything particilarly wrong with subtype_clear() or the > slot-traversing subtype_traverse() in typeobject.c. I couldn't either, but in my case I had scant idea what it thought it was trying to do <0.9 wink>. From guido@python.org Fri Jun 7 17:29:40 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 12:29:40 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: Your message of "Fri, 07 Jun 2002 00:26:23 EDT." <20020607002623.A20029@eecs.tufts.edu> References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> Message-ID: <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> First, a few new issues in this thread: - On Windows, setting a timeout on a socket and then using s.makefile() works as expected (the I/O operations on the file will time out according to the timeout set on the socket). This is because makefile() returns a pseudo-file that calls s.recv() etc. on the socket object. But on Unix, s.makefile() on a socket with a timeout is a disaster: because the socket is internally set to nonblocking mode, all I/O operations will fail if they cannot be completed immediately (effectively setting a timeout of 0 on the file). I have currently documented around this, but maybe it would be better if makefile() used a pseudo-file on all platforms, for uniform behavior. Thoughts? I'm also thinking of implementing the socket wrapper (which is currently a Python class that has a "real" socket object as an instance variable) as a subclass of the real socket class instead. - The original timeout socket code (in Python) by Tim O'Malley had a global timeout which you could set so that *all* sockets *automatically* had their timeout set. This is nice if you want it to affect library modules like urllib or ftplib. That feature is currently missing. Should we add it? (I have some concerns about it, in that it might break other code -- and it doesn't seem wise to do this for server-end sockets in general. But it's a nice hack for smaller programs.) Now on to my reply to Michael Gilfix: > Good stuff. The module needed a little work as I discovered as well > :) ...and it still needs more. There are still way too many #ifdefs in the code. > Er, hopefully Bernard is still watching this thread as he wrote > the test_timeout.py. He's been pretty quiet though as of late... I'm > willing to rewrite the tests if he doesn't have the time. Either way would be good. > I think the tests should follow the same pattern as the > test_socket.py. While adding my regression tests, I noted that the > general socket test suite could use some re-writing but I didn't feel > it appropriate to tackle it at that point. Perhaps a next patch? Yes, please! > > - Cross-platform testing. It's possible that the cleanup broke things > > on some platforms, or that select() doesn't work the same way. I > > can only test on Windows and Linux; there is code specific to OS/2 > > and RISCOS in the module too. > > This was a concern from the beginning but we had some chat on the > dev list and concluded that any system supporting sockets has to > support select or some equivalent (hence the initial reason for using > the select module, although I agree it was expensive). But that doesn't mean there aren't platform-specific tweaks necessary to import the definition of select() and the FD_* macros. We'll find out soon enough, this is what alpha releases are for. :-) > > - I'm not sure that the handling of timeout errors in accept(), > > connect() and connect_ex() is 100% correct (this code sniffs the > > error code and decides whether to retry or not). > > I've tested these on linux (manually) and they seem to work just > fine. I didn't do as much testing with connect_ex but the code is > very similar to connect, so confidence is high-er. The reason for the > two-pass is because the initial connect needs to be made to start the > process and then try again, based on the error codes, for non-blocking > connects. It's weird like that. I'll wait for the unit tests. These should test all three modes (blocking, non-blocking, and timeout). Can you explain why on Windows you say that the socket is connected when connect() returns a WSAEINVAL error? Also, your code enters this block even in non-blocking mode, if a timeout was also set. (Fortunately I fixed this for you by setting the timeout to -1.0 in setblocking(). Unfortunately there's still a later test for !sock_blocking in the same block that cannot succeed any more because of that.) The same concerns apply to connect_ex() and accept(), which have very similar logic. I believe it is possible on some Unix variants (maybe on Linux) that when select indicates that a socket is ready, if the socket is in nonblocking mode, the call will return an error because some kernel resource is unavailable. This suggests that you may have to keep the socket in blocking mode except when you have to do a connect() or accept() (for which you can't do a select without setting the socket in nonblocking mode first). Looking at connect_ex, it seems to be missing the "res = errno" bit in the case where it says "we're already connected". It used to return errno here, now it will return -1. Maybe the conex_finally label should be moved up to before the "if (res != 0) {" line? > > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)? > > Currently it sets a timeout of zero seconds, and that behaves pretty > > much the same as setting the socket in nonblocking mode -- but not > > exactly. Maybe these should be made the same? > > I thought about this and whether or not I wanted to address this. I > kinda decided to leave them separate though. I don't think setting a > timeout means anything equivalent to setblocking(0). In fact, I can't > see why anyone should ever set a timeout of zero and the immediate > throwing of the exception is a good alert as to what's going on. I > vote, leave them separate and as they are now... OTOH, a timeout of 0 behaves very similar to nonblocking mode -- similar enough that a program that uses setblocking(0) would probably also work when using settimeout(0). I kind of like the idea of having only a single internal flag value, sock_timeout, rather than two (sock_timeout and sock_blocking). > > - The socket.py module has been changed too, changing the way > > buffering is done on Windows. I haven't reviewed or tested this > > code thoroughly. > > I added a regression test to test_socket.py to test this, that works > on both the old code (I used 2.1.3) and the new code. Hopefully, this > will be instrumental for those testing it and it reflects my manual > tests. The tests don't look very systematic. There are many cases (default bufsize, unbuffered, bufsize=1, large bufsize; combine with read(), readline(), read a line larger than the buffer size, etc.). We need a more systematic approach to unit testing here. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 7 17:31:44 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 12:31:44 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: Your message of "Fri, 07 Jun 2002 12:03:49 EDT." References: Message-ID: <200206071631.g57GVix25975@pcp02138704pcs.reston01.va.comcast.net> > No, but assuming there isn't a real bug here, repeating gc.collect() until > it returns 0 would be -- as the self-contained program showed, we *may* need > to call gc.collect() as many as 4 times before that happens. And if it's > legit that it may need 4, I see no reason for believing there's any a priori > upper bound on how many may be needed. And the test could have failed all > along, even in 2.2; it apparently depends on how many times gc just happens > to run before we get to test_gc. > > I'll check in a "drain it" fix to test_gc, but I'm still squirming. Hold off. Neil said he thought there was a bug introduced early December -- that's before 2.2 was release! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 7 17:41:53 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 12:41:53 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: Your message of "Fri, 07 Jun 2002 18:10:13 +0200." <3D00DAE5.10203@lemburg.com> References: <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> <3D00CD79.6050409@lemburg.com> <200206071523.g57FNNe18428@pcp02138704pcs.reston01.va.comcast.net> <3D00DAE5.10203@lemburg.com> Message-ID: <200206071641.g57Gfr826040@pcp02138704pcs.reston01.va.comcast.net> > > Can you point to a document that records a decision to the contrary? > > Garbage collection, weak references, changes in the memory allocation, > etc. etc. IMO this is 100% FUD. The GC does not change the object lay-out. It is only triggered for types that have a specific flag bit. The changes in the GC API also changed the flag bit used. Weak references also use a flag bit in the type object and if the flag bit is on, look at a field in the type that checks whether there is a weakref pointer in the object struct. All objects (with public object lay-out) that have had their struct extended have always done so by appending to the end. Tim spent weeks to make the memory allocation code backwards compatible (with several different versions, binary and source compatibility). As an example, the old Zope ExtensionClass code works fine with Python 2.2. > All these change the binary layout of structs or the semantics > of memory allocation -- mostly only in slight ways, but to a point > where 100% binary compatibility is not given anymore. Still, I maintain that most extensions that work with 1.5.2 still work today without recompilation, if you can live with the API version change warnings. Try it! > Other changes (which I know of) are e.g. the Unicode APIs which > have changed (they now include UCS2 or UCS4 depending on whether > you use 16-bit or 32-bit Unicode internally). When you compile with UCS2 it should be backward compatible. > I don't think that binary compatibility is all that important; > it just requires a recompile (and hey, that way you even get > sub-type support for free ;-). Actually, you don't -- you have to set a flag bit to make your type subclassable. There are too many things that classic extension types don't provide. > Far more difficult to handle are all those minute little changes > which easily slip the developer's radar. Examples? > Luckily this will get approached now by Andrew and Raymond, so > things are getting much better for us poor souls having to > live on supporting 3-4 different Python versions :-) I'm not sure which initiative you are referring to. Or even which Andrew (I presume you mean Raymond Hettinger)? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jun 7 17:38:19 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 12:38:19 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <200206071631.g57GVix25975@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Tim] >> I'll check in a "drain it" fix to test_gc, but I'm still squirming. [Guido] > Hold off. Neil said he thought there was a bug introduced early > December -- that's before 2.2 was release! Yup, I saw that. From nas@python.ca Fri Jun 7 17:33:27 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 7 Jun 2002 09:33:27 -0700 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: ; from tim.one@comcast.net on Fri, Jun 07, 2002 at 12:03:49PM -0400 References: <200206071550.g57FoDk25701@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020607093327.A3135@glacier.arctrix.com> --liOOAslEiF7prFVr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Attached is a little program that triggers the behavior. The CVS change I finally narrowed in on was the addition of similar code to test_descr. A reference counting bug is still by best guess. Guido? Neil --liOOAslEiF7prFVr Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="gcbug.py" import gc gc.disable() def main(): # must be inside function scope class A(object): def __init__(self): self.__super = super(A, self) A() main() print 'first collect', gc.collect() print 'second collect', gc.collect() --liOOAslEiF7prFVr-- From tim.one@comcast.net Fri Jun 7 17:57:15 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 12:57:15 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <20020607093327.A3135@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > Attached is a little program that triggers the behavior. The CVS change > I finally narrowed in on was the addition of similar code to test_descr. Ouch! > A reference counting bug is still by best guess. Guido? Here's the code: import gc gc.disable() def main(): # must be inside function scope class A(object): def __init__(self): self.__super = super(A, self) A() main() print 'first collect', gc.collect() print 'second collect', gc.collect() The first collect is getting these: [<__main__.A object at 0x0066A090>, , >, {'_A__super': , >} ] The second is getting these: [, {'__dict__': , '__module__': '__main__', '__weakref__': , '__doc__': None, '__init__': }, (, ), (,), , , , (,), ] For some reason, the cell nags me. Perhaps because of your "must be inside function scope" comment, and that cells are poorly understood by me . From jeremy@zope.com Fri Jun 7 13:41:20 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 7 Jun 2002 08:41:20 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: References: <20020607093327.A3135@glacier.arctrix.com> Message-ID: <15616.43504.457473.110331@slothrop.zope.com> >>>>> "TP" == Tim Peters writes: TP> For some reason, the cell nags me. Perhaps because of your TP> "must be inside function scope" comment, and that cells are TP> poorly understood by me . I can explain the cells, at least. def main(): # must be inside function scope class A(object): def __init__(self): self.__super = super(A, self) A() In the example, the value bound to A is stored in a cell, because it is a free variable in __init__(). There are two references to the cell after the class statement is executed. One is the frame for main(). The other is the func_closure for __init__(). The second reference creates a cycle. The cycle is: class A refers to function __init__ refers to cell for A refers to class A That's it for what I understand. It looks like the example code creates two cycles, and one cycle refers to the other. The first cycle is the one involving the A instance and the super instance variable. That cycle has a reference to class A. When the garbage collector runs, it determines the first cycle is garbage. It doesn't determine the second cycle is garbage because it has an external reference from the first cycle. I presume that the garbage collector can't collect both cycles in one pass without re-running the update & subtract refs phase after deleting all the garbage. During the first refs pass, the second cycle wasn't detected. The second cycle is only collectable after the first cycle has been collected. Jeremy From nas@python.ca Fri Jun 7 18:41:37 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 7 Jun 2002 10:41:37 -0700 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <15616.43504.457473.110331@slothrop.zope.com>; from jeremy@zope.com on Fri, Jun 07, 2002 at 08:41:20AM -0400 References: <20020607093327.A3135@glacier.arctrix.com> <15616.43504.457473.110331@slothrop.zope.com> Message-ID: <20020607104137.C3400@glacier.arctrix.com> Jeremy Hylton wrote: > When the garbage collector runs, it determines the first cycle is > garbage. It doesn't determine the second cycle is garbage because it > has an external reference from the first cycle. But both cycles should be in the set being collected. It should be able to collect them both at once. If your theory is correct then we should be able to construct some cyclic garbage using only list objects and get the same behavior, right? Neil From mgilfix@eecs.tufts.edu Fri Jun 7 18:40:36 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Fri, 7 Jun 2002 13:40:36 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jun 07, 2002 at 12:29:40PM -0400 References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020607134036.C24428@eecs.tufts.edu> On Fri, Jun 07 @ 12:29, Guido van Rossum wrote: > First, a few new issues in this thread: > > - On Windows, setting a timeout on a socket and then using s.makefile() > works as expected (the I/O operations on the file will time out > according to the timeout set on the socket). This is because > makefile() returns a pseudo-file that calls s.recv() etc. on the > socket object. But on Unix, s.makefile() on a socket with a timeout > is a disaster: because the socket is internally set to nonblocking > mode, all I/O operations will fail if they cannot be completed > immediately (effectively setting a timeout of 0 on the file). I have > currently documented around this, but maybe it would be better if > makefile() used a pseudo-file on all platforms, for uniform behavior. > Thoughts? I'm also thinking of implementing the socket wrapper (which > is currently a Python class that has a "real" socket object as an > instance variable) as a subclass of the real socket class instead. Glad to hear that _fileobject works well. Is there any benefit to having the file code in C? I bet the python code isn't that much slower. It does seem a shame to have to switch between the two. Maybe one solution is that a makefile should set blocking back on if a timeout exists? That would solve the problem and is a consistent change since it only checks timeout behavior (= 2 lines of code). I'd vote for using the python fileobject for both. Some profiling of the two would be nice if you have the time. Er, what's the difference between that and _socketobject in socket.py? Why not just use the python bit consistently? > - The original timeout socket code (in Python) by Tim O'Malley had a > global timeout which you could set so that *all* sockets > *automatically* had their timeout set. This is nice if you want it > to affect library modules like urllib or ftplib. That feature is > currently missing. Should we add it? (I have some concerns about > it, in that it might break other code -- and it doesn't seem wise to > do this for server-end sockets in general. But it's a nice hack for > smaller programs.) Is it really so painful for apps to keep track of all their sockets and then do something like: for sock in sock_list: sock.settimeout (blah) Why keep track of them in the socket module, unless there's already code for this. > > Good stuff. The module needed a little work as I discovered as well > > :) > > ...and it still needs more. There are still way too many #ifdefs in > the code. Well, we agreed to do some clean-up in a separate patch but you seem anxious to get it in there :) > > I think the tests should follow the same pattern as the > > test_socket.py. While adding my regression tests, I noted that the > > general socket test suite could use some re-writing but I didn't feel > > it appropriate to tackle it at that point. Perhaps a next patch? > > Yes, please! Alrighty. I'll re-write the test_socket.py and do the test_timeout.py as well. > > This was a concern from the beginning but we had some chat on the > > dev list and concluded that any system supporting sockets has to > > support select or some equivalent (hence the initial reason for using > > the select module, although I agree it was expensive). > > But that doesn't mean there aren't platform-specific tweaks necessary > to import the definition of select() and the FD_* macros. We'll find > out soon enough, this is what alpha releases are for. :-) Well, this was the initial reason to use the selectmodule.c code. There's got to be a way to share code between the two for bare access to select, since someone else might want to use such functionality one day (and this has set the precendent). Why not make a small change to selectmodule.c that opens up the code in a C API or some sort? And then have select_select use that function internally. > > > - I'm not sure that the handling of timeout errors in accept(), > > > connect() and connect_ex() is 100% correct (this code sniffs the > > > error code and decides whether to retry or not). This is how the original timeoutsocket.py did it and it seems to be the way to do blocking connects. You try to do the connect, check if it happened instantaneously and then if not, do the select, and then try again. Errno is the way to check it. That's why if we're doing timeout stuff, there's a second call to accept/connect. Says the linux man pages: EAGAIN or EWOULDBLOCK The socket is marked non-blocking and no connections are present to be accepted. > > I've tested these on linux (manually) and they seem to work just > > fine. I didn't do as much testing with connect_ex but the code is > > very similar to connect, so confidence is high-er. The reason for the > > two-pass is because the initial connect needs to be made to start the > > process and then try again, based on the error codes, for non-blocking > > connects. It's weird like that. > > I'll wait for the unit tests. These should test all three modes > (blocking, non-blocking, and timeout). Ok.. Should I merge the test_timeout.py and test_socket.py as well then? A little off-topic, while I was thinking of restructuring these tests, I was wondering what might be the best way to structure a unit test where things have to work in seperate processes/threads. What I'd really like to do is: * Have the setup function set up server and client sockets * Have the tear-down function close them * Have some synchronization function or simple message (this is socket specific) * Then have a test that has access to both threads and can insert call-backs to run in each thread. This seems tricky with the current unit-test frame work. The way I'd do it is using threading.Event () and have the thing block until the server-side and client-side respectively submit their test callbacks. But it might be nice to have a general class that can be added to the testing framework. Thoughts? > Can you explain why on Windows you say that the socket is connected > when connect() returns a WSAEINVAL error? This is what timeoutsocket.py used as the unix equivalent error codes, and since I'm not set up to test windows and since it was working code, I took their word for it. > Also, your code enters this block even in non-blocking mode, if a > timeout was also set. (Fortunately I fixed this for you by setting > the timeout to -1.0 in setblocking(). Unfortunately there's still a > later test for !sock_blocking in the same block that cannot succeed > any more because of that.) This confusion is arising because of the restructuring of the code. Erm, this check applies if we have a timeout but are in non-blocking mode. Perhaps you changed this? To make it clearer, originally before the v2 of the patch, the socket was always in non-blocking mode, so it was necessary to check whether we were examining error code with non-blocking in mind, or whether we were checking for possible timeout behavior. Since we've changed this, it now checks if non-blocking has been set while a timeout has been set. Seems valid to me... > The same concerns apply to connect_ex() and accept(), which have very > similar logic. > > I believe it is possible on some Unix variants (maybe on Linux) that > when select indicates that a socket is ready, if the socket is in > nonblocking mode, the call will return an error because some kernel > resource is unavailable. This suggests that you may have to keep the > socket in blocking mode except when you have to do a connect() or > accept() (for which you can't do a select without setting the socket > in nonblocking mode first). Not sure about this. Checking the man pages, the error codes seem to be the thing to check to determine what the behavior is. Perhaps you could clarify? > Looking at connect_ex, it seems to be missing the "res = errno" bit > in the case where it says "we're already connected". It used to > return errno here, now it will return -1. Maybe the conex_finally > label should be moved up to before the "if (res != 0) {" line? Ah yes. I didn't look closely enough at the windows bit. On linux it isn't necessary. Let's move it up. > > I thought about this and whether or not I wanted to address this. I > > kinda decided to leave them separate though. I don't think setting a > > timeout means anything equivalent to setblocking(0). In fact, I can't > > see why anyone should ever set a timeout of zero and the immediate > > throwing of the exception is a good alert as to what's going on. I > > vote, leave them separate and as they are now... > > OTOH, a timeout of 0 behaves very similar to nonblocking mode -- > similar enough that a program that uses setblocking(0) would probably > also work when using settimeout(0). I kind of like the idea of having > only a single internal flag value, sock_timeout, rather than two > (sock_timeout and sock_blocking). But one throws an exception and one doesn't. It seems to me that setting a timeout of 0 is sort of an error, if anything. It'll be an easy way to do a superficial test of the functionality in the regr test. > > > - The socket.py module has been changed too, changing the way > > > buffering is done on Windows. I haven't reviewed or tested this > > > code thoroughly. > > > > I added a regression test to test_socket.py to test this, that works > > on both the old code (I used 2.1.3) and the new code. Hopefully, this > > will be instrumental for those testing it and it reflects my manual > > tests. > > The tests don't look very systematic. There are many cases (default > bufsize, unbuffered, bufsize=1, large bufsize; combine with read(), > readline(), read a line larger than the buffer size, etc.). We need a > more systematic approach to unit testing here. Ok, so to recap which tests we want: * Default read() * Read with size given * unbuffered read * large buffer size * Mix read and realine * Do a realine * Do a readline larger than buffer size. Any others in the list? -- Mike `-> (guido) -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From fredrik@pythonware.com Fri Jun 7 18:42:13 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 7 Jun 2002 19:42:13 +0200 Subject: [Python-Dev] textwrap.py References: Message-ID: <006d01c20e4a$d207a3c0$ced241d5@hagrid> tim wrote: > Despite that you never bought a shift key, you use two spaces between > sentences. that's only to compensate for the lack of uppercase letters. I can change that, if you wish. > A difference is that my quote came from a publisher spelling out > requirements for submission from what I can tell, the three most well-respected style guides for American English is the Chicago Manual of Style, the MLA style, and the APA. the page I linked to in my first post on this topic was from the official CMS FAQ: This practice [of using double spaces] is discouraged by the University of Chicago Press, especially for formally published works and the manuscripts from which they are published. the MLA FAQ says: Publications in the United States today usually have the same spacing after a punctuation mark as between words on the same line /.../ In addition, most publishers' guidelines for preparing a manuscript on disk ask authors to type only the spaces that are to appear in print. and continues ... there is nothing wrong with using two spaces after concluding punctuation marks unless an instructor or editor requests that you do otherwise. the APA don't have a FAQ, but according to a "crib sheet" I found on the net, the 5th edition says something similar to: Use one space after all punctuation. and finally, John Rhodes (of webword fame) has collected lots of arguments for and against: http://www.webword.com/reports/period.html his conclusion: If you can't decide for yourself based on the above information here is my advice: You should use one space. Period. > > you'll find that the word "some" is more correct than "several". > > I'm not sure that distinction means something; if it does, I don't buy it. now that you mention it, I'm not sure either. let's see: according to my dictionary, "some" implies "more than none but not many" while "several" implies "two or more, but not a large number". you're right; one could probably find two style guides that supports your view... > > "... I've found tenacity and authority the overriding "arguments" > > for maintaining the two-space rule. Empirically and financially, the > > one-space rule makes sense." > > Selective quoting of random people blathering at each other doesn't > count as "research" to me. no, but selective quoting can be used to make a point you're too lazy to spell out yourself: most proponents rely on the authority of their typing teacher or their mom. > > according to vision researchers, humans using their eyes to read > > text don't care much about sentence breaks inside blocks of text > > This reads like a garbled paraphrase; I assume that if you had a real > reference, you would have given it <0.9 wink>. no, it was an attempt to summarize various sources (as I've inter- preted them) in a way that could be understood by a bot. as we all know, bots can simply copy bytes from an input device, and doesn't have to learn how to carefully move their eyes in various intricate patterns... > Like many theories, it sounds logical, but those of us who read old > books or are old enough to remember when typesetting was an art > practiced by people, rather than the result of an algorithm, know > better. Typists were taught to hit two spaces after a period when > typing because typeset material once upon a time used extra space > there. > > This is an interesting instance of a phenomenon that we should all > be aware of: in times of much change, collective cultural amnesia > can occur, and a whole society can forget something that "everyone > knew." the webword page mentions that it was difficult to typeset double spaces on the first linotype machines, and when customers had to pay extra to get double spaces, it quickly became unfashionable: If the operator typed two spaces in a row, you had two wedges next to each other, and that tended to gum up the operation. Clients who insisted could be accommodated by typing an en-space followed by a justifier-space, but printers charged extra for it and ridiculed it as 'French Spacing, oo-la-la, you want it all fancy, huh? iirc, the linotype was introduced in the 1890's. when did Patricia write that review? ;-) ::: anyway, to end this thread, the only reasonable thing is to do like the "fmt" command, and provide a bunch of options: newtext = string.wrap(text, width=, french_spacing=, split=, prefix=) I'll leave it to Guido to pick suitable defaults. Cheers /F From mal@lemburg.com Fri Jun 7 18:48:30 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 07 Jun 2002 19:48:30 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t References: <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> <3D00CD79.6050409@lemburg.com> <200206071523.g57FNNe18428@pcp02138704pcs.reston01.va.comcast.net> <3D00DAE5.10203@lemburg.com> <200206071641.g57Gfr826040@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D00F1EE.9030501@lemburg.com> Guido van Rossum wrote: >>>Can you point to a document that records a decision to the contrary? >> >>Garbage collection, weak references, changes in the memory allocation, >>etc. etc. > > > IMO this is 100% FUD. Could be 95% FUD :-), but I do remember that older versions of my extensions broke when weak references were introduced. > The GC does not change the object lay-out. It is only triggered for > types that have a specific flag bit. The changes in the GC API also > changed the flag bit used. Weak references also use a flag bit in the > type object and if the flag bit is on, look at a field in the type > that checks whether there is a weakref pointer in the object struct. > All objects (with public object lay-out) that have had their struct > extended have always done so by appending to the end. > > Tim spent weeks to make the memory allocation code backwards > compatible (with several different versions, binary and source > compatibility). > > As an example, the old Zope ExtensionClass code works fine with Python > 2.2. > >>All these change the binary layout of structs or the semantics >>of memory allocation -- mostly only in slight ways, but to a point >>where 100% binary compatibility is not given anymore. > > > Still, I maintain that most extensions that work with 1.5.2 still work > today without recompilation, if you can live with the API version > change warnings. Try it! That's true for most extensions. Note that I wasn't saying that they all broke... distutils is mainly being very careful about the Python version on Windows because the name of the DLL contains the version name and the reference is hard-coded into the extension DLLs. Also, I don't have a problem with recompiling an extension for a new Python version. What's important to me is that the existing code continues to compile and work, not that a compiled version for some old Python version continues to run in a new version (the warnings are unacceptable in a production environment, so there's no point in discussing this). >>Other changes (which I know of) are e.g. the Unicode APIs which >>have changed (they now include UCS2 or UCS4 depending on whether >>you use 16-bit or 32-bit Unicode internally). > > > When you compile with UCS2 it should be backward compatible. No, it's not: 08077f18 T PyUnicodeUCS2_AsASCIIString 0807da34 T PyUnicodeUCS2_AsCharmapString 080760a8 T PyUnicodeUCS2_AsEncodedString 08077b54 T PyUnicodeUCS2_AsLatin1String 0807d84c T PyUnicodeUCS2_AsRawUnicodeEscapeString 08076f64 T PyUnicodeUCS2_AsUTF16String 0807d6b4 T PyUnicodeUCS2_AsUTF8String ... >>I don't think that binary compatibility is all that important; >>it just requires a recompile (and hey, that way you even get >>sub-type support for free ;-). > > > Actually, you don't -- you have to set a flag bit to make your type > subclassable. There are too many things that classic extension types > don't provide. I was referring to the Py_Check() changes. After a recompile they now also accept subtypes. >>Far more difficult to handle are all those minute little changes >>which easily slip the developer's radar. > > > Examples? Just see Raymond's PEP for a list of coding changes over the years. Before this list existed , getting at that information was hard. Other subtle changes include the bool stuff, things like re starting to fail when it sees multiple definitions of group names, changes to xrange, change of character escaping, new scoping rules, renaming in the C API (Length->Size) etc. etc. >>Luckily this will get approached now by Andrew and Raymond, so >>things are getting much better for us poor souls having to >>live on supporting 3-4 different Python versions :-) > > > I'm not sure which initiative you are referring to. The migration guide. > Or even which > Andrew (I presume you mean Raymond Hettinger)? Andrew Kuchling. Raymond Hettinger is summarizing the new coding style possiblities (and how to write code for older Python versions). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From tim.one@comcast.net Fri Jun 7 18:51:16 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 13:51:16 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <15616.43504.457473.110331@slothrop.zope.com> Message-ID: [Jeremy Hylton, explains the cells in Neil's example] Thanks! That was helpful. > ... > class A refers to > function __init__ refers to > cell for A refers to > class A > > That's it for what I understand. Where does the singleton tuple containing a cell come from? I guess it must be in function __init__. > It looks like the example code creates two cycles, and one cycle > refers to the other. The first cycle is the one involving the A > instance and the super instance variable. That cycle has a reference > to class A. > > When the garbage collector runs, it determines the first cycle is > garbage. It doesn't determine the second cycle is garbage because it > has an external reference from the first cycle. The proper understanding of "external" here is wrt all the objects GC tracks. So long as references are *within* that grand set, there are no external references in a relevant sense. "External" means stuff like the reference is due to an untracked container, or to a C variable -- stuff like that. > I presume that the garbage collector can't collect both cycles in one > pass without re-running the update & subtract refs phase after > deleting all the garbage. During the first refs pass, the second > cycle wasn't detected. The second cycle is only collectable after the > first cycle has been collected. I don't think that's it. Here: C:\Code\python\PCbuild>python Python 2.3a0 (#29, Jun 5 2002, 23:17:02) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> class A: pass ... >>> class B: pass ... >>> A.a = A # one cycle >>> B.b = B # another >>> A.b = B # point the first cycle at the second >>> import gc >>> gc.collect() 0 >>> gc.set_debug(gc.DEBUG_SAVEALL) >>> del B >>> gc.collect() # A still keeping everything alive 0 >>> del A # Both cycles are trash now >>> gc.collect() # And both are recoved in one pass 4 >>> print gc.garbage [, {'a': , '__module__': '__main__', 'b': , '__doc__': None }, , {'__module__': '__main__', 'b': , '__doc__': None } ] >>> From jeremy@zope.com Fri Jun 7 14:12:33 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 7 Jun 2002 09:12:33 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: References: <15616.43504.457473.110331@slothrop.zope.com> Message-ID: <15616.45377.425270.366203@slothrop.zope.com> >>>>> "TP" == Tim Peters writes: TP> [Jeremy Hylton, explains the cells in Neil's example] Thanks! TP> That was helpful. > ... > class A refers to > function __init__ refers to > cell for A refers to > class A TP> Where does the singleton tuple containing a cell come from? I TP> guess it must be in function __init__. As Guido mentioned, I was illustrative but avoided being thorough <0.5 wink>. class A refers to its __dict__ refers to function __init__ refers to its func_closure (a tuple of cells) refers to cell for A refers to class A Jeremy From guido@python.org Fri Jun 7 19:04:31 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 14:04:31 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: Your message of "Fri, 07 Jun 2002 19:42:13 +0200." <006d01c20e4a$d207a3c0$ced241d5@hagrid> References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> Message-ID: <200206071804.g57I4VT26617@pcp02138704pcs.reston01.va.comcast.net> > 'French Spacing, oo-la-la, you want it all fancy, huh? The really bizarre thing being that in LaTeX, \frenchspacing means *not* to put extra space after a sentence! It also appears right that (human) typsetters did stretch the space between sentences more than the space between words when stretching a line to right-justify it. In order to do that with a computer typesetting program, and to avoid it assuming a sentence ends after other use of periods (e.g. in "Mr. Lundh"), you have to tell it where the sentences end. Double spacing is a convenient convention for that. --Guido van Rossum (home page: http://www.python.org/~guido/) From sholden@holdenweb.com Fri Jun 7 19:19:18 2002 From: sholden@holdenweb.com (Steve Holden) Date: Fri, 7 Jun 2002 14:19:18 -0400 Subject: [Python-Dev] Socket timeout patch References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <20020607134036.C24428@eecs.tufts.edu> Message-ID: <052101c20e4f$d8153930$7201a8c0@holdenweb.com> [ ... ] [Guido] > > - The original timeout socket code (in Python) by Tim O'Malley had a > > global timeout which you could set so that *all* sockets > > *automatically* had their timeout set. This is nice if you want it > > to affect library modules like urllib or ftplib. That feature is > > currently missing. Should we add it? (I have some concerns about > > it, in that it might break other code -- and it doesn't seem wise to > > do this for server-end sockets in general. But it's a nice hack for > > smaller programs.) > [Mike] > Is it really so painful for apps to keep track of all their sockets > and then do something like: > > for sock in sock_list: > sock.settimeout (blah) > > Why keep track of them in the socket module, unless there's already code > for this. > It isn't painful, it's impossible (unless you want to revise all the libraries). The real problem comes when a program uses a socket-based library such as smtplib or ftplib. Without the ability to impose a default timeout the library client has no way to set a timeout until the library has created the socket (and even then it will break encapsulation to do so in many cases). Unfortunately, the most common requirement for a timeout is to avoid socket code hanging when it makes the initial attempt to connect to a non-responsive host. Under these circumstances, if the connect() doesn't time out it can apparently be as long as two hours before an exception is raised. [...] regards ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From fredrik@pythonware.com Fri Jun 7 19:27:14 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 7 Jun 2002 20:27:14 +0200 Subject: [Python-Dev] textwrap.py References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <200206071804.g57I4VT26617@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <00a801c20e50$f5dc6730$ced241d5@hagrid> Guido wrote: > It also appears right that (human) typsetters did stretch the space > between sentences more than the space between words when stretching a > line to right-justify it. In order to do that with a computer > typesetting program, and to avoid it assuming a sentence ends after > other use of periods (e.g. in "Mr. Lundh"), you have to tell it where > the sentences end. Double spacing is a convenient convention for > that. or you could use non-breaking spaces... From guido@python.org Fri Jun 7 19:39:18 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 14:39:18 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: Your message of "Fri, 07 Jun 2002 20:27:14 +0200." <00a801c20e50$f5dc6730$ced241d5@hagrid> References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <200206071804.g57I4VT26617@pcp02138704pcs.reston01.va.comcast.net> <00a801c20e50$f5dc6730$ced241d5@hagrid> Message-ID: <200206071839.g57IdIk26862@pcp02138704pcs.reston01.va.comcast.net> > or you could use non-breaking spaces... My keyboard doesn't have one. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Fri Jun 7 19:36:53 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 07 Jun 2002 20:36:53 +0200 Subject: [Python-Dev] Socket timeout patch References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <20020607134036.C24428@eecs.tufts.edu> <052101c20e4f$d8153930$7201a8c0@holdenweb.com> Message-ID: <3D00FD45.1090306@lemburg.com> >>>- The original timeout socket code (in Python) by Tim O'Malley had a >>> global timeout which you could set so that *all* sockets >>> *automatically* had their timeout set. This is nice if you want it >>> to affect library modules like urllib or ftplib. That feature is >>> currently missing. Should we add it? (I have some concerns about >>> it, in that it might break other code -- and it doesn't seem wise to >>> do this for server-end sockets in general. But it's a nice hack for >>> smaller programs.) Would be nice to have this. Programs like Plucker which do a lot of socket work could benefit from it. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From tim.one@comcast.net Fri Jun 7 19:46:35 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 07 Jun 2002 14:46:35 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <006d01c20e4a$d207a3c0$ced241d5@hagrid> Message-ID: [Tim] > Despite that you never bought a shift key, you use two spaces between > sentences. [/F] > that's only to compensate for the lack of uppercase letters. > I can change that, if you wish. Goodness no! I read email in Courier New, and the "extra" spaces do more to make your style readable than would capital letters. > ... > from what I can tell, the three most well-respected style guides > for American English is the Chicago Manual of Style, the MLA style, > and the APA. Ya, I saw all that stuff. As I said at the very start, the "two space" rule doesn't make sense for published works, as proportional fonts, kerning, and the other gimmicks available to real typesetting are sufficient there. I'm solely talking about monospaced fonts. The CMS etc are not. If you follow links deeply enough, you'll find at least one of the authors of these guides "confessing" that they use two spaces in email, so that it's readable in a fixed font. > ... > Publications in the United States today usually have the same > spacing after a punctuation mark as between words on the same > line /.../ Except virtually no publications in the US today use monospaced fonts. > ... > and finally, John Rhodes (of webword fame) has collected lots of arguments > for and against: > > http://www.webword.com/reports/period.html Yes, I read that too. His "revelation" at the start is crucial: One of the next things I realized is that, in general, the spacing after a period will be irrelevant since most fonts used today are proportional and goes on to reinforce the point in BOLD whenever he can : ... the current typographic standard for a single space after the period is a reflection of the power of proportionally spaced fonts. Repetitions of this point are ubiquitous all over the web, not just in my email . > ... > iirc, the linotype was introduced in the 1890's. when did Patricia > write that review? ;-) 1996. It's an OK review: http://www.the-efa.org/news/gramglean.html > ... > anyway, to end this thread, the only reasonable thing is to do > like the "fmt" command, and provide a bunch of options: > > newtext = string.wrap(text, width=, french_spacing=, split=, prefix=) > > I'll leave it to Guido to pick suitable defaults. Greg seems to want to do it via setting vrbls on subclasses. I couldn't care less how it's done, so long as I have some way to wrap for readability in a fixed-width font. From guido@python.org Fri Jun 7 20:25:10 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 15:25:10 -0400 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: Your message of "Fri, 07 Jun 2002 19:48:30 +0200." <3D00F1EE.9030501@lemburg.com> References: <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> <3D00CD79.6050409@lemburg.com> <200206071523.g57FNNe18428@pcp02138704pcs.reston01.va.comcast.net> <3D00DAE5.10203@lemburg.com> <200206071641.g57Gfr826040@pcp02138704pcs.reston01.va.comcast.net> <3D00F1EE.9030501@lemburg.com> Message-ID: <200206071925.g57JPB627364@pcp02138704pcs.reston01.va.comcast.net> > Could be 95% FUD :-), but I do remember that older versions of my > extensions broke when weak references were introduced. Could be that you were breaking the rules of course. :-) > Also, I don't have a problem with recompiling an extension for > a new Python version. What's important to me is that the > existing code continues to compile and work, not that a > compiled version for some old Python version continues > to run in a new version (the warnings are unacceptable in > a production environment, so there's no point in discussing > this). There are two kinds of case to be made for binary compatibility, both involving 3rd party extensions whose maintainer has lost interest. Case one is: it's only available in binary for a given platform (maybe something it's linked with wasn't open source). Case two: the code doesn't compile any more under a new Python version, and the user who wants to use it isn't sufficiently versatile in C to be able to fix it (or has no time). > > When you compile with UCS2 it should be backward compatible. > > No, it's not: > > 08077f18 T PyUnicodeUCS2_AsASCIIString > 0807da34 T PyUnicodeUCS2_AsCharmapString > 080760a8 T PyUnicodeUCS2_AsEncodedString > 08077b54 T PyUnicodeUCS2_AsLatin1String > 0807d84c T PyUnicodeUCS2_AsRawUnicodeEscapeString > 08076f64 T PyUnicodeUCS2_AsUTF16String > 0807d6b4 T PyUnicodeUCS2_AsUTF8String > ... Hm. Maybe only the UCS4 variants should be renamed? Of course, few extensions reference Unicode APIs... > Just see Raymond's PEP for a list of coding changes over the > years. Before this list existed , getting at that information was > hard. But you don't *have* to make any of those changes. That's the whole point of backwards compatibility. > Other subtle changes include the bool stuff, things like re starting > to fail when it sees multiple definitions of group names, changes > to xrange, change of character escaping, new scoping rules, > renaming in the C API (Length->Size) etc. etc. I think we have left the topic of binary compatibility here. :-) > Raymond Hettinger is summarizing the new coding style > possiblities (and how to write code for older Python > versions). Yeah, I'm waiting for him to check it in as PEP 290. --Guido van Rossum (home page: http://www.python.org/~guido/) From bernie@3captus.com Fri Jun 7 20:17:52 2002 From: bernie@3captus.com (Bernard Yue) Date: Fri, 07 Jun 2002 13:17:52 -0600 Subject: [Python-Dev] Socket timeout patch References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D0106E0.5D8DF0A5@3captus.com> [Guido] > Remaining issues: > > - A test suite. There's no decent test suite for the timeout code. > The file test_timeout.py doesn't test the functionality (as I > discovered when the test succeeded while I had several blunders > in the select code that made everything always time out). [Michael] > Er, hopefully Bernard is still watching this thread as he wrote > the test_timeout.py. He's been pretty quiet though as of late... I'm > willing to rewrite the tests if he doesn't have the time. > > I think the tests should follow the same pattern as the > test_socket.py. While adding my regression tests, I noted that the > general socket test suite could use some re-writing but I didn't feel > it appropriate to tackle it at that point. Perhaps a next patch? Looks like I have missed the war, folks! I will work on the test suite. The orginal test_timeout.py is incomplete. I actually had problem when writing test case for accept(), using blocking() and makefile(). Guido, you are right on the point, the test suite should work without the timeout code as well. If I've done that ... As for the scope of the test suite, I would prefer to focus on socket timeout test for now. Though there will be overlapping test for socket timeout test and socket test, we can always merge it later. [Guido] > - Cross-platform testing. It's possible that the cleanup broke things > on some platforms, or that select() doesn't work the same way. I > can only test on Windows and Linux; there is code specific to OS/2 > and RISCOS in the module too. [Michael] > This was a concern from the beginning but we had some chat on the > dev list and concluded that any system supporting sockets has to > support select or some equivalent (hence the initial reason for using > the select module, although I agree it was expensive). I now have Visual C++ version 6, but still limited to Windows and Linux. I think once we are done with this two platform, we can ask people to run the test on other platform. But I agreed with Michael that using python select module put us on the safer side. [Guido] > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)? > OTOH, a timeout of 0 behaves very similar to nonblocking mode -- > similar enough that a program that uses setblocking(0) would > probably also work when using settimeout(0). I kind of like the > idea of having only a single internal flag value, sock_timeout, > rather than two (sock_timeout and sock_blocking). Agree. [Guido] > - The original timeout socket code (in Python) by Tim O'Malley had a > global timeout which you could set so that *all* sockets > *automatically* had their timeout set. This is nice if you want it > to affect library modules like urllib or ftplib. That feature is > currently missing. Should we add it? (I have some concerns about > it, in that it might break other code -- and it doesn't seem wise to > do this for server-end sockets in general. But it's a nice hack for > smaller programs.) Steve Holden and M.-A. Lemburg have spoken. Bernie From guido@python.org Fri Jun 7 21:03:10 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 16:03:10 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: Your message of "Fri, 07 Jun 2002 13:40:36 EDT." <20020607134036.C24428@eecs.tufts.edu> References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <20020607134036.C24428@eecs.tufts.edu> Message-ID: <200206072003.g57K3AJ27544@pcp02138704pcs.reston01.va.comcast.net> [Jeremy, please skip forward to where it says "Stevens" or "Jeremy" and comment.] > Glad to hear that _fileobject works well. I didn't say that. I was hoping it does though. :-) > Is there any benefit to having the file code in C? Some operations (notably pickle.dump() and .load()) only work with real files. Other operations (e.g. printing a large list or dict) can be faster to real files because they don't have to build the str() or repr() of the whole thingk as a string first. > I bet the python code isn't that much slower. You're on. Write a benchmark. I notice that httplib uses makefile(), and often reads very small chunks. > It does seem a shame to have to switch > between the two. Maybe one solution is that a makefile should set > blocking back on if a timeout exists? That's not very nice. It could raise an exception. But you could set the timeout on the socket *after* calling makefile(), and then you'd be hosed. But if we always use the Python makefile(), the problem is solved. > That would solve the problem > and is a consistent change since it only checks timeout behavior (= > 2 lines of code). I'd vote for using the python fileobject for > both. Some profiling of the two would be nice if you have the time. I don't, maybe you do? :-) > > I'm also thinking of implementing the socket wrapper (which > > is currently a Python class that has a "real" socket object as an > > instance variable) as a subclass of the real socket class instead. > Er, what's the difference between that and _socketobject in > socket.py? Why not just use the python bit consistently? Making it a subclass should be faster. But I don't know yet if it can work -- I probably have to change the constructor at the C level to be able to support dup() (which is the other reason for the Python wrapper). > Is it really so painful for apps to keep track of all their sockets > and then do something like: > > for sock in sock_list: > sock.settimeout (blah) > > Why keep track of them in the socket module, unless there's already code > for this. Steve Holden already answered this one. Also, you don't have to keep track of all sockets -- you just have to apply the timeout if one is set in a global variable. > Well, we agreed to do some clean-up in a separate patch but you seem > anxious to get it in there :) I am relaxing now, waiting for you to pick up again. > Well, this was the initial reason to use the selectmodule.c code. > There's got to be a way to share code between the two for bare access > to select, since someone else might want to use such functionality one > day (and this has set the precendent). Why not make a small change to > selectmodule.c that opens up the code in a C API or some sort? And > then have select_select use that function internally. If two modules are both shared libraries, it's really painful to share entry points (see the interface between _ssl.c and socketmodule.c for an example -- it's written down in a large comment block in socketmodule.h). I really think one should be able to use select() in more than one file. At least it works on Windows. ;-) > > > > - I'm not sure that the handling of timeout errors in accept(), > > > > connect() and connect_ex() is 100% correct (this code sniffs the > > > > error code and decides whether to retry or not). > > This is how the original timeoutsocket.py did it and it seems > to be the way to do blocking connects. I understand all that. My comment is that your patch changed the control flow in non-blocking mode too. I think I accidentally fixed it by setting sock_timeout to -1.0 in setblocking(). But I'm not 100% sure so I'd like this aspect to be unit-tested thoroughly. > Ok.. Should I merge the test_timeout.py and test_socket.py as well > then? No, you can create as many (or as few) unit test files as you need. > A little off-topic, while I was thinking of restructuring these > tests, I was wondering what might be the best way to structure a unit > test where things have to work in seperate processes/threads. What > I'd really like to do is: > > * Have the setup function set up server and client sockets > * Have the tear-down function close them > * Have some synchronization function or simple message (this is > socket specific) > * Then have a test that has access to both threads and can insert > call-backs to run in each thread. > > This seems tricky with the current unit-test frame work. The way I'd > do it is using threading.Event () and have the thing block until the > server-side and client-side respectively submit their test callbacks. > But it might be nice to have a general class that can be added to the > testing framework. Thoughts? If I were you I'd worry about getting it right once first. Then we can see if there's room for generalization. (You might want to try to convert test_socketserver.py to your proposed framework to see how well it works.) > > Can you explain why on Windows you say that the socket is connected > > when connect() returns a WSAEINVAL error? > > This is what timeoutsocket.py used as the unix equivalent error > codes, and since I'm not set up to test windows and since it was > working code, I took their word for it. Well, but WSAEINVAL can also be returned for other conditions. See http://msdn.microsoft.com/library/en-us/winsock/wsapiref_8m7m.asp it seems that the *second* time you call connect() WSAEINVAL can only mean that you're already connected. But if this socket is in a different state, and the connect() is simply not appropriate, I don't like the fact that connect() would simply return "success" rather than reporting an error. E.g. I could do this: s = socket() s.settimeout(100) s.connect((host, port)) . . . # By mistake: s.connect((otherhost, otherport)) I want the latter connect() to fail, but I think your code will make it succeed. > > Also, your code enters this block even in non-blocking mode, if a > > timeout was also set. (Fortunately I fixed this for you by setting > > the timeout to -1.0 in setblocking(). Unfortunately there's still a > > later test for !sock_blocking in the same block that cannot succeed > > any more because of that.) > > This confusion is arising because of the restructuring of the code. > Erm, this check applies if we have a timeout but are in non-blocking > mode. Perhaps you changed this? To make it clearer, originally before > the v2 of the patch, the socket was always in non-blocking mode, so > it was necessary to check whether we were examining error code with > non-blocking in mind, or whether we were checking for possible timeout > behavior. Since we've changed this, it now checks if non-blocking has > been set while a timeout has been set. Seems valid to me... But I changed that again: setblocking() now always disables the timeout. Read the new source in CVS. > > The same concerns apply to connect_ex() and accept(), which have very > > similar logic. > > > > I believe it is possible on some Unix variants (maybe on Linux) that > > when select indicates that a socket is ready, if the socket is in > > nonblocking mode, the call will return an error because some kernel > > resource is unavailable. This suggests that you may have to keep the > > socket in blocking mode except when you have to do a connect() or > > accept() (for which you can't do a select without setting the socket > > in nonblocking mode first). > > Not sure about this. Checking the man pages, the error codes seem > to be the thing to check to determine what the behavior is. Perhaps > you could clarify? When a timeout is set, the socket file descriptor is always in nonblocking mode. Take sock_recv() for example. It calls internal_select(), and if that returns >= 1, it calls recv(). But according to the Stevens books, it is still possible (under heavy load) that the recv() returns an EWOULDBLOCK error. (We ran into this while debugging a high-performance application based on Spread. The select() succeeded, but the recv() failed, because the socket was in nonblocking mode. Well, I'm *almost* sure that this was the case -- Jeremy Hylton should know the details.) > > Looking at connect_ex, it seems to be missing the "res = errno" bit > > in the case where it says "we're already connected". It used to > > return errno here, now it will return -1. Maybe the conex_finally > > label should be moved up to before the "if (res != 0) {" line? > > Ah yes. I didn't look closely enough at the windows bit. On linux > it isn't necessary. Let's move it up. OK, done. > > OTOH, a timeout of 0 behaves very similar to nonblocking mode -- > > similar enough that a program that uses setblocking(0) would probably > > also work when using settimeout(0). I kind of like the idea of having > > only a single internal flag value, sock_timeout, rather than two > > (sock_timeout and sock_blocking). > > But one throws an exception and one doesn't. What do you mean? In nonblocking mode you get an exception when the socket isn't ready too. > It seems to me that > setting a timeout of 0 is sort of an error, if anything. It'll be an > easy way to do a superficial test of the functionality in the regr > test. OK, we don't seem to be able to agree on this. I'll let your wisdom prevail. > > The tests don't look very systematic. There are many cases (default > > bufsize, unbuffered, bufsize=1, large bufsize; combine with read(), > > readline(), read a line larger than the buffer size, etc.). We need a > > more systematic approach to unit testing here. > > Ok, so to recap which tests we want: > > * Default read() > * Read with size given > * unbuffered read > * large buffer size > * Mix read and realine > * Do a realine > * Do a readline larger than buffer size. > > Any others in the list? Check the socket.py source code and make sure that every code path through every method is taken at least once. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 7 21:14:37 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Jun 2002 16:14:37 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: Your message of "Fri, 07 Jun 2002 13:17:52 MDT." <3D0106E0.5D8DF0A5@3captus.com> References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <3D0106E0.5D8DF0A5@3captus.com> Message-ID: <200206072014.g57KEbl27613@pcp02138704pcs.reston01.va.comcast.net> > Looks like I have missed the war, folks! I will work on the test > suite. The orginal test_timeout.py is incomplete. I actually had > problem when writing test case for accept(), using blocking() and > makefile(). Guido, you are right on the point, the test suite > should work without the timeout code as well. If I've done that ... > > As for the scope of the test suite, I would prefer to focus on socket > timeout test for now. Though there will be overlapping test for socket > timeout test and socket test, we can always merge it later. Thanks, Bernie! > I now have Visual C++ version 6, but still limited to Windows and > Linux. I think once we are done with this two platform, we can ask > people to run the test on other platform. Good idea. > But I agreed with Michael that using python select module put us on > the safer side. But it's too slow. > [Guido] > > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)? > > > OTOH, a timeout of 0 behaves very similar to nonblocking mode -- > > similar enough that a program that uses setblocking(0) would > > probably also work when using settimeout(0). I kind of like the > > idea of having only a single internal flag value, sock_timeout, > > rather than two (sock_timeout and sock_blocking). > > Agree. Hm, finally someone who agrees with me on this. ;-) > Steve Holden and M.-A. Lemburg have spoken. Can I expect a patch from you or Michael? --Guido van Rossum (home page: http://www.python.org/~guido/) From mgilfix@eecs.tufts.edu Fri Jun 7 21:32:32 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Fri, 7 Jun 2002 16:32:32 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <3D00FD45.1090306@lemburg.com>; from mal@lemburg.com on Fri, Jun 07, 2002 at 08:36:53PM +0200 References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <20020607134036.C24428@eecs.tufts.edu> <052101c20e4f$d8153930$7201a8c0@holdenweb.com> <3D00FD45.1090306@lemburg.com> Message-ID: <20020607163232.D24428@eecs.tufts.edu> I stand corrected. Seems like people want the feature... On Fri, Jun 07 @ 20:36, M.-A. Lemburg wrote: > >>>- The original timeout socket code (in Python) by Tim O'Malley had a > >>> global timeout which you could set so that *all* sockets > >>> *automatically* had their timeout set. This is nice if you want it > >>> to affect library modules like urllib or ftplib. That feature is > >>> currently missing. Should we add it? (I have some concerns about > >>> it, in that it might break other code -- and it doesn't seem wise to > >>> do this for server-end sockets in general. But it's a nice hack for > >>> smaller programs.) > > Would be nice to have this. Programs like Plucker which do a lot > of socket work could benefit from it. > > -- > Marc-Andre Lemburg > CEO eGenix.com Software GmbH > ______________________________________________________________________ > Company & Consulting: http://www.egenix.com/ > Python Software: http://www.egenix.com/files/python/ > Meet us at EuroPython 2002: http://www.europython.org/ > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev `-> (mal) -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From mgilfix@eecs.tufts.edu Fri Jun 7 21:41:55 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Fri, 7 Jun 2002 16:41:55 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206072014.g57KEbl27613@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jun 07, 2002 at 04:14:37PM -0400 References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <3D0106E0.5D8DF0A5@3captus.com> <200206072014.g57KEbl27613@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020607164154.F24428@eecs.tufts.edu> If no one's taken this after I finish the rewrite of test_socket.py, I'll tackle this. So, either you'll have either Bernard or me on it. -- Mike On Fri, Jun 07 @ 16:14, Guido van Rossum wrote: > > Steve Holden and M.-A. Lemburg have spoken. > > Can I expect a patch from you or Michael? > > --Guido van Rossum (home page: http://www.python.org/~guido/) `-> (guido) -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From bernie@3captus.com Fri Jun 7 21:32:39 2002 From: bernie@3captus.com (Bernard Yue) Date: Fri, 07 Jun 2002 14:32:39 -0600 Subject: [Python-Dev] Socket timeout patch References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <3D0106E0.5D8DF0A5@3captus.com> <200206072014.g57KEbl27613@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D011867.EBC5470@3captus.com> Guido van Rossum wrote: > > > But I agreed with Michael that using python select module put us on > > the safer side. > > But it's too slow. > Well, if we have to use native select(), can I assume that there will only be three cases, namely UNIX, Windows and BeOS (looks like that's the case from selectmodule.c)? Assume the above is true, what needed to done is to create a C API from select_select() so that socketmodule.c can use it. Is that correct? > > Steve Holden and M.-A. Lemburg have spoken. > > Can I expect a patch from you or Michael? > Let's see where we are when I finished the test suite. I'll do it if it's still hasn't been done. > --Guido van Rossum (home page: http://www.python.org/~guido/) Bernie From mgilfix@eecs.tufts.edu Fri Jun 7 21:40:22 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Fri, 7 Jun 2002 16:40:22 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <3D0106E0.5D8DF0A5@3captus.com>; from bernie@3captus.com on Fri, Jun 07, 2002 at 01:17:52PM -0600 References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <3D0106E0.5D8DF0A5@3captus.com> Message-ID: <20020607164022.E24428@eecs.tufts.edu> On Fri, Jun 07 @ 13:17, Bernard Yue wrote: > Looks like I have missed the war, folks! I will work on the test > suite. The orginal test_timeout.py is incomplete. I actually had > problem when writing test case for accept(), using blocking() and > makefile(). Guido, you are right on the point, the test suite > should work without the timeout code as well. If I've done that ... > > As for the scope of the test suite, I would prefer to focus on socket > timeout test for now. Though there will be overlapping test for socket > timeout test and socket test, we can always merge it later. Sounds good. I'll work on rewriting test_socket.py, which needs to be done anyway to better test the _fileobject in windows - especially if we decide to adopt that later. I'll run some profiling tests and we'll see how painful it is. That way Guido can smack me appropriately. I'll probably be able to draft one up tomorrow (I doubt this evening). > > [Guido] > > - Cross-platform testing. It's possible that the cleanup broke things > > on some platforms, or that select() doesn't work the same way. I > > can only test on Windows and Linux; there is code specific to OS/2 > > and RISCOS in the module too. > > [Michael] > > This was a concern from the beginning but we had some chat on the > > dev list and concluded that any system supporting sockets has to > > support select or some equivalent (hence the initial reason for using > > the select module, although I agree it was expensive). > > I now have Visual C++ version 6, but still limited to Windows and > Linux. I think once we are done with this two platform, we can ask > people to run the test on other platform. But I agreed with Michael > that using python select module put us on the safer side. Great. We need some more win testing since it's so much different than *nix. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From gward@python.net Fri Jun 7 22:33:29 2002 From: gward@python.net (Greg Ward) Date: Fri, 7 Jun 2002 17:33:29 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <200206071500.g57F0lh18144@pcp02138704pcs.reston01.va.comcast.net> References: <3D008DA0.20248.6AD00DBA@localhost> <200206071500.g57F0lh18144@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020607213329.GA21836@gerg.ca> On 07 June 2002, Guido van Rossum said: > True, but then there needs to be a way to enable/disable it, since > even if you never use two spaces after a period, the rule can still > generate them for you in the output: when an input sentence ends at > the end of a line but the output sentence doesn't, the rule will > translate the newline into two spaces instead of one. > > I vote to have it off by default. Sounds about right to me. Reading this thread has revealed that 1) I was correct to add sentence-ending-detection code, 2) I missed a few subtle details (eg. my code will change "Dr. Frankenstein" to "Dr. Frankenstein" -- d'ohh!), and 3) the programmer must be able to select whether she wants to use it. Greg -- Greg Ward - Unix bigot gward@python.net http://starship.python.net/~gward/ A day for firm decisions!!!!! Or is it? From gward@python.net Fri Jun 7 22:39:47 2002 From: gward@python.net (Greg Ward) Date: Fri, 7 Jun 2002 17:39:47 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> Message-ID: <20020607213947.GB21836@gerg.ca> On 07 June 2002, Tim Peters said: > Greg seems to want to do it via setting vrbls on subclasses. I couldn't > care less how it's done, so long as I have some way to wrap for readability > in a fixed-width font. No subclass required -- just an instance: wrapper = TextWrapper() wrapper.fix_sentence_endings = 0 wrapper.wrap(...) Not sure if "fix_sentence_endings" is the right spelling, but it'll do for now. no-i-do-NOT-know-what-a-strategy-class-is, Greg -- Greg Ward - Unix weenie gward@python.net http://starship.python.net/~gward/ I just heard the SEVENTIES were over!! And I was just getting in touch with my LEISURE SUIT!! From paul@prescod.net Fri Jun 7 22:59:53 2002 From: paul@prescod.net (Paul Prescod) Date: Fri, 07 Jun 2002 14:59:53 -0700 Subject: [Python-Dev] textwrap.py References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <20020607213947.GB21836@gerg.ca> Message-ID: <3D012CD9.89D40523@prescod.net> Greg Ward wrote: > > On 07 June 2002, Tim Peters said: > > Greg seems to want to do it via setting vrbls on subclasses. I couldn't > > care less how it's done, so long as I have some way to wrap for readability > > in a fixed-width font. > > No subclass required -- just an instance: > > wrapper = TextWrapper() > wrapper.fix_sentence_endings = 0 > wrapper.wrap(...) > > Not sure if "fix_sentence_endings" is the right spelling, but it'll do > for now. Why three statements instead of one expression? textwrap.wrap_my_text(text, fix_sentence_endings = 0) If you want to do class-y stuff internally, then go ahead. But wrapping text is a stateless mathematical function with a domain and range. I'd prefer function syntax. Paul Prescod From gward@python.net Fri Jun 7 23:06:40 2002 From: gward@python.net (Greg Ward) Date: Fri, 7 Jun 2002 18:06:40 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <3D012CD9.89D40523@prescod.net> References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <20020607213947.GB21836@gerg.ca> <3D012CD9.89D40523@prescod.net> Message-ID: <20020607220640.GA21975@gerg.ca> On 07 June 2002, Paul Prescod said: > Why three statements instead of one expression? > > textwrap.wrap_my_text(text, fix_sentence_endings = 0) > > If you want to do class-y stuff internally, then go ahead. But wrapping > text is a stateless mathematical function with a domain and range. I'd > prefer function syntax. Yeah, me too. But there are an unbounded number of possible options that people might insist on, and making these options instance attributes seems vaguely friendly to subclasses to me. These are both wild, unproven allegations, of course. Patches welcome. Greg -- Greg Ward - geek gward@python.net http://starship.python.net/~gward/ I just read that 50% of the population has below median IQ! From paul@prescod.net Sat Jun 8 02:24:40 2002 From: paul@prescod.net (Paul Prescod) Date: Fri, 07 Jun 2002 18:24:40 -0700 Subject: [Python-Dev] textwrap.py References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <20020607213947.GB21836@gerg.ca> <3D012CD9.89D40523@prescod.net> <20020607220640.GA21975@gerg.ca> Message-ID: <3D015CD8.DE7C4BC2@prescod.net> Greg Ward wrote: > >... > > Yeah, me too. But there are an unbounded number of possible options > that people might insist on, and making these options instance > attributes seems vaguely friendly to subclasses to me. I don't follow. If I want a subclass then I need to instantiate it somehow. When I do, I'll call its constructor. I'll pass its constructor the keyword arguments that the subclass expects. Paul Prescod From aahz@pythoncraft.com Sat Jun 8 02:40:33 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 7 Jun 2002 21:40:33 -0400 Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020608014033.GA9625@panix.com> On Fri, Jun 07, 2002, Guido van Rossum wrote: > > - The original timeout socket code (in Python) by Tim O'Malley had a > global timeout which you could set so that *all* sockets > *automatically* had their timeout set. This is nice if you want it > to affect library modules like urllib or ftplib. That feature is > currently missing. Should we add it? (I have some concerns about > it, in that it might break other code -- and it doesn't seem wise to > do this for server-end sockets in general. But it's a nice hack for > smaller programs.) +1 -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I had lots of reasonable theories about children myself, until I had some." --Michael Rios From martin@v.loewis.de Sat Jun 8 06:49:34 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 08 Jun 2002 07:49:34 +0200 Subject: [Python-Dev] Changing ob_size to [s]size_t In-Reply-To: <3D00CD79.6050409@lemburg.com> References: <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> <3D00CD79.6050409@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > What binary compatibility ? The binary compatibility of extension modules across Python releases. That is not available on Windows, but it is available on Unix. Regards, Martin From mgilfix@eecs.tufts.edu Sat Jun 8 22:26:28 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Sat, 8 Jun 2002 17:26:28 -0400 Subject: [Python-Dev] unittest and sockets. Ugh!? Message-ID: <20020608172627.D9486@eecs.tufts.edu> Could someone please offer some advice/comments on this? While restructuring test_socket.py, I've come across the following (limitation?) problem with the unittest module. To test socket stuff, I really need to use two separate processes or threads. Since fork() is much better supported, here's my attempt at porting the _fileobject tests into unittest. I really like the structure of the test (hopefully you guys agree) and this is how I'd like to lay it out ideally for testing things like accept/connect, etc, using the various layers of my inhertiance hierarchy. However, this test won't work because I get an error binding to the socket in the setUp function because the socket is already taken. Is this because unittest dispatches the tests at roughly the same time? I'm not quite sure why this is failing in sequence (perhaps I'm missing something). In addition, I thought that the setUp/tearDown functions were shared between all tests within a class, not called for each test but this does not seem true. If I want the setup to be shared between all tests, do I have to override __init__? Another issue is that unittest doesn't seem to like that I've forked. It considers it to be the equivalent of two tests. Perhaps I shouldn't care, provided that they all pass anyway. Any other stuff that I've seen that uses forking/threading doesn't seem to use the unittest style framework. Perhaps I shouldn't be using this and should just write outside of it? That would be a shame since I like many of the features of the framework but some seem limiting. -- Mike ======================================================================= #!/usr/bin/env python import unittest import test_support import socket import os import time PORT = 50007 HOST = 'localhost' class SocketTest(unittest.TestCase): def setUp(self): canfork = hasattr (os, 'fork') if not canfork: raise test_support.TestSkipped, \ "Platform does not support forking." # Use this to figure out who we are in the tests self.parent = os.fork() if self.parent: self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) self.s.bind((HOST, PORT)) self.s.listen(1) else: self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) time.sleep(1) # So we can catch up def tearDown(self): self.s.close () self.s = None class SocketConnectedTest(SocketTest): SYNCH_MSG = 'Michael Gilfix was here' def setUp(self): SocketTest.setUp(self) if self.parent: conn, addr = self.s.accept() self.conn = conn else: self.s.connect((HOST, PORT)) self.conn = self.s def tearDown(self): if self.parent: self.conn.close() self.conn = None SocketTest.tearDown(self) def synchronize(self): time.sleep(1) if self.parent: msg = self.conn.recv(len(self.SYNCH_MSG)) self.assertEqual(msg, self.SYNCH_MSG, "Parent synchronization error") self.conn.send(msg) else: self.conn.send(msg) msg = self.conn.recv(len(self.SYNCH_MSG)) self.assertEqual(msg, self.SYNCH_MSG, "Child synchronization error") time.sleep(1) class FileObjectClassTestCase(SocketConnectedTest): def setUp(self): SocketConnectedTest.setUp(self) # Create a file object for both the parent/client processes self.f = socket._fileobject(self.conn, 'rb', 8192) def tearDown(self): self.f.close() SocketConnectedTest.tearDown(self) def testSmallRead(self): """Performing small read test.""" if self.parent: first_seg = self.f.read(7) second_seg = self.f.read(25) msg = ''.join((first_seg, second_seg)) self.assertEqual(msg, self.SYNCH_MSG, "Error performing small read.") else: self.f.write(self.SYNCH_MSG) self.f.flush() def testUnbufferedRead(self): """Performing unbuffered read test.""" if self.parent: buf = '' while 1: char = self.f.read(1) self.failIf(not char, "Error performing unbuffered read.") buf += char if buf == self.SYNCH_MSG: break else: self.f.write(self.SYNCH_MSG) self.f.flush() def testReadline(self): """Performing readline test.""" if self.parent: line = self.f.readline() self.assertEqual(line, self.SYNCH_MSG, "Error performing readline.") else: self.f.write(self.SYNCH_MSG) self.f.flush() def suite(): suite = unitest.TestSuite() suite.addTest(unittest.makeSuite(FileObjectClassTestCase)) return suite if __name__ == '__main__': unittest.main() -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From gward@python.net Sun Jun 9 01:27:22 2002 From: gward@python.net (Greg Ward) Date: Sat, 8 Jun 2002 20:27:22 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <3D015CD8.DE7C4BC2@prescod.net> References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <20020607213947.GB21836@gerg.ca> <3D012CD9.89D40523@prescod.net> <20020607220640.GA21975@gerg.ca> <3D015CD8.DE7C4BC2@prescod.net> Message-ID: <20020609002722.GA3750@gerg.ca> On 07 June 2002, Paul Prescod said: > Greg Ward wrote: > > > >... > > > > Yeah, me too. But there are an unbounded number of possible options > > that people might insist on, and making these options instance > > attributes seems vaguely friendly to subclasses to me. > > I don't follow. If I want a subclass then I need to instantiate it > somehow. When I do, I'll call its constructor. I'll pass its constructor > the keyword arguments that the subclass expects. Umm, ignore my original argument. I don't understand what I was talking about, and I understand your rebuttal even less. Let's accept the fact that we're not communicating and drop it. However, I *still* don't want to make all of TextWrapper's options keyword arguments to the wrap() method, because 1) I'd be morally bound to make them kwargs to the fill() method, and to the standalone wrap() and fill() functions as well, which is a PITA; and 2) I think it's useful to be able to encode your preferences in an object for multiple wrapping jobs. Compromise: the TextWrapper constructor now looks like this: def __init__ (self, expand_tabs=True, replace_whitespace=True, fix_sentence_endings=False, break_long_words=True): self.expand_tabs = expand_tabs self.replace_whitespace = replace_whitespace self.fix_sentence_endings = fix_sentence_endings self.break_long_words = break_long_words Good enough? I'm happy with it. Greg -- Greg Ward - Unix geek gward@python.net http://starship.python.net/~gward/ Question authority! From tim.one@comcast.net Sun Jun 9 02:00:40 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 08 Jun 2002 21:00:40 -0400 Subject: [Python-Dev] unittest and sockets. Ugh!? In-Reply-To: <20020608172627.D9486@eecs.tufts.edu> Message-ID: [Michael Gilfix] > Could someone please offer some advice/comments on this? While > restructuring test_socket.py, I've come across the following > (limitation?) problem with the unittest module. To test socket stuff, > I really need to use two separate processes or threads. Since > fork() is much better supported, Better supported than what? Threads? No way. If you use fork(), the test won't run at all except on Unixish systems. If you use threads, it will run just about everywhere. Use threads. Alas, I have no idea what unittest does in the presence of fork or threads, and no desire to learn . > ... > Any other stuff that I've seen that uses forking/threading doesn't > seem to use the unittest style framework. The existing fork and thread tests almost all long predate the invention of unittest. Frankly, I find that the layers of classes in elaborate unittests ensure I almost always spend more time trying to understand what a failing unittest *thinks* it's trying to do, and fixing what turn out to be bad assumptions, than in fixing actual bugs in the stuff it's supposed to be testing. Combining that artificial complexity with the inherent complexity of multiple processes or threads is something I instinctively shy away from. My coworkers do not, and PythonLabs has done several projects now at Zope Corp now that try to mix unittest with multiple threads and processes in the *app* being tested. Even that much is a never-ending nightmare. Then again, I feel this more acutely than them because the tests always fail on Windows -- or any other platform where the timing is 1% different . no-easy-answers-here-ly y'rs - tim From mgilfix@eecs.tufts.edu Sun Jun 9 02:43:41 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Sat, 8 Jun 2002 21:43:41 -0400 Subject: [Python-Dev] unittest and sockets. Ugh!? In-Reply-To: ; from tim.one@comcast.net on Sat, Jun 08, 2002 at 09:00:40PM -0400 References: <20020608172627.D9486@eecs.tufts.edu> Message-ID: <20020608214341.G9486@eecs.tufts.edu> On Sat, Jun 08 @ 21:00, Tim Peters wrote: > Better supported than what? Threads? No way. If you use fork(), the test > won't run at all except on Unixish systems. If you use threads, it will run > just about everywhere. Use threads. Will do. I would have much rather used threads to begin with in fact. I just assumed that the reason the socket module used fork to begin with is because it was considered more portable. Well, you know that thing about assuming makes an ass-out-of-u-and-me. > Alas, I have no idea what unittest does in the presence of fork or threads, > and no desire to learn . I'll just change it to threads happily and find out :) > > ... > > Any other stuff that I've seen that uses forking/threading doesn't > > seem to use the unittest style framework. > > The existing fork and thread tests almost all long predate the invention of > unittest. Frankly, I find that the layers of classes in elaborate unittests > ensure I almost always spend more time trying to understand what a failing > unittest *thinks* it's trying to do, and fixing what turn out to be bad > assumptions, than in fixing actual bugs in the stuff it's supposed to be > testing. Combining that artificial complexity with the inherent complexity > of multiple processes or threads is something I instinctively shy away from. I would agree in some respects. When I first started looking at unittest, I thought it seemed more complicated than it was worth. Indeed, I'm sure the advanced features are. I don't find the documentation to be very good at describing just what I needed to get going - at least not up to par with, for example, the xml.minidom documentation, which gets you going in 5 minutes. I just haven't made up my mind yet about what's bugging me and maybe I'll have more insight after the process. However, after trying it a bit, I've decided that I really like the format/layout and it's quite convient. I'm just not sure what it can and can't do yet. > My coworkers do not, and PythonLabs has done several projects now at Zope > Corp now that try to mix unittest with multiple threads and processes in the > *app* being tested. Even that much is a never-ending nightmare. Then > again, I feel this more acutely than them because the tests always fail on > Windows -- or any other platform where the timing is 1% different . Well, it's easier to envision with sockets where the timing issues are easier to sort out. But well written tests are a blessing and the more I look at the python regression suite, I begin to realize that they are lacking . > no-easy-answers-here-ly y'rs - tim agreeingly-and-I-hope-I-don't-pick-up-this-habit-ly y'rs -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From guido@python.org Sun Jun 9 03:10:44 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 08 Jun 2002 22:10:44 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: Your message of "Fri, 07 Jun 2002 12:31:44 EDT." Message-ID: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net> Offline, Jeremy, Neil, Tim & I figured out what was really the case here. I believe the last thing we reported here was that we'd found a sample program that required several gc.collect() calls to clean out all its garbage, which surprised Neil and Tim. Since they know the collector inside out, if it surprises them, there's probably a bug. Here's the analysis. A new-style instance increfs its class (its ob_type), to make sure that the class doesn't go away as long as the instance exists. But the tp_traverse handler for new-style instances didn't visit the ob_type reference, so the collector thinks there are "outside" references to the class, and doesn't collect it. When the last instance goes away, the refcnt to the class finally matches what the collector sees, and it can collect the class (once all other references to it are gone of course). I was tempted to say "this ain't so bad, let's keep it that way." But Jeremy and Tim came up with counterexamples involving a cycle between a class and its instance, where the cycle would never be broken. Tim found that this program grows without bounds, though ever slower, since it spends more and more time in the 2nd generation collector, where all the uncollected objects eventually end up: while 1: class A(object): pass A.a = A() I tried the simplest possible fix, which was to visit self->ob_type in the new-style instance tp_traverse handler (subtype_traverse() in typeobject.c). But this caused assertions to fail all over the place. It turns out that when the collector decides to break a cycle like this, it calls the tp_clear handler for each object in the cycle, and then the subsequent deletion of the instance references the type in ways that have been made invalid by the clearing of the type. So this was a dead end. So here's a patch that does do the right thing. It adds a new field to type objects, tp_dependents. This is essentially a second reference count, counting the instances and direct subclasses. As long as tp_dependents is nonzero, this means there are still instances or subclasses, and then the type's tp_clear handler doesn't do anything. A consequence of the patch is that there will always be examples that take more than one collection to clean their garbage -- but eventually all garbage will be cleared out. (I suppose a worst-case example would be a very long chain of subclasses, which would be cleared out once class per collection.) Consequently, I'm patching test_gc.py to get rid of garbage left behind by previous tests in a loop. A downside of the patch is that it adds a new field to the type object structure; I believe this prevents it from being a backport candidate to 2.2.2. For a half blissful hour I believed that it would be possible to do something much simpler by doubling the regular refcnt; then I realized that the double refcnt would mean the collector would never break the cycle. Still, I am wishing for a solution that avoids adding a new field. Please comment on this patch! Index: Include/object.h =================================================================== RCS file: /cvsroot/python/python/dist/src/Include/object.h,v retrieving revision 2.101 diff -c -r2.101 object.h *** Include/object.h 12 Apr 2002 01:57:06 -0000 2.101 --- Include/object.h 8 Jun 2002 13:09:14 -0000 *************** *** 292,297 **** --- 292,298 ---- PyObject *tp_cache; PyObject *tp_subclasses; PyObject *tp_weaklist; + int tp_dependents; #ifdef COUNT_ALLOCS /* these must be last and never explicitly initialized */ Index: Objects/typeobject.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/typeobject.c,v retrieving revision 2.148 diff -c -r2.148 typeobject.c *** Objects/typeobject.c 4 Jun 2002 19:52:53 -0000 2.148 --- Objects/typeobject.c 8 Jun 2002 13:09:15 -0000 *************** *** 218,225 **** memset(obj, '\0', size); ! if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) Py_INCREF(type); if (type->tp_itemsize == 0) PyObject_INIT(obj, type); --- 218,227 ---- memset(obj, '\0', size); ! if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_INCREF(type); + type->tp_dependents++; + } if (type->tp_itemsize == 0) PyObject_INIT(obj, type); *************** *** 290,295 **** --- 292,303 ---- } } + if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) { + int err = visit((PyObject *)type, arg); + if (err) + return err; + } + if (basetraverse) return basetraverse(self, visit, arg); return 0; *************** *** 464,469 **** --- 472,478 ---- /* Can't reference self beyond this point */ if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_DECREF(type); + type->tp_dependents--; } } *************** *** 1170,1175 **** --- 1179,1192 ---- Py_INCREF(base); type->tp_base = base; + /* Incref the bases' tp_dependents count */ + for (i = 0; i < nbases; i++) { + PyTypeObject *b; + b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i); + if (PyType_Check(b) && (b->tp_flags & Py_TPFLAGS_HEAPTYPE)) + b->tp_dependents++; + } + /* Initialize tp_dict from passed-in dict */ type->tp_dict = dict = PyDict_Copy(dict); if (dict == NULL) { *************** *** 1431,1441 **** --- 1448,1475 ---- static void type_dealloc(PyTypeObject *type) { + PyObject *bases; etype *et; /* Assert this is a heap-allocated type object */ assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE); _PyObject_GC_UNTRACK(type); + + /* Decref the bases' tp_dependents count */ + bases = type->tp_bases; + if (bases) { + int i, nbases; + assert(PyTuple_Check(bases)); + nbases = PyTuple_GET_SIZE(bases); + for (i = 0; i < nbases; i++) { + PyTypeObject *b; + b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i); + if (PyType_Check(b) && + (b->tp_flags & Py_TPFLAGS_HEAPTYPE)) + b->tp_dependents--; + } + } + PyObject_ClearWeakRefs((PyObject *)type); et = (etype *)type; Py_XDECREF(type->tp_base); *************** *** 1495,1502 **** etype *et; int err; ! if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE)) ! return 0; et = (etype *)type; --- 1529,1535 ---- etype *et; int err; ! assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE); et = (etype *)type; *************** *** 1524,1533 **** type_clear(PyTypeObject *type) { etype *et; ! PyObject *tmp; ! if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE)) ! return 0; et = (etype *)type; --- 1557,1583 ---- type_clear(PyTypeObject *type) { etype *et; ! PyObject *tmp, *bases; ! assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE); ! ! if (type->tp_dependents) ! return 0; /* Not yet, there are still instances */ ! ! /* Decref the bases' tp_dependents count */ ! bases = type->tp_bases; ! if (bases) { ! int i, nbases; ! assert(PyTuple_Check(bases)); ! nbases = PyTuple_GET_SIZE(bases); ! for (i = 0; i < nbases; i++) { ! PyTypeObject *b; ! b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i); ! if (PyType_Check(b) && ! (b->tp_flags & Py_TPFLAGS_HEAPTYPE)) ! b->tp_dependents--; ! } ! } et = (etype *)type; *************** *** 1754,1763 **** --- 1804,1815 ---- } if (new->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_INCREF(new); + new->tp_dependents++; } self->ob_type = new; if (old->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_DECREF(old); + old->tp_dependents--; } return 0; } Index: Lib/test/test_gc.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/test/test_gc.py,v retrieving revision 1.14 diff -c -r1.14 test_gc.py *** Lib/test/test_gc.py 28 Mar 2002 21:22:25 -0000 1.14 --- Lib/test/test_gc.py 8 Jun 2002 13:09:15 -0000 *************** *** 220,228 **** def test(): if verbose: print "disabling automatic collection" enabled = gc.isenabled() gc.disable() ! verify(not gc.isenabled() ) debug = gc.get_debug() gc.set_debug(debug & ~gc.DEBUG_LEAK) # this test is supposed to leak --- 220,229 ---- def test(): if verbose: print "disabling automatic collection" + while gc.collect(): pass # collect garbage from previous tests enabled = gc.isenabled() gc.disable() ! verify(not gc.isenabled()) debug = gc.get_debug() gc.set_debug(debug & ~gc.DEBUG_LEAK) # this test is supposed to leak --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jun 9 05:17:11 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 09 Jun 2002 00:17:11 -0400 Subject: [Python-Dev] unittest and sockets. Ugh!? In-Reply-To: Your message of "Sat, 08 Jun 2002 17:26:28 EDT." <20020608172627.D9486@eecs.tufts.edu> References: <20020608172627.D9486@eecs.tufts.edu> Message-ID: <200206090417.g594HBk03827@pcp02138704pcs.reston01.va.comcast.net> > Could someone please offer some advice/comments on this? While > restructuring test_socket.py, I've come across the following > (limitation?) problem with the unittest module. To test socket stuff, > I really need to use two separate processes or threads. Since > fork() is much better supported, here's my attempt at porting the > _fileobject tests into unittest. I really like the structure of the > test (hopefully you guys agree) and this is how I'd like to lay it out > ideally for testing things like accept/connect, etc, using the various > layers of my inhertiance hierarchy. Please don't use fork -- that isn't supported on Windows. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Sun Jun 9 08:38:45 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 09 Jun 2002 09:38:45 +0200 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net> References: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > I tried the simplest possible fix, which was to visit self->ob_type in > the new-style instance tp_traverse handler (subtype_traverse() in > typeobject.c). But this caused assertions to fail all over the place. > It turns out that when the collector decides to break a cycle like > this, it calls the tp_clear handler for each object in the cycle, and > then the subsequent deletion of the instance references the type in > ways that have been made invalid by the clearing of the type. So this > was a dead end. I'd like to question this statement. It ought to be possible, IMO, to dealloc an instance whose type has been cleared. The problem appears to be in the tp_clear. The task of tp_clear is to clear all references that may participate in cycles (*not* to clear all references per se). Now, if type_clear would clear tp_dict, tp_subclasses, and et->slots, but leave alone tp_base, tp_bases, and tp_mro, the type would still be "good enough" for subtype_dealloc, no? Regards, Martin From bernie@3captus.com Sun Jun 9 09:44:04 2002 From: bernie@3captus.com (Bernard Yue) Date: Sun, 09 Jun 2002 02:44:04 -0600 Subject: [Python-Dev] Subclassing threading.Thread Message-ID: <3D031553.D9E263C8@3captus.com> This is a multi-part message in MIME format. --------------695193BAE1E9F00BD723109A Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Attached files test1.py and test2.py produce different result. Is it a bug? Bernie --------------695193BAE1E9F00BD723109A Content-Type: text/plain; charset=us-ascii; name="test1.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="test1.py" #!/usr/bin/env python import socket import threading class server: def __init__(self): self._addr_local = ('127.0.0.1', 25339) self._s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self._s.bind(self._addr_local) def test(): for i in range(10): a = threading.Thread( target=server) a.start() a.run() a.join() if __name__ == '__main__': test() --------------695193BAE1E9F00BD723109A Content-Type: text/plain; charset=us-ascii; name="test2.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="test2.py" #!/usr/bin/env python import socket import threading class server(threading.Thread): def __init__(self): threading.Thread.__init__(self) self.__addr_local = ('127.0.0.1', 25339) self.__s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.__s.bind(self.__addr_local) def test(): for i in range(10): a = server() a.start() a.run() a.join() if __name__ == '__main__': test() --------------695193BAE1E9F00BD723109A-- From martin@v.loewis.de Sun Jun 9 10:20:18 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 09 Jun 2002 11:20:18 +0200 Subject: [Python-Dev] Subclassing threading.Thread In-Reply-To: <3D031553.D9E263C8@3captus.com> References: <3D031553.D9E263C8@3captus.com> Message-ID: Bernard Yue writes: > Attached files test1.py and test2.py produce different result. Is it a > bug? No. Martin From paul@prescod.net Sun Jun 9 19:19:47 2002 From: paul@prescod.net (Paul Prescod) Date: Sun, 09 Jun 2002 11:19:47 -0700 Subject: [Python-Dev] textwrap.py References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <20020607213947.GB21836@gerg.ca> <3D012CD9.89D40523@prescod.net> <20020607220640.GA21975@gerg.ca> <3D015CD8.DE7C4BC2@prescod.net> <20020609002722.GA3750@gerg.ca> Message-ID: <3D039C43.786EE3D8@prescod.net> Greg Ward wrote: > >... > > However, I *still* don't want to make all of TextWrapper's options > keyword arguments to the wrap() method, because 1) I'd be morally bound > to make them kwargs to the fill() method, and to the standalone wrap() > and fill() functions as well, which is a PITA; and 2) I think it's > useful to be able to encode your preferences in an object for multiple > wrapping jobs. I buy the second argument but not the first. I'm 95% happy with what you propose and suggest only a tiny change. If "expand_tabs" or "replace_whitespace" or "break_long_words" are options that people will want to specify when doing text wrapping then why *wouldn't* they be arguments to the wrap and fill functions? The object is useful for when you want to keep those options persistently. But it seems clear that you would want to pass the same argument to the function versions. Here's my proposed (untested) fix: def wrap (text, width, **kwargs): return TextWrapper(**kwargs).wrap(text, width) def fill (text, width, initial_tab="", subsequent_tab="", **kwargs): return _wrapper.fill(text, width, initial_tab, subsequent_tab, **kwargs) I'm not clear on why the "width" argument is special and should be on the wrap method rather than in the constructor. But I suspect most people will use the convenience functions so they'll never know the difference. Paul Prescod From s_lott@yahoo.com Sun Jun 9 19:44:25 2002 From: s_lott@yahoo.com (Steven Lott) Date: Sun, 9 Jun 2002 11:44:25 -0700 (PDT) Subject: [Python-Dev] textwrap.py In-Reply-To: <3D039C43.786EE3D8@prescod.net> Message-ID: <20020609184425.18907.qmail@web9601.mail.yahoo.com> --0-1592618721-1023648265=:17605 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Here's a version with the Strategy classes included. This allows for essentially unlimited alternatives on the subjects of long words, full stops, and also permits right justification. This is my preference for resolving "creeping featuritis". Any new feature can be implemented as yet another strategy plug-in. Note that I am AR about superclasses. Python does not require this level of fussiness, but I find that when I leave them out, I always wish I had them as a place to factor out common functions. ===== -- S. Lott, CCP :-{) S_LOTT@YAHOO.COM http://www.mindspring.com/~slott1 Buccaneer #468: KaDiMa Macintosh user: drinking upstream from the herd. __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com --0-1592618721-1023648265=:17605 Content-Type: application/octet-stream; name="textwrap.py" Content-Transfer-Encoding: base64 Content-Description: textwrap.py Content-Disposition: attachment; filename="textwrap.py" IiIiClV0aWxpdGllcyBmb3Igd3JhcHBpbmcgdGV4dCBzdHJpbmdzIGFuZCBm aWxsaW5nIHRleHQgcGFyYWdyYXBocy4KIiIiCgpfX3JldmlzaW9uX18gPSAi JElkJCIKCmltcG9ydCBzdHJpbmcsIHJlCgoKIyBYWFggaXMgdGhpcyBnb2lu ZyB0byBiZSBpbXBsZW1lbnRlZCBwcm9wZXJseSBzb21ld2hlcmUgaW4gMi4z PwpkZWYgaXNsb3dlciAoYyk6CiAgICByZXR1cm4gYyBpbiBzdHJpbmcubG93 ZXJjYXNlCgojIFRleHRQcmVwcm9jZXNzIGNsYXNzIGhpZXJhcmNoeToKIyAg IFRoZXNlIGNsYXNzZXMgYXJlIGFsbCBwbHVnLWluIHN0cmF0ZWdpZXMgZm9y IHByZXByb2Nlc3NpbmcgdGV4dCBwcmlvciAKIyAgIHRvIHNwbGl0dGluZyBp bnRvIGNodW5rcy4KCmNsYXNzIFRleHRQcmVwcm9jZXNzOgogICAgIiIiUHJl cHJvY2VzcyB0ZXh0IGJlZm9yZSBmaWxsaW5nIGxpbmVzLiIiIgogICAgZGVm IHRyYW5zZm9ybSggc2VsZiwgdGV4dCApOgogICAgICAgIHJldHVybiB0ZXh0 CiAgICAgICAgCmNsYXNzIEV4cGFuZFRhYnMoIFRleHRQcmVwcm9jZXNzICk6 CiAgICAiIiJFeHBhbmQgVGFicyB0byBzcGFjZXMuIiIiCiAgICBkZWYgX19p bml0X18oIHNlbGYsIHdpZHRoICk6CiAgICAgICAgc2VsZi53aWR0aD0gd2lk dGgKICAgIGRlZiB0cmFuc2Zvcm0oIHNlbGYsIHRleHQgKToKICAgICAgICBy ZXR1cm4gdGV4dC5leHBhbmR0YWJzKCkKCmNsYXNzIENsZWFuV2hpdGVzcGFj ZSggVGV4dFByZXByb2Nlc3MgKToKICAgICIiIk5vcm1hbGl6ZSB2YXJpb3Vz IHdoaXRlc3BhY2UgY2hhcmFjdGVycyB0byBiZSBzcGFjZXMuIiIiIAogICAg d2hpdGVzcGFjZV90cmFucz0gc3RyaW5nLm1ha2V0cmFucyggc3RyaW5nLndo aXRlc3BhY2UsIAogICAgICAgICcgJypsZW4oc3RyaW5nLndoaXRlc3BhY2Up ICkKICAgIGRlZiB0cmFuc2Zvcm0oIHNlbGYsIHRleHQgKToKICAgICAgICBy ZXR1cm4gdGV4dC50cmFuc2xhdGUoIHNlbGYud2hpdGVzcGFjZV90cmFucyAp CgojIENodW5rUHJlcHJvY2VzcyBjbGFzcyBoaWVyYXJjaHk6CiMgICAgVGhl c2UgY2xhc3NlcyBhcmUgcGx1Zy1pbiBzdHJhdGVnaWVzIGZvciBwcm9jZXNz aW5nIGEgc2VxdWVuY2Ugb2YgY2h1bmtzCiMgICAgcHJpb3IgdG8gd3JhcHBp bmcuCgpjbGFzcyBDaHVua1ByZXByb2Nlc3M6CiAgICAiIiJQYXJlbnQgY2xh c3MgZm9yIHBvc3Qtc3BsaXQsIHByZS13cmFwIHByb2Nlc3NpbmcuIiIiCiAg ICBkZWYgYWJicmV2KCBzZWxmLCB0ZXh0ICk6CiAgICAgICAgcmV0dXJuIHRl eHRbMF0gaW4gc3RyaW5nLnVwcGVyY2FzZSBhbmQgdGV4dFstMV0gPT0gJy4n CiAgICBkZWYgdHJhbnNmb3JtKCBzZWxmLCBjaHVua3MgKToKICAgICAgICBy ZXR1cm4gY2h1bmtzCiAgICAKY2xhc3MgRnVsbFN0b3BUd29TcGFjZSggQ2h1 bmtQcmVwcm9jZXNzICk6CiAgICAiIiJFbnN1cmUgdGhhdCBmdWxsIHN0b3Bz IGFyZSBmb2xsb3dlZCBieSB0d28gc3BhY2VzLiIiIgogICAgZGVmIHRyYW5z Zm9ybSggc2VsZiwgY2h1bmtzICk6CiAgICAgICAgaT0gMAogICAgICAgIHdo aWxlIGkgPCBsZW4oY2h1bmtzKS0xOgogICAgICAgICAgICBpZiBzZWxmLmFi YnJldiggY2h1bmtzW2ldICkgYW5kIGNodW5rc1tpKzFdLnN0YXJ0c3dpdGgo JyAnKToKICAgICAgICAgICAgICAgIGNodW5rc1tpKzFdPSAnICAnCiAgICAg ICAgICAgICAgICBpICs9IDIKICAgICAgICAgICAgZWxzZToKICAgICAgICAg ICAgICAgIGkgKz0gMQogICAgCmNsYXNzIEZ1bGxTdG9wT25lU3BhY2UoIENo dW5rUHJlcHJvY2VzcyApOgogICAgIiIiRW5zdXJlIHRoYXQgZnVsbCBzdG9w cyBhcmUgZm9sbG93ZWQgYnkgb25lIHNwYWNlLiIiIgogICAgZGVmIHRyYW5z Zm9ybSggc2VsZiwgY2h1bmtzICk6CiAgICAgICAgaT0gMAogICAgICAgIHdo aWxlIGkgPCBsZW4oY2h1bmtzKS0xOgogICAgICAgICAgICBpZiBzZWxmLmFi YnJldiggY2h1bmtzW2ldICkgYW5kIGNodW5rc1tpKzFdLnN0YXJ0c3dpdGgo JyAnKToKICAgICAgICAgICAgICAgIGNodW5rc1tpKzFdPSAnICcKICAgICAg ICAgICAgICAgIGkgKz0gMgogICAgICAgICAgICBlbHNlOgogICAgICAgICAg ICAgICAgaSArPSAxCgojIExvbmdXb3JkIGNsYXNzIGhpZXJhcmNoeToKIyAg ICBUaGVzZSBjbGFzc2VzIGFyZSBhbGwgcGx1Zy1pbiBzdHJhdGVnaWVzIGZv ciBjb3Bpbmcgd2l0aCBsb25nIHdvcmRzCiMgICAgdGhhdCB3b3VsZCBvdmVy ZmxvdyB0aGUgbGluZSBlbmRpbmcuICBBZGRpdGlvbmFsIHN1YmNsYXNzIG1p Z2h0IGJlCiMgICAgd3JpdHRlbiB0byBoYW5kbGUgc3BsaXR0aW5nIHdvcmRz IG9uIGRvdWJsZSBsZXR0ZXJzIG9yIGNvbW1vbiBlbmRpbmdzLgoKY2xhc3Mg TG9uZ1dvcmRzOgogICAgIiIiU3RyYXRlZ3kgZm9yIGhhbmRsaW5nIGxvbmcg d29yZCBicmVha3MuIiIiCiAgICBkZWYgaGFuZGxlKCBzZWxmLCBjaHVua3Ms IGN1cl9saW5lLCBjdXJfbGVuLCB3aWR0aCApOgogICAgICAgIHBhc3MKCmNs YXNzIEJyZWFrTG9uZ1dvcmRzKCBMb25nV29yZHMgKToKICAgICIiIkJyZWFr IGEgbG9uZyB3b3JkIHRvIGtlZXAgdGhlIGxpbmUgYW4gYXBwcm9wcmlhdGUg bGVuZ3RoLiIiIgogICAgZGVmIGhhbmRsZSggc2VsZiwgY2h1bmtzLCBjdXJf bGluZSwgY3VyX2xlbiwgd2lkdGggKToKICAgICAgICBzcGFjZV9sZWZ0PSB3 aWR0aCAtIGN1cl9sZW4KICAgICAgICBjdXJfbGluZS5hcHBlbmQoIGNodW5r c1swXVswOnNwYWNlX2xlZnRdICkKICAgICAgICBjaHVua3NbMF09IGNodW5r c1swXVtzcGFjZV9sZWZ0Ol0KICAgIApjbGFzcyBLZWVwTG9uZ1dvcmRzKCBM b25nV29yZHMgKToKICAgICIiIlB1dCB0aGUgbG9uZyB3b3JkIG9uIHRoZSBu ZXh0IGxpbmUsIHdoZXJlIGl0IG1heSBiZSB0b28gbG9uZy4iIiIKICAgIGRl ZiBoYW5kbGUoIHNlbGYsIGNodW5rcywgY3VyX2xpbmUsIGN1cl9sZW4sIHdp ZHRoICk6CiAgICAgICAgaWYgbm90IGN1cl9saW5lOiAKICAgICAgICAgICAg Y3VyX2xpbmUuYXBwZW5kKCBjaHVua3MucG9wKDApICkKCiMgTGluZSBwb3N0 LXByb2Nlc3MgY2xhc3MgaGllcmFyY2h5OgojICAgICBUaGVzZSBjbGFzc2Vz IGFyZSBwbHVnLWluIHN0cmF0ZWdpZXMgZm9yIHBvc3QtcHJvY2Vzc2luZyBs aW5lcwoKY2xhc3MgTGluZVBvc3Rwcm9jZXNzOgogICAgZGVmIHRyYW5zZm9y bSggc2VsZiwgbGluZXMgKToKICAgICAgICByZXR1cm4gbGluZXMKCmNsYXNz IFRleHRXcmFwcGVyOgogICAgIiIiCiAgICBPYmplY3QgZm9yIHdyYXBwaW5n L2ZpbGxpbmcgdGV4dC4gIFRoZSBwdWJsaWMgaW50ZXJmYWNlIGNvbnNpc3Rz IG9mCiAgICB0aGUgd3JhcCgpIGFuZCBmaWxsKCkgbWV0aG9kczsgdGhlIG90 aGVyIG1ldGhvZHMgYXJlIGp1c3QgdGhlcmUgZm9yCiAgICBzdWJjbGFzc2Vz IHRvIG92ZXJyaWRlIGluIG9yZGVyIHRvIHR3ZWFrIHRoZSBkZWZhdWx0IGJl aGF2aW91ci4KICAgIElmIHlvdSB3YW50IHRvIGNvbXBsZXRlbHkgcmVwbGFj ZSB0aGUgbWFpbiB3cmFwcGluZyBhbGdvcml0aG0sCiAgICB5b3UnbGwgcHJv YmFibHkgaGF2ZSB0byBvdmVycmlkZSBfd3JhcF9jaHVua3MoKS4KCiAgICBT ZXZlcmFsIHN0cmF0ZWd5IGNsYXNzZXMgY2FuIGJlIHVzZWQgdG8gY3VzdG9t aXplIHRoZSBvcGVyYXRpb24uCiAgICAKICAgIEEgc2VxdWVuY2Ugb2YgaW5z dGFuY2VzIG9mIHN1YmNsYXNzIG9mIFRleHRQcmVwcm9jZXNzIGFyZSB1c2Vk CiAgICB0byBwcmUtcHJvY2VzcyB0aGUgdGV4dCBiZWZvcmUgYnJlYWtpbmcg aW50byBjaHVua3MgYW5kIHdyYXBwaW5nLgogICAgICBFeHBhbmRUYWJzCiAg ICAgICAgVGFicyBpbiBpbnB1dCB0ZXh0IHdpbGwgYmUgZXhwYW5kZWQKICAg ICAgICB0byBzcGFjZXMgYmVmb3JlIGZ1cnRoZXIgcHJvY2Vzc2luZy4gIEVh Y2ggdGFiIHdpbGwKICAgICAgICBiZWNvbWUgMSAuLiA4IHNwYWNlcywgZGVw ZW5kaW5nIG9uIGl0cyBwb3NpdGlvbiBpbiBpdHMgbGluZS4KICAgICAgQ2xl YW5XaGl0ZXNwYWNlCiAgICAgICAgQWxsIHdoaXRlc3BhY2UgY2hhcmFjdGVy cyBpbiB0aGUgaW5wdXQKICAgICAgICB0ZXh0IGFyZSByZXBsYWNlZCBieSBz cGFjZXMuCiAgICAKICAgIEEgc2VxdWVuY2Ugb2YgaW5zdGFuY2VzIG9mIHN1 YmNsYXNzIG9mIENodW5rUHJlcHJvY2VzcyBhcmUgdXNlZAogICAgdG8gcHJl LXByb2Nlc3MgdGhlIGNodW5rcyBwcmlvciB0byB3cmFwcGluZy4KICAgICAg RnVsbFN0b3BUd29TcGFjZQogICAgICAgIEVuc3VyZXMgdGhhdCBmdWxsIHN0 b3BzIGFyZSBmb2xsb3dlZCBieSBwcmVjaXNlbHkgdHdvIHNwYWNlcy4KICAg ICAgRnVsbFN0b3BPbmVTcGFjZQogICAgICAgIEVuc3VyZXMgdGhhdCBmdWxs IHN0b3BzIGFyZSBmb2xsb3dlZCBieSBwcmVjaXNlbGUgb25lIHNwYWNlLgog ICAgCiAgICBBbiBpbnN0YW5jZSBvZiBhIHN1YmNsYXNzIG9mIExvbmdXb3Jk cyBpcyB1c2VkIHRvIGhhbmRsZSB0aGUgCiAgICBtZWNoYW5pc20gb2YgYnJl YWtpbmcgb3IgcHJlc2VydmluZyBsb25nIHdvcmRzIGF0IHRoZSBlbmQgb2Yg dGhlIAogICAgbGluZSBkdXJpbmcgd3JhcHBpbmcuCiAgICAgIEJyZWFrTG9u Z1dvcmRzCiAgICAgICAgV29yZHMgbG9uZ2VyIHRoYW4gdGhlIGxpbmUgd2lk dGggY29uc3RyYWludAogICAgICAgIHdpbGwgYmUgYnJva2VuLiAgCiAgICAg IEtlZXBMb25nV29yZHMKICAgICAgICBXb3JkcyBsb25nZXIgdGhhbiB0aGUg bGluZSB3aWR0aCBjb25zdHJhaW50IHdpbGwgbm90IGJlIGJyb2tlbiwKICAg ICAgICBhbmQgc29tZSBsaW5lcyBtaWdodCBiZSBsb25nZXIgdGhhbiB0aGUg d2lkdGggY29uc3RyYWludC4KICAgICAgICAKICAgIFByb3ZpZGUgYSBzZXF1 ZW5jZSBzdHJhdGVnaWVzIHdoZW4geW91IGNvbnN0cnVjdCB0aGUgaW5zdGFu Y2Ugb2YgVGV4dFdyYXBwZXIuCiAgICBUaGV5IGFyZSBleGVjdXRlZCBpbiB0 aGUgb3JkZXIgcHJvdmlkZWQuCiAgICAKICAgIEZvciBleGFtcGxlOgogICAg ICAgIHdyYXA9IFRleHRXcmFwcGVyKCBbIEV4cGFuZFRhYnMoKSwgQ2xlYW5X aGl0ZXNwYWNlKCkgXSwgTm9uZSwgQnJlYWtMb25nV29yZHMoKSApCiAgICAi IiIKCiAgICAjIFRoaXMgZnVua3kgbGl0dGxlIHJlZ2V4IGlzIGp1c3QgdGhl IHRyaWNrIGZvciBzcGxpdHRpbmcgCiAgICAjIHRleHQgdXAgaW50byB3b3Jk LXdyYXBwYWJsZSBjaHVua3MuICBFLmcuCiAgICAjICAgIkhlbGxvIHRoZXJl IC0tIHlvdSBnb29mLWJhbGwsIHVzZSB0aGUgLWIgb3B0aW9uISIKICAgICMg c3BsaXRzIGludG8KICAgICMgICBIZWxsby8gL3RoZXJlLyAvLS0vIC95b3Uv IC9nb29mLS9iYWxsLC8gL3VzZS8gL3RoZS8gLy1iLyAvb3B0aW9uIQogICAg IyAoYWZ0ZXIgc3RyaXBwaW5nIG91dCBlbXB0eSBzdHJpbmdzKS4KICAgIHdv cmRzZXBfcmUgPSByZS5jb21waWxlKHInKFxzK3wnICAgICAgICAgICAgICAg ICAgIyBhbnkgd2hpdGVzcGFjZQogICAgICAgICAgICAgICAgICAgICAgICAg ICAgcidcd3syLH0tKD89XHd7Mix9KXwnICAgICAjIGh5cGhlbmF0ZWQgd29y ZHMKICAgICAgICAgICAgICAgICAgICAgICAgICAgIHInKD88PVx3KS17Mix9 KD89XHcpKScpICAgIyBlbS1kYXNoCgoKICAgIGRlZiBfX2luaXRfXyAoc2Vs Zix0ZXh0UHJlPVtdLGNodW5rUHJlPVtdLGxvbmd3b3Jkcz1bXSxsaW5lUG9z dD1bXSk6CiAgICAgICAgaWYgdGV4dFByZToKICAgICAgICAgICAgc2VsZi50 ZXh0UHJlPSB0ZXh0UHJlCiAgICAgICAgZWxzZToKICAgICAgICAgICAgc2Vs Zi50ZXh0UHJlPSBbIEV4cGFuZFRhYnMoOCksIENsZWFuV2hpdGVzcGFjZSgp IF0KICAgICAgICBzZWxmLmNodW5rUHJlPSBjaHVua1ByZQogICAgICAgIGlm IGxvbmd3b3JkczoKICAgICAgICAgICAgc2VsZi5sb25nX3dvcmRzPSBsb25n d29yZHMKICAgICAgICBlbHNlOgogICAgICAgICAgICBzZWxmLmxvbmdfd29y ZHM9IEJyZWFrTG9uZ1dvcmRzKCkKICAgICAgICBzZWxmLmxpbmVQb3N0PSBs aW5lUG9zdAogICAgICAgIAoKICAgICMgLS0gUHJpdmF0ZSBtZXRob2RzIC0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t CiAgICAjIChwb3NzaWJseSB1c2VmdWwgZm9yIHN1YmNsYXNzZXMgdG8gb3Zl cnJpZGUpCgoKICAgIGRlZiBfc3BsaXQgKHNlbGYsIHRleHQpOgogICAgICAg ICIiIl9zcGxpdCh0ZXh0IDogc3RyaW5nKSAtPiBbc3RyaW5nXQoKICAgICAg ICBTcGxpdCB0aGUgdGV4dCB0byB3cmFwIGludG8gaW5kaXZpc2libGUgY2h1 bmtzLiAgQ2h1bmtzIGFyZQogICAgICAgIG5vdCBxdWl0ZSB0aGUgc2FtZSBh cyB3b3Jkczsgc2VlIHdyYXBfY2h1bmtzKCkgZm9yIGZ1bGwKICAgICAgICBk ZXRhaWxzLiAgQXMgYW4gZXhhbXBsZSwgdGhlIHRleHQKICAgICAgICAgIExv b2ssIGdvb2YtYmFsbCAtLSB1c2UgdGhlIC1iIG9wdGlvbiEKICAgICAgICBi cmVha3MgaW50byB0aGUgZm9sbG93aW5nIGNodW5rczoKICAgICAgICAgICdM b29rLCcsICcgJywgJ2dvb2YtJywgJ2JhbGwnLCAnICcsICctLScsICcgJywK ICAgICAgICAgICd1c2UnLCAnICcsICd0aGUnLCAnICcsICctYicsICcgJywg J29wdGlvbiEnCiAgICAgICAgIiIiCiAgICAgICAgY2h1bmtzID0gc2VsZi53 b3Jkc2VwX3JlLnNwbGl0KHRleHQpCiAgICAgICAgY2h1bmtzID0gZmlsdGVy KE5vbmUsIGNodW5rcykKICAgICAgICByZXR1cm4gY2h1bmtzCgogICAgZGVm IF9maXhfc2VudGVuY2VfZW5kaW5ncyAoc2VsZiwgY2h1bmtzKToKICAgICAg ICAiIiJfZml4X3NlbnRlbmNlX2VuZGluZ3MoY2h1bmtzIDogW3N0cmluZ10p CgogICAgICAgIENvcnJlY3QgZm9yIHNlbnRlbmNlIGVuZGluZ3MgYnVyaWVk IGluICdjaHVua3MnLiAgRWcuIHdoZW4gdGhlCiAgICAgICAgb3JpZ2luYWwg dGV4dCBjb250YWlucyAiLi4uIGZvby5cbkJhciAuLi4iLCBtdW5nZV93aGl0 ZXNwYWNlKCkKICAgICAgICBhbmQgc3BsaXQoKSB3aWxsIGNvbnZlcnQgdGhh dCB0byBbLi4uLCAiZm9vLiIsICIgIiwgIkJhciIsIC4uLl0KICAgICAgICB3 aGljaCBoYXMgb25lIHRvbyBmZXcgc3BhY2VzOyB0aGlzIG1ldGhvZCBzaW1w bHkgY2hhbmdlcyB0aGUgb25lCiAgICAgICAgc3BhY2UgdG8gdHdvLgogICAg ICAgICIiIgogICAgICAgIGkgPSAwCiAgICAgICAgd2hpbGUgaSA8IGxlbihj aHVua3MpLTE6CiAgICAgICAgICAgICMgY2h1bmtzW2ldIGxvb2tzIGxpa2Ug dGhlIGxhc3Qgd29yZCBvZiBhIHNlbnRlbmNlLAogICAgICAgICAgICAjIGFu ZCBpdCdzIGZvbGxvd2VkIGJ5IGEgc2luZ2xlIHNwYWNlLgogICAgICAgICAg ICBpZiAoY2h1bmtzW2ldWy0xXSA9PSAiLiIgYW5kCiAgICAgICAgICAgICAg ICAgIGNodW5rc1tpKzFdID09ICIgIiBhbmQKICAgICAgICAgICAgICAgICAg aXNsb3dlcihjaHVua3NbaV1bLTJdKSk6CiAgICAgICAgICAgICAgICBjaHVu a3NbaSsxXSA9ICIgICIKICAgICAgICAgICAgICAgIGkgKz0gMgogICAgICAg ICAgICBlbHNlOgogICAgICAgICAgICAgICAgaSArPSAxCgoKICAgIGRlZiBf d3JhcF9jaHVua3MgKHNlbGYsIGNodW5rcywgd2lkdGgpOgogICAgICAgICIi Il93cmFwX2NodW5rcyhjaHVua3MgOiBbc3RyaW5nXSwgd2lkdGggOiBpbnQp IC0+IFtzdHJpbmddCgogICAgICAgIFdyYXAgYSBzZXF1ZW5jZSBvZiB0ZXh0 IGNodW5rcyBhbmQgcmV0dXJuIGEgbGlzdCBvZiBsaW5lcyBvZgogICAgICAg IGxlbmd0aCAnd2lkdGgnIG9yIGxlc3MuICAoSWYgJ2JyZWFrX2xvbmdfd29y ZHMnIGlzIGZhbHNlLCBzb21lCiAgICAgICAgbGluZXMgbWF5IGJlIGxvbmdl ciB0aGFuICd3aWR0aCcuKSAgQ2h1bmtzIGNvcnJlc3BvbmQgcm91Z2hseSB0 bwogICAgICAgIHdvcmRzIGFuZCB0aGUgd2hpdGVzcGFjZSBiZXR3ZWVuIHRo ZW06IGVhY2ggY2h1bmsgaXMgaW5kaXZpc2libGUKICAgICAgICAobW9kdWxv ICdsb25nX3dvcmQuaGFuZGxlKCknKSwgYnV0IGEgbGluZSBicmVhayBjYW4g Y29tZSBiZXR3ZWVuCiAgICAgICAgYW55IHR3byBjaHVua3MuICBDaHVua3Mg c2hvdWxkIG5vdCBoYXZlIGludGVybmFsIHdoaXRlc3BhY2U7CiAgICAgICAg aWUuIGEgY2h1bmsgaXMgZWl0aGVyIGFsbCB3aGl0ZXNwYWNlIG9yIGEgIndv cmQiLiAgV2hpdGVzcGFjZQogICAgICAgIGNodW5rcyB3aWxsIGJlIHJlbW92 ZWQgZnJvbSB0aGUgYmVnaW5uaW5nIGFuZCBlbmQgb2YgbGluZXMsIGJ1dAog ICAgICAgIGFwYXJ0IGZyb20gdGhhdCB3aGl0ZXNwYWNlIGlzIHByZXNlcnZl ZC4KICAgICAgICAiIiIKICAgICAgICBsaW5lcyA9IFtdCgogICAgICAgIHdo aWxlIGNodW5rczoKCiAgICAgICAgICAgIGN1cl9saW5lID0gW10gICAgICAg ICAgICAgICAgICAgIyBsaXN0IG9mIGNodW5rcyAodG8tYmUtam9pbmVkKQog ICAgICAgICAgICBjdXJfbGVuID0gMCAgICAgICAgICAgICAgICAgICAgICMg bGVuZ3RoIG9mIGN1cnJlbnQgbGluZQoKICAgICAgICAgICAgIyBGaXJzdCBj aHVuayBvbiBsaW5lIGlzIHdoaXRlc3BhY2UgLS0gZHJvcCBpdC4KICAgICAg ICAgICAgaWYgY2h1bmtzWzBdLnN0cmlwKCkgPT0gJyc6CiAgICAgICAgICAg ICAgICBkZWwgY2h1bmtzWzBdCgogICAgICAgICAgICB3aGlsZSBjaHVua3M6 CiAgICAgICAgICAgICAgICBsID0gbGVuKGNodW5rc1swXSkKCiAgICAgICAg ICAgICAgICAjIENhbiBhdCBsZWFzdCBzcXVlZXplIHRoaXMgY2h1bmsgb250 byB0aGUgY3VycmVudCBsaW5lLgogICAgICAgICAgICAgICAgaWYgY3VyX2xl biArIGwgPD0gd2lkdGg6CiAgICAgICAgICAgICAgICAgICAgY3VyX2xpbmUu YXBwZW5kKGNodW5rcy5wb3AoMCkpCiAgICAgICAgICAgICAgICAgICAgY3Vy X2xlbiArPSBsCgogICAgICAgICAgICAgICAgIyBOb3BlLCB0aGlzIGxpbmUg aXMgZnVsbC4KICAgICAgICAgICAgICAgIGVsc2U6CiAgICAgICAgICAgICAg ICAgICAgYnJlYWsKCiAgICAgICAgICAgICMgVGhlIGN1cnJlbnQgbGluZSBp cyBmdWxsLCBhbmQgdGhlIG5leHQgY2h1bmsgaXMgdG9vIGJpZyB0bwogICAg ICAgICAgICAjIGZpdCBvbiAqYW55KiBsaW5lIChub3QganVzdCB0aGlzIG9u ZSkuICAKICAgICAgICAgICAgaWYgY2h1bmtzIGFuZCBsZW4oY2h1bmtzWzBd KSA+IHdpZHRoOgogICAgICAgICAgICAgICAgc2VsZi5sb25nX3dvcmRzLmhh bmRsZShjaHVua3MsIGN1cl9saW5lLCBjdXJfbGVuLCB3aWR0aCkKCiAgICAg ICAgICAgICMgSWYgdGhlIGxhc3QgY2h1bmsgb24gdGhpcyBsaW5lIGlzIGFs bCB3aGl0ZXNwYWNlLCBkcm9wIGl0LgogICAgICAgICAgICBpZiBjdXJfbGlu ZSBhbmQgY3VyX2xpbmVbLTFdLnN0cmlwKCkgPT0gJyc6CiAgICAgICAgICAg ICAgICBkZWwgY3VyX2xpbmVbLTFdCgogICAgICAgICAgICAjIENvbnZlcnQg Y3VycmVudCBsaW5lIGJhY2sgdG8gYSBzdHJpbmcgYW5kIHN0b3JlIGl0IGlu IGxpc3QKICAgICAgICAgICAgIyBvZiBhbGwgbGluZXMgKHJldHVybiB2YWx1 ZSkuCiAgICAgICAgICAgIGlmIGN1cl9saW5lOgogICAgICAgICAgICAgICAg bGluZXMuYXBwZW5kKCcnLmpvaW4oY3VyX2xpbmUpKQoKICAgICAgICByZXR1 cm4gbGluZXMKCgogICAgIyAtLSBQdWJsaWMgaW50ZXJmYWNlIC0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KCiAgICBk ZWYgd3JhcCAoc2VsZiwgdGV4dCwgd2lkdGgpOgogICAgICAgICIiIndyYXAo dGV4dCA6IHN0cmluZywgd2lkdGggOiBpbnQpIC0+IFtzdHJpbmddCgogICAg ICAgIFNwbGl0ICd0ZXh0JyBpbnRvIG11bHRpcGxlIGxpbmVzIG9mIG5vIG1v cmUgdGhhbiAnd2lkdGgnCiAgICAgICAgY2hhcmFjdGVycyBlYWNoLCBhbmQg cmV0dXJuIHRoZSBsaXN0IG9mIHN0cmluZ3MgdGhhdCByZXN1bHRzLgogICAg ICAgIFRhYnMgaW4gJ3RleHQnIGFyZSBleHBhbmRlZCB3aXRoIHN0cmluZy5l eHBhbmR0YWJzKCksIGFuZCBhbGwKICAgICAgICBvdGhlciB3aGl0ZXNwYWNl IGNoYXJhY3RlcnMgKGluY2x1ZGluZyBuZXdsaW5lKSBhcmUgY29udmVydGVk IHRvCiAgICAgICAgc3BhY2UuCiAgICAgICAgIiIiCiAgICAgICAgZm9yIHBy ZSBpbiBzZWxmLnRleHRQcmU6CiAgICAgICAgICAgIHRleHQgPSBwcmUudHJh bnNmb3JtKHRleHQpCiAgICAgICAgaWYgbGVuKHRleHQpIDw9IHdpZHRoOgog ICAgICAgICAgICByZXR1cm4gW3RleHRdCiAgICAgICAgY2h1bmtzID0gc2Vs Zi5fc3BsaXQodGV4dCkKICAgICAgICBmb3IgcHJlIGluIHNlbGYuY2h1bmtQ cmU6CiAgICAgICAgICAgIGNodW5rcyA9IHByZS50cmFuc2Zvcm0oIGNodW5r cyApCiAgICAgICAgbGluZXM9IHNlbGYuX3dyYXBfY2h1bmtzKGNodW5rcywg d2lkdGgpCiAgICAgICAgZm9yIHBvc3QgaW4gc2VsZi5saW5lUG9zdDoKICAg ICAgICAgICAgbGluZXM9IHBvc3QudHJhbnNmb3JtKCBsaW5lcyApCiAgICAg ICAgcmV0dXJuIGxpbmVzCgogICAgZGVmIGZpbGwgKHNlbGYsIHRleHQsIHdp ZHRoLCBpbml0aWFsX3RhYj0iIiwgc3Vic2VxdWVudF90YWI9IiIpOgogICAg ICAgICIiImZpbGwodGV4dCA6IHN0cmluZywKICAgICAgICAgICAgICAgIHdp ZHRoIDogaW50LAogICAgICAgICAgICAgICAgaW5pdGlhbF90YWIgOiBzdHJp bmcgPSAiIiwKICAgICAgICAgICAgICAgIHN1YnNlcXVlbnRfdGFiIDogc3Ry aW5nID0gIiIpCiAgICAgICAgICAgLT4gc3RyaW5nCgogICAgICAgIFJlZm9y bWF0IHRoZSBwYXJhZ3JhcGggaW4gJ3RleHQnIHRvIGZpdCBpbiBsaW5lcyBv ZiBubyBtb3JlIHRoYW4KICAgICAgICAnd2lkdGgnIGNvbHVtbnMuICBUaGUg Zmlyc3QgbGluZSBpcyBwcmVmaXhlZCB3aXRoICdpbml0aWFsX3RhYicsCiAg ICAgICAgYW5kIHN1YnNlcXVlbnQgbGluZXMgYXJlIHByZWZpeGVkIHdpdGgg J3N1YnNlcXVlbnRfdGFiJzsgdGhlCiAgICAgICAgbGVuZ3RocyBvZiB0aGUg dGFiIHN0cmluZ3MgYXJlIGFjY291bnRlZCBmb3Igd2hlbiB3cmFwcGluZyBs aW5lcwogICAgICAgIHRvIGZpdCBpbiAnd2lkdGgnIGNvbHVtbnMuCiAgICAg ICAgIiIiCiAgICAgICAgbGluZXMgPSBzZWxmLndyYXAodGV4dCwgd2lkdGgp CiAgICAgICAgc2VwID0gIlxuIiArIHN1YnNlcXVlbnRfdGFiCiAgICAgICAg cmV0dXJuIGluaXRpYWxfdGFiICsgc2VwLmpvaW4obGluZXMpCgoKIyBDb252 ZW5pZW5jZSBpbnRlcmZhY2UKCl93cmFwcGVyID0gVGV4dFdyYXBwZXIoIHRl eHRQcmU9WyBFeHBhbmRUYWJzKDgpLCBDbGVhbldoaXRlc3BhY2UoKSBdLCBs b25nd29yZHM9QnJlYWtMb25nV29yZHMoKSApCgpkZWYgd3JhcCAodGV4dCwg d2lkdGgpOgogICAgcmV0dXJuIF93cmFwcGVyLndyYXAodGV4dCwgd2lkdGgp CgpkZWYgZmlsbCAodGV4dCwgd2lkdGgsIGluaXRpYWxfdGFiPSIiLCBzdWJz ZXF1ZW50X3RhYj0iIik6CiAgICByZXR1cm4gX3dyYXBwZXIuZmlsbCh0ZXh0 LCB3aWR0aCwgaW5pdGlhbF90YWIsIHN1YnNlcXVlbnRfdGFiKQo= --0-1592618721-1023648265=:17605 Content-Type: application/octet-stream; name="test_textwrap.py" Content-Transfer-Encoding: base64 Content-Description: test_textwrap.py Content-Disposition: attachment; filename="test_textwrap.py" IyEvdXNyL2Jpbi9lbnYgcHl0aG9uCgpmcm9tIHRleHR3cmFwIGltcG9ydCBU ZXh0V3JhcHBlcgoKbnVtID0gMAoKZGVmIHRlc3QgKHJlc3VsdCwgZXhwZWN0 KToKICAgIGdsb2JhbCBudW0KICAgIG51bSArPSAxCiAgICBpZiByZXN1bHQg PT0gZXhwZWN0OgogICAgICAgIHByaW50ICIlZDogb2siICUgbnVtCiAgICBl bHNlOgogICAgICAgIHByaW50ICIlZDogbm90IG9rLCBleHBlY3RlZDoiICUg bnVtCiAgICAgICAgZm9yIGkgaW4gcmFuZ2UobGVuKGV4cGVjdCkpOgogICAg ICAgICAgICBwcmludCAiICAlZDogJXIiICUgKGksIGV4cGVjdFtpXSkKICAg ICAgICBwcmludCAiYnV0IGdvdDoiCiAgICAgICAgZm9yIGkgaW4gcmFuZ2Uo bGVuKHJlc3VsdCkpOgogICAgICAgICAgICBwcmludCAiICAlZDogJXIiICUg KGksIHJlc3VsdFtpXSkKCndyYXBwZXIgPSBUZXh0V3JhcHBlcigpCndyYXAg PSB3cmFwcGVyLndyYXAKCgojIFNpbXBsZSBjYXNlOiBqdXN0IHdvcmRzLCBz cGFjZXMsIGFuZCBhIGJpdCBvZiBwdW5jdHVhdGlvbi4KdCA9ICJIZWxsbyB0 aGVyZSwgaG93IGFyZSB5b3UgdGhpcyBmaW5lIGRheT8gIEknbSBnbGFkIHRv IGhlYXIgaXQhIgp0ZXN0KHdyYXAodCwgMTIpLCBbIkhlbGxvIHRoZXJlLCIs CiAgICAgICAgICAgICAgICAgICAiaG93IGFyZSB5b3UiLAogICAgICAgICAg ICAgICAgICAgInRoaXMgZmluZSIsCiAgICAgICAgICAgICAgICAgICAiZGF5 PyAgSSdtIiwKICAgICAgICAgICAgICAgICAgICJnbGFkIHRvIGhlYXIiLAog ICAgICAgICAgICAgICAgICAgIml0ISJdKQp0ZXN0KHdyYXAodCwgNDIpLCBb IkhlbGxvIHRoZXJlLCBob3cgYXJlIHlvdSB0aGlzIGZpbmUgZGF5PyIsCiAg ICAgICAgICAgICAgICAgICAiSSdtIGdsYWQgdG8gaGVhciBpdCEiXSkKdGVz dCh3cmFwKHQsIDgwKSwgW3RdKQoKIyBXaGl0ZXNwYWNlIG11bmdpbmcgYW5k IGVuZC1vZi1zZW50ZW5jZSBkZXRlY3Rpb24uCnQgPSAiIiJcClRoaXMgaXMg YSBwYXJhZ3JhcGggdGhhdCBhbHJlYWR5IGhhcwpsaW5lIGJyZWFrcy4gIEJ1 dCBzb21lIG9mIGl0cyBsaW5lcyBhcmUgbXVjaCBsb25nZXIgdGhhbiB0aGUg b3RoZXJzLApzbyBpdCBuZWVkcyB0byBiZSB3cmFwcGVkLgpTb21lIGxpbmVz IGFyZSBcdHRhYmJlZCB0b28uCldoYXQgYSBtZXNzIQoiIiIKdGVzdCh3cmFw KHQsIDQ1KSwgWyJUaGlzIGlzIGEgcGFyYWdyYXBoIHRoYXQgYWxyZWFkeSBo YXMgbGluZSIsCiAgICAgICAgICAgICAgICAgICAiYnJlYWtzLiAgQnV0IHNv bWUgb2YgaXRzIGxpbmVzIGFyZSBtdWNoIiwKICAgICAgICAgICAgICAgICAg ICJsb25nZXIgdGhhbiB0aGUgb3RoZXJzLCBzbyBpdCBuZWVkcyB0byBiZSIs CiAgICAgICAgICAgICAgICAgICAid3JhcHBlZC4gIFNvbWUgbGluZXMgYXJl ICB0YWJiZWQgdG9vLiAgV2hhdCBhIiwKICAgICAgICAgICAgICAgICAgICJt ZXNzISJdKQoKCiMgV3JhcHBpbmcgdG8gbWFrZSBzaG9ydCBsaW5lcyBsb25n ZXIuCnQgPSAiVGhpcyBpcyBhXG5zaG9ydCBwYXJhZ3JhcGguIgp0ZXN0KHdy YXAodCwgMjApLCBbIlRoaXMgaXMgYSBzaG9ydCIsCiAgICAgICAgICAgICAg ICAgICAicGFyYWdyYXBoLiJdKQp0ZXN0KHdyYXAodCwgNDApLCBbIlRoaXMg aXMgYSBzaG9ydCBwYXJhZ3JhcGguIl0pCgoKIyBUZXN0IGJyZWFraW5nIGh5 cGhlbmF0ZWQgd29yZHMuCnQgPSAidGhpcy1pcy1hLXVzZWZ1bC1mZWF0dXJl LWZvci1yZWZvcm1hdHRpbmctcG9zdHMtZnJvbS10aW0tcGV0ZXJzJ2x5Igp0 ZXN0KHdyYXAodCwgNDApLCBbInRoaXMtaXMtYS11c2VmdWwtZmVhdHVyZS1m b3ItIiwKICAgICAgICAgICAgICAgICAgICJyZWZvcm1hdHRpbmctcG9zdHMt ZnJvbS10aW0tcGV0ZXJzJ2x5Il0pCnRlc3Qod3JhcCh0LCA0MSksIFsidGhp cy1pcy1hLXVzZWZ1bC1mZWF0dXJlLWZvci0iLAogICAgICAgICAgICAgICAg ICAgInJlZm9ybWF0dGluZy1wb3N0cy1mcm9tLXRpbS1wZXRlcnMnbHkiXSkK dGVzdCh3cmFwKHQsIDQyKSwgWyJ0aGlzLWlzLWEtdXNlZnVsLWZlYXR1cmUt Zm9yLXJlZm9ybWF0dGluZy0iLAogICAgICAgICAgICAgICAgICAgInBvc3Rz LWZyb20tdGltLXBldGVycydseSJdKQoKIyBFbnN1cmUgdGhhdCB0aGUgc3Rh bmRhcmQgX3NwbGl0KCkgbWV0aG9kIHdvcmtzIGFzIGFkdmVydGlzZWQgaW4K IyB0aGUgY29tbWVudHMgKGRvbid0IHlvdSBoYXRlIGl0IHdoZW4gY29kZSBh bmQgY29tbWVudHMgZGl2ZXJnZT8pLgp0ID0gIkhlbGxvIHRoZXJlIC0tIHlv dSBnb29mLWJhbGwsIHVzZSB0aGUgLWIgb3B0aW9uISIKdGVzdCh3cmFwcGVy Ll9zcGxpdCh0KSwKICAgICBbIkhlbGxvIiwgIiAiLCAidGhlcmUiLCAiICIs ICItLSIsICIgIiwgInlvdSIsICIgIiwgImdvb2YtIiwKICAgICAgImJhbGws IiwgIiAiLCAidXNlIiwgIiAiLCAidGhlIiwgIiAiLCAiLWIiLCAiICIsICAi b3B0aW9uISJdKQoKCnRleHQgPSAnJycKRGlkIHlvdSBzYXkgInN1cGVyY2Fs aWZyYWdpbGlzdGljZXhwaWFsaWRvY2lvdXM/IgpIb3cgKmRvKiB5b3Ugc3Bl bGwgdGhhdCBvZGQgd29yZCwgYW55d2F5cz8KJycnCiMgWFhYIHNlbnRlbmNl IGVuZGluZyBub3QgZGV0ZWN0ZWQgYmVjYXVzZSBvZiBxdW90ZXMKdGVzdCh3 cmFwKHRleHQsIDMwKSwKICAgICBbJ0RpZCB5b3Ugc2F5ICJzdXBlcmNhbGlm cmFnaWxpcycsCiAgICAgICd0aWNleHBpYWxpZG9jaW91cz8iIEhvdyAqZG8q JywKICAgICAgJ3lvdSBzcGVsbCB0aGF0IG9kZCB3b3JkLCcsCiAgICAgICdh bnl3YXlzPyddKQp0ZXN0KHdyYXAodGV4dCwgNTApLAogICAgIFsnRGlkIHlv dSBzYXkgInN1cGVyY2FsaWZyYWdpbGlzdGljZXhwaWFsaWRvY2lvdXM/Iics CiAgICAgICdIb3cgKmRvKiB5b3Ugc3BlbGwgdGhhdCBvZGQgd29yZCwgYW55 d2F5cz8nXSkKCnRlc3QoVGV4dFdyYXBwZXIobG9uZ3dvcmRzPUtlZXBMb25n V29yZHMoKSkud3JhcCh0ZXh0LCAzMCksCiAgICAgWydEaWQgeW91IHNheScs CiAgICAgICcic3VwZXJjYWxpZnJhZ2lsaXN0aWNleHBpYWxpZG9jaW91cz8i JywKICAgICAgJ0hvdyAqZG8qIHlvdSBzcGVsbCB0aGF0IG9kZCcsCiAgICAg ICd3b3JkLCBhbnl3YXlzPyddKQoK --0-1592618721-1023648265=:17605-- From niemeyer@conectiva.com Sun Jun 9 20:19:18 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sun, 9 Jun 2002 16:19:18 -0300 Subject: [Python-Dev] os.stat(filename).r_dev Message-ID: <20020609161918.A17718@ibook.distro.conectiva> While talking to Lars about tarfile.py, he has noted an interesting detail in the current implementation of os.stat(filename).r_dev: ------------- With the current os.stat(), it is impossible to implement addition of those devices. That's because the result for st_rdev is just a plain integer, which still must be divided into the major and minor part. This division (resp. the C type dev_t) differs between several operating systems: OS format major minor Linux 32-bit upper 16bits lower 16bits SVR4 32-bit upper 14bits lower 18bits BSD 16-bit upper 8bits lower 8bits ------------- It seems like we really need some way to decode r_dev. One possible solutions are to implement major(), minor(), and makedev() somewhere. Another solution, if r_dev's raw value has no obvious use, would be to turn it into a two elements tuple like (major, minor). Any suggestions? -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From guido@python.org Sun Jun 9 20:58:52 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 09 Jun 2002 15:58:52 -0400 Subject: [Python-Dev] os.stat(filename).r_dev In-Reply-To: Your message of "Sun, 09 Jun 2002 16:19:18 -0300." <20020609161918.A17718@ibook.distro.conectiva> References: <20020609161918.A17718@ibook.distro.conectiva> Message-ID: <200206091958.g59JwqH15597@pcp02138704pcs.reston01.va.comcast.net> > Any suggestions? Submit a patch. ;-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jun 9 21:02:20 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 09 Jun 2002 16:02:20 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: Your message of "09 Jun 2002 09:38:45 +0200." References: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206092002.g59K2Kt15647@pcp02138704pcs.reston01.va.comcast.net> [Guido] > > I tried the simplest possible fix, which was to visit self->ob_type in > > the new-style instance tp_traverse handler (subtype_traverse() in > > typeobject.c). But this caused assertions to fail all over the place. > > It turns out that when the collector decides to break a cycle like > > this, it calls the tp_clear handler for each object in the cycle, and > > then the subsequent deletion of the instance references the type in > > ways that have been made invalid by the clearing of the type. So this > > was a dead end. [Martin] > I'd like to question this statement. It ought to be possible, IMO, to > dealloc an instance whose type has been cleared. > > The problem appears to be in the tp_clear. The task of tp_clear is to > clear all references that may participate in cycles (*not* to clear > all references per se). Now, if type_clear would clear tp_dict, > tp_subclasses, and et->slots, but leave alone tp_base, tp_bases, and > tp_mro, the type would still be "good enough" for subtype_dealloc, no? Alas, I don't think so. When tp_dict is cleared, this can remove the __del__ method before it can be called (it is called by the instance's tp_dealloc). But tp_dict has to be cleared, because it can participate in cycles (e.g. you could do A.A = A). tp_mro participates in a cycle too: it is a tuple whose first element is the type itself. Tuples are immutable, so the tp_clear for tuples doesn't do anything. So type_clear is our only hope to break this cycle. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Sun Jun 9 21:46:26 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 09 Jun 2002 22:46:26 +0200 Subject: [Python-Dev] os.stat(filename).r_dev In-Reply-To: <20020609161918.A17718@ibook.distro.conectiva> References: <20020609161918.A17718@ibook.distro.conectiva> Message-ID: Gustavo Niemeyer writes: > It seems like we really need some way to decode r_dev. One possible > solutions are to implement major(), minor(), and makedev() somewhere. > Another solution, if r_dev's raw value has no obvious use, would be to > turn it into a two elements tuple like (major, minor). > > Any suggestions? I'd add a field r_dev_pair which splits this into major and minor. I would not remove r_dev, since existing code may break. Notice that major, minor, and makedev is already available through TYPES on many platforms, although this has the known limitations, and is probably wrong for Linux at the moment. Regards, Martin From martin@v.loewis.de Sun Jun 9 21:56:00 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 09 Jun 2002 22:56:00 +0200 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <200206092002.g59K2Kt15647@pcp02138704pcs.reston01.va.comcast.net> References: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net> <200206092002.g59K2Kt15647@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > When tp_dict is cleared, this can remove the __del__ method before it > can be called (it is called by the instance's tp_dealloc). That cannot happen: an object whose type has an __del__ cannot refer to an object for which tp_clear has been called. Objects with finalizers go into gc.garbage, so in this case, the type is resurrected, and not cleared. > tp_mro participates in a cycle too: it is a tuple whose first element > is the type itself. Tuples are immutable, so the tp_clear for tuples > doesn't do anything. So type_clear is our only hope to break this > cycle. I see. So tp_mro must be cleared in tp_clear; it's not used from subtype_dealloc, so it won't cause problems to clear it. Regards, Martin From niemeyer@conectiva.com Sun Jun 9 22:17:11 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sun, 9 Jun 2002 18:17:11 -0300 Subject: [Python-Dev] os.stat(filename).r_dev In-Reply-To: References: <20020609161918.A17718@ibook.distro.conectiva> Message-ID: <20020609181710.A18935@ibook.distro.conectiva> Hi Martin! First, some self-corrections.. :-) > > It seems like we really need some way to decode r_dev. One possible > > solutions are to implement major(), minor(), and makedev() somewhere. "solution is" > > Another solution, if r_dev's raw value has no obvious use, would be to This should be st_rdev. > > turn it into a two elements tuple like (major, minor). > I'd add a field r_dev_pair which splits this into major and minor. I > would not remove r_dev, since existing code may break. Isn't st_rdev being made available only in 2.3, trough stat attributes? > Notice that major, minor, and makedev is already available through > TYPES on many platforms, although this has the known limitations, and > is probably wrong for Linux at the moment. Indeed. Here's what's defined here: def major(dev): return ((int)(((dev) >> 8) & 0xff)) def minor(dev): return ((int)((dev) & 0xff)) def major(dev): return (((dev).__val[1] >> 8) & 0xff) def minor(dev): return ((dev).__val[1] & 0xff) def major(dev): return (((dev).__val[0] >> 8) & 0xff) def minor(dev): return ((dev).__val[0] & 0xff) -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From guido@python.org Mon Jun 10 00:15:39 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 09 Jun 2002 19:15:39 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: Your message of "09 Jun 2002 22:56:00 +0200." References: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net> <200206092002.g59K2Kt15647@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206092315.g59NFdx20542@pcp02138704pcs.reston01.va.comcast.net> > > When tp_dict is cleared, this can remove the __del__ method before it > > can be called (it is called by the instance's tp_dealloc). > > That cannot happen: an object whose type has an __del__ cannot refer > to an object for which tp_clear has been called. Objects with > finalizers go into gc.garbage, so in this case, the type is > resurrected, and not cleared. You're right! > > tp_mro participates in a cycle too: it is a tuple whose first element > > is the type itself. Tuples are immutable, so the tp_clear for tuples > > doesn't do anything. So type_clear is our only hope to break this > > cycle. > > I see. So tp_mro must be cleared in tp_clear; it's not used from > subtype_dealloc, so it won't cause problems to clear it. You've convinced me. Here's a patch that only touches typeobject.c. It doesn't add any fields, and it doesn't require multiple collections to clear out cycles involving a class and its type. I like it! (Note: at the top of type_traverse() and type_clear(), there used to be code saying "if not a heaptype, return". That code was never necessary, because the collector doesn't call the traverse or clear hooks when tp_is_gc() returns false -- which it does when the heaptype flag isn't set. So I replaced these two with an assert that this is a heaptype.) Index: typeobject.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/typeobject.c,v retrieving revision 2.148 diff -c -c -r2.148 typeobject.c *** typeobject.c 4 Jun 2002 19:52:53 -0000 2.148 --- typeobject.c 9 Jun 2002 23:05:47 -0000 *************** *** 290,295 **** --- 290,301 ---- } } + if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) { + int err = visit((PyObject *)type, arg); + if (err) + return err; + } + if (basetraverse) return basetraverse(self, visit, arg); return 0; *************** *** 1323,1329 **** return NULL; } mro = type->tp_mro; ! assert(mro != NULL); } assert(PyTuple_Check(mro)); n = PyTuple_GET_SIZE(mro); --- 1329,1336 ---- return NULL; } mro = type->tp_mro; ! if (mro == NULL) ! return NULL; } assert(PyTuple_Check(mro)); n = PyTuple_GET_SIZE(mro); *************** *** 1335,1341 **** assert(PyType_Check(base)); dict = ((PyTypeObject *)base)->tp_dict; } ! assert(dict && PyDict_Check(dict)); res = PyDict_GetItem(dict, name); if (res != NULL) return res; --- 1342,1349 ---- assert(PyType_Check(base)); dict = ((PyTypeObject *)base)->tp_dict; } ! if (dict == NULL || !PyDict_Check(dict)) ! continue; res = PyDict_GetItem(dict, name); if (res != NULL) return res; *************** *** 1495,1502 **** etype *et; int err; ! if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE)) ! return 0; et = (etype *)type; --- 1503,1509 ---- etype *et; int err; ! assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE); et = (etype *)type; *************** *** 1512,1519 **** VISIT(type->tp_mro); VISIT(type->tp_bases); VISIT(type->tp_base); - VISIT(type->tp_subclasses); - VISIT(et->slots); #undef VISIT --- 1519,1524 ---- *************** *** 1526,1533 **** etype *et; PyObject *tmp; ! if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE)) ! return 0; et = (etype *)type; --- 1531,1537 ---- etype *et; PyObject *tmp; ! assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE); et = (etype *)type; *************** *** 1541,1555 **** CLEAR(type->tp_dict); CLEAR(type->tp_cache); CLEAR(type->tp_mro); - CLEAR(type->tp_bases); - CLEAR(type->tp_base); - CLEAR(type->tp_subclasses); - CLEAR(et->slots); - - if (type->tp_doc != NULL) { - PyObject_FREE(type->tp_doc); - type->tp_doc = NULL; - } #undef CLEAR --- 1545,1550 ---- *************** *** 2166,2175 **** PyTypeObject *base; int i, n; ! if (type->tp_flags & Py_TPFLAGS_READY) { ! assert(type->tp_dict != NULL); return 0; - } assert((type->tp_flags & Py_TPFLAGS_READYING) == 0); type->tp_flags |= Py_TPFLAGS_READYING; --- 2161,2168 ---- PyTypeObject *base; int i, n; ! if (type->tp_flags & Py_TPFLAGS_READY) return 0; assert((type->tp_flags & Py_TPFLAGS_READYING) == 0); type->tp_flags |= Py_TPFLAGS_READYING; --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Mon Jun 10 06:52:18 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 10 Jun 2002 07:52:18 +0200 Subject: [Python-Dev] os.stat(filename).r_dev In-Reply-To: <20020609181710.A18935@ibook.distro.conectiva> References: <20020609161918.A17718@ibook.distro.conectiva> <20020609181710.A18935@ibook.distro.conectiva> Message-ID: Gustavo Niemeyer writes: > Isn't st_rdev being made available only in 2.3, trough stat attributes? No, it was available in 2.2 already. So there is a backwards compatibility issue. Regards, Martin From martin@v.loewis.de Mon Jun 10 07:06:22 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 10 Jun 2002 08:06:22 +0200 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <200206092315.g59NFdx20542@pcp02138704pcs.reston01.va.comcast.net> References: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net> <200206092002.g59K2Kt15647@pcp02138704pcs.reston01.va.comcast.net> <200206092315.g59NFdx20542@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > You've convinced me. Here's a patch that only touches typeobject.c. > It doesn't add any fields, and it doesn't require multiple collections > to clear out cycles involving a class and its type. I like it! Looks good to me, too! Martin From mwh@python.net Mon Jun 10 11:36:30 2002 From: mwh@python.net (Michael Hudson) Date: 10 Jun 2002 11:36:30 +0100 Subject: [Python-Dev] pymemcompat.h & PyMem_New and friends In-Reply-To: Michael Hudson's message of "29 May 2002 16:13:58 +0100" References: <2mr8jwv84e.fsf@starship.python.net> <20020528080149.A7799@glacier.arctrix.com> <2mit57x9zd.fsf@starship.python.net> Message-ID: <2msn3vpgi9.fsf_-_@starship.python.net> Michael Hudson writes: > /* There are three "families" of memory API: the "raw memory", "object > memory" and "object" families. (This is ignoring the matter of the > cycle collector, about which more is said below). Of course this is an over-simplification. There is at least one other family in fairly widespread use in the Python core; the "typed memory allocator", PyMem_New, PyMem_Resize and PyMem_Del. Should this family be listed in pyemcompat.h or subtly discouraged? (I don't think there are any other options). I think it should be subtly discouraged, for a couple of reasons: a) three is a smaller number than four. b) there is a non-analogy: PyMem_Malloc ---> PyMem_New PyObject_Malloc ---> PyObject_New They do rather different things. c) I don't think omitting a cast and a sizeof is that much of a win. I'm not proposing actually taking these interfaces away. (as a special bonus I won't even mention the fact that we have PyMem_Resize, PyObject_GC_Resize (only used in listobject.c) but not PyObject_Resize...) Cheers, M. -- If a train station is a place where a train stops, what's a workstation? -- unknown (to me, at least) From tim.one@comcast.net Mon Jun 10 13:07:08 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 08:07:08 -0400 Subject: [Python-Dev] pymemcompat.h & PyMem_New and friends In-Reply-To: <2msn3vpgi9.fsf_-_@starship.python.net> Message-ID: [Michael Hudson] > ... > There is at least one other family in fairly widespread use in the > Python core; the "typed memory allocator", PyMem_New, PyMem_Resize and > PyMem_Del. Should this family be listed in pyemcompat.h or subtly > discouraged? (I don't think there are any other options). I left it out of the "recommended" memory API, so "subtly discouraged" gets my vote. I think the current comment in pymem.h shows this : * These are carried along for historical reasons. There's rarely a good * reason to use them anymore (you can just as easily do the multiply and * cast yourself). > I think it should be subtly discouraged, for a couple of reasons: > > a) three is a smaller number than four. > b) there is a non-analogy: > > PyMem_Malloc ---> PyMem_New > PyObject_Malloc ---> PyObject_New > > They do rather different things. > c) I don't think omitting a cast and a sizeof is that much of a win. It also hides a multiply, and that's a Bad Idea because callers almost never first check that the hidden multiply doesn't overflow a size_t -- and neither do the macros. There are a few calls in the Python core that do, but only because I slammed those checks in when a real-life overflow bug surfaced. If the PyMem_XYZ family were reworked to detect overflow, it may become valuable again. > I'm not proposing actually taking these interfaces away. I suppose that explains why you're still breathing . > (as a special bonus I won't even mention the fact that we have > PyMem_Resize, PyObject_GC_Resize (only used in listobject.c) No, it's not used in listobject.c -- there's no need for it there, as list guts are stored separately from the list object. It is used in tupleobject.c and in frameobject.c, as tuples and frames are the only container types in Python that *embed* a variable amount of data in the object and may participate in cycles. > but not PyObject_Resize...) That seems a curious omission, but it would only be useful for a variable- size object that doesn't participate in cyclic gc. 8-bit strings are the only type that come to mind. From tim.one@comcast.net Mon Jun 10 13:16:51 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 08:16:51 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) Message-ID: I think Beni has a very nice idea here, especially for people who can't visualize 2's-complement (not mentioning Guido by name ). -----Original Message----- From: Beni Cherniavksy Sent: Monday, June 10, 2002 1:57 AM To: python-list@python.org Subject: Re: Does Python need a '>>>' operator? ... [quotes of old postings deleted] ... I just got another idea: use 0x1234 for 0-filled numbers and 1xABCD for 1-filled ones. That way you impose no restrictions on what follows the prefix and keep backward compatibility. 0xFFFFFFFF stays a 2^n-1 _positive_ number, as it should be. The look of 1x is weird at first but it is very logical... -- Beni Cherniavsky -- http://mail.python.org/mailman/listinfo/python-list From guido@python.org Mon Jun 10 13:52:43 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 08:52:43 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: Your message of "Mon, 10 Jun 2002 08:16:51 EDT." References: Message-ID: <200206101252.g5ACqi922863@pcp02138704pcs.reston01.va.comcast.net> > I think Beni has a very nice idea here, especially for people who can't > visualize 2's-complement (not mentioning Guido by name ). In fact it's so subtle that I didn't notice what he proposed. I though it had to do with the uppercase of 1xABCD. Maybe that's too subtle? Do we really need this? > > I just got another idea: use 0x1234 for 0-filled numbers and 1xABCD for > > 1-filled ones. That way you impose no restrictions on what follows the > > prefix and keep backward compatibility. 0xFFFFFFFF stays a 2^n-1 > > _positive_ number, as it should be. The look of 1x is weird at first but > > it is very logical... > > > > Beni Cherniavsky --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon Jun 10 13:50:10 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 08:50:10 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: Message-ID: [Martin v. Loewis[ > ... > The problem appears to be in the tp_clear. The task of tp_clear is to > clear all references that may participate in cycles (*not* to clear > all references per se). That's the key insight, and one we all missed. Thanks for sharing your brain, Martin! From mgilfix@eecs.tufts.edu Mon Jun 10 13:56:40 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Mon, 10 Jun 2002 08:56:40 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: ; from tim.one@comcast.net on Mon, Jun 10, 2002 at 08:16:51AM -0400 References: Message-ID: <20020610085640.D23641@eecs.tufts.edu> On Mon, Jun 10 @ 08:16, Tim Peters wrote: > I think Beni has a very nice idea here, especially for people who can't > visualize 2's-complement (not mentioning Guido by name ). I like the idea but I'm not sure that still solves the down casting problem. Say I do some bit ops on a long type and want to get it into an int size (for whatever reason and there are several), I need somehow to tell python that it is not an overflow when I'm int()ing the number. Perhaps int could take a second hidden argument. Be able to do a: int(big_num, signed=1) which is pretty clear. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From niemeyer@conectiva.com Mon Jun 10 13:58:52 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Mon, 10 Jun 2002 09:58:52 -0300 Subject: [Python-Dev] os.stat(filename).r_dev In-Reply-To: References: <20020609161918.A17718@ibook.distro.conectiva> <20020609181710.A18935@ibook.distro.conectiva> Message-ID: <20020610095851.B1769@ibook.distro.conectiva> > > Isn't st_rdev being made available only in 2.3, trough stat attributes? > > No, it was available in 2.2 already. So there is a backwards > compatibility issue. You're right then. st_rdev_pair may be the way to go. I'm not sure if introducing major(), minor(), and makedev() would be a good idea, since they are completely platform dependent. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From David Abrahams" A couple of quick questions for the authors of the Python source: I notice that most, if not all, of the Python 'C' API includes null checks for the PyObject* arguments, meaning that you can't crash Python by passing the result of a previous operation, even if it returns an error. First question: can that be counted on? Hmm, I guess I've answered my own question -- PyNumber_InPlaceAdd has no checks. I note that the null_error() check in abstract.c is non-destructive: it preserves any existing error, whereas other checks (e.g. in typeobject.c) do not. Second question: I guess I really want to know what the intention behind these checks is. Is it something like "prevent extension writers from crashing Python in some large percentage of cases", or is there a deeper plan that I'm missing? TIA, Dave +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+ From tim.one@comcast.net Mon Jun 10 14:08:36 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 09:08:36 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: <200206101252.g5ACqi922863@pcp02138704pcs.reston01.va.comcast.net> Message-ID: >> I think Beni has a very nice idea here, especially for people who can't >> visualize 2's-complement (not mentioning Guido by name ). [Guido] > In fact it's so subtle that I didn't notice what he proposed. I > though it had to do with the uppercase of 1xABCD. > > Maybe that's too subtle? In context, it was part of a long thread wherein assorted people griped that they couldn't visualize what, e.g., >>> hex(-1L << 10) '-0x400L' >>> means, recalling that hex() is often used when people are thinking of its argument as a bitstring. 1xc00 "shows the bits" more clearly even in such an easy case. In a case like '-0xB373D', it's much harder to visualize the bits, and this will grow more acute under int-long unification. Right now, hex(negative_plain_int) shows the bits directly; after unification, hex(negative_plain_int) will likely have to resort to producing "negative literals" as hex(negative_long_int) currently does. > Do we really need this? No, but I think it would make unification more attractive to people who care about this sub-issue. The 0x vs 1x idea grew on me the longer I played with it. Bonus: we could generalize and say that integers beginning with "1" are negative octal literals . From tim.one@comcast.net Mon Jun 10 14:12:55 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 09:12:55 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: <20020610085640.D23641@eecs.tufts.edu> Message-ID: [Michael Gilfix] > I like the idea but I'm not sure that still solves the down casting > problem. It's not even pretending to have something to do with downcasting. > Say I do some bit ops on a long type and want to get it into an int > size (for whatever reason and there are several), I need somehow to > tell python that it is not an overflow when I'm int()ing the number. Sorry, I don't know what you want it to do. You have to specify the intended semantics first. > Perhaps int could take a second hidden argument. Be > able to do a: > > int(big_num, signed=1) > > which is pretty clear. After int/long unification is complete, int() and long() will likely be the same function. If you only want the last N bits, apply "&" to the long and a bitmask with the N least-significant bits set. From mwh@python.net Mon Jun 10 14:21:26 2002 From: mwh@python.net (Michael Hudson) Date: 10 Jun 2002 14:21:26 +0100 Subject: [Python-Dev] Null checking In-Reply-To: "David Abrahams"'s message of "Mon, 10 Jun 2002 09:01:32 -0400" References: <005901c2107f$52925d10$6601a8c0@boostconsulting.com> Message-ID: <2mptyz46cp.fsf@starship.python.net> "David Abrahams" writes: > A couple of quick questions for the authors of the Python source: I notice > that most, if not all, of the Python 'C' API includes null checks for the > PyObject* arguments, meaning that you can't crash Python by passing the > result of a previous operation, even if it returns an error. > > First question: can that be counted on? Hmm, I guess I've answered my own > question -- PyNumber_InPlaceAdd has no checks. You got it. > I note that the null_error() check in abstract.c is non-destructive: it > preserves any existing error, whereas other checks (e.g. in typeobject.c) > do not. > > Second question: I guess I really want to know what the intention behind > these checks is. I'm not sure there is one. It may just be a bad example of defensive programming (cf. OOSC). > Is it something like "prevent extension writers from crashing Python > in some large percentage of cases", or is there a deeper plan that > I'm missing? Well, if you're missing it, so am I. I'd also like to know why all the (for instance) methods in tupleobject.c start with "if (!PyTuple_Check(self)". You'd have to try REALLY hard to get those tests to fail... Cheers, M. -- Q: What are 1000 lawyers at the bottom of the ocean? A: A good start. (A lawyer told me this joke.) -- Michael Ströder, comp.lang.python From guido@python.org Mon Jun 10 14:28:42 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 09:28:42 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: Your message of "Mon, 10 Jun 2002 08:56:40 EDT." <20020610085640.D23641@eecs.tufts.edu> References: <20020610085640.D23641@eecs.tufts.edu> Message-ID: <200206101328.g5ADSgB22999@pcp02138704pcs.reston01.va.comcast.net> > I like the idea but I'm not sure that still solves the down casting > problem. Say I do some bit ops on a long type and want to get it > into an int size (for whatever reason and there are several), I need > somehow to tell python that it is not an overflow when I'm int()ing > the number. Perhaps int could take a second hidden argument. Be > able to do a: > > int(big_num, signed=1) > > which is pretty clear. I haven't been following the thread on c.l.py. What problem do you think this is trying to solve? Anyway, if you want to get an int back (which should be pretty rare in 2.2 and up since ints and longs are *almost* completely interchangeable) you should be able to say something like x & 0x7fffffff --Guido van Rossum (home page: http://www.python.org/~guido/) From mgilfix@eecs.tufts.edu Mon Jun 10 14:24:19 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Mon, 10 Jun 2002 09:24:19 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: ; from tim.one@comcast.net on Mon, Jun 10, 2002 at 09:12:55AM -0400 References: <20020610085640.D23641@eecs.tufts.edu> Message-ID: <20020610092418.E23641@eecs.tufts.edu> On Mon, Jun 10 @ 09:12, Tim Peters wrote: > [Michael Gilfix] > > I like the idea but I'm not sure that still solves the down casting > > problem. > > It's not even pretending to have something to do with downcasting. Er, I thought it was part of dealing with the int/long unification, where it becomes more difficult to express signed numbers as well. I think my phrasing was of. Should have been: Now if only we could solve... > > Say I do some bit ops on a long type and want to get it into an int > > size (for whatever reason and there are several), I need somehow to > > tell python that it is not an overflow when I'm int()ing the number. > > Sorry, I don't know what you want it to do. You have to specify the > intended semantics first. Well, in today's python, if I want to operate on a 64-bit block (without breaking it up into two ints), I could use a long to hold my value. Then let's say I perform some operation and I know the result is a 32-bit value and is signed. It's not easy to get it back into an int. I suppose with unification, I could just do: if num & 0xA0000000: num = -num I just want a straight-forward way of expressing that it's signed. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From tim.one@comcast.net Mon Jun 10 14:25:34 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 09:25:34 -0400 Subject: [Python-Dev] Null checking In-Reply-To: <005901c2107f$52925d10$6601a8c0@boostconsulting.com> Message-ID: [David Abrahams, on NULL-checking in the source] > ... > Second question: I guess I really want to know what the intention behind > these checks is. Is it something like "prevent extension writers from > crashing Python in some large percentage of cases", or is there a deeper > plan that I'm missing? Different authors have different paranoia levels. My level is here, for functions that don't intend to accept NULL arguments: 1. Public API functions should always do explicit NULL checks on pointer arguments, and it's the user's fault if they pass a NULL. A NULL argument should never crash Python regardless. 2. Private API functions should always assert non-NULL-ness on pointer arguments, and it's a bug in Python if a caller passes a NULL. Any place where the Python code base deviates from those is simply a place I didn't write . > I note that the null_error() check in abstract.c is non-destructive: it > preserves any existing error, whereas other checks (e.g. in typeobject.c) > do not. Different authors. Guido is omnipotent but not omnipresent . It would be good (IMO) to expose something like null_error in the public API, to encourage NULL-checking. I don't know that there's real value in trying to preserve a pre-existing exception, though (if the code is hosed, it's hosed). From guido@python.org Mon Jun 10 14:32:08 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 09:32:08 -0400 Subject: [Python-Dev] Null checking In-Reply-To: Your message of "Mon, 10 Jun 2002 09:01:32 EDT." <005901c2107f$52925d10$6601a8c0@boostconsulting.com> References: <005901c2107f$52925d10$6601a8c0@boostconsulting.com> Message-ID: <200206101332.g5ADW8v23024@pcp02138704pcs.reston01.va.comcast.net> > A couple of quick questions for the authors of the Python source: I > notice that most, if not all, of the Python 'C' API includes null > checks for the PyObject* arguments, meaning that you can't crash > Python by passing the result of a previous operation, even if it > returns an error. > > First question: can that be counted on? Hmm, I guess I've answered > my own question -- PyNumber_InPlaceAdd has no checks. Unless documented explicitly you cannot count on it! > I note that the null_error() check in abstract.c is non-destructive: > it preserves any existing error, whereas other checks (e.g. in > typeobject.c) do not. Different goals. (I'm not sure which checks in typeobject.c you're referring to.) > Second question: I guess I really want to know what the intention > behind these checks is. Is it something like "prevent extension > writers from crashing Python in some large percentage of cases", or > is there a deeper plan that I'm missing? Jim Fulton contributed the code that uses null_error(). I think he was making it possible to pass the result from one call to the next without doing the error checking on the first call. Personally, I find that inexcusable laziness and I don't intend to encourage it or propagate this style to other APIs. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon Jun 10 14:28:57 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 09:28:57 -0400 Subject: [Python-Dev] Null checking In-Reply-To: <2mptyz46cp.fsf@starship.python.net> Message-ID: [Michael Hudson] > ... > I'd also like to know why all the (for instance) methods in > tupleobject.c start with "if (!PyTuple_Check(self)". You'd have to > try REALLY hard to get those tests to fail... Not at all: extension modules can pass any sort of nonsense to public API functions. Python shouldn't crash as a result. The checking is expensive, though. From guido@python.org Mon Jun 10 14:34:01 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 09:34:01 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: Your message of "Mon, 10 Jun 2002 09:08:36 EDT." References: Message-ID: <200206101334.g5ADY1523041@pcp02138704pcs.reston01.va.comcast.net> > > Do we really need this? > > No, but I think it would make unification more attractive to people > who care about this sub-issue. The 0x vs 1x idea grew on me the > longer I played with it. Bonus: we could generalize and say that > integers beginning with "1" are negative octal literals . I'm not sure that we should extend the language at such a fundamental level (as adding a new form of literal) to address such a minor issue. --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" Message-ID: <009101c21083$6feb0a70$6601a8c0@boostconsulting.com> From: "Tim Peters" > Different authors. Guido is omnipotent but not omnipresent . It > would be good (IMO) to expose something like null_error in the public API, > to encourage NULL-checking. I don't know that there's real value in trying > to preserve a pre-existing exception, though (if the code is hosed, it's > hosed). It depends on whether you intend to make the null checks part of the public interface. There is a style of programming which says: "write your code with no error checks, then look at the end to see if something went wrong". When, as in 'C', you don't have real exception-handling in the language, it can lead to smaller/more-straightforward code. -Dave From guido@python.org Mon Jun 10 14:41:47 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 09:41:47 -0400 Subject: [Python-Dev] Null checking In-Reply-To: Your message of "10 Jun 2002 14:21:26 BST." <2mptyz46cp.fsf@starship.python.net> References: <005901c2107f$52925d10$6601a8c0@boostconsulting.com> <2mptyz46cp.fsf@starship.python.net> Message-ID: <200206101341.g5ADflr23092@pcp02138704pcs.reston01.va.comcast.net> > I'd also like to know why all the (for instance) methods in > tupleobject.c start with "if (!PyTuple_Check(self)". You'd have to > try REALLY hard to get those tests to fail... I found exactly one call to PyTuple_Check() that satisfies that description, and it was in your own (uncommitted) addition, tuplesubscript(). :-) Note that the PyXxx_Check() macros do *not* check for a NULL pointer and crash hard if you pass them one. --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Mon Jun 10 14:37:30 2002 From: mwh@python.net (Michael Hudson) Date: 10 Jun 2002 14:37:30 +0100 Subject: [Python-Dev] Null checking In-Reply-To: Guido van Rossum's message of "Mon, 10 Jun 2002 09:41:47 -0400" References: <005901c2107f$52925d10$6601a8c0@boostconsulting.com> <2mptyz46cp.fsf@starship.python.net> <200206101341.g5ADflr23092@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <2mlm9n45lx.fsf@starship.python.net> Guido van Rossum writes: > > I'd also like to know why all the (for instance) methods in > > tupleobject.c start with "if (!PyTuple_Check(self)". You'd have to > > try REALLY hard to get those tests to fail... > > I found exactly one call to PyTuple_Check() that satisfies that > description, and it was in your own (uncommitted) addition, > tuplesubscript(). :-) Yeah. I wonder what I was thinking when I wrote that (it was two years ago now, after all). Never mind, I'll do my research better before my next post. Cheers, M. -- GET *BONK* BACK *BONK* IN *BONK* THERE *BONK* -- Naich using the troll hammer in cam.misc From tim.one@comcast.net Mon Jun 10 14:41:27 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 09:41:27 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: <20020610092418.E23641@eecs.tufts.edu> Message-ID: [Michael Gilfix] > ... > Well, in today's python, if I want to operate on a 64-bit block (without > breaking it up into two ints), I could use a long to hold my value. Then > let's say I perform some operation and I know the result is a 32-bit value > and is signed. It's not easy to get it back into an int. It it's a signed result that truly fits in a 32-bit signed int, and you know you're running on a 32-bit box, simply do int(result). Nothing more than that is necessary or helpful. If you have a *positive* long that would fit in a 32-bit unsigned int (which type Python doesn't have), and know you're running on a 32-bit box, and only want the same bit pattern in an int, you can do def toint32(long): if long & 0x80000000L: long -= 1L << 32 return int(long) This also raises OverflowError if you're mistaken about it fitting in 32 bits. > I suppose with unification, I could just do: > > if num & 0xA0000000: > num = -num With full unification, there is no distinct int type, so there's nothing at all you need to do. From tim.one@comcast.net Mon Jun 10 14:42:28 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 09:42:28 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: <200206101334.g5ADY1523041@pcp02138704pcs.reston01.va.comcast.net> Message-ID: > I'm not sure that we should extend the language at such a fundamental > level (as adding a new form of literal) to address such a minor issue. That was obvious the first time around . From guido@python.org Mon Jun 10 14:48:10 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 09:48:10 -0400 Subject: [Python-Dev] Null checking In-Reply-To: Your message of "Mon, 10 Jun 2002 09:25:34 EDT." References: Message-ID: <200206101348.g5ADmAV23128@pcp02138704pcs.reston01.va.comcast.net> > 1. Public API functions should always do explicit NULL checks on > pointer arguments, and it's the user's fault if they pass a NULL. > A NULL argument should never crash Python regardless. This is violated in 99% of the code (you've got to start writing more code, Tim :-). My position is different: extensions shouldn't pass NULL pointers to Python APIs and if they do it's their fault. > > I note that the null_error() check in abstract.c is non-destructive: it > > preserves any existing error, whereas other checks (e.g. in typeobject.c) > > do not. > > Different authors. Guido is omnipotent but not omnipresent . > It would be good (IMO) to expose something like null_error in the > public API, to encourage NULL-checking. I don't know that there's > real value in trying to preserve a pre-existing exception, though > (if the code is hosed, it's hosed). That was a specific semantic trick that Jim tried to use (see my previous mail). I guess the idea whas that you could write things like PyObject_DelItemString(PyObject_GetAttr(PyEval_GetGlobals(), "foo"), "bar"). But this never caught on -- I'm sure in a large part because most things require you to do a DECREF if the result is *not* NULL. It *is* handy in Py_BuildValue(), and that has now grown a 'N' format that eats a reference to an object. Both 'O' and 'N' formats return NULL while preserving an existing error if the see a NULL. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon Jun 10 14:55:12 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 09:55:12 -0400 Subject: [Python-Dev] Null checking In-Reply-To: <200206101348.g5ADmAV23128@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Tim, howling in the wildnerness] >> 1. Public API functions should always do explicit NULL checks on >> pointer arguments, and it's the user's fault if they pass a NULL. >> A NULL argument should never crash Python regardless. [Guido] > This is violated in 99% of the code (you've got to start writing more > code, Tim :-). My position is different: extensions shouldn't pass > NULL pointers to Python APIs and if they do it's their fault. Then let's compromise: 0. All functions in the API, whether public or private, that don't intend to do something sensible with a NULL pointer argument, should assert non-NULL-ness. > ... > That was a specific semantic trick that Jim tried to use (see my > previous mail). I guess the idea whas that you could write things > like > > PyObject_DelItemString(PyObject_GetAttr(PyEval_GetGlobals(), > "foo"), "bar"). > > But this never caught on -- I'm sure in a large part because most > things require you to do a DECREF if the result is *not* NULL. You may have noticed that I've been spending much of my recent life checking in changes to clean up stray references when Zope's BTree code finds a reason to exit prematurely. It's due to a different set of mechanisms, but the pattern is clear . No more on null_error() from me. From guido@python.org Mon Jun 10 15:03:35 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 10:03:35 -0400 Subject: [Python-Dev] Null checking In-Reply-To: Your message of "Mon, 10 Jun 2002 09:55:12 EDT." References: Message-ID: <200206101403.g5AE3Z323281@pcp02138704pcs.reston01.va.comcast.net> > 0. All functions in the API, whether public or private, that don't > intend to do something sensible with a NULL pointer argument, > should assert non-NULL-ness. Sounds good to me. --Guido van Rossum (home page: http://www.python.org/~guido/) From mgilfix@eecs.tufts.edu Mon Jun 10 15:00:33 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Mon, 10 Jun 2002 10:00:33 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: ; from tim.one@comcast.net on Mon, Jun 10, 2002 at 09:41:27AM -0400 References: <20020610092418.E23641@eecs.tufts.edu> Message-ID: <20020610100032.F23641@eecs.tufts.edu> On Mon, Jun 10 @ 09:41, Tim Peters wrote: > If you have a *positive* long that would fit in a 32-bit unsigned int (which > type Python doesn't have), and know you're running on a 32-bit box, and only > want the same bit pattern in an int, you can do > > def toint32(long): > if long & 0x80000000L: > long -= 1L << 32 > return int(long) > > This also raises OverflowError if you're mistaken about it fitting in 32 > bits. Whoops. That should have been a 0x8... in my example :) At ny rate, I wish this function was available as a built-in. It would be nice if I had some conversion function where python looked at the highest bit and treated that as my word boundary to determine whether the number is positive or not. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From tim.one@comcast.net Mon Jun 10 15:08:15 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 10:08:15 -0400 Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: <20020610100032.F23641@eecs.tufts.edu> Message-ID: >> If you have a *positive* long that would fit in a 32-bit >> unsigned int .. > Whoops. That should have been a 0x8... in my example :) [Michael Gilfix] > At ny rate, I wish this function was available as a built-in. It > would be nice if I had some conversion function where python looked > at the highest bit and treated that as my word boundary to determine > whether the number is positive or not. Write a patch and try to sell it to Guido. I expect that with int/long unification coming along nicely, it doesn't stand much chance. From jeremy@zope.com Mon Jun 10 10:24:10 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 10 Jun 2002 05:24:10 -0400 Subject: [Python-Dev] Quota on sf.net In-Reply-To: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> References: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15620.28730.145429.690221@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: >> It appears SF is rearranging servers, and asks projects to honor >> their disk quota, see >> >> https://sourceforge.net/forum/forum.php?forum_id=183601 >> >> There is a per-project disk quota of 100MB; >> /home/groups/p/py/python currently consumes 880MB. Most of this >> (830MB) is in htdocs/snapshots. Should we move those onto >> python.org? GvR> What is htdocs/snapshots? There's plenty of space on creosote, GvR> but maybe the snapshots should be reduced in volume first? Last time quotas came up, the SF managers said that our project could exceed the normal quota. Still, we didn't intend to have ~1GB of CVS snapshots. The script that deletes old snapshots had a bug -- didn't deal with change-of-year -- that kept lots of old snapshots around. I just deleted a lot of them, so that we are using less space. I'm not sure the snapshots are worth the bother at all. Are there downloads statistics for the SF web pages? I'll bet no one has ever looked at them. Jeremy From tim.one@comcast.net Mon Jun 10 15:15:11 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 10:15:11 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: <200206092315.g59NFdx20542@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido, to MvL] > ... > You've convinced me. Here's a patch that only touches typeobject.c. > It doesn't add any fields, and it doesn't require multiple collections > to clear out cycles involving a class and its type. I like it! Me too. It's lovely. Check it in! It could use some comments about *why* the clear function isn't clearing all the members. That's unusual enough for a clear function that it deserves some prose. From fdrake@acm.org Mon Jun 10 15:15:37 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 10 Jun 2002 10:15:37 -0400 Subject: [Python-Dev] Quota on sf.net In-Reply-To: <15620.28730.145429.690221@slothrop.zope.com> References: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> <15620.28730.145429.690221@slothrop.zope.com> Message-ID: <15620.46217.344039.853918@grendel.zope.com> Jeremy Hylton writes: > I'm not sure the snapshots are worth the bother at all. Are there > downloads statistics for the SF web pages? I'll bet no one has ever > looked at them. I think there's a way to get the logs, but we'd have to run our own analysis tools to see if specific pages/directories get requests. I'd be fine to just drop the snapshots; we can deal with it as needed if we get requests for them. At the most, there's no need to keep more than a week's worth around. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido@python.org Mon Jun 10 15:23:30 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 10:23:30 -0400 Subject: [Python-Dev] Bizarre new test failure In-Reply-To: Your message of "Mon, 10 Jun 2002 10:15:11 EDT." References: Message-ID: <200206101423.g5AENUq24016@pcp02138704pcs.reston01.va.comcast.net> > Me too. It's lovely. Check it in! It could use some comments > about *why* the clear function isn't clearing all the members. > That's unusual enough for a clear function that it deserves some > prose. Will do. But first I have to fix SF 551412 for the third time. The issues here made me look at that once more and realize that the real cause of the problem was a bug in slot_tp_number -- it was being called because on behalf of the second argument. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jun 10 15:19:48 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 10:19:48 -0400 Subject: [Python-Dev] Quota on sf.net In-Reply-To: Your message of "Mon, 10 Jun 2002 05:24:10 EDT." <15620.28730.145429.690221@slothrop.zope.com> References: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> <15620.28730.145429.690221@slothrop.zope.com> Message-ID: <200206101419.g5AEJnY23884@pcp02138704pcs.reston01.va.comcast.net> > I'm not sure the snapshots are worth the bother at all. Are there > downloads statistics for the SF web pages? I'll bet no one has ever > looked at them. I forget -- are these snapshots of a checkout or of the whole CVS directory? If the former, we can probably lose them if and when SF starts to enforce quota. If the latter, then I suggest that having them on SF defeats the purpose -- we want them on hardware that is as far away from SF as we can imagine, like halfway across the world on www.python.org. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Mon Jun 10 15:25:39 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 10 Jun 2002 10:25:39 -0400 Subject: [Python-Dev] Quota on sf.net In-Reply-To: <15620.28730.145429.690221@slothrop.zope.com> References: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> <15620.28730.145429.690221@slothrop.zope.com> Message-ID: <20020610142539.GA14084@ute.mems-exchange.org> On Mon, Jun 10, 2002 at 05:24:10AM -0400, Jeremy Hylton wrote: >I'm not sure the snapshots are worth the bother at all. Are there >downloads statistics for the SF web pages? I'll bet no one has ever >looked at them. Wasn't the original goal of the snapshots to create an audit trail, guarding against someone Trojaning the CVS repository? And didn't Sean Reifscheider burn a CD containing a whole bunch of snapshots? --amk (www.amk.ca) Still, the future lies this way. -- The Doctor, in "Logopolis" From barry@zope.com Mon Jun 10 15:29:23 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 10 Jun 2002 10:29:23 -0400 Subject: [Python-Dev] Quota on sf.net References: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> <15620.28730.145429.690221@slothrop.zope.com> <200206101419.g5AEJnY23884@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15620.47043.201842.176944@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> I forget -- are these snapshots of a checkout or of the whole GvR> CVS directory? If the former, we can probably lose them if GvR> and when SF starts to enforce quota. If the latter, then I GvR> suggest that having them on SF defeats the purpose -- we want GvR> them on hardware that is as far away from SF as we can GvR> imagine, like halfway across the world on www.python.org. :-) creosote (at xs4all) is doing the nightly cvs repository snapshot downloads. It's probably due time to clear those out, but I haven't heard Thomas complain yet. :) >>>>> "AK" == Andrew Kuchling writes: AK> Wasn't the original goal of the snapshots to create an audit AK> trail, guarding against someone Trojaning the CVS repository? AK> And didn't Sean Reifscheider burn a CD containing a whole AK> bunch of snapshots? I believe that's true. I think Fred got one of the copies. -Barry From jeremy@zope.com Mon Jun 10 10:46:06 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 10 Jun 2002 05:46:06 -0400 Subject: [Python-Dev] Quota on sf.net In-Reply-To: <20020610142539.GA14084@ute.mems-exchange.org> References: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> <15620.28730.145429.690221@slothrop.zope.com> <20020610142539.GA14084@ute.mems-exchange.org> Message-ID: <15620.30046.267559.90616@slothrop.zope.com> >>>>> "AMK" == Andrew Kuchling writes: AMK> On Mon, Jun 10, 2002 at 05:24:10AM -0400, Jeremy Hylton wrote: >> I'm not sure the snapshots are worth the bother at all. Are >> there downloads statistics for the SF web pages? I'll bet no one >> has ever looked at them. AMK> Wasn't the original goal of the snapshots to create an audit AMK> trail, guarding against someone Trojaning the CVS repository? AMK> And didn't Sean Reifscheider burn a CD containing a whole bunch AMK> of snapshots? There are two different sets of snapshots. One is the copies of the CVS repository that Barry makes every night. The snapshots I'm talking about are cvs checkouts done every night. We did this because someone requested them, and it wasn't too much trouble. But now I'm wondering whether continued maintenance is worth the trouble. Jeremy From guido@python.org Mon Jun 10 15:39:10 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 10:39:10 -0400 Subject: [Python-Dev] Quota on sf.net In-Reply-To: Your message of "Mon, 10 Jun 2002 05:46:06 EDT." <15620.30046.267559.90616@slothrop.zope.com> References: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> <15620.28730.145429.690221@slothrop.zope.com> <20020610142539.GA14084@ute.mems-exchange.org> <15620.30046.267559.90616@slothrop.zope.com> Message-ID: <200206101439.g5AEdBp24496@pcp02138704pcs.reston01.va.comcast.net> > The snapshots I'm talking about are cvs checkouts done every night. > We did this because someone requested them, and it wasn't too much > trouble. But now I'm wondering whether continued maintenance is > worth the trouble. OK, lose them. If there's a real need, someone in the community will take over. --Guido van Rossum (home page: http://www.python.org/~guido/) From niemeyer@conectiva.com Mon Jun 10 15:53:34 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Mon, 10 Jun 2002 11:53:34 -0300 Subject: [Python-Dev] Null checking In-Reply-To: <005901c2107f$52925d10$6601a8c0@boostconsulting.com> References: <005901c2107f$52925d10$6601a8c0@boostconsulting.com> Message-ID: <20020610115334.A3324@ibook.distro.conectiva> Hello David! > A couple of quick questions for the authors of the Python source: I notice > that most, if not all, of the Python 'C' API includes null checks for the > PyObject* arguments, meaning that you can't crash Python by passing the > result of a previous operation, even if it returns an error. > > First question: can that be counted on? Hmm, I guess I've answered my own > question -- PyNumber_InPlaceAdd has no checks. Something that should also be noticed is that even if Python doesn't break, leaving errors around until a later point results in more trouble debugging this code if something goes wrong, and maybe even wrong error messages being sent to the user. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From tim.one@comcast.net Mon Jun 10 16:30:22 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 11:30:22 -0400 Subject: [Python-Dev] unittest and sockets. Ugh!? In-Reply-To: <20020608214341.G9486@eecs.tufts.edu> Message-ID: BTW, if you want to run threads with unittest, I expect you'll have to ensure that only the thread that starts unittest reports errors to unittest. I'll call that "the main thread". You should be aware that if a non-main thread dies, unittest won't know that. A common problem in the threaded tests PLabs has written is that a thread dies an ignoble death but unittest goes on regardless and says "ok!" at the end; if you didn't stare at all the output, you never would have known something went wrong. So wrap the body of your thread's work in a catch-all try/except, and if anything goes wrong communicate that back to the main thread. For example, a Queue object (one or more) could work nicely for this. From skip@mojam.com Mon Jun 10 16:42:15 2002 From: skip@mojam.com (Skip Montanaro) Date: Mon, 10 Jun 2002 10:42:15 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200206101542.g5AFgFR11272@12-248-41-177.client.attbi.com> Bug/Patch Summary ----------------- 263 open / 2562 total bugs (-3) 131 open / 1541 total patches (-5) New Bugs -------- Missing operator docs (2002-06-02) http://python.org/sf/563530 urllib2 can't cope with error response (2002-06-02) http://python.org/sf/563665 os.tmpfile should use w+b, not w+ (2002-06-02) http://python.org/sf/563750 getttext defaults with unicode (2002-06-03) http://python.org/sf/563915 FixTk.py logic wrong (2002-06-04) http://python.org/sf/564729 compile traceback must include filename (2002-06-05) http://python.org/sf/564931 IDLE needs printing (2002-06-06) http://python.org/sf/565373 urllib FancyURLopener.__init__ / urlopen (2002-06-06) http://python.org/sf/565414 ImportError: No module named _socket (2002-06-07) http://python.org/sf/565710 crash on gethostbyaddr (2002-06-07) http://python.org/sf/565747 string.replace() can corrupt heap (2002-06-07) http://python.org/sf/565993 telnetlib makes Python dump core (2002-06-07) http://python.org/sf/566006 Popen exectuion blocking w/threads (2002-06-07) http://python.org/sf/566037 Bgen should generate 7-bit-clean code (2002-06-08) http://python.org/sf/566302 PyUnicode_Find() returns wrong results (2002-06-09) http://python.org/sf/566631 rotormodule's set_key calls strlen (2002-06-10) http://python.org/sf/566859 Typo in "What's new in Python 2.3" (2002-06-10) http://python.org/sf/566869 New Patches ----------- experimental support for extended slicing on lists (2000-07-27) http://python.org/sf/400998 posixmodule.c RedHat 6.1 (bug #535545) (2002-06-03) http://python.org/sf/563954 error in weakref.WeakKeyDictionary (2002-06-04) http://python.org/sf/564549 modulefinder and string methods (2002-06-05) http://python.org/sf/564840 email Parser non-strict mode (2002-06-06) http://python.org/sf/565183 Expose _Py_ReleaseInternedStrings (2002-06-06) http://python.org/sf/565378 Rationalize DL_IMPORT and DL_EXPORT (2002-06-07) http://python.org/sf/566100 fix bug in shutil.rmtree exception case (2002-06-09) http://python.org/sf/566517 Closed Bugs ----------- Coercion rules incomplete (2001-05-07) http://python.org/sf/421973 clean doesn't (2002-01-29) http://python.org/sf/510186 __slots__ may lead to undetected cycles (2002-02-18) http://python.org/sf/519621 make fails at posixmodule.c (2002-03-26) http://python.org/sf/535545 Warn for __coerce__ in new-style classes (2002-04-22) http://python.org/sf/547211 possible to fail to calc mro's (2002-05-02) http://python.org/sf/551412 UTF-16 BOM handling counterintuitive (2002-05-13) http://python.org/sf/555360 TclError is a str should be an Exception (2002-05-17) http://python.org/sf/557436 Shutdown of IDLE blows up (2002-05-19) http://python.org/sf/558166 rfc822.Message.get() incompatibility (2002-05-20) http://python.org/sf/558179 imaplib.IMAP4.open() typo (2002-05-23) http://python.org/sf/559884 PyType_IsSubtype can segfault (2002-05-24) http://python.org/sf/560215 foo() doesn't use __getattribute__ (2002-05-25) http://python.org/sf/560438 Maximum recursion limit exceeded (2002-05-27) http://python.org/sf/561047 Module can be used as a base class (2002-05-31) http://python.org/sf/563060 Heap corruption in debug (2002-06-01) http://python.org/sf/563303 Add separator argument to readline() (2002-06-02) http://python.org/sf/563491 Closed Patches -------------- Remote execution patch for IDLE (2001-07-11) http://python.org/sf/440407 GNU/Hurd doesn't have large file support (2001-12-27) http://python.org/sf/497099 building a shared python library (2001-12-27) http://python.org/sf/497102 make setup.py less chatty by default (2002-01-17) http://python.org/sf/504889 Make doc strings optional (2002-01-18) http://python.org/sf/505375 Distutils & non-installed Python (2002-04-23) http://python.org/sf/547734 test_commands.py using . incorrectly (2002-05-03) http://python.org/sf/551911 Fix breakage of smtplib.starttls() (2002-05-03) http://python.org/sf/552060 Cygwin AH_BOTTOM cleanup patch (2002-05-14) http://python.org/sf/555929 Expose xrange type in builtins (2002-05-23) http://python.org/sf/559833 isinstance error message (2002-05-24) http://python.org/sf/560250 webchecker chokes at charsets. (2002-05-28) http://python.org/sf/561478 Getting rid of string, types and stat (2002-05-30) http://python.org/sf/562373 From bernie@3captus.com Mon Jun 10 17:04:40 2002 From: bernie@3captus.com (Bernard Yue) Date: Mon, 10 Jun 2002 10:04:40 -0600 Subject: [Python-Dev] unittest and sockets. Ugh!? References: Message-ID: <3D04CE18.7404CA90@3captus.com> Tim Peters wrote: > > BTW, if you want to run threads with unittest, I expect you'll have to > ensure that only the thread that starts unittest reports errors to unittest. > I'll call that "the main thread". You should be aware that if a non-main > thread dies, unittest won't know that. A common problem in the threaded > tests PLabs has written is that a thread dies an ignoble death but unittest > goes on regardless and says "ok!" at the end; if you didn't stare at all the > output, you never would have known something went wrong. > > So wrap the body of your thread's work in a catch-all try/except, and if > anything goes wrong communicate that back to the main thread. For example, > a Queue object (one or more) could work nicely for this. > Thanks for the good tip Tim ! I will add those to my test case. The main challenge in socket testing, however, lies on thread synchronization. I have make some progress on that front, Michael. See if the following code fragment helps: class socketObjTestCase(unittest.TestCase): """Test Case for Socket Object""" def setUp(self): self.__s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.__server_addr = ('127.0.0.1', 25339) self.__ready = threading.Event() self.__done = threading.Event() self.__quit = threading.Event() self.__server = threading.Thread(target=server, args=(self.__server_addr, self.__ready, self.__done, self.__quit)) self.__server.start() self.__ready.wait() def tearDown(self): self.__s.close() self.__quit.set() self.__server.join() del self.__server self.__done.wait() self.__ready.clear() self.__done.clear() self.__quit.clear() class server: def __init__(self, addr, ready, done, quit): self.__ready = ready self.__dead = done self.__quit = quit self.__s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.__s.setblocking(0) self.__s.bind(addr) self.__s.listen(1) self.getclient() def __del__(self): self.__dead.set() def getclient(self): self.__ready.set() while not self.__quit.isSet(): try: _client, _addr = self.__s.accept() self.serveclient(_client, _addr) except socket.error, msg: pass self.__s.shutdown(2) def serveclient(self, sock, addr): print sock, addr Bernie From mgilfix@eecs.tufts.edu Mon Jun 10 16:43:59 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Mon, 10 Jun 2002 11:43:59 -0400 Subject: [Python-Dev] unittest and sockets. Ugh!? In-Reply-To: ; from tim.one@comcast.net on Mon, Jun 10, 2002 at 11:30:22AM -0400 References: <20020608214341.G9486@eecs.tufts.edu> Message-ID: <20020610114359.C25627@eecs.tufts.edu> Cool. Thanks for the tip. You wouldn't happen to have any publicly available examples? Couldn't hurt to see someone else's layout so I can be sure I have the best structure for mine. -- Mike On Mon, Jun 10 @ 11:30, Tim Peters wrote: > BTW, if you want to run threads with unittest, I expect you'll have to > ensure that only the thread that starts unittest reports errors to unittest. > I'll call that "the main thread". You should be aware that if a non-main > thread dies, unittest won't know that. A common problem in the threaded > tests PLabs has written is that a thread dies an ignoble death but unittest > goes on regardless and says "ok!" at the end; if you didn't stare at all the > output, you never would have known something went wrong. > > So wrap the body of your thread's work in a catch-all try/except, and if > anything goes wrong communicate that back to the main thread. For example, > a Queue object (one or more) could work nicely for this. `-> (tim.one) -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From loewis@informatik.hu-berlin.de Mon Jun 10 18:04:43 2002 From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=) Date: 10 Jun 2002 19:04:43 +0200 Subject: [Python-Dev] Quota on sf.net In-Reply-To: <15620.28730.145429.690221@slothrop.zope.com> References: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> <15620.28730.145429.690221@slothrop.zope.com> Message-ID: Jeremy Hylton writes: > I'm not sure the snapshots are worth the bother at all. Are there > downloads statistics for the SF web pages? I'll bet no one has ever > looked at them. My recommendation would be to disable the scipt, and remove the snapshots, perhaps leaving a page that anybody who wants the snapshots should ask at python-dev to re-enable them. Regards, Martin From gward@python.net Mon Jun 10 21:18:00 2002 From: gward@python.net (Greg Ward) Date: Mon, 10 Jun 2002 16:18:00 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <3D039C43.786EE3D8@prescod.net> References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <20020607213947.GB21836@gerg.ca> <3D012CD9.89D40523@prescod.net> <20020607220640.GA21975@gerg.ca> <3D015CD8.DE7C4BC2@prescod.net> <20020609002722.GA3750@gerg.ca> <3D039C43.786EE3D8@prescod.net> Message-ID: <20020610201800.GB7655@gerg.ca> On 09 June 2002, Paul Prescod said: > I'm not clear on why the "width" argument is special and should be on > the wrap method rather than in the constructor. But I suspect most > people will use the convenience functions so they'll never know the > difference. Beats me. It does seem kind of silly. I think I'll go fix it now. Greg -- Greg Ward - programmer-at-large gward@python.net http://starship.python.net/~gward/ "He's dead, Jim. You get his tricorder and I'll grab his wallet." From gward@python.net Mon Jun 10 21:17:15 2002 From: gward@python.net (Greg Ward) Date: Mon, 10 Jun 2002 16:17:15 -0400 Subject: [Python-Dev] textwrap.py In-Reply-To: <20020609184425.18907.qmail@web9601.mail.yahoo.com> References: <3D039C43.786EE3D8@prescod.net> <20020609184425.18907.qmail@web9601.mail.yahoo.com> Message-ID: <20020610201714.GA7655@gerg.ca> On 09 June 2002, Steven Lott said: > Here's a version with the Strategy classes included. This > allows for essentially unlimited alternatives on the subjects of > long words, full stops, and also permits right justification. Ahh, very interesting. Smells like massive flaming overkill here, but at least now I understand what you meant by "strategy class". (I kept having visions of a classroom full of kids playing chess... design patters are great, as long as everyone has a copy of *Design Patterns* on their desk. ;-) I think my main reservation about this technique is that it does nothing to make the simplest case simpler, and it makes the slightly complex case ("I just want to disable breaking long words") a hell of a lot harder. Greg -- Greg Ward - just another /P(erl|ython)/ hacker gward@python.net http://starship.python.net/~gward/ The NSA. We care: we listen to our customers. From dan@sidhe.org Mon Jun 10 21:42:45 2002 From: dan@sidhe.org (Dan Sugalski) Date: Mon, 10 Jun 2002 16:42:45 -0400 Subject: [Python-Dev] Parrot in Phoenix Message-ID: Dunno if anyone's interested, but I'll be giving a full-on presentation on Parrot, the dynamic language interpreter we're building to implement Perl 6 on top of, on June 20th in Phoenix, AZ to the Phoenix perlmongers. If anyone's interested, drop me a note and I'll get you directions and update the head count. If you're worried about being surrounded by perl people, don't--you'll be surrounded by perl *and* ruby people. :) (Some of the folks involved in Cardinal, the project to layer Ruby on top of Parrot, will be there too) -- Dan --------------------------------------"it's like this"------------------- Dan Sugalski even samurai dan@sidhe.org have teddy bears and even teddy bears get drunk From tim.one@comcast.net Mon Jun 10 22:34:27 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 10 Jun 2002 17:34:27 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59 In-Reply-To: Message-ID: > Update of /cvsroot/python/python/nondist/peps > In directory usw-pr-cvs1:/tmp/cvs-serv32728 > > Modified Files: > pep-0042.txt > Log Message: > Added another wish. Removed a bunch of fulfilled wishes (no guarantee > that I caught all of 'em). ... > - - Port the Python SSL code to Windows. > - > - http://www.python.org/sf/210683 That this was done is news to me. That doesn't mean it isn't true . Python 2.3a0 (#29, Jun 1 2002, 02:50:59) [MSC 32 bit (Intel)] on win32 ... >>> import socket >>> hasattr(socket, 'ssl') False >>> From guido@python.org Mon Jun 10 22:46:10 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Jun 2002 17:46:10 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59 In-Reply-To: Your message of "Mon, 10 Jun 2002 17:34:27 EDT." References: Message-ID: <200206102146.g5ALkFE06401@pcp02138704pcs.reston01.va.comcast.net> > > - - Port the Python SSL code to Windows. > > - > > - http://www.python.org/sf/210683 > > That this was done is news to me. That doesn't mean it isn't true . AFAIK the C code works on Windows; I've heard repeated confirmations. It's just that we don't feel like configuring it (and there isn't a lot of demand AFAICT :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From andymac@bullseye.apana.org.au Mon Jun 10 21:26:05 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Tue, 11 Jun 2002 07:26:05 +1100 (edt) Subject: [Python-Dev] Socket timeout patch In-Reply-To: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Fri, 7 Jun 2002, Guido van Rossum wrote: > I've more or less completed the introduction of timeout sockets. {...} > - Cross-platform testing. It's possible that the cleanup broke things > on some platforms, or that select() doesn't work the same way. I > can only test on Windows and Linux; there is code specific to OS/2 > and RISCOS in the module too. wrt OS/2: sock_init() is an OS/2 TCPIP public symbol, which is used in the OS/2 os_init() (about line 2982 of socketmodule.c, as of yesterday). This of course clashes with the sock_init() defined in socketmodule.c. Even though the EMX port doesn't need the underlying sock_init(), EMX' socket.h defines sock_init() for compatibility with VACPP. Once the name clash is resolved, the module compiles and completes test_socket with no problems. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From DavidA@ActiveState.com Tue Jun 11 07:51:29 2002 From: DavidA@ActiveState.com (David Ascher) Date: Mon, 10 Jun 2002 23:51:29 -0700 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59 References: <200206102146.g5ALkFE06401@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D059DF1.4020003@ActiveState.com> Guido van Rossum wrote: >>>- - Port the Python SSL code to Windows. >>>- >>>- http://www.python.org/sf/210683 >>> >>That this was done is news to me. That doesn't mean it isn't true . >> > >AFAIK the C code works on Windows; I've heard repeated confirmations. >It's just that we don't feel like configuring it (and there isn't a >lot of demand AFAICT :-). > It works. We've tested it, and we have some customers who need it. The test suite was somewhat busted, and there was a bug in 2.2.0, but 2.2.1 is fine. The entire feature is somewhat underdocumented, though =(. --david From skip@pobox.com Tue Jun 11 17:22:08 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 11 Jun 2002 11:22:08 -0500 Subject: [Python-Dev] Please give this patch for building bsddb a try Message-ID: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> If you build the bsddb module on a Unix-like system (that is, you use configure and setup.py to build the interpreter and it attempts to build the bsddb module), please give the new patch attached to http://python.org/sf/553108 a try. Ignore the subject of the patch. I just tacked my patch onto this item and assigned it to myself. If/when the issue is settled I'll track down and close other patches and bug reports related to building the bsddb module. Briefly, it attempts the following: 1. Makes it inconvenient (though certainly not impossible) to build/link with version 1 of the Berkeley DB library by commenting out the relevant part of the db_try_this dictionary in setup.py. 2. Links the search for a DB library and corresponding include files so you don't find a version 2 include file and a version 3 library (for example). 3. Attempts to do the same for the dbm module when it decides to link with the Berkeley DB library for compatibility (this is stuff under "development" and will almost certainly require further changes). (You can ignore the debug print I forgot to remove before creating the patch. ;-) I asked on c.l.py about where people have the Berkeley DB stuff installed so I could tune the locations listed in db_try_this, but the thread almost immediately went off into the weeds arguing about berkdb license issues. I therefore humbly request your more rational input on this topic. If you have a Unix-ish system and Berkeley DB is installed somewhere not listed in the db_try_this dictionary in setup.py, please let me know. Thx, Skip From Oleg Broytmann Tue Jun 11 17:39:06 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Tue, 11 Jun 2002 20:39:06 +0400 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>; from skip@pobox.com on Tue, Jun 11, 2002 at 11:22:08AM -0500 References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> Message-ID: <20020611203906.V6026@phd.pp.ru> Hello! On Tue, Jun 11, 2002 at 11:22:08AM -0500, Skip Montanaro wrote: > http://python.org/sf/553108 > > 1. Makes it inconvenient (though certainly not impossible) to build/link > with version 1 of the Berkeley DB library by commenting out the > relevant part of the db_try_this dictionary in setup.py. Can I have two different modules simultaneously? For example, a module linked with db.1.85 plus a module linked with db3. > 2. Links the search for a DB library and corresponding include files so > you don't find a version 2 include file and a version 3 library (for > example). After compiling bsddb-3.2 from sources I have got /usr/local/BerkeleyDB.3.2/ directory, with lib/include being its subdirectories. The patch didn't look into this, as I understand it. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From skip@pobox.com Tue Jun 11 17:58:42 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 11 Jun 2002 11:58:42 -0500 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <20020611203906.V6026@phd.pp.ru> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> Message-ID: <15622.11330.948519.279929@12-248-41-177.client.attbi.com> >> 1. Makes it inconvenient (though certainly not impossible) to >> build/link with version 1 of the Berkeley DB library by commenting >> out the relevant part of the db_try_this dictionary in setup.py. Oleg> Can I have two different modules simultaneously? For example, a Oleg> module linked with db.1.85 plus a module linked with db3. Nope. I don't believe you can do that today (at least not without some build-time gymnastics), and I have no plans to support that. For one thing, you'd have to compile and link bsddmodule.c twice. To allow multiple versions to be loaded into the interpreter you'd also have to name them differently. This would require source code changes to keep global symbols (at least the module init functions) from clashing. >> 2. Links the search for a DB library and corresponding include files >> so you don't find a version 2 include file and a version 3 library >> (for example). Oleg> After compiling bsddb-3.2 from sources I have got Oleg> /usr/local/BerkeleyDB.3.2/ directory, with lib/include being its Oleg> subdirectories. The patch didn't look into this, as I understand Oleg> it. Thanks, I'll add that. I also notice that /usr/local/BerkeleyDB.4.0 is the default install directory for the 4.0 source. Skip From guido@python.org Tue Jun 11 19:28:05 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Jun 2002 14:28:05 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59 In-Reply-To: Your message of "Mon, 10 Jun 2002 23:51:29 PDT." <3D059DF1.4020003@ActiveState.com> References: <200206102146.g5ALkFE06401@pcp02138704pcs.reston01.va.comcast.net> <3D059DF1.4020003@ActiveState.com> Message-ID: <200206111828.g5BIS5p29303@pcp02138704pcs.reston01.va.comcast.net> [About SSL on Windows] > It works. We've tested it, and we have some customers who need it. The > test suite was somewhat busted, and there was a bug in 2.2.0, but 2.2.1 > is fine. > > The entire feature is somewhat underdocumented, though =(. What do you mean. Is this not enough? :-) http://www.python.org/doc/current/lib/ssl-objects.html --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue Jun 11 20:00:24 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 11 Jun 2002 21:00:24 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15622.11330.948519.279929@12-248-41-177.client.attbi.com> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > This would require source code changes to keep global symbols (at > least the module init functions) from clashing. It actually only requires different init functions. To support that with distutils, you need to tell distutils to generate different object files from the same source file, which is probably not supported out of the box. Regards, Martin From Oleg Broytmann Tue Jun 11 20:48:52 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Tue, 11 Jun 2002 23:48:52 +0400 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: ; from martin@v.loewis.de on Tue, Jun 11, 2002 at 09:00:24PM +0200 References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> Message-ID: <20020611234852.D23356@phd.pp.ru> On Tue, Jun 11, 2002 at 09:00:24PM +0200, Martin v. Loewis wrote: > Skip Montanaro writes: > > > This would require source code changes to keep global symbols (at > > least the module init functions) from clashing. > > It actually only requires different init functions. To support that > with distutils, you need to tell distutils to generate different > object files from the same source file, which is probably not > supported out of the box. I know. Once I thought about sed/awk magic to generate two different modules from one template. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From haering_python@gmx.de Tue Jun 11 20:58:48 2002 From: haering_python@gmx.de (Gerhard =?iso-8859-15?Q?H=E4ring?=) Date: Tue, 11 Jun 2002 21:58:48 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <20020611234852.D23356@phd.pp.ru> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> <20020611234852.D23356@phd.pp.ru> Message-ID: <20020611195848.GA27976@lilith.my-fqdn.de> * Oleg Broytmann [2002-06-11 23:48 +0400]: > On Tue, Jun 11, 2002 at 09:00:24PM +0200, Martin v. Loewis wrote: > > Skip Montanaro writes: > > > > > This would require source code changes to keep global symbols (at > > > least the module init functions) from clashing. > > > > It actually only requires different init functions. To support that > > with distutils, you need to tell distutils to generate different > > object files from the same source file, which is probably not > > supported out of the box. > > I know. Once I thought about sed/awk magic to generate two different > modules from one template. What about symlinks, like: bsd18module.c -> bsd30module.c bsd30module.c and using a few #ifdefs in the C sources? Gerhard -- This sig powered by Python! Außentemperatur in München: 17.1 °C Wind: 1.7 m/s From martin@v.loewis.de Tue Jun 11 21:17:15 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 11 Jun 2002 22:17:15 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <20020611195848.GA27976@lilith.my-fqdn.de> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> <20020611234852.D23356@phd.pp.ru> <20020611195848.GA27976@lilith.my-fqdn.de> Message-ID: Gerhard H=E4ring writes: > What about symlinks, like: That can't work on Windows. Regards, Martin From skip@pobox.com Tue Jun 11 21:18:09 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 11 Jun 2002 15:18:09 -0500 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> Message-ID: <15622.23297.193301.295155@12-248-41-177.client.attbi.com> >> This would require source code changes to keep global symbols (at >> least the module init functions) from clashing. Martin> It actually only requires different init functions. To support Martin> that with distutils, you need to tell distutils to generate Martin> different object files from the same source file, which is Martin> probably not supported out of the box. Thanks for the clarification Martin. Even though this seems possible with minimal changes to the source, I still think supporting this is not worth it. Oleg, at your end I suspect you could fairly easily copy a bit of code in setup.py, copy bsddbmodule.c to bsddb1module.c, and rename the module init function. Skip From jon+python-dev@unequivocal.co.uk Tue Jun 11 21:19:59 2002 From: jon+python-dev@unequivocal.co.uk (Jon Ribbens) Date: Tue, 11 Jun 2002 21:19:59 +0100 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59 In-Reply-To: <200206111828.g5BIS5p29303@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Tue, Jun 11, 2002 at 02:28:05PM -0400 References: <200206102146.g5ALkFE06401@pcp02138704pcs.reston01.va.comcast.net> <3D059DF1.4020003@ActiveState.com> <200206111828.g5BIS5p29303@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020611211959.M14101@snowy.squish.net> Guido van Rossum wrote: > > The entire feature is somewhat underdocumented, though =(. > > What do you mean. Is this not enough? :-) > > http://www.python.org/doc/current/lib/ssl-objects.html What about socket.sslerror, socket.SSL_ERROR_*, what to do about the various socket.SSL_ERROR_* values, etc? From guido@python.org Tue Jun 11 21:28:30 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Jun 2002 16:28:30 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: Your message of "Tue, 11 Jun 2002 21:58:48 +0200." <20020611195848.GA27976@lilith.my-fqdn.de> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> <20020611234852.D23356@phd.pp.ru> <20020611195848.GA27976@lilith.my-fqdn.de> Message-ID: <200206112028.g5BKSUD29813@pcp02138704pcs.reston01.va.comcast.net> > > I know. Once I thought about sed/awk magic to generate two different > > modules from one template. > > What about symlinks, like: > > bsd18module.c -> bsd30module.c > bsd30module.c > > and using a few #ifdefs in the C sources? Instead of symlinks, how about one .h file containing most of the code and two or three .c files that set a few #defines and then #include the .h file? Similar to what we do for Python/threads.c --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 11 21:53:39 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Jun 2002 16:53:39 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59 In-Reply-To: Your message of "Tue, 11 Jun 2002 21:19:59 BST." <20020611211959.M14101@snowy.squish.net> References: <200206102146.g5ALkFE06401@pcp02138704pcs.reston01.va.comcast.net> <3D059DF1.4020003@ActiveState.com> <200206111828.g5BIS5p29303@pcp02138704pcs.reston01.va.comcast.net> <20020611211959.M14101@snowy.squish.net> Message-ID: <200206112053.g5BKrd329949@pcp02138704pcs.reston01.va.comcast.net> > > What do you mean. Is this not enough? :-) > > > > http://www.python.org/doc/current/lib/ssl-objects.html > > What about socket.sslerror, socket.SSL_ERROR_*, what to do about the > various socket.SSL_ERROR_* values, etc? I was kidding. I'm hoping that someone who has used this stuff can contribute (a) a little more fleshed-out docs, and (b) a working example that goes beyond implementing an "https://..." URL. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 11 22:11:20 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Jun 2002 17:11:20 -0400 Subject: [Python-Dev] urllib.py and 303 redirect Message-ID: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net> There seems to be a move in the HTTP world to move away from 302 redirect responses to 303 redirects. The 302 response was poorly specified in the original HTTP spec, and most browsers implemented it by doing a GET on the redirected URL even if the original request was a POST, but the spec was ambiguous, and some browsers implemented it by repeating the original request to the redirected URL. The urllib module does the latter. The HTTP/1.1 spec now recommends doing the former, and it also has two new responses, 303 to unambiguously specify a redirect that must use d GET, and 307 to specify the original intent of 302. More info is on this page, which discusses the issue from a Zope perspective (which is how I found out about this): http://dev.zope.org/Wikis/DevSite/Projects/ComponentArchitecture/Use303RedirectsByDefault and here is a nice general treatise on the subject: http://ppewww.ph.gla.ac.uk/~flavell/www/post-redirect.html It's clear that urllib would do wise to implement a handler for the 303 response. A 307 handler would be useful too. But the big question is, should we also change the 302 handler to implement the HTTP/1.1 recommended behavior? I vaguely remember that the 302 handler used to do this and that it was "fixed", but I can't find it in the CVS log. Changing it *could* break applications, but is more likely to unbreak them, given that this is now the spec's recommendation. Opinions? Whatever we do should probably also be backported to Python 2.2.1. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Wed Jun 12 00:29:16 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 11 Jun 2002 19:29:16 -0400 Subject: [Python-Dev] urllib.py and 303 redirect In-Reply-To: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net> References: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020611232916.GA3126@panix.com> On Tue, Jun 11, 2002, Guido van Rossum wrote: > > Whatever we do should probably also be backported to Python 2.2.1. Should it? IMO, not unless someone stands forward with a clear case that the current behavior for 302 is buggy. If the current behavior is simply ambiguous and works well enough in many situations, I think that changing semantics would be counter to the intention for bugfix releases. No objection here to adding 303 and 307 handlers, though. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I had lots of reasonable theories about children myself, until I had some." --Michael Rios From guido@python.org Wed Jun 12 00:54:00 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Jun 2002 19:54:00 -0400 Subject: [Python-Dev] urllib.py and 303 redirect In-Reply-To: Your message of "Tue, 11 Jun 2002 19:29:16 EDT." <20020611232916.GA3126@panix.com> References: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net> <20020611232916.GA3126@panix.com> Message-ID: <200206112354.g5BNs0u05206@pcp02138704pcs.reston01.va.comcast.net> > > Whatever we do should probably also be backported to Python 2.2.1. > > Should it? IMO, not unless someone stands forward with a clear case > that the current behavior for 302 is buggy. If the current behavior > is simply ambiguous and works well enough in many situations, I > think that changing semantics would be counter to the intention for > bugfix releases. The recommendation in the HTTP/1.1 standard is unclear IMO. It says: | 10.3.3 302 Found | | The requested resource resides temporarily under a different | URI. Since the redirection might be altered on occasion, the client | SHOULD continue to use the Request-URI for future requests. This | response is only cacheable if indicated by a Cache-Control or | Expires header field. | | The temporary URI SHOULD be given by the Location field in the | response. Unless the request method was HEAD, the entity of the | response SHOULD contain a short hypertext note with a hyperlink to | the new URI(s). OK so far. | If the 302 status code is received in response to a request other | than GET or HEAD, the user agent MUST NOT automatically redirect the | request unless it can be confirmed by the user, since this might | change the conditions under which the request was issued. I *think* this says that the current urllib behavior (to reissue a POST request to the redirected URL) should *not* be done, since there is no user confirmation. | Note: RFC 1945 and RFC 2068 specify that the client is not | allowed to change the method on the redirected request. | However, most existing user agent implementations treat 302 as | if it were a 303 response, performing a GET on the Location | field-value regardless of the original request method. The | status codes 303 and 307 have been added for servers that wish | to make unambiguously clear which kind of reaction is expected | of the client. This is ambiguous but suggests that changing PUT to GET is what most servers expect by now. > No objection here to adding 303 and 307 handlers, though. Could I shame you into submitting a patch? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Wed Jun 12 01:19:54 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 11 Jun 2002 20:19:54 -0400 Subject: [Python-Dev] urllib.py and 303 redirect In-Reply-To: <200206112354.g5BNs0u05206@pcp02138704pcs.reston01.va.comcast.net> References: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net> <20020611232916.GA3126@panix.com> <200206112354.g5BNs0u05206@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020612001954.GA10924@panix.com> >> No objection here to adding 303 and 307 handlers, though. > > Could I shame you into submitting a patch? :-) Not until I figure out how I want to deal with SF not working with Lynx; haven't been able to get anyone at SF interested in talking about fixing the problem, and I've been too busy with writing (OSCON slides and book) to investigate alternatives. (Though I'll probably bite the bullet and switch to using PPP. :-( Need to do something about ISPs for that; my backup account supports PPP, but I'm limited in number of hours per month.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I had lots of reasonable theories about children myself, until I had some." --Michael Rios From David Abrahams" My initial post asking about the implementation of += sparked a small thread over on python-list, from which I've come to the conclusion that my little optimization suggestion (don't try to set the attribute or item if the inplace op returns its first argument) is actually more semantically correct. For better or worse, these ideas aren't all mine, as http://aspn.activestate.com/ASPN/Mail/Message/python-list/1222524 attests. Consider: >>> t = ([1],[2],[3]) >>> t[0].append(2) # OK, the elements of the tuple are mutable >>> t ([1, 2], [2], [3]) >>> >>> t[1] += [3] # ?? Just did the equivalent operation above Traceback (most recent call last): File "", line 1, in ? TypeError: object doesn't support item assignment >>> t # Despite the exception, the operation succeeded! ([1, 2], [2, 3], [3]) So, the exception happens because the user is ostensibly trying to modify this immutable tuple... but of course she's not. She's just trying to modify the element of the tuple, which is itself mutable, and that makes the exception surprising. Even more surprising in light of the exception is the fact that everything seems to have worked. In order to get this all to make sense, she needs to twist her brain into remembering that inplace operations don't really just modify their targets "in place", but also try to "replace" them. However, if we just set up the inplace operations so that when they return the original object there's no "replace", all of these problems go away. We don't lose any safety; trying to do += on an immutable tuple element will still fail. Also it makes tuples a generic replacement for lists in more places. There are other, more-perverse cases which the proposed change in semantics would also fix. For example: >>> class X(object): ... def __init__(self, l): ... self.container = l # will form a cycle ... self.stuff = [] ... def __iadd__(self, other): ... self.stuff += other # add to X's internal list ... del self.container[0] ... return self ... >>> l = [ 1, 2, 3] >>> l.append(X(l)) # the X element refers to l >>> l [1, 2, 3, <__main__.X object at 0x00876668>] >>> l[3] += 'a' # the element is gone by write-back time. Traceback (most recent call last): File "", line 1, in ? IndexError: list assignment index out of range >>> l # But everything succeeded [2, 3, <__main__.X object at 0x00876668>] >>> l[2].stuff ['a'] >>> l.append('tail') # this one's even wierder >>> l[2] += 'a' >>> l [3, <__main__.X object at 0x00876668>, <__main__.X object at 0x00876668>] These are too esoteric to be compelling on their own, but my proposal would also make them work as expected. Thoughts? -Dave ------------ P.S. This error message is kind of wierd: >>> t = ([1],[2],[3]) >>> t[1] += 3 Traceback (most recent call last): File "", line 1, in ? TypeError: argument to += must be iterable ^^^^^^^^^^^^^^^^ ?? +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+ From guido@python.org Wed Jun 12 03:06:42 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Jun 2002 22:06:42 -0400 Subject: [Python-Dev] behavior of inplace operations In-Reply-To: Your message of "Tue, 11 Jun 2002 20:38:38 EDT." <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com> References: <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com> Message-ID: <200206120206.g5C26g005493@pcp02138704pcs.reston01.va.comcast.net> > My initial post asking about the implementation of += sparked a > small thread over on python-list, from which I've come to the > conclusion that my little optimization suggestion (don't try to set > the attribute or item if the inplace op returns its first argument) > is actually more semantically correct. [...] > Thoughts? One problem is that it's really hard to design the bytecode so that this can be implemented. The problem is that the compiler sees this: a[i] += x and must compile bytecode that works for all cases: a can be mutable or immutable, and += could return the same or a different object as a[i]. It currently generates code that uses a STORE_SUBSCR opcode (store into a[i]) with the saved value of the object and index used for the BINARY_SUBSCR (load from a[i]) opcode. It would have to generate additional code to (a) save the object retrieved from a[i], (b) compare the result to it using the 'is' operator, and (c) pop some stuff of the stack and skip over the assignment if true. That could be done, but the extra test would definitely slow things down. A worse problem is that it's a semantic change. For example, persistent objects in Zope require (under certain circumstances) that if you modify an object that lives in a persistent container, you have to store it back into the container in order for the persistency mechanism to pick up on the change. Currently we can rely on a[i]+=x and a.foo+=x to do the assigment. Under your proposal, we couldn't (unless we knew that the item was of an immutable type). That is such a subtle change in semantics that I don't want to risk it without years of transitional warnings. Personally, I'd rather accept that if you have a = ([], [], []), a[1]+=[2] won't work. You can always write a[1].extend([2]). --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" <200206120206.g5C26g005493@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <018d01c21208$1efadab0$6501a8c0@boostconsulting.com> From: "Guido van Rossum" > > My initial post asking about the implementation of += sparked a > > small thread over on python-list, from which I've come to the > > conclusion that my little optimization suggestion (don't try to set > > the attribute or item if the inplace op returns its first argument) > > is actually more semantically correct. > [...] > > Thoughts? > > One problem is that it's really hard to design the bytecode so that > this can be implemented. The problem is that the compiler sees this: > > a[i] += x > > and must compile bytecode that works for all cases: a can be mutable > or immutable, and += could return the same or a different object as > a[i]. It currently generates code that uses a STORE_SUBSCR opcode > (store into a[i]) with the saved value of the object and index used > for the BINARY_SUBSCR (load from a[i]) opcode. It would have to > generate additional code to (a) save the object retrieved from a[i] Isn't that already lying about on the stack somewhere? Didn't you have to have it in order to invoke "+= x" on it? (I'm totally ignorant of Python's bytecode, I'll be the first to admit) > (b) compare the result to it using the 'is' operator, and (c) pop some > stuff of the stack and skip over the assignment if true. That could > be done, but the extra test would definitely slow things down. As was suggested by someone else in the thread I referenced, I was thinking that a new bytecode would be used to handle this. It has to be faster to do one test in 'C' code than it is to re-indexing into a map or even to do the refcount-twiddling that goes with an unneeded store into a list. > A worse problem is that it's a semantic change. For example, > persistent objects in Zope require (under certain circumstances) that > if you modify an object that lives in a persistent container, you have > to store it back into the container in order for the persistency > mechanism to pick up on the change. Currently we can rely on a[i]+=x > and a.foo+=x to do the assigment. Under your proposal, we couldn't > (unless we knew that the item was of an immutable type). That's right. I would have suggested that for persistent containers, the object returned carries its own write-back knowledge. > That is such > a subtle change in semantics that I don't want to risk it without > years of transitional warnings. Hah, code breakage. The purity of the language must not be compromised, at any cost! Well, ok, if someone's actually using this extra step I guess you can't change it on a whim... > Personally, I'd rather accept that if you have a = ([], [], []), > a[1]+=[2] won't work. You can always write a[1].extend([2]). It's your choice, of course. However, it seems a little strange to have this fundamental operation which is optimized for persistent containers but doesn't work right -- and (I assert without evidence) must be slower than neccessary -- in some simple cases. The pathological/non-generic cases are the ones that make me think twice about using the inplace ops at all. They don't, in fact, "just work", so I have to think carefully about what's happening to avoid getting myself in trouble. -Dave From David Abrahams" Suppose I want to execute, from "C", the same steps taken by Python in evaluating the expression x <= y I see no documented "C" API function which can do that. I'm guessing PyObject_RichCompare[Bool] may do what I want, but since it's undocumented I assume its untouchable. Guido? -Dave +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+ From tim.one@comcast.net Wed Jun 12 16:02:50 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 12 Jun 2002 11:02:50 -0400 Subject: [Python-Dev] Rich Comparison from "C" API? In-Reply-To: <020c01c2121a$d69c91b0$6501a8c0@boostconsulting.com> Message-ID: [David Abrahams] > Suppose I want to execute, from "C", the same steps taken by Python in > evaluating the expression > > x <= y > > I see no documented "C" API function which can do that. I'm guessing > PyObject_RichCompare[Bool] may do what I want, but since it's > undocumented I assume its untouchable. Guido? It doesn't start with an underscore, and is advertised in object.h, so that it's undocumented just means you didn't yet volunteer a doc patch . /* Return -1 if error; 1 if v op w; 0 if not (v op w). */ int PyObject_RichCompareBool(PyObject *v, PyObject *w, int op) where op is one of (also in object.h) /* Rich comparison opcodes */ #define Py_LT 0 #define Py_LE 1 #define Py_EQ 2 #define Py_NE 3 #define Py_GT 4 #define Py_GE 5 So the answer to your question is int result = PyObject_RichCompareBool(x, y, Py_LE); if (result < 0) return error_indicator; /* now result is true/false */ From guido@python.org Wed Jun 12 16:13:26 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 12 Jun 2002 11:13:26 -0400 Subject: [Python-Dev] behavior of inplace operations In-Reply-To: Your message of "Wed, 12 Jun 2002 07:54:29 EDT." <018d01c21208$1efadab0$6501a8c0@boostconsulting.com> References: <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com> <200206120206.g5C26g005493@pcp02138704pcs.reston01.va.comcast.net> <018d01c21208$1efadab0$6501a8c0@boostconsulting.com> Message-ID: <200206121513.g5CFDQa25012@odiug.zope.com> > > One problem is that it's really hard to design the bytecode so that > > this can be implemented. The problem is that the compiler sees this: > > > > a[i] += x > > > > and must compile bytecode that works for all cases: a can be mutable > > or immutable, and += could return the same or a different object as > > a[i]. It currently generates code that uses a STORE_SUBSCR opcode > > (store into a[i]) with the saved value of the object and index used > > for the BINARY_SUBSCR (load from a[i]) opcode. It would have to > > generate additional code to (a) save the object retrieved from a[i] > > Isn't that already lying about on the stack somewhere? Didn't you have to > have it in order to invoke "+= x" on it? (I'm totally ignorant of Python's > bytecode, I'll be the first to admit) Getting that object is the easy part. > > (b) compare the result to it using the 'is' operator, and (c) pop some > > stuff of the stack and skip over the assignment if true. That could > > be done, but the extra test would definitely slow things down. > > As was suggested by someone else in the thread I referenced, I was thinking > that a new bytecode would be used to handle this. It has to be faster to do > one test in 'C' code than it is to re-indexing into a map or even to do the > refcount-twiddling that goes with an unneeded store into a list. > > > A worse problem is that it's a semantic change. For example, > > persistent objects in Zope require (under certain circumstances) that > > if you modify an object that lives in a persistent container, you have > > to store it back into the container in order for the persistency > > mechanism to pick up on the change. Currently we can rely on a[i]+=x > > and a.foo+=x to do the assigment. Under your proposal, we couldn't > > (unless we knew that the item was of an immutable type). > > That's right. I would have suggested that for persistent containers, the > object returned carries its own write-back knowledge. But that's not how it works. Giving each container a persistent object ID is not an option. > > That is such > > a subtle change in semantics that I don't want to risk it without > > years of transitional warnings. > > Hah, code breakage. The purity of the language must not be compromised, at > any cost! Well, ok, if someone's actually using this extra step I guess you > can't change it on a whim... > > > Personally, I'd rather accept that if you have a = ([], [], []), > > a[1]+=[2] won't work. You can always write a[1].extend([2]). > > It's your choice, of course. However, it seems a little strange to have > this fundamental operation which is optimized for persistent containers but > doesn't work right -- and (I assert without evidence) must be slower than > neccessary -- in some simple cases. The pathological/non-generic cases are > the ones that make me think twice about using the inplace ops at all. They > don't, in fact, "just work", so I have to think carefully about what's > happening to avoid getting myself in trouble. You have a habit of thinking too much instead of using common sense. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" Message-ID: <026301c21224$344847b0$6501a8c0@boostconsulting.com> ----- Original Message ----- From: "Tim Peters" To: "David Abrahams" Cc: Sent: Wednesday, June 12, 2002 11:02 AM Subject: RE: [Python-Dev] Rich Comparison from "C" API? > [David Abrahams] > > Suppose I want to execute, from "C", the same steps taken by Python in > > evaluating the expression > > > > x <= y > > > > I see no documented "C" API function which can do that. I'm guessing > > PyObject_RichCompare[Bool] may do what I want, but since it's > > undocumented I assume its untouchable. Guido? > > It doesn't start with an underscore, and is advertised in object.h, so that > it's undocumented just means you didn't yet volunteer a doc patch . ...and I didn't volunteer a doc patch yet because the source is too complicated to easily determine if it's exactly what I'm looking for. Umm, OK, I guess I was looking in the wrong place when poring over the implementation of PyObject_RichCompare: ceval.c calls PyObject_RichCompare directly. OK, doc patch coming up. -Dave From tim.one@comcast.net Wed Jun 12 16:28:11 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 12 Jun 2002 11:28:11 -0400 Subject: [Python-Dev] Rich Comparison from "C" API? In-Reply-To: <026301c21224$344847b0$6501a8c0@boostconsulting.com> Message-ID: [Tim, claims that PyObject_RichCompareBool isn't documented because David hasn't yet submitted a doc patch] [David Abrahams] > ...and I didn't volunteer a doc patch yet because the source is too > complicated to easily determine if it's exactly what I'm looking for. > Umm, OK, I guess I was looking in the wrong place when poring over the > implementation of PyObject_RichCompare: ceval.c calls > PyObject_RichCompare directly. OK, doc patch coming up. 1. You really want PyObject_RichCompareBool in your example, not PyObject_RichCompare. The former is much more efficient in some cases (e.g., a total ordering on dicts is much hairer to determine than just equality). 2. This is why I keep pulling your leg. Sometimes you fall for it . Thanks for the patch! From David Abrahams" <200206120206.g5C26g005493@pcp02138704pcs.reston01.va.comcast.net> <018d01c21208$1efadab0$6501a8c0@boostconsulting.com> <200206121513.g5CFDQa25012@odiug.zope.com> Message-ID: <028701c21227$029a8220$6501a8c0@boostconsulting.com> From: "Guido van Rossum" > > That's right. I would have suggested that for persistent containers, the > > object returned carries its own write-back knowledge. > > But that's not how it works. Giving each container a persistent > object ID is not an option. I'm sure this is moot, but I don't think I was suggesting that. I was suggesting that a persistent container's __getitem__() returns a proxy object which contains a reference back to the container. You can either write-back upon modifying the object, or, I suppose, upon __del__(). My scheme may not work (I don't really understand the Zope requirements or implementation), but it seems that the existing one is just as vulnerable in the case of a container of mutable objects: x = container_of_lists[2] x += 3 # no write-back -Dave From David Abrahams" Message-ID: <028801c21227$02c3dc10$6501a8c0@boostconsulting.com> From: "Tim Peters" > 2. This is why I keep pulling your leg. Sometimes you fall for it . Heh, you may have been kidding but Guido, privately, gave me the old Obi-wan line. > Thanks for the patch! Workin' on it... From David Abrahams" Message-ID: <02a201c21229$1d6df620$6501a8c0@boostconsulting.com> From: "Tim Peters" > 1. You really want PyObject_RichCompareBool in your example, not > PyObject_RichCompare. The former is much more efficient in some > cases (e.g., a total ordering on dicts is much hairer to determine > than just equality). Now you're really pulling my leg. PyObject_RichCompareBool is just: /* Return -1 if error; 1 if v op w; 0 if not (v op w). */ int PyObject_RichCompareBool(PyObject *v, PyObject *w, int op) { PyObject *res = PyObject_RichCompare(v, w, op); int ok; if (res == NULL) return -1; ok = PyObject_IsTrue(res); Py_DECREF(res); return ok; } How can that be more efficient that PyObject_RichCompare? From nas@python.ca Wed Jun 12 16:58:06 2002 From: nas@python.ca (Neil Schemenauer) Date: Wed, 12 Jun 2002 08:58:06 -0700 Subject: [Python-Dev] Rich Comparison from "C" API? In-Reply-To: <026301c21224$344847b0$6501a8c0@boostconsulting.com>; from david.abrahams@rcn.com on Wed, Jun 12, 2002 at 11:13:25AM -0400 References: <026301c21224$344847b0$6501a8c0@boostconsulting.com> Message-ID: <20020612085806.A23915@glacier.arctrix.com> David Abrahams wrote: > ...and I didn't volunteer a doc patch yet because the source is too > complicated to easily determine if it's exactly what I'm looking for. Dragons be there. Comparison operations are, I think, the most complicated part of the interpreter. Be brave and good luck to you. :-) Neil From guido@python.org Wed Jun 12 16:51:53 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 12 Jun 2002 11:51:53 -0400 Subject: [Python-Dev] behavior of inplace operations In-Reply-To: Your message of "Wed, 12 Jun 2002 11:35:02 EDT." <028701c21227$029a8220$6501a8c0@boostconsulting.com> References: <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com> <200206120206.g5C26g005493@pcp02138704pcs.reston01.va.comcast.net> <018d01c21208$1efadab0$6501a8c0@boostconsulting.com> <200206121513.g5CFDQa25012@odiug.zope.com> <028701c21227$029a8220$6501a8c0@boostconsulting.com> Message-ID: <200206121551.g5CFprp25331@odiug.zope.com> > I'm sure this is moot, but I don't think I was suggesting that. I was > suggesting that a persistent container's __getitem__() returns a proxy > object which contains a reference back to the container. That's not sufficiently transparent for some purposes. > You can either > write-back upon modifying the object, or, I suppose, upon __del__(). My > scheme may not work (I don't really understand the Zope requirements or > implementation), but it seems that the existing one is just as vulnerable > in the case of a container of mutable objects: > > x = container_of_lists[2] > x += 3 # no write-back This is a known limitation. --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" Message-ID: <02cd01c2122b$9fd7aa00$6501a8c0@boostconsulting.com> From: "Tim Peters" > [David Abrahams] > > Now you're really pulling my leg. PyObject_RichCompareBool is just: > > ... > > How can that be more efficient that PyObject_RichCompare? > > I wasn't pulling your leg there, I was simply wrong. Who can blame me? You > never documented this stuff . In my patch I left out any mention of efficiency, just in case you were right -D From tim.one@comcast.net Wed Jun 12 17:01:10 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 12 Jun 2002 12:01:10 -0400 Subject: [Python-Dev] Rich Comparison from "C" API? In-Reply-To: <02a201c21229$1d6df620$6501a8c0@boostconsulting.com> Message-ID: [Tim] > 1. You really want PyObject_RichCompareBool in your example, not > PyObject_RichCompare. The former is much more efficient in some > cases (e.g., a total ordering on dicts is much hairer to determine > than just equality). [David Abrahams] > Now you're really pulling my leg. PyObject_RichCompareBool is just: > ... > How can that be more efficient that PyObject_RichCompare? I wasn't pulling your leg there, I was simply wrong. Who can blame me? You never documented this stuff . From fredrik@pythonware.com Wed Jun 12 17:54:56 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 12 Jun 2002 18:54:56 +0200 Subject: [Python-Dev] urllib.py and 303 redirect References: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net> <20020611232916.GA3126@panix.com> <200206112354.g5BNs0u05206@pcp02138704pcs.reston01.va.comcast.net> <20020612001954.GA10924@panix.com> Message-ID: <002d01c21231$e4e67080$ced241d5@hagrid> Aahz wrote: > > Could I shame you into submitting a patch? :-) > > Not until I figure out how I want to deal with SF not working with Lynx; > haven't been able to get anyone at SF interested in talking about fixing > the problem if you come up with a patch, I'm sure someone can help you post it to SF. (didn't someone report that SF worked perfectly fine with "links", btw?) From skip@pobox.com Wed Jun 12 20:43:18 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 12 Jun 2002 14:43:18 -0500 Subject: [Python-Dev] code coverage updated Message-ID: <15623.42070.960948.412843@12-248-41-177.client.attbi.com> I updated the C and Python code coverage information at http://manatee.mojam.com/~skip/python/Python/dist/src/ Something changed about how gcov works, so all the .gcov files got dumped into the top-level directory. Accordingly, I needed to make a couple changes to get the tables to display again, and all the .c file info is jumbled together. I don't think there are actually any name conflicts in the Python C source, so the information there should be okay. If I get a few more minutes I will see if I can fix the problem. Skip From guido@python.org Wed Jun 12 21:59:26 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 12 Jun 2002 16:59:26 -0400 Subject: [Python-Dev] test_socket failures Message-ID: <200206122059.g5CKxQa16372@odiug.zope.com> I know that there are problem with the two new socket tests: test_timeout and test_socket. The problems are varied: the tests assume network access and a working and consistent DNS, they assume predictable timing, and there is a number of Windows-specific failures that I'm trying to track down. Also, when the full test suite is run, test_socket.py may hang, while in isolation it will work. (Gosh if only we had had these unit tests a few years ago. They bring up all sorts of issues that are good to know about.) I'll try to fix these ASAP. --Guido van Rossum (home page: http://www.python.org/~guido/) From mgilfix@eecs.tufts.edu Thu Jun 13 00:13:55 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Wed, 12 Jun 2002 19:13:55 -0400 Subject: [Python-Dev] test_socket failures In-Reply-To: <200206122059.g5CKxQa16372@odiug.zope.com>; from guido@python.org on Wed, Jun 12, 2002 at 04:59:26PM -0400 References: <200206122059.g5CKxQa16372@odiug.zope.com> Message-ID: <20020612191355.C10542@eecs.tufts.edu> On Wed, Jun 12 @ 16:59, Guido van Rossum wrote: > I know that there are problem with the two new socket tests: > test_timeout and test_socket. The problems are varied: the tests > assume network access and a working and consistent DNS, they assume > predictable timing, and there is a number of Windows-specific failures > that I'm trying to track down. Also, when the full test suite is run, > test_socket.py may hang, while in isolation it will work. (Gosh if > only we had had these unit tests a few years ago. They bring up all > sorts of issues that are good to know about.) > > I'll try to fix these ASAP. Yeah, I hadn't gotten around to checking them within the full test suite. The version I had sent you was just for commentary . I try to do as much synchronization as possible. I'm sure fiddling with the synchronization points in the ThreadableTest class in test_socket.py will do the trick. BTW, I've concluded that the unitest module rocks. Just a show of support here. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From mwh@python.net Thu Jun 13 16:00:20 2002 From: mwh@python.net (Michael Hudson) Date: 13 Jun 2002 16:00:20 +0100 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src setup.py,1.89,1.90 In-Reply-To: gvanrossum@users.sourceforge.net's message of "Thu, 13 Jun 2002 07:41:38 -0700" References: Message-ID: <2m7kl3kyuz.fsf@starship.python.net> gvanrossum@users.sourceforge.net writes: > Index: setup.py > =================================================================== > RCS file: /cvsroot/python/python/dist/src/setup.py,v > retrieving revision 1.89 > retrieving revision 1.90 > diff -C2 -d -r1.89 -r1.90 > *** setup.py 11 Jun 2002 06:22:30 -0000 1.89 > --- setup.py 13 Jun 2002 14:41:32 -0000 1.90 > *************** > *** 272,275 **** > --- 272,277 ---- > exts.append( Extension('xreadlines', ['xreadlinesmodule.c']) ) > > + exts.append( Extension("bits", ["bits.c"]) ) > + I'm guessing that wasn't meant to get checked in? Cheers, M. -- /* I'd just like to take this moment to point out that C has all the expressive power of two dixie cups and a string. */ -- Jamie Zawinski from the xkeycaps source From gward@python.net Thu Jun 13 16:33:02 2002 From: gward@python.net (Greg Ward) Date: Thu, 13 Jun 2002 11:33:02 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> Message-ID: <20020613153302.GA1918@gerg.ca> On 11 June 2002, Martin v. Loewis said: > It actually only requires different init functions. To support that > with distutils, you need to tell distutils to generate different > object files from the same source file, which is probably not > supported out of the box. In theory: setup(... ext_modules=[Extension("foo1", ["foo.c"], define_macros=[('FOOSTYLE', 1)]), Extension("foo2", ["foo.c"], define_macros=[('FOOSTYLE', 2)])]) should work. See http://www.python.org/doc/current/dist/setup-script.html#SECTION000330000000000000000 Untested, YMMV, etc. Greg -- Greg Ward - just another Python hacker gward@python.net http://starship.python.net/~gward/ I have the power to HALT PRODUCTION on all TEENAGE SEX COMEDIES!! From thomas.heller@ion-tof.com Thu Jun 13 17:03:00 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 13 Jun 2002 18:03:00 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> <20020613153302.GA1918@gerg.ca> Message-ID: <09da01c212f3$cacd72d0$e000a8c0@thomasnotebook> From: "Greg Ward" > On 11 June 2002, Martin v. Loewis said: > > It actually only requires different init functions. To support that > > with distutils, you need to tell distutils to generate different > > object files from the same source file, which is probably not > > supported out of the box. > > In theory: > > setup(... > ext_modules=[Extension("foo1", ["foo.c"], > define_macros=[('FOOSTYLE', 1)]), > Extension("foo2", ["foo.c"], > define_macros=[('FOOSTYLE', 2)])]) > > should work. But not in practice, IIRC. Because the build process for foo2 will see that foo.o is up to date already. Thomas From jeremy@zope.com Thu Jun 13 18:38:01 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Thu, 13 Jun 2002 13:38:01 -0400 Subject: [Python-Dev] change to compiler implementations In-Reply-To: References: Message-ID: <15624.55417.763429.27404@slothrop.zope.com> I just checked in two sets of changes to distutils. I refactored in the implementation of compile() methods and I added some simple dependency tracking. I've only been able to test the changes on Unix, and expect Guido will soon test it with MSVC. I'd appreciate it if people with other affected compilers (Borland, Cygwin, EMX) could test it. Jeremy From skip@pobox.com Thu Jun 13 18:49:04 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 13 Jun 2002 12:49:04 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies Message-ID: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> I wonder if it would be better to have distutils generate the appropriate type of makefile and execute that instead of directly building objects and shared libraries. This would finesse some of the dependency tracking problems that pop up frequently. Skip From jeremy@zope.com Thu Jun 13 18:51:30 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Thu, 13 Jun 2002 13:51:30 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> Message-ID: <15624.56226.581221.448980@slothrop.zope.com> >>>>> "SM" == Skip Montanaro writes: SM> I wonder if it would be better to have distutils generate the SM> appropriate type of makefile and execute that instead of SM> directly building objects and shared libraries. This would SM> finesse some of the dependency tracking problems that pop up SM> frequently. That sounds really complicated. Jeremy From guido@python.org Thu Jun 13 18:51:54 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 13 Jun 2002 13:51:54 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Thu, 13 Jun 2002 12:49:04 CDT." <15624.56080.793620.970381@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> Message-ID: <200206131751.g5DHps801904@odiug.zope.com> [Skip] > I wonder if it would be better to have distutils generate the > appropriate type of makefile and execute that instead of directly > building objects and shared libraries. This would finesse some of > the dependency tracking problems that pop up frequently. But that doesn't work for platforms that don't have a Make. And while Windows has one, its file format is completely different, so you'd have to teach distutils how to write each platform's Makefile format. -1 --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 13 18:55:13 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 13 Jun 2002 13:55:13 -0400 Subject: [Python-Dev] change to compiler implementations In-Reply-To: Your message of "Thu, 13 Jun 2002 13:38:01 EDT." <15624.55417.763429.27404@slothrop.zope.com> References: <15624.55417.763429.27404@slothrop.zope.com> Message-ID: <200206131755.g5DHtED02030@odiug.zope.com> > I just checked in two sets of changes to distutils. I refactored in > the implementation of compile() methods and I added some simple > dependency tracking. I've only been able to test the changes on Unix, > and expect Guido will soon test it with MSVC. I'd appreciate it if > people with other affected compilers (Borland, Cygwin, EMX) could test > it. Um, I don't use setup.py with MSVC on Windows. The MSVC project, for better or for worse, has entries to build all the extensions we need, and I don't have any 3rd party extensions I'd like to build. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Thu Jun 13 19:00:55 2002 From: paul@prescod.net (Paul Prescod) Date: Thu, 13 Jun 2002 11:00:55 -0700 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> Message-ID: <3D08DDD7.BD8573D8@prescod.net> Skip Montanaro wrote: > > I wonder if it would be better to have distutils generate the appropriate > type of makefile and execute that instead of directly building objects and > shared libraries. This would finesse some of the dependency tracking > problems that pop up frequently. That's what Perl does ("MakeMaker") but I think it just adds another level of complexity, especially with different "makes" out there doing different things. Paul Prescod From thomas.heller@ion-tof.com Thu Jun 13 19:04:43 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 13 Jun 2002 20:04:43 +0200 Subject: [Python-Dev] Re: [Distutils] change to compiler implementations References: <15624.55417.763429.27404@slothrop.zope.com> Message-ID: <0ae001c21304$cb9d0f20$e000a8c0@thomasnotebook> From: "Jeremy Hylton" > I just checked in two sets of changes to distutils. I refactored in > the implementation of compile() methods and I added some simple > dependency tracking. I've only been able to test the changes on Unix, > and expect Guido will soon test it with MSVC. I'd appreciate it if > people with other affected compilers (Borland, Cygwin, EMX) could test > it. > I've tested some of my extensions with MSVC, and it works fine. Thomas From niemeyer@conectiva.com Thu Jun 13 19:55:13 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 13 Jun 2002 15:55:13 -0300 Subject: [Python-Dev] doc strings patch Message-ID: <20020613155513.A15681@ibook.distro.conectiva> Could someone please check out patch number 568124? It's a simple patch. OTOH, it's huge, and may break shortly if not applied. Thank you! -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From paul@pfdubois.com Thu Jun 13 19:59:58 2002 From: paul@pfdubois.com (Paul F Dubois) Date: Thu, 13 Jun 2002 11:59:58 -0700 Subject: [Distutils] Re: [Python-Dev] change to compiler implementations In-Reply-To: <200206131755.g5DHtED02030@odiug.zope.com> Message-ID: <001101c2130c$8dee5fa0$0c01a8c0@NICKLEBY> Numeric is a suitable stress test on Windows that uses a setup.py. > -----Original Message----- > From: distutils-sig-admin@python.org > [mailto:distutils-sig-admin@python.org] On Behalf Of Guido van Rossum > Sent: Thursday, June 13, 2002 10:55 AM > To: jeremy@zope.com > Cc: python-dev@python.org; distutils-sig@python.org > Subject: [Distutils] Re: [Python-Dev] change to compiler > implementations > > > > I just checked in two sets of changes to distutils. I > refactored in > > the implementation of compile() methods and I added some simple > > dependency tracking. I've only been able to test the > changes on Unix, > > and expect Guido will soon test it with MSVC. I'd appreciate it if > > people with other affected compilers (Borland, Cygwin, EMX) > could test > > it. > > Um, I don't use setup.py with MSVC on Windows. The MSVC > project, for better or for worse, has entries to build all > the extensions we need, and I don't have any 3rd party > extensions I'd like to build. > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG@python.org > http://mail.python.org/mailman/listinfo/distut> ils-sig > From guido@python.org Thu Jun 13 20:04:48 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 13 Jun 2002 15:04:48 -0400 Subject: [Python-Dev] doc strings patch In-Reply-To: Your message of "Thu, 13 Jun 2002 15:55:13 -0300." <20020613155513.A15681@ibook.distro.conectiva> References: <20020613155513.A15681@ibook.distro.conectiva> Message-ID: <200206131904.g5DJ4mV04980@odiug.zope.com> > Could someone please check out patch number 568124? It's a simple > patch. OTOH, it's huge, and may break shortly if not applied. Looks good to me, except for the socket module (where I changed some docstrings). We need a volunteer to check it in! --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Thu Jun 13 20:36:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Jun 2002 21:36:20 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > I wonder if it would be better to have distutils generate the appropriate > type of makefile and execute that instead of directly building objects and > shared libraries. This would finesse some of the dependency tracking > problems that pop up frequently. It was one of the design principles of distutils to not rely on any other tools but Python and the C compiler. Regards, Martin From skip@pobox.com Thu Jun 13 20:43:12 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 13 Jun 2002 14:43:12 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> Message-ID: <15624.62928.845160.407762@12-248-41-177.client.attbi.com> >> I wonder if it would be better to have distutils generate the >> appropriate type of makefile and execute that instead... Martin> It was one of the design principles of distutils to not rely on Martin> any other tools but Python and the C compiler. Perhaps it's a design principle that needs to be rethought. If you can assume the presence of a C compiler I think you can generally assume the presence of a make tool of some sort. Skip From paul-python@svensson.org Thu Jun 13 20:55:51 2002 From: paul-python@svensson.org (Paul Svensson) Date: Thu, 13 Jun 2002 15:55:51 -0400 (EDT) Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15624.62928.845160.407762@12-248-41-177.client.attbi.com> Message-ID: On Thu, 13 Jun 2002, Skip Montanaro wrote: > >> I wonder if it would be better to have distutils generate the > >> appropriate type of makefile and execute that instead... > > Martin> It was one of the design principles of distutils to not rely on > Martin> any other tools but Python and the C compiler. > >Perhaps it's a design principle that needs to be rethought. If you can >assume the presence of a C compiler I think you can generally assume the >presence of a make tool of some sort. ^^^^^^^^^^^^ That's the rub. The MAKE.EXE mostly found on WinDOS boxen doesn't have much more than the name in common with the well known Unix tool. /Paul From skip@pobox.com Thu Jun 13 21:06:30 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 13 Jun 2002 15:06:30 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206131751.g5DHps801904@odiug.zope.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <200206131751.g5DHps801904@odiug.zope.com> Message-ID: <15624.64326.946056.280597@12-248-41-177.client.attbi.com> >> I wonder if it would be better to have distutils generate the >> appropriate type of makefile and execute that instead... Guido> But that doesn't work for platforms that don't have a Make. And Guido> while Windows has one, its file format is completely different, Guido> so you'd have to teach distutils how to write each platform's Guido> Makefile format. I don't see that writing different makefile formats is any harder than writing different shell commands. On those systems where you don't have a make-like tool, either distutils already writes compile and link commands or it doesn't work at all. On those systems where you do have a make-like facility, I see no reason to not use it. You will get more reliable dependency checking for one thing. Skip From skip@pobox.com Thu Jun 13 21:10:04 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 13 Jun 2002 15:10:04 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <3D08DDD7.BD8573D8@prescod.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> Message-ID: <15624.64540.472905.469106@12-248-41-177.client.attbi.com> Paul> That's what Perl does ("MakeMaker") but I think it just adds Paul> another level of complexity, especially with different "makes" out Paul> there doing different things. If the extra complexity came with no added benefits I'd agree with you. However, most makes actually do support a fairly basic common syntax. Who cares about %-rules and suffix rules? Those are only there to make it easier for humans to maintain Makefiles Just generate a brute-force low-level makefile. Distutils will then do the right thing in the face of file edits. Skip From gball@cfa.harvard.edu Thu Jun 13 21:26:24 2002 From: gball@cfa.harvard.edu (Greg Ball) Date: Thu, 13 Jun 2002 16:26:24 -0400 (EDT) Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15624.64540.472905.469106@12-248-41-177.client.attbi.com> Message-ID: On Thu, 13 Jun 2002, Skip Montanaro wrote: > If the extra complexity came with no added benefits I'd agree with you. > However, most makes actually do support a fairly basic common syntax. Who > cares about %-rules and suffix rules? Those are only there to make it > easier for humans to maintain Makefiles Just generate a brute-force > low-level makefile. Distutils will then do the right thing in the face of > file edits. If you're not using sophisticated rules, the job make does is probably no more complicated than the job of generating a makefile. You just construct a dependency graph, then walk over it stat-ing the files, and running rules where needed. It's a SMOP ;-) -- Greg Ball From pyth@devel.trillke.net Thu Jun 13 21:40:04 2002 From: pyth@devel.trillke.net (holger krekel) Date: Thu, 13 Jun 2002 22:40:04 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15624.62928.845160.407762@12-248-41-177.client.attbi.com>; from skip@pobox.com on Thu, Jun 13, 2002 at 02:43:12PM -0500 References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> Message-ID: <20020613224004.F6609@prim.han.de> Skip Montanaro wrote: > > >> I wonder if it would be better to have distutils generate the > >> appropriate type of makefile and execute that instead... > > Martin> It was one of the design principles of distutils to not rely on > Martin> any other tools but Python and the C compiler. > > Perhaps it's a design principle that needs to be rethought. If you can > assume the presence of a C compiler I think you can generally assume the > presence of a make tool of some sort. isn't it funny that 'scons' as a *build system* doesn't rely on anything but python? I've heard rumors they even check dependencies... holger From martin@v.loewis.de Thu Jun 13 22:00:15 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Jun 2002 23:00:15 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15624.62928.845160.407762@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > Perhaps it's a design principle that needs to be rethought. If you can > assume the presence of a C compiler I think you can generally assume the > presence of a make tool of some sort. Maybe - although it removes reliability from the build process if you need to rely on locating another tool. For example, on Solaris, you could run into either the vendor's make, or GNU make. Also, it appears that nothing is gained by using make. Regards, Martin From martin@v.loewis.de Thu Jun 13 22:03:36 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Jun 2002 23:03:36 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15624.64326.946056.280597@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <200206131751.g5DHps801904@odiug.zope.com> <15624.64326.946056.280597@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > You will get more reliable dependency checking for one thing. I doubt that. To get that checking, you need to tell make what the dependencies are - it won't automatically assume anything except that object files depend on their input sources. In particular, you will *not* get dependencies to header files - in my experience, those are the biggest source of build problems. If you add a scanning procedure for finding header files, you can integrate this into distutils' dependency mechanisms just as well as you can generate five different makefile formats from those data. Regards, Martin From skip@pobox.com Thu Jun 13 22:05:53 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 13 Jun 2002 16:05:53 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <20020613224004.F6609@prim.han.de> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <20020613224004.F6609@prim.han.de> Message-ID: <15625.2353.853651.314565@12-248-41-177.client.attbi.com> holger> isn't it funny that 'scons' as a *build system* doesn't rely on holger> anything but python? I've heard rumors they even check holger> dependencies... Scons would be fine by me. It doesn't rely on a C compiler, but if you want to build something that needs to be compiled I suspect you'd need one. Skip From skip@pobox.com Thu Jun 13 22:08:05 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 13 Jun 2002 16:08:05 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> Message-ID: <15625.2485.994408.888814@12-248-41-177.client.attbi.com> Martin> Also, it appears that nothing is gained by using make. That's incorrect. Distutils is not a make substitute and I doubt it ever will be. What dependency checking it does do is incomplete and this gives rise to problems that are reported fairly frequently. Skip From pyth@devel.trillke.net Thu Jun 13 22:11:36 2002 From: pyth@devel.trillke.net (holger krekel) Date: Thu, 13 Jun 2002 23:11:36 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15625.2353.853651.314565@12-248-41-177.client.attbi.com>; from skip@pobox.com on Thu, Jun 13, 2002 at 04:05:53PM -0500 References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <20020613224004.F6609@prim.han.de> <15625.2353.853651.314565@12-248-41-177.client.attbi.com> Message-ID: <20020613231136.H6609@prim.han.de> Skip Montanaro wrote: > > holger> isn't it funny that 'scons' as a *build system* doesn't rely on > holger> anything but python? I've heard rumors they even check > holger> dependencies... > > Scons would be fine by me. It doesn't rely on a C compiler, but if you want > to build something that needs to be compiled I suspect you'd need one. i didn't mean to include or require scons for distutils. I was just making the obvious point that for the dependency task at hand python should be powerful enough. Reusing some code from scons might be worthwile, though. why-use-a-car-when-you-can-beam-ly y'rs, holger From martin@v.loewis.de Thu Jun 13 23:13:50 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jun 2002 00:13:50 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15625.2485.994408.888814@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > That's incorrect. Distutils is not a make substitute and I doubt it ever > will be. What dependency checking it does do is incomplete and this gives > rise to problems that are reported fairly frequently. Can you provide a specific example to support this criticism? Could you also explain how generating makefiles would help? Regards, Martin From paul@prescod.net Fri Jun 14 01:09:28 2002 From: paul@prescod.net (Paul Prescod) Date: Thu, 13 Jun 2002 17:09:28 -0700 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> Message-ID: <3D093438.92B46349@prescod.net> Skip Montanaro wrote: > >... > > If the extra complexity came with no added benefits I'd agree with you. I guess most of us don't understand the benefits because we don't see dependency tracking as necessarily that difficult. It's no harder than the new method resolution order. ;) Jeremy says he has already started implementing dependency tracking. Would switching strategies to using make actually get us anywhere faster or easier? > However, most makes actually do support a fairly basic common syntax. Who > cares about %-rules and suffix rules? Those are only there to make it > easier for humans to maintain Makefiles Just generate a brute-force > low-level makefile. Distutils will then do the right thing in the face of > file edits. Okay, so let's say that we want distutils to handle ".i" files for SWIG (it does) and .pyrx files for PyREX (it should), then we have to generate rules for those too. Paul Prescod From goodger@users.sourceforge.net Fri Jun 14 01:37:04 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Thu, 13 Jun 2002 20:37:04 -0400 Subject: [Python-Dev] Design Patterns quick reference (was Re: textwrap.py) Message-ID: Greg Ward wrote: > design patterns are great, as long as everyone has a copy of *Design > Patterns* on their desk. ;-) For those of us who don't, here's a free and nearly-complete (20 patterns) quick reference: http://www.netobjectives.com/dpexplained/download/dpmatrix.pdf . Printed double-sided, it makes a good memory jogger. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From skip@pobox.com Fri Jun 14 01:53:52 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 13 Jun 2002 19:53:52 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> Message-ID: <15625.16032.161304.357298@12-248-41-177.client.attbi.com> >> That's incorrect. Distutils is not a make substitute and I doubt it >> ever will be. What dependency checking it does do is incomplete and >> this gives rise to problems that are reported fairly frequently. Martin> Can you provide a specific example to support this criticism? Martin> Could you also explain how generating makefiles would help? >From python-list on June 10 (this is what made me wish yet again for better dependency checking): http://mail.python.org/pipermail/python-list/2002-June/108153.html It's clear nobody but me wants this, though I find it hard to believe most of you haven't been burned in the past the same way the above poster was. Frequently, after executing "cvs up" I see almost all of the Python core rebuild because some commonly used header file was modified, yet find that distutils rebuilds nothing. If a header file is modified which causes most of Objects and Python to be rebuilt, but nothing in Modules is rebuilt, I'm immediately suspicious. Here's a simple test you can perform at home. Build Python. Touch Include/Python.h. Run make again. Notice how the core files are all rebuilt but no modules are. Touch Modules/dbmmodule.c (or something else that builds). Run make again. I'm simply going to stop worrying about it and just keep deleting all the stuff distutils builds to make sure I get correct shared libraries. Skip From skip@pobox.com Fri Jun 14 01:54:32 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 13 Jun 2002 19:54:32 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <3D093438.92B46349@prescod.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> <3D093438.92B46349@prescod.net> Message-ID: <15625.16072.900596.114938@12-248-41-177.client.attbi.com> Paul> I guess most of us don't understand the benefits because we don't Paul> see dependency tracking as necessarily that difficult. It's no Paul> harder than the new method resolution order. ;) If it's not that difficult why isn't it being done? Skip From guido@python.org Fri Jun 14 03:05:00 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 13 Jun 2002 22:05:00 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Thu, 13 Jun 2002 19:53:52 CDT." <15625.16032.161304.357298@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> Message-ID: <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> > Here's a simple test you can perform at home. Build Python. Touch > Include/Python.h. Run make again. Notice how the core files are all > rebuilt but no modules are. Touch Modules/dbmmodule.c (or something else > that builds). Run make again. Most of the time most of the rebuilds of the core are unnecessary. > I'm simply going to stop worrying about it and just keep deleting all the > stuff distutils builds to make sure I get correct shared libraries. Because we are religious about binary backwards compatibility, it is very rare that a change to a header file requires that extension are recompiled. But when this happens, it is often the last thing we think of when debugging. :-( I think the conclusion from this thread is that it's not the checking of dependencies which is the problem. (Jeremy just added this to distutils.) It is the specification of which files are dependent on which others that is a pain. I think that with Jeremy's changes it would not be hard to add a rule to our setup.py that makes all extensions dependent on all .h files in the Include directory -- a reasonable approximation of the rule that the main Makefile uses. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 14 03:05:36 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 13 Jun 2002 22:05:36 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Thu, 13 Jun 2002 19:54:32 CDT." <15625.16072.900596.114938@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> <3D093438.92B46349@prescod.net> <15625.16072.900596.114938@12-248-41-177.client.attbi.com> Message-ID: <200206140205.g5E25av27450@pcp02138704pcs.reston01.va.comcast.net> > If it's not that difficult why isn't it being done? It's done. Jeremy added it today. --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Fri Jun 14 03:09:51 2002 From: nas@python.ca (Neil Schemenauer) Date: Thu, 13 Jun 2002 19:09:51 -0700 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15625.16072.900596.114938@12-248-41-177.client.attbi.com>; from skip@pobox.com on Thu, Jun 13, 2002 at 07:54:32PM -0500 References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> <3D093438.92B46349@prescod.net> <15625.16072.900596.114938@12-248-41-177.client.attbi.com> Message-ID: <20020613190951.A30383@glacier.arctrix.com> Skip Montanaro wrote: > If it's not that difficult why isn't it being done? I think the hard part is getting the dependency information. Using it is trivial. 'make' does not help solve the former problem. Speaking as someone who spent time hacking on the Python Makefile, avoid 'make'. The portable subset is limited and sucky. Neil From jeremy@zope.com Thu Jun 13 23:26:39 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Thu, 13 Jun 2002 18:26:39 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15624.64326.946056.280597@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <200206131751.g5DHps801904@odiug.zope.com> <15624.64326.946056.280597@12-248-41-177.client.attbi.com> Message-ID: <15625.7199.102456.85532@slothrop.zope.com> >>>>> "SM" == Skip Montanaro writes: SM> I don't see that writing different makefile formats is any SM> harder than writing different shell commands. On those systems SM> where you don't have a make-like tool, either distutils already SM> writes compile and link commands or it doesn't work at all. On SM> those systems where you do have a make-like facility, I see no SM> reason to not use it. You will get more reliable dependency SM> checking for one thing. Only if distutils grows a way to specify all those dependencies. Once you've specified them, I'm not sure why it is difficult to check them in Python code instead of relying on make. i'm-probably-naive-ly y'rs, Jeremy From skip@pobox.com Fri Jun 14 04:48:07 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 13 Jun 2002 22:48:07 -0500 Subject: [Python-Dev] Updates to bsddb and dbm module build process Message-ID: <15625.26487.865171.411458@12-248-41-177.client.attbi.com> Would someone like to take a look at the diff file attached to this patch? http://python.org/sf/553108 I think it's complete except for a note in Misc/NEWS. I'd assign it to Barry, but I understand he's a bit busy these days. Skip From martin@v.loewis.de Fri Jun 14 08:04:18 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jun 2002 09:04:18 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15625.16032.161304.357298@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > Martin> Can you provide a specific example to support this criticism? > Martin> Could you also explain how generating makefiles would help? > > >From python-list on June 10 (this is what made me wish yet again for better > dependency checking): > > http://mail.python.org/pipermail/python-list/2002-June/108153.html That does not answer my question: How would generating a makefile have helped? Notice that setup.py *will* regenerate nis.so if nis.c changes. The OP is right that it refused to do so because, meanwhile, he had changed Setup to build nismodule.so instead. This is where the real problem lies: that building modules via makesetup generates module.so, whereas building modules via setup.py builds .so. This needs to be fixed, and I feel that setup.py is right and makesetup is wrong. > It's clear nobody but me wants this Hard to tell, since I still don't quite get what "this" is. Generating makefiles: certainly I don't want this. The reason is not that I think there are no problems - I think that generating makefiles will not solve these problems. > though I find it hard to believe most of you haven't been burned in > the past the same way the above poster was. The specific problem comes from building shared modules through Setup. I never do this, so I have not been burned by that. > Frequently, after executing "cvs up" I see almost all of the Python core > rebuild because some commonly used header file was modified, yet find that > distutils rebuilds nothing. That I noticed. It has nothing to do with the article you quote, though, and I question that generating makefiles would help. I routinely rm -rf build when I see that some common header has changed. > Here's a simple test you can perform at home. Build Python. Touch > Include/Python.h. Run make again. Notice how the core files are all > rebuilt but no modules are. Touch Modules/dbmmodule.c (or something else > that builds). Run make again. I can reproduce your observations. I still don't see how generating makefiles will help to solve this problem. Regards, Martin From martin@v.loewis.de Fri Jun 14 08:10:04 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jun 2002 09:10:04 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15625.7199.102456.85532@slothrop.zope.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <200206131751.g5DHps801904@odiug.zope.com> <15624.64326.946056.280597@12-248-41-177.client.attbi.com> <15625.7199.102456.85532@slothrop.zope.com> Message-ID: Jeremy Hylton writes: > Only if distutils grows a way to specify all those dependencies. Once > you've specified them, I'm not sure why it is difficult to check them > in Python code instead of relying on make. I believe people normally want their build process to know dependencies without any specification of dependencies. Instead, the build process should know what the dependencies are by looking at the source files. For C, there are two ways to do that: you can either scan the sources yourself for include statements, or you can let the compiler dump dependency lists into files. The latter is only supported for some compilers, but it would help enourmously: when compiling the first time, you know for sure that you will need to compile. When compiling the second time, you read the dependency information generated the first time, to determine whether any of the included headers has changed. If that is not the case, you can skip rebuilding. If you do rebuild, the dependency information will be updated automatically (since the change might have been to add an include). Regards, Martin From martin@v.loewis.de Fri Jun 14 08:11:00 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jun 2002 09:11:00 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15625.16072.900596.114938@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> <3D093438.92B46349@prescod.net> <15625.16072.900596.114938@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > Paul> I guess most of us don't understand the benefits because we don't > Paul> see dependency tracking as necessarily that difficult. It's no > Paul> harder than the new method resolution order. ;) > > If it's not that difficult why isn't it being done? You are wrong assuming it is not done. distutils does dependency analysis since day 1. Regards, Martin From martin@strakt.com Fri Jun 14 08:44:58 2002 From: martin@strakt.com (Martin =?iso-8859-1?Q?Sj=F6gren?=) Date: Fri, 14 Jun 2002 09:44:58 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020614074458.GA31022@strakt.com> On Thu, Jun 13, 2002 at 10:05:00PM -0400, Guido van Rossum wrote: > I think the conclusion from this thread is that it's not the checking > of dependencies which is the problem. (Jeremy just added this to > distutils.) It is the specification of which files are dependent on > which others that is a pain. I think that with Jeremy's changes it > would not be hard to add a rule to our setup.py that makes all > extensions dependent on all .h files in the Include directory -- a > reasonable approximation of the rule that the main Makefile uses. I for one would love to have dependencies in my extension modules, I usually end up deleting the build directory whenever I've changed a heade= r file :( How about something like this: Extension('foo', ['foo1.c', 'foo2.c'], dependencies=3D{'foo1.c': ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']}) though there is the problem of backwards compatability :/ Just my two cents, Martin --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 7710870 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html From mal@lemburg.com Fri Jun 14 09:04:24 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 14 Jun 2002 10:04:24 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> Message-ID: <3D09A388.8080107@lemburg.com> Martin Sj=F6gren wrote: > On Thu, Jun 13, 2002 at 10:05:00PM -0400, Guido van Rossum wrote: >=20 >>I think the conclusion from this thread is that it's not the checking >>of dependencies which is the problem. (Jeremy just added this to >>distutils.) It is the specification of which files are dependent on >>which others that is a pain. I think that with Jeremy's changes it >>would not be hard to add a rule to our setup.py that makes all >>extensions dependent on all .h files in the Include directory -- a >>reasonable approximation of the rule that the main Makefile uses. >=20 >=20 > I for one would love to have dependencies in my extension modules, I > usually end up deleting the build directory whenever I've changed a hea= der > file :( >=20 > How about something like this: >=20 > Extension('foo', ['foo1.c', 'foo2.c'], dependencies=3D{'foo1.c': > ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']}) >=20 > though there is the problem of backwards compatability :/ Just curious: distutils, as the name says, is a tool for distributing source code; that doesn't have much to do with developing code where dependency analysis is nice to have since it saves compile time. When distributing code, the standard setup is that a user unzips the distutils created archive, types "python setup.py install" and that's it. Dependency analyis doesn't gain him anything. Now if you want to use distutils in the development process then you have a different mindset and therefore need different tools like e.g. scons or make (+ makedep, etc.). The question is whether we want distutils to be a development tool as well, or rather stick to its main purpose: that of simplifying distribution and installation of software (and thanks to Greg, it's great at that !). --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From aleax@aleax.it Fri Jun 14 09:42:33 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 14 Jun 2002 10:42:33 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <3D09A388.8080107@lemburg.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> Message-ID: On Friday 14 June 2002 10:04 am, M.-A. Lemburg wrote: ... > distutils, as the name says, is a tool for distributing > source code; that doesn't have much to do with developing code > where dependency analysis is nice to have since it saves compile However, distutils is already today the handiest building environment, particularly if your extension needs to support several platforms and/or several versions of Python. > The question is whether we want distutils to be a development > tool as well, or rather stick to its main purpose: that of > simplifying distribution and installation of software (and > thanks to Greg, it's great at that !). The "problem" (:-) is that it's great at just building extensions, too. python2.1 setup.py install, python2.2 setup.py install, python2.3 setup.py install, and hey pronto, I have my extension built and installed on all Python versions I want to support, ready for testing. Hard to beat!-) Alex From fredrik@pythonware.com Fri Jun 14 10:45:28 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 14 Jun 2002 11:45:28 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> Message-ID: <005a01c21388$387cd3e0$0900a8c0@spiff> alex wrote: > The "problem" (:-) is that it's great at just building extensions, = too. >=20 > python2.1 setup.py install, python2.2 setup.py install, python2.3 = setup.py=20 > install, and hey pronto, I have my extension built and installed on = all=20 > Python versions I want to support, ready for testing. Hard to beat!-) does your code always work right away? I tend to use an incremental approach, with lots of edit-compile-run cycles. I still haven't found a way to get the damn thing to just build my extension and copy it to the current directory, so I can run the test scripts. does anyone here know how to do that, without having to resort to ugly wrapper batch files/shell scripts? (distutils is also a pain to use with a version management system that marks files in the repository as read-only; distutils copy function happily copies all the status bits. but the remove function refuses to remove files that are read-only, even if the files have been created by distutils itself...) From mwh@python.net Fri Jun 14 10:48:54 2002 From: mwh@python.net (Michael Hudson) Date: 14 Jun 2002 10:48:54 +0100 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: martin@v.loewis.de's message of "14 Jun 2002 09:10:04 +0200" References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <200206131751.g5DHps801904@odiug.zope.com> <15624.64326.946056.280597@12-248-41-177.client.attbi.com> <15625.7199.102456.85532@slothrop.zope.com> Message-ID: <2mofeeyyux.fsf@starship.python.net> martin@v.loewis.de (Martin v. Loewis) writes: > Jeremy Hylton writes: > > > Only if distutils grows a way to specify all those dependencies. Once > > you've specified them, I'm not sure why it is difficult to check them > > in Python code instead of relying on make. > > I believe people normally want their build process to know > dependencies without any specification of dependencies. Instead, the > build process should know what the dependencies are by looking at the > source files. > > For C, there are two ways to do that: you can either scan the sources > yourself for include statements, or you can let the compiler dump > dependency lists into files. > > The latter is only supported for some compilers, but it would help > enourmously: when compiling the first time, you know for sure that you > will need to compile. When compiling the second time, you read the > dependency information generated the first time, to determine whether > any of the included headers has changed. If that is not the case, you > can skip rebuilding. If you do rebuild, the dependency information > will be updated automatically (since the change might have been to add > an include). $ cd ~/src/sf/python/dist/src/Lib/distutils/command/ $ ls -l build_dep.py -rw-rw-r-- 1 mwh mwh 763 Apr 13 11:18 build_dep.py Had that idea. Didn't get very far with it, though. Maybe on the train to EuroPython... Cheers, M. -- The gripping hand is really that there are morons everywhere, it's just that the Americon morons are funnier than average. -- Pim van Riezen, alt.sysadmin.recovery From mwh@python.net Fri Jun 14 11:10:46 2002 From: mwh@python.net (Michael Hudson) Date: 14 Jun 2002 11:10:46 +0100 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: "Fredrik Lundh"'s message of "Fri, 14 Jun 2002 11:45:28 +0200" References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> Message-ID: <2m7kl2npax.fsf@starship.python.net> "Fredrik Lundh" writes: > alex wrote: > > > The "problem" (:-) is that it's great at just building extensions, = > too. > >=20 > > python2.1 setup.py install, python2.2 setup.py install, python2.3 = > setup.py=20 > > install, and hey pronto, I have my extension built and installed on = > all=20 > > Python versions I want to support, ready for testing. Hard to beat!-) > > does your code always work right away? > > I tend to use an incremental approach, with lots of edit-compile-run > cycles. I still haven't found a way to get the damn thing to just build > my extension and copy it to the current directory, so I can run the > test scripts. > > does anyone here know how to do that, without having to resort to > ugly wrapper batch files/shell scripts? Nope. I guess class install_local(distutils.command.install.install): ... would be one way. Perhaps it should be built in. > (distutils is also a pain to use with a version management system > that marks files in the repository as read-only; distutils copy function > happily copies all the status bits. but the remove function refuses to > remove files that are read-only, even if the files have been created > by distutils itself...) Yeah, this area sucks. It interacts v. badly with umask, too. Maybe I'll work on this bug instead on my next train journey... installing shared libraries with something like copy_tree is gross. Cheers, M. -- Counting lines is probably a good idea if you want to print it out and are short on paper, but I fail to see the purpose otherwise. -- Erik Naggum, comp.lang.lisp From thomas.heller@ion-tof.com Fri Jun 14 11:56:43 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 14 Jun 2002 12:56:43 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> Message-ID: <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> > does your code always work right away? > > I tend to use an incremental approach, with lots of edit-compile-run > cycles. I still haven't found a way to get the damn thing to just build > my extension and copy it to the current directory, so I can run the > test scripts. > > does anyone here know how to do that, without having to resort to > ugly wrapper batch files/shell scripts? > > (distutils is also a pain to use with a version management system > that marks files in the repository as read-only; distutils copy function > happily copies all the status bits. but the remove function refuses to > remove files that are read-only, even if the files have been created > by distutils itself...) > > > setup.py install --install-lib=. Thomas From aleax@aleax.it Fri Jun 14 12:03:39 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 14 Jun 2002 13:03:39 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <005a01c21388$387cd3e0$0900a8c0@spiff> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <005a01c21388$387cd3e0$0900a8c0@spiff> Message-ID: On Friday 14 June 2002 11:45 am, Fredrik Lundh wrote: > alex wrote: > > The "problem" (:-) is that it's great at just building extensions, too. > > > > python2.1 setup.py install, python2.2 setup.py install, python2.3 > > setup.py install, and hey pronto, I have my extension built and installed > > on all Python versions I want to support, ready for testing. Hard to > > beat!-) > > does your code always work right away? Never! As the tests fail and problems are identified, I edit the sources, and redo the setup.py install on one or more of the Python versions. > I tend to use an incremental approach, with lots of edit-compile-run Me too. Iterative and incremental is highly productive. > cycles. I still haven't found a way to get the damn thing to just build > my extension and copy it to the current directory, so I can run the > test scripts. I haven't even looked for such a way, since going to site-packages is no problem for me. If I was developing on a Python installation shared by several users I'd no doubt feel differently about it. > (distutils is also a pain to use with a version management system > that marks files in the repository as read-only; distutils copy function Many things are. Fortunately, cvs, for all of its problem, doesn't do the readonly thing:-). Alex From fredrik@pythonware.com Fri Jun 14 12:20:18 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 14 Jun 2002 13:20:18 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <200206141059.g5EAxLI31419@pythonware.com> Message-ID: <016901c21395$dc9ee190$0900a8c0@spiff> alex wrote: > > cycles. I still haven't found a way to get the damn thing to just = build > > my extension and copy it to the current directory, so I can run the > > test scripts. >=20 > I haven't even looked for such a way, since going to site-packages is > no problem for me. If I was developing on a Python installation = shared > by several users I'd no doubt feel differently about it. you only work on a single project too, I assume. I tend to prefer not to install a broken extension in my machine's default install, in case I have to switch to another project... (and switching between projects is all I seem to do these days ;-) (and I maintain too many modules to afford to install a separate python interpreter for each one of them...) From fredrik@pythonware.com Fri Jun 14 12:23:06 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 14 Jun 2002 13:23:06 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> Message-ID: <016a01c21395$dca1eed0$0900a8c0@spiff> thomas wrote: > >=20 > > does anyone here know how to do that, without having to resort to > > ugly wrapper batch files/shell scripts? > >=20 > > (distutils is also a pain to use with a version management system > > that marks files in the repository as read-only; distutils copy = function > > happily copies all the status bits. but the remove function refuses = to > > remove files that are read-only, even if the files have been created > > by distutils itself...) >=20 > setup.py install --install-lib=3D. doesn't work: distutils ends up trying to overwrite (readonly) original source files. consider PIL, for example: in my source directory, I have the following files, checked out from a repository: setup.py _imaging.c *.c PIL/*.py I want to be able to run setup.py and end up with an _imaging.pyd in the same directory. I don't want distutils to attempt to copy stuff from PIL/*.py to PIL/*.py, mess up other parts of my source tree, install any scripts (broken or not) in the Python directory, or just generally make an ass of itself when failing to copy readonly files on top of other readonly files. the following is a bit more reliable (windows version): rd /s /q build python setup.py build rd /s /q install python setup.py install --prefix install copy install\*.pyd . if distutils didn't mess up when deleting readonly files it created all by itself, the following command could perhaps work: setup.py install_ext --install-lib=3D. but there is no install_ext command in the version I have... From fredrik@pythonware.com Fri Jun 14 12:25:38 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 14 Jun 2002 13:25:38 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies Message-ID: <019201c21396$36c9eb60$0900a8c0@spiff> > the following is a bit more reliable (windows version): >=20 > rd /s /q build > python setup.py build > rd /s /q install > python setup.py install --prefix install > copy install\*.pyd . except that the PYD ends up under install\lib\site-packages in some versions of distutils, of course... brute-force workaround: copy install\*.pyd . copy install\lib\site-packages\*.pyd . From aleax@aleax.it Fri Jun 14 12:36:37 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 14 Jun 2002 13:36:37 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <016901c21395$dc9ee190$0900a8c0@spiff> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <200206141059.g5EAxLI31419@pythonware.com> <016901c21395$dc9ee190$0900a8c0@spiff> Message-ID: On Friday 14 June 2002 01:20 pm, Fredrik Lundh wrote: > alex wrote: > > > cycles. I still haven't found a way to get the damn thing to just > > > build my extension and copy it to the current directory, so I can run > > > the test scripts. > > > > I haven't even looked for such a way, since going to site-packages is > > no problem for me. If I was developing on a Python installation shared > > by several users I'd no doubt feel differently about it. > > you only work on a single project too, I assume. You know what they say about "assume"...? In all fairness, I would say a substantial defect I have is to tend to try and juggle too MANY things at the same time. "a single project", *INDEED*...! > I tend to prefer not to install a broken extension in my machine's > default install, in case I have to switch to another project... (and > switching between projects is all I seem to do these days ;-) I guess it comes down to being a well-organized person. I'm not, and a couple of imperfect extensions in various site-packages doesn't importantly increase the already-high entropy of my working environment. Sure, it _would_ be even better if the extensions could be in some other distinguishable place until they're working -- but "the local directory" isn't such a place (no per-Python-version distinction in that case, and per-platform is also important -- I do most of my Windows work these days in a win4lin setup, same machine, disk, screen, &c, as my main Linux box -- ditto for cygwin inside that virtual Windows box inside that Linux box). Alex From thomas.heller@ion-tof.com Fri Jun 14 12:39:29 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 14 Jun 2002 13:39:29 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> Message-ID: <0ef701c21398$24fc81c0$e000a8c0@thomasnotebook> From: "Fredrik Lundh" > consider PIL, for example: in my source directory, I have the > following files, checked out from a repository: > > setup.py > _imaging.c > *.c > PIL/*.py > > I want to be able to run setup.py and end up with an _imaging.pyd > in the same directory. I don't want distutils to attempt to copy > stuff from PIL/*.py to PIL/*.py, mess up other parts of my source > tree, install any scripts (broken or not) in the Python directory, or > just generally make an ass of itself when failing to copy readonly > files on top of other readonly files. > Then there's Berthold Höllmanns test-command he posted to the distutils sig, which internally runs the 'build' command, then extends sys.path by build_purelib, build_platlib, and the test-directory, and finally runs the tests in the test-directory files. For the readonly file issue, I have a force_remove_tree() function in one of my setup-scripts (well, actually it is part of the pyexe-distutils extension), cloned from distutils' remove_tree() function. Thomas From guido@python.org Fri Jun 14 12:58:53 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 07:58:53 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "14 Jun 2002 09:04:18 +0200." References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> Message-ID: <200206141158.g5EBwrw31717@pcp02138704pcs.reston01.va.comcast.net> > This is where the real problem lies: that building modules via > makesetup generates module.so, whereas building modules via setup.py > builds .so. This needs to be fixed, and I feel that setup.py is right > and makesetup is wrong. IMO that's entirely accidental. You can use Setup to build either form. I would assume you can use setup.py to build either form too, but I'm not sure. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 14 13:05:32 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 08:05:32 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 09:44:58 +0200." <20020614074458.GA31022@strakt.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> Message-ID: <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> > How about something like this: > > Extension('foo', ['foo1.c', 'foo2.c'], dependencies={'foo1.c': > ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']}) > > though there is the problem of backwards compatability :/ But this is wrong: it's not foo1.c that depends on bar.h, it's foo1.o. With the latest CVS, on Unix or Linux, try this: - Run Make to be sure you are up to date - Touch Modules/socketobject.h - Run Make again The latest setup.py has directives that tell it that the _socket and _ssl modules depend on socketmodule.h, and this makes it rebuild the necessary .o and .so files (through the changes to distutils that Jeremy made). All we need is for someone to add all the other dependencies to setup.py. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 14 13:09:15 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 08:09:15 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 10:04:24 +0200." <3D09A388.8080107@lemburg.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> Message-ID: <200206141209.g5EC9F931822@pcp02138704pcs.reston01.va.comcast.net> > The question is whether we want distutils to be a development > tool as well, or rather stick to its main purpose: that of > simplifying distribution and installation of software (and > thanks to Greg, it's great at that !). Yes. Much of distutils is concerned with compiling, and that part is also needed by a development tool. So I'd say it's a pretty good match. You have to specify the extension build rules as some kind of script. We found Modules/Setup + makesetup inadequate, and moved to setup.py + distutils. Distutils is the best we got; it knows about many compilers and platforms; it was pretty easy to add .h file dependency handling (though not discovery). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 14 13:13:22 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 08:13:22 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 11:45:28 +0200." <005a01c21388$387cd3e0$0900a8c0@spiff> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> Message-ID: <200206141213.g5ECDMT31861@pcp02138704pcs.reston01.va.comcast.net> > I tend to use an incremental approach, with lots of edit-compile-run > cycles. I still haven't found a way to get the damn thing to just build > my extension and copy it to the current directory, so I can run the > test scripts. Funny, I use an edit-compile-run cycle too, but I don't have the need to copy anything to the current directory. > does anyone here know how to do that, without having to resort to > ugly wrapper batch files/shell scripts? > > (distutils is also a pain to use with a version management system > that marks files in the repository as read-only; distutils copy function > happily copies all the status bits. but the remove function refuses to > remove files that are read-only, even if the files have been created > by distutils itself...) This smells like a bug. Maybe it can be fixed rather than used as a stick to hit the dog. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@strakt.com Fri Jun 14 13:12:25 2002 From: martin@strakt.com (Martin =?iso-8859-1?Q?Sj=F6gren?=) Date: Fri, 14 Jun 2002 14:12:25 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020614121225.GA32573@strakt.com> On Fri, Jun 14, 2002 at 08:05:32AM -0400, Guido van Rossum wrote: > > How about something like this: > >=20 > > Extension('foo', ['foo1.c', 'foo2.c'], dependencies=3D{'foo1.c': > > ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']}) > >=20 > > though there is the problem of backwards compatability :/ >=20 > But this is wrong: it's not foo1.c that depends on bar.h, it's foo1.o. You're right. > With the latest CVS, on Unix or Linux, try this: >=20 > - Run Make to be sure you are up to date > - Touch Modules/socketobject.h > - Run Make again >=20 > The latest setup.py has directives that tell it that the _socket and > _ssl modules depend on socketmodule.h, and this makes it rebuild the > necessary .o and .so files (through the changes to distutils that > Jeremy made). Cool. But my module consists of several .c files, how do I specify which .o files depend on which .h files? Now, it's a shame I have to maintain compatability with the Python 2.1 and Python 2.2 distributions in my setup.py ;) I suppose I could try/except... Regards, Martin --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 7710870 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html From guido@python.org Fri Jun 14 13:17:52 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 08:17:52 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 13:23:06 +0200." <016a01c21395$dca1eed0$0900a8c0@spiff> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> Message-ID: <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net> > consider PIL, for example: in my source directory, I have the > following files, checked out from a repository: > > setup.py > _imaging.c > *.c > PIL/*.py > > I want to be able to run setup.py and end up with an _imaging.pyd > in the same directory. I don't want distutils to attempt to copy > stuff from PIL/*.py to PIL/*.py, mess up other parts of my source > tree, install any scripts (broken or not) in the Python directory, or > just generally make an ass of itself when failing to copy readonly > files on top of other readonly files. I thought that the thing to do this was python setup.py build_ext -i --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 14 13:21:58 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 08:21:58 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 14:12:25 +0200." <20020614121225.GA32573@strakt.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> <20020614121225.GA32573@strakt.com> Message-ID: <200206141222.g5ECMA406904@pcp02138704pcs.reston01.va.comcast.net> > Cool. But my module consists of several .c files, how do I specify > which .o files depend on which .h files? You can't. Compared to throwing away the entire build directory containing all Python extensions, it's still a huge win. For your extension, it may not make much of a difference. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Fri Jun 14 13:24:36 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 14 Jun 2002 08:24:36 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <2m7kl2npax.fsf@starship.python.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <2m7kl2npax.fsf@starship.python.net> Message-ID: <20020614122436.GA2791@ute.mems-exchange.org> On Fri, Jun 14, 2002 at 11:10:46AM +0100, Michael Hudson wrote: >Yeah, this area sucks. It interacts v. badly with umask, too. Maybe >I'll work on this bug instead on my next train journey... installing >shared libraries with something like copy_tree is gross. Out of curiosity, why? I've had the patch below sitting in my copy of the tree forever, but haven't gotten around to checking whether it fixes the umask-related bug: I think it should, though. Index: install_lib.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/distutils/command/install_lib.py,vretrieving revision 1.40 diff -u -r1.40 install_lib.py --- install_lib.py 4 Jun 2002 20:14:43 -0000 1.40 +++ install_lib.py 14 Jun 2002 12:05:30 -0000 @@ -106,7 +106,8 @@ def install (self): if os.path.isdir(self.build_dir): - outfiles = self.copy_tree(self.build_dir, self.install_dir) + outfiles = self.copy_tree(self.build_dir, self.install_dir, + preserve_mode=1) else: self.warn("'%s' does not exist -- no Python modules to install" % self.build_dir) --amk From David Abrahams" Message-ID: <084801c2139e$df0bfae0$6601a8c0@boostconsulting.com> Hi Fred, I wasn't aware of how it would be formatted when I submitted my PyObject_RichCompare patches. I think the different italic "op" usages are too similar. Probably \emph should be replaced with something that bold-ifies. Best, Dave ----- Original Message ----- From: "Fred L. Drake" To: ; ; Sent: Friday, June 14, 2002 8:20 AM Subject: [Python-Dev] [development doc updates] > The development version of the documentation has been updated: > > http://www.python.org/dev/doc/devel/ > > Updated to reflect recent changes. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From mwh@python.net Fri Jun 14 13:46:28 2002 From: mwh@python.net (Michael Hudson) Date: 14 Jun 2002 13:46:28 +0100 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Andrew Kuchling's message of "Fri, 14 Jun 2002 08:24:36 -0400" References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <2m7kl2npax.fsf@starship.python.net> <20020614122436.GA2791@ute.mems-exchange.org> Message-ID: <2m4rg6hvtn.fsf@starship.python.net> Andrew Kuchling writes: > On Fri, Jun 14, 2002 at 11:10:46AM +0100, Michael Hudson wrote: > >Yeah, this area sucks. It interacts v. badly with umask, too. Maybe > >I'll work on this bug instead on my next train journey... installing > >shared libraries with something like copy_tree is gross. > > Out of curiosity, why? Dunno. It just strikes me as a really bad idea. If the linker produces other cruft you'll end up installing that too. It's certainly possible that core files could get installed, if you've run tests in build/lib.foo/. > I've had the patch below sitting in my copy of the tree forever, but > haven't gotten around to checking whether it fixes the umask-related > bug: I think it should, though. I think I tried that and it didn't work, but can't remember all that clearly. Cheers, M. -- As it seems to me, in Perl you have to be an expert to correctly make a nested data structure like, say, a list of hashes of instances. In Python, you have to be an idiot not to be able to do it, because you just write it down. -- Peter Norvig, comp.lang.functional From mwh@python.net Fri Jun 14 13:49:39 2002 From: mwh@python.net (Michael Hudson) Date: 14 Jun 2002 13:49:39 +0100 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: "Thomas Heller"'s message of "Fri, 14 Jun 2002 13:39:29 +0200" References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <0ef701c21398$24fc81c0$e000a8c0@thomasnotebook> Message-ID: <2m1ybahvoc.fsf@starship.python.net> "Thomas Heller" writes: > Then there's Berthold Höllmanns test-command he posted to the > distutils sig, which internally runs the 'build' command, then extends > sys.path by build_purelib, build_platlib, and the test-directory, and > finally runs the tests in the test-directory files. You could have a variant of that that just ran a code.InteractiveInterpreter. $ python setup.py play-around Cheers, M. -- 31. Simplicity does not precede complexity, but follows it. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From mgilfix@eecs.tufts.edu Fri Jun 14 13:55:53 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Fri, 14 Jun 2002 08:55:53 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <016901c21395$dc9ee190$0900a8c0@spiff>; from fredrik@pythonware.com on Fri, Jun 14, 2002 at 01:20:18PM +0200 References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <200206141059.g5EAxLI31419@pythonware.com> <016901c21395$dc9ee190$0900a8c0@spiff> Message-ID: <20020614085552.A4109@eecs.tufts.edu> On Fri, Jun 14 @ 13:20, Fredrik Lundh wrote: > alex wrote: > > > cycles. I still haven't found a way to get the damn thing to just build > > > my extension and copy it to the current directory, so I can run the > > > test scripts. > > > > I haven't even looked for such a way, since going to site-packages is > > no problem for me. If I was developing on a Python installation shared > > by several users I'd no doubt feel differently about it. > > you only work on a single project too, I assume. > > I tend to prefer not to install a broken extension in my machine's > default install, in case I have to switch to another project... (and > switching between projects is all I seem to do these days ;-) > > (and I maintain too many modules to afford to install a separate > python interpreter for each one of them...) Er, do you encase the main routines for your programs in a sub-directory? I usually create a 'libapp' directory and put my sources in there and then the main application loads main.py from libapp. That way, setup.py installs libapp into site-packages and I don't have to worry about multiple projects. That's a definite help. If that doesn't satisfy you, you could always play around with the install locations and then sys.path.. but that's usually not necessary. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From guido@python.org Fri Jun 14 14:12:45 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 09:12:45 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "14 Jun 2002 13:46:28 BST." <2m4rg6hvtn.fsf@starship.python.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <2m7kl2npax.fsf@starship.python.net> <20020614122436.GA2791@ute.mems-exchange.org> <2m4rg6hvtn.fsf@starship.python.net> Message-ID: <200206141312.g5EDCjl07313@pcp02138704pcs.reston01.va.comcast.net> > > I've had the patch below sitting in my copy of the tree forever, but > > haven't gotten around to checking whether it fixes the umask-related > > bug: I think it should, though. > > I think I tried that and it didn't work, but can't remember all that > clearly. Looks like that wouldn't address /F's problem, which is that the original files are read-only, so distutils makes the copies read-only, and then refuses to remove them when you ask it to. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com Fri Jun 14 09:25:15 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 14 Jun 2002 04:25:15 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <20020614121225.GA32573@strakt.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> <20020614121225.GA32573@strakt.com> Message-ID: <15625.43115.50297.18925@slothrop.zope.com> >>>>> "MS" =3D=3D Martin Sj=F6gren writes: >> But this is wrong: it's not foo1.c that depends on bar.h, it's >> foo1.o. MS> You're right. On the other hand, distutils setup scripts don't talk about .o files directly. They talk about the .c file and assume there is a one-to-one correspondence between .c files and .o files. >> With the latest CVS, on Unix or Linux, try this: >> >> - Run Make to be sure you are up to date >> - Touch Modules/socketobject.h >> - Run Make again >> >> The latest setup.py has directives that tell it that the _socket >> and _ssl modules depend on socketmodule.h, and this makes it >> rebuild the necessary .o and .so files (through the changes to >> distutils that Jeremy made). MS> Cool. But my module consists of several .c files, how do I MS> specify which .o files depend on which .h files? I did something simpler, as Guido mentioned. I added global dependencies for an extension. This has been fine for all the extensions that I commonly build because they have only one or several source files. Recompiling a few .c files costs little. I agree that it would be nice to have fine-grained dependency tracking, but that costs more in the implementation and to use. Thomas Heller has a patch on SF (don't recall the number) that handles per-file dependencies. I didn't care for the way the dependencies are spelled in the setup script, but something like the dict that Martin (the other Martin, right?) suggested seems workable. MS> Now, it's a shame I have to maintain compatability with the MS> Python 2.1 and Python 2.2 distributions in my setup.py ;) MS> I suppose I could try/except... We should come up with a good hack to use in setup scripts. This is my first try. It's got too many lines, but it works. # A hack to determine if Extension objects support the depends keyword = arg. if not "depends" in Extension.__init__.func_code.co_varnames: # If it doesn't, create a local replacement that removes depends # from the kwargs before calling the regular constructor. _Extension =3D Extension class Extension(_Extension): def __init__(self, name, sources, **kwargs): if "depends" in kwargs: del kwargs["depends"] _Extension.__init__(self, name, sources, **kwargs) Jeremy From Jack.Jansen@cwi.nl Fri Jun 14 14:11:43 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 14 Jun 2002 15:11:43 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> Message-ID: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl> On Thursday, June 13, 2002, at 07:49 , Skip Montanaro wrote: > > I wonder if it would be better to have distutils generate the > appropriate > type of makefile and execute that instead of directly building objects > and > shared libraries. This would finesse some of the dependency tracking > problems that pop up frequently. +1 Distutils is very unix-centric in that it expects there to be separate compile and link steps. While this can be made to work on Windows (at least for MSVC) where there are such separate compilers if you look hard enough it can't be made to work for MetroWerks on the Mac, and also for MSVC it's a rather funny way to do things. I would much prefer it if distutils would (optionally) gather all it's knowledge and generate a Makefile or an MW projectfile or an MSVC projectfile. For MW distutils already does this (every step simply remembers information, and at the "link" step it writes out a project file and builds that) but it would be nice if this way of operation was codified. Note that for people having an IDE this would also make debugging a lot easier: if you have an IDE project you can easily do nifty things like turn on debugging, use its class browser, etc. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From martin@strakt.com Fri Jun 14 14:23:00 2002 From: martin@strakt.com (Martin =?iso-8859-1?Q?Sj=F6gren?=) Date: Fri, 14 Jun 2002 15:23:00 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15625.43115.50297.18925@slothrop.zope.com> References: <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> <20020614121225.GA32573@strakt.com> <15625.43115.50297.18925@slothrop.zope.com> Message-ID: <20020614132300.GB712@strakt.com> On Fri, Jun 14, 2002 at 04:25:15AM -0400, Jeremy Hylton wrote: > MS> Cool. But my module consists of several .c files, how do I > MS> specify which .o files depend on which .h files? >=20 > I did something simpler, as Guido mentioned. I added global > dependencies for an extension. This has been fine for all the > extensions that I commonly build because they have only one or several > source files. Recompiling a few .c files costs little. >=20 > I agree that it would be nice to have fine-grained dependency > tracking, but that costs more in the implementation and to use. > Thomas Heller has a patch on SF (don't recall the number) that handles > per-file dependencies. I didn't care for the way the dependencies are > spelled in the setup script, but something like the dict that Martin > (the other Martin, right?) suggested seems workable. Extension('foo', ['foo1.c', 'foo2.c'], dependencies=3D{'foo1.c': ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']}) That's what I suggested, is that what you meant? > MS> Now, it's a shame I have to maintain compatability with the > MS> Python 2.1 and Python 2.2 distributions in my setup.py ;) > MS> I suppose I could try/except... >=20 > We should come up with a good hack to use in setup scripts. This is > my first try. It's got too many lines, but it works. >=20 > # A hack to determine if Extension objects support the depends keyword = arg. > if not "depends" in Extension.__init__.func_code.co_varnames: > # If it doesn't, create a local replacement that removes depends > # from the kwargs before calling the regular constructor. > _Extension =3D Extension > class Extension(_Extension): > def __init__(self, name, sources, **kwargs): > if "depends" in kwargs: > del kwargs["depends"] > _Extension.__init__(self, name, sources, **kwargs) Eep :) Looks like it could work, yes, but I think I'll skip that one whil= e I'm still running Python 2.2. :) Cheers, Martin --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 7710870 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html From guido@python.org Fri Jun 14 14:35:40 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 09:35:40 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 15:11:43 +0200." <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl> References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl> Message-ID: <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net> > Distutils is very unix-centric in that it expects there to be separate > compile and link steps. While this can be made to work on Windows (at > least for MSVC) where there are such separate compilers if you look hard > enough it can't be made to work for MetroWerks on the Mac, and also for > MSVC it's a rather funny way to do things. Actually, the setup dialogs and general structure of MSVC make you very aware of the Unixoid structure of the underlying compiler suite. :-) But I believe what you say about MW. > I would much prefer it if distutils would (optionally) gather all it's > knowledge and generate a Makefile or an MW projectfile or an MSVC > projectfile. > > For MW distutils already does this (every step simply remembers > information, and at the "link" step it writes out a project file and > builds that) but it would be nice if this way of operation was codified. I'm not sure what's to codify -- this is different for each compiler suite. When using setup.py with a 3rd party extension on Windows, I like the fact that I don't have to fire up the GUI to build it. (I just wish it were easier to make distutils do the right thing for debug builds of Python. This has improved on Unix but I hear it's still broken on Windows.) > Note that for people having an IDE this would also make debugging a lot > easier: if you have an IDE project you can easily do nifty things like > turn on debugging, use its class browser, etc. That's for developers though, not for people installing extensions that come with a setup.py script. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jun 14 14:44:07 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 14 Jun 2002 09:44:07 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > ... > (I just wish it were easier to make distutils do the right thing for > debug builds of Python. This has improved on Unix but I hear it's > still broken on Windows.) Hard to say. "stupid_build.py --debug" works great on Windows in the Zope3 tree. "setup.py --debug" on Windows in the Zope tree builds the debug stuff but leaves the results in unusable places. Since I don't understood disutils or the Zope build process, I'm not complaining. There's nothing that can't be fixed by hand via a mouse, Windows Explorer, and a spare hour each time around . From fdrake@acm.org Fri Jun 14 14:53:21 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 14 Jun 2002 09:53:21 -0400 Subject: [Python-Dev] [development doc updates] In-Reply-To: <084801c2139e$df0bfae0$6601a8c0@boostconsulting.com> References: <20020614122004.9148F286BC@beowolf.fdrake.net> <084801c2139e$df0bfae0$6601a8c0@boostconsulting.com> Message-ID: <15625.62801.794774.266808@grendel.zope.com> David Abrahams writes: > I wasn't aware of how it would be formatted when I submitted my > PyObject_RichCompare patches. I think the different italic "op" usages are > too similar. Probably \emph should be replaced with something that > bold-ifies. Bold is the wrong thing. I propose: - change the argument name to "opid" - let the "op" stand-in show in the code font so it contrasts with the o1 and o2 references in the \samp. Here's what it looks like: http://www.python.org/dev/doc/devel/api/object.html#l2h-170 If that works for you, let me know and I'll update the 2.2.x docs and commit the changes. Thanks for your comments! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From David Abrahams" <084801c2139e$df0bfae0$6601a8c0@boostconsulting.com> <15625.62801.794774.266808@grendel.zope.com> Message-ID: <08b101c213af$662a2dc0$6601a8c0@boostconsulting.com> That's great, Fred. You might also want to add the "Return value: New reference." riff to PyObject_RichCompare; I missed that when looking at the existing doc for examples. -Dave From: "Fred L. Drake, Jr." > > David Abrahams writes: > > I wasn't aware of how it would be formatted when I submitted my > > PyObject_RichCompare patches. I think the different italic "op" usages are > > too similar. Probably \emph should be replaced with something that > > bold-ifies. > > Bold is the wrong thing. I propose: > > - change the argument name to "opid" > - let the "op" stand-in show in the code font so it contrasts with the > o1 and o2 references in the \samp. > > Here's what it looks like: > > http://www.python.org/dev/doc/devel/api/object.html#l2h-170 > > If that works for you, let me know and I'll update the 2.2.x docs and > commit the changes. > > Thanks for your comments! > > > -Fred > > -- > Fred L. Drake, Jr. > PythonLabs at Zope Corporation > From fdrake@acm.org Fri Jun 14 15:37:56 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 14 Jun 2002 10:37:56 -0400 Subject: [Python-Dev] [development doc updates] In-Reply-To: <08b101c213af$662a2dc0$6601a8c0@boostconsulting.com> References: <20020614122004.9148F286BC@beowolf.fdrake.net> <084801c2139e$df0bfae0$6601a8c0@boostconsulting.com> <15625.62801.794774.266808@grendel.zope.com> <08b101c213af$662a2dc0$6601a8c0@boostconsulting.com> Message-ID: <15625.65476.263702.375912@grendel.zope.com> David Abrahams writes: > That's great, Fred. You might also want to add the "Return value: New > reference." riff to PyObject_RichCompare; I missed that when looking at the > existing doc for examples. The mechanics of the refcount data are a bit different. I've added that and committed the changes; updates should appear on the site in the next hour. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From thomas.heller@ion-tof.com Fri Jun 14 15:38:59 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 14 Jun 2002 16:38:59 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl> <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <105701c213b1$3853a460$e000a8c0@thomasnotebook> From: "Guido van Rossum" > I'm not sure what's to codify -- this is different for each compiler > suite. When using setup.py with a 3rd party extension on Windows, I > like the fact that I don't have to fire up the GUI to build it. (I Same for me. > just wish it were easier to make distutils do the right thing for > debug builds of Python. This has improved on Unix but I hear it's > still broken on Windows.) > What do you think is broken with the debug builds? I use it routinely and have no problems at all... [Jack] > > Note that for people having an IDE this would also make debugging a lot > > easier: if you have an IDE project you can easily do nifty things like > > turn on debugging, use its class browser, etc. I prefer to insert #ifdef _DEBUG _asm int 3; /* breakpoint */ #endif into the problematic sections of my code, and whoops, the MSVC GUI debugger opens just when this code is executed, even if it was started from the command line. Thomas From guido@python.org Fri Jun 14 15:51:21 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 10:51:21 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 16:38:59 +0200." <105701c213b1$3853a460$e000a8c0@thomasnotebook> References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl> <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net> <105701c213b1$3853a460$e000a8c0@thomasnotebook> Message-ID: <200206141451.g5EEpMg08193@pcp02138704pcs.reston01.va.comcast.net> > What do you think is broken with the debug builds? > I use it routinely and have no problems at all... I was repeating hearsay. Here's what used to be broken on Unix: if you built a debug Python but did not install it (assuming a non-debug Python was already installed), and then used that debug Python to build a 3rd party extension, the debug Python's configuration would be ignored, and the extension would be built with the configuration of the installed Python instead. Such extensions can't be linked with the debug Python, which was the whole point of using the debug Python to build in the first place. Jeremy recently fixed this for Unix, and I'm very happy. But I believe that on Windows you still have to add "--debug" to your setup.py build command to get the same effect. I think that using the debug executable should be sufficient to turn on the debug flags. More generally, I think that when you use a Python executable that lives in a build directory, the configuration of that build directory should be used for all extensions you build. This is what Jeremy did in his fix. (As a side effect, building the Python extensions no longer needs to be special-cased.) --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Fri Jun 14 15:59:06 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 14 Jun 2002 16:59:06 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl> <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net> <105701c213b1$3853a460$e000a8c0@thomasnotebook> <200206141451.g5EEpMg08193@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <107901c213b4$07e849e0$e000a8c0@thomasnotebook> From: "Guido van Rossum" > > What do you think is broken with the debug builds? > > I use it routinely and have no problems at all... > > I was repeating hearsay. The complaints I remember (mostly from c.l.p) are from people who want to build debug versions of extensions while at the same time refusing to build a debug version of Python from the sources. > > Here's what used to be broken on Unix: if you built a debug Python but > did not install it (assuming a non-debug Python was already > installed), and then used that debug Python to build a 3rd party > extension, the debug Python's configuration would be ignored, and the > extension would be built with the configuration of the installed > Python instead. Such extensions can't be linked with the debug > Python, which was the whole point of using the debug Python to build > in the first place. > > Jeremy recently fixed this for Unix, and I'm very happy. > > But I believe that on Windows you still have to add "--debug" to your > setup.py build command to get the same effect. I think that using the > debug executable should be sufficient to turn on the debug flags. > > More generally, I think that when you use a Python executable that > lives in a build directory, the configuration of that build directory > should be used for all extensions you build. This is what Jeremy did > in his fix. (As a side effect, building the Python extensions no > longer needs to be special-cased.) > I don't know anything about building Python (and extensions) on Unix, but here's how it works on windows: You can use the release as well as the debug version of Python to build release debug or release extensions with distutils. You have to use the --debug switch to specify which one to use. The debug version needs other libraries than the release version, they all have an _d inserted into the filename just before the filename- extension (but you probably know this already ;-). I don't know if it even is possible (in Python code) to determine whether the debug or the release exe is currently running. With changes I recently made to distutils, you can even do all this in a 'not installed' version, straight from CVS, for example. Thomas From David Abrahams" <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <08ed01c213b4$52d866b0$6601a8c0@boostconsulting.com> From: "Guido van Rossum" > > Distutils is very unix-centric in that it expects there to be separate > > compile and link steps. While this can be made to work on Windows (at > > least for MSVC) where there are such separate compilers if you look hard > > enough it can't be made to work for MetroWerks on the Mac, and also for > > MSVC it's a rather funny way to do things. > > Actually, the setup dialogs and general structure of MSVC make you > very aware of the Unixoid structure of the underlying compiler > suite. :-) > > But I believe what you say about MW. Well, that really depends on whether you think supporting MacOS 9 development is important. MW supplies regular command-line tools for MacOS X. -Dave From jeremy@zope.com Fri Jun 14 11:20:04 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 14 Jun 2002 06:20:04 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141451.g5EEpMg08193@pcp02138704pcs.reston01.va.comcast.net> References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl> <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net> <105701c213b1$3853a460$e000a8c0@thomasnotebook> <200206141451.g5EEpMg08193@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15625.50004.13771.247686@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: GvR> Jeremy recently fixed this for Unix, and I'm very happy. Actually, it was Fred. I expect you're still very happy, and now Fred is, too. Jeremy From tim.one@comcast.net Fri Jun 14 16:18:22 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 14 Jun 2002 11:18:22 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <107901c213b4$07e849e0$e000a8c0@thomasnotebook> Message-ID: [Thomas Heller] > ... > I don't know if it even is possible (in Python code) to determine > whether the debug or the release exe is currently running. FYI, the sys module exposes some debugging tools only in the debug build. So, e.g., def is_debug_build(): import sys return hasattr(sys, "getobjects") returns the right answer (and, I believe, under all versions of Python). From Jack.Jansen@cwi.nl Fri Jun 14 16:30:47 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 14 Jun 2002 17:30:47 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <105701c213b1$3853a460$e000a8c0@thomasnotebook> Message-ID: On Friday, June 14, 2002, at 04:38 , Thomas Heller wrote: > I prefer to insert > #ifdef _DEBUG > _asm int 3; /* breakpoint */ > #endif > into the problematic sections of my code, and whoops, > the MSVC GUI debugger opens just when this code is executed, > even if it was started from the command line. Ok, MSVC finally scored a point with me, this is nifty:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From thomas.heller@ion-tof.com Fri Jun 14 16:45:04 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 14 Jun 2002 17:45:04 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: Message-ID: <10fb01c213ba$743e25f0$e000a8c0@thomasnotebook> From: "Tim Peters" > [Thomas Heller] > > ... > > I don't know if it even is possible (in Python code) to determine > > whether the debug or the release exe is currently running. > > FYI, the sys module exposes some debugging tools only in the debug build. > So, e.g., > > def is_debug_build(): > import sys > return hasattr(sys, "getobjects") > > returns the right answer (and, I believe, under all versions of Python). > I can (in 2.2) see sys.getobjects() and sys.gettotalrefcount(). I can also guess what gettotalrefcount does, but what does getobjects() do? Is it documented somewhere? Thomas From jepler@unpythonic.net Fri Jun 14 17:01:22 2002 From: jepler@unpythonic.net (Jeff Epler) Date: Fri, 14 Jun 2002 11:01:22 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: References: <105701c213b1$3853a460$e000a8c0@thomasnotebook> Message-ID: <20020614160117.GE30070@unpythonic.net> On Fri, Jun 14, 2002 at 05:30:47PM +0200, Jack Jansen wrote: > > On Friday, June 14, 2002, at 04:38 , Thomas Heller wrote: > >I prefer to insert > >#ifdef _DEBUG > > _asm int 3; /* breakpoint */ > >#endif > >into the problematic sections of my code, and whoops, > >the MSVC GUI debugger opens just when this code is executed, > >even if it was started from the command line. > > Ok, MSVC finally scored a point with me, this is nifty:-) You can "set" a breakpoint this way in x86 Linux too. Unfortunately, when this is not run under the debugger, it simply sends a SIGTRAP to the process. In theory the standard library could handle SIGTRAP by invoking the debugger, but 5 minutes fiddling around didn't produce a very dependable way of doing so. (gdb) run Starting program: ./a.out a Program received signal SIGTRAP, Trace/breakpoint trap. main () at bp.c:21 21 printf("b\n"); (gdb) cont Continuing. b #include #define _DEBUG #ifdef _DEBUG #if defined(WIN32) #define BREAKPOINT _asm int 3 #elif defined(__GNUC__) && defined(__i386__) #define BREAKPOINT __asm__ __volatile__ ("int3") #else #warning "BREAKPOINT not defined for this OS / Compiler" #define BREAKPOINT (void)0 #endif #else #define _DEBUG (void)0 #endif main() { printf("a\n"); BREAKPOINT; printf("b\n"); return 0; } From oren-py-d@hishome.net Fri Jun 14 17:15:58 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 14 Jun 2002 19:15:58 +0300 Subject: [Python-Dev] 'new' and 'types' Message-ID: <20020614191558.A31580@hishome.net> Patch 568629 removes the built-in module new (with sincere apologies to Tommy Burnette ;-) and replaces it with a tiny Python module consisting of a single import statement: """This module is no longer required except for backward compatibility. Objects of most types can now be created by calling the type object. """ from types import \ ClassType as classobj, \ CodeType as code, \ FunctionType as function, \ InstanceType as instance, \ MethodType as instancemethod, \ ModuleType as module These types (as well as buffer and slice) have been made callable. It looks like the Python core no longer has any objects that are created by a separate factory function (there are still some in the Modules). Now, what about the types module? It has been suggested that this module should be deprecated. I think it still has some use: we need a place to put all the types that are not used often enough to be added to the builtins. I suggest that they be placed in the module 'types' with names matching their __name__ attribute. The types module will still have the long MixedCaseType names for backward compatibility. The use of the long names should be deprecated, not the types module itself. Oren From tim.one@comcast.net Fri Jun 14 17:41:03 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 14 Jun 2002 12:41:03 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <10fb01c213ba$743e25f0$e000a8c0@thomasnotebook> Message-ID: [Thomas Heller] > I can (in 2.2) see sys.getobjects() and sys.gettotalrefcount(). > I can also guess what gettotalrefcount does, but what does > getobjects() do? Is it documented somewhere? Sorry, I don't think any debug-mode-only gimmicks are documented outside of comments in the source files. In a debug build, the PyObject layout changes (btw, that's why you can't mix debug-build modules w/ release-build modules), adding new _ob_next and _ob_prev pointers at the start of every PyObject. The pointers form a doubly-linked list, which contains every live object in existence, except for those statically allocated (the builtin type objects). The head of the list is in object.c's static refchain vrbl. sys.getobjects(n) returns that C list of (almost) all live objects, as a Python list. Excluded from the list returned are the list itself, and the objects created to *call* getobjects(). The list of objects is in allocation order, most-recently allocated at the start (getobjects()[0]). n is the maximum number of objects it will return, where n==0 means (of course ) infinity. You can also pass it a type after the int, and, if you do, only objects of that type get returned. getobjects() is the tool of last resort when trying to track down an excess of increfs over decrefs. Python code that's exceedingly careful to account for its own effects can figure out anything using it. I once determined that the compiler was leaking references to the integer 2 this way . From tim.one@comcast.net Fri Jun 14 18:00:06 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 14 Jun 2002 13:00:06 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Message-ID: [Tim, on the debug-build sys.getobjects()] > ... > You can also pass it a type after the int, and, if you do, only objects of > that type get returned. Speaking of which, that become a lot more pleasant in 2.2, as new-style classes create new types, and most builtin types have builtin names. You can pee away delighted weeks pondering the mysteries . For example: Python 2.3a0 (#29, Jun 13 2002, 17:06:59) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import sys [8285 refs] >>> sys.getobjects(0, int) [17, 19, 20, 18, 14, 512, 56, 448, 128, 256, 512, 1024, 2048, 49152, 40960, 4096, 32768, 24576, 8192, 16384, 61440, 4095, 9, 7, 6, 5, 10, 32, 16, 64, 4096, 128, 16384, 32768, 512, 1024, 256, 32767, 511, -4, -1, 15, 11, 8, 22, 4, 21, 23, 503316480, 65535, 2147483647, 1, 0, 3, 2, 33751201] [8348 refs] >>> Why would the first int Python allocates be 33751201? The answer is clear with a little hexification: >>> hex(_[-1]) '0x20300a1' [8292 refs] >>> Or, if that answer isn't clear, you should unsubscribe from Python-Dev immediately . From martin@v.loewis.de Fri Jun 14 18:16:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jun 2002 19:16:20 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141158.g5EBwrw31717@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206141158.g5EBwrw31717@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > IMO that's entirely accidental. You can use Setup to build either > form. I would assume you can use setup.py to build either form too, > but I'm not sure. Can you please elaborate? I believe that, because of the fragment case $objs in *$mod.o*) base=$mod;; *) base=${mod}module;; esac the line nis nismodule.c -lnsl # Sun yellow pages -- not everywhere will always cause makesetup to build nismodule.so - if you want to build nis.so, you have to rename the source file. I don't think you can tell setup.py to build nismodule.so. So what do you propose to do to make the resulting shared library name consistent regardless of whether it is build through setup.py or makesetup? Regards, Martin From martin@v.loewis.de Fri Jun 14 18:18:07 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jun 2002 19:18:07 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <3D09A388.8080107@lemburg.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > The question is whether we want distutils to be a development > tool as well, or rather stick to its main purpose: that of > simplifying distribution and installation of software (and > thanks to Greg, it's great at that !). IMO, that's not a question anymore: distutils already *is* a tool used in build and development environments. Regards, Martin From martin@v.loewis.de Fri Jun 14 18:19:57 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jun 2002 19:19:57 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <005a01c21388$387cd3e0$0900a8c0@spiff> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> Message-ID: "Fredrik Lundh" writes: > I tend to use an incremental approach, with lots of edit-compile-run > cycles. I still haven't found a way to get the damn thing to just build > my extension and copy it to the current directory, so I can run the > test scripts. > > does anyone here know how to do that, without having to resort to > ugly wrapper batch files/shell scripts? I usually make a symlink into the build directory. Then, whenever it is rebuild, the symlink will still be there. > (distutils is also a pain to use with a version management system > that marks files in the repository as read-only; distutils copy function > happily copies all the status bits. but the remove function refuses to > remove files that are read-only, even if the files have been created > by distutils itself...) That's a bug, IMO. Regards, Martin From martin@v.loewis.de Fri Jun 14 18:23:43 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jun 2002 19:23:43 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141213.g5ECDMT31861@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <200206141213.g5ECDMT31861@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > > I tend to use an incremental approach, with lots of edit-compile-run > > cycles. I still haven't found a way to get the damn thing to just build > > my extension and copy it to the current directory, so I can run the > > test scripts. > > Funny, I use an edit-compile-run cycle too, but I don't have the need > to copy anything to the current directory. That's because Python treats its own build directory special: it adds build/something to sys.path when it finds that it is started from the build directory. If you are developing a separate package, all your code ends up in ./build/lib.platform, which is not on sys.path. Regards, Martin From skip@pobox.com Fri Jun 14 18:49:04 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 14 Jun 2002 12:49:04 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <3D09A388.8080107@lemburg.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> Message-ID: <15626.11408.660388.360296@12-248-41-177.client.attbi.com> mal> The question is whether we want distutils to be a development tool mal> as well, or rather stick to its main purpose: that of simplifying mal> distribution and installation of software (and thanks to Greg, it's mal> great at that !). Thanks for elaborating the distinction. That is exactly what I missed. I really want make+makedepend. I think that's what others have missed as well. Skip From skip@pobox.com Fri Jun 14 19:02:01 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 14 Jun 2002 13:02:01 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15626.12185.867020.437418@12-248-41-177.client.attbi.com> Guido> All we need is for someone to add all the other dependencies to Guido> setup.py. May I humbly propose that this task should be automated? Tools like makedepend have been invented and reinvented many times over precisely because it's too error-prone for humans to maintain that information manually. Switching from Make's syntax to Python's syntax won't make that task substantially easier. (Yes, I realize that backward compatibility is a strong goal so the layout of objects tends to change rarely. I still prefer having correct dependencies.) Skip From fredrik@pythonware.com Fri Jun 14 19:03:31 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 14 Jun 2002 20:03:31 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <01e701c213cd$d148ae60$ced241d5@hagrid> guido wrote: > I thought that the thing to do this was > > python setup.py build_ext -i oh, that's definitely close enough. that's what you get for reading the docs instead of trying every combination of the available options ;-) (maybe someone who knows a little more about distutils could take an hour and add brief overviews of all standard commands to the reference section(s)? just having a list of all commands and command options would have helped me, for sure...) thanks /F From martin@v.loewis.de Fri Jun 14 19:36:33 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Jun 2002 20:36:33 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15626.12185.867020.437418@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> <15626.12185.867020.437418@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > May I humbly propose that this task should be automated? Tools like > makedepend have been invented and reinvented many times over precisely > because it's too error-prone for humans to maintain that information > manually. They also have been invented and reinvented because the previous tool would not work, just like the next one wouldn't. makedepend is particularly bad: you need to reinvoke makedepend manually whenever you change a file, which is as easily to forget as updating dependency lists whenever you change a file. In addition, makedepend has problems finding out the names of header files used. That said, feel free to contribute patches that automate this task. Regards, Martin From guido@python.org Fri Jun 14 20:01:59 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 15:01:59 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "14 Jun 2002 19:16:20 +0200." References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206141158.g5EBwrw31717@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206141902.g5EJ20e09812@pcp02138704pcs.reston01.va.comcast.net> > > IMO that's entirely accidental. You can use Setup to build either > > form. I would assume you can use setup.py to build either form too, > > but I'm not sure. > > Can you please elaborate? I believe that, because of the fragment > > case $objs in > *$mod.o*) base=$mod;; > *) base=${mod}module;; > esac > > the line > > nis nismodule.c -lnsl # Sun yellow pages -- not everywhere > > will always cause makesetup to build nismodule.so - if you want to > build nis.so, you have to rename the source file. Oops, I was mistaken. > I don't think you can tell setup.py to build nismodule.so. Actually, you can. Just specify "nismodule" as the extension name. Whether you should, I don't know. > So what do you propose to do to make the resulting shared library name > consistent regardless of whether it is build through setup.py or > makesetup? I don't know if we need consistency, but if we do, I propose that we deprecate the "module" part. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 14 20:06:50 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 15:06:50 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 12:49:04 CDT." <15626.11408.660388.360296@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> Message-ID: <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> > Thanks for elaborating the distinction. That is exactly what I missed. I > really want make+makedepend. I think that's what others have missed as > well. Sorry Skip, but many others pointed out early on in this discussion that dependency discovery is the important issue. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 14 20:09:24 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 15:09:24 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 13:02:01 CDT." <15626.12185.867020.437418@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> <15626.12185.867020.437418@12-248-41-177.client.attbi.com> Message-ID: <200206141909.g5EJ9Ow09923@pcp02138704pcs.reston01.va.comcast.net> > May I humbly propose that this task should be automated? Tools like > makedepend have been invented and reinvented many times over > precisely because it's too error-prone for humans to maintain that > information manually. Switching from Make's syntax to Python's > syntax won't make that task substantially easier. Unfortunately it's also darn tooting hard to do a good job of discovering dependencies, which is why there is still no standard tool that does this. Makedepend tries, but is still hard to use. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 14 20:11:30 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 15:11:30 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 20:03:31 +0200." <01e701c213cd$d148ae60$ced241d5@hagrid> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net> <01e701c213cd$d148ae60$ced241d5@hagrid> Message-ID: <200206141911.g5EJBUC09941@pcp02138704pcs.reston01.va.comcast.net> > > I thought that the thing to do this was > > > > python setup.py build_ext -i > > oh, that's definitely close enough. > > that's what you get for reading the docs instead of trying > every combination of the available options ;-) > > (maybe someone who knows a little more about distutils > could take an hour and add brief overviews of all standard > commands to the reference section(s)? just having a list > of all commands and command options would have helped > me, for sure...) Instead of bothering with the (mostly) harmless but also mostly unhelpful manuals, try the --help feature. E.g. this has the info you want: $ python setup.py build_ext --help Global options: --verbose (-v) run verbosely (default) --quiet (-q) run quietly (turns verbosity off) --dry-run (-n) don't actually do anything --help (-h) show detailed help message Options for 'PyBuildExt' command: --build-lib (-b) directory for compiled extension modules --build-temp (-t) directory for temporary files (build by-products) --inplace (-i) ignore build-lib and put compiled extensions into the source directory alongside your pure Python modules --include-dirs (-I) list of directories to search for header files (separated by ':') --define (-D) C preprocessor macros to define --undef (-U) C preprocessor macros to undefine --libraries (-l) external C libraries to link with --library-dirs (-L) directories to search for external C libraries (separated by ':') --rpath (-R) directories to search for shared C libraries at runtime --link-objects (-O) extra explicit link objects to include in the link --debug (-g) compile/link with debugging information --force (-f) forcibly build everything (ignore file timestamps) --compiler (-c) specify the compiler type --swig-cpp make SWIG create C++ files (default is C) --help-compiler list available compilers usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...] or: setup.py --help [cmd1 cmd2 ...] or: setup.py --help-commands or: setup.py cmd --help $ --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Fri Jun 14 20:47:22 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 14 Jun 2002 12:47:22 -0700 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141909.g5EJ9Ow09923@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jun 14, 2002 at 03:09:24PM -0400 References: <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> <15626.12185.867020.437418@12-248-41-177.client.attbi.com> <200206141909.g5EJ9Ow09923@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020614124722.A415@glacier.arctrix.com> Guido van Rossum wrote: > Unfortunately it's also darn tooting hard to do a good job of > discovering dependencies, which is why there is still no standard tool > that does this. Makedepend tries, but is still hard to use. ccache is an interesting solution to the problem. Neil From Steve Holden" <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net> <01e701c213cd$d148ae60$ced241d5@hagrid> <200206141911.g5EJBUC09941@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <052f01c213db$b4f0a2f0$bc778f41@holdenweb.com> [Fredrik] > > (maybe someone who knows a little more about distutils > > could take an hour and add brief overviews of all standard > > commands to the reference section(s)? just having a list > > of all commands and command options would have helped > > me, for sure...) > [Guido] > Instead of bothering with the (mostly) harmless but also mostly > unhelpful manuals, try the --help feature. E.g. this has the info you > want: > > $ python setup.py build_ext --help [ ... ] It seems like a shame that effort was wasted producing "unhelpful" documentation (and I have to say my experience was similar, but I thought it was just me). The better the docs, the more module and extension authors will use distutils. Is the problem simply too generic for it to be logged as a documentation bug? (A bit like the famous DEC SIR: "VMS 2.0 does not work". DEC's response? "Fixed in next release"). Couldn't find anything in SF. regards ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From guido@python.org Fri Jun 14 20:49:43 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 15:49:43 -0400 Subject: [Python-Dev] 'new' and 'types' In-Reply-To: Your message of "Fri, 14 Jun 2002 19:15:58 +0300." <20020614191558.A31580@hishome.net> References: <20020614191558.A31580@hishome.net> Message-ID: <200206141949.g5EJnh610636@pcp02138704pcs.reston01.va.comcast.net> > Patch 568629 removes the built-in module new (with sincere apologies > to Tommy Burnette ;-) and replaces it with a tiny Python module > consisting of a single import statement: I'm reviewing it now. It seems it's your patch. Did you forget to mention that in this message? > Now, what about the types module? It has been suggested that this > module should be deprecated. I think it still has some use: we need > a place to put all the types that are not used often enough to be > added to the builtins. I suggest that they be placed in the module > 'types' with names matching their __name__ attribute. The types > module will still have the long MixedCaseType names for backward > compatibility. The use of the long names should be deprecated, not > the types module itself. Not a bad idea. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Fri Jun 14 20:46:36 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 14 Jun 2002 14:46:36 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15626.18460.685973.605098@12-248-41-177.client.attbi.com> >> Thanks for elaborating the distinction. That is exactly what I >> missed. I really want make+makedepend. I think that's what others >> have missed as well. Guido> Sorry Skip, but many others pointed out early on in this Guido> discussion that dependency discovery is the important issue. Which distutils doesn't do, but for which make and/or compilers have done for years. Skip From fredrik@pythonware.com Fri Jun 14 20:48:29 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 14 Jun 2002 21:48:29 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net> <01e701c213cd$d148ae60$ced241d5@hagrid> <200206141911.g5EJBUC09941@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <023301c213dc$7d3567a0$ced241d5@hagrid> Guido wrote: > Instead of bothering with the (mostly) harmless but also mostly > unhelpful manuals, try the --help feature. E.g. this has the info you > want: I think I got sidetracked by the --help-commands summary, which sort of implies that build_ext is just a subvariant of build... (maybe we could add a --help-commands-long option that lists both the command names and their descriptions? my brain clearly couldn't execute [--help x for x in commands] without adding an arbitrary if-clause, but I'm sure distutils can do that...) From guido@python.org Fri Jun 14 20:54:55 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 15:54:55 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 14:46:36 CDT." <15626.18460.685973.605098@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> Message-ID: <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> > Guido> Sorry Skip, but many others pointed out early on in this > Guido> discussion that dependency discovery is the important issue. > > Which distutils doesn't do, but for which make and/or compilers have > done for years. That same imprecise language again that got you in trouble before! :-) Make doesn't do dependency discovery (beyond the trivial .c -> .o). There may be a few compilers that do this but I don't think it's the norm. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Fri Jun 14 20:55:56 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 14 Jun 2002 14:55:56 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15626.18460.685973.605098@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> Message-ID: <15626.19020.867632.353969@12-248-41-177.client.attbi.com> Skip> Which distutils doesn't do, but for which make and/or compilers Skip> have done for years. Bad English, sorry. Should have been "which has been available for make for years". Skip From skip@pobox.com Fri Jun 14 20:58:35 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 14 Jun 2002 14:58:35 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15626.19179.464237.382313@12-248-41-177.client.attbi.com> Guido> That same imprecise language again that got you in trouble Guido> before! :-) Yes, I realize that. Guido> Make doesn't do dependency discovery (beyond the trivial .c -> Guido> .o). There may be a few compilers that do this but I don't think Guido> it's the norm. I also realize that. Gcc has had good dependency checking for probably ten years. Sun's C compiler for a similar length of time. Larry Wall did a pretty good job of dependency checking for patch in the mid-80's. Scons does it as well. Skip From guido@python.org Fri Jun 14 21:07:02 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 16:07:02 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 14:58:35 CDT." <15626.19179.464237.382313@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> Message-ID: <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> > Gcc has had good dependency checking for probably ten years. How do you invoke this? Maybe we can use this to our advantage. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Fri Jun 14 21:12:20 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 14 Jun 2002 15:12:20 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15626.20004.302739.140783@12-248-41-177.client.attbi.com> >> Gcc has had good dependency checking for probably ten years. Guido> How do you invoke this? Maybe we can use this to our advantage. "gcc -M" gives you all dependencies. "gcc -MM" gives you just the stuff included via '#include "file"' and omits the headers included via '#include '. Programmers use and "file" inconsistently enough that it's probably better to just use -M and eliminate the files you don't care about (or leave them in and have Python rebuild automatically after OS upgrades). There are several other variants as well. Search the GCC man page for "-M". It seems to me that distutils' base compiler class could provide a generic makedepend-like method which could be overridden in subclasses where specific compilers have better builtin schemes for dependency generation. Skip From guido@python.org Fri Jun 14 21:19:16 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Jun 2002 16:19:16 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Fri, 14 Jun 2002 15:12:20 CDT." <15626.20004.302739.140783@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> Message-ID: <200206142019.g5EKJGV10977@pcp02138704pcs.reston01.va.comcast.net> > "gcc -M" gives you all dependencies. "gcc -MM" gives you just the > stuff included via '#include "file"' and omits the headers included > via '#include '. Programmers use and "file" > inconsistently enough that it's probably better to just use -M and > eliminate the files you don't care about (or leave them in and have > Python rebuild automatically after OS upgrades). There are several > other variants as well. Search the GCC man page for "-M". Cool. > It seems to me that distutils' base compiler class could provide a generic > makedepend-like method which could be overridden in subclasses where > specific compilers have better builtin schemes for dependency generation. Care to whip up a patch? --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Fri Jun 14 21:23:17 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 14 Jun 2002 22:23:17 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> Message-ID: <00bb01c213e1$528655a0$475afea9@thomasnotebook> From: "Skip Montanaro" > > >> Gcc has had good dependency checking for probably ten years. > > Guido> How do you invoke this? Maybe we can use this to our advantage. > > "gcc -M" gives you all dependencies. "gcc -MM" gives you just the stuff > included via '#include "file"' and omits the headers included via '#include > '. Programmers use and "file" inconsistently enough that it's > probably better to just use -M and eliminate the files you don't care about > (or leave them in and have Python rebuild automatically after OS upgrades). > There are several other variants as well. Search the GCC man page for "-M". > > It seems to me that distutils' base compiler class could provide a generic > makedepend-like method which could be overridden in subclasses where > specific compilers have better builtin schemes for dependency generation. > MSVC could do something similar with the /E or /P flag (preprocess to standard out or to file). A simple python filter looking for #line directives could then collect the dependencies. Isn't -E and -P also available in any unixish compiler? Thomas From thomas.heller@ion-tof.com Fri Jun 14 21:27:22 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 14 Jun 2002 22:27:22 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: Message-ID: <00db01c213e1$e3e21890$475afea9@thomasnotebook> > [Tim, on the debug-build sys.getobjects()] Thanks, this is useful info. Seems I have to read the source more often... Thomas From skip@pobox.com Fri Jun 14 21:33:31 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 14 Jun 2002 15:33:31 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <00bb01c213e1$528655a0$475afea9@thomasnotebook> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <00bb01c213e1$528655a0$475afea9@thomasnotebook> Message-ID: <15626.21275.105157.150399@12-248-41-177.client.attbi.com> Thomas> MSVC could do something similar with the /E or /P flag Thomas> (preprocess to standard out or to file). A simple python filter Thomas> looking for #line directives could then collect the Thomas> dependencies. Isn't -E and -P also available in any unixish Thomas> compiler? Yes. I believe this is how some makedepend scripts work. Skip From ask@valueclick.com Fri Jun 14 21:47:04 2002 From: ask@valueclick.com (Ask Bjoern Hansen) Date: Fri, 14 Jun 2002 13:47:04 -0700 (PDT) Subject: [Python-Dev] Quota on sf.net In-Reply-To: Message-ID: On 10 Jun 2002, Martin v. Löwis wrote: > My recommendation would be to disable the scipt, and remove the > snapshots, perhaps leaving a page that anybody who wants the snapshots > should ask at python-dev to re-enable them. feel free to refer people to; http://cvs.perl.org/snapshots/python/ I'll keep about half a weeks worth of 6 hourly snapshots there, like we do for parrot at http://cvs.perl.org/snapshots/parrot/ - ask -- ask bjoern hansen, http://ask.netcetera.dk/ !try; do(); From akuchlin@mems-exchange.org Fri Jun 14 21:57:44 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 14 Jun 2002 16:57:44 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <052f01c213db$b4f0a2f0$bc778f41@holdenweb.com> References: <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net> <01e701c213cd$d148ae60$ced241d5@hagrid> <200206141911.g5EJBUC09941@pcp02138704pcs.reston01.va.comcast.net> <052f01c213db$b4f0a2f0$bc778f41@holdenweb.com> Message-ID: <20020614205744.GA12086@ute.mems-exchange.org> On Fri, Jun 14, 2002 at 03:43:04PM -0400, Steve Holden wrote: >It seems like a shame that effort was wasted producing "unhelpful" >documentation (and I have to say my experience was similar, but I thought it >was just me). The better the docs, the more module and extension authors >will use distutils. Part of it is not having an idea of what tasks people commonly need to do with Distutils. --amk From tim.one@comcast.net Fri Jun 14 22:23:31 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 14 Jun 2002 17:23:31 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <01e701c213cd$d148ae60$ced241d5@hagrid> Message-ID: [/F] > ... > (maybe someone who knows a little more about distutils > could take an hour and add brief overviews of all standard > commands to the reference section(s)? just having a list > of all commands and command options would have helped > me, for sure...) Me too, except that it still would . The docs do a fine job of explaining the framework, but it turns out every option I actually have to use gets extracted from one of my coworkers at the point of tears . From gmcm@hypernet.com Fri Jun 14 23:25:43 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 14 Jun 2002 18:25:43 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141911.g5EJBUC09941@pcp02138704pcs.reston01.va.comcast.net> References: Your message of "Fri, 14 Jun 2002 20:03:31 +0200." <01e701c213cd$d148ae60$ced241d5@hagrid> Message-ID: <3D0A3527.4320.90867436@localhost> > $ python setup.py build_ext --help > Global options: ... > --dry-run (-n) don't actually do anything Last time I tried that with a package, it went ahead and installed itself anyway. -- Gordon http://www.mcmillan-inc.com/ From neal@metaslash.com Fri Jun 14 21:13:12 2002 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 14 Jun 2002 16:13:12 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D0A4E58.760E3D40@metaslash.com> Guido van Rossum wrote: > > > Gcc has had good dependency checking for probably ten years. > > How do you invoke this? Maybe we can use this to our advantage. Here's a bunch of useful options. gcc --help -v | grep -e -M -M Generate make dependencies -MM As -M, but ignore system header files -MF Write dependency output to the given file -MG Treat missing header file as generated files -MP Generate phony targets for all headers -MQ Add a MAKE-quoted target -MT Add an unquoted target -MD Print dependencies to FILE.d -MMD Print dependencies to FILE.d -M Print dependencies to stdout -MM Print dependencies to stdout From martin@v.loewis.de Fri Jun 14 23:38:41 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Jun 2002 00:38:41 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15626.20004.302739.140783@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > "gcc -M" gives you all dependencies. "gcc -MM" gives you just the stuff > included via '#include "file"' and omits the headers included via '#include > '. Both options are somewhat obsolete. It requires a separate invocation of the compiler to output the dependencies, since it outputs the dependencies to stdout; it can't do compilation at the same time. It is much better if compilation of a file updates the dependency information as a side effect. For that, gcc supports -MD/-MMD since 1989; this generates dependencies in a file obtained by replacing the .o extension of the target with .d. SunPRO supports generation of dependency files also as a separate compiler invocation. It also supports the undocumented environment variable SUNPRO_DEPENDENCIES, which allows specification of the dependency file, along with specification of directories. GCC also supports SUNPRO_DEPENDENCIES, so this is the most effective and portable way to get dependency file generation. Regards, Martin From martin@v.loewis.de Fri Jun 14 23:45:40 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Jun 2002 00:45:40 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141902.g5EJ20e09812@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206141158.g5EBwrw31717@pcp02138704pcs.reston01.va.comcast.net> <200206141902.g5EJ20e09812@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > > I don't think you can tell setup.py to build nismodule.so. > > Actually, you can. Just specify "nismodule" as the extension name. That won't work. setup.py tries to import "md5module", which fails since md5module.so has no function initmd5module. > I don't know if we need consistency, but if we do, I propose that we > deprecate the "module" part. Ok, I'll try to remove the feature that makesetup adds "module". Regards, Martin From nhodgson@bigpond.net.au Sat Jun 15 02:11:36 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Sat, 15 Jun 2002 11:11:36 +1000 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <025f01c21409$98c42dd0$3da48490@neil> Guido: > Make doesn't do dependency discovery (beyond the trivial .c -> .o). > There may be a few compilers that do this but I don't think it's the > norm. Borland make does in conjunction with the compiler including header dependencies in the object file. Thus there is no need for dependency generation options like gcc's and no such options are provided. Its differences in functionality like this that will cause problems with moving towards greater use of make. Neil From skip@pobox.com Sat Jun 15 15:51:25 2002 From: skip@pobox.com (Skip Montanaro) Date: Sat, 15 Jun 2002 09:51:25 -0500 Subject: [Python-Dev] unicode() and its error argument Message-ID: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> The unicode() builtin accepts an optional third argument, errors, which defaults to "strict". According to the docs if errors is set to "ignore", decoding errors are silently ignored. I seem to still get the occasional UnicodeError exception, however. I'm still trying to track down an actual example (it doesn't happen often, and I hadn't wrapped unicode() in a try/except statement, so all I saw was the error raised, not the input string value). This reminds me, it occurred to me the other day that a plain text version of cgitb would be useful to use for non-web scripts. You'd get a lot more context about the environment in which the exception was raised. Skip From Oleg Broytmann Sat Jun 15 15:58:42 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Sat, 15 Jun 2002 18:58:42 +0400 Subject: [Python-Dev] unicode() and its error argument In-Reply-To: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>; from skip@pobox.com on Sat, Jun 15, 2002 at 09:51:25AM -0500 References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> Message-ID: <20020615185842.D12705@phd.pp.ru> On Sat, Jun 15, 2002 at 09:51:25AM -0500, Skip Montanaro wrote: > The unicode() builtin accepts an optional third argument, errors, which > defaults to "strict". According to the docs if errors is set to "ignore", > decoding errors are silently ignored. I seem to still get the occasional > UnicodeError exception, however. I got the error very often (but I use encoding conversion much more often than you). First time I saw it I was very surprized that neither "ignore" nor "replace" can eliminate the error. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From guido@python.org Sat Jun 15 16:03:53 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 15 Jun 2002 11:03:53 -0400 Subject: [Python-Dev] unicode() and its error argument In-Reply-To: Your message of "Sat, 15 Jun 2002 09:51:25 CDT." <15627.21613.94336.985634@12-248-41-177.client.attbi.com> References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> Message-ID: <200206151503.g5FF3rJ16446@pcp02138704pcs.reston01.va.comcast.net> > The unicode() builtin accepts an optional third argument, errors, > which defaults to "strict". According to the docs if errors is set > to "ignore", decoding errors are silently ignored. I seem to still > get the occasional UnicodeError exception, however. I'm still > trying to track down an actual example (it doesn't happen often, and > I hadn't wrapped unicode() in a try/except statement, so all I saw > was the error raised, not the input string value). This is between you and MAL. :-) > This reminds me, it occurred to me the other day that a plain text > version of cgitb would be useful to use for non-web scripts. You'd > get a lot more context about the environment in which the exception > was raised. Not a bad idea. I think it could live in the traceback module, possibly as a family of functions named "fancy_traceback" and similar. Care to do a patch? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jun 15 16:05:12 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 15 Jun 2002 11:05:12 -0400 Subject: [Python-Dev] unicode() and its error argument In-Reply-To: Your message of "Sat, 15 Jun 2002 18:58:42 +0400." <20020615185842.D12705@phd.pp.ru> References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> <20020615185842.D12705@phd.pp.ru> Message-ID: <200206151505.g5FF5Cr16468@pcp02138704pcs.reston01.va.comcast.net> > I got the error very often (but I use encoding conversion much more > often than you). First time I saw it I was very surprized that neither > "ignore" nor "replace" can eliminate the error. Got an example? --Guido van Rossum (home page: http://www.python.org/~guido/) From Oleg Broytmann Sat Jun 15 16:04:41 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Sat, 15 Jun 2002 19:04:41 +0400 Subject: [Python-Dev] unicode() and its error argument In-Reply-To: <200206151505.g5FF5Cr16468@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Sat, Jun 15, 2002 at 11:05:12AM -0400 References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> <20020615185842.D12705@phd.pp.ru> <200206151505.g5FF5Cr16468@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020615190441.E12705@phd.pp.ru> On Sat, Jun 15, 2002 at 11:05:12AM -0400, Guido van Rossum wrote: > > I got the error very often (but I use encoding conversion much more > > often than you). First time I saw it I was very surprized that neither > > "ignore" nor "replace" can eliminate the error. > > Got an example? Not right now... I'll send it when I get one. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From jon+python-dev@unequivocal.co.uk Sat Jun 15 16:44:54 2002 From: jon+python-dev@unequivocal.co.uk (Jon Ribbens) Date: Sat, 15 Jun 2002 16:44:54 +0100 Subject: [Python-Dev] unicode() and its error argument In-Reply-To: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>; from skip@pobox.com on Sat, Jun 15, 2002 at 09:51:25AM -0500 References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> Message-ID: <20020615164454.B4842@snowy.squish.net> Skip Montanaro wrote: > This reminds me, it occurred to me the other day that a plain text version > of cgitb would be useful to use for non-web scripts. You'd get a lot more > context about the environment in which the exception was raised. I have code adapted from cgitb in my jonpy cgi module which simultaneously does text and optional html fancy tracebacks. See the function "traceback" in: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/jonpy/jonpy/jon/cgi.py the 'req.error()' calls are doing the text traceback, simply remove the stuff that says 'if html' to remove the html traceback and then do a search&replace from req.error to out.write() or something. From tim.one@comcast.net Sat Jun 15 17:21:03 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 15 Jun 2002 12:21:03 -0400 Subject: [Python-Dev] unicode() and its error argument In-Reply-To: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> Message-ID: [Skip Montanaro] > The unicode() builtin accepts an optional third argument, errors, which > defaults to "strict". According to the docs if errors is set to "ignore", > decoding errors are silently ignored. I seem to still get the occasional > UnicodeError exception, however. I'm still trying to track down an actual > example (it doesn't happen often, and I hadn't wrapped unicode() in a > try/except statement, so all I saw was the error raised, not the input > string value). Play with this: """ def generrors(encoding, errors, maxlen, maxtries): from random import choice, randint bytes = [chr(i) for i in range(256)] paste = ''.join for dummy in xrange(maxtries): n = randint(1, maxlen) raw = paste([choice(bytes) for dummy in range(n)]) try: u = unicode(raw, encoding, errors) except UnicodeError, detail: print 'fail w/ errors', errors, '- raw data', repr(raw) print ' UnicodeError', str(detail) errors = ('strict', 'replace', 'ignore') generrors('mac-turkish', errors[2], 10, 1000) """ Plug in your favorite encoding and let it do the work of finding examples. It generates plenty of errors with 'strict', but so far I haven't seen it generate one with 'replace' or 'ignore'. From gward@python.net Sat Jun 15 18:31:14 2002 From: gward@python.net (Greg Ward) Date: Sat, 15 Jun 2002 13:31:14 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> <3D093438.92B46349@prescod.net> <15625.16072.900596.114938@12-248-41-177.client.attbi.com> Message-ID: <20020615173114.GA8981@gerg.ca> On 14 June 2002, Martin v. Loewis said: > Skip Montanaro writes: > > > Paul> I guess most of us don't understand the benefits because we don't > > Paul> see dependency tracking as necessarily that difficult. It's no > > Paul> harder than the new method resolution order. ;) > > > > If it's not that difficult why isn't it being done? > > You are wrong assuming it is not done. distutils does dependency > analysis since day 1. Only insofar as foo.o depends on foo.c. The header file stuff Jeremy has been adding sounds like a very useful addition (haven't actually inspected his patches yet). Greg -- Greg Ward - Unix bigot gward@python.net http://starship.python.net/~gward/ Monday is an awful way to spend one seventh of your life. From gward@python.net Sat Jun 15 18:36:39 2002 From: gward@python.net (Greg Ward) Date: Sat, 15 Jun 2002 13:36:39 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <005a01c21388$387cd3e0$0900a8c0@spiff> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> Message-ID: <20020615173639.GB8981@gerg.ca> On 14 June 2002, Fredrik Lundh said: > alex wrote: > > > The "problem" (:-) is that it's great at just building extensions, too. > > > > python2.1 setup.py install, python2.2 setup.py install, python2.3 setup.py > > install, and hey pronto, I have my extension built and installed on all > > Python versions I want to support, ready for testing. Hard to beat!-) > > does your code always work right away? If we're talking about a downloaded third party extension -- the main use case for the Distutils -- one certainly hopes so! It's only a happy accident that the Distutils are moderately useful for building/development. > I tend to use an incremental approach, with lots of edit-compile-run > cycles. I still haven't found a way to get the damn thing to just build > my extension and copy it to the current directory, so I can run the > test scripts. Last time I checked: python setup.py build_ext --inplace > (distutils is also a pain to use with a version management system > that marks files in the repository as read-only; distutils copy function > happily copies all the status bits. but the remove function refuses to > remove files that are read-only, even if the files have been created > by distutils itself...) Yeah, that's a stupid situation. I'm sure there are "XXX" comments in the code where I ponder the wisdom of preserving mtime and mode. Greg -- Greg Ward - just another Python hacker gward@python.net http://starship.python.net/~gward/ From gmcm@hypernet.com Sat Jun 15 20:00:06 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 15 Jun 2002 15:00:06 -0400 Subject: [Python-Dev] SF bug count Message-ID: <3D0B5676.8419.94F0937E@localhost> Hi all, SF's "tracker" page http://sourceforge.net/tracker/?group_id=5470 says there are a total of 2581 bugs. Using a url template of (broken so as to be readable): bugurlfmt = http://sourceforge.net/tracker/index.php ?group_id=5470 &atid=105470 &set=custom &_assigned_to=100 &_status=100 &_category=100 &_group=100 &order=artifact_id &sort=ASC&offset=%d to get 51 at a time, I get only 636. Whose bug? -- Gordon http://www.mcmillan-inc.com/ From niemeyer@conectiva.com Sat Jun 15 20:08:31 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sat, 15 Jun 2002 16:08:31 -0300 Subject: [Python-Dev] mkdev, major, st_rdev, etc Message-ID: <20020615160831.A5440@ibook.distro.conectiva> After thinking for a while, and doing some research about these functions, I've changed my mind about the best way to implement the needed functionality for tarfile. Maybe including major, minor, and makedev is the best solution. Some of the issues I'm considering: - st_rdev was already available in 2.2, so we'd have to introduce a new redundant pair attribute to provide a (major, minor) pair. - mkdev would be able to use the standard posix format, and would work regardless of makedev's availability (mkdev is being introduced in 2.3). - more flexible. major, minor, and makedev may be needed in other cases, besides st_rdev parsing and mknod device creation. - TYPES.py is already trying to provide them, but it's broken (indeed, it's more broken than that. h2py should use cpp to preprocess the files, but that's something for another occasion). - these "functions" are usually macros, thus should introduce little overhead. A patch providing these functions is available at http://www.python.org/sf/569139 -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From martin@v.loewis.de Sat Jun 15 22:00:02 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Jun 2002 23:00:02 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <20020615173114.GA8981@gerg.ca> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> <3D093438.92B46349@prescod.net> <15625.16072.900596.114938@12-248-41-177.client.attbi.com> <20020615173114.GA8981@gerg.ca> Message-ID: Greg Ward writes: > > You are wrong assuming it is not done. distutils does dependency > > analysis since day 1. > > Only insofar as foo.o depends on foo.c. The header file stuff Jeremy > has been adding sounds like a very useful addition (haven't actually > inspected his patches yet). Certainly true. However, the makefiles that Skip wanted to generate would not have offered anything beyond "foo.o depends on foo.c". He then recognized that dependencies are essential, here, and suggested makedepend... Regards, Martin From martin@v.loewis.de Sat Jun 15 22:06:06 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Jun 2002 23:06:06 +0200 Subject: [Python-Dev] SF bug count In-Reply-To: <3D0B5676.8419.94F0937E@localhost> References: <3D0B5676.8419.94F0937E@localhost> Message-ID: "Gordon McMillan" writes: > to get 51 at a time, I get only 636. > > Whose bug? Yours; you count only unassigned reports. assigned_to=0 gives you some more. Regards, Martin From gmcm@hypernet.com Sat Jun 15 22:11:40 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 15 Jun 2002 17:11:40 -0400 Subject: [Python-Dev] SF bug count In-Reply-To: <3D0B5676.8419.94F0937E@localhost> Message-ID: <3D0B754C.4256.956906F7@localhost> On 15 Jun 2002 at 15:00, Gordon McMillan wrote: Hmmph. For any filterable column *except* _assigned_to, "100" means "any". For _assigned_to, the magic number is "0". > SF's "tracker" page > http://sourceforge.net/tracker/?group_id=5470 > says there are a total of 2581 bugs. > > Using a url template of (broken so as to be > readable): > bugurlfmt = > http://sourceforge.net/tracker/index.php > ?group_id=5470 > &atid=105470 > &set=custom > &_assigned_to=100 > &_status=100 > &_category=100 > &_group=100 > &order=artifact_id > &sort=ASC&offset=%d > > to get 51 at a time, I get only 636. > > Whose bug? Their design bug is my implementation bug. -- Gordon http://www.mcmillan-inc.com/ From martin@v.loewis.de Sat Jun 15 22:22:19 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Jun 2002 23:22:19 +0200 Subject: [Python-Dev] mkdev, major, st_rdev, etc In-Reply-To: <20020615160831.A5440@ibook.distro.conectiva> References: <20020615160831.A5440@ibook.distro.conectiva> Message-ID: Gustavo Niemeyer writes: > - mkdev would be able to use the standard posix format, and would > work regardless of makedev's availability (mkdev is being > introduced in 2.3). Notice that the this interface is *not* part of the Posix spec. http://www.opengroup.org/onlinepubs/007904975/functions/mknod.html says that the only portable use of mknod is to create FIFOs; any use where dev is not null is unspecified. Furthermore, major and minor are not part of Posix. > - TYPES.py is already trying to provide them, but it's broken > (indeed, it's more broken than that. h2py should use cpp to > preprocess the files, but that's something for another occasion). That cannot work: the preprocessor will eat the macro definition, and you have no way to find out what its body was. > A patch providing these functions is available at > http://www.python.org/sf/569139 I wonder whether the additional TRY_COMPILE test is really necessary. Isn't it sufficient to restrict attention to systems on which major and minor are macros, and use #ifdef major inside posixmodule.c? Regards, Martin From niemeyer@conectiva.com Sat Jun 15 23:21:46 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sat, 15 Jun 2002 19:21:46 -0300 Subject: [Python-Dev] mkdev, major, st_rdev, etc In-Reply-To: References: <20020615160831.A5440@ibook.distro.conectiva> Message-ID: <20020615192146.A5978@ibook.distro.conectiva> Hello Martin! > > - mkdev would be able to use the standard posix format, and would > > work regardless of makedev's availability (mkdev is being > > introduced in 2.3). > > Notice that the this interface is *not* part of the Posix spec. Please, notice that what I said is that *mknod* would be able to use the standard posix format. In other words, instead of mknod(name, mode, major, minor) It can become: mknod(name, mode, device) Which *is* posix compliant, so your note seems to be another reason for us to use the new proposed system. > http://www.opengroup.org/onlinepubs/007904975/functions/mknod.html > > says that the only portable use of mknod is to create FIFOs; any use > where dev is not null is unspecified. Furthermore, major and minor are > not part of Posix. Indeed. But I think we need this functionality nevertheless, since that's the only way to create special devices in Linux and other systems. Otherwise we won't be able to have modules like tarfile.py which have to rebuild them. Besides that, os is meant to have operating system specific functionality, isn't it? > > - TYPES.py is already trying to provide them, but it's broken > > (indeed, it's more broken than that. h2py should use cpp to > > preprocess the files, but that's something for another occasion). > > That cannot work: the preprocessor will eat the macro definition, and > you have no way to find out what its body was. That's what I meant: [niemeyer@ibook dist]$ cpp -dM /usr/include/sys/types.h #define __LITTLE_ENDIAN 1234 #define BYTE_ORDER __BYTE_ORDER #define powerpc 1 #define __linux__ 1 #define LITTLE_ENDIAN __LITTLE_ENDIAN #define FD_SET(fd, fdsetp) __FD_SET (fd, fdsetp) [...] > > A patch providing these functions is available at > > http://www.python.org/sf/569139 > > I wonder whether the additional TRY_COMPILE test is really > necessary. Isn't it sufficient to restrict attention to systems on > which major and minor are macros, and use > > #ifdef major > > inside posixmodule.c? I'm not sure if these functions are macros in every system. If that's true, and makedev is always available with major, you're completely right. My purpose to include a TRY_COMPILE was a safe approach, and was based on a review on the way autoconf checks if makedev is available. Martin, thanks for your review! -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From guido@python.org Sun Jun 16 02:23:44 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 15 Jun 2002 21:23:44 -0400 Subject: [Python-Dev] SF bug count In-Reply-To: Your message of "Sat, 15 Jun 2002 15:00:06 EDT." <3D0B5676.8419.94F0937E@localhost> References: <3D0B5676.8419.94F0937E@localhost> Message-ID: <200206160123.g5G1NiJ23638@pcp02138704pcs.reston01.va.comcast.net> > SF's "tracker" page > http://sourceforge.net/tracker/?group_id=5470 > says there are a total of 2581 bugs. > > Using a url template of (broken so as to be > readable): > bugurlfmt = http://sourceforge.net/tracker/index.php > ?group_id=5470 > &atid=105470 > &set=custom > &_assigned_to=100 > &_status=100 > &_category=100 > &_group=100 > &order=artifact_id > &sort=ASC&offset=%d > > to get 51 at a time, I get only 636. > > Whose bug? I assume yours -- I tried manually clicking on the "Next 50" link and got bored after 20 clicks or over 1000 bugs. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Sun Jun 16 02:48:28 2002 From: skip@pobox.com (Skip Montanaro) Date: Sat, 15 Jun 2002 20:48:28 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> <3D093438.92B46349@prescod.net> <15625.16072.900596.114938@12-248-41-177.client.attbi.com> <20020615173114.GA8981@gerg.ca> Message-ID: <15627.61036.77500.635415@12-248-41-177.client.attbi.com> Martin> Certainly true. However, the makefiles that Skip wanted to Martin> generate would not have offered anything beyond "foo.o depends Martin> on foo.c". He then recognized that dependencies are essential, Martin> here, and suggested makedepend... Please don't put words into my mouth (or thoughts into my brain)? I have used make+makedepend for a long time and tend to think of them as inseparable. I was certainly thinking in terms of .o:.h dependencies. That is, after all, what my example demonstrated was missing. Skip From skip@pobox.com Sun Jun 16 05:53:57 2002 From: skip@pobox.com (Skip Montanaro) Date: Sat, 15 Jun 2002 23:53:57 -0500 Subject: [Python-Dev] unicode() and its error argument In-Reply-To: <200206151503.g5FF3rJ16446@pcp02138704pcs.reston01.va.comcast.net> References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> <200206151503.g5FF3rJ16446@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15628.6629.292906.585819@12-248-41-177.client.attbi.com> >> This reminds me, it occurred to me the other day that a plain text >> version of cgitb would be useful to use for non-web scripts. You'd >> get a lot more context about the environment in which the exception >> was raised. Guido> Not a bad idea. I think it could live in the traceback module, Guido> possibly as a family of functions named "fancy_traceback" and Guido> similar. Care to do a patch? I just submitted a patch done differently than you suggested. I simply added a text() formatting routine to cgitb.py and an extra 'format' argument to cgitb.enable(). Now, if you want plain text output, just call enable() like so import cgitb cgitb.enable(format="text") I think I muffed the HTML formatting (there was an odd little bit of logic in there I believe I might have botched). I'll take another look at that and submit a revised patch if necessary and include a little doc update. For the curious, the patch is at http://python.org/sf/569574 Guido expressed interest, so I assigned it to him. ;-) Skip From mal@lemburg.com Sun Jun 16 11:48:49 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 16 Jun 2002 12:48:49 +0200 Subject: [Python-Dev] unicode() and its error argument References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> Message-ID: <3D0C6D11.5090200@lemburg.com> Skip Montanaro wrote: > The unicode() builtin accepts an optional third argument, errors, which > defaults to "strict". According to the docs if errors is set to "ignore", > decoding errors are silently ignored. I seem to still get the occasional > UnicodeError exception, however. I'm still trying to track down an actual > example (it doesn't happen often, and I hadn't wrapped unicode() in a > try/except statement, so all I saw was the error raised, not the input > string value). The error argument is passed on to the codec you request. It's the codec that decides how to implement the error handling, not the unicode() builtin, so if you're seeing errors with 'ignore' then this is probably the result of some problem in the codec. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From skip@pobox.com Sun Jun 16 15:15:35 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 16 Jun 2002 09:15:35 -0500 Subject: [Python-Dev] unicode() and its error argument In-Reply-To: <3D0C6D11.5090200@lemburg.com> References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> <3D0C6D11.5090200@lemburg.com> Message-ID: <15628.40327.92855.344184@12-248-41-177.client.attbi.com> >> According to the docs if errors is set to "ignore", decoding errors >> are silently ignored. I seem to still get the occasional >> UnicodeError exception, however. mal> The error argument is passed on to the codec you request ... so if mal> you're seeing errors with 'ignore' then this is probably the result mal> of some problem in the codec. Thanks. I've been running my test program a lot the past couple of days. I think I squelched a couple bugs in my own code that may have been causing the problem. I thought it was because some non-string args were sneaking into the call, but that appears not to be the case either (the error message in that case is different than what I was seeing). Tim's inability to provoke errors was also suggestive that it was pilot error, not a problem with the plane. I'll keep my eye on things and let you know if anything else appears. Skip P.S. Happy Father's Day all you dads out there. From skip@mojam.com Sun Jun 16 17:14:25 2002 From: skip@mojam.com (Skip Montanaro) Date: Sun, 16 Jun 2002 11:14:25 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200206161614.g5GGEPZ24272@12-248-41-177.client.attbi.com> Bug/Patch Summary ----------------- 254 open / 2582 total bugs (-12) 128 open / 1554 total patches (-3) New Bugs -------- tarball to untar into a single dir (2002-06-11) http://python.org/sf/567576 whatsnew explains noargs incorrectly (2002-06-11) http://python.org/sf/567607 Various Playstation 2 Linux Test Errors (2002-06-12) http://python.org/sf/567892 urllib needs 303/307 handlers (2002-06-12) http://python.org/sf/568068 asynchat module undocumented (2002-06-12) http://python.org/sf/568134 Misleading string constant. (2002-06-12) http://python.org/sf/568269 socket module htonl/ntohl bug (2002-06-12) http://python.org/sf/568322 minor improvement to Grammar file (2002-06-13) http://python.org/sf/568412 __slots__ attribute and private variable (2002-06-14) http://python.org/sf/569257 cgi.py broken with xmlrpclib on Python 2 (2002-06-15) http://python.org/sf/569316 LINKCC incorrectly set (2002-06-16) http://python.org/sf/569668 New Patches ----------- unicode in sys.path (2002-06-10) http://python.org/sf/566999 GetFInfo update (2002-06-11) http://python.org/sf/567296 A different patch for python-mode vs gdb (2002-06-11) http://python.org/sf/567468 Add param to email.Utils.decode() (2002-06-12) http://python.org/sf/568348 Convert slice and buffer to types (2002-06-13) http://python.org/sf/568544 gettext module charset changes (2002-06-13) http://python.org/sf/568669 Implementation of major, minor and makedev (2002-06-14) http://python.org/sf/569139 names in types module (2002-06-15) http://python.org/sf/569328 plain text enhancement for cgitb (2002-06-15) http://python.org/sf/569574 Closed Bugs ----------- tuple __getitem__ limited (2001-09-06) http://python.org/sf/459235 str, __getitem__ and slices (2001-10-23) http://python.org/sf/473985 string.{starts,ends}with vs slices (2001-12-16) http://python.org/sf/493951 HTMLParser fail to handle '&foobar' (2002-01-06) http://python.org/sf/500073 markupbase handling of HTML declarations (2002-01-19) http://python.org/sf/505747 Recursive class instance "error" (2002-03-20) http://python.org/sf/532646 fcntl module with wrong module for ioctl (2002-04-30) http://python.org/sf/550777 deepcopy can't handle custom metaclasses (2002-05-26) http://python.org/sf/560794 Assertion with very long lists (2002-05-29) http://python.org/sf/561858 test_signal.py fails on FreeBSD-4-stable (2002-05-29) http://python.org/sf/562188 xmlrpclib.Binary.data undocumented (2002-05-31) http://python.org/sf/562878 Clarify documentation for inspect (2002-06-01) http://python.org/sf/563273 os.tmpfile should use w+b, not w+ (2002-06-02) http://python.org/sf/563750 getttext defaults with unicode (2002-06-03) http://python.org/sf/563915 IDLE needs printing (2002-06-06) http://python.org/sf/565373 urllib FancyURLopener.__init__ / urlopen (2002-06-06) http://python.org/sf/565414 string.replace() can corrupt heap (2002-06-07) http://python.org/sf/565993 telnetlib makes Python dump core (2002-06-07) http://python.org/sf/566006 rotormodule's set_key calls strlen (2002-06-10) http://python.org/sf/566859 Typo in "What's new in Python 2.3" (2002-06-10) http://python.org/sf/566869 Closed Patches -------------- experimental support for extended slicing on lists (2000-07-27) http://python.org/sf/400998 getopt with GNU style scanning (2001-10-21) http://python.org/sf/473512 AtheOS port of Python 2.2b2 (2001-12-02) http://python.org/sf/488073 Silence AIX C Compiler Warnings. (2002-03-21) http://python.org/sf/533070 Floating point issues in body of text (2002-04-26) http://python.org/sf/548943 Deprecate bsddb (2002-05-06) http://python.org/sf/553108 Prevent duplicates in readline history (2002-05-30) http://python.org/sf/562492 First patch: start describing types... (2002-05-30) http://python.org/sf/562529 posixmodule.c RedHat 6.1 (bug #535545) (2002-06-03) http://python.org/sf/563954 error in weakref.WeakKeyDictionary (2002-06-04) http://python.org/sf/564549 modulefinder and string methods (2002-06-05) http://python.org/sf/564840 fix bug in shutil.rmtree exception case (2002-06-09) http://python.org/sf/566517 From tim.one@comcast.net Sun Jun 16 17:41:23 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 16 Jun 2002 12:41:23 -0400 Subject: [Python-Dev] unicode() and its error argument In-Reply-To: <15628.40327.92855.344184@12-248-41-177.client.attbi.com> Message-ID: [Skip Montanaro] > ... > Tim's inability to provoke errors was also suggestive that it was pilot > error, not a problem with the plane. Ya, but what do I know about encodings? "Nothing" is right -- that's why I wrote a program to generate stuff at random. Taking that another step, to generate the encoding at random too, turns up at least one way to crash Python: the attached program eventually crashes when doing a utf7 decode. It appears to be in this line: if ((ch == '-') || !B64CHAR(ch)) { and ch "is big" when it blows up. I assume this is because B64CHAR(ch) expands in part to isalnum(ch), and on Windows the latter is done via array lookup (and ch is out-of-bounds). Other failures I've seen out of this are benign, like >>> unicode('\xf1R\x7f^C\x1e\xd8', 'hex_codec', 'ignore') Traceback (most recent call last): File "", line 1, in ? File "C:\CODE\PYTHON\lib\encodings\hex_codec.py", line 41, in hex_decode assert errors == 'strict' AssertionError >>> from random import choice, randint from traceback import print_exc bytes = [chr(i) for i in range(256)] paste = ''.join def generrors(encoding, errors, maxlen, maxtries): for dummy in xrange(maxtries): n = randint(1, maxlen) raw = paste([choice(bytes) for dummy in range(n)]) try: u = unicode(raw, encoding, errors) except: print 'failure in unicode(%r, %r, %r)' % (raw, encoding, errors) print_exc(0) return 1 return 0 from encodings.aliases import aliases unique = aliases.values() unique = dict(zip(unique, unique)).keys() while unique: e = choice(unique) print print 'Trying', e if generrors(e, 'ignore', 10, 1000): unique.remove(e) From mgilfix@eecs.tufts.edu Sun Jun 16 18:24:45 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Sun, 16 Jun 2002 13:24:45 -0400 Subject: [Python-Dev] test_socket failures In-Reply-To: <200206122059.g5CKxQa16372@odiug.zope.com>; from guido@python.org on Wed, Jun 12, 2002 at 04:59:26PM -0400 References: <200206122059.g5CKxQa16372@odiug.zope.com> Message-ID: <20020616132445.C23809@eecs.tufts.edu> Ok. I just submitted a patch to SF: http://www.python.org/sf/569697 that fixes the race conditions in test_socket.py (and also documents ThreadedTest). I've done a make test and it seem to pass through the regression test. So please test this out on windows for me Guido. -- Mike On Wed, Jun 12 @ 16:59, Guido van Rossum wrote: > I know that there are problem with the two new socket tests: > test_timeout and test_socket. The problems are varied: the tests > assume network access and a working and consistent DNS, they assume > predictable timing, and there is a number of Windows-specific failures > that I'm trying to track down. Also, when the full test suite is run, > test_socket.py may hang, while in isolation it will work. (Gosh if > only we had had these unit tests a few years ago. They bring up all > sorts of issues that are good to know about.) > > I'll try to fix these ASAP. -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From aahz@pythoncraft.com Mon Jun 17 00:45:55 2002 From: aahz@pythoncraft.com (Aahz) Date: Sun, 16 Jun 2002 19:45:55 -0400 Subject: [Python-Dev] PEP 8: Lists/tuples Message-ID: <20020616234555.GA3415@panix.com> I don't really want to open a can of worms here, but as I'm finishing up my OSCON slides, I remembered a conversation earlier where Guido said that tuples should be used for heterogeneous items (i.e. lightweight structs) while lists should be used for homogeneous items. Should this preference be enshrined in PEP 8? (I'm -1 myself, but I'd like to know what to tell my class.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From guido@python.org Mon Jun 17 01:23:58 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 16 Jun 2002 20:23:58 -0400 Subject: [Python-Dev] PEP 8: Lists/tuples In-Reply-To: Your message of "Sun, 16 Jun 2002 19:45:55 EDT." <20020616234555.GA3415@panix.com> References: <20020616234555.GA3415@panix.com> Message-ID: <200206170023.g5H0NxC00733@pcp02138704pcs.reston01.va.comcast.net> > I don't really want to open a can of worms here, but as I'm finishing up > my OSCON slides, I remembered a conversation earlier where Guido said > that tuples should be used for heterogeneous items (i.e. lightweight > structs) while lists should be used for homogeneous items. > > Should this preference be enshrined in PEP 8? Yes. > (I'm -1 myself, but I'd like to know what to tell my class.) Like it or not, that's what tuples are for. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon Jun 17 02:13:21 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 16 Jun 2002 21:13:21 -0400 Subject: [Python-Dev] behavior of inplace operations In-Reply-To: <018d01c21208$1efadab0$6501a8c0@boostconsulting.com> Message-ID: [David Abrahams] > ... > The pathological/non-generic cases are> the ones that make me think twice > about using the inplace ops at all. They don't, in fact, "just work", so > I have to think carefully about what's happening to avoid getting myself > in trouble. I didn't understand this thread. The inplace ops in Python do "just work" to my eyes, but I expect them to work the way Python defines them to work, which is quite uniform. For example, e1[e2] += e3 acts like t0, t1 = e1, e2 t0[t1] = t0[t1] + e3 There's no guarantee that e1[e2] as a whole is evaluated at most once, and, to the contrary, the subscription is performed twice, just like the "acts like" line implies. Likewise e1.id += e2 acts like t0 = e1 t0.id = t0.id + e3 The way an augmented assignment in Python works is defined by cases, on the form of the target. Those were the "subscription" and "attributeref" forms of target. There are two other relevant forms of target, "identifier" and "slicing", and they're wholly analogous. Note an implication: in a "complicated" target, it's only the top-level subscription or attribute lookup that gets evaluated twice; e.g., e1[e2].e3[e4] += e5 acts like t0, t1 = e1[e2].e3, e4 t0[t1] = t0[t1] + e5 Note that Python doesn't have a reference-to-lvalue concept. If you don't believe "but it should, so I'm going to think as if it does", there's nothing surprising about augmented assignment in Python. Indeed, I'm not even surprised by what this prints : >>> a = range(12) >>> a[2:9] += [666] >>> a From greg@cosc.canterbury.ac.nz Mon Jun 17 02:46:58 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 17 Jun 2002 13:46:58 +1200 (NZST) Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <3D09A388.8080107@lemburg.com> Message-ID: <200206170146.g5H1kwh27759@oma.cosc.canterbury.ac.nz> > The question is whether we want distutils to be a development > tool as well I'd say yes, we do -- otherwise we have to maintain two parallel systems for building stuff, which sucks for what should be obvious reasons. What's more -- on Windows, distutils is the only way I know *how* to build extension modules! I once tried doing it on my own and gave up in disgust. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jun 17 02:57:51 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 17 Jun 2002 13:57:51 +1200 (NZST) Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206170157.g5H1vp027803@oma.cosc.canterbury.ac.nz> > Extension('foo', ['foo1.c', 'foo2.c'], dependencies={'foo1.c': > > ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']}) > > But this is wrong: it's not foo1.c that depends on bar.h, it's foo1.o. It's not wrong if you read the dependency statement as "anything which depends on foo1.c also depends on bar.h" etc. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From David Abrahams" Message-ID: <009a01c215f4$6481b1e0$6601a8c0@boostconsulting.com> From: "Tim Peters" > [David Abrahams] > > ... > > The pathological/non-generic cases are> the ones that make me think twice > > about using the inplace ops at all. They don't, in fact, "just work", so > > I have to think carefully about what's happening to avoid getting myself > > in trouble. > > I didn't understand this thread. The inplace ops in Python do "just work" > to my eyes, but I expect them to work the way Python defines them to work, > which is quite uniform. For example, > > e1[e2] += e3 > > acts like > > t0, t1 = e1, e2 > t0[t1] = t0[t1] + e3 But that's not even right, AFAICT. Instead, its: t0, t1 = e1, e2 t2 = t0[t1] t2 += e3 # possible rebinding operation t0[t1] = t2 > There's no guarantee that e1[e2] as a whole is evaluated at most once Actually, that was exactly what I expected. What I didn't expect was that there's a guarantee that it's evaluated twice, once as part of a getitem and once as part of a setitem. > The way an augmented assignment in Python works is defined by cases, on the > form of the target. Those were the "subscription" and "attributeref" forms > of target. There are two other relevant forms of target, "identifier" and > "slicing", and they're wholly analogous. Note an implication: in a > "complicated" target, it's only the top-level subscription or attribute > lookup that gets evaluated twice; e.g., > > e1[e2].e3[e4] += e5 > > acts like > > t0, t1 = e1[e2].e3, e4 > t0[t1] = t0[t1] + e5 I understood that part, but thanks for going to the trouble. > Note that Python doesn't have a reference-to-lvalue concept. Never expected it to. > If you don't > believe "but it should, so I'm going to think as if it does", there's > nothing surprising about augmented assignment in Python. I don't think it should have a reference-to-lvalue. Please, give me a tiny bit of credit for being able to think Pythonically. I don't see everything in terms of C++; I just expected Python not to do a potentially expensive lookup and writeback in the cases where it could be avoided. Other people, apparently, are also surprised by some of the cases that arise due to the unconditional write-back operation. > Indeed, I'm not > even surprised by what this prints : > > >>> a = range(12) > >>> a[2:9] += [666] > >>> a I guess I am, even if I believed your "as-if" description: >>> a = range(12) >>> t0,t1 = a,slice(2,9) >>> t0[t1] = t0[t1] + [666] Traceback (most recent call last): File "", line 1, in ? TypeError: sequence index must be integer can-we-stop-beating-this-horse-now-ly y'rs, dave From jeremy@zope.com Mon Jun 17 13:40:30 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 17 Jun 2002 08:40:30 -0400 Subject: [Python-Dev] behavior of inplace operations In-Reply-To: <009a01c215f4$6481b1e0$6601a8c0@boostconsulting.com> References: <009a01c215f4$6481b1e0$6601a8c0@boostconsulting.com> Message-ID: <15629.55486.743231.509983@slothrop.zope.com> >>>>> "DA" == David Abrahams writes: DA> I guess I am, even if I believed your "as-if" description: >>> a = range(12) >>> t0,t1 = a,slice(2,9) >>> t0[t1] = t0[t1] + [666] DA> Traceback (most recent call last): DA> File "", line 1, in ? DA> TypeError: sequence index must be integer There seem to be many ways to spell this, all quite similar. And different versions of Python have different things to say about it. Current CVS says: >>> a = range(12) >>> t0,t1 = a,slice(2, 9) >>> t0[t1] = t0[t1] + [666] Traceback (most recent call last): File "", line 1, in ? ValueError: attempt to assign list of size 8 to extended slice of size 7 I suspect this is a bug, since I didn't ask for an extended slice. >>> a[2:9] = a[2:9] + [666] >>> a [0, 1, 2, 3, 4, 5, 6, 7, 8, 666, 9, 10, 11] Jeremy From guido@python.org Mon Jun 17 13:59:46 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 08:59:46 -0400 Subject: [Python-Dev] behavior of inplace operations In-Reply-To: Your message of "Mon, 17 Jun 2002 08:40:30 EDT." <15629.55486.743231.509983@slothrop.zope.com> References: <009a01c215f4$6481b1e0$6601a8c0@boostconsulting.com> <15629.55486.743231.509983@slothrop.zope.com> Message-ID: <200206171259.g5HCxkB08311@pcp02138704pcs.reston01.va.comcast.net> > Current CVS says: > > >>> a = range(12) > >>> t0,t1 = a,slice(2, 9) > >>> t0[t1] = t0[t1] + [666] > Traceback (most recent call last): > File "", line 1, in ? > ValueError: attempt to assign list of size 8 to extended slice of > size 7 > > I suspect this is a bug, since I didn't ask for an extended slice. > > >>> a[2:9] = a[2:9] + [666] > >>> a > [0, 1, 2, 3, 4, 5, 6, 7, 8, 666, 9, 10, 11] > > Jeremy Yeah, I think this is a bug introduced when MWH added support for extended slices. I hope he'll fix it so I won't have to. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Mon Jun 17 14:03:24 2002 From: mwh@python.net (Michael Hudson) Date: Mon, 17 Jun 2002 14:03:24 +0100 Subject: [Python-Dev] (no subject) Message-ID: <9B37BC74-81F2-11D6-9BA6-0003931DF95C@python.net> My starship mail currently seems to be broken in and out, so this is my first mail sent with Mac OS X's Mail.app. I hope it comes out plain text... > Current CVS says: > > >>> a = range(12) > >>> t0,t1 = a,slice(2, 9) > >>> t0[t1] = t0[t1] + [666] > Traceback (most recent call last): > File "", line 1, in ? > ValueError: attempt to assign list of size 8 to extended slice of > size 7 > > I suspect this is a bug, since I didn't ask for an extended slice. Yes you did :) If you'd have tried that with any released Python, you'd have got a TypeError. The trouble is, there's no way to distinguish between l1[a:b:] l1[slice(a,b)] I deliberately made the former be the same as l1[a:b:1] (and so have the restriction on the length of slice) to reduce special-casing (both for the user and me). Do you think I got that wrong? Cheers, M. From guido@python.org Mon Jun 17 14:39:59 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 09:39:59 -0400 Subject: [Python-Dev] (no subject) In-Reply-To: Your message of "Mon, 17 Jun 2002 14:03:24 BST." <9B37BC74-81F2-11D6-9BA6-0003931DF95C@python.net> References: <9B37BC74-81F2-11D6-9BA6-0003931DF95C@python.net> Message-ID: <200206171339.g5HDdxN08737@pcp02138704pcs.reston01.va.comcast.net> > The trouble is, there's no way to distinguish between > > l1[a:b:] > l1[slice(a,b)] > > I deliberately made the former be the same as l1[a:b:1] (and so have the > restriction on the length of slice) to reduce special-casing (both for > the user and me). Do you think I got that wrong? Yes I think you got that wrong. __getslice__ and __setlice__ are being deprecated (or at least discouraged), so you'll have objects implementing only __getitem__. Such objects will get a slice object passed to __getitem__ even for simple (one-colon) slices. If such an object wants to pass the slice on to a list object underlying the implementation, it should be allowed to. IOW slice(a, b, None) should be considered equivalent to L[a:b] in all situations. --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Mon Jun 17 10:54:51 2002 From: mwh@python.net (Michael Hudson) Date: 17 Jun 2002 10:54:51 +0100 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules unicodedata.c,2.16,2.17 References: <02e101c213e3$4785cc10$ced241d5@hagrid> <200206142048.g5EKmwC14089@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <2m4rg21b84.fsf@starship.python.net> Guido van Rossum writes: > > in my experience, simple names without underscores always conflicts > > with something on platforms that I don't have access to, while simple > > names with leading underscores never causes any problems... > > Unfortunately, that's exactly the opposite of what the C > standardization committee wants you to do. Huh? I thought underscore-lowercase character was fine, and double-underscore and underscore-capital were the verboten combinations. Cheers, M. -- SPIDER: 'Scuse me. [scuttles off] ZAPHOD: One huge spider. FORD: Polite though. -- The Hitch-Hikers Guide to the Galaxy, Episode 11 From mwh@python.net Mon Jun 17 10:54:58 2002 From: mwh@python.net (Michael Hudson) Date: 17 Jun 2002 10:54:58 +0100 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <005a01c21388$387cd3e0$0900a8c0@spiff> Message-ID: <2mznxuz0ul.fsf@starship.python.net> martin@v.loewis.de (Martin v. Loewis) writes: > > (distutils is also a pain to use with a version management system > > that marks files in the repository as read-only; distutils copy function > > happily copies all the status bits. but the remove function refuses to > > remove files that are read-only, even if the files have been created > > by distutils itself...) > > That's a bug, IMO. And hang on, wasn't it fixed by revision 1.12 of Lib/distutils/file_util.py? If not, more details would be appreciated. Cheers, M. -- Now this is what I don't get. Nobody said absolutely anything bad about anything. Yet it is always possible to just pull random flames out of ones ass. -- http://www.advogato.org/person/vicious/diary.html?start=60 From jeremy@zope.com Mon Jun 17 15:30:24 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 17 Jun 2002 10:30:24 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> Message-ID: <15629.62080.478763.190468@slothrop.zope.com> >>>>> "MvL" == Martin v Loewis writes: MvL> GCC also supports SUNPRO_DEPENDENCIES, so this is the most MvL> effective and portable way to get dependency file generation. Here's a rough strategy for exploiting this feature in distutils. Does it make sense? Happily, I can't see any possible use of make. There is an option to enable dependency tracking. Not sure how the option is passed: command line (tedious), setup (not easily customized by user), does distutils have a user options file of some sort? Each time distutils compiles a file it passes the -MD file to generate a .d file. On subsequent compilations, it checks for the .d file. If the .d file does not exist or is older than the .c file, it recompiles. Otherwise, it parses the .d file and compares the times for each of the dependencies. This doesn't involve make because the only thing make would do for us is check the dependencies and invoke the compiler. distutils already knows how to do both those things. Jeremy From guido@python.org Mon Jun 17 15:47:38 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 10:47:38 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Mon, 17 Jun 2002 10:30:24 EDT." <15629.62080.478763.190468@slothrop.zope.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> Message-ID: <200206171447.g5HElcw09279@pcp02138704pcs.reston01.va.comcast.net> > Here's a rough strategy for exploiting this feature in distutils. > Does it make sense? Happily, I can't see any possible use of make. > > There is an option to enable dependency tracking. Not sure how the > option is passed: command line (tedious), setup (not easily customized > by user), does distutils have a user options file of some sort? We could make the configure script check for GCC, and if detected, add -MD to it. > Each time distutils compiles a file it passes the -MD file to generate > a .d file. > > On subsequent compilations, it checks for the .d file. If the .d file > does not exist or is older than the .c file, it recompiles. > Otherwise, it parses the .d file and compares the times for each of > the dependencies. Sounds good. It could skip parsing the .d file if the .o file doesn't exist or is older than the .c file. If there is no .d file, I would suggest only recompiling if the .c file is newer than the .o file (otherwise systems without GCC will see recompilation of everything all the time -- not a good idea IMO.) Go ahead and implement this! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jun 17 15:49:23 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 10:49:23 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules unicodedata.c,2.16,2.17 In-Reply-To: Your message of "17 Jun 2002 10:54:51 BST." <2m4rg21b84.fsf@starship.python.net> References: <02e101c213e3$4785cc10$ced241d5@hagrid> <200206142048.g5EKmwC14089@pcp02138704pcs.reston01.va.comcast.net> <2m4rg21b84.fsf@starship.python.net> Message-ID: <200206171449.g5HEnNe09298@pcp02138704pcs.reston01.va.comcast.net> > Huh? I thought underscore-lowercase character was fine, and > double-underscore and underscore-capital were the verboten > combinations. underscore-lowercase with external linkage is also reserved for the C implementation. --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Mon Jun 17 15:53:13 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 17 Jun 2002 10:53:13 -0400 Subject: [Python-Dev] 'new' and 'types' In-Reply-To: <20020617134905.GA1003@gerg.ca> References: <20020614191558.A31580@hishome.net> <20020617134905.GA1003@gerg.ca> Message-ID: <20020617145313.GA6960@hishome.net> On Mon, Jun 17, 2002 at 09:49:05AM -0400, Greg Ward wrote: > On 14 June 2002, Oren Tirosh said: > > Patch 568629 removes the built-in module new (with sincere apologies to > > Tommy Burnette ;-) and replaces it with a tiny Python module consisting of a > > single import statement: > [...] > > Now, what about the types module? It has been suggested that this module > > should be deprecated. I think it still has some use: we need a place to put > > all the types that are not used often enough to be added to the builtins. > > I suggest that they be placed in the module 'types' with names matching their > > __name__ attribute. The types module will still have the long MixedCaseType > > names for backward compatibility. The use of the long names should be > > deprecated, not the types module itself. > > Two great ideas. I think you should write up a *short* PEP to keep them > alive -- this feels like one of those small, not-too-controversial > changes that will slip between the cracks unless/until someone with > checkin privs takes up the cause. Don't let that discourage you! The first is very much alive - my patch for the new module has already been checked in by Guido. No need for a PEP there because it's a transparent change. For the types module I have a pending patch but I guess it won't sneak into the CVS without a PEP. It requires general agreement that it's a bad thing for types to have two names, that the long names should be deprecated and that the right place for short names that are not builtins is in the types module. It's PEP time... Oren From guido@python.org Mon Jun 17 16:01:38 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 11:01:38 -0400 Subject: [Python-Dev] 'new' and 'types' In-Reply-To: Your message of "Mon, 17 Jun 2002 10:53:13 EDT." <20020617145313.GA6960@hishome.net> References: <20020614191558.A31580@hishome.net> <20020617134905.GA1003@gerg.ca> <20020617145313.GA6960@hishome.net> Message-ID: <200206171501.g5HF1cm09450@pcp02138704pcs.reston01.va.comcast.net> > For the types module I have a pending patch but I guess it won't > sneak into the CVS without a PEP. It requires general agreement that > it's a bad thing for types to have two names, that the long names > should be deprecated and that the right place for short names that > are not builtins is in the types module. FWIW, *I* agree with that. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Mon Jun 17 16:18:58 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 17 Jun 2002 10:18:58 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15629.62080.478763.190468@slothrop.zope.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> Message-ID: <15629.64994.385213.97041@12-248-41-177.client.attbi.com> Jeremy> Here's a rough strategy for exploiting this feature in Jeremy> distutils. Does it make sense? Happily, I can't see any Jeremy> possible use of make. I still don't quite understand what everyone's aversion to make is (yes, I realize it's not platform-independent, but then neither are C compilers or linkers and we manage to live with that), but I will let that slide. Instead, I see a potentially different approach. Write an scons build file (typically named SConstruct) and deliver that in the Modules directory. Most people can safely ignore it. The relatively few people (mostly on this list) who care about such things can simply install SCons (it's quite small) and run it to build the stuff in the Modules directory. The benefits as I see them are * SCons implements portable automatic dependency analysis already * Dependencies are based upon file checksums instead of timestamps (worthwhile in highly networked development environments) * Clearer separation between build/install and edit/compile/test types of tasks. I was able to create a simple SConstruct file over the weekend that builds many of the extension modules. I stalled a bit on library/include file discovery, but hopefully that barrier will be passed soon. I realize in the short-term there are also several disadvantages to this idea: * There will initially be a lot of overlap between setup.py and SCons. * SCons doesn't yet implement a VPATH-like capability so the source and build directories can't easily be separated. One is in the works though, planned for initial release in 0.09. The current version is 0.07. Skip From guido@python.org Mon Jun 17 16:30:09 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 11:30:09 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Mon, 17 Jun 2002 10:18:58 CDT." <15629.64994.385213.97041@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> <15629.64994.385213.97041@12-248-41-177.client.attbi.com> Message-ID: <200206171530.g5HFU9X09701@pcp02138704pcs.reston01.va.comcast.net> [Proposal to use SCons] Let's not tie ourselves to SCons before it's a lot more mature. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Mon Jun 17 16:29:29 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 17 Jun 2002 11:29:29 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> <15629.64994.385213.97041@12-248-41-177.client.attbi.com> <200206171530.g5HFU9X09701@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15630.89.990123.822136@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> [Proposal to use SCons] GvR> Let's not tie ourselves to SCons before it's a lot more GvR> mature. On the other hand, eating our own dogfood is a great incentive for quickly making it taste better. :) -Barry From jeremy@zope.com Mon Jun 17 16:36:38 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 17 Jun 2002 11:36:38 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15629.64994.385213.97041@12-248-41-177.client.attbi.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> <15629.64994.385213.97041@12-248-41-177.client.attbi.com> Message-ID: <15630.518.652328.859842@slothrop.zope.com> >>>>> "SM" == Skip Montanaro writes: SM> I still don't quite understand what everyone's aversion to make SM> is (yes, I realize it's not platform-independent, but then SM> neither are C compilers or linkers and we manage to live with SM> that), but I will let that slide. You didn't let it slide. You brought it up again. Many people have offered many reasons not to use make. You haven't offered any rebuttal to their arguments, which comes across as rather cavalier. SM> Instead, I see a potentially different approach. Write an scons SM> build file (typically named SConstruct) and deliver that in the SM> Modules directory. I don't care much about the Modules directory actually. I want this for third-party extensions that use distutils for distribution, particularly for my own third-party extensions :-). It sounds like you're proposing to drop distutils in favor of SCons, but not saying so explicitly. Is that right? If so, we'd need to strong case for dumping distutils than automatic dependency tracking. If that isn't right, I don't understand how SCons and distutils meet in the middle. Would extension writers need to learn distutils and SCons? It seems like the primary benefit of SCons is that it does the dependency analysis for us, while only gcc and MSVC seem to offer something similar as a compiler builtin. Since those two compilers cover all the platforms I ever use, it isn't something that interests me a lot. SM> The benefits as I see them are SM> * SCons implements portable automatic dependency analysis SM> already That's good. SM> * Dependencies are based upon file checksums instead of SM> timestamps (worthwhile in highly networked development SM> environments) That's good, too, although we could do the same for distutils. Not too much work, but not my first priority. SM> * Clearer separation between build/install and edit/compile/test SM> types of tasks. I don't know what you mean. I use the current Python make file for both tasks, and haven't had much problem. SM> I was able to create a simple SConstruct file over the weekend SM> that builds many of the extension modules. I stalled a bit on SM> library/include file discovery, but hopefully that barrier will SM> be passed soon. That's cool. SM> I realize in the short-term there are also several disadvantages SM> to this idea: SM> * There will initially be a lot of overlap between setup.py and SM> SCons. Won't there be a lot of overlap for all time unless Python adopts SCons as the one true way to build extension modules? It's not like setup.py is going to be replaced. SM> * SCons doesn't yet implement a VPATH-like capability so the SM> source and build directories can't easily be separated. SM> One is in the works though, planned for initial release in SM> 0.09. The current version is 0.07. Absolute requirement for me :-(. I've got three CVS checkouts of Python and probably 10 total build directories that I use on a regular basis -- normal builds, debug builds, profiled builds, etc. Jeremy From mwh@python.net Mon Jun 17 15:26:22 2002 From: mwh@python.net (Michael Hudson) Date: 17 Jun 2002 15:26:22 +0100 Subject: [Python-Dev] extended slicing again In-Reply-To: Guido van Rossum's message of "Mon, 17 Jun 2002 09:39:59 -0400" References: <9B37BC74-81F2-11D6-9BA6-0003931DF95C@python.net> <200206171339.g5HDdxN08737@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <2my9de7zht.fsf_-_@starship.python.net> Email back. Guido van Rossum writes: > > The trouble is, there's no way to distinguish between > > > > l1[a:b:] > > l1[slice(a,b)] > > > > I deliberately made the former be the same as l1[a:b:1] (and so have the > > restriction on the length of slice) to reduce special-casing (both for > > the user and me). Do you think I got that wrong? > > Yes I think you got that wrong. __getslice__ and __setlice__ are > being deprecated (or at least discouraged), so you'll have objects > implementing only __getitem__. Such objects will get a slice object > passed to __getitem__ even for simple (one-colon) slices. If such an > object wants to pass the slice on to a list object underlying the > implementation, it should be allowed to. > > IOW slice(a, b, None) should be considered equivalent to L[a:b] in all > situations. OK. I'll do this soon. It's not as bad as I thought at first -- only mutable sequences are affected, so it's only lists and arrays that need to be tweaked. Cheers, M. -- Our lecture theatre has just crashed. It will currently only silently display an unexplained line-drawing of a large dog accompanied by spookily flickering lights. -- Dan Sheppard, ucam.chat (from Owen Dunn's summary of the year) From tim.one@comcast.net Mon Jun 17 17:28:26 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 17 Jun 2002 12:28:26 -0400 Subject: [Python-Dev] behavior of inplace operations In-Reply-To: <009a01c215f4$6481b1e0$6601a8c0@boostconsulting.com> Message-ID: [Tim sez] > The inplace ops in Python do "just work" to my eyes, but I expect them > to work the way Python defines them to work,> which is quite uniform. > For example, > > e1[e2] += e3 > > acts like > > t0, t1 = e1, e2 > t0[t1] = t0[t1] + e3 [David Abrahams] > But that's not even right, AFAICT. Instead, its: > > t0, t1 = e1, e2 > t2 = t0[t1] > t2 += e3 # possible rebinding operation > t0[t1] = t2 That's closer, although the "mystery" in the 3rd line is less mysterious if the whole shebang is rewritten t0, t1 = e1, e2 t0[t1] = t0[t1].__iadd__(e3) That makes it clearer that the effect of the final binding is determined by what the __iadd__ implementation chooses to return. > ... > Actually, that was exactly what I expected. What I didn't expect was that > there's a guarantee that it's evaluated twice, once as part of a getitem > and once as part of a setitem. There is. > ... > I don't think it should have a reference-to-lvalue. Please, give me a tiny > bit of credit for being able to think Pythonically. I don't see everything > in terms of C++; I just expected Python not to do a potentially expensive > lookup and writeback in the cases where it could be avoided. Other people, > apparently, are also surprised by some of the cases that arise due to the > unconditional write-back operation. Getting single-evaluation requires picturing reference-to-lvalue, or magical writeback proxies, or something else equally convoluted: they're unPythonic simply because Python doesn't have stuff like that. A protocol was invented for supporting both in-place and replacement forms of augmented assignments uniformly, but it's a Pythonically simple protocol in that it can be expressed *in* Python with no new concepts beyond that methods like __iadd__ exist. I don't dispute that it surprises some people some of the time, but I submit that any other protocol would surprise some people too. Heck, even before augmented assignments were introduced, it surprised some people that list = list + [list2] *didn't* extend list inplace. Overall, "other people" are nuts <0.9 wink>. From python@rcn.com Mon Jun 17 18:17:11 2002 From: python@rcn.com (Raymond Hettinger) Date: Mon, 17 Jun 2002 13:17:11 -0400 Subject: [Python-Dev] PEP 290 - Code Modernization and Migration Message-ID: <002a01c21622$d2ede3a0$52b53bd0@othello> The migration guide has been codified in a new informational pep at http://www.python.org/peps/pep-0290.html. Developers with CVS access can add their contributions or improvements be directly to the PEP. Over time, it is expected to grow and serve as a repository of collective wisdom regarding version upgrades. From jeremy@zope.com Mon Jun 17 20:22:33 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 17 Jun 2002 15:22:33 -0400 Subject: [Python-Dev] large file support Message-ID: <15630.14073.764208.284613@slothrop.zope.com> I've run into a problem with large files using Python 2.1.2 and a Linux 2.4.9 box. We've got a large file -- almost 6GB -- that Python chokes on even though regular shell tools seem to be fine. In particular, os.stat() of the file fails with EOVERFLOW and open() of the file fails with EFBIG. The stat() failure is really bad because it means os.path.exists() returns false. strace tells me that other tools open the file passing O_LARGEFILE, but Python does not. (They pass it even for small files.) I can't find any succient explanation of O_LARGEFILE, but Google turns up all sorts of pages that mention it. It looks like the right way to open large files, but it only seems to be defined in on the Linux box in question. I haven't had any luck searching for a decent way to invoke stat() and have it be prepared for a very large file. I think Python is definitely broken here. Can anyone offer any clues or pointers to documentation? Better yet, a fix. I'm happy to help integrate and test it. Jeremy From David Abrahams" Message-ID: <017c01c21633$d19dad30$6601a8c0@boostconsulting.com> ----- Original Message ----- From: "Tim Peters" > Getting single-evaluation requires picturing reference-to-lvalue, or magical > writeback proxies, or something else equally convoluted: No, sorry, it doesn't. All it requires is a straightforward interpretation of the name given to these operations. You guys called them "inplace". When I heard that I thought "oh, it just modifies the value in-place. Wait, what about immutable objects? I guess it must rebind the thing on the LHS to the new value." Call that convoluted if you want, but at least a few people seem to approach it that way. > they're unPythonic simply because Python doesn't have stuff like that. Just stop right there and sign it "tautology-of-the-week-ly y'rs", OK? > A protocol was invented > for supporting both in-place and replacement forms of augmented assignments > uniformly, but it's a Pythonically simple protocol in that it can be > expressed *in* Python with no new concepts beyond that methods like __iadd__ > exist. Another simple protocol can also be expressed in Python, but since it involves an "if" statement it might not be considered "Pythonically simple". But, y'know, I don't care about this issue that much -- I just don't like leaving the implication that I'm thinking convolutedly uncontested. IOW, yer pressing my buttons, Tim! I'm happy to drop the technical issue unless you want to keep trolling me. > I don't dispute that it surprises some people some of the time, but > I submit that any other protocol would surprise some people too. True. I just think that the trade-offs of the chosen protocol are going to surprise more people in simpler situations than an alternative would have. It seems as though clean operation with mutable tuple elements (among other things) was sacrificed for the sake of transparency with persistent containers. I would've made a different trade-off, but then I'm not BDFL around here. like-a-moth-to-a-flame-ly y'rs, dave From guido@python.org Mon Jun 17 20:31:36 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 15:31:36 -0400 Subject: [Python-Dev] large file support In-Reply-To: Your message of "Mon, 17 Jun 2002 15:22:33 EDT." <15630.14073.764208.284613@slothrop.zope.com> References: <15630.14073.764208.284613@slothrop.zope.com> Message-ID: <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> > I've run into a problem with large files using Python 2.1.2 and a > Linux 2.4.9 box. We've got a large file -- almost 6GB -- that Python > chokes on even though regular shell tools seem to be fine. Was this Python configured for large file support? I think you have to turn that on somehow, and then everything is automatic. --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Mon Jun 17 20:45:17 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 17 Jun 2002 12:45:17 -0700 Subject: [Python-Dev] large file support In-Reply-To: <15630.14073.764208.284613@slothrop.zope.com>; from jeremy@zope.com on Mon, Jun 17, 2002 at 03:22:33PM -0400 References: <15630.14073.764208.284613@slothrop.zope.com> Message-ID: <20020617124517.A13106@glacier.arctrix.com> Jeremy Hylton wrote: > can't find any succient explanation of O_LARGEFILE, but Google turns > up all sorts of pages that mention it. It looks like the right way > to open large files, but it only seems to be defined in > on the Linux box in question. Perhaps it is set by libc if the application is compiled with large file support. Neil From Andreas Jung Mon Jun 17 20:44:54 2002 From: Andreas Jung (Andreas Jung) Date: Mon, 17 Jun 2002 15:44:54 -0400 Subject: [Python-Dev] large file support In-Reply-To: <15630.14073.764208.284613@slothrop.zope.com> References: <15630.14073.764208.284613@slothrop.zope.com> Message-ID: <104250714.1024328694@[10.10.1.2]> I remember that we had several times trouble with compiling Python 2.1 under Redhat with LF support. Also the way described in the docs did not work in all cases and we had to tweak the sources at bit (I think it was posixmodule.c). -aj --On Monday, June 17, 2002 15:22 -0400 Jeremy Hylton wrote: > I've run into a problem with large files using Python 2.1.2 and a > Linux 2.4.9 box. We've got a large file -- almost 6GB -- that Python > chokes on even though regular shell tools seem to be fine. > > In particular, os.stat() of the file fails with EOVERFLOW and open() > of the file fails with EFBIG. The stat() failure is really bad > because it means os.path.exists() returns false. > > strace tells me that other tools open the file passing O_LARGEFILE, > but Python does not. (They pass it even for small files.) I can't > find any succient explanation of O_LARGEFILE, but Google turns up all > sorts of pages that mention it. It looks like the right way to open > large files, but it only seems to be defined in on the > Linux box in question. > > I haven't had any luck searching for a decent way to invoke stat() and > have it be prepared for a very large file. > > I think Python is definitely broken here. Can anyone offer any clues > or pointers to documentation? Better yet, a fix. I'm happy to help > integrate and test it. > > Jeremy > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev --------------------------------------------------------------------- - Andreas Jung http://www.andreas-jung.com - - EMail: andreas at andreas-jung.com - - "Life is too short to (re)write parsers" - --------------------------------------------------------------------- From jeremy@zope.com Mon Jun 17 20:37:05 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 17 Jun 2002 15:37:05 -0400 Subject: [Python-Dev] large file support In-Reply-To: <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15630.14945.184194.434432@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: >> I've run into a problem with large files using Python 2.1.2 and a >> Linux 2.4.9 box. We've got a large file -- almost 6GB -- that >> Python chokes on even though regular shell tools seem to be fine. GvR> Was this Python configured for large file support? I think you GvR> have to turn that on somehow, and then everything is automatic. Indeed, I think my message ought to be mostly disregarded :-). I was told that Python had been built with large file support, but didn't test it myself. However, I'm still unhappy with one thing related to large file support. If you've got a Python that doesn't have large file support and you try os.path.exists() on a large file, it will return false. This is really bad! Imagine you've got code that says, if the file doesn't exist open with mode "w+b" :-(. I'd be happiest if os.path.exists() would work regardless of whether Python supported large files. I'd be satisifed with an exception that at least let me know something went wrong. Jeremy From guido@python.org Mon Jun 17 21:08:26 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 16:08:26 -0400 Subject: [Python-Dev] large file support In-Reply-To: Your message of "Mon, 17 Jun 2002 15:37:05 EDT." <15630.14945.184194.434432@slothrop.zope.com> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> Message-ID: <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> > However, I'm still unhappy with one thing related to large file > support. If you've got a Python that doesn't have large file support > and you try os.path.exists() on a large file, it will return false. > This is really bad! Imagine you've got code that says, if the file > doesn't exist open with mode "w+b" :-(. Wow, that sucks. > I'd be happiest if os.path.exists() would work regardless of whether > Python supported large files. I'd be satisifed with an exception that > at least let me know something went wrong. Is there an errno we can test for? stat() for a non-existent file raises one exception, stat() for a file in a directory you can't read raises a different one; maybe stat of a large file raises something else again? I think os.path.exists() ought to return True in this case. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com Mon Jun 17 21:28:23 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 17 Jun 2002 16:28:23 -0400 Subject: [Python-Dev] large file support In-Reply-To: <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15630.18023.298708.670795@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: >> I'd be happiest if os.path.exists() would work regardless of >> whether Python supported large files. I'd be satisifed with an >> exception that at least let me know something went wrong. GvR> Is there an errno we can test for? stat() for a non-existent GvR> file raises one exception, stat() for a file in a directory you GvR> can't read raises a different one; maybe stat of a large file GvR> raises something else again? I think os.path.exists() ought to GvR> return True in this case. On the platform I tried (apparently RH 7.1) it raises EOVERFLOW. I can extend posixpath to treat that as "file exists" tomorrow. Jeremy From bac@OCF.Berkeley.EDU Mon Jun 17 21:28:22 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 17 Jun 2002 13:28:22 -0700 (PDT) Subject: [Python-Dev] Python strptime Message-ID: I have implemented strptime in pure Python (SF patch #474274) as a drop-in replacement for the time module's version, but there is the issue of the time module being a C extension. Any chance of getting a Python module stub for time (assuming this patch is good enough to be accepted)? There is also obviously the option of doing something like a time2, but is there enough other time-manipulating Python code out there to warrant another module? It could be used for housing naivetime and any other code that does not directly stem from some ANSI C function. -Brett C. From guido@python.org Mon Jun 17 21:49:49 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 16:49:49 -0400 Subject: [Python-Dev] large file support In-Reply-To: Your message of "Mon, 17 Jun 2002 16:28:23 EDT." <15630.18023.298708.670795@slothrop.zope.com> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> Message-ID: <200206172049.g5HKnn012080@pcp02138704pcs.reston01.va.comcast.net> > On the platform I tried (apparently RH 7.1) it raises EOVERFLOW. I > can extend posixpath to treat that as "file exists" tomorrow. OK. Be sure to check that the errno module and the value errno.EOVERFLOW exist before using them! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jun 17 21:51:34 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 16:51:34 -0400 Subject: [Python-Dev] Python strptime In-Reply-To: Your message of "Mon, 17 Jun 2002 13:28:22 PDT." References: Message-ID: <200206172051.g5HKpYn12103@pcp02138704pcs.reston01.va.comcast.net> > I have implemented strptime in pure Python (SF patch #474274) as a drop-in > replacement for the time module's version, but there is the issue of the > time module being a C extension. Any chance of getting a Python module > stub for time (assuming this patch is good enough to be accepted)? > > There is also obviously the option of doing something like a time2, but is > there enough other time-manipulating Python code out there to warrant > another module? It could be used for housing naivetime and any other code > that does not directly stem from some ANSI C function. I think this should be done, but I have no time to review your strptime implementation. Can you submit (to the same patch item) a patch for timemodule.c that adds a callout to your Python strptime code when HAVE_STRPTIME is undefined? --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Mon Jun 17 22:25:59 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 17 Jun 2002 23:25:59 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15629.62080.478763.190468@slothrop.zope.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> Message-ID: Jeremy Hylton writes: > Here's a rough strategy for exploiting this feature in distutils. > Does it make sense? Sounds good. Unlike make, it should not choke if it cannot locate one of the inputs of the dependency file - it may be that the header file has gone away, and subsequent recompilation would update the dependency file to show that. If that is done, I'd still encourage use of the SUNPRO_DEPENDENCIES feature for use with SunPRO (aka Forte, aka Sun ONE). Not that I'm asking you to implement it, but it would be good if another such mechanism would be easy to hook into whatever you implement. Regards, Martin From guido@python.org Mon Jun 17 22:31:35 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 17:31:35 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "17 Jun 2002 23:25:59 +0200." References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> Message-ID: <200206172131.g5HLVZT12712@pcp02138704pcs.reston01.va.comcast.net> > If that is done, I'd still encourage use of the SUNPRO_DEPENDENCIES > feature for use with SunPRO (aka Forte, aka Sun ONE). Not that I'm > asking you to implement it, but it would be good if another such > mechanism would be easy to hook into whatever you implement. I don't recall that you explained the meaning of the SUNPRO_DEPENDENCIES variable, only that it was undocumented and did something similar to GCC's -M. That's hardly enough. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From bac@OCF.Berkeley.EDU Mon Jun 17 22:31:35 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 17 Jun 2002 14:31:35 -0700 (PDT) Subject: [Python-Dev] Python strptime In-Reply-To: <200206172051.g5HKpYn12103@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Mon, 17 Jun 2002, Guido van Rossum wrote: > > I have implemented strptime in pure Python (SF patch #474274) as a drop-in > > replacement for the time module's version, but there is the issue of the > > time module being a C extension. Any chance of getting a Python module > > stub for time (assuming this patch is good enough to be accepted)? > > > > There is also obviously the option of doing something like a time2, but is > > there enough other time-manipulating Python code out there to warrant > > another module? It could be used for housing naivetime and any other code > > that does not directly stem from some ANSI C function. > > I think this should be done, but I have no time to review your > strptime implementation. Can you submit (to the same patch item) a > patch for timemodule.c that adds a callout to your Python strptime > code when HAVE_STRPTIME is undefined? Do you just want a callout to strptime or should I also include my helper classes and functions? I have implemented a class that figures out and stores all locale-specific date info (weekday names, month names, etc.). I subclass that for another class that creates the regexes used by strptime. I also have three functions that calculate missing data (Julian date from Gregorian date, etc.). But one does not need access to any of these things directly if one just wants to use strptime like in the time module, so they can be left out for now if you prefer. > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > -Brett C. From guido@python.org Mon Jun 17 22:44:28 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 17:44:28 -0400 Subject: [Python-Dev] Python strptime In-Reply-To: Your message of "Mon, 17 Jun 2002 14:31:35 PDT." References: Message-ID: <200206172144.g5HLiSZ12766@pcp02138704pcs.reston01.va.comcast.net> > Do you just want a callout to strptime or should I also include my helper > classes and functions? I have implemented a class that figures out and > stores all locale-specific date info (weekday names, month names, etc.). > I subclass that for another class that creates the regexes used by > strptime. I also have three functions that calculate missing data (Julian > date from Gregorian date, etc.). > > But one does not need access to any of these things directly if one just > wants to use strptime like in the time module, so they can be left out for > now if you prefer. If there's a way to get at the extra stuff by importing strptime.py, that's preferred. The time module only needs to support the classic strptime function. (But as I said I haven't seen your code, so maybe I misunderstand your question.) --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Mon Jun 17 22:44:59 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 17 Jun 2002 23:44:59 +0200 Subject: [Python-Dev] large file support In-Reply-To: <15630.14945.184194.434432@slothrop.zope.com> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> Message-ID: jeremy@zope.com (Jeremy Hylton) writes: > However, I'm still unhappy with one thing related to large file > support. If you've got a Python that doesn't have large file support > and you try os.path.exists() on a large file, it will return false. > This is really bad! I believe this is a pilot error. On a system that supports large files, it is the administrator's job to make sure the Python installation has large file support enabled, otherwise, strange things may happen. So yes, it is bad, but no, it is not really bad. Feel free to fix it, but be prepared to include work-arounds in many other places, too. Regards, Martin From martin@v.loewis.de Mon Jun 17 22:49:11 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 17 Jun 2002 23:49:11 +0200 Subject: [Python-Dev] Python strptime In-Reply-To: References: Message-ID: Brett Cannon writes: > Do you just want a callout to strptime or should I also include my helper > classes and functions? I have implemented a class that figures out and > stores all locale-specific date info (weekday names, month names, etc.). That sounds terrible. How do you do that, and on what systems does it work? Do we really want to do that? Does it always work? Regards, Martin From bac@OCF.Berkeley.EDU Mon Jun 17 22:47:44 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 17 Jun 2002 14:47:44 -0700 (PDT) Subject: [Python-Dev] Python strptime In-Reply-To: <200206172144.g5HLiSZ12766@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Mon, 17 Jun 2002, Guido van Rossum wrote: > > Do you just want a callout to strptime or should I also include my helper > > classes and functions? I have implemented a class that figures out and > > stores all locale-specific date info (weekday names, month names, etc.). > > I subclass that for another class that creates the regexes used by > > strptime. I also have three functions that calculate missing data (Julian > > date from Gregorian date, etc.). > > > > But one does not need access to any of these things directly if one just > > wants to use strptime like in the time module, so they can be left out for > > now if you prefer. > > If there's a way to get at the extra stuff by importing strptime.py, > that's preferred. The time module only needs to support the classic > strptime function. (But as I said I haven't seen your code, so maybe > I misunderstand your question.) No, you understood it. I made all of it importable. I figured they might be useful in some other fashion so there is no munging of names or explicit leaving out in __all__ or anything. So my question is answered. Now I just need to write the patch. Might be a little while since I have never bothered to learn callouts from C to Python. Guess I now have my personal project for the week. -Brett C. From guido@python.org Mon Jun 17 22:57:03 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 17:57:03 -0400 Subject: [Python-Dev] large file support In-Reply-To: Your message of "17 Jun 2002 23:44:59 +0200." References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> Message-ID: <200206172157.g5HLv3f12913@pcp02138704pcs.reston01.va.comcast.net> > I believe this is a pilot error. On a system that supports large > files, it is the administrator's job to make sure the Python > installation has large file support enabled, otherwise, strange things > may happen. I'm not sure who to blame, but note that (at least for 2.1.2, which is the version that Jeremy said he was given to use) large file support must be configured manually. So this might be a common problem. Unfortunately that may mean that it's only worth fixing in 2.1.4... --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Mon Jun 17 23:03:12 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 18 Jun 2002 00:03:12 +0200 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206172131.g5HLVZT12712@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> <200206172131.g5HLVZT12712@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > I don't recall that you explained the meaning of the > SUNPRO_DEPENDENCIES variable, only that it was undocumented and did > something similar to GCC's -M. That's hardly enough. :-) I see :-) Suppose you have a file x.c, and you invoke env SUNPRO_DEPENDENCIES="x.deps build/x.o" gcc -c -o x.o x.c then a file x.deps is generated, and has, on the left-hand side of the dependency rule, build/x.o. It works the same way for compilers identifying themselves as cc: Sun WorkShop 6 update 1 C 5.2 2000/09/11 when invoked with -V. I can't give a complete list of compilers that support that feature, but setting the variable can't hurt - the worst case is that it is ignored. Regards, Martin From martin@v.loewis.de Mon Jun 17 22:47:09 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 17 Jun 2002 23:47:09 +0200 Subject: [Python-Dev] large file support In-Reply-To: <104250714.1024328694@[10.10.1.2]> References: <15630.14073.764208.284613@slothrop.zope.com> <104250714.1024328694@[10.10.1.2]> Message-ID: Andreas Jung writes: > I remember that we had several times trouble with compiling > Python 2.1 under Redhat with LF support. Also the way described in the docs > did not work in all cases and we had to tweak the sources at bit > (I think it was posixmodule.c). For the current 2.1 release, the docs are believed to be correct (the instructions used to be incorrect, as was the code). For 2.2, it is believed that no extra configuration is necessary on "most" systems (Windows, Linux, Solaris). Regards, Martin From bac@OCF.Berkeley.EDU Mon Jun 17 23:27:23 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 17 Jun 2002 15:27:23 -0700 (PDT) Subject: [Python-Dev] Python strptime In-Reply-To: Message-ID: On 17 Jun 2002, Martin v. Loewis wrote: > Brett Cannon writes: > > > Do you just want a callout to strptime or should I also include my helper > > classes and functions? I have implemented a class that figures out and > > stores all locale-specific date info (weekday names, month names, etc.). > > That sounds terrible. How do you do that, and on what systems does it > work? Do we really want to do that? Does it always work? > Well, since locale info is not directly accessible for time-specific things in Python (let alone in C in a standard way), I have to do multiple calls to strftime to get the names of the weekdays. As for the strings representing locale-specifc date, time, and date/time representation I have to go through and figure out what the format of the output to extract the format string used by strftime to create the string. Since it is in pure Python and relies only on strftime and locale for its info, it works on all systems. I have yet to have anyone say it doesn't work for them. As for whether that is the best solution, I think it is for the situation. Yes, I could roll all of this into strptime itself and make it a single monolithic function. The reason I did this was so that the object (names LocaleTime) could handle lazy evaluation for that info. That way you are not paying the price of having to recalculate the same information thousands of times (for instance if you are parsing a huge logfile). I also think it is helpful to have that info available separately from strptime since locale does not provide it. Since the locale information is not accessible any other way that I can come up with, the only other solution is to have someone enter all the locale-specific info by hand. I personally would rather put up with a more complicated strptime setup then have to worry about entering all of that info. -Brett C. From niemeyer@conectiva.com Tue Jun 18 00:45:15 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Mon, 17 Jun 2002 20:45:15 -0300 Subject: [Python-Dev] Python strptime In-Reply-To: References: Message-ID: <20020617204515.A10690@ibook.distro.conectiva> Brett, > Well, since locale info is not directly accessible for time-specific > things in Python (let alone in C in a standard way), I have to do multiple > calls to strftime to get the names of the weekdays. As for the strings > representing locale-specifc date, time, and date/time representation I > have to go through and figure out what the format of the output to extract > the format string used by strftime to create the string. Since it is in > pure Python and relies only on strftime and locale for its info, it works > on all systems. I have yet to have anyone say it doesn't work for them. [...] > I also think it is helpful to have that info available separately from > strptime since locale does not provide it. What kind of information are you looking for exactly? I'm not sure if this is available in every paltform (it's standarized only by the "Single UNIX Specification" acording to my man page), but depending on this issue, everything you're looking for is there: >>> locale.nl_langinfo(locale.D_FMT) '%m/%d/%y' You could also try loading a translation catalog in the target system, but that could be unportable as well. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From bac@OCF.Berkeley.EDU Tue Jun 18 00:49:05 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 17 Jun 2002 16:49:05 -0700 (PDT) Subject: [Python-Dev] Python strptime In-Reply-To: <20020617204515.A10690@ibook.distro.conectiva> Message-ID: > What kind of information are you looking for exactly? I'm not sure if > this is available in every paltform (it's standarized only by the > "Single UNIX Specification" acording to my man page), but depending on > this issue, everything you're looking for is there: > > >>> locale.nl_langinfo(locale.D_FMT) > '%m/%d/%y' > > You could also try loading a translation catalog in the target system, > but that could be unportable as well. That is the type of info I am looking for, but it is not portable. Windows does not have this functionality to my knowledge. If it did it is stupid that it does not have strptime built-in. ANSI C, unfortunately, does not provide a way to get this info directly. This is why I have to get it from strftime. -Brett C. > > -- > Gustavo Niemeyer From niemeyer@conectiva.com Tue Jun 18 00:53:21 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Mon, 17 Jun 2002 20:53:21 -0300 Subject: [Python-Dev] Python strptime In-Reply-To: References: <20020617204515.A10690@ibook.distro.conectiva> Message-ID: <20020617205321.A11040@ibook.distro.conectiva> > That is the type of info I am looking for, but it is not portable. > Windows does not have this functionality to my knowledge. If it did it is > stupid that it does not have strptime built-in. ANSI C, unfortunately, > does not provide a way to get this info directly. This is why I have to > get it from strftime. Well, providing strftime and not strptime is stupid already, following your point of view. :-) -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From jason@crash.org Tue Jun 18 01:14:21 2002 From: jason@crash.org (Jason L. Asbahr) Date: Mon, 17 Jun 2002 19:14:21 -0500 Subject: [Python-Dev] Playstation 2 and GameCube ports Message-ID: Pythonistas, During the past year, I did some work that involved porting Python 2.1 to the Sony Playstation 2 (professional developer system) and Nintendo GameCube platforms. Since it involved little code change to Python itself, I was wondering if there is interest in merging these changes into the main trunk? This would first involved bringing those ports up to speed with the current CVS trunk. But before that, a PEP? Cheers, Jason ______________________________________________________________________ Jason Asbahr jason@asbahr.com From pobrien@orbtech.com Tue Jun 18 01:30:41 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Mon, 17 Jun 2002 19:30:41 -0500 Subject: [Python-Dev] Playstation 2 and GameCube ports In-Reply-To: Message-ID: [Jason L. Asbahr] > > Pythonistas, > > During the past year, I did some work that involved porting Python 2.1 > to the Sony Playstation 2 (professional developer system) and Nintendo > GameCube platforms. Since it involved little code change to Python > itself, I was wondering if there is interest in merging these changes > into the main trunk? > > This would first involved bringing those ports up to speed with the > current CVS trunk. But before that, a PEP? That would certainly get my son's attention and might even get him started in programming. I wouldn't mind seeing your efforts written up in a PEP. What exactly can you accomplish with Python on one of these boxes? -- Patrick K. O'Brien Orbtech ----------------------------------------------- "Your source for Python software development." ----------------------------------------------- Web: http://www.orbtech.com/web/pobrien/ Blog: http://www.orbtech.com/blog/pobrien/ Wiki: http://www.orbtech.com/wiki/PatrickOBrien ----------------------------------------------- From guido@python.org Tue Jun 18 02:24:50 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 21:24:50 -0400 Subject: [Python-Dev] Playstation 2 and GameCube ports In-Reply-To: Your message of "Mon, 17 Jun 2002 19:14:21 CDT." References: Message-ID: <200206180124.g5I1Oos13421@pcp02138704pcs.reston01.va.comcast.net> > During the past year, I did some work that involved porting Python 2.1 > to the Sony Playstation 2 (professional developer system) and Nintendo > GameCube platforms. Since it involved little code change to Python > itself, I was wondering if there is interest in merging these changes > into the main trunk? > > This would first involved bringing those ports up to speed with the > current CVS trunk. But before that, a PEP? I don't think this needs a PEP -- you can just submit the changes to the SF patch manager, assuming they are indeed small. Our neighbors Qove here in McLean might be interested in your work. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 18 02:42:36 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 21:42:36 -0400 Subject: [Python-Dev] Playstation 2 and GameCube ports In-Reply-To: Your message of "Mon, 17 Jun 2002 19:30:41 CDT." References: Message-ID: <200206180142.g5I1gbg13515@pcp02138704pcs.reston01.va.comcast.net> > That would certainly get my son's attention and might even get him started > in programming. I wouldn't mind seeing your efforts written up in a PEP. > What exactly can you accomplish with Python on one of these boxes? Don't you need a (costly) developers license in order to use this? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 18 02:46:35 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 21:46:35 -0400 Subject: [Python-Dev] Python strptime In-Reply-To: Your message of "Mon, 17 Jun 2002 15:27:23 PDT." References: Message-ID: <200206180146.g5I1kZn13557@pcp02138704pcs.reston01.va.comcast.net> > Well, since locale info is not directly accessible for time-specific > things in Python (let alone in C in a standard way), I have to do > multiple calls to strftime to get the names of the weekdays. I guess so -- the calendar module does the same (and then makes them available). --Guido van Rossum (home page: http://www.python.org/~guido/) From jafo@tummy.com Tue Jun 18 03:50:02 2002 From: jafo@tummy.com (Sean Reifschneider) Date: Mon, 17 Jun 2002 20:50:02 -0600 Subject: [Python-Dev] large file support In-Reply-To: <15630.18023.298708.670795@slothrop.zope.com>; from jeremy@zope.com on Mon, Jun 17, 2002 at 04:28:23PM -0400 References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> Message-ID: <20020617205002.A13413@tummy.com> On Mon, Jun 17, 2002 at 04:28:23PM -0400, Jeremy Hylton wrote: >On the platform I tried (apparently RH 7.1) it raises EOVERFLOW. I >can extend posixpath to treat that as "file exists" tomorrow. How about changing os.path.exists for posix to: def exists(path): return(os.access(path, os.F_OK)) I haven't done more than a few simple tests, but I believe that this would provide similar functionality without relying on os.stat not breaking. Plus, access is faster (on the order of 2x as fast stating a quarter million files on my laptop). Sean -- I have never been able to conceive how any rational being could propose happiness to himself from the exercise of power over others. -- Jefferson Sean Reifschneider, Inimitably Superfluous tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python From guido@python.org Tue Jun 18 04:02:42 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 23:02:42 -0400 Subject: [Python-Dev] large file support In-Reply-To: Your message of "Mon, 17 Jun 2002 20:50:02 MDT." <20020617205002.A13413@tummy.com> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> Message-ID: <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net> > How about changing os.path.exists for posix to: > > def exists(path): > return(os.access(path, os.F_OK)) NO, NO, NOOOOOOO! access() does something different. It checks permissions as they would be for the effective user id. DO NOT USE access() TO CHECK FOR FILE PERMISSIONS UNLESS YOU HAVE A SET-UID MISSION! --Guido van Rossum (home page: http://www.python.org/~guido/) From jafo@tummy.com Tue Jun 18 04:07:07 2002 From: jafo@tummy.com (Sean Reifschneider) Date: Mon, 17 Jun 2002 21:07:07 -0600 Subject: [Python-Dev] large file support In-Reply-To: <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jun 17, 2002 at 11:02:42PM -0400 References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020617210707.C24702@tummy.com> On Mon, Jun 17, 2002 at 11:02:42PM -0400, Guido van Rossum wrote: >> How about changing os.path.exists for posix to: >> >> def exists(path): >> return(os.access(path, os.F_OK)) > >NO, NO, NOOOOOOO! > >access() does something different. It checks permissions as they F_OK checks to see if the file exists. Am I misunderstanding something in the following test: [2] guin:tmp# cd /tmp [2] guin:tmp# mkdir test [2] guin:tmp# chmod 700 test [2] guin:tmp# touch test/exists [2] guin:tmp# chmod 700 test/exists [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/exists' jafo access: 0 exists: 0 [2] guin:tmp# chmod 111 /tmp/test [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/exists' jafo access: 1 exists: 1 [2] guin:tmp# chmod 000 test/exists [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/exists' jafo access: 1 exists: 1 [2] guin:tmp# chmod 000 /tmp/test [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/exists' jafo access: 0 exists: 0 [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/noexists' jafo access: 0 exists: 0 [2] guin:tmp# chmod 777 /tmp/test [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/noexists' jafo access: 0 exists: 0 The above is run as root, with the su doing the test as non-root. The code in showaccess simply does an os.access and then an os.path.exists and displays the results: [2] guin:tmp# cat /tmp/showaccess #!/usr/bin/env python2 import os, sys print 'access: %d exists: %d' % ( os.access(sys.argv[1], os.F_OK), os.path.exists(sys.argv[1])) Sean -- /home is where your .heart is. -- Sean Reifschneider, 1999 Sean Reifschneider, Inimitably Superfluous tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python From guido@python.org Tue Jun 18 04:25:34 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Jun 2002 23:25:34 -0400 Subject: [Python-Dev] large file support In-Reply-To: Your message of "Mon, 17 Jun 2002 21:07:07 MDT." <20020617210707.C24702@tummy.com> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net> <20020617210707.C24702@tummy.com> Message-ID: <200206180325.g5I3PYj28659@pcp02138704pcs.reston01.va.comcast.net> > >> def exists(path): > >> return(os.access(path, os.F_OK)) > > > >NO, NO, NOOOOOOO! > > > >access() does something different. It checks permissions as they > > F_OK checks to see if the file exists. It is my understanding that if some directory along the path to the file is accessible to root but not to the effective user, access() for a file in that directory might return 0 while exists would return 1, on some operating systems. There's only one rule for access(): only use it if you have a set-uid mission. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com Tue Jun 18 05:14:18 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Tue, 18 Jun 2002 00:14:18 -0400 Subject: [Python-Dev] large file support In-Reply-To: References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> Message-ID: <15630.45978.675730.729531@slothrop.zope.com> >>>>> "MvL" == Martin v Loewis writes: MvL> jeremy@zope.com (Jeremy Hylton) writes: >> However, I'm still unhappy with one thing related to large file >> support. If you've got a Python that doesn't have large file >> support and you try os.path.exists() on a large file, it will >> return false. This is really bad! MvL> I believe this is a pilot error. On a system that supports MvL> large files, it is the administrator's job to make sure the MvL> Python installation has large file support enabled, otherwise, MvL> strange things may happen. We sure don't provide much help for such an administrator. (Happily, I am not one.) The instructions for Linux offer a configure recipe and says "it might work." If you build without large file support on a Linux system, the test suite gives no indication that something went wrong. So I think it is unreasonable to say the Python install is broken, despite the fact that it's possible to do better. MvL> So yes, it is bad, but no, it is not really bad. Feel free to MvL> fix it, but be prepared to include work-arounds in many other MvL> places, too. os.path.exists() is perhaps the most egregious. I think it's worth backporting the fix to the 2.1 branch, along with any other glaring errors. We might still see a 2.1.4. Jeremy From mgilfix@eecs.tufts.edu Tue Jun 18 05:26:28 2002 From: mgilfix@eecs.tufts.edu (Michael Gilfix) Date: Tue, 18 Jun 2002 00:26:28 -0400 Subject: [Python-Dev] test_socket failures In-Reply-To: <20020618000741.F17999@eecs.tufts.edu>; from mgilfix@eecs.tufts.edu on Tue, Jun 18, 2002 at 12:07:41AM -0400 References: <200206122059.g5CKxQa16372@odiug.zope.com> <20020612191355.C10542@eecs.tufts.edu> <200206130042.g5D0gZu10922@pcp02138704pcs.reston01.va.comcast.net> <20020612232545.B12119@eecs.tufts.edu> <200206131657.g5DGv0300386@odiug.zope.com> <20020613140908.E18170@eecs.tufts.edu> <200206131816.g5DIGSS03032@odiug.zope.com> <20020618000741.F17999@eecs.tufts.edu> Message-ID: <20020618002627.G17999@eecs.tufts.edu> Guido, please let me know if you want me to do anything more regarding the test_socket.py stuff and perhaps some timeout stuff before I take off (this wednsday). A note for the interested: I'll be gone for a month on vacation, so response won't be timely. Regards, -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html From martin@v.loewis.de Tue Jun 18 06:30:15 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 18 Jun 2002 07:30:15 +0200 Subject: [Python-Dev] Python strptime In-Reply-To: References: Message-ID: Brett Cannon writes: > Well, since locale info is not directly accessible for time-specific > things in Python (let alone in C in a standard way), I have to do multiple > calls to strftime to get the names of the weekdays. I wonder what the purpose of having a pure-Python implementation of strptime is, if you have to rely on strftime. Is this for Windows only? Regards, Martin From jafo@tummy.com Tue Jun 18 06:32:41 2002 From: jafo@tummy.com (Sean Reifschneider) Date: Mon, 17 Jun 2002 23:32:41 -0600 Subject: [Python-Dev] large file support In-Reply-To: <200206180325.g5I3PYj28659@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jun 17, 2002 at 11:25:34PM -0400 References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net> <20020617210707.C24702@tummy.com> <200206180325.g5I3PYj28659@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020617233241.E24702@tummy.com> On Mon, Jun 17, 2002 at 11:25:34PM -0400, Guido van Rossum wrote: >It is my understanding that if some directory along the path to the >file is accessible to root but not to the effective user, access() for >a file in that directory might return 0 while exists would return 1, I would be shocked if POSIX allowed a non-root user to probe file entries under a root/700 directory... What a paradox -- when I submitted the patch to add F_OK, you said that exists() did the same thing. ;-) Sean -- "Your documents always look so good." "That's because I keep my laser-printer set on ``stun''." -- Sean Reifschneider, 1998 Sean Reifschneider, Inimitably Superfluous tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python From bac@OCF.Berkeley.EDU Tue Jun 18 06:32:45 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 17 Jun 2002 22:32:45 -0700 (PDT) Subject: [Python-Dev] Python strptime In-Reply-To: <20020617205321.A11040@ibook.distro.conectiva> Message-ID: On Mon, 17 Jun 2002, Gustavo Niemeyer wrote: > > That is the type of info I am looking for, but it is not portable. > > Windows does not have this functionality to my knowledge. If it did it is > > stupid that it does not have strptime built-in. ANSI C, unfortunately, > > does not provide a way to get this info directly. This is why I have to > > get it from strftime. > > Well, providing strftime and not strptime is stupid already, following > your point of view. :-) > Yeah. =) Since it is just reversed it is not that difficult once you have the locale information. The most difficult part of this whole module was trying to come up with a way to get that information. > -- > Gustavo Niemeyer > -Brett C. From martin@v.loewis.de Tue Jun 18 06:36:07 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 18 Jun 2002 07:36:07 +0200 Subject: [Python-Dev] large file support In-Reply-To: <15630.45978.675730.729531@slothrop.zope.com> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <15630.45978.675730.729531@slothrop.zope.com> Message-ID: jeremy@zope.com (Jeremy Hylton) writes: > We sure don't provide much help for such an administrator. We do, but not in 2.1. > os.path.exists() is perhaps the most egregious. I think it's worth > backporting the fix to the 2.1 branch, along with any other glaring > errors. We might still see a 2.1.4. In that case, I recommend to backport the machinery that enables LFS from 2.2. If this machinery fails to detect LFS support on a system, there is a good chance that your processing of EOVERFLOW fails on that system as well. Regards, Martin From bac@OCF.Berkeley.EDU Tue Jun 18 06:39:37 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 17 Jun 2002 22:39:37 -0700 (PDT) Subject: [Python-Dev] Python strptime In-Reply-To: <200206180146.g5I1kZn13557@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Mon, 17 Jun 2002, Guido van Rossum wrote: > > Well, since locale info is not directly accessible for time-specific > > things in Python (let alone in C in a standard way), I have to do > > multiple calls to strftime to get the names of the weekdays. > > I guess so -- the calendar module does the same (and then makes them > available). > Perhaps this info is important enough to not be in time but in locale? I could rework my code that figures out the date info to fit more into locale. Maybe have some constants (like A_WEEKDAY, F_WEEKDAY, etc.) that could be passed to a function that would return a list of the requested names? Or could stay with the way I currently have it and just have a class that stores all of that info and has named attributes to return the info? > --Guido van Rossum (home page: http://www.python.org/~guido/) -Brett C. From bac@OCF.Berkeley.EDU Tue Jun 18 06:53:58 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 17 Jun 2002 22:53:58 -0700 (PDT) Subject: [Python-Dev] Python strptime In-Reply-To: Message-ID: On 18 Jun 2002, Martin v. Loewis wrote: > Brett Cannon writes: > > > Well, since locale info is not directly accessible for time-specific > > things in Python (let alone in C in a standard way), I have to do multiple > > calls to strftime to get the names of the weekdays. > > I wonder what the purpose of having a pure-Python implementation of > strptime is, if you have to rely on strftime. Is this for Windows only? > The purpose is that strptime is not common across all platforms. As it stands now, it requires that the underlying C library support it. Since it is not specified in ANSI C, not all have it. glibc has it so most UNIX installs have it. But Windows doesn't. It is not meant specifically for Windows, but it happens to be the major OS that lacks it. But strftime is guaranteed by Python to be there since it is in ANSI C. The reason I have the reliance on strftime is because it was the only way I could think of to reliably gain access to locale information in regards for time. If I didn't try to figure out what the names of months were, I would not need strftime at all. But since I wanted this to be able to be a drop-in replacement, I decided it was worth my time to figure out how to get this locale info when it is not directly accessible. As for why it is in Python and not C, it's mainly because I prefer Python. =) I think it could be done in C, but it would be much more work. I also remember Guido saying somewhere that he would like to see modules in Python when possible. It was possible, so I did it in Python. > Regards, > Martin > > -Brett C. From pf@artcom-gmbh.de Tue Jun 18 08:27:13 2002 From: pf@artcom-gmbh.de (Peter Funk) Date: Tue, 18 Jun 2002 09:27:13 +0200 (CEST) Subject: [Python-Dev] Python strptime In-Reply-To: from Brett Cannon at "Jun 17, 2002 10:53:58 pm" Message-ID: Hi, Brett Cannon: [...] > The purpose is that strptime is not common across all platforms. As it > stands now, it requires that the underlying C library support it. Since > it is not specified in ANSI C, not all have it. glibc has it so most UNIX > installs have it. But Windows doesn't. It is not meant specifically for > Windows, but it happens to be the major OS that lacks it. [...] There is some relationship betweeen the time module and the calendar module, which is also a long time member of the Python standard library. If your new code becomes part of the standard library, please have a look at the calendar module and its documentation. At least crossreferencing pointers should be added. May be someone will come up with a "locale-awareness" patch to calendar? Currently month and weekday names are constants hardcoded in english in calendar.py. The stuff you wrote might help here. Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany) From fredrik@pythonware.com Tue Jun 18 09:50:05 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 18 Jun 2002 10:50:05 +0200 Subject: [Python-Dev] Python strptime References: Message-ID: <01e801c216a5$257eb130$0900a8c0@spiff> brett wrote: > Might be a little while since I have never bothered to learn callouts > from C to Python. Guess I now have my personal project for the week. look for the "call" function in Modules/_sre.c (and how it's used throughout the module). From mwh@python.net Tue Jun 18 12:23:11 2002 From: mwh@python.net (Michael Hudson) Date: 18 Jun 2002 12:23:11 +0100 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects bufferobject.c,2.17,2.18 classobject.c,2.156,2.157 descrobject.c,2.26,2.27 funcobject.c,2.54,2.55 sliceobject.c,2.14,2.15 In-Reply-To: gvanrossum@users.sourceforge.net's message of "Fri, 14 Jun 2002 13:41:18 -0700" References: Message-ID: <2m3cvkde5c.fsf@starship.python.net> gvanrossum@users.sourceforge.net writes: > Index: classobject.c [...] > + static PyObject * > + new_class(PyObject* unused, PyObject* args) > + { > + PyObject *name; > + PyObject *classes; > + PyObject *dict; > + > + if (!PyArg_ParseTuple(args, "SO!O!:class", > + &name, > + &PyTuple_Type, &classes, > + &PyDict_Type, &dict)) > + return NULL; > + return PyClass_New(classes, dict, name); > + } > + What's this for? It's not referred to anywhere, so I'm getting warnings about it. I'd just hack it out, but it only just got added... Cheers, M. -- ARTHUR: Don't ask me how it works or I'll start to whimper. -- The Hitch-Hikers Guide to the Galaxy, Episode 11 From guido@python.org Tue Jun 18 12:43:56 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Jun 2002 07:43:56 -0400 Subject: [Python-Dev] Python strptime In-Reply-To: Your message of "18 Jun 2002 07:30:15 +0200." References: Message-ID: <200206181144.g5IBi3g29948@pcp02138704pcs.reston01.va.comcast.net> > I wonder what the purpose of having a pure-Python implementation of > strptime is, if you have to rely on strftime. Is this for Windows only? Isn't the problem that strftime() is in the C standard but strptime() is not? So strptime() isn't always provided but we can count on strftime()? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 18 12:50:09 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Jun 2002 07:50:09 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects bufferobject.c,2.17,2.18 classobject.c,2.156,2.157 descrobject.c,2.26,2.27 funcobject.c,2.54,2.55 sliceobject.c,2.14,2.15 In-Reply-To: Your message of "18 Jun 2002 12:23:11 BST." <2m3cvkde5c.fsf@starship.python.net> References: <2m3cvkde5c.fsf@starship.python.net> Message-ID: <200206181150.g5IBo9e30012@pcp02138704pcs.reston01.va.comcast.net> > > Index: classobject.c > [...] > > + static PyObject * > > + new_class(PyObject* unused, PyObject* args) [...] > > What's this for? It's not referred to anywhere, so I'm getting > warnings about it. I'd just hack it out, but it only just got > added... Looks like an experiment by Oren Tirosh that didn't get nuked. I think you can safely lose it. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 18 12:56:15 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Jun 2002 07:56:15 -0400 Subject: [Python-Dev] Python strptime In-Reply-To: Your message of "Tue, 18 Jun 2002 09:27:13 +0200." References: Message-ID: <200206181156.g5IBuFI30101@pcp02138704pcs.reston01.va.comcast.net> > Currently month and weekday names are constants hardcoded in > english in calendar.py. No they're not. You're a year behind. ;-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 18 13:00:26 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Jun 2002 08:00:26 -0400 Subject: [Python-Dev] Python strptime In-Reply-To: Your message of "Mon, 17 Jun 2002 22:39:37 PDT." References: Message-ID: <200206181200.g5IC0Qu30146@pcp02138704pcs.reston01.va.comcast.net> > Perhaps this info is important enough to not be in time but in locale? Perhaps, if Martin von Loewis agrees. > I could rework my code that figures out the date info to fit more into > locale. Maybe have some constants (like A_WEEKDAY, F_WEEKDAY, etc.) that > could be passed to a function that would return a list of the requested > names? Or could stay with the way I currently have it and just have a > class that stores all of that info and has named attributes to return the > info? I haven't seen your code and have no time to review it until I'm back from vacation, so can't comment on this bit. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 18 13:01:56 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Jun 2002 08:01:56 -0400 Subject: [Python-Dev] large file support In-Reply-To: Your message of "18 Jun 2002 07:36:07 +0200." References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <15630.45978.675730.729531@slothrop.zope.com> Message-ID: <200206181201.g5IC1uM30164@pcp02138704pcs.reston01.va.comcast.net> > In that case, I recommend to backport the machinery that enables LFS > from 2.2. If this machinery fails to detect LFS support on a system, > there is a good chance that your processing of EOVERFLOW fails on that > system as well. That sounds a good plan, though painful (much configure.in hacking, and didn't we switch to a newer version of autoconf?). Can you help? 2.1 is still a popular release, and large files will become more and more common as it grows older... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jun 18 13:05:35 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Jun 2002 08:05:35 -0400 Subject: [Python-Dev] large file support In-Reply-To: Your message of "Mon, 17 Jun 2002 23:32:41 MDT." <20020617233241.E24702@tummy.com> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net> <20020617210707.C24702@tummy.com> <200206180325.g5I3PYj28659@pcp02138704pcs.reston01.va.comcast.net> <20020617233241.E24702@tummy.com> Message-ID: <200206181205.g5IC5aE30187@pcp02138704pcs.reston01.va.comcast.net> > I would be shocked if POSIX allowed a non-root user to probe file > entries under a root/700 directory... Exactly. If a program is written to use access(), and subsequently that program is used in a setuid(root) situation, access() will say you can't access the file, but exists() will say it exists. So access() cannot be used to emulate exists() -- they serve different purposes, and can return different results. > What a paradox -- when I submitted the patch to add F_OK, you said that > exists() did the same thing. ;-) Given the widespread misunderstanding of what access() does, anything that makes using access() easier is a mistake IMO. --Guido van Rossum (home page: http://www.python.org/~guido/) From Oleg Broytmann Tue Jun 18 15:51:27 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Tue, 18 Jun 2002 18:51:27 +0400 Subject: [Python-Dev] unicode() and its error argument In-Reply-To: <20020615190441.E12705@phd.pp.ru>; from phd@phd.pp.ru on Sat, Jun 15, 2002 at 07:04:41PM +0400 References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> <20020615185842.D12705@phd.pp.ru> <200206151505.g5FF5Cr16468@pcp02138704pcs.reston01.va.comcast.net> <20020615190441.E12705@phd.pp.ru> Message-ID: <20020618185127.R17532@phd.pp.ru> Hello! On Sat, Jun 15, 2002 at 07:04:41PM +0400, Oleg Broytmann wrote: > On Sat, Jun 15, 2002 at 11:05:12AM -0400, Guido van Rossum wrote: > > > I got the error very often (but I use encoding conversion much more > > > often than you). First time I saw it I was very surprized that neither > > > "ignore" nor "replace" can eliminate the error. > > > > Got an example? > > Not right now... I'll send it when I get one. Sorry for the false alarm. It was my fault. I used to write s = unicode(s, "cp1251").encode("koi8-r", "replace") where I need s = unicode(s, "cp1251", "replace").encode("koi8-r", "replace") ^^^^^^^^^ Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From jafo@tummy.com Tue Jun 18 15:53:08 2002 From: jafo@tummy.com (Sean Reifschneider) Date: Tue, 18 Jun 2002 08:53:08 -0600 Subject: [Python-Dev] large file support In-Reply-To: <200206181205.g5IC5aE30187@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Tue, Jun 18, 2002 at 08:05:35AM -0400 References: <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net> <20020617210707.C24702@tummy.com> <200206180325.g5I3PYj28659@pcp02138704pcs.reston01.va.comcast.net> <20020617233241.E24702@tummy.com> <200206181205.g5IC5aE30187@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020618085308.I24702@tummy.com> On Tue, Jun 18, 2002 at 08:05:35AM -0400, Guido van Rossum wrote: >Given the widespread misunderstanding of what access() does, anything >that makes using access() easier is a mistake IMO. I obviously need to re-read my Posix reference. I've submitted a docstr and library documentation change for os.access() which should make it clear what the issues are... Sean -- You know you're in Canada when: You see a flyer advertising a polka-fest at the curling rink. Sean Reifschneider, Inimitably Superfluous tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python From skip@pobox.com Tue Jun 18 16:21:07 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 18 Jun 2002 10:21:07 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <200206171530.g5HFU9X09701@pcp02138704pcs.reston01.va.comcast.net> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> <15629.64994.385213.97041@12-248-41-177.client.attbi.com> <200206171530.g5HFU9X09701@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15631.20451.97640.63785@localhost.localdomain> Guido> [Proposal to use SCons] Guido> Let's not tie ourselves to SCons before it's a lot more mature. I wasn't proposing that, at least not for the short-term. I was suggesting that distutils be left as is, and a SConstruct file be delivered in .../Modules, to be used manually by developers to update module .so files. Skip From skip@pobox.com Tue Jun 18 16:31:46 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 18 Jun 2002 10:31:46 -0500 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15630.518.652328.859842@slothrop.zope.com> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> <15629.64994.385213.97041@12-248-41-177.client.attbi.com> <15630.518.652328.859842@slothrop.zope.com> Message-ID: <15631.21090.114087.860848@localhost.localdomain> SM> Instead, I see a potentially different approach. Write an scons SM> build file (typically named SConstruct) and deliver that in the SM> Modules directory. Jeremy> I don't care much about the Modules directory actually. I want Jeremy> this for third-party extensions that use distutils for Jeremy> distribution, particularly for my own third-party extensions Jeremy> :-). As I think has been hashed out here recently, there are two functions that need to be addressed. Distribution/installation of modules is fine with distutils as it currently sits. Jeremy> It sounds like you're proposing to drop distutils in favor of Jeremy> SCons, but not saying so explicitly. Is that right? No. Here, I'll put it in writing: I am explicitly not suggesting that distutils be dropped. I suggested that a SConstruct file be added to the modules directory to be used by people who need to do more than install modules. That's it. Jeremy> If so, we'd need to strong case for dumping distutils than Jeremy> automatic dependency tracking. If that isn't right, I don't Jeremy> understand how SCons and distutils meet in the middle. Would Jeremy> extension writers need to learn distutils and SCons? No. I'm only suggesting that a SConstruct file be added to the Modules directory. I don't want it tied into the build process, at least for the time being. As Guido indicated, scons is still in its infancy. Skip From jeremy@zope.com Tue Jun 18 16:36:01 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Tue, 18 Jun 2002 11:36:01 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: <15631.21090.114087.860848@localhost.localdomain> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> <15629.64994.385213.97041@12-248-41-177.client.attbi.com> <15630.518.652328.859842@slothrop.zope.com> <15631.21090.114087.860848@localhost.localdomain> Message-ID: <15631.21345.360104.903661@slothrop.zope.com> >>>>> "SM" == Skip Montanaro writes: SM> No. I'm only suggesting that a SConstruct file be added to the SM> Modules directory. I don't want it tied into the build process, SM> at least for the time being. As Guido indicated, scons is still SM> in its infancy. Oh! That sounds fine with me. Jeremy From gcordova@hebmex.com Tue Jun 18 16:07:45 2002 From: gcordova@hebmex.com (Gustavo Cordova) Date: Tue, 18 Jun 2002 10:07:45 -0500 Subject: [Python-Dev] Playstation 2 and GameCube ports Message-ID: > > > That would certainly get my son's attention and might even > > get him started in programming. I wouldn't mind seeing your > > efforts written up in a PEP. > > What exactly can you accomplish with Python on one of these boxes? > > Don't you need a (costly) developers license in order to use this? > > --Guido van Rossum (home page: http://www.python.org/~guido/) > Nope, just hack away at it in PS2-Linux, quite nice; you can distribute your own games. Now, if there's a PyGame and PyGL for PS2, then I think that some very cool hacks and demos are gonna start appearing for the piss-two in a little time. :-) -gus From skip@pobox.com Tue Jun 18 16:44:34 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 18 Jun 2002 10:44:34 -0500 Subject: [Python-Dev] large file support In-Reply-To: References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> Message-ID: <15631.21858.140029.311208@localhost.localdomain> >> If you've got a Python that doesn't have large file support and you >> try os.path.exists() on a large file, it will return false. This is >> really bad! Martin> I believe this is a pilot error. On a system that supports large Martin> files, it is the administrator's job to make sure the Python Martin> installation has large file support enabled, otherwise, strange Martin> things may happen. What about a networked environment? If machine A without large file support mounts an NFS directory from machine B that does support large files, what should a program running on A see if it attempts to stat a large file? Sounds like the EOVERFLOW thing would come in handy here. Skip From skip@pobox.com Tue Jun 18 16:47:47 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 18 Jun 2002 10:47:47 -0500 Subject: [Python-Dev] Python strptime In-Reply-To: References: Message-ID: <15631.22051.656345.152081@localhost.localdomain> Brett, Have you looked at calendar.py? It already does locale-specific weekday and month names. Localizing that code into a single place seems like it would be a good idea? Skip From guido@python.org Tue Jun 18 16:49:02 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Jun 2002 11:49:02 -0400 Subject: [Python-Dev] addressing distutils inability to track file dependencies In-Reply-To: Your message of "Tue, 18 Jun 2002 10:21:07 CDT." <15631.20451.97640.63785@localhost.localdomain> References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <15629.62080.478763.190468@slothrop.zope.com> <15629.64994.385213.97041@12-248-41-177.client.attbi.com> <200206171530.g5HFU9X09701@pcp02138704pcs.reston01.va.comcast.net> <15631.20451.97640.63785@localhost.localdomain> Message-ID: <200206181549.g5IFn2N01924@odiug.zope.com> > Guido> Let's not tie ourselves to SCons before it's a lot more mature. > > I wasn't proposing that, at least not for the short-term. I was suggesting > that distutils be left as is, and a SConstruct file be delivered in > .../Modules, to be used manually by developers to update module .so files. I don't object to that, but it wouldn't do me any good. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Tue Jun 18 16:39:54 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 18 Jun 2002 10:39:54 -0500 Subject: [Python-Dev] Python strptime In-Reply-To: <200206172051.g5HKpYn12103@pcp02138704pcs.reston01.va.comcast.net> References: <200206172051.g5HKpYn12103@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15631.21578.51007.71767@localhost.localdomain> Guido> Can you submit (to the same patch item) a patch for timemodule.c Guido> that adds a callout to your Python strptime code when Guido> HAVE_STRPTIME is undefined? I thought the preferred way to do this would be to implement a Lib/time.py module that includes Brett's strptime() function, move Modules/timemodule.c to Modules/_timemodule.c and at the end of Lib/time.py import the symbols from _time. Skip From aahz@pythoncraft.com Tue Jun 18 17:47:35 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 18 Jun 2002 12:47:35 -0400 Subject: [Python-Dev] Quota on sf.net In-Reply-To: References: Message-ID: <20020618164735.GB5681@panix.com> On Fri, Jun 14, 2002, Ask Bjoern Hansen wrote: > On 10 Jun 2002, Martin v. Löwis wrote: >> >> My recommendation would be to disable the scipt, and remove the >> snapshots, perhaps leaving a page that anybody who wants the snapshots >> should ask at python-dev to re-enable them. > > feel free to refer people to; > > http://cvs.perl.org/snapshots/python/ > > I'll keep about half a weeks worth of 6 hourly snapshots there, like > we do for parrot at http://cvs.perl.org/snapshots/parrot/ Thanks! I've just replaced the link in the Dev Guide with your link, so the SF snapshots can be blown away any time. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From martin@v.loewis.de Tue Jun 18 17:56:17 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 18 Jun 2002 18:56:17 +0200 Subject: [Python-Dev] large file support In-Reply-To: <200206181201.g5IC1uM30164@pcp02138704pcs.reston01.va.comcast.net> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <15630.45978.675730.729531@slothrop.zope.com> <200206181201.g5IC1uM30164@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Can you help? 2.1 is still a popular release, and large files will > become more and more common as it grows older... I can work out a patch, but that may take some time. Regards, Martin From martin@v.loewis.de Tue Jun 18 18:04:05 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 18 Jun 2002 19:04:05 +0200 Subject: [Python-Dev] large file support In-Reply-To: <15631.21858.140029.311208@localhost.localdomain> References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <15631.21858.140029.311208@localhost.localdomain> Message-ID: Skip Montanaro writes: > What about a networked environment? If machine A without large file support > mounts an NFS directory from machine B that does support large files, what > should a program running on A see if it attempts to stat a large file? I would have to read the specs to answer this question correctly, but I believe the answer would go like this: case 1: Machine A only supports NFSv2, which does not support large files. When machine A accesses a large file on machine B (through the NFS GETATTR operation), it will see a truncated file. Notice that the exact behaviour depends on the NFSv2 implementation on machine B. case 2: Machine A supports NFSv3, and the client NFS implementation correctly recognizes the large file. Now, you say "A has no large file support". That could either mean that the syscalls don't support that, or that the C library doesn't support that. If the kernel does not support it, it may be that it does not define EOVERFLOW, either. Most likely, you will again see the truncated value. > Sounds like the EOVERFLOW thing would come in handy here. It's not our choice whether the operating system reports EOVERFLOW, or a truncated file. My guess is that you likely see a truncated file, but you would need to specify a precise combination of (client C lib, client OS, wire NFS version, server OS) to find out what really happens. My guess is that if the system is not aware of large files, it likely won't work "correctly" when it sees one, with Python having no way to influence the outcome. Regards, Martin From kbutler@campuspipeline.com Tue Jun 18 18:20:59 2002 From: kbutler@campuspipeline.com (Kevin Butler) Date: Tue, 18 Jun 2002 11:20:59 -0600 Subject: [Python-Dev] popen behavior Message-ID: <3D0F6BFB.5080602@campuspipeline.com> I've done some work implementing popen* for Jython, and had a couple of questions: - Should we maintain the os.popen & popen2.popen dual exposure with their different argument & return value orders? The 'os' exposure is newer, so I assume it is preferred. The calls follow these patterns: stdin, stdout, stderr = os.popen*( command, mode, bufsize ) stdout, stdin, stderr = popen2.popen*( command, bufsize, mode ) - Should we maintain the different behavior for lists of arguments vs strings? (it does not appear to be documented) That is, the command can be either a string or a list of strings. If it is a list of strings, it is executed as a new process without a shell. If it is a string, CPython's popen2 module attempts to execute it as a shell command-line as follows: if isinstance(cmd, types.StringTypes): cmd = ['/bin/sh', '-c', cmd] Thanks kb From edcjones@erols.com Tue Jun 18 19:03:45 2002 From: edcjones@erols.com (Edward C. Jones) Date: Tue, 18 Jun 2002 14:03:45 -0400 Subject: [Python-Dev] MultiDict / Table, suggestion for a new module Message-ID: <3D0F7601.1040003@erols.com> I have written a module called MultiDict.py which can be found at http://members.tripod.com/~edcjones/MultiDict.py . It contains two classes MultiDict and Table. MultiDict is like a dictionary except that each key can occur more than once. It is like the multimap in the C++ Standard Template Library except that MultiDicts are hashed rather than sorted. Table views a 2 dimensional nested list as a "relation" (a set of n-tuples). I use it for simple one table databases where the full panoply of SQL is not needed. If there is interest in this, I will write a PEP and some documentation. Thanks, Edward C. Jones From bac@OCF.Berkeley.EDU Tue Jun 18 19:20:00 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 18 Jun 2002 11:20:00 -0700 (PDT) Subject: [Python-Dev] Python strptime In-Reply-To: <200206181144.g5IBi3g29948@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Tue, 18 Jun 2002, Guido van Rossum wrote: > > I wonder what the purpose of having a pure-Python implementation of > > strptime is, if you have to rely on strftime. Is this for Windows only? > > Isn't the problem that strftime() is in the C standard but strptime() > is not? So strptime() isn't always provided but we can count on > strftime()? > Exactly. For some reason ANSI decided to go to the trouble of requiring strftime(), and thus all of the locale info for it, but not strptime() nor a standard way to expose that locale info for the programmer to use. > --Guido van Rossum (home page: http://www.python.org/~guido/) -Brett C. From David Abrahams" I did a little experiment to see if I could use a uniform interface for slicing (from C++): >>> range(10)[slice(3,5)] Traceback (most recent call last): File "", line 1, in ? TypeError: sequence index must be integer >>> class Y(object): ... def __getslice__(self, a, b): ... print "getslice",a,b ... >>> y = Y() >>> y[slice(3,5)] Traceback (most recent call last): File "", line 1, in ? TypeError: unsubscriptable object >>> y[3:5] getslice 3 5 This seems to indicate that I can't, in general, pass a slice object to PyObject_GetItem in order to do slicing.** Correct? So I went looking around for alternatives to PyObject_GetItem. I found PySequence_GetSlice, but that takes int parameters, and AFAIK there's no rule saying you can't slice on strings, for example. Further experiments revealed: >>> y['hi':'there'] Traceback (most recent call last): File "", line 1, in ? TypeError: unsubscriptable object >>> class X(object): ... def __getitem__(self, x): ... print 'getitem',x ... >>> X()['hi':'there'] getitem slice('hi', 'there', None) So I /can/ slice on strings, but only through __getitem__(). And... >>> class Z(Y): ... def __getitem__(self, x): ... print 'getitem',x ... >>> Z()[3:5] getslice 3 5 >>> Z()['3':5] getitem slice('3', 5, None) So Python is doing some dispatching internally based on the types of the slice elements, but: >>> class subint(int): pass ... >>> subint() 0 >>> Z[subint():5] Traceback (most recent call last): File "", line 1, in ? TypeError: unsubscriptable object So it's looking at the concrete type of the slice elements. I'm not sure I actually understand how this one fails. I want to make a generalized getslice function in C which can operate on a triple of arbitrary objects. Here's the python version I came up with: def getslice(x,start,finish): if (type(start) is type(finish) is int and hasattr(type(x), '__getslice__')): return x.__getslice__(start, finish) else: return x.__getitem__(slice(start,finish)) Have I got the logic right here? Thanks, Dave **it seems like a good idea to make it work in the Python core, by recognizing slice objects and dispatching the elements to __getslice__ if they are ints and if one is defined. Have I overlooked something? +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+ From bac@OCF.Berkeley.EDU Tue Jun 18 19:22:44 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 18 Jun 2002 11:22:44 -0700 (PDT) Subject: [Python-Dev] Python strptime In-Reply-To: <200206181156.g5IBuFI30101@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Tue, 18 Jun 2002, Guido van Rossum wrote: > > Currently month and weekday names are constants hardcoded in > > english in calendar.py. > > No they're not. You're a year behind. ;-) > Didn't realize that; undocumented feature. I will change my code to use calendar for getting the names of the days of the week and months. I still have my code, though, that figures out the format strings for date, time, and date/time if Martin wants to use that in locale. > --Guido van Rossum (home page: http://www.python.org/~guido/) > -Brett C. From guido@python.org Tue Jun 18 20:52:49 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Jun 2002 15:52:49 -0400 Subject: [Python-Dev] making dbmmodule still broken Message-ID: <200206181952.g5IJqnL02042@odiug.zope.com> On my 2yo Mandrake 8.1 (?) system, when I do "make" in the latest CVS tree, I always get an error from building dbmmodule.c: [guido@odiug linux]$ make case $MAKEFLAGS in \ *-s*) CC='gcc' LDSHARED='gcc -shared' OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python -E ../setup.py -q build;; \ *) CC='gcc' LDSHARED='gcc -shared' OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python -E ../setup.py build;; \ esac running build running build_ext building 'dbm' extension gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -I. -I/home/guido/python/dist/src/./Include -I/usr/local/include -I/home/guido/python/dist/src/Include -I/home/guido/python/dist/src/linux -c /home/guido/python/dist/src/Modules/dbmmodule.c -o build/temp.linux-i686-2.3/dbmmodule.o /home/guido/python/dist/src/Modules/dbmmodule.c:25: #error "No ndbm.h available!" running build_scripts [guido@odiug linux]$ There's an ndbm.h is in /usr/include/db1/ndbm.h Skip, didn't you change something in this area recently? I think it's still busted. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue Jun 18 21:44:50 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 18 Jun 2002 22:44:50 +0200 Subject: [Python-Dev] popen behavior In-Reply-To: <3D0F6BFB.5080602@campuspipeline.com> References: <3D0F6BFB.5080602@campuspipeline.com> Message-ID: Kevin Butler writes: > - Should we maintain the os.popen & popen2.popen dual exposure with > their different argument & return value orders? Certainly. Any change to that will break existing applications. > - Should we maintain the different behavior for lists of arguments vs > strings? (it does not appear to be documented) > > That is, the command can be either a string or a list of strings. If > it is a list of strings, it is executed as a new process without a > shell. If it is a string, CPython's popen2 module attempts to execute > it as a shell command-line as follows: > if isinstance(cmd, types.StringTypes): > cmd = ['/bin/sh', '-c', cmd] If you propose to extend argument processing for one of the functions so that passing the additional arguments in current releases produces an exception - then adding this extension would be desirable if that adds consistency. Regards, Martin From barry@zope.com Tue Jun 18 23:03:08 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 18 Jun 2002 18:03:08 -0400 Subject: [Python-Dev] making dbmmodule still broken References: <200206181952.g5IJqnL02042@odiug.zope.com> Message-ID: <15631.44572.417268.622961@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> On my 2yo Mandrake 8.1 (?) system, when I do "make" in the GvR> latest CVS tree, I always get an error from building GvR> dbmmodule.c: I just tried building from scratch on my three systems using "configure --with-pymalloc ; make test" - RH6.1 checking ndbm.h usability... no checking ndbm.h presence... no checking for ndbm.h... no checking gdbm/ndbm.h usability... yes checking gdbm/ndbm.h presence... yes checking for gdbm/ndbm.h... yes building 'dbm' extension gcc -g -Wall -Wstrict-prototypes -fPIC -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/dbmmodule.c -o build/temp.linux-i686-2.3/dbmmodule.o gcc -shared build/temp.linux-i686-2.3/dbmmodule.o -L/usr/local/lib -lndbm -o build/lib.linux-i686-2.3/dbm.so test_dbm succeeds Note that test_bsddb was skipped. No attempt was even made to compile the bsddb extension. - RH7.3 checking ndbm.h usability... no checking ndbm.h presence... no checking for ndbm.h... no checking gdbm/ndbm.h usability... yes checking gdbm/ndbm.h presence... yes checking for gdbm/ndbm.h... yes building 'dbm' extension gcc -g -Wall -Wstrict-prototypes -fPIC -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/dbmmodule.c -o build/temp.linux-i686-2.3/dbmmodule.o gcc -shared build/temp.linux-i686-2.3/dbmmodule.o -L/usr/local/lib -lndbm -o build/lib.linux-i686-2.3/dbm.so test_dbm succeeds, as does test_bsddb - MD8.1 checking ndbm.h usability... no checking ndbm.h presence... no checking for ndbm.h... no checking gdbm/ndbm.h usability... no checking gdbm/ndbm.h presence... no checking for gdbm/ndbm.h... no building 'dbm' extension gcc -g -Wall -Wstrict-prototypes -fPIC -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/dbmmodule.c -o build/temp.linux-i686-2.3/dbmmodule.o /home/barry/projects/python/Modules/dbmmodule.c:25:2: #error "No ndbm.h available!" and test_dbm is skipped As with Guido, there is an ndbm.h in /usr/include/db1/ndbm.h Also, bsddbmodule seems to get build okay (i.e. no errors are reported), but test_bsddb gets skipped: building 'bsddb' extension gcc -g -Wall -Wstrict-prototypes -fPIC -DHAVE_DB_185_H=1 -I/usr/include/db3 -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/bsddbmodule.c -o build/temp.linux-i686-2.3/bsddbmodule.o gcc -shared build/temp.linux-i686-2.3/bsddbmodule.o -L/usr/local/BerkeleyDB.3.3/lib -L/usr/local/lib -ldb-3.3 -o build/lib.linux-i686-2.3/bsddb.so [...] test_bsddb test test_bsddb skipped -- No module named bsddb - I can't at the moment test MD8.2 -Barry From niemeyer@conectiva.com Tue Jun 18 23:14:42 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Tue, 18 Jun 2002 19:14:42 -0300 Subject: [Python-Dev] mkdev, major, st_rdev, etc In-Reply-To: <20020615160831.A5440@ibook.distro.conectiva> References: <20020615160831.A5440@ibook.distro.conectiva> Message-ID: <20020618191442.A7425@ibook.distro.conectiva> > After thinking for a while, and doing some research about these > functions, I've changed my mind about the best way to implement > the needed functionality for tarfile. Maybe including major, > minor, and makedev is the best solution. Some of the issues I'm > considering: [...] > A patch providing these functions is available at > http://www.python.org/sf/569139 Can someone please review it and let me know what I have to change to get it in, or commit if everything is ok? I'd like to give Lars some feedback about it, so that he can finish his work on tarfile.py. Thank you! -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From kbutler@campuspipeline.com Wed Jun 19 01:37:06 2002 From: kbutler@campuspipeline.com (Kevin Butler) Date: Tue, 18 Jun 2002 18:37:06 -0600 Subject: [Python-Dev] popen behavior References: <3D0F6BFB.5080602@campuspipeline.com> Message-ID: <3D0FD232.8070207@campuspipeline.com> Martin v. Loewis wrote: > Kevin Butler writes: > >>- Should we maintain the os.popen & popen2.popen dual exposure with >>their different argument & return value orders? > > Certainly. Any change to that will break existing applications. Actually, it would just "fail to enable existing applications that currently don't work on Jython". :-) But if one or the other form is or will be deprecated in CPython, I probably wouldn't expose it in Jython at this point (TMTOWTDI, etc.) >>- Should we maintain the different behavior for lists of arguments vs >>strings? (it does not appear to be documented) > > If you propose to extend argument processing for one of the functions > so that passing the additional arguments in current releases produces > an exception - then adding this extension would be desirable if that > adds consistency. I'm not sure what you meant here. The inconsistency is as follows (Python output below): On both (all?) platforms: popen*( "cmd arg arg" ) executes cmd in a subshell popen( ["cmd", "arg", "arg"] ) fails In win32: popen[234]( ["cmd", "arg", "arg"] ) fails In posix: popen[234]( ["cmd", "arg", "arg"] ) runs cmd w/o a subshell I consider the posix behavior more useful (especially on Java where we can't always determine a useful shell for a platform), but where it isn't documented and isn't supported in win32, I wasn't sure if I should support it. I think it would also be useful and more consistent to allow popen() to accept an args list, which is currently not supported on either platform. Should I allow this for Java? Should I spend time to make the Win32 functions accept the args lists? kb Python 2.1.1 (#1, Aug 25 2001, 04:19:08) [GCC 3.0.1] on sunos5 Type "copyright", "credits" or "license" for more information. >>> from os import popen, popen2 >>> out = popen( "echo $USER" ) >>> out.read() 'kbutler\n' >>> out = popen( ["echo", "$USER"] ) Traceback (most recent call last): File "", line 1, in ? TypeError: popen() argument 1 must be string, not list >>> in_, out = popen2( ["echo", "$USER"] ) >>> out.read() '$USER\n' >>> in_, out = popen2( "echo $USER" ) >>> out.read() 'kbutler\n' >>> Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from os import popen, popen2 >>> out = popen( "echo %USERNAME%" ) >>> out.read() 'kbutler\n' >>> out = popen( ["echo", "%USERNAME%"] ) Traceback (most recent call last): File "", line 1, in ? TypeError: popen() argument 1 must be string, not list >>> in_, out = popen2( ["echo", "%USERNAME%"] ) Traceback (most recent call last): File "", line 1, in ? TypeError: popen2() argument 1 must be string, not list >>> in_, out = popen2( "echo %USERNAME%" ) >>> out.read() 'kbutler\n' >>> From groups@crash.org Wed Jun 19 02:00:52 2002 From: groups@crash.org (Jason L. Asbahr) Date: Tue, 18 Jun 2002 20:00:52 -0500 Subject: [Python-Dev] Playstation 2 and GameCube ports In-Reply-To: <200206180142.g5I1gbg13515@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Patrick K. O'Brien wrote: > That would certainly get my son's attention and might even get him started > in programming. I wouldn't mind seeing your efforts written up in a PEP. > What exactly can you accomplish with Python on one of these boxes? Guido wrote: > Don't you need a (costly) developers license in order to use this? One does in fact need a costly, though very attractive and sleek :-), developer box from Sony to get the most out of Python on the PS2 as a professional game developer. However, the hobbiest PS2/Linux upgrade kit for the retail PS2 unit may be acquired for $200 and Python could be used on that system as well. Info at http://playstation2-linux.com As for what you can accomplish with Python in gaming, check out my papers at http://www.asbahr.com/papers.html :-) Cheers, Jason From paul@prescod.net Wed Jun 19 02:14:50 2002 From: paul@prescod.net (Paul Prescod) Date: Tue, 18 Jun 2002 18:14:50 -0700 Subject: [Python-Dev] Playstation 2 and GameCube ports References: Message-ID: <3D0FDB0A.EC53656@prescod.net> "Jason L. Asbahr" wrote: > >... > > However, the hobbiest PS2/Linux upgrade kit for the retail PS2 unit > may be acquired for $200 and Python could be used on that system > as well. Info at http://playstation2-linux.com What do you lose by going this route? Obviously if this was good enough there would be no need for developer boxes nor (I'd guess) for a special port of Python. Paul Prescod From barry@zope.com Wed Jun 19 03:34:17 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 18 Jun 2002 22:34:17 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> Message-ID: <15631.60841.28978.492291@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> If you build the bsddb module on a Unix-like system (that is, SM> you use configure and setup.py to build the interpreter and it SM> attempts to build the bsddb module), please give the new patch SM> attached to SM> http://python.org/sf/553108 Skip, Apologies for taking so long to respond to this thread. I'm still digging out from my move. Basically what you have in cvs works great, except for one small necessary addition. If you build Berkeley DB from source, it's going to install it in something like /usr/local/BerkeleyDB.3.3 by default. Why they choose such a bizarre location, I don't know. The problem is that unless your sysadmin hacks ld.so.conf to add /usr/local/BerkeleyDB.X.Y/lib onto your standard ld run path, bsddbmodule.so won't be linked in such a way that it can actually resolve the symbols at run time. I don't think it's reasonable to require such system hacking to get the bsddb module to link properly, and I think we can do better. Here's a small patch to setup.py which should fix things in a portable way, at least for *nix systems. It sets the envar LD_RUN_PATH to the location that it found the Berkeley library, but only if that envar isn't already set. I've tested this on RH7.3 and MD8.1 -- all of which I have a from-source install of BerkeleyDB 3.3.11. Seems to work well for me. I'm still having build trouble on my RH6.1 system, but maybe it's just too old to worry about (I /really/ need to upgrade one of these days ;). -------------------- snip snip -------------------- building 'bsddb' extension gcc -g -Wall -Wstrict-prototypes -fPIC -DHAVE_DB_185_H=1 -I/usr/local/BerkeleyDB.3.3/include -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/bsddbmodule.c -o build/temp.linux-i686-2.3/bsddbmodule.o In file included from /home/barry/projects/python/Modules/bsddbmodule.c:25: /usr/local/BerkeleyDB.3.3/include/db_185.h:171: parse error before `*' /usr/local/BerkeleyDB.3.3/include/db_185.h:171: warning: type defaults to `int' in declaration of `__db185_open' /usr/local/BerkeleyDB.3.3/include/db_185.h:171: warning: data definition has no type or storage class /home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbhashobject': /home/barry/projects/python/Modules/bsddbmodule.c:74: warning: assignment from incompatible pointer type /home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbbtobject': /home/barry/projects/python/Modules/bsddbmodule.c:124: warning: assignment from incompatible pointer type /home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbrnobject': /home/barry/projects/python/Modules/bsddbmodule.c:182: warning: assignment from incompatible pointer type -------------------- snip snip -------------------- Sigh. Anyway thanks! You've improved the situation immensely. -Barry -------------------- snip snip -------------------- Index: setup.py =================================================================== RCS file: /cvsroot/python/python/dist/src/setup.py,v retrieving revision 1.94 diff -u -r1.94 setup.py --- setup.py 17 Jun 2002 17:55:30 -0000 1.94 +++ setup.py 19 Jun 2002 01:01:19 -0000 @@ -507,6 +507,13 @@ dblibs = [dblib] raise found except found: + # A default source build puts Berkeley DB in something like + # /usr/local/Berkeley.3.3 and the lib dir under that isn't + # normally on LD_RUN_PATH, unless the sysadmin has hacked + # /etc/ld.so.conf. Setting the envar should be portable across + # compilers and platforms. + if 'LD_RUN_PATH' not in os.environ: + os.environ['LD_RUN_PATH'] = dblib_dir if dbinc == 'db_185.h': exts.append(Extension('bsddb', ['bsddbmodule.c'], library_dirs=[dblib_dir], From barry@zope.com Wed Jun 19 03:38:36 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 18 Jun 2002 22:38:36 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> Message-ID: <15631.61100.561824.480935@anthem.wooz.org> >>>>> "OB" == Oleg Broytmann writes: OB> Can I have two different modules simultaneously? For OB> example, a module linked with db.1.85 plus a module linked OB> with db3. I still think we may want to pull PyBSDDB into the standard distro, as a way to provide BDB api's > 1.85. The question is, what would this new module be called? I dislike "bsddb3" -- which I think PyBSDDB itself uses -- because it links against BDB 4.0. OTOH, PyBSDDB seems to be quite solid, so I think it's mature enough to migrate into the Python distro. I'm cc'ing pybsddb-users for feedback. -Barry From skip@pobox.com Wed Jun 19 03:03:57 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 18 Jun 2002 21:03:57 -0500 Subject: [Python-Dev] Re: making dbmmodule still broken Message-ID: <15631.59021.96743.453588@localhost.localdomain> I wrote: ... here's what I propose (and what changes I just made locally): I forgot about one other change. In dbmmodule.c I #include if HAVE_BERKDB_H is defined: #if defined(HAVE_NDBM_H) #include #if defined(PYOS_OS2) && !defined(PYCC_GCC) static char *which_dbm = "ndbm"; #else static char *which_dbm = "GNU gdbm"; /* EMX port of GDBM */ #endif #elif defined(HAVE_GDBM_NDBM_H) #include static char *which_dbm = "GNU gdbm"; #elif defined(HAVE_BERKDB_H) #include static char *which_dbm = "Berkeley DB"; #else #error "No ndbm.h available!" #endif Skip From skip@pobox.com Wed Jun 19 02:58:47 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 18 Jun 2002 20:58:47 -0500 Subject: [Python-Dev] Re: making dbmmodule still broken Message-ID: <15631.58711.213506.701945@localhost.localdomain> Guido> On my 2yo Mandrake 8.1 (?) system, when I do "make" in the latest Guido> CVS tree, I always get an error from building dbmmodule.c: Okay, in between ringing up credit card charges I took another look at building the dbm module. I can't cvs diff at the moment, but here's what I propose (and what changes I just made locally): * Remove the ndbm.h and gdbm/ndbm.h checks from configure.in and run autoconf. * Change the block of code in setup.py that checks for dbm libraries and include files to if platform not in ['cygwin']: if (self.compiler.find_library_file(lib_dirs, 'ndbm') and find_file("ndbm.h", inc_dirs, []) is not None): exts.append( Extension('dbm', ['dbmmodule.c'], define_macros=[('HAVE_NDBM_H',None)], libraries = ['ndbm'] ) ) elif (self.compiler.find_library_file(lib_dirs, 'gdbm') and find_file("gdbm/ndbm.h", inc_dirs, []) is not None): exts.append( Extension('dbm', ['dbmmodule.c'], define_macros=[('HAVE_GDBM_NDBM_H',None)], libraries = ['gdbm'] ) ) elif db_incs is not None: exts.append( Extension('dbm', ['dbmmodule.c'], library_dirs=[dblib_dir], include_dirs=db_incs, define_macros=[('HAVE_BERKDB_H',None), ('DB_DBM_HSEARCH',None)], libraries=dblibs)) This does two things. It removes the else clause which would never have worked (no corresponding include files would have been found). It also performs the two include file tests I removed from configure.in and defines the appropriate macros. Building after making these changes I get gdbm. If I mv gdbm/ndbm.h out of the way or comment out the first elif branch I get Berkeley DB. I don't have an ndbm library on my system, so I can't exercise the first branch. I think it would probably be a good idea to alert the person running make what library the module will be linked with. Anyone else agree? Skip From skip@pobox.com Wed Jun 19 00:44:40 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 18 Jun 2002 18:44:40 -0500 Subject: [Python-Dev] Re: making dbmmodule still broken In-Reply-To: <200206181952.g5IJqnL02042@odiug.zope.com> References: <200206181952.g5IJqnL02042@odiug.zope.com> Message-ID: <15631.50664.251930.976552@localhost.localdomain> Guido> On my 2yo Mandrake 8.1 (?) system, when I do "make" in the latest Guido> CVS tree, I always get an error from building dbmmodule.c: Guido> [guido@odiug linux]$ make Guido> case $MAKEFLAGS in \ Guido> *-s*) CC='gcc' LDSHARED='gcc -shared' OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python -E ../setup.py -q build;; \ Guido> *) CC='gcc' LDSHARED='gcc -shared' OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python -E ../setup.py build;; \ Guido> esac Guido> running build Guido> running build_ext Guido> building 'dbm' extension Guido> gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -I. -I/home/guido/python/dist/src/./Include -I/usr/local/include -I/home/guido/python/dist/src/Include -I/home/guido/python/dist/src/linux -c /home/guido/python/dist/src/Modules/dbmmodule.c -o build/temp.linux-i686-2.3/dbmmodule.o Guido> /home/guido/python/dist/src/Modules/dbmmodule.c:25: #error "No ndbm.h available!" Guido> running build_scripts Guido> [guido@odiug linux]$ Guido> There's an ndbm.h is in /usr/include/db1/ndbm.h Guido> Skip, didn't you change something in this area recently? I think Guido> it's still busted. :-( Hmmm... Works on my Mandrake 8.1 system. I have the db2-devel-2.4.14-5mdk package installed, which provides /usr/lib/libndbm.{a,so}. Note that you probably don't want to use /usr/include/db1/ndbm.h because you will wind up using is the broken Berkeley DB 1.85 hash file implementation. One of the two major goals of the change I checked in recently was to deprecate BerkeleyDB 1.85. Do you not have an ndbm or gdbm implementation on your system? If you don't have an acceptable set of libraries/include files it shouldn't try building the module at all. It looks like the else: branch else: exts.append( Extension('dbm', ['dbmmodule.c']) ) should probably be removed. I'll take another look at the problem again Wednesday. I am offline at the moment and can't "cvs up" (the modem here at the North Beach Inn shares a phone line with the credit card machine and it's the dinner hour... :-) Skip From gerhard@bigfoot.de Wed Jun 19 03:48:06 2002 From: gerhard@bigfoot.de (Gerhard =?iso-8859-15?Q?H=E4ring?=) Date: Wed, 19 Jun 2002 04:48:06 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15631.60841.28978.492291@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> Message-ID: <20020619024806.GA7218@lilith.my-fqdn.de> * Barry A. Warsaw [2002-06-18 22:34 -0400]: > The problem is that unless your sysadmin hacks ld.so.conf to add > /usr/local/BerkeleyDB.X.Y/lib onto your standard ld run path, > bsddbmodule.so won't be linked in such a way that it can actually > resolve the symbols at run time. > [...] > os.environ['LD_RUN_PATH'] = dblib_dir I may be missing something here, but AFAIC that's what the library_dirs parameter in the Extension constructor of distutils is for. It basically sets the runtime library path at compile time using the "-R" linker option. Gerhard -- mail: gerhard bigfoot de registered Linux user #64239 web: http://www.cs.fhm.edu/~ifw00065/ OpenPGP public key id AD24C930 public key fingerprint: 3FCC 8700 3012 0A9E B0C9 3667 814B 9CAA AD24 C930 reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b'))) From barry@zope.com Wed Jun 19 04:24:09 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 18 Jun 2002 23:24:09 -0400 Subject: [Python-Dev] Re: making dbmmodule still broken References: <15631.58711.213506.701945@localhost.localdomain> Message-ID: <15631.63833.440127.405556@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> I think it would probably be a good idea to alert the person SM> running make what library the module will be linked with. SM> Anyone else agree? +1. The less guessing the builder has to do the better! -Barry From barry@zope.com Wed Jun 19 04:27:28 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 18 Jun 2002 23:27:28 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <20020619024806.GA7218@lilith.my-fqdn.de> Message-ID: <15631.64032.21870.910289@anthem.wooz.org> >>>>> "GH" =3D=3D Gerhard H=E4ring writes: GH> * Barry A. Warsaw [2002-06-18 22:34 -0400]: >> The problem is that unless your sysadmin hacks ld.so.conf to >> add /usr/local/BerkeleyDB.X.Y/lib onto your standard ld run >> path, bsddbmodule.so won't be linked in such a way that it can >> actually resolve the symbols at run time. [...] >> os.environ['LD_RUN_PATH'] =3D dblib_dir GH> I may be missing something here, but AFAIC that's what the GH> library_dirs parameter in the Extension constructor of GH> distutils is for. It basically sets the runtime library path GH> at compile time using the "-R" linker option. Possibly (Greg?), but without setting that envar (or doing a less portable -Xlinker -Rblah trick) I'd end up with a build/lib.linux-i686-2.3/bsddb_failed.so which, if you ran ldd over it would show no resolution for libdb-3.3.so. Also, no -R or equivalent option showed up in the compilation output. -Barry From barry@zope.com Tue Jun 18 23:29:02 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue Jun 18 23:29:11 2002 Subject: [Python-Dev] PEP 292, Simpler String Substitutions I'm so behind on my email, that the anticipated flamefest will surely die down before I get around to reading it. Yet still, here is a new PEP. :) -Barry -------------------- snip snip -------------------- PEP: 292 Title: Simpler String Substitutions Version: $Revision: 1.2 $ Last-Modified: $Date: 2002/06/19 02:54:22 $ Author: barry@zope.com (Barry A. Warsaw) Status: Draft Type: Standards Track Created: 18-Jun-2002 Python-Version: 2.3 Post-History: 18-Jun-2002 Abstract This PEP describes a simpler string substitution feature, also known as string interpolation. This PEP is "simpler" in two respects: 1. Python's current string substitution feature (commonly known as %-substitutions) is complicated and error prone. This PEP is simpler at the cost of less expressiveness. 2. PEP 215 proposed an alternative string interpolation feature, introducing a new `$' string prefix. PEP 292 is simpler than this because it involves no syntax changes and has much simpler rules for what substitutions can occur in the string. Rationale Python currently supports a string substitution (a.k.a. string interpolation) syntax based on C's printf() % formatting character[1]. While quite rich, %-formatting codes are also quite error prone, even for experienced Python programmers. A common mistake is to leave off the trailing format character, e.g. the `s' in "%(name)s". In addition, the rules for what can follow a % sign are fairly complex, while the usual application rarely needs such complexity. A Simpler Proposal Here we propose the addition of a new string method, called .sub() which performs substitution of mapping values into a string with special substitution placeholders. These placeholders are introduced with the $ character. The following rules for $-placeholders apply: 1. $$ is an escape; it is replaced with a single $ 2. $identifier names a substitution placeholder matching a mapping key of "identifier". "identifier" must be a Python identifier as defined in [2]. The first non-identifier character after the $ character terminates this placeholder specification. 3. ${identifier} is equivalent to $identifier and for clarity, this is the preferred form. It is required for when valid identifier characters follow the placeholder but are not part of the placeholder, e.g. "${noun}ification". No other characters have special meaning. The .sub() method takes an optional mapping (e.g. dictionary) where the keys match placeholders in the string, and the values are substituted for the placeholders. For example: '${name} was born in ${country}'.sub({'name': 'Guido', 'country': 'the Netherlands'}) returns 'Guido was born in the Netherlands' The mapping argument is optional; if it is omitted then the mapping is taken from the locals and globals of the context in which the .sub() method is executed. For example: def birth(self, name): country = self.countryOfOrigin['name'] return '${name} was born in ${country}' birth('Guido') returns 'Guido was born in the Netherlands' Reference Implementation Here's a Python 2.2-based reference implementation. Of course the real implementation would be in C, would not require a string subclass, and would not be modeled on the existing %-interpolation feature. import sys import re dre = re.compile(r'(\$\$)|\$([_a-z]\w*)|\$\{([_a-z]\w*)\}', re.I) EMPTYSTRING = '' class dstr(str): def sub(self, mapping=None): # Default mapping is locals/globals of caller if mapping is None: frame = sys._getframe(1) mapping = frame.f_globals.copy() mapping.update(frame.f_locals) # Escape %'s s = self.replace('%', '%%') # Convert $name and ${name} to $(name)s parts = dre.split(s) for i in range(1, len(parts), 4): if parts[i] is not None: parts[i] = '$' elif parts[i+1] is not None: parts[i+1] = '%(' + parts[i+1] + ')s' else: parts[i+2] = '%(' + parts[i+2] + ')s' # Interpolate return EMPTYSTRING.join(filter(None, parts)) % mapping And here are some examples: s = dstr('${name} was born in ${country}') print s.sub({'name': 'Guido', 'country': 'the Netherlands'}) name = 'Barry' country = 'the USA' print s.sub() This will print "Guido was born in the Netherlands" followed by "Barry was born in the USA". Handling Missing Keys What should happen when one of the substitution keys is missing from the mapping (or the locals/globals namespace if no argument is given)? There are two possibilities: - We can simply allow the exception (likely a NameError or KeyError) to propagate. - We can return the original substitution placeholder unchanged. An example of the first is: print dstr('${name} was born in ${country}').sub({'name': 'Bob'}) would raise: Traceback (most recent call last): File "sub.py", line 66, in ? print s.sub({'name': 'Bob'}) File "sub.py", line 26, in sub return EMPTYSTRING.join(filter(None, parts)) % mapping KeyError: country An example of the second is: print dstr('${name} was born in ${country}').sub({'name': 'Bob'}) would print: Bob was born in ${country} The PEP author would prefer the latter interpretation, although a case can be made for raising the exception instead. We could almost ignore the issue, since the latter example could be accomplished by passing in a "safe-dictionary" in instead of a normal dictionary, like so: class safedict(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return '${%s}' % key so that d = safedict({'name': 'Bob'}) print dstr('${name} was born in ${country}').sub(d) would print: Bob was born in ${country} The one place where this won't work is when no arguments are given to the .sub() method. .sub() wouldn't know whether to wrap locals/globals in a safedict or not. This ambiguity can be solved in several ways: - we could have a parallel method called .safesub() which always wrapped its argument in a safedict() - .sub() could take an optional keyword argument flag which indicates whether to wrap the argument in a safedict or not. - .sub() could take an optional keyword argument which is a callable that would get called with the original mapping and return the mapping to be used for the substitution. By default, this callable would be the identity function, but you could easily pass in the safedict constructor instead. BDFL proto-pronouncement: It should always raise a NameError when the key is missing. There may not be sufficient use case for soft failures in the no-argument version. Comparison to PEP 215 PEP 215 describes an alternate proposal for string interpolation. Unlike that PEP, this one does not propose any new syntax for Python. All the proposed new features are embodied in a new string method. PEP 215 proposes a new string prefix representation such as $"" which signal to Python that a new type of string is present. $-strings would have to interact with the existing r-prefixes and u-prefixes, essentially doubling the number of string prefix combinations. PEP 215 also allows for arbitrary Python expressions inside the $-strings, so that you could do things like: import sys print $"sys = $sys, sys = $sys.modules['sys']" which would return sys = , sys = It's generally accepted that the rules in PEP 215 are safe in the sense that they introduce no new security issues (see PEP 215, "Security Issues" for details). However, the rules are still quite complex, and make it more difficult to see what exactly is the substitution placeholder in the original $-string. By design, this PEP does not provide as much interpolation power as PEP 215, however it is expected that the no-argument version of .sub() allows at least as much power with no loss of readability. References [1] String Formatting Operations http://www.python.org/doc/current/lib/typesseq-strings.html [2] Identifiers and Keywords http://www.python.org/doc/current/ref/identifiers.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From ping@zesty.ca Wed Jun 19 05:36:51 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Tue, 18 Jun 2002 21:36:51 -0700 (PDT) Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206190329.g5J3TLZ22194@server1.lfw.org> Message-ID: On Tue, 18 Jun 2002, Barry A. Warsaw wrote: > def birth(self, name): > country = self.countryOfOrigin['name'] > return '${name} was born in ${country}' > > birth('Guido') > > returns > > 'Guido was born in the Netherlands' I assume you in fact meant return '${name} was born in ${country}'.sub() for the third line above? > print s.sub({'name': 'Guido', > 'country': 'the Netherlands'}) Have you considered the possibility of accepting keyword arguments instead? They would be slightly more pleasant to write: print s.sub(name='Guido', country='the Netherlands') This is motivated because i imagine relative frequencies of use to be something like this: 1. sub() [most frequent] 2. sub(name=value, ...) [nearly as frequent] 3. sub(dictionary) [least frequent] If you decide to use keyword arguments, you can either allow both keyword arguments and a single dictionary argument, or you can just accept keyword arguments and people can pass in dictionaries using **. -- ?!ng From paul@prescod.net Wed Jun 19 05:46:33 2002 From: paul@prescod.net (Paul Prescod) Date: Tue, 18 Jun 2002 21:46:33 -0700 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com> Message-ID: <3D100CA9.999E7B3F@prescod.net> "Barry A. Warsaw" wrote: > >... The mapping argument is optional; if it is omitted then the > mapping is taken from the locals and globals of the context in > which the .sub() method is executed. For example: > > def birth(self, name): > country = self.countryOfOrigin['name'] > return '${name} was born in ${country}' > > birth('Guido') You forgot the "implicit .sub"() feature. >... > - We can simply allow the exception (likely a NameError or > KeyError) to propagate. Explicit! > - We can return the original substitution placeholder unchanged. Silently guess??? Overall it isn't bad...it's a little weird to have a method that depends on sys._getframe(1) (or as the say in Tcl-land "upvar"). It may set a bad precedent... Paul Prescod From greg@cosc.canterbury.ac.nz Wed Jun 19 06:11:37 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 19 Jun 2002 17:11:37 +1200 (NZST) Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?) In-Reply-To: Message-ID: <200206190511.g5J5BbU16855@oma.cosc.canterbury.ac.nz> > 1xc00 "shows the bits" more clearly even in such > an easy case. Except that, if we're thinking in hex, it's not a 1-filled bit string, it's an F-filled bit string! So it should be Fxc00 :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin@v.loewis.de Wed Jun 19 06:30:11 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Jun 2002 07:30:11 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15631.61100.561824.480935@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> Message-ID: barry@zope.com (Barry A. Warsaw) writes: > I still think we may want to pull PyBSDDB into the standard distro, as > a way to provide BDB api's > 1.85. The question is, what would this > new module be called? I dislike "bsddb3" -- which I think PyBSDDB > itself uses -- because it links against BDB 4.0. If this is just a question of naming, I recommend bsddb2 - not indicating the version of the database, but the version of the Python module. Regards, Martin From martin@v.loewis.de Wed Jun 19 06:37:08 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Jun 2002 07:37:08 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15631.60841.28978.492291@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> Message-ID: barry@zope.com (Barry A. Warsaw) writes: > Basically what you have in cvs works great, except for one small > necessary addition. If you build Berkeley DB from source, it's going > to install it in something like /usr/local/BerkeleyDB.3.3 by default. > Why they choose such a bizarre location, I don't know. > > The problem is that unless your sysadmin hacks ld.so.conf to add > /usr/local/BerkeleyDB.X.Y/lib onto your standard ld run path, > bsddbmodule.so won't be linked in such a way that it can actually > resolve the symbols at run time. I don't think it's reasonable to > require such system hacking to get the bsddb module to link properly, > and I think we can do better. > > Here's a small patch to setup.py which should fix things in a portable > way, at least for *nix systems. It sets the envar LD_RUN_PATH to the > location that it found the Berkeley library, but only if that envar > isn't already set. I dislike that change. Setting LD_RUN_PATH is the jobs of whoever is building the compiler, and should not be done by Python automatically. So far, the Python build process avoids adding any -R linker options, since it requires quite some insight into the specific installation to determine whether usage of that option is the right thing. If setup.py fails to build an extension correctly, it is the adminstrator's job to specify a correct build procedure in Modules/Setup. For that reason, I rather recommend to remove the magic that setup.py looks in /usr/local/Berkeley*, instead of adding more magic. Regards, Martin From python@rcn.com Wed Jun 19 07:45:46 2002 From: python@rcn.com (Raymond Hettinger) Date: Wed, 19 Jun 2002 02:45:46 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net> Message-ID: <007f01c2175c$f1e768e0$f4d8accf@othello> From: "Barry A. Warsaw" > A Simpler Proposal > > Here we propose the addition of a new string method, called .sub() > which performs substitution of mapping values into a string with > special substitution placeholders. These placeholders are > introduced with the $ character. The following rules for > $-placeholders apply: > > 1. $$ is an escape; it is replaced with a single $ Hmm, some strings (at least in the spam I receive) contain $$$$$$. How about ${$}? > 2. $identifier names a substitution placeholder matching a mapping > key of "identifier". "identifier" must be a Python identifier > as defined in [2]. The first non-identifier character after > the $ character terminates this placeholder specification. +1 > > 3. ${identifier} is equivalent to $identifier and for clarity, > this is the preferred form. It is required for when valid > identifier characters follow the placeholder but are not part of > the placeholder, e.g. "${noun}ification". +1 > Handling Missing Keys > > What should happen when one of the substitution keys is missing > from the mapping (or the locals/globals namespace if no argument > is given)? There are two possibilities: > > - We can simply allow the exception (likely a NameError or > KeyError) to propagate. > > - We can return the original substitution placeholder unchanged. And/Or, - Leave placeholder unchanged unless default argument supplied: mystr.sub(mydict, undefined='***') # Fill unknowns with stars And/Or, - Raise an exception only if specified: mystr.sub(mydict, undefined=NameError) And/Or - Return a count of the number of missed substitutions: nummisses = mystr.sub(mydict) > BDFL proto-pronouncement: It should always raise a NameError when > the key is missing. There may not be sufficient use case for soft > failures in the no-argument version. I had written a minature mail-merge program and learned that the NameError approach is a PITA. It makes sense if the mapping is defined inside the program; however, externally supplied mappings (like a mergelist) can be expected to have "holes" and launching exceptions makes it harder to recover than having a default behavior. The best existing Python comparison is the str.replace() method which does not bomb-out when the target string is not found. Raymond Hettinger From pf@artcom-gmbh.de Wed Jun 19 07:50:50 2002 From: pf@artcom-gmbh.de (Peter Funk) Date: Wed, 19 Jun 2002 08:50:50 +0200 (CEST) Subject: [Python-Dev] Python strptime In-Reply-To: <200206181156.g5IBuFI30101@pcp02138704pcs.reston01.va.comcast.net> from Guido van Rossum at "Jun 18, 2002 07:56:15 am" Message-ID: Guido van Rossum: > > Currently month and weekday names are constants hardcoded in > > english in calendar.py. > > No they're not. You're a year behind. ;-) Oupppsss! Sorry. I must admit I looked at the most recent documentation and not at the most recent source code and so I missed the clever patch written by Denis S. Otkidach. Obviously there are still some more \versionchanged{} missing in python/dist/src/Doc/lib/libcalendar.tex. I will see, if I can provide another doc patch. Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany) From fredrik@pythonware.com Wed Jun 19 08:05:00 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 19 Jun 2002 09:05:00 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> Message-ID: <003701c2175f$b219c340$ced241d5@hagrid> Barry wrote: > def birth(self, name): > country = self.countryOfOrigin['name'] > return '${name} was born in ${country}'.sub() now explain why the above is a vast improvement over: def birth(self, name): country = self.countryOfOrigin['name'] return join(name, ' was born in ', country) (for extra bonus, explain how sub() can be made to execute substantially faster than a join() function) From oren-py-d@hishome.net Wed Jun 19 08:36:57 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 19 Jun 2002 03:36:57 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <007f01c2175c$f1e768e0$f4d8accf@othello> References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net> <007f01c2175c$f1e768e0$f4d8accf@othello> Message-ID: <20020619073657.GA25541@hishome.net> On Wed, Jun 19, 2002 at 02:45:46AM -0400, Raymond Hettinger wrote: > > Handling Missing Keys > > > > What should happen when one of the substitution keys is missing > > from the mapping (or the locals/globals namespace if no argument > > is given)? There are two possibilities: > > > > - We can simply allow the exception (likely a NameError or > > KeyError) to propagate. > > > > - We can return the original substitution placeholder unchanged. > > And/Or, > - Leave placeholder unchanged unless default argument supplied: > mystr.sub(mydict, undefined='***') # Fill unknowns with > stars > And/Or, > - Raise an exception only if specified: > mystr.sub(mydict, undefined=NameError) > And/Or > - Return a count of the number of missed substitutions: > nummisses = mystr.sub(mydict) > > > BDFL proto-pronouncement: It should always raise a NameError when > > the key is missing. There may not be sufficient use case for soft > > failures in the no-argument version. > > I had written a minature mail-merge program and learned that the NameError > approach is a PITA. It makes sense if the mapping is defined inside the > program; Exceptions are *supposed* to be a PITA in order to make sure they are hard to ignore. +1 on optional argument for default value. -1 on not raising exception for missing name. I think the best approach might be to raise NameError exception *unless* a default argument is passed. The number of misses cannot be returned by this method - it returns the new string. Oren From oren-py-d@hishome.net Wed Jun 19 08:51:21 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 19 Jun 2002 03:51:21 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <003701c2175f$b219c340$ced241d5@hagrid> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> Message-ID: <20020619075121.GB25541@hishome.net> On Wed, Jun 19, 2002 at 09:05:00AM +0200, Fredrik Lundh wrote: > Barry wrote: > > > def birth(self, name): > > country = self.countryOfOrigin['name'] > > return '${name} was born in ${country}'.sub() > > now explain why the above is a vast improvement over: > > def birth(self, name): > country = self.countryOfOrigin['name'] > return join(name, ' was born in ', country) Assuming join = lambda *args: ''.join(map(str, args)) 1. Friendly for people coming from other languages (Perl/shell). Same reason why the != operator was added as an alternative to <>. 2. Less quotes and commas for the terminally lazy. 3. More flexible for data-driven use. Either the template or the dictionary can be data rather than hard-wired into the code. Oren From martin@strakt.com Wed Jun 19 09:33:11 2002 From: martin@strakt.com (Martin =?iso-8859-1?Q?Sj=F6gren?=) Date: Wed, 19 Jun 2002 10:33:11 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <20020619075121.GB25541@hishome.net> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> Message-ID: <20020619083311.GA1011@ratthing-b3cf> On Wed, Jun 19, 2002 at 03:51:21AM -0400, Oren Tirosh wrote: > On Wed, Jun 19, 2002 at 09:05:00AM +0200, Fredrik Lundh wrote: > > Barry wrote: > >=20 > > > def birth(self, name): > > > country =3D self.countryOfOrigin['name'] > > > return '${name} was born in ${country}'.sub() > >=20 > > now explain why the above is a vast improvement over: > >=20 > > def birth(self, name): > > country =3D self.countryOfOrigin['name'] > > return join(name, ' was born in ', country) >=20 > Assuming join =3D lambda *args: ''.join(map(str, args))=20 >=20 > 1. Friendly for people coming from other languages (Perl/shell). Same r= eason=20 > why the !=3D operator was added as an alternative to <>. >=20 > 2. Less quotes and commas for the terminally lazy. >=20 > 3. More flexible for data-driven use. Either the template or the > dictionary can be data rather than hard-wired into the code. But what about >>> '%(name)s was born in %(country)s' % {'name':'Guido', 'country':'the Netherlands'} 'Guido was born in the Netherlands' >>> name =3D 'Martin' >>> country =3D 'Sweden' >>> '%(name)s was born in %(country)s' % globals() 'Martin was born in Sweden' What's the advantage of using ${name} and ${country} instead? /Martin --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 7710870 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html From fredrik@pythonware.com Wed Jun 19 09:53:02 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 19 Jun 2002 10:53:02 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> Message-ID: <006501c2176e$b9dbb3e0$0900a8c0@spiff> oren wrote: > 1. Friendly for people coming from other languages (Perl/shell). > > 2. Less quotes and commas for the terminally lazy. > > 3. More flexible for data-driven use. Either the template or the > dictionary can be data rather than hard-wired into the code. combine 1, 2, and 3 with _getframe(), and you have a feature that crackers are going to love... From duncan@rcp.co.uk Wed Jun 19 10:34:24 2002 From: duncan@rcp.co.uk (Duncan Booth) Date: Wed, 19 Jun 2002 10:34:24 +0100 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> Message-ID: <09342475030690@aluminium.rcp.co.uk> On 19 Jun 2002, Martin Sjögren wrote: > But what about > >>>> '%(name)s was born in %(country)s' % {'name':'Guido', > 'country':'the Netherlands'} > 'Guido was born in the Netherlands' >>>> name = 'Martin' >>>> country = 'Sweden' >>>> '%(name)s was born in %(country)s' % globals() > 'Martin was born in Sweden' > > What's the advantage of using ${name} and ${country} instead? Presumably it looks more natural to people experienced in shell programming or Perl---at the expense of losing the ability to format field widths and alignments of course (so should we have regexes delimited by '/' next?). Personally I can't see the need for a second form of string interpolation, but since it comes up so often somebody must feel it is significantly superior to the existing system. What I really don't understand is why there is such pressure to get an alternative interpolation added as methods to str & unicode rather than just adding an interpolation module to the library? e.g. from interpolation import sub def birth(self, name): country = self.countryOfOrigin['name'] return sub('${name} was born in ${country}', vars()) I added in the explicit vars() parameter because the idea of a possibly unknown template string picking up arbitrary variables is, IMHO, a BAD idea. If it were a library module then it would probably also make sense to define a wrapper object constructed from a sequence that would do the interpolation when called: e.g. >>> message = interpolation.template('${name} was born in ${country}') >>> print message(name='Duncan', country='Scotland') Duncan was born in Scotland Putting it in a separate module would also give more scope for providing minor variations on the theme, for example the default should be to throw a NameError for missing variables, but you could have another function wrapping the basic one that substituted in a default value instead. -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? From fredrik@pythonware.com Wed Jun 19 11:01:36 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 19 Jun 2002 12:01:36 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> Message-ID: <018c01c21778$4d374f60$0900a8c0@spiff> duncan wrote: > What I really don't understand is why there is such pressure to get an = > alternative interpolation added as methods to str & unicode rather = than=20 > just adding an interpolation module to the library? > e.g. >=20 > from interpolation import sub > def birth(self, name): > country =3D self.countryOfOrigin['name'] > return sub('${name} was born in ${country}', vars()) that's too easy, of course ;-) especially since someone has already added such a method, a long time ago (os.path.expandvars). and there's already an interpolation engine in there; barry's loop, join and format stuff can replaced with a simple callback, and a call to sre.sub with the right pattern: def sub(string, mapping): def repl(m, mapping=3Dmapping): return mapping[m.group(m.lastindex)] return sre.sub(A_PATTERN, repl, string) (if you like lambdas, you can turn this into a one-liner) maybe it would be sufficient to add a number of "right patterns" to the standard library... From walter@livinglogic.de Wed Jun 19 11:07:36 2002 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Wed, 19 Jun 2002 12:07:36 +0200 Subject: [Python-Dev] PEP 293, Codec Error Handling Callbacks Message-ID: <3D1057E8.9090200@livinglogic.de> Here's another new PEP. Bye, Walter Dörwald ---------------------------------------------------------------------- PEP: 293 Title: Codec Error Handling Callbacks Version: $Revision: 1.1 $ Last-Modified: $Date: 2002/06/19 03:22:11 $ Author: Walter Dörwald Status: Draft Type: Standards Track Created: 18-Jun-2002 Python-Version: 2.3 Post-History: Abstract This PEP aims at extending Python's fixed codec error handling schemes with a more flexible callback based approach. Python currently uses a fixed error handling for codec error handlers. This PEP describes a mechanism which allows Python to use function callbacks as error handlers. With these more flexible error handlers it is possible to add new functionality to existing codecs by e.g. providing fallback solutions or different encodings for cases where the standard codec mapping does not apply. Specification Currently the set of codec error handling algorithms is fixed to either "strict", "replace" or "ignore" and the semantics of these algorithms is implemented separately for each codec. The proposed patch will make the set of error handling algorithms extensible through a codec error handler registry which maps handler names to handler functions. This registry consists of the following two C functions: int PyCodec_RegisterError(const char *name, PyObject *error) PyObject *PyCodec_LookupError(const char *name) and their Python counterparts codecs.register_error(name, error) codecs.lookup_error(name) PyCodec_LookupError raises a LookupError if no callback function has been registered under this name. Similar to the encoding name registry there is no way of unregistering callback functions or iterating through the available functions. The callback functions will be used in the following way by the codecs: when the codec encounters an encoding/decoding error, the callback function is looked up by name, the information about the error is stored in an exception object and the callback is called with this object. The callback returns information about how to proceed (or raises an exception). For encoding, the exception object will look like this: class UnicodeEncodeError(UnicodeError): def __init__(self, encoding, object, start, end, reason): UnicodeError.__init__(self, "encoding '%s' can't encode characters " + "in positions %d-%d: %s" % (encoding, start, end-1, reason)) self.encoding = encoding self.object = object self.start = start self.end = end self.reason = reason This type will be implemented in C with the appropriate setter and getter methods for the attributes, which have the following meaning: * encoding: The name of the encoding; * object: The original unicode object for which encode() has been called; * start: The position of the first unencodable character; * end: (The position of the last unencodable character)+1 (or the length of object, if all characters from start to the end of object are unencodable); * reason: The reason why object[start:end] couldn't be encoded. If object has consecutive unencodable characters, the encoder should collect those characters for one call to the callback if those characters can't be encoded for the same reason. The encoder is not required to implement this behaviour but may call the callback for every single character, but it is strongly suggested that the collecting method is implemented. The callback must not modify the exception object. If the callback does not raise an exception (either the one passed in, or a different one), it must return a tuple: (replacement, newpos) replacement is a unicode object that the encoder will encode and emit instead of the unencodable object[start:end] part, newpos specifies a new position within object, where (after encoding the replacement) the encoder will continue encoding. If the replacement string itself contains an unencodable character the encoder raises the exception object (but may set a different reason string before raising). Should further encoding errors occur, the encoder is allowed to reuse the exception object for the next call to the callback. Furthermore the encoder is allowed to cache the result of codecs.lookup_error. If the callback does not know how to handle the exception, it must raise a TypeError. Decoding works similar to encoding with the following differences: The exception class is named UnicodeDecodeError and the attribute object is the original 8bit string that the decoder is currently decoding. The decoder will call the callback with those bytes that constitute one undecodable sequence, even if there is more than one undecodable sequence that is undecodable for the same reason directly after the first one. E.g. for the "unicode-escape" encoding, when decoding the illegal string "\\u00\\u01x", the callback will be called twice (once for "\\u00" and once for "\\u01"). This is done to be able to generate the correct number of replacement characters. The replacement returned from the callback is a unicode object that will be emitted by the decoder as-is without further processing instead of the undecodable object[start:end] part. There is a third API that uses the old strict/ignore/replace error handling scheme: PyUnicode_TranslateCharmap/unicode.translate The proposed patch will enhance PyUnicode_TranslateCharmap, so that it also supports the callback registry. This has the additional side effect that PyUnicode_TranslateCharmap will support multi-character replacement strings (see SF feature request #403100 [1]). For PyUnicode_TranslateCharmap the exception class will be named UnicodeTranslateError. PyUnicode_TranslateCharmap will collect all consecutive untranslatable characters (i.e. those that map to None) and call the callback with them. The replacement returned from the callback is a unicode object that will be put in the translated result as-is, without further processing. All encoders and decoders are allowed to implement the callback functionality themselves, if they recognize the callback name (i.e. if it is a system callback like "strict", "replace" and "ignore"). The proposed patch will add two additional system callback names: "backslashreplace" and "xmlcharrefreplace", which can be used for encoding and translating and which will also be implemented in-place for all encoders and PyUnicode_TranslateCharmap. The Python equivalent of these five callbacks will look like this: def strict(exc): raise exc def ignore(exc): if isinstance(exc, UnicodeError): return (u"", exc.end) else: raise TypeError("can't handle %s" % exc.__name__) def replace(exc): if isinstance(exc, UnicodeEncodeError): return ((exc.end-exc.start)*u"?", exc.end) elif isinstance(exc, UnicodeDecodeError): return (u"\\ufffd", exc.end) elif isinstance(exc, UnicodeTranslateError): return ((exc.end-exc.start)*u"\\ufffd", exc.end) else: raise TypeError("can't handle %s" % exc.__name__) def backslashreplace(exc): if isinstance(exc, (UnicodeEncodeError, UnicodeTranslateError)): s = u"" for c in exc.object[exc.start:exc.end]: if ord(c)<=0xff: s += u"\\x%02x" % ord(c) elif ord(c)<=0xffff: s += u"\\u%04x" % ord(c) else: s += u"\\U%08x" % ord(c) return (s, exc.end) else: raise TypeError("can't handle %s" % exc.__name__) def xmlcharrefreplace(exc): if isinstance(exc, (UnicodeEncodeError, UnicodeTranslateError)): s = u"" for c in exc.object[exc.start:exc.end]: s += u"&#%d;" % ord(c) return (s, exc.end) else: raise TypeError("can't handle %s" % exc.__name__) These five callback handlers will also be accessible to Python as codecs.strict_error, codecs.ignore_error, codecs.replace_error, codecs.backslashreplace_error and codecs.xmlcharrefreplace_error. Rationale Most legacy encoding do not support the full range of Unicode characters. For these cases many high level protocols support a way of escaping a Unicode character (e.g. Python itself supports the \x, \u and \U convention, XML supports character references via &#xxx; etc.). When implementing such an encoding algorithm, a problem with the current implementation of the encode method of Unicode objects becomes apparent: For determining which characters are unencodable by a certain encoding, every single character has to be tried, because encode does not provide any information about the location of the error(s), so # (1) us = u"xxx" s = us.encode(encoding) has to be replaced by # (2) us = u"xxx" v = [] for c in us: try: v.append(c.encode(encoding)) except UnicodeError: v.append("&#%d;" % ord(c)) s = "".join(v) This slows down encoding dramatically as now the loop through the string is done in Python code and no longer in C code. Furthermore this solution poses problems with stateful encodings. For example UTF-16 uses a Byte Order Mark at the start of the encoded byte string to specify the byte order. Using (2) with UTF-16, results in an 8 bit string with a BOM between every character. To work around this problem, a stream writer - which keeps state between calls to the encoding function - has to be used: # (3) us = u"xxx" import codecs, cStringIO as StringIO writer = codecs.getwriter(encoding) v = StringIO.StringIO() uv = writer(v) for c in us: try: uv.write(c) except UnicodeError: uv.write(u"&#%d;" % ord(c)) s = v.getvalue() To compare the speed of (1) and (3) the following test script has been used: # (4) import time us = u"äa"*1000000 encoding = "ascii" import codecs, cStringIO as StringIO t1 = time.time() s1 = us.encode(encoding, "replace") t2 = time.time() writer = codecs.getwriter(encoding) v = StringIO.StringIO() uv = writer(v) for c in us: try: uv.write(c) except UnicodeError: uv.write(u"?") s2 = v.getvalue() t3 = time.time() assert(s1==s2) print "1:", t2-t1 print "2:", t3-t2 print "factor:", (t3-t2)/(t2-t1) On Linux this gives the following output (with Python 2.3a0): 1: 0.274321913719 2: 51.1284689903 factor: 186.381278466 i.e. (3) is 180 times slower than (1). Codecs must be stateless, because as soon as a callback is registered it is available globally and can be called by multiple encode() calls. To be able to use stateful callbacks, the errors parameter for encode/decode/translate would have to be changed from char * to PyObject *, so that the callback could be used directly, without the need to register the callback globally. As this requires changes to lots of C prototypes, this approach was rejected. Currently all encoding/decoding functions have arguments const Py_UNICODE *p, int size or const char *p, int size to specify the unicode characters/8bit characters to be encoded/decoded. So in case of an error the codec has to create a new unicode or str object from these parameters and store it in the exception object. The callers of these encoding/decoding functions extract these parameters from str/unicode objects themselves most of the time, so it could speed up error handling if these object were passed directly. As this again requires changes to many C functions, this approach has been rejected. Implementation Notes A sample implementation is available as SourceForge patch #432401 [2]. The current version of this patch differs from the specification in the following way: * The error information is passed from the codec to the callback not as an exception object, but as a tuple, which has an additional entry state, which can be used for additional information the codec might want to pass to the callback. * There are two separate registries (one for encoding/translating and one for decoding) The class codecs.StreamReaderWriter uses the errors parameter for both reading and writing. To be more flexible this should probably be changed to two separate parameters for reading and writing. The errors parameter of PyUnicode_TranslateCharmap is not availably to Python, which makes testing of the new functionality of PyUnicode_TranslateCharmap impossible with Python scripts. The patch should add an optional argument errors to unicode.translate to expose the functionality and make testing possible. Codecs that do something different than encoding/decoding from/to unicode and want to use the new machinery can define their own exception classes and the strict handlers will automatically work with it. The other predefined error handlers are unicode specific and expect to get a Unicode(Encode|Decode|Translate)Error exception object so they won't work. Backwards Compatibility The semantics of unicode.encode with errors="replace" has changed: The old version always stored a ? character in the output string even if no character was mapped to ? in the mapping. With the proposed patch, the replacement string from the callback callback will again be looked up in the mapping dictionary. But as all supported encodings are ASCII based, and thus map ? to ?, this should not be a problem in practice. References [1] SF feature request #403100 "Multicharacter replacements in PyUnicode_TranslateCharmap" http://www.python.org/sf/403100 [2] SF patch #432401 "unicode encoding error callbacks" http://www.python.org/sf/432401 Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From mwh@python.net Wed Jun 19 11:09:24 2002 From: mwh@python.net (Michael Hudson) Date: 19 Jun 2002 11:09:24 +0100 Subject: [Python-Dev] Slicing In-Reply-To: "David Abrahams"'s message of "Tue, 18 Jun 2002 14:21:23 -0400" References: <05cd01c216f5$00740fc0$6601a8c0@boostconsulting.com> Message-ID: <2m3cvjshpn.fsf@starship.python.net> "David Abrahams" writes: > I did a little experiment to see if I could use a uniform interface for > slicing (from C++): > > >>> range(10)[slice(3,5)] > Traceback (most recent call last): > File "", line 1, in ? > TypeError: sequence index must be integer > >>> class Y(object): > ... def __getslice__(self, a, b): > ... print "getslice",a,b > ... > >>> y = Y() > >>> y[slice(3,5)] > Traceback (most recent call last): > File "", line 1, in ? > TypeError: unsubscriptable object > >>> y[3:5] > getslice 3 5 > > This seems to indicate that I can't, in general, pass a slice object to > PyObject_GetItem in order to do slicing.** Correct? No. The time machine has got you here; update to CVS and try again. This comes down to the (slightly odd, IMHO) distinction between sequences and mappings, which doesn't really appear at the Python level. type_pointer->tp_as_sequence->sq_item takes a single int as a parameter type_pointer->tp_as_mapping->mp_subscr takes a PyObject*. Builtin sequences (as of last week) have mp_subscr methods that handle slices. I haven't checked, but would be amazed if PyObject_GetItem can't now be used with sliceobjects. (PS: I'm not sure I've got all the field names right here. They're close). > So I went looking around for alternatives to PyObject_GetItem. I found > PySequence_GetSlice, but that takes int parameters, and AFAIK there's no > rule saying you can't slice on strings, for example. > > Further experiments revealed: > > >>> y['hi':'there'] > Traceback (most recent call last): > File "", line 1, in ? > TypeError: unsubscriptable object > >>> class X(object): > ... def __getitem__(self, x): > ... print 'getitem',x > ... > >>> X()['hi':'there'] > getitem slice('hi', 'there', None) > > So I /can/ slice on strings, but only through __getitem__(). And... > > >>> class Z(Y): > ... def __getitem__(self, x): > ... print 'getitem',x > ... > >>> Z()[3:5] > getslice 3 5 > >>> Z()['3':5] > getitem slice('3', 5, None) > > So Python is doing some dispatching internally based on the types of the This area is very messy. > slice elements, but: > > >>> class subint(int): pass > ... > >>> subint() > 0 > >>> Z[subint():5] > Traceback (most recent call last): > File "", line 1, in ? > TypeError: unsubscriptable object This last one is easy: you're trying to subscript the class object! > So it's looking at the concrete type of the slice elements. I'm not > sure I actually understand how this one fails. > > I want to make a generalized getslice function in C which can operate on a > triple of arbitrary objects. Here's the python version I came up with: > > def getslice(x,start,finish): > if (type(start) is type(finish) is int > and hasattr(type(x), '__getslice__')): > return x.__getslice__(start, finish) > else: > return x.__getitem__(slice(start,finish)) > > Have I got the logic right here? You can't do this logic from Python, AFAIK. I think PyObject_GetItem is your best bet. Cheers, M. -- The only problem with Microsoft is they just have no taste. -- Steve Jobs, (From _Triumph of the Nerds_ PBS special) and quoted by Aahz Maruch on comp.lang.python From pyth@devel.trillke.net Wed Jun 19 11:21:51 2002 From: pyth@devel.trillke.net (holger krekel) Date: Wed, 19 Jun 2002 12:21:51 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <018c01c21778$4d374f60$0900a8c0@spiff>; from fredrik@pythonware.com on Wed, Jun 19, 2002 at 12:01:36PM +0200 References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <018c01c21778$4d374f60$0900a8c0@spiff> Message-ID: <20020619122151.I15079@prim.han.de> Fredrik Lundh wrote: > duncan wrote: > > > What I really don't understand is why there is such pressure to get an > > alternative interpolation added as methods to str & unicode rather than > > just adding an interpolation module to the library? > > e.g. > > > > from interpolation import sub > > def birth(self, name): > > country = self.countryOfOrigin['name'] > > return sub('${name} was born in ${country}', vars()) > > that's too easy, of course ;-) > > especially since someone has already added such a method, a > long time ago (os.path.expandvars). > > and there's already an interpolation engine in there; barry's loop, > join and format stuff can replaced with a simple callback, and a > call to sre.sub with the right pattern: > > def sub(string, mapping): > def repl(m, mapping=mapping): > return mapping[m.group(m.lastindex)] > return sre.sub(A_PATTERN, repl, string) > > (if you like lambdas, you can turn this into a one-liner) > > maybe it would be sufficient to add a number of "right patterns" > to the standard library... FWIW, +1 holger From barry@wooz.org Wed Jun 19 12:10:10 2002 From: barry@wooz.org (Barry A. Warsaw) Date: Wed, 19 Jun 2002 07:10:10 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <200206190329.g5J3TLZ22194@server1.lfw.org> Message-ID: <15632.26258.371707.408884@anthem.wooz.org> >>>>> "KY" == Ka-Ping Yee writes: KY> I assume you in fact meant KY> return '${name} was born in ${country}'.sub() KY> for the third line above? Yup, thanks for the fix. >> print s.sub({'name': 'Guido', 'country': 'the Netherlands'}) KY> Have you considered the possibility of accepting keyword KY> arguments instead? Nope, and it's not a bad idea. I've added this in an "Open Issues" section. KY> If you decide to use keyword arguments, you can either allow KY> both keyword arguments and a single dictionary argument, or KY> you can just accept keyword arguments and people can pass in KY> dictionaries using **. I'd prefer the latter, otherwise we'd have to pick a keyword that would be off-limits as a substitution variable. Thanks! -Barry From barry@zope.com Wed Jun 19 12:16:52 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 07:16:52 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com> <3D100CA9.999E7B3F@prescod.net> Message-ID: <15632.26660.981170.540633@anthem.wooz.org> >>>>> "PP" == Paul Prescod writes: PP> You forgot the "implicit .sub"() feature. Yup, thanks. >> ... >> - We can simply allow the exception (likely a NameError or >> KeyError) to propagate. PP> Explicit! Now you're thinking like Guido! :) >> - We can return the original substitution placeholder >> unchanged. PP> Silently guess??? I'm beginning to agree that the exception should be raised. I want to be explicit about it so we can write the safedict wrapper effectively. I agree that an exception is better when using the no-arg version, but for the arg-version I /really/ want to be able to suppress all interpolation exceptions, and just return some string, even if it has placeholders still in it. PP> Overall it isn't bad...it's a little weird to have a method PP> that depends on sys._getframe(1) (or as the say in Tcl-land PP> "upvar"). It may set a bad precedent... Noted. Thanks, -Barry From barry@zope.com Wed Jun 19 12:23:59 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 07:23:59 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net> <007f01c2175c$f1e768e0$f4d8accf@othello> Message-ID: <15632.27087.786103.175959@anthem.wooz.org> >>>>> "RH" == Raymond Hettinger writes: >> 1. $$ is an escape; >> it is replaced with a single $ RH> Hmm, some strings (at least in the spam I receive) contain RH> $$$$$$. How about ${$}? How often will you be interpolating into spam? Sounds like it could get messy. ;) >> Handling Missing Keys >> What should happen when one of the substitution keys is missing >> from the mapping (or the locals/globals namespace if no >> argument is given)? There are two possibilities: - We can >> simply allow the exception (likely a NameError or KeyError) to >> propagate. - We can return the original substitution >> placeholder unchanged. RH> And/Or, RH> - Leave placeholder unchanged unless default argument RH> supplied: mystr.sub(mydict, undefined='***') # Fill unknowns RH> with RH> stars RH> And/Or, RH> - Raise an exception only if specified: RH> mystr.sub(mydict, undefined=NameError) RH> And/Or RH> - Return a count of the number of missed substitutions: RH> nummisses = mystr.sub(mydict) I /really/ dislike interfaces that raise an exception or don't, based on an argument to the function. Returning the number of missed substitutions doesn't seem useful, but could be done with a regexp. Filling unknowns with stars could just as easily be done with a different safedict wrapper. The specific issue is when using locals+globals. In that case, it's seems like the problem is clearly a programming bug, so the exception should be raised. >> BDFL proto-pronouncement: It should always raise a NameError >> when the key is missing. There may not be sufficient use case >> for soft failures in the no-argument version. RH> I had written a minature mail-merge program and learned that RH> the NameError approach is a PITA. It makes sense if the RH> mapping is defined inside the program; however, externally RH> supplied mappings (like a mergelist) can be expected to have RH> "holes" and launching exceptions makes it harder to recover RH> than having a default behavior. The best existing Python RH> comparison is the str.replace() method which does not bomb-out RH> when the target string is not found. I agree that certain use cases make the exception problematic. Think a program that uses a template entered remotely through the web. That template could have misspellings in the variable substitutions. In that case I think you'd like to carry on as best you can, by returning a string with the bogus placeholders still in the string. -Barry From David Abrahams" <2m3cvjshpn.fsf@starship.python.net> Message-ID: <08a201c21788$7afc4760$6601a8c0@boostconsulting.com> ----- Original Message ----- From: "Michael Hudson" > > This seems to indicate that I can't, in general, pass a slice object to > > PyObject_GetItem in order to do slicing.** Correct? > > No. The time machine has got you here; update to CVS and try again. While that result is of interest, I think I need to support 2.2.1, so maybe it doesn't make too much difference what the current CVS is doing. > This comes down to the (slightly odd, IMHO) distinction between > sequences and mappings, which doesn't really appear at the Python > level. > > type_pointer->tp_as_sequence->sq_item > > takes a single int as a parameter > > type_pointer->tp_as_mapping->mp_subscr > > takes a PyObject*. I know about those details, but of course they aren't really a cause: as the current CVS shows, it *can* be handled. > Builtin sequences (as of last week) have mp_subscr > methods that handle slices. I haven't checked, but would be amazed if > PyObject_GetItem can't now be used with sliceobjects. Good to know. > > So Python is doing some dispatching internally based on the types of the > > >>> class subint(int): pass > > ... > > >>> subint() > > 0 > > >>> Z[subint():5] > > Traceback (most recent call last): > > File "", line 1, in ? > > TypeError: unsubscriptable object > > This last one is easy: you're trying to subscript the class object! Oops, nice catch! >>> Z()[subint():5] getslice 0 5 > > I want to make a generalized getslice function in C which can operate on a > > triple of arbitrary objects. Here's the python version I came up with: > > > > def getslice(x,start,finish): > > if (type(start) is type(finish) is int > > and hasattr(type(x), '__getslice__')): > > return x.__getslice__(start, finish) > > else: > > return x.__getitem__(slice(start,finish)) > > > > Have I got the logic right here? > > You can't do this logic from Python, AFAIK. Why do you say that? Are you saying I should be looking at slots and not attributes, plus special handling for classic classes (ick)? > I think PyObject_GetItem is your best bet. Well, not if I care about versions < 2.2.2. So, I'm modifying my logic slightly: def getslice(x,start,finish): if (isinstance(start,int) and isinstance(finish,int) and hasattr(type(x), '__getslice__')): return x.__getslice__(start, finish) else: return x.__getitem__(slice(start,finish)) -Dave From barry@zope.com Wed Jun 19 13:10:48 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 08:10:48 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> Message-ID: <15632.29896.521752.346381@anthem.wooz.org> >>>>> "FL" == Fredrik Lundh writes: >> def birth(self, name): country = self.countryOfOrigin['name'] >> return '${name} was born in ${country}'.sub() FL> now explain why the above is a vast improvement over: | def birth(self, name): | country = self.countryOfOrigin['name'] | return join(name, ' was born in ', country) One use case: you can't internationalize that. You /can/ translate '${name} was born in ${country}', which might end up in some languages like '${country} was ${name} born in'. FL> (for extra bonus, explain how sub() can be made to FL> execute substantially faster than a join() function) All I care is that it runs as fast as the % operator. -Barry From barry@zope.com Wed Jun 19 13:16:17 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 08:16:17 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> Message-ID: <15632.30225.721902.65921@anthem.wooz.org> >>>>> "MS" =3D=3D Martin Sj=F6gren writes: MS> What's the advantage of using ${name} and ${country} instead? There's a lot of empirical evidence that %(name)s is quite error prone. BTW, you can't use locals() or globals() because you really want globals()-overridden-with-locals(), i.e. d =3D globals().copy() d.update(locals()) vars() doesn't cut it either: >>> help(vars) Help on built-in function vars: vars(...) vars([object]) -> dictionary =20 Without arguments, equivalent to locals(). With an argument, equivalent to object.__dict__. -Barry From barry@zope.com Wed Jun 19 13:18:44 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 08:18:44 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <006501c2176e$b9dbb3e0$0900a8c0@spiff> Message-ID: <15632.30372.601835.200686@anthem.wooz.org> >>>>> "FL" == Fredrik Lundh writes: FL> combine 1, 2, and 3 with _getframe(), and you have a FL> feature that crackers are going to love... Why? I've added a note that you should never use no-arg .sub() on strings that come from untrusted sources. Are there any other specific security concerns you can identify? -Barry From barry@zope.com Wed Jun 19 13:29:39 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 08:29:39 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> Message-ID: <15632.31027.356393.678498@anthem.wooz.org> >>>>> "DB" == Duncan Booth writes: DB> What I really don't understand is why there is such pressure DB> to get an alternative interpolation added as methods to str & DB> unicode rather than just adding an interpolation module to the DB> library? e.g. Because I don't think there's all that much useful variation, open issues in this PEP notwithstanding. A module seems pretty heavy for such a simple addition. It might obviate the need for a PEP though. :) | from interpolation import sub | def birth(self, name): | country = self.countryOfOrigin['name'] | return sub('${name} was born in ${country}', vars()) DB> I added in the explicit vars() parameter because the idea of a DB> possibly unknown template string picking up arbitrary DB> variables is, IMHO, a BAD idea. Only if the template string comes from an untrusted source. If it's in your code, there should be no problem, and if there is, it's a programming bug. vars() doesn't cut it as mentioned in a previous reply. -Barry From barry@zope.com Wed Jun 19 13:31:56 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 08:31:56 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <018c01c21778$4d374f60$0900a8c0@spiff> Message-ID: <15632.31164.457058.17493@anthem.wooz.org> >>>>> "FL" == Fredrik Lundh writes: FL> especially since someone has already added such a method, a FL> long time ago (os.path.expandvars). Of course expandvars() takes it's mapping from os.environ, but I see what you're saying. -Barry From neal@metaslash.com Wed Jun 19 13:32:47 2002 From: neal@metaslash.com (Neal Norwitz) Date: Wed, 19 Jun 2002 08:32:47 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <15632.30225.721902.65921@anthem.wooz.org> Message-ID: <3D1079EF.61AA3E79@metaslash.com> "Barry A. Warsaw" wrote: > > BTW, you can't use locals() or globals() because you really want > globals()-overridden-with-locals(), i.e. > > d = globals().copy() > d.update(locals()) What about free/cell vars? Will these be used? If not, is that a problem? Neal From aahz@pythoncraft.com Wed Jun 19 13:40:59 2002 From: aahz@pythoncraft.com (Aahz) Date: Wed, 19 Jun 2002 08:40:59 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> Message-ID: <20020619124059.GA22356@panix.com> On Wed, Jun 19, 2002, Martin v. Loewis wrote: > barry@zope.com (Barry A. Warsaw) writes: >> >> Here's a small patch to setup.py which should fix things in a portable >> way, at least for *nix systems. It sets the envar LD_RUN_PATH to the >> location that it found the Berkeley library, but only if that envar >> isn't already set. > > I dislike that change. Setting LD_RUN_PATH is the jobs of whoever is > building the compiler, and should not be done by Python > automatically. So far, the Python build process avoids adding any -R > linker options, since it requires quite some insight into the specific > installation to determine whether usage of that option is the right > thing. > > If setup.py fails to build an extension correctly, it is the > adminstrator's job to specify a correct build procedure in > Modules/Setup. For that reason, I rather recommend to remove the magic > that setup.py looks in /usr/local/Berkeley*, instead of adding more > magic. -1 if it doesn't at least include an error message saying that we found dbm but couldn't use it. (That is, I agree with you that explicit is better than implicit -- but if we can provide info, we should.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From guido@python.org Wed Jun 19 13:39:37 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 08:39:37 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: Your message of "Tue, 18 Jun 2002 22:38:36 EDT." <15631.61100.561824.480935@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> Message-ID: <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> > I still think we may want to pull PyBSDDB into the standard distro, as > a way to provide BDB api's > 1.85. The question is, what would this > new module be called? I dislike "bsddb3" -- which I think PyBSDDB > itself uses -- because it links against BDB 4.0. Good idea. Maybe call it berkeleydb? That's what Sleepycat calls it (there's no connection with the BSD Unix distribution AFAICT). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jun 19 13:49:22 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 08:49:22 -0400 Subject: [Python-Dev] Re: making dbmmodule still broken In-Reply-To: Your message of "Tue, 18 Jun 2002 23:24:09 EDT." <15631.63833.440127.405556@anthem.wooz.org> References: <15631.58711.213506.701945@localhost.localdomain> <15631.63833.440127.405556@anthem.wooz.org> Message-ID: <200206191249.g5JCnM001518@pcp02138704pcs.reston01.va.comcast.net> > SM> I think it would probably be a good idea to alert the person > SM> running make what library the module will be linked with. > SM> Anyone else agree? > > +1. The less guessing the builder has to do the better! Just don't start asking questions and reading answers from stdin. The Make process is often run unattended. A new option to allow asking questions is OK. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Wed Jun 19 13:46:05 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 19 Jun 2002 08:46:05 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <09342475030690@aluminium.rcp.co.uk> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> Message-ID: <20020619124604.GB31653@ute.mems-exchange.org> On Wed, Jun 19, 2002 at 10:34:24AM +0100, Duncan Booth wrote: >What I really don't understand is why there is such pressure to get an >alternative interpolation added as methods to str & unicode rather than >just adding an interpolation module to the library? It could live in the new text module, where Greg Ward's word-wrapping code will be going. +1 on /F's suggestion of recycling the os.path.expandvars() code. (Maybe a syntax-checker for %(...) strings would solve Mailman's problems, and alleviate the plaintive cries for an alternative interpolation syntax?) --amk (www.amk.ca) GERTRUDE: The lady doth protest too much, methinks. -- _Hamlet_, III, ii From guido@python.org Wed Jun 19 13:49:53 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 08:49:53 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Tue, 18 Jun 2002 23:29:20 EDT." <200206190329.g5J3TKm19802@smtp.zope.com> References: <200206190329.g5J3TKm19802@smtp.zope.com> Message-ID: <200206191249.g5JCnr901530@pcp02138704pcs.reston01.va.comcast.net> > I'm so behind on my email, that the anticipated flamefest will surely > die down before I get around to reading it. Yet still, here is a new > PEP. :) No, the flamefest won't start until you post to c.l.py. :) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jun 19 13:52:38 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 08:52:38 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Tue, 18 Jun 2002 21:36:51 PDT." References: Message-ID: <200206191252.g5JCqcQ01558@pcp02138704pcs.reston01.va.comcast.net> > Have you considered the possibility of accepting keyword arguments > instead? They would be slightly more pleasant to write: > > print s.sub(name='Guido', country='the Netherlands') > > This is motivated because i imagine relative frequencies of use > to be something like this: > > 1. sub() [most frequent] > 2. sub(name=value, ...) [nearly as frequent] > 3. sub(dictionary) [least frequent] > > If you decide to use keyword arguments, you can either allow both > keyword arguments and a single dictionary argument, or you can > just accept keyword arguments and people can pass in dictionaries > using **. I imagine that the most common use case is a situation where the dict is already prepared. I think **dict is slower than a positional dict argument. I agree that keyword args would be useful in some cases where you can't trust the string. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jun 19 13:57:07 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 08:57:07 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 02:45:46 EDT." <007f01c2175c$f1e768e0$f4d8accf@othello> References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net> <007f01c2175c$f1e768e0$f4d8accf@othello> Message-ID: <200206191257.g5JCv7V01597@pcp02138704pcs.reston01.va.comcast.net> > > 1. $$ is an escape; it is replaced with a single $ > > Hmm, some strings (at least in the spam I receive) contain $$$$$$. > How about ${$}? I don't understand the use case. Do you want to *output* strings containing many dollars? If you want a {} based escape, it should be ${} IMO. --Guido van Rossum (home page: http://www.python.org/~guido/) From Oleg Broytmann Wed Jun 19 13:53:44 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Wed, 19 Jun 2002 16:53:44 +0400 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Wed, Jun 19, 2002 at 08:39:37AM -0400 References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020619165344.V4127@phd.pp.ru> On Wed, Jun 19, 2002 at 08:39:37AM -0400, Guido van Rossum wrote: > Good idea. Maybe call it berkeleydb? That's what Sleepycat calls it > (there's no connection with the BSD Unix distribution AFAICT). +1 on berkeleydb Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From guido@python.org Wed Jun 19 13:58:54 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 08:58:54 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 09:05:00 +0200." <003701c2175f$b219c340$ced241d5@hagrid> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> Message-ID: <200206191258.g5JCwsp01610@pcp02138704pcs.reston01.va.comcast.net> > > def birth(self, name): > > country = self.countryOfOrigin['name'] > > return '${name} was born in ${country}'.sub() > > now explain why the above is a vast improvement over: > > def birth(self, name): > country = self.countryOfOrigin['name'] > return join(name, ' was born in ', country) One word: I18n. > (for extra bonus, explain how sub() can be made to > execute substantially faster than a join() function) That's not a requirement. It can obviously be made as fast as the % operator. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jun 19 14:05:32 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 09:05:32 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 07:23:59 EDT." <15632.27087.786103.175959@anthem.wooz.org> References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net> <007f01c2175c$f1e768e0$f4d8accf@othello> <15632.27087.786103.175959@anthem.wooz.org> Message-ID: <200206191305.g5JD5Xe01727@pcp02138704pcs.reston01.va.comcast.net> > I agree that certain use cases make the exception problematic. Think > a program that uses a template entered remotely through the web. That > template could have misspellings in the variable substitutions. In > that case I think you'd like to carry on as best you can, by returning > a string with the bogus placeholders still in the string. That's a matter of validating the template before accepting it. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jun 19 13:54:56 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 08:54:56 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Tue, 18 Jun 2002 21:46:33 PDT." <3D100CA9.999E7B3F@prescod.net> References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com> <3D100CA9.999E7B3F@prescod.net> Message-ID: <200206191254.g5JCsu301574@pcp02138704pcs.reston01.va.comcast.net> > > - We can simply allow the exception (likely a NameError or > > KeyError) to propagate. > > Explicit! > > > - We can return the original substitution placeholder unchanged. > > Silently guess??? I'm strongly in favor of always making missing keys an error. It should be a KeyError when a dict is used, and a NameError when locals/globals are looked up. > Overall it isn't bad...it's a little weird to have a method that depends > on sys._getframe(1) (or as the say in Tcl-land "upvar"). It may set a > bad precedent... No, the real implementation will be in C. C functions always have access to locals and globals. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Wed Jun 19 14:13:25 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 19 Jun 2002 15:13:25 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <200206191258.g5JCwsp01610@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <028c01c21795$a5841d20$0900a8c0@spiff> guido wrote: > > > def birth(self, name): > > > country =3D self.countryOfOrigin['name'] > > > return '${name} was born in ${country}'.sub() > >=20 > > now explain why the above is a vast improvement over: > >=20 > > def birth(self, name): > > country =3D self.countryOfOrigin['name'] > > return join(name, ' was born in ', country) >=20 > One word: I18n. really? who's doing the localization in: return '${name} was born in ${country}'.sub() maybe barry meant return _('${name} was born in ${country}').sub() but in that case, I completely fail to see why he couldn't just as well do the substitution inside the "_" function: return _('${name} was born in ${country}') where _ could be defined as: def _(string, mapping=3DNone): if mapping is None: ... def repl(m, mapping=3Dmapping): return mapping[m.group(m.lastindex)] return sre.sub(A_PATTERN, repl, do_translation(string)) instead of def _(string, mapping=3DNone): if mapping is None: ... return do_translation(string).sub(mapping) From tismer@tismer.com Wed Jun 19 14:38:46 2002 From: tismer@tismer.com (Christian Tismer) Date: Wed, 19 Jun 2002 15:38:46 +0200 Subject: [Python-Dev] Tcl adept wanted for Stackless problem Message-ID: <3D108966.9050402@tismer.com> Dear Lists, there is still some problem with Tcl and Stackless hiding. Is there anybody around who knows the Tcl/Tk sources about as I know Python's? My big question is: When does Tcl use C stack entries as globals, which are passed as function arguments to interpreter calls? This is of special interest when Python callbacks are invoked. The problem is, that I use stack slicing, which moves parts of the C stack away at some times, and I need to know when I am forbidden to do that. It is solved to some extent, but not all. That's unfortunately not all. A friend is running the Tcl/Tk mainloop in one real thread, and stackless tasklets are running in another one (of course calling into Tcl). When tasklet switches are performed (again, this is moving stack contents around), there appear to be crashes, too. This kind of slicing should be allowed IMHO, since these are different contexts, which shouldn't interact at all. Are there any structures which are shared between threads that use Tcl/Tk? Something that may not disappear during some operations? Which part of the documentation / the Tcl/Tk source should I read to find this out wihout learning everything? This is really an urgent problem which is blocking me. If somebody has good knowledge, please show up! :-) thanks & cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From guido@python.org Wed Jun 19 15:45:49 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 10:45:49 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 08:29:39 EDT." <15632.31027.356393.678498@anthem.wooz.org> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> Message-ID: <200206191445.g5JEjoL01987@pcp02138704pcs.reston01.va.comcast.net> > DB> What I really don't understand is why there is such pressure > DB> to get an alternative interpolation added as methods to str & > DB> unicode rather than just adding an interpolation module to the > DB> library? e.g. > > Because I don't think there's all that much useful variation, open > issues in this PEP notwithstanding. A module seems pretty heavy for > such a simple addition. It might obviate the need for a PEP > though. :) Certainly if we can't agree on the PEP, a module might make sense. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jun 19 15:50:59 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 10:50:59 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 08:32:47 EDT." <3D1079EF.61AA3E79@metaslash.com> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <15632.30225.721902.65921@anthem.wooz.org> <3D1079EF.61AA3E79@metaslash.com> Message-ID: <200206191450.g5JEoxC02004@pcp02138704pcs.reston01.va.comcast.net> > > BTW, you can't use locals() or globals() because you really want > > globals()-overridden-with-locals(), i.e. > > > > d = globals().copy() > > d.update(locals()) > > What about free/cell vars? Will these be used? > If not, is that a problem? Without compiler support for this construct we have no hope of getting references to outer non-global scopes right. E.g. def f(): x = 12 def g(): return "x is $x".sub() return g Here the compiler has no clue that g references x, so it wouldn't do the special treatment for x that's needed to make it work. I see no way to fix this in general without introducing new syntax; note that the string "x is $x" could have been an argument to g(). --Guido van Rossum (home page: http://www.python.org/~guido/) From niemeyer@conectiva.com Wed Jun 19 15:40:36 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Wed, 19 Jun 2002 11:40:36 -0300 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <028c01c21795$a5841d20$0900a8c0@spiff> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <200206191258.g5JCwsp01610@pcp02138704pcs.reston01.va.comcast.net> <028c01c21795$a5841d20$0900a8c0@spiff> Message-ID: <20020619114036.A5586@ibook.distro.conectiva> [...] > but in that case, I completely fail to see why he couldn't > just as well do the substitution inside the "_" function: > > return _('${name} was born in ${country}') [...] That would parse every translated string, which doesn't seem reasonable. My vote is for sub()-like, inside an extension module. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From barry@zope.com Wed Jun 19 15:58:56 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 10:58:56 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net> <007f01c2175c$f1e768e0$f4d8accf@othello> <15632.27087.786103.175959@anthem.wooz.org> <200206191305.g5JD5Xe01727@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15632.39984.662165.422755@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: >> I agree that certain use cases make the exception problematic. >> Think a program that uses a template entered remotely through >> the web. That template could have misspellings in the variable >> substitutions. In that case I think you'd like to carry on as >> best you can, by returning a string with the bogus placeholders >> still in the string. GvR> That's a matter of validating the template before accepting GvR> it. True, which isn't hard to do. You can write a regexp to extract the $names and then validate those. In fact, I think this is what newer versions of xgettext do for Python code (albeit with the %(name)s syntax). -Barry From greg@electricrain.com Wed Jun 19 19:21:41 2002 From: greg@electricrain.com (Gregory P. Smith) Date: Wed, 19 Jun 2002 11:21:41 -0700 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> Message-ID: <20020619182141.GA18944@zot.electricrain.com> On Wed, Jun 19, 2002 at 07:30:11AM +0200, Martin v. Loewis wrote: > barry@zope.com (Barry A. Warsaw) writes: > > > I still think we may want to pull PyBSDDB into the standard distro, as > > a way to provide BDB api's > 1.85. The question is, what would this > > new module be called? I dislike "bsddb3" -- which I think PyBSDDB > > itself uses -- because it links against BDB 4.0. > > If this is just a question of naming, I recommend bsddb2 - not > indicating the version of the database, but the version of the Python > module. If I hadn't made the initial mistake of naming pybsddb's module bsddb3 when i first extended robin's berkeleydb 2.x module to work with 3.0 I would agree with that name. I worry that having a module named bsddb2 might cause endless confusion as bsddb and bsddb3 already exist and did correlate to the version number. How about 'berkeleydb'? From paul@prescod.net Wed Jun 19 17:43:02 2002 From: paul@prescod.net (Paul Prescod) Date: Wed, 19 Jun 2002 09:43:02 -0700 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com> <3D100CA9.999E7B3F@prescod.net> <200206191254.g5JCsu301574@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D10B496.67B19898@prescod.net> Guido van Rossum wrote: > >... > > I'm strongly in favor of always making missing keys an error. It > should be a KeyError when a dict is used, and a NameError when > locals/globals are looked up. I think someone later suggested that maybe a keyword argument could allow some kind of policy to be expressed. That would be okay for me if people feel strongly that some strategy for fixing up missing arguments is necessary. > > Overall it isn't bad...it's a little weird to have a method that depends > > on sys._getframe(1) (or as the say in Tcl-land "upvar"). It may set a > > bad precedent... > > No, the real implementation will be in C. C functions always have > access to locals and globals. I didn't mean it will be a bad precedent because of the implemention. I mean that methods do not usually peak into their caller's variables, even from C. What other methods do that? I'm still "+0" despite being somewhat uncomfortable with that aspect. Paul Prescod From jeremy@zope.com Wed Jun 19 16:29:13 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Wed, 19 Jun 2002 11:29:13 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <200206191450.g5JEoxC02004@pcp02138704pcs.reston01.va.comcast.net> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <15632.30225.721902.65921@anthem.wooz.org> <3D1079EF.61AA3E79@metaslash.com> <200206191450.g5JEoxC02004@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15632.41801.526300.360183@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: >> > BTW, you can't use locals() or globals() because you really >> > want globals()-overridden-with-locals(), i.e. >> > >> > d = globals().copy() d.update(locals()) >> >> What about free/cell vars? Will these be used? If not, is that >> a problem? GvR> Without compiler support for this construct we have no hope of GvR> getting references to outer non-global scopes right. E.g. GvR> def f(): GvR> x = 12 GvR> def g(): GvR> return "x is $x".sub() GvR> return g GvR> Here the compiler has no clue that g references x, so it GvR> wouldn't do the special treatment for x that's needed to make GvR> it work. GvR> I see no way to fix this in general without introducing new GvR> syntax; note that the string "x is $x" could have been an GvR> argument to g(). If Python had macros, then we could define the interpolation function as a macro. It would expand to explicit references to all the variables in the block that called the macro. Then the compiler could do the right thing. Of course, we ain't got macros, but whatever. I think they would provide cleaner support for interpolation than sys._getframe(). Jeremy From barry@zope.com Wed Jun 19 19:44:34 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 14:44:34 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15632.53522.716428.359480@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> Maybe call it berkeleydb? >>>>> "GPS" == Gregory P Smith writes: GPS> How about 'berkeleydb'? Sounds like consensus. Greg, how do you feel about moving project management from the pybsddb project to the Python project? Maybe we should plan a transition on the pybsddb-users list. I'm willing to help. -Barry From mwh@python.net Wed Jun 19 17:07:55 2002 From: mwh@python.net (Michael Hudson) Date: 19 Jun 2002 17:07:55 +0100 Subject: [Python-Dev] extended slicing again In-Reply-To: Michael Hudson's message of "17 Jun 2002 15:26:22 +0100" References: <9B37BC74-81F2-11D6-9BA6-0003931DF95C@python.net> <200206171339.g5HDdxN08737@pcp02138704pcs.reston01.va.comcast.net> <2my9de7zht.fsf_-_@starship.python.net> Message-ID: <2m7kkv9rqc.fsf@starship.python.net> Michael Hudson writes: > Guido van Rossum writes: > > > IOW slice(a, b, None) should be considered equivalent to L[a:b] in all > > situations. > > OK. I'll do this soon. It's not as bad as I thought at first -- only > mutable sequences are affected, so it's only lists and arrays that > need to be tweaked. That was easy! Cheers, M. -- I have no disaster recovery plan for black holes, I'm afraid. Also please be aware that if it one looks imminent I will be out rioting and setting fire to McDonalds (always wanted to do that) and probably not reading email anyway. -- Dan Barlow From guido@python.org Wed Jun 19 16:07:36 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 11:07:36 -0400 Subject: [Python-Dev] Tcl adept wanted for Stackless problem In-Reply-To: Your message of "Wed, 19 Jun 2002 15:38:46 +0200." <3D108966.9050402@tismer.com> References: <3D108966.9050402@tismer.com> Message-ID: <200206191507.g5JF7ag02088@pcp02138704pcs.reston01.va.comcast.net> > My big question is: > When does Tcl use C stack entries as globals, which > are passed as function arguments to interpreter calls? It's a performance hack, just as stackless :-). Tcl's interpreter data structure has a return value field which can receive a string of arbitrary length. In order to make this efficient, this is initialized with a pointer to a limited-size array on the stack of the caller; when the return value is longer, a malloc()'ed buffer is used. There is a little dance you have to do to free the malloc()'ed buffer. The big win is that most calls return short strings and hence you save a call to malloc() and one to free() per invocation. This is used *all over* the Tcl source, so good luck getting rid of it. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim@zope.com Wed Jun 19 16:16:44 2002 From: tim@zope.com (Tim Peters) Date: Wed, 19 Jun 2002 11:16:44 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <20020619124604.GB31653@ute.mems-exchange.org> Message-ID: [Andrew Kuchling] > +1 on /F's suggestion of recycling the os.path.expandvars() code. -1 on that part: os.path.expandvars() is an ill-defined mess (the core has more than one of them, varying by platform, and what they do differs in platform-irrelevant ways). +1 on making Barry fix expandvars : http://www.python.org/sf/494589 From barry@zope.com Wed Jun 19 19:31:58 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 14:31:58 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> Message-ID: <15632.52766.822003.689689@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> I dislike that change. Setting LD_RUN_PATH is the jobs of MvL> whoever is building the compiler, and should not be done by MvL> Python automatically. So far, the Python build process avoids MvL> adding any -R linker options, since it requires quite some MvL> insight into the specific installation to determine whether MvL> usage of that option is the right thing. Really? You know the path for the -R/--rpath flag, so all you need is the magic compiler-specific incantation, and distutils already (or /should/ already) know that. MvL> If setup.py fails to build an extension correctly, it is the MvL> adminstrator's job to specify a correct build procedure in MvL> Modules/Setup. For that reason, I rather recommend to remove MvL> the magic that setup.py looks in /usr/local/Berkeley*, MvL> instead of adding more magic. I disagree. While the sysadmin should probably fiddle with /etc/ld.so.conf when he installs BerkeleyDB, it's not documented in the Sleepycat docs, so it's entirely possible that they haven't done it. That shouldn't stop Python from building a perfectly usable module, especially because it really can figure out all the necessary information. Is there some specific fear you have about compiling in the run-path? Note I'm not saying setting LD_RUN_PATH is the best approach, but it seemed like the most portable. I couldn't figure out if distutils knew what the right compiler-specific switches are (i.e. "-R dir" on Solaris cc if memory serves, and "-Xlinker -rpath -Xlinker dir" for gcc, and who knows what for other Unix or Windows compilers). -Barry From paul@prescod.net Wed Jun 19 18:44:14 2002 From: paul@prescod.net (Paul Prescod) Date: Wed, 19 Jun 2002 10:44:14 -0700 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> Message-ID: <3D10C2EE.CE833DB7@prescod.net> "Barry A. Warsaw" wrote: > >... > > Because I don't think there's all that much useful variation, open > issues in this PEP notwithstanding. A module seems pretty heavy for > such a simple addition. I really hate putting things in modules that will be needed in a Python programmer's second program (the one after "Hello world"). If this is to be the *simpler* way of doing introspection then getting at it should be simpler than getting at "%". $ is taught in hour 2, import is taught on day 2. Some people may never make it to the metaphorical day 2 if they are doing simple text processing in some kind of embedded-Python environment. Paul Prescod From greg@electricrain.com Wed Jun 19 20:31:47 2002 From: greg@electricrain.com (Gregory P. Smith) Date: Wed, 19 Jun 2002 12:31:47 -0700 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15632.53522.716428.359480@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.53522.716428.359480@anthem.wooz.org> Message-ID: <20020619193147.GB18944@zot.electricrain.com> On Wed, Jun 19, 2002 at 02:44:34PM -0400, Barry A. Warsaw wrote: > > >>>>> "GvR" == Guido van Rossum writes: > > GvR> Maybe call it berkeleydb? > > >>>>> "GPS" == Gregory P Smith writes: > > GPS> How about 'berkeleydb'? > > Sounds like consensus. Greg, how do you feel about moving project > management from the pybsddb project to the Python project? > > Maybe we should plan a transition on the pybsddb-users list. I'm > willing to help. That sounds like a good idea to me, though I don't know what moving it entails. (i assume creating a berkeleydb module directory in the python project and maintaining the code and documentation from there?). Technically Robin Dunn is the only project administrator on the pybsddb sourceforge project but i've got access to at least modify pybsddb.sf.net, cvs and file releases. The project has been relatively idle recently; i've tried to give it a little of my time every month or two (basic 4.0 support, some bugfixes, accepting patches, etc). As it is quite stable I don't believe he's actively working on it anymore. Robin? -G From tismer@tismer.com Wed Jun 19 16:28:50 2002 From: tismer@tismer.com (Christian Tismer) Date: Wed, 19 Jun 2002 17:28:50 +0200 Subject: [Python-Dev] Tcl adept wanted for Stackless problem References: <3D108966.9050402@tismer.com> <200206191507.g5JF7ag02088@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D10A332.7010909@tismer.com> Guido van Rossum wrote: >>My big question is: >>When does Tcl use C stack entries as globals, which >>are passed as function arguments to interpreter calls? > > > It's a performance hack, just as stackless :-). Where the effect of my hack is slightly bigger. We can fight that out in Charleroi. :-) > Tcl's interpreter data structure has a return value field which can > receive a string of arbitrary length. In order to make this > efficient, this is initialized with a pointer to a limited-size array > on the stack of the caller; when the return value is longer, a > malloc()'ed buffer is used. There is a little dance you have to do to > free the malloc()'ed buffer. The big win is that most calls return > short strings and hence you save a call to malloc() and one to free() > per invocation. This is used *all over* the Tcl source, so good luck > getting rid of it. Thank you! I should better not try this. Instead, I'd like not to touch it at all. I have patched tkinter in a way that it does not slice the stack while some Tcl stuff is running (maybe I didn't catch all). That should mean that the small stack stings are all alive. That is, in the context of Tcl, I dispensed with the "stackless" concept. The remaining problem is switching of tasklets which contain Tcl invocations. I thought so far that this is no problem, since these are disjoint contexts, but Jeff Senn reported problems as well. I fear I have the problem that Tcl thinks it is still using the same interp, or it creates a nested one, while the tasklets are not nested, but seen as independent. Somehow I need to create a new Tcl frame chain for every tasklet that uses Tcl. Can this be the problem? Still no clue how to do it but thanks - ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From barry@zope.com Wed Jun 19 15:54:16 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 10:54:16 -0400 Subject: [Python-Dev] Re: making dbmmodule still broken References: <15631.58711.213506.701945@localhost.localdomain> <15631.63833.440127.405556@anthem.wooz.org> <200206191249.g5JCnM001518@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15632.39704.752754.245878@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> Just don't start asking questions and reading answers from GvR> stdin. The Make process is often run unattended. A new GvR> option to allow asking questions is OK. Right, but printing build process calculations is a good thing. FWIW, XEmacs's build process tells you exactly what libraries it's linking with and features it's enabling. Very helpful in answering questions like "which Berkeley library did I link against?". -Barry From andymac@bullseye.apana.org.au Wed Jun 19 12:42:14 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Wed, 19 Jun 2002 22:42:14 +1100 (edt) Subject: [Python-Dev] test_socket failure on FreeBSD Message-ID: Below is the output of test_socket with the -v option, from a CVS tree of about 1915 UTC June 18. FreeBSD 4.4, gcc 2.95.3 (-g -O3). In speaking up now, I'm making the assumption that the non-blocking socket changes should be complete, modulo bugfixes. If this is not the case, please let me know, and I'll wait for the situation to stabilise. Otherwise, is there any more info I can (attempt to) provide? I tried "print"ing the addr variable when running the test, and just get "ERROR" (sans quotes of course). I've not yet tried to build the OS/2 port with the current CVS code, so I don't yet know what the situation is there. Won't have much time to dig until the weekend... -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia ---------- Forwarded message ---------- Date: Wed, 19 Jun 2002 22:13:45 +1000 (EST) {...} test_socket Testing for mission critical constants. ... ok Testing getservbyname(). ... ok Testing getsockopt(). ... ok Testing hostname resolution mechanisms. ... ok Making sure getnameinfo doesn't crash the interpreter. ... ok Testing for existance of non-crucial constants. ... ok Testing reference count for getnameinfo. ... ok Testing setsockopt(). ... ok Testing getsockname(). ... ok Testing that socket module exceptions. ... ok Testing fromfd(). ... ok Testing receive in chunks over TCP. ... ok Testing recvfrom() in chunks over TCP. ... ERROR Testing large receive over TCP. ... ok Testing large recvfrom() over TCP. ... ERROR Testing sendall() with a 2048 byte string over TCP. ... ok Testing shutdown(). ... ok Testing recvfrom() over UDP. ... ok Testing sendto() and Recv() over UDP. ... ok Testing non-blocking accept. ... FAIL Testing non-blocking connect. ... ok Testing non-blocking recv. ... FAIL Testing whether set blocking works. ... ok Performing file readline test. ... ok Performing small file read test. ... ok Performing unbuffered file read test. ... ok ====================================================================== ERROR: Testing recvfrom() in chunks over TCP. ---------------------------------------------------------------------- Traceback (most recent call last): File "./Lib/test/test_socket.py", line 359, in testOverFlowRecvFrom hostname, port = addr TypeError: unpack non-sequence ====================================================================== ERROR: Testing large recvfrom() over TCP. ---------------------------------------------------------------------- Traceback (most recent call last): File "./Lib/test/test_socket.py", line 347, in testRecvFrom hostname, port = addr TypeError: unpack non-sequence ====================================================================== FAIL: Testing non-blocking accept. ---------------------------------------------------------------------- Traceback (most recent call last): File "./Lib/test/test_socket.py", line 451, in testAccept self.fail("Error trying to do non-blocking accept.") File "/home/andymac/cvs/python/python-cvs/Lib/unittest.py", line 254, in fail raise self.failureException, msg AssertionError: Error trying to do non-blocking accept. ====================================================================== FAIL: Testing non-blocking recv. ---------------------------------------------------------------------- Traceback (most recent call last): File "./Lib/test/test_socket.py", line 478, in testRecv self.fail("Error trying to do non-blocking recv.") File "/home/andymac/cvs/python/python-cvs/Lib/unittest.py", line 254, in fail raise self.failureException, msg AssertionError: Error trying to do non-blocking recv. ---------------------------------------------------------------------- Ran 26 tests in 0.330s FAILED (failures=2, errors=2) test test_socket failed -- errors occurred; run in verbose mode for details 1 test failed: test_socket From guido@python.org Wed Jun 19 21:23:03 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 16:23:03 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 09:43:02 PDT." <3D10B496.67B19898@prescod.net> References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com> <3D100CA9.999E7B3F@prescod.net> <200206191254.g5JCsu301574@pcp02138704pcs.reston01.va.comcast.net> <3D10B496.67B19898@prescod.net> Message-ID: <200206192023.g5JKN3K02971@pcp02138704pcs.reston01.va.comcast.net> > > > Overall it isn't bad...it's a little weird to have a method that depends > > > on sys._getframe(1) (or as the say in Tcl-land "upvar"). It may set a > > > bad precedent... > > > > No, the real implementation will be in C. C functions always have > > access to locals and globals. > > I didn't mean it will be a bad precedent because of the implemention. I > mean that methods do not usually peak into their caller's variables, > even from C. What other methods do that? Dunno about methods, but locals(), globals(), vars() and dir() do this or something like it. > I'm still "+0" despite being somewhat uncomfortable with that > aspect. I think little would be lost if sub() always required a dict (or perhaps keyword args, although that feels like a YAGNI now). I think that the key thing here is to set the precedent of using $ and the specific syntax proposed, not necessarily to have this as a built-in string methong. Note: posixpath.expandvars() doesn't have $$, which is essential, and leaves unknown variables alone, which we (mostly) agree is not the right thing to do. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Wed Jun 19 21:23:55 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 19 Jun 2002 22:23:55 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> Message-ID: <01bf01c217cf$407bcec0$ced241d5@hagrid> paul wrote: > $ is taught in hour 2, import is taught on day 2. says who? I usually mention "import" in the first hour (before methods), and nobody has ever had any problem with that... From guido@python.org Wed Jun 19 21:27:40 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 16:27:40 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 10:44:14 PDT." <3D10C2EE.CE833DB7@prescod.net> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> Message-ID: <200206192027.g5JKReA03020@pcp02138704pcs.reston01.va.comcast.net> > I really hate putting things in modules that will be needed in a Python > programmer's second program (the one after "Hello world"). If this is to > be the *simpler* way of doing introspection then getting at it should be > simpler than getting at "%". $ is taught in hour 2, import is taught on > day 2. Some people may never make it to the metaphorical day 2 if they > are doing simple text processing in some kind of embedded-Python > environment. This is a good argument for making this a built-in (Barry, please add to your PEP!). Though I doubt that string % is taught in hour two -- you cna do everything you want with str() and string concatenation, both of which *are* taught in hour two. (And you can do *most* of what you want with print, which is taught in hour one. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Wed Jun 19 21:29:20 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 19 Jun 2002 22:29:20 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: Message-ID: <01f101c217d0$4e199110$ced241d5@hagrid> Tim Peters wrote: > [Andrew Kuchling] > > +1 on /F's suggestion of recycling the os.path.expandvars() code. > > -1 on that part: os.path.expandvars() is an ill-defined mess (the core has > more than one of them, varying by platform, and what they do differs in > platform-irrelevant ways). +1 on making Barry fix expandvars : I'm pretty sure my plan was to change *path.expandvars to def expandvars(string): return string.expandvars(string, os.environ) (and I've already sent SRE-based expandvars code to barry, so all he has to do is to check it in ;-) From gward@python.net Wed Jun 19 21:33:32 2002 From: gward@python.net (Greg Ward) Date: Wed, 19 Jun 2002 16:33:32 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <20020619024806.GA7218@lilith.my-fqdn.de> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <20020619024806.GA7218@lilith.my-fqdn.de> Message-ID: <20020619203332.GA9758@gerg.ca> On 19 June 2002, Gerhard H?ring said: > * Barry A. Warsaw [2002-06-18 22:34 -0400]: > > The problem is that unless your sysadmin hacks ld.so.conf to add > > /usr/local/BerkeleyDB.X.Y/lib onto your standard ld run path, > > bsddbmodule.so won't be linked in such a way that it can actually > > resolve the symbols at run time. > > [...] > > os.environ['LD_RUN_PATH'] = dblib_dir > > I may be missing something here, but AFAIC that's what the library_dirs > parameter in the Extension constructor of distutils is for. It basically > sets the runtime library path at compile time using the "-R" linker > option. No, library_dirs is for good old -L. AFAIK it works fine. For -R (or equivalent) you need runtime_library_dirs. I'm not sure if it works (or ever did). I think it's a question of knowing what magic options to supply to each compiler. Probably it works (worked) on Solaris, since for once Sun got things right and supplied a simple, obvious, working command-line option -- namely -R. Greg -- Greg Ward - programmer-at-large gward@python.net http://starship.python.net/~gward/ Jesus Saves -- and you can too, by redeeming these valuable coupons! From guido@python.org Wed Jun 19 21:37:28 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 16:37:28 -0400 Subject: [Python-Dev] test_socket failure on FreeBSD In-Reply-To: Your message of "Wed, 19 Jun 2002 22:42:14 +1100." References: Message-ID: <200206192037.g5JKbSj03086@pcp02138704pcs.reston01.va.comcast.net> > Below is the output of test_socket with the -v option, from a CVS tree of > about 1915 UTC June 18. FreeBSD 4.4, gcc 2.95.3 (-g -O3). > > In speaking up now, I'm making the assumption that the non-blocking socket > changes should be complete, modulo bugfixes. If this is not the case, > please let me know, and I'll wait for the situation to stabilise. This is supposed to work, there's a missing feature but it's not being tested yet. :-) > Otherwise, is there any more info I can (attempt to) provide? I tried > "print"ing the addr variable when running the test, and just get "ERROR" > (sans quotes of course). Try print "\n" + repr(addr) There are probably some differences in the socket semantics. I'd appreciate it if you could provide a patch or at least a clue! (I'll be away from Friday through July 8.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@python.net Wed Jun 19 21:40:17 2002 From: gward@python.net (Greg Ward) Date: Wed, 19 Jun 2002 16:40:17 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <20020619124604.GB31653@ute.mems-exchange.org> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> Message-ID: <20020619204017.GB9758@gerg.ca> On 19 June 2002, Andrew Kuchling said: > It could live in the new text module, where Greg Ward's word-wrapping > code will be going. +1 on /F's suggestion of recycling the > os.path.expandvars() code. No, that's already checked in as textwrap.py. Greg From fredrik@pythonware.com Wed Jun 19 21:40:40 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 19 Jun 2002 22:40:40 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <01f101c217d0$4e199110$ced241d5@hagrid> Message-ID: <022901c217d1$a2ab6360$ced241d5@hagrid> > I'm pretty sure my plan was to change *path.expandvars to > > def expandvars(string): > return string.expandvars(string, os.environ) should of course have been: def expandvars(string): return text.expandvars(string, os.environ) From martin@v.loewis.de Wed Jun 19 21:43:01 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Jun 2002 22:43:01 +0200 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <20020619182141.GA18944@zot.electricrain.com> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <20020619182141.GA18944@zot.electricrain.com> Message-ID: "Gregory P. Smith" writes: > If I hadn't made the initial mistake of naming pybsddb's module bsddb3 > when i first extended robin's berkeleydb 2.x module to work with 3.0 I > would agree with that name. I worry that having a module named bsddb2 > might cause endless confusion as bsddb and bsddb3 already exist and did > correlate to the version number. How about 'berkeleydb'? What does it have to do with the city of Berkeley (CA)? Perhaps "sleepycat"? Regards, Martin From guido@python.org Wed Jun 19 21:52:58 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 16:52:58 -0400 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: Your message of "19 Jun 2002 22:43:01 +0200." References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <20020619182141.GA18944@zot.electricrain.com> Message-ID: <200206192052.g5JKqw803249@pcp02138704pcs.reston01.va.comcast.net> > What does it have to do with the city of Berkeley (CA)? Perhaps > "sleepycat"? The company Sleepycat calls this particular product Berkeley DB, that's enough reason for me. They (may) have other products too, so Sleepycat is not sufficiently distinctive. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Wed Jun 19 21:49:49 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Jun 2002 22:49:49 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15632.52766.822003.689689@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <15632.52766.822003.689689@anthem.wooz.org> Message-ID: barry@zope.com (Barry A. Warsaw) writes: > Really? You know the path for the -R/--rpath flag, so all you need is > the magic compiler-specific incantation, and distutils already (or > /should/ already) know that. Yes, but you don't know whether usage of -R is appropriate. If the installed library is static, -R won't be needed. If then the target directory recorded with -R happens to be on an unavailable NFS server at run-time (on a completely different network), you cannot import the library module anymore, which would otherwise work perfectly fine. We had big problems with recorded library directories over the years; at some point, the administrators decided to take the machine that had /usr/local/lib/gcc-lib/sparc-sun-solaris2.3/2.5.8 on it offline. They did not knew that they would thus make vim inoperable, which happened to be compiled with LD_RUN_PATH pointing to that directory - even though no library was ever needed from that directory. > I disagree. While the sysadmin should probably fiddle with > /etc/ld.so.conf when he installs BerkeleyDB, it's not documented in > the Sleepycat docs, so it's entirely possible that they haven't done > it. I'm not asking for the administrator fiddle with ld.so.conf. Instead, I'm asking the administrator fiddle with Modules/Setup. > Is there some specific fear you have about compiling in the run-path? Yes, see above. > Note I'm not saying setting LD_RUN_PATH is the best approach, but it > seemed like the most portable. I couldn't figure out if distutils > knew what the right compiler-specific switches are (i.e. "-R dir" on > Solaris cc if memory serves, and "-Xlinker -rpath -Xlinker dir" for > gcc, and who knows what for other Unix or Windows compilers). LD_LIBRARY_PATH won't work for Windows compilers, either. To my knowledge, there is nothign equivalent on Windows. Regards, Martin From skip@pobox.com Wed Jun 19 21:52:26 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 19 Jun 2002 15:52:26 -0500 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> Message-ID: <15632.61194.588186.196532@localhost.localdomain> Barry> Here's a small patch to setup.py which should fix things in a Barry> portable way, at least for *nix systems. It sets the envar Barry> LD_RUN_PATH to the location that it found the Berkeley library, Barry> but only if that envar isn't already set. Martin> I dislike that change. Setting LD_RUN_PATH is the jobs of Martin> whoever is building the compiler, and should not be done by Martin> Python automatically. Agreed. Also, is LD_RUN_PATH widely available? Martin> If setup.py fails to build an extension correctly, it is the Martin> adminstrator's job to specify a correct build procedure in Martin> Modules/Setup. For that reason, I rather recommend to remove the Martin> magic that setup.py looks in /usr/local/Berkeley*, instead of Martin> adding more magic. I'm happy with the current setup. While the /usr/local/BerkeleyN.M location is a bit odd, Sleepycat is pretty consistent in this regard. (At least versions 3 and 4 install this way.) I'd rather require sysadmins to run ldconfig or its equivalent. Most of the time people install packages using the default locations. In those situations where they don't, distutils accepts a couple environment variables which specific alternate search directories for libraries and include files. Their names escape me at the moment, and I'm not sure they accept the usual colon-separated list of directories. If they don't, they should be suitably modified. It should probably be easy to specify these through some configure command line args. Skip From skip@pobox.com Wed Jun 19 22:10:24 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 19 Jun 2002 16:10:24 -0500 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15631.60841.28978.492291@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> Message-ID: <15632.62272.946354.832044@localhost.localdomain> BAW> I'm still having build trouble on my RH6.1 system, but maybe it's BAW> just too old to worry about (I /really/ need to upgrade one of BAW> these days BAW> ;). BAW> -------------------- snip snip -------------------- BAW> building 'bsddb' extension BAW> gcc -g -Wall -Wstrict-prototypes -fPIC -DHAVE_DB_185_H=1 -I/usr/local/BerkeleyDB.3.3/include -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/bsddbmodule.c -o build/temp.linux-i686-2.3/bsddbmodule.o BAW> In file included from /home/barry/projects/python/Modules/bsddbmodule.c:25: BAW> /usr/local/BerkeleyDB.3.3/include/db_185.h:171: parse error before `*' BAW> /usr/local/BerkeleyDB.3.3/include/db_185.h:171: warning: type defaults to `int' in declaration of `__db185_open' BAW> /usr/local/BerkeleyDB.3.3/include/db_185.h:171: warning: data definition has no type or storage class BAW> /home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbhashobject': BAW> /home/barry/projects/python/Modules/bsddbmodule.c:74: warning: assignment from incompatible pointer type BAW> /home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbbtobject': BAW> /home/barry/projects/python/Modules/bsddbmodule.c:124: warning: assignment from incompatible pointer type BAW> /home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbrnobject': BAW> /home/barry/projects/python/Modules/bsddbmodule.c:182: warning: assignment from incompatible pointer type BAW> -------------------- snip snip -------------------- I think you might have to define another CPP macro. In my post from last night about building dbmmodule.c I included define_macros=[('HAVE_BERKDB_H',None), ('DB_DBM_HSEARCH',None)], in the Extension constructor. Maybe DB_DBM_HSEARCH is also needed for older bsddb? I have no trouble building though. Skip From skip@pobox.com Wed Jun 19 22:15:16 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 19 Jun 2002 16:15:16 -0500 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15632.62564.638418.191453@localhost.localdomain> BAW> I still think we may want to pull PyBSDDB into the standard distro, BAW> as a way to provide BDB api's > 1.85. The question is, what would BAW> this new module be called? I dislike "bsddb3" -- which I think BAW> PyBSDDB itself uses -- because it links against BDB 4.0. Guido> Good idea. Maybe call it berkeleydb? That's what Sleepycat Guido> calls it (there's no connection with the BSD Unix distribution Guido> AFAICT). Why can't it just be called bsddb? As far as I could tell tell, it provides a bsddb-compatible interface at the module level. The only change at the bsddb level is the addition of an extra object (db? I can't recall right now and have to get offline soon for the credit card machine so I can't pause to check ;-) which gives the programmer access to all the PyBSDDB magic. Skip From skip@pobox.com Wed Jun 19 22:19:30 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 19 Jun 2002 16:19:30 -0500 Subject: [Python-Dev] Re: making dbmmodule still broken In-Reply-To: <200206191249.g5JCnM001518@pcp02138704pcs.reston01.va.comcast.net> References: <15631.58711.213506.701945@localhost.localdomain> <15631.63833.440127.405556@anthem.wooz.org> <200206191249.g5JCnM001518@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15632.62818.352443.789431@localhost.localdomain> SM> I think it would probably be a good idea to alert the person SM> running make what library the module will be linked with. SM> Anyone else agree? BAW> +1. The less guessing the builder has to do the better! Guido> Just don't start asking questions and reading answers from stdin. Agreed. If necessary, I would recommend adding an option to configure. Skip From Donald Beaudry Wed Jun 19 22:23:47 2002 From: Donald Beaudry (Donald Beaudry) Date: Wed, 19 Jun 2002 17:23:47 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com> <3D100CA9.999E7B3F@prescod.net> <200206191254.g5JCsu301574@pcp02138704pcs.reston01.va.comcast.net> <3D10B496.67B19898@prescod.net> <200206192023.g5JKN3K02971@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206192123.g5JLNlS25542@zippy.abinitio.com> Guido van Rossum wrote, > I think little would be lost if sub() always required a dict (or > perhaps keyword args, although that feels like a YAGNI now). Requiring the dict sounds about right to me. But, now when you consider that the sub() being discussed is litte more than import re def sub(s, **kws): return re.sub(r"\${\w+}", lambda m, d=kws: d[m.group(0)[2:-1]], s) print sub("this is ${my} way to ${sub}", my="Don's", sub="do it") you just have to wonder what the fuss is really all about. Ease of use seems to be the issue. Should this variant of sub() just be added to the re module? With such a friendly introduction, it might coax new users into looking deeper into the power of re. There seems to be another issue though: the default value for the substitution dictionary and whether a KeyError or NameError should be raised when a key doesnt exist. Why not just define a new mapping object, returned from a call to namespace() that behaves something like this (bogus implementation): class namespace: def __getitem__(s, k): return eval(k) Then, print sub("this is ${my} way to ${sub}", **namespace()) Should do the right thing. The fun here is that the namespace() mechanism would be available for further abuse. I see no reason to lock it up inside a string interpolation function. Consideration should even be given to allowing a frame index argument to be passed to it. So, def sub(s, **kws): if not kws: kws = namespace(-1) return re.sub(r"\${\w+}", lambda m, d=kws: d[m.group(0)[2:-1]], s) would do the complete job. But that might be too much like upvar ;) -- Donald Beaudry Ab Initio Software Corp. 201 Spring Street donb@abinitio.com Lexington, MA 02421 ...So much code... From greg@electricrain.com Wed Jun 19 22:25:59 2002 From: greg@electricrain.com (Gregory P. Smith) Date: Wed, 19 Jun 2002 14:25:59 -0700 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15632.62564.638418.191453@localhost.localdomain> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> Message-ID: <20020619212559.GC18944@zot.electricrain.com> On Wed, Jun 19, 2002 at 04:15:16PM -0500, Skip Montanaro wrote: > > BAW> I still think we may want to pull PyBSDDB into the standard distro, > BAW> as a way to provide BDB api's > 1.85. The question is, what would > BAW> this new module be called? I dislike "bsddb3" -- which I think > BAW> PyBSDDB itself uses -- because it links against BDB 4.0. > > Guido> Good idea. Maybe call it berkeleydb? That's what Sleepycat > Guido> calls it (there's no connection with the BSD Unix distribution > Guido> AFAICT). > > Why can't it just be called bsddb? As far as I could tell tell, it provides > a bsddb-compatible interface at the module level. The only change at the > bsddb level is the addition of an extra object (db? I can't recall right > now and have to get offline soon for the credit card machine so I can't > pause to check ;-) which gives the programmer access to all the PyBSDDB > magic. > > Skip Modern berkeleydb uses much different on disk database formats, glancing at the docs on sleepycat.com i don't even think it can read bsddb (1.85) files. Existing code using bsddb (1.85) should not automatically start using a different database library even if we provide a compatibility interface. That upgrade can be done to code manually using: import berkeleydb bsddb = berkeleydb (and creating a single bsddb module that used the old 1.85 library for the old interface and the 3.3/4.0 library for the modern interface would add bloat to many applications that don't need both if it were even possible to link that in such a way as to avoid the symbol conflicts) Greg -- Some mistakes are too much fun to make only once. From jeremy@zope.com Wed Jun 19 22:26:27 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Wed, 19 Jun 2002 17:26:27 -0400 Subject: [Python-Dev] PEP 8: Lists/tuples In-Reply-To: <200206170023.g5H0NxC00733@pcp02138704pcs.reston01.va.comcast.net> References: <20020616234555.GA3415@panix.com> <200206170023.g5H0NxC00733@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15632.63235.580901.707844@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: >> (I'm -1 myself, but I'd like to know what to tell my class.) GvR> Like it or not, that's what tuples are for. :-) That and storing homogenous lists in code objects and base classes and function closures and ... Jeremy From martin@v.loewis.de Wed Jun 19 22:41:53 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Jun 2002 23:41:53 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15632.62564.638418.191453@localhost.localdomain> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> Message-ID: Skip Montanaro writes: > Why can't it just be called bsddb? If full compatibility is guaranteed, I'm all for it. Regards, Martin From tdelaney@avaya.com Thu Jun 20 01:31:53 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Thu, 20 Jun 2002 10:31:53 +1000 Subject: [Python-Dev] PEP 292, Simpler String Substitutions Message-ID: > From: barry@zope.com [mailto:barry@zope.com] >=20 > >>>>> "MS" =3D=3D Martin Sj=F6gren writes: >=20 > MS> What's the advantage of using ${name} and ${country} instead? >=20 > There's a lot of empirical evidence that %(name)s is quite error > prone. Perhaps an unadorned %(name) should default to %(name)s? Tim Delaney From tdelaney@avaya.com Thu Jun 20 01:35:01 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Thu, 20 Jun 2002 10:35:01 +1000 Subject: [Python-Dev] PEP 292, Simpler String Substitutions Message-ID: > From: barry@zope.com [mailto:barry@zope.com] > > >>>>> "GvR" == Guido van Rossum writes: > > GvR> That's a matter of validating the template before accepting > GvR> it. > > True, which isn't hard to do. You can write a regexp to extract the > $names and then validate those. In fact, I think this is what newer > versions of xgettext do for Python code (albeit with the %(name)s > syntax). Of course, once you've validated a string, it's almost no extra work to do the interpolation in place (at least in source code size). Tim Delaney From trentm@ActiveState.com Thu Jun 20 01:39:42 2002 From: trentm@ActiveState.com (Trent Mick) Date: Wed, 19 Jun 2002 17:39:42 -0700 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: ; from tdelaney@avaya.com on Thu, Jun 20, 2002 at 10:31:53AM +1000 References: Message-ID: <20020619173942.B28839@ActiveState.com> [Delaney, Timothy wrote] > > From: barry@zope.com [mailto:barry@zope.com] > >=20 > > >>>>> "MS" =3D=3D Martin Sj=F6gren writes: > >=20 > > MS> What's the advantage of using ${name} and ${country} instead? > >=20 > > There's a lot of empirical evidence that %(name)s is quite error > > prone. >=20 > Perhaps an unadorned %(name) should default to %(name)s? Or: - get pychecker2 working (the one that does not need to import modules that it checks, I *think* that that is one of the pychecker2 features) - get PyChecker in the core - provide a python flag to load the pychecker import hook to check your code when running it (say, '-w') - have PyChecker warn about "%(name)"-sans-formatting-character instances in strings (if it does not already). Trent --=20 Trent Mick TrentM@ActiveState.com From guido@python.org Thu Jun 20 01:50:11 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 20:50:11 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 17:39:42 PDT." <20020619173942.B28839@ActiveState.com> References: <20020619173942.B28839@ActiveState.com> Message-ID: <200206200050.g5K0oBv03714@pcp02138704pcs.reston01.va.comcast.net> > > > There's a lot of empirical evidence that %(name)s is quite error > > > prone. > > > > Perhaps an unadorned %(name) should default to %(name)s? Ambiguous, hence even more error-prone. > Or: > - get pychecker2 working (the one that does not need to import modules > that it checks, I *think* that that is one of the pychecker2 features) > - get PyChecker in the core > - provide a python flag to load the pychecker import hook to check your > code when running it (say, '-w') > - have PyChecker warn about "%(name)"-sans-formatting-character > instances in strings (if it does not already). I'd rather have a notation that's less error-prone than a better way to check for errors. (Not that PyChecker 2 isn't a great idea. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Donald Beaudry Thu Jun 20 01:53:38 2002 From: Donald Beaudry (Donald Beaudry) Date: Wed, 19 Jun 2002 20:53:38 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: Message-ID: <200206200053.g5K0rch27979@zippy.abinitio.com> "Delaney, Timothy" wrote, > > From: barry@zope.com [mailto:barry@zope.com] > > > > >>>>> "MS" == Martin Sjögren writes: > > > > MS> What's the advantage of using ${name} and ${country} instead? > > > > There's a lot of empirical evidence that %(name)s is quite error > > prone. > > Perhaps an unadorned %(name) should default to %(name)s? The problem is knowing that it's unadorned. Now an unadorned %{name} could could be interpreted as %(name)s. -- Donald Beaudry Ab Initio Software Corp. 201 Spring Street donb@abinitio.com Lexington, MA 02421 ...So little time... From joe@notcharles.ca Wed Jun 19 22:37:59 2002 From: joe@notcharles.ca (Joe Mason) Date: Wed, 19 Jun 2002 16:37:59 -0500 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206191252.g5JCqcQ01558@pcp02138704pcs.reston01.va.comcast.net> References: <200206191252.g5JCqcQ01558@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020619213759.GA6092@plover.net> On Wed, Jun 19, 2002 at 08:52:38AM -0400, Guido van Rossum wrote: > I imagine that the most common use case is a situation where the dict > is already prepared. I think **dict is slower than a positional dict > argument. I agree that keyword args would be useful in some cases > where you can't trust the string. As Barry noted, this isn't as powerful as PEP 215 (er, was that the right number? The earlier $interpolation one, anyway) because it doesn't allow arbitrary expressions. I'd imagine a common use case would be to shortcut an expression without binding it to a local variable, "The length is ${length}".sub(length = len(someString)) In this case it would be handy to use the default environment overridden by the new bindings, so you could do "The length of ${someString} is ${length}".sub(length = len(someString)) But that could get messy real fast. The idiom could be "The length of ${someString} is ${length}".sub(someString = someString, length = len(someString)) But that's ugly. Joe From guido@python.org Thu Jun 20 02:06:27 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 21:06:27 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 16:37:59 CDT." <20020619213759.GA6092@plover.net> References: <200206191252.g5JCqcQ01558@pcp02138704pcs.reston01.va.comcast.net> <20020619213759.GA6092@plover.net> Message-ID: <200206200106.g5K16RD03949@pcp02138704pcs.reston01.va.comcast.net> > As Barry noted, this isn't as powerful as PEP 215 (er, was that the > right number? The earlier $interpolation one, anyway) because it > doesn't allow arbitrary expressions. That's intentional. Trying to put an expression parser in the interpolation code quickly leads to insanity. > I'd imagine a common use case > would be to shortcut an expression without binding it to a local > variable, > > "The length is ${length}".sub(length = len(someString)) > > In this case it would be handy to use the default environment overridden > by the new bindings, so you could do > > "The length of ${someString} is ${length}".sub(length = > len(someString)) > > But that could get messy real fast. The idiom could be > > "The length of ${someString} is ${length}".sub(someString = > someString, length = len(someString)) > > But that's ugly. I think you're simply trying to do too much in one line. Simple is better than complex. --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney@avaya.com Thu Jun 20 02:07:05 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Thu, 20 Jun 2002 11:07:05 +1000 Subject: [Python-Dev] PEP 292, Simpler String Substitutions Message-ID: > From: Guido van Rossum [mailto:guido@python.org] > > > > > There's a lot of empirical evidence that %(name)s is quite error > > > > prone. > > > > > > Perhaps an unadorned %(name) should default to %(name)s? > > Ambiguous, hence even more error-prone. Fair enough. I couldn't off the top of my head think of an ambiguous case, but of course there's '%(thing)s'ly-yours' ... PyChecker2 should *definitely* include checking format strings IMO, irrespective of whether $ formatting gets in. But only as a warning (because of the above case). Tim Delaney From neal@metaslash.com Thu Jun 20 02:17:27 2002 From: neal@metaslash.com (Neal Norwitz) Date: Wed, 19 Jun 2002 21:17:27 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: Message-ID: <3D112D27.BC846E5A@metaslash.com> "Delaney, Timothy" wrote: > PyChecker2 should *definitely* include checking format strings IMO, > irrespective of whether $ formatting gets in. But only as a warning (because > of the above case). It does, but there are a few bugs in it right now. Neal From tim.one@comcast.net Thu Jun 20 02:39:35 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 19 Jun 2002 21:39:35 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Message-ID: [Delaney, Timothy] > Fair enough. I couldn't off the top of my head think of an ambiguous case, The difficulty is that a blank is a kosher flag modifier in C formats. So, e.g., 'goofy%(name) dogs' is a legitimate Python format as is, and >>> print 'goofy%(name) dogs' % {'name': 666} goofy 666ogs >>> works correctly. Why *Python* follows these goofy rules in all respects is a question we don't ask . From paul@prescod.net Thu Jun 20 02:58:26 2002 From: paul@prescod.net (Paul Prescod) Date: Wed, 19 Jun 2002 18:58:26 -0700 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <200206192027.g5JKReA03020@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D1136C2.1A7E0D47@prescod.net> Guido van Rossum wrote: > >... > Though I doubt that string % is taught in hour two -- you cna do > everything you want with str() and string concatenation, both of which > *are* taught in hour two. If there were an easy way to do interpolation I might well want to teach it before any of str() or string concatenation. And I would probably treat it in preference to the magic and special "," operator of the print statement. I prefer to teach something that is generally useful like $ rather than something which they may have to unlearn like "," -- unlearn to the extent that they will naturally expect that commas in other contexts will do whitespace-generating concatenation and they hardly ever will. Paul Prescod From guido@python.org Thu Jun 20 03:15:58 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 22:15:58 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 18:58:26 PDT." <3D1136C2.1A7E0D47@prescod.net> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <200206192027.g5JKReA03020@pcp02138704pcs.reston01.va.comcast.net> <3D1136C2.1A7E0D47@prescod.net> Message-ID: <200206200215.g5K2FwJ04359@pcp02138704pcs.reston01.va.comcast.net> > If there were an easy way to do interpolation I might well want to > teach it before any of str() or string concatenation. I'm afraid your students would end up appending a character c to a string s by writing s = "$s$c".sub() Not exactly good style. > And I would probably treat it in preference to the magic and special > "," operator of the print statement. I object to this insinuation. > I prefer to teach something that is generally useful like $ rather > than something which they may have to unlearn like "," -- unlearn to > the extent that they will naturally expect that commas in other > contexts will do whitespace-generating concatenation and they hardly > ever will. (a) You're making this argument up. I don't believe for a second that you've observed this mistake in an actual student. (b) I expect that students never even *think* about the space between printed items -- it's entirely natural. (c) Commas are designed to "disappear" in our interpretation of things, and they do. The comma has so many uses where whitespace generation is just not one of the things you could possibly think about that I find it hard to take this argument serious. --Guido van Rossum (home page: http://www.python.org/~guido/) From pinard@iro.umontreal.ca Thu Jun 20 03:20:49 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 19 Jun 2002 22:20:49 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> Message-ID: [Barry A. Warsaw] > This PEP describes a simpler string substitution feature, also > known as string interpolation. This PEP is "simpler" in two > respects: > 1. Python's current string substitution feature (commonly known as > %-substitutions) is complicated and error prone. This PEP is > simpler at the cost of less expressiveness. > 2. PEP 215 proposed an alternative string interpolation feature, > introducing a new `$' string prefix. PEP 292 is simpler than > this because it involves no syntax changes and has much simpler > rules for what substitutions can occur in the string. For one, I do not like seeing `$' as a string prefix in Python, and wonder if we could not merely go with `%' as we always did in Python. At least, it keeps a kind of clear cut distance between Python and Perl. :-) > In addition, the rules for what can follow a % sign are fairly > complex, while the usual application rarely needs such complexity. This premise seems exaggerated to me. `%' as it stands is not that complex to understand. Moreover, many of us use `%' formatting a lot, so it is not so rare that the current `%' specification is useful. > 1. $$ is an escape; it is replaced with a single $ Let's suppose we stick with `%', the above rule reduces to something already known. > 3. ${identifier} [...] We could use %{identifier} as meaning `%(identifier)s'. Clean. Simple. > 2. $identifier [...] This is where the difficulty lies. Since the PEP already suggests that ${identifier} was to be preferred over $identifier, why not just go a bit forward, and drop 2. altogether? Or else, how do you justify that using it really make things more legible? Then, the whole proposal would reduce to adding %{identifier}, and instead of having `.sub()' methods or whatever, just stick with what we already have. This would be a mild change instead of a whole new feature, and keep Python a little more wrapped to itself. Interpolation proposals I've seen always looked a bit awkward and foreign so far. I guess that merely adding %{identifier} would wholly satisfy the given justifications for the PEP (that is, giving a mean for people to avoid the %()s as error prone), with a minimal impact on the current Python definition, and a bit less of a surprise. Python does not have to look like Perl to be useful, you know! :-) > Handling Missing Keys This would be a non-issue, by the fact that %(identifier)s behaviour, for undefined identifier, is already what we want. > The mapping argument is optional; if it is omitted then the mapping is > taken from the locals and globals of the context in which the .sub() > method is executed. This is an interesting idea. However, there are other contexts where the concept of a compound dictionary of all globals and locals would be useful. Maybe we could have some allvars() similar to globals() and locals(), and use `... % allvars()' instead of `.sub()'? So this would serve both string interpolation and other avenues. I hope I succeed to express my feeling that we should try keeping string interpolation rather natural with what Python already is. We should not carelessly multiply paradigms. -- François Pinard http://www.iro.umontreal.ca/~pinard From guido@python.org Thu Jun 20 03:30:59 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 22:30:59 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: Your message of "19 Jun 2002 22:20:49 EDT." References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> Message-ID: <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net> > For one, I do not like seeing `$' as a string prefix in Python, and > wonder if we could not merely go with `%' as we always did in > Python. At least, it keeps a kind of clear cut distance between > Python and Perl. :-) The $ means "substitution" in so many languages besides Perl that I wonder where you've been. > > In addition, the rules for what can follow a % sign are fairly > > complex, while the usual application rarely needs such complexity. > > This premise seems exaggerated to me. `%' as it stands is not that > complex to understand. Moreover, many of us use `%' formatting a lot, > so it is not so rare that the current `%' specification is useful. I quite like the positional % substitution. I think %(...)s was a mistake -- what we really wanted was ${...}. > > 1. $$ is an escape; it is replaced with a single $ > > Let's suppose we stick with `%', the above rule reduces to something > already known. > > > 3. ${identifier} [...] > > We could use %{identifier} as meaning `%(identifier)s'. Clean. Simple. Confusing. The visual difference between () and {} is too small. > > 2. $identifier [...] > > This is where the difficulty lies. Since the PEP already suggests that > ${identifier} was to be preferred over $identifier, why not just go a bit > forward, and drop 2. altogether? Or else, how do you justify that using > it really make things more legible? Less clutter. Compare "My name is $name, I live in $country" to "My name is ${name}, I live in ${country}" The {} add nothing but noise. We're copying this from the shell. --Guido van Rossum (home page: http://www.python.org/~guido/) From pinard@iro.umontreal.ca Thu Jun 20 03:43:53 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 19 Jun 2002 22:43:53 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net> References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > The $ means "substitution" in so many languages besides Perl that I > wonder where you've been. Of course, I've been elsewhere. But Python currently uses `%' for driving interpolation, and on this topic, I've been with Python, if you wonder :-). > I quite like the positional % substitution. I think %(...)s was a > mistake -- what we really wanted was ${...}. The distinction between %()s and %()r, recently introduced, has been useful. But with str() and repr(), only one of those is really necessary. But it gave the impression that Python trend is pushing for % to get stronger. The proposal of using $ as yet another formatting avenue makes it weaker. > Less clutter. Compare > "My name is $name, I live in $country" > to > "My name is ${name}, I live in ${country}" > The {} add nothing but noise. We're copying this from the shell. Noise decreases legibility. So, maybe the PEP should not say that ${name} is to be preferred over $name? Or else, it should explain why. -- François Pinard http://www.iro.umontreal.ca/~pinard From guido@python.org Thu Jun 20 03:50:51 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 22:50:51 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: Your message of "19 Jun 2002 22:43:53 EDT." References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206200250.g5K2opM04538@pcp02138704pcs.reston01.va.comcast.net> > The distinction between %()s and %()r, recently introduced, has been > useful. But with str() and repr(), only one of those is really > necessary. But it gave the impression that Python trend is pushing > for % to get stronger. The proposal of using $ as yet another > formatting avenue makes it weaker. Language evolution doesn't always go into a straight line. > > Less clutter. Compare > > > "My name is $name, I live in $country" > > > to > > > "My name is ${name}, I live in ${country}" > > > The {} add nothing but noise. We're copying this from the shell. > > Noise decreases legibility. So, maybe the PEP should not say that > ${name} is to be preferred over $name? Or else, it should explain > why. I agree that I see no reason to prefer ${name} (except when followed by another word character of course). --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Thu Jun 20 03:53:32 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 22:53:32 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> Message-ID: <15633.17324.467335.416736@anthem.wooz.org> >>>>> "AK" == Andrew Kuchling writes: AK> (Maybe a syntax-checker for %(...) strings would solve AK> Mailman's problems, and alleviate the plaintive cries for an AK> alternative interpolation syntax?) If I had to do it over again, I would have used $name in i18n source strings from the start. It would have saved lots of headaches and broken translations. People just seem to get $names whereas they get %(name)s wrong too often. (Little known MM2.1 fact: you can actually convert your headers and footers to $name substitutions, but it's a hack. Later, it might be required. One of the outgrowths of experimenting with this was to add a %(name)s checker and now bogus names in the %(...)s are flagged as errors, while missing trailing `s's are flagged as warnings and auto-corrected.) -Barry From barry@zope.com Thu Jun 20 04:00:52 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 23:00:52 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <20020619124604.GB31653@ute.mems-exchange.org> Message-ID: <15633.17764.219180.975870@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> -1 on that part: os.path.expandvars() is an ill-defined mess TP> (the core has more than one of them, varying by platform, and TP> what they do differs in platform-irrelevant ways). +1 on TP> making Barry fix expandvars : TP> http://www.python.org/sf/494589 Heck, I've never even /used/ expandvars (I think that's the first time I've even typed that sequence of letters). Plus if it works on Unix, what more could you want? I'd say Mr. Neal Odd Body is handling that bug report just fine. -Barry From barry@zope.com Thu Jun 20 04:13:55 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 23:13:55 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com> <3D100CA9.999E7B3F@prescod.net> <200206191254.g5JCsu301574@pcp02138704pcs.reston01.va.comcast.net> <3D10B496.67B19898@prescod.net> <200206192023.g5JKN3K02971@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15633.18547.455980.465004@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> I think little would be lost if sub() always required a dict GvR> (or perhaps keyword args, although that feels like a YAGNI GvR> now). IME, doing the locals/globals trick is really helpful, but I might be willing to let go on that one, because I can wrap that functionality in my _() function. The reason for this is that I'd say 80-90% of the time, you have value you want to interpolate into the string sitting around handy in a local variable. And that local variable has the name of the key in the template string. So what you (would) end up doing is: ... name = getNameSomehow() ... country = getCountryOfOrigin(name) ... return '$name was born in $country'.sub({'name': name, 'country': country}) Do that a few hundred times and you start wanting to make that a lot more concise. :) GvR> I think that the key thing here is to set the precedent of GvR> using $ and the specific syntax proposed, not necessarily to GvR> have this as a built-in string methong. I'll note that before this idea gained PEPhood, Guido and I discussed using an operator, like: return '$name was born in $country' / dict but came around to the current proposal's string method. I agree with Guido that it's the use of $strings that is important here, and I don't care how the interpolation is actually done (builtin, string method, etc.), though relegating it to a module would, I think, make this a rarely used syntax. -Barry From pobrien@orbtech.com Thu Jun 20 04:25:34 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Wed, 19 Jun 2002 22:25:34 -0500 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > I quite like the positional % substitution. I think %(...)s was a > mistake -- what we really wanted was ${...}. What is the advantage of curly braces over parens in this context? +1 on the allvars() suggestion also. -- Patrick K. O'Brien Orbtech ----------------------------------------------- "Your source for Python software development." ----------------------------------------------- Web: http://www.orbtech.com/web/pobrien/ Blog: http://www.orbtech.com/blog/pobrien/ Wiki: http://www.orbtech.com/wiki/PatrickOBrien ----------------------------------------------- From guido@python.org Thu Jun 20 04:32:37 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 23:32:37 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 22:25:34 CDT." References: Message-ID: <200206200332.g5K3Wbj06062@pcp02138704pcs.reston01.va.comcast.net> > > I quite like the positional % substitution. I think %(...)s was a > > mistake -- what we really wanted was ${...}. > > What is the advantage of curly braces over parens in this context? Apart from Make, most $ substituters use ${...}, not $(...). > +1 on the allvars() suggestion also. I have no idea what you are talking about. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Thu Jun 20 04:31:36 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 23:31:36 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15633.19608.392087.60416@anthem.wooz.org> >>>>> "FP" =3D=3D Fran=E7ois Pinard writes: FP> So, maybe the PEP should not say that ${name} is to be FP> preferred over $name? Agreed. -Barry From barry@zope.com Thu Jun 20 04:34:38 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 23:34:38 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> Message-ID: <15633.19790.152438.926329@anthem.wooz.org> >>>>> "FP" =3D=3D Fran=E7ois Pinard writes: FP> However, there are other contexts where the concept of a FP> compound dictionary of all globals and locals would be useful. FP> Maybe we could have some allvars() similar to globals() and FP> locals(), and use `... % allvars()' instead of `.sub()'? So FP> this would serve both string interpolation and other avenues. Or maybe just make vars() do something more useful when no arguments are given? In any event, allvars() or a-different-vars() is out of scope for this PEP. We'd use it if it was there, but I think it needs its own PEP, which someone else will have to champion. -Barry From damien.morton@acm.org Thu Jun 20 04:35:32 2002 From: damien.morton@acm.org (Damien Morton) Date: Wed, 19 Jun 2002 23:35:32 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions Message-ID: <006201c2180b$883c1120$72976c42@damien> "I'd rather have a notation that's less error-prone than a better way to check for errors. (Not that PyChecker 2 isn't a great idea. :-)" Percent notation "%s" notation already exists for strings. Backquote notation is already in python, though I think, little used. The $ notation reeks of obscure languages such as perl and shell. Why not simply add backquote notation to python strings. I read in a recent email from Timbot, I think, that the backquote notation was originally intended for string interpolation too. name = "guido" country = "the netherlands" height = 1.92 "`name` is from `country`".sub() -> "guido is from the netherlands" "`name.capitalize()` is from `country`" -> "Guido is from the netherlands" "`name` is %`height`4.1f meters tall".sub() -> "guido is 1.9 meters tall" "`name.capitalize()` can jump `height*1.7` meters".sub() -> "guido can jump 3.264 meters" You could probably also compile these interpolation strings as well. Another thought: One of the main problems with the "%(name)4.2f" notation is that the format comes after the variable name. Is easy to forget adding the actual format specifier in after the name. Why not alter the notation to allow the format specifier to come before the name part. "%4.2f(height)" I think would be a whole lot less error prone, and would allow for the format specifier to default to "s" where omitted. "%(height)" is also less error prone, though it is ambiguous in the current scheme. Ive seen a nice class which evaluates the string used in the name part. class itpl: def __getitem__(self, s): return eval(s, globals()) From guido@python.org Thu Jun 20 04:46:33 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 19 Jun 2002 23:46:33 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 23:35:32 EDT." <006201c2180b$883c1120$72976c42@damien> References: <006201c2180b$883c1120$72976c42@damien> Message-ID: <200206200346.g5K3kXV06637@pcp02138704pcs.reston01.va.comcast.net> > The $ notation reeks of obscure languages such as perl and shell. Sigh. Please grow up. > Why not simply add backquote notation to python strings. I read in a > recent email from Timbot, I think, that the backquote notation was > originally intended for string interpolation too. Unfortunately, backquotes are often hard to see, or mistaken for forward quotes. I think that disqualifies it. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Thu Jun 20 04:45:32 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Jun 2002 23:45:32 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <200206200332.g5K3Wbj06062@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15633.20444.828514.232578@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> Apart from Make, most $ substituters use ${...}, not $(...). GNU Make allows either braces or parentheses; there's no difference between the two. So it's a pretty strong precedent in lots of Unix tools. GNU Make also uses the $$ escape. -Barry From David Abrahams" Message-ID: <0d3501c2180e$5b6c7b00$6601a8c0@boostconsulting.com> From: "Damien Morton" > "`name.capitalize()` can jump `height*1.7` meters".sub() -> "guido can > jump 3.264 meters" I love this suggestion. It's the sort of thing you can't do in C++ ;-) I suspect the arguments against will run to efficiency and complexity, since you need to compile the backquoted expressions (in some context). Hmm, here they are... Nope, I'm wrong -Dave From pobrien@orbtech.com Thu Jun 20 05:02:58 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Wed, 19 Jun 2002 23:02:58 -0500 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206200332.g5K3Wbj06062@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > > +1 on the allvars() suggestion also. > > I have no idea what you are talking about. :-( François Pinard made the following suggestion and I think something along the lines of allvars() would be very handy, especially with the html stuff I've been doing lately: > This is an interesting idea. However, there are other contexts where the > concept of a compound dictionary of all globals and locals would > be useful. > Maybe we could have some allvars() similar to globals() and locals(), > and use `... % allvars()' instead of `.sub()'? So this would serve both > string interpolation and other avenues. -- Patrick K. O'Brien Orbtech ----------------------------------------------- "Your source for Python software development." ----------------------------------------------- Web: http://www.orbtech.com/web/pobrien/ Blog: http://www.orbtech.com/blog/pobrien/ Wiki: http://www.orbtech.com/wiki/PatrickOBrien ----------------------------------------------- From guido@python.org Thu Jun 20 05:06:55 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 00:06:55 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 23:55:44 EDT." <0d3501c2180e$5b6c7b00$6601a8c0@boostconsulting.com> References: <006201c2180b$883c1120$72976c42@damien> <0d3501c2180e$5b6c7b00$6601a8c0@boostconsulting.com> Message-ID: <200206200406.g5K46tB06740@pcp02138704pcs.reston01.va.comcast.net> > I love this suggestion. It's the sort of thing you can't do in C++ ;-) > I suspect the arguments against will run to efficiency and complexity, > since you need to compile the backquoted expressions (in some context). Actually, I had planned a secret feature that skips matching nested {...} inside ${...}, so that you could write a magic dict whose keys were eval()'ed in the caller's context. The %(...) parser does this (skipping nested (...)) because someone wanted to do that. --Guido van Rossum (home page: http://www.python.org/~guido/) From damien.morton@acm.org Thu Jun 20 05:14:13 2002 From: damien.morton@acm.org (Damien Morton) Date: Thu, 20 Jun 2002 00:14:13 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <200206200407.g5K47f906751@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <006601c21810$ef52a770$72976c42@damien> LOL. Youre right of course :) `...` is already in python though. Better to generalise and/or extend an already existing construct than to add a new one, I would assert. Youre also right about `...` being less visible than other pairs of delimiters. I think, though, that if modifying the % notation, that the requirement should be for delimiters that are not parentheses, and ones that wouldn't interfere with putting expressions between them. This would allow for eval()ing the content between the delimiters. > -----Original Message----- > From: guido@pcp02138704pcs.reston01.va.comcast.net > [mailto:guido@pcp02138704pcs.reston01.va.comcast.net] On > Behalf Of Guido van Rossum > Sent: Thursday, 20 June 2002 00:08 > To: Damien Morton > Subject: Re: [Python-Dev] PEP 292, Simpler String Substitutions > > > > I stand by my position though. Ive been programing for a long time, > > and I have rarely come across the $ notation. Mind you, I > don't work > > in unix very often, and I would hazard a guess that the $ > substitution > > comes mainly from unix. > > So does `...`. :-) > > --Guido van Rossum (home page: http://www.python.org/~guido/) > From guido@python.org Thu Jun 20 05:23:26 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 00:23:26 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Thu, 20 Jun 2002 00:14:13 EDT." <006601c21810$ef52a770$72976c42@damien> References: <006601c21810$ef52a770$72976c42@damien> Message-ID: <200206200423.g5K4NQt06841@pcp02138704pcs.reston01.va.comcast.net> > `...` is already in python though. But not in this form. > Better to generalise and/or extend an already existing construct than to > add a new one, I would assert. Only if that construct is successful. `...` has a bad rep -- people by and large prefer repr(). > Youre also right about `...` being less visible than other pairs of > delimiters. > > I think, though, that if modifying the % notation, that the requirement > should be for delimiters that are not parentheses, and ones that > wouldn't interfere with putting expressions between them. This would > allow for eval()ing the content between the delimiters. You could do that with ${...} if it skipped nested {...} inside, which I plan to implement as an (initially secret) extension. That should be good enough. If you try to put string literals containing { or } inside ${...} you deserve what you get. :-) Still, this wouldn't work: x = 12 y = 14 print "$x times $y equals ${x*y}".sub() You'd have to pass a special magic dict (which I *won't* supply). --Guido van Rossum (home page: http://www.python.org/~guido/) From pobrien@orbtech.com Thu Jun 20 05:23:10 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Wed, 19 Jun 2002 23:23:10 -0500 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206200332.g5K3Wbj06062@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > > > I quite like the positional % substitution. I think %(...)s was a > > > mistake -- what we really wanted was ${...}. > > > > What is the advantage of curly braces over parens in this context? > > Apart from Make, most $ substituters use ${...}, not $(...). I guess what I was really wondering is whether that advantage clearly outways some of the possible disadvantages. I'm not a fan of curly braces and I'll be sad to see more of them in Python. There's something refreshing about only having curly braces for dictionaries and parens everywhere else. And since the exisiting string substitution uses parens why shouldn't the new? It won't surprise me that you've already considered all this and are fine with using curly braces here, but I just had to ask before it is a done deal. (And I promise I won't go on a boolean crusade and predict that curly braces will appear everywhere to the demise of the language. ) -- Patrick K. O'Brien Orbtech ----------------------------------------------- "Your source for Python software development." ----------------------------------------------- Web: http://www.orbtech.com/web/pobrien/ Blog: http://www.orbtech.com/blog/pobrien/ Wiki: http://www.orbtech.com/wiki/PatrickOBrien ----------------------------------------------- From guido@python.org Thu Jun 20 05:28:34 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 00:28:34 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Wed, 19 Jun 2002 23:23:10 CDT." References: Message-ID: <200206200428.g5K4SZh06890@pcp02138704pcs.reston01.va.comcast.net> > I'm not a fan of curly braces and I'll be sad to see more of them in > Python. This seems more emotional than anything else. --Guido van Rossum (home page: http://www.python.org/~guido/) From pobrien@orbtech.com Thu Jun 20 05:33:15 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Wed, 19 Jun 2002 23:33:15 -0500 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206200428.g5K4SZh06890@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > > I'm not a fan of curly braces and I'll be sad to see more of them in > > Python. > > This seems more emotional than anything else. Definitely. And habit. Since I program mostly in Python I'm used to {} meaning dictionary and I'm used to typing parens everywhere else. Others who are used to ${} for string substitution in other contexts will be happy that you copied that syntax. I'm just trying to see if there is anything more substantial involved. Sounds like there isn't. And that's fine. I'll adapt. :-) -- Patrick K. O'Brien Orbtech ----------------------------------------------- "Your source for Python software development." ----------------------------------------------- Web: http://www.orbtech.com/web/pobrien/ Blog: http://www.orbtech.com/blog/pobrien/ Wiki: http://www.orbtech.com/wiki/PatrickOBrien ----------------------------------------------- From tdelaney@avaya.com Thu Jun 20 06:24:48 2002 From: tdelaney@avaya.com (Delaney, Timothy) Date: Thu, 20 Jun 2002 15:24:48 +1000 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions Message-ID: > From: Patrick K. O'Brien [mailto:pobrien@orbtech.com] > > Definitely. And habit. Since I program mostly in Python I'm used to {} > meaning dictionary and I'm used to typing parens everywhere In that case you should be happy. ${} is using a dictionary as its source ... Actually, for consistency, it should probably be $[] to suggest accessing a dictionary element. But I won't go down that path ;) Tim Delaney From python@rcn.com Thu Jun 20 06:38:35 2002 From: python@rcn.com (Raymond Hettinger) Date: Thu, 20 Jun 2002 01:38:35 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <006201c2180b$883c1120$72976c42@damien> Message-ID: <00cb01c2181c$b97fb320$d2f7a4d8@othello> From: "Damien Morton" > Why not simply add backquote notation to python strings. I read in a > recent email from Timbot, I think, that the backquote notation was > originally intended for string interpolation too. > > "`name` is from `country`".sub() > "`name.capitalize()` is from `country > "`name` is %`height`4.1f meters tall".sub() > "`name.capitalize()` can jump `height*1.7` meters".sub() I'll bet this style would be brutal to read with the accented letters in French. Raymond Hettinger From Oleg Broytmann Thu Jun 20 09:16:08 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Thu, 20 Jun 2002 12:16:08 +0400 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: ; from martin@v.loewis.de on Wed, Jun 19, 2002 at 10:43:01PM +0200 References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <20020619182141.GA18944@zot.electricrain.com> Message-ID: <20020620121608.H22899@phd.pp.ru> On Wed, Jun 19, 2002 at 10:43:01PM +0200, Martin v. Loewis wrote: > What does it have to do with the city of Berkeley (CA)? Perhaps > "sleepycat"? from sleepycat import berkeleydb from sleepycat import bsddb2 from sleepycat import bsddb3 from sleepycat import bsddb4 ??? Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From oren-py-d@hishome.net Thu Jun 20 08:18:56 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Thu, 20 Jun 2002 03:18:56 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <006201c2180b$883c1120$72976c42@damien> References: <006201c2180b$883c1120$72976c42@damien> Message-ID: <20020620071856.GA10497@hishome.net> On Wed, Jun 19, 2002 at 11:35:32PM -0400, Damien Morton wrote: > Why not simply add backquote notation to python strings. I read in a > recent email from Timbot, I think, that the backquote notation was > originally intended for string interpolation too. See http://tothink.com/python/embedpp Oren From fredrik@pythonware.com Thu Jun 20 11:17:14 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 20 Jun 2002 12:17:14 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> <20020619204017.GB9758@gerg.ca> Message-ID: <02fd01c21843$a959df30$ced241d5@hagrid> Greg wrote: > No, that's already checked in as textwrap.py. are you saying that you cannot rename stuff under CVS? not even delete it, and check it in again under a new name? (it's a brand new module, after all, so the history shouldn't matter much) I'm +1 on adding a text utility module for occasionally useful stuff like wrapping, getting rid of gremlins, doing various kinds of substitutions, centering/capitalizing/padding and otherwise formatting strings, searching/parsing, and other fun stuff that your average text editor can do (and +0 on using the existing "string" module for that purpose, but I can live with another name). I'm -1 on adding one specialized module, and then rejecting other things because we have nowhere to put it, or because adding a useful function to a corner of the standard library is so much harder than adding a method to a core data type, or because a single trainer has decided that he doesn't want to teach his classes to use modules and call functions... From tismer@tismer.com Thu Jun 20 11:27:04 2002 From: tismer@tismer.com (Christian Tismer) Date: Thu, 20 Jun 2002 12:27:04 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <01bf01c217cf$407bcec0$ced241d5@hagrid> Message-ID: <3D11ADF8.9090802@tismer.com> Fredrik Lundh wrote: > paul wrote: > >>$ is taught in hour 2, import is taught on day 2. > > > says who? > > I usually mention "import" in the first hour (before methods), > and nobody has ever had any problem with that... Well, same here, but that might change, since the string module is nearly obsolete. You can show reasonably powerful stuff(*) without a single import. (*) and that's what you need to get people interested. -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From fredrik@pythonware.com Thu Jun 20 11:52:26 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 20 Jun 2002 12:52:26 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <01bf01c217cf$407bcec0$ced241d5@hagrid> <3D11ADF8.9090802@tismer.com> Message-ID: <038e01c21848$9bef5320$ced241d5@hagrid> christian wrote: > > I usually mention "import" in the first hour (before methods), > > and nobody has ever had any problem with that... > > Well, same here, but that might change, since the string > module is nearly obsolete. You can show reasonably > powerful stuff(*) without a single import. > > (*) and that's what you need to get people interested. I usually start out with something web-oriented (which means urllib). how about adding a "get" method to strings? or an "L" prefix character that causes Python to wrap it up in a simple URL container: print url"http://www.python.org".read() ::: but in practice, if you really want people to get interested, make sure you have a domain-specific library installed on the training machines. why care about string fiddling when your second python program (after print "hello world") can be: import noaa im = noaa.open("noaa16_20020620_1021") im.rectify("euro") im.show() From tismer@tismer.com Thu Jun 20 12:05:20 2002 From: tismer@tismer.com (Christian Tismer) Date: Thu, 20 Jun 2002 13:05:20 +0200 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: Message-ID: <3D11B6F0.5000803@tismer.com> Patrick K. O'Brien wrote: > [Guido van Rossum] > >>I quite like the positional % substitution. I think %(...)s was a >>mistake -- what we really wanted was ${...}. > > > What is the advantage of curly braces over parens in this context? It unambiguously spells that there is no format suffix char. > +1 on the allvars() suggestion also. me too. -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From greg@cosc.canterbury.ac.nz Thu Jun 20 08:25:34 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 20 Jun 2002 19:25:34 +1200 (NZST) Subject: [Python-Dev] Weird problem with exceptions raised in extension module Message-ID: <200206200725.g5K7PYt26747@oma.cosc.canterbury.ac.nz> I'm getting strange behaviour when raising an exception in a extension module generated by Pyrex. The extension module does the equivalent of def foo(): raise TypeError("Test-Exception") If I invoke it with the following Python code: try: mymodule.foo() except IOError: print "blarg" the following happens: Traceback (most recent call last): File "", line 3, in ? SystemError: 'finally' pops bad exception This only happens when the try-except catches something *other* than the exception being raised. If the exception being raised is caught, or no exception catching is done, the exception is handled properly. Also, it only happens when an *intance* is used as the exception object. If I do this instead: raise TypeError, "Test-Exception" the problem doesn't occur. The relevant piece of C code generated by Pyrex is as follows. Can anyone see if I'm doing anything wrong? (I'm aware that there's a missing Py_DECREF, but it shouldn't be causing this sort of thing.) The Python version I'm using is 2.2. __pyx_1 = __Pyx_GetName(__pyx_b, "TypeError"); if (!__pyx_1) goto __pyx_L1; __pyx_2 = PyString_FromString(__pyx_k1); if (!__pyx_2) goto __pyx_L1; __pyx_3 = PyTuple_New(1); if (!__pyx_3) goto __pyx_L1; PyTuple_SET_ITEM(__pyx_3, 0, __pyx_2); __pyx_2 = 0; __pyx_4 = PyObject_CallObject(__pyx_1, __pyx_3); if (!__pyx_4) goto __pyx_L1; Py_DECREF(__pyx_3); __pyx_3 = 0; PyErr_SetNone(__pyx_4); Py_DECREF(__pyx_4); __pyx_4 = 0; goto __pyx_L1; /*...*/ __pyx_L1:; Py_XDECREF(__pyx_1); Py_XDECREF(__pyx_2); Py_XDECREF(__pyx_3); Py_XDECREF(__pyx_4); __pyx_r = 0; __pyx_L0:; return __pyx_r; Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tismer@tismer.com Thu Jun 20 12:52:06 2002 From: tismer@tismer.com (Christian Tismer) Date: Thu, 20 Jun 2002 13:52:06 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <01bf01c217cf$407bcec0$ced241d5@hagrid> <3D11ADF8.9090802@tismer.com> <038e01c21848$9bef5320$ced241d5@hagrid> Message-ID: <3D11C1E6.4060508@tismer.com> Fredrik Lundh wrote: > christian wrote: > > >>>I usually mention "import" in the first hour (before methods), >>>and nobody has ever had any problem with that... >> >>Well, same here, but that might change, since the string >>module is nearly obsolete. You can show reasonably >>powerful stuff(*) without a single import. >> >>(*) and that's what you need to get people interested. > > > I usually start out with something web-oriented (which means > urllib). how about adding a "get" method to strings? or an "L" > prefix character that causes Python to wrap it up in a simple > URL container: > > print url"http://www.python.org".read() *puke* > but in practice, if you really want people to get interested, > make sure you have a domain-specific library installed on the > training machines. why care about string fiddling when your > second python program (after print "hello world") can be: Yes, I know. I didn't want to make a point, just to point out that it is possible to show neat stuff without import. Sure, the next thing I show is COM stuff or formatted stock market reports, using urllib, xml... -- no point. --- the rest below is not to Fredrik but the whole thread --- I'd like to express my opinion at this place (which is as good as any other place in such a much-too-fast growing thread): The following statements are ordered by increasing hate. 1 - I do hate the idea of introducing a "$" sign at all. 2 - giving "$" special meaning in strings via a module 3 - doing it as a builtin function 4 - allowing it to address local/global variables Version 4 as worst comes visually quite close to languages like Perl. In another post, Guido answered such objection with "grow up". While my emotional reaction would be to reply with "wake up!", I have some rationale reasons why I don't like this: I have to read and sometimes write lots of Perl code. The massive use of "$" gives me true headache. I don't want Python to remind me of headaches. One argument was that "$" and the unembraced usage in "$name" is so common and therefore easy to sell to Python newbies. Fine, but no reason to adopt this overly abused character. Instead, I'm happy that exactly "$" is nowhere used in formatting. I don't want to make Python similar to something, but to keep it different in this aspect. Like the triple quotes, the percent formatting exists rather seldom in other languages, and I love to use templates for makefiles, scripts and whatsoever, where I don't have to care too much about escaping the escapes. With an upcoming "$" feature, I fear that "%" might get abandoned in some future, and I loose this benefit. I agree with any sensible extension/refinement of the "%" sign. I disagree on using "$" for anything frequent in Python. I don't want to see variable names as placeholder inside of strings. Placeholders should be dictionary string keys, but this dictionary must be obtained explicitly. I do like the allvars() proposal. crap-py -ly - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From David Abrahams" <0d3501c2180e$5b6c7b00$6601a8c0@boostconsulting.com> <200206200406.g5K46tB06740@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <0e4c01c21855$d16773e0$6601a8c0@boostconsulting.com> From: "Guido van Rossum" > > I love this suggestion. It's the sort of thing you can't do in C++ ;-) > > I suspect the arguments against will run to efficiency and complexity, > > since you need to compile the backquoted expressions (in some context). > > Actually, I had planned a secret feature that skips matching nested > {...} inside ${...}, so that you could write a magic dict whose keys > were eval()'ed in the caller's context. The %(...) parser does this > (skipping nested (...)) because someone wanted to do that. Ooh, magic and secrets! Maybe a little too magical for me to understand easily. Is the stuff between ${...} allowed to be any valid expression? harry-potter's-got-nothing-on-you-ly y'rs, dave From fredrik@pythonware.com Thu Jun 20 13:34:57 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 20 Jun 2002 14:34:57 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <006201c2180b$883c1120$72976c42@damien> <0d3501c2180e$5b6c7b00$6601a8c0@boostconsulting.com> <200206200406.g5K46tB06740@pcp02138704pcs.reston01.va.comcast.net> <0e4c01c21855$d16773e0$6601a8c0@boostconsulting.com> Message-ID: <010501c21856$e4950260$0900a8c0@spiff> David wrote: > Ooh, magic and secrets! Maybe a little too magical for me to = understand > easily. Is the stuff between ${...} allowed to be any valid = expression? not according to the PEP, but nothing stops you from using a magic dictionary: class magic_dict: def __getitem__(self, value): return str(eval(value)) d =3D magic_dict() print "%(__import__('os').system('echo hello'))s" % d print replacevars("${__import__('os').system('echo hello')}", d) # for extra fun, replace 'echo hello' with 'rm -rf ~') From David Abrahams" I suggested: "What about making a public interface to apply_slice() and assign_slice() which I can call in the future? Perhaps PyObject_Get/Set/DelSlice()?" If I submitted such a patch would it be likely to be accepted? I don't want to waste my time on this if it's a bad idea. -Dave +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+ From fredrik@pythonware.com Thu Jun 20 13:37:18 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 20 Jun 2002 14:37:18 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206190322.g5J3M1I07670@pythonware.com><003701c2175f$b219c340$ced241d5@hagrid><20020619075121.GB25541@hishome.net><006501c2176e$b9dbb3e0$0900a8c0@spiff> <15632.30372.601835.200686@anthem.wooz.org> Message-ID: <012701c21857$37bcb2d0$0900a8c0@spiff> barry wrote: > I've added a note that you should never use no-arg .sub() on strings > that come from untrusted sources. if adding a note to the specification really helped, my servers logs wouldn't be full of findmail.pl requests, and our mail filters wouldn't catch quite as many outlook worms ;-) From aleax@aleax.it Thu Jun 20 13:38:36 2002 From: aleax@aleax.it (Alex Martelli) Date: Thu, 20 Jun 2002 14:38:36 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <3D11C1E6.4060508@tismer.com> References: <200206190322.g5J3M1I07670@pythonware.com> <038e01c21848$9bef5320$ced241d5@hagrid> <3D11C1E6.4060508@tismer.com> Message-ID: On Thursday 20 June 2002 01:52 pm, Christian Tismer wrote: ... > I'd like to express my opinion at this place (which is as good > as any other place in such a much-too-fast growing thread): Ditto (particularly since Christian's opinions strike me as quite sensible here). > The following statements are ordered by increasing hate. > 1 - I do hate the idea of introducing a "$" sign at all. In my case, I detest the idea of using '$' for something that is similar but subtly different from the job of '%'. It reeks of "more than one way to do it". I can just picture myself once more having to teach/explain "this, dear bright Python beginner, is the % way of formatting, with which you can do tasks X, Y, Z. However, for tasks Y, most of Z, and a good deal of T, there is also the $ way. It's close enough to the % way that you're sure to confuse their syntactic details when you try to learn both, but don't worry, those differences are arbitrary and bereft of any mnemonic value, so your eventual confusion is totally inevitable and you may as well give up. Cheer up, though -- if you're well-learned in C (not Java, C#, or C++, but the Real McCoy) *AND* sh (or bash or perl, and able to keep in mind exactly what subset of their interpolation syntax is implemented here), then you can toss a coin each and every time you want to format/interpolate, given the wide but not total overlap of tasks best accomplished each way. There, aren't you happy you've chosen to learn a language so powerful, simple, regular, and uniform, based on the self evident principle that there should be one way, and ideally just one way, to perform each task?" > 2 - giving "$" special meaning in strings via a module > 3 - doing it as a builtin function I agree that having it as a builtin (or string method) would be even worse than having it as a module. A module I can more easily try to "sweep under the carpet" as a side-show aberration. Built-in functions, operators, and methods of built-in object types, are far harder to explain away. > 4 - allowing it to address local/global variables Yeah, I can see this smells, too, but IMHO not quite as bad as the $-formatting - vs - %-formatting task overlap. > Version 4 as worst comes visually quite close to > languages like Perl. In another post, Guido answered > such objection with "grow up". While my emotional > reaction would be to reply with "wake up!", I have some > rationale reasons why I don't like this: > > I have to read and sometimes write lots of Perl code. > The massive use of "$" gives me true headache. I don't > want Python to remind me of headaches. I don't get this specific point. As punctuation goes, $ or % are much of a muchness from my POV (I admit to not having a very high visual orientation, so I may be missing some subtle point of graphical rendition?). A massive use of one OR the other would be just as bad. Am I misreading you or just missing something important? > One argument was that "$" and the unembraced usage in "$name" > is so common and therefore easy to sell to Python newbies. Newbies who come from Windows and have never knowingly used a Unix-ish box (probably more numerous today, despite Linux's renaissance -- newbies do tend to grow on Microsoft operating systems, that's what comes bundled with typical PCs today) might of course be familiar with %name and not $name (%name is what Microsoft's pitifully weak .BAT "language" uses). It seems to me that this "familiariry" argument doesn't cut much ice either way. Were we designing from scratch, and having to choose one punctuation character for this purpose, I'd be pretty neutral on ground of looks and familiarity. A slight bias against $ because on some terminals or printers it can come out as some OTHER currency symbol, depending on various settings, but that's pretty marginal. But I'd much rather not have both $ and % used in slightly different contexts for somewhat-similar, overlapping tasks... > Fine, but no reason to adopt this overly abused character. > Instead, I'm happy that exactly "$" is nowhere used in > formatting. Me, I'm happy (so far) that not BOTH $ and % are used for this, but just one of them. > I don't want to make Python similar to something, but to > keep it different in this aspect. Like the triple quotes, > the percent formatting exists rather seldom in other > languages, and I love to use templates for makefiles, > scripts and whatsoever, where I don't have to care too > much about escaping the escapes. Good point -- sometimes being different than most others has pluses:-). Of course, if you were templating to MS .BAT files it would be the other way 'round, but one doesn't do that much:-). > With an upcoming "$" feature, I fear that "%" might get > abandoned in some future, and I loose this benefit. This sounds to me like a FUD/"slippery slope" argument, even though I'm in broad agreement with you. I've neither heard nor suspected anything about plans to introduce the huge code breakage that abandoning % would entail. Rather, my fear is exactly that we'll get BOTH approaches to formatting, in an acute if localized outbreak of morethanonewayitis. > I agree with any sensible extension/refinement of the "%" sign. Sure. The current %-formatting rules aren't perfect, far from it, and while we must of course keep compatibility when the % _operator_ is used, it WOULD be nice to have a function or method that does something simpler and sensible with the template string when called instead of the % operator. > I disagree on using "$" for anything frequent in Python. I'd have no inherent objection to using $ (or @ or ? -- I think those are the three currently unused ASCII printing characters) for other tasks that didn't overlap with %'s. > I don't want to see variable names as placeholder inside > of strings. Placeholders should be dictionary string keys, > but this dictionary must be obtained explicitly. Yes, I see your point. It's definitely a valid one. > I do like the allvars() proposal. Me too BUT. How would allvars deal with free variables? eval is currently unable to deal with them very well: >>> def f(): ... a=1; b=2; c=3 ... def g(x): ... z = b ... try: return eval(x) ... except: return '%s unknown'%x ... return g ... >>> K=f() >>> for Z in 'abc': print K(Z) ... a unknown 2 c unknown >>> 'b' is OK because the compiler has seen it used elsewhere in nested function g, but 'a' and 'c' aren't. If 'allvars' can't deal with this problem, then it should not be named 'allvars' but 'somevarsbutmaybenotalldepending'. Alex From martin@v.loewis.de Thu Jun 20 08:42:27 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 20 Jun 2002 09:42:27 +0200 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <20020619212559.GC18944@zot.electricrain.com> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> Message-ID: "Gregory P. Smith" writes: > Modern berkeleydb uses much different on disk database formats, glancing > at the docs on sleepycat.com i don't even think it can read bsddb (1.85) > files. Existing code using bsddb (1.85) should not automatically start > using a different database library even if we provide a compatibility > interface. The Python bsddb module never guaranteed that you can use it to read bsddb 1.85 data files. In fact, on many installation, the bsddb module links with bsddb 2.x or bsddb 3.x, using db_185.h. So this is no reason not to call the module bsddb. Regards, Martin From gmcm@hypernet.com Thu Jun 20 14:07:34 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 20 Jun 2002 09:07:34 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net> References: Your message of "19 Jun 2002 22:20:49 EDT." Message-ID: <3D119B56.32515.AD6E881A@localhost> On 19 Jun 2002 at 22:30, Guido van Rossum wrote: > The $ means "substitution" in so many languages > besides Perl that I wonder where you've been. It doesn't mean anything in any language I *like*. -- Gordon http://www.mcmillan-inc.com/ From fredrik@pythonware.com Thu Jun 20 14:34:02 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 20 Jun 2002 15:34:02 +0200 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: Your message of "19 Jun 2002 22:20:49 EDT." <3D119B56.32515.AD6E881A@localhost> Message-ID: <02e401c2185f$24db5970$0900a8c0@spiff> gordon wrote: > > The $ means "substitution" in so many languages > > besides Perl that I wonder where you've been.=20 >=20 > It doesn't mean anything in any language I *like*. not even in american? From gward@python.net Thu Jun 20 14:49:05 2002 From: gward@python.net (Greg Ward) Date: Thu, 20 Jun 2002 09:49:05 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <02fd01c21843$a959df30$ced241d5@hagrid> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> <20020619204017.GB9758@gerg.ca> <02fd01c21843$a959df30$ced241d5@hagrid> Message-ID: <20020620134905.GC13858@gerg.ca> On 20 June 2002, Fredrik Lundh said: > > No, that's already checked in as textwrap.py. > > are you saying that you cannot rename stuff under CVS? > not even delete it, and check it in again under a new name? Not at all -- I was just correcting Andrew's misunderstanding about where the text-wrapping code lives (for now). > I'm +1 on adding a text utility module for occasionally useful > stuff like wrapping, getting rid of gremlins, doing various kinds > of substitutions, centering/capitalizing/padding and otherwise > formatting strings, searching/parsing, and other fun stuff that > your average text editor can do (and +0 on using the existing > "string" module for that purpose, but I can live with another > name). Sounds like a pretty good idea to me. Note that textwrap.py is almost 300 lines, which in my worldview is big enough to warrant its own module. I don't think that reduces the desirablity of having text.py (or text/__init__.py and friends) for all the things you mentioned. Greg -- Greg Ward - Python bigot gward@python.net http://starship.python.net/~gward/ From oren-py-d@hishome.net Thu Jun 20 14:49:16 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Thu, 20 Jun 2002 09:49:16 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <200206200053.g5K0rch27979@zippy.abinitio.com> References: <200206200053.g5K0rch27979@zippy.abinitio.com> Message-ID: <20020620134916.GA53951@hishome.net> >From what I've read on this thread so far my vote would be: +1 - no new forms of string formatting +0 - Donald Beaudry's proposal that %{name} would be equivalent to %(name)s -1 - anything else Oren From pinard@iro.umontreal.ca Thu Jun 20 14:41:00 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 20 Jun 2002 09:41:00 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <15633.17324.467335.416736@anthem.wooz.org> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> <15633.17324.467335.416736@anthem.wooz.org> Message-ID: [Barry A. Warsaw] > If I had to do it over again, I would have used $name in i18n source > strings from the start. It would have saved lots of headaches and > broken translations. People just seem to get $names whereas they get > %(name)s wrong too often. There were similar problems in C, you know, that yielded the addition of diagnostics in GNU `msgfmt', in case of discrepancies between formatting specifications in the original and the translated string. The suffering would not have really existed is `msgfmt' has been made Python aware, or if Python programs used their own `msgfmt'. As your wrote in your message, you wrote your own checker. A solution should be sought that would easily apply to all Python internationalised programs. -- François Pinard http://www.iro.umontreal.ca/~pinard From pinard@iro.umontreal.ca Thu Jun 20 15:28:16 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 20 Jun 2002 10:28:16 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <15633.19790.152438.926329@anthem.wooz.org> References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> <15633.19790.152438.926329@anthem.wooz.org> Message-ID: [Barry A. Warsaw] > FP> However, there are other contexts where the concept of a > FP> compound dictionary of all globals and locals would be useful. > FP> Maybe we could have some allvars() similar to globals() and > FP> locals(), and use `... % allvars()' instead of `.sub()'? So > FP> this would serve both string interpolation and other avenues. > Or maybe just make vars() do something more useful when no arguments > are given? I surely had the thought, but changing the meaning of an existing library function is most probably out of question. > In any event, allvars() or a-different-vars() is out of scope for this > PEP. We'd use it if it was there, but I think it needs its own PEP, > which someone else will have to champion. I do not see myself championing a PEP yet, I'm not sure the Python community is soft enough for my thin skin (not so thin maybe, but I really had my share of over-long discussions in other projects, I want some rest in these days). On the other hand, the allvars() suggestion is right on the point in my opinion. It is not a stand-alone suggestion, its goal was to stress out that `.sub()' is too far from the `%' operator, it looks like a random addition. The available formatting paradigms of Python, I mean, those which are standard, should look a bit more unified, just to preserve overall elegance. If we want Python to stay elegant (which is the source of comfort and pleasure, these being the main goals of using Python after all), we have to seek elegance in each Python move. To the risk of looking frenetic and heretic, I guess that `$' would become more acceptable in view of the preceding paragraph, if we were introducing an `$' operator for driving `$' substitutions, the same as the `%' operator currently drives `%' substitutions. I'm not asserting that this is the direction to take, but I'm presenting this as an example of a direction that would be a bit less shocking, and which through some unification, could somewhat salvage the messy aspect of having two formatting characters. Saying that PEP 292 rejects an idea because this idea would require another PEP to be debated and accepted beforehand, and than rushing the acceptance of PEP 292 as it stands, is probably missing the point of the discussion. Each time such an argumentation is made, we loose vision and favour the blossom of various Python features in random directions, which is not good in the long term for Python self-consistency and elegance. -- François Pinard http://www.iro.umontreal.ca/~pinard From pf@artcom-gmbh.de Thu Jun 20 15:37:26 2002 From: pf@artcom-gmbh.de (Peter Funk) Date: Thu, 20 Jun 2002 16:37:26 +0200 (CEST) Subject: Version Fatigue (was: Re: [Python-Dev] PEP 292, Simpler String Substitutions) In-Reply-To: <20020620134916.GA53951@hishome.net> from Oren Tirosh at "Jun 20, 2002 09:49:16 am" Message-ID: Oren Tirosh: > From what I've read on this thread so far my vote would be: > > +1 - no new forms of string formatting > +0 - Donald Beaudry's proposal that %{name} would be equivalent to %(name)s > -1 - anything else /. just had a pointer to a feature defining the term "Version Fatigue": http://slashdot.org/articles/02/06/20/1223247.shtml?tid=126 """Version fatigue comes from the accumulated realization that most knowledge gained with regard to any particular version of a product will be useless with regard to future generations of that same product.""" Thinking about that and recent Python development: <> operator called "obsolescent", iterators, generators, list comprehensions, ugly '//' operator introduced for integer division, deprecating import string, types, possibly adding "$name".sub(), may be later deprecating the % operator. What next? Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany) From skip@pobox.com Wed Jun 19 23:27:06 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 19 Jun 2002 17:27:06 -0500 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <20020619212559.GC18944@zot.electricrain.com> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> Message-ID: <15633.1338.367283.257786@localhost.localdomain> >> Why can't it just be called bsddb? Greg> Modern berkeleydb uses much different on disk database formats, Greg> glancing at the docs on sleepycat.com i don't even think it can Greg> read bsddb (1.85) files. That's never stopped us before. ;-) The current bsddb module works with versions 1, 2, 3, and 4 of Berkeley DB using the 1.85-compatible API that Sleepycat provides. It's always been the user's responsibility to run the appropriate db_dump or db_dump185 commands before using the next version of Berkeley DB. Using the library from Python never removed that requirement. Skip From fredrik@pythonware.com Thu Jun 20 16:15:44 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 20 Jun 2002 17:15:44 +0200 Subject: Version Fatigue (was: Re: [Python-Dev] PEP 292, Simpler String Substitutions) References: Message-ID: <049301c2186d$5d627270$ced241d5@hagrid> peter wrote: > <> operator called "obsolescent", iterators, generators hey, don't lump generators in with the rest of the stuff. generators opens a new universe, the rest is more like moving the furniture around... From gmcm@hypernet.com Thu Jun 20 16:43:03 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 20 Jun 2002 11:43:03 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <02e401c2185f$24db5970$0900a8c0@spiff> Message-ID: <3D11BFC7.4677.ADFCE47F@localhost> On 20 Jun 2002 at 15:34, Fredrik Lundh wrote: > gordon wrote: > > > > The $ means "substitution" in so many languages > > > besides Perl that I wonder where you've been. > > > > It doesn't mean anything in any language I *like*. > > not even in american? Where $ means "dough", which is one letter different from "cough" and "tough"[1]? the-world's-best-language-for-discussing- the-price-of-oranges-ly y'rs -- Gordon http://www.mcmillan-inc.com/ [1] If you're old-fashioned enough, you can spell "plow" as "plough", too. From niemeyer@conectiva.com Thu Jun 20 16:48:07 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 20 Jun 2002 12:48:07 -0300 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <006201c2180b$883c1120$72976c42@damien> References: <006201c2180b$883c1120$72976c42@damien> Message-ID: <20020620124807.B1504@ibook.distro.conectiva> > "`name` is from `country`".sub() -> "guido is from the netherlands" [...] But I'm not, thus I'm against any special character that needs two hits to be typed in my keyboard. ;-))) -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From jacobs@penguin.theopalgroup.com Thu Jun 20 16:55:28 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Thu, 20 Jun 2002 11:55:28 -0400 (EDT) Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <3D11BFC7.4677.ADFCE47F@localhost> Message-ID: On Thu, 20 Jun 2002, Gordon McMillan wrote: > On 20 Jun 2002 at 15:34, Fredrik Lundh wrote: > > gordon wrote: > > > > > > The $ means "substitution" in so many languages > > > > besides Perl that I wonder where you've been. > > > > > > It doesn't mean anything in any language I *like*. > > > > not even in american? > > Where $ means "dough", which is one letter > different from "cough" and "tough"[1]? Shouldn't if be: d'oh! -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From aahz@pythoncraft.com Thu Jun 20 17:21:50 2002 From: aahz@pythoncraft.com (Aahz) Date: Thu, 20 Jun 2002 12:21:50 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> <15633.19790.152438.926329@anthem.wooz.org> Message-ID: <20020620162150.GB18208@panix.com> On Thu, Jun 20, 2002, François Pinard wrote: > [Barry A. Warsaw] >> >> In any event, allvars() or a-different-vars() is out of scope for this >> PEP. We'd use it if it was there, but I think it needs its own PEP, >> which someone else will have to champion. > > On the other hand, the allvars() suggestion is right on the point > in my opinion. It is not a stand-alone suggestion, its goal was to > stress out that `.sub()' is too far from the `%' operator, it looks > like a random addition. The available formatting paradigms of Python, > I mean, those which are standard, should look a bit more unified, > just to preserve overall elegance. If we want Python to stay elegant > (which is the source of comfort and pleasure, these being the main > goals of using Python after all), we have to seek elegance in each > Python move. +1 -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From aahz@pythoncraft.com Thu Jun 20 17:26:14 2002 From: aahz@pythoncraft.com (Aahz) Date: Thu, 20 Jun 2002 12:26:14 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <01bf01c217cf$407bcec0$ced241d5@hagrid> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <01bf01c217cf$407bcec0$ced241d5@hagrid> Message-ID: <20020620162613.GC18208@panix.com> On Wed, Jun 19, 2002, Fredrik Lundh wrote: > Paul Prescod wrote: >> >> $ is taught in hour 2, import is taught on day 2. > > says who? > > I usually mention "import" in the first hour (before methods), > and nobody has ever had any problem with that... Same here. Note that there's a big difference between introducing import (which pretty much is essential somewhere in the first or third hour if you want to teach anything interesting) and giving a full explanation of how import works (which would indeed be day 2 or 3). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From aahz@pythoncraft.com Thu Jun 20 17:10:57 2002 From: aahz@pythoncraft.com (Aahz) Date: Thu, 20 Jun 2002 12:10:57 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <02fd01c21843$a959df30$ced241d5@hagrid> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> <20020619204017.GB9758@gerg.ca> <02fd01c21843$a959df30$ced241d5@hagrid> Message-ID: <20020620161057.GA18208@panix.com> On Thu, Jun 20, 2002, Fredrik Lundh wrote: > > I'm +1 on adding a text utility module for occasionally useful > stuff like wrapping, getting rid of gremlins, doing various kinds > of substitutions, centering/capitalizing/padding and otherwise > formatting strings, searching/parsing, and other fun stuff that > your average text editor can do (and +0 on using the existing > "string" module for that purpose, but I can live with another > name). +1 (And I said so in the original thread on text wrapping.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From aahz@pythoncraft.com Thu Jun 20 17:43:23 2002 From: aahz@pythoncraft.com (Aahz) Date: Thu, 20 Jun 2002 12:43:23 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions Message-ID: <20020620164323.GA27422@panix.com> I'm about a pure 0 on this proposal in the aggregate I'm -1 on .sub() as the name; I'd rather it be called .interp() (this mainly due to confusing with the existing re.sub() and str.replace()) I'm +0 on putting this functionality in the text module instead of adding a string method I'm +0 on trying to find a solution that uses % instead of $ (There's Only One Way) I'm -1 on ${name}; there's no reason not to at least use $(name) for consistency -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From guido@python.org Thu Jun 20 18:10:36 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 13:10:36 -0400 Subject: Version Fatigue (was: Re: [Python-Dev] PEP 292, Simpler String Substitutions) In-Reply-To: Your message of "Thu, 20 Jun 2002 16:37:26 +0200." References: Message-ID: <200206201710.g5KHAaO03970@odiug.zope.com> > """Version fatigue comes from the accumulated realization that most > knowledge gained with regard to any particular version of a product > will be useless with regard to future generations of that same product.""" > > Thinking about that and recent Python development: This is highly exaggerated. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 20 18:30:47 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 13:30:47 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Thu, 20 Jun 2002 03:18:56 EDT." <20020620071856.GA10497@hishome.net> References: <006201c2180b$883c1120$72976c42@damien> <20020620071856.GA10497@hishome.net> Message-ID: <200206201730.g5KHUlP04117@odiug.zope.com> > See http://tothink.com/python/embedpp How come you never submitted this PEP to the PEPmeister? I can't comment on what I don't know. It certainly comes closest to the original ABC feature. (The main problem with `...` is that many people can't distinguish between ` and ', as user testing has shown.) --Guido van Rossum (home page: http://www.python.org/~guido/) From damien.morton@acm.org Thu Jun 20 18:45:03 2002 From: damien.morton@acm.org (Damien Morton) Date: Thu, 20 Jun 2002 13:45:03 -0400 Subject: [Python-Dev] FW: PEP 292, Simpler String Substitutions Message-ID: <008301c21882$357e25f0$72976c42@damien> Youre right. I only threw that out there as a talking point rather than a serious suggestion. I take it you agree with my assertion that putting the format string before the variable would be less error prone? (if it didn=92t destroy = the current usage). Given that the $ notation is all-new, perhaps prefixing with the format string should be considered as in: "$4.2f{height}" In fact, if we are going to revisit format strings why not ditch the format character and keep the numeric specifier only. Determine the format character by the type of the variable. For x =3D "hello", "$4.2{x}" =3D=3D "$4s{x}" -> "hell" For x =3D 3.7865, "$4.2{x}" =3D=3D "$4.2f{x}" -> "3.78" > -----Original Message----- > From: pinard@titan.progiciels-bpi.ca > [mailto:pinard@titan.progiciels-bpi.ca] On Behalf Of Fran=E7ois Pinard > Sent: Thursday, 20 June 2002 10:39 > To: Damien Morton > Subject: Re: PEP 292, Simpler String Substitutions >=20 >=20 > [Damien Morton] >=20 > > Why not alter the notation to allow the format specifier to come > > before the name part. "%4.2f(height)" I think would be a whole lot=20 > > less error prone, and would allow for the format specifier=20 > to default > > to "s" where omitted. >=20 > Hello, Damien. >=20 > "%4.2f(height)" already has the meaning of "%4.2f", which is > complete in itself, and then "(height)", which is a constant=20 > string -- you understand what I mean. Altering the notation=20 > as you suggest would undoubtedly break many, many=20 > applications, so we should guess it is not acceptable. >=20 > --=20 > Fran=E7ois Pinard http://www.iro.umontreal.ca/~pinard >=20 From guido@python.org Thu Jun 20 18:45:18 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 13:45:18 -0400 Subject: [Python-Dev] Weird problem with exceptions raised in extension module In-Reply-To: Your message of "Thu, 20 Jun 2002 19:25:34 +1200." <200206200725.g5K7PYt26747@oma.cosc.canterbury.ac.nz> References: <200206200725.g5K7PYt26747@oma.cosc.canterbury.ac.nz> Message-ID: <200206201745.g5KHjI604158@odiug.zope.com> > I'm getting strange behaviour when raising an > exception in a extension module generated by > Pyrex. The extension module does the equivalent of > > def foo(): > raise TypeError("Test-Exception") > > If I invoke it with the following Python code: > > try: > mymodule.foo() > except IOError: > print "blarg" > > the following happens: > > Traceback (most recent call last): > File "", line 3, in ? > SystemError: 'finally' pops bad exception > > This only happens when the try-except catches > something *other* than the exception being raised. > If the exception being raised is caught, or > no exception catching is done, the exception > is handled properly. > > Also, it only happens when an *intance* is used > as the exception object. If I do this instead: > > raise TypeError, "Test-Exception" > > the problem doesn't occur. > > The relevant piece of C code generated by > Pyrex is as follows. Can anyone see if I'm > doing anything wrong? (I'm aware that there's > a missing Py_DECREF, but it shouldn't be > causing this sort of thing.) > > The Python version I'm using is 2.2. > > __pyx_1 = __Pyx_GetName(__pyx_b, "TypeError"); > if (!__pyx_1) goto __pyx_L1; > __pyx_2 = PyString_FromString(__pyx_k1); > if (!__pyx_2) goto __pyx_L1; > __pyx_3 = PyTuple_New(1); > if (!__pyx_3) goto __pyx_L1; > PyTuple_SET_ITEM(__pyx_3, 0, __pyx_2); > __pyx_2 = 0; > __pyx_4 = PyObject_CallObject(__pyx_1, __pyx_3); > if (!__pyx_4) goto __pyx_L1; > Py_DECREF(__pyx_3); > __pyx_3 = 0; > PyErr_SetNone(__pyx_4); > Py_DECREF(__pyx_4); > __pyx_4 = 0; > goto __pyx_L1; > > /*...*/ It seems that this is just for raise TypeError, "Test-Exception" Shouldn't you show the code for the try/except and for the function call/return too? But I think that you shouldn't be calling PyErr_SetNone() here -- I think you should call PyErr_SetObject(__pyx_1, __pyx_2). For details see do_raise() in ceval.c. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 20 18:46:58 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 13:46:58 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Thu, 20 Jun 2002 13:05:20 +0200." <3D11B6F0.5000803@tismer.com> References: <3D11B6F0.5000803@tismer.com> Message-ID: <200206201746.g5KHkwH04175@odiug.zope.com> Christian, you seem to be contradicting yourself. First: [someone] > > +1 on the allvars() suggestion also. [Christian] > me too. and later: [Christian] > The following statements are ordered by increasing hate. > 1 - I do hate the idea of introducing a "$" sign at all. > 2 - giving "$" special meaning in strings via a module > 3 - doing it as a builtin function > 4 - allowing it to address local/global variables Doesn't 4 contradict your +1 on allvars()? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Thu Jun 20 18:48:12 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 20 Jun 2002 19:48:12 +0200 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <006201c2180b$883c1120$72976c42@damien> <20020620071856.GA10497@hishome.net> <200206201730.g5KHUlP04117@odiug.zope.com> Message-ID: <00d901c21882$a9258a20$ced241d5@hagrid> guido wrote: > > See http://tothink.com/python/embedpp > > How come you never submitted this PEP to the PEPmeister? iirc, that's because Oren did the why would e"X=`x`, Y=`calc_y(x)`." be a vast improvement over: e("X=", x, ", Y=", calc_y(x), ".") test and his answer was not "I18N" (for obvious reasons ;-) (but I think we called the function "I" at that time) From oren-py-d@hishome.net Thu Jun 20 19:23:19 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Thu, 20 Jun 2002 21:23:19 +0300 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <200206201730.g5KHUlP04117@odiug.zope.com>; from guido@python.org on Thu, Jun 20, 2002 at 01:30:47PM -0400 References: <006201c2180b$883c1120$72976c42@damien> <20020620071856.GA10497@hishome.net> <200206201730.g5KHUlP04117@odiug.zope.com> Message-ID: <20020620212319.A17467@hishome.net> On Thu, Jun 20, 2002 at 01:30:47PM -0400, Guido van Rossum wrote: > > See http://tothink.com/python/embedpp > > How come you never submitted this PEP to the PEPmeister? I can't > comment on what I don't know. It certainly comes closest to the > original ABC feature. (The main problem with `...` is that many people > can't distinguish between ` and ', as user testing has shown.) I guess I got a bit discouraged by the response on python-list back then. Now I know better :-) Oren From tismer@tismer.com Thu Jun 20 19:28:43 2002 From: tismer@tismer.com (Christian Tismer) Date: Thu, 20 Jun 2002 20:28:43 +0200 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D11B6F0.5000803@tismer.com> <200206201746.g5KHkwH04175@odiug.zope.com> Message-ID: <3D121EDB.6070501@tismer.com> Guido van Rossum wrote: > Christian, > > you seem to be contradicting yourself. First: > > [someone] > >>>+1 on the allvars() suggestion also. >> > > [Christian] > >>me too. > > > and later: > > [Christian] > >>The following statements are ordered by increasing hate. >>1 - I do hate the idea of introducing a "$" sign at all. >>2 - giving "$" special meaning in strings via a module >>3 - doing it as a builtin function >>4 - allowing it to address local/global variables > > > Doesn't 4 contradict your +1 on allvars()? By no means. allvars() is something like locals() or globals(), just an explicit way to produce a dictionary of variables. What I want to preserve is the distinction between arbitrary "%(name)s" or maybe "${name}" names and my local variables. Using locals() or allvars(), I can decide to *feed* the formatting expression with variable names. But the implementation of .sub() should not know anything about variables, the same way as % doesn't know about variables. Formatting is "by value", IMHO. Furthermore I'd like to thank Alex for his opinions, additions and adjustments to my post. I have to say that I always *am* emotional with such stuff, although I'm trying hard not to. But he hits the nail's head more than I. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From paul@prescod.net Thu Jun 20 19:29:33 2002 From: paul@prescod.net (Paul Prescod) Date: Thu, 20 Jun 2002 11:29:33 -0700 Subject: [Python-Dev] *Simpler* string substitutions Message-ID: <3D121F0D.E3B60865@prescod.net> We will never come to a solution unless we agree on what, if any, the problem is. Here is my sense of the "interpolation" problem (based entirely on the code I see): * 95% of all scripts (or modules) need to do string interpolation * 5% of all scripts want to be explicit about the types * 10% of all scripts want to submit a dictionary rather than the current namespace * 5% of all scripts want to do printf-style formatting tricks Which means that if we do the math in a simplistic way, 20% modules/scripts need these complicated features but the other 75% pay for these features that they are not using. They pay through having to use "% locals()" (which uses two advanced features of Python, operator overloading and the local namespace). They pay through counting the lengths of their %-tuples (in my case, usually miscounting). They pay through adding (or forgetting to add) the format specifier after "%(...)". They pay through having harder to read strings where they have to go back and forth to figure out what various positional variables mean. They through having to remember the special case for singletons -- except for singleton tuples! Of course the syntax is flexible: you get to choose HOW you pay (shifting from positional to name) and thus reduce some costs while you incur others, but you can't choose simply NOT to pay, as you can in every other scripting language I know. And remember that Python is a language that *encourages* readability. But this kind of code is common: * exception.append('\n
%s%s =\n%s' % (indent, name, value)) whereas it could be just: * exception.append('\n
${ident}${name} =\n${value}') Which is shorter, uses fewer concepts, and keeps variables close to where they are used. We could argue that the programmer here made the wrong choice (versus using % locals()) but the point is that Python itself favoured the wrong choice by making the wrong choice shorter and simpler. Usually Python favours the right choice. The tax is small but it is collected on almost every script, almost every beginner and almost every programmer almost every day. So it adds up. If we put this new feature in a module: (whether "text", "re", "string"), then we are just divising another way to make people pay. At that point it becomes a negative feature, because it will clutter up the standard library without getting use.As long as you are agreeing to pay some tax, "%" is a smaller tax (at least at first) because it does not require you to interrupt your workflow to insert an import statement. In my mind, this feature is only worth adding if we agree that it is now the standard string interpolation feature and "%" becomes a quaint historical feature -- a bad experiment in operator overloading gone wrong. "%" could be renamed "text.printf" and would actually become more familiar to its core constituency and less of a syntactic abberation. "interp" could be a built-in and thus similarly simple syntactically. But I am against adding "$" if half of Python programmers are going to use that and half are going to use %. $ needs to be a replacement. There should be one obvious way to solve simple problems like this, not two. I am also against adding it as a useless function buried in a module that nobody will bother to import. Paul Prescod From guido@python.org Thu Jun 20 19:32:39 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 14:32:39 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Thu, 20 Jun 2002 12:10:57 EDT." <20020620161057.GA18208@panix.com> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> <20020619204017.GB9758@gerg.ca> <02fd01c21843$a959df30$ced241d5@hagrid> <20020620161057.GA18208@panix.com> Message-ID: <200206201832.g5KIWd904912@odiug.zope.com> > > I'm +1 on adding a text utility module for occasionally useful > > stuff like wrapping, getting rid of gremlins, doing various kinds > > of substitutions, centering/capitalizing/padding and otherwise > > formatting strings, searching/parsing, and other fun stuff that > > your average text editor can do (and +0 on using the existing > > "string" module for that purpose, but I can live with another > > name). +1 on (e.g.) a text module. -1 on reusing the string module. --Guido van Rossum (home page: http://www.python.org/~guido/) From tismer@tismer.com Thu Jun 20 19:47:48 2002 From: tismer@tismer.com (Christian Tismer) Date: Thu, 20 Jun 2002 20:47:48 +0200 Subject: Version Fatigue (was: Re: [Python-Dev] PEP 292, Simpler String Substitutions) References: <200206201710.g5KHAaO03970@odiug.zope.com> Message-ID: <3D122354.8040308@tismer.com> Guido van Rossum wrote: >>"""Version fatigue comes from the accumulated realization that most >> knowledge gained with regard to any particular version of a product >> will be useless with regard to future generations of that same product.""" >> >>Thinking about that and recent Python development: > > > This is highly exaggerated. Guido, I'm not sure that you are always aware what people actually like about Python and what they dislike. I have heared such complaints from so many people, that I think there are reasonably many who don't share your judgement. Personally, I belong to the more conservatives, too. (Stunned? No, really, I like the minimum, most orthogonal set of features, since I'm running low on brain cells). Don't take me as negative. This has to be said, once: I like the new generators very much. They have a lot of elegance and power. I am absolutely amazed by the solution to the type/class dichotomy, and I'm completely excited about the metaclass stuff. Great! Much more valuable memorizing than list comprehensions, booleans and hopefully no new formatting syntax. All in all Python is evolving good. Maybe we could slow a little down, please? -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From pinard@iro.umontreal.ca Thu Jun 20 20:15:41 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 20 Jun 2002 15:15:41 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206201832.g5KIWd904912@odiug.zope.com> References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> <20020619204017.GB9758@gerg.ca> <02fd01c21843$a959df30$ced241d5@hagrid> <20020620161057.GA18208@panix.com> <200206201832.g5KIWd904912@odiug.zope.com> Message-ID: [Guido van Rossum] > +1 on (e.g.) a text module. > -1 on reusing the string module. _A_ text module is OK. But please, avoid naming such a module "text". Coming to Python, I had to loose the habit of naming the usual string work-variable "string", because it would conflict with the module name. (Some use `s', but this is more close to algebra than programming, as for programming, clear names are usually better.) So, I have tons of programs and scripts using "text" instead for the usual string work-variable. It would be a pain having to revise this all, making room for "import text". -- François Pinard http://www.iro.umontreal.ca/~pinard From guido@python.org Thu Jun 20 20:54:29 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 15:54:29 -0400 Subject: [Python-Dev] Re: Version Fatigue In-Reply-To: Your message of "Thu, 20 Jun 2002 20:47:48 +0200." <3D122354.8040308@tismer.com> References: <200206201710.g5KHAaO03970@odiug.zope.com> <3D122354.8040308@tismer.com> Message-ID: <200206201954.g5KJsTt05302@odiug.zope.com> > Guido, I'm not sure that you are always aware what > people actually like about Python and what they dislike. > I have heared such complaints from so many people, > that I think there are reasonably many who don't share > your judgement. Tough. People used to like it because they trusted my judgement. Maybe I should stop listening to others. :-) Seriously, the community is large enough that we can't expect everybody to like the same things. There are reasonably many who still do share my judgement. > Personally, I belong to the more conservatives, too. > (Stunned? No, really, I like the minimum, most orthogonal > set of features, since I'm running low on brain cells). > > > Don't take me as negative. This has to be said, once: > > I like the new generators very much. They > have a lot of elegance and power. > I am absolutely amazed by the solution to > the type/class dichotomy, and I'm completely > excited about the metaclass stuff. Great! No surprise that you, always the mathematician, like the most brain-exploding features. :-) And note the contradiction, which you share with everybody else: you don't want new features, except the three that you absolutely need to have. And you see nothing wrong with this contradiction. > Much more valuable memorizing than list comprehensions, > booleans and hopefully no new formatting syntax. For many people it's just the other way around though. > All in all Python is evolving good. Maybe we could > slow a little down, please? I'm trying. I'm really trying. Please give me some credit. --Guido van Rossum (home page: http://www.python.org/~guido/) From niemeyer@conectiva.com Thu Jun 20 21:00:25 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 20 Jun 2002 17:00:25 -0300 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <3D121F0D.E3B60865@prescod.net> References: <3D121F0D.E3B60865@prescod.net> Message-ID: <20020620170025.A25014@ibook.distro.conectiva> > Here is my sense of the "interpolation" problem (based entirely on the > code I see): > > * 95% of all scripts (or modules) need to do string interpolation > > * 5% of all scripts want to be explicit about the types > > * 10% of all scripts want to submit a dictionary rather than the > current namespace > > * 5% of all scripts want to do printf-style formatting tricks > > Which means that if we do the math in a simplistic way, 20% > modules/scripts need these complicated features but the other 75% pay [...] I'm curious.. where did you get this from? Have you counted? I think 99% of the statistics are forged to enforce an opinion. :-) [...] > Of course the syntax is flexible: you get to choose HOW you pay > (shifting from positional to name) and thus reduce some costs while you > incur others, but you can't choose simply NOT to pay, as you can in > every other scripting language I know. > > And remember that Python is a language that *encourages* readability. > But this kind of code is common: > > * exception.append('\n
%s%s =\n%s' % (indent, name, value)) > > whereas it could be just: > > * exception.append('\n
${ident}${name} =\n${value}') That's the usual Perl way of string interpolation. I've used Perl in some large projects before being a python adept, and I must confess I don't miss this feature. Maybe it's my C background, but I don't like to mix code and strings. Think about these real examples, taken from *one* single module (BaseHTTPServer): "%s %s %s\r\n" % (self.protocol_version, str(code), message) "%s - - [%s] %s\n" % (self.address_string(), self.log_date_time_string(), format%args)) "%s, %02d %3s %4d %02d:%02d:%02d GMT" % (self.weekdayname[wd], day, self.monthname[month], year, hh, mm, ss) "Serving HTTP on", sa[0], "port", sa[1], "..." "Bad HTTP/0 .9 request type (%s)" % `command` "Unsupported method (%s)" % `self.command` "Bad request syntax (%s)" % `requestline` "Bad request version (%s)" % `version` > Which is shorter, uses fewer concepts, and keeps variables close to > where they are used. We could argue that the programmer here made the [...] Please, show me that with one of the examples above. > The tax is small but it is collected on almost every script, almost > every beginner and almost every programmer almost every day. So it adds > up. That seems like an excessive generalization of a personal opinion. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From niemeyer@conectiva.com Thu Jun 20 21:13:17 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 20 Jun 2002 17:13:17 -0300 Subject: Version Fatigue (was: Re: [Python-Dev] PEP 292, Simpler String Substitutions) In-Reply-To: References: <20020620134916.GA53951@hishome.net> Message-ID: <20020620171317.B25014@ibook.distro.conectiva> > <> operator called "obsolescent", iterators, generators, list > comprehensions, ugly '//' operator introduced for integer division, > deprecating import string, types, possibly adding "$name".sub(), > may be later deprecating the % operator. What next? major, minor, and makedev.. I hope... /me runs.. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From pinard@iro.umontreal.ca Thu Jun 20 21:17:38 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 20 Jun 2002 16:17:38 -0400 Subject: [Python-Dev] Re: Version Fatigue (was: Re: PEP 292, Simpler String Substitutions) In-Reply-To: <3D122354.8040308@tismer.com> References: <200206201710.g5KHAaO03970@odiug.zope.com> <3D122354.8040308@tismer.com> Message-ID: [Christian Tismer] > >> """Version fatigue comes from the accumulated realization that most > >> knowledge gained with regard to any particular version of a product will > >> be useless with regard to future generations of that same product.""" > >> > >>Thinking about that and recent Python development: > Guido van Rossum wrote: > > This is highly exaggerated. Exactly as stated, yes, I agree. There is another kind of fatigue which may apply to Python, by which a language becomes so featured over time that people may naturally come to limit themselves to a sufficient subset of the language and be perfectly happy, until they have to read the code written by guys speaking another subset of the language possibilities. Legibility becomes subjective and questionable. In the past, this has been true for some comprehensive implementations of LISP, and as I heard (but did not experience) for PL/I. There was a time, not so long ago, when there was only one way to do it in Python, and this one way was the good way, necessarily. This is not true anymore, and we ought to recognise that this impacts legibility. Fredrik wrote: > Hey, don't lump generators in with the rest of the stuff. Generators opens > a new universe, the rest is more like moving the furniture around... Indeed, there are very nice additions, that really bring something new, and generators are of this kind. Even moving the furniture around may be very good, like for when the underlying mechanics get revised, acquiring power and expressiveness on the road, while keeping the same surface aspect. At least, people recognise the furniture, and could appreciate the new order. Where it might hurt, however, is when the Python place get crowded with furniture, that is, when Python gets new syntaxes and functions above those which already exist, while keeping the old stuff around more or less forever for compatibility reasons, with no firm intention or plan for deprecation, and no tools to help users at switching from a paradigm to its replacement. This ends up messy, as each programmer then uses some preferred subset. The `$' PEP is a typical example of this. Either the PEP should contain a serious and detailed study about how `%' is going to become deprecated, or the PEP should design `$' so nicely that it appears to be a born-twin of the current `%' format, long lost then recently rediscovered. The PEP should convince us that it would be heart-breaking to separate so loving brothers. Now, it looks like these two do not much belong to the same family, they just randomly met in Python, they are not especially fit with one another. Just the expression of a feeling, of course. :-) -- François Pinard http://www.iro.umontreal.ca/~pinard From paul@prescod.net Thu Jun 20 21:26:03 2002 From: paul@prescod.net (Paul Prescod) Date: Thu, 20 Jun 2002 13:26:03 -0700 Subject: [Python-Dev] *Simpler* string substitutions References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva> Message-ID: <3D123A5B.EB7389AA@prescod.net> Gustavo Niemeyer wrote: > >... > > I'm curious.. where did you get this from? Have you counted? No. > I think 99% of the statistics are forged to enforce an opinion. :-) I said it was based only on my experience! > ... Think about these real examples, taken > from *one* single module (BaseHTTPServer): > > "%s %s %s\r\n" % (self.protocol_version, str(code), message) Let's presume a "sub" method with the features of Ping's string interpolation PEP. This would look like: "${self.protocol_version}, $code, $message\r\n".sub() Shorter and simpler. > "%s - - [%s] %s\n" % (self.address_string(), > self.log_date_time_string(), > format%args)) "${self.address_string()} - - [${self.log_date_time_string()}] ${format.sub(args)}".sub() But I would probably clarify that: addr = self.address_string() time = self.log_date_time_string() command = format.sub(args) "$addr - - [$time] $command\n".sub() > "%s, %02d %3s %4d %02d:%02d:%02d GMT" % (self.weekdayname[wd], > day, > self.monthname[month], year, > hh, mm, ss) This one is part of the small percent that uses formatting codes. It wouldn't be rocket science to integrate formatting codes with the "$" notation $02d{day} but it would also be fine if this involved a call to textutils.printf() > "Serving HTTP on", sa[0], "port", sa[1], "..." This doesn't use "%" to start with, but it is still clearer (IMO) in the new notation: "Serving HTTP on ${sa[0]} port ${sa[1]} ..." > "Bad HTTP/0 .9 request type (%s)" % `command` "Bad HTTP/0 .9 request type ${`command`}" etc. Paul Prescod From jmiller@stsci.edu Thu Jun 20 21:29:18 2002 From: jmiller@stsci.edu (Todd Miller) Date: Thu, 20 Jun 2002 16:29:18 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ Message-ID: <3D123B1E.6050600@stsci.edu> There has been some recent interest in the Numeric/numarray community for using array objects as indices for builtin sequences. I know this has come up before, but to make myself clear, the basic idea is to make the following work: class C: def __int__(self): return 5 object = C() l = "Another feature..." print l[object] "h" Are there any plans (or interest) for developing Python in this direction? Todd From greg@electricrain.com Thu Jun 20 21:50:41 2002 From: greg@electricrain.com (Gregory P. Smith) Date: Thu, 20 Jun 2002 13:50:41 -0700 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15633.1338.367283.257786@localhost.localdomain> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain> Message-ID: <20020620205041.GD18944@zot.electricrain.com> On Wed, Jun 19, 2002 at 05:27:06PM -0500, Skip Montanaro wrote: > > >> Why can't it just be called bsddb? > > Greg> Modern berkeleydb uses much different on disk database formats, > Greg> glancing at the docs on sleepycat.com i don't even think it can > Greg> read bsddb (1.85) files. > > That's never stopped us before. ;-) The current bsddb module works with > versions 1, 2, 3, and 4 of Berkeley DB using the 1.85-compatible API that > Sleepycat provides. It's always been the user's responsibility to run the > appropriate db_dump or db_dump185 commands before using the next version of > Berkeley DB. Using the library from Python never removed that requirement. Good point. I was ignorant of the original bsddb 1.85 module workings as i never used it. Pybsddb backwards compatibility was implemented (not be me) by with the intention that it could be used as a replacement for the existing bsddb module. It passes the simplistic test_bsddb.py that is included with python today as well as pybsddb's own test_compat.py to test the compatibility layer. If we replace the existing bsddb with pybsddb (bsddb3), it should work. If there are hidden bugs that's what the alpha/beta periods are for. However linking against berkeleydb versions less than 3.2 will no longer be supported; should we keep the existing bsddb around as oldbsddb for users in that situation? -G From guido@python.org Thu Jun 20 21:53:12 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 16:53:12 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: Your message of "Thu, 20 Jun 2002 16:29:18 EDT." <3D123B1E.6050600@stsci.edu> References: <3D123B1E.6050600@stsci.edu> Message-ID: <200206202053.g5KKrCA05552@odiug.zope.com> > There has been some recent interest in the Numeric/numarray community > for using array objects as indices > for builtin sequences. I know this has come up before, but to make > myself clear, the basic idea is to make the > following work: > > class C: > def __int__(self): > return 5 > > object = C() > > l = "Another feature..." > > print l[object] > "h" > > Are there any plans (or interest) for developing Python in this direction? I'm concerned that this will also make floats acceptable as indices (since they have an __int__ method) and this would cause atrocities like print "hello"[3.5] to work. --Guido van Rossum (home page: http://www.python.org/~guido/) From tismer@tismer.com Thu Jun 20 21:59:43 2002 From: tismer@tismer.com (Christian Tismer) Date: Thu, 20 Jun 2002 22:59:43 +0200 Subject: [Python-Dev] Re: Version Fatigue (was: Re: PEP 292, Simpler String Substitutions) References: <200206201710.g5KHAaO03970@odiug.zope.com> <3D122354.8040308@tismer.com> Message-ID: <3D12423F.3010604@tismer.com> Fran=E7ois Pinard wrote: > [Christian Tismer] >=20 >=20 >>>>"""Version fatigue comes from the accumulated realization that most >>>>knowledge gained with regard to any particular version of a product = will >>>>be useless with regard to future generations of that same product.""" Thanks. This is a false quote, but I could have said it. :-) --=20 Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From niemeyer@conectiva.com Thu Jun 20 22:20:43 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 20 Jun 2002 18:20:43 -0300 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <3D123A5B.EB7389AA@prescod.net> References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva> <3D123A5B.EB7389AA@prescod.net> Message-ID: <20020620182043.A4252@ibook.distro.conectiva> > Let's presume a "sub" method with the features of Ping's string > interpolation PEP. This would look like: That's not the PEP being discussed, and if it was, it can't replace the % mapping. Read the Security Issues. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From barry@barrys-emacs.org Thu Jun 20 22:21:02 2002 From: barry@barrys-emacs.org (Barry Scott) Date: Thu, 20 Jun 2002 22:21:02 +0100 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <3D121F0D.E3B60865@prescod.net> Message-ID: <001901c218a0$6158d1c0$070210ac@LAPDANCE> If I'm going to move from %(name)fmt to ${name} I need a place for the fmt format. Given the error prone nature of %(name) should have been %(name)s Howabout adding the format inside the {} for example: ${name:format} You can then have $name ${name} ${name:s} $name and ${name} work as you have already decided. ${name:format} allows the format to control the substitution. Barry From guido@python.org Thu Jun 20 22:21:24 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 17:21:24 -0400 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: Your message of "Thu, 20 Jun 2002 11:29:33 PDT." <3D121F0D.E3B60865@prescod.net> References: <3D121F0D.E3B60865@prescod.net> Message-ID: <200206202121.g5KLLPT05634@odiug.zope.com> [Paul] > We will never come to a solution unless we agree on what, if any, > the problem is. [...eloquent argument, ending in...] > But I am against adding "$" if half of Python programmers are going > to use that and half are going to use %. $ needs to be a > replacement. There should be one obvious way to solve simple > problems like this, not two. I am also against adding it as a > useless function buried in a module that nobody will bother to > import. Well argued. Alex said roughly the same thing: let's not add $ while keeping %. Adding a function for $-interpolation to a module would certainly help some projects (like web templating) from reinventing the wheel -- but /F has shown that this particular wheel isn't hard to recreate. I would certainly recommend any project that offers substitution in templates that are edited by non-programmers to use the $-based syntax from Barry's PEP rather than Python's %(name)s syntax. (In particular I hope Python's i18n projects will use $ interpolation.) Oren made a good point that Paul emphasized: the most common use case needs interpolation from the current namespace in a string literal, and expressions would be handy. Oren also made the point that the necessary parsing could (should?) be done at compile time. We currently already have many ways to do this: - In some cases print is appropriate: def f(x, t): print "The sum of", x, "and", y, "is", x+y - You can use string concatenation: def f(x, y): return "The sum of " + str(x) + " and " + str(y) + " is " + str(x+y) - You can use % interpolation (with two variants: positional and by-name). A problem is that you have to specify an explicit tuple or dict of values. def f(x, y): return "The sum of %s and %s is %s" % (x, y, x+y) Note that the print version is the shortest, and IMO also the easiest to read. (Though some people might disagree and prefer the % version because it separates the template from the data; it's not much longer.) - You could have an interpolation helper function: def i(*a): return "".join(map(str, a)) so you could write this: def f(x, y): return i("The sum of ", x, " and ", y, " is ", x+y) This comes closer in length to the print version. IMO the attraction of the $ version is that it reduces the amount of punctuation so that it becomes even shorter and clearer. While I said "shorter" several times above when comparing styles, I really meant that as a shorthand for "shorter and clearer". Even the print example suffers from the fact that every interpolated value is separated from the surrounding template by a comma and a string quote on both sides -- that's a lot of visual clutter (not to mention stuff to type). Maybe in Python 3.0 we will be able to write: def f(x, y): return "The sum of $x and $y is $(x+y)" To me, it's a toss-up whether this looks better or worse than the ABC version: def f(x, y): return "The sum of `x` and `y` is `x+y`" but I do know that backticks have a poor reputation for being hard to find on the keyboard (newbies don't even know they have it), hard to distinguish in some fonts, and publishers often turn 'foo' into `foo', making it hard to publish accurate documentation. I think on some European keyboards ` is a dead key, making it even harder to type. Additionally, it's a symmetric operator, which makes it harder to parse complex examples. Now, how to get there (or somewhere similar) in Python 2.3? PEP 215 solves it by using (yet) another string prefix character. It uses $, which to me looks a bit ugly; in this thread, someone proposed using e, so you can do: def f(x, y): return e"The sum of $x and $y is $(x+y)" That looks OK to me, especially if it can be combined with u and r to create unicode and raw strings. There are other possibilities: def f(x, y): return "The sum of \$x and \$y is \$(x+y)" Alas, it's not 100% backwards compatible, and the \$ looks pretty bad. Another one: def f(x, y): return "The sum of \(x) and \(y) is \(x+y)" Still not 100% compatible, looks perhaps a bit better, but notice how now every interpolation needs three punctuation characters: almost as many as the print example. Assuming that interpolating simple variables is relatively common, I still like plain $ with something to tag the string as an interpolation best. PEP 292 is an attempt to do this *without* involving the parser: def f(x, y): return "The sum of $x and $y is $(x+y)".sub() Downsides are that it invites using non-literals as formats, with all the security aspects, and that its parsing happens at run-time (no big deal IMO). Now back to $ vs. %. I think I can defend having both in the language, but only if % is reduced to the positional version (classic printf). This would be used mostly to format numerical data with fixed column width. There would be very little overlap in use cases: % always requires you to specify explicit values, while $ is always % followed by a variable name. (Yet another variant is from Tcl, which uses $variable but also [expression]. In Python 3.0 this would become: def f(x, y): return "The sum of $x and $y is [x+y]" But now you have three characters that need quoting, and we might as well use \$ to quote a literal $ instead of $$.) All options are still open. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 20 22:35:41 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 17:35:41 -0400 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: Your message of "Thu, 20 Jun 2002 13:26:03 PDT." <3D123A5B.EB7389AA@prescod.net> References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva> <3D123A5B.EB7389AA@prescod.net> Message-ID: <200206202135.g5KLZf705731@odiug.zope.com> > > "%s, %02d %3s %4d %02d:%02d:%02d GMT" % (self.weekdayname[wd], > > day, > > self.monthname[month], year, > > hh, mm, ss) > > This one is part of the small percent that uses formatting codes. It > wouldn't be rocket science to integrate formatting codes with the "$" > notation $02d{day} but it would also be fine if this involved a call to > textutils.printf() But if you support $02d{day} you should also support $d{day}, but that already means something different (the variable 'd' followed by '{day}'). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jun 20 22:37:39 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 17:37:39 -0400 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: Your message of "Thu, 20 Jun 2002 22:21:02 BST." <001901c218a0$6158d1c0$070210ac@LAPDANCE> References: <001901c218a0$6158d1c0$070210ac@LAPDANCE> Message-ID: <200206202137.g5KLbd505742@odiug.zope.com> > If I'm going to move from %(name)fmt to ${name} I need a place for > the fmt format. Given the error prone nature of %(name) should have > been %(name)s > > Howabout adding the format inside the {} for example: > > ${name:format} > > You can then have > > $name > ${name} > ${name:s} > > $name and ${name} work as you have already decided. ${name:format} allows > the format to control the substitution. Not bad. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Thu Jun 20 23:08:32 2002 From: paul@prescod.net (Paul Prescod) Date: Thu, 20 Jun 2002 15:08:32 -0700 Subject: [Python-Dev] *Simpler* string substitutions References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva> <3D123A5B.EB7389AA@prescod.net> <20020620182043.A4252@ibook.distro.conectiva> Message-ID: <3D125260.8559CAD8@prescod.net> Gustavo Niemeyer wrote: > > > Let's presume a "sub" method with the features of Ping's string > > interpolation PEP. This would look like: > > That's not the PEP being discussed, and if it was, it can't replace > the % mapping. Read the Security Issues. That's true. I didn't mean to endorse any particular solution but rather to clarify the problem. I believe that only one of your examples required a feature (runtime provision of the format string) that was not in Ping's PEP. If another PEP is a better solution to the problem than the current one then fine. My point is that there *is* a problem! Paul Prescod From skip@pobox.com Thu Jun 20 23:17:56 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 20 Jun 2002 17:17:56 -0500 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <200206202121.g5KLLPT05634@odiug.zope.com> References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com> Message-ID: <15634.21652.224578.240799@beluga.mojam.com> Guido> Alex said roughly the same thing: let's not add $ while keeping Guido> %. Then let's not add $ at all. ;-) Seriously, I'm not keen on having to modify all my %-formatted strings for something I perceive as a negligible improvement. I've seen nothing to suggest that any $-format proposals I've read were knock-my-socks-off better than the current %-format implementation. Skip From jmiller@stsci.edu Thu Jun 20 23:19:17 2002 From: jmiller@stsci.edu (Todd Miller) Date: Thu, 20 Jun 2002 18:19:17 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> Message-ID: <3D1254E5.6010007@stsci.edu> Guido van Rossum wrote: >>There has been some recent interest in the Numeric/numarray community >>for using array objects as indices >>for builtin sequences. I know this has come up before, but to make >>myself clear, the basic idea is to make the >>following work: >> >>class C: >> def __int__(self): >> return 5 >> >>object = C() >> >>l = "Another feature..." >> >>print l[object] >>"h" >> >>Are there any plans (or interest) for developing Python in this direction? >> > >I'm concerned that this will also make floats acceptable as indices >(since they have an __int__ method) and this would cause atrocities >like > >print "hello"[3.5] > >to work. > >--Guido van Rossum (home page: http://www.python.org/~guido/) > > >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev > That makes sense. What if we specifically excluded Float objects from the conversion? Are there any types that need to be excluded? If there's a chance of getting a patch for this accepted, STSCI is willing to do the work. Todd -- Todd Miller jmiller@stsci.edu STSCI / SSG (410) 338 4576 From niemeyer@conectiva.com Thu Jun 20 23:20:15 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 20 Jun 2002 19:20:15 -0300 Subject: [Python-Dev] h2py Message-ID: <20020620192014.A5111@ibook.distro.conectiva> Hi everyone! I was thinking about working a little bit in the h2py tool. But first, I'd like to understand what's the current position of its utility. Should I worry about it, or this tool and its generated files, are something to be obsoleted soon? Thanks! -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From niemeyer@conectiva.com Thu Jun 20 23:32:44 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 20 Jun 2002 19:32:44 -0300 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <3D125260.8559CAD8@prescod.net> References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva> <3D123A5B.EB7389AA@prescod.net> <20020620182043.A4252@ibook.distro.conectiva> <3D125260.8559CAD8@prescod.net> Message-ID: <20020620193244.B5111@ibook.distro.conectiva> > That's true. I didn't mean to endorse any particular solution but rather > to clarify the problem. I believe that only one of your examples > required a feature (runtime provision of the format string) that was not > in Ping's PEP. If another PEP is a better solution to the problem than > the current one then fine. My point is that there *is* a problem! Agreed. I feel relieved to know that the problem is in a PEP, and that there's a lot of smart people discussing its implementation. Don't worry, it won't get into Python before there's a minimum consensus on the solution. Of course, issuing your opinion is important to define the minimum consensus. ;-) -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From ping@zesty.ca Thu Jun 20 23:48:52 2002 From: ping@zesty.ca (Ka-Ping Yee) Date: Thu, 20 Jun 2002 15:48:52 -0700 (PDT) Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <20020620071856.GA10497@hishome.net> Message-ID: On Thu, 20 Jun 2002, Oren Tirosh wrote: > > See http://tothink.com/python/embedpp Hi Oren, Your proposal brings up some valid concerns with PEP 215: 1. run-time vs. compile-time parsing 2. how to decide what's an expression 3. balanced quoting instead of $ PEP 215 actually agrees with you on point #1. That is, the intent (though poorly explained) was that the interpolated strings would be turned into bytecode by the compiler. That is why the PEP insists on having the interpolated expressions in the literal itself -- they can be taken apart at compile time. However, i don't necessarily agree with PEP 215. (I mentioned this once before, but it might not hurt to reiterate that i didn't write the PEP because i desperately wanted string interpolation. I wrote it because i wanted to try to get one local optimum written down in a PEP, so there would be something for discussion.) Using compile-time parsing, as in PEP 215, has the advantage that it avoids any possible security problems; but it also eliminates the possibility of using this for internationalization. I see this as the key tension in the string interpolation issue (aside from all the syntax stuff -- which is naturally controversial). -- ?!ng "Computers are useless. They can only give you answers." -- Pablo Picasso From Donald Beaudry Thu Jun 20 23:50:51 2002 From: Donald Beaudry (Donald Beaudry) Date: Thu, 20 Jun 2002 18:50:51 -0400 Subject: [Python-Dev] *Simpler* string substitutions References: <001901c218a0$6158d1c0$070210ac@LAPDANCE> Message-ID: <200206202250.g5KMops08100@zippy.abinitio.com> "Barry Scott" wrote, > Howabout adding the format inside the {} for example: > > ${name:format} Considering that the $ is supposed to be familar to folks who use other tools, the colon used this way might undo much of that good will. On the other hand, %{name:format} might be just the right thing. -- Donald Beaudry Ab Initio Software Corp. 201 Spring Street donb@abinitio.com Lexington, MA 02421 ...So much code... From aahz@pythoncraft.com Fri Jun 21 00:00:11 2002 From: aahz@pythoncraft.com (Aahz) Date: Thu, 20 Jun 2002 19:00:11 -0400 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <20020620170025.A25014@ibook.distro.conectiva> References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva> Message-ID: <20020620230011.GA18327@panix.com> On Thu, Jun 20, 2002, Gustavo Niemeyer wrote: > > "Serving HTTP on", sa[0], "port", sa[1], "..." This is where current string handling comes up short. What's the correct way to internationalize this string? What if the person handling I18N isn't a Python programmer? I'm sort of caught in the middle here. I can see that in some ways what we currently have isn't ideal, but we've already got problems with strings violating the Only One Way stricture (largely due to immutability vs. "+" combined with .join() vs. % -- fortunately, the use cases for .join() and % are different, so people mostly use them appropriately). It seems to me that fixing the problems with % formatting for newbie Python programmers just isn't worth the pain. It also seems to me that getting better/simpler interpolation support for I18N and similar templating situations is also a requirement. I vote for two things: * String template class for the text module/package that does more-or-less what PEP 292 suggests. I think standardizing string templating would be a Good Thing. I recommend that only one interpolation form be supported; if we're following PEP 292, it should be ${var}. This makes it visually easy for translators to find the variables. * No changes to current string interpolation features unless it's made compatible with % formatting. I don't think I can support dropping % formatting even in Python 3.0; it's not just source code that will have string formats, but also config files and databases. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From pinard@iro.umontreal.ca Fri Jun 21 00:40:23 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 20 Jun 2002 19:40:23 -0400 Subject: [Python-Dev] Re: *Simpler* string substitutions In-Reply-To: <200206202121.g5KLLPT05634@odiug.zope.com> References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com> Message-ID: [Guido van Rossum] > [...] All options are still open. Thanks, Guido, for the synthesis of a summary of various avenues. These two points are worth underlining: 1) let's not add $ while keeping %. [...] having both in the language, but only if % is reduced to the positional version 2) the necessary parsing could (should?) be done at compile time. Here are other comments, some of which are related to internationalisation. > return "The sum of " + str(x) + " and " + str(y) + " is " + str(x+y) > return i("The sum of ", x, " and ", y, " is ", x+y) > print "The sum of", x, "and", y, "is", x+y > Note that the print version is the shortest, and IMO also the easiest > to read. These are good for quick programs, and `print' is good for debugging. But they are less appropriate whenever internationalisation is in the picture, because it is more handy and precise for translators to handle wider context at once, than individual sentence fragments. > [...] % interpolation (with two variants: positional and by-name). The advantage of by-name interpolation for internationalisation is the flexibility it gives for translators to reorganise the inserts. > return "The sum of `x` and `y` is `x+y`" > return "The sum of $x and $y is $(x+y)" > return "The sum of $x and $y is [x+y]" Those three above might be a little too magical for Python. Python does not ought to have interpolation on all double-quoted strings like shells or Perl (and it should probably avoid deciding interpolability on the delimiter being a single or double quote, even if shells or Perl do this). > return "The sum of \(x) and \(y) is \(x+y)" > return "The sum of \$x and \$y is \$(x+y)" > return e"The sum of $x and $y is $(x+y)" > [...] I still like plain $ with something to tag the string as an > interpolation best. Those three are interesting, because they build on the escape syntax, or prefix letters, which Python already has. All these notations would naturally accept `ur' prefix letters. The shortest notation in the above is the third, using the `e' prefix, because this is the one requiring the least number of supplementary characters per interpolation. This is really a big advantage. (A detail about the letter `e': is it the best letter to use?) I also like the hidden suggestion that round parentheses are more readable than braces, something that was already granted in Python through the current %-by-name syntax. In fact, `${name}' would be more acceptable if Python also got at the same time `$(name)' as equivalent, and _also_ `%{name}format' as equivalent for %(name)format'. The simplest is surely to avoid braces completely, not introducing them. As long as Python does not fully get rid of `%', I wonder if the last two examples above could not be rewritten: return "The sum of \%x and \%y is \%(x+y)" return e"The sum of %x and %y is %(x+y)" That would avoid introducing `$' while we already have `%'. On the other hand, it might be confusing to overload `%' too much, if one want to mix everything like in: return e"The sum of %x and %y is %%d" % (x+y) This is debatable, and delicate. Users already have to deal with how to quote `\' and `%'. Having to deal with `$' as well, in all combinations and exceptional cases, makes a lot of things to consider. Most of us easily write shell scripts, yet we have difficulty to properly write or decipher a shell line using many quoting devices at once. Python is progressively climbing the same road. It should stay simpler, all considered. But I think the main problem in all these suggestions is how they interact with internationalisation. Surely: return _(e"The sum of %x and %y is %(x+y)") cannot be right. Interpolation has to be delayed to after translation, not before, because you agree that translators just cannot produce a translation for all possible inserts. I do not know what the solution is, and what kind of elegant magic may be invented to yield programmers all the flexibility they still need in that area. It is worth a good thought, and we should not rush into a decision before this aspect has been carefully analysed. If other PEPs are necessary for addressing interactions between interpolation and translation, these PEPs should be fully resolved before or concurrently with the PEP on interpolation, and not pictured as independent issues. > [...] There would be very little overlap in use cases: % always > requires you to specify explicit values, while $ is always % followed > by a variable name. Yes, the suggestion of using `$(name:format)', whenever needed, is a good one that should be retained, maybe as `%(name:format)', or maybe with `$'. It means that the overlap would not be so little, after all. -- François Pinard http://www.iro.umontreal.ca/~pinard From guido@python.org Fri Jun 21 02:29:30 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 21:29:30 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: Your message of "Thu, 20 Jun 2002 18:19:17 EDT." <3D1254E5.6010007@stsci.edu> References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu> Message-ID: <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net> [Todd Miller] > >>There has been some recent interest in the Numeric/numarray community > >>for using array objects as indices > >>for builtin sequences. I know this has come up before, but to make > >>myself clear, the basic idea is to make the > >>following work: > >> > >>class C: > >> def __int__(self): > >> return 5 > >> > >>object = C() > >> > >>l = "Another feature..." > >> > >>print l[object] > >>"h" > >> > >>Are there any plans (or interest) for developing Python in this direction? [Guido] > >I'm concerned that this will also make floats acceptable as indices > >(since they have an __int__ method) and this would cause atrocities > >like > > > >print "hello"[3.5] > > > >to work. [Todd] > That makes sense. What if we specifically excluded Float objects from > the conversion? Are there any types that need to be excluded? If > there's a chance of getting a patch for this accepted, STSCI is willing > to do the work. Hm, an exception for a specific type seems ugly. What if a user defines a UserFloat type, or a Rational type, or a FixedPoint type, with an __int__ conversion? This points to an unfortunate early design flaw in Python (inherited from C casts): __int__ has two different meanings -- sometimes it converts the type, sometimes it also truncates the value. I suppose you could hack something where you extract x.__int__() and x.__float__() and compare the two, but that could lead to a lot of overhead. I hesitate to propose a new special method, but that may be the only solution. :-( What's your use case? Why do you need this? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 21 02:31:11 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 21:31:11 -0400 Subject: [Python-Dev] h2py In-Reply-To: Your message of "Thu, 20 Jun 2002 19:20:15 -0300." <20020620192014.A5111@ibook.distro.conectiva> References: <20020620192014.A5111@ibook.distro.conectiva> Message-ID: <200206210131.g5L1VBe09370@pcp02138704pcs.reston01.va.comcast.net> > I was thinking about working a little bit in the h2py tool. But > first, I'd like to understand what's the current position of its > utility. Should I worry about it, or this tool and its generated files, > are something to be obsoleted soon? It's a poor hack. We're trying to get away from having any header files generated by this tool, because it turns out there is always some platform where it misses some symbols. (E.g. I recall we had a case where a particular set of important constants was defined as an enum instead of as #defines.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 21 02:41:13 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Jun 2002 21:41:13 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: Your message of "Thu, 20 Jun 2002 15:48:52 PDT." References: Message-ID: <200206210141.g5L1fDv09800@pcp02138704pcs.reston01.va.comcast.net> > Using compile-time parsing, as in PEP 215, has the advantage that it > avoids any possible security problems; It is also the only way to properly support nested scopes. It would be confusing and inconsistent if you can use a variable from a nested scope in an expression but not in a "string display" (which I think is a cute name for strings with embedded expressions). > but it also eliminates the possibility of using this for > internationalization. I see this as the key tension in the string > interpolation issue (aside from all the syntax stuff -- which is > naturally controversial). Yes, I believe that Barry's main purpose is i18n. But I think i18n should not be approached in a cavalier way. If you need i18n of your application, you have to be very disciplined anyway. I think collecting the variable available for interpolation in a dict and passing them explicitly to an interpolation function is the way to go here. Also, in i18n the interpolation syntax must be usable for translators who are not necessarily programmers. I believe the $ notation with only simple variables is entirely adequate for that purpose -- and Barry can implement it in a few lines. (We just adopted this for Zope3, and while there are all sorts of open issues, $ interpolation is not one of them.) --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Fri Jun 21 04:05:17 2002 From: paul@prescod.net (Paul Prescod) Date: Thu, 20 Jun 2002 20:05:17 -0700 Subject: [Python-Dev] String substitution: compile-time versus runtime References: <200206210141.g5L1fDv09800@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D1297ED.3990C30F@prescod.net> Guido van Rossum wrote: > > > Using compile-time parsing, as in PEP 215, has the advantage that it > > avoids any possible security problems; > > It is also the only way to properly support nested scopes. It would > be confusing and inconsistent if you can use a variable from a nested > scope in an expression but not in a "string display" (which I think is > a cute name for strings with embedded expressions). >... > Yes, I believe that Barry's main purpose is i18n. But I think i18n > should not be approached in a cavalier way. If you need i18n of your > application, you have to be very disciplined anyway. I think > collecting the variable available for interpolation in a dict and > passing them explicitly to an interpolation function is the way to go > here. I think that what I hear you saying is that interpolation should ideally be done at a compile time for simple uses and at runtime for i18n. The compile-time version should have the ability to do full expressions (array indexes and self.members at the very least) and will have access to nested scopes. The runtime version should only work with dictionaries. I think you also said that they should both use named parameters instead of positional parameters. And presumably just for simplicity they would use similar syntax although one would be triggered at compile time and one at runtime. If "%" survives, it would be used for positional parameters, instead of named parameters. Is that your current thinking on the matter? I think we are making progress if we're coming to understand that the two different problem domains (simple scripts versus i18n) have different needs and that there is probably no one solution that fits both. Paul Prescod From niemeyer@conectiva.com Fri Jun 21 06:07:25 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Fri, 21 Jun 2002 02:07:25 -0300 Subject: [Python-Dev] Behavior of matching backreferences Message-ID: <20020621020725.A9565@ibook.distro.conectiva> Hi everyone! I was studying the sre module, when I came up with the following regular expression: re.compile("^(?P
a)?(?P=a)$").match("ebc").groups() The (?P=a) matches with whatever was matched by the "a" group. If "a" is optional and doesn't match, it seems to make sense that (?P=a) becomes optional as well, instead of failing. Otherwise the regular expression above will allways fail if the first group fails, even being optional. One could argue that to make it a valid regular expression, it should become "^(?Pa)?(?P=a)?". But that's a different regular expression, since it would match "a", while the regular expression above would match "aa" or "", but not "a". This kind of pattern is useful, for example, to match a string which could be optionally surrounded by quotes, like shell variables. Here's an example of such pattern: r"^(?P')?((?:\\'|[^'])*)(?P=a)$". This pattern matches "'a'", "\'a", "a\'a", "'a\'a'" and all such variants, but not "'a", "a'", or "a'a". I've submitted a patch to make this work to http://python.org/sf/571976 -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From niemeyer@conectiva.com Fri Jun 21 06:08:15 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Fri, 21 Jun 2002 02:08:15 -0300 Subject: [Python-Dev] h2py In-Reply-To: <200206210131.g5L1VBe09370@pcp02138704pcs.reston01.va.comcast.net> References: <20020620192014.A5111@ibook.distro.conectiva> <200206210131.g5L1VBe09370@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20020621020815.B9565@ibook.distro.conectiva> Guido, > It's a poor hack. We're trying to get away from having any header > files generated by this tool, because it turns out there is always > some platform where it misses some symbols. (E.g. I recall we had a > case where a particular set of important constants was defined as an > enum instead of as #defines.) Ok, I'll leave it as is then. Thank you! -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From bac@OCF.Berkeley.EDU Fri Jun 21 06:23:25 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Thu, 20 Jun 2002 22:23:25 -0700 (PDT) Subject: [Python-Dev] strptime recapped Message-ID: I have written the callout to strptime.strptime (strptime is SF patch #474274) as Guido asked. Since that was the current hold-up and the thread has gone dormant, I figured I should summarize the discussion up to this point. 1) what is the need?: The question was raised why this was done. The answer was that since time is just a wrapper around time.h, strptime was not guaranteed since it is not a part of ANSI C. Some ANSI C libraries include it, though (like glibc), because it is so useful. Unfortunately Windows and OS X do not have it. Having it in Python means it is completely portable and no longer reliant on the ANSI C library being kind enough to provide it. 2) strftime dependence: Some people worried about the dependence upon strftime for calculating some info. But since strftime is guaranteed to be there by Python (since it is a part of ANSI C), the dependence is not an issue. 3) locale info for dates: Skip and Guido pointed out that calendar.py now generates the names of the weekdays and months on the fly similar to my solution. So I did go ahead and use it. But Skip pointed out that perhaps we should centralize any code that calculates locale info for dates (calendar.py's names and my code for figuring out format for date/time). I had suggested adding it to the locale module and Guido responded that Martin had to ok that. Martin hasn't responded to that idea. 4) location of strptime: Skip asked why Guido was having me write the callout patch to timemodule.c. He wondered why Lib/time.py wasn't just created holding my code and then renaming timemodule.c to _timemodule.c and importing it at the end of time.py. No response has been given thus far for that. I also suggested a possible time2 where things like strptime, my helper fxns (calculate the Julian date from the Gregorian date, etc.), and things such as naivetime could be kept. That would allow time to stay as a straight wrapper to time.h while all bonus code gets relegated to time2. Guido said it might be a good idea but would have to wait until he got back from vacation. That pretty much sums up everything to this point; hope I got it right and didn't miss anything. -Brett C. From tim.one@comcast.net Fri Jun 21 06:49:02 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 21 Jun 2002 01:49:02 -0400 Subject: [Python-Dev] RE: [Patches] [ python-Patches-566100 ] Rationalize DL_IMPORT and DL_EXPORT In-Reply-To: Message-ID: In case anyone wants prettier Python source code, check out Mark Hammond's patch. It deserves more consideration than 50 identical ways to spell string interpolation : http://www.python.org/sf/566100 From python@rcn.com Fri Jun 21 07:18:59 2002 From: python@rcn.com (Raymond Hettinger) Date: Fri, 21 Jun 2002 02:18:59 -0400 Subject: [Python-Dev] Re: *Simpler* string substitutions References: <3D121F0D.E3B60865@prescod.net><200206202121.g5KLLPT05634@odiug.zope.com> Message-ID: <00a401c218eb$8872c7c0$87d8accf@othello> +1 for $(name) instead of ${name} because it is closer to existing formatting spec because my tastes like it better -1 for $(x+y) because z=x+y; '$z'.sub() works fine because general expressions are harder to pick-out +1 for $(name:fmt) because the style is powerful and elegant +1 for \$ instead of $$ because \ is already an escape character because $$ is more likely to occur in actual string samples +1 for 'istring'.sub() instead of e'istring' because sub allows a particular mapping to be specified +1 for not being a separate module so the feature gets used +1 for leaving %()s alone because formats may have been stored external to programs +1 for not using back-quotes because they are hard to read in languages with accents because the open and close back-quotes are not distinct 'regnitteh dnomyar'[::-1] From greg@cosc.canterbury.ac.nz Fri Jun 21 07:36:23 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 21 Jun 2002 18:36:23 +1200 (NZST) Subject: [Python-Dev] Weird problem with exceptions raised in extension module In-Reply-To: <200206201745.g5KHjI604158@odiug.zope.com> Message-ID: <200206210636.g5L6aNU06187@oma.cosc.canterbury.ac.nz> Guido: > It seems that this is just for > > raise TypeError, "Test-Exception" Actually, it's raise TypeError("Test-Exception") > But I think that you shouldn't be calling PyErr_SetNone() here -- I > think you should call PyErr_SetObject(__pyx_1, __pyx_2). > > For details see do_raise() in ceval.c. Hmmm. Having studied this routine *very* carefully, I think I can see where things are going wrong. Reading the C API docs led me to believe that the equivalent of the Python statement raise x would be PyErr_SetNone(x) But it appears that is not the case, and what I should actually be doing is PyErr_SetObject( ((PyInstanceObject*)x)->in_class, x) This is... um... not very intuitive. Perhaps the C API docs could be amended to mention this? Also, it looks as if exceptions have to be old-style instances, not new-style ones. Is that correct? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aleax@aleax.it Fri Jun 21 08:38:05 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 21 Jun 2002 09:38:05 +0200 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <200206202121.g5KLLPT05634@odiug.zope.com> References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com> Message-ID: On Thursday 20 June 2002 11:21 pm, Guido van Rossum wrote: ... > Now back to $ vs. %. I think I can defend having both in the > language, but only if % is reduced to the positional version (classic > printf). This would be used mostly to format numerical data with > fixed column width. There would be very little overlap in use cases: I think you're right: in a "greenfield" language design (a hypothetical one starting from scratch with no constraints of backwards compatibility) you can indeed defend using both % and $ for these two tasks, net of the issues of what feature set to give $ formatting -- implicit vs nonimplicit access to variables, including the very delicate case of access to free variables (HOW to give access to free variables if the formatstring isn't a literal?); ability to use expressions and not just identifiers; ability to pass a mapping; what format control should be allowed in $ formatting -- and what syntax to use to give acces to those features. If %(name)s is to be deprecated moving towards Python-3000 (surely it can't be _removed_ before then), $-formatting needs a very rich feature set; otherwise it can't _replace_ %-formatting. It seems to me that (assuming $ formatting IS destined to get into Python) $ formatting should then be introduced with all or most of the formatting power it eventually needs, so that those who want to make their programs Py3K-ready can use $ formatting to replace all their uses of %(name)s formatting. The "transition" period will thus inevitably offer different ways to perform the same tasks -- we can never get out of this bind, any time we move to deprecate an "old way" to perform a task, since the old way and the new way MUST both work together for a good while to allow migration. This substantial cost is of course worth paying only if the new way is a huge win over the old one -- not just "somewhat" better, but ENORMOUSLY better. But that's OK, and exactly the kind of delicate trade-off which you DO have such a good track record at getting right in the past:-). > All options are still open. Thanks for clarifying this. To me personally it seems that the gain of introducing $ formatting, if gain it be, is small enough not to be worth the transition cost, but that's just opinion, hard to back up with any substance. So I offer a real-life anecdote instead. A colleague at Strakt (a wizard at various communication and storage programming issues) had no previous exposure to Python at all, his recent background being mostly with Plan-9, Inferno, and Limbo (previously, other Bell Labs technologies, centered on Unix and C). He picked up Python on the job over the last few months -- basically from Python's own docs, our existing code base, and discussions with colleagues, me included -- and didn't take long to become productive with it. He still has some issues. Some are very understandable considering his background -- e.g., he's still not fully _comfortable_ with dynamic typing (I predict he'll grow to like it, but Rome wasn't built in one day). Overall, what I would call a pretty good scenario and an implicit tribute to Python's simplicity / ease / power. He may pine for Limbo, but in fact produces a lot of excellent Python code day in day out. But his biggest remaining "general peeve" struck me hard the other day, exactly because that's not something he "heard", but an observation he came up with all by himself, by reasonably unbiased examination of "Python as she's spoken". "I wouldn't mind Python so much" (I'm paraphrasing, but that IS the kind of grudging-compliment understatement he did use:-) "except that there's always so MANY deuced ways to do everything -- can't they just pick one and STICK with it?!". In the widespread subtext of most Python discourse this might sound like irony, but in his case, it was just an issue of fact (compared, remember, with SMALL languages such as Limbo -- bloated ones such as, e.g., C++, are totally *outside* his purvey and experience) -- a bewildering array of possible variations. Surely inevitable when viewed diachronically (==as an evolution over time), but his view, like that of anybody who comes to Python anew today, is synchronic (==a snapshot at one moment). I don't think there's anything we can do to AVOID this phenomenon, of course, but right now I'm probably over-sensitized to the "transition costs" of introducing "yet one more way to do it" by this recent episode. So, it appears to me that REDUCING the occurrence of such perceptions is important. Alex From mal@lemburg.com Fri Jun 21 08:57:57 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 21 Jun 2002 09:57:57 +0200 Subject: [Python-Dev] *Simpler* string substitutions References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com> Message-ID: <3D12DC85.6040501@lemburg.com> Alex Martelli wrote: > If %(name)s is to be deprecated moving towards Python-3000 (surely it > can't be _removed_ before then), $-formatting needs a very rich feature set; > otherwise it can't _replace_ %-formatting. It seems to me that (assuming > $ formatting IS destined to get into Python) $ formatting should then be > introduced with all or most of the formatting power it eventually needs, so > that those who want to make their programs Py3K-ready can use $ formatting > to replace all their uses of %(name)s formatting. I haven't jumped into this discussion since I thought that you were only discussing some new feature which I don't have a need for. Now if you want to deprecate %(name)s formatting, the situation is different: my tie would start jumping up and down, doing funny noises :-) So just this comment from me: please don't deprecate %(name)s formatting. For the rest: I don't really care. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From aleax@aleax.it Fri Jun 21 09:10:03 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 21 Jun 2002 10:10:03 +0200 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: <3D1254E5.6010007@stsci.edu> References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu> Message-ID: On Friday 21 June 2002 12:19 am, Todd Miller wrote: ... > >I'm concerned that this will also make floats acceptable as indices > >(since they have an __int__ method) and this would cause atrocities > >like > > > >print "hello"[3.5] ... > That makes sense. What if we specifically excluded Float objects from > the conversion? Are there any types that need to be excluded? If "Any type that's float-like", and that's a very hard set to pin down. Consider a user-written class that implements (e.g.) a number in decimal form (maybe BCD), carefully crafted to "look&feel just like float" except for its specifics (such as different rounding behavior). How would you tell that this class is NOT acceptable as a sequence index even though it has an __int__ method while another class with an __int__ method IS OK? It seems to me that one solution would be to add an attribute that is to be exposed by types / classes that WANT to be usable as indices in this way. If, say, the object exposes an attribute _usable_as_sequence_index, then the indexing code could proceed, otherwise, TypeError. It's quite sad that a lot of ad-hoc approaches such as this one have to be devised in each and every similar case, when PEP 246, gathering dust in the PEP repository, offers such a simple, elegant architecture for them all. Basically, PEP 246 lets you ask a "central mechanism", given an object X and a "protocol" Y, to yield a Z (where Z is X if feasible, but in many cases might be a "version of X which is Y-fied without loss of information") such that Z is "X or a version of X that satisfies protocol Y". "Adaptation" is the name commonly used for this approach (also in PEP 246). When X can't be adapted to Y, an exception gets raised. Here, indexing code could ask for an adaptation of X to the "sequence index protocol" and get either "a version of X usable as sequence index" or an exception. "A protocol" is normally a type or class, and "Z satisfies protocol Y" may then be roughly equated to "Z is an instance of Y", but the concept is more general. If Python had a formal concept of 'interface', a protocol might also be an interface -- this is apparently what's holding up PEP 246, waiting for such 'interfaces' to appear. But "a protocol" may in fact be any object at all and the concept of "satisfying" it is really a matter of convention between the code that requests adaptation and the code that _provides_ adaptation. The latter may live in X's type, or in the Y protocol, or *outside of both* and get added to the "central mechanism" dynamically -- so you get a chance to adapt two separately developed frameworks without as much blood, sweat and tears as currently needed. (The compile-time equivalent of this is in Haskell's "typeclass" mechanism, but of course Python moves it to runtime instead.) Back to your specific issue. "An integer" is too BROAD a concept. When some client-code has an object X and "wants an integer equivalent of X" it may have SEVERAL different purposes in mind. int(X) can't guess and so provides only ONE way -- for example, truncating the fractional part if X is a float. If the client-code could ask more precisely for "give me a version of X to be used as a sequence index" it would still get back either an int OR an exception, BUT, the int result would only be supplied if "it was known" that X is indeed "usable without loss of information" for the specific purpose of indexing a sequence. The "it was known" part could reside in any one of three places: a. the SequenceIndexing protocol could 'know' that e.g. every int X is OK as a sequence index, and immediately return such an X if asked for adaptation of it; b. a type could 'know' its instances are OK as sequence indices, and supply the equivalent-for-THAT-purpose int on request; c. a "third-party" adapter could know that, for this application, instances of type A are OK to use as sequence indices: the third-party adapter would be installed at application startup, get invoked upon such adaptation requests when X is an instance of type A, and provide the needed int. See PEP 246 for one possible mechanism (at Python-level) to support this, but the mechanism is of course fully negotiable. The point is that we NEED something like PEP 246 each and every time we want to perform any task of this ilk. Almost every time I see type-testing (as implicit in the idea "but do something different if X is a float", for example), I see a need for PEP 246 that stays unmet because PEP 246 is waiting... Alex From aleax@aleax.it Fri Jun 21 09:20:02 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 21 Jun 2002 10:20:02 +0200 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net> References: <3D123B1E.6050600@stsci.edu> <3D1254E5.6010007@stsci.edu> <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Friday 21 June 2002 03:29 am, Guido van Rossum wrote: ... > This points to an unfortunate early design flaw in Python (inherited > from C casts): __int__ has two different meanings -- sometimes it > converts the type, sometimes it also truncates the value. That's inherent in any conversion to a type which has multiple purposes. I wouldn't call it a "design flaw" -- it's a "flaw" (?) in the underlying reality:-). > I hesitate to propose a new special method, but that may be the only > solution. :-( PEP 246... Alex From martin@v.loewis.de Fri Jun 21 09:34:00 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 21 Jun 2002 10:34:00 +0200 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <20020620205041.GD18944@zot.electricrain.com> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain> <20020620205041.GD18944@zot.electricrain.com> Message-ID: "Gregory P. Smith" writes: > However linking against berkeleydb versions less than 3.2 will no longer > be supported; should we keep the existing bsddb around as oldbsddb for > users in that situation? I don't think so; users could always extract the module from older distributions if they want to. Instead, if there are complaints, I think we should try to extend source support a little further back. Regards, Martin From martin@v.loewis.de Fri Jun 21 09:36:14 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 21 Jun 2002 10:36:14 +0200 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: <3D1254E5.6010007@stsci.edu> References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu> Message-ID: Todd Miller writes: > That makes sense. What if we specifically excluded Float objects > from the conversion? Are there any types that need to be excluded? > If there's a chance of getting a patch for this accepted, STSCI is > willing to do the work. Perhaps an __index__ conversion could work? Regards, Martin From piers@cs.su.oz.au Fri Jun 21 09:41:25 2002 From: piers@cs.su.oz.au (Piers Lauder) Date: Fri, 21 Jun 2002 18:41:25 +1000 Subject: [Python-Dev] unifying read method semantics Message-ID: <1024648887.89.481975932@cs.su.oz.au> A user of imaplib's IMAP4_SSL class has complained that the "read" and "write" methods don't behave correctly, sometimes omitting to handle all the requested data. This is a bug - I should have noticed this common misconception when installing the submitted sub-class into imaplib. However, this is a common enough gotcha for python programmers that I wondered if it is worthwhile fixing it once and for all. Ie: mandate that core python modules providing read/write methods guarantee that all the data is sent by write() (or exception), and all the requested data is read() (or exception). The last time this came up the socketmodule code got a "sendall" method. However, this doesn't exist in the ssl portion of socketmodule.c. And while I'm on the topic - please could we always support "readline" (or "makefile") methods in C modules? Surely the following code now necessary in imaplib must make CPU-time conscious programmers wince: def readline(self): """Read line from remote.""" line = "" while 1: char = self.sslobj.read(1) line += char if char == "\n": return line :-) Piers Lauder. From martin@v.loewis.de Fri Jun 21 09:46:52 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 21 Jun 2002 10:46:52 +0200 Subject: [Python-Dev] unifying read method semantics In-Reply-To: <1024648887.89.481975932@cs.su.oz.au> References: <1024648887.89.481975932@cs.su.oz.au> Message-ID: Piers Lauder writes: > And while I'm on the topic - please could we always support "readline" > (or "makefile") methods in C modules? I don't think this is feasiable. > Surely the following code now > necessary in imaplib must make CPU-time conscious programmers wince: > > def readline(self): > """Read line from remote.""" > line = "" > while 1: > char = self.sslobj.read(1) > line += char > if char == "\n": return line Moving this algorithm to another location won't essentially change CPU consumption... Regards, Martin From Paul.Moore@atosorigin.com Fri Jun 21 11:22:27 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 21 Jun 2002 11:22:27 +0100 Subject: [Python-Dev] Re: *Simpler* string substitutions Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B3B2@UKRUX002.rundc.uk.origin-it.com> Some points on the current thread, in no particular order... 1. While I agree that "$" is better known as an interpolation character than "%", it shouldn't be forgotten that "%" is the interpolation character in DOS/Windows shells. Some recent examples which showed "%" used ("The sum of %x and %y is %(x+y)") looked entirely natural to me (I use Windows more than Unix) - in fact, more so than "$"!! 2. The internationalisation issue is clearly important. However, it has very different characteristics insofar as the template string is (of necessity) handled at runtime, so issues of compilation and security become relevant. I'm no I18N expert, so I can't comment on details, but I *do* think it's worth separating out the I18N issues from the "simple interpolation" issues... 3. I feel that the existing % formatting operator cannot realistically be removed. Tidying up some of its warts may be possible, and even sensible, but there's too much code using it (and as was pointed out, template strings may not event be stored in code files) to make major changes. 4. Access to variables is also problematic. Without compile-time support, access to nested scopes is impossible (AIUI). But on the other hand, a scheme with subtle limitations such as lack of such access may not realistically count as "simple"... 5. (Personal opinion here!) I believe that formatting specifiers so not belong in a "simple" scheme - leave them for the "advanced" verion (the existing % operator). On the other hand, I feel that expression interpolation, within limits, *is* suitable. It's the user's responsibility not to go overboard, though... Sorry for butting into an already long thread. I hope the summary is useful, at least... Paul. From tismer@tismer.com Fri Jun 21 12:09:33 2002 From: tismer@tismer.com (Christian Tismer) Date: Fri, 21 Jun 2002 13:09:33 +0200 Subject: [Python-Dev] *Simpler* string substitutions References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com> <3D12DC85.6040501@lemburg.com> Message-ID: <3D13096D.9030803@tismer.com> M.-A. Lemburg wrote: > Alex Martelli wrote: > >> If %(name)s is to be deprecated moving towards Python-3000 (surely it >> can't be _removed_ before then), $-formatting needs a very rich >> feature set; otherwise it can't _replace_ %-formatting. It seems to >> me that (assuming >> $ formatting IS destined to get into Python) $ formatting should then be >> introduced with all or most of the formatting power it eventually >> needs, so >> that those who want to make their programs Py3K-ready can use $ >> formatting >> to replace all their uses of %(name)s formatting. > > > I haven't jumped into this discussion since I thought that > you were only discussing some new feature which I don't have > a need for. > > Now if you want to deprecate %(name)s formatting, > the situation is different: my tie would start jumping up > and down, doing funny noises :-) > > So just this comment from me: please don't deprecate %(name)s > formatting. For the rest: I don't really care. Yes, please don't! Besides the proposals so far, I'd like to add one, which I really like a bit, since I used it for years in an institute with a row of macro languages: How about name = "Guido" ; land = "The Netherlands" "His name is <> and he comes from <>.".sub(locals()) I always found this notation very sharp and readable, maybe this is just me. I like to have a notation that is easily parsed, has unique start and stop strings, no puctuation/whitespace rules at all. Any kind of extra stuff like format specifiers, default values or expressions (if you really must) can be added with ease. If people like to use different delimiters, why not: "His name is <$name$> and he comes from <$land$>.".sub(locals(), delimiters=("<$","$>") ) -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer@tismer.com Fri Jun 21 12:48:18 2002 From: tismer@tismer.com (Christian Tismer) Date: Fri, 21 Jun 2002 13:48:18 +0200 Subject: [Python-Dev] Re: Version Fatigue References: <200206201710.g5KHAaO03970@odiug.zope.com> <3D122354.8040308@tismer.com> <200206201954.g5KJsTt05302@odiug.zope.com> Message-ID: <3D131282.4030400@tismer.com> Guido van Rossum wrote: >>Guido, I'm not sure that you are always aware what >>people actually like about Python and what they dislike. >>I have heared such complaints from so many people, >>that I think there are reasonably many who don't share >>your judgement. > > > Tough. People used to like it because they trusted my judgement. > Maybe I should stop listening to others. :-) I thought you did already? :-) > Seriously, the community is large enough that we can't expect > everybody to like the same things. There are reasonably many who > still do share my judgement. As the community grows, your audience also seems to change. Newbies are more comfortable with new features. The crowd of people like me who have become accustomed to slow and resistant motion in Python over the years are now no longer the main target of Python's develoment. I think I could spell about 20 "oldtimers" without thinking, who might have similar feelings. [type, class, metaclass] > No surprise that you, always the mathematician, like the most > brain-exploding features. :-) It is not for the features. It is for the elegance of the concept, the great backward compatibility, the nice convergence of concepts that seemed to be impossible to get married. This is real "Kunst", whether I'm a math guy or a musician. > And note the contradiction, which you share with everybody else: you > don't want new features, except the three that you absolutely need to > have. And you see nothing wrong with this contradiction. Many people share this, but not me, sorry. I didn't request any change to Python since years. I don't need any of the recent changes, but I can of course use them. >>All in all Python is evolving good. Maybe we could >>slow a little down, please? > > I'm trying. I'm really trying. Please give me some credit. I checked your soundness. How much do you want? -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From s_lott@yahoo.com Fri Jun 21 13:27:22 2002 From: s_lott@yahoo.com (Steven Lott) Date: Fri, 21 Jun 2002 05:27:22 -0700 (PDT) Subject: [Python-Dev] strptime recapped In-Reply-To: Message-ID: <20020621122722.44222.qmail@web9601.mail.yahoo.com> time2 might be a good place to include the nifty Reingold-Dershowitz "rata die" date numbering; it can be converted back and forth a vast number of widely used calendars: Julian, Gregorian, Hebrew, old and new Hindu, Chinese (given enough floating-point accuracy), Astronomical Julian, and several others. Generally, "Julian" dates are really just the day number within a given year; this is a simple special case of the more general (and more useful) approach that R-D use. See http://emr.cs.iit.edu/home/reingold/calendar-book/index.shtml for more information. --- Brett Cannon wrote: > I have written the callout to strptime.strptime (strptime is > SF patch > #474274) as Guido asked. Since that was the current hold-up > and the > thread has gone dormant, I figured I should summarize the > discussion up to > this point. > > 1) what is the need?: > The question was raised why this was done. The answer was > that since time > is just a wrapper around time.h, strptime was not guaranteed > since it is > not a part of ANSI C. Some ANSI C libraries include it, > though (like > glibc), because it is so useful. Unfortunately Windows and OS > X do not > have it. Having it in Python means it is completely portable > and no > longer reliant on the ANSI C library being kind enough to > provide it. > > 2) strftime dependence: > Some people worried about the dependence upon strftime for > calculating > some info. But since strftime is guaranteed to be there by > Python (since > it is a part of ANSI C), the dependence is not an issue. > > 3) locale info for dates: > Skip and Guido pointed out that calendar.py now generates the > names of > the weekdays and months on the fly similar to my solution. So > I did go > ahead and use it. But Skip pointed out that perhaps we should > centralize > any code that calculates locale info for dates (calendar.py's > names and my > code for figuring out format for date/time). I had suggested > adding it to > the locale module and Guido responded that Martin had to ok > that. Martin > hasn't responded to that idea. > > 4) location of strptime: > Skip asked why Guido was having me write the callout patch to > timemodule.c. He wondered why Lib/time.py wasn't just created > holding my > code and then renaming timemodule.c to _timemodule.c and > importing it at > the end of time.py. No response has been given thus far for > that. > > I also suggested a possible time2 where things like strptime, > my helper > fxns (calculate the Julian date from the Gregorian date, > etc.), and things > such as naivetime could be kept. That would allow time to > stay as a > straight wrapper to time.h while all bonus code gets relegated > to time2. > Guido said it might be a good idea but would have to wait > until he got > back from vacation. > > > That pretty much sums up everything to this point; hope I got > it right and > didn't miss anything. > > -Brett C. > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev ===== -- S. Lott, CCP :-{) S_LOTT@YAHOO.COM http://www.mindspring.com/~slott1 Buccaneer #468: KaDiMa Macintosh user: drinking upstream from the herd. __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From guido@python.org Fri Jun 21 13:43:28 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Jun 2002 08:43:28 -0400 Subject: [Python-Dev] Vacation Message-ID: <200206211243.g5LChT524426@pcp02138704pcs.reston01.va.comcast.net> I should mention that I'm going on vacation. I'm not going to run a vacation program -- I've had poor results with those in the past, and I expect to be sporadically checking email. But I don't expect to be responding to python-dev mail until I'm back. I'm leaving later today and plan to return Monday July 8, back at work the 9th. Of course, I'll see a bunch of you all at EuroPython! I'm looking forward to it -- it sounds like it's gonna be a great conference! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 21 13:50:54 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Jun 2002 08:50:54 -0400 Subject: [Python-Dev] Weird problem with exceptions raised in extension module In-Reply-To: Your message of "Fri, 21 Jun 2002 18:36:23 +1200." <200206210636.g5L6aNU06187@oma.cosc.canterbury.ac.nz> References: <200206210636.g5L6aNU06187@oma.cosc.canterbury.ac.nz> Message-ID: <200206211250.g5LCosp24749@pcp02138704pcs.reston01.va.comcast.net> > Reading the C API docs led me to believe that the > equivalent of the Python statement > > raise x > > would be > > PyErr_SetNone(x) > > But it appears that is not the case, and what I > should actually be doing is > > PyErr_SetObject( > ((PyInstanceObject*)x)->in_class, x) > > This is... um... not very intuitive. Perhaps the > C API docs could be amended to mention this? I guess so. The rule is that all PyErr_SetXXX functions correspond to a raise statement with a class as first argument. raise with an instance first argument is a shortcut. > Also, it looks as if exceptions have to be > old-style instances, not new-style ones. Is > that correct? Unfortunately so in the current code base. I'm not sure if/when we should lift this restriction. I'm also not sure if, when we lift it, we should make Exception and all other built-in exceptions new-style classes. New-style and classic classes aren't 100% compatible and I don't like to break people's code who have subclassed a built-in exception class and did something that doesn't work the same in new-style classes. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 21 13:59:47 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Jun 2002 08:59:47 -0400 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: Your message of "Fri, 21 Jun 2002 09:38:05 +0200." References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com> Message-ID: <200206211259.g5LCxmx24769@pcp02138704pcs.reston01.va.comcast.net> > But his biggest remaining "general peeve" struck me hard the other > day, exactly because that's not something he "heard", but an > observation he came up with all by himself, by reasonably unbiased > examination of "Python as she's spoken". "I wouldn't mind Python so > much" (I'm paraphrasing, but that IS the kind of grudging-compliment > understatement he did use:-) "except that there's always so MANY > deuced ways to do everything -- can't they just pick one and STICK > with it?!". In the widespread subtext of most Python discourse this > might sound like irony, but in his case, it was just an issue of > fact (compared, remember, with SMALL languages such as Limbo -- > bloated ones such as, e.g., C++, are totally *outside* his purvey > and experience) -- a bewildering array of possible variations. > Surely inevitable when viewed diachronically (==as an evolution over > time), but his view, like that of anybody who comes to Python anew > today, is synchronic (==a snapshot at one moment). > > I don't think there's anything we can do to AVOID this phenomenon, > of course, but right now I'm probably over-sensitized to the > "transition costs" of introducing "yet one more way to do it" by > this recent episode. So, it appears to me that REDUCING the > occurrence of such perceptions is important. AFAIK Limbo has a very small user base (and its key designer is much more arrogant than your average BDFL even :-). It's much easier to withstand the pressure to add features in that case. And lately, most new features have been better ways to do things you could already do, but only clumsily. That would add to his impressions. Plus, inevitably, that not everybody at Strakt uses the same coding style. I understand the sentiment, but users are like this: they all want you to stop adding features except the one thing they absolutely need. (Myhrvold) --Guido van Rossum (home page: http://www.python.org/~guido/) From pinard@iro.umontreal.ca Fri Jun 21 14:35:36 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 21 Jun 2002 09:35:36 -0400 Subject: [Python-Dev] Re: String substitution: compile-time versus runtime In-Reply-To: <3D1297ED.3990C30F@prescod.net> References: <200206210141.g5L1fDv09800@pcp02138704pcs.reston01.va.comcast.net> <3D1297ED.3990C30F@prescod.net> Message-ID: [Paul Prescod] > I think that what I hear you saying is that interpolation should ideally > be done at a compile time for simple uses and at runtime for i18n. > [...] If "%" survives, it would be used for positional parameters, > instead of named parameters. [...] I think we are making progress > if we're coming to understand that the two different problem domains > (simple scripts versus i18n) have different needs and that there is > probably no one solution that fits both. [Moore, Paul] > The internationalisation issue is clearly important. However, it has > very different characteristics insofar as the template string is (of > necessity) handled at runtime, so issues of compilation and security > become relevant. I'm no I18N expert, so I can't comment on details, > but I *do* think it's worth separating out the I18N issues from the > "simple interpolation" issues... You know, the ultimate goal of internationalisation, for a non English speaking user and even programmer, is to see his/her own language all over the screen. This means from the shell, from the system libraries, from all applications, big or small, everything. For what is provided by other programmers or maintainers, this may occur sooner and later, depending on the language, the interest of the maintainer, and the development dynamic. The far-reaching hope is that it will eventually occur. For what a user/programmer writes little things himself/herself, and this is where Python pops up, there are two ways. The simplest is to write all strings in native language. The other way, meant to help exchange with various friends or get feedback from a wider community, is to do things properly, and internationalise even small scripts from the start. It is easy to develop such an attitude, yet currently, examples do not abound. I surely had it for a few languages, despite it was rather demanding on me, at the time `gettext' was not yet available -- and in fact, my works were used to benchmark various ideas before `gettext' was first written. The mantra I repeated all along had two key points: 1) internationalisation will only be successful if designed to be unobtrusive, otherwise average maintainers and implementors will resist it. 2) programmer duties and translation duties are to be kept separate, so these activities could be done asynchronously from one another.[1] I really, really think that with enough and proper care, Python could be set so internationalisation of Python scripts is just unobtrusive routine. There should not be one way to write Python when one does not internationalise, and another different way to use it when one internationalises. The full power and facilities of Python should be available at all times, unrelated to internationalisation intents. Non-English people should not have to pay a penalty, or if they do, the penalty should be minimised As Much As Possible. Our BDFL, Guido, should favour internationalisation as a principle in the evolution for the language, that is, more than a random negligible feature. I sincerely hope he will do. For many people, internationalisation issues cannot be separated out that simply, or otherwise dismissed. We should rather learn to collaborate at properly addressing and solving them at each evolutionary step, so Python really remains a language for everybody. -------------------- [1] In practice, we've met those two goals only partly. For C programs, the character overhead per localised string is low -- the three characters "_()", while exceptionally _not_ obeying the GNU standard about a space before the opening parenthesis. The glue code is still small -- yet not as small as I would have wanted. I wrote the Emacs PO mode so marking strings in a C project can be done rather quickly by maintainers, and so translators can do their job alone. These are on the positive side. But I think we failed at the level of release engineering, as the combined complexity of Automake, Autoconf, Libtool and Gettext installation scripts is merely frightening, and very discouraging for the casual user. There were reasons behind "releng" choices, but they would make a long story. :-) Also, people in the development allowed more fundamental unneeded complexities, and which had to sad effect of anchoring the original plans to the point of being stuck. On the other hand, people not understanding where we were aiming, are happily unaware of what we are missing. (Maintainers may become incredibly stubborn, when having erections. :-) Eh, that's life... Sigh! Python can do better on _all_ fronts. By the way, I hope that `distutils' can be adapted to address internationalisation-related release engineering difficulties, so these merely vanish in practice for Python lovers. We could also have other standard helper tools for non-installed scripts. -- François Pinard http://www.iro.umontreal.ca/~pinard From aleax@aleax.it Fri Jun 21 14:38:46 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 21 Jun 2002 15:38:46 +0200 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <200206211259.g5LCxmx24769@pcp02138704pcs.reston01.va.comcast.net> References: <3D121F0D.E3B60865@prescod.net> <200206211259.g5LCxmx24769@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Friday 21 June 2002 02:59 pm, Guido van Rossum wrote: ... > AFAIK Limbo has a very small user base (and its key designer is much > more arrogant than your average BDFL even :-). It's much easier to > withstand the pressure to add features in that case. And lately, most This is probably correct (both arrogance and a small user base help:-). However, see at end as for how a largish user base (of a certain kind) may actually HELP. But I'm not particularly concerned with comparisons of Python to Limbo, but rather with the general issue of how Python can be perceived (quite apart from any issue of spin or how to present it - by somebody who's not inclined to listen to spin or presentation but prefers to see things for himself). > new features have been better ways to do things you could already do, > but only clumsily. Yes, to some extent that's inevitable. Once a language is (net of finite-storage issues) Turing-complete, in a very real sense EVERYTHING is one of those "things you could already do". > That would add to his impressions. Plus, > inevitably, that not everybody at Strakt uses the same coding style. We try hard to avoid such "code ownership" issues, with pair programming, frequent refactoring, strong consensus-based coding-style guidelines, and so on. But of course we can't get them down to 0. > I understand the sentiment, but users are like this: they all want you > to stop adding features except the one thing they absolutely need. > (Myhrvold) Actually, I believe a lot of users don't particularly mind there being lots of redundant features around, but presumably THAT sort tends to be selected-against wrt Python-Dev (and Strakt employment), while (e.g.) Perl (or MS employment) might draw them more. Still, as long as you keep in your field of vision the reality that (excluding the selected-against crowd) every new feature you DO add is perceived as a negative by MOST users, I trust you'll keep being extremely selective in deciding what IS truly "absolutely" needed. This is the point I mentioned at the start about effects of user base. Given that the user base is largish AND biased AGAINST featuritis, it should HELP you "withstand the pressure to add features"... if you WANT to withstand it. I.e., you'll mostly get strong support for any stance of "let's NOT add this". You may dislike that when you WANT to add a feature, but surely not when it's about "withstanding the pressure". Alex From jmiller@stsci.edu Fri Jun 21 14:42:02 2002 From: jmiller@stsci.edu (Todd Miller) Date: Fri, 21 Jun 2002 09:42:02 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu> <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D132D2A.7080801@stsci.edu> Guido van Rossum wrote: >[Todd Miller] > >>>>There has been some recent interest in the Numeric/numarray community >>>>for using array objects as indices >>>>for builtin sequences. I know this has come up before, but to make >>>>myself clear, the basic idea is to make the >>>>following work: >>>> >>>>class C: >>>> def __int__(self): >>>> return 5 >>>> >>>>object = C() >>>> >>>>l = "Another feature..." >>>> >>>>print l[object] >>>>"h" >>>> >>>>Are there any plans (or interest) for developing Python in this direction? >>>> > >[Guido] > >>>I'm concerned that this will also make floats acceptable as indices >>>(since they have an __int__ method) and this would cause atrocities >>>like >>> >>>print "hello"[3.5] >>> >>>to work. >>> > >[Todd] > >>That makes sense. What if we specifically excluded Float objects from >>the conversion? Are there any types that need to be excluded? If >> ^ other types >> >>there's a chance of getting a patch for this accepted, STSCI is willing >>to do the work. >> > >Hm, an exception for a specific type seems ugly. What if a user > I agree. > >defines a UserFloat type, or a Rational type, or a FixedPoint type, >with an __int__ conversion? > Perry actually suggested excluding instances of any subclass of Float. I see now that there is also the related problem of excluding instances which act like Floats. > >This points to an unfortunate early design flaw in Python (inherited >from C casts): __int__ has two different meanings -- sometimes it >converts the type, sometimes it also truncates the value. > >I suppose you could hack something where you extract x.__int__() and >x.__float__() and compare the two, but that could lead to a lot of >overhead. > Sounds too tricky to me. I'd hate to explain it. > > >I hesitate to propose a new special method, but that may be the only >solution. :-( > I liked MvL's __index__ method. I understand your hesitancy. It's pretty tough exploring new features side-by-side the "version fatigue" thread :) > > >What's your use case? Why do you need this? > This might settle it :) Numeric/numarray arrays are sometimes used in reduction operations (e.g. max) which eliminate one dimension. Sometimes the result is a zero dimensional array, which is currently converted to a Python scalar in both Numeric and numarray. The conversion to scalar enables integer zero dimensional results to be used as indices, but causes other problems since any auxilliary information in the array (e.g. type = Int8) is lost. Adding some form of implicit conversion to index value might permit us to retain zero dimensional objects as arrays. > >--Guido van Rossum (home page: http://www.python.org/~guido/) > > >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev > From Jack.Jansen@cwi.nl Fri Jun 21 14:48:10 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 21 Jun 2002 15:48:10 +0200 Subject: [Python-Dev] Python strptime In-Reply-To: Message-ID: <85F02CFE-851D-11D6-8310-0030655234CE@cwi.nl> On Tuesday, June 18, 2002, at 07:30 , Martin v. Loewis wrote: > I wonder what the purpose of having a pure-Python implementation of > strptime is, if you have to rely on strftime. Is this for Windows only? MacPython would benefit as well: it also has strftime() but not strptime(). There's currently a pure-python implementation of strptime() in the Contrib folder, but Brett's solution would better (the Contrib strptime can't be incorporated because it's under GPL license, and the automatic callout to Python would be really nice as it makes this user-invisible). -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From guido@python.org Fri Jun 21 14:56:27 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Jun 2002 09:56:27 -0400 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: Your message of "Fri, 21 Jun 2002 15:38:46 +0200." References: <3D121F0D.E3B60865@prescod.net> <200206211259.g5LCxmx24769@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net> > This is the point I mentioned at the start about effects of user > base. Given that the user base is largish AND biased AGAINST > featuritis, it should HELP you "withstand the pressure to add > features"... if you WANT to withstand it. I.e., you'll mostly get > strong support for any stance of "let's NOT add this". You may > dislike that when you WANT to add a feature, but surely not when > it's about "withstanding the pressure". I really have to start packing :-), but I've got one more thing to say. You say that the use base is biased against featuritis. Yet the user base is the largest source of new feature requests and proposals. How do you reconcile these? You yourself pleaded for PEP 246 just an hour ago. Surely that's a big honking new feature! For the user base as a whole, the Myhrvold quote is even more true. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jun 21 14:59:50 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Jun 2002 09:59:50 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: Your message of "Fri, 21 Jun 2002 09:42:02 EDT." <3D132D2A.7080801@stsci.edu> References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu> <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net> <3D132D2A.7080801@stsci.edu> Message-ID: <200206211359.g5LDxoF25028@pcp02138704pcs.reston01.va.comcast.net> > I liked MvL's __index__ method. I understand your hesitancy. It's > pretty tough exploring new features side-by-side the "version > fatigue" thread :) Yes. > >What's your use case? Why do you need this? > This might settle it :) Numeric/numarray arrays are sometimes used in > reduction operations (e.g. max) which eliminate one dimension. Sometimes > the result is a zero dimensional array, which is currently converted to > a Python scalar in both Numeric and numarray. The conversion to scalar > enables integer zero dimensional results to be used as indices, but > causes other problems since any auxilliary information in the array > (e.g. type = Int8) is lost. Adding some form of implicit conversion to > index value might permit us to retain zero dimensional objects as arrays. But when you're indexing Numeric/numarray arrays, you have full control over the interpretation of indices, so you can do this yourself. Do you really need to be able to index Python sequences (lists, tuples) with your 0-dimensional arrays? Could you live with having to call int() in those cases? --Guido van Rossum (home page: http://www.python.org/~guido/) From pinard@iro.umontreal.ca Fri Jun 21 14:58:37 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 21 Jun 2002 09:58:37 -0400 Subject: [Python-Dev] Re: String substitution: compile-time versus runtime In-Reply-To: <3D1297ED.3990C30F@prescod.net> References: <200206210141.g5L1fDv09800@pcp02138704pcs.reston01.va.comcast.net> <3D1297ED.3990C30F@prescod.net> Message-ID: [Alex Martelli] > If %(name)s is to be deprecated moving towards Python-3000 (surely it > can't be _removed_ before then), $-formatting needs a very rich feature > set; otherwise it can't _replace_ %-formatting. [...] The "transition" > period will thus inevitably offer different ways to perform the same > tasks [...] the old way and the new way MUST both work together for a > good while to allow migration. [Moore, Paul] > I feel that the existing % formatting operator cannot realistically > be removed. I too, like Alex and Paul, have a hard time believing that `%' will effectively fade out in favour of `$'. As a few people tried to stress out (Alex did very well with his anecdote), changes in Python are welcome when they add real new capabilities, but they are less welcome when they merely add diversity over old substance: the language is then hurt each time, loosing bits of simplicity (and even legibility, through the development of Python subsets in user habits). Each individual loss may be seen as insignificant when discussed separately[1], but when the pace of change is high, the losses accumulate, especially if the cleanup does not occur. This is why any change in current string interpolation should be crafted so it fits _very_ naturally with what already exists, and does not look like another feature patched over other features. A forever "transition" period between two interpolation paradigms, foreign to one another, might give exactly that bad impression. -------------------- [1] This is one of the drawback of the PEP system. By concentrating on individual features, we loose the vision of all features taken together. Only Guido has a global vision. :-) -- François Pinard http://www.iro.umontreal.ca/~pinard From aleax@aleax.it Fri Jun 21 15:19:50 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 21 Jun 2002 16:19:50 +0200 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net> References: <3D121F0D.E3B60865@prescod.net> <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Friday 21 June 2002 03:56 pm, Guido van Rossum wrote: ... > You say that the use base is biased against featuritis. Yet the user > base is the largest source of new feature requests and proposals. How > do you reconcile these? I hypothesized that, because of self-selection effects, Python's user based (particularly on Python-Dev and in Python-only firms) is biased against featurities _when compared_ to the general population, which (I opine) includes a wider proportion of people who don't particularly mind a language having many redundant features. There is obviously nothing that needs to be "reconciled" between this hypothesized sample-bias and the observation that requests for features come more from the user base than from (who else would you expect them to come FROM -- people who've never HEARD about Python...?-). So, you're either joking or subject to a common and quite understandable "statistical fallacy". Never forget Bayes's Theorem...!-) > You yourself pleaded for PEP 246 just an hour > ago. Surely that's a big honking new feature! I prefer to think of it as a framework that lets most type-casts, type-tests, special purpose type-conversion methods, and the like, be avoided WITHOUT adding a zillion little ad-hoc features. But of course you could choose to view it differently:-). Alex From skip@pobox.com Fri Jun 21 15:26:35 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 21 Jun 2002 09:26:35 -0500 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain> <20020620205041.GD18944@zot.electricrain.com> Message-ID: <15635.14235.79608.390983@beluga.mojam.com> Greg> should we keep the existing bsddb around as oldbsddb for users in Greg> that situation? Martin> I don't think so; users could always extract the module from Martin> older distributions if they want to. I would prefer the old version be moved to lib-old (or Modules-old?). For people still running DB 2.x it shouldn't be a major headache to retrieve. Skip From guido@python.org Fri Jun 21 15:45:57 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Jun 2002 10:45:57 -0400 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: Your message of "Fri, 21 Jun 2002 16:19:50 +0200." References: <3D121F0D.E3B60865@prescod.net> <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200206211445.g5LEjvh25287@pcp02138704pcs.reston01.va.comcast.net> > I hypothesized that, because of self-selection effects, Python's > user based (particularly on Python-Dev and in Python-only firms) is > biased against featurities _when compared_ to the general > population, which (I opine) includes a wider proportion of people > who don't particularly mind a language having many redundant > features. There is obviously nothing that needs to be "reconciled" > between this hypothesized sample-bias and the observation that > requests for features come more from the user base than from (who > else would you expect them to come FROM -- people who've never HEARD > about Python...?-). So, you're either joking or subject to a common > and quite understandable "statistical fallacy". Never forget > Bayes's Theorem...!-) I dunno. Most feature proposals come from the c.l.py crowd, and that's also the place where the loudest clamor for a stop to the featuritis was heard. And I believe that even those who consider themselves strongly anti-featuritis still have one or two pet features that they really need (even self-proclaimed arch-conservative Christian Tismer, who went so far as to develop his own version of the language because he couldn't get his pet feature adopted). > > You yourself pleaded for PEP 246 just an hour > > ago. Surely that's a big honking new feature! > > I prefer to think of it as a framework that lets most type-casts, > type-tests, special purpose type-conversion methods, and the like, > be avoided WITHOUT adding a zillion little ad-hoc features. But of > course you could choose to view it differently:-). Surely it would be a dramatic change, probably deeper than new-style classes and generators together. --Guido van Rossum (home page: http://www.python.org/~guido/) From tismer@tismer.com Fri Jun 21 16:10:10 2002 From: tismer@tismer.com (Christian Tismer) Date: Fri, 21 Jun 2002 17:10:10 +0200 Subject: [Python-Dev] *Simpler* string substitutions References: <3D121F0D.E3B60865@prescod.net> <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net> <200206211445.g5LEjvh25287@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D1341D2.9020805@tismer.com> Guido van Rossum wrote: ... > And I believe that even those who consider themselves strongly > anti-featuritis still have one or two pet features that they really > need (even self-proclaimed arch-conservative Christian Tismer, who > went so far as to develop his own version of the language because he > couldn't get his pet feature adopted). ROTFLMAO yes, how could I forget. :-) Well, the real story is this: I created some problems and some solutions, made people dependant from this, and now I make my living from maintenance work. >>I prefer to think of it as a framework that lets most type-casts, >>type-tests, special purpose type-conversion methods, and the like, >>be avoided WITHOUT adding a zillion little ad-hoc features. But of >>course you could choose to view it differently:-). > > Surely it would be a dramatic change, probably deeper than new-style > classes and generators together. Sounds as if this would both be very powerful and might shrink the code base at the same time. One of the things I like the best. have-to-start-packing,-too - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From guido@python.org Fri Jun 21 16:19:32 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Jun 2002 11:19:32 -0400 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: Your message of "Fri, 21 Jun 2002 17:10:10 +0200." <3D1341D2.9020805@tismer.com> References: <3D121F0D.E3B60865@prescod.net> <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net> <200206211445.g5LEjvh25287@pcp02138704pcs.reston01.va.comcast.net> <3D1341D2.9020805@tismer.com> Message-ID: <200206211519.g5LFJWl25427@pcp02138704pcs.reston01.va.comcast.net> > > Surely it would be a dramatic change, probably deeper than new-style > > classes and generators together. > > Sounds as if this would both be very powerful and > might shrink the code base at the same time. > One of the things I like the best. See? Given sufficiently clever presentation, even the most conservative users can be made to want new things. Advertisers know this, of course. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jun 21 16:28:42 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 21 Jun 2002 11:28:42 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <200206201730.g5KHUlP04117@odiug.zope.com> Message-ID: [Guido] > ... > (The main problem with `...` is that many people can't distinguish > between ` and ', as user testing has shown.) Including Tim testing, which is dear to my heart. The editor I usually use allows defining styles (font, size, color, etc) for syntactic elements, and for Python files I set it up so that the backtick has its own style, 1.5x bigger than all other characters. This makes it very easy to see the backticks as such, but mostly(!) because it forces extra vertical space above a line containing one. That's more emphasis than even a Tim needs. OTOH, I have no trouble seeing lowercase "x" . xabcx==repr(abc)-ly y'rs - tim From tim.one@comcast.net Fri Jun 21 16:33:14 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 21 Jun 2002 11:33:14 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <200206201746.g5KHkwH04175@odiug.zope.com> Message-ID: [Guido, quotes Christian] >> The following statements are ordered by increasing hate. >> 1 - I do hate the idea of introducing a "$" sign at all. >> 2 - giving "$" special meaning in strings via a module >> 3 - doing it as a builtin function >> 4 - allowing it to address local/global variables [and adds] > Doesn't 4 contradict your +1 on allvars()? Since Christian's reply only increased the apparent contradiction, allow me to channel: they are ordered by increasing hate, but starting at the bottom. s/increasing/decreasing/ in his original, or s/hate/love/, and you can continue to read it in the top-down Dutch way . From tismer@tismer.com Fri Jun 21 16:53:36 2002 From: tismer@tismer.com (Christian Tismer) Date: Fri, 21 Jun 2002 17:53:36 +0200 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: Message-ID: <3D134C00.2090205@tismer.com> Tim Peters wrote: > [Guido, quotes Christian] > >>>The following statements are ordered by increasing hate. >>>1 - I do hate the idea of introducing a "$" sign at all. >>>2 - giving "$" special meaning in strings via a module >>>3 - doing it as a builtin function >>>4 - allowing it to address local/global variables >> > > [and adds] > >>Doesn't 4 contradict your +1 on allvars()? > > > Since Christian's reply only increased the apparent contradiction, allow me > to channel: they are ordered by increasing hate, but starting at the > bottom. s/increasing/decreasing/ in his original, or s/hate/love/, and you > can continue to read it in the top-down Dutch way . Huh? Reading from top to bottom, as I used to, I see increasing numbers, which are in the same order as the "increasing hate" (not a linear function, but the same ordering). 4 - allowing it to address local/global variables is what I hate the most. This is in no contradiction to allvars(), which is simply a function that puts some variables into a dict, therefore deliberating the interpolation from variable access. Where is the problem, please? -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From aleax@aleax.it Fri Jun 21 16:55:34 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 21 Jun 2002 17:55:34 +0200 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <200206211445.g5LEjvh25287@pcp02138704pcs.reston01.va.comcast.net> References: <3D121F0D.E3B60865@prescod.net> <200206211445.g5LEjvh25287@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Friday 21 June 2002 04:45 pm, Guido van Rossum wrote: ...[re PEP 246]... > Surely it would be a dramatic change, probably deeper than new-style > classes and generators together. Rarely does one catch Guido (or most any Dutch, I believe) in such a wild overbid. Heat getting to you?-) Protocol-Adaptation is (I believe) a nice idea, but somewhat of a marginal one when compared e.g. to new-style classes (a change whose consequences still haven't finished propagating through the language -- witness the recent issues about making exception classes new-style vs keeping them classic) -- and most evidently so if you ALSO add generators to that side of the scales!-) Alex From jmiller@stsci.edu Fri Jun 21 17:00:27 2002 From: jmiller@stsci.edu (Todd Miller) Date: Fri, 21 Jun 2002 12:00:27 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu> <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net> <3D132D2A.7080801@stsci.edu> <200206211359.g5LDxoF25028@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3D134D9B.7030601@stsci.edu> Guido van Rossum wrote: >>I liked MvL's __index__ method. I understand your hesitancy. It's >>pretty tough exploring new features side-by-side the "version >>fatigue" thread :) >> > >Yes. > >>>What's your use case? Why do you need this? >>> > >>This might settle it :) Numeric/numarray arrays are sometimes used in >>reduction operations (e.g. max) which eliminate one dimension. Sometimes >>the result is a zero dimensional array, which is currently converted to >>a Python scalar in both Numeric and numarray. The conversion to scalar >>enables integer zero dimensional results to be used as indices, but >>causes other problems since any auxilliary information in the array >>(e.g. type = Int8) is lost. Adding some form of implicit conversion to >>index value might permit us to retain zero dimensional objects as arrays. >> > >But when you're indexing Numeric/numarray arrays, you have full >control over the interpretation of indices, so you can do this >yourself. > Yes. We do this now in numarray. >Do you really need to be able to index Python sequences >(lists, tuples) with your 0-dimensional arrays? > We want to. Here's why: 1) Currently, when you fully reduce or subscript a numarray, you get back a python scalar. This has the disadvantages that: a. information (type=Int8) is lost b. precision (Float128 --> types.FloatType) can be lost. c. subsequent code must handle multiple types: result = some_array_operation() if result in PythonScalarTypes: do_it_the_scalar_way() else: do_it_the_array_way() 2) 0-D arrays can be represented as simple scalars using __repr__. This creates a convenient illusion that a 0-D array is just a number. 0-D arrays solve all of the problems cited in 1. But 0-D arrays introduce one new problem: a. 0-D arrays don't work as builtin sequence indices, destroying the illusion that what lies at the bottom of an array is just a number. If a fix for this was conceivable, we'd be willing to do the frontend work to make it happen. >Could you live with >having to call int() in those cases? > Yes and no. I think there are two disadvantages: a. There is a small notational overhead, and the need to remember it. In terms of the illusion that a 0-D array is a number, this will be a point of confusion. b. Once int() is added, the semantics of the code are narrower than they used to be. The same code called with an array as the sequence, might otherwise accept an index to be an array. Once int() is used, this can no longer reasonably happen since int(multi_valued_array) should raise an exception. Thanks for your attention on this thread. Have a nice vacation! Todd > >--Guido van Rossum (home page: http://www.python.org/~guido/) > > >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev > -- Todd Miller jmiller@stsci.edu STSCI / SSG (410) 338 4576 From python@rcn.com Fri Jun 21 16:59:05 2002 From: python@rcn.com (Raymond Hettinger) Date: Fri, 21 Jun 2002 11:59:05 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: Message-ID: <002f01c2193c$925a0360$a5f8a4d8@othello> > [Guido, quotes Christian] > >> The following statements are ordered by increasing hate. > >> 1 - I do hate the idea of introducing a "$" sign at all. > >> 2 - giving "$" special meaning in strings via a module > >> 3 - doing it as a builtin function > >> 4 - allowing it to address local/global variables > > [and adds] > > Doesn't 4 contradict your +1 on allvars()? > [Tim] > Since Christian's reply only increased the apparent contradiction, allow me > to channel: they are ordered by increasing hate, but starting at the > bottom. s/increasing/decreasing/ in his original, or s/hate/love/, and you > can continue to read it in the top-down Dutch way . template = [ '$linenum - I do $feeling the idea of introducing the "$$" sign at all.', '$linenum - give "$$" special meaning in strings via a module', '$linenum - doing it as a builtin function' '$linenum - allowing it to address/global local variables' ] feeling = 'hate' if 'Dutch' in options: feeling = 'love' template = template[::-1] # cool new feature print 'The following statements are ordered by increasing $feeling.'.sub() for cnt, line in enumerate(template): # cool new feature linenum = cnt+1 # still wish enumerate had an optional start arg print linenum.sub() # aspiring cool new feature 'regnitteh dnomyar'[::-1] From mclay@nist.gov Fri Jun 21 16:57:50 2002 From: mclay@nist.gov (Michael McLay) Date: Fri, 21 Jun 2002 11:57:50 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: References: Message-ID: <200206211157.50972.mclay@nist.gov> On Thursday 20 June 2002 06:48 pm, Ka-Ping Yee wrote: > On Thu, 20 Jun 2002, Oren Tirosh wrote: > > See http://tothink.com/python/embedpp > > Hi Oren, > > Your proposal brings up some valid concerns with PEP 215: > > 1. run-time vs. compile-time parsing > 2. how to decide what's an expression > 3. balanced quoting instead of $ > I like Oren's PEP as a replacement for PEP 292. But there is one major problem with his notation. I would change the "`" character to something more readable. I tried examples with "@", "$", "%", "!", and "?". My preference was "?", "@", or "$". (The choice should consider the easy of typing on international keyboards.) The "?" seems like a good choice because the replacement expresssion will answer the question of what will appear in the string at that location. Here is Oren's example using the "?" to quote the expression. print e"X=?x?, Y=?calc_y(x)?." The following example is provided for contrast. It has a larger text to variable substitution ratio. p = e"""A new character prefix "e" is defined for strings. This prefix precedes the 'u' and 'r' prefixes, if present. Capital 'E' is also acceptable. Within an e-string any ?expressions? enclosed in backquotes are evaluated, converted to strings using the equivalent of the ?str()? function and embedded in-place into the In the larger body of text the "?" is clearly visible. I'm not so sure I like the "?" in the smaller example. It may be because the "?" looks too much like letters that can appear in a variable name. The "@" stands out a bit better than "?". This is probably because there are more pixels turned on and the character is fatter. print e"X=@x@, Y=@calc_y(x)@." p = e"""A new character prefix "e" is defined for strings. This prefix precedes the 'u' and 'r' prefixes, if present. Capital 'E' is also acceptable. Within an e-string any @expressions@ enclosed in backquotes are evaluated, converted to strings using the equivalent of the @str()@ function and embedded in-place into the e-string.""" The function of the "$" would be recognizable to people migrating from other languages, but it would be used as a balanced quote, rather than as a starting character in a variable that will be substituted. (Is this character easy to type on non-US keyboards? I thought the "$" was one of the character that is replaced on European keyboards.) If The "@" is available on international keyboards then I think it would be a better choice. print e"X=$x$, Y=$calc_y(x)$." p = e"""A new character prefix "e" is defined for strings. This prefix precedes the 'u' and 'r' prefixes, if present. Capital 'E' is also acceptable. Within an e-string any $expressions$ enclosed in backquotes are evaluated, converted to strings using the equivalent of the $str()$ function and embedded in-place into the e-string.""" From skip@pobox.com Fri Jun 21 17:36:26 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 21 Jun 2002 11:36:26 -0500 Subject: [Python-Dev] strptime recapped In-Reply-To: References: Message-ID: <15635.22026.914241.398242@beluga.mojam.com> Brett> 4) location of strptime: Brett> Skip asked why Guido was having me write the callout patch to Brett> timemodule.c. He wondered why Lib/time.py wasn't just created Brett> holding my code and then renaming timemodule.c to _timemodule.c Brett> and importing it at the end of time.py. No response has been Brett> given thus far for that. This is what's keeping me from going further. I did run the test suite against the latest version with no problem. I think making the current time module call out to a new strptime module is the wrong way to do things, especially given past practice (socket/_socket, string/strop, etc). I would prefer a time.py module be created to hold Brett's strptime function. On import, the last thing it would try doing is to import * from _time, which would obliterate Brett's Python version if the platform supports strptime(). Brett> I also suggested a possible time2 where things like strptime, my Brett> helper fxns (calculate the Julian date from the Gregorian date, Brett> etc.), and things such as naivetime could be kept. That's well beyond the scope of this patch. I'd rather not address it at this point (at least not on this thread). I'd prefer to just focus on how best to add strptime() to platforms without a libc version. Skip From skip@pobox.com Fri Jun 21 17:42:52 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 21 Jun 2002 11:42:52 -0500 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com> Message-ID: <15635.22412.414350.293111@beluga.mojam.com> >> Now back to $ vs. %. I think I can defend having both in the >> language, but only if % is reduced to the positional version (classic >> printf). This would be used mostly to format numerical data with >> fixed column width. There would be very little overlap in use cases: Overlap or not, you wind up with two things that look very much alike doing nearly identical things. -1... Skip From guido@python.org Fri Jun 21 17:58:11 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Jun 2002 12:58:11 -0400 Subject: [Python-Dev] strptime recapped In-Reply-To: Your message of "Fri, 21 Jun 2002 11:36:26 CDT." <15635.22026.914241.398242@beluga.mojam.com> References: <15635.22026.914241.398242@beluga.mojam.com> Message-ID: <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net> > This is what's keeping me from going further. I did run the test > suite against the latest version with no problem. I think making > the current time module call out to a new strptime module is the > wrong way to do things, especially given past practice > (socket/_socket, string/strop, etc). I would prefer a time.py > module be created to hold Brett's strptime function. On import, the > last thing it would try doing is to import * from _time, which would > obliterate Brett's Python version if the platform supports > strptime(). That's only a good idea if Brett's Python code has absolutely no features beyond the C version. I'm -0 on the time.py idea -- it seems it would churn things around more than absolutely necessary. But you're right about the socket precedent. --Guido van Rossum (home page: http://www.python.org/~guido/) From python@rcn.com Fri Jun 21 18:16:49 2002 From: python@rcn.com (Raymond Hettinger) Date: Fri, 21 Jun 2002 13:16:49 -0400 Subject: [Python-Dev] Behavior of buffer() Message-ID: <005001c21947$6eb31e00$adb53bd0@othello> I would like to solicit py-dev's thoughts on the best way to resolve a bug, www.python.org/sf/546434 . The root problem is that mybuf[:] returns a buffer type and mybuf[2:4] returns a string type. A similar issue exists for buffer repetition. One way to go is to have the slices always return a string. If code currently relies on the type of a buffer slice, it is more likely to be relying on it being a string as in: print mybuf[:4]. This is an intuitive guess because I can't find empirical evidence. Another reason to choose a string return type is that buffer() appears to have been designed to be as stringlike as possible so that it can be easily substituted in code originally designed for strings. The other way to go is to return a buffer object everytime. Slices usually, but not always (see subclasses of list), return the same type that was being sliced. If we choose this route, another issue remains -- mybuf[:] returns self instead of a new buffer. I think that behavior is also a bug and should be changed to be consistent with the Python idiom where: b = a[:] assert id(a) != id(b) Incidental to the above, GvR had a thought that slice repetition ought to always return an error. Though I don't see any use cases for buffer repetition, bufferobjects do implement all other sequence behaviors and I think it would be weird to nullify the sq_repeat slot. I appreciate your thoughts on the best way to proceed. fixing-bugs-is-easier-than-deciding-appropriate-behavior-ly yours, 'regnitteh dnomyar'[::-1] From skip@pobox.com Fri Jun 21 18:29:59 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 21 Jun 2002 12:29:59 -0500 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: References: <3D121F0D.E3B60865@prescod.net> <200206211259.g5LCxmx24769@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15635.25239.56486.554175@beluga.mojam.com> Alex> This is the point I mentioned at the start about effects of user Alex> base. Given that the user base is largish AND biased AGAINST Alex> featuritis, it should HELP you "withstand the pressure to add Alex> features"... if you WANT to withstand it. I.e., you'll mostly get Alex> strong support for any stance of "let's NOT add this". You may Alex> dislike that when you WANT to add a feature, but surely not when Alex> it's about "withstanding the pressure". Alex, I think you're missing one point. As the Python user base grows, even though the majority of people are comfortable with the status quo, most of them are silent most of the time, more people who do want some changes are added to the mix, and more people with strident voices who want some changes are avaiable. Guido isn't cloned to keep up with the increasing user base, however. (I'm obviously picking numbers out of thin air in what follows.) If you go from 100,000 users, 100 of whom would like their favorite bit from the last language they used added to Python, and 1 of whom is a crackpot who just won't take "no" for an answer, to 1,000,000 users, you probably have 10 crackpots and 1,000 less strident voices now clamoring for change. You also probably have multiple proposals for similar changes (like string interpolation - everybody has their favorite scheme, whether it's $name, ${name}, %(name)s, or <>). You still have just one BDFL, however. He has more inputs to consider, and has to figure out who among the much larger masses are the crackpots. And some of the arguments, whether they come from crackpots or not, are fairly convincing. Makes it tougher to resist change. Skip From joe@notcharles.ca Fri Jun 21 18:54:37 2002 From: joe@notcharles.ca (Joe Mason) Date: Fri, 21 Jun 2002 12:54:37 -0500 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions Message-ID: <20020621175437.GA1213@plover.net> Tim wrote: > [Guido, quotes Christian] > >> The following statements are ordered by increasing hate. > >> 1 - I do hate the idea of introducing a "$" sign at all. > >> 2 - giving "$" special meaning in strings via a module > >> 3 - doing it as a builtin function > >> 4 - allowing it to address local/global variables > > [and adds] > > Doesn't 4 contradict your +1 on allvars()? > > Since Christian's reply only increased the apparent contradiction, allow me > to channel: they are ordered by increasing hate, but starting at the > bottom. s/increasing/decreasing/ in his original, or s/hate/love/, and you > can continue to read it in the top-down Dutch way . If you'll allow me to counter-channel: Christian hates giving this special syntax form access to local/global variables, since it's a security risk that's not apparent unless you know what you're looking for. He prefers to use allvars() to achieve the same end, since it's explicit. He's not opposed to variable access in general. Write-only variables don't tend to find much use. Joe From David Abrahams" <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <145001c2194c$ac570c30$6601a8c0@boostconsulting.com> From: "Alex Martelli" > > You yourself pleaded for PEP 246 just an hour > > ago. Surely that's a big honking new feature! > > I prefer to think of it as a framework that lets most type-casts, type-tests, > special purpose type-conversion methods, and the like, be avoided WITHOUT > adding a zillion little ad-hoc features. Such a strong endorsement from you made me go take a cursory look; I think I'd be -1 on this in its current form. It seems like an intrusive mechanism in that it forces the adapter or the adaptee to know how to do the job. Given libraries A and B, can I do something to allow them to interoperate without modifying them? Conversely, is there a reasonably "safe" way to add adaptations to an existing type from the outside? I'm thinking of some analogy to specialization of traits in C++, here. -Dave From oren-py-d@hishome.net Fri Jun 21 19:09:03 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 21 Jun 2002 14:09:03 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <200206211157.50972.mclay@nist.gov> References: <200206211157.50972.mclay@nist.gov> Message-ID: <20020621180903.GA66506@hishome.net> On Fri, Jun 21, 2002 at 11:57:50AM -0400, Michael McLay wrote: > On Thursday 20 June 2002 06:48 pm, Ka-Ping Yee wrote: > > On Thu, 20 Jun 2002, Oren Tirosh wrote: > > > See http://tothink.com/python/embedpp > > > > Hi Oren, > > > > Your proposal brings up some valid concerns with PEP 215: > > > > 1. run-time vs. compile-time parsing > > 2. how to decide what's an expression > > 3. balanced quoting instead of $ > > > > I like Oren's PEP as a replacement for PEP 292. But there is one major > problem with his notation. I would change the "`" character to something > more readable. Expression embedding, unlike interpolation, is done at compile time. This would make it natural to use the same prefix used for inserting other kinds of special stuff into strings at compile-time - the backslash. print "X=\(x), Y=\(calc_y(x))." No need for double backslash. No need for a special string prefix either because \( currently has no meaning. Oren From skip@pobox.com Fri Jun 21 19:19:12 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 21 Jun 2002 13:19:12 -0500 Subject: [Python-Dev] strptime recapped In-Reply-To: <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net> References: <15635.22026.914241.398242@beluga.mojam.com> <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15635.28192.396040.21104@beluga.mojam.com> I am getting more and more frustrated with the way things are going here lately. At this point I would be more than happy to pass off anything that's assigned to me to other people and just unsubscribe from python-dev. I feel like there are enormous contradictions in the way different changes to Python are being addressed. If you want to take over any of these bugs or patches, feel free: 411881 Use of "except:" in modules 569574 plain text enhancement for cgitb 542562 clean up trace.py 474274 Pure Python strptime() (PEP 42) 541694 whichdb unittest >> I would prefer a time.py module be created to hold Brett's strptime >> function. On import, the last thing it would try doing is to import >> * from _time, which would obliterate Brett's Python version if the >> platform supports strptime(). Guido> That's only a good idea if Brett's Python code has absolutely no Guido> features beyond the C version. I don't understand what you mean. Guido's probably gone by now. Perhaps someone can channel him. I am clearly missing something obvious, but I don't see any support for the argument that having the existing time module call out to a separate Python module makes a lot of sense. (Other than the fact that it comes from the BDFL, of course.) If we put Brett's changes into time.py (I'd argue that initially all we want is strptime(), but can live with the other stuff assuming it's tested - after all, it has to go somewhere), then from _time import * at the bottom, the only thing to be eliminated would be his version of strptime(), and only if the platform libc didn't support it. The Gregorian/Julian date stuff would remain. If you don't want them exposed in time, just prefix them with underscores and don't add them to time.all. The original patch http://python.org/sf/474274 was just about adding strptime() to the time module. All PEP 42 asked for was Add a portable implementation of time.strptime() that works in clearly defined ways on all platforms. All the other stuff is secondary in my mind to making time.strptime() universally available and should be dealt with separately. If performance isn't a big issue (I doubt it will be most of the time), I can see dropping the C version of time.strptime altogether. I still think the best way to add new stuff which is written in Python to the time module is to have time.py be the front-end module and have it import other stuff from a C-based _time module. Skip From guido@python.org Fri Jun 21 19:49:19 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Jun 2002 14:49:19 -0400 Subject: [Python-Dev] strptime recapped In-Reply-To: Your message of "Fri, 21 Jun 2002 13:19:12 CDT." <15635.28192.396040.21104@beluga.mojam.com> References: <15635.22026.914241.398242@beluga.mojam.com> <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net> <15635.28192.396040.21104@beluga.mojam.com> Message-ID: <200206211849.g5LInJg26550@pcp02138704pcs.reston01.va.comcast.net> (I'm still here, for maybe another hour.) > I am getting more and more frustrated with the way things are going here > lately. At this point I would be more than happy to pass off anything > that's assigned to me to other people and just unsubscribe from python-dev. > I feel like there are enormous contradictions in the way different changes > to Python are being addressed. This sounds like a reference to something I've said but I don't get it. > If you want to take over any of these bugs or patches, feel free: > > 411881 Use of "except:" in modules > 569574 plain text enhancement for cgitb > 542562 clean up trace.py > 474274 Pure Python strptime() (PEP 42) > 541694 whichdb unittest >From time to time we all get frustrated. I, too, wish things would move along quicker. One thing that may not be obvious is that most of PythonLabs' resources (myself to some extent excluded) have been consumed by Zope projects recently, significantly reducing the time we can spend on moving Python projects along. This is part of our deal with Zope Corp: they pay our salaries, we have to spend over half our time on Zope Corp projects. That's on average: sometimes we spend weeks or more exclusively on Python stuff, other times we spend weeks working on Zope Corp stuff nearly full time. > >> I would prefer a time.py module be created to hold Brett's strptime > >> function. On import, the last thing it would try doing is to import > >> * from _time, which would obliterate Brett's Python version if the > >> platform supports strptime(). > > Guido> That's only a good idea if Brett's Python code has absolutely no > Guido> features beyond the C version. > > I don't understand what you mean. Guido's probably gone by now. Perhaps > someone can channel him. I am clearly missing something obvious, but I > don't see any support for the argument that having the existing time module > call out to a separate Python module makes a lot of sense. (Other than the > fact that it comes from the BDFL, of course.) I meant that if Brett's code has useful features not found in the standard strptime, it should be available explicitly (for those who want the extra features) and not be overwritten by the C version even if it exists. I'm not sure what your objection is against calling out to Python from C. We do it all the time, e.g. in PyErr_Warn(). I guess my objection (of -0 strength) against renaming time to _time is that you'd have to fix a dozen or so build recipes for all sorts of exotic platforms. The last time something like this was done (for new, a much less popular module than time) the initial change set broke the Windows build, and I think I saw Mac build changes for this issue checked in today or yesterday. We can avoid all that if the time module calls out to strptime.py. > If we put Brett's changes into time.py (I'd argue that initially all we want > is strptime(), but can live with the other stuff assuming it's tested - > after all, it has to go somewhere), then > > from _time import * > > at the bottom, the only thing to be eliminated would be his version of > strptime(), and only if the platform libc didn't support it. The > Gregorian/Julian date stuff would remain. If you don't want them exposed in > time, just prefix them with underscores and don't add them to time.all. It's not that I don't want to expose them. I haven't seen them, so I don't know how useful they are. However (as I have tried to point out a few times now in response to proposed changes to calendar.py) I plan to introduce the new datetime type that's currently living in nondist/sandbox/datetime/, either the Python version or the C version if we find time to finish it. This has all the date/time calculations you want, can represent years from AD 0 till 9999 (we can easily extend it if that's not enough :-) and I would like all code in need of date/time calculations to be based on this rather than grow more ad-hoc approaches to doing essentially the same. > The original patch > > http://python.org/sf/474274 > > was just about adding strptime() to the time module. All PEP 42 asked for > was > > Add a portable implementation of time.strptime() that works in clearly > defined ways on all platforms. > > All the other stuff is secondary in my mind to making time.strptime() > universally available and should be dealt with separately. Correct. > If performance isn't a big issue (I doubt it will be most of the time), I > can see dropping the C version of time.strptime altogether. I still think > the best way to add new stuff which is written in Python to the time module > is to have time.py be the front-end module and have it import other stuff > from a C-based _time module. I hope I can dissuade you from this. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Fri Jun 21 20:06:47 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 21 Jun 2002 14:06:47 -0500 Subject: [Python-Dev] strptime recapped In-Reply-To: <200206211849.g5LInJg26550@pcp02138704pcs.reston01.va.comcast.net> References: <15635.22026.914241.398242@beluga.mojam.com> <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net> <15635.28192.396040.21104@beluga.mojam.com> <200206211849.g5LInJg26550@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15635.31047.68516.959914@beluga.mojam.com> Guido> I guess my objection (of -0 strength) against renaming time to Guido> _time is that you'd have to fix a dozen or so build recipes for Guido> all sorts of exotic platforms. The last time something like this Guido> was done (for new, a much less popular module than time) the Guido> initial change set broke the Windows build, and I think I saw Mac Guido> build changes for this issue checked in today or yesterday. Okay, I can understand that issue. Still, that is a mere ripple compared to the longer term ramifications of taking a wrong turn by adding a strptime module that might turn out to be more-or-less an orphan. You have to consider: * Is strptime even the right name for it? I doubt it. Only us C-heads would think that was a good name. * If you create a strptime (or timeparse or parsedate) module should it really have exposed functions named julianFirst, julianToGreg or gregToJulian? Ignore the studly caps issue (sorry Brett, I don't think they fit in with normal naming practice in the Python core library) and just consider the functionality. Guido> We can avoid all that if the time module calls out to Guido> strptime.py. But it seems to me that it would be an even bigger step to add a new module to Lib, which, as it now sits, would probably only provide a single useful function. >> If we put Brett's changes into time.py (I'd argue that initially all >> we want is strptime(), but can live with the other stuff assuming >> it's tested - after all, it has to go somewhere), then >> >> from _time import * >> >> at the bottom, the only thing to be eliminated would be his version >> of strptime(), and only if the platform libc didn't support it. The >> Gregorian/Julian date stuff would remain. If you don't want them >> exposed in time, just prefix them with underscores and don't add them >> to time.all. Guido> It's not that I don't want to expose them. I haven't seen them, so I Guido> don't know how useful they are. Guido> However (as I have tried to point out a few times now in response Guido> to proposed changes to calendar.py) I plan to introduce the new Guido> datetime type that's currently living in Guido> nondist/sandbox/datetime/, either the Python version or the C Guido> version if we find time to finish it. Right. Which is another reason I think we shouldn't just plop a strptime module into Lib. There is more going on with time issues than just adding time.strptime(). Creating a Python-based time module seems less intrusive to me at the user level than creating new module you will wind up supporting for a long time. >> If performance isn't a big issue (I doubt it will be most of the time), I >> can see dropping the C version of time.strptime altogether. I still >> think the best way to add new stuff which is written in Python to the >> time module is to have time.py be the front-end module and have it >> import other stuff from a C-based _time module. Guido> I hope I can dissuade you from this. Likewise. ;-) It's clear that there is a lot of semi-related stuff going on related to timekeeping and time calculations. Maybe the best course is simply to hold off on Brett's patch for the time being and consider it in the context of the all the other stuff (your datetime object, Brett's Gregorian/Julian functions, etc). Skip From guido@python.org Fri Jun 21 20:26:55 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Jun 2002 15:26:55 -0400 Subject: [Python-Dev] strptime recapped In-Reply-To: Your message of "Fri, 21 Jun 2002 14:06:47 CDT." <15635.31047.68516.959914@beluga.mojam.com> References: <15635.22026.914241.398242@beluga.mojam.com> <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net> <15635.28192.396040.21104@beluga.mojam.com> <200206211849.g5LInJg26550@pcp02138704pcs.reston01.va.comcast.net> <15635.31047.68516.959914@beluga.mojam.com> Message-ID: <200206211926.g5LJQuC26703@pcp02138704pcs.reston01.va.comcast.net> > Guido> I guess my objection (of -0 strength) against renaming > Guido> time to _time is that you'd have to fix a dozen or so > Guido> build recipes for all sorts of exotic platforms. The > Guido> last time something like this was done (for new, a much > Guido> less popular module than time) the initial change set > Guido> broke the Windows build, and I think I saw Mac build > Guido> changes for this issue checked in today or yesterday. [Skip] > Okay, I can understand that issue. Still, that is a mere ripple > compared to the longer term ramifications of taking a wrong turn by > adding a strptime module that might turn out to be more-or-less an > orphan. You have to consider: > > * Is strptime even the right name for it? I doubt it. Only us > C-heads would think that was a good name. It's already called strptime in the time module. :-) > * If you create a strptime (or timeparse or parsedate) module > should it really have exposed functions named julianFirst, > julianToGreg or gregToJulian? Ignore the studly caps issue > (sorry Brett, I don't think they fit in with normal naming > practice in the Python core library) and just consider the > functionality. I think it shouldn't, see my argument about the datetime type. > Guido> We can avoid all that if the time module calls out to > Guido> strptime.py. > > But it seems to me that it would be an even bigger step to add a new > module to Lib, which, as it now sits, would probably only provide a > single useful function. IMO a new module in Lib is a much smaller step than renaming an existing built-in module. New modules get added all the time. > >> If we put Brett's changes into time.py (I'd argue that > >> initially all we want is strptime(), but can live with the > >> other stuff assuming it's tested - after all, it has to go > >> somewhere), then > >> > >> from _time import * > >> > >> at the bottom, the only thing to be eliminated would be his > >> version of strptime(), and only if the platform libc didn't > >> support it. The Gregorian/Julian date stuff would remain. > >> If you don't want them exposed in time, just prefix them with > >> underscores and don't add them to time.all. > > Guido> It's not that I don't want to expose them. I haven't > Guido> seen them, so I don't know how useful they are. > > Guido> However (as I have tried to point out a few times now in > Guido> response to proposed changes to calendar.py) I plan to > Guido> introduce the new datetime type that's currently living > Guido> in nondist/sandbox/datetime/, either the Python version > Guido> or the C version if we find time to finish it. > > Right. Which is another reason I think we shouldn't just plop a > strptime module into Lib. There is more going on with time issues > than just adding time.strptime(). Creating a Python-based time > module seems less intrusive to me at the user level than creating > new module you will wind up supporting for a long time. I guess we just disagree. Th datetime type does *not* have parsing capability, so we still need a strptime. > >> If performance isn't a big issue (I doubt it will be most of > >> the time), I can see dropping the C version of time.strptime > >> altogether. I still think the best way to add new stuff > >> which is written in Python to the time module is to have > >> time.py be the front-end module and have it import other > >> stuff from a C-based _time module. > > Guido> I hope I can dissuade you from this. > > Likewise. ;-) It's clear that there is a lot of semi-related stuff > going on related to timekeeping and time calculations. Maybe the > best course is simply to hold off on Brett's patch for the time > being and consider it in the context of the all the other stuff > (your datetime object, Brett's Gregorian/Julian functions, etc). Yes, holding off until I have the time to work on datetime and review Brett's patch seems wise. Apologies for Brett. --Guido van Rossum (home page: http://www.python.org/~guido/) From hbl@st-andrews.ac.uk Fri Jun 21 20:31:27 2002 From: hbl@st-andrews.ac.uk (Hamish Lawson) Date: Fri, 21 Jun 2002 20:31:27 +0100 Subject: [Python-Dev] Provide a Python wrapper for any new C extension Message-ID: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> One of the arguments put forward against renaming the existing time module to _time (as part of incorporating a pure-Python strptime function) is that it could break some builds. Therefore I'd suggest that it could be a useful principle for any C extension added in the future to the standard library to have an accompanying pure-Python wrapper that would be the one that client code would usually import. Hamish Lawson From mal@lemburg.com Fri Jun 21 20:41:00 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 21 Jun 2002 21:41:00 +0200 Subject: [Python-Dev] Provide a Python wrapper for any new C extension References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> Message-ID: <3D13814C.2040708@lemburg.com> Hamish Lawson wrote: > One of the arguments put forward against renaming the existing time > module to _time (as part of incorporating a pure-Python strptime > function) is that it could break some builds. Therefore I'd suggest that > it could be a useful principle for any C extension added in the future > to the standard library to have an accompanying pure-Python wrapper that > would be the one that client code would usually import. Sounds like a plan :-) BTW, this reminds me of the old idea to move that standard lib into a package, eg. 'python'... from python import time. We should at least reserve such a name RSN so that we don't run into problems later on. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From paul@prescod.net Fri Jun 21 20:54:53 2002 From: paul@prescod.net (Paul Prescod) Date: Fri, 21 Jun 2002 12:54:53 -0700 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206211157.50972.mclay@nist.gov> <20020621180903.GA66506@hishome.net> Message-ID: <3D13848D.345119F7@prescod.net> Oren Tirosh wrote: > >... > No need for double backslash. No need for a special string prefix either > because \( currently has no meaning. I like this idea but note that \( does have a current meaning: >>> "\(" '\\(' >>> "\(" =="\\(" 1 I think this is weird but it is inherited from C... So it would take time to phase this in. First we have to warn about \( and then give people time to find instances of it and change them to \\(. Then we could introduce a new meaning for it. Paul Prescod From smurf@noris.de Fri Jun 21 21:37:26 2002 From: smurf@noris.de (Matthias Urlichs) Date: Fri, 21 Jun 2002 22:37:26 +0200 Subject: [Python-Dev] *Simpler* string substitutions Message-ID: Guido: > publishers often turn 'foo' into `foo' It gets worse. The opposite of ` isn't ' -- it's =B4. Besides, these are apostrophes and not quotes. _Real_ symmetric=20 quotes are " " or =AB =BB or =93 " or ' ' or =92 ' or ..., but you can'= t use=20 any of these with just ASCII. Apple's MPW Shell language played with=20 some of these. Anyway, I agree that real languages use ${} or $WORD and that=20 formatting is best done with ${NAME:format}. Personally, the "${foo}".sub(foo=3D"bar") syntax (using keyword=20 arguments) looks good and works reasonably well for i18n. A possible=20 simplification would be to use the local+global variables if no=20 arguments are given. > def f(x, y): > return e"The sum of $x and $y is $(x+y)" How would that work with i18n? Proposal: The compiler should translate _"foo is ${foo}" to _("foo is ${foo}") and e"foo is ${foo}" to "foo is ${foo}".sub() >That looks OK to me, especially if it can be combined with u and r to > create unicode and raw strings. > Exactly. > PEP 292 is an attempt to do this *without* involving the parser: > > def f(x, y): > return "The sum of $x and $y is $(x+y)".sub() > > Downsides are that it invites using non-literals as formats, with al= l > the security aspects, and that its parsing happens at run-time (no b= ig > deal IMO). > You can't do it any other way if you want to use i18nalized strings=20 and formats. Note that some sentences cannot be internationalized without=20 rearranging some parameters... --=20 Matthias Urlichs From aleax@aleax.it Fri Jun 21 21:37:21 2002 From: aleax@aleax.it (Alex Martelli) Date: Fri, 21 Jun 2002 22:37:21 +0200 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <145001c2194c$ac570c30$6601a8c0@boostconsulting.com> References: <3D121F0D.E3B60865@prescod.net> <145001c2194c$ac570c30$6601a8c0@boostconsulting.com> Message-ID: On Friday 21 June 2002 07:52 pm, David Abrahams wrote: ... > Such a strong endorsement from you made me go take a cursory look; I think > I'd be -1 on this in its current form. It seems like an intrusive mechanism > in that it forces the adapter or the adaptee to know how to do the job. That's point (e) in the Requirements of the PEP: """ e) When the context knows about the object and the protocol and knows how to adapt the object so that the required protocol is satisfied. This could use an adapter registry or similar method. """ but the reference implementation doesn't go as far as that -- the only thing the PEP has to say about how to implement "third-party, non-invasive adaptation" is this vague mention of an adapter registry. As I recall (the PEP's been stagnant for over a year so my memory of it is getting fuzzy), I had gently nudged Clark Evans at the time about committing to SOME specific form of registry, even if a minimal one (e.g., a dictionary of callables keyed by the pair (protocol, typeofadaptee)). However, do notice that even in its present form it's WAY less invasive than C++'s dynamic_cast<>, which ONLY allows the _adaptee_ to solve things -- and in a very inflexible way, too. With dynamic_cast there's no way the "protocol" can noninvasively "adopt" existing objects, nor can an object have any say about it (e.g. how to disambiguate between multiple inheritance cases). QueryInterface does let the adaptee have an explicit say, but still, the adaptee is the only party consulted. Only Haskell's typeclass, AFAIK, has (among widely used languages and objectmodels) a smooth way to allow noninvasive 3rd party post-facto adaptation (and another couple of small gems too), but I guess it has an easier life because it's compile-time rather than runtime. Reussner et al (http://hera.csse.monash.edu.au/dsse/seminars/2000-12-07-reussner.html) and of course Yellin and Strom (ACM Transactions on Programming Languages and Systems, 19(2):292-333, 1997) may have even better stories, but I think their work (particularly the parts on _automatic_ discovery/adaptation) must still count as research, not yet suitable as a foundation for a language to be widely deployed. So let's not count those here. > Given libraries A and B, can I do something to allow them to interoperate > without modifying them? With the reference implementation proposed in PEP 246 you'd only have a few more strings to your bow than in C++ or COM -- enough to solve many cases, but not all. The adapter registry, even in its simplest form, would, I think, solve all of these cases. > Conversely, is there a reasonably "safe" way to add adaptations to an > existing type from the outside? I'm thinking of some analogy to > specialization of traits in C++, here. If a type is DESIGNED to let itself be extended in this way, no problem, even with the reference implementation. Remember that one of the steps of function adapt is to call the type's __conform__ method, if any, with the protocol as its argument. If the PEP was adopted in its current reference-form (_without_ a global adapter registry), there would still be nothing stopping a clever type from letting 3rd parties extend its adaptability by enriching its __conform__ -- most simply for example by having __conform__ as a last step check a typespecific registry of adapters. Code may be clearer...: class DummyButExtensibleType(object): _adapters = {} def addadapter(cls, protocol, adapter): cls._adapters[protocol] = adapter addadapter = classmethod(addadapter) def __conform__(self, protocol): adapter = self._adapters.get(protocol) if adapter: return adapter(self, protocol) raise TypeError BTW, I think the reference implementation's "# try to use the object's adapting mechanism" section is flawed -- it wouldn't let __conform__ return the object as being conformant to the protocol if the object happened to be false in a boolean context. I think TypeError must be the normal way for a __conform__ method (or an __adapt__ one) to indicate failure -- we can't reject conformant objects that happen to evaluate as false, it seems to me. Alex From smurf@noris.de Fri Jun 21 21:46:52 2002 From: smurf@noris.de (Matthias Urlichs) Date: Fri, 21 Jun 2002 22:46:52 +0200 Subject: [Python-Dev] *Simpler* string substitutions Message-ID: Aahz: > On Thu, Jun 20, 2002, Gustavo Niemeyer wrote: > > > > "Serving HTTP on", sa[0], "port", sa[1], "..." > _("Serving HTTP on ${addr} port ${port}").sub(addr=sa[0], port=sa[1]) _("Serving HTTP on ${sa[0]} port ${sa[1]}").sub() or some equivalent syntax. Note that the second way, above, has a distinct disadvantage for the translating person. He or she would probably know what "addr" stands for, but what is a "sa[0]" ??? > This is where current string handling comes up short. What's the > correct way to internationalize this string? Currently, you do the same thing but use % and a dictionary. As I said, equivalent syntax. > What if the person > handling I18N isn't a Python programmer? > The person gets frustrated and bitches at the programmer until the programmer fixes the code... -- Matthias Urlichs From bac@OCF.Berkeley.EDU Fri Jun 21 22:06:36 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Fri, 21 Jun 2002 14:06:36 -0700 (PDT) Subject: [Python-Dev] strptime recapped In-Reply-To: <200206211926.g5LJQuC26703@pcp02138704pcs.reston01.va.comcast.net> Message-ID: After reading all the email up to this point, [Guido van Rossum] > [Skip] [snip] > > * Is strptime even the right name for it? I doubt it. Only us > > C-heads would think that was a good name. > > It's already called strptime in the time module. :-) > I have to agree with Guido on this one. It might only make sense to people who come from C, but it has always been named this in Python. If the decision is made to go with another module for this code, though, then that is a different story. > > * If you create a strptime (or timeparse or parsedate) module > > should it really have exposed functions named julianFirst, > > julianToGreg or gregToJulian? Ignore the studly caps issue > > (sorry Brett, I don't think they fit in with normal naming > > practice in the Python core library) and just consider the I think you're right. I wrote this code originally after my last final ever in college as an undergraduate and so I was just more interested in relaxing and churning out some good code then being overtly proper in fxn naming. =) I will go through and read the Python coding style PEP and clean up my code. [snip] [Big discussion on whether a new module in Lib or just the callout to my Python code from timemodule.c ensued that is beyond my comment since I am so new to this list] > Yes, holding off until I have the time to work on datetime and review > Brett's patch seems wise. Apologies for Brett. It's quite fine with me. I want to see this done right just like everyone else who cares about Python's development. Personally, I am just ecstatic that I am getting to help out in some way. I feel more like a giddy little kid who is helping out some grown-ups with some important project than a recent college graduate. =) Enjoy your vacation, Guido. And don't leave us, Skip! I know I have greatly appreciated your help both on my patch and your input into all the other threads that have been going on as late here. -Brett C. From bac@OCF.Berkeley.EDU Fri Jun 21 22:18:39 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Fri, 21 Jun 2002 14:18:39 -0700 (PDT) Subject: [Python-Dev] Provide a Python wrapper for any new C extension In-Reply-To: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> Message-ID: [Hamish Lawson] > One of the arguments put forward against renaming the existing time module > to _time (as part of incorporating a pure-Python strptime function) is that > it could break some builds. Therefore I'd suggest that it could be a useful > principle for any C extension added in the future to the standard library > to have an accompanying pure-Python wrapper that would be the one that > client code would usually import. I am for that, but then again I am biased in this situation. =) But it seems reasonable. I would think everyeone who makes any major contribution of code to Python would much rather code it up in Python then C. It would probably help to get more code accepted since I know I felt a little daunted having to write that callout for strptime. The only obvious objection I can see to this is a performance hit for having to go through the Python stub to call the C extension. But I just did a very simple test of calling strftime('%c') 25,000 times from time directly and using a Python stub and it was .470 and .490 secs total respectively according to profile.run(). The oher objection I can see is that this would promote coding everything in Python when possible and that might not always be the best solution. Some things should just be coded in C, period. But I think for such situations that the person writing the code would most likely recognize that fact. Or maybe I am wrong in all of this. I don't know the exact process of how a C extension file gets accepted or what currently leads to an extension file getting a stub is. I would (and I am sure anyone else new to the list) really appreciate someone possibly explaining it to me since I would like to know. -Brett C. From oren-py-d@hishome.net Fri Jun 21 22:27:37 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sat, 22 Jun 2002 00:27:37 +0300 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <3D13848D.345119F7@prescod.net>; from paul@prescod.net on Fri, Jun 21, 2002 at 12:54:53PM -0700 References: <200206211157.50972.mclay@nist.gov> <20020621180903.GA66506@hishome.net> <3D13848D.345119F7@prescod.net> Message-ID: <20020622002737.A31767@hishome.net> On Fri, Jun 21, 2002 at 12:54:53PM -0700, Paul Prescod wrote: > > No need for double backslash. No need for a special string prefix either > > because \( currently has no meaning. > > I like this idea but note that \( does have a current meaning: > > >>> "\(" > '\\(' > >>> "\(" =="\\(" > 1 """Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) """ In other words, programs that rely on this beaviour are broken. Oren From David Abrahams" <145001c2194c$ac570c30$6601a8c0@boostconsulting.com> Message-ID: <154001c2196c$c411f9f0$6601a8c0@boostconsulting.com> From: "Alex Martelli" > On Friday 21 June 2002 07:52 pm, David Abrahams wrote: > ... > > Such a strong endorsement from you made me go take a cursory look; I think > > I'd be -1 on this in its current form. It seems like an intrusive mechanism > > in that it forces the adapter or the adaptee to know how to do the job. > > That's point (e) in the Requirements of the PEP: > > """ > e) When the context knows about the object and the protocol and > knows how to adapt the object so that the required protocol is > satisfied. This could use an adapter registry or similar > method. > """ Oh, sorry I missed that. > However, do notice that even in its present form it's WAY less invasive > than C++'s dynamic_cast<>, which ONLY allows the _adaptee_ to solve things -- > and in a very inflexible way, too. With dynamic_cast there's no way the > "protocol" can noninvasively "adopt" existing objects, nor can an object > have any say about it (e.g. how to disambiguate between multiple > inheritance cases). QueryInterface does let the adaptee have an explicit > say, but still, the adaptee is the only party consulted. I wasn't trying to spark a comparison with C++ here, nor was I talking about runtime-dispatched stuff in C++. I'm not even sure I would call dynamic_cast<> a candidate for this kind of job, at least, not by iteself. I was thinking of the use of template specialization to describe the relationship of a type to a library, e.g. specialization of std::iterator_traits by libB, which makes libA::some_class available for use as an iterator with the standard library (assuming it has some appropriate interface). > Only Haskell's typeclass, AFAIK, has (among widely used languages and > objectmodels) a smooth way to allow noninvasive 3rd party post-facto > adaptation (and another couple of small gems too), but I guess it has an > easier life because it's compile-time rather than runtime. IIUC the same kind of thing can be implemented in C++ templates, if you know where to look. There's been a lot of discussion of how to build variant types lately. -Dave From greg@electricrain.com Fri Jun 21 22:54:44 2002 From: greg@electricrain.com (Gregory P. Smith) Date: Fri, 21 Jun 2002 14:54:44 -0700 Subject: [Python-Dev] Re: replacing bsddb with pybsddb's bsddb3 module In-Reply-To: <15635.14235.79608.390983@beluga.mojam.com> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain> <20020620205041.GD18944@zot.electricrain.com> <15635.14235.79608.390983@beluga.mojam.com> Message-ID: <20020621215444.GB30056@zot.electricrain.com> On Fri, Jun 21, 2002 at 09:26:35AM -0500, Skip Montanaro wrote: > > Greg> should we keep the existing bsddb around as oldbsddb for users in > Greg> that situation? > > Martin> I don't think so; users could always extract the module from > Martin> older distributions if they want to. > > I would prefer the old version be moved to lib-old (or Modules-old?). For > people still running DB 2.x it shouldn't be a major headache to retrieve. This sounds good. Here's what i see on the plate to be done so far: 1) move the existing Modules/bsddbmodule.c to a new Modules-old or directory. 2) create a new Lib/bsddb directory containing bsddb3/bsddb3/*.py from the pybsddb project. 3) create a new Modules/bsddb directory containing bsddb3/src/* from the pybsddb project (the files should probably be renamed to _bsddbmodule.c and bsddbmoduleversion.h for consistent naming) 4) place the pybsddb setup.py in the Modules/bsddb directory, modifying it as needed. OR modify the top level setup.py to understand how to build the pybsddb module. (there is code in pybsddb's setup.py to locate the berkeleydb install and determine appropriate flags that should be cleaned up and carried on) 5) modify the top level python setup.py to build the bsddb module as appropriate. 6) "everything else" including integrating documentation and pybsddb's large test suite. Sound correct? How do we want future bsddb module development to proceed? I envision it either taking place 100% under the python project, or taking place as it is now in the pybsddb project with patches being fed to the python project as desired? Any preferences? [i prefer to not maintain the code in two places myself (ie: do it all under the python project)] Greg From tim.one@comcast.net Fri Jun 21 23:48:44 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 21 Jun 2002 18:48:44 -0400 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: Message-ID: [Guido, re PEP 246] > Surely it would be a dramatic change, probably deeper than new-style > classes and generators together. [Alex Martelli] > Rarely does one catch Guido (or most any Dutch, I believe) in such > a wild overbid. Heat getting to you?-) Curiously, I don't think Guido was overstating his belief, but he's got his Python-User's Hat on there, not his Developer-of-Python Hat. While new-style classes cut deeply and broadly in the language implementation, most Python programmers can ignore them (the type/class split bit extension module authors the hardest, and life can be much more pleasant for them now). Protocol adaptation taken seriously would be a fundamental change in the Python Way of Life for users, from "just try it and see whether it works", to "you don't *have* to guess anymore". I think it would make a dramatic difference in the flavor of day-to-day Python programming -- and probably for the better, ignoring speed. From tim.one@comcast.net Sat Jun 22 00:01:25 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 21 Jun 2002 19:01:25 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: <3D13848D.345119F7@prescod.net> Message-ID: [Paul Prescod] > I like this idea but note that \( does have a current meaning: > > >>> "\(" > '\\(' > >>> "\(" =="\\(" > 1 > > I think this is weird but it is inherited from C... C89 doesn't define the effect. C99 specifically forbids this treatment, and requires a diagnostic if \( appears. Guido did this originally to make it easier to write Emacsish regexps; the later raw strings were a better solution to that problem, although 99.7% of Python newbies seem to believe that raw strings are an idiot's attempt to make it easier to embed Windows file path literals (newbies -- gotta love 'em ). From tim.one@comcast.net Sat Jun 22 00:18:12 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 21 Jun 2002 19:18:12 -0400 Subject: [Python-Dev] strptime recapped In-Reply-To: <15635.31047.68516.959914@beluga.mojam.com> Message-ID: [Skip Montanaro] > ... > * Is strptime even the right name for it? I doubt it. Only > us C-heads would think that was a good name. Given that we're stuck with strftime for date->string, strptime for string->date is better than just about anything else ('f' for 'format', 'p' for 'parse'). > * If you create a strptime (or timeparse or parsedate) module > should it really have exposed functions named julianFirst, > julianToGreg or gregToJulian? No, and definitely not at first. Stick to the original request and this will be sooooo much easier to resolve. As you put it earlier, All PEP 42 asked for was Add a portable implementation of time.strptime() that works in clearly defined ways on all platforms. Cool! Let's do just that much to start, and don't take it as "a reason" to rename the time module either (it really is trivial to add another .py file to Lib! give the name a leading underscore if you want to imply it's a helper for something else). From tim.one@comcast.net Sat Jun 22 00:31:00 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 21 Jun 2002 19:31:00 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <3D134C00.2090205@tismer.com> Message-ID: [Tim] >> Since Christian's reply only increased the apparent contradiction, >> allow me to channel: ... [Christian Tismer] > Huh? > Reading from top to bottom, as I used to, I see increasing > numbers, which are in the same order as the "increasing hate" > (not a linear function, but the same ordering). > > 4 - allowing it to address local/global variables > is what I hate the most. > This is in no contradiction to allvars(), which is simply > a function that puts some variables into a dict, therefore > deliberating the interpolation from variable access. > > Where is the problem, please? I was warming up my awesome channeling powers for Guido's impending vacation, and all I can figure is that I must have left them parked in reverse the last time he came back. Nothing a 12-pack of Coke didn't cure, though! I channel that you'll graciously accept my apology . From tim.one@comcast.net Sat Jun 22 00:33:38 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 21 Jun 2002 19:33:38 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <002f01c2193c$925a0360$a5f8a4d8@othello> Message-ID: [Raymond Hettinger] > ... > 'regnitteh dnomyar'[::-1] Is there any chance of ripping this out of the language before someone uses it for real? If not, strings need to grow a .reversed_title_case() method too. it's-bad-enough-we-added-a-reversed_alternating_rot13-method-ly y'rs - tim From bac@OCF.Berkeley.EDU Sat Jun 22 00:47:22 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Fri, 21 Jun 2002 16:47:22 -0700 (PDT) Subject: [Python-Dev] strptime recapped In-Reply-To: Message-ID: [Tim Peters] > [Skip Montanaro] [snip] > > Cool! Let's do just that much to start, and don't take it as "a reason" to > rename the time module either (it really is trivial to add another .py file > to Lib! give the name a leading underscore if you want to imply it's a > helper for something else). Sounds good to me. Perhaps this is the best solution for Python 2.3 (goes beta mid-July, right?). If we do this should we leave access to the C version of strptime, or move all calls over to my code? Personally, I say leave it since then any possible differences people might have with their implementation of strptime compared to mine won't affect them. This is not saying that I think there is, though; I have done my best to make sure there is not a deviance. There is also a noticeable performace difference between my implementation and the C version. I have tried to address the best I could by making locale discovery lazy and being able to have the re object used for a format string be returned so as to use that instead of having to recalculate it, but there is still going to be a difference. So basically, I am agreeing with Tim that my module should just be added as Lib/_strptime.py and my callout should just be added to timemodule.c. I will clean up the naming of my helper fxns and add __all__ to only contain strptime to keep it simple. That will get this in for 2.3 and lets this discussion of where time fxns, data types, etc. are going to be in Python. Who would of thought little old me would spark a Timbot response. =) -Brett C. From tim.one@comcast.net Sat Jun 22 01:03:01 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 21 Jun 2002 20:03:01 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: <3D134D9B.7030601@stsci.edu> Message-ID: [Todd Miller, wants to use rank-0 arrays as regular old indices] Here's a sick idea: given Python 2.2, you *could* make the type of a rank-0 array a subclass of Python's int type, making sure (if needed) to copy the value into "the int part" at the start of the struct. Then a rank-0 array would act like an integer in almost all contexts requiring a Python int, including use as a sequence index. The relevant code in the core is if (PyInt_Check(key)) return PySequence_GetItem(o, PyInt_AsLong(key)); in PyObject_GetItem(). PyInt_Check() says "yup!" for an instance of any subclass of int, and PyInt_AsLong() extracts "the int part" out of any instance of any subclass of int. In return, it shifts the burden onto convincing the rest of numarray that the thing is still an array too <0.4 wink>. From sholden@holdenweb.com Fri Jun 21 17:52:16 2002 From: sholden@holdenweb.com (Steve Holden) Date: Fri, 21 Jun 2002 12:52:16 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206182330444.SM01040@mail.python.org> Message-ID: <002d01c2199b$d7e50150$6300000a@holdenweb.com> "Barry A. Warsaw" wrote ... > > I'm so behind on my email, that the anticipated flamefest will surely > die down before I get around to reading it. Yet still, here is a new > PEP. :) > [flamefest-fodder] Seems to me that the volume of comment might imply that string formatting isn't the one obvious best way to do it. Now the flamefest has died down somewhat, the only thing I can see PEP 292 justifying is better documentation for the string-formatting operator. "But then I'm an $pytpe".sub({"ptype": "old fogey"}) regards Steve ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From aleax@aleax.it Sat Jun 22 08:58:14 2002 From: aleax@aleax.it (Alex Martelli) Date: Sat, 22 Jun 2002 09:58:14 +0200 Subject: [Python-Dev] *Simpler* string substitutions In-Reply-To: <154001c2196c$c411f9f0$6601a8c0@boostconsulting.com> References: <3D121F0D.E3B60865@prescod.net> <154001c2196c$c411f9f0$6601a8c0@boostconsulting.com> Message-ID: On Friday 21 June 2002 11:39 pm, David Abrahams wrote: ... > > That's point (e) in the Requirements of the PEP: > > > > """ > > e) When the context knows about the object and the protocol and > > knows how to adapt the object so that the required protocol is > > satisfied. This could use an adapter registry or similar > > method. > > """ > > Oh, sorry I missed that. Easy to miss because the PEP (I think) makes no further reference to [e], not even to say it's not gonna address it directly. I think the PEP could be enhanced about this (as about the reference implementation's buglet which I already remarked upon). > I was thinking of the use of template specialization to describe the > relationship of a type to a library, e.g. specialization of > std::iterator_traits by libB, which makes > libA::some_class available for use as an iterator with the standard library > (assuming it has some appropriate interface). That requires proper design for extensibility in advance -- the standard library must have had the forethought to define, and use everywhere appropriate, std::iterator_traits, AND libA must expose classes that can be plugged into that "slot". As I tried indicating, if you're willing to require design-in-advance for such issues, PEP 246 (together with Python's general mechanisms) already offer what you need. Allow me to offer an analogy: a Ruby programmer complains to a Python or C++ programmer "your language ain't flexible enough! I have a library X that supplies type X1 and a library Y that consumes any type Y1 which exposes a method Y2 and I want to just add a suitable Y2 to the existing X1 but Python/C++ doesn't let me modify the existing type/class X1". The Python or C++ programmer replies: "well INHERIT from X1 and add method Y2, that's easy". The Ruby programmer retorts" "No use, library X does in umpteen places a 'new X1();' [in C++ terms] so my subclassing won't be picked up" The Python or C++ programmer triumphantly concludes: "Ah that's a design error in X, X should instead use a factory makeX1() and let you override THAT to make your Y2-enriched X1 instead". Yeah right. That's like the airplane manufactures explaining away most crashes as "pilot error". Perrow's "Normal Accidents" (GREAT book btw) is highly recommended reading, particularly to anybody who still falls for that line. *Humans are fallible* and most often in quite predictable ways: a system that stresses humans just the wrong way is gonna produce "pilot error" over and over AND over again. Wishing for a better Human Being Release 2.0 is just silly. Ain't gonna come and we couldn't afford the upgrade fee if it did:-). Yes, factories and such creational patterns ARE a better long-term answer, BUT there's no denying that Ruby's ability to patch things up with duct tape (while having its own costs, of course!-) can be a short-term lifesaver. "If God had WANTED us to get things right the first time he wouldn't have created duct tape", after all:-). End of analogy... The way I read [e] is more demanding -- allowing some degree of "impedance matching" WITHOUT requiring special forethought by the designers of either library, beyond using adapt rather than typetesting -- just some ingenuity on the third party's part. > > Only Haskell's typeclass, AFAIK, has (among widely used languages and > > objectmodels) a smooth way to allow noninvasive 3rd party post-facto > > adaptation (and another couple of small gems too), but I guess it has an > > easier life because it's compile-time rather than runtime. > > IIUC the same kind of thing can be implemented in C++ templates, if you > know where to look. There's been a lot of discussion of how to build > variant types lately. I don't think you can do it without some degree of design forethought, but admittedly I'm starting to get very slightly rusty (haven't designed a dazzling new C++ template in almost six months, after all:-). Alex From oren-py-d@hishome.net Sat Jun 22 11:44:59 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sat, 22 Jun 2002 13:44:59 +0300 Subject: [Python-Dev] PEP 292, Simpler String Substitutions In-Reply-To: ; from ping@zesty.ca on Thu, Jun 20, 2002 at 03:48:52PM -0700 References: <20020620071856.GA10497@hishome.net> Message-ID: <20020622134459.A6918@hishome.net> On Thu, Jun 20, 2002 at 03:48:52PM -0700, Ka-Ping Yee wrote: > Using compile-time parsing, as in PEP 215, has the advantage that it > avoids any possible security problems; but it also eliminates the > possibility of using this for internationalization. Compile-time parsing may eliminate the possibility of using the same mechanism for internationalization, but not the possibility of using the same syntax. A module may provide a function that interprets the same notation at runtime. The runtime version probably shouldn't support full expression embedding - just simple name substitution. > I see this as the key tension in the string interpolation issue (aside > from all the syntax stuff -- which is naturally controversial). And the security vs. ease-of-use issue. Oren From mwh@python.net Sat Jun 22 12:10:42 2002 From: mwh@python.net (Michael Hudson) Date: 22 Jun 2002 12:10:42 +0100 Subject: [Python-Dev] strptime recapped In-Reply-To: Brett Cannon's message of "Fri, 21 Jun 2002 14:06:36 -0700 (PDT)" References: Message-ID: <2madpna7rh.fsf@starship.python.net> Brett Cannon writes: > It's quite fine with me. I want to see this done right just like > everyone else who cares about Python's development. Personally, I > am just ecstatic that I am getting to help out in some way. I feel > more like a giddy little kid who is helping out some grown-ups with > some important project than a recent college graduate. =) Feel like being release manager for 2.2.2? Cheers, M. -- at any rate, I'm satisfied that not only do they know which end of the pointy thing to hold, but where to poke it for maximum effect. -- Eric The Read, asr, on google.com From David Abrahams" <154001c2196c$c411f9f0$6601a8c0@boostconsulting.com> Message-ID: <164501c219e0$dacf69b0$6601a8c0@boostconsulting.com> ----- Original Message ----- From: "Alex Martelli" > > I was thinking of the use of template specialization to describe the > > relationship of a type to a library, e.g. specialization of > > std::iterator_traits by libB, which makes > > libA::some_class available for use as an iterator with the standard library > > (assuming it has some appropriate interface). > > That requires proper design for extensibility in advance -- the standard > library must have had the forethought to define, and use everywhere > appropriate, std::iterator_traits, AND libA must expose classes that > can be plugged into that "slot". Very true. > As I tried indicating, if you're willing to require design-in-advance for > such issues, PEP 246 (together with Python's general mechanisms) > already offer what you need. Super! +1 > Yes, factories and such creational patterns ARE a better long-term > answer, BUT there's no denying that Ruby's ability to patch things > up with duct tape (while having its own costs, of course!-) can > be a short-term lifesaver. "If God had WANTED us to get things > right the first time he wouldn't have created duct tape", after all:-). In Alaska, where my wife grew up, they call it "100-mile-an-hour tape" -- good for any use up to 100 mph. [apparently not for ducts, though, even if they're sitting still :(] > > > Only Haskell's typeclass, AFAIK, has (among widely used languages and > > > objectmodels) a smooth way to allow noninvasive 3rd party post-facto > > > adaptation (and another couple of small gems too), but I guess it has an > > > easier life because it's compile-time rather than runtime. > > > > IIUC the same kind of thing can be implemented in C++ templates, if you > > know where to look. There's been a lot of discussion of how to build > > variant types lately. > > I don't think you can do it without some degree of design forethought, > but admittedly I'm starting to get very slightly rusty (haven't designed a > dazzling new C++ template in almost six months, after all:-). Well, I have to admit that I don't have the time to say anything backed up by any research at this point ...I'm currently stuck in a Microsoft gravity well trying to survive the descent... but thanks as always for your educational and broad perspective! -Dave From pinard@iro.umontreal.ca Sat Jun 22 14:52:07 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 22 Jun 2002 09:52:07 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <20020622134459.A6918@hishome.net> References: <20020620071856.GA10497@hishome.net> <20020622134459.A6918@hishome.net> Message-ID: [Oren Tirosh] > On Thu, Jun 20, 2002 at 03:48:52PM -0700, Ka-Ping Yee wrote: > > Using compile-time parsing, as in PEP 215, has the advantage that it > > avoids any possible security problems; but it also eliminates the > > possibility of using this for internationalization. > Compile-time parsing may eliminate the possibility of using the same > mechanism for internationalization, but not the possibility of using the > same syntax. Parsing must be done at some time. Maybe the solution lies into finding some way so Python could lazily delay the "compilation" of the string to after its translation (at run-time), when it is known beforehand that a given string is internationalised. The `.pyc' would contain byte-code and data slot for driving the laziness. The translation and compilation should occur only once for a particular string, of course, as the internationalised string may appears within a loop, or within a function which gets called often. In threaded contexts, if we allow for spurious re-compilations once in a long while, and with a simple bit of care, locks could be fully avoided.[1] The good in the above approach is that people would write Python about the same way irrelevant to the fact internationalisation is in the picture or not, and would not have to suffer the complexities of "hand" optimisation of string interpolation in internationalised context. It would simple for _everybody_, on the road meant to make internationalisation a breeze. For Python to know at initial compile time if a string is going to be internationalised of not, it has to be modified, but a positive side of this effort is that internationalisation becomes part of the language design. A possible way towards this (suggested a long while ago) could be to use, beside `eru', some `t' prefix letter asking for translation. Two problems are still to be solved, however. First, going from `_("TEXT")' to `t"TEXT"', the translation function (`_' here) and textual domain should have proper defaults, while offering a way to override them for bigger applications needing finer control or tuning. A simple solution might lie, here, into inventing some special module attribute to that purpose. Second, some applications accept switching national language at run-time. So a mechanism is needed to invalidate lazily-compiled strings when such a switch occurs. An avenue would be to use the national language string code as the "done" flag in the lazy compilation process, allowing recompilation to occur on the fly, as needed. -------------------- [1] Temporarily switching locale-related environment variables in threaded contexts may yield pretty surprising results, this is well-known already. It only stresses, in my opinion, that the design has been frozen without having all the vision it would have taken. Many internationalisation devices implement half-hearted solutions for half-thought problems. I'm not at all asserting that it is possible to foresee everything in advance. Yet, we could be more productive by _not_ slavishly sticking to actual "standards". -- François Pinard http://www.iro.umontreal.ca/~pinard From aahz@pythoncraft.com Sat Jun 22 15:00:16 2002 From: aahz@pythoncraft.com (Aahz) Date: Sat, 22 Jun 2002 10:00:16 -0400 Subject: [Python-Dev] strptime recapped In-Reply-To: <2madpna7rh.fsf@starship.python.net> References: <2madpna7rh.fsf@starship.python.net> Message-ID: <20020622140016.GA362@panix.com> On Sat, Jun 22, 2002, Michael Hudson wrote: > Brett Cannon writes: >> >> It's quite fine with me. I want to see this done right just like >> everyone else who cares about Python's development. Personally, I >> am just ecstatic that I am getting to help out in some way. I feel >> more like a giddy little kid who is helping out some grown-ups with >> some important project than a recent college graduate. =) > > Feel like being release manager for 2.2.2? Tsk, tsk, let's not burn out Brett before we get some useful code from him. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From bac@OCF.Berkeley.EDU Sat Jun 22 20:13:12 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sat, 22 Jun 2002 12:13:12 -0700 (PDT) Subject: [Python-Dev] strptime recapped In-Reply-To: <20020622140016.GA362@panix.com> Message-ID: [Aahz] > On Sat, Jun 22, 2002, Michael Hudson wrote: > > Brett Cannon writes: > >> > >> It's quite fine with me. I want to see this done right just like > >> everyone else who cares about Python's development. Personally, I > >> am just ecstatic that I am getting to help out in some way. I feel > >> more like a giddy little kid who is helping out some grown-ups with > >> some important project than a recent college graduate. =) > > > > Feel like being release manager for 2.2.2? > > Tsk, tsk, let's not burn out Brett before we get some useful code from > him. Thanks for watching out for me, Aahz. =) Actually, I think it might be a cool thing to do. I do have the time (taking a year off before I apply to grad school and thus I am unemployed). Trouble is that beyond some light reading of timemodule.c, I have no experience with Python's C code, let alone writing my own extensions. Maybe next time. =) Wouldn't mine learning, though. Who knows, maybe I will get sucked into all of this enough to do my master or PhD thesis on something Python-related (assuming I get into grad school). =) -Brett C. From barry@barrys-emacs.org Sat Jun 22 20:39:19 2002 From: barry@barrys-emacs.org (Barry Scott) Date: Sat, 22 Jun 2002 20:39:19 +0100 Subject: [Python-Dev] Behavior of matching backreferences In-Reply-To: <20020621020725.A9565@ibook.distro.conectiva> Message-ID: <001001c21a24$80a8dbd0$070210ac@LAPDANCE> I think the re module worked correctly. If you write your expression without the ambiguity: yours: "^(?Pa)?(?P=a)$" re-1a: "^((?Pa)(?P=a))?$" re-2a: "^(?Pa?)(?P=a)$" your test data ebc will does not match either 'aa' or ''. Try removing the $ so that it will match '' at the start of the string. re-1b: "^((?Pa)(?P=a))?" re-2b: "^(?Pa?)(?P=a)" I think the re-2b form is the way to deal with the optional quotes. I'm not sure a patch is needed for this. BArry -----Original Message----- From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On Behalf Of Gustavo Niemeyer Sent: 21 June 2002 06:07 To: python-dev@python.org Subject: [Python-Dev] Behavior of matching backreferences Hi everyone! I was studying the sre module, when I came up with the following regular expression: re.compile("^(?Pa)?(?P=a)$").match("ebc").groups() The (?P=a) matches with whatever was matched by the "a" group. If "a" is optional and doesn't match, it seems to make sense that (?P=a) becomes optional as well, instead of failing. Otherwise the regular expression above will allways fail if the first group fails, even being optional. One could argue that to make it a valid regular expression, it should become "^(?Pa)?(?P=a)?". But that's a different regular expression, since it would match "a", while the regular expression above would match "aa" or "", but not "a". This kind of pattern is useful, for example, to match a string which could be optionally surrounded by quotes, like shell variables. Here's an example of such pattern: r"^(?P')?((?:\\'|[^'])*)(?P=a)$". This pattern matches "'a'", "\'a", "a\'a", "'a\'a'" and all such variants, but not "'a", "a'", or "a'a". I've submitted a patch to make this work to http://python.org/sf/571976 -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev From niemeyer@conectiva.com Sat Jun 22 21:10:36 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sat, 22 Jun 2002 17:10:36 -0300 Subject: [Python-Dev] Behavior of matching backreferences In-Reply-To: <001001c21a24$80a8dbd0$070210ac@LAPDANCE> References: <20020621020725.A9565@ibook.distro.conectiva> <001001c21a24$80a8dbd0$070210ac@LAPDANCE> Message-ID: <20020622171035.A6004@ibook> > I think the re module worked correctly. > > If you write your expression without the ambiguity: I must confess I see no ambiguity in my expression. > yours: "^(?Pa)?(?P=a)$" > re-1a: "^((?Pa)(?P=a))?$" Using "aa" was just an example, of course. If I wanted to match "aa" or "", I wouldn't use this at all. > re-2a: "^(?Pa?)(?P=a)$" > > your test data ebc will does not match either 'aa' or ''. Try removing > the $ so that it will match '' at the start of the string. Sorry, I took the wrong test to paste into the message. > re-1b: "^((?Pa)(?P=a))?" > re-2b: "^(?Pa?)(?P=a)" > > I think the re-2b form is the way to deal with the optional quotes. > > I'm not sure a patch is needed for this. If you think about a match with more characters, you'll end up in something like "^(?P(abc)?)(?P=a)", instead of "^(?Pabc)?(?P=a)". Besides having a little difference in their meanings (the first m.group(1) is '', and the second is None), it looks like you're workarounding an existant problem, but you may argue that this opinion is something personal. Thus, my main point here is that using the second regular expression will never work as expected, and there is no point in not fixing it, if that's possible and has already been done. If you find an example where it *should* fail, working as it is now, I promiss I'll shut up, and withdraw myself. :-) -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From tismer@tismer.com Sat Jun 22 23:57:52 2002 From: tismer@tismer.com (Christian Tismer) Date: Sun, 23 Jun 2002 00:57:52 +0200 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: Message-ID: <3D1500F0.708@tismer.com> Tim Peters wrote: > [Tim] > >>>Since Christian's reply only increased the apparent contradiction, >>>allow me to channel: ... >> > > [Christian Tismer] > >>Huh? >>Reading from top to bottom, as I used to, I see increasing >>numbers, which are in the same order as the "increasing hate" >>(not a linear function, but the same ordering). >> >>4 - allowing it to address local/global variables >>is what I hate the most. >>This is in no contradiction to allvars(), which is simply >>a function that puts some variables into a dict, therefore >>deliberating the interpolation from variable access. >> >>Where is the problem, please? > > > I was warming up my awesome channeling powers for Guido's impending > vacation, and all I can figure is that I must have left them parked in > reverse the last time he came back. Nothing a 12-pack of Coke didn't cure, > though! I channel that you'll graciously accept my apology . Whow! A TPA. Will stick it next to my screen :-) Well, the slightly twisted content of that message shaded its correct logic, maybe. Meanwhile, I'd like to drop that hate stuff and replace it by a little reasoning: Let's name locals/globals/whatever as "program variables". If there are program variables directly accessible inside strings to be interpolated, then I see possible abuse, if abusers manage to supply such a string in an unforeseen way. For that reason, I wanted to enforce that an explicit dictionary has to be passed as an argument, to remind the programmer that she is responsible for providing access. But at that time, I wasn't considering compile time string parsing. Compile time means the strings containing variable names are evaluated only once, and they behave like constants, cannot be passed in by a later intruder. That sounds pretty cool, although I don't see how this fits with I18n, which needs to change strings at runtime? Maybe it is possible to parse variable names out, replace them with some placeholders, and to do the internationalization after that, still not giving variable access to the final product. Example (now also allowing functions): name1 = "Felix" age1 = 17 name2 = "Hannes" age2 = 8 "My little son $name1 is $age1. $name2 is $(age2-age1) years older.".sub() --> "My little son Felix is 8. Hannes is 9 years older." This string might be translated under the hood into: _ipol = { x1: name1, x2: age1, x3: name2, x4: (age2-age1) } "My little son $x1 is $x2. $x3 is $x4 years older.".sub(_ipol) This string is now safe for further processing. Maybe the two forms should be syntactically different, but what I mean is a compile time transformation, that removes all real variables names in the first place. interpolation-is-by-value-not-by-name - ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From barry@barrys-emacs.org Sun Jun 23 00:40:25 2002 From: barry@barrys-emacs.org (Barry Scott) Date: Sun, 23 Jun 2002 00:40:25 +0100 Subject: [Python-Dev] Behavior of matching backreferences In-Reply-To: <20020622171035.A6004@ibook> Message-ID: <000501c21a46$2ee38210$070210ac@LAPDANCE> I think your re has a bug in it that in python would be if cond: a = 1 print a python will give an error is cond is false. An re that defines a group conditionally as yours does I think is the same programming error. That's the ambiguity I am referring to, is or is not the named group defined? > If you think about a match with more characters, you'll end up in > something like "^(?P(abc)?)(?P=a)", instead of "^(?Pabc)?(?P=a)". > Besides having a little difference in their meanings (the first > m.group(1) is '', and the second is None), it looks like you're > workarounding an existant problem, but you may argue that this opinion > is something personal. You can prevent groups being remember using the (?:...) syntax if you need to preserve the group index. So you need: "^(?P(?:abc)?)(?P=a)" I'm not convinced you have found a bug in the engine that needs fixing, I think its your re needs changing. I want the re engine to report the error for re that are illogical. BArry From barry@zope.com Sun Jun 23 01:37:23 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 22 Jun 2002 20:37:23 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <200206200332.g5K3Wbj06062@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15637.6211.441850.925511@anthem.wooz.org> >>>>> "PKO" == Patrick K O'Brien writes: PKO> I guess what I was really wondering is whether that advantage PKO> clearly outways some of the possible disadvantages. I'm not a PKO> fan of curly braces and I'll be sad to see more of them in PKO> Python. There's something refreshing about only having curly PKO> braces for dictionaries and parens everywhere else. And PKO> since the exisiting string substitution uses parens why PKO> shouldn't the new? Personally, I wouldn't mind it if this syntax took a cue from the make program and accepted both $(name) and ${name} as alternatives to $name (with nested parenthesis/brace matching). -Barry From barry@zope.com Sun Jun 23 01:56:57 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 22 Jun 2002 20:56:57 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D11B6F0.5000803@tismer.com> <200206201746.g5KHkwH04175@odiug.zope.com> <3D121EDB.6070501@tismer.com> Message-ID: <15637.7385.966341.14847@anthem.wooz.org> >>>>> "CT" == Christian Tismer writes: CT> By no means. allvars() is something like locals() or CT> globals(), just an explicit way to produce a dictionary of CT> variables. I'd be ok with something like allvars() and requiring a dictionary to the .sub() method, /if/ allvars() were a method on a frame object. I really, really do want to write in my i18n programs: def whereBorn(name): country = countryOfOrigin(name) return _('$name was born in $country') I'd be fine if the definition of _() could reach into the frame of whereBorn() and give me a list of all variables, including ones in nested scopes. Actually, that'd be a lot better than what I do now (although truth be told, losing access to nested scoped variables is only a hypothetical limitation in the code I've written). The feature would be useless to me if I had to pass some explicit dictionary into the _() method. It makes writing i18n code extremely tedious. Invariably, the unsafeness of an implicit dictionary happens when strings come from untrusted sources, and your .py file can't be considered untrusted. In those cases, creating an explicit dictionary for interpolation is fine, but they also tend not to overlap with i18n much. -Barry From barry@zope.com Sun Jun 23 02:02:14 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 22 Jun 2002 21:02:14 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> <15633.19790.152438.926329@anthem.wooz.org> Message-ID: <15637.7702.841779.383698@anthem.wooz.org> >>>>> "FP" =3D=3D Fran=E7ois Pinard writes: FP> Saying that PEP 292 rejects an idea because this idea would FP> require another PEP to be debated and accepted beforehand, and FP> than rushing the acceptance of PEP 292 as it stands, is FP> probably missing the point of the discussion. I don't think there's /any/ danger of rushing acceptance of PEP 292. It may not even be accepted at all. still-slogging-through-50-some-odd-messages-ly y'rs, -Barry From barry@zope.com Sun Jun 23 02:12:22 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 22 Jun 2002 21:12:22 -0400 Subject: [Python-Dev] *Simpler* string substitutions References: <3D121F0D.E3B60865@prescod.net> <001901c218a0$6158d1c0$070210ac@LAPDANCE> Message-ID: <15637.8310.961687.468635@anthem.wooz.org> >>>>> "BS" == Barry Scott writes: BS> If I'm going to move from %(name)fmt to ${name} I need a place BS> for the fmt format. One of the reasons why I added "simpler" to the PEP is because I didn't want to support formatting characters in the specification. While admittedly handy for some applications, I submit that most string interpolation simply uses %s or %(name)s and there should be a simpler, less error prone way of writing that. -Barry From paul@prescod.net Sun Jun 23 02:12:57 2002 From: paul@prescod.net (Paul Prescod) Date: Sat, 22 Jun 2002 18:12:57 -0700 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <20020620071856.GA10497@hishome.net> <20020622134459.A6918@hishome.net> Message-ID: <3D152099.1C1E73FA@prescod.net> Oren Tirosh wrote: > >... > > Compile-time parsing may eliminate the possibility of using the same > mechanism for internationalization, but not the possibility of using the > same syntax. A module may provide a function that interprets the same > notation at runtime. The runtime version probably shouldn't support full > expression embedding - just simple name substitution. I think that there are enough benefits for each form (compile time with expressions, runtime without) that we should expect any final solution to support both. Maybe you guys should merge your PEPs! Paul Prescod From jmiller@stsci.edu Sun Jun 23 02:22:52 2002 From: jmiller@stsci.edu (Todd Miller) Date: Sat, 22 Jun 2002 21:22:52 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ References: Message-ID: <3D1522EC.1070001@stsci.edu> Tim Peters wrote: >[Todd Miller, wants to use rank-0 arrays as regular old indices] > >Here's a sick idea: given Python 2.2, you *could* make the type of a rank-0 > Right now, numarray is a subclass of object for Python-2.2 in order to get properties in order to emulate some of Numeric's attributes. I'm wondering what I'd loose from object in order to pick up int's indexing. I'm also wondering how to make a rank-0 Float array fail as an index. I might try it just to see where it breaks... Thanks! > >array a subclass of Python's int type, making sure (if needed) to copy the >value into "the int part" at the start of the struct. Then a rank-0 array >would act like an integer in almost all contexts requiring a Python int, >including use as a sequence index. > >The relevant code in the core is > > if (PyInt_Check(key) > > > return PySequence_GetItem(o, PyInt_AsLong(key)); > >in PyObject_GetItem(). PyInt_Check() says "yup!" for an instance of any >subclass of int, and PyInt_AsLong() extracts "the int part" out of any >instance of any subclass of int. > >In return, it shifts the burden onto convincing the rest of numarray that >the thing is still an array too <0.4 wink>. > From barry@zope.com Sun Jun 23 02:20:54 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 22 Jun 2002 21:20:54 -0400 Subject: [Python-Dev] *Simpler* string substitutions References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com> Message-ID: <15637.8822.201643.67822@anthem.wooz.org> GvR> Oren made a good point that Paul emphasized: the most common GvR> use case needs interpolation from the current namespace in a GvR> string literal, and expressions would be handy. Oren also GvR> made the point that the necessary parsing could (should?) be GvR> done at compile time. I'll point out that in my experience, while expressions are (very) occasionally handy, you wouldn't necessarily need /arbitrary/ expressions. Something as simple as allowing dotted names only would solve probably 90% of uses, e.g. person = getPerson() print '${person.name} was born in ${person.country}' Not that this can't execute arbitrary code of course, so the security implications of that would need to be examined. -Barry From barry@zope.com Sun Jun 23 02:36:31 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 22 Jun 2002 21:36:31 -0400 Subject: [Python-Dev] Re: *Simpler* string substitutions References: <714DFA46B9BBD0119CD000805FC1F53B01B5B3B2@UKRUX002.rundc.uk.origin-it.com> Message-ID: <15637.9759.111784.481102@anthem.wooz.org> >>>>> "PM" == Paul Moore writes: PM> 4. Access to variables is also problematic. Without PM> compile-time support, access to nested scopes is impossible PM> (AIUI). Is this really true? I think it was two IPC's ago that Jeremy and I discussed the possibility of adding a method to frame objects that would basically yield you the equivalent of globals+freevars+locals. -Barry From barry@zope.com Sun Jun 23 02:40:22 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 22 Jun 2002 21:40:22 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <200206182330444.SM01040@mail.python.org> <002d01c2199b$d7e50150$6300000a@holdenweb.com> Message-ID: <15637.9990.703227.618127@anthem.wooz.org> >>>>> "SH" == Steve Holden writes: SH> Seems to me that the volume of comment might imply that string SH> formatting isn't the one obvious best way to do it. Now the SH> flamefest has died down somewhat, the only thing I can see PEP SH> 292 justifying is better documentation for the SH> string-formatting operator. "But then I'm an SH> $pytpe".sub({"ptype": "old fogey"}) I will soon do an update of the PEP to add a bunch more open issues based on these threads, which I /think/ I've mostly slogged through. -Barry From barry@zope.com Sun Jun 23 02:45:42 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 22 Jun 2002 21:45:42 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D1500F0.708@tismer.com> Message-ID: <15637.10310.131724.556831@anthem.wooz.org> >>>>> "CT" == Christian Tismer writes: CT> If there are program variables directly accessible inside CT> strings to be interpolated, then I see possible abuse, if CT> abusers manage to supply such a string in an unforeseen way. For literal strings in .py files, the only way that's going to happen is if someone you don't trust is hacking your source code, /or/ if you have evil translators sneaking in bogus translation strings. The latter can be solved with a verification step over your message catalogs, while the former I leave as an exercise for the reader. :) So still, I trust automatic interpolation of program vars for literal strings, but for strings coming from some other source (e.g. a web form), then yes, you obviously want to be explicit about the interpolation dictionary. -Barry From barry@zope.com Sun Jun 23 02:50:47 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 22 Jun 2002 21:50:47 -0400 Subject: [Python-Dev] PEP 292, Simpler String Substitutions References: <20020620071856.GA10497@hishome.net> <20020622134459.A6918@hishome.net> <3D152099.1C1E73FA@prescod.net> Message-ID: <15637.10615.563601.808178@anthem.wooz.org> >>>>> "PP" == Paul Prescod writes: PP> I think that there are enough benefits for each form (compile PP> time with expressions, runtime without) that we should expect PP> any final solution to support both. Maybe you guys should PP> merge your PEPs! Only two of them are official PEPs currently <294 winks to Oren>. -Barry From tismer@tismer.com Sun Jun 23 03:04:59 2002 From: tismer@tismer.com (Christian Tismer) Date: Sun, 23 Jun 2002 04:04:59 +0200 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> Message-ID: <3D152CCB.6010000@tismer.com> Barry A. Warsaw wrote: >>>>>>"CT" == Christian Tismer writes: >>>>> > > CT> If there are program variables directly accessible inside > CT> strings to be interpolated, then I see possible abuse, if > CT> abusers manage to supply such a string in an unforeseen way. > > For literal strings in .py files, the only way that's going to happen > is if someone you don't trust is hacking your source code, /or/ if you > have evil translators sneaking in bogus translation strings. The > latter can be solved with a verification step over your message > catalogs, while the former I leave as an exercise for the reader. :) > > So still, I trust automatic interpolation of program vars for literal > strings, but for strings coming from some other source (e.g. a web > form), then yes, you obviously want to be explicit about the > interpolation dictionary. From another reply: > > def whereBorn(name): > country = countryOfOrigin(name) > return _('$name was born in $country') Ok, I'm all with it. Since a couple of hours, I'm riding the following horse: - $name, $(name), $(any expr) is just fine - all of this is compile-time stuff The idea is: Resolve the variables at compile time. Don't provide the feature at runtime. Here a simple approach. (I'm working on a complicated, too): (assuming the "e" character triggering expression extraction) def whereBorn(name): country = countryOfOrigin(name) return _(e'$name was born in $country') is accepted by the grammar, but turned into the equivalent of: def whereBorn(name): country = countryOfOrigin(name) return _('%(x1)s was born in %(x2)s') % { "x1": name, "x2": country} That is: The $ stuff is extracted, turning the fmt string into something anonymous. Your _() processes it, then the variables are formatted in. This turns the $ stuff completely into syntactic sugar. Any Python expression inside $() is allowed, it is compiled as if it were sitting inside the dict. I also believe it is a good idea to do the _() on the unexpanded string (as shown), since the submitted values are most probably hard to translate at all. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From sjmachin@lexicon.net Sun Jun 23 03:15:22 2002 From: sjmachin@lexicon.net (John Machin) Date: Sun, 23 Jun 2002 12:15:22 +1000 Subject: "Julian" ambiguity (was Re: [Python-Dev] strptime recapped) In-Reply-To: <20020621122722.44222.qmail@web9601.mail.yahoo.com> Message-ID: 21/06/2002 10:27:22 PM, Steven Lott wrote: > >Generally, "Julian" dates are really just the day number within >a given year; this is a simple special case of the more general >(and more useful) approach that R-D use. > >See >http://emr.cs.iit.edu/home/reingold/calendar-book/index.shtml > >for more information. > AFAICT from perusing their book, R-D use the term "julian-date" to mean a tuple (year, month, day) in the Julian calendar. The International Astro. Union uses "Julian date" to mean an instant in time measured in days (and fraction therof) since noon on 1 January -4712 (Julian ("proleptic") calendar). See for example http://maia.usno.navy.mil/iauc19/iaures.html#B1 A "Julian day number" (or "JDN") is generally used to mean an ordinal day number counting day 0 as Julian_calendar(-4712, 1, 1) as above. Some folks use JDN to include the IAU's instant-in-time. Some folks use "julian day" to mean a day within a year (range 0-365 *or* 1-366 (all inclusive)). This terminology IMO should be severely deprecated. The concept is best described as something like "day of year", with a specification of the origin (0 or 1) when appropriate. It is not clear from the first of your sentences quoted above exactly what you are calling a "Julian date": (a) the tuple (given_year, day_of_year) with calendar not specified or (b) just day_of_year. However either answer seems IMO to be an inappropriate addition to the terminology confusion. Cheers, John From niemeyer@conectiva.com Sun Jun 23 05:39:36 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sun, 23 Jun 2002 01:39:36 -0300 Subject: [Python-Dev] Behavior of matching backreferences In-Reply-To: <000501c21a46$2ee38210$070210ac@LAPDANCE> References: <20020622171035.A6004@ibook> <000501c21a46$2ee38210$070210ac@LAPDANCE> Message-ID: <20020623013936.A6543@ibook> > I think your re has a bug in it that in python would be > > if cond: > a = 1 > print a > > python will give an error is cond is false. > > An re that defines a group conditionally as yours does I think > is the same programming error. That's the ambiguity I am > referring to, is or is not the named group defined? Sorry Barry, but I don't see your point here. There's no change in the naming semantics. In sre that's totally valid and used in a lot of code: >>> `re.compile("(?Pa)?").match("b").group("a")` 'None' >>> `re.compile("(?Pa)?").match("a").group("a")` "'a'" >>> [...] > You can prevent groups being remember using the (?:...) syntax > if you need to preserve the group index. So you need: > > "^(?P(?:abc)?)(?P=a)" Again, you may do regular expressions in many ways, the point I'm still raising is that there's one way that doesn't work as expected. > I'm not convinced you have found a bug in the engine that needs > fixing, I think its your re needs changing. I want the re engine > to report the error for re that are illogical. The re won't report anything when somebody uses this syntax. It will just don't work as expected. If you think this re is illogical, don't use it. But I see no point in denying others to use it. I'm not planning to discuss much more about this. My intentions and the issue are clear enough. I'd like to hear the opinion of Fredrik about this, though. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From xscottg@yahoo.com Sun Jun 23 06:05:06 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Sat, 22 Jun 2002 22:05:06 -0700 (PDT) Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: <3D1522EC.1070001@stsci.edu> Message-ID: <20020623050506.37311.qmail@web40106.mail.yahoo.com> --- Todd Miller: > Right now, numarray is a subclass of object for Python-2.2 in order to > get properties in order to emulate some of Numeric's attributes. I'm > wondering what I'd loose from object in order to pick up int's indexing. Since int is also a subclass of object, you'd still get the benefits of new style classes... > I'm also wondering how to make a rank-0 Float array fail as an index. Raise a TypeError and it would match the standard behavior. __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From nhodgson@bigpond.net.au Sun Jun 23 11:32:24 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Sun, 23 Jun 2002 20:32:24 +1000 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D11B6F0.5000803@tismer.com><200206201746.g5KHkwH04175@odiug.zope.com><3D121EDB.6070501@tismer.com> <15637.7385.966341.14847@anthem.wooz.org> Message-ID: <000a01c21aa1$438bfde0$3da48490@neil> Barry A. Warsaw: > def whereBorn(name): > country = countryOfOrigin(name) > return _('$name was born in $country') > ... > The feature would be useless to me if I had to pass some explicit > dictionary into the _() method. It makes writing i18n code extremely > tedious. I think you are overstating the problem here. The explicit bindings are a small increase over your current code as you are already creating an extra variable just to use the automatic binding. With explicit bindings: def whereBorn(name): return _('$name was born in $country', name=name, country=countryOfOrigin(name)) The protection provided is not just against untrustworthy translaters but also allows checking the initial language code. You can ensure all the interpolations are provided with values and all the provided values are used. It avoids exposing implementation details such as the names of local variables and can ensure that a more meaningful identifier in the local context of the string is available to the translator. For example, I may have some code that processes a command line argument which has multiple uses on different execution paths: _('$moduleName already exists', moduleName = arg) _('$searchString can not be found', searchString = arg) Not making bindings explicit may mean that translators use other variables available at the translation point leading to unexpected failures when internal details are changed. Neil From skip@mojam.com Sun Jun 23 13:00:13 2002 From: skip@mojam.com (Skip Montanaro) Date: Sun, 23 Jun 2002 07:00:13 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200206231200.g5NC0DU02192@12-248-11-90> Bug/Patch Summary ----------------- 254 open / 2603 total bugs (-3) 128 open / 1565 total patches (no change) New Bugs -------- ConfigParser code cleanup (2002-04-17) http://python.org/sf/545096 odd index entries (2002-06-17) http://python.org/sf/570003 "python -u" not binary on cygwin (2002-06-17) http://python.org/sf/570044 Broken pre.subn() (and pre.sub()) (2002-06-17) http://python.org/sf/570057 inspect.getmodule symlink-related failur (2002-06-17) http://python.org/sf/570300 .PYO files not imported unless -OO used (2002-06-18) http://python.org/sf/570640 bdist_rpm and the changelog option (2002-06-18) http://python.org/sf/570655 CGIHTTPServer flushes read-only file. (2002-06-18) http://python.org/sf/570678 glob() fails for network drive in cgi (2002-06-19) http://python.org/sf/571167 imaplib fetch is broken (2002-06-19) http://python.org/sf/571334 Mixing framework and static Pythons (2002-06-19) http://python.org/sf/571343 Numeric Literal Anomoly (2002-06-19) http://python.org/sf/571382 test_import crashes/hangs for MacPython (2002-06-20) http://python.org/sf/571845 Segmentation fault in Python 2.3 (2002-06-20) http://python.org/sf/571885 python-mode IM parses code in docstrings (2002-06-21) http://python.org/sf/572341 Memory leak in object comparison (2002-06-22) http://python.org/sf/572567 New Patches ----------- Remove support for Win16 (2002-06-16) http://python.org/sf/569753 Fix bug in encodings.search_function (2002-06-20) http://python.org/sf/571603 Changes (?P=) with optional backref (2002-06-20) http://python.org/sf/571976 AUTH method LOGIN for smtplib (2002-06-21) http://python.org/sf/572031 Remove import string in Tools/ directory (2002-06-21) http://python.org/sf/572113 opt. timeouts for Queue.put() and .get() (2002-06-22) http://python.org/sf/572628 Closed Bugs ----------- Incorporate timeoutsocket.py into core (2001-08-30) http://python.org/sf/457114 ext call doco warts (2001-12-14) http://python.org/sf/493243 It's the future for generators (2001-12-21) http://python.org/sf/495978 PyModule_AddObject doesn't set exception (2002-02-27) http://python.org/sf/523473 Incomplete list of escape sequences (2002-03-06) http://python.org/sf/526390 range() description: rewording suggested (2002-03-11) http://python.org/sf/528748 Popen3 might cause dead lock (2002-03-16) http://python.org/sf/530637 6.9 The raise statement is confusing (2002-03-20) http://python.org/sf/532467 cut-o/paste-o in Marshalling doc: 2.2.1 (2002-03-22) http://python.org/sf/533735 mimify.mime_decode_header only latin1 (2002-05-03) http://python.org/sf/551912 [RefMan] Special status of "as" (2002-05-07) http://python.org/sf/553262 Expat improperly described in setup.py (2002-05-15) http://python.org/sf/556370 \verbatiminput and name duplication (2002-05-20) http://python.org/sf/558279 Missing operator docs (2002-06-02) http://python.org/sf/563530 PyUnicode_Find() returns wrong results (2002-06-09) http://python.org/sf/566631 __slots__ attribute and private variable (2002-06-14) http://python.org/sf/569257 Closed Patches -------------- Pure python version of calendar.weekday (2001-11-20) http://python.org/sf/483864 Janitoring in ConfigParser (2002-04-17) http://python.org/sf/545096 add support for HtmlHelp output (2002-05-06) http://python.org/sf/552835 texi2html.py - add support for HTML Help (2002-05-06) http://python.org/sf/552837 OSX build -- make python.app (2002-05-18) http://python.org/sf/557719 unicode in sys.path (2002-06-10) http://python.org/sf/566999 From jmiller@stsci.edu Sun Jun 23 14:37:12 2002 From: jmiller@stsci.edu (Todd Miller) Date: Sun, 23 Jun 2002 09:37:12 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ References: <20020623050506.37311.qmail@web40106.mail.yahoo.com> Message-ID: <3D15CF08.2020506@stsci.edu> Scott Gilbert wrote: >--- Todd Miller: > >>Right now, numarray is a subclass of object for Python-2.2 in order to >>get properties in order to emulate some of Numeric's attributes. I'm >>wondering what I'd loose from object in order to pick up int's indexing. >> > >Since int is also a subclass of object, you'd still get the benefits of new >style classes... > Well, that's excellent! > >> I'm also wondering how to make a rank-0 Float array fail as an index. >> > >Raise a TypeError and it would match the standard behavior. > Raise TypeError where? I was thinking I'd have to either inherit from int, or not, depending on the type of the array. It still might work out though... > > > >__________________________________________________ >Do You Yahoo!? >Yahoo! - Official partner of 2002 FIFA World Cup >http://fifaworldcup.yahoo.com > From barry@zope.com Sun Jun 23 16:33:18 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 23 Jun 2002 11:33:18 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D11B6F0.5000803@tismer.com> <200206201746.g5KHkwH04175@odiug.zope.com> <3D121EDB.6070501@tismer.com> <15637.7385.966341.14847@anthem.wooz.org> <000a01c21aa1$438bfde0$3da48490@neil> Message-ID: <15637.59966.161957.754620@anthem.wooz.org> >>>>> "NH" == Neil Hodgson writes: >> The feature would be useless to me if I had to >> pass some explicit dictionary into the _() method. It makes >> writing i18n code extremely tedious. NH> I think you are overstating the problem here. Trust me, I'm not. Then again, maybe it's just me, or my limited experience w/ i18n'd source code, but being forced to pass in the explicit bindings is a big burden in terms of maintainability and readability. NH> The explicit bindings are a small increase over your current NH> code as you are already creating an extra variable just to use NH> the automatic binding. With explicit bindings: NH> def whereBorn(name): | return _('$name was born in $country', | name=name, country=countryOfOrigin(name)) More often then not, you already have the values you want to interpolate sitting in local variables for other uses inside the function. Notice how you've written `name' 5 times there? Try that with every other line of code and see if it doesn't get tedious. ;) NH> The protection provided is not just against untrustworthy NH> translaters but also allows checking the initial language NH> code. You can ensure all the interpolations are provided with NH> values and all the provided values are used. Yes, you could do that. Note that the actual interpolation function /does/ have access to a dictionary, it might have more stuff than you want (making the second check impossible), but the first check could be done. NH> It avoids exposing implementation details such as the names of NH> local variables This isn't an issue from a security concern, if the code is open source. And you should be picking meaningful local variable names anyway! Mine tend to be stuff like `subject', `listname', `realname'. I've yet to get a question about the meaning of an interpolation variable. Actually, translators really need access to the source code anyway, and .po files usually contain references to the file and line number of the source string, and po-mode makes it easy for translators to locate the context and the purpose of the translation. NH> and can ensure that a more meaningful identifier in the local NH> context of the string is available to the translator. For NH> example, I may have some code that processes a command line NH> argument which has multiple uses on different execution paths: NH> _('$moduleName already exists', moduleName = arg) NH> _('$searchString can not be found', searchString = arg) +1 on using explicit bindings or a dictionary when it improves clarity! NH> Not making bindings explicit may mean that translators use NH> other variables available at the translation point leading to NH> unexpected failures when internal details are changed. I18n'ing a program means you have to worry about a lot more things. If some local variable changed, I'd consider using an explicit binding to preserve the original source string, a change to which would force updated translations. Then again, you tend to get paranoid about changing /any/ source string, say to remove a comma, adjust whitespace, or fix a preposition. Any change means a dozen language teams have a new message they must translate (unless you can mechanically fix them for them). Another i18n approach altogether uses explicit message ids instead of using the source string as the implicit message id, but that has a whole 'nuther set of issues. multi-lingual-ly y'rs, -Barry From rnd@onego.ru Sun Jun 23 17:10:43 2002 From: rnd@onego.ru (Roman Suzi) Date: Sun, 23 Jun 2002 20:10:43 +0400 (MSD) Subject: [Python-Dev] Behavior of matching backreferences In-Reply-To: <20020623013936.A6543@ibook> Message-ID: On Sun, 23 Jun 2002, Gustavo Niemeyer wrote: I do not agree with both of you. I think, re should give an error at compile time (as it does in cases, like (?<=REGEXP), where only fixed length is allowed: >>> re.compile("(?<=R*)") Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.2/sre.py", line 178, in compile return _compile(pattern, flags) File "/usr/lib/python2.2/sre.py", line 228, in _compile raise error, v # invalid expression sre_constants.error: look-behind requires fixed-width pattern Why? Because there is no sense in matching non-existent group. It's simply incorrect. So, instead of having time-bombs Gustavo found, it's better to check at re compile time. >Sorry Barry, but I don't see your point here. There's no change in >the naming semantics. In sre that's totally valid and used in a >lot of code: > >>>> `re.compile("(?Pa)?").match("b").group("a")` >'None' >>>> `re.compile("(?Pa)?").match("a").group("a")` >"'a'" >>>> This is quite different. None has a sense of meta-value which indicates that group was not used while matching. There is no way to use it in the re consistently. (well, probably some syntax could be invented for it, like 'match only if exists', etc. But it is too subtle and is hardly needed). Sincerely yours, Roman Suzi -- rnd@onego.ru =\= My AI powered by Linux RedHat 7.2 From lalo@laranja.org Sun Jun 23 19:16:30 2002 From: lalo@laranja.org (Lalo Martins) Date: Sun, 23 Jun 2002 15:16:30 -0300 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting Message-ID: <20020623181630.GN25927@laranja.org> For a moment, please remove for your mind your experience of C and printf. Meditate with me and picture yourself in a happy world of object-orientation and code readability, where everything cryptic and obscured is banished. Just to help you do that, I'll avoid the notation chosen by the PEP. Let's use, for the duration of this post, the hypothetic notation suggested by some other reader: "<> is from <>". Now, this thing we're talking about is replacing parts of the string with other strings. These strings may be the result of running some non-string objects trough str(foo) - but, we are making no assumptions about these objects. Just that str(foo) is somehow meaningful. And, to my knowledge, there are no python objects for which str(foo) doesn't work. So, string substitution is non-intrusive. Also, if you keep your templates (let's call a string containing substitution markup a template, shall we?) outside your source code, as is the case with i18n, pure substitution doesn't require the people who edit them (for example, translators) to know anything about python *or* even programming. String substitution only depends on an identifier ('name' or 'country'), no sick abbreviations like 's' or 'd' or 'f' or 'r' or 'x' that you have to keep a table for. So, string substitution is readable and non-cryptic. Now, data formatting is another animal entirely. It's a way to request one specific representation of a piece of data. But there is a catch. When you do '%8.3d' % foo you are *expecting* that foo a floating-point number and you know you'll get TypeError otherwise. This is, IMO, invasive. In my ideal OO-paradise I would rather have something like foo.format(8, 3) (THIS IS NOT A PEP!). IMO, if you, as I asked in the first paragraph, pretend you don't know C and printf and python's % operator and then pretend you're having your first contact with it, while already having some experience with python's readability, it's hard not to be shocked. And I bet you'd go to great lengths to avoid using the "feature". Conclusion: I think string formatting is a cryptic and obscure misfeature inherited from C that should be deprecated in favour of something less invasive and more readable/explicit. More, I'm completely opposed to "<> is <> years old" because it's still cryptic and invasive. This should instead read similar to "<> is <> years old".sub({'name': x.name, 'age': x.age.format(None, 0)}) Guido, can you please, for our enlightenment, tell us what are the reasons you feel %(foo)s was a mistake? []s, |alo +---- -- It doesn't bother me that people say things like "you'll never get anywhere with this attitude". In a few decades, it will make a good paragraph in my biography. You know, for a laugh. -- http://www.laranja.org/ mailto:lalo@laranja.org pgp key: http://www.laranja.org/pessoal/pgp Python Foundry Guide http://www.sf.net/foundry/python-foundry/ From tim.one@comcast.net Sun Jun 23 19:28:53 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 23 Jun 2002 14:28:53 -0400 Subject: [Python-Dev] Behavior of matching backreferences In-Reply-To: <20020623013936.A6543@ibook> Message-ID: [Gustavo Niemeyer, on the behavior of re.compile("^(?Pa)?(?P=a)$").match("ebc").groups() ] Python and Perl work exactly the same way for the equivalent (but spellable in Perl) regexp ^(a)?\1$ matching the two strings a and aa and nothing else. That's what I expected. You didn't give a concrete example of what you think it should do instead. It may have been your intent to say that you believe the regexp *should* match the string ebc but you didn't really say so one way or the other. Regardless, neither Python nor Perl do match ebc in this case, and that's intended. The Rule, in vague English, is that a backreference matches the same text as was matched by the referenced group; if the referenced group didn't match any text, then the backreference can't match either. Note that whether the referenced group matched any text is a different question than whether the referenced group is *used* in the match. This is a subtle point I suspect you're missing. > Otherwise the regular expression above will allways fail if the first > group fails, Yes. > even being optional There's no such beast as "an optional group". The ^(a) part *must* match or the entire regexp fails, period, regardless of whether or not backreferences appear later. The question mark following doesn't change this requirement. (a)? says 'a' must match but the overall pattern can choose to use this match or not That's why the regexp as a whole matches the string a The (a) part does match 'a', the ? chooses not to use this match, and then the backreference matches the 'a' that the first group matched. Study the output of this and it may be clearer: import re pat = re.compile(r"^((a)?)(\2)$") print pat.match('a').groups() print pat.match('aa').groups() > ... > while the regular expression above would match "aa" or "", but not "a". As above, Python and Perl disagree with you: they match "aa" and "a" but not "". > ... > My intentions and the issue are clear enough. Sorry, your intentions weren't clear to me. The issue is, though . From paul@prescod.net Sun Jun 23 19:32:21 2002 From: paul@prescod.net (Paul Prescod) Date: Sun, 23 Jun 2002 11:32:21 -0700 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> <3D152CCB.6010000@tismer.com> Message-ID: <3D161435.9D154EE0@prescod.net> Christian Tismer wrote: > >... > > Ok, I'm all with it. > Since a couple of hours, I'm riding the following horse: > > - $name, $(name), $(any expr) is just fine > - all of this is compile-time stuff > .... I think you just described PEP 215. But what you're missing is that we need a compile time facility for its flexibility and simplicity but we also need a runtime facility to allow I18N. > I also believe it is a good idea to do the _() on > the unexpanded string (as shown), since the submitted > values are most probably hard to translate at all. _ runs at runtime. If the interpolation is done at compile time then "_" is executed too late. Paul Prescod From paul@prescod.net Sun Jun 23 19:38:43 2002 From: paul@prescod.net (Paul Prescod) Date: Sun, 23 Jun 2002 11:38:43 -0700 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D11B6F0.5000803@tismer.com><200206201746.g5KHkwH04175@odiug.zope.com><3D121EDB.6070501@tismer.com> <15637.7385.966341.14847@anthem.wooz.org> <000a01c21aa1$438bfde0$3da48490@neil> Message-ID: <3D1615B3.D9F382F@prescod.net> Neil Hodgson wrote: > >... > > Not making bindings explicit may mean that translators use other > variables available at the translation point leading to unexpected failures > when internal details are changed. Actually, I don't think that is the case. I think that the security implications of "_" are overstated. name = "Paul" country = "Canada" password = "jfoiejw" _('${name} was born in ${country}') The "_" function can use a regular expression to determine that the original code used only "${name}" and "${country}". Then it can disallow access to ${password} def _(origstring): orig_substitions = get_substitutions(origstring) translation = lookup_translation(origstring) translation_substitions = get_substitutions(translation_substitions) assert translation.substitutions == orig_substitutions Paul Prescod From tim.one@comcast.net Sun Jun 23 19:45:30 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 23 Jun 2002 14:45:30 -0400 Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: <3D1522EC.1070001@stsci.edu> Message-ID: [Todd Miller, wants to use rank-0 arrays as regular old indices] [Tim] > Here's a sick idea: given Python 2.2, you *could* make the type > of a rank-0 array a subclass of Python's int type [Todd] > Right now, numarray is a subclass of object for Python-2.2 in order to > get properties in order to emulate some of Numeric's attributes. I'm > wondering what I'd loose from object in order to pick up int's indexing. All types in 2.2 inherit from object, including int. >>> class IntWithA(int): ... def seta(self, value): ... self._a = value ... def geta(self): ... return self._a * 2 ... a = property(geta, seta) ... >>> i = IntWithA(42) >>> i 42 >>> i.a = 333 >>> i.a 666 >>> range(50)[i] 42 >>> So, e.g., adding arbitrary properties should be a crawl in the park. > I'm also wondering how to make a rank-0 Float array fail as an index. Quit while you're ahead . The obvious idea is to make a Rank0FloatArray type which is not a subclass of int. From tim.one@comcast.net Sun Jun 23 20:00:25 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 23 Jun 2002 15:00:25 -0400 Subject: [Python-Dev] Behavior of matching backreferences In-Reply-To: Message-ID: [Tim] > .. > There's no such beast as "an optional group". The > > ^(a) > > part *must* match or the entire regexp fails, period, regardless > of whether or not backreferences appear later. The question mark > following doesn't change this requirement. ... Wow, yesterday's drugs haven't worn off yet . The details of this explanation were partly full of beans. Let's consider a different regexp: ^(a)?b\1$ Should that match b or not? Python and Perl say "no" today, because \1 refers to a group that didn't match. Ir remains unclear to me whether Gustavo is saying it should, but, if he is, that's too big a change, and ^(a?)b\1$ is the intended way to spell it. From python@rcn.com Sun Jun 23 20:03:08 2002 From: python@rcn.com (Raymond Hettinger) Date: Sun, 23 Jun 2002 15:03:08 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() Message-ID: <002001c21ae8$9d687a40$bbb53bd0@othello> GvR thought you guys might have some ideas on this one for me. If I don't get any replies, I may have to rely on my own instincts and judgment and no one knows what follies might ensue ;) Raymond Hettinger ----- Original Message ----- From: "Raymond Hettinger" To: Sent: Friday, June 21, 2002 1:16 PM Subject: Behavior of buffer() > I would like to solicit py-dev's thoughts on the best way to resolve a bug, > www.python.org/sf/546434 . > > The root problem is that mybuf[:] returns a buffer type and mybuf[2:4] > returns a string type. A similar issue exists for buffer repetition. > > One way to go is to have the slices always return a string. If code > currently relies on the type of a buffer slice, it is more likely to be > relying on it being a string as in: print mybuf[:4]. This is an intuitive > guess because I can't find empirical evidence. Another reason to choose a > string return type is that buffer() appears to have been designed to be as > stringlike as possible so that it can be easily substituted in code > originally designed for strings. > > The other way to go is to return a buffer object everytime. Slices usually, > but not always (see subclasses of list), return the same type that was being > sliced. If we choose this route, another issue remains -- mybuf[:] returns > self instead of a new buffer. I think that behavior is also a bug and > should be changed to be consistent with the Python idiom where: > b = a[:] > assert id(a) != id(b) > > Incidental to the above, GvR had a thought that slice repetition ought to > always return an error. Though I don't see any use cases for buffer > repetition, bufferobjects do implement all other sequence behaviors and I > think it would be weird to nullify the sq_repeat slot. > > I appreciate your thoughts on the best way to proceed. > > fixing-bugs-is-easier-than-deciding-appropriate-behavior-ly yours, > > > 'regnitteh dnomyar'[::-1] > > > > > > > > > > > > > > From pinard@iro.umontreal.ca Sun Jun 23 20:05:45 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 23 Jun 2002 15:05:45 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <15637.59966.161957.754620@anthem.wooz.org> References: <3D11B6F0.5000803@tismer.com> <200206201746.g5KHkwH04175@odiug.zope.com> <3D121EDB.6070501@tismer.com> <15637.7385.966341.14847@anthem.wooz.org> <000a01c21aa1$438bfde0$3da48490@neil> <15637.59966.161957.754620@anthem.wooz.org> Message-ID: [Barry A. Warsaw] > "NH" == Neil Hodgson > Another i18n approach altogether uses explicit message ids instead of > using the source string as the implicit message id, but that has a > whole 'nuther set of issues. The `catgets' approach, by opposition to the `gettext' approach. I've seen some people having religious feelings in either direction. Roughly said, `catgets' is faster, as you directly index the translation string without having to hash the original string first. It is also easier to translate single words or strings offering little translation context, as English ambiguities are resolved by using different message ids for the same text fragment. On the other hand, `gettext' can be made nearly as fast as `catgets', only _if_ we use efficient hashing combined with proper caching. But the real advantage of `gettext' is that internationalised sources are more legible and easier to maintain, since the original string is shown in clear exactly where it is meant to be used. A problem with both is that implementations bundled in various systems are often weak of bugged, provided they exist of course. Portability is notoriously difficult. Linux and GNU `gettext' rate rather nicely. But nothing is perfect. > [...] you tend to get paranoid about changing /any/ source string, say > to remove a comma, adjust whitespace, or fix a preposition. Any change > means a dozen language teams have a new message they must translate > (unless you can mechanically fix them for them). This is why the responsibilities between maintainers and programmers ought to be well split. If the maintainer feels responsible for the work that is induced on the translation teams by string changes, comfort is lost. The maintainer should do its work in all freedom, and the problem of later reflecting tiny editorial changes into PO `msgstr' fully pertains to translators, with the possible help of automatic tools. Translators should be prepared to such changes. If the split of responsibilities is not fully understood and accepted, internationalisation becomes much heavier, in practice, than it has to be. > >> The feature would be useless to me if I had to pass some explicit > >> dictionary into the _() method. It makes writing i18n code > >> extremely tedious. > NH> I think you are overstating the problem here. > Trust me, I'm not. [...] being forced to pass in the explicit bindings > is a big burden in terms of maintainability and readability. > NH> Not making bindings explicit may mean that translators use > NH> other variables available at the translation point leading to > NH> unexpected failures when internal details are changed. > I18n'ing a program means you have to worry about a lot more things. [...] Internationalisation should not add a significant burden on the programmer. I mean, if there is something cumbersome in the internationalisation of a string, then there is something cumbersome in that string outside any internationalisation context. If internationalisation really adds a significant burden, this is a signal that internationalisation has not been implemented well enough in the underlying language, or else, that it is not getting used correctly. I really think that internationalising of strings should be designed so it is a light activity and negligible burden for the maintainer. (And of course, translators should also get help in form of proper files and tools.) -- François Pinard http://www.iro.umontreal.ca/~pinard From tismer@tismer.com Sun Jun 23 20:24:16 2002 From: tismer@tismer.com (Christian Tismer) Date: Sun, 23 Jun 2002 21:24:16 +0200 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> <3D152CCB.6010000@tismer.com> <3D161435.9D154EE0@prescod.net> Message-ID: <3D162060.9030101@tismer.com> Paul Prescod wrote: > Christian Tismer wrote: > >>... >> >>Ok, I'm all with it. >>Since a couple of hours, I'm riding the following horse: >> >>- $name, $(name), $(any expr) is just fine >>- all of this is compile-time stuff >>.... > > > I think you just described PEP 215. But what you're missing is that we > need a compile time facility for its flexibility and simplicity but we > also need a runtime facility to allow I18N. Are you sure you got what I meant? I want to compile the variable references away at compile time, resulting in an ordinary format string. This string is wraped by the runtime _(), and the result is then interpolated with a dict. >>I also believe it is a good idea to do the _() on >>the unexpanded string (as shown), since the submitted >>values are most probably hard to translate at all. > > > _ runs at runtime. If the interpolation is done at compile time then "_" > is executed too late. Compile time does no interpolation but a translation of the string into a different one, which is interpolated at runtime. will-read-PEP215-anyway - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From skip@pobox.com Sun Jun 23 20:28:20 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 23 Jun 2002 14:28:20 -0500 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting In-Reply-To: <20020623181630.GN25927@laranja.org> References: <20020623181630.GN25927@laranja.org> Message-ID: <15638.8532.380440.63318@beluga.mojam.com> Lalo> These strings may be the result of running some non-string objects Lalo> trough str(foo) - but, we are making no assumptions about these Lalo> objects. Just that str(foo) is somehow meaningful. And, to my Lalo> knowledge, there are no python objects for which str(foo) doesn't Lalo> work. Unicode objects can't always be passed to str(): >>> str(u"abc") 'abc' >>> p = u'Scr\xfcj MacDuhk' >>> str(p) Traceback (most recent call last): File "", line 1, in ? UnicodeError: ASCII encoding error: ordinal not in range(128) (My default encoding is "ascii".) You need to encode Unicode objects using the appropriate charset, which may not always be the default. Skip From paul@prescod.net Sun Jun 23 20:36:34 2002 From: paul@prescod.net (Paul Prescod) Date: Sun, 23 Jun 2002 12:36:34 -0700 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> <3D152CCB.6010000@tismer.com> <3D161435.9D154EE0@prescod.net> <3D162060.9030101@tismer.com> Message-ID: <3D162342.BBDC07B3@prescod.net> Christian Tismer wrote: > >... > > Are you sure you got what I meant? > I want to compile the variable references away at compile > time, resulting in an ordinary format string. > This string is wraped by the runtime _(), and > the result is then interpolated with a dict. How can that be? Original expression: _($"$foo") Expands to: _("%(x1)s"%{"x1": foo}) Standard Python order of operations will do the %-interpolation before the method call! You say that it could instead be _("%(x1)s")%{"x1": foo} But how would Python know to do that? "_" is just another function. There is nothing magical about it. What if the function was instead re.compile? In that case I would want to do the interpolation *before* the compilation, not after! Are you saying that the "_" function should be made special and recognized by the compiler? Paul Prescod From lalo@laranja.org Sun Jun 23 20:38:41 2002 From: lalo@laranja.org (Lalo Martins) Date: Sun, 23 Jun 2002 16:38:41 -0300 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting In-Reply-To: <15638.8532.380440.63318@beluga.mojam.com> References: <20020623181630.GN25927@laranja.org> <15638.8532.380440.63318@beluga.mojam.com> Message-ID: <20020623193841.GO25927@laranja.org> On Sun, Jun 23, 2002 at 02:28:20PM -0500, Skip Montanaro wrote: > > Lalo> These strings may be the result of running some non-string objects > Lalo> trough str(foo) - but, we are making no assumptions about these > Lalo> objects. Just that str(foo) is somehow meaningful. And, to my > Lalo> knowledge, there are no python objects for which str(foo) doesn't > Lalo> work. > > Unicode objects can't always be passed to str(): > > >>> str(u"abc") > 'abc' > >>> p = u'Scr\xfcj MacDuhk' > >>> str(p) > Traceback (most recent call last): > File "", line 1, in ? > UnicodeError: ASCII encoding error: ordinal not in range(128) > > (My default encoding is "ascii".) > > You need to encode Unicode objects using the appropriate charset, which may > not always be the default. Valid point but completely unrelated to my argument - just s/str/unicode/ where necessary. '%s' already handles this: >>> '-%s-' % u'Scr\xfcj MacDuhk' u'-Scr\xfcj MacDuhk-' []s, |alo +---- -- It doesn't bother me that people say things like "you'll never get anywhere with this attitude". In a few decades, it will make a good paragraph in my biography. You know, for a laugh. -- http://www.laranja.org/ mailto:lalo@laranja.org pgp key: http://www.laranja.org/pessoal/pgp Eu jogo RPG! (I play RPG) http://www.eujogorpg.com.br/ Python Foundry Guide http://www.sf.net/foundry/python-foundry/ From tismer@tismer.com Sun Jun 23 21:51:41 2002 From: tismer@tismer.com (Christian Tismer) Date: Sun, 23 Jun 2002 22:51:41 +0200 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> <3D152CCB.6010000@tismer.com> <3D161435.9D154EE0@prescod.net> <3D162060.9030101@tismer.com> <3D162342.BBDC07B3@prescod.net> Message-ID: <3D1634DD.8060101@tismer.com> Paul Prescod wrote: > Christian Tismer wrote: > >>... >> >>Are you sure you got what I meant? >>I want to compile the variable references away at compile >>time, resulting in an ordinary format string. >>This string is wraped by the runtime _(), and >>the result is then interpolated with a dict. > > > How can that be? > > Original expression: > > _($"$foo") > > Expands to: > > _("%(x1)s"%{"x1": foo}) > > Standard Python order of operations will do the %-interpolation before > the method call! You say that it could instead be > > _("%(x1)s")%{"x1": foo} > > But how would Python know to do that? "_" is just another function. > There is nothing magical about it. What if the function was instead > re.compile? In that case I would want to do the interpolation *before* > the compilation, not after! > > Are you saying that the "_" function should be made special and > recognized by the compiler? As you say it, it looks a little as if something special would be needed, right. I have no concrete idea. Somehow I'd want to express that a function is applied after compile time substitution, but before runtime interpolation. Here a simple idea, while not very nice, but it could work: Assume a "$" prefix, which does the interpolation in the way you said. Assume further a "%" prefix, which does it only halfway, returning a tuple: (modified string, dict). This tuple would be passed to _(), and it is _()'s decision to work this way: def _(s): if type(s) == type(()): s, args = s else: args = None #... processing s ... if args: return s % args else: return s But this is a minor issue, I just wanted to tell what I think should happen, without giving an exact solution. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From xscottg@yahoo.com Sun Jun 23 22:05:08 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Sun, 23 Jun 2002 14:05:08 -0700 (PDT) Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__ In-Reply-To: <3D15CF08.2020506@stsci.edu> Message-ID: <20020623210508.60531.qmail@web40105.mail.yahoo.com> --- Todd Miller wrote: > > > >Raise a TypeError and it would match the standard behavior. > > > Raise TypeError where? I was thinking I'd have to either inherit from > int, or not, depending on the type of the array. It still might work > out though... > You're right. You'd have to raise the TypeError from whatever object was being subscripted. I'm not sure what I was thinking... __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From martin@v.loewis.de Sun Jun 23 23:19:50 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 24 Jun 2002 00:19:50 +0200 Subject: [Python-Dev] Re: replacing bsddb with pybsddb's bsddb3 module In-Reply-To: <20020621215444.GB30056@zot.electricrain.com> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain> <20020620205041.GD18944@zot.electricrain.com> <15635.14235.79608.390983@beluga.mojam.com> <20020621215444.GB30056@zot.electricrain.com> Message-ID: "Gregory P. Smith" writes: > Sound correct? Yes, please go ahead. > How do we want future bsddb module development to proceed? I envision > it either taking place 100% under the python project, or taking place > as it is now in the pybsddb project with patches being fed to the python > project as desired? Any preferences? [i prefer to not maintain the > code in two places myself (ie: do it all under the python project)] It's your choice. If people want to maintain pybsddb3 for older Python releases, it would be necessary to synchronize the two code bases regularly. That would be the task of whoever is interested in providing older Python releases with newer code. From the viewpoint of the Python distribution, this support is not interesting. Regards, Martin From xscottg@yahoo.com Sun Jun 23 23:22:09 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Sun, 23 Jun 2002 15:22:09 -0700 (PDT) Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <002001c21ae8$9d687a40$bbb53bd0@othello> Message-ID: <20020623222209.62675.qmail@web40105.mail.yahoo.com> --- Raymond Hettinger wrote: > GvR thought you guys might have some ideas on this one for me. > > If I don't get any replies, I may have to rely on my own instincts and > judgment and no one knows what follies might ensue ;) > > [...] I think buffers have a weird duality that they don't really want. In one case, the buffer object acts as a low level way to inspect some other object's PyBufferProcs. I'll call this BufferInspector. In the other case, the buffer object just acts like an array of bytes. I'll call this ByteArray. So for a BufferInspector, you'd want slices to return new "views" into the same object, and repetition doesn't make any sense. If you wanted to copy the data out of the object you're mucking with, you'd be explicit about it - either creating a new string, or a new ByteArray. For a ByteArray, I think you'd want slices to have copy behaviour and return a new ByteArray. Repetition also makes perfect sense. Of course this all gets screwy when the object being inspected by the BufferInspector sense is created solely to provide a ByteArray. I see this as an ugly workaround for arraymodule.c not allowing one to supply a pointer/destructor when creating arrays. The fact that either of these pretend to be strings is really convenient, but I don't think it has much to do with the weirdness. The fact that either of these returns strings for any operation is somewhat weird. For the ByteArray sense of the buffer object, it's analagous to a list slice/repetition returning a tuple. Since the array module already has a way to create a ByteArray (and a ShortArray, and...), buffer objects don't really need to duplicate that effort. Except creating an array from your own "special memory" (mmap, DMA, third party API), and backwards compatibility in general. :-) BTW: I chuckled when I saw you post this the first time. This topic seems to draw a lot of silence. I know that I would suggest deprecating the PyBufferObject to just being a BufferInspector, and taking what little extra functionality was in there and stuffing it into arraymodule.c. Another solution would be to factor PyBufferObject into PyBufferInspector and a "bytes" object. A few months ago, I was tempted to submit a PEP saying as much, but I think that would have quietly fallen to the floor. Nobody seems to like this topic too much... If you do go in and make changes to bufferobject.c, I've already submitted two patches (fallen quietly to the floor) that fix some other classic "buffer problems". You might want to look at them. Or not :-) Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From aahz@pythoncraft.com Sun Jun 23 23:35:26 2002 From: aahz@pythoncraft.com (Aahz) Date: Sun, 23 Jun 2002 18:35:26 -0400 Subject: [Python-Dev] Fw: Behavior of buffer() In-Reply-To: <20020623222209.62675.qmail@web40105.mail.yahoo.com> References: <002001c21ae8$9d687a40$bbb53bd0@othello> <20020623222209.62675.qmail@web40105.mail.yahoo.com> Message-ID: <20020623223526.GA2570@panix.com> On Sun, Jun 23, 2002, Scott Gilbert wrote: > > I know that I would suggest deprecating the PyBufferObject to just being a > BufferInspector, and taking what little extra functionality was in there > and stuffing it into arraymodule.c. Another solution would be to factor > PyBufferObject into PyBufferInspector and a "bytes" object. A few months > ago, I was tempted to submit a PEP saying as much, but I think that would > have quietly fallen to the floor. Nobody seems to like this topic too > much... OTOH, for PEPs, silence may be construed as consent. Just don't be too surprised if an actual PEP generated a lot of noise. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From niemeyer@conectiva.com Mon Jun 24 00:30:06 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sun, 23 Jun 2002 20:30:06 -0300 Subject: [Python-Dev] Behavior of matching backreferences In-Reply-To: References: Message-ID: <20020623203006.A9783@ibook> Hello Tim! > Wow, yesterday's drugs haven't worn off yet . The details of this > explanation were partly full of beans. Let's consider a different regexp: [...] Thanks for explaining again, in a way I could understand. :-) > ^(a)?b\1$ > > Should that match > > b > > or not? Python and Perl say "no" today, because \1 refers to a group that > didn't match. Ir remains unclear to me whether Gustavo is saying it should, > but, if he is, that's too big a change, and > > ^(a?)b\1$ [...] I still think it should, because otherwise the "^(a)?b\1$" can never be used, and this expression will become "^((a)?)b\1$" if more than one character is needed. But since nobody agrees with me, and both languages are doing it that way, I give up. :-) Could you please reject the patch at SF? Thank you! -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From barry@zope.com Mon Jun 24 01:03:37 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 23 Jun 2002 20:03:37 -0400 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting References: <20020623181630.GN25927@laranja.org> Message-ID: <15638.25049.352408.232831@anthem.wooz.org> >>>>> "LM" == Lalo Martins writes: LM> Also, if you keep your templates (let's call a string LM> containing substitution markup a template, shall we?) outside LM> your source code, as is the case with i18n, pure substitution LM> doesn't require the people who edit them (for example, LM> translators) to know anything about python *or* even LM> programming. It isn't always done that way though. See Francois's very good followup describing gettext vs. catgets. LM> Now, data formatting is another animal entirely. It's a way to LM> request one specific representation of a piece of data. I agree! -Barry From barry@zope.com Mon Jun 24 01:12:26 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 23 Jun 2002 20:12:26 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D11B6F0.5000803@tismer.com> <200206201746.g5KHkwH04175@odiug.zope.com> <3D121EDB.6070501@tismer.com> <15637.7385.966341.14847@anthem.wooz.org> <000a01c21aa1$438bfde0$3da48490@neil> <15637.59966.161957.754620@anthem.wooz.org> Message-ID: <15638.25578.254353.531473@anthem.wooz.org> >>>>> "FP" =3D=3D Fran=E7ois Pinard writes: FP> This is why the responsibilities between maintainers and FP> programmers ought to be well split. If the maintainer feels FP> responsible for the work that is induced on the translation FP> teams by string changes, comfort is lost. The maintainer FP> should do its work in all freedom, and the problem of later FP> reflecting tiny editorial changes into PO `msgstr' fully FP> pertains to translators, with the possible help of automatic FP> tools. Translators should be prepared to such changes. If FP> the split of responsibilities is not fully understood and FP> accepted, internationalisation becomes much heavier, in FP> practice, than it has to be. Unfortunately, sometimes one person has to wear both hats and then we see the tension between the roles. >> I18n'ing a program means you have to worry about a lot more >> things. [...] FP> Internationalisation should not add a significant burden on FP> the programmer. I mean, if there is something cumbersome in FP> the internationalisation of a string, then there is something FP> cumbersome in that string outside any internationalisation FP> context. It may not be a significant burden, once the infrastructure is in place and a rhythm is established, but it is still not non-zero. Little issues crop up all the time, like the fact that a message might have the same English phrase but need to be distinguished for proper translation in some other languages (gettext vs. catgets), or that the translation is slightly different depending on where the message is output (email, web, console), or dealing with localized formatting of numbers, dates, and other values. It's just stuff you have to keep in mind and deal with, but it's not insurmountable. I think the current Python tools for i18n'ing are pretty good, and the bright side is that I'd still rather be developing an i18n'd program in Python than in just about any other language. One area that I think we could do better in is in support of localizing dates, currency, etc. Here, Stephan Richter is laying some groundwork in the Zope3 I18n project, possibly integrating IBM's ICU library into Python.= http://www-124.ibm.com/icu/ -Barry From groups@crash.org Mon Jun 24 01:23:50 2002 From: groups@crash.org (Jason L. Asbahr) Date: Sun, 23 Jun 2002 19:23:50 -0500 Subject: [Python-Dev] Playstation 2 and GameCube ports In-Reply-To: <3D0FDB0A.EC53656@prescod.net> Message-ID: Paul, The PS2 Linux FAQ has a great answer to this: What are the differences between the Linux (for PlayStation 2) development environment and that used by professional game developers? Professional game developers get access to a special version of the PlayStation 2 hardware which contains more memory and extra debug facilities. This hardware, known as the T10K, is a lot more expensive than a commercial PlayStation 2 and is only available to licensed game developers. If you are seriously interested in becoming a licensed game developer, please see this link for North America and this link for Europe and Australasia . In addition to the T10K, licensed game developers get additional support which is part of the reason that the T10K is so much more expensive than a PlayStation 2 console. In terms of access to the PlayStation 2 hardware and libraries, Linux (for PlayStation 2) offers an almost identical set of functionality to that provided to licensed game developers. In fact the system manuals provided with the Linux kit have identical content to 6 of the 7 system manuals provided to licensed developers. The missing information which is provided to licensed developers and not to users of Linux (for PlayStation 2) describes the hardware that controls the CD/DVD-ROM, SPU2 Audio chip and other IO peripheral control hardware. This hardware functionality is still available for use with the linux kit through a software interface called the Runtime Environment. The final major difference between the two is the operating system. A licensed developer creates games for the PlayStation 2 which use a light weight proprietary operating system kernel. This kernel offers much less functionality than Linux, but has the advantage of offering slightly faster access to the hardware. In most cases, it is possible to get almost the same performance with Linux (for PlayStation 2) and the professional game development tools. -----Original Message----- From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On Behalf Of Paul Prescod Sent: Tuesday, June 18, 2002 8:15 PM To: Jason L. Asbahr Cc: python-dev@python.org Subject: Re: [Python-Dev] Playstation 2 and GameCube ports "Jason L. Asbahr" wrote: > >... > > However, the hobbiest PS2/Linux upgrade kit for the retail PS2 unit > may be acquired for $200 and Python could be used on that system > as well. Info at http://playstation2-linux.com What do you lose by going this route? Obviously if this was good enough there would be no need for developer boxes nor (I'd guess) for a special port of Python. Paul Prescod _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev From tim.one@comcast.net Mon Jun 24 03:24:42 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 23 Jun 2002 22:24:42 -0400 Subject: [Python-Dev] Behavior of matching backreferences In-Reply-To: <20020623203006.A9783@ibook> Message-ID: [Gustavo Niemeyer] > I still think it should, because otherwise the "^(a)?b\1$" can never be > used, and this expression will become "^((a)?)b\1$" if more than one > character is needed. Is that a real concern? I mean that in the sense of whether you have an actual application requiring that some multi-character bracketing string either does or doesn't appear on both ends of a thing, and typing another set of parens is a burden. Both parts of that seem strained. > But since nobody agrees with me, and both languages are doing it that > way, I give up. :-) That's wise . It's not just Python and Perl, I expect you're going to find this in every careful regexp package. There's a painful discussion buried here: wherein the POSIX committee debated their own ambiguous wording about backreferences. Their specific example is: what should the regexp (in Python notation, not POSIX) ^((.)*\2#)* match in xx#yy## ? Your example is hiding in there, on the "third iteration of the outer loop". The official POSIX interpretation was that it should match just the first 6 characters, and not the trailing #, because in a third iteration of the outer subexpression, . would match nothing (as distinct from matching a null string) and hence \2 would match nothing. Python and Perl agree, which wouldn't surprise you if you first implemented a regexp engine with stinking backreferences <0.9 wink>. The distinction between "matched an empty string" and "didn't match anything" is night-&-day inside an engine, and people skating on the edge (meaning using backreferences at all!) quickly rely on the exact behavior this implies. > Could you please reject the patch at SF? I'm not sure which one you mean, so on your authority I'm going to reject all patches at SF. Whew! This makes our job much easier . From niemeyer@conectiva.com Mon Jun 24 04:04:58 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Mon, 24 Jun 2002 00:04:58 -0300 Subject: [Python-Dev] Behavior of matching backreferences In-Reply-To: References: <20020623203006.A9783@ibook> Message-ID: <20020624000458.A12114@ibook> > > I still think it should, because otherwise the "^(a)?b\1$" can never be > > used, and this expression will become "^((a)?)b\1$" if more than one > > character is needed. > > Is that a real concern? I mean that in the sense of whether you have an > actual application requiring that some multi-character bracketing string > either does or doesn't appear on both ends of a thing, and typing another > set of parens is a burden. Both parts of that seem strained. No, it isn't. Even because there is some way to implement this, as Barry and you have shown, and because *I* know it doesn't work as I'd expect. :-)) Indeed, I've found it while implementing another feature which in my opinion is really useful, and can't be easily achieved. But that's something for another thread, another day. [...] > ? Your example is hiding in there, on the "third iteration of the outer > loop". The official POSIX interpretation was that it should match just the > first 6 characters, and not the trailing #, > > because in a third iteration of the outer subexpression, . would match > nothing (as distinct from matching a null string) and hence \2 would > match nothing. [...] Thanks for giving me a strong and detailed reason. I understand that small issues can end up in endless discussions and different implementations. I'm happy that the POSIX people thought about that before me <2.0 wink>. > > Could you please reject the patch at SF? > > I'm not sure which one you mean, so on your authority I'm going to reject > all patches at SF. Whew! This makes our job much easier . That's good! You'll take back the time you wasted with me. ;-)) -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From misa@redhat.com Mon Jun 24 04:06:10 2002 From: misa@redhat.com (Mihai Ibanescu) Date: Sun, 23 Jun 2002 23:06:10 -0400 (EDT) Subject: [Python-Dev] Added SSL through HTTP proxies support to httplib.py Message-ID: Hello, Can somebody please verify if the following patch makes enough sense to be accepted? http://sourceforge.net/tracker/index.php?func=detail&aid=515003&group_id=5470&atid=305470 Thanks, Misa From neal@metaslash.com Mon Jun 24 04:16:20 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 23 Jun 2002 23:16:20 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D11B6F0.5000803@tismer.com> <200206201746.g5KHkwH04175@odiug.zope.com> <3D121EDB.6070501@tismer.com> <15637.7385.966341.14847@anthem.wooz.org> <000a01c21aa1$438bfde0$3da48490@neil> <15637.59966.161957.754620@anthem.wooz.org> <15638.25578.254353.531473@anthem.wooz.org> Message-ID: <3D168F04.CE03AD66@metaslash.com> I'm pretty negative on string interpolation, I don't see it as that useful or %()s as that bad. But obviously, many others do feel there is a problem. I don't like the schism that $ vs. % would create. Nor do I like many other proposals. So here is yet another proposal: * Add new builtin function interp() or some other name: def interp(format, uselocals=True, useglobals=True, dict={}, **kw) * use % as the format character and allow optional () or {} around the name * if this is acceptable, {name:format_modifiers} could be added in the future Code would then look like this: >>> x = 5 >>> print interp('x = %x') x = 5 >>> print interp('x = %(x)') x = 5 >>> print interp('x = %{x}') x = 5 >>> print interp('y = %y') NameError: name 'y' is not defined >>> print interp('y = %y', dict={'y': 10}) y = 10 >>> print interp('y = %y', y=10) y = 10 This form: * eliminates any hint of $ * is similar to current % handling, but hopefully fixes the current deficiencies * allows locals and/or globals to be used * allows any dictionary/mapping to be used * allows keywords * is extensible to allow for formatting in the future * doesn't require much extra typing or thought Now I'm sure everyone will tell me how awful this is. :-) Neal PS I'm -0 on this proposal. And I dislike the name interp. From pinard@iro.umontreal.ca Mon Jun 24 05:02:35 2002 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) Date: 24 Jun 2002 00:02:35 -0400 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions In-Reply-To: <15638.25578.254353.531473@anthem.wooz.org> References: <3D11B6F0.5000803@tismer.com> <200206201746.g5KHkwH04175@odiug.zope.com> <3D121EDB.6070501@tismer.com> <15637.7385.966341.14847@anthem.wooz.org> <000a01c21aa1$438bfde0$3da48490@neil> <15637.59966.161957.754620@anthem.wooz.org> <15638.25578.254353.531473@anthem.wooz.org> Message-ID: [Barry A. Warsaw] > FP> This is why the responsibilities between maintainers and > FP> programmers ought to be well split. > Unfortunately, sometimes one person has to wear both hats and then we > see the tension between the roles. I have the same experience, having been for a good while the assigned French translator for the packages I was maintaining. But I was splitting my roles rather carefully, with the precise purpose of seeing where were lying tensions and problems, and then work at improving how interactions go between involved parties. > >> I18n'ing a program means you have to worry about a lot more > >> things. [...] > FP> Internationalisation should not add a significant burden on > FP> the programmer. > It may not be a significant burden, once the infrastructure is in > place and a rhythm is established, but it is still not non-zero. The Mailman effort has been especially courageous, as it ought to address many problems on which we did not accumulate much experience yet, but which are inescapable in the long run. For example, I guess you had to take care of translating external HTML templates, considering some input aspects, allowing on-the-fly language selection, and of course, looking into more prosaic non-message "locale" concerns. -- François Pinard http://www.iro.umontreal.ca/~pinard From barry@zope.com Mon Jun 24 05:28:50 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 24 Jun 2002 00:28:50 -0400 Subject: [Python-Dev] I18n'ing a Python program (was Re: PEP 292, Simpler String Substitutions) References: <3D11B6F0.5000803@tismer.com> <200206201746.g5KHkwH04175@odiug.zope.com> <3D121EDB.6070501@tismer.com> <15637.7385.966341.14847@anthem.wooz.org> <000a01c21aa1$438bfde0$3da48490@neil> <15637.59966.161957.754620@anthem.wooz.org> <15638.25578.254353.531473@anthem.wooz.org> Message-ID: <15638.40962.721153.934484@anthem.wooz.org> >>>>> "FP" =3D=3D Fran=E7ois Pinard writes: >> It may not be a significant burden, once the infrastructure is >> in place and a rhythm is established, but it is still not >> non-zero. FP> The Mailman effort has been especially courageous, as it ought FP> to address many problems on which we did not accumulate much FP> experience yet, but which are inescapable in the long run. FP> For example, I guess you had to take care of translating FP> external HTML templates, considering some input aspects, FP> allowing on-the-fly language selection, and of course, looking FP> into more prosaic non-message "locale" concerns. Thanks, I think it's been valuable experience -- I certainly have learned a lot! One of the most painful areas has in fact been the translating of HTML templates specifically because a template file is far too coarse a granularity. When I want to add a new widget to a template, I can usually figure out where to add it in say, the Spanish or French version, but it's nearly hopeless to try to add it to the Japanese version. :) Here, I hope Fred, Stephan Richter, and my efforts at i18n'ing Zope3's Page Templates will greatly improve things. It's early going but it feels right. It would mean you essentially have one version of the template but you'd mark it up to designate the translatable messages, and I think you'd end up integrating those with your Python source catalogs (but maybe in a different domain?). I'm not quite sure how that would translate to plaintext templates (e.g. for email messages). Input aspects are something neither MM nor Zope has (yet) adequately addressed. What I'm thinking of here are message footers in multiple languages or say, a job description in multiple languages. We'll have to address these down the road. I've already mentioned about efforts in Zopeland for localizing non-message issues. On-the-fly language selection is something that I have had to deal with in MM, and Python's class-based gettext API is essential here, and works well. Zope3 and MM take slightly different u/i tacks, with Zope3 doing better browser language negotiation and MM allowing for explicit overrides in forms. Some combination of the two is probably where web-based applications want to head. now-to-make-time-to-finish-MM2.1-ly y'rs, -Barry From nhodgson@bigpond.net.au Mon Jun 24 09:22:14 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Mon, 24 Jun 2002 18:22:14 +1000 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D11B6F0.5000803@tismer.com><200206201746.g5KHkwH04175@odiug.zope.com><3D121EDB.6070501@tismer.com><15637.7385.966341.14847@anthem.wooz.org><000a01c21aa1$438bfde0$3da48490@neil> <15637.59966.161957.754620@anthem.wooz.org> Message-ID: <005201c21b58$3f6d39b0$3da48490@neil> [Doh! Forgot to send to the list as well - shouldn't try to use a computer when I have a cold] Barry A. Warsaw: > Trust me, I'm not. Then again, maybe it's just me, or my limited > experience w/ i18n'd source code, but being forced to pass in the > explicit bindings is a big burden in terms of maintainability and > readability. My main experience in internationalization has been in GUI apps where there is often a strong separation between the localizable static text and the variable text. In dialogs you often have: Static localized description: [Editable variable] In my editor SciTE, which currently has about 15 translations, of the 177 localizable strings, only 9 are messages that require insertion of variables and all of those require only one variable. Most of the strings are menu or dialog items. Maybe I'm just stingy with messages :-) On the largest sensibly internationalized project I have worked on (7 years old and with a maximum of 20 reasearch/design/develop/test staff when I left), I would estimate that less than 50 messages required variable substitution. The amount of effort that went into ensuring that the messages were accurate, meaningful and understandable outweighed by several orders of magnitude any typing or reading work. Neil From s_lott@yahoo.com Mon Jun 24 15:40:49 2002 From: s_lott@yahoo.com (Steven Lott) Date: Mon, 24 Jun 2002 07:40:49 -0700 (PDT) Subject: "Julian" ambiguity (was Re: [Python-Dev] strptime recapped) In-Reply-To: Message-ID: <20020624144049.9418.qmail@web9603.mail.yahoo.com> Thanks for the amplification - that was precisely my point. When proposing that strptime() parse "Julian" dates, some more precise definition of Julian is required. --- John Machin wrote: > 21/06/2002 10:27:22 PM, Steven Lott wrote: > > > > >Generally, "Julian" dates are really just the day number > within > >a given year; this is a simple special case of the more > general > >(and more useful) approach that R-D use. > > > >See > >http://emr.cs.iit.edu/home/reingold/calendar-book/index.shtml > > > >for more information. > > > > AFAICT from perusing their book, R-D use the term > "julian-date" to mean a tuple (year, month, day) in the Julian > calendar. > The International Astro. Union uses "Julian date" to mean an > instant in time measured in days (and fraction therof) since > noon on 1 January -4712 (Julian ("proleptic") calendar). See > for example > http://maia.usno.navy.mil/iauc19/iaures.html#B1 > > A "Julian day number" (or "JDN") is generally used to mean an > ordinal day number counting day 0 as Julian_calendar(-4712, 1, > 1) as above. Some folks use JDN to include the IAU's > instant-in-time. > > Some folks use "julian day" to mean a day within a year (range > 0-365 *or* 1-366 (all inclusive)). This terminology IMO should > be severely deprecated. The concept is best described as > something like "day of year", with a > specification of the origin (0 or 1) when appropriate. > > It is not clear from the first of your sentences quoted above > exactly what you are calling a "Julian date": (a) the tuple > (given_year, day_of_year) with calendar not specified or (b) > just day_of_year. However either answer seems > IMO to be an inappropriate addition to the terminology > confusion. > > Cheers, > John > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev ===== -- S. Lott, CCP :-{) S_LOTT@YAHOO.COM http://www.mindspring.com/~slott1 Buccaneer #468: KaDiMa Macintosh user: drinking upstream from the herd. __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From lalo@laranja.org Mon Jun 24 17:07:19 2002 From: lalo@laranja.org (Lalo Martins) Date: Mon, 24 Jun 2002 13:07:19 -0300 Subject: [Python-Dev] New subscriber In-Reply-To: <20020624145315.8773.62199.Mailman@mail.python.org> References: <20020624145315.8773.62199.Mailman@mail.python.org> Message-ID: <20020624160719.GS25927@laranja.org> On Mon, Jun 24, 2002 at 10:53:15AM -0400, python-dev-request@python.org wrote: > If you are a new subscriber, please take the time to introduce yourself > briefly in your first post. Hmm, ok. My name is Fernando Martins, known as Lalo. I'm currently 27 and I live in Brazil. I've been a Python advocate since Bruce Perens introduced me to it in, what, '96. I've been working professionally with Zope - ranging from site building to training, from infrastructure hacking to consulting - since mid-99, when I selected it from a range of options due to the fact that it was in Python. In the course of zope training and consulting, I take every opportunity to give python courses and talks. I never subscribed to python-dev before because I was very involved in the Zope community and preferred to keep my mind out of lower-level stuff, but now I find there are lots of interesting things going on and I'd prefer to be a part of it. ;-) (Also, Zope is very cool but the web marketing can get tiresome - if I can find a way, I'm planning to retire from it at least in part and spend more time doing plain Python.) []s, |alo +---- -- It doesn't bother me that people say things like "you'll never get anywhere with this attitude". In a few decades, it will make a good paragraph in my biography. You know, for a laugh. -- http://www.laranja.org/ mailto:lalo@laranja.org pgp key: http://www.laranja.org/pessoal/pgp Eu jogo RPG! (I play RPG) http://www.eujogorpg.com.br/ Python Foundry Guide http://www.sf.net/foundry/python-foundry/ From barry@zope.com Mon Jun 24 19:04:31 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 24 Jun 2002 14:04:31 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <20020619024806.GA7218@lilith.my-fqdn.de> <20020619203332.GA9758@gerg.ca> Message-ID: <15639.24367.371777.509082@anthem.wooz.org> >>>>> "GW" == Greg Ward writes: GW> No, library_dirs is for good old -L. AFAIK it works fine. GW> For -R (or equivalent) you need runtime_library_dirs. I'm not GW> sure if it works (or ever did). I think it's a question of GW> knowing what magic options to supply to each compiler. GW> Probably it works (worked) on Solaris, since for once Sun got GW> things right and supplied a simple, obvious, working GW> command-line option -- namely -R. runtime_library_dirs works perfectly for Linux and gcc, thanks. -Barry From barry@zope.com Mon Jun 24 19:29:07 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 24 Jun 2002 14:29:07 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <15632.52766.822003.689689@anthem.wooz.org> Message-ID: <15639.25843.562043.559385@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> barry@zope.com (Barry A. Warsaw) writes: >> Really? You know the path for the -R/--rpath flag, so all you >> need is the magic compiler-specific incantation, and distutils >> already (or /should/ already) know that. MvL> Yes, but you don't know whether usage of -R is appropriate. MvL> If the installed library is static, -R won't be needed. And shouldn't hurt. MvL> If then the target directory recorded with -R happens to be MvL> on an unavailable NFS server at run-time (on a completely MvL> different network), you cannot import the library module MvL> anymore, which would otherwise work perfectly fine. Do people still use NFS servers to share programs? I thought big cheap disks and RPMs did away with all that. :) I believe that -R/-rpath adds directories to runtime search paths so if the NFS directory was unmounted, ld.so should still be able to locate the shared library through fallback means. That may fail too, but oh well. One issue on Solaris may be that -- according to the GNU ld docs -- the runtime search path will be built from the -L options which we're already passing, /unless/ -rpath is given, and this seems to be added to help with NFS mounted directories on the -L specified path. But since I'm proposing that the -rpath directory be the same as the -L path, I don't think it will make matters worse. MvL> We had big problems with recorded library directories over MvL> the years; at some point, the administrators decided to take MvL> the machine that had MvL> /usr/local/lib/gcc-lib/sparc-sun-solaris2.3/2.5.8 on it MvL> offline. They did not knew that they would thus make vim MvL> inoperable, which happened to be compiled with LD_RUN_PATH MvL> pointing to that directory - even though no library was ever MvL> needed from that directory. Hmm. Was the problem that the NFS server was unresponsive, or that the directory was unmounted, but still searched? If the former, then maybe you do have a problem. I've experienced hangs over the years when NFS servers have been unresponsive (because the host was down and the nfs mounts options weren't given to make this a soft error). I haven't used NFS in years though so my memory is rusty on the details. >> I disagree. While the sysadmin should probably fiddle with >> /etc/ld.so.conf when he installs BerkeleyDB, it's not >> documented in the Sleepycat docs, so it's entirely possible >> that they haven't done it. MvL> I'm not asking for the administrator fiddle with MvL> ld.so.conf. Instead, I'm asking the administrator fiddle with MvL> Modules/Setup. We've made it so easy to build a batteries-included Python that I think it would be unfortunately not to do better just because we fear that things /might/ go wrong in some strange environments. I think it's largely unnecessary to edit Modules/Setup these days, and since we /know/ that BerkeleyDB is installed in a funky location not usually on your ld.so path, I think we can take advantage of that to not require editing Modules/Setup in this case too. Our failure mode for bsddbmodule is so cryptic that it's very difficult to figure out why it's not available. I think this simple change to setup.py[1] would improve the life for the average Python programmer. I'd be happy with a command line switch or envar to disable the -R/--rpath addition. Here's a compromise. If LD_RUN_PATH is set at all (regardless of value), don't add -R/--rpath. Or add a --without-rpath switch to configure. >> Note I'm not saying setting LD_RUN_PATH is the best approach, >> but it seemed like the most portable. I couldn't figure out if >> distutils knew what the right compiler-specific switches are >> (i.e. "-R dir" on Solaris cc if memory serves, and "-Xlinker >> -rpath -Xlinker dir" for gcc, and who knows what for other Unix >> or Windows compilers). MvL> LD_LIBRARY_PATH won't work for Windows compilers, either. To MvL> my knowledge, there is nothign equivalent on Windows. Someone else will have to figure out the problems for Windows source builders . I'd like to make life just a little easier for Linux and Unix users. I think this change will do that. -Barry [1] Index: setup.py =================================================================== RCS file: /cvsroot/python/python/dist/src/setup.py,v retrieving revision 1.95 diff -u -r1.95 setup.py --- setup.py 21 Jun 2002 14:48:38 -0000 1.95 +++ setup.py 24 Jun 2002 18:03:06 -0000 @@ -510,12 +510,14 @@ if dbinc == 'db_185.h': exts.append(Extension('bsddb', ['bsddbmodule.c'], library_dirs=[dblib_dir], + runtime_library_dirs=[dblib_dir], include_dirs=db_incs, define_macros=[('HAVE_DB_185_H',1)], libraries=[dblib])) else: exts.append(Extension('bsddb', ['bsddbmodule.c'], library_dirs=[dblib_dir], + runtime_library_dirs=[dblib_dir], include_dirs=db_incs, libraries=[dblib])) else: From barry@zope.com Mon Jun 24 19:31:50 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 24 Jun 2002 14:31:50 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <15632.62272.946354.832044@localhost.localdomain> Message-ID: <15639.26006.812728.291668@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: BAW> I'm still having build trouble on my RH6.1 system, but maybe BAW> it's just too old to worry about (I /really/ need to upgrade BAW> one of these days ;). [errors snipped] SM> I think you might have to define another CPP macro. In my SM> post from last night about building dbmmodule.c I included | define_macros=[('HAVE_BERKDB_H',None), | ('DB_DBM_HSEARCH',None)], SM> in the Extension constructor. Maybe DB_DBM_HSEARCH is also SM> needed for older bsddb? I have no trouble building though. It must be an issue with this ancient RH6.1 system. It builds fine on Mandrake 8.1, and the time to dig into this would probably be better spent upgrading to a more modern Linux distro. But thanks, -Barry From barry@zope.com Mon Jun 24 19:34:15 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 24 Jun 2002 14:34:15 -0400 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain> <20020620205041.GD18944@zot.electricrain.com> <15635.14235.79608.390983@beluga.mojam.com> Message-ID: <15639.26151.752521.415108@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: Greg> should we keep the existing bsddb around as oldbsddb for Greg> users in that situation? Martin> I don't think so; users could always extract the module Martin> from older distributions if they want to. SM> I would prefer the old version be moved to lib-old (or SM> Modules-old?). For people still running DB 2.x it shouldn't SM> be a major headache to retrieve. Modules/old/ probably. We wouldn't do anything with that directory except use it as a placeholder for old extension source, right? Do we care about preserving the cvs history for the current bsddbmodule.c? If so, we'll have to ask SF to do a cvs dance for us. It may not be worth it. -Barry From barry@zope.com Mon Jun 24 19:47:21 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 24 Jun 2002 14:47:21 -0400 Subject: [Python-Dev] Re: replacing bsddb with pybsddb's bsddb3 module References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain> <20020620205041.GD18944@zot.electricrain.com> <15635.14235.79608.390983@beluga.mojam.com> <20020621215444.GB30056@zot.electricrain.com> Message-ID: <15639.26937.866896.917152@anthem.wooz.org> >>>>> "GPS" == Gregory P Smith writes: GPS> This sounds good. Here's what i see on the plate to be done GPS> so far: GPS> 1) move the existing Modules/bsddbmodule.c to a new GPS> Modules-old or directory. mkdir Modules/old (or Modules/extensions-old) mv Modules/bsddbmodule.c Modules/old GPS> 2) create a new Lib/bsddb GPS> directory containing bsddb3/bsddb3/*.py from the pybsddb GPS> project. +1 GPS> 3) create a new Modules/bsddb directory containing GPS> bsddb3/src/* from the pybsddb project (the files should GPS> probably be renamed to _bsddbmodule.c and GPS> bsddbmoduleversion.h for consistent naming) I don't think you need to create a new directory under Modules for this; it's just two files. Probably Modules/_bsddbmodule.c and Modules/_bsddbversion.h are fine. Also, for backwards compatibility, won't Lib/bsddb/__init__.py need to do "from _bsddb import btopen, error, hashopen, rnopen"? GPS> 4) place the pybsddb setup.py in the Modules/bsddb directory, GPS> modifying it as needed. OR modify the top level setup.py GPS> to understand how to build the pybsddb module. (there is GPS> code in pybsddb's setup.py to locate the berkeleydb install GPS> and determine appropriate flags that should be cleaned up and GPS> carried on) How much of Skip's recent changes to setup.py can be retargeted for pybsddb? GPS> 5) modify the top level python setup.py to build the bsddb GPS> module as appropriate. 6) "everything else" including GPS> integrating documentation and pybsddb's large test suite. What to do about the test suite is a good question. pybsddb's is /much/ more extensive, and I wouldn't want to lose that, but I'm also not sure I'd want it to run during a normal regrtest. Here's an idea: leave test_bsddb as is and add pybsddb's as test_all_bsddb.py. Then add a "-u all_bsddb" to regrtest's resource flags. GPS> How do we want future bsddb module development to proceed? I GPS> envision it either taking place 100% under the python GPS> project, or taking place as it is now in the pybsddb project GPS> with patches being fed to the python project as desired? Any GPS> preferences? [i prefer to not maintain the code in two GPS> places myself (ie: do it all under the python project)] I'd like to see one more official release from the pybsddb project, since its cvs has some very useful additions (important bug fixes plus support for BerkeleyDB 4). Then move all development over to the Python project and let interested volunteers port critical patches back to the pybsddb project. If you add me as a developer on pybsddb.sf.net, I'll volunteer to help. -Barry From oren-py-l@hishome.net Mon Jun 24 21:01:40 2002 From: oren-py-l@hishome.net (Oren Tirosh) Date: Mon, 24 Jun 2002 23:01:40 +0300 Subject: [Python-Dev] PEP 294: Type Names in the types Module Message-ID: <20020624230140.B3555@hishome.net> PEP: 294 Title: Type Names in the types Module Version: $Revision: 1.1 $ Last-Modified: $Date: 2002/06/23 23:52:19 $ Author: oren at hishome.net (Oren Tirosh) Status: Draft Type: Standards track Created: 19-Jun-2002 Python-Version: 2.3 Post-History: Abstract This PEP proposes that symbols matching the type name should be added to the types module for all basic Python types in the types module: types.IntegerType -> types.int types.FunctionType -> types.function types.TracebackType -> types.traceback ... The long capitalized names currently in the types module will be deprecated. With this change the types module can serve as a replacement for the new module. The new module shall be deprecated and listed in PEP 4. Rationale Using two sets of names for the same objects is redundant and confusing. In Python versions prior to 2.2 the symbols matching many type names were taken by the factory functions for those types. Now all basic types have been unified with their factory functions and therefore the type names are available to be consistently used to refer to the type object. Most types are accessible as either builtins or in the new module but some types such as traceback and generator are only accssible through the types module under names which do not match the type name. This PEP provides a uniform way to access all basic types under a single set of names. Specification The types module shall pass the following test: import types for t in vars(types).values(): if type(t) is type: assert getattr(types, t.__name__) is t The types 'class', 'instance method' and 'dict-proxy' have already been renamed to the valid Python identifiers 'classobj', 'instancemethod' and 'dictproxy', making this possible. Backward compatibility Because of their widespread use it is not planned to actually remove the long names from the types module in some future version. However, the long names should be changed in documentation and library sources to discourage their use in new code. Reference Implementation A reference implementation is available in SourceForge patch #569328: http://www.python.org/sf/569328 Copyright This document has been placed in the public domain. From martin@v.loewis.de Mon Jun 24 21:05:13 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 24 Jun 2002 22:05:13 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15639.25843.562043.559385@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <15632.52766.822003.689689@anthem.wooz.org> <15639.25843.562043.559385@anthem.wooz.org> Message-ID: barry@zope.com (Barry A. Warsaw) writes: > MvL> If then the target directory recorded with -R happens to be > MvL> on an unavailable NFS server at run-time (on a completely > MvL> different network), you cannot import the library module > MvL> anymore, which would otherwise work perfectly fine. > > Do people still use NFS servers to share programs? I thought big > cheap disks and RPMs did away with all that. :) This was on Solaris, so no RPMs. > I believe that -R/-rpath adds directories to runtime search paths so > if the NFS directory was unmounted, ld.so should still be able to > locate the shared library through fallback means. That may fail too, > but oh well. Yes, but the startup time for the program increases dramatically - it has to wait for the dead NFS server to timeout. > One issue on Solaris may be that -- according to the GNU ld docs -- > the runtime search path will be built from the -L options which we're > already passing, /unless/ -rpath is given, and this seems to be added Where do the docs say that? I don't think this is the case, or ever was ... > to help with NFS mounted directories on the -L specified path. But > since I'm proposing that the -rpath directory be the same as the -L > path, I don't think it will make matters worse. Indeed, it wouldn't. > Hmm. Was the problem that the NFS server was unresponsive, or that > the directory was unmounted, but still searched? If the former, then > maybe you do have a problem. Yes, that was the problem. Even with soft mounting, it will still take time to timeout. > We've made it so easy to build a batteries-included Python that I > think it would be unfortunately not to do better just because we fear > that things /might/ go wrong in some strange environments. That is a reasonable argument, and I've been giving similar arguments in other cases, too, so I guess I should just stop complaining. > Here's a compromise. If LD_RUN_PATH is set at all (regardless of > value), don't add -R/--rpath. Or add a --without-rpath switch to > configure. I guess we don't need to compromise, and approach is *very* cryptic, so I'd rather avoid it. It looks like the current bsddb module is going to go away, anyway, so there is no need to tweak the current configuration that much. I don't know what the bsddb3 build procedure is, but any approach you come up with now probably needs to be redone after pybsddb3 integration. Regards, Martin From barry@zope.com Mon Jun 24 21:24:27 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 24 Jun 2002 16:24:27 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <15632.52766.822003.689689@anthem.wooz.org> <15639.25843.562043.559385@anthem.wooz.org> Message-ID: <15639.32763.103711.902632@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: >> Do people still use NFS servers to share programs? I thought >> big cheap disks and RPMs did away with all that. :) MvL> This was on Solaris, so no RPMs. I know, I was kind of joking. But even Solaris has pkg, though I don't know if it's in nearly as widespread use as Linux packages. >> I believe that -R/-rpath adds directories to runtime search >> paths so if the NFS directory was unmounted, ld.so should still >> be able to locate the shared library through fallback means. >> That may fail too, but oh well. MvL> Yes, but the startup time for the program increases MvL> dramatically - it has to wait for the dead NFS server to MvL> timeout. Yeah that would suck. I wonder if that would only affect imports of bsddb though since the Python executable itself wouldn't be linked w/-R. >> One issue on Solaris may be that -- according to the GNU ld >> docs -- the runtime search path will be built from the -L >> options which we're already passing, /unless/ -rpath is given, >> and this seems to be added MvL> Where do the docs say that? I don't think this is the case, MvL> or ever was ... It's in the GNU ld info page under Options: `-rpath DIR' [...] The `-rpath' option may also be used on SunOS. By default, on SunOS, the linker will form a runtime search patch out of all the `-L' options it is given. If a `-rpath' option is used, the runtime search path will be formed exclusively using the `-rpath' options, ignoring the `-L' options. This can be useful when using gcc, which adds many `-L' options which may be on NFS mounted filesystems. Reading it again now, it's not clear if "SunOS" also means "Solaris". >> We've made it so easy to build a batteries-included Python that >> I think it would be unfortunately not to do better just because >> we fear that things /might/ go wrong in some strange >> environments. MvL> That is a reasonable argument, and I've been giving similar MvL> arguments in other cases, too, so I guess I should just stop MvL> complaining. >> Here's a compromise. If LD_RUN_PATH is set at all (regardless >> of value), don't add -R/--rpath. Or add a --without-rpath >> switch to configure. MvL> I guess we don't need to compromise, and approach is *very* MvL> cryptic, so I'd rather avoid it. Cool. I'll commit the change. MvL> It looks like the current bsddb module is going to go away, MvL> anyway, so there is no need to tweak the current MvL> configuration that much. I don't know what the bsddb3 build MvL> procedure is, but any approach you come up with now probably MvL> needs to be redone after pybsddb3 integration. I suspect we'll need /something/ like this once pybsddb's integrated, but I'll definitely be testing it once Greg does the integration. I doubt pybsddb's build process is going to just drop into place, and I suspect it'll actually be easier. Thanks, -Barry From bac@OCF.Berkeley.EDU Mon Jun 24 22:02:27 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 24 Jun 2002 14:02:27 -0700 (PDT) Subject: "Julian" ambiguity (was Re: [Python-Dev] strptime recapped) In-Reply-To: <20020624144049.9418.qmail@web9603.mail.yahoo.com> Message-ID: [Steven Lott] > Thanks for the amplification - that was precisely my point. > When proposing that strptime() parse "Julian" dates, some more > precise definition of Julian is required. [snip] strptime just follows strftime's definition of a Julian day which is the number of days since Jan. 1 of the year. It is out of my hands in terms of definition of what type of Julian info strptime parses since I just follow the formats for strftime. But when Guido implements the new datetime type this argument will change since both versions that he is considering do not include any Julian days or dates. There could be fxns, though (mine could actually be used), that do calculate various Julian values and those can be abundantly clear on what they return. -Brett From bac@OCF.Berkeley.EDU Mon Jun 24 22:17:08 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 24 Jun 2002 14:17:08 -0700 (PDT) Subject: [Python-Dev] New Subscriber Introduction Message-ID: Uh, I realize this is a little late, but I didn't thoroughly read the intro email for the list and so I didn't realize this was requested into lalo sent his email. Anyway, better late then never. So my name is Brett Cannon. I am a recent graduate of the philosophy program here at UC Berkeley. I am taking a year off from school (and apparently employment =P) while I apply to grad school in hopes of persuing as masters or doctorate in CS. I also hope to discover an area of CS that I love above all else during this year so that I can stop jumping between different areas of programming (I have the slight issue of wnating to be the best that I can be at everything and this jumping around is not helping with that =). I discovered Python in Fall 2000. Was trying to choose a language to use to teach myself OOP before I took my first CS course here at Cal. Have been using Python pretty exclusively since except for CS coursework; Python spoiled me and didn't help when I had to use Scheme, Lisp, or Java in my classes. The only grumblings anyone has heard out of me as yet on this list is over my Python implementation of strptime. I do plan to stay on this list, though, even after this is resolved and be as involved as I can on the list (which is going to be limited until I get off my rear and really dive into the C source). -Brett C. From skip@pobox.com Mon Jun 24 23:48:44 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 24 Jun 2002 17:48:44 -0500 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15639.25843.562043.559385@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <15632.52766.822003.689689@anthem.wooz.org> <15639.25843.562043.559385@anthem.wooz.org> Message-ID: <15639.41420.988942.868137@12-248-11-90.client.attbi.com> Just a quick note to let you all know you've completely lost me with all this -R stuff. If someone would like to implement this, the now-closed patch is at http://python.org/sf/553108 Just reopen it and assign it to yourself. A quick summary of this thread added to the bug report would probably be a good idea. Skip From skip@pobox.com Mon Jun 24 23:50:37 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 24 Jun 2002 17:50:37 -0500 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15639.26151.752521.415108@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain> <20020620205041.GD18944@zot.electricrain.com> <15635.14235.79608.390983@beluga.mojam.com> <15639.26151.752521.415108@anthem.wooz.org> Message-ID: <15639.41533.776854.272767@12-248-11-90.client.attbi.com> SM> I would prefer the old version be moved to lib-old (or SM> Modules-old?). For people still running DB 2.x it shouldn't be a SM> major headache to retrieve. BAW> Modules/old/ probably. We wouldn't do anything with that directory BAW> except use it as a placeholder for old extension source, right? Sounds good to me. BAW> Do we care about preserving the cvs history for the current BAW> bsddbmodule.c? If so, we'll have to ask SF to do a cvs dance for BAW> us. It may not be worth it. I think it would be worthwhile. Alternatively, you could cvs remove it, the add it to Modules/old with a note to check the Attic for older revision notes. Skip From barry@zope.com Mon Jun 24 23:58:05 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 24 Jun 2002 18:58:05 -0400 Subject: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <15632.52766.822003.689689@anthem.wooz.org> <15639.25843.562043.559385@anthem.wooz.org> <15639.41420.988942.868137@12-248-11-90.client.attbi.com> Message-ID: <15639.41981.672105.879289@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> Just a quick note to let you all know you've completely lost SM> me with all this -R stuff. If someone would like to implement SM> this, the now-closed patch is at SM> http://python.org/sf/553108 SM> Just reopen it and assign it to yourself. A quick summary of SM> this thread added to the bug report would probably be a good SM> idea. Actually, I think we're now good to go, although we'll need to revisit this once Greg starts w/ the integration of pybsddb. Thanks Skip, -Barry From barry@zope.com Mon Jun 24 23:59:24 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 24 Jun 2002 18:59:24 -0400 Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain> <20020620205041.GD18944@zot.electricrain.com> <15635.14235.79608.390983@beluga.mojam.com> <15639.26151.752521.415108@anthem.wooz.org> <15639.41533.776854.272767@12-248-11-90.client.attbi.com> Message-ID: <15639.42060.171745.132635@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: BAW> Do we care about preserving the cvs history for the current BAW> bsddbmodule.c? If so, we'll have to ask SF to do a cvs dance BAW> for us. It may not be worth it. SM> I think it would be worthwhile. Alternatively, you could cvs SM> remove it, the add it to Modules/old with a note to check the SM> Attic for older revision notes. That would be fine with me. -Barry From mwh@python.net Tue Jun 25 00:09:38 2002 From: mwh@python.net (Michael Hudson) Date: 25 Jun 2002 00:09:38 +0100 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: Oren Tirosh's message of "Mon, 24 Jun 2002 23:01:40 +0300" References: <20020624230140.B3555@hishome.net> Message-ID: <2mvg888ea5.fsf@starship.python.net> Oren Tirosh writes: > Abstract > > This PEP proposes that symbols matching the type name should be > added to the types module for all basic Python types in the types > module: > > types.IntegerType -> types.int > types.FunctionType -> types.function > types.TracebackType -> types.traceback > ... > > The long capitalized names currently in the types module will be > deprecated. Um, can I be a little confused? If you are writing code that you know will be run in 2.2 and later, you write isinstance(obj, int) If you want to support 2.1 and so on, you write isinstance(obj, types.IntType) What would writing isinstance(obj, types.int) ever gain you except restricting execution to 2.3+? I mean, I don't have any real opinion *against* this pep, I just don't really see why anyone would care... Cheers, M. -- it's not that perl programmers are idiots, it's that the language rewards idiotic behavior in a way that no other language or tool has ever done -- Erik Naggum, comp.lang.lisp From lalo@laranja.org Tue Jun 25 00:18:45 2002 From: lalo@laranja.org (Lalo Martins) Date: Mon, 24 Jun 2002 20:18:45 -0300 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: <2mvg888ea5.fsf@starship.python.net> References: <20020624230140.B3555@hishome.net> <2mvg888ea5.fsf@starship.python.net> Message-ID: <20020624231845.GT25927@laranja.org> Check the rationale: | Most types are accessible as either builtins or in the new module but some | types such as traceback and generator are only accssible through the types | module under names which do not match the type name. This PEP provides a | uniform way to access all basic types under a single set of names. []s, |alo +---- -- It doesn't bother me that people say things like "you'll never get anywhere with this attitude". In a few decades, it will make a good paragraph in my biography. You know, for a laugh. -- http://www.laranja.org/ mailto:lalo@laranja.org pgp key: http://www.laranja.org/pessoal/pgp Eu jogo RPG! (I play RPG) http://www.eujogorpg.com.br/ Python Foundry Guide http://www.sf.net/foundry/python-foundry/ From greg@cosc.canterbury.ac.nz Tue Jun 25 02:07:15 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 25 Jun 2002 13:07:15 +1200 (NZST) Subject: [Python-Dev] Re: String substitution: compile-time versus runtime In-Reply-To: Message-ID: <200206250107.NAA08919@s454.cosc.canterbury.ac.nz> pinard@iro.umontreal.ca: > I really, really think that with enough and proper care, Python > could be set so internationalisation of Python scripts is just > unobtrusive routine. There should not be one way to write Python when > one does not internationalise, and another different way to use it > when one internationalises. As long as you have a Turing-complete programming language available for constructing strings, there will always be ways to write code that defies any straightforward means of internationalisation. Or in other words, if internationalisation is a goal, you'll always have to keep it in mind when coding, one way or another. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From kevin@koconnor.net Tue Jun 25 02:33:18 2002 From: kevin@koconnor.net (Kevin O'Connor) Date: Mon, 24 Jun 2002 21:33:18 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code Message-ID: <20020624213318.A5740@arizona.localdomain> I often find myself needing priority queues in python, and I've finally broken down and written a simple implementation. Previously I've used sorted lists (via bisect) to get the job done, but the heap code significantly improves performance. There are C based implementations, but the effort of compiling in an extension often isn't worth the effort. I'm including the code here for everyone's amusement. Any chance something like this could make it into the standard python library? It would save a lot of time for lazy people like myself. :-) Cheers, -Kevin def heappush(heap, item): pos = len(heap) heap.append(None) while pos: parentpos = (pos - 1) / 2 parent = heap[parentpos] if item <= parent: break heap[pos] = parent pos = parentpos heap[pos] = item def heappop(heap): endpos = len(heap) - 1 if endpos <= 0: return heap.pop() returnitem = heap[0] item = heap.pop() pos = 0 while 1: child2pos = (pos + 1) * 2 child1pos = child2pos - 1 if child2pos < endpos: child1 = heap[child1pos] child2 = heap[child2pos] if item >= child1 and item >= child2: break if child1 > child2: heap[pos] = child1 pos = child1pos continue heap[pos] = child2 pos = child2pos continue if child1pos < endpos: child1 = heap[child1pos] if child1 > item: heap[pos] = child1 pos = child1pos break heap[pos] = item return returnitem -- ------------------------------------------------------------------------ | Kevin O'Connor "BTW, IMHO we need a FAQ for | | kevin@koconnor.net 'IMHO', 'FAQ', 'BTW', etc. !" | ------------------------------------------------------------------------ From skip@pobox.com Tue Jun 25 02:53:49 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 24 Jun 2002 20:53:49 -0500 Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error Message-ID: <15639.52525.481846.601961@12-248-8-148.client.attbi.com> I just noticed in the development docs that when a timeout on a socket occurs, socket.error is raised. I rather liked the idea that a different exception was raised for timeouts (I used Tim O'Malley's timeout_socket module). Making a TimeoutError exception a subclass of socket.error would be fine so you can catch it with existing code, but I could see recovering differently for a timeout as opposed to other possible errors: sock.settimeout(5.0) try: data = sock.recv(8192) except socket.TimeoutError: # maybe requeue the request ... except socket.error, codes: # some more drastic solution is needed ... Skip From skip@pobox.com Tue Jun 25 03:00:49 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 24 Jun 2002 21:00:49 -0500 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020624213318.A5740@arizona.localdomain> References: <20020624213318.A5740@arizona.localdomain> Message-ID: <15639.52945.388250.264216@12-248-8-148.client.attbi.com> Kevin> I often find myself needing priority queues in python, and I've Kevin> finally broken down and written a simple implementation. Hmmm... I don't see a priority associated with items when you push them onto the queue in heappush(). This seems somewhat different than my notion of a priority queue. Seems to me that you could implement the type of priority queue I'm think of rather easily using a class that wraps a list of Queue.Queue objects. Am I missing something obvious? -- Skip Montanaro skip@pobox.com consulting: http://manatee.mojam.com/~skip/resume.html From kevin@koconnor.net Tue Jun 25 03:59:41 2002 From: kevin@koconnor.net (Kevin O'Connor) Date: Mon, 24 Jun 2002 22:59:41 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <15639.52945.388250.264216@12-248-8-148.client.attbi.com>; from skip@pobox.com on Mon, Jun 24, 2002 at 09:00:49PM -0500 References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> Message-ID: <20020624225941.A5798@arizona.localdomain> On Mon, Jun 24, 2002 at 09:00:49PM -0500, Skip Montanaro wrote: > > Kevin> I often find myself needing priority queues in python, and I've > Kevin> finally broken down and written a simple implementation. > > Hmmm... I don't see a priority associated with items when you push them > onto the queue in heappush(). This seems somewhat different than my notion > of a priority queue. Hi Skip, I should have included a basic usage in my original email: >>> t = []; heappush(t, 10); heappush(t, 20); heappush(t, 15); heappush(t, 5) >>> print heappop(t), heappop(t), heappop(t), heappop(t) 20 15 10 5 The binary heap has the property that pushing takes O(log n) time and popping takes O(log n) time. One may push in any order and a pop() always returns the greatest item in the list. I don't explicitly associate a priority with every item in the queue - instead I rely on the user having a __cmp__ operator defined on the items (if the default does not suffice). The same behavior can be obtained using sorted lists: >>> from bisect import insort >>> t = []; insort(t, 10); insort(t, 20); insort(t, 15); insort(t, 5) >>> print t.pop(), t.pop(), t.pop(), t.pop() 20 15 10 5 But insort takes a lot more overhead on large lists. > Seems to me that you could implement the type of priority queue I'm think of > rather easily using a class that wraps a list of Queue.Queue objects. Am I > missing something obvious? Perhaps I am, because I do not see how one would use Queue.Queue efficiently for this task. Cheers, -Kevin -- ------------------------------------------------------------------------ | Kevin O'Connor "BTW, IMHO we need a FAQ for | | kevin@koconnor.net 'IMHO', 'FAQ', 'BTW', etc. !" | ------------------------------------------------------------------------ From zack@codesourcery.com Tue Jun 25 04:06:09 2002 From: zack@codesourcery.com (Zack Weinberg) Date: Mon, 24 Jun 2002 20:06:09 -0700 Subject: [Python-Dev] Improved tmpfile module Message-ID: <20020625030609.GD13729@codesourcery.com> Attached please find a rewritten and improved tmpfile.py. The major change is to make the temporary file names significantly harder to predict. This foils denial-of-service attacks, where a hostile program floods /tmp with files named @12345.NNNN to prevent process 12345 from creating any temp files. It also makes the race condition inherent in tmpfile.mktemp() somewhat harder to exploit. I also implemented three new interfaces: (fd, name) = mkstemp(suffix="", binary=1): Creates a temporary file, returning both an OS-level file descriptor open on it and its name. This is useful in situations where you need to know the name of the temporary file, but can't risk the race in mktemp. name = mkdtemp(suffix=""): Creates a temporary directory, without race. file = NamedTemporaryFile(mode='w+b', bufsize=-1, suffix=""): This is just the non-POSIX version of tmpfile.TemporaryFile() made available on all platforms, and with the .path attribute documented. It provides a convenient way to get a temporary file with a name, that will be automatically deleted on close, and with a high-level file object associated with it. Finally, I tore out a lot of the posix/not-posix conditionals, relying on the os module to provide open() and O_EXCL -- this should make all the recommended interfaces race-safe on non-posix systems, which they were not before. Comments? I would very much like to see something along these lines in 2.3; I have an application that needs to be reliable in the face of the aforementioned denial of service. Please note that I wound up removing all the top-level 'del foo' statements (cleaning up the namespace) as I could not figure out how to do them properly. I'm not a python guru. zw """Temporary files and filenames.""" import os from errno import EEXIST from random import Random __all__ = [ "TemporaryFile", "NamedTemporaryFile", # recommended (high level) "mkstemp", "mkdtemp", # recommended (low level) "mktemp", "gettempprefix", # deprecated "tempdir", "template" # control ] ### Parameters that the caller may set to override the defaults. tempdir = None # _template contains an appropriate pattern for the name of each # temporary file. if os.name == 'nt': _template = '~%s~' elif os.name in ('mac', 'riscos'): _template = 'Python-Tmp-%s' else: _template = 'pyt%s' # better ideas? ### Recommended, user-visible interfaces. _text_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL if os.name == 'posix': _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL else: _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL | os.O_BINARY def mkstemp(suffix="", binary=1): """Function to create a named temporary file, with 'suffix' for its suffix. Returns an OS-level handle to the file and the name, as a tuple. If 'binary' is 1, the file is opened in binary mode, otherwise text mode (if this is a meaningful concept for the operating system in use). In any case, the file is readable and writable only by the creating user, and executable by no one.""" if binary: flags = _bin_openflags else: flags = _text_openflags while 1: name = _candidate_name(suffix) try: fd = os.open(name, flags, 0600) return (fd, name) except OSError, e: if e.errno == EEXIST: continue # try again raise def mkdtemp(suffix=""): """Function to create a named temporary directory, with 'suffix' for its suffix. Returns the name of the directory. The directory is readable, writable, and searchable only by the creating user.""" while 1: name = _candidate_name(suffix) try: os.mkdir(name, 0700) return name except OSError, e: if e.errno == EEXIST: continue # try again raise class _TemporaryFileWrapper: """Temporary file wrapper This class provides a wrapper around files opened for temporary use. In particular, it seeks to automatically remove the file when it is no longer needed. """ # Cache the unlinker so we don't get spurious errors at shutdown # when the module-level "os" is None'd out. Note that this must # be referenced as self.unlink, because the name TemporaryFileWrapper # may also get None'd out before __del__ is called. unlink = os.unlink def __init__(self, file, path): self.file = file self.path = path self.close_called = 0 def close(self): if not self.close_called: self.close_called = 1 self.file.close() self.unlink(self.path) def __del__(self): self.close() def __getattr__(self, name): file = self.__dict__['file'] a = getattr(file, name) if type(a) != type(0): setattr(self, name, a) return a def NamedTemporaryFile(mode='w+b', bufsize=-1, suffix=""): """Create a named temporary file, with 'suffix' for its suffix. It will automatically be deleted when it is closed. Pass 'mode' and 'bufsize' to fdopen. Returns a file object; the name of the file is accessible as file.path.""" if 'b' in mode: binary = 1 else: binary = 0 (fd, name) = mkstemp(suffix, binary) file = os.fdopen(fd, mode, bufsize) return _TemporaryFileWrapper(file, name) if os.name != 'posix': # A file cannot be unlinked while open, so TemporaryFile # degenerates to NamedTemporaryFile. TemporaryFile = NamedTemporaryFile else: def TemporaryFile(mode='w+b', bufsize=-1, suffix=""): """Create a temporary file. It has no name and will not survive being closed; the 'suffix' argument is ignored. Pass 'mode' and 'bufsize' to fdopen. Returns a file object.""" if 'b' in mode: binary = 1 else: binary = 0 (fd, name) = mkstemp(binary=binary) file = os.fdopen(fd, mode, bufsize) os.unlink(name) return file ### Deprecated, user-visible interfaces. def mktemp(suffix=""): """User-callable function to return a unique temporary file name.""" while 1: name = _candidate_name(suffix) if not os.path.exists(name): return name def gettempprefix(): """Function to calculate a prefix of the filename to use. This incorporates the current process id on systems that support such a notion, so that concurrent processes don't generate the same prefix. """ global _template return (_template % `os.getpid`) + '.' ### Threading gook. try: from thread import allocate_lock except ImportError: class _DummyMutex: def acquire(self): pass release = acquire def allocate_lock(): return _DummyMutex() del _DummyMutex _init_once_lock = allocate_lock() def _init_once(var, constructor): """If 'var' is not None, initialize it to the return value from 'constructor'. Do this exactly once, no matter how many threads call this routine. FIXME: How would I cause 'var' to be passed by reference to this routine, so that the caller can write simply _init_once(foo, make_foo) instead of foo = _init_once(foo, make_foo) ?""" # Check once outside the lock, so we can avoid acquiring it if # the variable has already been initialized. if var is not None: return var try: _init_once_lock.acquire() # Check again inside the lock, in case someone else got # here first. if var is None: var = constructor() finally: _init_once_lock.release() return var ### Internal routines and data. _seq = None def _candidate_name(suffix): """Return a candidate temporary name in 'tempdir' (global) ending with 'suffix'.""" # We have to make sure that _seq and tempdir are initialized only # once, even in the presence of multiple threads of control. global _seq global tempdir _seq = _init_once(_seq, _RandomFilenameSequence) tempdir = _init_once(tempdir, _gettempdir) # Most of the work is done by _RandomFilenameSequence. return os.path.join(tempdir, _seq.get()) + suffix class _RandomFilenameSequence: characters = ( "abcdefghijklmnopqrstuvwxyz" + "ABCDEFGHIJKLMNOPQRSTUVWXYZ" + "0123456789-_") def __init__(self): self.mutex = allocate_lock() self.rng = Random() def get(self): global _template # Only one thread can call into the RNG at a time. self.mutex.acquire() c = self.characters r = self.rng letters = ''.join([r.choice(c), r.choice(c), r.choice(c), r.choice(c), r.choice(c), r.choice(c)]) self.mutex.release() return (_template % letters) # XXX This tries to be not UNIX specific, but I don't know beans about # how to choose a temp directory or filename on MS-DOS or other # systems so it may have to be changed... # _gettempdir deduces whether a candidate temp dir is usable by # trying to create a file in it, and write to it. If that succeeds, # great, it closes the file and unlinks it. There's a race, though: # the *name* of the test file it tries is the same across all threads # under most OSes (Linux is an exception), and letting multiple threads # all try to open, write to, close, and unlink a single file can cause # a variety of bogus errors (e.g., you cannot unlink a file under # Windows if anyone has it open, and two threads cannot create the # same file in O_EXCL mode under Unix). The simplest cure is to serialize # calls to _gettempdir, which is done above in _candidate_name(). def _gettempdir(): """Function to calculate the directory to use.""" try: pwd = os.getcwd() except (AttributeError, os.error): pwd = os.curdir attempdirs = ['/tmp', '/var/tmp', '/usr/tmp', pwd] if os.name == 'nt': attempdirs.insert(0, 'C:\\TEMP') attempdirs.insert(0, '\\TEMP') elif os.name == 'mac': import macfs, MACFS try: refnum, dirid = macfs.FindFolder(MACFS.kOnSystemDisk, MACFS.kTemporaryFolderType, 1) dirname = macfs.FSSpec((refnum, dirid, '')).as_pathname() attempdirs.insert(0, dirname) except macfs.error: pass elif os.name == 'riscos': scrapdir = os.getenv('Wimp$ScrapDir') if scrapdir: attempdirs.insert(0, scrapdir) for envname in 'TMPDIR', 'TEMP', 'TMP': if os.environ.has_key(envname): attempdirs.insert(0, os.environ[envname]) testfile = gettempprefix() + 'test' for dir in attempdirs: try: filename = os.path.join(dir, testfile) fd = os.open(filename, os.O_RDWR | os.O_CREAT | os.O_EXCL, 0700) fp = os.fdopen(fd, 'w') fp.write('blat') fp.close() os.unlink(filename) del fp, fd return dir except IOError: pass msg = "Can't find a usable temporary directory amongst " + `attempdirs` raise IOError, msg From skip@pobox.com Tue Jun 25 05:09:32 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 24 Jun 2002 23:09:32 -0500 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020624225941.A5798@arizona.localdomain> References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> Message-ID: <15639.60668.591252.466454@12-248-8-148.client.attbi.com> Kevin> I don't explicitly associate a priority with every item in the Kevin> queue - instead I rely on the user having a __cmp__ operator Kevin> defined on the items (if the default does not suffice). That's what I missed. >> Seems to me that you could implement the type of priority queue I'm >> think of rather easily using a class that wraps a list of Queue.Queue >> objects. Am I missing something obvious? Kevin> Perhaps I am, because I do not see how one would use Queue.Queue Kevin> efficiently for this task. I don't know how efficient it would be, but I usually think that most applications have a small, fixed set of possible priorities, like ("low", "medium", "high") or ("info", "warning", "error", "fatal"). In this sort of situation my initial inclination would be to implement a dict of Queue instances which corresponds to the fixed set of priorities, something like: import Queue class PriorityQueue: def __init__(self, priorities): self.queues = {} self.marker = Queue.Queue() self.priorities = priorities for p in priorities: self.queues[p] = Queue.Queue() def put(self, obj, priority): self.queues[priority].put(obj) self.marker.put(None) def get(self): dummy = self.marker.get() # at this point we know one of the queues has an entry for us for p in self.priorities: try: return self.queues[p].get_nowait() except Queue.Empty: pass if __name__ == "__main__": q = PriorityQueue(("low", "medium", "high")) q.put(12, "low") q.put(13, "high") q.put(14, "medium") print q.get() print q.get() print q.get() Obviously this won't work if your set of priorities isn't fixed at the outset, but I think it's pretty straightforward, and it should work in multithreaded applications. It will also work if for some reason you want to queue up objects for which __cmp__ doesn't make sense. Skip From oren-py-d@hishome.net Tue Jun 25 06:02:10 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 25 Jun 2002 01:02:10 -0400 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: <2mvg888ea5.fsf@starship.python.net> References: <20020624230140.B3555@hishome.net> <2mvg888ea5.fsf@starship.python.net> Message-ID: <20020625050210.GA14749@hishome.net> On Tue, Jun 25, 2002 at 12:09:38AM +0100, Michael Hudson wrote: > Oren Tirosh writes: > > > Abstract > > > > This PEP proposes that symbols matching the type name should be > > added to the types module for all basic Python types in the types > > module: > > > > types.IntegerType -> types.int > > types.FunctionType -> types.function > > types.TracebackType -> types.traceback > > ... > > > > The long capitalized names currently in the types module will be > > deprecated. > > Um, can I be a little confused? If you are writing code that you know > will be run in 2.2 and later, you write > > isinstance(obj, int) > > If you want to support 2.1 and so on, you write > > isinstance(obj, types.IntType) > > What would writing > > isinstance(obj, types.int) > > ever gain you except restricting execution to 2.3+? It's like asking what do you gain by using string methods instead of the string module. It's part of a slow, long-term effort to clean up the language while trying to minimize the impact on existing code. Oren From greg@cosc.canterbury.ac.nz Tue Jun 25 06:28:36 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 25 Jun 2002 17:28:36 +1200 (NZST) Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <15639.60668.591252.466454@12-248-8-148.client.attbi.com> Message-ID: <200206250528.RAA08943@s454.cosc.canterbury.ac.nz> Skip Montanaro : > I don't know how efficient it would be, but I usually think that most > applications have a small, fixed set of possible priorities Some applications of priority queues are like that, but others aren't -- e.g. an event queue in a discrete event simulation, where events are ordered by time. I expect that's the sort of application Kevin had in mind. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From fredrik@pythonware.com Tue Jun 25 07:27:41 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 25 Jun 2002 08:27:41 +0200 Subject: [Python-Dev] PEP 294: Type Names in the types Module References: <20020624230140.B3555@hishome.net> <2mvg888ea5.fsf@starship.python.net> <20020625050210.GA14749@hishome.net> Message-ID: <008c01c21c11$6b529070$ced241d5@hagrid> Oren Tirosh wrote: > > What would writing > > > > isinstance(obj, types.int) > > > > ever gain you except restricting execution to 2.3+? > > It's like asking what do you gain by using string methods instead of the > string module. no, it's not. it's not like that at all. as michael pointed out, we've already added a *third* way to access type objects in 2.2. you're adding a *fourth* way. string methods were added at a time when Python went from one to two different string types; they solved a real implementation problem. reducing/eliminating the need for the string module was a side effect. > It's part of a slow, long-term effort to clean up the language > while trying to minimize the impact on existing code. or as likely, part of a slow, long-term effort by to make Python totally unusable for any serious software engineering... "who cares about timtowtdi? we add a new one every week!" "we know what's better for you. you don't." "deprecation guaranteed!" (etc) From oren-py-d@hishome.net Tue Jun 25 07:52:03 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 25 Jun 2002 02:52:03 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020624213318.A5740@arizona.localdomain> References: <20020624213318.A5740@arizona.localdomain> Message-ID: <20020625065203.GA27183@hishome.net> On Mon, Jun 24, 2002 at 09:33:18PM -0400, Kevin O'Connor wrote: > I often find myself needing priority queues in python, and I've finally > broken down and written a simple implementation. Previously I've used > sorted lists (via bisect) to get the job done, but the heap code > significantly improves performance. There are C based implementations, but > the effort of compiling in an extension often isn't worth the effort. I'm > including the code here for everyone's amusement. > > Any chance something like this could make it into the standard python > library? It would save a lot of time for lazy people like myself. :-) A sorted list is a much more general-purpose data structure than a priority queue and can be used to implement a priority queue. It offers almost the same asymptotic performance: sorted list using splay tree (amortized): insert: O(log n) pop: O(log n) peek: O(log n) priority queue using binary heap: insert: O(log n) pop: O(log n) peek: O(1) The only advantage of a heap is O(1) peek which doesn't seem so critical. It may also have somewhat better performance by a constant factor because it uses an array rather than allocating node structures. But the internal order of a heap-based priority queue is very non-intuitive and quite useless for other purposes while a sorted list is, umm..., sorted! Oren From martin@v.loewis.de Tue Jun 25 08:04:38 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 25 Jun 2002 09:04:38 +0200 Subject: [Python-Dev] Please give this patch for building bsddb a try In-Reply-To: <15639.32763.103711.902632@anthem.wooz.org> References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <15632.52766.822003.689689@anthem.wooz.org> <15639.25843.562043.559385@anthem.wooz.org> <15639.32763.103711.902632@anthem.wooz.org> Message-ID: barry@zope.com (Barry A. Warsaw) writes: > `-rpath DIR' > [...] > > The `-rpath' option may also be used on SunOS. By default, on > SunOS, the linker will form a runtime search patch out of all the > `-L' options it is given. If a `-rpath' option is used, the > runtime search path will be formed exclusively using the `-rpath' > options, ignoring the `-L' options. This can be useful when using > gcc, which adds many `-L' options which may be on NFS mounted > filesystems. > > Reading it again now, it's not clear if "SunOS" also means "Solaris". I see. This is indeed SunOS 4 only. Regards, Martin From martin@v.loewis.de Tue Jun 25 08:15:12 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 25 Jun 2002 09:15:12 +0200 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020624213318.A5740@arizona.localdomain> References: <20020624213318.A5740@arizona.localdomain> Message-ID: "Kevin O'Connor" writes: > Any chance something like this could make it into the standard python > library? It would save a lot of time for lazy people like myself. :-) I think this deserves a library PEP. I would also recommend to have a separate heap and priority queue API, to avoid the kind of confusion that Skip ran into. Something like the C++ STL API might be appropriate: the heap functions take a comparator function, on top of which you offer both heapsort and priority queues. The technical issues set aside, the main purpose of a library PEP is to record a commitment from the author to maintain the module, with the option of removing the module if the author runs away, and nobody takes over. Regards, Martin From martin@v.loewis.de Tue Jun 25 08:23:43 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 25 Jun 2002 09:23:43 +0200 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020625065203.GA27183@hishome.net> References: <20020624213318.A5740@arizona.localdomain> <20020625065203.GA27183@hishome.net> Message-ID: Oren Tirosh writes: > The only advantage of a heap is O(1) peek which doesn't seem so > critical. It may also have somewhat better performance by a > constant factor because it uses an array rather than allocating node > structures. But the internal order of a heap-based priority queue > is very non-intuitive and quite useless for other purposes while a > sorted list is, umm..., sorted! I think that heaps don't allocate additional memory is a valuable property, more valuable than the asymptotic complexity (which is also quite good). If you don't want to build priority queues, you can still use heaps to sort a list. IMO, heaps are so standard as an algorithm that they belong into the Python library, in some form. It is then the user's choice to use that algorithm or not. Regards, Martin From aleax@aleax.it Tue Jun 25 08:30:43 2002 From: aleax@aleax.it (Alex Martelli) Date: Tue, 25 Jun 2002 09:30:43 +0200 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <15639.60668.591252.466454@12-248-8-148.client.attbi.com> References: <20020624213318.A5740@arizona.localdomain> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com> Message-ID: On Tuesday 25 June 2002 06:09 am, Skip Montanaro wrote: ... > I don't know how efficient it would be, but I usually think that most > applications have a small, fixed set of possible priorities, like ("low", > "medium", "high") or ("info", "warning", "error", "fatal"). In this sort Then you do "bin sorting", of course -- always worth considering when you know the sort key can only take a small number of different values (as is the more general "radix sorting" when you have a few such keys, or a key that easily breaks down that way). But it IS rather a special case, albeit an important one (and quite possibly frequently occurring in some application areas). Alex From oren-py-d@hishome.net Tue Jun 25 09:09:29 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 25 Jun 2002 04:09:29 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: References: <20020624213318.A5740@arizona.localdomain> <20020625065203.GA27183@hishome.net> Message-ID: <20020625080929.GA39304@hishome.net> On Tue, Jun 25, 2002 at 09:23:43AM +0200, Martin v. Loewis wrote: > Oren Tirosh writes: > > > The only advantage of a heap is O(1) peek which doesn't seem so > > critical. It may also have somewhat better performance by a > > constant factor because it uses an array rather than allocating node > > structures. But the internal order of a heap-based priority queue > > is very non-intuitive and quite useless for other purposes while a > > sorted list is, umm..., sorted! > > I think that heaps don't allocate additional memory is a valuable > property, more valuable than the asymptotic complexity (which is also > quite good). If you don't want to build priority queues, you can still > use heaps to sort a list. When I want to sort a list I just use .sort(). I don't care which algorithm is used. I don't care whether dictionaries are implemented using hash tables, some kind of tree structure or magic smoke. I just trust Python to use a reasonably efficient implementation. I always find it funny when C++ or Perl programmers refer to an associative array as a "hash". > IMO, heaps are so standard as an algorithm that they belong into the > Python library, in some form. It is then the user's choice to use that > algorithm or not. Heaps are a "standard algorithm" only from a CS point of view. It doesn't have much to do with everyday programming. Let's put it this way: If Python has an extension module in the standard library implementing a sorted list, would you care enough about the specific binary heap implementation to go and write one or would you just use what you had in the library for a priority queue? ;-) Oren From mal@lemburg.com Tue Jun 25 09:12:00 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 25 Jun 2002 10:12:00 +0200 Subject: [Python-Dev] New Subscriber Introduction References: Message-ID: <3D1825D0.2070309@lemburg.com> Brett Cannon wrote: > The only grumblings anyone has heard out of me as yet on this list is over > my Python implementation of strptime. I do plan to stay on this list, > though, even after this is resolved and be as involved as I can on the > list (which is going to be limited until I get off my rear and really dive > into the C source). Just curious: have you taken a look at the mxDateTime parser ? It has a slightly different approach than strptime() but also takes a lot of load from the programmer in terms of not requiring a predefined format. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From pf@artcom-gmbh.de Tue Jun 25 09:56:47 2002 From: pf@artcom-gmbh.de (Peter Funk) Date: Tue, 25 Jun 2002 10:56:47 +0200 (CEST) Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: <2mvg888ea5.fsf@starship.python.net> from Michael Hudson at "Jun 25, 2002 00:09:38 am" Message-ID: Hi, > Oren Tirosh writes: [...] > > types.IntegerType -> types.int > > types.FunctionType -> types.function > > types.TracebackType -> types.traceback > > ... > > > > The long capitalized names currently in the types module will be > > deprecated. Michael Hudson: [...] > I mean, I don't have any real opinion *against* this pep, I just don't > really see why anyone would care... I care and I've a strong opinion against this PEP and any other so called "enhancement", which makes it harder or impossible to write Python code *NOW*, which covers a certain range of Python language implementations. The Python documentation advertises the 'types' module with the following wording: """This module defines names for all object types that are used by the standard Python interpreter, [...] It is safe to use "from types import *" -- the module does not export any names besides the ones listed here. New names exported by future versions of this module will all end in "Type". """ This makes promises about future versions of this module and the the Python language. Breaking promises is in general a very bad idea and will do serious harm to trustworthiness. At the time of this writing the oldest Python version I have to support is Python 1.5.2 and this will stay so until at least the end of year 2004. So any attempts to deprecate often used language features does no good other than demotivating people to start using Python. It would be possible to change the documentation of types module now and start telling users that the Python development team made up their mind. That would open up the possibility to really deprecate the module or change the type names later (but only much much later!), without causing the effect I called "version fatigue" lately here. A look at http://www.python.org/dev/doc/devel/lib/module-types.html showed that this didn't happened yet. Sigh! Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany) From fredrik@pythonware.com Tue Jun 25 11:16:18 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 25 Jun 2002 12:16:18 +0200 Subject: [Python-Dev] New Subscriber Introduction References: <3D1825D0.2070309@lemburg.com> Message-ID: <003f01c21c31$59cae6c0$0900a8c0@spiff> mal wrote: > Just curious: have you taken a look at the mxDateTime parser ? is that an extension of the rfc822.parsedate approach? > It has a slightly different approach than strptime() but also > takes a lot of load from the programmer in terms of not requiring > a predefined format. if you're asking me, strptime is mostly useless in 99% of all practical cases (even more useless than scanf). but luckily, it's mostly harm- less as well... From mal@lemburg.com Tue Jun 25 11:21:31 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 25 Jun 2002 12:21:31 +0200 Subject: [Python-Dev] New Subscriber Introduction References: <3D1825D0.2070309@lemburg.com> <003f01c21c31$59cae6c0$0900a8c0@spiff> Message-ID: <3D18442B.30609@lemburg.com> Fredrik Lundh wrote: > mal wrote: > > >>Just curious: have you taken a look at the mxDateTime parser ? > > > is that an extension of the rfc822.parsedate approach? Yes, but it goes far beyond RFC822 style dates and times. >>It has a slightly different approach than strptime() but also >>takes a lot of load from the programmer in terms of not requiring >>a predefined format. > > > if you're asking me, strptime is mostly useless in 99% of all practical > cases (even more useless than scanf). but luckily, it's mostly harm- > less as well... Agreed. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From Oleg Broytmann Tue Jun 25 11:40:31 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Tue, 25 Jun 2002 14:40:31 +0400 Subject: [Python-Dev] mxDatTime parser (was: New Subscriber Introduction) In-Reply-To: <3D18442B.30609@lemburg.com>; from mal@lemburg.com on Tue, Jun 25, 2002 at 12:21:31PM +0200 References: <3D1825D0.2070309@lemburg.com> <003f01c21c31$59cae6c0$0900a8c0@spiff> <3D18442B.30609@lemburg.com> Message-ID: <20020625144031.A11513@phd.pp.ru> On Tue, Jun 25, 2002 at 12:21:31PM +0200, M.-A. Lemburg wrote: > >>Just curious: have you taken a look at the mxDateTime parser ? > > > > is that an extension of the rfc822.parsedate approach? > > Yes, but it goes far beyond RFC822 style dates and times. >>> from mx import DateTime >>> dt = DateTime.DateTimeFrom("21/12/2002") >>> dt >>> dt = DateTime.DateTimeFrom("21/08/2002") >>> dt >>> dt = DateTime.DateTimeFrom("21-08-2002") >>> dt I am not sure I understand the logic. Because of this I always use ISO date format (2002-08-21). Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From mal@lemburg.com Tue Jun 25 11:50:32 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 25 Jun 2002 12:50:32 +0200 Subject: [Python-Dev] mxDatTime parser (was: New Subscriber Introduction) References: <3D1825D0.2070309@lemburg.com> <003f01c21c31$59cae6c0$0900a8c0@spiff> <3D18442B.30609@lemburg.com> <20020625144031.A11513@phd.pp.ru> Message-ID: <3D184AF8.8090300@lemburg.com> Oleg Broytmann wrote: > On Tue, Jun 25, 2002 at 12:21:31PM +0200, M.-A. Lemburg wrote: > >>>>Just curious: have you taken a look at the mxDateTime parser ? >>> >>>is that an extension of the rfc822.parsedate approach? >> >>Yes, but it goes far beyond RFC822 style dates and times. > > >>>>from mx import DateTime >>>>dt = DateTime.DateTimeFrom("21/12/2002") >>>>dt >>> > > >>>>dt = DateTime.DateTimeFrom("21/08/2002") >>>>dt >>> > > >>>>dt = DateTime.DateTimeFrom("21-08-2002") >>>>dt >>> > > > I am not sure I understand the logic. Because of this I always use ISO > date format (2002-08-21). The problem with the first two is that the parser parses date *and* time (it defaults to today for entries which are not found in the string; this can be changed though). The last one is parsed as ISO date (21-08-20), the trailing 02 is omitted. As you can see date parsing is very difficult, and even though the mxDateTime parser already recognizes tons of different formats, it doesn't always work. It is getting better with each release, though :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From Oleg Broytmann Tue Jun 25 11:55:54 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Tue, 25 Jun 2002 14:55:54 +0400 Subject: [Python-Dev] mxDatTime parser (was: New Subscriber Introduction) In-Reply-To: <3D184AF8.8090300@lemburg.com>; from mal@lemburg.com on Tue, Jun 25, 2002 at 12:50:32PM +0200 References: <3D1825D0.2070309@lemburg.com> <003f01c21c31$59cae6c0$0900a8c0@spiff> <3D18442B.30609@lemburg.com> <20020625144031.A11513@phd.pp.ru> <3D184AF8.8090300@lemburg.com> Message-ID: <20020625145553.B11513@phd.pp.ru> On Tue, Jun 25, 2002 at 12:50:32PM +0200, M.-A. Lemburg wrote: > As you can see date parsing is very difficult, and even though Too true, and that's why I never sent a complaint. > the mxDateTime parser already recognizes tons of different > formats, it doesn't always work. It is getting better with > each release, though :-) Thank you for the work! Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From oren-py-d@hishome.net Tue Jun 25 11:58:39 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 25 Jun 2002 06:58:39 -0400 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: References: <2mvg888ea5.fsf@starship.python.net> Message-ID: <20020625105839.GA58813@hishome.net> On Tue, Jun 25, 2002 at 10:56:47AM +0200, Peter Funk wrote: > """This module defines names for all object types that are used by > the standard Python interpreter, [...] > It is safe to use "from types import *" -- the module does not > export any names besides the ones listed here. New names exported > by future versions of this module will all end in "Type". """ Thanks for pointing this out! > It would be possible to change the documentation of types module now > and start telling users that the Python development team made up > their mind. That would open up the possibility to really deprecate > the module or change the type names later (but only much much later!), > without causing the effect I called "version fatigue" lately here. I don't understand exactly what you are suggesting here. Would you care to explain it more clearly? Oren From pf@artcom-gmbh.de Tue Jun 25 13:08:54 2002 From: pf@artcom-gmbh.de (Peter Funk) Date: Tue, 25 Jun 2002 14:08:54 +0200 (CEST) Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: <20020625105839.GA58813@hishome.net> from Oren Tirosh at "Jun 25, 2002 06:58:39 am" Message-ID: Hi, Oren Tirosh: > On Tue, Jun 25, 2002 at 10:56:47AM +0200, Peter Funk wrote: > > """This module defines names for all object types that are used by > > the standard Python interpreter, [...] > > It is safe to use "from types import *" -- the module does not > > export any names besides the ones listed here. New names exported > > by future versions of this module will all end in "Type". """ > > Thanks for pointing this out! > > > It would be possible to change the documentation of types module now > > and start telling users that the Python development team made up > > their mind. That would open up the possibility to really deprecate > > the module or change the type names later (but only much much later!), > > without causing the effect I called "version fatigue" lately here. > > I don't understand exactly what you are suggesting here. Would you care to > explain it more clearly? A recent thread here on python-dev came to the conclusion to "silently deprecate" the standard library modules 'string' and 'types'. This silent deprecation nevertheless means, that these modules will go away at some future point in time. I don't like this decision, but I understand the reasoning and can now only hope, that this point in time lies very very far away in the future. It is a reasonable expection, that source code written for a certain version of a serious programming language remains valid for a *LONG* period of time. Backward compatibility is absolutely essential. What I was trying to suggest is to change the documentation of the Python language and library as early as possible, so that programmers get a reasonable chance to become familar with any upcoming new situation. Unfortunately this will not help for software, which has already been written and is in production. If in 2004 certain Python programs written in 2000 or earlier would start raising ImportError exceptions on 'from types import *' after upgrading to a new system which may come with the latest version of Python, this will certainly cause damage. Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany) From oren-py-d@hishome.net Tue Jun 25 14:24:24 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 25 Jun 2002 16:24:24 +0300 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: ; from pf@artcom-gmbh.de on Tue, Jun 25, 2002 at 02:08:54PM +0200 References: <20020625105839.GA58813@hishome.net> Message-ID: <20020625162424.A5762@hishome.net> On Tue, Jun 25, 2002 at 02:08:54PM +0200, Peter Funk wrote: > A recent thread here on python-dev came to the conclusion to > "silently deprecate" the standard library modules 'string' and 'types'. > This silent deprecation nevertheless means, that these modules will > go away at some future point in time. I don't like this decision, > but I understand the reasoning and can now only hope, that this > point in time lies very very far away in the future. I don't like it very much either. I prefer the string module to be silently deprecated "forever" without any specific schedule for removal. That's why the Backward Compatibility section of this PEP says that "it is not planned to actually remove the long names from the types module in some future version." I think that actually breaking backward compatibility should be reserved for really obscure modules that virtually nobody uses any more. Another case is when the programs that will be broken were using somewhat questionable programming practices in the first place (e.g. lst.append(x,y) instead of lst.append((x,y)) or assignment to __class__). > Unfortunately this will not help for software, which has already been > written and is in production. If in 2004 certain Python programs > written in 2000 or earlier would start raising ImportError exceptions > on 'from types import *' after upgrading to a new system which may come > with the latest version of Python, this will certainly cause damage. In fact, reusing the types module instead of deprecating and eventually removing it will ensure that no ImportError will be raised. The new types module will also serve as a retirement home for the long type names where they can live comfortably and still be of some use to old code instead being evicted. There is a problem though. "from types import *" would import the short names, too, overriding the builtins. If you redefine int or str you probably deserve it :-) but if you have an innocent variable called "function" somewhere in your module it will get clobbered. This is a problem. The solution might be to include only the long type names in __all__. Oren From aahz@pythoncraft.com Tue Jun 25 14:38:07 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 25 Jun 2002 09:38:07 -0400 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: References: <20020625105839.GA58813@hishome.net> Message-ID: <20020625133807.GA21633@panix.com> On Tue, Jun 25, 2002, Peter Funk wrote: > > Unfortunately this will not help for software, which has already been > written and is in production. If in 2004 certain Python programs > written in 2000 or earlier would start raising ImportError exceptions > on 'from types import *' after upgrading to a new system which may come > with the latest version of Python, this will certainly cause damage. This can be solved by a combination of changing the documentation and using __all__ (which I think is in part precisely the point of creating __all__). (To save people time, __all__ controls what names import * uses; I think it was introduced in Python 2.1, but I'm not sure.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From pobrien@orbtech.com Tue Jun 25 14:53:28 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Tue, 25 Jun 2002 08:53:28 -0500 Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting In-Reply-To: <20020623181630.GN25927@laranja.org> Message-ID: [Lalo Martins] > > Now, this thing we're talking about is replacing parts of the string with > other strings. These strings may be the result of running some non-string > objects trough str(foo) - but, we are making no assumptions about these > objects. Just that str(foo) is somehow meaningful. And, to my knowledge, > there are no python objects for which str(foo) doesn't work. I guess it depends on your definition of "work". This can fail if foo is an instance of a class with __str__ (or __repr__) having a bug or raising an exception. If foo is your own code you probably want it to fail. If foo is someone else's code you may have no choice but to work around it. :-( -- Patrick K. O'Brien Orbtech ----------------------------------------------- "Your source for Python software development." ----------------------------------------------- Web: http://www.orbtech.com/web/pobrien/ Blog: http://www.orbtech.com/blog/pobrien/ Wiki: http://www.orbtech.com/wiki/PatrickOBrien ----------------------------------------------- From aahz@pythoncraft.com Tue Jun 25 14:45:04 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 25 Jun 2002 09:45:04 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: References: <20020624213318.A5740@arizona.localdomain> <20020625065203.GA27183@hishome.net> Message-ID: <20020625134504.GB21633@panix.com> On Tue, Jun 25, 2002, Martin v. Loewis wrote: > > IMO, heaps are so standard as an algorithm that they belong into the > Python library, in some form. It is then the user's choice to use that > algorithm or not. Should this PEP be split in two, then? One for a new "AbstractData" package (that would include the heap algorithm) and one for an update to Queue that would use some algorithm from AbstractData. The latter might not even need a PEP. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From skip@pobox.com Tue Jun 25 15:23:20 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 25 Jun 2002 09:23:20 -0500 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: <20020625133807.GA21633@panix.com> References: <20020625105839.GA58813@hishome.net> <20020625133807.GA21633@panix.com> Message-ID: <15640.31960.526262.104242@12-248-8-148.client.attbi.com> >> If in 2004 certain Python programs written in 2000 or earlier would >> start raising ImportError exceptions on 'from types import *' after >> upgrading to a new system which may come with the latest version of >> Python, this will certainly cause damage. aahz> This can be solved by a combination of changing the documentation aahz> and using __all__ (which I think is in part precisely the point of aahz> creating __all__). I don't think __all__ would help here. The problem as I see it is that the docs say "from types import *" is safe. If you add new names to the types module, they would presumably be added to __all__ as well, and then "from types import *" could clobber local variables or hide globals or builtins the programmer didn't anticipate. So, if we add an object named "function" to the types module and Peter's stable code has a variable of the same name, it's possible that running on a new version of Python will introduce a bug. Still, I have to quibble with Peter's somewhat extreme example. If you take a stable system of the complexity of perhaps Linux or Windows and upgrade it four years later, Python compatibility will probably only be one of many problems raised by the upgrade. If you have a stable program, you try to leave it alone. That means not upgrading it. If you modify the environment the program runs in, you need to retest it. If you write in C you can minimize these problems through static linkage, but the problem with Python is no different than that of a program written in C which uses shared libraries. Names can move around (from one library to another) or new names can be added, giving rise to name conflicts. I seem to recall someone reporting recently about another shared library which defined an external symbol named "socket_init". Skip From aahz@pythoncraft.com Tue Jun 25 15:53:57 2002 From: aahz@pythoncraft.com (Aahz) Date: Tue, 25 Jun 2002 10:53:57 -0400 Subject: [Python-Dev] PEP 294: Type Names in the types Module In-Reply-To: <15640.31960.526262.104242@12-248-8-148.client.attbi.com> References: <20020625105839.GA58813@hishome.net> <20020625133807.GA21633@panix.com> <15640.31960.526262.104242@12-248-8-148.client.attbi.com> Message-ID: <20020625145357.GA6652@panix.com> On Tue, Jun 25, 2002, Skip Montanaro wrote: > > >> If in 2004 certain Python programs written in 2000 or earlier would > >> start raising ImportError exceptions on 'from types import *' after > >> upgrading to a new system which may come with the latest version of > >> Python, this will certainly cause damage. > > aahz> This can be solved by a combination of changing the documentation > aahz> and using __all__ (which I think is in part precisely the point of > aahz> creating __all__). > > I don't think __all__ would help here. The problem as I see it is that the > docs say "from types import *" is safe. If you add new names to the types > module, they would presumably be added to __all__ as well, and then "from > types import *" could clobber local variables or hide globals or builtins > the programmer didn't anticipate. The point is that we could change the docs -- but Peter would still have his problem with import * unless we also used __all__ to retain the old behavior. Overall, I agree with your point about upgrading applications four years old; I'm just suggesting a possible mechanism for minimizing damage. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From gward@python.net Tue Jun 25 16:08:16 2002 From: gward@python.net (Greg Ward) Date: Tue, 25 Jun 2002 11:08:16 -0400 Subject: [Python-Dev] Improved tmpfile module In-Reply-To: <20020625030609.GD13729@codesourcery.com> References: <20020625030609.GD13729@codesourcery.com> Message-ID: <20020625150816.GA3660@gerg.ca> On 24 June 2002, Zack Weinberg said: > Attached please find a rewritten and improved tmpfile.py. The major > change is to make the temporary file names significantly harder to > predict. This foils denial-of-service attacks, where a hostile > program floods /tmp with files named @12345.NNNN to prevent process > 12345 from creating any temp files. It also makes the race condition > inherent in tmpfile.mktemp() somewhat harder to exploit. Oh, good! I've long wished that there was a tmpfile module written by someone who understands the security issues involved in generating temporary filenames and files. I hope you do... ;-) > (fd, name) = mkstemp(suffix="", binary=1): Creates a temporary file, > returning both an OS-level file descriptor open on it and its name. > This is useful in situations where you need to know the name of the > temporary file, but can't risk the race in mktemp. +1 except for the name. What does the "s" stand for? Unfortunately, I can't think of a more descriptive name offhand. > name = mkdtemp(suffix=""): Creates a temporary directory, without > race. How about calling this one mktempdir() ? > file = NamedTemporaryFile(mode='w+b', bufsize=-1, suffix=""): This is > just the non-POSIX version of tmpfile.TemporaryFile() made available > on all platforms, and with the .path attribute documented. It > provides a convenient way to get a temporary file with a name, that > will be automatically deleted on close, and with a high-level file > object associated with it. I've scanned your code and the existing tempfile.py. I don't understand why you rearranged things. Please explain why your arrangement of _TemporaryFileWrapper/TemporaryFile/NamedTemporaryFile is better than what we have. A few minor comments on the code... > if os.name == 'nt': > _template = '~%s~' > elif os.name in ('mac', 'riscos'): > _template = 'Python-Tmp-%s' > else: > _template = 'pyt%s' # better ideas? Why reveal the implementation language of the application creating these temporary names? More importantly, why do it certain platforms, but not others? > ### Recommended, user-visible interfaces. > > _text_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL > if os.name == 'posix': > _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL Why not just "_bin_openflags = _text_openflags" ? That clarifies their equality on Unix. > else: > _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL | os.O_BINARY Why not "_bin_openflags = _text_openflags | os.O_BINARY" ? > def mkstemp(suffix="", binary=1): > """Function to create a named temporary file, with 'suffix' for > its suffix. Returns an OS-level handle to the file and the name, > as a tuple. If 'binary' is 1, the file is opened in binary mode, > otherwise text mode (if this is a meaningful concept for the > operating system in use). In any case, the file is readable and > writable only by the creating user, and executable by no one.""" "Function to" is redundant. That docstring should probably look something like this: """Create a named temporary file. Create a named temporary file with 'suffix' for its suffix. Return a tuple (fd, name) where 'fd' is an OS-level handle to the file, and 'name' is the complete path to the file. If 'binary' is true, the file is opened in binary mode, otherwise text mode (if this is a meaningful concept for the operating system in use). In any case, the file is readable and writable only by the creating user, and executable by no one (on platforms where that makes sense). """ Hmmm: if suffix == ".bat", the file is executable on some platforms. That last sentence still needs work. > if binary: flags = _bin_openflags > else: flags = _text_openflags I dunno if the Python coding standards dictate this, but I prefer if binary: flags = _bin_openflags else: flags = _text_openflags > class _TemporaryFileWrapper: > """Temporary file wrapper > > This class provides a wrapper around files opened for temporary use. > In particular, it seeks to automatically remove the file when it is > no longer needed. > """ Here's where I started getting confused. I don't dispute that the existing code could stand some rearrangement, but I don't understand why you did it the way you did. Please clarify! > ### Deprecated, user-visible interfaces. > > def mktemp(suffix=""): > """User-callable function to return a unique temporary file name.""" > while 1: > name = _candidate_name(suffix) > if not os.path.exists(name): > return name The docstring for mktemp() should state *why* it's bad to use this function -- otherwise people will say, "oh, this looks like it does what I need" and use it in ignorance. So should the library reference manual. Overall I'm +1 on the idea of improving tempfile with an eye to security. +0 on implementation, mainly because I don't understand how your arrangement of TemporaryFile and friends is better than what we have. Greg -- Greg Ward - geek gward@python.net http://starship.python.net/~gward/ What the hell, go ahead and put all your eggs in one basket. From niemeyer@conectiva.com Tue Jun 25 16:09:51 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Tue, 25 Jun 2002 12:09:51 -0300 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <15639.60668.591252.466454@12-248-8-148.client.attbi.com> References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com> Message-ID: <20020625120951.B2207@ibook.distro.conectiva> > I don't know how efficient it would be, but I usually think that most > applications have a small, fixed set of possible priorities, like ("low", > "medium", "high") or ("info", "warning", "error", "fatal"). In this sort of > situation my initial inclination would be to implement a dict of Queue > instances which corresponds to the fixed set of priorities, something like: If priority queues were to be included, I'd rather add the necessary support in Queue to easily attach priority handling, if that's not already possible. Maybe adding a generic **kw parameter, and passing it to _put() could help a bit. The applications of a priority Queue I've used until now weren't able to use your approach. OTOH, there are many cases where you're right, and we could benefit from this. If it's of common sense that priority queues are that useful, we should probably add one or two subclasses of Queue in the Queue module (one with your approach and one with the more generic one). Otherwise, subclassing Queue is already easy enough, IMO (adding the **kw suggestion would avoid overloading put(), and seems reasonable to me). Thanks! -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From tim.one@comcast.net Tue Jun 25 16:23:28 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 25 Jun 2002 11:23:28 -0400 Subject: [Python-Dev] Improved tmpfile module In-Reply-To: <20020625150816.GA3660@gerg.ca> Message-ID: [Greg Ward, to Zack Weinberg] > ../ > Overall I'm +1 on the idea of improving tempfile with an eye to > security. +0 on implementation, mainly because I don't understand how > your arrangement of TemporaryFile and friends is better than what we > have. -1 on the implementation here, because it didn't start with current CVS, so is missing important work that went into improving this module on Windows for 2.3. Whether spawned/forked processes inherit descriptors for "temp files" is also a security issue that's addressed in current CVS but seemed to have gotten dropped on the floor here. A note on UI: for many programmers, "it's a feature" that temp file names contain the pid. I don't think we can get away with taking that away no matter how stridently someone claims it's bad for us . From fredrik@pythonware.com Tue Jun 25 16:26:57 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 25 Jun 2002 17:26:57 +0200 Subject: [Python-Dev] Improved tmpfile module References: <20020625030609.GD13729@codesourcery.com> <20020625150816.GA3660@gerg.ca> Message-ID: <003f01c21c5c$c8d8de20$ced241d5@hagrid> Greg wrote: > > (fd, name) = mkstemp(suffix="", binary=1): Creates a temporary file, > > returning both an OS-level file descriptor open on it and its name. > > This is useful in situations where you need to know the name of the > > temporary file, but can't risk the race in mktemp. > > +1 except for the name. What does the "s" stand for? "safe"? or at least "safer"? unix systems usually have both "mktemp" and "mkstemp", but I think they're both deprecated under SUSv2 (use "tmpfile" instead). From fredrik@pythonware.com Tue Jun 25 16:33:55 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 25 Jun 2002 17:33:55 +0200 Subject: [Python-Dev] Priority queue (binary heap) python code References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com> <20020625120951.B2207@ibook.distro.conectiva> Message-ID: <004d01c21c5d$b96c6460$ced241d5@hagrid> Gustavo Niemeyer wrote: > If priority queues were to be included, I'd rather add the necessary > support in Queue to easily attach priority handling, if that's not > already possible. it takes a whopping four lines of code, if you're a pragmatic programmer: # # implementation import Queue, bisect class PriorityQueue(Queue.Queue): def _put(self, item): bisect.insort(self.queue, item) # # usage queue = PriorityQueue(0) queue.put((2, "second")) queue.put((1, "first")) queue.put((3, "third")) priority, value = queue.get() From bernie@3captus.com Tue Jun 25 16:29:34 2002 From: bernie@3captus.com (Bernard Yue) Date: Tue, 25 Jun 2002 09:29:34 -0600 Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com> Message-ID: <3D188C5D.D519DD90@3captus.com> Skip Montanaro wrote: > > I just noticed in the development docs that when a timeout on a socket > occurs, socket.error is raised. I rather liked the idea that a different > exception was raised for timeouts (I used Tim O'Malley's timeout_socket > module). Making a TimeoutError exception a subclass of socket.error would > be fine so you can catch it with existing code, but I could see recovering > differently for a timeout as opposed to other possible errors: > > sock.settimeout(5.0) > try: > data = sock.recv(8192) > except socket.TimeoutError: > # maybe requeue the request > ... > except socket.error, codes: > # some more drastic solution is needed > ... > +1 on your suggestion. Anyway, under windows, the current implementation returns incorrect socket.error code for timeout. I am working on the test suite as well as a fix for problem found. Once the code is bug free maybe we can put the TimeoutError in. I will leave it to Guido for the approval of the change. When he comes back from his holiday. Bernie > Skip > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From niemeyer@conectiva.com Tue Jun 25 17:02:16 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Tue, 25 Jun 2002 13:02:16 -0300 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <004d01c21c5d$b96c6460$ced241d5@hagrid> References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com> <20020625120951.B2207@ibook.distro.conectiva> <004d01c21c5d$b96c6460$ced241d5@hagrid> Message-ID: <20020625130216.B1837@ibook.distro.conectiva> > it takes a whopping four lines of code, if you're a pragmatic > programmer: Indeed. Using a tuple directly was a nice idea! I was thinking about a priority parameter (maybe I'm not that pragmatic? ;-), which is not hard as well, but one will have to overload the put method to pass the priority parameter. import Queue, bisect class PriorityQueue(Queue.Queue): def __init__(self, maxsize=0, defaultpriority=0): self.defaultpriority = defaultpriority Queue.Queue.__init__(self, maxsize) def put(self, item, block=1, **kw): if block: self.fsema.acquire() elif not self.fsema.acquire(0): raise Full self.mutex.acquire() was_empty = self._empty() # <- Priority could be handled here as well. self._put(item, **kw) if was_empty: self.esema.release() if not self._full(): self.fsema.release() self.mutex.release() def _put(self, item, **kw): # <- But here seems better priority = kw.get("priority", self.defaultpriority) bisect.insort(self.queue, (priority, item)) def _get(self): return self.queue.pop(0)[1] -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From sholden@holdenweb.com Tue Jun 25 17:21:48 2002 From: sholden@holdenweb.com (Steve Holden) Date: Tue, 25 Jun 2002 12:21:48 -0400 Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com> Message-ID: <0b0d01c21c64$6a17b0c0$6300000a@holdenweb.com> ----- Original Message ----- From: "Skip Montanaro" To: Sent: Monday, June 24, 2002 9:53 PM Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error > > I just noticed in the development docs that when a timeout on a socket > occurs, socket.error is raised. I rather liked the idea that a different > exception was raised for timeouts (I used Tim O'Malley's timeout_socket > module). Making a TimeoutError exception a subclass of socket.error would > be fine so you can catch it with existing code, but I could see recovering > differently for a timeout as opposed to other possible errors: > > sock.settimeout(5.0) > try: > data = sock.recv(8192) > except socket.TimeoutError: > # maybe requeue the request > ... > except socket.error, codes: > # some more drastic solution is needed > ... > This seems logical: the timeout is inherently different, so a separate "except" seems better than having to analyze the reason of the socket error. regards ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From fredrik@pythonware.com Tue Jun 25 18:03:41 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 25 Jun 2002 19:03:41 +0200 Subject: [Python-Dev] Priority queue (binary heap) python code References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com> <20020625120951.B2207@ibook.distro.conectiva> <004d01c21c5d$b96c6460$ced241d5@hagrid> <20020625130216.B1837@ibook.distro.conectiva> Message-ID: <016e01c21c6a$4402cae0$ced241d5@hagrid> Gustavo wrote: > def put(self, item, block=1, **kw): > if block: > self.fsema.acquire() > elif not self.fsema.acquire(0): > raise Full > self.mutex.acquire() > was_empty = self._empty() > # <- Priority could be handled here as well. > self._put(item, **kw) > if was_empty: > self.esema.release() > if not self._full(): > self.fsema.release() > self.mutex.release() > > def _put(self, item, **kw): > # <- But here seems better > priority = kw.get("priority", self.defaultpriority) > bisect.insort(self.queue, (priority, item)) or better: def put(self, item, block=1, priority=None): if priority is None: priority = self.defaultpriority Queue.Queue.put(self, (priority, item), block) From martin@v.loewis.de Tue Jun 25 19:18:02 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 25 Jun 2002 20:18:02 +0200 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020625080929.GA39304@hishome.net> References: <20020624213318.A5740@arizona.localdomain> <20020625065203.GA27183@hishome.net> <20020625080929.GA39304@hishome.net> Message-ID: Oren Tirosh writes: > When I want to sort a list I just use .sort(). I don't care which > algorithm is used. I don't care whether dictionaries are implemented > using hash tables, some kind of tree structure or magic smoke. I > just trust Python to use a reasonably efficient implementation. And nobody says you should think differently. > I always find it funny when C++ or Perl programmers refer to an > associative array as a "hash". I agree. > Heaps are a "standard algorithm" only from a CS point of view. It doesn't > have much to do with everyday programming. This has many different reasons: In the case of Python, the standard .sort is indeed good for most applications. In general (including Python), usage of heapsort is rare since it is difficult to implement and not part of the standard library. Likewise, the naive priority queue implementation is good in most cases. If it was more easy to use, I assume it would be used more often. > Let's put it this way: If Python has an extension module in the standard > library implementing a sorted list, would you care enough about the > specific binary heap implementation to go and write one or would you just > use what you had in the library for a priority queue? ;-) I don't understand this question: Why do I have to implement anything? Having heapsort in the library precisely means that I do not have to write an implementation. Regards, Martin From martin@v.loewis.de Tue Jun 25 19:19:23 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 25 Jun 2002 20:19:23 +0200 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020625134504.GB21633@panix.com> References: <20020624213318.A5740@arizona.localdomain> <20020625065203.GA27183@hishome.net> <20020625134504.GB21633@panix.com> Message-ID: Aahz writes: > Should this PEP be split in two, then? One for a new "AbstractData" > package (that would include the heap algorithm) and one for an update to > Queue that would use some algorithm from AbstractData. The latter might > not even need a PEP. I don't know. The author of the PEP would have the freedom to propose anything initially. Depending on the proposal, people will comment, then reorganizations might be necessary. Regards, Martin From bac@OCF.Berkeley.EDU Tue Jun 25 19:43:45 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 25 Jun 2002 11:43:45 -0700 (PDT) Subject: [Python-Dev] New Subscriber Introduction In-Reply-To: <3D1825D0.2070309@lemburg.com> Message-ID: [M.-A. Lemburg] > Just curious: have you taken a look at the mxDateTime parser ? > > It has a slightly different approach than strptime() but also > takes a lot of load from the programmer in terms of not requiring > a predefined format. No. I originally wrote strptime a year ago and it was initially just a hack. It just has been fleshed out by me over the past year. Just last month was when I realized how I could figure out all the locale info on my own after having taken a break from it. I also wanted to avoid any possible license issues so I just did completely from scratch. As for your comment about not requiring a predefined format, I don't quite follow what you mean. Looking at mxDateTime's strptime, the only difference in the possible parameters is the optional default for mxDateTime. Otherwise both mxDateTime's and my implementation have exactly the same parameter requirements: mxDateTime.strptime(string,format_string[,default]) strptime.strptime(data_string, format) with string == data_string and format_string == format. -Brett C. From mal@lemburg.com Tue Jun 25 19:56:04 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 25 Jun 2002 20:56:04 +0200 Subject: [Python-Dev] New Subscriber Introduction References: Message-ID: <3D18BCC4.4020407@lemburg.com> Brett Cannon wrote: > [M.-A. Lemburg] > > >>Just curious: have you taken a look at the mxDateTime parser ? >> >>It has a slightly different approach than strptime() but also >>takes a lot of load from the programmer in terms of not requiring >>a predefined format. > > > No. I originally wrote strptime a year ago and it was initially just a > hack. It just has been fleshed out by me over the past year. Just last > month was when I realized how I could figure out all the locale info on my > own after having taken a break from it. I also wanted to avoid any > possible license issues so I just did completely from scratch. mxDateTime is part of egenix-mx-base which is covered by an open source license similar to that of Python (with less fuzz, though :-). > As for your comment about not requiring a predefined format, I don't quite > follow what you mean. Looking at mxDateTime's strptime, the only > difference in the possible parameters is the optional default for > mxDateTime. Otherwise both mxDateTime's and my implementation have > exactly the same parameter requirements: > mxDateTime.strptime(string,format_string[,default]) > strptime.strptime(data_string, format) > > with string == data_string and format_string == format. That's correct. I was refering to the mx.DateTime.Parser module, which implements several different date/time parsers. The basic interface is mx.DateTime.DateTimeFrom(string). No format string is required. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/ From bac@OCF.Berkeley.EDU Tue Jun 25 20:06:26 2002 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 25 Jun 2002 12:06:26 -0700 (PDT) Subject: [Python-Dev] New Subscriber Introduction In-Reply-To: <3D18BCC4.4020407@lemburg.com> Message-ID: [M.-A. Lemburg] > mxDateTime is part of egenix-mx-base which is covered > by an open source license similar to that of Python (with less > fuzz, though :-). > Good to know. > That's correct. I was refering to the mx.DateTime.Parser > module, which implements several different date/time parsers. > > The basic interface is mx.DateTime.DateTimeFrom(string). No format > string is required. Ah, OK. Well, that is handy, but since this is meant to be a drop-in replacement for strptime, I don't think it is warranted here. Perhaps something like that could be put into Python when Guido starts putting in new fxns for the forthcoming new datetime type? And I do agree that strptime is not need most of the time. But it is there so might as well fix that non-portable wart. -Brett C. From kevin@koconnor.net Tue Jun 25 23:07:59 2002 From: kevin@koconnor.net (Kevin O'Connor) Date: Tue, 25 Jun 2002 18:07:59 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020625065203.GA27183@hishome.net>; from oren-py-d@hishome.net on Tue, Jun 25, 2002 at 02:52:03AM -0400 References: <20020624213318.A5740@arizona.localdomain> <20020625065203.GA27183@hishome.net> Message-ID: <20020625180759.B5798@arizona.localdomain> On Tue, Jun 25, 2002 at 02:52:03AM -0400, Oren Tirosh wrote: > > Any chance something like this could make it into the standard python > > library? It would save a lot of time for lazy people like myself. :-) > > A sorted list is a much more general-purpose data structure than a priority > queue and can be used to implement a priority queue. It offers almost the same > asymptotic performance: Hi Oren, I agree that some form of a balanced tree object would be more useful, but unfortunately it doesn't exist natively. A pure python implementation of heaps is a pretty straight-forward addition. If, however, one were to consider adding C code then I would agree a tree object would be more valuable. As you surmised later, I wouldn't have bothered with a heap if trees were available. In fact, I've always wondered why Python dictionaries use the hash algorithm instead of the more general binary tree algorithm. :-} -Kevin -- ------------------------------------------------------------------------ | Kevin O'Connor "BTW, IMHO we need a FAQ for | | kevin@koconnor.net 'IMHO', 'FAQ', 'BTW', etc. !" | ------------------------------------------------------------------------ From kevin@koconnor.net Tue Jun 25 23:26:06 2002 From: kevin@koconnor.net (Kevin O'Connor) Date: Tue, 25 Jun 2002 18:26:06 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <15639.60668.591252.466454@12-248-8-148.client.attbi.com>; from skip@pobox.com on Mon, Jun 24, 2002 at 11:09:32PM -0500 References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com> Message-ID: <20020625182606.C5798@arizona.localdomain> On Mon, Jun 24, 2002 at 11:09:32PM -0500, Skip Montanaro wrote: > I don't know how efficient it would be, but I usually think that most > applications have a small, fixed set of possible priorities, like ("low", > "medium", "high") or ("info", "warning", "error", "fatal"). In this sort of > situation my initial inclination would be to implement a dict of Queue > instances which corresponds to the fixed set of priorities, something like: Hi Skip, The application I had in mind stored between 100,000-1,000,000 objects with priorities between 0-150. I found that moving from bisect to a heap improved performance of the entire program by about 25%. >It will also work if for some reason you want > to queue up objects for which __cmp__ doesn't make sense. I just assumed the user would use the (priority, data) tuple trick at the start (it does make the algorithm simpler). In a way, the code is very similar to the way the bisect module is implemented. -Kevin -- ------------------------------------------------------------------------ | Kevin O'Connor "BTW, IMHO we need a FAQ for | | kevin@koconnor.net 'IMHO', 'FAQ', 'BTW', etc. !" | ------------------------------------------------------------------------ From tim.one@comcast.net Wed Jun 26 04:58:24 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 25 Jun 2002 23:58:24 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020625180759.B5798@arizona.localdomain> Message-ID: [Kevin O'Connor] > ... > In fact, I've always wondered why Python dictionaries use the hash > algorithm instead of the more general binary tree algorithm. :-} Speed. Zope and StandaloneZODB have a BTree package, which I've recently spent a good amount of time optimizing. Here's a timing driver: """ from time import clock as now N = 1000000 indices = range(N) def doit(constructor): d = constructor() t1 = now() for i in indices: d[i] = i t2 = now() for i in indices: assert d[i] == i t3 = now() for i in indices: del d[i] t4 = now() return t2-t1, t3-t2, t4-t3 def drive(constructor, n): print "Using", constructor.__name__, "on", N, "entries" for i in range(n): d1, d2, d3 = doit(constructor) print "construct %6.3f" % d1 print "query %6.3f" % d2 print "remove %6.3f" % d3 def dict(): return {} from BTrees.OOBTree import OOBTree drive(OOBTree, 3) drive(dict, 3) """ This is a little strained because I'm running it under Python 2.1.3. This favors the BTrees, because I also spent a lot of time optimizing Python's dicts for the Python 2.2 release; 2.1 doesn't have that stuff. OOBTrees are most similar to Python's dicts, mapping objects to objects. Here's a run: Using OOBTree on 1000000 entries construct 5.376 query 5.571 remove 4.065 construct 5.349 query 5.610 remove 4.211 construct 5.363 query 5.585 remove 4.374 Using dict on 1000000 entries construct 1.411 query 1.336 remove 0.780 construct 1.382 query 1.335 remove 0.781 construct 1.376 query 1.334 remove 0.778 There's just no contest here. BTrees have many other virtues, like supporting range searches, and automatically playing nice with ZODB persistence, but they're plain sluggish compared to dicts. To be completely fair and unfair at the same time , there are also 4 other flavors of Zope BTree, purely for optimization reasons. In particular, the IIBTree maps Python ints to Python ints, and does so by avoiding Python int objects altogether, storing C longs directly and comparing them at native "compare a long to a long" C speed. That's *almost* as fast as Python 2.1 int->int dicts (which endure all-purpose Python object comparison), except for deletion (the BTree spends a lot of time tearing apart all the tree pointers again). Now that's a perfectly height-balanced search tree that "chunks up" blocks of keys for storage and speed efficiency, and rarely needs more than a simple local adjustment to maintain balance. I expect that puts it at the fast end of what can be achieved with a balanced tree scheme. The ABC language (which Guido worked on before Python) used AVL trees for just about everything under the covers. It's not a coincidence that Python doesn't use balanced trees for anything . From tim.one@comcast.net Wed Jun 26 05:30:36 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 26 Jun 2002 00:30:36 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020625134504.GB21633@panix.com> Message-ID: [Aahz] > Should this PEP be split in two, then? One for a new "AbstractData" > package (that would include the heap algorithm) and one for an update to > Queue that would use some algorithm from AbstractData. The latter might > not even need a PEP. I'm chuckling, but to myself . By the time you add all the bells and whistles everyone may want out of "a priority queue", the interface gets so frickin' complicated that almost everyone will ignore the library and call bisect.insort() themself. /F gives me a thread-safe Queue when I don't want to pay overheads for enforcing mutual exclusion in 99% of my priority-queue apps. Schemes that store (priority, object) tuples to exploit lexicographic comparison are convenient to code but a nightmare if priorities can ever be equal, and object comparison can raise exceptions, or object comparison can be expensive. Sometimes I want a min-queue, other times a max-queue. Sometimes I need efficient access to both ends. About a month ago I needed to write a priority queue that was especially efficient at adding thousands of new entries in one gulp. And so on. It's easier to write appropriate code from scratch in Python than to figure out how to *use* a package profligate enough to contain canned solutions for all common and reasonable use cases. People have been known to gripe at the number of methods Python's simple little lists and dicts have sprouted -- heh heh. BTW, the Zope BTree may be a good candidate to fold into Python. I'm not sure. It's a mountain of fairly sophisticated code with an interface so rich that it's genuinely hard to learn how to use it as intended -- the latter especially should appeal to just about everyone . From aahz@pythoncraft.com Wed Jun 26 05:40:20 2002 From: aahz@pythoncraft.com (Aahz) Date: Wed, 26 Jun 2002 00:40:20 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: References: <20020625134504.GB21633@panix.com> Message-ID: <20020626044020.GB11161@panix.com> On Wed, Jun 26, 2002, Tim Peters wrote: > [Aahz] >> >> Should this PEP be split in two, then? One for a new "AbstractData" >> package (that would include the heap algorithm) and one for an update to >> Queue that would use some algorithm from AbstractData. The latter might >> not even need a PEP. > > I'm chuckling, but to myself . By the time you add all the > bells and whistles everyone may want out of "a priority queue", the > interface gets so frickin' complicated that almost everyone will > ignore the library and call bisect.insort() themself. Fair enough -- but I didn't really know about bisect myself. Looking at the docs for bisect, it says that the code might be best used as a source code example. I think that having a package to dump similar kinds of code might be a good idea. It's not a substitute for a CS course, but... > And so on. It's easier to write appropriate code from scratch in Python > than to figure out how to *use* a package profligate enough to contain > canned solutions for all common and reasonable use cases. People have been > known to gripe at the number of methods Python's simple little lists and > dicts have sprouted -- heh heh. Actually, I was expecting that the Queue PEP would be dropped once the AbstractData package got some momentum behind. I was just trying to be a tiny bit subtle. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From skip@pobox.com Wed Jun 26 06:23:13 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 26 Jun 2002 00:23:13 -0500 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <20020626044020.GB11161@panix.com> References: <20020625134504.GB21633@panix.com> <20020626044020.GB11161@panix.com> Message-ID: <15641.20417.403770.873433@12-248-8-148.client.attbi.com> aahz> Fair enough -- but I didn't really know about bisect myself. aahz> Looking at the docs for bisect, it says that the code might be aahz> best used as a source code example. I always forget about it as well. I just added /F's four-line PriorityQueue class as an example in the bisect docs and a "seealso" pointing at the bisect doc to the Queue module doc. Skip From python@rcn.com Wed Jun 26 07:37:17 2002 From: python@rcn.com (Raymond Hettinger) Date: Wed, 26 Jun 2002 02:37:17 -0400 Subject: [Python-Dev] Xrange and Slices Message-ID: <000d01c21cdb$eb03b720$91d8accf@othello> Wild idea of the day: Merge the code for xrange() into slice(). So that old code will work, make the word 'xrange' a synonym for 'slice' >>> x = xrange(0,10,2) >>> s = slice(0,10,2) >>> [m for m in dir(x) if m not in dir(s)] ['__getitem__', '__iter__', '__len__'] >>> [m for m in dir(s) if m not in dir(x)] ['__cmp__', 'start', 'step', 'stop'] Raymond Hettinger 'regnitteh dnomyar'[::-1] From python@rcn.com Wed Jun 26 08:36:21 2002 From: python@rcn.com (Raymond Hettinger) Date: Wed, 26 Jun 2002 03:36:21 -0400 Subject: [Python-Dev] Dict constructor Message-ID: <008101c21ce4$2b504fc0$91d8accf@othello> Second wild idea of the day: The dict constructor currently accepts sequences where each element has length 2, interpreted as a key-value pair. Let's have it also accept sequences with elements of length 1, interpreted as a key:None pair. The benefit is that it provides a way to rapidly construct sets: lowercase = dict('abcdefghijklmnopqrstuvwxyz') if char in lowercase: ... dict([key1, key2, key3, key1]).keys() # eliminate duplicate keys Raymond Hettinger 'regnitteh dnomyar'[::-1] From lellinghaus@yahoo.com Wed Jun 26 09:21:35 2002 From: lellinghaus@yahoo.com (Lance Ellinghaus) Date: Wed, 26 Jun 2002 01:21:35 -0700 (PDT) Subject: [Python-Dev] posixmodule.c diffs for working forkpty() and openpty() under Solaris 2.8 Message-ID: <20020626082135.16733.qmail@web20905.mail.yahoo.com> --0-68967167-1025079695=:16014 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hello everyone! I had to get forkpty() and openpty() working under Solaris 2.8 for a project I am working on. Here are the diffs to the 2.2.1 source file. Please let me know if anyone has any problems with this! Lance Ellinghaus ===== -- Lance Ellinghaus __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com --0-68967167-1025079695=:16014 Content-Type: text/plain; name="posixmodule.c.diff" Content-Description: posixmodule.c.diff Content-Disposition: inline; filename="posixmodule.c.diff" *** Python-2.2.1/Modules/posixmodule.c Tue Mar 12 16:38:31 2002 --- Python-2.2.1.new/Modules/posixmodule.c Tue May 21 01:16:29 2002 *************** *** 1904,1910 **** } #endif ! #if defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) #ifdef HAVE_PTY_H #include #else --- 1904,1913 ---- } #endif ! #if defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || defined(sun) ! #ifdef sun ! #include ! #endif #ifdef HAVE_PTY_H #include #else *************** *** 1914,1920 **** #endif /* HAVE_PTY_H */ #endif /* defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) */ ! #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) static char posix_openpty__doc__[] = "openpty() -> (master_fd, slave_fd)\n\ Open a pseudo-terminal, returning open fd's for both master and slave end.\n"; --- 1917,1923 ---- #endif /* HAVE_PTY_H */ #endif /* defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) */ ! #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(sun) static char posix_openpty__doc__[] = "openpty() -> (master_fd, slave_fd)\n\ Open a pseudo-terminal, returning open fd's for both master and slave end.\n"; *************** *** 1925,1932 **** int master_fd, slave_fd; #ifndef HAVE_OPENPTY char * slave_name; #endif ! if (!PyArg_ParseTuple(args, ":openpty")) return NULL; --- 1928,1941 ---- int master_fd, slave_fd; #ifndef HAVE_OPENPTY char * slave_name; + #ifdef sun + void *sig_saved; #endif ! #endif ! #if !defined(HAVE_OPENPTY) && !defined(HAVE__GETPTY) && defined(sun) ! extern char *ptsname(); ! #endif ! if (!PyArg_ParseTuple(args, ":openpty")) return NULL; *************** *** 1933,1939 **** #ifdef HAVE_OPENPTY if (openpty(&master_fd, &slave_fd, NULL, NULL, NULL) != 0) return posix_error(); ! #else slave_name = _getpty(&master_fd, O_RDWR, 0666, 0); if (slave_name == NULL) return posix_error(); --- 1942,1948 ---- #ifdef HAVE_OPENPTY if (openpty(&master_fd, &slave_fd, NULL, NULL, NULL) != 0) return posix_error(); ! #elif HAVE__GETPTY slave_name = _getpty(&master_fd, O_RDWR, 0666, 0); if (slave_name == NULL) return posix_error(); *************** *** 1941,1946 **** --- 1950,1966 ---- slave_fd = open(slave_name, O_RDWR); if (slave_fd < 0) return posix_error(); + #else + master_fd = open("/dev/ptmx", O_RDWR|O_NOCTTY); /* open master */ + sig_saved = signal(SIGCHLD, SIG_DFL); + grantpt(master_fd); /* change permission of slave */ + unlockpt(master_fd); /* unlock slave */ + signal(SIGCHLD,sig_saved); + slave_name = ptsname(master_fd); /* get name of slave */ + slave_fd = open(slave_name, O_RDWR); /* open slave */ + ioctl(slave_fd, I_PUSH, "ptem"); /* push ptem */ + ioctl(slave_fd, I_PUSH, "ldterm"); /* push ldterm*/ + ioctl(slave_fd, I_PUSH, "ttcompat"); /* push ttcompat*/ #endif /* HAVE_OPENPTY */ return Py_BuildValue("(ii)", master_fd, slave_fd); *************** *** 1948,1954 **** } #endif /* defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) */ ! #ifdef HAVE_FORKPTY static char posix_forkpty__doc__[] = "forkpty() -> (pid, master_fd)\n\ Fork a new process with a new pseudo-terminal as controlling tty.\n\n\ --- 1968,1974 ---- } #endif /* defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) */ ! #if defined(HAVE_FORKPTY) || defined(sun) static char posix_forkpty__doc__[] = "forkpty() -> (pid, master_fd)\n\ Fork a new process with a new pseudo-terminal as controlling tty.\n\n\ *************** *** 1959,1968 **** --- 1979,2067 ---- posix_forkpty(PyObject *self, PyObject *args) { int master_fd, pid; + #if defined(sun) + int slave; + char * slave_name; + void *sig_saved; + int fd; + #endif if (!PyArg_ParseTuple(args, ":forkpty")) return NULL; + #if defined(sun) + master_fd = open("/dev/ptmx", O_RDWR|O_NOCTTY); /* open master */ + sig_saved = signal(SIGCHLD, SIG_DFL); + grantpt(master_fd); /* change permission of slave */ + unlockpt(master_fd); /* unlock slave */ + signal(SIGCHLD,sig_saved); + slave_name = ptsname(master_fd); /* get name of slave */ + slave = open(slave_name, O_RDWR); /* open slave */ + ioctl(slave, I_PUSH, "ptem"); /* push ptem */ + ioctl(slave, I_PUSH, "ldterm"); /* push ldterm*/ + ioctl(slave, I_PUSH, "ttcompat"); /* push ttcompat*/ + if (master_fd < 0 || slave < 0) + { + return posix_error(); + } + switch (pid = fork()) { + case -1: + return posix_error(); + case 0: + /* First disconnect from the old controlling tty. */ + #ifdef TIOCNOTTY + fd = open("/dev/tty", O_RDWR | O_NOCTTY); + if (fd >= 0) { + (void) ioctl(fd, TIOCNOTTY, NULL); + close(fd); + } + #endif /* TIOCNOTTY */ + if (setsid() < 0) + return posix_error(); + + /* + * Verify that we are successfully disconnected from the controlling + * tty. + */ + fd = open("/dev/tty", O_RDWR | O_NOCTTY); + if (fd >= 0) { + return posix_error(); + close(fd); + } + /* Make it our controlling tty. */ + #ifdef TIOCSCTTY + if (ioctl(slave, TIOCSCTTY, NULL) < 0) + return posix_error(); + #endif /* TIOCSCTTY */ + fd = open(slave_name, O_RDWR); + if (fd < 0) { + return posix_error(); + } else { + close(fd); + } + /* Verify that we now have a controlling tty. */ + fd = open("/dev/tty", O_WRONLY); + if (fd < 0) + return posix_error(); + else { + close(fd); + } + (void) close(master_fd); + (void) dup2(slave, 0); + (void) dup2(slave, 1); + (void) dup2(slave, 2); + if (slave > 2) + (void) close(slave); + pid = 0; + break; + defautlt: + /* + * parent + */ + (void) close(slave); + } + #else pid = forkpty(&master_fd, NULL, NULL, NULL); + #endif if (pid == -1) return posix_error(); if (pid == 0) *************** *** 5607,5616 **** #ifdef HAVE_FORK {"fork", posix_fork, METH_VARARGS, posix_fork__doc__}, #endif /* HAVE_FORK */ ! #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) {"openpty", posix_openpty, METH_VARARGS, posix_openpty__doc__}, #endif /* HAVE_OPENPTY || HAVE__GETPTY */ ! #ifdef HAVE_FORKPTY {"forkpty", posix_forkpty, METH_VARARGS, posix_forkpty__doc__}, #endif /* HAVE_FORKPTY */ #ifdef HAVE_GETEGID --- 5706,5715 ---- #ifdef HAVE_FORK {"fork", posix_fork, METH_VARARGS, posix_fork__doc__}, #endif /* HAVE_FORK */ ! #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(sun) {"openpty", posix_openpty, METH_VARARGS, posix_openpty__doc__}, #endif /* HAVE_OPENPTY || HAVE__GETPTY */ ! #if defined(HAVE_FORKPTY) || defined(sun) {"forkpty", posix_forkpty, METH_VARARGS, posix_forkpty__doc__}, #endif /* HAVE_FORKPTY */ #ifdef HAVE_GETEGID --0-68967167-1025079695=:16014-- From sholden@holdenweb.com Wed Jun 26 11:12:54 2002 From: sholden@holdenweb.com (Steve Holden) Date: Wed, 26 Jun 2002 06:12:54 -0400 Subject: [Python-Dev] Asyncore/asynchat Message-ID: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com> I thought I might try to add appropriate module documentation for asynchat. This effective code doesn't get enough recognition (IMHO), partly because you are forced to read the code to understand how to use it. I notice that Sam Rushing's code tends to use spaces before the parentheses around argument lists. Should I think about cleaning up the code at the same time, or are we best letting sleeping dogs lie? regards ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From barry@zope.com Wed Jun 26 14:21:24 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 26 Jun 2002 09:21:24 -0400 Subject: [Python-Dev] Dict constructor References: <008101c21ce4$2b504fc0$91d8accf@othello> Message-ID: <15641.49108.839568.721853@anthem.wooz.org> >>>>> "RH" == Raymond Hettinger writes: RH> Second wild idea of the day: RH> The dict constructor currently accepts sequences where each RH> element has length 2, interpreted as a key-value pair. RH> Let's have it also accept sequences with elements of length 1, RH> interpreted as a key:None pair. None might be an unfortunate choice because it would make dict.get() less useful. I'd prefer key:1 But of course it's fairly easy to construct either with a list comprehension: Python 2.2.1 (#1, May 31 2002, 18:34:35) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import string >>> abc = string.letters[:26] >>> dict([(c, 1) for c in abc]) {'a': 1, 'c': 1, 'b': 1, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'i': 1, 'h': 1, 'k': 1, 'j': 1, 'm': 1, 'l': 1, 'o': 1, 'n': 1, 'q': 1, 'p': 1, 's': 1, 'r': 1, 'u': 1, 't': 1, 'w': 1, 'v': 1, 'y': 1, 'x': 1, 'z': 1} >>> dict([(c, None) for c in abc]) {'a': None, 'c': None, 'b': None, 'e': None, 'd': None, 'g': None, 'f': None, 'i': None, 'h': None, 'k': None, 'j': None, 'm': None, 'l': None, 'o': None, 'n': None, 'q': None, 'p': None, 's': None, 'r': None, 'u': None, 't': None, 'w': None, 'v': None, 'y': None, 'x': None, 'z': None} pep-274-ly y'rs, -Barry From oren-py-d@hishome.net Wed Jun 26 14:27:18 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 26 Jun 2002 09:27:18 -0400 Subject: [Python-Dev] Xrange and Slices In-Reply-To: <000d01c21cdb$eb03b720$91d8accf@othello> References: <000d01c21cdb$eb03b720$91d8accf@othello> Message-ID: <20020626132718.GA57665@hishome.net> On Wed, Jun 26, 2002 at 02:37:17AM -0400, Raymond Hettinger wrote: > Wild idea of the day: > Merge the code for xrange() into slice(). > So that old code will work, make the word 'xrange' a synonym for 'slice' Nice idea. Since xrange is the one more commonly used in everyday programming I'd say that slice should be an alias to xrange, not the other way around. The start, stop and step attributes to xrange would have to be revived (what was the idea behind removing them in the first place?) This would make it trivial to implement a __getitem__ that fully supports extended slice notation: class Spam: def __getitem__(self, index): if isinstance(index, xrange): return [self[i] for i in index] else: ...handle integer index Two strange things about xrange objects: >>> xrange(1,100,2) xrange(1, 101, 2) It's been there since at least Python 2.0. Hasn't anyone noticed this bug before? >>> dir(x) [] Shouldn't it have at least __class__, __repr__, etc and everything else that object has? Oren From pobrien@orbtech.com Wed Jun 26 14:49:26 2002 From: pobrien@orbtech.com (Patrick K. O'Brien) Date: Wed, 26 Jun 2002 08:49:26 -0500 Subject: [Python-Dev] Xrange and Slices In-Reply-To: <20020626132718.GA57665@hishome.net> Message-ID: [Oren Tirosh] > > Two strange things about xrange objects: > >>> xrange(1,100,2) > xrange(1, 101, 2) > > It's been there since at least Python 2.0. Hasn't anyone noticed this > bug before? > > >>> dir(x) > [] > Shouldn't it have at least __class__, __repr__, etc and everything else > that object has? What is x in your example? Assuming x == xrange, I get this with Python 2.2.1: >>> dir(xrange) ['__call__', '__class__', '__cmp__', '__delattr__', '__doc__', '__getattribute__', '__hash__', '__init__', '__name__', '__new__', '__reduce__', '__repr__', '__self__', '__setattr__', '__str__'] Assuming x == xrange(1, 100, 2): >>> x = xrange(1, 100, 2) >>> dir(x) PyCrust-Shell:1: DeprecationWarning: xrange object's 'start', 'stop' and 'step' attributes are deprecated ['start', 'step', 'stop', 'tolist'] -- Patrick K. O'Brien Orbtech ----------------------------------------------- "Your source for Python software development." ----------------------------------------------- Web: http://www.orbtech.com/web/pobrien/ Blog: http://www.orbtech.com/blog/pobrien/ Wiki: http://www.orbtech.com/wiki/PatrickOBrien ----------------------------------------------- From walter@livinglogic.de Wed Jun 26 14:55:38 2002 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 26 Jun 2002 15:55:38 +0200 Subject: [Python-Dev] Dict constructor References: <008101c21ce4$2b504fc0$91d8accf@othello> <15641.49108.839568.721853@anthem.wooz.org> Message-ID: <3D19C7DA.3050509@livinglogic.de> Barry A. Warsaw wrote: >>>>>>"RH" == Raymond Hettinger writes: >>>>> > > RH> Second wild idea of the day: > > RH> The dict constructor currently accepts sequences where each > RH> element has length 2, interpreted as a key-value pair. > > RH> Let's have it also accept sequences with elements of length 1, > RH> interpreted as a key:None pair. > > None might be an unfortunate choice because it would make dict.get() > less useful. I'd prefer key:1 How about key:True ? Bye, Walter Dörwald From thomas.heller@ion-tof.com Wed Jun 26 15:19:37 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 26 Jun 2002 16:19:37 +0200 Subject: [Python-Dev] Dict constructor References: <008101c21ce4$2b504fc0$91d8accf@othello> Message-ID: <029a01c21d1c$80ad0e30$e000a8c0@thomasnotebook> > Second wild idea of the day: > > The dict constructor currently accepts sequences where each element has > length 2, interpreted as a key-value pair. > > Let's have it also accept sequences with elements of length 1, interpreted > as a key:None pair. > > The benefit is that it provides a way to rapidly construct sets: > The downside is that it's another way to write programs incompatible with 2.2. Thomas From David Abrahams" <20020625065203.GA27183@hishome.net> Message-ID: <1c4901c21d1b$e0511d50$6601a8c0@boostconsulting.com> Also, in case nobody has said so, worst-case performance for insertion into a large heap (log N) is much better than for insertion into a sorted list (N). Of course, in practice, it takes a really large heap to notice these effects. -Dave From: "Martin v. Loewis" > Oren Tirosh writes: > > > The only advantage of a heap is O(1) peek which doesn't seem so > > critical. It may also have somewhat better performance by a > > constant factor because it uses an array rather than allocating node > > structures. But the internal order of a heap-based priority queue > > is very non-intuitive and quite useless for other purposes while a > > sorted list is, umm..., sorted! > > I think that heaps don't allocate additional memory is a valuable > property, more valuable than the asymptotic complexity (which is also > quite good). If you don't want to build priority queues, you can still > use heaps to sort a list. > > IMO, heaps are so standard as an algorithm that they belong into the > Python library, in some form. It is then the user's choice to use that > algorithm or not. > > Regards, > Martin From barry@zope.com Wed Jun 26 15:54:38 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 26 Jun 2002 10:54:38 -0400 Subject: [Python-Dev] Dict constructor References: <008101c21ce4$2b504fc0$91d8accf@othello> <15641.49108.839568.721853@anthem.wooz.org> <3D19C7DA.3050509@livinglogic.de> Message-ID: <15641.54702.652214.551556@anthem.wooz.org> >>>>> "WD" =3D=3D Walter D=F6rwald writes: WD> How about key:True ? Kids today, always with the newfangled gadgets. :) -Barry From David Abrahams" Message-ID: <1d9601c21d25$d32a85d0$6601a8c0@boostconsulting.com> This is really interesting. When I was at Dragon (well, actually, after Tim left and it became L&H), I ported my natural language parsing/understanding system from Python to C++ so it could run quickly enough for embedded devices. The core of this system was an associative container, so I knew that its performance would be crucial. I used C++ generics which made it really easy to swap in different associative container implementations, and I tried lots, including the red-black tree containers built into most C++ implementations, and hash tables. My experience was that trying to come up with a hash function that would give a significant speed increases over the tree containers was extremely difficult, because it was really hard to come up with a good hash function. Furthermore, even if I succeeded, it was like black magic: it was inconsistent accross my test cases and there was no way to understand why it worked well, and to get a feeling for how it would scale to problems outside those cases. I ended up hand-coding a two-level scheme based on binary searches in contiguous arrays which blew away anything I'd been able to do with a hash table. My conclusion was that for general-purpose use, the red-black tree was pretty good, despite its relatively high memory overhead of 3 pointers per node: it places easy requirements on the user (supply a strick weak ordering) and provides predictable and smooth performance even asymptotically. On the other hand, hashing requires that the user supply both a hash function and an equality detector which must agree with one-another, requires hand-tuning of the hash function for performance, and is rather more unpredictable. We've been talking about adding hash-based containers to the C++ standard library but I'm reluctant on these grounds. It seems to me that when you really care about speed, some kind of hand-coded solution might be a better investment than trying to come up with a good hash function. I'm ready to believe that hashing is the most appropriate choice for Python, but I wonder what makes the difference? -Dave From: "Tim Peters" > There's just no contest here. BTrees have many other virtues, like > supporting range searches, and automatically playing nice with ZODB > persistence, but they're plain sluggish compared to dicts. To be completely > fair and unfair at the same time , there are also 4 other flavors of > Zope BTree, purely for optimization reasons. In particular, the IIBTree > maps Python ints to Python ints, and does so by avoiding Python int objects > altogether, storing C longs directly and comparing them at native "compare a > long to a long" C speed. That's *almost* as fast as Python 2.1 int->int > dicts (which endure all-purpose Python object comparison), except for > deletion (the BTree spends a lot of time tearing apart all the tree pointers > again). > > Now that's a perfectly height-balanced search tree that "chunks up" blocks > of keys for storage and speed efficiency, and rarely needs more than a > simple local adjustment to maintain balance. I expect that puts it at the > fast end of what can be achieved with a balanced tree scheme. > > The ABC language (which Guido worked on before Python) used AVL trees for > just about everything under the covers. It's not a coincidence that Python > doesn't use balanced trees for anything . > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > From skip@pobox.com Wed Jun 26 17:32:16 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 26 Jun 2002 11:32:16 -0500 Subject: [Python-Dev] Asyncore/asynchat In-Reply-To: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com> References: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com> Message-ID: <15641.60560.844503.239136@12-248-8-148.client.attbi.com> Steve> I thought I might try to add appropriate module documentation for Steve> asynchat. This effective code doesn't get enough recognition Steve> (IMHO), partly because you are forced to read the code to Steve> understand how to use it. That would be a great idea. Once I actually tried it, it was easy to work with, but the lack of documentation does steepen the initial learning curve a bit. Steve> I notice that Sam Rushing's code tends to use spaces before the Steve> parentheses around argument lists. Should I think about cleaning Steve> up the code at the same time, or are we best letting sleeping Steve> dogs lie? I would let this particular sleeping dog lie. I think the code in Python is occasionally sync'd with Sam's code. Changing the spacing would just add a bunch of spurious differences and thus make that task more difficult. Skip From tim@zope.com Wed Jun 26 17:36:16 2002 From: tim@zope.com (Tim Peters) Date: Wed, 26 Jun 2002 12:36:16 -0400 Subject: [Python-Dev] Asyncore/asynchat In-Reply-To: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com> Message-ID: [Steve Holden] > I thought I might try to add appropriate module documentation for > asynchat. Cool! > This effective code doesn't get enough recognition (IMHO), partly because > you are forced to read the code to understand how to use it. > > I notice that Sam Rushing's code tends to use spaces before the > parentheses around argument lists. Should I think about cleaning up the > code at the same time, or are we best letting sleeping dogs lie? You should feel free to clean it up, but not at the same time: clean the spaces in a distinct checkin dedicated to just that much, with a checkin comment like "Whitespace normalization". From jeremy@zope.com Wed Jun 26 17:36:23 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Wed, 26 Jun 2002 12:36:23 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: <008101c21ce4$2b504fc0$91d8accf@othello> References: <008101c21ce4$2b504fc0$91d8accf@othello> Message-ID: <15641.60807.674692.84314@slothrop.zope.com> >>>>> "RH" == Raymond Hettinger writes: RH> Second wild idea of the day: The dict constructor currently RH> accepts sequences where each element has length 2, interpreted RH> as a key-value pair. RH> Let's have it also accept sequences with elements of length 1, RH> interpreted as a key:None pair. That seems a little too magical to me. RH> Raymond Hettinger 'regnitteh dnomyar'[::-1] Then again it seems like you like magic! Jeremy From python@rcn.com Wed Jun 26 18:38:09 2002 From: python@rcn.com (Raymond Hettinger) Date: Wed, 26 Jun 2002 13:38:09 -0400 Subject: [Python-Dev] Dict constructor References: <008101c21ce4$2b504fc0$91d8accf@othello> <15641.60807.674692.84314@slothrop.zope.com> Message-ID: <001801c21d38$3d4e2220$56ec7ad1@othello> From: "Jeremy Hylton" > RH> Second wild idea of the day: The dict constructor currently > RH> accepts sequences where each element has length 2, interpreted > RH> as a key-value pair. > > RH> Let's have it also accept sequences with elements of length 1, > RH> interpreted as a key:None pair. > > That seems a little too magical to me. Fair enough. > > RH> Raymond Hettinger 'regnitteh dnomyar'[::-1] > Then again it seems like you like magic! While I'm a fan of performance magic, a la the Magic Castle, the root of this suggestion is more mundane. There are too many pieces of code that test membership with 'if elem in container' where the container is not a dictionary. This results in O(n) performance rather than O(1). To fix it, I found myself writing the same code over and over again: def _toset(container): return dict([(elem, True) for elem in container]) This repeated dictionary construction exercise occurs in so many guises that it would be worthwhile to provide a fast, less magical looking approach. Being able to construct dictionaries with default values isn't exactly the most exotic idea ever proposed. IMO, it's clearer, faster, commonly needed, and easy to implement. 'nuff said, Raymond Hettinger From oren-py-d@hishome.net Wed Jun 26 19:30:59 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Wed, 26 Jun 2002 21:30:59 +0300 Subject: [Python-Dev] Xrange and Slices In-Reply-To: ; from pobrien@orbtech.com on Wed, Jun 26, 2002 at 08:49:26AM -0500 References: <20020626132718.GA57665@hishome.net> Message-ID: <20020626213059.A7500@hishome.net> On Wed, Jun 26, 2002 at 08:49:26AM -0500, Patrick K. O'Brien wrote: > What is x in your example? Assuming x == xrange, I get this with Python > 2.2.1: > > >>> dir(xrange) > ['__call__', '__class__', '__cmp__', '__delattr__', '__doc__', > '__getattribute__', '__hash__', '__init__', '__name__', '__new__', > '__reduce__', '__repr__', '__self__', '__setattr__', '__str__'] > > Assuming x == xrange(1, 100, 2): > > >>> x = xrange(1, 100, 2) > >>> dir(x) > PyCrust-Shell:1: DeprecationWarning: xrange object's 'start', 'stop' and > 'step' attributes are deprecated > ['start', 'step', 'stop', 'tolist'] It's the latter (xrange instance, not the type). I'm getting an empty dir() in the latest CVS version. The result you got is what happens in 2.2 and 2.2.1. Oren From tim@zope.com Wed Jun 26 20:18:52 2002 From: tim@zope.com (Tim Peters) Date: Wed, 26 Jun 2002 15:18:52 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: <008101c21ce4$2b504fc0$91d8accf@othello> Message-ID: [Raymond Hettinger] > Second wild idea of the day: > > The dict constructor currently accepts sequences where each element has > length 2, interpreted as a key-value pair. > > Let's have it also accept sequences with elements of length 1, > interpreted as a key:None pair. -1 because of ambiguity. Is this trying to build a set with the single element (42, 666), or a mapping of 42 to 666? dict([(42, 666)]} The same dilemma but perhaps subtler: dict(["ab", "cd", "ef"]) > The benefit is that it provides a way to rapidly construct sets: I've got nothing against sets, but don't think we should push raw dicts any closer to supporting them directly than they already are. Better for someone to take over Greg Wilson's PEP to add a new set module; I also note that Zope/ZODB's BTree package supports BTree-based sets directly as a distinct (from BTree-based mappings) datatype. From python@rcn.com Wed Jun 26 20:45:09 2002 From: python@rcn.com (Raymond Hettinger) Date: Wed, 26 Jun 2002 15:45:09 -0400 Subject: [Python-Dev] Dict constructor References: Message-ID: <009601c21d49$fb2acee0$56ec7ad1@othello> From: "Tim Peters" > -1 because of ambiguity. Is this trying to build a set with the single > element (42, 666), or a mapping of 42 to 666? > > dict([(42, 666)]} I've been thinking about this and the unabiguous explicit solution is to specify a value argument like dict.get(). >>> dict([(42, 666)]) # current behavior unchanged {42: 666} >>> dict([(42, 666)], True) {(42, 666): True} >>> dict( '0123456789abcdef', True) {'a': True, 'c': True, 'b': True, 'e': True, 'd': True, 'f': True, '1': True, '0': True, '3': True, '2': True, '5': True, '4': True, 7': True, '6': True, '9': True, '8': True} >>> dict('0123456789abcdef') # current behavior unchanged ValueError: dictionary update sequence element #0 has length 1; 2 is required The goal is not to provide full set behavior but to facilitate the common task of building dictionaries with a constant value. It comes up in membership testing and in uniquifying sequences. The task of dict() is to construct dictionaries and this is a reasonably common construction. Raymond Hettinger From tim@zope.com Wed Jun 26 21:07:03 2002 From: tim@zope.com (Tim Peters) Date: Wed, 26 Jun 2002 16:07:03 -0400 Subject: [Python-Dev] Dict constructor In-Reply-To: <009601c21d49$fb2acee0$56ec7ad1@othello> Message-ID: [Raymond Hettinger] > I've been thinking about this and the unabiguous explicit solution is to > specify a value argument like dict.get(). > > >>> dict([(42, 666)]) # current behavior unchanged > {42: 666} > > >>> dict([(42, 666)], True) > {(42, 666): True} > > >>> dict( '0123456789abcdef', True) > {'a': True, 'c': True, 'b': True, 'e': True, 'd': True, 'f': True, '1': > True, '0': True, '3': True, '2': True, '5': True, '4': True, 7': > True, '6': True, '9': True, '8': True} > > >>> dict('0123456789abcdef') # current behavior unchanged > ValueError: dictionary update sequence element #0 has length 1; 2 is > required That's better -- but I'd still rather see a set. > The goal is not to provide full set behavior but to facilitate the > common task of building dictionaries with a constant value. The only dicts with constant values I've ever seen are simulating sets. > It comes up in membership testing and in uniquifying sequences. Those are indeed two common examples of using dicts to get at set functionality. > The task of dict() is to construct dictionaries and this is a > reasonably common construction. But only because there isn't a set type. From gsw@agere.com Wed Jun 26 21:22:06 2002 From: gsw@agere.com (Gerald S. Williams) Date: Wed, 26 Jun 2002 16:22:06 -0400 Subject: [Python-Dev] List comprehensions In-Reply-To: <20020626153218.1766.44879.Mailman@mail.python.org> Message-ID: Has anyone summarized the list comprehension design discussions? I found references to "lots of discussion" about it but haven't yet found the discussions themselves. I don't want to rehash any old discussions, but I came across a surprise recently while converting constructs like "map(lambda x:x+1,x)" and just wanted to see the rationale behind not creating a local scope for list comprehension variables. Any pointer would be appreciated. Thanks, -Jerry From python@rcn.com Wed Jun 26 21:48:23 2002 From: python@rcn.com (Raymond Hettinger) Date: Wed, 26 Jun 2002 16:48:23 -0400 Subject: [Python-Dev] List comprehensions References: Message-ID: <00d901c21d52$d13488c0$56ec7ad1@othello> From: "Gerald S. Williams" > I don't want to rehash any old discussions, but I came across a surprise > recently while converting constructs like "map(lambda x:x+1,x)" and just > wanted to see the rationale behind not creating a local scope for list > comprehension variables. The idea was to make a = [expr(i) for i in seqn]; print i behave the same as: a = [] for i in seqn: a.append(expr(i)) print i # i is in locals in its final loop state Raymond Hettinger From Jack.Jansen@oratrix.com Wed Jun 26 21:30:43 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Wed, 26 Jun 2002 22:30:43 +0200 Subject: [Python-Dev] Dict constructor In-Reply-To: <001801c21d38$3d4e2220$56ec7ad1@othello> Message-ID: <969E2908-8943-11D6-A9BF-003065517236@oratrix.com> On woensdag, juni 26, 2002, at 07:38 , Raymond Hettinger wrote: > To fix it, I found myself > writing the same code over and over again: > > def _toset(container): > return dict([(elem, True) for elem in container]) > > This repeated dictionary construction exercise occurs in so many > guises that it would be worthwhile to provide a fast, less magical > looking approach. I disagree on this being "magical", I tend to think of it as "Pythonic". If there is a reasonably easy to remember construct (such as this one: if you've seen it once you'll remember it) just use that, in stead of adding extra layers of functionality. Moreover, this construct has lots of slight modifications that are useful in slightly different situations (i.e. don't put True in the value but something else), and people will "magically" understand these if they've seen this one. What I could imagine would be nice is a warning if you're doing inefficient "in" operations. But I guess this would have to be done in the interpreter itself (I don't think pychecker could do this, or could it?), and the definition of "inefficient" is going to be difficult ("if your program has done more than N1 in operations on a data structure with more than N2 items in it and these took an average of O(N1*N2/2) compares", and keep that information per object). -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From fredrik@pythonware.com Wed Jun 26 22:20:48 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 26 Jun 2002 23:20:48 +0200 Subject: [Python-Dev] Dict constructor References: <969E2908-8943-11D6-A9BF-003065517236@oratrix.com> Message-ID: <001f01c21d57$5a5782c0$ced241d5@hagrid> jack wrote: > I disagree on this being "magical", I tend to think of it as > "Pythonic". If there is a reasonably easy to remember construct > (such as this one: if you've seen it once you'll remember it) > just use that, in stead of adding extra layers of functionality. to quote a certain bot (guess raymond wasn't following that thread): "It's easier to write appropriate code from scratch in Python than to figure out how to *use* a package profligate enough to contain canned solutions for all common and reasonable use cases." time to add a best_practices module to Python 2.3? KeyError:-profligate-ly yrs /F From gsw@agere.com Wed Jun 26 22:38:22 2002 From: gsw@agere.com (Gerald S. Williams) Date: Wed, 26 Jun 2002 17:38:22 -0400 Subject: [Python-Dev] List comprehensions In-Reply-To: <00d901c21d52$d13488c0$56ec7ad1@othello> Message-ID: Raymond Hettinger wrote: > The idea was to make a = [expr(i) for i in seqn]; print i behave ... No problem. As long as it was decided that there's a use for the current behavior, I won't question it. -Jerry From skip@pobox.com Wed Jun 26 22:57:53 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 26 Jun 2002 16:57:53 -0500 Subject: [Python-Dev] List comprehensions In-Reply-To: References: <20020626153218.1766.44879.Mailman@mail.python.org> Message-ID: <15642.14561.397461.86661@12-248-8-148.client.attbi.com> Jerry> Has anyone summarized the list comprehension design discussions? Jerry> I found references to "lots of discussion" about it but haven't Jerry> yet found the discussions themselves. Jerry, I think list comprehensions were just about the last major feature to be added to the language before PEPs became the absolute way to hash stuff out. Here's a comment from Guido dated 2000-08-11: Go for it! (This must be unique -- the PEP still hasn't been finished, and the code is already accepted. :-) Consequently, the PEP (202) never did really get fleshed out. As I recall, it went something like: 1. Buncha discussion in c.l.py. Check out this thread begun by Greg Ewing from August 1998: http://groups.google.com/groups?dq=&hl=en&lr=&ie=UTF-8&oe=utf-8&threadm=35C7E33C.4B14%40cosc.canterbury.ac.nz&rnum=1&prev=/groups%3Fq%3Dg:thl4020484492d%26dq%3D%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3Dutf-8%26selm%3D35C7E33C.4B14%2540cosc.canterbury.ac.nz With a little more agressive use of the time machine, Tim & Greg could maybe have snuck them into 1.5.2! 2. Greg implemented them as a proof of concept and then they languished. 3. I picked them up in mid-2000 and got the ball rolling on getting them into 2.0. 4. They got accepted in August 2000. The last couple steps happened while the 1.6/2.0/CNRI/BeOpen stuff was going on, so I'm pretty sure no summary of the discussions took place. If you're looking for a significant thread, I'd start in the python-dev archives around April or May 2000. You might also want to check the comments in the patch: http://python.org/sf/400654 I don't believe the issue of variable scope ever came up until after 2.0 was released. I certainly thought of them as just shorthand notation for for loops. (Maybe it was discussed in the 1998 thread, but I'm not about to read all 103 articles. ;-) Skip From fredrik@pythonware.com Wed Jun 26 23:05:29 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 27 Jun 2002 00:05:29 +0200 Subject: [Python-Dev] List comprehensions References: <20020626153218.1766.44879.Mailman@mail.python.org> <15642.14561.397461.86661@12-248-8-148.client.attbi.com> Message-ID: <002901c21d5d$96f1aca0$ced241d5@hagrid> skip wrote: > Consequently, the PEP (202) never did really get fleshed out. despite the fact that PEP 202 is marked as final, maybe relevant portions of this thread could be added to it? (so we can reply RTFP the next time someone stumbles upon this) > 1. Buncha discussion in c.l.py. Check out this thread begun by Greg > Ewing from August 1998: in case someone would like to add this to the PEP, that URL can be shortened to: http://groups.google.com/groups?threadm=35C7E33C.4B14%40cosc.canterbury.ac.nz From tim@zope.com Wed Jun 26 23:09:46 2002 From: tim@zope.com (Tim Peters) Date: Wed, 26 Jun 2002 18:09:46 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <1d9601c21d25$d32a85d0$6601a8c0@boostconsulting.com> Message-ID: [David Abrahams] > This is really interesting. When I was at Dragon (well, actually, > after Tim left and it became L&H), I ported my natural language > parsing/understanding system from Python to C++ so it could run > quickly enough for embedded devices. The core of this system was an > associative container, so I knew that its performance would be > crucial. I used C++ generics which made it really easy to swap in > different associative container implementations, and I tried lots, > including the red-black tree containers built into most C++ > implementations, and hash tables. My experience was that trying to > come up with a hash function that would give a significant speed > increases over the tree containers was extremely difficult, because it > was really hard to come up with a good hash function. There's more to a speedy hash implementation than just the hash function, of course. > Furthermore, even if I succeeded, it was like black magic: it was < inconsistent accross my test cases and there was no way to understand > why it worked well, and to get a feeling for how it would scale to > problems outside those cases. Python's dictobject.c and .h have extensive comments about how Python's dicts work. Their behavior isn't mysterious, at least not after 10 years of thinking about it <0.9 wink>. Python's dicts also use tricks that I've never seen in print -- many people have contributed clever tricks. > I ended up hand-coding a two-level scheme based on binary searches in > contiguous arrays which blew away anything I'd been able to do with a > hash table. My conclusion was that for general-purpose use, the red- > black tree was pretty good, despite its relatively high memory overhead > of 3 pointers per node: The example I posted built a mapping with a million entries. A red-black tree of that size needs to chase between 20 and 40 pointers to determine membership. By sheer instruction count alone, that's way more instructions than the Python dict usually has to do, although it's comparable to the number the BTree had to do. The BTree probably has better cache behavior than a red-black tree; for example, all the keys in the million-element example were on the third level, and a query required exactly two pointer-chases to get to the right leaf bucket. All the rest is binary search on 60-120 element contiguous vectors (in fact, sounds a lot like your custom "two-level scheme") > it places easy requirements on the user (supply a strick weak ordering) > and provides predictable and smooth performance even asymptotically. OTOH, it can be very hard to write an efficient, correct "<" ordering, while testing just "equal or not?" can be easier and run quicker than that. Dict comparison is a good example from the Python core: computing "<" for dicts is a nightmare, but computing "==" for dicts is easy (contrast the straightforward dict_equal() with the brain-busting dict_compare() + characterize() pair). This was one of the motivations for introducing "rich comparisons". > On the other hand, hashing requires that the user supply both a hash > function and an equality detector which must agree with one-another, I've rarely found this to be a challenge. For example, for sets that contain sets as elements, a suitable hash function can simply xor the hash codes of the set elements. Since equal sets have equal elements, such a scheme delivers equal hash codes for equal sets, and independent of the order in which set elements get enumerated. In contrast, defining a sensible *total* ordering on sets is a delicate undertaking (yes, I know that "strict weak ordering" is weaker than "total", but in real life you couldn't buy a hot dog with the difference ). > requires hand-tuning of the hash function for performance, and is rather > more unpredictable. We've been talking about adding hash-based > containers to the C++ standard library but I'm reluctant on these > grounds. It seems to me that when you really care about speed, some kind > of hand-coded solution might be a better investment than trying to come > up with a good hash function. > > I'm ready to believe that hashing is the most appropriate choice for > Python, but I wonder what makes the difference? Well, I'm intimately familar with the details of how Python dicts and Zope BTrees are implemented, down to staring at the machine instructions generated, and there's no mystery here to me. I'm not familiar with any of the details of what you tried. Understanding speed differences at this level isn't a "general principles" kind of discussion. I should note that Zope's BTrees pay a lot for playing nice with persistence, about a factor of two: upon visiting and leaving each BTree node, there are messy test+branch sequences ensuring that the object isn't a ghost, notifying the persistence machinery that fields have been accessed and/or changed, and telling the persistence machinery when the object is no longer in active use. Most of these bookkeeping operations can fail too, so there's another layer of "and did that fail?" test+branches around all that. The saving grace for BTrees (why this doesn't cost a factor of, say, 10) is that each BTree node contains a fair amount of "stuff", so that the guts of each function can do a reasonable amount of useful work. The persistence overhead could be a disaster if visiting an object only moved one bit closer to the result. But Python's dicts aren't aware of persistence at all, and that did give dicts an ~= factor-of-2 advantage in the example. While they're still not as zippy as dicts after factoring that out, B-Trees certainly aren't pigs. BTW, note that Python's dicts originally catered only to string keys, as they were the implementation of Python's namespaces, and dicts remain highly optimized for that specific purpose. Indeed, there's a distinct dict lookup routine dedicated to dicts with string keys. Namespaces have no compelling use for range search or lexicographic traversal, just association, and peak lookup speed was the design imperative. From tim.one@comcast.net Thu Jun 27 00:29:42 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 26 Jun 2002 19:29:42 -0400 Subject: [Python-Dev] List comprehensions In-Reply-To: Message-ID: [Gerald S. Williams, on listcomp (non)scopes] > No problem. As long as it was decided that there's a use for > the current behavior, I won't question it. I'm not sure there's a use for it, but I am sure I'd shoot any coworker who found one and relied on it . Python didn't have lexical scoping at the time listcomps were getting hammered out, and it would have been nuts to introduce a "local scope" for a single, isolated construct. Then and now, the semantics of listcomps can be exactly explained via a straightforward transformation to a for-loop. Now that we have lexical scoping, a local index vrbl would also be easy to explain -- but it would be "a change". From David Abrahams" Message-ID: <222401c21d68$bc1f3000$6601a8c0@boostconsulting.com> From: "Tim Peters" > There's more to a speedy hash implementation than just the hash function, of > course. 'course. > > Furthermore, even if I succeeded, it was like black magic: it was > < inconsistent accross my test cases and there was no way to understand > > why it worked well, and to get a feeling for how it would scale to > > problems outside those cases. > > Python's dictobject.c and .h have extensive comments about how Python's > dicts work. Their behavior isn't mysterious, at least not after 10 years of > thinking about it <0.9 wink>. Python's dicts also use tricks that I've > never seen in print -- many people have contributed clever tricks. I noticed that, and I think the next time I try hashing I'm going to steal as much as possible from Python's implementation to get a head start. Noticing that also left me with a question: how come everybody in the world hasn't stolen as much as possible from the Python hashing implementation? Are there a billion such 10-years'-tweaked implementations lying around which all perform comparably well? > The example I posted built a mapping with a million entries. A red-black > tree of that size needs to chase between 20 and 40 pointers to determine > membership. By sheer instruction count alone, that's way more instructions > than the Python dict usually has to do, although it's comparable to the > number the BTree had to do. The BTree probably has better cache behavior > than a red-black tree; for example, all the keys in the million-element > example were on the third level, and a query required exactly two > pointer-chases to get to the right leaf bucket. All the rest is binary > search on 60-120 element contiguous vectors (in fact, sounds a lot like your > custom "two-level scheme") Yeah, I think it ended up being something like that. Of course, the container I ended up with used domain-specific knowledge which would have been inappropriate for general-purpose use. > > it places easy requirements on the user (supply a strick weak ordering) > > and provides predictable and smooth performance even asymptotically. > > OTOH, it can be very hard to write an efficient, correct "<" ordering, while > testing just "equal or not?" can be easier and run quicker than that. Dict > comparison is a good example from the Python core: computing "<" for dicts > is a nightmare, but computing "==" for dicts is easy (contrast the > straightforward dict_equal() with the brain-busting dict_compare() + > characterize() pair). Well, OK, ordering hash tables is hard, unless the bucket count is a deterministic function of the element count. If they were sorted containers, of course, < would be a simple matter. And I assume that testing equality still involves a lot of hashing... Hmm, looking at the 3 C++ implementations of hashed containers that I have available to me, only one provides operator<(), which is rather strange since the other two implement operator== by first comparing sizes, then iterating through consecutive elements of each set looking for a difference. The implementation supplying operator<() uses a (IMO misguided) design that rehashes incrementally, but it seems to me that if the more straightforward approaches can implement operator==() as described, operator<() shouldn't have to be a big challenge for an everyday hash table. I'm obviously missing something, but what...? > This was one of the motivations for introducing "rich > comparisons". I don't see how that helps. Got a link? Or a clue? > > On the other hand, hashing requires that the user supply both a hash > > function and an equality detector which must agree with one-another, > > I've rarely found this to be a challenge. For example, for sets that > contain sets as elements, a suitable hash function can simply xor the hash > codes of the set elements. Since equal sets have equal elements, such a > scheme delivers equal hash codes for equal sets, and independent of the > order in which set elements get enumerated. In contrast, defining a > sensible *total* ordering on sets is a delicate undertaking (yes, I know > that "strict weak ordering" is weaker than "total", but in real life you > couldn't buy a hot dog with the difference ). I don't know what that means. If you represent your sets as sorted containers, getting a strict weak ordering on sets is trivial; you just do it with a lexicographical comparison of the two sequences. > > I'm ready to believe that hashing is the most appropriate choice for > > Python, but I wonder what makes the difference? > > Well, I'm intimately familar with the details of how Python dicts and Zope > BTrees are implemented, down to staring at the machine instructions > generated, and there's no mystery here to me. I'm not familiar with any of > the details of what you tried. Understanding speed differences at this > level isn't a "general principles" kind of discussion. No, I suppose not. But python's dicts are general-purpose containers, and you can put any key you like in there. It's still surprising to me given my (much less than 10 years') experience with hash implementations that you can design something that performs well over all those different cases. > I should note that Zope's BTrees pay a lot for playing nice with > persistence, about a factor of two: upon visiting and leaving each BTree > node, there are messy test+branch sequences ensuring that the object isn't a > ghost, notifying the persistence machinery that fields have been accessed > and/or changed, and telling the persistence machinery when the object is no > longer in active use. Most of these bookkeeping operations can fail too, so > there's another layer of "and did that fail?" test+branches around all that. Aww, heck, you just need a good C++ exception-handling implementation to get rid of the error-checking overheads ;-) > The saving grace for BTrees (why this doesn't cost a factor of, say, 10) is > that each BTree node contains a fair amount of "stuff", so that the guts of > each function can do a reasonable amount of useful work. The persistence > overhead could be a disaster if visiting an object only moved one bit closer > to the result. > > But Python's dicts aren't aware of persistence at all, and that did give > dicts an ~= factor-of-2 advantage in the example. While they're still not > as zippy as dicts after factoring that out, B-Trees certainly aren't pigs. > > BTW, note that Python's dicts originally catered only to string keys, as > they were the implementation of Python's namespaces, and dicts remain highly > optimized for that specific purpose. Indeed, there's a distinct dict lookup > routine dedicated to dicts with string keys. Namespaces have no compelling > use for range search or lexicographic traversal, just association, and peak > lookup speed was the design imperative. Thanks for the perspective! still-learning-ly y'rs, dave From greg@cosc.canterbury.ac.nz Thu Jun 27 03:38:12 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Jun 2002 14:38:12 +1200 (NZST) Subject: [Python-Dev] Xrange and Slices In-Reply-To: <20020626132718.GA57665@hishome.net> Message-ID: <200206270238.g5R2cC825570@oma.cosc.canterbury.ac.nz> Oren Tirosh : > Since xrange is the one more commonly used in everyday > programming I'd say that slice should be an alias to xrange, not the other > way around. I was about to yell "No, don't do that, slice is a type!" when I decided I'd better make sure that's true... and found that it's NOT! Python 2.2 (#14, May 28 2002, 14:11:27) [GCC 2.95.2 19991024 (release)] on sunos5 Type "help", "copyright", "credits" or "license" for more information. >>> slice >>> s = slice(1,2,3) >>> s.__class__ >>> So... why *isn't* slice == ? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From skip@pobox.com Thu Jun 27 03:50:48 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 26 Jun 2002 21:50:48 -0500 Subject: [Python-Dev] Xrange and Slices In-Reply-To: <200206270238.g5R2cC825570@oma.cosc.canterbury.ac.nz> References: <20020626132718.GA57665@hishome.net> <200206270238.g5R2cC825570@oma.cosc.canterbury.ac.nz> Message-ID: <15642.32136.631168.24453@12-248-8-148.client.attbi.com> Greg> So... why *isn't* slice == ? I suspect nobody at PythonLabs currently has any spare round tuits. Skip From oren-py-d@hishome.net Thu Jun 27 05:37:47 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Thu, 27 Jun 2002 00:37:47 -0400 Subject: [Python-Dev] Xrange and Slices In-Reply-To: <200206270238.g5R2cC825570@oma.cosc.canterbury.ac.nz> References: <20020626132718.GA57665@hishome.net> <200206270238.g5R2cC825570@oma.cosc.canterbury.ac.nz> Message-ID: <20020627043747.GA80339@hishome.net> On Thu, Jun 27, 2002 at 02:38:12PM +1200, Greg Ewing wrote: > Oren Tirosh : > > > Since xrange is the one more commonly used in everyday > > programming I'd say that slice should be an alias to xrange, not the other > > way around. > > I was about to yell "No, don't do that, slice is a type!" > when I decided I'd better make sure that's true... > and found that it's NOT! It is in the latest CVS and so is xrange. Oren From tim.one@comcast.net Thu Jun 27 05:44:36 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 27 Jun 2002 00:44:36 -0400 Subject: [Python-Dev] Xrange and Slices In-Reply-To: <20020626132718.GA57665@hishome.net> Message-ID: [Oren Tirosh] > ... > The start, stop and step attributes to xrange would have to be > revived (what was the idea behind removing them in the first place?) A futile attempt at bloat reduction. At the time, there was more code in Python to support unused xrange embellishments than there was to support generators. > ... > >>> xrange(1,100,2) > xrange(1, 101, 2) > > It's been there since at least Python 2.0. Hasn't anyone noticed this > bug before? It's been that way since xrange() was introduced, but nobody *called* it a bug before. The two expressions are equivalent: >>> list(xrange(1, 100, 2)) == list(xrange(1, 101, 2)) True >>> [Greg Ewing] > ... > So... why *isn't* slice == ? It is in current CVS Python, but still range != , and won't until someone cares enough to change it. From oren-py-d@hishome.net Thu Jun 27 08:00:53 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Thu, 27 Jun 2002 03:00:53 -0400 Subject: [Python-Dev] Xrange and Slices In-Reply-To: References: <20020626132718.GA57665@hishome.net> Message-ID: <20020627070053.GA96670@hishome.net> On Thu, Jun 27, 2002 at 12:44:36AM -0400, Tim Peters wrote: > > >>> xrange(1,100,2) > > xrange(1, 101, 2) > > > > It's been there since at least Python 2.0. Hasn't anyone noticed this > > bug before? > > It's been that way since xrange() was introduced, but nobody *called* it a > bug before. The two expressions are equivalent: > > >>> list(xrange(1, 100, 2)) == list(xrange(1, 101, 2)) > True I found that seconds after hitting 'y'... > [Greg Ewing] > > ... > > So... why *isn't* slice == ? > > It is in current CVS Python, but still range != , and won't > until someone cares enough to change it. There is no spoo^H^H^H^H . xrange is and range is . Oren From oren-py-d@hishome.net Thu Jun 27 08:06:03 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Thu, 27 Jun 2002 03:06:03 -0400 Subject: [Python-Dev] Xrange and Slices In-Reply-To: <000d01c21cdb$eb03b720$91d8accf@othello> References: <000d01c21cdb$eb03b720$91d8accf@othello> Message-ID: <20020627070603.GB96670@hishome.net> On Wed, Jun 26, 2002 at 02:37:17AM -0400, Raymond Hettinger wrote: > Wild idea of the day: > Merge the code for xrange() into slice(). > So that old code will work, make the word 'xrange' a synonym for 'slice' It looks possible, but it will hurt the performance of xrange. Internally, xrange uses C longs while slice uses python objects with all the associated overhead. Oren From gmcm@hypernet.com Thu Jun 27 12:06:29 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 27 Jun 2002 07:06:29 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <222401c21d68$bc1f3000$6601a8c0@boostconsulting.com> Message-ID: <3D1AB975.3954.1FBF483@localhost> On 26 Jun 2002 at 19:20, David Abrahams wrote: [Python's hashing] > Noticing that also left me with a question: how > come everybody in the world hasn't stolen as much as > possible from the Python hashing implementation? > Are there a billion such 10-years'-tweaked > implementations lying around which all perform > comparably well? Jean-Claude Wippler and Christian Tismer did some benchmarks against other implementations. IIRC, the only one in the same ballpark was Lua's (which, IIRC, was faster at least under some conditions). -- Gordon http://www.mcmillan-inc.com/ From sholden@holdenweb.com Thu Jun 27 12:15:32 2002 From: sholden@holdenweb.com (Steve Holden) Date: Thu, 27 Jun 2002 07:15:32 -0400 Subject: [Python-Dev] Dict constructor References: <009601c21d49$fb2acee0$56ec7ad1@othello> Message-ID: <12c301c21dcb$f420ebc0$6300000a@holdenweb.com> ----- Original Message ----- From: "Raymond Hettinger" To: Sent: Wednesday, June 26, 2002 3:45 PM Subject: Re: [Python-Dev] Dict constructor > From: "Tim Peters" > > -1 because of ambiguity. Is this trying to build a set with the single > > element (42, 666), or a mapping of 42 to 666? > > > > dict([(42, 666)]} > > I've been thinking about this and the unabiguous explicit solution is to > specify a value argument like dict.get(). > > >>> dict([(42, 666)]) # current behavior unchanged > {42: 666} > > >>> dict([(42, 666)], True) > {(42, 666): True} > > >>> dict( '0123456789abcdef', True) > {'a': True, 'c': True, 'b': True, 'e': True, 'd': True, 'f': True, '1': > True, '0': True, '3': True, '2': True, '5': True, '4': True, 7': True, '6': > True, '9': True, '8': True} > > >>> dict('0123456789abcdef') # current behavior unchanged > ValueError: dictionary update sequence element #0 has length 1; 2 is > required > > > > The goal is not to provide full set behavior but to facilitate the common > task of building dictionaries with a constant value. It comes up in > membership testing and in uniquifying sequences. The task of dict() is to > construct dictionaries and this is a reasonably common construction. > But is it really common enough to merit special-casing what can anyway be spelt very simply: adict = {} for k in asequence: dict[k] = sentinel ? regards ----------------------------------------------------------------------- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/pwp/ ----------------------------------------------------------------------- From tack@cscs.ch Thu Jun 27 13:01:22 2002 From: tack@cscs.ch (Davide Tacchella) Date: Thu, 27 Jun 2002 14:01:22 +0200 Subject: [Python-Dev] Help, Compile / debug Python 2.2.1 64 bit on AIX Message-ID: <20020627140122.07b44f43.tack@cscs.ch> I'm trying to build Python with 64 bit support on AIX, so far I've encountered 2 problems, dynload_aix.c is not 100% 64 bit compliant (it includes some cast from pointer to (int)); this was causing python to SEGV when building extensions, after changing from int to long, the error is now ILL (SIGILL), there is a pointer to NULL: The call stack from debugger is: 0x000000 initstruct (structmodule.c - line 1508) _PyImport_LoadDynamicModule ( importdl.c - line 53) load_module (import.c - line 1365) import_submodule (import.c - line 1895) load_next (import.c - line 1751) import_module_ex (import.c - line 1602) PyImport_ImportModuleEx (import.c - line 1643) builtin___import__ (bltinmodule.c - line 40) PyCFunction_Call (methodobject.c - line 80) eval_frame (ceval.c - line 2004) .... Any idea ? Any help is always welcome. Can anybody help me out ? Davide From fredrik@pythonware.com Thu Jun 27 15:29:45 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 27 Jun 2002 16:29:45 +0200 Subject: [Python-Dev] SF task tracker confusion Message-ID: <04f801c21de7$168326e0$0900a8c0@spiff> on my "my sf.net" page, there are a couple of development tasks listed for python 2.1 (!). however, if I click on one of the links, e.g. https://sourceforge.net/pm/task.php?func=3Ddetailtask&project_task_id=3D2= 5031&group_id=3D5470&group_project_id=3D4564 all I get is a page saying that: Permission Denied This project's administrator will have to grant you permission to view this page.=20 =20 any ideas? maybe the project's administrator could remove the tasks for me? From David Abrahams" <009601c21d49$fb2acee0$56ec7ad1@othello> <12c301c21dcb$f420ebc0$6300000a@holdenweb.com> Message-ID: <239801c21deb$94528030$6601a8c0@boostconsulting.com> From: "Steve Holden" > But is it really common enough to merit special-casing what can anyway be > spelt very simply: > > adict = {} > for k in asequence: > dict[k] = sentinel > > ? Yep. -Dave From tim@zope.com Thu Jun 27 17:19:09 2002 From: tim@zope.com (Tim Peters) Date: Thu, 27 Jun 2002 12:19:09 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <3D1AB975.3954.1FBF483@localhost> Message-ID: [David Abrahams] > Noticing that also left me with a question: how > come everybody in the world hasn't stolen as much as > possible from the Python hashing implementation? > Are there a billion such 10-years'-tweaked > implementations lying around which all perform > comparably well? [Gordon McMillan] > Jean-Claude Wippler and Christian Tismer did some > benchmarks against other implementations. IIRC, the > only one in the same ballpark was Lua's (which, IIRC, > was faster at least under some conditions). I'd like to see the benchmark. Like Python, Lua uses a power-of-2 table size, but unlike Python uses linked lists for collisions instead of open addressing. This appears to leave it very vulnerable to bad cases (like using [i << 16 for i in range(20000)] as a set of keys -- Python and Lua both grab the last 15 bits of the ints as their hash codes, which means every key maps to the same hash bucket. Looks like Lua would chain them all together. Python breaks the ties quickly via its collision resolution scrambling.). The Lua string hash appears systematically vulnerable: static unsigned long hash_s (const char *s, size_t l) { unsigned long h = l; /* seed */ size_t step = (l>>5)|1; /* if string is too long, don't hash all its chars */ for (; l>=step; l-=step) h = h ^ ((h<<5)+(h>>2)+(unsigned char)*(s++)); return h; } That hash function would be weak even if it didn't ignore up to 97% of the input characters. OTOH, if it happens not to collide, ignoring up to 97% of the characters eliminates up to 97% of the expense of computing a hash. Etc. Lua's hashes do appear to get a major benefit from lacking a Python feature: user-defined comparisons can (a) raise exceptions, and (b) mutate the hash table *while* you're looking for a key in it. Those cause the Python implementation lots of expensive pain (indeed, the main reason Python has a distinct lookup function for string-keyed dicts is that it doesn't have to choke itself worrying about #a or #b for builtin strings). There's a lovely irony here. Python's dicts are fast because they've been optimized to death. When Lua's dicts are fast, it seems more the case it's because they don't worry much about bad cases. That's *supposed* to be Python's trick . From tim@zope.com Thu Jun 27 19:54:50 2002 From: tim@zope.com (Tim Peters) Date: Thu, 27 Jun 2002 14:54:50 -0400 Subject: [Python-Dev] SF task tracker confusion In-Reply-To: <04f801c21de7$168326e0$0900a8c0@spiff> Message-ID: [/F] > on my "my sf.net" page, there are a couple of development > tasks listed for python 2.1 (!). So finish them already . > however, if I click on one of the links, e.g. > > https://sourceforge.net/pm/task.php?func=detailtask&project_task_i > d=25031&group_id=5470&group_project_id=4564 > > all I get is a page saying that: > > Permission Denied > > This project's administrator will have to grant > you permission to view this page. > > any ideas? maybe the project's administrator could remove > the tasks for me? I got the same error page. Looks like someone tried to disable use of the task manager, and delete the old tasks, without closing the subtasks first. It took a lot of fiddling but I believe I've done all I can to get those tasks off your page. You were the only who still had a task assigned to them. If they still show up on your page, let me know; I expect we'll have to elevate it to an SF support request then. From fredrik@pythonware.com Thu Jun 27 19:56:49 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 27 Jun 2002 20:56:49 +0200 Subject: [Python-Dev] SF task tracker confusion References: <04f801c21de7$168326e0$0900a8c0@spiff> Message-ID: <028301c21e0c$66427f30$ced241d5@hagrid> > maybe the project's administrator could remove > the tasks for me? "You have no open tasks assigned to you." thanks! /F From fredrik@pythonware.com Thu Jun 27 20:12:41 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 27 Jun 2002 21:12:41 +0200 Subject: [Python-Dev] pre.sub's broken under 2.2 Message-ID: <02ad01c21e0e$b3737960$ced241d5@hagrid> just for the record, one of those "let's change a lot of code that we don't understand, just because we can" things broke the "pre" module in 2.2. someone changed: try: repl = pcre_expand(_Dummy, repl) except: m = MatchObject(self, source, 0, end, []) to try: repl = pcre_expand(_Dummy, repl) except error: m = MatchObject(self, source, 0, end, []) but in the most common use case (replacement strings containing group references), the pcre_expand function raises a TypeError exception... From tim@zope.com Thu Jun 27 20:30:00 2002 From: tim@zope.com (Tim Peters) Date: Thu, 27 Jun 2002 15:30:00 -0400 Subject: [Python-Dev] pre.sub's broken under 2.2 In-Reply-To: <02ad01c21e0e$b3737960$ced241d5@hagrid> Message-ID: [/F] > just for the record, one of those "let's change a lot of > code that we don't understand, just because we can" In the case of try + bare-except, it was more a case of "let's change code we don't understand because it's impossible to guess its intent and that's bad for future maintenance". > things broke the "pre" module in 2.2. > > someone changed: > > try: > repl = pcre_expand(_Dummy, repl) > except: > m = MatchObject(self, source, 0, end, []) > > to > > try: > repl = pcre_expand(_Dummy, repl) > except error: > m = MatchObject(self, source, 0, end, []) > > but in the most common use case (replacement strings > containing group references), the pcre_expand function > raises a TypeError exception... Like I said . The except clause should list the exceptions it specifically intends to silence, and something as obscure as this case deserves a comment to boot. I also note that if this passed the tests, then the test suite wasn't even trying "the most common use case". there's-more-than-one-kind-of-breakage-illustrated-here-ly y'rs - tim From fredrik@pythonware.com Thu Jun 27 20:38:47 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 27 Jun 2002 21:38:47 +0200 Subject: [Python-Dev] pre.sub's broken under 2.2 References: Message-ID: <030901c21e12$443ce000$ced241d5@hagrid> tim wrote: > I also note that if this passed the tests, then the test suite wasn't even > trying "the most common use case". sure. but who should make sure that the regression test suite covers the code being changed: the person changing it, or the end user? From barry@zope.com Thu Jun 27 20:13:05 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 27 Jun 2002 15:13:05 -0400 Subject: [Python-Dev] Building Python cvs w/ gcc 3.1 Message-ID: <15643.25537.767831.983206@anthem.wooz.org> File this under "you just can't win". I'm building Python cvs w/gcc 3.1 and I get warnings for every extension, e.g.: building 'zlib' extension gcc -g -Wall -Wstrict-prototypes -fPIC -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/zlibmodule.c -o build/temp.linux-i686-2.3/zlibmodule.o cc1: warning: changing search order for system directory "/usr/local/include" cc1: warning: as it has already been specified as a non-system directory gcc -shared build/temp.linux-i686-2.3/zlibmodule.o -L/usr/local/lib -lz -o build/lib.linux-i686-2.3/zlib.so The problem is the inclusion of -I/usr/local/include because that's a directory on gcc's system include path. Adding such directories can cause gcc headaches because it likes to treat system include dirs specially, doing helpful things like fix bugs in vendor's header files and sticking them in special locations. -I apparently overrides the special treatment of system include dirs, so the warnings are gcc's way of helpfully reminding us not to do that. Unfortunately, it seems difficult to fix this in a principled way. I can't figure out a way to reliably ask gcc what its system include dirs are. -v doesn't give you the information. There's no switch to turn off these warnings. You could ask cpp ("cpp -v") which does provide output that could be grep'd for the system include dirs, but that just seems way too fragile. Besides, that doesn't play well with distutils because it only wants to invoke the preprocessor using "gcc -E" and /that/ interprets -v as one of its options. You could use "gcc -E -Wp,-v dummyfile.c" but then you'd have to redirect stderr, capture that output, and grep it. Blech, blech, blech. If I comment out the line in setup.py which add /usr/local/include to self.compiler.include_dirs, it takes care of the problem, but that might break other builds, so I'm loathe to do that. The other option is to ignore the warnings since I don't think gcc 3.1 (3.x?) is distributed as the default compiler for very many distros yet. OTOH, it /will/ at some point and then it will be a PITA for support . OTTH, from some quick googling, I gather that this warning is somewhat controversial inside the gcc community, and other projects have dealt with it in heavyhanded ways (just don't -I/usr/local/include), so if we ignore it long enough, the problem might just go away. Sigh, I'm done with this for now, but wanted to get it into the archives for future reference. -Barry From tim@zope.com Thu Jun 27 20:56:16 2002 From: tim@zope.com (Tim Peters) Date: Thu, 27 Jun 2002 15:56:16 -0400 Subject: [Python-Dev] pre.sub's broken under 2.2 In-Reply-To: <030901c21e12$443ce000$ced241d5@hagrid> Message-ID: [/F] > sure. but who should make sure that the regression test suite > covers the code being changed: the person changing it, or the > end user? We could ask a lot of "who should have?" questions here. As it turns out, an end user finished everyone's job here. learn-&-move-on-ly y'rs - tim From zack@codesourcery.com Thu Jun 27 23:12:28 2002 From: zack@codesourcery.com (Zack Weinberg) Date: Thu, 27 Jun 2002 15:12:28 -0700 Subject: [Python-Dev] Improved tmpfile module Message-ID: <20020627221228.GB9371@codesourcery.com> I'm not subscribed to python-dev. Please cc: me directly on replies. I'm going to respond to all the comments at once. Greg Ward wrote: > > Attached please find a rewritten and improved tmpfile.py. The major > > change is to make the temporary file names significantly harder to > > predict. This foils denial-of-service attacks, where a hostile > > program floods /tmp with files named @12345.NNNN to prevent process > > 12345 from creating any temp files. It also makes the race condition > > inherent in tmpfile.mktemp() somewhat harder to exploit. > > Oh, good! I've long wished that there was a tmpfile module written by > someone who understands the security issues involved in generating > temporary filenames and files. I hope you do... ;-) Well, I wrote the analogous code in the GNU C library (using basically the same algorithm). I'm confident it is safe on a Unix-based system. On Windows and others, I am relying on os.open(..., os.O_EXCL) to do what it claims to do; assuming it does, the code should be safe there too. > > (fd, name) = mkstemp(suffix="", binary=1): Creates a temporary file, > > returning both an OS-level file descriptor open on it and its name. > > This is useful in situations where you need to know the name of the > > temporary file, but can't risk the race in mktemp. > > +1 except for the name. What does the "s" stand for? Unfortunately, I > can't think of a more descriptive name offhand. Fredrik Lundh's suggestion that it is for "safer" seems plausible, but I do not actually know. I chose the names mkstemp and mkdtemp to match the functions of the same name in most modern Unix C libraries. Since they don't take the same "template" parameter that those functions do, that was probably a bad idea. [Note to Fredrik: at the C level, mkstemp is not deprecated in favor of tmpfile, as they do very different things - tmpfile(3) is analogous to tmpfile.TemporaryFile(), you don't get the file name back.] I'm open to suggestions for a better routine name; I can't think of a good one myself. > > name = mkdtemp(suffix=""): Creates a temporary directory, without > > race. > > How about calling this one mktempdir() ? Sure. > I've scanned your code and the existing tempfile.py. I don't > understand why you rearranged things. Please explain why your > arrangement of _TemporaryFileWrapper/TemporaryFile/ > NamedTemporaryFile is better than what we have. I was trying to get all the user-accessible interfaces to be at the top of the file. Also, I do not understand the bits in the existing file that delete names out of the module namespace after we're done with them, so I wound up taking all of that out to get it to work. I think the existing file's organization was largely determined by those 'del' statements. I'm happy to organize the file any way y'all like -- I'm kind of new to Python and I don't know the conventions yet. > A few minor comments on the code... > > > if os.name == 'nt': > > _template = '~%s~' > > elif os.name in ('mac', 'riscos'): > > _template = 'Python-Tmp-%s' > > else: > > _template = 'pyt%s' # better ideas? > > Why reveal the implementation language of the application creating these > temporary names? More importantly, why do it certain platforms, but not > others? This is largely as it was in the old file. I happen to know that ~%s~ is conventional for temporary files on Windows. I changed 'tmp%s' to 'pyt%s' for Unix to make it consistent with Mac/RiscOS Ideally one would allow the calling application to control the prefix, but I'm not sure what the right interface is. Maybe tmpfile.mkstemp(prefix="", suffix="") where if one argument is provided it gets treated as the suffix, but if two are provided the prefix comes first, a la range()? Is there a way to express that in the prototype? > > ### Recommended, user-visible interfaces. > > > > _text_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL > > if os.name == 'posix': > > _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL > > Why not just "_bin_openflags = _text_openflags" ? That clarifies their > equality on Unix. > > > else: > > _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL | os.O_BINARY > > Why not "_bin_openflags = _text_openflags | os.O_BINARY" ? *shrug* Okay. > > > def mkstemp(suffix="", binary=1): > > """Function to create a named temporary file, with 'suffix' for > > its suffix. Returns an OS-level handle to the file and the name, > > as a tuple. If 'binary' is 1, the file is opened in binary mode, > > otherwise text mode (if this is a meaningful concept for the > > operating system in use). In any case, the file is readable and > > writable only by the creating user, and executable by no one.""" > > "Function to" is redundant. I didn't change much of this text from the old file. Where are docstring conventions documented? > """Create a named temporary file. > > Create a named temporary file with 'suffix' for its suffix. Return > a tuple (fd, name) where 'fd' is an OS-level handle to the file, and > 'name' is the complete path to the file. If 'binary' is true, the > file is opened in binary mode, otherwise text mode (if this is a > meaningful concept for the operating system in use). In any case, > the file is readable and writable only by the creating user, and > executable by no one (on platforms where that makes sense). > """ Okay. > Hmmm: if suffix == ".bat", the file is executable on some platforms. > That last sentence still needs work. ... In any case, the file is readable and writable only by the creating user. On platforms where the file's permission bits control whether it can be executed as a program, no one can. Other platforms have other ways of controlling this: for instance, under Windows, the suffix determines whether the file can be executed. How's that? > > class _TemporaryFileWrapper: > > """Temporary file wrapper > > > > This class provides a wrapper around files opened for temporary use. > > In particular, it seeks to automatically remove the file when it is > > no longer needed. > > """ > > Here's where I started getting confused. I don't dispute that the > existing code could stand some rearrangement, but I don't understand why > you did it the way you did. Please clarify! See above. What would you consider a sensible arrangement? > > > ### Deprecated, user-visible interfaces. > > > > def mktemp(suffix=""): > > """User-callable function to return a unique temporary file name.""" > > while 1: > > name = _candidate_name(suffix) > > if not os.path.exists(name): > > return name > > The docstring for mktemp() should state *why* it's bad to use this > function -- otherwise people will say, "oh, this looks like it does what > I need" and use it in ignorance. So should the library reference > manual. Good point. """Suggest a name to be used for a temporary file. This function returns a file name, with 'suffix' for its suffix, which did not correspond to any file at some point in the past. By the time you get the return value of this function, a file may have already been created with that name. It is therefore unsafe to use this function for any purpose. It is deprecated and may be removed in a future version of Python.""" and corresponding text in the library manual? Tim Peters wrote: > > -1 on the implementation here, because it didn't start with current CVS, so > is missing important work that went into improving this module on Windows > for 2.3. Whether spawned/forked processes inherit descriptors for "temp > files" is also a security issue that's addressed in current CVS but seemed > to have gotten dropped on the floor here. I'll get my hands on a copy of current CVS and rework my changes against that. > A note on UI: for many programmers, "it's a feature" that temp file names > contain the pid. I don't think we can get away with taking that away no > matter how stridently someone claims it's bad for us . GNU libc took that away from C programmers about four years ago and no one even noticed. FreeBSD libc, ditto, although I'm not sure when it happened. zw From fredrik@pythonware.com Thu Jun 27 23:41:06 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 28 Jun 2002 00:41:06 +0200 Subject: [Python-Dev] Improved tmpfile module References: <20020627221228.GB9371@codesourcery.com> Message-ID: <05ea01c21e2b$bcab5fd0$ced241d5@hagrid> zack wrote: > [Note to Fredrik: at the C level, mkstemp is not deprecated in favor > of tmpfile, as they do very different things - tmpfile(3) is analogous > to tmpfile.TemporaryFile(), you don't get the file name back.] I quoted the SUSv2 spec from memory. shouldn't have done that: it says "preferred for portability reasons", not deprecated. From David Abrahams" I just submitted a patch to the list.extend docstring, to reflect the fact that x.extend(xrange(10)) and x.extend((2,3)) both work when x is a list. Then I went to look at the documentation and noticed it says at http://www.python.org/dev/doc/devel/lib/typesseq-mutable.html: s.extend(x) same as s[len(s):len(s)] = x (2) ... (2) Raises an exception when x is not a list object. The extend() method is experimental and not supported by mutable sequence types other than lists. Now I'm wondering what all this means. It is /not/ equivalent to the slice assignment, because list slice assignment requires a list rhs. What does this "experimental" label mean? Is my patch to the docstring wrong, in the sense that it suggests exploiting undefined behavior in the same way that the old append-multiple-items behavior was undefined? Also, I note that the table referenced above seems to be missing some right parentheses, at least on the .pop and .sort method descriptions. -Dave +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+ From David Abrahams" Message-ID: <021801c21e35$08fb1d90$6501a8c0@boostconsulting.com> ----- Original Message ----- From: "David Abrahams" To: Sent: Thursday, June 27, 2002 7:43 PM Subject: [Python-Dev] list.extend > I just submitted a patch to the list.extend docstring, to reflect the fact > that x.extend(xrange(10)) and x.extend((2,3)) both work when x is a list. > Then I went to look at the documentation and noticed it says at > http://www.python.org/dev/doc/devel/lib/typesseq-mutable.html: > > s.extend(x) same as s[len(s):len(s)] = x (2) > ... > (2) Raises an exception when x is not a list object. The extend() method is > experimental and not supported by mutable sequence types other than lists. > > > Now I'm wondering what all this means. It is /not/ equivalent to the slice > assignment, because list slice assignment requires a list rhs. What does > this "experimental" label mean? Is my patch to the docstring wrong, in the > sense that it suggests exploiting undefined behavior in the same way that > the old append-multiple-items behavior was undefined? Looking again, I note that even if my patch is wrong, either the doc or the implementation must be fixed since it currently lies about throwing an exception when x is not a list. If someone can channel me the right state of affairs I'll submit another patch. -Dave From tim@zope.com Fri Jun 28 04:40:00 2002 From: tim@zope.com (Tim Peters) Date: Thu, 27 Jun 2002 23:40:00 -0400 Subject: [Python-Dev] list.extend In-Reply-To: <020201c21e34$6905b6b0$6501a8c0@boostconsulting.com> Message-ID: [David Abrahams] > I just submitted a patch to the list.extend docstring, to reflect the fact > that x.extend(xrange(10)) and x.extend((2,3)) both work when x is a list. > Then I went to look at the documentation and noticed it says at > http://www.python.org/dev/doc/devel/lib/typesseq-mutable.html: > > s.extend(x) same as s[len(s):len(s)] = x (2) Ya, that's no longer true. > ... > (2) Raises an exception when x is not a list object. That's true of s[len(s}:len(s)] = x, but not of s.extend(x). > The extend() method is experimental "experimental" doesn't mean anything, so neutral on that . > and not supported by mutable sequence types other than lists. That's not true anymore either; for example, arrays (from the array module) have since grown .extend() methods. > Now I'm wondering what all this means. Just that the docs are, as you suspect, out of date. > It is /not/ equivalent to the slice assignment, because list slice > assignment requires a list rhs. Right. list.extend(x) actually requires that x be an iterable object. Even list.extend(open('some file')) works fine (and appends the lines of the file to the list). > What does this "experimental" label mean? I'm not sure. Guido slaps that label on new features from time to time, with the implication that they may go away in the following release. However, no *advertised* experimental feature has ever gone away, and I doubt one ever will. We should drop the "experimental" on this one for sure now, as lots of code uses list.extend(). > Is my patch to the docstring wrong, in the sense that it suggests > exploiting undefined behavior in the same way that the old append > -multiple-items behavior was undefined? I haven't looked at the patch because you didn't include a handy link. It's definitely intended that list.extend() accept iterable objects now. > Also, I note that the table referenced above seems to be missing > some right parentheses, at least on the .pop and .sort method > descriptions. Yup, and they used to be there. Thanks for the loan of the eyeballs! From python@rcn.com Fri Jun 28 04:53:11 2002 From: python@rcn.com (Raymond Hettinger) Date: Thu, 27 Jun 2002 23:53:11 -0400 Subject: [Python-Dev] list.extend References: Message-ID: <002401c21e57$548b5280$19d8accf@othello> > > Also, I note that the table referenced above seems to be missing > > some right parentheses, at least on the .pop and .sort method > > descriptions. > > Yup, and they used to be there. Hmmph! This is occurring throughout the docs (see also dict.get() and dict.setdefault()). It looks like a flaw in the doc gen process or in the interaction of tex macro for methods with optional arguments Raymond Hettinger From David Abrahams" Message-ID: <031801c21e58$7f3f37c0$6501a8c0@boostconsulting.com> From: "Tim Peters" > Thanks for the loan of the eyeballs! As long as I'm eyeballin' (and you're thankin'), I notice in PyInt_AsLong: if (op == NULL || (nb = op->ob_type->tp_as_number) == NULL || nb->nb_int == NULL) { PyErr_SetString(PyExc_TypeError, "an integer is required"); return -1; } But really, an integer isn't required; Any type with a tp_as_number section and a conversion to int will do. Should the error say "a numeric type convertible to int is required"? -Dave From tim.one@comcast.net Fri Jun 28 06:01:41 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 28 Jun 2002 01:01:41 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <222401c21d68$bc1f3000$6601a8c0@boostconsulting.com> Message-ID: [David Abrahams] > ... > Noticing that also left me with a question: how come everybody in > the world hasn't stolen as much as possible from the Python hashing > implementation? Are there a billion such 10-years'-tweaked > implementations lying around which all perform comparably well? It's a Mystery, and in all directions. Python has virtually no code from, say, Tcl or Perl either, and the latter certainly do some things better than Python does them. I've studied all their hash implementations, but didn't find anything worth stealing ; OTOH, multiple attempts by multiple groups to steal Perl's regexp engine years ago fizzled out in a tarpit of frustration. Curious: Python independently developed a string hash *very* similar to what later became "the standard" Fowler-Noll-Vo string hash: http://www.isthe.com/chongo/tech/comp/fnv/ The multiplier is different, and the initial value, but that's it. I'm sure there was no communication in either direction. So ya, given enough time, a billion other projects will rediscover it too. >> OTOH, it can be very hard to write an efficient, correct "<" ordering, >> while testing just "equal or not?" can be easier and run quicker than >> that. Dict comparison is a good example from the Python core: >> computing "<" for dicts is a nightmare, but computing "==" for dicts is >> easy (contrast the straightforward dict_equal() with the brain-busting >> dict_compare() + characterize() pair). > Well, OK, ordering hash tables is hard, unless the bucket count is a > deterministic function of the element count. I don't know how the latter could help; for that matter, I'm not even sure what it means. > If they were sorted containers, of course, < would be a simple matter. Yes. > And I assume that testing equality still involves a lot of hashing... No more times than the common length of the two dicts. It's just: def dict_equal(dict1, dict2): if len(dict1) != len(dict2): return False for key, value in dict1.iteritems(): if key not in dict2 or not value == dict2[key]: return False return True Searching dict2 for key *may* involve hashing key again (or it may not; for example, Python string objects are immutable and cache their 32-bit hash in the string object the first time it's computed). There's a world of pain involved in the "==" there, though, as a dict can very well have itself as a value in itself, and the time required for completion appears to be exponential in some pathological cases of that kind (Python does detect the unbounded recursion in such cases -- eventually). > Hmm, looking at the 3 C++ implementations of hashed containers that I have > available to me, only one provides operator<(), which is rather strange > since the other two implement operator == by first comparing sizes, then > iterating through consecutive elements of each set looking for a > difference. The implementation supplying operator<() uses a (IMO > misguided) design that rehashes incrementally, but it seems to me that if > the more straightforward approaches can implement operator==() as > described, operator<() shouldn't have to be a big challenge for an > everyday hash table. > > I'm obviously missing something, but what...? I don't know, but I didn't follow what you were saying (like, "rehashes incrementally" doesn't mean anything to me). If there's a simpler way to get "the right" answer, I'd love to see it. I once spent two hours trying to prove that the dict_compare() + characterize() pair in Python was correct, but gave up in a mushrooming forest of end cases. In The Beginning, Python implemented dict comparison by materializing the .items(), sorting both, and then doing list comparison. The correctness of that was easy to show. But it turned out that in real life all anyone ever used was == comparison on dicts, and sorting was enormously expensive compared to what was possible. characterize() is a very clever hack Guido dreamt up to get the same result in no more than two passes -- but I've never been sure it's a thoroughly correct hack. OTOH, since nobody appears to care about "<" for dicts, if it's wrong we may never know that. >> This was one of the motivations for introducing "rich comparisons". > I don't see how that helps. Got a link? Or a clue? Sorry, I don't understand the question. When Python funneled all comparisons through cmp(), it wasn't possible for a type implementation to do anything faster for, say, "==", because it had no idea why cmp() was being called. Allowing people to ask for the specific comparison they wanted is part of what "rich comparisons" was about, and speed was one of the reasons for adopting it. Comparing strings for equality/inequality alone is also done faster than needing to resolve string ordering. And complex numbers have no accepted "<" relation at all. So comparing dicts isn't the only place it's easier and quicker to restrict the burden on the type to implementing equality testing. For user-defined types, I've often found it *much* easier. For example, I can easily tell whether two chessboards are equal (do they have the same pieces on the same squares?), but a concept of "<" for chessboards is strained. > I don't know what that means. There's too much of that on both sides here, so I delcare this mercifully ended now <0.9 wink>. > If you represent your sets as sorted containers, getting a strict weak > ordering on sets is trivial; you just do it with a lexicographical > comparison of the two sequences. And if you don't, that conclusion doesn't follow. > .,, > No, I suppose not. But python's dicts are general-purpose containers, and > you can put any key you like in there. It's still surprising to > me given my (much less than 10 years') experience with hash > implementations that you can design something that performs well over > all those different cases. You probably can't a priori, but after a decade people stumble into all the cases that don't work well, and you eventually fiddle the type-specific hash functions and the general implementation until surprises appear to stop. It remains a probabilistic method, though, and there are no guarantees. BTW, I believe that of all Python's builtin types, only the hash function for integers remains in its original form (hash(i) == i). So even if I don't want to, I'm forced to agree that finding a good hash function isn't trivial. [on Zope's B-Trees] > Aww, heck, you just need a good C++ exception-handling implementation to > get rid of the error-checking overheads ;-) I'd love to use C++ for this. This is one of those things that defines 5 families of 4 related data structures each via a pile of .c and .h files that get #include'd and recompiled 5 times after #define'ing a pile of macros. It would be *so* much more pleasant using templates. > ... > Thanks for the perspective! > > still-learning-ly y'rs, You're too old for that now -- start making money instead . the-psf-will-put-it-to-good-use-ly y'rs - tim From tim.one@comcast.net Fri Jun 28 06:20:17 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 28 Jun 2002 01:20:17 -0400 Subject: [Python-Dev] list.extend In-Reply-To: <031801c21e58$7f3f37c0$6501a8c0@boostconsulting.com> Message-ID: [David Abrahams] > As long as I'm eyeballin' (and you're thankin'), I notice in PyInt_AsLong: > > if (op == NULL || (nb = op->ob_type->tp_as_number) == NULL || > nb->nb_int == NULL) { > PyErr_SetString(PyExc_TypeError, "an integer is required"); > return -1; > } > > But really, an integer isn't required; Any type with a > tp_as_number section and a conversion to int will do. Should the > error say "a numeric type convertible to int is required"? I'll leave it up to Fred, but I don't think so. The suggestion is wordier, would be wordier still if converted to the more accurate "an object of a numeric type convertible to int is required", and even then is not, IMO, more likely to be of real help when this error triggers. If you want to change it, be sure to hunt down all the related ones too; e.g., >>> class C: pass >>> range(12)[C()] Traceback (most recent call last): File "", line 1, in ? TypeError: list indices must be integers >>> BTW, most places that call PyInt_AsLong() either do so conditionally upon the success of a PyInt_Check(), or replace the exception raised when it returns -1 with an error. Offhand I wasn't even able to provoke the msg in question. From David Abrahams" Message-ID: <035201c21e63$bfbdfc90$6501a8c0@boostconsulting.com> From: "Tim Peters" > If you want to > change it, be sure to hunt down all the related ones too; e.g., I wouldn't know where to start with that project. Do you think it would be a bad idea to make one of many error messages more accurate? > BTW, most places that call PyInt_AsLong() either do so conditionally upon > the success of a PyInt_Check(), or replace the exception raised when it > returns -1 with an error. Offhand I wasn't even able to provoke the msg in > question. We extension writers like to use it too, though, and usually without an extra layer of error processing. -Dave From tim.one@comcast.net Fri Jun 28 06:48:11 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 28 Jun 2002 01:48:11 -0400 Subject: [Python-Dev] list.extend In-Reply-To: <035201c21e63$bfbdfc90$6501a8c0@boostconsulting.com> Message-ID: [Tim] >> If you want to change it, be sure to hunt down all the related ones >> too; e.g., [David] > I wouldn't know where to start with that project. Do you think it would be > a bad idea to make one of many error messages more accurate? Increasing accuracy isn't necessarily helpful. In any context where PyInt_AsLong is called, an int most certainly is required *in the end*. Spelling out that the implementation may satisfy this requirement by asking a non-int type whether it knows how to convert instances of itself to an int doesn't seem helpful to me as a user. I'm not thinking that much about the internal implementation, and "of course" if an int is required Python will accept an object of a type that knows how to convert itself to an int. But I suppose you don't like seeing SyntaxError: invalid syntax at the end of a 7-line statement either . From David Abrahams" Message-ID: <036101c21e68$8abed730$6501a8c0@boostconsulting.com> ----- Original Message ----- From: "Tim Peters" To: "David Abrahams" Cc: Sent: Friday, June 28, 2002 1:01 AM Subject: RE: [Python-Dev] Priority queue (binary heap) python code > [David Abrahams] > > ... > > Noticing that also left me with a question: how come everybody in > > the world hasn't stolen as much as possible from the Python hashing > > implementation? Are there a billion such 10-years'-tweaked > > implementations lying around which all perform comparably well? > > It's a Mystery, and in all directions. Python has virtually no code from, > say, Tcl or Perl either, and the latter certainly do some things better than > Python does them. I've studied all their hash implementations, but didn't > find anything worth stealing ; Well of course not! > OTOH, multiple attempts by multiple > groups to steal Perl's regexp engine years ago fizzled out in a tarpit of > frustration. Oh, I had the impression that Python's re *was* pilfered Perl. > Curious: Python independently developed a string hash *very* similar to > what later became "the standard" Fowler-Noll-Vo string hash: > > http://www.isthe.com/chongo/tech/comp/fnv/ > > The multiplier is different, and the initial value, but that's it. I'm sure > there was no communication in either direction. So ya, given enough time, a > billion other projects will rediscover it too. Nifty. > > Well, OK, ordering hash tables is hard, unless the bucket count is a > > deterministic function of the element count. > > I don't know how the latter could help; for that matter, I'm not even sure > what it means. I know what I meant, but I was wrong. My brain cell musta jammed. Ordering hash tables is hard if collisions are possible. > > And I assume that testing equality still involves a lot of hashing... > > No more times than the common length of the two dicts. Of course. > It's just: > > def dict_equal(dict1, dict2): > if len(dict1) != len(dict2): > return False > for key, value in dict1.iteritems(): > if key not in dict2 or not value == dict2[key]: > return False > return True > > Searching dict2 for key *may* involve hashing key again (or it may not; for > example, Python string objects are immutable and cache their 32-bit hash in > the string object the first time it's computed). Tricky. I guess a C++ object could be designed to cooperate with hash tables in that way also. > There's a world of pain involved in the "==" there, though, as a dict can > very well have itself as a value in itself, and the time required for > completion appears to be exponential in some pathological cases of that kind > (Python does detect the unbounded recursion in such cases -- eventually). Yuck. I wouldn't expect any C++ implementation to handle that issue. > > Hmm, looking at the 3 C++ implementations of hashed containers that I have > > available to me, only one provides operator<(), which is rather strange > > since the other two implement operator == by first comparing sizes, then > > iterating through consecutive elements of each set looking for a > > difference. The implementation supplying operator<() uses a (IMO > > misguided) design that rehashes incrementally, but it seems to me that if > > the more straightforward approaches can implement operator==() as > > described, operator<() shouldn't have to be a big challenge for an > > everyday hash table. > > > > I'm obviously missing something, but what...? > > I don't know, but I didn't follow what you were saying (like, "rehashes > incrementally" doesn't mean anything to me). Get ahold of MSVC7 and look at the hash_set implementation. IIRC how Plaugher described it, it is constantly maintaining the load factor across insertions, so there's never a big cost to grow the table. It also keeps the items in each bucket sorted, so hash table comparisons are a lot easier. My gut tells me that this isn't worth what you pay for it, but so far my gut hasn't had very much of any value to say about hashing... The other implementations seem to implement equality as something like: template inline bool operator==(const hash_set& x, const hash_set& y) { return x.size() == y.size() && std::equal(x.begin(), x.end(), y.begin()); } Which has to be a bug unless they've got a very strange way of defining equality, or some kindof ordering built into the iterators. > If there's a simpler way to > get "the right" answer, I'd love to see it. I once spent two hours trying > to prove that the dict_compare() + characterize() pair in Python was > correct, but gave up in a mushrooming forest of end cases. I think it's a tougher problem in Python than in languages with value semantics, where an object can't actually contain itself. > In The Beginning, Python implemented dict comparison by materializing the > .items(), sorting both, and then doing list comparison. The correctness of > that was easy to show. But it turned out that in real life all anyone ever > used was == comparison on dicts, and sorting was enormously expensive > compared to what was possible. characterize() is a very clever hack Guido > dreamt up to get the same result in no more than two passes -- but I've > never been sure it's a thoroughly correct hack. ??? I can't find characterize() described anywhere, nor can I find it on my trusty dict objects: >>> help({}.characterize) Traceback (most recent call last): File "", line 1, in ? AttributeError: 'dict' object has no attribute 'characterize' > OTOH, since nobody appears > to care about "<" for dicts, if it's wrong we may never know that. As long as the Python associative world is built around hash + ==, you're probably OK. > >> This was one of the motivations for introducing "rich comparisons". > > > I don't see how that helps. Got a link? Or a clue? > > Sorry, I don't understand the question. Well, you answered it pretty damn well anyway... > Comparing strings for equality/inequality > alone is also done faster than needing to resolve string ordering. And > complex numbers have no accepted "<" relation at all. Yeah, good point. C++ has a less/operator< dichotomy mostly to accomodate pointer types in segmented memory models, but there's no such accomodation for complex. > So comparing dicts > isn't the only place it's easier and quicker to restrict the burden on the > type to implementing equality testing. For user-defined types, I've often > found it *much* easier. For example, I can easily tell whether two > chessboards are equal (do they have the same pieces on the same squares?), > but a concept of "<" for chessboards is strained. Strained, maybe, but easy. You can do a lexicographic comparison of the square contents. > > I don't know what that means. > > There's too much of that on both sides here, so I delcare this mercifully > ended now <0.9 wink>. I, of course, will drag it on to the bitter end. > > [on Zope's B-Trees] > > Aww, heck, you just need a good C++ exception-handling implementation to > > get rid of the error-checking overheads ;-) > > I'd love to use C++ for this. This is one of those things that defines 5 > families of 4 related data structures each via a pile of .c and .h files > that get #include'd and recompiled 5 times after #define'ing a pile of > macros. It would be *so* much more pleasant using templates. I have *just* the library for you. Works with 'C', too! http://www.boost.org/libs/preprocessor/doc/ Believe it or not, people are still pushing this technology to improve compilation times and debuggability of the result. > > ... > > Thanks for the perspective! > > > > still-learning-ly y'rs, > > You're too old for that now -- start making money instead . Sorry, I'll try hard to grow up now. -Dave From David Abrahams" Message-ID: <039c01c21e6a$335ee960$6501a8c0@boostconsulting.com> From: "Tim Peters" > Increasing accuracy isn't necessarily helpful. In any context where > PyInt_AsLong is called, an int most certainly is required *in the end*. > Spelling out that the implementation may satisfy this requirement by asking > a non-int type whether it knows how to convert instances of itself to an int > doesn't seem helpful to me as a user. I'm not thinking that much about the > internal implementation, and "of course" if an int is required Python will > accept an object of a type that knows how to convert itself to an int. OK. Explicit is better than implicit, except when it's obvious what GvR really meant ;-) > But I suppose you don't like seeing > > SyntaxError: invalid syntax > > at the end of a 7-line statement either . I never like seeing that, but I don't know what you're getting at. maybe-you-need-to--harder-ly y'rs, dave From aahz@pythoncraft.com Fri Jun 28 14:42:54 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 28 Jun 2002 09:42:54 -0400 Subject: [Python-Dev] Priority queue (binary heap) python code In-Reply-To: <036101c21e68$8abed730$6501a8c0@boostconsulting.com> References: <036101c21e68$8abed730$6501a8c0@boostconsulting.com> Message-ID: <20020628134254.GA14414@panix.com> On Fri, Jun 28, 2002, David Abrahams wrote: > From: "Tim Peters" >> >> OTOH, multiple attempts by multiple >> groups to steal Perl's regexp engine years ago fizzled out in a tarpit of >> frustration. > > Oh, I had the impression that Python's re *was* pilfered Perl. Thank Fredrik for a brilliant job of re-implementing Perl's regex syntax into something that I assume is maintainable (haven't looked at the code myself) *and* Unicode compliant. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From David Abrahams" <036101c21e68$8abed730$6501a8c0@boostconsulting.com> <20020628134254.GA14414@panix.com> Message-ID: <04b701c21eaa$89d77e70$6501a8c0@boostconsulting.com> From: "Aahz" > > Oh, I had the impression that Python's re *was* pilfered Perl. > > Thank Fredrik for a brilliant job of re-implementing Perl's regex syntax > into something that I assume is maintainable (haven't looked at the code > myself) *and* Unicode compliant. Thanks, Fredrik! From jacobs@penguin.theopalgroup.com Fri Jun 28 15:44:44 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 28 Jun 2002 10:44:44 -0400 (EDT) Subject: [Python-Dev] Garbage collector problem Message-ID: I've found what I consider a major problem with the garbage collector in the Python 2.3 CVS tree. Here is a small kernel that demonstrates the problem: lst = [] for i in range(100000): lst.append( (1,) ) The key ingredients are: 1) A method is called on a container (rather than __setitem__ or __setattr__). 2) A new object is allocated while the method object lives on the Python VM stack, as shown by the disassembled bytecodes: 40 LOAD_FAST 1 (lst) 43 LOAD_ATTR 3 (append) 46 LOAD_CONST 2 (1) 49 BUILD_TUPLE 1 52 CALL_FUNCTION 1 These ingredients combine in the following way to trigger quadratic-time behavior in the Python garbage collector: * First, the LOAD_ATTR on "lst" for "append" is called, and a PyCFunction is returned from this code in descrobject.c:method_get: return PyCFunction_New(descr->d_method, obj); Thus, a _new_ PyCFunction is allocated every time the method is requested. * This new method object is added to generation 0 of the garbage collector, which holds a reference to "lst". * The BUILD_TUPLE call may then trigger a garbage collection cycle. * Since the "append" method is in generation 0, the reference traversal must also follow all objects within "lst", even if "lst" is in generation 1 or 2. This traversal requires time linear in the number of objects in "lst", thus increasing the overall time complexity of the code to quadratic in the number of elements in "lst". Also note that this is a much more general problem than this small example. It can affect many types of objects in addition to methods, including descriptors, iterator objects, and any other object that contains a "back reference". So, what can be done about this.... One simple solution would be to not traverse some "back references" if we are collecting objects in generation 0. This will avoid traversing virtually all of these ephemoral objects that will trigger such expensive behavior. If they live long enough to pass through to generation one or two, then clearly they should be traversed. So, what do all of you GC gurus think? Provided that my analysis is sound, I can rapidly propose a patch to demonstrate this approach if there is sufficient positive sentiment. There is a bug open on sourceforge on this issue, so feel free to reply via python-dev or via the bug -- I read both. As usual sourceforge is buggered, so I have not been able to update the bug with the contents of this e-mail. http://sourceforge.net/tracker/?func=detail&atid=105470&aid=572567&group_id=5470 Regards, -Kevin PS: I have not looked into why this doesn't happen in Python 2.2.x or before. I suspect that it must be related to the recent GC changes in methodobject.py. I'm not motivated to spend much time looking into this, because the current GC behavior is technically correct, though clearly sub-optimal. -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From jeremy@zope.com Fri Jun 28 11:37:19 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 28 Jun 2002 06:37:19 -0400 Subject: [Python-Dev] list.extend In-Reply-To: References: <020201c21e34$6905b6b0$6501a8c0@boostconsulting.com> Message-ID: <15644.15455.298184.157605@slothrop.zope.com> >>>>> "TP" == Tim Peters writes: TP> [David Abrahams] >> What does this "experimental" label mean? TP> I'm not sure. Guido slaps that label on new features from time TP> to time, with the implication that they may go away in the TP> following release. However, no *advertised* experimental TP> feature has ever gone away, and I doubt one ever will. We TP> should drop the "experimental" on this one for sure now, as lots TP> of code uses list.extend(). The access statement was experimental and went away. I guess it is the exception that proves the rule. It was removed about the time I started using Python, so I don't know what it's intended use was. Many of the Python 2.2 features are also labeled experimental. And I don't expect that they will go away either. Jeremy From tim.one@comcast.net Fri Jun 28 16:50:42 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 28 Jun 2002 11:50:42 -0400 Subject: [Python-Dev] list.extend In-Reply-To: <15644.15455.298184.157605@slothrop.zope.com> Message-ID: [Jeremy] > The access statement was experimental and went away. I guess it is > the exception that proves the rule. There are no exceptions to Guido's channeled rules : no *advertised* experimental feature has ever gone away and the access stmt was never documented ("advertised"). The closest it got was its NEWS entry for 0.9.9: * There's a new reserved word: "access". The syntax and semantics are still subject of research and debate (as well as undocumented), but the parser knows about the keyword so you must not use it as a variable, function, or attribute name. The "debate" mentioned there may have been limited to email between Guido and (IIRC) Tommy Burnette. > It was removed about the time I started using Python, so I don't know > what it's intended use was. access_stmt: 'access' NAME (',' NAME)* ':' accesstype (',' accesstype)* accesstype: NAME+ # accesstype should be ('public' | 'protected' | 'private') # ['read'] ['write'] # but can't be because that would create undesirable reserved words! So it was for creating attributes that could be written by the public but read only by class methods . > Many of the Python 2.2 features are also labeled experimental. And I > don't expect that they will go away either. Well, at least not the ones we've told people about. Barry's hack to make print << file, '%d' % i read an int i from file may well go away. From tim.one@comcast.net Fri Jun 28 17:18:34 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 28 Jun 2002 12:18:34 -0400 Subject: [Python-Dev] list.extend In-Reply-To: <15644.34975.161808.776825@anthem.wooz.org> Message-ID: [Barry, on 'access'] > python-mode.el gained knowledge of it in 1996: > > revision 2.81 > date: 1996/09/04 15:21:55; author: bwarsaw; state: Exp; lines: +4 -4 > (python-font-lock-keywords): with Python 1.4 `access' is no a keyword You're misreading "no" as "now" instead of "not". This patch removed 'access' from python-font-lock-keywords, and that's exactly what "is no a keyword" meant to me considering it was BarrySpeak . > Which is just before Python 1.4 final. I've no idea when it went > away. According to Misc/HISTORY, the bulk of it vanished in 1.4beta3, with assorted forgetten pieces removed over the following years. > TP> Well, at least not the ones we've told people about. Barry's > TP> hack to make > > TP> print << file, '%d' % i > > TP> read an int i from file may well go away. > > It will last week. Freakin' time machine erased all evidence > tomorrow. Damn -- it's already gone from my disk! Quick, document it before From barry@zope.com Fri Jun 28 17:29:42 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 28 Jun 2002 12:29:42 -0400 Subject: [Python-Dev] list.extend References: <15644.34975.161808.776825@anthem.wooz.org> Message-ID: <15644.36598.681690.547336@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> You're misreading "no" as "now" instead of "not". This patch TP> removed 'access' from python-font-lock-keywords, and that's TP> exactly what "is no a keyword" meant to me considering it was TP> BarrySpeak . How weird, I never wrote any of that! I /have/ been playing with NaturallySpeaking for Linux (tm) and all I did was burp. Why did it take that sound to mean: cause my XEmacs to respond to the message, do the cvs log, cut-n-paste, send the message, without even my knowledge? Okay, it was a rather, um, soupy burp, but nonetheless... You should have seen what it did with the cat's purrs. i-swear-honey-it-was-the-cat-that-ran-pt.py-ly y'rs, -Barry From David Abrahams" <15644.36598.681690.547336@anthem.wooz.org> Message-ID: <05b501c21ec1$afc8a980$6501a8c0@boostconsulting.com> From: "Barry A. Warsaw" > >>>>> "TP" == Tim Peters writes: > > TP> You're misreading "no" as "now" instead of "not". This patch > TP> removed 'access' from python-font-lock-keywords, and that's > TP> exactly what "is no a keyword" meant to me considering it was > TP> BarrySpeak . > > How weird, I never wrote any of that! > > I /have/ been playing with NaturallySpeaking for Linux (tm) and all I > did was burp. Why did it take that sound to mean: cause my XEmacs to > respond to the message, do the cvs log, cut-n-paste, send the message, > without even my knowledge? Okay, it was a rather, um, soupy burp, but > nonetheless... Part of the deal with my natural language system at Dragon was that they wanted me to work on Dutch translation, but I don't know Dutch so I used Python and figured that would be enough. It turns out that Dutch sounds a lot like burping to my ear. I think you can see where this is headed... -Dave From barry@zope.com Fri Jun 28 17:02:39 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 28 Jun 2002 12:02:39 -0400 Subject: [Python-Dev] list.extend References: <15644.15455.298184.157605@slothrop.zope.com> Message-ID: <15644.34975.161808.776825@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> There are no exceptions to Guido's channeled rules : TP> no *advertised* experimental feature has ever gone away TP> and the access stmt was never documented ("advertised"). The TP> closest it got was its NEWS entry for 0.9.9: python-mode.el gained knowledge of it in 1996: revision 2.81 date: 1996/09/04 15:21:55; author: bwarsaw; state: Exp; lines: +4 -4 (python-font-lock-keywords): with Python 1.4 `access' is no a keyword Which is just before Python 1.4 final. I've no idea when it went away. >> Many of the Python 2.2 features are also labeled experimental. >> And I don't expect that they will go away either. TP> Well, at least not the ones we've told people about. Barry's TP> hack to make TP> print << file, '%d' % i TP> read an int i from file may well go away. It will last week. Freakin' time machine erased all evidence tomorrow. -Barry From jeremy@zope.com Fri Jun 28 15:02:20 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 28 Jun 2002 10:02:20 -0400 Subject: [Python-Dev] Garbage collector problem In-Reply-To: References: Message-ID: <15644.27756.584393.217271@slothrop.zope.com> I had a different ideas to solve this performance problem and perhaps others. It's only half baked, but I thought it was at least worth mentioning in an e-mail. The premise is that the garbage collector tracks a lot of objects that will never participate in cycles and can never participate in cycles. The idea is to avoid tracking objects until it becomes possible for them to participate in a collectible cycle. For example, an object referenced from a local variable will never be collected until after the frame releases its reference. So what if we did not track objects that were stored in local variables? To make this work, we would need to change the SETLOCAL macro in ceval to track the object that it was DECREFing. There are a lot of little details that would make this complicated unfortunately. All new container objects are tracked, so we would need to untrack ones that are stored in local variables. To track objects on DECREF, we would also need to ask if the object type was GC-enabled. Another kind of object that is never going to participate in a cycle, I think, is an object that lives only temporarily on the ceval stack. For example, a bound method object created on the stack in order to be called. If it's never bound to another object as an attribute or stored in local variable, it can never participate in the cycle. How hard would it be to add logic that avoided tracking objects until it was plausible that they would participate in a cycle? Jeremy From tim.one@comcast.net Fri Jun 28 19:59:54 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 28 Jun 2002 14:59:54 -0400 Subject: [Python-Dev] Garbage collector problem In-Reply-To: Message-ID: [Kevin Jacobs, working hard!] I don't know what causes this. The little time I've been able to spend on it ended up finding an obvious buglet in some new-in-2.3 gcmodule code: for (i = 0; i <= generation; i++) generations[generation].count = 0; That was certainly intended to index by "i", not by "generation". Fixing that makes the gc.DEBUG_STATS output less surprising, and cuts down on the number of collections, but doesn't really cure anything. Note that bound methods in 2.2 also create new objects, etc; that was good deduction, but not yet good enough . From jacobs@penguin.theopalgroup.com Fri Jun 28 20:11:13 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 28 Jun 2002 15:11:13 -0400 (EDT) Subject: [Python-Dev] Garbage collector problem In-Reply-To: Message-ID: On Fri, 28 Jun 2002, Tim Peters wrote: > I don't know what causes this. The little time I've been able to spend on > it ended up finding an obvious buglet in some new-in-2.3 gcmodule code: > > for (i = 0; i <= generation; i++) > generations[generation].count = 0; > > That was certainly intended to index by "i", not by "generation". Good catch! I missed that in spite of reading those lines 20 times. > Note that bound methods in 2.2 also create new objects, etc; that was good > deduction, but not yet good enough . That is why I added my "PS" about not looking into why it didn't blow up in Python 2.2. In reality, I did look, but only for 30 seconds, and then decided I didn't want to know. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From jacobs@penguin.theopalgroup.com Fri Jun 28 20:23:04 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 28 Jun 2002 15:23:04 -0400 (EDT) Subject: [Python-Dev] Garbage collector problem In-Reply-To: <15644.27756.584393.217271@slothrop.zope.com> Message-ID: On Fri, 28 Jun 2002, Jeremy Hylton wrote: > I had a different ideas to solve this performance problem and perhaps > others. It's only half baked, but I thought it was at least worth > mentioning in an e-mail. The premise is that the garbage collector > tracks a lot of objects that will never participate in cycles and can > never participate in cycles. The idea is to avoid tracking objects > until it becomes possible for them to participate in a collectible > cycle. Hi Jeremy, You have an interesting idea, though I'd state the premise slightly differently. How about: The premise is that the garbage collector tracks a lot of objects that will never participate in collectible cycles, because untraceable references are held. The idea is to avoid tracking these objects until it becomes possible for them to participate in a collectible cycle. (virtually any object _can_ participate in a cycle -- most just never do) Offhand, I am not sure if my idea of ignoring certain references in generation 0 or your idea will work better in practice. Both require adding more intelligence to the garbage collection system via careful annotations. I wouldn't be surprised if the optimal approach involved both methods. > How hard would it be to add logic that avoided tracking objects until > it was plausible that they would participate in a [collectable] cycle? I can work up a patch that does this. Can anyone else think of places where this makes sense, other than frame objects and the ceval stack? Also, any thoughts on my approach? I have a hard time thinking of any situation that generates enough cyclic garbage where delaying collection until generation 1 would be a serious problem. -Kevin PS: The bug Tim spotted makes a big difference too. -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From python@rcn.com Fri Jun 28 20:25:13 2002 From: python@rcn.com (Raymond Hettinger) Date: Fri, 28 Jun 2002 15:25:13 -0400 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() Message-ID: <001f01c21ed9$873f3c00$06ea7ad1@othello> As far as I can tell, buffer() is one of the least used or known about Python tools. What do you guys think about this as a candidate for silent deprecation (moving out of the primary documentation)? Raymond Hettinger From jeremy@zope.com Fri Jun 28 15:56:29 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 28 Jun 2002 10:56:29 -0400 Subject: [Python-Dev] Garbage collector problem In-Reply-To: References: <15644.27756.584393.217271@slothrop.zope.com> Message-ID: <15644.31005.110922.650771@slothrop.zope.com> >>>>> "KJ" == Kevin Jacobs writes: KJ> You have an interesting idea, though I'd state the premise KJ> slightly differently. How about: KJ> The premise is that the garbage collector tracks a lot of KJ> objects that will never participate in collectible cycles, KJ> because untraceable references are held. The idea is to avoid KJ> tracking these objects until it becomes possible for them to KJ> participate in a collectible cycle. I guess the untraced reference to the current frame is the untraceable reference you refer to. The crucial issue is that local variables of the current frame can't be collected, so there's little point in tracking and traversing the objects they refer to. I agree, of course, that the concern is collectible cycles. KJ> (virtually any object _can_ participate in a cycle -- most just KJ> never do) Right. There ought to be some way to exploit that. KJ> Also, any thoughts on my approach? I have a hard time thinking KJ> of any situation that generates enough cyclic garbage where KJ> delaying collection until generation 1 would be a serious KJ> problem. If I take your last statement literally, it sounds like we ought to avoid doing anything until an object gets to generation 1 <0.7 wink>. Your suggestion seems to be that we should treat references from older generations to newer generations as external roots. So a cycle that spans generations will not get collected until everything is in the same generation. Indeed, that does not seem harmful. On the other hand, it's hard to reconcile an intuitive notion of generation with what we're doing by running GC over and over as you add more elements to your list. It doesn't seem right that your list becomes an "old" object just because a single function allocates 100k young objects. That is, I wish the notion of generations accommodated a baby boom in a generation. Jeremy From nas@python.ca Fri Jun 28 21:04:18 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 28 Jun 2002 13:04:18 -0700 Subject: [Python-Dev] Garbage collector problem In-Reply-To: ; from jacobs@penguin.theopalgroup.com on Fri, Jun 28, 2002 at 03:23:04PM -0400 References: <15644.27756.584393.217271@slothrop.zope.com> Message-ID: <20020628130418.D10441@glacier.arctrix.com> Another idea would be exploit the fact that we know most of the root objects (e.g. sys.modules and the current stack of frames). I haven't figured out a good use for this knowledge though. Neil From jacobs@penguin.theopalgroup.com Fri Jun 28 21:07:33 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 28 Jun 2002 16:07:33 -0400 (EDT) Subject: [Python-Dev] Garbage collector problem In-Reply-To: <15644.31005.110922.650771@slothrop.zope.com> Message-ID: On Fri, 28 Jun 2002, Jeremy Hylton wrote: > Your suggestion seems to be that we should treat references from older > generations to newer generations as external roots. So a cycle that > spans generations will not get collected until everything is in the > same generation. Indeed, that does not seem harmful. Not really -- I'm saying that certain types of containers tend to hold references to other, much larger, containers. These small containers tend to be ephemoral -- they appear and disappear quickly -- but sometimes are unlucky enough to be around when a collection is triggered. In my example, the small containers were bound-method objects, which store back-references to their class instance, a huge list, which will live in generation 2 very quickly. I do not advocate making objects store which generation they belong to, but rather to delay the traversal of certain containers until after generation 0. This means that they've been around the block a few times, and may have fallen in with a bad cyclical crowd. This annotation should be added to objects that tend to shadow other containers, like bound-methods, iterators, generators, descriptors, etc. In some tests using real workloads, I've found that upwards of 99% of these ephemoral objects never make it to a generation 1 collection anyway. Haulin' garbage, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From jacobs@penguin.theopalgroup.com Fri Jun 28 21:11:41 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 28 Jun 2002 16:11:41 -0400 (EDT) Subject: [Python-Dev] Garbage collector problem In-Reply-To: <20020628130418.D10441@glacier.arctrix.com> Message-ID: On Fri, 28 Jun 2002, Neil Schemenauer wrote: > Another idea would be exploit the fact that we know most of the root > objects (e.g. sys.modules and the current stack of frames). I haven't > figured out a good use for this knowledge though. If the root objects cannot be reached by the GC traversal, you get the approach that Jeremy is suggesting. (Though I just looked, and frame objects aren't exempt from tracking) -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From jeremy@zope.com Fri Jun 28 16:49:46 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 28 Jun 2002 11:49:46 -0400 Subject: [Python-Dev] Garbage collector problem In-Reply-To: References: <20020628130418.D10441@glacier.arctrix.com> Message-ID: <15644.34202.289500.35641@slothrop.zope.com> >>>>> "KJ" == Kevin Jacobs writes: KJ> On Fri, 28 Jun 2002, Neil Schemenauer wrote: >> Another idea would be exploit the fact that we know most of the >> root objects (e.g. sys.modules and the current stack of frames). >> I haven't figured out a good use for this knowledge though. KJ> If the root objects cannot be reached by the GC traversal, you KJ> get the approach that Jeremy is suggesting. (Though I just KJ> looked, and frame objects aren't exempt from tracking) Right. My suggestion is to not track a set of objects that otherwise would be tracked -- the current frame and its local variables. Jeremy From tim.one@comcast.net Fri Jun 28 22:49:58 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 28 Jun 2002 17:49:58 -0400 Subject: [Python-Dev] Garbage collector problem In-Reply-To: <15644.31005.110922.650771@slothrop.zope.com> Message-ID: [Jeremy, to Kevin Jacobs] > ... > Your suggestion seems to be that we should treat references from older > generations to newer generations as external roots. That's the way it works now: an object in gen N *is* an external root wrt any object it references in gen I with I < N. > So a cycle that spans generations will not get collected until > everything is in the same generation. Right, and that's what happens (already). When gen K is collected, all gens <= K are smushed into gen K at the start, and all trash cycles are collected except for those that contain at least one object in gen K+1 or higher. > Indeed, that does not seem harmful. It hasn't been so far , although you can certainly construct cases where it causes an inconvenient delay in trash collection. > On the other hand, it's hard to reconcile an intuitive notion of > generation with what we're doing by running GC over and over as you > add more elements to your list. It doesn't seem right that your list > becomes an "old" object just because a single function allocates 100k > young objects. That is, I wish the notion of generations accommodated > a baby boom in a generation. I don't think you do. Pushing the parent object into an older generation is exactly what's supposed to save us from needing to scan all its children every time a gen0 collection occurs. Under 2.2.1, Kevin's test case pushes "the list" into gen2 early on, and those of the list's children that existed at that time are never scanned again until another gen2 collection occurs. For a reason I still haven't determined, under current CVS "the whole list" is getting scanned by move_root_reachable() every time a gen0 collection occurs. It's also getting scanned by both subtract_refs() and move_root_reachable() every time a gen1 collection occurs. I'm not yet sure whether the mystery is why this happens in 2.3, or why it doesn't happen in 2.2.1 <0.5 wink>. From gmcm@hypernet.com Fri Jun 28 23:40:33 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 28 Jun 2002 18:40:33 -0400 Subject: [Python-Dev] Garbage collector problem In-Reply-To: References: <15644.31005.110922.650771@slothrop.zope.com> Message-ID: <3D1CADA1.25053.99DBE9E@localhost> On 28 Jun 2002 at 16:07, Kevin Jacobs wrote: > In some tests using real workloads, I've found that > upwards of 99% of these ephemoral objects never make > it to a generation 1 collection anyway. Um, the objects are ephemeral. You were probably thinking of Uncle Timmy. -- Gordon http://www.mcmillan-inc.com/ From tim.one@comcast.net Fri Jun 28 23:58:30 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 28 Jun 2002 18:58:30 -0400 Subject: [Python-Dev] Garbage collector problem In-Reply-To: Message-ID: [Tim] > ... > I'm not yet sure whether the mystery is why this happens in 2.3, or > why it doesn't happen in 2.2.1 <0.5 wink>. Knock that down to 0.1 wink <0.3 wink>: Kevin's problem goes away in current CVS if I change the guard in visit_decref() to if (IS_TRACKED(op) && !IS_MOVED(op)) ^^^^^^^^^^^^^^^^ added this I've no real idea why, as 2.2.1 didn't need this to prevent "the list" from getting continually pulled back into a younger generation. Without this change in current CVS, it looks like, in each gen0 collection: a. The bound method object in gen0 knocks "the list"'s gc_refs down to -124 when visit_decref() is called by the bound method object traverse via subtract_refs(). Therefore IS_MOVED("the list") is no longer true. b. move_root_reachable() then moves "the list" into the list of reachable things, because visit_move's has-always-been-there if (IS_TRACKED(op) && !IS_MOVED(op)) { guard doesn't believe "the list" has already been moved. vist_move then restores the list's gc_refs to the magic -123. c. move_root_reachable() goes on to scan all of "the list"'s entries too. d. "the list" itself gets moved into gen1, just because it's in the list of reachable things. e. The next gen0 collection starts at #a again, and does the same stuff all over again. Adding the new guard in visit_decref() breaks this at #a: IS_MOVED("the list") remains true, and so #b doesn't move "the list" into the set of reachable objects again, and so the list stays in whichever older generation it was in, and doesn't get scanned again (until the next gen2 traversal). The mystery to me now is why the a,b,c,d,e loop didn't happen in 2.2.1. From aahz@pythoncraft.com Sat Jun 29 00:03:43 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 28 Jun 2002 19:03:43 -0400 Subject: [Python-Dev] Improved tmpfile module In-Reply-To: <20020627221228.GB9371@codesourcery.com> References: <20020627221228.GB9371@codesourcery.com> Message-ID: <20020628230343.GA6262@panix.com> On Thu, Jun 27, 2002, Zack Weinberg wrote: > > This is largely as it was in the old file. I happen to know that ~%s~ > is conventional for temporary files on Windows. I changed 'tmp%s' to > 'pyt%s' for Unix to make it consistent with Mac/RiscOS > > Ideally one would allow the calling application to control the prefix, but > I'm not sure what the right interface is. Maybe > > tmpfile.mkstemp(prefix="", suffix="") > > where if one argument is provided it gets treated as the suffix, but > if two are provided the prefix comes first, a la range()? Is there a > way to express that in the prototype? The main problem with this is that range() doesn't support keyword arguments, just positional ones. In order to get the same effect with mkstemp, you'd have to do def tmpfile.mkstemp(*args): and raise an exception with more than two arguments. Otherwise, if you allow keyword arguments, you get the possibility of: tmpfile.mkstemp(prefix="foo") and you can't distinguish that from tmpfile.mkstemp("foo") unless you change the prototype to def tmp.mkstemp(*args, **kwargs): which requires a bit more of a song-and-dance setup routine. In any event, you probably should not use empty strings as the default parameters; use None instead. (Yeah, this is getting a bit off-topic for python-dev; I'm just practicing for my book. ;-) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From tim.one@comcast.net Sat Jun 29 00:25:24 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 28 Jun 2002 19:25:24 -0400 Subject: [Python-Dev] Garbage collector problem In-Reply-To: Message-ID: [Tim] > ... > The mystery to me now is why the a,b,c,d,e loop didn't happen in 2.2.1. Because 2.2.1 has a bug in PyCFunction_New(), which ends with op->m_self = self; PyObject_GC_Init(op); return (PyObject *)op; But also in 2.2.1, /* This is here for the sake of backwards compatibility. Extensions that * use the old GC API will still compile but the objects will not be * tracked by the GC. */ #define PyGC_HEAD_SIZE 0 #define PyObject_GC_Init(op) #define PyObject_GC_Fini(op) #define PyObject_AS_GC(op) (op) #define PyObject_FROM_GC(op) (op) IOW, PyObject_GC_Init(op) is a nop in 2.2.1, and the bound method object never gets tracked. Therefore the a,b,c,d,e loop never gets started. In current CVS, the function ends with op->m_self = self; _PyObject_GC_TRACK(op); return (PyObject *)op; and a world of fun follows . From gward@python.net Sat Jun 29 00:56:28 2002 From: gward@python.net (Greg Ward) Date: Fri, 28 Jun 2002 19:56:28 -0400 Subject: [Python-Dev] posixmodule.c diffs for working forkpty() and openpty() under Solaris 2.8 In-Reply-To: <20020626082135.16733.qmail@web20905.mail.yahoo.com> References: <20020626082135.16733.qmail@web20905.mail.yahoo.com> Message-ID: <20020628235628.GA2634@gerg.ca> On 26 June 2002, Lance Ellinghaus said: > I had to get forkpty() and openpty() working under Solaris 2.8 for a > project I am working on. > Here are the diffs to the 2.2.1 source file. Patches will get lost in the shuffle on python-dev. You should a) make the patch relative to the current CVS, b) submit it to SourceForge, and c) keep your eye on the ball until someone checks it in. Thanks! Greg -- Greg Ward - geek-at-large gward@python.net http://starship.python.net/~gward/ War is Peace; Freedom is Slavery; Ignorance is Knowledge From aahz@pythoncraft.com Sat Jun 29 01:21:23 2002 From: aahz@pythoncraft.com (Aahz) Date: Fri, 28 Jun 2002 20:21:23 -0400 Subject: [Python-Dev] list.extend In-Reply-To: <05b501c21ec1$afc8a980$6501a8c0@boostconsulting.com> References: <15644.36598.681690.547336@anthem.wooz.org> <05b501c21ec1$afc8a980$6501a8c0@boostconsulting.com> Message-ID: <20020629002123.GC18004@panix.com> On Fri, Jun 28, 2002, David Abrahams wrote: > > Part of the deal with my natural language system at Dragon was that they > wanted me to work on Dutch translation, but I don't know Dutch so I used > Python and figured that would be enough. It turns out that Dutch sounds a > lot like burping to my ear. I think you can see where this is headed... Yeah, it means that Orlijn has a guaranteed job. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ From gward@python.net Sat Jun 29 01:24:55 2002 From: gward@python.net (Greg Ward) Date: Fri, 28 Jun 2002 20:24:55 -0400 Subject: [Python-Dev] Improved tmpfile module In-Reply-To: <20020627221228.GB9371@codesourcery.com> References: <20020627221228.GB9371@codesourcery.com> Message-ID: <20020629002455.GB2634@gerg.ca> On 27 June 2002, Zack Weinberg said: > Well, I wrote the analogous code in the GNU C library (using basically > the same algorithm). I'm confident it is safe on a Unix-based system. > On Windows and others, I am relying on os.open(..., os.O_EXCL) to do > what it claims to do; assuming it does, the code should be safe there too. Sounds like good credentials to me. Welcome to Python-land! Note that you'll probably get more positive feedback if you provide a patch to tmpfile.py rather than a complete rewrite. And patches will get lost on python-dev -- you should submit it to SourceForge, and stay on the case until the patch is accepted or rejected (or maybe deferred). [me] > +1 except for the name. What does the "s" stand for? Unfortunately, I > can't think of a more descriptive name offhand. [Zack] > Fredrik Lundh's suggestion that it is for "safer" seems plausible, but > I do not actually know. I chose the names mkstemp and mkdtemp to > match the functions of the same name in most modern Unix C libraries. > Since they don't take the same "template" parameter that those > functions do, that was probably a bad idea. Hmmmm... I'm torn here. When emulating (or wrapping) functionality from the standard C library or Unix kernel, I think it's generally good to preserve familiar, long-used names: os.chmod() is better than os.changemode() (or change_mode(), if I wrote the code). But mkstemp() and mkdtemp() are *not* familiar, long-used names. (At least not to me -- I program in C very rarely!) But they will probably become more familiar over time. Also, API changes that are just due to fundamental differences between C and Python (immutable strings, multiple return values) are not really enough reason to change a name. It looks like your Python mkstemp() has one big advantage over the glibc mkstemp() -- you can supply a suffix. IMHO, the inability to supply a prefix is a small disadvantage. But those add up to a noticeably different API. I think I'm slightly in favour of a different name for the Python version. If you make it act like this: mkstemp(template : string = (sensible default), suffix : string = "") -> (filename : string, fd : int) (err, I hope my personal type language is comprehensible), then call it mkstemp() after all. > [Note to Fredrik: at the C level, mkstemp is not deprecated in favor > of tmpfile, as they do very different things - tmpfile(3) is analogous > to tmpfile.TemporaryFile(), you don't get the file name back.] But the man page for mkstemp() in glibc 2.2.5 (Debian unstable) says: Don't use this function, use tmpfile(3) instead. It is better defined and more portable. BTW, that man page has two "NOTES" sections. > I was trying to get all the user-accessible interfaces to be at the > top of the file. Also, I do not understand the bits in the existing > file that delete names out of the module namespace after we're done > with them, so I wound up taking all of that out to get it to work. I > think the existing file's organization was largely determined by those > 'del' statements. > > I'm happy to organize the file any way y'all like -- I'm kind of new > to Python and I don't know the conventions yet. If I was starting from scratch, I would *probably* do something like this: if os.name == "posix": class TemporaryFile: [... define Unix version of TemporaryFile ...] elif os.name == "nt": class NamedTemporaryFile: [...] class TemporaryFile: [... on top of NamedTemporaryFile ...] elif os.name == "macos": # beats me But I don't know the full history of this module. IMHO you would have a much better chance of success if you prepared a couple of patches -- eg. one to add mkstemp() and mkdtemp() (possibly with different names), another to do whatever it is to TemporaryFile that you want to do. Possibly a third for general code cleanup, if you feel some is needed. (Or do the code cleanup patch first, if that's what's needed.) Greg -- Greg Ward - just another Python hacker gward@python.net http://starship.python.net/~gward/ I hope something GOOD came in the mail today so I have a REASON to live!! From nas@python.ca Sat Jun 29 04:01:27 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 28 Jun 2002 20:01:27 -0700 Subject: [Python-Dev] On the topic of garbage collection Message-ID: <20020628200127.A11344@glacier.arctrix.com> Seen on the net: http://www.ravenbrook.com/project/mps/ The Memory Pool System is a very general, adaptable, flexible, reliable, and efficient memory management system. It permits the flexible combination of memory management techniques, supporting manual and automatic memory management, in-line allocation, finalization, weakness, and multiple concurrent co-operating incremental generational garbage collections. It also includes a library of memory pool classes implementing specialized memory management policies. The code is offered under an open source license. Neil From fredrik@pythonware.com Sat Jun 29 12:02:13 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 29 Jun 2002 13:02:13 +0200 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() References: <001f01c21ed9$873f3c00$06ea7ad1@othello> Message-ID: <004c01c21f5c$6dcbf2d0$ced241d5@hagrid> raymond wrote: > As far as I can tell, buffer() is one of the least used or known about > Python tools. What do you guys think about this as a candidate for silent > deprecation (moving out of the primary documentation)? +1, in theory. does anyone have any real-life use cases? I've never been able to use it for anything, and cannot recall ever seeing it being used by anyone else... (it sure doesn't work for the use cases I thought of when first learning about the API...) From tim.one@comcast.net Sat Jun 29 13:22:28 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 29 Jun 2002 08:22:28 -0400 Subject: [Python-Dev] Garbage collector problem In-Reply-To: Message-ID: [Kevin Jacobs] Nice job, Kevin! You learned a lot in a hurry here. I'll try to fill in some blanks. > ... > lst = [] > for i in range(100000): > lst.append( (1,) ) > > The key ingredients are: > > 1) A method is called on a container (rather than __setitem__ or > __setattr__). > > 2) A new object is allocated while the method object lives on the Python > VM stack, as shown by the disassembled bytecodes: > > 40 LOAD_FAST 1 (lst) > 43 LOAD_ATTR 3 (append) > 46 LOAD_CONST 2 (1) > 49 BUILD_TUPLE 1 > 52 CALL_FUNCTION 1 > > These ingredients combine in the following way to trigger quadratic-time > behavior in the Python garbage collector: > > * First, the LOAD_ATTR on "lst" for "append" is called, and a > PyCFunction is returned from this code in descrobject.c:method_get: > > return PyCFunction_New(descr->d_method, obj); > > Thus, a _new_ PyCFunction is allocated every time the method is > requested. In outline, so far all that has been true since 0 AP (After Python). > * This new method object is added to generation 0 of the garbage > collector, which holds a reference to "lst". It's a bug in 2.2.1 that the method object isn't getting added to gen0. It is added in current CVS. > * The BUILD_TUPLE call may then trigger a garbage collection cycle. > > * Since the "append" method is in generation 0, Yes. > the reference traversal must also follow all objects within "lst", > even if "lst" is in generation 1 or 2. According to me, it's a bug that it does so in current CVS, and a bug that's been in cyclic gc forever. This kind of gc scheme isn't "supposed to" chase old objects (there's no point to doing so -- if there is a reclaimable cycle in the young generation, the cycle is necessarily composed of pure young->young pointers, so chasing a cross-generation pointer can't yield any useful info). It's not a *semantic* error if it chases old objects too, but it does waste time, and can (as it does here) yank old objects back to a younger generation. I attached a brief patch to your bug report that stops this. > This traversal requires time linear in the number of > objects in "lst", thus increasing the overall time complexity of the > code to quadratic in the number of elements in "lst". Yes. Do note that this class of program is quadratic-time anyway, just because the rare gen2 traversals have to crawl over an ever-increasing lst too. BTW, the "range(100000)" in your test program also gets crawled over every time a gen2 collection occurs! That's why Neil made them rare . > Also note that this is a much more general problem than this > small example. Sure, although whether it's still "a real problem" after the patch is open to cost-benefit ridicule . > It can affect many types of objects in addition to methods, including > descriptors, iterator objects, and any other object that contains a "back > reference". > > So, what can be done about this.... One simple solution would be to not > traverse some "back references" if we are collecting objects in generation > 0. > > This will avoid traversing virtually all of these ephemoral > objects that will trigger such expensive behavior. If they live long > enough to pass through to generation one or two, then clearly they > should be traversed. > > So, what do all of you GC gurus think? Provided that my analysis > is sound, I can rapidly propose a patch to demonstrate this approach if > there is sufficient positive sentiment. Seeing a patch is the only way I'd understand your intent. You can understand my intent by reading my patch . > ... > > PS: I have not looked into why this doesn't happen in Python 2.2.x or > before. It's a bug in 2.2.1 (well, two bugs, if Neil accepts my claim that the patch I put up "fixes a bug" too). In 2.1, method objects hadn't yet been added to cyclic gc. From fredrik@pythonware.com Sat Jun 29 13:39:07 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 29 Jun 2002 14:39:07 +0200 Subject: [Python-Dev] Priority queue (binary heap) python code References: <036101c21e68$8abed730$6501a8c0@boostconsulting.com> Message-ID: <01f201c21f69$f7600f10$ced241d5@hagrid> david wrote: > > OTOH, multiple attempts by multiple groups to steal Perl's > > regexp engine years ago fizzled out in a tarpit of frustration. > > Oh, I had the impression that Python's re *was* pilfered Perl. Tim warned me that the mere attempt to read sources for existing RE implementations was a sure way to destroy my brain, so I avoided that. SRE is a clean-room implementation, using the Python 1.5.2 docs and the regression test suite as the only reference. I've never written a Perl regexp in my life. From jacobs@penguin.theopalgroup.com Sat Jun 29 14:03:32 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Sat, 29 Jun 2002 09:03:32 -0400 (EDT) Subject: [Python-Dev] Garbage collector problem In-Reply-To: Message-ID: On Sat, 29 Jun 2002, Tim Peters wrote: > Nice job, Kevin! You learned a lot in a hurry here. I'll try to fill in > some blanks. Thanks for the great sleuthing, Tim. I missed a few critical details about how the GC system was intended to work. It was not initially clear that most GC traversals were not recursive. i.e., I had assumed that functions like update_refs and subtract_refs did a DFS through all reachable references, instead of a shallow 1-level search. Of course, it all makes much more sense now. Here are the results of my test program (attached to the SF bug) with and without your patch installed (2.3a0+ and 2.3a0-, respectively) and GC enabled: N 20000 40000 80000 160000 240000 320000 480000 640000 Ver. -------- -------- -------- -------- -------- -------- -------- -------- 1.5.2 316450/s 345590/s 349609/s 342895/s 351352/s 353734/s 345362/s 350978/s 2.0 183723/s 192671/s 174146/s 151661/s 154592/s 127181/s 114903/s 99469/s 2.2.1 228553/s 234018/s 197809/s 166019/s 171306/s 137840/s 122835/s 105785/s 2.3a0- 164968/s 111752/s 68220/s 38129/s 26098/s 19678/s 13488/s 10396/s 2.3a0+ 291286/s 287168/s 284857/s 233244/s 196731/s 170759/s 135541/s 129851/s There is still room for improvement, but overall I'm happy with the performance of 2.3a0+. > > So, what do all of you GC gurus think? Provided that my analysis > > is sound, I can rapidly propose a patch to demonstrate this approach if > > there is sufficient positive sentiment. > > Seeing a patch is the only way I'd understand your intent. You can > understand my intent by reading my patch . When functioning correctly, the current garbage collector already does what I was suggesting (in more generality, to boot). No need for a patch. Thanks again, Tim. It was a lively chase through some of the strange and twisted innards of my favorite language. Off to write boring code again, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From Juergen Hermann" Message-ID: <17OJ5m-171Y92C@fwd10.sul.t-online.com> On Sat, 29 Jun 2002 13:02:13 +0200, Fredrik Lundh wrote: >does anyone have any real-life use cases? I've never been >able to use it for anything, and cannot recall ever seeing it >being used by anyone else... We use it for BLOB support in our Python binding to our C++ binding to Oracle OCI. Oracle allows loading of limited ranges out of BLOBs, and the buffer interface perfectly fits into that. Ciao, J=FCrgen From mwh@python.net Sat Jun 29 16:07:46 2002 From: mwh@python.net (Michael Hudson) Date: 29 Jun 2002 16:07:46 +0100 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() In-Reply-To: j.her@t-online.de's message of "Sat, 29 Jun 2002 16:20:57 +0200" References: <17OJ5m-171Y92C@fwd10.sul.t-online.com> Message-ID: <2mlm8yw2bh.fsf@starship.python.net> j.her@t-online.de (Juergen Hermann) writes: > On Sat, 29 Jun 2002 13:02:13 +0200, Fredrik Lundh wrote: > > >does anyone have any real-life use cases? I've never been > >able to use it for anything, and cannot recall ever seeing it > >being used by anyone else... > > We use it for BLOB support in our Python binding to our C++ binding to > Oracle OCI. Oracle allows loading of limited ranges out of BLOBs, and > the buffer interface perfectly fits into that. But that's from C, right? I don't think anyone's suggested removing the C-level buffer interface. Cheers, M. -- I think perhaps we should have electoral collages and construct our representatives entirely of little bits of cloth and papier mache. -- Owen Dunn, ucam.chat, from his review of the year From tismer@tismer.com Sat Jun 29 16:48:00 2002 From: tismer@tismer.com (Christian Tismer) Date: Sat, 29 Jun 2002 17:48:00 +0200 Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions References: <3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> <3D152CCB.6010000@tismer.com> <3D161435.9D154EE0@prescod.net> <3D162060.9030101@tismer.com> <3D162342.BBDC07B3@prescod.net> Message-ID: <3D1DD6B0.2020603@tismer.com> Extended proposal at the end: Paul Prescod wrote: > Christian Tismer wrote: > >>... >> >>Are you sure you got what I meant? >>I want to compile the variable references away at compile >>time, resulting in an ordinary format string. >>This string is wraped by the runtime _(), and >>the result is then interpolated with a dict. > > > How can that be? > > Original expression: > > _($"$foo") > > Expands to: > > _("%(x1)s"%{"x1": foo}) > > Standard Python order of operations will do the %-interpolation before > the method call! You say that it could instead be > > _("%(x1)s")%{"x1": foo} > > But how would Python know to do that? "_" is just another function. > There is nothing magical about it. What if the function was instead > re.compile? In that case I would want to do the interpolation *before* > the compilation, not after! > > Are you saying that the "_" function should be made special and > recognized by the compiler? My idea has evolved into the following: Consider an interpolating object with the following properties (sketched by a class here): class Interpol: def __init__(self, fmt, dic): self.fmt = fmt self.dic = dic def __repr__(self): return self.fmt % self.dic Original expression: _($"$foo") Expands at compile time to: _( Interpol("%(x1)s", {"x1": foo}) ) Having said that, it is now up to the function _() to test whether its argument is an Interpol or not. It can do something like that: def _(arg): ... if type(arg) is Interpol: return _(arg.fmt) % arg.dic # or, maybe cleaner, leaving the formatting action # to the Interpol class: def _(arg): ... if isinstance(arg, Interpol): return arg.__class__(_(arg.fmt), arg.dic) # which then in turn will return the final string, # if it is interrogated via str or repr. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From martin@v.loewis.de Sat Jun 29 18:59:47 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 29 Jun 2002 19:59:47 +0200 Subject: [Python-Dev] posixmodule.c diffs for working forkpty() and openpty() under Solaris 2.8 In-Reply-To: <20020626082135.16733.qmail@web20905.mail.yahoo.com> References: <20020626082135.16733.qmail@web20905.mail.yahoo.com> Message-ID: Lance Ellinghaus writes: > Please let me know if anyone has any problems with this! I do. I have the general problem with posting such patches to python-dev; please put them onto SF instead. For specific problems, please see below. > ! #if defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || defined(sun) > ! #ifdef sun > ! #include > ! #endif I don't like #if defines. What is the problem, and why can't it be solved with a HAVE_ test? Also, are you certain your changes apply to all systems that define sun? > + master_fd = open("/dev/ptmx", O_RDWR|O_NOCTTY); /* open master */ > + sig_saved = signal(SIGCHLD, SIG_DFL); > + grantpt(master_fd); /* change permission of slave */ > + unlockpt(master_fd); /* unlock slave */ > + signal(SIGCHLD,sig_saved); > + slave_name = ptsname(master_fd); /* get name of slave */ > + slave_fd = open(slave_name, O_RDWR); /* open slave */ > + ioctl(slave_fd, I_PUSH, "ptem"); /* push ptem */ > + ioctl(slave_fd, I_PUSH, "ldterm"); /* push ldterm*/ > + ioctl(slave_fd, I_PUSH, "ttcompat"); /* push ttcompat*/ Again, that is a fragment that seems to apply to more systems than just Solaris. It appears that atleast HP-UX has the same API, perhaps other SysV systems have that as well. On some of these other systems, ttcompat is not used, see http://ou800doc.caldera.com/SDK_sysprog/_Pseudo-tty_Drivers_em_ptm_and_p.html for an example. So I wonder whether it should be used by default - especially since the Solaris man page says that it can be autopushed as well. Regards, Martin From martin@v.loewis.de Sat Jun 29 19:02:29 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 29 Jun 2002 20:02:29 +0200 Subject: [Python-Dev] Asyncore/asynchat In-Reply-To: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com> References: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com> Message-ID: "Steve Holden" writes: > I notice that Sam Rushing's code tends to use spaces before the parentheses > around argument lists. Should I think about cleaning up the code at the same > time, or are we best letting sleeping dogs lie? The general principle seems to be that cleanup can be done while the module is reviewed, anyway. So I think doing these changes together with the documentation is appropriate. Regards, Martin From martin@v.loewis.de Sat Jun 29 20:24:16 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 29 Jun 2002 21:24:16 +0200 Subject: [Python-Dev] Building Python cvs w/ gcc 3.1 In-Reply-To: <15643.25537.767831.983206@anthem.wooz.org> References: <15643.25537.767831.983206@anthem.wooz.org> Message-ID: barry@zope.com (Barry A. Warsaw) writes: > There's no switch to turn off these warnings. The GCC developers now consider this entire warning a bug in the compiler. Nobody can recall the rationale for the warning, and it will likely go away. Just remove it from your GCC sources if it bothers you too much. Regards, Martin From xscottg@yahoo.com Sat Jun 29 20:59:20 2002 From: xscottg@yahoo.com (Scott Gilbert) Date: Sat, 29 Jun 2002 12:59:20 -0700 (PDT) Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() In-Reply-To: <004c01c21f5c$6dcbf2d0$ced241d5@hagrid> Message-ID: <20020629195920.64153.qmail@web40104.mail.yahoo.com> --- Fredrik Lundh wrote: > > does anyone have any real-life use cases? I've never been > able to use it for anything, and cannot recall ever seeing it > being used by anyone else... > As far as I can tell, it only has two uses - To create a (read only) subview of some other object without making a copy: a = array.array('b', [0])*16*1024*1024 b = buffer(a, 512, 1024*1024) # Cheap 1M view of 16M object Or to add string like qualities to an object which supports the PyBufferProcs interface, but didn't bother to support a string like interface. There don't appear to be any of those in the Python core, so here is a bogus example: l = lazy.slacker() b = buffer(l) x = b[1024:1032] > > (it sure doesn't work for the use cases I thought of when > first learning about the API...) > I think that's the reason that no one ever fixes its quirks and bugs. As soon as you understand what it is, you realize that even if it was fixed it isn't very useful. What would be useful is a mutable array of bytes that you could optionally construct from pointer and destructor, that pickled efficiently (no copy to string), and that reliably retained it's pointer value after letting go of the GIL (no realloc). __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From lellinghaus@yahoo.com Sat Jun 29 21:57:05 2002 From: lellinghaus@yahoo.com (Lance Ellinghaus) Date: Sat, 29 Jun 2002 13:57:05 -0700 (PDT) Subject: [Python-Dev] posixmodule.c diffs for working forkpty() and openpty() under Solaris 2.8 In-Reply-To: Message-ID: <20020629205705.28248.qmail@web20905.mail.yahoo.com> Martin: See my comments below please... --- "Martin v. Loewis" wrote: > Lance Ellinghaus writes: > > > Please let me know if anyone has any problems with this! > > I do. I have the general problem with posting such patches to > python-dev; please put them onto SF instead. For specific problems, > please see below. I did not think just anyone could post to the python section on SF. My mistake. The rest of the comments are below... > > ! #if defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || > defined(sun) > > ! #ifdef sun > > ! #include > > ! #endif > > I don't like #if defines. What is the problem, and why can't > it be solved with a HAVE_ test? The problem is that Solaris (SUN) does NOT have openpty() and does not have forkpty().. So what HAVE_ test would you suggest? What would I test for? I guess I could have tested for "grantpt()", but testing for "sun" works as needed. I understand your PERSONAL problem with testing for SYSTEMs.. but that does not mean it is WRONG.. > Also, are you certain your changes apply to all systems that define > sun? Yes. All currently supported Solaris systems will need this patch to provide openpty() and forkpty() services. Supported Solaris is 2.8. This should work with 2.9 as well. > > + master_fd = open("/dev/ptmx", O_RDWR|O_NOCTTY); /* open > master */ > > + sig_saved = signal(SIGCHLD, SIG_DFL); > > + grantpt(master_fd); /* change > permission of slave */ > > + unlockpt(master_fd); /* unlock slave > */ > > + signal(SIGCHLD,sig_saved); > > + slave_name = ptsname(master_fd); /* get name of > slave */ > > + slave_fd = open(slave_name, O_RDWR); /* open slave */ > > + ioctl(slave_fd, I_PUSH, "ptem"); /* push ptem */ > > + ioctl(slave_fd, I_PUSH, "ldterm"); /* push ldterm*/ > > + ioctl(slave_fd, I_PUSH, "ttcompat"); /* push > ttcompat*/ > > Again, that is a fragment that seems to apply to more systems than > just Solaris. It appears that atleast HP-UX has the same API, perhaps > other SysV systems have that as well. This may be the case. I was not coding for these other systems. I was only coding for Sun Solaris 2.8. If someone wants to test it on those other systems, then it could be expanded for them. > On some of these other systems, ttcompat is not used, see > > http://ou800doc.caldera.com/SDK_sysprog/_Pseudo-tty_Drivers_em_ptm_and_p.html > Again, was I coding for other systems? No. Hence the "#if defined(sun)". Again, many other systems do not need this patch as they already have forkpty() and openpty() defined. > for an example. So I wonder whether it should be used by default - > especially since the Solaris man page says that it can be autopushed > as well. Yes. You can use the autopush feature, but that requires making changes to the OS level configuration files. If they have been autopushed, it will not reload them. You do not want the requirement of making changes to the OS level configuration files if you can keep from having to do it. BTW: This is how SSH, EMACS, and other programs do it (YES I LOOKED!). Lance ===== -- Lance Ellinghaus __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From python@rcn.com Sat Jun 29 23:04:28 2002 From: python@rcn.com (Raymond Hettinger) Date: Sat, 29 Jun 2002 18:04:28 -0400 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() References: <001f01c21ed9$873f3c00$06ea7ad1@othello> <004c01c21f5c$6dcbf2d0$ced241d5@hagrid> Message-ID: <003b01c21fb8$f0e3ce20$ecb53bd0@othello> From: "Fredrik Lundh" > > As far as I can tell, buffer() is one of the least used or known about > > Python tools. What do you guys think about this as a candidate for silent > > deprecation (moving out of the primary documentation)? > > +1, in theory. And perhaps in practice. No replies were received from my buffer() survey on comp.lang.py: http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&th=e36ac767eb076bf4&rnum= 1 > does anyone have any real-life use cases? I've never been > able to use it for anything, and cannot recall ever seeing it > being used by anyone else... Also, I scanned a few packages (just the ones I thought might use it like Gadfly, HTMLgen, Spark, etc) on the Vaults of Parnassus and found zero occurances. My Google searches turned-up empty and so did a grep of the library. Raymond Hettinger From barry@zope.com Sun Jun 30 00:57:39 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 29 Jun 2002 19:57:39 -0400 Subject: [Python-Dev] Building Python cvs w/ gcc 3.1 References: <15643.25537.767831.983206@anthem.wooz.org> Message-ID: <15646.18803.947941.467257@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: >> There's no switch to turn off these warnings. MvL> The GCC developers now consider this entire warning a bug in MvL> the compiler. Nobody can recall the rationale for the MvL> warning, and it will likely go away. That's what I gathered from reading some archives. MvL> Just remove it from your GCC sources if it bothers you too MvL> much. It really doesn't. I was giving gcc 3.1 a shake for something else, but that didn't turn out to be relevant, so I'll probably just wax it for now. -Barry From tim.one@comcast.net Sun Jun 30 03:07:43 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 29 Jun 2002 22:07:43 -0400 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() In-Reply-To: <003b01c21fb8$f0e3ce20$ecb53bd0@othello> Message-ID: Guido's last eassy on the buffer interface is still worth reading: http://mail.python.org/pipermail/python-dev/2000-October/009974.html No progress on the issues discussed has been made since, and, to the contrary, recent changes go in directions Guido didn't want to go. Note that he was in favor of both gutting and deprecating the buffer object (as distinct from the buffer C API) "way back then" already. The only time I ever see buffer() used in Python code is in examples of how to crash Python . In practice, the positive way to look at it is that we've been following Finn Bock's advice with passion: Because it is so difficult to look at java storage as a sequence of bytes, I think I'm all for keeping the buffer() builtin and buffer object as obscure and unknown as possible . From python@rcn.com Sun Jun 30 04:24:35 2002 From: python@rcn.com (Raymond Hettinger) Date: Sat, 29 Jun 2002 23:24:35 -0400 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() References: Message-ID: <008e01c21fe5$a9383480$17ea7ad1@othello> From: "Tim Peters" > Guido's last eassy on the buffer interface is still worth reading: > > http://mail.python.org/pipermail/python-dev/2000-October/009974.html Thanks for the helpful pointer! :) > No progress on the issues discussed has been made since, and, to the > contrary, recent changes go in directions Guido didn't want to go. He sent me to you guys for direction. The change was based on the advice I got. The point is moot because a) it's not too late to change course to returning all buffer objects, b) because almost nobody uses it anyway, and c) it all should probably be deprecated. > Note that he was in favor of both gutting and deprecating the buffer object > (as distinct from the buffer C API) "way back then" already. The only time > I ever see buffer() used in Python code is in examples of how to crash > Python . Perhaps full deprecation (of the Python API not the C API) is in order. It;s just one fewer item in the Python concept space. Besides mmap() and iterators have already addressed some of the original need. > In practice, the positive way to look at it is that we've been following > Finn Bock's advice with passion: > > Because it is so difficult to look at java storage as a sequence of > bytes, I think I'm all for keeping the buffer() builtin and buffer > object as obscure and unknown as possible . Sounds almost like silent deprecation to me . Raymond Hettinger From martin@v.loewis.de Sun Jun 30 07:55:56 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 30 Jun 2002 08:55:56 +0200 Subject: [Python-Dev] posixmodule.c diffs for working forkpty() and openpty() under Solaris 2.8 In-Reply-To: <20020629205705.28248.qmail@web20905.mail.yahoo.com> References: <20020629205705.28248.qmail@web20905.mail.yahoo.com> Message-ID: Lance Ellinghaus writes: > The problem is that Solaris (SUN) does NOT have openpty() and does not > have forkpty().. So what HAVE_ test would you suggest? What would I > test for? For the features you use: HAVE_PTMX, HAVE_GRANTPT, HAVE_SYSV_STREAMS, ... If you know they always come in groups, testing for an single one would be sufficient. > I guess I could have tested for "grantpt()", but testing for "sun" > works as needed. Does it work on SunOS 4 as well? > I understand your PERSONAL problem with testing for SYSTEMs.. but > that does not mean it is WRONG.. It is not just my personal problem; it is a maintainance principle for Python. Perhaps there should be a section on it in PEP 7. In this case, it is not only wrong because it is too inclusive (as it tests for Sun 4 as well). What's worse is that it is too exclusive: it will force us to produce long lists of tests for other systems that use the same mechanism. > > Also, are you certain your changes apply to all systems that define > > sun? > > Yes. All currently supported Solaris systems will need this patch to > provide openpty() and forkpty() services. Supported Solaris is 2.8. > This should work with 2.9 as well. Besides SunOS 4, are you *sure* it also works on, say, Solaris 2.5? > This may be the case. I was not coding for these other systems. I was > only coding for Sun Solaris 2.8. But you should be. > If someone wants to test it on those other systems, then it could be > expanded for them. No. Anybody expanding it for other systems will use the same style that you currently use, and we can look forward for a constant stream of patches saying "add this, and trust me - I'm the only one who has such a system". If we later find that the version test was incorrect, we are at a loss as to what to do. > Again, was I coding for other systems? No. Again, this is my primary concern with that patch. > Hence the "#if defined(sun)". Again, many other systems do not need > this patch as they already have forkpty() and openpty() defined. Right, and autoconf will find out. However, that still leaves quite a number of systems that follow the STREAMS way of live. If there is a chance to support them simultaneously, than this should be done. > Yes. You can use the autopush feature, but that requires making changes > to the OS level configuration files. If they have been autopushed, it > will not reload them. You do not want the requirement of making changes > to the OS level configuration files if you can keep from having to do > it. BTW: This is how SSH, EMACS, and other programs do it (YES I > LOOKED!). That doesn't necessarily make it more right. What happens if you leave out the ttcompat module? Regards, Martin From pyth@devel.trillke.net Sun Jun 30 09:12:10 2002 From: pyth@devel.trillke.net (holger krekel) Date: Sun, 30 Jun 2002 10:12:10 +0200 Subject: [Python-Dev] Re: *Simpler* string substitutions In-Reply-To: <15637.9759.111784.481102@anthem.wooz.org>; from barry@zope.com on Sat, Jun 22, 2002 at 09:36:31PM -0400 References: <714DFA46B9BBD0119CD000805FC1F53B01B5B3B2@UKRUX002.rundc.uk.origin-it.com> <15637.9759.111784.481102@anthem.wooz.org> Message-ID: <20020630101210.D20310@prim.han.de> Barry A. Warsaw wrote: > > >>>>> "PM" == Paul Moore writes: > > PM> 4. Access to variables is also problematic. Without > PM> compile-time support, access to nested scopes is impossible > PM> (AIUI). > > Is this really true? I think it was two IPC's ago that Jeremy and I > discussed the possibility of adding a method to frame objects that > would basically yield you the equivalent of globals+freevars+locals. Explicit ways to get a the actual name-obj bindings for any particular code block are much appreciated. What's currently the best way to access lexically scoped names from inside a code block? holger From oren-py-d@hishome.net Sun Jun 30 18:39:03 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Sun, 30 Jun 2002 13:39:03 -0400 Subject: [Python-Dev] Xrange and Slices In-Reply-To: <000d01c21cdb$eb03b720$91d8accf@othello> References: <000d01c21cdb$eb03b720$91d8accf@othello> Message-ID: <20020630173903.GA37045@hishome.net> On Wed, Jun 26, 2002 at 02:37:17AM -0400, Raymond Hettinger wrote: > Wild idea of the day: > Merge the code for xrange() into slice(). There's a patch pending for this: www.python.org/sf/575515 Some issues related to the change: xrange currently accepts only integer arguments. With this change it will accept any type and the exception will be raised when iteration is attempted. Is this a problem? The canonical use of xrange is to use it immediately in a for statement so it will probably go unnoticed. Should xrange be an alias for slice or the other way around? Personally I think that xrange is the more familiar of the two so the merged object should be called xrange. Its repr should also be like that of xrange, suppressing the display of unnecessary None arguments. One of the differences between slice and xrange is that slices are allowed to have open-ended ranges such as slice(10, None). It may useful (and probably quite controversial...) to allow open-ended xranges too, defaulting to INT_MAX or INT_MIN, depending on the sign of the step. It's useful in for loops where you know you will bail out with break and also for zip. A possible extension is to add a method iterslice(len) to slice/xrange that exposes the functionality of PySlice_GetIndicesEx. With this change the following code should work correctly for all forms of slicing: def __getitem__(self, index): if isinstance(index, xrange): return [self[i] for i in index.iterslice(len(self))] else: ... implement integer indexing for this container class This extension, BTW, is independent of whether slice/xrange merging is accepted or not. Oren From Oleg Broytmann Sun Jun 30 20:48:20 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Sun, 30 Jun 2002 23:48:20 +0400 Subject: [Python-Dev] Infinie recursion in Pickle Message-ID: <20020630234820.A1006@phd.pp.ru> Hello! Nobody noted the message in c.l.py, let me try to ask you before I file a bug report on SF. ----- Forwarded message from Oleg Broytmann ----- On Thu, Jun 27, 2002 at 08:45:16PM +0000, Bengt Richter wrote: > > Recently Python (one program that I am dbugging) started to crash. > >FreeBSD kills it with "Bus error", Linux with "Segmentation fault". > > > > I think the program crashed in the cPickle.dump(file, 1) I replaced cPikle.dump with pikle.dump and got infinite rcursion. The traceback is below. What's that? Are there any limits that an object to be pikled must follow? Could it be a tree with loops? (I am pretty sure it could - I used the program for years, and data structures was not changed much). Could it be "new" Python class? (Recently I changed one of my classes to be derived from builtin list instead of UserList). Well (or not so well), the traceback: Traceback (most recent call last): File "/home/phd/lib/bookmarks_db/check_urls.py", line 158, in ? run() File "/home/phd/lib/bookmarks_db/check_urls.py", line 145, in run storage.store(root_folder) File "bkmk_stpickle.py", line 23, in store File "/usr/local/lib/python2.2/pickle.py", line 973, in dump Pickler(file, bin).dump(object) File "/usr/local/lib/python2.2/pickle.py", line 115, in dump self.save(object) File "/usr/local/lib/python2.2/pickle.py", line 219, in save self.save_reduce(callable, arg_tup, state) File "/usr/local/lib/python2.2/pickle.py", line 245, in save_reduce save(arg_tup) File "/usr/local/lib/python2.2/pickle.py", line 225, in save f(self, object) File "/usr/local/lib/python2.2/pickle.py", line 374, in save_tuple save(element) File "/usr/local/lib/python2.2/pickle.py", line 225, in save f(self, object) [about 1000 lines skipped - they are all the same] File "/usr/local/lib/python2.2/pickle.py", line 498, in save_inst save(stuff) File "/usr/local/lib/python2.2/pickle.py", line 225, in save f(self, object) File "/usr/local/lib/python2.2/pickle.py", line 447, in save_dict save(value) File "/usr/local/lib/python2.2/pickle.py", line 219, in save self.save_reduce(callable, arg_tup, state) File "/usr/local/lib/python2.2/pickle.py", line 245, in save_reduce save(arg_tup) File "/usr/local/lib/python2.2/pickle.py", line 225, in save f(self, object) File "/usr/local/lib/python2.2/pickle.py", line 374, in save_tuple save(element) File "/usr/local/lib/python2.2/pickle.py", line 225, in save f(self, object) File "/usr/local/lib/python2.2/pickle.py", line 414, in save_list save(element) File "/usr/local/lib/python2.2/pickle.py", line 143, in save pid = self.persistent_id(object) RuntimeError: maximum recursion depth exceeded ----- End forwarded message ----- Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From tim.one@comcast.net Sun Jun 30 20:58:02 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 30 Jun 2002 15:58:02 -0400 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() In-Reply-To: <008e01c21fe5$a9383480$17ea7ad1@othello> Message-ID: [Tim] >> Guido's last essay on the buffer interface is still worth reading: >> >> http://mail.python.org/pipermail/python-dev/2000-October/009974.html >> >> No progress on the issues discussed has been made since, and, to the >> contrary, recent changes go in directions Guido didn't want to go. [Raymond Hettinger] > He sent me to you guys for direction. That's only because he forgot he wrote the essay -- it's my job to remember what he did . > The change was based on the advice I got. Wasn't that an empty set? > The point is moot because a) it's not too late to change course > to returning all buffer objects, b) because almost nobody uses it > anyway, and c) it all should probably be deprecated. In effect, it's been "silently deprecated" since before Guido wrote the above. > ... > Perhaps full deprecation (of the Python API not the C API) is in order. Someone will whine if that's done. Everyone's sick of fighting these battles. The buffer object is broken, won't get fixed (if it hasn't been by now ...), and nobody seems to have a real use for it; but, *because* it's virtually unused, "don't ask, don't tell" remains a path of small resistance. > It's just one fewer item in the Python concept space. Besides mmap() > and iterators have already addressed some of the original need. I don't know what the original need was, but suspect it was never addressed. IIRC, the real expressed need had something to do with running code objects directly out of mmap'ed files, presumably on memory-starved platforms. As Guido said in his essay, "the reason" for the buffer object's existence isn't clear, so whether the original need has been met, or could be met in other ways now, isn't clear either. Since it remains unused, if there is a need for it, it's a peculiar meaning for "need". From tim.one@comcast.net Sun Jun 30 21:06:36 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 30 Jun 2002 16:06:36 -0400 Subject: [Python-Dev] Infinie recursion in Pickle In-Reply-To: <20020630234820.A1006@phd.pp.ru> Message-ID: [Oleg Broytmann] > Nobody noted the message in c.l.py, let me try to ask you before I file > a bug report on SF. I saw the c.l.py msgs but found nothing to say: before you reduce this to a program someone else can run and get the same error, it's going to remain a mystery. Mysteries belong on SF, though. You may want to see whether the problem persists with a build from current CVS Python (something someone would have tried last week if they had a program they could run). From python@rcn.com Sun Jun 30 21:09:12 2002 From: python@rcn.com (Raymond Hettinger) Date: Sun, 30 Jun 2002 16:09:12 -0400 Subject: [Python-Dev] Silent Deprecation Candidate -- buffer() References: Message-ID: <006801c22072$00c890a0$88e97ad1@othello> RH> > The change was based on the advice I got. TP > Wasn't that an empty set? Not unless Scott Gilbert is a null: SG > > > "... So the best bet would be to have it just always return a string..." From Oleg Broytmann Sun Jun 30 21:25:35 2002 From: Oleg Broytmann (Oleg Broytmann) Date: Mon, 1 Jul 2002 00:25:35 +0400 Subject: [Python-Dev] Infinie recursion in Pickle In-Reply-To: ; from tim.one@comcast.net on Sun, Jun 30, 2002 at 04:06:36PM -0400 References: <20020630234820.A1006@phd.pp.ru> Message-ID: <20020701002535.A1510@phd.pp.ru> On Sun, Jun 30, 2002 at 04:06:36PM -0400, Tim Peters wrote: > I saw the c.l.py msgs but found nothing to say: before you reduce this to a > program someone else can run and get the same error, it's going to remain a I think I can reduce this, but I am afraid the data structure still will be large, > mystery. Mysteries belong on SF, though. You may want to see whether the That what I don't want to do - file a mysterious bug report. > problem persists with a build from current CVS Python (something someone > would have tried last week if they had a program they could run). I'll try. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From tim.one@comcast.net Sun Jun 30 21:32:35 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 30 Jun 2002 16:32:35 -0400 Subject: [Python-Dev] Infinie recursion in Pickle In-Reply-To: <20020701002535.A1510@phd.pp.ru> Message-ID: [Oleg Broytmann] > I think I can reduce this, but I am afraid the data structure > still will be large, That doesn't matter. It's the amount of *code* we don't understand and have to learn that matters. If you could reduce this to a gigabyte of pickle input that we only need to feed into pickle, that would be great. > That what I don't want to do - file a mysterious bug report. That's what bug reports are best for! Now you've got comments about your bug scattered across comp.lang.python and python-dev, and nobody will be able to find them again. Attaching new info to a shared bug report is much more effective. if-there-isn't-a-mystery-there-isn't-a-bug-ly y'rs - tim