From steve at pearwood.info Tue May 1 02:47:09 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 01 May 2012 10:47:09 +1000 Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module execution] In-Reply-To: References: Message-ID: <4F9F328D.1040003@pearwood.info> Gregory P. Smith wrote: > Making modules "simply" be a class that could be subclasses rather than > their own thing _would_ be nice for one particular project I've worked on > where the project including APIs and basic implementations were open source > but which allowed for site specific code to override many/most of those > base implementations as a way of customizing it for your own specific (non > open source) environment. This makes no sense to me. What does the *licence* of a project have to do with the library API? I mean, yes, you could do such a thing, but surely you shouldn't. That would be like saying that the accelerator pedal should be on the right in cars you buy outright, but on the left for cars you get on hire-purchase. Nevertheless, I think your focus here is on the wrong thing. It seems to me that you are jumping to an implementation, namely that modules should stop being instances of a type and become classes, without having a clear idea of your functional requirements. The functional requirements *might* be: "There ought to be an easy way to customize the behaviour of attribute access in modules." Or perhaps: "There ought to be an easy way for one module to shadow another module with the same name, but still inherit behaviour from the shadowed module." neither of which *require* modules to become classes. Or perhaps it is something else... it is unclear to me exactly what problems you and Jim wish to solve, or whether they're the same kind of problem, which is why I say the functional requirements are unclear. Changing modules from an instance of ModuleType to "a class that could be a subclass" is surely going to break code. Somewhere, someone is relying on the fact that modules are not types and you're going to break their application. > Any APIs that were unfortunately defined as a > module with a bunch of functions in it was a real pain to make site > specific overrides for. It shouldn't be. Just ensure the site-specific override module comes first in the path, and "import module" will pick up the override module instead of the standard one. This is a simple exercise in shadowing modules. Of course, this implies that the override module has to override *everything*. There's currently no simple way for the shadowing module to inherit functionality from the shadowed module. You can probably hack something together, but it would be a PITA. -- Steven From guido at python.org Tue May 1 04:33:40 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 30 Apr 2012 19:33:40 -0700 Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module execution] In-Reply-To: <4F9F328D.1040003@pearwood.info> References: <4F9F328D.1040003@pearwood.info> Message-ID: On Mon, Apr 30, 2012 at 5:47 PM, Steven D'Aprano wrote: > Gregory P. Smith wrote: > >> Making modules "simply" be a class that could be subclasses rather than >> their own thing _would_ be nice for one particular project I've worked on >> where the project including APIs and basic implementations were open >> source >> but which allowed for site specific code to override many/most of those >> base implementations as a way of customizing it for your own specific (non >> open source) environment. > This makes no sense to me. What does the *licence* of a project have to do > with the library API? I mean, yes, you could do such a thing, but surely you > shouldn't. That would be like saying that the accelerator pedal should be on > the right in cars you buy outright, but on the left for cars you get on > hire-purchase. That's an irrelevant, surprising and unfair criticism of Greg's message. He just tried to give a specific example without being too specific. > Nevertheless, I think your focus here is on the wrong thing. It seems to me > that you are jumping to an implementation, namely that modules should stop > being instances of a type and become classes, without having a clear idea of > your functional requirements. > > The functional requirements *might* be: > > "There ought to be an easy way to customize the behaviour of attribute > access in modules." > > Or perhaps: > > "There ought to be an easy way for one module to shadow another module with > the same name, but still inherit behaviour from the shadowed module." > > neither of which *require* modules to become classes. > > Or perhaps it is something else... it is unclear to me exactly what problems > you and Jim wish to solve, or whether they're the same kind of problem, > which is why I say the functional requirements are unclear. > > Changing modules from an instance of ModuleType to "a class that could be a > subclass" is surely going to break code. Somewhere, someone is relying on > the fact that modules are not types and you're going to break their > application. > > > >> Any APIs that were unfortunately defined as a >> module with a bunch of functions in it was a real pain to make site >> specific overrides for. > > > It shouldn't be. Just ensure the site-specific override module comes first > in the path, and "import module" will pick up the override module instead of > the standard one. This is a simple exercise in shadowing modules. > > Of course, this implies that the override module has to override > *everything*. There's currently no simple way for the shadowing module to > inherit functionality from the shadowed module. You can probably hack > something together, but it would be a PITA. If there is a bunch of functions and you want to replace a few of those, you can probably get the desired effect quite easily: from base_module import * # Or the specific set of functions that comprise the API. def funct1(): def funct2(): Not that I would recommend this -- it's easy to get confused if there are more than a very small number of functions. Also if base_module.funct3 were to call func2, it wouldn't call the overridden version. But all attempts to view modules as classes or instances have lead to negative results. (I'm sure I've thought about it at various times in the past.) I think the reason is that a module at best acts as a class where every method is a *static* method, but implicitly so. Ad we all know how limited static methods are. (They're basically an accident -- back in the Python 2.2 days when I was inventing new-style classes and descriptors, I meant to implement class methods but at first I didn't understand them and accidentally implemented static methods first. Then it was too late to remove them and only provide class methods.) There is actually a hack that is occasionally used and recommended: a module can define a class with the desired functionality, and then at the end, replace itself in sys.modules with an instance of that class (or with the class, if you insist, but that's generally less useful). E.g.: # module foo.py import sys class Foo: def funct1(self, ): def funct2(self, ): sys.modules[__name__] = Foo() This works because the import machinery is actively enabling this hack, and as its final step pulls the actual module out of sys.modules, after loading it. (This is no accident. The hack was proposed long ago and we decided we liked enough to support it in the import machinery.) You can easily override __getattr__ / __getattribute__ / __setattr__ this way. It also makes "subclassing" the module a little easier (although accessing the class to be used as a base class is a little tricky: you'd have to use foo.__class__). But of course the kind of API that Greg was griping about would never be implemented this way, so that's fairly useless. And if you were designing a module as an inheritable class right from the start you're much better off just using a class instead of the above hack. But all in all I don't think there's a great future in stock for the idea of allowing modules to be "subclassed". In the vast, vast majority of cases it's better to clearly have a separation between modules, which provide no inheritance and no instantiation, and classes, which provide both. I think Python is better off this way than Java, where all you have is classes (its packages cannot contain anything except class definitions). -- --Guido van Rossum (python.org/~guido) From ericsnowcurrently at gmail.com Tue May 1 04:39:47 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 30 Apr 2012 20:39:47 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: On Sat, Apr 28, 2012 at 12:22 AM, Chris Rebert wrote: > On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow wrote: >> >> * ``sys.implementation`` as a proper namespace rather than a dict. ?It >> ?would be it's own module or an instance of a concrete class. > > So, what's the justification for it being a dict rather than an object > with attributes? The PEP merely (sensibly) concludes that it cannot be > considered a sequence. At this point I'm not aware of the strong justifications either way. However, sys.implementation is currently intended as a simple collection of variables. A dict reflects that. One obvious concern is that if we start off with a dict we're binding ourselves to that interface. If we later want concrete class with dotted lookup, we'd be looking at backwards-incompatibility. This is the part of the PEP that still needs more serious thought. > Relatedly, I find the PEP's use of the term "namespace" in reference > to a dict to be somewhat confusing. In my mind a mapping is a namespace. I don't have a problem changing that to mitigate any confusion. Thanks for the feedback. -eric From ncoghlan at gmail.com Tue May 1 04:48:02 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 1 May 2012 12:48:02 +1000 Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module execution] In-Reply-To: References: <4F9F328D.1040003@pearwood.info> Message-ID: On Tue, May 1, 2012 at 12:33 PM, Guido van Rossum wrote: > But all in all I don't think there's a great future in stock for the > idea of allowing modules to be "subclassed". In the vast, vast > majority of cases it's better to clearly have a separation between > modules, which provide no inheritance and no instantiation, and > classes, which provide both. I think Python is better off this way > than Java, where all you have is classes (its packages cannot contain > anything except class definitions). FWIW, in 3.3 the full import machinery will be exposed in sys.meta_path (and sys.path_hooks), so third parties will be free to experiment with whatever crazy things they want without having to work around the implicit import behaviour :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Tue May 1 04:50:24 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 30 Apr 2012 20:50:24 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: On Sat, Apr 28, 2012 at 7:39 PM, Victor Stinner wrote: >> I've written up a PEP for the sys.implementation idea. ?Feedback is welcome! > > Cool, it's better with PEP! Even the change looks trivial. > >> name >> ?the name of the implementation (case sensitive). > > It would help if the PEP (and the documentation of sys.implementation) > lists at least the most common names. I suppose that we would have > something like: "CPython", "PyPy", "Jython", "IronPython". Good point. I'll do that. >> version >> ?the version of the implementation, as opposed to the version of the >> ?language it implements. ?This would use a standard format, similar to >> ?``sys.version_info`` (see `Version Format`_). > > Dummy question: what is sys.version/sys.version_info? The version of > the implementation or the version of the Python lnaguage? The PEP > should explain that, and maybe also the documentation of > sys.implementation.version (something like "use sys.version_info to > get the version of the Python language"). Yeah, sys.version (et al.) is the version of the language. It just happens to be the same as the implementation version for CPython. I'll make that more clear. >> cache_tag > > Why not adding this information to the imp module? This is certainly something I need to clarify. Either the different implementors set these values in the various modules to which they pertain; or they set them all in one place (sys.implementation). I really think we should avoid having a mix. In my mind sys.implementation makes more sense. For example, in the case of cache_tag (which is merely a potential future variable), its value is an implementation detail used by importlib. Having it in sys.implementation would emphasize this point. -eric From ncoghlan at gmail.com Tue May 1 04:57:49 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 1 May 2012 12:57:49 +1000 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: On Tue, May 1, 2012 at 12:39 PM, Eric Snow wrote: > On Sat, Apr 28, 2012 at 12:22 AM, Chris Rebert wrote: >> On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow wrote: >>> >>> * ``sys.implementation`` as a proper namespace rather than a dict. ?It >>> ?would be it's own module or an instance of a concrete class. >> >> So, what's the justification for it being a dict rather than an object >> with attributes? The PEP merely (sensibly) concludes that it cannot be >> considered a sequence. > > At this point I'm not aware of the strong justifications either way. > However, sys.implementation is currently intended as a simple > collection of variables. ?A dict reflects that. > > One obvious concern is that if we start off with a dict we're binding > ourselves to that interface. ?If we later want concrete class with > dotted lookup, we'd be looking at backwards-incompatibility. ?This is > the part of the PEP that still needs more serious thought. I think it's a case where practicality beats purity. By using structseq, we get a nice representation and dotted attribute access, just as we have for sys.float_info. Providing this kind of convenience is the same reason collections.namedtuple exists. We should just document that the length of the tuple and the order of items is not guaranteed (either across implementations or between versions), and even the ability to iterate over the items or access them by index is not mandatory in an implementation. Would it be better if we had a separate "namespace" type in CPython that simply *disallowed* iteration and indexing? Perhaps, but we've survived long enough without it that I have my doubts about the practical need. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue May 1 05:08:44 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 1 May 2012 13:08:44 +1000 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: On Tue, May 1, 2012 at 12:50 PM, Eric Snow wrote: > In my mind sys.implementation makes more sense. ?For example, in the > case of cache_tag (which is merely a potential future variable), its > value is an implementation detail used by importlib. ?Having it in > sys.implementation would emphasize this point. Personally, I think cache_tag should be part of the initial proposal. Implementations may want to use different cache tags depending on additional information that importlib shouldn't need to care about, and I think it would also be reasonable to allow "cache_tag=None" to disable the implicit caching altogether. The ultimate goal would be for us to be able to eliminate implementation checks from other parts of the standard library. importlib is a good place to start, since the idea is that, aside from the mechanism used to bootstrap it into place, along with optional acceleration of __import__, importlib itself should be implementation independent. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Tue May 1 05:22:30 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 30 Apr 2012 21:22:30 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: <20120430170454.08d73f74@resist.wooz.org> References: <20120430170454.08d73f74@resist.wooz.org> Message-ID: On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw wrote: > On Apr 27, 2012, at 12:36 AM, Eric Snow wrote: >>``sys.implementation`` is a dictionary, as opposed to any form of "named" >>tuple (a la ``sys.version_info``). ?This is partly because it doesn't >>have meaning as a sequence, and partly because it's a potentially more >>variable data structure. > > I agree that sequence semantics are meaningless here. ?Presumably, a > dictionary is proposed because this > > ? ?cache_tag = sys.implementation.get('cache_tag') > > is nicer than > > ? ?cache_tag = getattr(sys.implementation, 'cache_tag', None) That's a good point. Also, a dict better reflects a collection of variables that a dotted-access object, which to me implies the potential for methods as well. > OTOH, maybe we need a nameddict type! You won't have to convince _me_. :) >>repository >> ? the implementation's repository URL. > > What does this mean? ?Oh, I think you mean the URL for the VCS used to develop > this version of the implementation. ?Maybe vcs_url (and even then there could > be alternative blessed mirrors in other vcs's). ?A Debian analog are the Vcs-* > header (e.g. Vcs-Git, Vcs-Bzr, etc.). Yeah, you got it. For CPython it would be "http://hg.python.org/cpython". You're right that vcs_url is more clear. I'll update it. Perhaps I should clarify "Other Possible Values" in the PEP? I'd intended it as a list of meaningful names, most of which others had suggested, that could be considered at some later point. That's part of why I didn't develop the descriptions there too much. Rather, I wanted to focus on the two primary names for now. Should those potential names be considered more seriously right now? I was hoping to keep it light to start out, just the things we'd use immediately. >>repository_revision >> ? the revision identifier for the implementation. > > I'm not sure what this is. ?Is it like the hexgoo you see in the banner of a > from-source build that identifies the revision used to build this interpreter? > Is this key a replacement for that? I was thinking along those lines. For CPython, it could be 76678 or ab63e874265e or both. The decision on any constraints for this one would be subject to further discussion. > >>build_toolchain >> ? identifies the tools used to build the interpreter. > > As a tuple of free-form strings? That would work. I expect it would depend on how it would be used. >>url (or website) >> ? the URL of the implementation's site. > > Maybe 'homepage' (another Debian analog). Sounds good to me. >>site_prefix >> ? the preferred site prefix for this implementation. >> >>runtime >> ? the run-time environment in which the interpreter is running. > > I'm not sure what this means either. ;) Yeah, it's not so clear there. For Jython it would be something like "jvm X.X", for IronPython it would be ".net CLR X.X" or whatever. Again the actual definition would be subject to more discussion relative to the use case, be it information or otherwise. >>gc_type >> ? the type of garbage collection used. > > Another free-form string? ?What would be the values say, for CPython and > Jython? I was imagining a free-form string, like "reference counting" or "mark and sweep". I just depends on what people need it for. >>Version Format >>-------------- >> >>XXX same as sys.version_info? > > Why not? :) ?It might be useful also to have something similar to > sys.hexversion, which I often find convenient. That's the way I'm leaning. I've covered it a little more in the newer version of the PEP (on python-ideas). >>* What are the long-term objectives for sys.implementation? >> >> ?- pull in implementation detail from the main sys namespace and >> ? ?elsewhere (PEP 3137 lite). > > That's where this seems to be leaning. ?Even if it's a good idea, I bet it > will be a long time before the old sys names can be removed. Yeah, it's definitely not the focus of the PEP, but I think it's a valid long-term goal of which we should be cognizant. >>* Alternatives to the approach dictated by this PEP? >> >>* ``sys.implementation`` as a proper namespace rather than a dict. ?It >> ?would be it's own module or an instance of a concrete class. > > Which might make sense, as would perhaps a top-level `implementation` module. > IOW, why situate it in sys? > >>The implementatation of this PEP is covered in `issue 14673`_. > > s/implementatation/implementation Got it. > Nicely done! ?Let's see how those placeholders shake out. Thanks. I'm glad to get this rolling. And yeah, I need to poke the folks with the other implementations to get their feedback (rather than rely on nods from 3 years ago). :) -eric From ericsnowcurrently at gmail.com Tue May 1 05:43:47 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 30 Apr 2012 21:43:47 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: On Mon, Apr 30, 2012 at 8:57 PM, Nick Coghlan wrote: > On Tue, May 1, 2012 at 12:39 PM, Eric Snow wrote: >> At this point I'm not aware of the strong justifications either way. >> However, sys.implementation is currently intended as a simple >> collection of variables. ?A dict reflects that. >> >> One obvious concern is that if we start off with a dict we're binding >> ourselves to that interface. ?If we later want concrete class with >> dotted lookup, we'd be looking at backwards-incompatibility. ?This is >> the part of the PEP that still needs more serious thought. > > I think it's a case where practicality beats purity. By using > structseq, we get a nice representation and dotted attribute access, > just as we have for sys.float_info. Providing this kind of convenience > is the same reason collections.namedtuple exists. That was my original sentiment, partly for the "this is how it's already been done" aspect. Barry made a good point about sys.implementation.get(name) vs. getattr(sys.implementation, name, None). However, having dotted access still seems more correct. (continued below...) > We should just document that the length of the tuple and the order of > items is not guaranteed (either across implementations or between > versions), and even the ability to iterate over the items or access > them by index is not mandatory in an implementation. Would it be > better if we had a separate "namespace" type in CPython that simply > *disallowed* iteration and indexing? Perhaps, but we've survived long > enough without it that I have my doubts about the practical need. That's a good point. Perhaps it depends on how general we expect the consumption of sys.implementation to be. If its practicality is oriented toward internal use then the data structure is not as critical. However, sys.implementation is intended to have a non-localized impact across the standard library and the interpreter. I'd rather not make hacking it become an attractive nuisance, regardless of our intentions for usage. This is where I usually defer to those that have been dealing for years with the aftermath of these types of decisions. -eric From ericsnowcurrently at gmail.com Tue May 1 05:47:51 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 30 Apr 2012 21:47:51 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: On Mon, Apr 30, 2012 at 9:08 PM, Nick Coghlan wrote: > On Tue, May 1, 2012 at 12:50 PM, Eric Snow wrote: >> In my mind sys.implementation makes more sense. ?For example, in the >> case of cache_tag (which is merely a potential future variable), its >> value is an implementation detail used by importlib. ?Having it in >> sys.implementation would emphasize this point. > > Personally, I think cache_tag should be part of the initial proposal. > Implementations may want to use different cache tags depending on > additional information that importlib shouldn't need to care about, > and I think it would also be reasonable to allow "cache_tag=None" to > disable the implicit caching altogether. Agreed. This is how I was thinking of it. I just wanted to keep things as minimal as possible to start. In importlib we can fall back to name+version if cache_tag isn't there. Still, of the potential variables, cache_tag is the strongest candidate, having a solid (if optional) use-case right now. > The ultimate goal would be for us to be able to eliminate > implementation checks from other parts of the standard library. > importlib is a good place to start, since the idea is that, aside from > the mechanism used to bootstrap it into place, along with optional > acceleration of __import__, importlib itself should be implementation > independent. Spot on! -eric From ericsnowcurrently at gmail.com Tue May 1 08:10:18 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 1 May 2012 00:10:18 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: On Mon, Apr 30, 2012 at 9:08 PM, Nick Coghlan wrote: > On Tue, May 1, 2012 at 12:50 PM, Eric Snow wrote: >> In my mind sys.implementation makes more sense. ?For example, in the >> case of cache_tag (which is merely a potential future variable), its >> value is an implementation detail used by importlib. ?Having it in >> sys.implementation would emphasize this point. > > Personally, I think cache_tag should be part of the initial proposal. > Implementations may want to use different cache tags depending on > additional information that importlib shouldn't need to care about, > and I think it would also be reasonable to allow "cache_tag=None" to > disable the implicit caching altogether. I'm going to leave it as-is for the moment, but I'm leaning toward doing this. -eric From ericsnowcurrently at gmail.com Tue May 1 21:05:52 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 1 May 2012 13:05:52 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: Updated: http://www.python.org/dev/peps/pep-0421/ -eric From barry at python.org Wed May 2 00:25:29 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 1 May 2012 18:25:29 -0400 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation References: <20120430170454.08d73f74@resist.wooz.org> Message-ID: <20120501182529.5ea3d94d@resist.wooz.org> On Apr 30, 2012, at 09:22 PM, Eric Snow wrote: >Perhaps I should clarify "Other Possible Values" in the PEP? I'd >intended it as a list of meaningful names, most of which others had >suggested, that could be considered at some later point. That's part >of why I didn't develop the descriptions there too much. Rather, I >wanted to focus on the two primary names for now. > >Should those potential names be considered more seriously right now? >I was hoping to keep it light to start out, just the things we'd use >immediately. I think you could keep it light (but +1 for adding cache_tag now). I'd suggest making it clear that neither the keys, values, nor semantics are actually being proposed in this PEP. The PEP could just include some examples for future additions (and thus de-emphasize that section of the PEP). It might be helpful to describe a mechanism by which future values would be added to sys.implementation. E.g. is a new PEP required for each? (I don't have an opinion on that right now. :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed May 2 00:28:26 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 1 May 2012 18:28:26 -0400 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation References: <20120430170454.08d73f74@resist.wooz.org> Message-ID: <20120501182826.0864f84b@resist.wooz.org> On Apr 30, 2012, at 09:22 PM, Eric Snow wrote: >> I agree that sequence semantics are meaningless here. ?Presumably, a >> dictionary is proposed because this >> >> ? ?cache_tag = sys.implementation.get('cache_tag') >> >> is nicer than >> >> ? ?cache_tag = getattr(sys.implementation, 'cache_tag', None) > >That's a good point. Also, a dict better reflects a collection of >variables that a dotted-access object, which to me implies the >potential for methods as well. > >> OTOH, maybe we need a nameddict type! > >You won't have to convince _me_. :) Well, I was being a bit facetious. You can easily implement those semantics in pure Python. 5 minute hack below. Cheers, -Barry -----snip snip----- #! /usr/bin/python3 _missing = object() import operator import unittest class Implementation: cache_tag = 'cpython33' name = 'CPython' def __getitem__(self, name, default=_missing): result = getattr(self, name, default) if result is _missing: raise AttributeError("'{}' object has no attribute '{}'".format( self.__class__.__name__, name)) return result def __setitem__(self, name, value): raise TypeError('read only') def __setattr__(self, name, value): raise TypeError('read only') implementation = Implementation() class TestImplementation(unittest.TestCase): def test_cache_tag(self): self.assertEqual(implementation.cache_tag, 'cpython33') self.assertEqual(implementation['cache_tag'], 'cpython33') def test_name(self): self.assertEqual(implementation.name, 'CPython') self.assertEqual(implementation['name'], 'CPython') def test_huh(self): self.assertRaises(AttributeError, operator.getitem, implementation, 'droids') self.assertRaises(AttributeError, getattr, implementation, 'droids') def test_read_only(self): self.assertRaises(TypeError, operator.setitem, implementation, 'droids', 'looking') self.assertRaises(TypeError, setattr, implementation, 'droids', 'looking') self.assertRaises(TypeError, operator.setitem, implementation, 'cache_tag', 'xpython99') self.assertRaises(TypeError, setattr, implementation, 'cache_tag', 'xpython99') -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From steve at pearwood.info Wed May 2 03:09:17 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 02 May 2012 11:09:17 +1000 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: <20120430170454.08d73f74@resist.wooz.org> Message-ID: <4FA0893D.4090903@pearwood.info> Eric Snow wrote: > On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw wrote: >> On Apr 27, 2012, at 12:36 AM, Eric Snow wrote: >>> ``sys.implementation`` is a dictionary, as opposed to any form of "named" >>> tuple (a la ``sys.version_info``). This is partly because it doesn't >>> have meaning as a sequence, and partly because it's a potentially more >>> variable data structure. >> I agree that sequence semantics are meaningless here. Presumably, a >> dictionary is proposed because this >> >> cache_tag = sys.implementation.get('cache_tag') >> >> is nicer than >> >> cache_tag = getattr(sys.implementation, 'cache_tag', None) > > That's a good point. Also, a dict better reflects a collection of > variables that a dotted-access object, which to me implies the > potential for methods as well. Dicts have methods, and support iteration. A dict suggests to me that an arbitrary number of items could be included, rather than suggesting a record-like structure with an fixed number of items. (Even if that number varies from release to release.) On the other hand, a dict supports iteration, and len, so even if you don't know how many fields there are, you can always find them by iterating over the record. Syntax-wise, dotted name access seems right to me for this, similar to sys.float_info. If you know a field exists, sys.implementation.field is much nicer than sys.implementation['field']. I hate to admit it, but I'm starting to think that the right solution here is something like a dict with dotted name access. http://code.activestate.com/recipes/473786 http://code.activestate.com/recipes/576586 sort of thing. -- Steven From steve at pearwood.info Wed May 2 03:24:19 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 02 May 2012 11:24:19 +1000 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: <4FA08CC3.5070602@pearwood.info> Nick Coghlan wrote: > Would it be > better if we had a separate "namespace" type in CPython that simply > *disallowed* iteration and indexing? Perhaps, but we've survived long > enough without it that I have my doubts about the practical need. I have often wanted a namespace type, with class-like syntax and module-like semantics. In pseudocode: namespace Spam: x = 1 def ham(a): return x+a def cheese(a): return ham(a)*10 Spam.cheese(5) => returns 60 But I suspect that's not what you're talking about here in context. -- Steven From ncoghlan at gmail.com Wed May 2 04:37:08 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 2 May 2012 12:37:08 +1000 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: <4FA0893D.4090903@pearwood.info> References: <20120430170454.08d73f74@resist.wooz.org> <4FA0893D.4090903@pearwood.info> Message-ID: On Wed, May 2, 2012 at 11:09 AM, Steven D'Aprano wrote: > Syntax-wise, dotted name access seems right to me for this, similar to > sys.float_info. If you know a field exists, sys.implementation.field is much > nicer than sys.implementation['field']. > > I hate to admit it, but I'm starting to think that the right solution here > is something like a dict with dotted name access. Whereas I'm thinking it makes sense to explicitly separate out "standard, must be defined by all conforming Python implementations" and "implementation specific extras" Under that model, we'd add an extra "metadata" field at the standard level to hold implementation specific fields. The initial set of standard fields would then be: name: the name of the implementation (e.g. "CPython", "IronPython", "PyPy", "Jython") version: the version of the implemenation (in sys.version_info format) cache_tag: the identifier used by importlib when caching bytecode files in __pycache__ (set to None to disable bytecode caching) metadata: a dict containing arbitrary additional information about a particular implementation sys.implementation.metadata would then give a home for information that needs to be builtin, without having to pollute the main sys namespace. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Thu May 3 03:23:14 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 2 May 2012 19:23:14 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: <20120501182529.5ea3d94d@resist.wooz.org> References: <20120430170454.08d73f74@resist.wooz.org> <20120501182529.5ea3d94d@resist.wooz.org> Message-ID: On Tue, May 1, 2012 at 4:25 PM, Barry Warsaw wrote: > On Apr 30, 2012, at 09:22 PM, Eric Snow wrote: > >>Perhaps I should clarify "Other Possible Values" in the PEP? ?I'd >>intended it as a list of meaningful names, most of which others had >>suggested, that could be considered at some later point. ?That's part >>of why I didn't develop the descriptions there too much. ?Rather, I >>wanted to focus on the two primary names for now. >> >>Should those potential names be considered more seriously right now? >>I was hoping to keep it light to start out, just the things we'd use >>immediately. > > I think you could keep it light (but +1 for adding cache_tag now). cache_tag it is. > I'd suggest making it clear that neither the keys, values, nor semantics are > actually being proposed in this PEP. ?The PEP could just include some examples > for future additions (and thus de-emphasize that section of the PEP). > > It might be helpful to describe a mechanism by which future values would be > added to sys.implementation. ?E.g. is a new PEP required for each? ?(I don't > have an opinion on that right now. :) This is a good direction. I'll update the PEP. Thanks! -eric From mwm at mired.org Thu May 3 03:28:21 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 2 May 2012 21:28:21 -0400 Subject: [Python-ideas] argparse FileType v.s default arguments... In-Reply-To: References: <20120430133338.33b2f75d@bhuda.mired.org> Message-ID: On Mon, Apr 30, 2012 at 3:59 PM, Gregory P. Smith wrote: > On Mon, Apr 30, 2012 at 10:33 AM, Mike Meyer wrote: >> While I really like the argparse module, I've run into a case I think >> it ought to handle that it doesn't. >> >> So I'm asking here to see if 1) I've overlooked something, and it can >> do this, or 2) there's a good reason for it not to do this or maybe 3) >> this is a bad idea. >> >> The usage I ran into looks like this: >> >> parser.add_argument('configfile', default='/my/default/config', >> ? ? ? ? ? ? ? ? ? ? type=FileType('r'), nargs='?') >> >> If I provide the argument, everything works fine, and it opens the >> named file for me. If I don't, parser.configfile is set to the string, >> which doesn't work very well when I try to use it's read method. >> Unfortunately, setting default to open('/my/default/config') has the >> side affect of opening the file. Or raising an exception if the file >> doesn't exist (which is a common reason for wanting to provide an >> alternative!) >> >> Could default handling could be made smarter, and if 1) type is set >> and 2) the value of default is a string, call pass the value of >> default to type? Or maybe a flag to make that happen, or even a >> default_factory argument (incompatible with default) that would accept >> something like default_factory=lambda: open('/my/default/config')? > This makes sense to me as described. ?I suggest going ahead and file an > issue on bugs.python.org with the above. I finally got around to this. There are already two issues that address this problem, though in different ways: 12776 and 11389. 12776 includes a patch that worked against a build of a checkout today. I've added a patch for test_argparse that adds tests to verify that a default filename that doesn't exist with type=FileType complains if you don't specify the argument, and opens the correct file if you do. References: <20120430170454.08d73f74@resist.wooz.org> <4FA0893D.4090903@pearwood.info> Message-ID: On Tue, May 1, 2012 at 8:37 PM, Nick Coghlan wrote: > On Wed, May 2, 2012 at 11:09 AM, Steven D'Aprano wrote: >> Syntax-wise, dotted name access seems right to me for this, similar to >> sys.float_info. If you know a field exists, sys.implementation.field is much >> nicer than sys.implementation['field']. >> >> I hate to admit it, but I'm starting to think that the right solution here >> is something like a dict with dotted name access. > > Whereas I'm thinking it makes sense to explicitly separate out > "standard, must be defined by all conforming Python implementations" > and "implementation specific extras" > > Under that model, we'd add an extra "metadata" field at the standard > level to hold implementation specific fields. The initial set of > standard fields would then be: > > name: the name of the implementation (e.g. "CPython", "IronPython", > "PyPy", "Jython") > version: the version of the implemenation (in sys.version_info format) > cache_tag: the identifier used by importlib when caching bytecode > files in __pycache__ (set to None to disable bytecode caching) > metadata: a dict containing arbitrary additional information about a > particular implementation > > sys.implementation.metadata would then give a home for information > that needs to be builtin, without having to pollute the main sys > namespace. I really like this approach, particularly the separation aspect. Presumably sys.implementation would be more struct-like (static-ish, dotted-access namespace). I'll give it a day or two to stew and if it still seems like a good idea I'll weave it into the PEP. One question though: having it be iterable (a la structseq or namedtuple) doesn't seem to be a good fit, but does it matter? Likewise with mutability. Thoughts? -eric From ericsnowcurrently at gmail.com Thu May 3 04:17:40 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 2 May 2012 20:17:40 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: <20120430170454.08d73f74@resist.wooz.org> References: <20120430170454.08d73f74@resist.wooz.org> Message-ID: On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw wrote: > On Apr 27, 2012, at 12:36 AM, Eric Snow wrote: >>Version Format >>-------------- >> >>XXX same as sys.version_info? > > Why not? :) ?It might be useful also to have something similar to > sys.hexversion, which I often find convenient. Would it be worth mirroring all 3 (sys.version, sys.version_info, sys.hexversion)? Symmetry is nice, but it also makes sense if the each would be as meaningful as they are in sys. -eric From steve at pearwood.info Thu May 3 05:49:59 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 3 May 2012 13:49:59 +1000 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: <20120430170454.08d73f74@resist.wooz.org> Message-ID: <20120503034959.GA19401@ando> On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote: > On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw wrote: > > On Apr 27, 2012, at 12:36 AM, Eric Snow wrote: > >>Version Format > >>-------------- > >> > >>XXX same as sys.version_info? > > > > Why not? :) ?It might be useful also to have something similar to > > sys.hexversion, which I often find convenient. > > Would it be worth mirroring all 3 (sys.version, sys.version_info, > sys.hexversion)? Symmetry is nice, but it also makes sense if the > each would be as meaningful as they are in sys. I am still unclear what justification there is for having a separate sys.version (from PEP 421: "the version of the Python language") and sys.implementation.version ("the version of the Python implementation"). Under what circumstances will one change but not the other? -- Steven From carl at oddbird.net Thu May 3 06:30:28 2012 From: carl at oddbird.net (Carl Meyer) Date: Wed, 02 May 2012 22:30:28 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: <20120503034959.GA19401@ando> References: <20120430170454.08d73f74@resist.wooz.org> <20120503034959.GA19401@ando> Message-ID: <4FA209E4.6040900@oddbird.net> On 05/02/2012 09:49 PM, Steven D'Aprano wrote: > On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote: >> On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw wrote: >>> On Apr 27, 2012, at 12:36 AM, Eric Snow wrote: >>>> Version Format >>>> -------------- >>>> >>>> XXX same as sys.version_info? >>> >>> Why not? :) It might be useful also to have something similar to >>> sys.hexversion, which I often find convenient. >> >> Would it be worth mirroring all 3 (sys.version, sys.version_info, >> sys.hexversion)? Symmetry is nice, but it also makes sense if the >> each would be as meaningful as they are in sys. > > I am still unclear what justification there is for having a separate > sys.version (from PEP 421: "the version of the Python language") and > sys.implementation.version ("the version of the Python implementation"). > Under what circumstances will one change but not the other? I know at least PyPy has separate "PyPy version" and "Python language compatibility version" numbers. They might choose to do a release that increments the PyPy version (because they've made improvements to the JIT or any number of other implementation-quality issues) but doesn't change the bundled stdlib version or language-compatibility version at all. Seems pretty reasonable to me. Carl From pyideas at rebertia.com Thu May 3 06:31:47 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 2 May 2012 21:31:47 -0700 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: <20120503034959.GA19401@ando> References: <20120430170454.08d73f74@resist.wooz.org> <20120503034959.GA19401@ando> Message-ID: On Wed, May 2, 2012 at 8:49 PM, Steven D'Aprano wrote: > On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote: >> On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw wrote: >> > On Apr 27, 2012, at 12:36 AM, Eric Snow wrote: > I am still unclear what justification there is for having a separate > sys.version (from PEP 421: "the version of the Python language") and > sys.implementation.version ("the version of the Python implementation"). > Under what circumstances will one change but not the other? In the event of an implementation bugfix? The Python version implemented would be unchanged, but the implementation version would be incremented slightly. Cheers, Chris From ncoghlan at gmail.com Thu May 3 08:06:44 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 3 May 2012 16:06:44 +1000 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: <20120503034959.GA19401@ando> References: <20120430170454.08d73f74@resist.wooz.org> <20120503034959.GA19401@ando> Message-ID: On Thu, May 3, 2012 at 1:49 PM, Steven D'Aprano wrote: > I am still unclear what justification there is for having a separate > sys.version (from PEP 421: "the version of the Python language") and > sys.implementation.version ("the version of the Python implementation"). > Under what circumstances will one change but not the other? The PyPy example is the real motivator. It allows "sys.version" to declare what version of Python the implementation intends to implement, while sys.implementation.version may be completely different. For example, a new implementation might declare sys.version_info as (3, 3, etc...) to indicate they're aiming at 3.3 compatibility, while setting sys.implementation.version to (0, 1, etc...) to reflect its actual immaturity as an implementation. Implementations are of course free to set the two numbers in lock step, and CPython, IronPython and Jython will likely continue to do exactly that. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mal at egenix.com Thu May 3 10:20:39 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 03 May 2012 10:20:39 +0200 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: <20120430170454.08d73f74@resist.wooz.org> <20120503034959.GA19401@ando> Message-ID: <4FA23FD7.3070906@egenix.com> Some corrections to the PEP text: platform.python_implementation() -------------------------------- The following text in the PEP needs to be updated: """ The platform module guesses the python implementation by looking for clues in a couple different sys variables [3]. However, this approach is fragile. """ Fact is, that sys.version parsing is documented to be done by the platform module (see the docs on sys.version), so implementations are free to provide patches in case they choose different ways of formatting sys.version. A sys.implementation record would make things easier for the platform module, though, so it's an improvement. sys.version ----------- sys.version is defined as "A string containing the version number of the Python interpreter plus additional information on the build number and compiler used. This string is displayed when the interactive interpreter is started. Do not extract version information out of it, rather, use version_info and the functions provided by the platform module. It's not defined as "version of the Python language" as the PEP appears to indicate. Other things: Making sys.implementation a dictionary -------------------------------------- This is not a good idea, since it allows for monkey-patching the values and will also result in new undocumented or per-implementation keys. Better use a namedtuple like we do for all other such informational resources. sys.implementation information ------------------------------ While I'm not sure whether details such as VCS URLs and revision ids should really be part of a data structure that is supposed to identify the implementation (sys.version is better for that), if you do want to add such information, then please add all of it, not just part of the available build information. See platform._sys_version() returns (name, version, branch, revision, buildno, builddate, compiler). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 03 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ericsnowcurrently at gmail.com Thu May 3 22:50:49 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 3 May 2012 14:50:49 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: <4FA23FD7.3070906@egenix.com> References: <20120430170454.08d73f74@resist.wooz.org> <20120503034959.GA19401@ando> <4FA23FD7.3070906@egenix.com> Message-ID: On Thu, May 3, 2012 at 2:20 AM, M.-A. Lemburg wrote: > Some corrections to the PEP text: > > platform.python_implementation() > -------------------------------- > > The following text in the PEP needs to be updated: > > """ > The platform module guesses the python implementation by looking for > clues in a couple different sys variables [3]. However, this approach > is fragile. > """ > > Fact is, that sys.version parsing is documented to be done by the > platform module (see the docs on sys.version), so implementations > are free to provide patches in case they choose different ways of > formatting sys.version. > > A sys.implementation record would make things easier for the platform > module, though, so it's an improvement. Yeah, I'll update that to be softer and more clear. > sys.version > ----------- > > sys.version is defined as "A string containing the version number > of the Python interpreter plus additional information on the build > number and compiler used. This string is displayed when the interactive > interpreter is started. Do not extract version information out of it, > rather, use version_info and the functions provided by the platform module. > > It's not defined as "version of the Python language" as the PEP > appears to indicate. This is an excellent point. sys.(version|version_info|hexversion) reflect CPython specifics, rather than the language itself. As far as I know the language does not have a "micro" version, nor a release level or serial. So where does that leave us? Undoubtedly no small number of people already depend on the the sys variables for CPython release info, so we can't just change the semantics. I'll clarify the PEP and add this to the open issues list because the PEP definitely needs to be clear here. Any suggestions on this point would be great. > Other things: > > Making sys.implementation a dictionary > -------------------------------------- > > This is not a good idea, since it allows for monkey-patching > the values and will also result in new undocumented or per-implementation > keys. > > Better use a namedtuple like we do for all other such informational > resources. Nick Coghlan made good suggestion on this front that I'm likely going to adopt: sys.implementation as an object (namespace with dotted access) with required attributes. One required attribute would be 'metadata', a dict where optional/per-implementation values could go. Having it be immutable (make monkey-patching hard) didn't seem like it mattered, though I'm not opposed. I just don't see that as a convincing reason for it to be a named tuple (structseq, etc.). To be honest, I'd like to avoid making sys.implementation any kind of sequence. It has no meaning as a sequence (hence why the PEP shifted from named tuple to dict). Unlike other informational sources, we expect that the namespace of required attributes will grow over time. As such, people shouldn't rely on a fixed number of attributes, which a named tuple would imply. As well, I'm not convinced that the order of the attributes is significant, nor that sequence unpacking is useful here. So in order to send the right message on both points, I'd rather not make it a sequence. It *could* be meaningful to implement the Mapping ABC, but I'm not going to specify that in the PEP without good reason. (I will add that as an open issue though.) Unless there is a good reason to use a named tuple, as opposed to a regular object, let's not. However, I'm still quite open to hearing out arguments on this point. -eric From nbvfour at gmail.com Thu May 3 22:57:33 2012 From: nbvfour at gmail.com (nbv4) Date: Thu, 3 May 2012 13:57:33 -0700 (PDT) Subject: [Python-ideas] one line class definitions Message-ID: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27> Instead of class CustomException(Exception): pass how about just class CustomException(Exception) (no colon, no 'pass') I'm sure this has been suggested before, but I couldn't find any... -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyideas at rebertia.com Fri May 4 00:19:32 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 3 May 2012 15:19:32 -0700 Subject: [Python-ideas] one line class definitions In-Reply-To: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27> References: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27> Message-ID: On Thu, May 3, 2012 at 1:57 PM, nbv4 wrote: > Instead of > > class CustomException(Exception): > ? ? pass > > how about just > > class CustomException(Exception) > > (no colon, no 'pass') > I'm sure this has been suggested before, but I couldn't find any... "Special cases aren't special enough to break the rules." -- PEP 20 Just use a docstring-only body; you should be documenting what the exception means anyways: class CustomException(Exception): """This exception means that a custom error happened.""" # other module-level code? Cheers, Chris From ned at nedbatchelder.com Fri May 4 00:44:42 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 03 May 2012 18:44:42 -0400 Subject: [Python-ideas] one line class definitions In-Reply-To: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27> References: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27> Message-ID: <4FA30A5A.7050407@nedbatchelder.com> On 5/3/2012 4:57 PM, nbv4 wrote: > Instead of > > class CustomException(Exception): > pass > > how about just > > class CustomException(Exception) > How about just: class CustomException(Exception): pass or better yet, using more lines? --Ned. > (no colon, no 'pass') > I'm sure this has been suggested before, but I couldn't find any... > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri May 4 00:56:07 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 3 May 2012 15:56:07 -0700 Subject: [Python-ideas] one line class definitions In-Reply-To: References: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27> Message-ID: (Repeat, somehow the message to which I replied had python-ideas at googlegroups.com which doesn't exist.) On Thu, May 3, 2012 at 2:53 PM, Guido van Rossum wrote: > That's just asking for more mysterious errors if you forget the colon. > You can always write > > class Foo(Bar): pass > > On Thu, May 3, 2012 at 1:57 PM, nbv4 wrote: >> Instead of >> >> class CustomException(Exception): >> ? ? pass >> >> how about just >> >> class CustomException(Exception) >> >> (no colon, no 'pass') >> I'm sure this has been suggested before, but I couldn't find any... >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > > > -- > --Guido van Rossum (python.org/~guido) -- --Guido van Rossum (python.org/~guido) From Ronny.Pfannschmidt at gmx.de Sun May 6 09:48:46 2012 From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt) Date: Sun, 06 May 2012 09:48:46 +0200 Subject: [Python-ideas] package based import Message-ID: <4FA62CDE.8050307@gmx.de> Hi, this one is still prety rough (not just on the edges) after some tinkering with tools like nmp, i got the idea of package based imports the idea is to have a lookup based on package toplevel names firs, instead of just walking sys.path that way it becomes more natural for packages to be in a own dir instead of everything being merged in site-packages and of course, much less files to walk to find a particular package, since the mapping of package name to import paths is already known in order to add such ackages, some kind of registration would be necessary a basic example could be something like packages.pth:: import pkgutil # whereever _init__.py is pkgutil.register_package('flask', '~/Projects/flask/flask') pkgutil.register_module('hgdistver', '~/Projects/hgdistver.py') alltough for convience a ini file with sections and doted name to path mappings might be better. once that is in place that opens up the path for fun local envs like simply looping over eggs/dirs in a packages subdir and adding them, instead of needing buildout/virtualenv -- Ronny From p.f.moore at gmail.com Sun May 6 10:02:09 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 6 May 2012 09:02:09 +0100 Subject: [Python-ideas] package based import In-Reply-To: <4FA62CDE.8050307@gmx.de> References: <4FA62CDE.8050307@gmx.de> Message-ID: On 6 May 2012 08:48, Ronny Pfannschmidt wrote: > > the idea is to have a lookup based on package toplevel names firs, instead > of just walking sys.path > > that way it becomes more natural for packages to be in a own dir instead of > everything being merged in site-packages > and of course, much less files to walk to find a particular package, since > the mapping of package name to import paths is already known > > in order to add such ackages, some kind of registration would be necessary This should be relatively easy to do using importlib - as a custom meta hook (in PEP 302 terms). It could probably be written as a 3rd party module, at least as a proof of concept. Paul. From tjreedy at udel.edu Sun May 6 23:24:40 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 06 May 2012 17:24:40 -0400 Subject: [Python-ideas] Should range() == range(0)? Message-ID: It is a general principle that if a built-in class C has a unique (up to equality) null object, then C() returns that null object. >>> for f in (bool, int, float, complex, tuple, list, dict, set, frozenset, str, bytes, bytearray): print(bool(f())) # 12 lines of False Some imported classes such as fractions.Fraction and collections.deque can be added to the list. I add 'up to equality' because in the case of floats, 0.0 and -0.0 are distinct but equal, and float() returns the obvious 0.0. >>> 0.0 == -0.0 True >>> m.copysign(1, 0.0) 1.0 >>> m.copysign(1, -0.0) -1.0 >>> m.copysign(1, float()) 1.0 The notable exception to the rule is >>> range() Traceback (most recent call last): File "", line 1, in range() TypeError: range expected 1 arguments, got 0 >>> bool(range(0)) False It is true that there are multiple distinct null range objects (because the defining start,stop,step args are kept as attributes) but they are all equal. >>> range(1,1) == range(0) True range(0) == range(0, 0, 1) would be the obvious choice for range(). Another advantage of doing this, beside consistency, is that it would emphasize that range() produces a re-iterable sequence, not just an iterator. Possible objections and responses: 1. This would slightly complicate the already messy code and doc for range(). Pass, for now. 2. There is little need as there is already the alternative. This is just as true or even more true for the other classes. While int() is slightly easier to type than int(), 0 is even easier. 3. There is little or no use case. The justification I have seen for all the other classes behaving as they do is expressions like type(x)(), which gets the null object corresponding to x. This requires a parameterless call rather than a literal (or display) or call with typed arg. A proper objection this sort would have to argue that range() is less useful than all 12+ cases that we have now. 4. memoryview() does not work. Even though memoryview(bytes()) and memoryview(bytearray()) are both False and equal, other empty memoryviews would not all be equal. Besides which, a memoryview is dependent on another object, and there is not reason to create any particular object for it to be dependent on. 5. The dict view methods, such as dict.keys, do not work. These also return views that are dependent on a primary object, and the views also are null if the primary object is. Here there is a unique null primary object, so it would at least be possible to create an empty dict whose only reference is held by a read-only view. On the other hand, '.keys' is a function, not a class. 6. filter() does not work. While filter is a class, its instances, again, are dependent on another object, not just at creation but during its lifetime. Moreover, bool(empty-iterable) is not False. Ditto for map() and, for instance, open(), even though in the latter case the primary object is external. -- Terry Jan Reedy From g.brandl at gmx.net Mon May 7 00:24:21 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 07 May 2012 00:24:21 +0200 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: Message-ID: On 05/06/2012 11:24 PM, Terry Reedy wrote: > It is a general principle that if a built-in class C has a unique (up to > equality) null object, then C() returns that null object. > > >>> for f in (bool, int, float, complex, tuple, list, dict, set, > frozenset, str, bytes, bytearray): > print(bool(f())) > > # 12 lines of False > > Some imported classes such as fractions.Fraction and collections.deque > can be added to the list. > > I add 'up to equality' because in the case of floats, 0.0 and -0.0 are > distinct but equal, and float() returns the obvious 0.0. > >>> 0.0 == -0.0 > True > >>> m.copysign(1, 0.0) > 1.0 > >>> m.copysign(1, -0.0) > -1.0 > >>> m.copysign(1, float()) > 1.0 > > The notable exception to the rule is > >>> range() > Traceback (most recent call last): > File "", line 1, in > range() > TypeError: range expected 1 arguments, got 0 > >>> bool(range(0)) > False > > It is true that there are multiple distinct null range objects (because > the defining start,stop,step args are kept as attributes) but they are > all equal. > >>> range(1,1) == range(0) > True > > range(0) == range(0, 0, 1) would be the obvious choice for range(). > > Another advantage of doing this, beside consistency, is that it would > emphasize that range() produces a re-iterable sequence, not just an > iterator. > > Possible objections and responses: [1. - 6.] 7. The "default value" is only really useful for types that are best described as "data-like". range is not a data-like type, it's a helper for iteration, just as filter or dictviews aren't data-like. Georg From jeanpierreda at gmail.com Mon May 7 01:52:00 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 6 May 2012 19:52:00 -0400 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: Message-ID: On Sun, May 6, 2012 at 5:24 PM, Terry Reedy wrote: > Another advantage of doing this, beside consistency, is that it would > emphasize that range() produces a re-iterable sequence, not just an > iterator. How? The empty sequence is the exact case where reiterable objects and iterators have identical iteration behavior. (Both immediately stop every time you try.) > Possible objections and responses: > > 1. This would slightly complicate the already messy code and doc for > range(). > > Pass, for now By this, do you mean don't write new documentation? That just defers the problem to later. > 3. There is little or no use case. > > The justification I have seen for all the other classes behaving as they do > is expressions like type(x)(), which gets the null object corresponding to > x. This requires a parameterless call rather than a literal (or display) or > call with typed arg. > > A proper objection this sort would have to argue that range() is less useful > than all 12+ cases that we have now. Most of the other types are useful as parameters to something such as collections.defaultdict. In the case of range, why not use tuple for this? Although, I actually like this idea, because it feels more consistent. I imagine that isn't a good reason to like things though. -- Devin From steve at pearwood.info Mon May 7 02:46:32 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 07 May 2012 10:46:32 +1000 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: Message-ID: <4FA71B68.4010400@pearwood.info> Terry Reedy wrote: > It is a general principle that if a built-in class C has a unique (up to > equality) null object, then C() returns that null object. > > >>> for f in (bool, int, float, complex, tuple, list, dict, set, > frozenset, str, bytes, bytearray): > print(bool(f())) > > # 12 lines of False I don't think that's so much a general principle that should be aspired to as a general observation that many objects have an obvious "nothing" (empty) value that intuitively matches the zero-argument case, e.g. set, dict, list and so forth. The cases of int, float, complex etc. are a little more dubious; I'm not convinced there's a general philosophical reason why int() should be allowed at all. E.g. int("") fails, int([]) fails, etc. so there's no general principle that the int of "emptiness" is expected to return 0. The fact that float() has to choose between two zero objects, complex() between four, and Fraction and Decimal between an infinity of zero objects, highlights that the choice of a "default" is at least in part an arbitrary choice. If Python has any general principle here, it is that we should be reluctant to make arbitrary choices in the face of ambiguity. For the avoidance of doubt, I'm not arguing for changing the behaviour of int. The current behaviour is fine. But I don't think we should treat it as a general principle that other objects should necessarily follow. > Some imported classes such as fractions.Fraction and collections.deque > can be added to the list. [...] > It is true that there are multiple distinct null range objects (because > the defining start,stop,step args are kept as attributes) but they are > all equal. > >>> range(1,1) == range(0) > True Are you using Python 2 here? If so, you should be looking at xrange, not range. In Python 3, range objects are equal if their start, stop and step attributes are equal, not if their output values are equal: py> range(0) == range(1,1) False py> range(1, 6, 2) == range(1, 7, 2) False > range(0) == range(0, 0, 1) would be the obvious choice for range(). I'm not entirely sure that is quite so obvious. range() defaults to a start of 0 and a step of 1, so it's natural to reason that range() => range(0, end, 1). But surely we should treat end to be a required argument? If end is not required, that suggests the possibility of calling range with (say) a start value only, using the default end and step values. I think there is great value in keeping range simple, and the simplest thing is to keep end as a required argument and refuse the temptation to guess if it is not given. I do think this is a line-call though. If I were designing range from scratch, I too would be sorely tempted to have range() => range(0). > Another advantage of doing this, beside consistency, is that it would > emphasize that range() produces a re-iterable sequence, not just an > iterator. I don't follow your reasoning there. Whether range(*args) succeeds or fails for some arbitrary value of args has no bearing on whether it is re-iterable. Consider zip(). > 6. filter() does not work. > > While filter is a class, its instances, again, are dependent on another > object, not just at creation but during its lifetime. Moreover, > bool(empty-iterable) is not False. Ditto for map() and, for instance, > open(), even though in the latter case the primary object is external. Likewise reversed() and iter(). sorted() is an interesting case, because although it returns a list rather than a (hypothetical) SortedSequence object, it could choose to return [] when called with no arguments. I think it is right to not do so. zip() on the other hand is a counter-example, and it is informative to think about why zip() succeeds while range() fails. zip takes an arbitrary number of arguments, where no particular argument is required or treated differently from the others. Also there is a unique interpretation of zip() with no arguments: an empty zip object (or list in the case of Python 2). Nevertheless, I consider it somewhat surprising that zip() succeeds, and don't think that it is a good match for range. Given the general principle "the status quo wins", I'm going to vote -0 on the suggested change. -- Steven From tjreedy at udel.edu Mon May 7 04:20:35 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 06 May 2012 22:20:35 -0400 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: <4FA71B68.4010400@pearwood.info> References: <4FA71B68.4010400@pearwood.info> Message-ID: On 5/6/2012 8:46 PM, Steven D'Aprano wrote: > Terry Reedy wrote: >> It is a general principle that if a built-in class C has a unique (up >> to equality) null object, then C() returns that null object. >> >> >>> for f in (bool, int, float, complex, tuple, list, dict, set, >> frozenset, str, bytes, bytearray): >> print(bool(f())) >> >> # 12 lines of False > > I don't think that's so much a general principle that should be aspired > to as a general observation that many objects have an obvious "nothing" > (empty) value that intuitively matches the zero-argument case, e.g. set, > dict, list and so forth. The general principle, including consistency, *has* been invoked in discussions about making the code example above true. It is not just an accident. To me, an empty range is nearly as obvious as any other empty collection. > The cases of int, float, complex etc. are a little more dubious; I'm not > convinced there's a general philosophical reason why int() should be > allowed at all. E.g. int("") fails, int([]) fails, etc. so there's no > general principle that the int of "emptiness" is expected to return 0. > > The fact that float() has to choose between two zero objects, complex() > between four, and Fraction and Decimal between an infinity of zero > objects, Fraction normalizes all 0 fractions to 0/1, so there is no choice ;-) >>> from fractions import Fraction >>> Fraction(0, 2) Fraction(0, 1) >>> Fraction() Fraction(0, 1) I believe there was consideration given to similarly normalizing ranges so that equal ranges (in 3.3, see below) would have the same start, stop, and step attributes. But I believe Guido said that recording the input might help debugging. Or there might have been some point about consistency with slice objects. If list objects, for instance, had a .source_type attribute (for debugging), there would be multiple different but equal empty lists. Both [] and list() would then, most sensibly, use list as the default .source_type. > highlights that the choice of a "default" is at least in part > an arbitrary choice. If Python has any general principle here, it is > that we should be reluctant to make arbitrary choices in the face of > ambiguity. > > For the avoidance of doubt, I'm not arguing for changing the behaviour > of int. The current behaviour is fine. But I don't think we should treat > it as a general principle that other objects should necessarily follow. The consistent list above *is* a result of treating the principle as one that 'other' classes should follow. > Are you using Python 2 here? If so, you should be looking at xrange, not > range. In Python 3, range objects are equal if their start, stop and > step attributes are equal, not if their output values are equal: > > py> range(0) == range(1,1) > False > py> range(1, 6, 2) == range(1, 7, 2) > False Python 3.3.0a3 (default, May 1 2012, 16:46:00) [MSC v.1500 64 bit (AMD64)] on win32 >>> range(0) == range(1,1) True >>> range(1, 6, 2) == range(1, 7, 2) True I remember there being a discussion about this, which Guido was part of, that since ranges are sequences, not their source inputs, == should reflect what they are, and not how they came to be. If ranges A and B are equal, len(A) == len(B), A[i] == B[i], and iter(A) and iter(B) produce the same sequence -- and vice versa. >> range(0) == range(0, 0, 1) would be the obvious choice for range(). > > I'm not entirely sure that is quite so obvious. range() defaults to a > start of 0 and a step of 1, so it's natural to reason that range() => > range(0, end, 1). But surely we should treat end to be a required > argument? If end is not required, that suggests the possibility of > calling range with (say) a start value only, using the default end and > step values. > > I think there is great value in keeping range simple, and the simplest > thing is to keep end as a required argument and refuse the temptation to > guess if it is not given. > > I do think this is a line-call though. If I were designing range from > scratch, I too would be sorely tempted to have range() => range(0). > >> Another advantage of doing this, beside consistency, is that it would >> emphasize that range() produces a re-iterable sequence, not just an >> iterator. Sorry, that is mis-worded to the point of being erroneous. I meant to say 'non-iterator re-iterable sequence *instead of* an iterator. Just like a list or tuple or deque ... . > I don't follow your reasoning there. Whether range(*args) succeeds or > fails for some arbitrary value of args has no bearing on whether it is > re-iterable. Whether range is an non-iterator iterable sequence or an iterator has everything to do with whether it it reiterable. > Consider zip(). That surprises me. Zip is an one-time iterator, like map, dependent on underlying iterables. I wonder whether it is really intentional, or an accident of the definition or some mplementation, that zip() returns an exhausted iterator instead of raising. In any case, bool(zip()) returns True, not False, so it has nothing to do with the return null principle. >> 6. filter() does not work. >> >> While filter is a class, its instances, again, are dependent on >> another object, not just at creation but during its lifetime. >> Moreover, bool(empty-iterable) is not False. Ditto for map() and, for >> instance, open(), even though in the latter case the primary object is >> external. > > Likewise reversed() and iter(). both fail, as I expected. > sorted() is an interesting case, because although it returns a list > rather than a (hypothetical) SortedSequence object, it could choose to > return [] when called with no arguments. I think it is right to not do so. It is a function, not a class. I would not suggest that all functions of one arg should have a default input and therefor a default output. This is certainly not a Python design principle. > zip() on the other hand is a counter-example, and it is informative to > think about why zip() succeeds while range() fails. zip takes an > arbitrary number of arguments, where no particular argument is required > or treated differently from the others. Also there is a unique > interpretation of zip() with no arguments: an empty zip object (or list > in the case of Python 2). > > Nevertheless, I consider it somewhat surprising that zip() succeeds, and > don't think that it is a good match for range. They are not in the same sub-categories of iterables. -- Terry Jan Reedy From tjreedy at udel.edu Mon May 7 04:43:20 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 06 May 2012 22:43:20 -0400 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: Message-ID: On 5/6/2012 6:24 PM, Georg Brandl wrote: > 7. The "default value" is only really useful for types that are best > described as "data-like". > range is not a data-like type, it's a helper > for iteration, just as filter or dictviews aren't data-like. Not knowning your definition of 'data-like', it is hard to respond. A range is an immutable, indexable, reiterable sequence of regularly spaced ints with a definite length. It compactly represents an finite but possibly long arithmetic sequence. While mostly used for iteration, it is not limited to iteration. It implements the sequence protocol. It is not an iterator. It is not dependent on an underlying iterable. It is properly documented with the other sequence types. It is most like a bytes object in being an immutable sequence of ints. In that regard, it is different in not restricting the ints to [0,255] while restricting the differences to being equal. (Dict views, especially .keys() are also, to me, somewhat data-like and not limited to iteration. But, unlike ranges, they are dependencies.) -- Terry Jan Reedy From tjreedy at udel.edu Mon May 7 04:56:19 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 06 May 2012 22:56:19 -0400 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: <4FA71B68.4010400@pearwood.info> Message-ID: On 5/6/2012 10:20 PM, Terry Reedy wrote: > On 5/6/2012 8:46 PM, Steven D'Aprano wrote: >> Are you using Python 2 here? If so, you should be looking at xrange, not >> range. In Python 3, range objects are equal if their start, stop and >> step attributes are equal, not if their output values are equal: >> >> py> range(0) == range(1,1) >> False >> py> range(1, 6, 2) == range(1, 7, 2) >> False > > Python 3.3.0a3 (default, May 1 2012, 16:46:00) [MSC v.1500 64 bit > (AMD64)] on win32 >>>> range(0) == range(1,1) > True >>>> range(1, 6, 2) == range(1, 7, 2) > True > > I remember there being a discussion about this, which Guido was part of, > that since ranges are sequences, not their source inputs, == should > reflect what they are, and not how they came to be. If ranges A and B > are equal, len(A) == len(B), A[i] == B[i], and iter(A) and iter(B) > produce the same sequence -- and vice versa. I found the change notice in the library manual. "Changed in version 3.3: Define ?==? and ?!=? to compare range objects based on the sequence of values they define (instead of comparing based on object identity)." That implies, for instance, "range(1,6,2) != range(1,6,2)" in 3.2, which is rather useless. Python slowly improves in many little ways. -- Terry Jan Reedy From tjreedy at udel.edu Mon May 7 05:22:18 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 06 May 2012 23:22:18 -0400 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: Message-ID: On 5/6/2012 7:52 PM, Devin Jeanpierre wrote: > On Sun, May 6, 2012 at 5:24 PM, Terry Reedy wrote: >> Another advantage of doing this, beside consistency, is that it would >> emphasize that range() produces a re-iterable sequence, not just an >> iterator. My apology for mis-writing that. A range is a non-iterator, re-iterable sequence rather than an iterator. > How? The empty sequence is the exact case where reiterable objects and > iterators have identical iteration behavior. (Both immediately stop > every time you try.) That is also true of empty tuples, lists, sets, and dicts. An iterator can only be used to iterate - once. Non-iterator iterables (usually) have other behaviors. >> Possible objections and responses: >> >> 1. This would slightly complicate the already messy code and doc for >> range(). >> >> Pass, for now > > By this, do you mean don't write new documentation? No, it means I was defering discussing this possible objection unless someone raises it as a show-stopper, or it becomes the last issue. The current messiness is that the signature in the doc "range([start], stop[, step])" is non-standard in that it does not follow the rule that optional parameters and arguements follow required ones. It would perhaps be more accurate, but also possibly more confusing, to give it as "range(start_stop, [[stop], [step])", where start_stop is interpreted as start if stop is given and stop if stop is not (otherwise) given. Either version would just need an outer '[]' added: "range([[start], stop, [step]])" and a note "If no arguments are given, return range(0)." For a Python version, adding "= 0" to start_stop in the header should be sufficient. But I do not know how the C version works. > Most of the other types are useful as parameters to something such as > collections.defaultdict. I admit range() would be seemingly useless there. > Although, I actually like this idea, because it feels more consistent. > I imagine that isn't a good reason to like things though. I believe, though, it was a reason for the consistency of everything other than range. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Mon May 7 07:05:30 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 07 May 2012 17:05:30 +1200 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: <4FA71B68.4010400@pearwood.info> References: <4FA71B68.4010400@pearwood.info> Message-ID: <4FA7581A.6050807@canterbury.ac.nz> Steven D'Aprano wrote: > The cases of int, float, complex etc. are a little more dubious; I'm not > convinced there's a general philosophical reason why int() should be > allowed at all. A philosophical reason would be that list() and int() both return false values. Pragmatically, it makes them useful as arguments to defaultdict. The fact that there is sometimes more than one representation of zero isn't much of a problem, since they all give the same result when you add a nonzero value to them. The defaultdict argument doesn't apply to range() in Python 3, or xrange() in Python 2, since you can't apply += to them. It also doesn't apply much to range() in Python 2, since list would work just as well as a defaultdict argument as a range that accepted no arguments. -- Greg From greg.ewing at canterbury.ac.nz Mon May 7 07:16:39 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 07 May 2012 17:16:39 +1200 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: <4FA71B68.4010400@pearwood.info> Message-ID: <4FA75AB7.1000702@canterbury.ac.nz> Terry Reedy wrote: > I believe there was consideration given to similarly normalizing ranges > so that equal ranges (in 3.3, see below) would have the same start, > stop, and step attributes. That might make sense if there were a well-defined algebra of range objects, but there isn't. For example, concatenating the sequences represented by two ranges with different step sizes results in a sequence that can't be represented by a single range object. Also I can't remember seeing a plethora of use cases for comparing range objects. -- Greg From ncoghlan at gmail.com Mon May 7 09:06:59 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 7 May 2012 17:06:59 +1000 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: <4FA75AB7.1000702@canterbury.ac.nz> References: <4FA71B68.4010400@pearwood.info> <4FA75AB7.1000702@canterbury.ac.nz> Message-ID: On Mon, May 7, 2012 at 3:16 PM, Greg Ewing wrote: > Also I can't remember seeing a plethora of use cases for > comparing range objects. Most of the changes to range() in 3.3 are about making them live up to their claim to implement the Sequence ABC. The approach taken to achieve this is to follow the philosophy that a Python 3.3 range object should behave as much as possible like a memory efficient representation for a tuple of regularly spaced integers (but ignoring the concatenation and repetition operations that tuples support but aren't part of the Sequence ABC). Having range() return an empty range in the same way that tuple() returns an empty tuple would be a natural extension of that philosophy. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Mon May 7 12:50:26 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 07 May 2012 12:50:26 +0200 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: <4FA71B68.4010400@pearwood.info> <4FA75AB7.1000702@canterbury.ac.nz> Message-ID: On 05/07/2012 09:06 AM, Nick Coghlan wrote: > On Mon, May 7, 2012 at 3:16 PM, Greg Ewing wrote: >> Also I can't remember seeing a plethora of use cases for >> comparing range objects. > > Most of the changes to range() in 3.3 are about making them live up to > their claim to implement the Sequence ABC. The approach taken to > achieve this is to follow the philosophy that a Python 3.3 range > object should behave as much as possible like a memory efficient > representation for a tuple of regularly spaced integers (but ignoring > the concatenation and repetition operations that tuples support but > aren't part of the Sequence ABC). > > Having range() return an empty range in the same way that tuple() > returns an empty tuple would be a natural extension of that > philosophy. For what gain? At the moment, I cannot think of any arguments in favor of the change, which is the point where arguments against it aren't even needed to keep the status quo. Ah yes: and I would rather have the bug for i in range(): # <- "n" (or equivalent) missing give me an explicit exception than silently "skipping" the loop. After all, the primary use case for range() is loops, and we should not make that use worse for the benefit of hypothetical other use cases. Georg From ncoghlan at gmail.com Mon May 7 13:14:27 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 7 May 2012 21:14:27 +1000 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: <4FA71B68.4010400@pearwood.info> <4FA75AB7.1000702@canterbury.ac.nz> Message-ID: On Mon, May 7, 2012 at 8:50 PM, Georg Brandl wrote: > For what gain? ?At the moment, I cannot think of any arguments in favor > of the change, which is the point where arguments against it aren't > even needed to keep the status quo. > > Ah yes: and I would rather have the bug > > for i in range(): ? # <- "n" (or equivalent) missing > > give me an explicit exception than silently "skipping" the loop. > After all, the primary use case for range() is loops, and we should not > make that use worse for the benefit of hypothetical other use cases. Now *that's* a good reason to nix the idea :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ram.rachum at gmail.com Mon May 7 13:42:34 2012 From: ram.rachum at gmail.com (Ram Rachum) Date: Mon, 7 May 2012 04:42:34 -0700 (PDT) Subject: [Python-ideas] bool(datetime.time(0, 0)) Message-ID: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> Hello, Currently, `bool(datetime.time(0, 0)) is False`. Can we change that to `True`? There is nothing False-y about midnight. Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon May 7 16:38:19 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 08 May 2012 00:38:19 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> Message-ID: <4FA7DE5B.8000703@pearwood.info> Ram Rachum wrote: > Hello, > > Currently, `bool(datetime.time(0, 0)) is False`. > > Can we change that to `True`? > > There is nothing False-y about midnight. Of course there is -- it is the witching hour, and witches are known to be deceivers whose words and actions are false. *wink* I fear that backwards compatibility will prevent any change, but I don't see any good reasons for treating any date or time as a false value. By the way, the "To:" address on your post is set to python-ideas at googlegroups.com, which does not exist. -- Steven From solipsis at pitrou.net Mon May 7 17:02:19 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 17:02:19 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> Message-ID: <20120507170219.266304f2@pitrou.net> On Tue, 08 May 2012 00:38:19 +1000 Steven D'Aprano wrote: > Ram Rachum wrote: > > Hello, > > > > Currently, `bool(datetime.time(0, 0)) is False`. > > > > Can we change that to `True`? > > > > There is nothing False-y about midnight. > > > Of course there is -- it is the witching hour, and witches are known to be > deceivers whose words and actions are false. > > *wink* > > I fear that backwards compatibility will prevent any change, but I don't see > any good reasons for treating any date or time as a false value. I, too, think it would be desireable to make the change. Regards Antoine. From dickinsm at gmail.com Mon May 7 17:11:55 2012 From: dickinsm at gmail.com (Mark Dickinson) Date: Mon, 7 May 2012 16:11:55 +0100 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120507170219.266304f2@pitrou.net> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> Message-ID: > Steven D'Aprano wrote: > I fear that backwards compatibility will prevent any change, but I don't see > any good reasons for treating any date or time as a false value. I agree for the date, time and datetime classes. Having timedelta(0) be False makes sense to me, though. But see: http://bugs.python.org/issue13936 Mark From alexander.belopolsky at gmail.com Mon May 7 17:19:04 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 7 May 2012 11:19:04 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> Message-ID: On Mon, May 7, 2012 at 11:11 AM, Mark Dickinson wrote: >> Steven D'Aprano wrote: >> I fear that backwards compatibility will prevent any change, but I don't see >> any good reasons for treating any date or time as a false value. > > I agree for the date, time and datetime classes. Can anyone show a use case where the change will result in an improvement? It seems to me that the issue mostly shows up in the code like "if t: ..." which would work better with "if t is not None: ...". From solipsis at pitrou.net Mon May 7 17:32:54 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 17:32:54 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> Message-ID: <20120507173254.6a6aee5b@pitrou.net> On Mon, 7 May 2012 11:19:04 -0400 Alexander Belopolsky wrote: > On Mon, May 7, 2012 at 11:11 AM, Mark Dickinson wrote: > >> Steven D'Aprano wrote: > >> I fear that backwards compatibility will prevent any change, but I don't see > >> any good reasons for treating any date or time as a false value. > > > > I agree for the date, time and datetime classes. > > Can anyone show a use case where the change will result in an > improvement? Well, less occasional puzzlement is an improvement in itself. Unintuitive behaviour is always a risk for software quality. Regards Antoine. From alexander.belopolsky at gmail.com Mon May 7 17:57:34 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 7 May 2012 11:57:34 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120507173254.6a6aee5b@pitrou.net> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> Message-ID: On Mon, May 7, 2012 at 11:32 AM, Antoine Pitrou wrote: > Well, less occasional puzzlement is an improvement in itself. > Unintuitive behaviour is always a risk for software quality. I don't find the current behavior unintuitive. It is common to represent time of day as an integer (number of minutes or seconds since midnight) or as a float (fraction of the 24-hour day). In these cases one gets bool(midnight) -> False as an artifact of the representation. For someone who wants to switch from typeless time variables to datetime module types, bool(midnight) -> True may present an extra hurdle. One can improve the quality of his software by avoiding constructs that he finds unintuitive. For example, I claim that in most cases a test for bool(t) is really a lazy version of the more appropriate test for t is None. Note that if we make bool(midnight) -> True, it will not be trivial to faithfully reproduce the old behavior. I want the proponents of the change to try it before I explain why it is not easy. From solipsis at pitrou.net Mon May 7 18:06:53 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 18:06:53 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> Message-ID: <20120507180653.25a654d1@pitrou.net> On Mon, 7 May 2012 11:57:34 -0400 Alexander Belopolsky wrote: > On Mon, May 7, 2012 at 11:32 AM, Antoine Pitrou wrote: > > Well, less occasional puzzlement is an improvement in itself. > > Unintuitive behaviour is always a risk for software quality. > > I don't find the current behavior unintuitive. It is common to > represent time of day as an integer (number of minutes or seconds > since midnight) or as a float (fraction of the 24-hour day). I'm not sure it's common. I don't remember seeing it myself. When I use an integer or a float as you say, it's to represent a *duration*, not an absolute time. > In these > cases one gets bool(midnight) -> False as an artifact of the > representation. That's part of why the integer or float representation is worse than a higher-level structure. > One can improve the quality of his software by > avoiding constructs that he finds unintuitive. For example, I claim > that in most cases a test for bool(t) is really a lazy version of the > more appropriate test for t is None. From a purity standpoint, you are right, but people still do it intuitively, and it works for well-behaved types. Either we try to lecture people into "the one way of writing Python code using time objects", or we make it so that common uses are not broken (i.e. a piece of code that gets wrongly executed in the rare case they encounter a midnight time object). > Note that if we make bool(midnight) -> True, it will not be trivial to > faithfully reproduce the old behavior. Why do you want to reproduce it? Does midnight warrant any special shortcut for testing? Especially one that is confusing to many readers. Regards Antoine. From alexander.belopolsky at gmail.com Mon May 7 18:24:21 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 7 May 2012 12:24:21 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120507180653.25a654d1@pitrou.net> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> Message-ID: On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou wrote: > Why do you want to reproduce it? If I am porting my software to the hypothetical Python 3.4 and see that the time.__bool__ changed, I would prefer to simply replace every occurrence of time tested for truth with something equivalent. In a porting scenario, I don't want to second guess the intent of the original programmer or "improve" the code. > Does midnight warrant any special shortcut for testing? I never needed it, but apparently it is common enough for users to notice an complain. That's why I asked my original question: if you've seen a time variable been tested for truth, was it a bug that can be fixed by a change in time.__bool__ or a deliberate test for the midnight value? > Especially one that is confusing to many readers. I have a feeling that "readers" here are readers of documentation or tutorials rather than readers of actual code. If this is the case, we can discuss how to improve the documentation and not change the behavior. From solipsis at pitrou.net Mon May 7 18:33:43 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 18:33:43 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> Message-ID: <20120507183343.5552cccb@pitrou.net> On Mon, 7 May 2012 12:24:21 -0400 Alexander Belopolsky wrote: > On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou wrote: > > Does midnight warrant any special shortcut for testing? > > I never needed it, but apparently it is common enough for users to > notice an complain. How so? Those users complain that midnight is false, not that they have trouble testing for midnight. That's the whole point really: they don't think about midnight as a special value, and they are surprised that it is. > That's why I asked my original question: if > you've seen a time variable been tested for truth, was it a bug that > can be fixed by a change in time.__bool__ or a deliberate test for the > midnight value? Most likely it's a bug, unless the code is written by an expert in the datetime module. I don't expect many people to remember such oddities (and I don't remember them myselves), let alone willfully rely on them instead of writing more explicit code. > > Especially one that is confusing to many readers. > > I have a feeling that "readers" here are readers of documentation or > tutorials rather than readers of actual code. I was talking about readers of code. If I read code where boolean testing of a time object is done, I wouldn't assume the intent is to test for midnight (unless there's a comment indicating so). Regards Antoine. From mal at egenix.com Mon May 7 18:53:28 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 07 May 2012 18:53:28 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> Message-ID: <4FA7FE08.2050901@egenix.com> Alexander Belopolsky wrote: > On Mon, May 7, 2012 at 11:32 AM, Antoine Pitrou wrote: >> Well, less occasional puzzlement is an improvement in itself. >> Unintuitive behaviour is always a risk for software quality. > > I don't find the current behavior unintuitive. It is common to > represent time of day as an integer (number of minutes or seconds > since midnight) or as a float (fraction of the 24-hour day). In these > cases one gets bool(midnight) -> False as an artifact of the > representation. In Python 2.x, the slot used by bool() is called nb_nonzero, which returns 1/0 depending on whether a value is considered zero or not. This makes it quite natural for any special object representing a value akin to zero in its domain to be false. In Python 3.x, nb_nonzero was renamed to nb_bool without really paying attention to the fact that many types implemented the original meaning instead of a notion of boolean value, so I guess we'll just have to live with it, unless we want to introduce yet another subtle difference between Python 2.x and 3.x. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 07 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-07-02: EuroPython 2012, Florence, Italy 56 days to go 2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From alexander.belopolsky at gmail.com Mon May 7 18:57:47 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 7 May 2012 12:57:47 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120507183343.5552cccb@pitrou.net> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <20120507183343.5552cccb@pitrou.net> Message-ID: On Mon, May 7, 2012 at 12:33 PM, Antoine Pitrou wrote: >> I have a feeling that "readers" here are readers of documentation or >> tutorials rather than readers of actual code. > > I was talking about readers of code. If I read code where boolean > testing of a time object is done, I wouldn't assume the intent is to > test for midnight (unless there's a comment indicating so). I understand your hypothetical, but does such code actually exist in the wild or are we debating the number of angels than can dance at midnight? Yes, if I were to rely on bool(time(0)) exact behavior, I would comment my code. Do we know one way or another whether such code exists? As a matter of coding stile, I recommend avoiding use of datetime.time objects. More often than not, time values are meaningless when detached from the date value. This is particularly true when timezone aware instances are used. Lack of support for time + timedelta makes naked timevalues inconvenient even in reasonable applications that deal with schedules that repeat from day to day. If perceived uncertainly over the truth value will further dissuade anyone from using naked time objects, I am all for it. Note that since date range starts at date(1,1,1) we don't have the same problem with the date or datetime objects. From tjreedy at udel.edu Mon May 7 19:04:17 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 07 May 2012 13:04:17 -0400 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: <4FA71B68.4010400@pearwood.info> <4FA75AB7.1000702@canterbury.ac.nz> Message-ID: On 5/7/2012 7:14 AM, Nick Coghlan wrote: > On Mon, May 7, 2012 at 8:50 PM, Georg Brandl wrote: >> For what gain? At the moment, I cannot think of any arguments in favor >> of the change, which is the point where arguments against it aren't >> even needed to keep the status quo. >> >> Ah yes: and I would rather have the bug >> >> for i in range(): #<- "n" (or equivalent) missing >> >> give me an explicit exception than silently "skipping" the loop. >> After all, the primary use case for range() is loops, and we should not >> make that use worse for the benefit of hypothetical other use cases. > > Now *that's* a good reason to nix the idea :) I agree that the bug possibility is by far the strongest for range whereas the usefulness is probably the weakest. So this seems a case of practicality beats purity. -- Terry Jan Reedy From solipsis at pitrou.net Mon May 7 19:03:05 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 19:03:05 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <20120507183343.5552cccb@pitrou.net> Message-ID: <20120507190305.5255b030@pitrou.net> On Mon, 7 May 2012 12:57:47 -0400 Alexander Belopolsky wrote: > On Mon, May 7, 2012 at 12:33 PM, Antoine Pitrou wrote: > >> I have a feeling that "readers" here are readers of documentation or > >> tutorials rather than readers of actual code. > > > > I was talking about readers of code. If I read code where boolean > > testing of a time object is done, I wouldn't assume the intent is to > > test for midnight (unless there's a comment indicating so). > > I understand your hypothetical, but does such code actually exist in > the wild or are we debating the number of angels than can dance at > midnight? Well, people complained about it, so they did try to write such code and got bitten, right? Whether or not such code still exists "in the wild" probably depends on how fast people get bitten and fix it :-) But regardless, it's still an annoyance for people who write new code. On the other hand, nobody chimed in to say that they relied on boolean testing to check for midnight. > As a matter of coding stile, I recommend avoiding use of datetime.time > objects. More often than not, time values are meaningless when > detached from the date value. I tend to agree. > Note that since date range starts at date(1,1,1) we don't have the > same problem with the date or datetime objects. That's fortunate. Regards Antoine. From solipsis at pitrou.net Mon May 7 19:06:12 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 19:06:12 +0200 Subject: [Python-ideas] Should range() == range(0)? References: <4FA71B68.4010400@pearwood.info> <4FA75AB7.1000702@canterbury.ac.nz> Message-ID: <20120507190612.2173a86e@pitrou.net> On Mon, 07 May 2012 13:04:17 -0400 Terry Reedy wrote: > On 5/7/2012 7:14 AM, Nick Coghlan wrote: > > On Mon, May 7, 2012 at 8:50 PM, Georg Brandl wrote: > >> For what gain? At the moment, I cannot think of any arguments in favor > >> of the change, which is the point where arguments against it aren't > >> even needed to keep the status quo. > >> > >> Ah yes: and I would rather have the bug > >> > >> for i in range(): #<- "n" (or equivalent) missing > >> > >> give me an explicit exception than silently "skipping" the loop. > >> After all, the primary use case for range() is loops, and we should not > >> make that use worse for the benefit of hypothetical other use cases. > > > > Now *that's* a good reason to nix the idea :) > > I agree that the bug possibility is by far the strongest for range > whereas the usefulness is probably the weakest. So this seems a case of > practicality beats purity. The fact that there's absolutely no use case to call range() without an argument is enough to dismiss the idea, IMO. Just because something can be done doesn't mean it should be done. Regards Antoine. From alexander.belopolsky at gmail.com Mon May 7 19:16:11 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 7 May 2012 13:16:11 -0400 Subject: [Python-ideas] Should range() == range(0)? In-Reply-To: References: <4FA71B68.4010400@pearwood.info> <4FA75AB7.1000702@canterbury.ac.nz> Message-ID: On Mon, May 7, 2012 at 6:50 AM, Georg Brandl wrote: >> Having range() return an empty range in the same way that tuple() >> returns an empty tuple would be a natural extension of that >> philosophy. > > For what gain? Lack of the default constructor is a pain for generic programming in Python. It is not uncommon to require an arbitrary instance of the given type and calling the type without arguments is a convenient way to get one. I never missed working range() mostly because I don't recall ever using range as an actual type rather than the Python way to spell the C for loop. I do, however often miss default constructors for datetime objects, so I understand why some people may desire range(). From tjreedy at udel.edu Mon May 7 19:27:01 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 07 May 2012 13:27:01 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120507183343.5552cccb@pitrou.net> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <20120507183343.5552cccb@pitrou.net> Message-ID: On 5/7/2012 12:33 PM, Antoine Pitrou wrote: > On Mon, 7 May 2012 12:24:21 -0400 > Alexander Belopolsky > wrote: >> On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou wrote: >>> Does midnight warrant any special shortcut for testing? >> >> I never needed it, but apparently it is common enough for users to >> notice an complain. > > How so? Those users complain that midnight is false, not that they have > trouble testing for midnight. > That's the whole point really: they don't think about midnight as a > special value, and they are surprised that it is. It is only special in the representation because 24:00 == 00:00. I have the impression that European train timetables at least in decades past printed midnight arrival times as 24:00 instead of 00:00. I agree that that is an extremely thin reason for the current behavior. Someone printing timetables in that style should explicitly test arrivals for being midnight. There have been cultures that started the day at dawn or noon, and in the US at least, we still restart half-days at both noon and midnight, so both would be special here. Rather unusually, I disagree with Tim here: "It is odd, but really no odder than "zero values" of other types evaluating to false in Boolean contexts ;-)". Numerical 0 and empty collections are special and often need to be treated specially in a way that is untrue of midnight. I think treating it as special was a design mistake. There have been discussions on python-list to the effect that if one wants to branch on something being None or not, one should be explicit -- 'is None' or 'is not None' -- to avoid accidentally picking up other null values. -- Terry Jan Reedy From steve at pearwood.info Mon May 7 19:31:59 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 08 May 2012 03:31:59 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> Message-ID: <4FA8070F.1040107@pearwood.info> Alexander Belopolsky wrote: > On Mon, May 7, 2012 at 11:32 AM, Antoine Pitrou wrote: >> Well, less occasional puzzlement is an improvement in itself. >> Unintuitive behaviour is always a risk for software quality. > > I don't find the current behavior unintuitive. It is common to > represent time of day as an integer (number of minutes or seconds > since midnight) or as a float (fraction of the 24-hour day). In these > cases one gets bool(midnight) -> False as an artifact of the > representation. I think you have made a good point there: the behaviour of bool(midnight) is an artifact of the internal representation. Unless this behaviour is documented, that makes it an implementation detail, and therefore lowers (but not eliminates) the barrier to changing it. [...] > Note that if we make bool(midnight) -> True, it will not be trivial to > faithfully reproduce the old behavior. I want the proponents of the > change to try it before I explain why it is not easy. I think it is easy. Instead of either of these: if bool(some_time): ... if some_time: ... write this: _MIDNIGHT = datetime.time(0, 0) # defined once if some_time != _MIDNIGHT: ... For code where some_time could be None, write this: if not (some_time is None or some_time == _MIDNIGHT): ... Have I missed any common cases? -- Steven From alexander.belopolsky at gmail.com Mon May 7 19:44:29 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 7 May 2012 13:44:29 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <4FA8070F.1040107@pearwood.info> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <4FA8070F.1040107@pearwood.info> Message-ID: On Mon, May 7, 2012 at 1:31 PM, Steven D'Aprano wrote: > I think it is easy. Instead of either of these: > > ? ?if bool(some_time): > ? ? ? ?... > ? ?if some_time: > ? ? ? ?... > > write this: > > ? ?_MIDNIGHT = datetime.time(0, 0) ?# defined once > ? ?if some_time != _MIDNIGHT: > ? ? ? ?... > > For code where some_time could be None, write this: > > ? ?if not (some_time is None or some_time == _MIDNIGHT): > ? ? ? ?... > > > Have I missed any common cases? Yes, your code will raise an exception if some_time has tzinfo set. This is exactly the issue that I expected you to miss, so I rest my case. :-) From breamoreboy at yahoo.co.uk Mon May 7 19:44:34 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Mon, 07 May 2012 18:44:34 +0100 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <20120507183343.5552cccb@pitrou.net> Message-ID: On 07/05/2012 18:27, Terry Reedy wrote: > It is only special in the representation because 24:00 == 00:00. I have > the impression that European train timetables at least in decades past > printed midnight arrival times as 24:00 instead of 00:00. I agree that > that is an extremely thin reason for the current behavior. Someone > printing timetables in that style should explicitly test arrivals for > being midnight. > > There have been cultures that started the day at dawn or noon, and in > the US at least, we still restart half-days at both noon and midnight, > so both would be special here. > My understanding is that 24:00 hours is only really used by the military to avoid misunderstanding over the actual day they're talking about. The night of 5th/6th June 1944 springs to my mind. I'll happily be corrected. -- Cheers. Mark Lawrence. From mal at egenix.com Mon May 7 19:49:56 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 07 May 2012 19:49:56 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <20120507183343.5552cccb@pitrou.net> Message-ID: <4FA80B44.8090608@egenix.com> Mark Lawrence wrote: > On 07/05/2012 18:27, Terry Reedy wrote: >> It is only special in the representation because 24:00 == 00:00. I have >> the impression that European train timetables at least in decades past >> printed midnight arrival times as 24:00 instead of 00:00. I agree that >> that is an extremely thin reason for the current behavior. Someone >> printing timetables in that style should explicitly test arrivals for >> being midnight. >> >> There have been cultures that started the day at dawn or noon, and in >> the US at least, we still restart half-days at both noon and midnight, >> so both would be special here. >> > > My understanding is that 24:00 hours is only really used by the military to avoid misunderstanding > over the actual day they're talking about. The night of 5th/6th June 1944 springs to my mind. I'll > happily be corrected. It's in common use in Germany, e.g. for describing opening hours. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 07 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-07-02: EuroPython 2012, Florence, Italy 56 days to go 2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From alexander.belopolsky at gmail.com Mon May 7 20:01:19 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 7 May 2012 14:01:19 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <4FA80B44.8090608@egenix.com> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <20120507183343.5552cccb@pitrou.net> <4FA80B44.8090608@egenix.com> Message-ID: On Mon, May 7, 2012 at 1:49 PM, M.-A. Lemburg wrote: >> My understanding is that 24:00 hours is only really used by the military to avoid misunderstanding >> over the actual day they're talking about. The night of 5th/6th June 1944 springs to my mind. ?I'll >> happily be corrected. > > It's in common use in Germany, e.g. for describing opening hours. Properly supporting 24:00 timestamps in datetime module is actually a more interesting issue than what bool(time(0)) should be. See http://bugs.python.org/issue10427 From mal at egenix.com Mon May 7 20:25:26 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 07 May 2012 20:25:26 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <20120507183343.5552cccb@pitrou.net> <4FA80B44.8090608@egenix.com> Message-ID: <4FA81396.4020606@egenix.com> Alexander Belopolsky wrote: > On Mon, May 7, 2012 at 1:49 PM, M.-A. Lemburg wrote: >>> My understanding is that 24:00 hours is only really used by the military to avoid misunderstanding >>> over the actual day they're talking about. The night of 5th/6th June 1944 springs to my mind. I'll >>> happily be corrected. >> >> It's in common use in Germany, e.g. for describing opening hours. > > Properly supporting 24:00 timestamps in datetime module is actually a > more interesting issue than what bool(time(0)) should be. See > http://bugs.python.org/issue10427 Just to clarify: 24:00 is used when describing times, but not in timestamps (those use 00:00 and the next day). E.g. it's common to write: "open 00:00-24:00" or "open 18:00-24:00". I've never seen anything like "2011-12-31 24:00" in Germany, but Google suggests that it's in common use in Asia: https://www.google.de/search?q=%222011-12-31+24:00%22 -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 07 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-07-02: EuroPython 2012, Florence, Italy 56 days to go 2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From steve at pearwood.info Mon May 7 20:31:51 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 08 May 2012 04:31:51 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <4FA8070F.1040107@pearwood.info> Message-ID: <4FA81517.8070300@pearwood.info> Alexander Belopolsky wrote: > On Mon, May 7, 2012 at 1:31 PM, Steven D'Aprano wrote: >> For code where some_time could be None, write this: >> >> if not (some_time is None or some_time == _MIDNIGHT): >> ... >> >> >> Have I missed any common cases? > > > Yes, your code will raise an exception if some_time has tzinfo set. > This is exactly the issue that I expected you to miss, so I rest my > case. :-) But if you set the timezone, midnight is not necessarily false. py> class GMT5(datetime.tzinfo): ... def utcoffset(self,dt): ... return timedelta(hours=5) ... def tzname(self,dt): ... return "GMT +5" ... def dst(self,dt): ... return timedelta(0) ... py> gmt5 = GMT5() py> bool(datetime.time(0,0, tzinfo=gmt5)) True py> bool(datetime.time(5, 0, tzinfo=gmt5)) False So I assume anyone using tzinfo will probably know enough not to be testing against time objects directly. Or at least not be using bool(some_some) to detect midnight. -- Steven From jkbbwr at gmail.com Mon May 7 20:43:17 2012 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Mon, 7 May 2012 19:43:17 +0100 Subject: [Python-ideas] Replacing shelve in the next 3.x release. Message-ID: I suggest that we either replace the internals of shelve, or deprecate it, or remove it in favour of other dbm's like http://packages.python.org/sqlite3dbm/dbm.html. Many people feel that shelve is a pointless module that should not be used because it relies too much on pickle an insecure format, in my own searches Google showed only 13 projects using shelve and Github showed only 3000 odd snippets containing shelve.open so its time this module either died quietly or got the internals replaced. For a major start shelve doesn't support integer keys where as the suggestion put earlier clearly does. I'm sure there is other stuff I'm missing which is why I'm posting here first. From solipsis at pitrou.net Mon May 7 20:50:11 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 20:50:11 +0200 Subject: [Python-ideas] Replacing shelve in the next 3.x release. References: Message-ID: <20120507205011.676362c0@pitrou.net> On Mon, 7 May 2012 19:43:17 +0100 Jakob Bowyer wrote: > I suggest that we either replace the internals of shelve, or deprecate > it, or remove it in favour of other dbm's like > http://packages.python.org/sqlite3dbm/dbm.html. Many people feel that > shelve is a pointless module that should not be used because it relies > too much on pickle an insecure format, pickle is only insecure if you want to accept data from untrusted sources. shelve would obviously be very bad for an exchange format, but I don't think that's what it's used for. Someone should post a proper comparison of shelve with its alternatives (including functionality and performance) before a decision is made. Regards Antoine. From ubershmekel at gmail.com Mon May 7 21:20:56 2012 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Mon, 7 May 2012 22:20:56 +0300 Subject: [Python-ideas] Replacing shelve in the next 3.x release. In-Reply-To: <20120507205011.676362c0@pitrou.net> References: <20120507205011.676362c0@pitrou.net> Message-ID: On Mon, May 7, 2012 at 9:50 PM, Antoine Pitrou wrote: > On Mon, 7 May 2012 19:43:17 +0100 > Jakob Bowyer wrote: > > I suggest that we either replace the internals of shelve, or deprecate > > it, or remove it in favour of other dbm's like > > http://packages.python.org/sqlite3dbm/dbm.html. Many people feel that > > shelve is a pointless module that should not be used because it relies > > too much on pickle an insecure format, > > pickle is only insecure if you want to accept data from untrusted > sources. shelve would obviously be very bad for an exchange format, but > I don't think that's what it's used for. > > Someone should post a proper comparison of shelve with its alternatives > (including functionality and performance) before a decision is made. > > Regards > > Antoine. > > I used shelve for a long time on multiple projects as it's really easy to use but I had to deal with random data corruption on abrupt process termination. That was my motivator to implement an sqlite backend for shelve though I guess I wasn't motivated strongly enough to follow through. Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon May 7 21:25:19 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 21:25:19 +0200 Subject: [Python-ideas] Replacing shelve in the next 3.x release. References: <20120507205011.676362c0@pitrou.net> Message-ID: <20120507212519.58d25b26@pitrou.net> On Mon, 7 May 2012 22:20:56 +0300 Yuval Greenfield wrote: > I used shelve for a long time on multiple projects as it's really easy to > use but I had to deal with random data corruption on abrupt process > termination. That was my motivator to implement an sqlite backend for > shelve though I guess I wasn't motivated strongly enough to follow through. Atomic replacement of the shelve file is probably an improvement worth adding. Regards Antoine. From greg.ewing at canterbury.ac.nz Tue May 8 02:18:20 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 08 May 2012 12:18:20 +1200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> Message-ID: <4FA8664C.2090201@canterbury.ac.nz> Alexander Belopolsky wrote: > I don't find the current behavior unintuitive. It is common to > represent time of day as an integer (number of minutes or seconds > since midnight) or as a float (fraction of the 24-hour day). In these > cases one gets bool(midnight) -> False as an artifact of the > representation. Relying on that artifact by using midnight as a kind of null value seems like a bad idea to me, though. Any code doing that almost deserves to be broken. -- Greg From greg.ewing at canterbury.ac.nz Tue May 8 02:32:23 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 08 May 2012 12:32:23 +1200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <4FA7FE08.2050901@egenix.com> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <4FA7FE08.2050901@egenix.com> Message-ID: <4FA86997.5000809@canterbury.ac.nz> M.-A. Lemburg wrote: > In Python 3.x, nb_nonzero was renamed to nb_bool without really > paying attention to the fact that many types implemented the original > meaning instead of a notion of boolean value I don't think it was wrong to do that. The fact that the C slot was called "nonzero" was never visible to the Python programmer, who always thought of the operation it represents as truth-testing. If there's any fault here, it's with C type implementors who have taken "nonzero" too literally. -- Greg From ncoghlan at gmail.com Tue May 8 03:57:08 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 8 May 2012 11:57:08 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <4FA86997.5000809@canterbury.ac.nz> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <4FA7FE08.2050901@egenix.com> <4FA86997.5000809@canterbury.ac.nz> Message-ID: On Tue, May 8, 2012 at 10:32 AM, Greg Ewing wrote: > M.-A. Lemburg wrote: >> >> In Python 3.x, nb_nonzero was renamed to nb_bool without really >> paying attention to the fact that many types implemented the original >> meaning instead of a notion of boolean value > > > I don't think it was wrong to do that. The fact that the > C slot was called "nonzero" was never visible to the Python > programmer, who always thought of the operation it represents > as truth-testing. > > If there's any fault here, it's with C type implementors who > have taken "nonzero" too literally. The Python level special method in 2.x is also __nonzero__ (or you could just implement __len__). In 3.x, the two relevant special methods are now __bool__ and __len__. Type designers are, of course, still free to use "non-zero" as their definition for how they choose to implement __bool__. For myself, I don't see any harm in having the zero hour be treated as the zero hour at the language level ("zero hour" is another term for midnight, which, as far as I know, stems from the military Zulu notation where it's written as "0000Z"). Certainly I don't see adequate justification for changing the boolean behaviour of time objects at this late date. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ethan at stoneleaf.us Tue May 8 04:57:20 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 07 May 2012 19:57:20 -0700 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> Message-ID: <4FA88B90.7060309@stoneleaf.us> Alexander Belopolsky wrote: > On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou wrote: >> Does midnight warrant any special shortcut for testing? > > I never needed it, but apparently it is common enough for users to > notice an complain. That's why I asked my original question: if > you've seen a time variable been tested for truth, was it a bug that > can be fixed by a change in time.__bool__ or a deliberate test for the > midnight value? Complain or rewrite to something reasonable, which is what I did. Much better to have the built-in time behave properly than have users either work around it or constantly create new classes. >> Especially one that is confusing to many readers. > > I have a feeling that "readers" here are readers of documentation or > tutorials rather than readers of actual code. If this is the case, we > can discuss how to improve the documentation and not change the > behavior. The behavior is broken. Midnight is not False. ~Ethan~ From ncoghlan at gmail.com Tue May 8 05:57:13 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 8 May 2012 13:57:13 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <4FA88B90.7060309@stoneleaf.us> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> Message-ID: On Tue, May 8, 2012 at 12:57 PM, Ethan Furman wrote: > The behavior is broken. ?Midnight is not False. Whereas I disagree - I see having zero hour be false as perfectly reasonable behaviour (not necessarily *useful*, but then having all time objects report as True in boolean context isn't particularly useful either). In matters of opinion, the status quo reigns. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Tue May 8 07:42:49 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 8 May 2012 15:42:49 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> Message-ID: <20120508054249.GB3797@ando> On Tue, May 08, 2012 at 01:57:13PM +1000, Nick Coghlan wrote: > On Tue, May 8, 2012 at 12:57 PM, Ethan Furman wrote: > > The behavior is broken. ?Midnight is not False. > > Whereas I disagree - I see having zero hour be false as perfectly > reasonable behaviour (not necessarily *useful*, but then having all > time objects report as True in boolean context isn't particularly > useful either). On the contrary, it can be very useful to have all objects of some classes treated as true. For example, we can write: mo = re.search(a, b) if mo: do_something_with(mo) without having to worry about the case where a valid MatchObject happens to be false. Consider: t = get_job_start_time() # returns a datetime.time object, or None if t: do_something_with(t) Oops, we have a bug. If the job happens to have started at exactly midnight, it will wrongly be treated as false. But wait, it's worse than that. Because it's not actually midnight that gets treated as false, but some arbitrary time of the day which depends on your timezone. It's only midnight if you don't specify a tzinfo, or if you do and happen to be using GMT. Midnight (modulo timezone) is not special enough to treat it as a false value. It's not an empty container or mapping, or the identity element under addition, or the only string that contains no substrings except itself. It's just another hour. (Midnight is only special if you care about the change from one day to another. But if you care about that, you're probably using datetime objects rather than time objects, and then you don't have this problem because "midnight last Tuesday" is not treated as false.) I believe that having time(0,0) be treated as false is at best a misfeature and at worst a bug. It is as unnecessary a special case as it would be to have the string "\0" treated as false. The only good defence for keeping it, in my opinion, would be fear that there is working code that relies on this. > In matters of opinion, the status quo reigns. That's somewhat of an exaggeration. The mere existence of a single dissenting opinion isn't enough to block all progress/changes. (Not unless it's Guido *wink*.) Consensus doesn't require every single person to agree. -- Steven From alexander.belopolsky at gmail.com Tue May 8 08:21:07 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 8 May 2012 02:21:07 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508054249.GB3797@ando> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> Message-ID: On Tue, May 8, 2012 at 1:42 AM, Steven D'Aprano wrote: >> In matters of opinion, the status quo reigns. > > That's somewhat of an exaggeration. The mere existence of a single > dissenting opinion isn't enough to block all progress/changes. For what it's worth, I am also against changing the status quo. time(0) is special: it is the smallest possible value. If you deal with low resolution time values, say hourly schedules, it is not unreasonable to test for time(0). For example, when estimating daily averages, midnight samples can be weighted by 1/2 to account for the uncertainty in assigning midnight to a given day. From ethan at stoneleaf.us Tue May 8 08:33:56 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 07 May 2012 23:33:56 -0700 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> Message-ID: <4FA8BE54.6060909@stoneleaf.us> Alexander Belopolsky wrote: > On Tue, May 8, 2012 at 1:42 AM, Steven D'Aprano wrote: >>> In matters of opinion, the status quo reigns. >> That's somewhat of an exaggeration. The mere existence of a single >> dissenting opinion isn't enough to block all progress/changes. > > For what it's worth, I am also against changing the status quo. > time(0) is special: it is the smallest possible value. If you deal > with low resolution time values, say hourly schedules, it is not > unreasonable to test for time(0). For example, when estimating daily > averages, midnight samples can be weighted by 1/2 to account for the > uncertainty in assigning midnight to a given day. Testing for midnight does not require midnight to be False. And no, I don't maintain any hope of winning this argument -- that's why I wrote my own class. With it it is possible to create an unspecified moment... and guess what? It evaluates to False; all actual times evaluate as True. (Including midnight. ;) ~Ethan~ From ncoghlan at gmail.com Tue May 8 09:02:04 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 8 May 2012 17:02:04 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508054249.GB3797@ando> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> Message-ID: On Tue, May 8, 2012 at 3:42 PM, Steven D'Aprano wrote: > On Tue, May 08, 2012 at 01:57:13PM +1000, Nick Coghlan wrote: >> On Tue, May 8, 2012 at 12:57 PM, Ethan Furman wrote: >> > The behavior is broken. ?Midnight is not False. >> >> Whereas I disagree - I see having zero hour be false as perfectly >> reasonable behaviour (not necessarily *useful*, but then having all >> time objects report as True in boolean context isn't particularly >> useful either). > > On the contrary, it can be very useful to have all objects of some > classes treated as true. For example, we can write: > > mo = re.search(a, b) > if mo: > ? ?do_something_with(mo) > > without having to worry about the case where a valid MatchObject happens > to be false. > > Consider: > > t = get_job_start_time() ?# returns a datetime.time object, or None > if t: > ? ?do_something_with(t) > > > Oops, we have a bug. If the job happens to have started at exactly > midnight, it will wrongly be treated as false. IMO, you've completely misdiagnosed the source of that bug. Never *ever* rely on boolean evaluation when testing against None. *Always* use the "is not None" trailer. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue May 8 09:09:25 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 8 May 2012 17:09:25 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508054249.GB3797@ando> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> Message-ID: On Tue, May 8, 2012 at 3:42 PM, Steven D'Aprano wrote: > On Tue, May 08, 2012 at 01:57:13PM +1000, Nick Coghlan wrote: >> In matters of opinion, the status quo reigns. > > That's somewhat of an exaggeration. The mere existence of a single > dissenting opinion isn't enough to block all progress/changes. (Not > unless it's Guido *wink*.) Consensus doesn't require every single person > to agree. The current behaviour is perfectly consistent and well-defined, so changing it will break any code that relies on the current behaviour. The burden is not on me to prove that there *is* such code in the wild, it's on those proposing a change to prove that there *isn't* such code (which can't be done), or else to provide a sufficiently compelling rationale that the risk of breakage can be justified. "I don't like it" is not a valid argument for a change, nor is "I like using a boolean test when I really mean an 'is not None' test". *If* such a change were to be made, it would require at least one release where a DeprecationWarning was emitted before returning False, and then the return value could change in the next release. Why bother? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ubershmekel at gmail.com Tue May 8 09:17:31 2012 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Tue, 8 May 2012 10:17:31 +0300 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508054249.GB3797@ando> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> Message-ID: On Tue, May 8, 2012 at 8:42 AM, Steven D'Aprano wrote: [...] > It's only midnight if you don't specify a tzinfo, or > if you do and happen to be using GMT. > Arbitrary and unexpected times evaluating to False is a bug waiting to happen. Personally I'd prefer all datetime.time objects had no boolean value at all. The only good defence for keeping it, in my opinion, would be fear that > there is working code that relies on this. +1 Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue May 8 11:56:26 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 8 May 2012 11:56:26 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> Message-ID: <20120508115626.32706f28@pitrou.net> On Tue, 8 May 2012 17:02:04 +1000 Nick Coghlan wrote: > > IMO, you've completely misdiagnosed the source of that bug. Never > *ever* rely on boolean evaluation when testing against None. Nick, that's just plain silly. If we didn't want people to rely on boolean evaluation, we wouldn't define __bool__ at all (or we would make it return a random value). Regards Antoine. From ncoghlan at gmail.com Tue May 8 12:08:07 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 8 May 2012 20:08:07 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508115626.32706f28@pitrou.net> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <20120508115626.32706f28@pitrou.net> Message-ID: The problem is not using boolean evaluation - it's assuming that boolean evaluation is defined as "x is not None". Doing so introduces a completely unnecessary dependency on the type of "x". I'm frankly astonished that so many people seem to think it's a reasonable thing to do. -- Sent from my phone, thus the relative brevity :) On May 8, 2012 8:01 PM, "Antoine Pitrou" wrote: > On Tue, 8 May 2012 17:02:04 +1000 > Nick Coghlan wrote: > > > > IMO, you've completely misdiagnosed the source of that bug. Never > > *ever* rely on boolean evaluation when testing against None. > > Nick, that's just plain silly. If we didn't want people to rely on > boolean evaluation, we wouldn't define __bool__ at all (or we would > make it return a random value). > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue May 8 12:11:04 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 08 May 2012 12:11:04 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <20120508115626.32706f28@pitrou.net> Message-ID: <1336471864.3376.1.camel@localhost.localdomain> Le mardi 08 mai 2012 ? 20:08 +1000, Nick Coghlan a ?crit : > The problem is not using boolean evaluation - it's assuming that > boolean evaluation is defined as "x is not None". Doing so introduces > a completely unnecessary dependency on the type of "x". Well, the dependency is obvious when the type is already well-known. Regards Antoine. From g.brandl at gmx.net Tue May 8 12:25:48 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 08 May 2012 12:25:48 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508115626.32706f28@pitrou.net> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <20120508115626.32706f28@pitrou.net> Message-ID: On 05/08/2012 11:56 AM, Antoine Pitrou wrote: > On Tue, 8 May 2012 17:02:04 +1000 > Nick Coghlan wrote: >> >> IMO, you've completely misdiagnosed the source of that bug. Never >> *ever* rely on boolean evaluation when testing against None. > > Nick, that's just plain silly. If we didn't want people to rely on > boolean evaluation, we wouldn't define __bool__ at all (or we would > make it return a random value). Read again: he's talking about people using "bool(x)" (implicitly) when they mean "x is not None". Georg From mal at egenix.com Tue May 8 12:34:32 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 08 May 2012 12:34:32 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> Message-ID: <4FA8F6B8.3030307@egenix.com> Nick Coghlan wrote: > On Tue, May 8, 2012 at 3:42 PM, Steven D'Aprano wrote: >> On Tue, May 08, 2012 at 01:57:13PM +1000, Nick Coghlan wrote: >>> On Tue, May 8, 2012 at 12:57 PM, Ethan Furman wrote: >>>> The behavior is broken. Midnight is not False. >>> >>> Whereas I disagree - I see having zero hour be false as perfectly >>> reasonable behaviour (not necessarily *useful*, but then having all >>> time objects report as True in boolean context isn't particularly >>> useful either). >> >> On the contrary, it can be very useful to have all objects of some >> classes treated as true. For example, we can write: >> >> mo = re.search(a, b) >> if mo: >> do_something_with(mo) >> >> without having to worry about the case where a valid MatchObject happens >> to be false. >> >> Consider: >> >> t = get_job_start_time() # returns a datetime.time object, or None >> if t: >> do_something_with(t) >> >> >> Oops, we have a bug. If the job happens to have started at exactly >> midnight, it will wrongly be treated as false. > > IMO, you've completely misdiagnosed the source of that bug. Never > *ever* rely on boolean evaluation when testing against None. *Always* > use the "is not None" trailer. Fully agreed. The above code is just plain wrong and often causes problems in larger applications - besides, it's also slower in most cases, esp. if determining the length of an object or converting it to a numeric value is slow. If you want to test for None return values, you need to use "if is None" or "if is not None". -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 08 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-07-02: EuroPython 2012, Florence, Italy 55 days to go 2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Tue May 8 12:51:48 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 8 May 2012 12:51:48 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <20120508115626.32706f28@pitrou.net> Message-ID: <20120508125148.030614cb@pitrou.net> On Tue, 08 May 2012 12:25:48 +0200 Georg Brandl wrote: > On 05/08/2012 11:56 AM, Antoine Pitrou wrote: > > On Tue, 8 May 2012 17:02:04 +1000 > > Nick Coghlan wrote: > >> > >> IMO, you've completely misdiagnosed the source of that bug. Never > >> *ever* rely on boolean evaluation when testing against None. > > > > Nick, that's just plain silly. If we didn't want people to rely on > > boolean evaluation, we wouldn't define __bool__ at all (or we would > > make it return a random value). > > Read again: he's talking about people using "bool(x)" (implicitly) when > they mean "x is not None". That's what I read. From solipsis at pitrou.net Tue May 8 12:53:42 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 8 May 2012 12:53:42 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> Message-ID: <20120508125342.50a02242@pitrou.net> On Tue, 08 May 2012 12:34:32 +0200 "M.-A. Lemburg" wrote: > > Fully agreed. > > The above code is just plain wrong and often causes problems > in larger applications - besides, it's also slower in most > cases, esp. if determining the length of an object or > converting it to a numeric value is slow. > > If you want to test for None return values, you need to use > "if is None" or "if is not None". So who writes the PEP to deprecate __bool__ methods wholesale? Regards Antoine. From p.f.moore at gmail.com Tue May 8 13:12:05 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 8 May 2012 12:12:05 +0100 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508125342.50a02242@pitrou.net> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> Message-ID: On 8 May 2012 11:53, Antoine Pitrou wrote: >> If you want to test for None return values, you need to use >> "if is None" or "if is not None". > > So who writes the PEP to deprecate __bool__ methods wholesale? I see no need - if you're testing for "true" return values, bool is correct. But if you're testing for None vs an actual value, it's not. The fact that testing for boolean true values is a lot rarer than people think, or that it's not appropriate in certain situations, doesn't mean it's useless. Paul. From solipsis at pitrou.net Tue May 8 13:17:22 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 8 May 2012 13:17:22 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> Message-ID: <20120508131722.1353070e@pitrou.net> On Tue, 8 May 2012 12:12:05 +0100 Paul Moore wrote: > On 8 May 2012 11:53, Antoine Pitrou wrote: > >> If you want to test for None return values, you need to use > >> "if is None" or "if is not None". > > > > So who writes the PEP to deprecate __bool__ methods wholesale? > > I see no need - if you're testing for "true" return values, bool is > correct. But if you're testing for None vs an actual value, it's not. Well, again, if that's the case, then __bool__ should be deprecated for all types where being "true" or "false" doesn't make obvious sense. Which is most types, actually. Of course, this is a completely vacuous discussion. The reality is that a __bool__ exists for all types, we are not deprecating it, and people rely on it even though PEP 8 zealots may recommend otherwise. Again, the question is whether time.__bool__ is sane and, if not, why not make it saner? Lecturing people on style doesn't make Python better. Regards Antoine. From mal at egenix.com Tue May 8 13:36:52 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 08 May 2012 13:36:52 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508125342.50a02242@pitrou.net> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> Message-ID: <4FA90554.9010505@egenix.com> Antoine Pitrou wrote: > On Tue, 08 May 2012 12:34:32 +0200 > "M.-A. Lemburg" wrote: >> >> Fully agreed. >> >> The above code is just plain wrong and often causes problems >> in larger applications - besides, it's also slower in most >> cases, esp. if determining the length of an object or >> converting it to a numeric value is slow. >> >> If you want to test for None return values, you need to use >> "if is None" or "if is not None". > > So who writes the PEP to deprecate __bool__ methods wholesale? I think I lost you there. What does the above have to do with __bool__ methods ? Whether or not a type implements the notion of a boolean value is really up to the specific implementation and not a question that can be answered in general. It's perfectly fine for time value to mimic a boolean value by following the same paradigm as a float "seconds since midnight" value. As such, reusing the __nonzero__ or __len__ slots for boolean values is fine as well. It may not always make sense in every conceivable way, but as long as there is a valid explanation that can be documented, I don't see that as problem. If you're purist, you'd probably disallow __bool__ methods on non-boolean types, but this is Python, so we pass on control to object and type implementers. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 08 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-07-02: EuroPython 2012, Florence, Italy 55 days to go 2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From g.brandl at gmx.net Tue May 8 13:45:57 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 08 May 2012 13:45:57 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508131722.1353070e@pitrou.net> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> <20120508131722.1353070e@pitrou.net> Message-ID: On 05/08/2012 01:17 PM, Antoine Pitrou wrote: > On Tue, 8 May 2012 12:12:05 +0100 > Paul Moore wrote: >> On 8 May 2012 11:53, Antoine Pitrou wrote: >> >> If you want to test for None return values, you need to use >> >> "if is None" or "if is not None". >> > >> > So who writes the PEP to deprecate __bool__ methods wholesale? >> >> I see no need - if you're testing for "true" return values, bool is >> correct. But if you're testing for None vs an actual value, it's not. > > Well, again, if that's the case, then __bool__ should be deprecated for > all types where being "true" or "false" doesn't make obvious sense. > Which is most types, actually. Repeating a strange argument does not make it more true. Georg From mwm at mired.org Tue May 8 13:58:20 2012 From: mwm at mired.org (Mike Meyer) Date: Tue, 08 May 2012 07:58:20 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <20120508115626.32706f28@pitrou.net> Message-ID: <6fc091a5-c4bf-46e5-b9ba-8f3251ba1c1d@email.android.com> Nick Coghlan wrote: >The problem is not using boolean evaluation - it's assuming that >boolean >evaluation is defined as "x is not None". Doing so introduces a >completely >unnecessary dependency on the type of "x". I'm frankly astonished that >so >many people seem to think it's a reasonable thing to do. +1 -- Sent from my Android tablet. Please excuse my swyping. From solipsis at pitrou.net Tue May 8 14:00:44 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 8 May 2012 14:00:44 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> <20120508131722.1353070e@pitrou.net> Message-ID: <20120508140044.3547fff9@pitrou.net> On Tue, 08 May 2012 13:45:57 +0200 Georg Brandl wrote: > On 05/08/2012 01:17 PM, Antoine Pitrou wrote: > > On Tue, 8 May 2012 12:12:05 +0100 > > Paul Moore wrote: > >> On 8 May 2012 11:53, Antoine Pitrou wrote: > >> >> If you want to test for None return values, you need to use > >> >> "if is None" or "if is not None". > >> > > >> > So who writes the PEP to deprecate __bool__ methods wholesale? > >> > >> I see no need - if you're testing for "true" return values, bool is > >> correct. But if you're testing for None vs an actual value, it's not. > > > > Well, again, if that's the case, then __bool__ should be deprecated for > > all types where being "true" or "false" doesn't make obvious sense. > > Which is most types, actually. > > Repeating a strange argument does not make it more true. Well, it does follow from what you wrote above... Regards Antoine. From mwm at mired.org Tue May 8 17:15:53 2012 From: mwm at mired.org (Mike Meyer) Date: Tue, 8 May 2012 11:15:53 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508131722.1353070e@pitrou.net> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> <20120508131722.1353070e@pitrou.net> Message-ID: <20120508111553.6c9023cd@bhuda.mired.org> On Tue, 8 May 2012 13:17:22 +0200 Antoine Pitrou wrote: > On Tue, 8 May 2012 12:12:05 +0100 > Paul Moore wrote: > > On 8 May 2012 11:53, Antoine Pitrou wrote: > > >> If you want to test for None return values, you need to use > > >> "if is None" or "if is not None". > > > So who writes the PEP to deprecate __bool__ methods wholesale? > > I see no need - if you're testing for "true" return values, bool is > > correct. But if you're testing for None vs an actual value, it's not. > Well, again, if that's the case, then __bool__ should be deprecated for > all types where being "true" or "false" doesn't make obvious sense. > Which is most types, actually. Not quite, because "practicality beats purity". It should be "all types where there aren't obviously useful values for 'true' and 'false' in a boolean context." For container types, "not empty" provides obviously useful etc. For numeric types, "nonzero" does that. I think that covers most of the builtins. Arguably, the ability to write "if not " instead of "if empty()" isn't worth the price of the all-to-common bug of writing "if not " when you should be writing "if is None" passing quietly instead of possibly throwing an exception. But that battle is already lost (and I prefer the current behavior anyway). For the case at hand - datetime.time() - the current behavior isn't obviously useful. If we were doing it from scratch, yeah, maybe it ought to be true all the time. Or maybe we should follow your suggestion here, and make converting a datetime.time() to bool throw an exception. But it doesn't appear to be a problem worth the cost of fixing. > Of course, this is a completely vacuous discussion. The reality is that > a __bool__ exists for all types, we are not deprecating it, and people > rely on it even though PEP 8 zealots may recommend otherwise. PEP 8 doesn't recommend against using __bool__. It warns about the common python coding error of writing "if not x" instead of "if x is not None" when x may have a value that's both false and not None. This is a code correctness issue. Checking whether converting something to bool yields false when you're trying to see if it's some specific value that happens to convert to false is wrong. It's just as wrong to write "if not x" rather than "if len(x)" to check to see if x is an empty container if x might be None as to write "if not x" rather than "if x is not None" to check to see if x is None. The latter is the far more common bug in Python programs, which is why PEP 8 warns people about it instead of the former. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From guido at python.org Tue May 8 19:11:39 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 8 May 2012 10:11:39 -0700 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120508111553.6c9023cd@bhuda.mired.org> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> <20120508131722.1353070e@pitrou.net> <20120508111553.6c9023cd@bhuda.mired.org> Message-ID: I think there's nothing to be done. It's clear that making datetime.time() be false was a conscious decision when the datetime module was designed (we thought about *every* aspect of the design quite a lot -- see (*) below). At the same time knowing what I know now about common usage I wouldn't design it this way today. Note that the date and datetime types don't have this problem because a zero date is invalid; and for the timedelta type having zero be false is more useful than harmful. But for time, the use case is marginal and the "trap" is real, even if it is due to poor coding style. (Good API design should consider avoiding traps for poor coders one of its goals.) However, given that it's been a feature for so long I don't think we can change it. Perhaps we could call it out more in the documentation (though it's already quite prominent). (*) I trawled through some history. The original design wiki is only accessible on the wayback machine: http://wayback.archive.org/web/20020801000000*/http://zope.org/Members/fdrake/DateTimeWiki/FrontPage (has many versions between 2002 and 2006). The wiki has no mention of boolean interpretation for the time type, but the earliest docs for the C implementation mentions Boolean context: http://web.archive.org/web/20030119231337/http://www.python.org/dev/doc/devel/lib/datetime-time.html -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Tue May 8 19:57:32 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 08 May 2012 13:57:32 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <4FA90554.9010505@egenix.com> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> <4FA90554.9010505@egenix.com> Message-ID: On 5/8/2012 7:36 AM, M.-A. Lemburg wrote: > It's perfectly fine for time value to mimic a boolean value > by following the same paradigm as a float "seconds since midnight" > value. Ah, I think this is the key to the dispute as to whether midnight should be False or True. Is the implementation of time of day as seconds since midnight essential (then midnight should be False) or accidental (then midnight should be True like all other times)? Different discussants disagree on the premise and hence the conclusion. If one first implements time-of-day as a number representing seconds from midnight, then bool(midmight) is bool(0) is False, like it or not. If one later wraps the number as a Time object, as Python did, then seconds from midnight and the specialness of midnight is essential for the new object to be a completely back-compatible drop-in replacement (with augmentations). Anyway, if 'from midnight' is part of the core concept of the class, the current behavior is correct. If one starts with time-of-day as a concept independent of linear numbers, as smoothly flowing around a circle, then making any particular time of day (or point on the circle) special seems wrong. Indeed, time of day is the same as local rotation angle with respect to the sum. So it is as much geometric as numeric. Abstractly, the second viewpoint seems correct. Pragmatically, however, civilized humans (those with clocks ;-) have standardized on local nominal midnight as the base point for numerically measuring time of day. --- We can also argue the issue both ways from the viewpoint of code compactness. False: Let t be a Time instance and midnight be Time(0). Then False midnight allows 'if t == midnight', which is needed occasionally, to be abbreviated 'if not t'. True: Let t be a Time instance or None, such as might be the return from a function just prior to testing t. Then True midnight allows 'if t is None', which may be needed on such occasions, to be abbreviated 'if not t'. While I am comfortable with and love the abbreviations for 0 and empty (and occasionally None), I would be disinclined, at least at present, to use either abbreviation for Time conditions. Typing the actual conditions above was as fast as thinking about getting the abbreviation to match correctly. --- If the stdlib had an Elevation class, we could have the same argument about whether Elevation(0) should be True, like all others, or False. -- Terry Jan Reedy From greg at krypto.org Tue May 8 20:03:17 2012 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 8 May 2012 11:03:17 -0700 Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module execution] In-Reply-To: References: <4F9F328D.1040003@pearwood.info> Message-ID: On Mon, Apr 30, 2012 at 7:33 PM, Guido van Rossum wrote: > On Mon, Apr 30, 2012 at 5:47 PM, Steven D'Aprano > wrote: > > Gregory P. Smith wrote: > > > >> Making modules "simply" be a class that could be subclasses rather than > >> their own thing _would_ be nice for one particular project I've worked > on > >> where the project including APIs and basic implementations were open > >> source > >> but which allowed for site specific code to override many/most of those > >> base implementations as a way of customizing it for your own specific > (non > >> open source) environment. > > > This makes no sense to me. What does the *licence* of a project have to > do > > with the library API? I mean, yes, you could do such a thing, but surely > you > > shouldn't. That would be like saying that the accelerator pedal should > be on > > the right in cars you buy outright, but on the left for cars you get on > > hire-purchase. > > That's an irrelevant, surprising and unfair criticism of Greg's > message. He just tried to give a specific example without being too > specific. > heh. right. I didn't want to drag people into a project that shall not be named because I dislike its design to begin with... I was just suggesting a case where we _would_ have found it useful to treat a module as a class that could be subclassed. A better overall design pattern using classes in the code's public APIs to start with would have prevented all such need. I am not strongly in favor of modules being classes but it is an interesting thing to ponder from time to time. > > Nevertheless, I think your focus here is on the wrong thing. It seems to > me > > that you are jumping to an implementation, namely that modules should > stop > > being instances of a type and become classes, without having a clear > idea of > > your functional requirements. > > > > The functional requirements *might* be: > > > > "There ought to be an easy way to customize the behaviour of attribute > > access in modules." > > > > Or perhaps: > > > > "There ought to be an easy way for one module to shadow another module > with > > the same name, but still inherit behaviour from the shadowed module." > > > > neither of which *require* modules to become classes. > > > > Or perhaps it is something else... it is unclear to me exactly what > problems > > you and Jim wish to solve, or whether they're the same kind of problem, > > which is why I say the functional requirements are unclear. > > > > Changing modules from an instance of ModuleType to "a class that could > be a > > subclass" is surely going to break code. Somewhere, someone is relying on > > the fact that modules are not types and you're going to break their > > application. > > > > > > > >> Any APIs that were unfortunately defined as a > >> module with a bunch of functions in it was a real pain to make site > >> specific overrides for. > > > > > > It shouldn't be. Just ensure the site-specific override module comes > first > > in the path, and "import module" will pick up the override module > instead of > > the standard one. This is a simple exercise in shadowing modules. > > > > Of course, this implies that the override module has to override > > *everything*. There's currently no simple way for the shadowing module to > > inherit functionality from the shadowed module. You can probably hack > > something together, but it would be a PITA. > > If there is a bunch of functions and you want to replace a few of > those, you can probably get the desired effect quite easily: > > from base_module import * # Or the specific set of functions that > comprise the API. > > def funct1(): > def funct2(): > > Not that I would recommend this -- it's easy to get confused if there > are more than a very small number of functions. Also if > base_module.funct3 were to call func2, it wouldn't call the overridden > version. > > But all attempts to view modules as classes or instances have lead to > negative results. (I'm sure I've thought about it at various times in > the past.) > > I think the reason is that a module at best acts as a class where > every method is a *static* method, but implicitly so. Ad we all know > how limited static methods are. (They're basically an accident -- back > in the Python 2.2 days when I was inventing new-style classes and > descriptors, I meant to implement class methods but at first I didn't > understand them and accidentally implemented static methods first. > Then it was too late to remove them and only provide class methods.) > This "oops, I implemented static methods" is a wonderful bit of history! :) > > There is actually a hack that is occasionally used and recommended: a > module can define a class with the desired functionality, and then at > the end, replace itself in sys.modules with an instance of that class > (or with the class, if you insist, but that's generally less useful). > E.g.: > > # module foo.py > > import sys > > class Foo: > def funct1(self, ): > def funct2(self, ): > > sys.modules[__name__] = Foo() > > This works because the import machinery is actively enabling this > hack, and as its final step pulls the actual module out of > sys.modules, after loading it. (This is no accident. The hack was > proposed long ago and we decided we liked enough to support it in the > import machinery.) > > You can easily override __getattr__ / __getattribute__ / __setattr__ > this way. It also makes "subclassing" the module a little easier > (although accessing the class to be used as a base class is a little > tricky: you'd have to use foo.__class__). But of course the kind of > API that Greg was griping about would never be implemented this way, > so that's fairly useless. And if you were designing a module as an > inheritable class right from the start you're much better off just > using a class instead of the above hack. > > But all in all I don't think there's a great future in stock for the > idea of allowing modules to be "subclassed". In the vast, vast > majority of cases it's better to clearly have a separation between > modules, which provide no inheritance and no instantiation, and > classes, which provide both. I think Python is better off this way > than Java, where all you have is classes (its packages cannot contain > anything except class definitions). > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue May 8 20:30:39 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 8 May 2012 14:30:39 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> <20120508131722.1353070e@pitrou.net> <20120508111553.6c9023cd@bhuda.mired.org> Message-ID: On Tue, May 8, 2012 at 1:11 PM, Guido van Rossum wrote: > At the same time knowing what I know > now about common usage I wouldn't design it this way today. I agree with this. Note that the latest addition to the datetime module - the timezone type is designed differently: >>> bool(timezone.utc) True In many ways the timezone type is similar to time: it represents a point on a 24-hour circle. Even though the "zero" timezone is even more special than midnight, the potential for a coding mistake testing for truth instead of identity with None is even greater because None is (unfortunately) a very common value for tzinfo. I still don't think we can change bool(time(0)), though. From ethan at stoneleaf.us Tue May 8 20:46:25 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 08 May 2012 11:46:25 -0700 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> <4FA90554.9010505@egenix.com> Message-ID: <4FA96A01.6040503@stoneleaf.us> Terry Reedy wrote: > If the stdlib had an Elevation class, we could have the same argument > about whether Elevation(0) should be True, like all others, or False. I liked your explanation, Terry. For me, it comes down to the something vs. nothing argument: empty containers, the number 0 (when it represents Nothing), False, etc., are all instances of nothing. I do not see midnight as a representation of nothing. I would probably go with Something for an elevation of zero as well. But not money. ;) positive is money I have, negative is money I owe, and zero is nothing. ~Ethan~ From mal at egenix.com Tue May 8 21:55:15 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 08 May 2012 21:55:15 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> <4FA90554.9010505@egenix.com> Message-ID: <4FA97A23.7010009@egenix.com> Terry Reedy wrote: > On 5/8/2012 7:36 AM, M.-A. Lemburg wrote: > >> It's perfectly fine for time value to mimic a boolean value >> by following the same paradigm as a float "seconds since midnight" >> value. > > Ah, I think this is the key to the dispute as to whether midnight should be False or True. Is the > implementation of time of day as seconds since midnight essential (then midnight should be False) or > accidental (then midnight should be True like all other times)? Different discussants disagree on > the premise and hence the conclusion. > > If one first implements time-of-day as a number representing seconds from midnight, then > bool(midmight) is bool(0) is False, like it or not. If one later wraps the number as a Time object, > as Python did, then seconds from midnight and the specialness of midnight is essential for the new > object to be a completely back-compatible drop-in replacement (with augmentations). Anyway, if 'from > midnight' is part of the core concept of the class, the current behavior is correct. > > If one starts with time-of-day as a concept independent of linear numbers, as smoothly flowing > around a circle, then making any particular time of day (or point on the circle) special seems > wrong. Indeed, time of day is the same as local rotation angle with respect to the sum. So it is as > much geometric as numeric. > > Abstractly, the second viewpoint seems correct. Pragmatically, however, civilized humans (those with > clocks ;-) have standardized on local nominal midnight as the base point for numerically measuring > time of day. I think you have to broaden that view a bit :-) The Julian day starts noon and in other date/time concepts, the day starts at sunrise or sunrise, so it depends on the location as well as the day of the year (and various other astronomical corrections). See e.g. http://en.wikipedia.org/wiki/Day The whole date/time topic is full of mysteries, oddities and very human errors and misconceptions. It also demonstrates that there's no single right way to capture date/time. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 08 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-07-02: EuroPython 2012, Florence, Italy 55 days to go 2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ben+python at benfinney.id.au Wed May 9 02:10:18 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 09 May 2012 10:10:18 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <20120508115626.32706f28@pitrou.net> <20120508125148.030614cb@pitrou.net> Message-ID: <874nrq6wph.fsf@benfinney.id.au> Antoine Pitrou writes: > On Tue, 08 May 2012 12:25:48 +0200 > Georg Brandl wrote: > > On 05/08/2012 11:56 AM, Antoine Pitrou wrote: > > > On Tue, 8 May 2012 17:02:04 +1000 > > > Nick Coghlan wrote: > > >> > > >> IMO, you've completely misdiagnosed the source of that bug. Never > > >> *ever* rely on boolean evaluation when testing against None. > > > > > > Nick, that's just plain silly. If we didn't want people to rely on > > > boolean evaluation, we wouldn't define __bool__ at all (or we > > > would make it return a random value). > > > > Read again: he's talking about people using "bool(x)" (implicitly) > > when they mean "x is not None". > > That's what I read. Yet you mis-represent him, omitting the crucial qualifier ?when testing against None? when you quote him as saying ?we don't want people to rely on boolean evaluation?. Then you call your straw man silly. -- \ ?I took it easy today. I just pretty much layed around in my | `\ underwear all day. ? Got kicked out of quite a few places, | _o__) though.? ?Bug-Eyed Earl, _Red Meat_ | Ben Finney From jimjjewett at gmail.com Wed May 9 02:53:58 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 8 May 2012 20:53:58 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <20120507180653.25a654d1@pitrou.net> References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> Message-ID: On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou wrote: > Why do you want to reproduce it? Does midnight warrant any special > shortcut for testing? Especially one that is confusing to many > readers. Why do you think that 0 represents midnight? *Because* it is zero, it will often be used as a default for missing data. Python at least offers alternatives, but that doesn't mean people will use them, and certainly doesn't mean that the data wasn't already corrupted before python ever saw it. And if I know the hour but not the minute or second, I myself would generally use zeros for the missing data even in python. Saying that it represents midnight because it is defined that way is true only in the same sense that evaluating to False is correct because it is defined that way. With a sufficiently powerful time machine, I would use 1-60 to represent the minute/second being traversed and leave 0 for missing data. (And making this the right answer would probably involve going back long before python considered the issue.) Without such a time machine, none of options are good enough to justify breaking backwards compatibility. -jJ From stephen at xemacs.org Wed May 9 03:42:42 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 09 May 2012 10:42:42 +0900 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: <4FA96A01.6040503@stoneleaf.us> References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> <4FA90554.9010505@egenix.com> <4FA96A01.6040503@stoneleaf.us> Message-ID: <87lil2ruy5.fsf@uwakimon.sk.tsukuba.ac.jp> Ethan Furman writes: > But not money. ;) positive is money I have, negative is money I owe, > and zero is nothing. Accountants and economists take exception. Unless you live on a desert island with Friday, an aggregate zero is an accident, just like a Poisson arrival at exactly midnight is an accident. On the other hand, there are zeros that are None in accounting, but they are *always* associated with a real zero (no sales of an item, no production, worker absent, ...). I have to admit I find Terry's circular reasoning[1] compelling as an ex ante argument, but looking at the clock I discover it's ex post. And it's really not that big a deal; if a factory function is documented to return None, the test *should* be "x is not None", not "bool(x)". Footnotes: [1] Sorry, Terry, I couldn't resist! From steve at pearwood.info Wed May 9 05:20:56 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 09 May 2012 13:20:56 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> Message-ID: <4FA9E298.20704@pearwood.info> Nick Coghlan wrote: > On Tue, May 8, 2012 at 3:42 PM, Steven D'Aprano wrote: >> Consider: >> >> t = get_job_start_time() # returns a datetime.time object, or None >> if t: >> do_something_with(t) >> >> >> Oops, we have a bug. If the job happens to have started at exactly >> midnight, it will wrongly be treated as false. > > IMO, you've completely misdiagnosed the source of that bug. Never > *ever* rely on boolean evaluation when testing against None. *Always* > use the "is not None" trailer. and then in a follow-up post: > The problem is not using boolean evaluation - it's assuming that boolean > evaluation is defined as "x is not None". Doing so introduces a completely > unnecessary dependency on the type of "x". I'm frankly astonished that so > many people seem to think it's a reasonable thing to do. I am perfectly aware that None is not the only falsey value, and that bool tests are not implemented as comparisons against None. It is unfortunate that this thread has been hijacked into a argument about testing objects in a bool context, because that's not the fundamental problem. In my example code I intentionally assumed that time values don't have a false value, to show how the time values encourage buggy code. At the time I wrote that example I thought that the behaviour of time values in a bool context was undocumented, an easy mistake to make: the only documentation I have found is buried under the entry for time.tzinfo: a time object is considered to be true if and only if, after converting it to minutes and subtracting utcoffset() (or 0 if that?s None), the result is non-zero. http://docs.python.org/py3k/library/datetime.html#datetime.time.tzinfo The complicated nature of this bool context should have been a clue that it might have been a bad idea. The false time value is, perhaps "unpredictable" is a little strong, but certainly surprising. If my timezone calculations are correct, the local time which is falsey for me is 10am. Anyone keen to defend having 10am local time treated as false? Also, be careful about dogmatic prohibitions like "*never* ever rely on boolean evaluation when testing against None". The equivalent comparison for re MatchObjects is unproblematic. They are explicitly documented as always being true, with the explicit aim of allowing boolean evaluation to work correctly: "Match objects always have a boolean value of True. This lets you use a simple if-statement to test whether a match was found." http://docs.python.org/py3k/library/re.html#match-objects and indeed, there are at least five instances in the standard library that use the idiom mo = re.match(blah blah) if mo: ... or similar. It is unfortunate that time values don't operate similarly. -- Steven From steve at pearwood.info Wed May 9 07:16:23 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 9 May 2012 15:16:23 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> Message-ID: <20120509051622.GA8882@ando> On Tue, May 08, 2012 at 02:21:07AM -0400, Alexander Belopolsky wrote: > On Tue, May 8, 2012 at 1:42 AM, Steven D'Aprano wrote: > >> In matters of opinion, the status quo reigns. > > > > That's somewhat of an exaggeration. The mere existence of a single > > dissenting opinion isn't enough to block all progress/changes. > > For what it's worth, I am also against changing the status quo. > time(0) is special: it is the smallest possible value. If you deal > with low resolution time values, say hourly schedules, it is not > unreasonable to test for time(0). For example, when estimating daily > averages, midnight samples can be weighted by 1/2 to account for the > uncertainty in assigning midnight to a given day. I think this demonstrates the incidious nature of this design flaw in time objects. Alexander, you caught me in a mistake earlier, when I neglected to take tzinfo into account, and here you are doing the same sort of thing: you can't reliably detect midnight with a simple bool(timevalue) test. -- Steven From steve at pearwood.info Wed May 9 07:42:28 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 9 May 2012 15:42:28 +1000 Subject: [Python-ideas] bool(datetime.time(0, 0)) In-Reply-To: References: <20120507180653.25a654d1@pitrou.net> <4FA88B90.7060309@stoneleaf.us> <20120508054249.GB3797@ando> <4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net> <4FA90554.9010505@egenix.com> Message-ID: <20120509054227.GB8882@ando> On Tue, May 08, 2012 at 01:57:32PM -0400, Terry Reedy wrote: > On 5/8/2012 7:36 AM, M.-A. Lemburg wrote: > > >It's perfectly fine for time value to mimic a boolean value > >by following the same paradigm as a float "seconds since midnight" > >value. > > Ah, I think this is the key to the dispute as to whether midnight should > be False or True. Is the implementation of time of day as seconds since > midnight essential (then midnight should be False) or accidental (then > midnight should be True like all other times)? Different discussants > disagree on the premise and hence the conclusion. If we implemented times as an Hour-Minute-Second tuple would that imply that midnight was True because the tuple (0, 0, 0) is not an empty tuple? I don't think so. > If one first implements time-of-day as a number representing seconds > from midnight, then bool(midmight) is bool(0) is False, like it or not. > If one later wraps the number as a Time object, as Python did, then > seconds from midnight and the specialness of midnight is essential for > the new object to be a completely back-compatible drop-in replacement > (with augmentations). Anyway, if 'from midnight' is part of the core > concept of the class, the current behavior is correct. Just because times can be implemented as (say) floats doesn't mean it is sensible to treat them as floats. "Square root of 3:15pm" isn't a meaningful concept. > If one starts with time-of-day as a concept independent of linear > numbers, as smoothly flowing around a circle, then making any particular > time of day (or point on the circle) special seems wrong. Precisely. Especially when that special-time-of-day depends on the timezone. [...] > Abstractly, the second viewpoint seems correct. Pragmatically, however, > civilized humans (those with clocks ;-) have standardized on local > nominal midnight as the base point for numerically measuring time of day. Even if that is correct, and I think that orthodox Jews may disagree that the day begins at midnight (even those with clocks), the datetime.time class does not fit that model. Local midnight is not necessarily false. -- Steven From solipsis at pitrou.net Wed May 9 08:07:46 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 9 May 2012 08:07:46 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5> <4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net> <20120507173254.6a6aee5b@pitrou.net> <20120507180653.25a654d1@pitrou.net> Message-ID: <20120509080746.5fb45e3b@pitrou.net> On Tue, 8 May 2012 20:53:58 -0400 Jim Jewett wrote: > On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou wrote: > > > Why do you want to reproduce it? Does midnight warrant any special > > shortcut for testing? Especially one that is confusing to many > > readers. > > Why do you think that 0 represents midnight? *Because* it is zero, > it will often be used as a default for missing data. Well, if you've decided upfront that midnight "is zero", then you may argue it's special. But as others have shown, there's nothing obvious about midnight being "zero", especially with timezones factored in. For example, there are no binary operators where midnight is a "zero" i.e. a neutral element. Besides, we have a special value called None exactly for the purpose of representing missing data. Regards Antoine. From jimjjewett at gmail.com Wed May 9 18:38:19 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 9 May 2012 12:38:19 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data Message-ID: On Wed, May 9, 2012 at 2:07 AM, Antoine Pitrou wrote: > On Tue, 8 May 2012 20:53:58 -0400 > Jim Jewett wrote: >> On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou wrote: >> > ... Does midnight warrant any special ... >> Why do you think that 0 represents midnight? ? *Because* it is zero, >> it will often be used as a default for missing data. > Well, if you've decided upfront that midnight "is zero", then you may > argue it's special. But as others have shown, there's nothing obvious > about midnight being "zero", especially with timezones factored in. > For example, there are no binary operators where midnight is a "zero" > i.e. a neutral element. The cyclic groups Z/n have a zero element, so *something* has to be effectively zero; start of day is as reasonable as anything else. Or are you just saying that there aren't *any* meaningful binary operators on hour-of-the-day, beyond __eq__ and __ne__? Practicality Beats Purity suggests that at least comparisons should work consistently, so that time-of-day can be consistently ordered, and that requires a least element. (You could make it noon, or mean sunrise, or actual sundown at a certain monument, but you do need one.) > Besides, we have a special value called None exactly for the purpose of > representing missing data. Not really at the moment, since datetime.time doesn't accept it as an argument, and itself uses 0 for missing data. That could *probably* be fixed in an upwards compatible way, but you would still have to special case how missing-data times should compare to current class instances. # Should they ever be equal? # If not, mixing types is a problem. time(hour, min, sec, microsecond) == Time(hour, min, sec, microsecond) ? # Should microseconds be required? # If not, mixing types is a problem. time(hour, min, sec, microsecond=0) == Time(hour, min, sec, microsecond=None) ? # Should even hours be required? # If so, how much precision is logically required? time(hour=0, min=0, sec=0, microsecond=0) == Time(hour=None, min=None, sec=None, microsecond=None) ? # datetime.time already skips the date. # Can hours be skipped too, to indicate "every hour on the hour"? Time(hour=None, min=0, sec=0, microsecond=0) ? # Should missing data never match, match 0, or match anything (the "on the hour" case) time(hour=7, min=15, sec=0, microsecond=0) == Time(hour=None, min=15, sec=0, microsecond=0) ? -jJ From solipsis at pitrou.net Wed May 9 18:58:03 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 9 May 2012 18:58:03 +0200 Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data References: Message-ID: <20120509185803.49487659@pitrou.net> On Wed, 9 May 2012 12:38:19 -0400 Jim Jewett wrote: > Or > are you just saying that there aren't *any* meaningful binary > operators on hour-of-the-day, beyond __eq__ and __ne__? There aren't indeed. __eq__ and __ne__ don't produce a time result, so they can't be used as a basis for a group. Hence, the notion of a "zero" which you invoked is undefined here. > Practicality Beats Purity suggests that at least comparisons should > work consistently, so that time-of-day can be consistently ordered, > and that requires a least element. (You could make it noon, or mean > sunrise, or actual sundown at a certain monument, but you do need > one.) Sure, so what? Let's say you are creating a day-of-week class with 7 possible instances, and you make them orderable. Does it mean that the least of them (say, Monday or Sunday) should evaluate to false? > > Besides, we have a special value called None exactly for the purpose of > > representing missing data. > > Not really at the moment, since datetime.time doesn't accept it as an > argument, and itself uses 0 for missing data. That's bogus. int() doesn't take None as an argument, yet None is often used to indicate a missing integer argument in other APIs. This fact is true for most types under the sun. You are just inventing non-existing contraints. Regards Antoine. From jeanpierreda at gmail.com Wed May 9 19:10:32 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 9 May 2012 13:10:32 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data In-Reply-To: References: Message-ID: On Wed, May 9, 2012 at 12:38 PM, Jim Jewett wrote: > The cyclic groups Z/n have a zero element, so *something* has to be > effectively zero; start of day is as reasonable as anything else. ?Or > are you just saying that there aren't *any* meaningful binary > operators on hour-of-the-day, beyond __eq__ and __ne__? Times are not a group -- there's no addition or multiplication operator among times. They only add against timedeltas, and timedeltas are the ones that need a 0 in order for that to work properly in some sense (since the result is a time). -- Devin From jeanpierreda at gmail.com Wed May 9 20:27:20 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 9 May 2012 14:27:20 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data In-Reply-To: References: Message-ID: Egh, maybe I should elaborate. On Wed, May 9, 2012 at 1:10 PM, Devin Jeanpierre wrote: > Times are not a group -- there's no addition or multiplication > operator among times. They only add against timedeltas, and timedeltas > are the ones that need a 0 in order for that to work properly in some > sense (since the result is a time). In some system with addition, you generally want an "identity element" (called 0), such that for every x, 0 + x = x. + is only defined with times for timedeltas, not with times and other times. It doesn't make sense to add 3 'oclock to 5 'oclock. So if we're talking about some 0 such that 0 + x = x, either 0 is the time and x is the timedelta, or 0 is the timedelta and x is the time. It doesn't make sense for the zero to be with the times, since the result of addition with a timedelta shouldn't be a timedelta. There is no time such that time + x = x for any timedelta x. This basically means that there is _no time that makes sense as a "zero"_. At all. You can pick some arbitrary one and call it zero, but it isn't zero in an arithmetical sense. On the other hand, there definitely is a zero timedelta: x + timedelta(0) = x for every time x. -- Devin From alexander.belopolsky at gmail.com Wed May 9 20:36:32 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 9 May 2012 14:36:32 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data In-Reply-To: References: Message-ID: On Wed, May 9, 2012 at 1:10 PM, Devin Jeanpierre wrote: > Times are not a group -- there's no addition or multiplication > operator among times. They only add against timedeltas, ... No, they don't. There is really very little that you can do with detached time objects. While they have the tzinfo, with any timezone that observes DST it is useless. That's the main reason I am so skeptical about any ideas about improving the time type. Users should just learn to avoid using it and use full datetime instead. From jeanpierreda at gmail.com Wed May 9 20:52:12 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 9 May 2012 14:52:12 -0400 Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data In-Reply-To: References: Message-ID: On Wed, May 9, 2012 at 2:36 PM, Alexander Belopolsky wrote: > No, they don't. ?There is really very little that you can do with > detached time objects. ?While they have the tzinfo, with any timezone > that observes DST it is useless. ?That's the main reason I am so > skeptical about any ideas about improving the time type. ?Users should > just learn to avoid using it and use full datetime instead. Blagh, that's even worse. But thanks for the correction. I admit I was just assuming they were like datetimes. -- Devin From sven at marnach.net Wed May 9 20:48:56 2012 From: sven at marnach.net (Sven Marnach) Date: Wed, 9 May 2012 19:48:56 +0100 Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins` Message-ID: <20120509184856.GC3133@bagheera> With the reintroduction of u"Unicode literals", Python 3.3 will remove one of the major stumbling stones for supporting Python 2.x and 3.3 within the same code base. Another rather trivial stumbling stone could be removed by adding the alias `future_builtins` for the `builtins` module. Currently, you need to use a try/except block, which isn't too bad, but I think it would be nicer if a line like from future_builtins import map continues to work, just like __future__ imports continue to work. I think the above actually *is* a kind of __future__ report which just happens to be in a regular module because it doesn't need any special compiler support. I know a few module names changed and some modules have been reorganised to packages, so you will still need try/except blocks for other imports. However, I think `future_builtins` is special because it's sole raison d'?tre is forward-compatibility and becuase of the analogy with `__future__`. Cheers, Sven From mikegraham at gmail.com Wed May 9 22:26:00 2012 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 9 May 2012 16:26:00 -0400 Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins` In-Reply-To: <20120509184856.GC3133@bagheera> References: <20120509184856.GC3133@bagheera> Message-ID: On Wed, May 9, 2012 at 2:48 PM, Sven Marnach wrote: > With the reintroduction of u"Unicode literals", Python 3.3 will remove > one of the major stumbling stones for supporting Python 2.x and 3.3 > within the same code base. Another rather trivial stumbling stone > could be removed by adding the alias `future_builtins` for the > `builtins` module. Currently, you need to use a try/except block, > which isn't too bad, but I think it would be nicer if a line like > > from future_builtins import map > > continues to work, just like __future__ imports continue to work. I > think the above actually *is* a kind of __future__ report which just > happens to be in a regular module because it doesn't need any special > compiler support. > > I know a few module names changed and some modules have been > reorganised to packages, so you will still need try/except blocks for > other imports. However, I think `future_builtins` is special because > it's sole raison d'?tre is forward-compatibility and becuase of the > analogy with `__future__`. > > Cheers, > Sven Sounds like it will do more good than harm. +1 Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkbbwr at gmail.com Wed May 9 22:28:31 2012 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Wed, 9 May 2012 21:28:31 +0100 Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins` In-Reply-To: <20120509184856.GC3133@bagheera> References: <20120509184856.GC3133@bagheera> Message-ID: Why not for naming call it from __future__.builtins import map On Wed, May 9, 2012 at 7:48 PM, Sven Marnach wrote: > With the reintroduction of u"Unicode literals", Python 3.3 will remove > one of the major stumbling stones for supporting Python 2.x and 3.3 > within the same code base. ?Another rather trivial stumbling stone > could be removed by adding the alias `future_builtins` for the > `builtins` module. ?Currently, you need to use a try/except block, > which isn't too bad, but I think it would be nicer if a line like > > ? ?from future_builtins import map > > continues to work, just like __future__ imports continue to work. ?I > think the above actually *is* a kind of __future__ report which just > happens to be in a regular module because it doesn't need any special > compiler support. > > I know a few module names changed and some modules have been > reorganised to packages, so you will still need try/except blocks for > other imports. ?However, I think `future_builtins` is special because > it's sole raison d'?tre is forward-compatibility and becuase of the > analogy with `__future__`. > > Cheers, > ? ?Sven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From mikegraham at gmail.com Wed May 9 23:17:33 2012 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 9 May 2012 17:17:33 -0400 Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins` In-Reply-To: References: <20120509184856.GC3133@bagheera> Message-ID: On Wed, May 9, 2012 at 4:28 PM, Jakob Bowyer wrote: > Why not for naming call it from __future__.builtins import map > Because future_builtins already exists. Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed May 9 23:59:03 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 09 May 2012 17:59:03 -0400 Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins` In-Reply-To: <20120509184856.GC3133@bagheera> References: <20120509184856.GC3133@bagheera> Message-ID: On 5/9/2012 2:48 PM, Sven Marnach wrote: > With the reintroduction of u"Unicode literals", Python 3.3 will remove > one of the major stumbling stones for supporting Python 2.x and 3.3 > within the same code base. The justification for that is that some people need *three* types of fast string/byte literals: 1. things that should be bytes in both Py 2 and Py 3; 2. things that should be bytes in Py 2 and unicode in Py 3; 3. things that should be unicode in both Py 2 and Py; and that there is no way to accomplish that with imports. > Another rather trivial stumbling stone > could be removed by adding the alias `future_builtins` for the > `builtins` module. Currently, you need to use a try/except block, > which isn't too bad,but I think it would be nicer if a line like > from future_builtins import map > continues to work, just like __future__ imports continue to work. This proposal, as admitted above and below, does not have the justification of the u prefix reversion. By bloating Python 3 with obsolete stuff, it would make Python 3 less nice. I was dubious about the u prefix addition, because I anticipated that additional, less justified, reversion proposals would follow;-). A strong -1 Every deprecation and removal or change of a name introduces a stumbling block that could be 'removed' by keeping the old name. So we do not remove things casually. This one was deprecated on introduction, so there is no surprise. I do not see any particular reason to special case it, and I notice that you had the sense to not propose that all changed/deleted module names be duplicated;-). I do not know whether this: "The 2to3 tool that ports Python 2 code to Python 3 will recognize this usage and leave the new builtins alone" is because 2to3 special-cases imports from future_builtins or because it always leaves explicitly imported names alone, even if they duplicate a built-in name. But I don't think it matters for a single code base, even it you do use 2to3 to help write that. In any case, if you do not like how you have to directly use future_builtins with a single code base, wrap it. Install the work-a-like wrapper module in either site-packages or include one in your application package. In either case, use whatever name you prefer. app/x.py from app.builtins import map # or *, or whatever. app/builtins.py: try: from future.builtins import * else: # Py3 map = map This will work with all versions of Python 3. > I know a few module names changed and some modules have been > reorganised to packages, so you will still need try/except blocks for > other imports. If you really dislike conditional imports, you could wrap them all in an application 'stdlib' module that hid the version details. Is something like this really not part of existing compatibility packages? -- Terry Jan Reedy From ncoghlan at gmail.com Thu May 10 00:04:38 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 May 2012 08:04:38 +1000 Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins` In-Reply-To: <20120509184856.GC3133@bagheera> References: <20120509184856.GC3133@bagheera> Message-ID: No, because it is trivial to do the following during application startup (with appropriate version checks or try blocks): import sys, builtins sys.modules["future_builtins"] = builtins Or, use the six package instead. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu May 10 02:18:45 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 10 May 2012 12:18:45 +1200 Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data In-Reply-To: References: Message-ID: <4FAB0965.90009@canterbury.ac.nz> Jim Jewett wrote: > The cyclic groups Z/n have a zero element, so *something* has to be > effectively zero; But times of day are not a cyclic group. Time of day *differences* are, but not the times themselves. -- Greg From charlesw123456 at gmail.com Sat May 12 00:27:03 2012 From: charlesw123456 at gmail.com (li wang) Date: Sat, 12 May 2012 06:27:03 +0800 Subject: [Python-ideas] I have an encrypted python module format: .pye Message-ID: hi all: I want to use python in my product because I like and familiar with python for many years, but I won't let the customer to read and modify my code. So the best way is to encrypt my module .py to .pye. Now python will write compiled byte code .pyc or .pyo when a .py is imported, I have write a patch to add .pye support for encrypted byte code. When a .pye is imported, python will check the environment variable PYTHONENCRYPT, if this environment variable is defined with non-blank value, the value is used to generate AES key and CBC initialize vector which will be used to encrypt .py and decrypt .pye. Now it is work for me, does the python community is interested for it? I believe this feature can be helpful to let the python to be used in bussiness use case. Thanks greatly. Charles Wang May/12, 2012. From mwm at mired.org Sat May 12 00:39:07 2012 From: mwm at mired.org (Mike Meyer) Date: Fri, 11 May 2012 18:39:07 -0400 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: <20120511183907.1c256ed8@bhuda.mired.org> On Sat, 12 May 2012 06:27:03 +0800 li wang wrote: > When a .pye is imported, python will check the environment variable > PYTHONENCRYPT, if this environment variable is defined with non-blank > value, the value is used to generate AES key and CBC initialize vector > which will be used to encrypt .py and decrypt .pye. And what prevents the customer from doing that themselves in order to read the source? > Now it is work for me, does the python community is interested for it? > I believe this feature can be helpful to let the python to be used in > bussiness use case. While the ability to hide code is a recurring request, it really doesn't get a lot of support. The problem is that you have to have the plain text of the code available on the customers machine in order to run it. So everything they need to know to decrypt it has to be on the machine, meaning you're relying on obscuring some part of that information to keep them from decrypting it outside of the execution environment. Security through obscurity is a bad idea, and never really works for very long. The recommended solution is to package your software so that reading the source isn't really a requirement. One alternative is to ship both a Python executable and .pyo files without the .py files. I believe there's even a tool for windows that bundles all of that up into a .exe file. This is really just more obscurity, though. It's not like extracting the .pyo files from the .exe is impossible, and turning .pyo files back into python code is straightforward. The better approach is to refactor the critical code into a web service, and sell the users a client and an account. Or give away the client and just sell the account. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From alexander.belopolsky at gmail.com Sat May 12 00:43:41 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 11 May 2012 18:43:41 -0400 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: <1BD6DC17-D858-48CF-AA75-461F6E6FC655@gmail.com> On May 11, 2012, at 6:27 PM, li wang wrote: > I won't let the customer to read and modify > my code. What you describe sounds impossible: how can your customer run your code without an encryption key? If you deliver the key, how can you prevent the customer from reading your code? Preventing modification is feasible with various signed code schemes, but software DRM can never work. From guido at python.org Sat May 12 01:02:55 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 11 May 2012 16:02:55 -0700 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: <1BD6DC17-D858-48CF-AA75-461F6E6FC655@gmail.com> References: <1BD6DC17-D858-48CF-AA75-461F6E6FC655@gmail.com> Message-ID: It it impossible in the same way that it is impossible to lock the front door of your house. The Dropbox client for most major OS'es is written in Python and they use a similar technique. They are very happy with it. --Guido On Fri, May 11, 2012 at 3:43 PM, Alexander Belopolsky wrote: > > > > > On May 11, 2012, at 6:27 PM, li wang wrote: > >> I won't let the customer to read and modify >> my code. > > What you describe sounds impossible: how can your customer run your code without an encryption key? If you deliver the key, how can you prevent the customer from reading your code? ?Preventing modification is feasible with various signed code schemes, but software DRM can never work. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- --Guido van Rossum (python.org/~guido) From grosser.meister.morti at gmx.net Sat May 12 01:14:24 2012 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Sat, 12 May 2012 01:14:24 +0200 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: <1BD6DC17-D858-48CF-AA75-461F6E6FC655@gmail.com> Message-ID: <4FAD9D50.9030209@gmx.net> Well, a quick Google search found this: http://itooktheredpill.dyndns.org/2012/dropbox-decrypt/ So their encryption is pretty useless. The difference to breaking a door lock is, that breaking a lock requires some effort each time you do it. Breaking the encryption of such code only requires a one time effort by someone interested in cracking such things (provided he/she will then publish his/her findings, which they often do). On 05/12/2012 01:02 AM, Guido van Rossum wrote: > It it impossible in the same way that it is impossible to lock the > front door of your house. > > The Dropbox client for most major OS'es is written in Python and they > use a similar technique. They are very happy with it. > > --Guido > > On Fri, May 11, 2012 at 3:43 PM, Alexander Belopolsky > wrote: >> >> >> >> >> On May 11, 2012, at 6:27 PM, li wang wrote: >> >>> I won't let the customer to read and modify >>> my code. >> >> What you describe sounds impossible: how can your customer run your code without an encryption key? If you deliver the key, how can you prevent the customer from reading your code? Preventing modification is feasible with various signed code schemes, but software DRM can never work. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > From techtonik at gmail.com Sat May 12 10:21:17 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 12 May 2012 11:21:17 +0300 Subject: [Python-ideas] [...].join(sep) Message-ID: I am certain this was proposed many times, but still - why it is rejected? "real man don't use spaces".split().join('+').upper() instead of '+'.join("real man don't use spaces".split()).upper() The class purity (not being dependent from objects of other class) is not an argument here: string.join() produces list, why list.join() couldn't produce strings? The impedance mismatch can be, but it is a pain already and string.join() doesn't help: that means you still get exception when trying to join lists with no strings inside Can practicality still beat purity in this case? From phd at phdru.name Sat May 12 10:37:08 2012 From: phd at phdru.name (Oleg Broytman) Date: Sat, 12 May 2012 12:37:08 +0400 Subject: [Python-ideas] [...].join(sep) In-Reply-To: References: Message-ID: <20120512083708.GA3901@iskra.aviel.ru> On Sat, May 12, 2012 at 11:21:17AM +0300, anatoly techtonik wrote: > I am certain this was proposed many times Thousands. > string.join() produces list, why list.join() couldn't produce strings? string.split() produces list. There is no list.join() because list is only one of many containers. Should tuple has its own .join() method? What about other containers? iterables? generators? string.join() can accept any iterable, not only a list. That's the explanation why it's preferred. > that means you still get exception when trying to join lists with > no strings inside In what way do you expect list.join(string) would help? Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From techtonik at gmail.com Sat May 12 10:59:03 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 12 May 2012 11:59:03 +0300 Subject: [Python-ideas] hexdump Message-ID: Just an idea of usability fix for Python 3. hexdump module (function or bytes method is better) as simple, easy and intuitive way for dumping binary data when writing programs in Python. hexdump(bytes) - produce human readable dump of binary data, byte-by-byte representation, separated by space, 16-byte rows Rationale: 1. Debug. Generic binary data can't be output to console. A separate helper is needed to print, log or store its value in human readable format in database. This takes time. 2. Usability. binascii is ugly: name is not intuitive any more, there are a lot of functions, and it is not clear how it relates to unicode. 3. Serialization. It is convenient to have format that can be displayed in a text editor. Simple tools encourage people to use them. Practical example: >>> print(b) ? ? ? ?? ?? ? ?? ?? ? ? ? ? >>> b '\xe6\xb0\x08\x04\xe7\x9e\x08\x04\xe7\xbc\x08\x04\xe7\xd5\x08\x04\xe7\xe4\x08\x04\xe6\xb0\x08\x04\xe7\xf0\x08\x04\xe7\xff\x08\x04\xe8\x0b\x08\x04\xe8\x1a\x08\x04\xe6\xb0\x08\x04\xe6\xb0\x08\x04' >>> print(binascii.hexlify(data)) e6b00804e79e0804e7bc0804e7d50804e7e40804e6b00804e7f00804e7ff0804e80b0804e81a0804e6b00804e6b00804 >>> >>> data = hexdump(b) >>> print(data) E6 B0 08 04 E7 9E 08 04 E7 BC 08 04 E7 D5 08 04 E7 E4 08 04 E6 B0 08 04 E7 F0 08 04 E7 FF 08 04 E8 0B 08 04 E8 1A 08 04 E6 B0 08 04 E6 B0 08 04 >>> >>> # achieving the same output with binascii is overcomplicated >>> data_lines = [binascii.hexlify(b)[i:min(i+32, len(binascii.hexlify(b)))] for i in xrange(0, len(binascii.hexlify(b)), 32)] >>> data_lines = [' '.join(l[i:min(i+2, len(l))] for i in xrange(0, len(l), 2)).upper() for l in data_lines] >>> print('\n'.join(data_lines)) E6 B0 08 04 E7 9E 08 04 E7 BC 08 04 E7 D5 08 04 E7 E4 08 04 E6 B0 08 04 E7 F0 08 04 E7 FF 08 04 E8 0B 08 04 E8 1A 08 04 E6 B0 08 04 E6 B0 08 04 On the other side, getting rather useless binascii output from hexdump() is quite trivial: >>> data.replace(' ','').replace('\n','').lower() 'e6b00804e79e0804e7bc0804e7d50804e7e40804e6b00804e7f00804e7ff0804e80b0804e81a0804e6b00804e6b00804' But more practical, for example, would be counting offset from hexdump: >>> print( ''.join( '%05x: %s\n' % (i*16,l) for i,l in enumerate(hexdump(b).split('\n')))) Etc. Conclusion: By providing better building blocks on basic level Python will become a better tool for more useful tasks. References: [1] http://stackoverflow.com/questions/2340319/python-3-1-1-string-to-hex [2] http://en.wikipedia.org/wiki/Hex_dump -- anatoly t. From phd at phdru.name Sat May 12 11:15:43 2012 From: phd at phdru.name (Oleg Broytman) Date: Sat, 12 May 2012 13:15:43 +0400 Subject: [Python-ideas] hexdump In-Reply-To: References: Message-ID: <20120512091543.GA5284@iskra.aviel.ru> On Sat, May 12, 2012 at 11:59:03AM +0300, anatoly techtonik wrote: > Just an idea of usability fix for Python 3. > hexdump module (function or bytes method is better) as simple, easy > and intuitive way for dumping binary data when writing programs in > Python. Well, you know, the way to add such modules to Python is via Cheeseshop. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From simon.sapin at kozea.fr Sat May 12 10:34:26 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Sat, 12 May 2012 10:34:26 +0200 Subject: [Python-ideas] [...].join(sep) In-Reply-To: References: Message-ID: <4FAE2092.70309@kozea.fr> Le 12/05/2012 10:21, anatoly techtonik a ?crit : > I am certain this was proposed many times, but still - why it is rejected? > > "real man don't use spaces".split().join('+').upper() > instead of > '+'.join("real man don't use spaces".split()).upper() > > > The class purity (not being dependent from objects of other class) is > not an argument here: > string.join() produces list, why list.join() couldn't produce strings? > > The impedance mismatch can be, but it is a pain already and > string.join() doesn't help: > that means you still get exception when trying to join lists with > no strings inside > > > Can practicality still beat purity in this case? Hi, I?m not sure what you mean by "class purity", but the argument against this is practical: list.join would work but we want to join iterables, not just lists. bytes.join and str.join accept any iterable (including user-defined ones), while not every iterable would have a join method. Having the burden of defining join on user-defined string-like types (not very common) is better than on user-defined iterables (more common). Also, a "string-like" already needs many methods while __iter__ is enough to make an iterable. -- Simon Sapin From timothy.c.delaney at gmail.com Sat May 12 15:11:00 2012 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Sat, 12 May 2012 23:11:00 +1000 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: On 12 May 2012 08:27, li wang wrote: > When a .pye is imported, python will check the environment variable > PYTHONENCRYPT, if this environment variable is defined with non-blank > value, the value is used to generate AES key and CBC initialize vector > which will be used to encrypt .py and decrypt .pye. > As others have noted, this is essentially useless for protecting your code. How do you set that environment variable on your customer's system, without giving them the key they need? You can erect a somewhat higher barrier by using Pyrex or Cython to compile your modules to .pyd/.so. It's still quite possible to extract your logic and/or patch around things, but it's a little harder. The only reasonably secure method (again, as noted by others) is to not have your code on the client machine e.g. using a web service for the critical logic. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Sat May 12 16:29:35 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Sat, 12 May 2012 16:29:35 +0200 Subject: [Python-ideas] Move tarfile.filemode() into stat module Message-ID: http://hg.python.org/cpython/file/9d9495fabeb9/Lib/tarfile.py#l304 I discovered this undocumented function by accident different years ago and reused it a couple of times since then. I think that leaving it hidden inside tarfile module is unfortunate. What about moving it into stat module and document it? Regards, --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From solipsis at pitrou.net Sat May 12 16:41:10 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 12 May 2012 16:41:10 +0200 Subject: [Python-ideas] Move tarfile.filemode() into stat module References: Message-ID: <20120512164110.27316aec@pitrou.net> On Sat, 12 May 2012 16:29:35 +0200 Giampaolo Rodol? wrote: > http://hg.python.org/cpython/file/9d9495fabeb9/Lib/tarfile.py#l304 > I discovered this undocumented function by accident different years > ago and reused it a couple of times since then. > I think that leaving it hidden inside tarfile module is unfortunate. > What about moving it into stat module and document it? I don't know which of stat or shutil would be the better recipient, but it's a good idea anyway. Regards Antoine. From g.rodola at gmail.com Sat May 12 17:23:48 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Sat, 12 May 2012 17:23:48 +0200 Subject: [Python-ideas] Move tarfile.filemode() into stat module In-Reply-To: <20120512164110.27316aec@pitrou.net> References: <20120512164110.27316aec@pitrou.net> Message-ID: 2012/5/12 Antoine Pitrou : > On Sat, 12 May 2012 16:29:35 +0200 > Giampaolo Rodol? > wrote: >> http://hg.python.org/cpython/file/9d9495fabeb9/Lib/tarfile.py#l304 >> I discovered this undocumented function by accident different years >> ago and reused it a couple of times since then. >> I think that leaving it hidden inside tarfile module is unfortunate. >> What about moving it into stat module and document it? > > I don't know which of stat or shutil would be the better recipient, but > it's a good idea anyway. Hmm... right. It's controversial. On one hand stat module looks better because the "mode" concept is scattered all over the place, on the other hand this is a perfect example of file-related "utility" function. Let's wait in order to collect some bikeshedding then. =) --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From tjreedy at udel.edu Sat May 12 18:18:33 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 12 May 2012 12:18:33 -0400 Subject: [Python-ideas] hexdump In-Reply-To: References: Message-ID: On 5/12/2012 4:59 AM, anatoly techtonik wrote: > Just an idea of usability fix for Python 3. > hexdump module (function or bytes method is better) as simple, easy > and intuitive way for dumping binary data when writing programs in > Python. > > hexdump(bytes) - produce human readable dump of binary data, > byte-by-byte representation, separated by space, 16-byte rows Hexdump, as you propose it, does three things. In each case, it fixes a parameter that could reasonably have a different value. 1. Splits the hex characters into groups of two characters, each representing one byte. For some uses, large chunks would be more useful. 2. Uppercases the alpha hex characters. This is a holdover from the ancient all-uppercase world, where there was no choice. While is may make the block visual more 'even' and 'aesthetic', which not actually being read, it makes it harder to tell the difference between a 0-9 digit and alpha digit. B and 8 become very similar. There is justification for binascii.hexlify using locecase. 3. Group the hex-represented units into lines of 16 each. This is only useful when the bytes come from memory with hex addresses, when the point is to determine the specific bytes at specific addresses. For displaying decimal-length byte strings, 25 bytes per line would be better. What it does not do. 4. Break lines into blocks. One might want to break up multiple lines of 25 into blocks of four lines each. 5. Label the rows and column either with hex or decimal labels. 6. Add 'dotted ascii' translation to reveal embedded ascii strints. Output: choices are an iterator of lines, a list of lines, and a string with embedded newlines. The second and third are easily derived from the first, so I propose the first as the best choice. A iterator can also be used to write to a file. A flexible module would be a good addition to pypi if not there already. Let see.... hexencoder 1.0 hex encode decode and compare This project offers 3 basic tools for manipulating binary files: 1) flexible hexdump Home Page: http://sourceforge.net/projects/hexencoder I did not look to see how flexible is 'flexible', but there it is. > Rationale: > 1. Debug. > Generic binary data can't be output to console. That depends on the console. Old IBM PCs had a character for every byte. That was meant for line-drawing, accents, and symbols, but could also be used for binary dumps. I believe there are Windows codepages that will do similar. Any bytes can be decoded as latin-1 and then printed. > A separate helper > is needed to print, log or store its value in human readable format in > database. This takes time. A custom helper gives custom output. > 2. Usability. > binascii is ugly: name is not intuitive any more, there are a lot > of functions, and it is not clear how it relates to unicode. Even if there are lots of functions, one might be added. What does 'it' refer to? hexdump or binascii? Both are about binary bytes and not about unicode characters, so neither relate to abstract unicode. Encoded unicode characters are binary data like any other, though if the encoding is utf-16 or utf-32, one would want 2 or 4 bytes dumped together, as I suggested above. -- Terry Jan Reedy From flub at devork.be Sat May 12 18:20:40 2012 From: flub at devork.be (Floris Bruynooghe) Date: Sat, 12 May 2012 17:20:40 +0100 Subject: [Python-ideas] hexdump In-Reply-To: References: Message-ID: On 12 May 2012 09:59, anatoly techtonik wrote: > hexdump(bytes) ? - produce human readable dump of binary data, +1 on this basic function, that would be very nice in the stdlib. Now I always need to go and dig up my own function from somewhere. A certain deal of bikeshedding would be required on the function signature however, I'd go with something like: hexdump(data, rowsize=16, offsets=True, ascii=True) Where rowsize is the number of bytes on one row, offsets controls showing the byte number (in hex) of the first byte of each row and ascii controls showing the 7-bit printable characters in a right hand column. This would cover my needs, I'm sure other people will come up with more must-haves. Regards, Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org From jkbbwr at gmail.com Sat May 12 18:44:13 2012 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Sat, 12 May 2012 17:44:13 +0100 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: http://chargen.matasano.com/chargen/2009/7/22/if-youre-typing-the-letters-a-e-s-into-your-code-youre-doing.html On Sat, May 12, 2012 at 2:11 PM, Tim Delaney wrote: > On 12 May 2012 08:27, li wang wrote: >> >> When a .pye is imported, python will check the environment variable >> PYTHONENCRYPT, if this environment variable is defined with non-blank >> value, the value is used to generate AES key and CBC initialize vector >> which will be used to encrypt .py and decrypt .pye. > > > As others have noted, this is essentially useless for protecting your code. > How do you set that environment variable on your customer's system, without > giving them the key they need? > > You can erect a somewhat higher barrier by using Pyrex or Cython to compile > your modules to .pyd/.so. It's still quite possible to extract your logic > and/or patch around things, but it's a little harder. > > The only reasonably secure method (again, as noted by others) is to not have > your code on the client machine e.g. using a web service for the critical > logic. > > Tim Delaney > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From brett at python.org Sat May 12 19:13:59 2012 From: brett at python.org (Brett Cannon) Date: Sat, 12 May 2012 13:13:59 -0400 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: On Fri, May 11, 2012 at 6:27 PM, li wang wrote: > hi all: > > I want to use python in my product because I like and familiar with > python for many years, but I won't let the customer to read and modify > my code. So the best way is to encrypt my module .py to .pye. > > Actually it's better to simply ship the .pyc/.pyo files and/or to minify the code to make it unreadable. As everyone pointed out, the encryption you are proposing won't stop anyone from reading your source, it will just make it a little harder. -Brett > Now python will write compiled byte code .pyc or .pyo when a .py is > imported, I have write a patch to add .pye support for encrypted byte > code. > > When a .pye is imported, python will check the environment variable > PYTHONENCRYPT, if this environment variable is defined with non-blank > value, the value is used to generate AES key and CBC initialize vector > which will be used to encrypt .py and decrypt .pye. > > Now it is work for me, does the python community is interested for it? > I believe this feature can be helpful to let the python to be used in > bussiness use case. > > Thanks greatly. > > Charles Wang May/12, 2012. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat May 12 19:14:46 2012 From: guido at python.org (Guido van Rossum) Date: Sat, 12 May 2012 10:14:46 -0700 Subject: [Python-ideas] hexdump In-Reply-To: References: Message-ID: Rather than bikeshedding, why not implement the common formats and flags implemented by the venerable 'od' command? It's been time-tested... On Sat, May 12, 2012 at 9:20 AM, Floris Bruynooghe wrote: > On 12 May 2012 09:59, anatoly techtonik wrote: >> hexdump(bytes) ? - produce human readable dump of binary data, > > +1 on this basic function, that would be very nice in the stdlib. ?Now > I always need to go and dig up my own function from somewhere. > > A certain deal of bikeshedding would be required on the function > signature however, I'd go with something like: > > hexdump(data, rowsize=16, offsets=True, ascii=True) > > Where rowsize is the number of bytes on one row, offsets controls > showing the byte number (in hex) of the first byte of each row and > ascii controls showing the 7-bit printable characters in a right hand > column. > > This would cover my needs, I'm sure other people will come up with > more must-haves. > > Regards, > Floris > > -- > Debian GNU/Linux -- The Power of Freedom > www.debian.org | www.gnu.org | www.kernel.org > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Sat May 12 19:16:05 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 12 May 2012 13:16:05 -0400 Subject: [Python-ideas] Move tarfile.filemode() into stat module In-Reply-To: <20120512164110.27316aec@pitrou.net> References: <20120512164110.27316aec@pitrou.net> Message-ID: On 5/12/2012 10:41 AM, Antoine Pitrou wrote: > On Sat, 12 May 2012 16:29:35 +0200 > Giampaolo Rodol? > wrote: >> http://hg.python.org/cpython/file/9d9495fabeb9/Lib/tarfile.py#l304 >> I discovered this undocumented function by accident different years >> ago and reused it a couple of times since then. >> I think that leaving it hidden inside tarfile module is unfortunate. >> What about moving it into stat module and document it? > > I don't know which of stat or shutil would be the better recipient, but > it's a good idea anyway. I think I would more likely look in stat, and as noted below, the constants used for the table used in the function are already in stat.py. I checked, and # Bits used in the mode field, values in octal. #--------------------------------------------------------- S_IFLNK = 0o120000 # symbolic link ... are only used in filemode_table = ( ((S_IFLNK, "l"), ... which is only used in def filemode(mode): ... So all three can be cleanly extracted into another module. However 1) the bit definitions themselves should just be deleted as they *duplicate* those in stat.py. The S_Ixxx names are the same, the other names are variations of the other stat.S_Ixxxx names. So filemode_table (with '_' added?) could/should be re-written in stat.py to use the public, documented constants already defined there. However 2) stat.py lacks the nice comments explaining the constants in the file itself, so I *would* copy the comments to the appropriate lines. There only seems to be one use of the function in tarfile.py: Line 1998: print(filemode(tarinfo.mode), end=' ') All the other uses of 'filemode' are as a local name inside the open method, derived from its mode parameter: filemode, comptype = mode.split(":", 1) +1 on moving the table (probably with private name, and using the existing, documented stat S_Ixxxx constants) and function (public) to stat.py. -- Terry Jan Reedy From mwm at mired.org Sat May 12 20:39:55 2012 From: mwm at mired.org (Mike Meyer) Date: Sat, 12 May 2012 14:39:55 -0400 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: <20120512143955.174c4a76@bhuda.mired.org> On Sat, 12 May 2012 13:13:59 -0400 Brett Cannon wrote: > On Fri, May 11, 2012 at 6:27 PM, li wang wrote: > > I want to use python in my product because I like and familiar with > > python for many years, but I won't let the customer to read and modify > > my code. So the best way is to encrypt my module .py to .pye. > Actually it's better to simply ship the .pyc/.pyo files and/or to minify > the code to make it unreadable. As everyone pointed out, the encryption you > are proposing won't stop anyone from reading your source, it will just make > it a little harder. I think it's worth explaining why just shipping the .pyc/.pyo files is "better". If it's not clear by now, a fancy encryption scheme won't protect your sources from someone who really wants to read them. On the other hand, shipping just the .pyc/.pyo files will stop casual browsing. The only real difference here is how much effort it takes to get the source. To carry Guido's analogy further, both lock your front door, one just uses a better lock. Neither will stop a determined burglar. On the other hand, if you ship code with a fancy encryption scheme, you're shipping more moving parts, which means more things to go wrong, which means more support calls. With the particular scheme you proposed, you'll get calls from people who managed to run the code without properly setting the environment variable, or set it to the wrong thing, and those are just the obvious problems. In summary, your encryption scheme will make life just a little harder for everyone when compared to simply not shipping the source. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From cmjohnson.mailinglist at gmail.com Sun May 13 02:49:54 2012 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Sat, 12 May 2012 14:49:54 -1000 Subject: [Python-ideas] Printf function? Message-ID: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> I was looking at this jokey page on the evolution of programming language syntax -- http://alan.dipert.org/post/153430634/the-march-of-progress -- and it made me think about where Python is now. The Python 2 version of the example from the page is print "%10.2f" % x and the Python 3 is print("{:10.2f}".format(x)) Personally, I prefer the new style {} formatting to the old % formatting, but it is pretty busy when you want to do a print and format in one step. Why not add a printf function to the built-ins, so you could just write printf("{:10.2f}", x) Of course, writing a printf function for oneself is trivial and "not every three line function needs to be a built-in," but I do feel like this would be a win for Python's legibility. What do you all think? -- Carl Johnson From cs at zip.com.au Sun May 13 03:50:10 2012 From: cs at zip.com.au (Cameron Simpson) Date: Sun, 13 May 2012 11:50:10 +1000 Subject: [Python-ideas] Printf function? In-Reply-To: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> Message-ID: <20120513015010.GA30528@cskk.homeip.net> On 12May2012 14:49, Carl M. Johnson wrote: | I was looking at this jokey page on the evolution of programming language syntax -- http://alan.dipert.org/post/153430634/the-march-of-progress -- and it made me think about where Python is now. The Python 2 version of the example from the page is | | print "%10.2f" % x | | and the Python 3 is | | print("{:10.2f}".format(x)) | | Personally, I prefer the new style {} formatting to the old % | formatting, but it is pretty busy when you want to do a print and | format in one step. Why not add a printf function to the built-ins, | so you could just write | | printf("{:10.2f}", x) | | Of course, writing a printf function for oneself is trivial and "not | every three line function needs to be a built-in," but I do feel like | this would be a win for Python's legibility. I'm -1 on it: - as you say, it could be a three line function - %-formatting isn't going away - neither %-formatting nor {}-formatting is anything to do with the print statement; they are both string actions So the printf idea does not achieve anything anyway. Observe my Python 3.2: [/home/cameron]janus*> python3.2 Python 3.2.2 (default, May 2 2012, 09:04:59) [GCC 4.5.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> x=1.5 >>> print("%10.2f" % x) 1.50 >>> Printf isn't needed. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Senior ego adepto, ocius ego eram. From cmjohnson.mailinglist at gmail.com Sun May 13 05:37:54 2012 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Sat, 12 May 2012 17:37:54 -1000 Subject: [Python-ideas] Printf function? In-Reply-To: <20120513015010.GA30528@cskk.homeip.net> References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> <20120513015010.GA30528@cskk.homeip.net> Message-ID: On May 12, 2012, at 3:50 PM, Cameron Simpson wrote: > Observe my Python 3.2: > > [/home/cameron]janus*> python3.2 > Python 3.2.2 (default, May 2 2012, 09:04:59) > [GCC 4.5.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> x=1.5 >>>> print("%10.2f" % x) > 1.50 >>>> > > Printf isn't needed. Well, if that's the solution, why do we even have .format in the first place? I know there are a lot of people who still prefer % formatting, but I personally never liked it, and I prefer not to use it if I have any choice about it. But that's neither here nor there. My question is, being that we have .format, why not make it easier to use? From steve at pearwood.info Sun May 13 09:05:43 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 13 May 2012 17:05:43 +1000 Subject: [Python-ideas] Printf function? In-Reply-To: <20120513015010.GA30528@cskk.homeip.net> References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> <20120513015010.GA30528@cskk.homeip.net> Message-ID: <4FAF5D47.3090503@pearwood.info> Cameron Simpson wrote: > Printf isn't needed. Agreed. printf does two things, formatting and printing, and Python can already do both. There's no point in a format-then-print function when you can just format then print. However a lightweight alternative to regexes, something similar to scanf only safe, might be a nice idea. You can simulate scanf with regexes, but of course that's hardly lightweight. (But now I'm indulging in idle speculation, not a serious proposal.) -- Steven From steve at pearwood.info Sun May 13 09:22:45 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 13 May 2012 17:22:45 +1000 Subject: [Python-ideas] Printf function? In-Reply-To: References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> <20120513015010.GA30528@cskk.homeip.net> Message-ID: <4FAF6145.7090804@pearwood.info> Carl M. Johnson wrote: > Well, if that's the solution, why do we even have .format in the first > place? I know there are a lot of people who still prefer % formatting, but > I personally never liked it, and I prefer not to use it if I have any > choice about it. But that's neither here nor there. My question is, being > that we have .format, why not make it easier to use? Define "easier to use". Calling a method and passing its output to print seems to be pretty easy to me. The major issue with printf is that it prints AND formats. That means you can't easily capture its output. A simpler approach is to have one function that handles the printing, and another function (or possibly a choice of multiple functions) that handles the formatting, then simply pass the output of the second to the first. That is to say, multiple simple tools that do one thing each are simpler *and* more flexible than a single tool to do multiple things: a hammer and a wrench together are less complex than a combination hammer-wrench, and you can do more with them as separate tools than as a combo. -- Steven From steve at pearwood.info Sun May 13 11:36:52 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 13 May 2012 19:36:52 +1000 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: <20120512143955.174c4a76@bhuda.mired.org> References: <20120512143955.174c4a76@bhuda.mired.org> Message-ID: <4FAF80B4.1040507@pearwood.info> Mike Meyer wrote: > On Sat, 12 May 2012 13:13:59 -0400 > Brett Cannon wrote: > >> On Fri, May 11, 2012 at 6:27 PM, li wang wrote: >>> I want to use python in my product because I like and familiar with >>> python for many years, but I won't let the customer to read and modify >>> my code. So the best way is to encrypt my module .py to .pye. >> Actually it's better to simply ship the .pyc/.pyo files and/or to minify >> the code to make it unreadable. As everyone pointed out, the encryption you >> are proposing won't stop anyone from reading your source, it will just make >> it a little harder. > > I think it's worth explaining why just shipping the .pyc/.pyo files is > "better". > > If it's not clear by now, a fancy encryption scheme won't protect your > sources from someone who really wants to read them. On the other hand, > shipping just the .pyc/.pyo files will stop casual browsing. The only > real difference here is how much effort it takes to get the source. To > carry Guido's analogy further, both lock your front door, one just > uses a better lock. Neither will stop a determined burglar. I think Guido's analogy is bogus and wrongly suggests that encrypting applications just might work if you try hard enough. If we can lock the door and keep strangers from peeking inside, why can't we encrypt apps and stop people from peeking at the code? But the analogy doesn't follow. In the front door example, untrusted people don't have a key and are forced to pick or break the lock to get it. In the encryption example, untrusted people are given the key (as an environment variable), then trusted not to use it to read the source code! (Possibly on the assumption that they don't realise they have the key, or that using it manually is too difficult for them.) Ultimately, on a computer the user controls, with a key they have access to, they can bypass any encryption or security you install. That's why e.g. so many forms of copy protection and digital restrictions software try to take control away from the user, to some greater or lesser degree of success. -- Steven From masklinn at masklinn.net Sun May 13 11:51:58 2012 From: masklinn at masklinn.net (Masklinn) Date: Sun, 13 May 2012 11:51:58 +0200 Subject: [Python-ideas] Printf function? In-Reply-To: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> Message-ID: <45E5027E-9A8C-4748-BAE8-54ACA49D70E7@masklinn.net> On 2012-05-13, at 02:49 , Carl M. Johnson wrote: > I was looking at this jokey page on the evolution of programming language syntax -- http://alan.dipert.org/post/153430634/the-march-of-progress -- and it made me think about where Python is now. The Python 2 version of the example from the page is > > print "%10.2f" % x > > and the Python 3 is > > print("{:10.2f}".format(x)) > > Personally, I prefer the new style {} formatting to the old % formatting, but it is pretty busy when you want to do a print and format in one step. Why not add a printf function to the built-ins, so you could just write > > printf("{:10.2f}", x) > > Of course, writing a printf function for oneself is trivial and "not every three line function needs to be a built-in," but I do feel like this would be a win for Python's legibility. > > > What do you all think? I'm ?1 on two counts personally: 1. Even with Python 3's slightly more verbose string formatting, I don't think there's much (if any) gain in having a builtin merging print and format 2. If I see a function called `printf` (or with `printf` pas part of its name), I expect it to use printf-style format strings (that is, Python 2-style formatting). A function called printf with new-style format string would be far more confusing than the current situation, I think. From masklinn at masklinn.net Sun May 13 12:00:49 2012 From: masklinn at masklinn.net (Masklinn) Date: Sun, 13 May 2012 12:00:49 +0200 Subject: [Python-ideas] Printf function? In-Reply-To: <4FAF6145.7090804@pearwood.info> References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> <20120513015010.GA30528@cskk.homeip.net> <4FAF6145.7090804@pearwood.info> Message-ID: On 2012-05-13, at 09:22 , Steven D'Aprano wrote: > Define "easier to use". Calling a method and passing its output to print seems to be pretty easy to me. > > The major issue with printf is that it prints AND formats. That means you can't easily capture its output. A simpler approach is to have one function that handles the printing, and another function (or possibly a choice of multiple functions) that handles the formatting, then simply pass the output of the second to the first. An other option is to have a formatting function similar to Common Lisp's format[0]: it formats a string and * If provided with a stream (or stream-like) argument writes the formatted string to the stream and returns `nil` * Otherwise returns the formatted string The function formats and prints, but capturing the output (to a non-standard stream or to a string) is trivial. [0] http://www.lispworks.com/documentation/HyperSpec/Body/f_format.htm#format From ncoghlan at gmail.com Sun May 13 16:44:02 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 14 May 2012 00:44:02 +1000 Subject: [Python-ideas] Printf function? In-Reply-To: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> Message-ID: On Sun, May 13, 2012 at 10:49 AM, Carl M. Johnson wrote: > I was looking at this jokey page on the evolution of programming language syntax -- http://alan.dipert.org/post/153430634/the-march-of-progress -- and it made me think about where Python is now. The Python 2 version of the example from the page is > > ? ? ? ?print "%10.2f" % x > > and the Python 3 is > > ? ? ? ?print("{:10.2f}".format(x)) The main reason the format() builtin exists is to easily format single fields: >>> x = 5.0 >>> print(format(x, "10.2f")) 5.00 The new formatting system doesn't scale down as well as the old one, so "format a single field value with no surrounding text" is handled as a special case. > Personally, I prefer the new style {} formatting to the old % formatting, but it is pretty busy when you want to do a print and format in one step. Why not add a printf function to the built-ins, so you could just write > > ? ? ? ?printf("{:10.2f}", x) > > Of course, writing a printf function for oneself is trivial and "not every three line function needs to be a built-in," but I do feel like this would be a win for Python's legibility. The problem is that you have two competing uses for your arguments - "print" wants to accept "file", "end", etc, while format() wants to accept *args and **kwds). It's better to keep the two separate and allow people to compose them as they wish. For myself, aside from temporary debugging messages, I rarely call "print" directly in any non-trivial code - instead I'll have a "display" utility module that tweaks things appropriately for the specific application. Maybe it will redirect to logging, maybe it will print directly to a stream or to a file - the utility module gives me a single point of control without having to change the rest of the script. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun May 13 16:44:57 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 14 May 2012 00:44:57 +1000 Subject: [Python-ideas] Printf function? In-Reply-To: <4FAF5D47.3090503@pearwood.info> References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com> <20120513015010.GA30528@cskk.homeip.net> <4FAF5D47.3090503@pearwood.info> Message-ID: On Sun, May 13, 2012 at 5:05 PM, Steven D'Aprano wrote: > However a lightweight alternative to regexes, something similar to scanf > only safe, might be a nice idea. You can simulate scanf with regexes, but of > course that's hardly lightweight. > > (But now I'm indulging in idle speculation, not a serious proposal.) Take a look at the parse module on PyPI :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From gahtune at gmail.com Sun May 13 17:00:24 2012 From: gahtune at gmail.com (Gabriel AHTUNE) Date: Sun, 13 May 2012 23:00:24 +0800 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: <4FAF80B4.1040507@pearwood.info> References: <20120512143955.174c4a76@bhuda.mired.org> <4FAF80B4.1040507@pearwood.info> Message-ID: > > I think Guido's analogy is bogus and wrongly suggests that encrypting > applications just might work if you try hard enough. If we can lock the > door and keep strangers from peeking inside, why can't we encrypt apps and > stop people from peeking at the code? But the analogy doesn't follow. > The analogy is you want to protect the doors with a lock from the guy you gave the key. (source: house, encrypted: the lock, the way to decrypt in order to run: the key) > In the front door example, untrusted people don't have a key and are > forced to pick or break the lock to get it. In the encryption example, > untrusted people are given the key > (as an environment variable), then trusted not to use it to read the source > code! > The problem is that he don't trust the customer. > (Possibly on the assumption that they don't realise they have the key, or > that using it manually is too difficult for them.) > > Ultimately, on a computer the user controls, with a key they have access > to, they can bypass any encryption or security you install. That's why e.g. > so many forms of copy protection and digital restrictions software try to > take control away from the user, to some greater or lesser degree of > success. > > > -- > Steven > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun May 13 17:30:39 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 13 May 2012 08:30:39 -0700 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: <4FAF80B4.1040507@pearwood.info> References: <20120512143955.174c4a76@bhuda.mired.org> <4FAF80B4.1040507@pearwood.info> Message-ID: On Sun, May 13, 2012 at 2:36 AM, Steven D'Aprano wrote: > I think Guido's analogy is bogus and wrongly suggests that encrypting > applications just might work if you try hard enough. Eh? I didn't mean that at all. To the contrary I meant that every encryption can be broken but that it may still be a useful deterrent. I wasn't aware of the detail of the OP's proposal that the key was right in the user's environment -- but that actually has an exact analogy in the front door example: hiding the key under the mat. -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Sun May 13 18:43:08 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 13 May 2012 12:43:08 -0400 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: <20120512143955.174c4a76@bhuda.mired.org> <4FAF80B4.1040507@pearwood.info> Message-ID: On Sun, May 13, 2012 at 11:30 AM, Guido van Rossum wrote: > I wasn't aware of the detail of the OP's proposal that the key was > right in the user's environment -- but that actually has an exact > analogy in the front door example: hiding the key under the mat. This sounds like an argument not to include this functionality in stdlib. If hiding the key under the mat becomes standard, a key under the mat will be as inviting as an open front door. Those interested in obscurity should not invite public discussion of clandestine advantages of doormats over garden rocks. From guido at python.org Sun May 13 19:33:07 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 13 May 2012 10:33:07 -0700 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: <20120512143955.174c4a76@bhuda.mired.org> <4FAF80B4.1040507@pearwood.info> Message-ID: --Guido van Rossum (sent from Android phone) On May 13, 2012 9:43 AM, "Alexander Belopolsky" < alexander.belopolsky at gmail.com> wrote: > > On Sun, May 13, 2012 at 11:30 AM, Guido van Rossum wrote: > > I wasn't aware of the detail of the OP's proposal that the key was > > right in the user's environment -- but that actually has an exact > > analogy in the front door example: hiding the key under the mat. > > This sounds like an argument not to include this functionality in > stdlib. If hiding the key under the mat becomes standard, a key under > the mat will be as inviting as an open front door. Those interested > in obscurity should not invite public discussion of clandestine > advantages of doormats over garden rocks. Agreed, definitely not for the stdlib. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwm at mired.org Sun May 13 20:26:19 2012 From: mwm at mired.org (Mike Meyer) Date: Sun, 13 May 2012 14:26:19 -0400 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: <4FAF80B4.1040507@pearwood.info> References: <20120512143955.174c4a76@bhuda.mired.org> <4FAF80B4.1040507@pearwood.info> Message-ID: <20120513142619.07bce1a8@bhuda.mired.org> On Sun, 13 May 2012 19:36:52 +1000 Steven D'Aprano wrote: > Mike Meyer wrote: > > If it's not clear by now, a fancy encryption scheme won't protect your > > sources from someone who really wants to read them. On the other hand, > > shipping just the .pyc/.pyo files will stop casual browsing. The only > > real difference here is how much effort it takes to get the source. To > > carry Guido's analogy further, both lock your front door, one just > > uses a better lock. Neither will stop a determined burglar. > I think Guido's analogy is bogus and wrongly suggests that encrypting > applications just might work if you try hard enough. If we can lock the door > and keep strangers from peeking inside, why can't we encrypt apps and stop > people from peeking at the code? But locking the door *won't* keep strangers from peeking inside. Not if they really want to. It'll keep people from casually opening the door, but it won't stop someone who really wants to see the insides because they can: > But the analogy doesn't follow. In the front > door example, untrusted people don't have a key and are forced to pick or > break the lock to get it. Exactly. You can easily get tools to do all these things, as well as others, to get past the lock. > In the encryption example, untrusted people are given the key (as an > environment variable), then trusted not to use it to read the source > code! This is pretty much required in any form of DRM. You have to give the end user the keys in order for them to use what you gave them. Trying to prevent them from then doing *other* things is done by obfuscating how you get from the cyphertext to the plaintext. That's it can't work is why the US container companies got laws passed making doing so illegal. > (Possibly on the assumption that they don't realise they have the key, or that > using it manually is too difficult for them.) The difficulty level is immaterial. With the proper training and tools, none of these things (picking locks, breaking down doors, reverse engineering code obfuscation) is difficult. On the other hand, you can raise the difficulty level of any of them by investing more in whatever obstacles you're putting in the way. They both do the same thing. That's why the analogy works. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From g.rodola at gmail.com Sun May 13 23:04:38 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Sun, 13 May 2012 23:04:38 +0200 Subject: [Python-ideas] Move tarfile.filemode() into stat module In-Reply-To: References: <20120512164110.27316aec@pitrou.net> Message-ID: 2012/5/12 Terry Reedy : > On 5/12/2012 10:41 AM, Antoine Pitrou wrote: >> >> On Sat, 12 May 2012 16:29:35 +0200 >> Giampaolo Rodol? >> wrote: >>> >>> http://hg.python.org/cpython/file/9d9495fabeb9/Lib/tarfile.py#l304 >>> I discovered this undocumented function by accident different years >>> ago and reused it a couple of times since then. >>> I think that leaving it hidden inside tarfile module is unfortunate. >>> What about moving it into stat module and document it? >> >> >> I don't know which of stat or shutil would be the better recipient, but >> it's a good idea anyway. > > > I think I would more likely look in stat, and as noted below, the constants > used for the table used in the function are already in stat.py. > > I checked, and > > # Bits used in the mode field, values in octal. > #--------------------------------------------------------- > S_IFLNK = 0o120000 ? ? ? ?# symbolic link > ... > > are only used in > > filemode_table = ( > ? ?((S_IFLNK, ? ? ?"l"), > ? ?... > > which is only used in > > def filemode(mode): ... > > So all three can be cleanly extracted into another module. > > However 1) the bit definitions themselves should just be deleted as they > *duplicate* those in stat.py. The S_Ixxx names are the same, the other names > are variations of the other stat.S_Ixxxx names. So filemode_table (with '_' > added?) could/should be re-written in stat.py to use the public, documented > constants already defined there. > > However 2) stat.py lacks the nice comments explaining the constants in the > file itself, so I *would* copy the comments to the appropriate lines. > > > There only seems to be one use of the function in tarfile.py: > Line 1998: ? ? ? ? ? ? ? ? print(filemode(tarinfo.mode), end=' ') > > All the other uses of 'filemode' are as a local name inside the open method, > derived from its mode parameter: > ? ? ? ? ? ?filemode, comptype = mode.split(":", 1) > > +1 on moving the table (probably with private name, and using the existing, > documented stat S_Ixxxx constants) and function (public) to stat.py. > > -- > Terry Jan Reedy Agreed then. I'm going to submit a patch soon. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From g.rodola at gmail.com Mon May 14 00:29:18 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Mon, 14 May 2012 00:29:18 +0200 Subject: [Python-ideas] Move tarfile.filemode() into stat module In-Reply-To: References: <20120512164110.27316aec@pitrou.net> Message-ID: 2012/5/12 Terry Reedy : > However 2) stat.py lacks the nice comments explaining the constants in the > file itself, so I *would* copy the comments to the appropriate lines. +1 If no one is opposed I'll do that tomorrow. > Add me terry.reedy as nosy and I will help review by checking the > S_Ixxx substitutions. Thanks, will do. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From mikegraham at gmail.com Mon May 14 19:35:13 2012 From: mikegraham at gmail.com (Mike Graham) Date: Mon, 14 May 2012 13:35:13 -0400 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: On Fri, May 11, 2012 at 6:27 PM, li wang wrote: > > I want to use python in my product because I like and familiar with > python for many years, but I won't let the customer to read and modify > my code. So the best way is to encrypt my module .py to .pye. They scheme you describe only provides a false sense of security. That would be very bad. The only ways to protect your code are a) legally, which is the main one, and b) by not giving it to anyone (and making them access things by a remote interface). A very strong -1 from me. Do not provide wrong-headed, insecure features like this. Mike From guido at python.org Mon May 14 19:46:29 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 14 May 2012 10:46:29 -0700 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: On Mon, May 14, 2012 at 10:35 AM, Mike Graham wrote: > ?On Fri, May 11, 2012 at 6:27 PM, li wang wrote: >> >> I want to use python in my product because I like and familiar with >> python for many years, but I won't let the customer to read and modify >> my code. So the best way is to encrypt my module .py to .pye. > > They scheme you describe only provides a false sense of security. That > would be very bad. You seem to be assuming security by obscurity is worse than no security. I disagree (although I am not defending it as the sole form of security). Many security professionals are not happy unless multiple levels of security are in place, some of which can only be described as obscurity. > The only ways to protect your code are a) legally, which is the main > one, If you look into legal ways of protecting physical property you'll find that having locks, fences etc. is often necessary for legal protection to apply. That's why so often you'll find "no trespassing" signs (in Holland these even have a specific reference to the law on them). > and b) by not giving it to anyone (and making them access things > by a remote interface). > > A very strong -1 from me. Do not provide wrong-headed, insecure > features like this. I am -1 on including any support for encrypting bytecode in the standard library, for the same reasons that we *removed* Bastion and rexec -- since it cannot be made perfect, we'd be forever open to criticism and possible liability if someone misunderstood the level of security provided. But I am defending the right of users to implement a level of obscurity that they are comfortable with. At the same time it is good to point out the limits of such schemes. -- --Guido van Rossum (python.org/~guido) From mikegraham at gmail.com Mon May 14 20:00:11 2012 From: mikegraham at gmail.com (Mike Graham) Date: Mon, 14 May 2012 14:00:11 -0400 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: On Mon, May 14, 2012 at 1:46 PM, Guido van Rossum wrote: > You seem to be assuming security by obscurity is worse than no > security. I disagree (although I am not defending it as the sole form > of security). Many security professionals are not happy unless > multiple levels of security are in place, some of which can only be > described as obscurity. I would point out: a) It can be worse than no security for the same reason a cotton bulletproof jacket is worse than no bulletproof jacket: it lures you into a false sense of security, and b) The original post asked for a non-obscure, non-secure solution. > If you look into legal ways of protecting physical property you'll > find that having locks, fences etc. is often necessary for legal > protection to apply. That's why so often you'll find "no trespassing" > signs (in Holland these even have a specific reference to the law on > them). This is very true, but I think I might be missing something about your point. Are there places where intellectual property has similar laws or policies? Thanks, Mike From bruce at leapyear.org Mon May 14 20:10:27 2012 From: bruce at leapyear.org (Bruce Leban) Date: Mon, 14 May 2012 11:10:27 -0700 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: On Mon, May 14, 2012 at 11:00 AM, Mike Graham wrote: > On Mon, May 14, 2012 at 1:46 PM, Guido van Rossum > wrote:> If you look into legal ways of protecting physical property you'll > > find that having locks, fences etc. is often necessary for legal > > protection to apply. That's why so often you'll find "no trespassing" > > signs (in Holland these even have a specific reference to the law on > > them). > > This is very true, but I think I might be missing something about your > point. Are there places where intellectual property has similar laws > or policies? > Both patent and copyright law have the concept of 'willful infringement' and 'proper notice'. Taking the right steps to make sure the person receiving your IP is aware of your copyright and patent rights can make them a willful infringer and subject to harsher penalties. Conversely, failure to use proper notices means you have less protection. (It used to be that the mere absence of a copyright notice would put your work in the public domain but that is no longer the case.) If you obfuscate the code, the reader of the code cannot claim that you didn't mind if they read it. It makes your intent clear. While simply compiling source to byte codes obfuscates it to some extent, it doesn't send a clear message that you don't want them to read it. A notice at the front of the file saying that you don't want them to read it might be just as good as obfuscation from that standpoint. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Mon May 14 21:41:19 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 14 May 2012 21:41:19 +0200 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: References: Message-ID: <4FB15FDF.9030806@egenix.com> Mike Graham wrote: > On Mon, May 14, 2012 at 1:46 PM, Guido van Rossum wrote: >> You seem to be assuming security by obscurity is worse than no >> security. I disagree (although I am not defending it as the sole form >> of security). Many security professionals are not happy unless >> multiple levels of security are in place, some of which can only be >> described as obscurity. > > I would point out: a) It can be worse than no security for the same > reason a cotton bulletproof jacket is worse than no bulletproof > jacket: it lures you into a false sense of security, and b) The > original post asked for a non-obscure, non-secure solution. > >> If you look into legal ways of protecting physical property you'll >> find that having locks, fences etc. is often necessary for legal >> protection to apply. That's why so often you'll find "no trespassing" >> signs (in Holland these even have a specific reference to the law on >> them). > > This is very true, but I think I might be missing something about your > point. Are there places where intellectual property has similar laws > or policies? Yes, see http://en.wikipedia.org/wiki/Anti-circumvention Take e.g. the EU directive text: "...the expression 'technological measures' means any technology, device or component that, in the normal course of its operation, is designed to prevent or restrict acts..." "Technological measures shall be deemed 'effective' where the use of a protected work or other subjectmatter is controlled by the rightsholders through application of an access control or protection process, such as encryption, scrambling or other transformation of the work or other subject-matter or a copy control mechanism, which achieves the protection objective." There's an important difference between "security by obscurity" and "protection by obscurity". The first is very hard to achieve. The second is made easy by laws and regulations (because the first doesn't work out too well in practice). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 14 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-07-02: EuroPython 2012, Florence, Italy 49 days to go 2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ckaynor at zindagigames.com Mon May 14 22:31:11 2012 From: ckaynor at zindagigames.com (Chris Kaynor) Date: Mon, 14 May 2012 13:31:11 -0700 Subject: [Python-ideas] I have an encrypted python module format: .pye In-Reply-To: <4FB15FDF.9030806@egenix.com> References: <4FB15FDF.9030806@egenix.com> Message-ID: On Mon, May 14, 2012 at 12:41 PM, M.-A. Lemburg wrote: > Mike Graham wrote: > > I would point out: a) It can be worse than no security for the same > > reason a cotton bulletproof jacket is worse than no bulletproof > > jacket: it lures you into a false sense of security, and b) The > > original post asked for a non-obscure, non-secure solution. > > > > On Mon, May 14, 2012 at 1:46 PM, Guido van Rossum wrote: > >> If you look into legal ways of protecting physical property you'll > >> find that having locks, fences etc. is often necessary for legal > >> protection to apply. That's why so often you'll find "no trespassing" > >> signs (in Holland these even have a specific reference to the law on > >> them). > > > > This is very true, but I think I might be missing something about your > > point. Are there places where intellectual property has similar laws > > or policies? > > Yes, see http://en.wikipedia.org/wiki/Anti-circumvention > > Take e.g. the EU directive text: > > "...the expression 'technological measures' means any technology, device or component that, in the > normal course of its operation, is designed to prevent or restrict acts..." > > "Technological measures shall be deemed 'effective' where the use of a protected work or other > subjectmatter is controlled by the rightsholders through application of an access control or > protection process, such as encryption, scrambling or other transformation of the work or other > subject-matter or a copy control mechanism, which achieves the protection objective." As I read it, the text of the law quoted above would mean that just releasing the pyc files would be enough, as would running the source though an obfuscator. > > There's an important difference between "security by obscurity" and > "protection by obscurity". The first is very hard to achieve. The second > is made easy by laws and regulations (because the first doesn't work out > too well in practice). Chris From fuzzyman at gmail.com Tue May 15 01:28:02 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Tue, 15 May 2012 00:28:02 +0100 Subject: [Python-ideas] Unhelpful error message from sorted Message-ID: Hello all, It seems to me that the following error message, whilst technically correct, is unhelpful: >>> sorted([3, 2, 1], reverse=None) Traceback (most recent call last): File "", line 1, in TypeError: an integer is required Worth creating an issue for? Michael -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyideas at rebertia.com Tue May 15 02:04:04 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Mon, 14 May 2012 17:04:04 -0700 Subject: [Python-ideas] Unhelpful error message from sorted In-Reply-To: References: Message-ID: On Mon, May 14, 2012 at 4:28 PM, Michael Foord wrote: > Hello all, > > It seems to me that the following error message, whilst technically correct, > is unhelpful: > >>>> sorted([3, 2, 1], reverse=None) > Traceback (most recent call last): > ? File "", line 1, in > TypeError: an integer is required > > Worth creating an issue for? IMO, yes. Surely a *bool[ean]* value ought to be required. (And mentioning the `reverse` parameter by name would of course also be nice.) Cheers, Chris From tjreedy at udel.edu Tue May 15 08:52:26 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 15 May 2012 02:52:26 -0400 Subject: [Python-ideas] Unhelpful error message from sorted In-Reply-To: References: Message-ID: On 5/14/2012 8:04 PM, Chris Rebert wrote: > On Mon, May 14, 2012 at 4:28 PM, Michael Foord wrote: >> Hello all, >> >> It seems to me that the following error message, whilst technically correct, >> is unhelpful: >> >>>>> sorted([3, 2, 1], reverse=None) >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: an integer is required >> >> Worth creating an issue for? > > IMO, yes. Surely a *bool[ean]* value ought to be required. > (And mentioning the `reverse` parameter by name would of course also be nice.) There are still overly cryptic errors messages. I would like to see something more like TypeError: 'reverse' argument must be bool, not Nonetype -- Terry Jan Reedy From storchaka at gmail.com Tue May 15 09:16:49 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 15 May 2012 10:16:49 +0300 Subject: [Python-ideas] Unhelpful error message from sorted In-Reply-To: References: Message-ID: <4FB202E1.4060509@gmail.com> On 15.05.12 09:52, Terry Reedy wrote: > There are still overly cryptic errors messages. I would like to see > something more like > TypeError: 'reverse' argument must be bool, not Nonetype With using issue14705 [1] sort can accepts reverse=None. [1] http://bugs.python.org/issue14705 From stefan_ml at behnel.de Tue May 15 09:58:27 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 15 May 2012 09:58:27 +0200 Subject: [Python-ideas] Unhelpful error message from sorted In-Reply-To: <4FB202E1.4060509@gmail.com> References: <4FB202E1.4060509@gmail.com> Message-ID: Serhiy Storchaka, 15.05.2012 09:16: > On 15.05.12 09:52, Terry Reedy wrote: >> There are still overly cryptic errors messages. I would like to see >> something more like >> TypeError: 'reverse' argument must be bool, not Nonetype > > With using issue14705 [1] sort can accepts reverse=None. > > [1] http://bugs.python.org/issue14705 Looks like a side effect, though. It doesn't make any sense to me to pass None for the "reversed" argument. Stefan From steve at pearwood.info Tue May 15 10:02:08 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 15 May 2012 18:02:08 +1000 Subject: [Python-ideas] Unhelpful error message from sorted In-Reply-To: References: Message-ID: <20120515080208.GA30938@ando> On Tue, May 15, 2012 at 12:28:02AM +0100, Michael Foord wrote: > Hello all, > > It seems to me that the following error message, whilst technically > correct, is unhelpful: > > >>> sorted([3, 2, 1], reverse=None) > Traceback (most recent call last): > File "", line 1, in > TypeError: an integer is required I don't know what you mean by "technically correct". Surely the Pythonic idiom is to allow any value in a boolean context. sorted() here is neither one thing nor the other, neither duck-typing, since it won't accept flags that quack like a bool, nor does it strictly insist on a bool, since it accepts ints: >>> sorted([1,2,3], reverse=42) [3, 2, 1] I can't see any sense to this almost-but-not-quite type restriction. +1 to allow any object that is truthy or falsey (i.e. anything). +0 to allowing only True or False. -1 to half-heartedly allowing ints but no other values. -- Steven From fuzzyman at gmail.com Tue May 15 11:01:21 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Tue, 15 May 2012 10:01:21 +0100 Subject: [Python-ideas] Unhelpful error message from sorted In-Reply-To: <20120515080208.GA30938@ando> References: <20120515080208.GA30938@ando> Message-ID: On 15 May 2012 09:02, Steven D'Aprano wrote: > On Tue, May 15, 2012 at 12:28:02AM +0100, Michael Foord wrote: > > Hello all, > > > > It seems to me that the following error message, whilst technically > > correct, is unhelpful: > > > > >>> sorted([3, 2, 1], reverse=None) > > Traceback (most recent call last): > > File "", line 1, in > > TypeError: an integer is required > > I don't know what you mean by "technically correct". Surely the Pythonic > idiom is to allow any value in a boolean context. > > sorted() here is neither one thing nor the other, neither duck-typing, > since it won't accept flags that quack like a bool, nor does it strictly > insist on a bool, since it accepts ints: > > >>> sorted([1,2,3], reverse=42) > [3, 2, 1] > > I can't see any sense to this almost-but-not-quite type restriction. > > +1 to allow any object that is truthy or falsey (i.e. anything). > I would rather have "sorted(some_list, reverse=[1, 2, 3])" raise an error (and preferably a helpful error message that tells you which argument is faulty and why). Michael > +0 to allowing only True or False. > -1 to half-heartedly allowing ints but no other values. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From Suriaprakash.Mariappan at smsc.com Tue May 15 12:46:20 2012 From: Suriaprakash.Mariappan at smsc.com (Suriaprakash.Mariappan at smsc.com) Date: Tue, 15 May 2012 16:16:20 +0530 Subject: [Python-ideas] input function: built-in space between string and user-input Message-ID: print function: built-in space between string and variable: The below python code, length = 5 print('Length is', length) gives an output of Length is 5 Even though we have not specified a space between 'Length is' and the variable length, Python puts it for us so that we get a clean nice output and the program is much more readable this way (since we don't need to worry about spacing in the strings we use for output). This is surely an example of how Python makes life easy for the programmer. input function: built-in space between string and user-input: However, the below python code, guess = int(input('Enter an integer')) gives an output of Enter an integer7 [Note: Assume 7 is entered by the user.] Suggestion: Similar to the printf function, for the input function also, it will be nice to have the Python put a space between string and user-input, so that the output in the above case will be more readable as below. Enter an integer 7 Thanks and Regards, Suriaprakash M, Principal Engineer - Software, Standard Microsystems India Pvt. Ltd., Module 1, 4th Floor, Block A, SP Infocity, #40, MGR Salai, Perungudi, Chennai - 600 096, Tamil Nadu, INDIA. Email: Suriaprakash.Mariappan at smsc.com Mobile :+919381453832 Skype ID: msuriaprakash -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Tue May 15 18:26:35 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 15 May 2012 10:26:35 -0600 Subject: [Python-ideas] [Python-Dev] sys.implementation In-Reply-To: References: <20120426103150.4898a678@limelight.wooz.org> <4FAA3FA7.5070808@v.loewis.de> <20120509165039.23c8bf56@pitrou.net> <20120509095311.3a2c25c2@resist> <20120510105749.7401f1d2@pitrou.net> Message-ID: At this point I'm pretty comfortable with where PEP 421 is at. Before asking for pronouncement, I'd like to know if anyone has any outstanding concerns that should be addressed first. The only (relatively) substantial point of debate has been the type for sys.implementation. The PEP now limits the specification of the type to the minimum (Big-Endian vs. Little...er...attribute-access vs mapping). If anyone objects to the decision there to go with attribute-access, please make your case. >From my point of the view either one would be fine for what we need and attribute-access is more representative of the fixed namespace. Unless there is a really good reason to use a mapping, I'd like to stick with that. Thanks! -eric From tjreedy at udel.edu Tue May 15 23:19:49 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 15 May 2012 17:19:49 -0400 Subject: [Python-ideas] input function: built-in space between string and user-input In-Reply-To: References: Message-ID: On 5/15/2012 6:46 AM, Suriaprakash.Mariappan at smsc.com wrote: > *_print function: built-in space between string and variable:_* > > The below python code, > > */length = 5/* > */print('Length is', length)/* > > gives an output of > > */Length is 5/* The */.../* and *_..._* bracketing makes you post harder to read. Perhaps this is used in India, but not elsewhere. Omit next time. > Even though we have not specified a space between 'Length is' and the > variable length, Python puts it for us so that we get a clean nice > output and the program is much more readable this way (since we don't > need to worry about spacing in the strings we use for output). This is > surely an example of how Python makes life easy for the programmer. > > *_input function: built-in space between string and user-input:_* > > However, the below python code, > > */guess = int(input('Enter an integer'))/* > > gives an output of > > */Enter an integer7/* > > [Note: Assume 7 is entered by the user.] > > *Suggestion: *Similar to the printf function, for the input function > also, it will be nice to have the Python put a space between string and > user-input, so that the output in the above case will be more readable > as below. > > */Enter an integer 7/* print() converts objects to strings and adds separators and a terminator before writing to outfile.write(). In 3.x, the separator, terminator, and outfile can all be changed from the default. The user is stuck with the fact that str(obj) is what it is, so it is handy to automatically tack something on. input() directly writes a prompt string with sys.stdout.write. There is no need to to augment that as the user can make the prompt string be whatever they want. In any case, a change would break back-compatibility. -- Terry Jan Reedy From mwm at mired.org Wed May 16 08:32:15 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 16 May 2012 02:32:15 -0400 Subject: [Python-ideas] get method for sets? Message-ID: <20120516023215.4699c0b4@bhuda.mired.org> Is there some reason that there isn't a straightforward way to get an element from a set without removing it? Everything I find either requires multiple statements or converting the set to another data type. It seems that some kind of get method would be useful. The argument that "getting an arbitrary element from a set isn't useful" is refuted by 1) the existence of the pop method, which does just that, and 2) the fact that I (and a number of other people) have run into such a need. My search for such a reason kept finding people asking how to get an element instead. Of course, my key words (set and get) are heavily overloaded. thanks, http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From steve at pearwood.info Wed May 16 08:58:43 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 16 May 2012 16:58:43 +1000 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516023215.4699c0b4@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> Message-ID: <20120516065843.GA2542@ando> On Wed, May 16, 2012 at 02:32:15AM -0400, Mike Meyer wrote: > Is there some reason that there isn't a straightforward way to get an > element from a set without removing it? Everything I find either > requires multiple statements or converting the set to another data > type. > > It seems that some kind of get method would be useful. The argument > that "getting an arbitrary element from a set isn't useful" is refuted > by 1) the existence of the pop method, which does just that, pop returns an arbitrary element, and removes it. That's a very different operation to "get this element from the set". The problem is, if there was a set.get(x) method, you have to pass x as argument, and it returns, what? x. So what's the point? You already have the return value before you call the function. def get(s, x): """Return element x from set s.""" if x in s: return x raise KeyError('not found') As I see it, this is only remotely useful if: - you care about identity, e.g. caching/interning - you care about types, e.g. get(s, 42) may return 42.0 as the element of the set instead. In either case, a dict is the more obvious data structure to use. def intern(d, x): """Intern element x in dict d, and return the interned version.""" return d.setdefault(x, x) But with sets? Seems pretty pointless to me. I can't help but feel that set.get() is a poorly thought out operation, much requested but rarely useful. > and 2) > the fact that I (and a number of other people) have run into such a > need. > > My search for such a reason kept finding people asking how > to get an element instead. Of course, my key words (set and get) are > heavily overloaded. What's your use-case? -- Steven From dirkjan at ochtman.nl Wed May 16 09:01:04 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Wed, 16 May 2012 09:01:04 +0200 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516065843.GA2542@ando> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516065843.GA2542@ando> Message-ID: On Wed, May 16, 2012 at 8:58 AM, Steven D'Aprano wrote: > pop returns an arbitrary element, and removes it. That's a very > different operation to "get this element from the set". > > The problem is, if there was a set.get(x) method, you have to pass x as > argument, and it returns, what? x. So what's the point? You already have > the return value before you call the function. I understood Mike's message to be proposing an argument-less .get() method. Cheers, Dirkjan From bruce at leapyear.org Wed May 16 09:02:31 2012 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 16 May 2012 00:02:31 -0700 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516023215.4699c0b4@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> Message-ID: On Tue, May 15, 2012 at 11:32 PM, Mike Meyer wrote: > Is there some reason that there isn't a straightforward way to get an > element from a set without removing it? Everything I find either > requires multiple statements or converting the set to another data > type. > > It seems that some kind of get method would be useful. The argument > that "getting an arbitrary element from a set isn't useful" is refuted > by 1) the existence of the pop method, which does just that, and 2) > the fact that I (and a number of other people) have run into such a > need. > Your request needs clarification. What does set.get do? What is the actual use case? I understand what pop does: it removes and returns an arbitrary member of the set. Therefore, if I call pop repeatedly, I eventually get all the members. That's useful. Here's one definition of get: def get_from_set1(s): """Return an arbitrary member of a set.""" return min(s, key=hash) How is this useful? Or do you mean instead: checks to see if an element is in the set and returns it otherwise returns a default value def get_from_set2(s, v, d=None): """Returns v if v is in the set, otherwise returns d.""" return v if v in s else d I suppose this could be useful but it's a one liner and seems much less obvious what it does than dict.get. Or did you mean something else? --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwm at mired.org Wed May 16 09:10:35 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 16 May 2012 03:10:35 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516065843.GA2542@ando> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516065843.GA2542@ando> Message-ID: <20120516031035.5974b5b3@bhuda.mired.org> On Wed, 16 May 2012 16:58:43 +1000 Steven D'Aprano wrote: > On Wed, May 16, 2012 at 02:32:15AM -0400, Mike Meyer wrote: > > Is there some reason that there isn't a straightforward way to get an > > element from a set without removing it? Everything I find either > > requires multiple statements or converting the set to another data > > type. > > It seems that some kind of get method would be useful. The argument > > that "getting an arbitrary element from a set isn't useful" is refuted > > by 1) the existence of the pop method, which does just that, > pop returns an arbitrary element, and removes it. That's a very > different operation to "get this element from the set". > The problem is, if there was a set.get(x) method, you have to pass x as > argument, and it returns, what? x. So what's the point? You already have > the return value before you call the function. I guess I should have been explicit about what I'm was asking about. I'm not asking for set.get(x) that returns "this element", I'm asking for set.get() that returns an arbitrary element, like set.pop(), but without removing it. It doesn't even need to be the same element that set.pop() would return. The name is probably a poor choice, but I'm not sure what else it should be. pop_without_remove seems a bit verbose, and implies that it might return the element a pop would. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From ncoghlan at gmail.com Wed May 16 09:11:28 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 16 May 2012 17:11:28 +1000 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516023215.4699c0b4@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> Message-ID: On Wed, May 16, 2012 at 4:32 PM, Mike Meyer wrote: > Is there some reason that there isn't a straightforward way to get an > element from a set without removing it? Everything I find either > requires multiple statements or converting the set to another data > type. > > It seems that some kind of get method would be useful. The argument > that "getting an arbitrary element from a set isn't useful" is refuted > by 1) the existence of the pop method, which does just that, and 2) > the fact that I (and a number of other people) have run into such a > need. > > My search for such a reason kept finding people asking how > to get an element instead. Of course, my key words (set and get) are > heavily overloaded. The two primary use cases handled by the current interface are: 1. Do something for all items in the set (iteration) 2. Do something for an arbitrary item in the set, and keep track of which items remain (set.pop) Now, at the iterator level, it is possible to turn "do something for all items in an iterable" to "do something for the *first* item in the iterable" via "next(iter(obj))". Since this use case is already covered by the iterator protocol, the question then becomes: Is there a specific reason a dedicated set-specific solution is needed rather than better educating people that "the first item" is an acceptable answer when the request is for "an arbitrary item" (this is particularly true in a world where set ordering is randomised by default)? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Wed May 16 09:26:45 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 16 May 2012 17:26:45 +1000 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516031035.5974b5b3@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516065843.GA2542@ando> <20120516031035.5974b5b3@bhuda.mired.org> Message-ID: <20120516072645.GB2542@ando> On Wed, May 16, 2012 at 03:10:35AM -0400, Mike Meyer wrote: > I guess I should have been explicit about what I'm was asking about. :) > I'm not asking for set.get(x) that returns "this element", I'm asking > for set.get() that returns an arbitrary element, like set.pop(), but > without removing it. It doesn't even need to be the same element that > set.pop() would return. Could this helper function not do the job? def get(s): x = s.pop() s.add(x) return x Of course, this does not guarantee that repeated calls to get() won't return the same result over and over again. If that's unacceptable, you'll need to specify what behaviour is acceptable -- i.e. what your functional requirements are. E.g. "I need the element to be selected at random." "I don't need randomness, returning the elements in some arbitrary but deterministic order will do, with no repeats or cycles." "I don't care whether or not there are repeats, so long as the same element is not returned twice in a row." "Once I've seen every element, I expect get() to raise an exception." etc. And I guarantee that whatever your requirements are, other people will want something different. Once you have your requirements, you can start thinking about implementation (e.g. how does the set remember which elements have already been get'ed?). > The name is probably a poor choice, but I'm not sure what else it > should be. pop_without_remove seems a bit verbose, and implies that it > might return the element a pop would. Are you suggesting that get() and pop() should not return the same element? -- Steven From ben+python at benfinney.id.au Wed May 16 09:39:10 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 16 May 2012 17:39:10 +1000 Subject: [Python-ideas] get method for sets? References: <20120516023215.4699c0b4@bhuda.mired.org> Message-ID: <87vcjw8ti9.fsf@benfinney.id.au> Mike Meyer writes: > Is there some reason that there isn't a straightforward way to get an > element from a set without removing it? Everything I find either > requires multiple statements or converting the set to another data > type. With a mapping, you use a key to get an item. With a sequence, you have an index and get an item. Sets are unordered collections of items without indices or keys. What does it mean to you to ?get? an item from that? If you mean ?get the items one by one?, a set is an iterable:: for item in foo_set: do_something_with(item) If you mean ?test whether an item is in the set?, the ?in? operator works:: if item in foo_set: do_something() If you mean ?get a specific item from a set?, the only way to do that is to *already have* the specific item and test whether it's in the set. > It seems that some kind of get method would be useful. The argument > that "getting an arbitrary element from a set isn't useful" is refuted > by 1) the existence of the pop method, which does just that, and 2) > the fact that I (and a number of other people) have run into such a > need. If by ?get? you mean to get an *arbitrary* item, not a specific item, then what's the problem? You already have ?set.pop?, as you point out. What need do you have that isn't being fulfilled by the existing mthods and operators? Can you show some actual code that would be improved by a ?get? operation on sets? -- \ ?It's a terrible paradox that most charities are driven by | `\ religious belief.? if you think altruism without Jesus is not | _o__) altruism, then you're a dick.? ?Tim Minchin, 2010-11-28 | Ben Finney From jeanpierreda at gmail.com Wed May 16 09:38:57 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 16 May 2012 03:38:57 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516072645.GB2542@ando> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516065843.GA2542@ando> <20120516031035.5974b5b3@bhuda.mired.org> <20120516072645.GB2542@ando> Message-ID: On Wed, May 16, 2012 at 3:26 AM, Steven D'Aprano wrote: > Are you suggesting that get() and pop() should not return the same > element? He is suggesting that "It doesn't even need to be the same element that set.pop() would return." -- Devin From mwm at mired.org Wed May 16 09:40:34 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 16 May 2012 03:40:34 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: References: <20120516023215.4699c0b4@bhuda.mired.org> Message-ID: <20120516034034.048f2eaa@bhuda.mired.org> On Wed, 16 May 2012 00:02:31 -0700 Bruce Leban wrote: > On Tue, May 15, 2012 at 11:32 PM, Mike Meyer wrote: > > Is there some reason that there isn't a straightforward way to get an > > element from a set without removing it? Everything I find either > > requires multiple statements or converting the set to another data > > type. > > > > It seems that some kind of get method would be useful. The argument > > that "getting an arbitrary element from a set isn't useful" is refuted > > by 1) the existence of the pop method, which does just that, and 2) > > the fact that I (and a number of other people) have run into such a > > need. > Your request needs clarification. What does set.get do? What is the actual > use case? I understand what pop does: it removes and returns an arbitrary > member of the set. Therefore, if I call pop repeatedly, I eventually get > all the members. That's useful. So is just getting a single member: > Here's one definition of get: > def get_from_set1(s): > """Return an arbitrary member of a set.""" > return min(s, key=hash) From poking around, at least at one time the fastest implementation was the very confusing: def get_from_set(s): for x in s: return x > How is this useful? Basically, anytime you want to examine an arbitrary element of a set, and would use pop, except you need to preserve the set for future use. In my case, I'm running a series of tests on the set, and some tests need an element. Again, looking for a reason for this not existing turned up other cases where people were wondering how to do this. Hmm. Maybe the name should be item? http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From mwm at mired.org Wed May 16 09:52:01 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 16 May 2012 03:52:01 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516072645.GB2542@ando> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516065843.GA2542@ando> <20120516031035.5974b5b3@bhuda.mired.org> <20120516072645.GB2542@ando> Message-ID: <20120516035201.2fb0b3f6@bhuda.mired.org> On Wed, 16 May 2012 17:26:45 +1000 Steven D'Aprano wrote: > On Wed, May 16, 2012 at 03:10:35AM -0400, Mike Meyer wrote: > > I guess I should have been explicit about what I'm was asking about. > > :) > > > I'm not asking for set.get(x) that returns "this element", I'm asking > > for set.get() that returns an arbitrary element, like set.pop(), but > > without removing it. It doesn't even need to be the same element that > > set.pop() would return. > > Could this helper function not do the job? > > def get(s): > x = s.pop() > s.add(x) > return x Sure, if you don't mind munging the set unnecessarily. That's more readable, but slower and longer than: def get(s): for x in is: return s > Of course, this does not guarantee that repeated calls to get() won't > return the same result over and over again. If that's unacceptable, > you'll need to specify what behaviour is acceptable -- i.e. what your > functional requirements are. E.g. > "I need the element to be selected at random." > "I don't need randomness, returning the elements in some arbitrary > but deterministic order will do, with no repeats or cycles." > "I don't care whether or not there are repeats, so long as the same > element is not returned twice in a row." > "Once I've seen every element, I expect get() to raise an exception." > etc. My requirements are "I need an element from the set". The behavior of repeated calls is immaterial. > And I guarantee that whatever your requirements are, other people will > want something different. That's not what I found in my google results. They were all pretty much asking for what I was asking for, and didn't care what happened beyond the first call. I believe you're assuming that the purpose of this method is to start an iteration through the set. That's not the case at all, and a single call to pop would be perfectly acceptable, except I would then need to put the element back. If that were the purpose, I'd agree with you - between iteration and pop, we've covered most of the ways you might want to iterate through the elements. But the point isn't to iterate through the elements, it's to examine a single element. > Once you have your requirements, you can start thinking about > implementation (e.g. how does the set remember which elements have > already been get'ed?). Doesn't need to, because it doesn't matter. > > The name is probably a poor choice, but I'm not sure what else it > > should be. pop_without_remove seems a bit verbose, and implies that it > > might return the element a pop would. > Are you suggesting that get() and pop() should not return the same > element? I'm suggesting there is no requirement that they return the same element. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From ncoghlan at gmail.com Wed May 16 10:08:11 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 16 May 2012 18:08:11 +1000 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516034034.048f2eaa@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> Message-ID: On Wed, May 16, 2012 at 5:40 PM, Mike Meyer wrote: > >From poking around, at least at one time the fastest implementation > was the very confusing: > > def get_from_set(s): > ? ?for x in s: > ? ? ? ?return x Why is this confusing? The operation you want to perform is "give me an object from this set, I don't care which one". That's not an operation that applies just to sets, you can do it with an iterable, therefore the spelling is one that works with any iterable: next(iter(s)) This entire thread is like asking for s.length() when len(s) already works. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From bruce at leapyear.org Wed May 16 10:08:09 2012 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 16 May 2012 01:08:09 -0700 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516034034.048f2eaa@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> Message-ID: On Wed, May 16, 2012 at 12:40 AM, Mike Meyer wrote: > On Wed, 16 May 2012 00:02:31 -0700 > Bruce Leban wrote: > > > Here's one definition of get: > > def get_from_set1(s): > > """Return an arbitrary member of a set.""" > > return min(s, key=hash) > > >From poking around, at least at one time the fastest implementation > was the very confusing: > > def get_from_set(s): > for x in s: > return x > I didn't claim it was fast. I actually wrote that version instead of the in/return version for a very specific reason: it always returns the same element. (The for/in/return version might return the same element every time too but it's not guaranteed.) > How is this useful? > > Basically, anytime you want to examine an arbitrary element of a set, > and would use pop, except you need to preserve the set for future > use. In my case, I'm running a series of tests on the set, and some > tests need an element. > > That's bordering on tautological. It's useful anytime you need it. I don't think your test is very good if it uses the get I wrote above. Your test will only operate on one element of the set and it's easy to write functions which succeed for some elements of the set and fail for others. I'd like to see an actual test that you think needs this that would not be improved by iterating over the list. > Again, looking for a reason for this not existing turned up other > cases where people were wondering how to do this. > Things are added to APIs and libraries because they are useful, not because people wonder why they aren't there. set.get as you propose is not sufficiently analogous to dict.get or list.__getitem__. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwm at mired.org Wed May 16 10:09:41 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 16 May 2012 04:09:41 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: References: <20120516023215.4699c0b4@bhuda.mired.org> Message-ID: <20120516040941.0330975d@bhuda.mired.org> On Wed, 16 May 2012 17:11:28 +1000 Nick Coghlan wrote: > On Wed, May 16, 2012 at 4:32 PM, Mike Meyer wrote: > > Is there some reason that there isn't a straightforward way to get an > > element from a set without removing it? Everything I find either > > requires multiple statements or converting the set to another data > > type. > > It seems that some kind of get method would be useful. The argument > > that "getting an arbitrary element from a set isn't useful" is refuted > > by 1) the existence of the pop method, which does just that, and 2) > > the fact that I (and a number of other people) have run into such a > > need. > > My search for such a reason kept finding people asking how > > to get an element instead. Of course, my key words (set and get) are > > heavily overloaded. > The two primary use cases handled by the current interface are: > 1. Do something for all items in the set (iteration) > 2. Do something for an arbitrary item in the set, and keep track of > which items remain (set.pop) Neither of which fits my use case. > Since this use case is already covered by the iterator protocol, the > question then becomes: Is there a specific reason a dedicated > set-specific solution is needed rather than better educating people > that "the first item" is an acceptable answer when the request is for > "an arbitrary item" (this is particularly true in a world where set > ordering is randomised by default)? Because next(iter(s)) makes the reader wonder "Why is this iterator being created?" It's a less expensive form of writing list(s)[0]. It's also sufficiently non-obvious that the closest I found on google for a discussion of the issue was the "for x in s: break" variant. Which makes me think that at the very least, this idiom ought to be mentioned in the documentation. Or if it's already there, then a pointer added to the set documentation. But my question was actually whether or not there was a reason for it not existing. Has there been a previous discussion of this? http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From mwm at mired.org Wed May 16 10:12:38 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 16 May 2012 04:12:38 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: <87vcjw8ti9.fsf@benfinney.id.au> References: <20120516023215.4699c0b4@bhuda.mired.org> <87vcjw8ti9.fsf@benfinney.id.au> Message-ID: <20120516041238.2ef36576@bhuda.mired.org> On Wed, 16 May 2012 17:39:10 +1000 Ben Finney wrote: > If by ?get? you mean to get an *arbitrary* item, not a specific item, > then what's the problem? You already have ?set.pop?, as you point out. And, as I also pointed out, it's not useful in the case where you need to preserve the set for future use. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From p.f.moore at gmail.com Wed May 16 10:34:52 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 16 May 2012 09:34:52 +0100 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516040941.0330975d@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516040941.0330975d@bhuda.mired.org> Message-ID: On 16 May 2012 09:09, Mike Meyer wrote: > Because next(iter(s)) makes the reader wonder "Why is this iterator > being created?" It's a less expensive form of writing list(s)[0]. It's > also sufficiently non-obvious that the closest I found on google for a > discussion of the issue was the "for x in s: break" variant. Which > makes me think that at the very least, this idiom ought to be > mentioned in the documentation. Or if it's already there, then a > pointer added to the set documentation. I guess a doc patch adding a comment in the documentation of set.pop that if you want an arbitrary element of a set *without* removing it, then next(iter(s)) will give it to you, would be reasonable. Maybe you could write one? But I don't think it's particularly difficult to understand. It's very Python-specific, sure, but it feels idiomatic to me. Paul. From stephen at xemacs.org Wed May 16 10:42:49 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 16 May 2012 17:42:49 +0900 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516035201.2fb0b3f6@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516065843.GA2542@ando> <20120516031035.5974b5b3@bhuda.mired.org> <20120516072645.GB2542@ando> <20120516035201.2fb0b3f6@bhuda.mired.org> Message-ID: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp> Mike Meyer writes: > On Wed, 16 May 2012 17:26:45 +1000 > > Could this helper function not do the job? > > > > def get(s): > > x = s.pop() > > s.add(x) > > return x > > Sure, if you don't mind munging the set unnecessarily. That's more > readable, but slower and longer than: > > def get(s): > for x in s: > return s Why would you mind munging the set temporarily? Why is speed (of something that almost by definition is undefined if repeated) important? Your example use case of testing doesn't motivate these parts of your requirements. I'm -1 on adding a method that has no motivation in production that I can see. Just redefine your get() function as a function, with a more appropriate name such as "get_item_nondeterministically". It will work on any iterable. (Don't forget to document that it will "use up" an item if the iterable is not a sequence, though.) From mwm at mired.org Wed May 16 10:46:28 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 16 May 2012 04:46:28 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> Message-ID: <20120516044628.2aa6dff9@bhuda.mired.org> On Wed, 16 May 2012 01:08:09 -0700 Bruce Leban wrote: > On Wed, May 16, 2012 at 12:40 AM, Mike Meyer wrote: > > On Wed, 16 May 2012 00:02:31 -0700 > > Bruce Leban wrote: > > > Here's one definition of get: > > > def get_from_set1(s): > > > """Return an arbitrary member of a set.""" > > > return min(s, key=hash) > > > > >From poking around, at least at one time the fastest implementation > > was the very confusing: > > > > def get_from_set(s): > > for x in s: > > return x > I didn't claim it was fast. I actually wrote that version instead of the > in/return version for a very specific reason: it always returns the same > element. (The for/in/return version might return the same element every > time too but it's not guaranteed.) I didn't ask for a get that would always return the same element. > > Basically, anytime you want to examine an arbitrary element of a set, > > and would use pop, except you need to preserve the set for future > > use. In my case, I'm running a series of tests on the set, and some > > tests need an element. > That's bordering on tautological. It's useful anytime you need it. What do you expect? We've got a container type that has no way to examine an element without modifying the container or wrapping it in an object of a different another type. The general use case is that you don't need the facilities of the wrapping type (except for their ability to provide a single element) and you don't want to modify the container. Would you also complain that having int accept a string value in lieu of using eval on untrusted input is a case of "it's useful anytime you need it."? > I don't think your test is very good if it uses the get I wrote > above. Your test will only operate on one element of the set and > it's easy to write functions which succeed for some elements of the > set and fail for others. I'd like to see an actual test that you > think needs this that would not be improved by iterating over the > list. Talk about tautologies! Of course you can write tests that will fail in some cases. You can also write tests that won't fail for your cases. Especially if you know something about the set beforehand. For instance, I happen to know I have a set of ElementTree elements that all have the same tag. I want to check the tag. One of the test cases starts by checking to see if the set is a singleton. Do you really propose something like: if len(s) == 1: for i in s: res = process(i) or: if len(s) == 1: res = process(list(s)[0]) Or, as suggested elsewhere: if len(s) == 1: res = process(next(iter(s))) Or, just to be really obtuse: if len(s) == 1: res = process(set(s).pop()) All of these require creating an intermediate object for the sole purpose of getting an item out of the container without destroying the container. This leads the reader to wonder why it was created, which clashes with pretty much everything else in python. The only really palatable version is sufficiently obscure that nobody else who needed this API found it, or had it suggested to them by those responding to their question. > > Again, looking for a reason for this not existing turned up other > > cases where people were wondering how to do this. > Things are added to APIs and libraries because they are useful, not because > people wonder why they aren't there. set.get as you propose is not > sufficiently analogous to dict.get or list.__getitem__. People wonder why an API is not there because they need it and can't find it. I have a use for this API, so clearly it's useful. When I didn't find it, I figured there might be a reason for it not existing, so I asked the google djinn. Rather than providing a reason (djinn being noticeably untrustworthy), it turned up other people who had the same need as I did. I then asked this list the same question. Instead of getting an answer to that question, I get a bunch of claims that "I don't really need that", making me wonder if I've invoked another djinn. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From stephen at xemacs.org Wed May 16 11:23:37 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 16 May 2012 18:23:37 +0900 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516044628.2aa6dff9@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> <20120516044628.2aa6dff9@bhuda.mired.org> Message-ID: <87havgqy1y.fsf@uwakimon.sk.tsukuba.ac.jp> Mike Meyer writes: > of getting an answer to that question, I get a bunch of claims that "I > don't really need that", making me wonder if I've invoked another > djinn. Indeed, you asked whether there's a reason it's not in the stdlib, and the bunch of answers you got is the stdlib really doesn't need it, so it's not there. Maybe some of them were rash enough to also claim that *you* don't really need it, but that's sort of off topic in this thread---just ignore them and use the recipe you find most useful! From fuzzyman at gmail.com Wed May 16 11:28:54 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Wed, 16 May 2012 10:28:54 +0100 Subject: [Python-ideas] input function: built-in space between string and user-input In-Reply-To: References: Message-ID: On 15 May 2012 22:19, Terry Reedy wrote: > On 5/15/2012 6:46 AM, Suriaprakash.Mariappan at smsc.**comwrote: > >> *_print function: built-in space between string and variable:_* >> >> The below python code, >> >> */length = 5/* >> */print('Length is', length)/* >> >> gives an output of >> >> */Length is 5/* >> > > The */.../* and *_..._* bracketing makes you post harder to read. Perhaps > this is used in India, but not elsewhere. Omit next time. > They weren't present in the version I read. Probably a consequence of your mail client not being able to display formatted emails. Michael > > Even though we have not specified a space between 'Length is' and the >> variable length, Python puts it for us so that we get a clean nice >> output and the program is much more readable this way (since we don't >> need to worry about spacing in the strings we use for output). This is >> surely an example of how Python makes life easy for the programmer. >> >> *_input function: built-in space between string and user-input:_* >> >> >> However, the below python code, >> >> */guess = int(input('Enter an integer'))/* >> >> gives an output of >> >> */Enter an integer7/* >> >> >> [Note: Assume 7 is entered by the user.] >> >> *Suggestion: *Similar to the printf function, for the input function >> >> also, it will be nice to have the Python put a space between string and >> user-input, so that the output in the above case will be more readable >> as below. >> >> */Enter an integer 7/* >> > > print() converts objects to strings and adds separators and a terminator > before writing to outfile.write(). In 3.x, the separator, terminator, and > outfile can all be changed from the default. The user is stuck with the > fact that str(obj) is what it is, so it is handy to automatically tack > something on. > > input() directly writes a prompt string with sys.stdout.write. > There is no need to to augment that as the user can make the prompt string > be whatever they want. In any case, a change would break back-compatibility. > > -- > Terry Jan Reedy > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Wed May 16 12:27:38 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 16 May 2012 06:27:38 -0400 Subject: [Python-ideas] input function: built-in space between string and user-input In-Reply-To: References: Message-ID: <4FB3811A.4090601@nedbatchelder.com> On 5/15/2012 5:19 PM, Terry Reedy wrote: > On 5/15/2012 6:46 AM, Suriaprakash.Mariappan at smsc.com wrote: >> *_print function: built-in space between string and variable:_* >> >> The below python code, >> >> */length = 5/* >> */print('Length is', length)/* >> >> gives an output of >> >> */Length is 5/* > > The */.../* and *_..._* bracketing makes you post harder to read. > Perhaps this is used in India, but not elsewhere. Omit next time. That's your mail client's rendering of bold-italic and bold-underscored text from the HTML version of the original email. --Ned. From mikegraham at gmail.com Wed May 16 16:44:20 2012 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 16 May 2012 10:44:20 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> Message-ID: On Wed, May 16, 2012 at 4:08 AM, Nick Coghlan wrote: > That's not an operation that applies just to sets, you can do it with > an iterable, therefore the spelling is one that works with any > iterable: next(iter(s)) It sounds like you're re-implementing the venerable ,= operator. ,= is one of my favorite operators in Python. You know, >>> s set([42]) >>> item ,= s >>> item 42 ;^), Mike From masklinn at masklinn.net Wed May 16 16:51:09 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 16 May 2012 16:51:09 +0200 Subject: [Python-ideas] get method for sets? In-Reply-To: References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> Message-ID: <9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net> On 2012-05-16, at 16:44 , Mike Graham wrote: > On Wed, May 16, 2012 at 4:08 AM, Nick Coghlan wrote: >> That's not an operation that applies just to sets, you can do it with >> an iterable, therefore the spelling is one that works with any >> iterable: next(iter(s)) > > It sounds like you're re-implementing the venerable ,= operator. ,= is > one of my favorite operators in Python. > > You know, >>>> s > set([42]) >>>> item ,= s >>>> item > 42 With the difference that ,= also asserts there is only one item in the iterable, where `next . iter` only does `head`. (but the formatting as a single operator is genius, I usually write it as `item, = s` and the lack of clarity bothers me, thanks for that visual trick) From mikegraham at gmail.com Wed May 16 16:57:55 2012 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 16 May 2012 10:57:55 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: <9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> <9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net> Message-ID: On Wed, May 16, 2012 at 10:51 AM, Masklinn wrote: > With the difference that ,= also asserts there is only one item in the > iterable, where `next . iter` only does `head`. > > (but the formatting as a single operator is genius, I usually write it > as `item, = s` and the lack of clarity bothers me, thanks for that > visual trick) 1. I should have quoted Mike Meyer's code "if len(s) == 1: res = process(next(iter(s)))", which is what I had in mind. (In that case, it's a feature. :) ) 2. I grouped it this way as a joke. If you do this, everyone will think you're crazy. I've been known to write it (item,) = s, which makes it a little easier to see the comma. If you want to be unambiguous AND confuse everyone, go with >>> s set([42]) >>> [item] = s >>> item 42 Mike From bruce at leapyear.org Wed May 16 17:06:12 2012 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 16 May 2012 08:06:12 -0700 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516044628.2aa6dff9@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> <20120516044628.2aa6dff9@bhuda.mired.org> Message-ID: On Wed, May 16, 2012 at 1:46 AM, Mike Meyer wrote: > On Wed, 16 May 2012 01:08:09 -0700 > Bruce Leban wrote: > > On Wed, May 16, 2012 at 12:40 AM, Mike Meyer wrote: > > I didn't claim it was fast. I actually wrote that version instead of the > > in/return version for a very specific reason: it always returns the same > > element. (The for/in/return version might return the same element every > > time too but it's not guaranteed.) > > I didn't ask for a get that would always return the same element. > You didn't ask for a get that *didn't* always return the same element. My deterministic version is totally compatible with your ask here and > Would you also complain that having int accept a string value in lieu > of using eval on untrusted input is a case of "it's useful anytime you > need it."? > Not at all. It's useful because it's very common to need to convert strings to numbers and I can show you lots of code that does just that. So we need a method that does that safely. Does it have to be int? No; it could be atoi or parse_int or scanf. But we do need it. > > > I don't think your test is very good if it uses the get I wrote > > above. Your test will only operate on one element of the set and > > it's easy to write functions which succeed for some elements of the > > set and fail for others. I'd like to see an actual test that you > > think needs this that would not be improved by iterating over the > > list. > > Talk about tautologies! Of course you can write tests that will fail > in some cases. You can also write tests that won't fail for your > cases. Especially if you know something about the set beforehand. > Not what I said. It's easy to write a *function* that fails on some elements and your *test* won't test it. Example: a function that fails when operating on non-integer set elements or the largest element or .... Your test only tests an one case. > > For instance, I happen to know I have a set of ElementTree elements > that all have the same tag. I want to check the tag. > Then maybe you should be using a different data structure than a set. Maybe set_with_same_tag that declares that constraint and can enforce the constraint if you want. > One of the test cases starts by checking to see if the set is a > singleton. Do you really propose something like: > > if len(s) == 1: > for i in s: > res = process(i) > > This is a legitimate use case. I don't think it's a big deal to have to add a one line function to your code. I might even use EAFP: res = process(set_singleton(s)) where def set_singleton(s): [result] = s return result --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Wed May 16 17:43:06 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 17 May 2012 01:43:06 +1000 Subject: [Python-ideas] get method for sets? References: <20120516023215.4699c0b4@bhuda.mired.org> <87vcjw8ti9.fsf@benfinney.id.au> <20120516041238.2ef36576@bhuda.mired.org> Message-ID: <87r4uk873p.fsf@benfinney.id.au> Mike Meyer writes: > On Wed, 16 May 2012 17:39:10 +1000 > Ben Finney wrote: > > If by ?get? you mean to get an *arbitrary* item, not a specific item, > > then what's the problem? You already have ?set.pop?, as you point out. > > And, as I also pointed out, it's not useful in the case where you need > to preserve the set for future use. Then, if ?item = next(iter(foo_set))? doesn't suit you, perhaps you'd like ?item = set(foo_set).pop()?. Regardless, I think you have your answer: Like most things that can already be done by composing the existing pieces, this corner case hasn't met the deliberately-high bar for making a special method just to do it. I still haven't seen you describe the use case where the existing ways of doing this aren't good enough. -- \ ?Men never do evil so completely and cheerfully as when they do | `\ it from religious conviction.? ?Blaise Pascal (1623?1662), | _o__) Pens?es, #894. | Ben Finney From steve at pearwood.info Wed May 16 18:20:19 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 17 May 2012 02:20:19 +1000 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516040941.0330975d@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516040941.0330975d@bhuda.mired.org> Message-ID: <4FB3D3C3.2070502@pearwood.info> Mike Meyer wrote: > But my question was actually whether or not there was a reason for it > not existing. Has there been a previous discussion of this? Aye yai yai, have there ever. http://mail.python.org/pipermail/python-bugs-list/2005-August/030069.html If you have an hour or two spare, read this thread: http://mail.python.org/pipermail/python-dev/2009-October/093227.html By the way, I suggest that a better name than "get" is pick(), which once was (but no longer is) suggested by Wikipedia as a fundamental set operation. http://en.wikipedia.org/w/index.php?title=Set_%28abstract_data_type%29&oldid=461872038#Static_sets It seems to me that it has been removed because: - the actual semantics of what it means to get/pick a value from a set are unclear; and - few, if any, set implementations actually provide this method. I still think your best bet is a helper function: def pick(s): return next(iter(s)) -- Steven From steve at pearwood.info Wed May 16 18:28:22 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 17 May 2012 02:28:22 +1000 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516044628.2aa6dff9@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> <20120516044628.2aa6dff9@bhuda.mired.org> Message-ID: <4FB3D5A6.602@pearwood.info> Mike Meyer wrote: > I didn't ask for a get that would always return the same element. It seems to me that you haven't exactly been clear about what you want this get() method to actually do. In an earlier email, you said: [quote] My requirements are "I need an element from the set". The behavior of repeated calls is immaterial. [end quote] So a get() which always returns the same element fits your requirements, *as stated*. If you have other requirements, you haven't been forthcoming with them. I've never come across a set implementation which includes something like your get() method. Does anyone know any language whose set implementation has this functionality? Wikipedia currently suggests it isn't a natural method of sets: Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set. http://en.wikipedia.org/wiki/Set_%28computer_science%29 For all you say it is a common request, I don't think it's a well-thought-out request. It's one thing to ask "give me any element without modifying the set", but what does that mean exactly? Which element should it return? "Any element, so long as it isn't always the same element twice in a row" perhaps? Would flip-flopping between the first and second elements meet your requirements? The example you give below: > For instance, I happen to know I have a set of ElementTree elements > that all have the same tag. I want to check the tag. > > One of the test cases starts by checking to see if the set is a > singleton. Do you really propose something like: is too much of a special case to really matter. A set with one item avoids all the hard questions, since there is only one item which could be picked. It's the sets with two or more items that are hard. A general case get/pick method has to deal with the hard cases, not just the easy one-element cases. [...] > All of these require creating an intermediate object for the sole > purpose of getting an item out of the container without destroying the > container. This leads the reader to wonder why it was created, You've just explained why it was created -- to get an item out of the set without destroying it. Why is this a problem? We do something similar frequently, often abstracted away inside a helper function: first = list(iterable)[0] num_digits = len(str(some_integer)) etc. -- Steven From masklinn at masklinn.net Wed May 16 18:43:22 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 16 May 2012 18:43:22 +0200 Subject: [Python-ideas] get method for sets? In-Reply-To: References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> <9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net> Message-ID: On 2012-05-16, at 16:57 , Mike Graham wrote: > 2. I grouped it this way as a joke. If you do this, everyone will > think you're crazy. I don't mind, it expresses the intent clearly and looks *weird* at first glance which is fine by me: colleagues & readers are unlikely to miss it if they don't know the idiom. I genuinely like it. From pyideas at rebertia.com Wed May 16 20:19:05 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 16 May 2012 11:19:05 -0700 Subject: [Python-ideas] get method for sets? In-Reply-To: <4FB3D3C3.2070502@pearwood.info> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516040941.0330975d@bhuda.mired.org> <4FB3D3C3.2070502@pearwood.info> Message-ID: On Wed, May 16, 2012 at 9:20 AM, Steven D'Aprano wrote: > Mike Meyer wrote: > >> But my question was actually whether or not there was a reason for it >> not existing. Has there been a previous discussion of this? > > Aye yai yai, have there ever. > > http://mail.python.org/pipermail/python-bugs-list/2005-August/030069.html > > If you have an hour or two spare, read this thread: > > http://mail.python.org/pipermail/python-dev/2009-October/093227.html > > By the way, I suggest that a better name than "get" is pick(), which once > was (but no longer is) suggested by Wikipedia as a fundamental set > operation. > > http://en.wikipedia.org/w/index.php?title=Set_%28abstract_data_type%29&oldid=461872038#Static_sets > > > It seems to me that it has been removed because: > > - the actual semantics of what it means to get/pick a value from > ?a set are unclear; and > - few, if any, set implementations actually provide this method. Objective-C's NSSet calls it "anyObject" and doesn't specify much about it (in particular, its behavior when called repeatedly), mainly just that "the selection is not guaranteed to be random". I haven't poked around to see how it actually behaves in practice. C#'s ISet has First() and Last(), but merely as extension methods. Java, Ruby, and Haskell don't seem to include any such operation in their generic set interfaces. Cheers, Chris -- http://rebertia.com From asampson at cs.washington.edu Wed May 16 20:43:05 2012 From: asampson at cs.washington.edu (Adrian Sampson) Date: Wed, 16 May 2012 11:43:05 -0700 Subject: [Python-ideas] Composability and concurrent.futures Message-ID: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu> The concurrent.futures module in the Python standard library has problems with composability. If I start a ThreadPoolExecutor to run some library functions that internally use ThreadPoolExecutor, I will end up with many more worker threads on my system than I expect. For example, each parallel execution wants to take full advantage of an 8-core machine, I could end up with as many as 8*8=64 competing worker threads, which could significantly hurt performance. This is because each instance of ThreadPoolExecutor (or ProcessPoolExecutor) maintains its own independent worker pool. Especially in situations where the goal is to exploit multiple CPUs, it's essential for any thread pool implementation to globally manage contention between multiple concurrent job schedulers. I'm not sure about the best way to address this problem, but here's one proposal: Add additional executors to the futures library. ComposableThreadPoolExecutor and ComposableProcessPoolExecutor would each use a *shared* thread-pool model. When created, these composable executors will check to see if they are being created within a future worker thread/process initiated by another composable executor. If so, the "child" executor will forward all submitted jobs to the executor in the parent thread/process. Otherwise, it will behave normally, starting up its own worker pool. Has anyone else dealt with composition problems in parallel programs? What do you think of this solution -- is there a better way to tackle this deficiency? Adrian From masklinn at masklinn.net Wed May 16 21:21:08 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 16 May 2012 21:21:08 +0200 Subject: [Python-ideas] get method for sets? In-Reply-To: References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516040941.0330975d@bhuda.mired.org> <4FB3D3C3.2070502@pearwood.info> Message-ID: <32F30142-0503-453A-BC55-90DC763E937C@masklinn.net> On 2012-05-16, at 20:19 , Chris Rebert wrote: > Objective-C's NSSet calls it "anyObject" and doesn't specify much > about it (in particular, its behavior when called repeatedly), mainly > just that "the selection is not guaranteed to be random". I haven't > poked around to see how it actually behaves in practice. It takes the first object it finds which is pretty much solely a property of how it stores its items, its behavior translated into Python code is precisely: next(iter(set)) or list(set)[0] I just tested creating a few thousand sets and filling them with random integer values (using arc4random(3)) and never once did [set anyObject] differ from [[set allObjects] objectAtIndex:0] or from [[set objectEnumerator] nextValue]. From mwm at mired.org Wed May 16 22:00:33 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 16 May 2012 16:00:33 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: <87r4uk873p.fsf@benfinney.id.au> References: <20120516023215.4699c0b4@bhuda.mired.org> <87vcjw8ti9.fsf@benfinney.id.au> <20120516041238.2ef36576@bhuda.mired.org> <87r4uk873p.fsf@benfinney.id.au> Message-ID: <20120516160033.7245ce3f@bhuda.mired.org> On Thu, 17 May 2012 01:43:06 +1000 Ben Finney wrote: > Mike Meyer writes: > > On Wed, 16 May 2012 17:39:10 +1000 > > Ben Finney wrote: > > > If by ?get? you mean to get an *arbitrary* item, not a specific item, > > > then what's the problem? You already have ?set.pop?, as you point out. > > And, as I also pointed out, it's not useful in the case where you need > > to preserve the set for future use. > Then, if ?item = next(iter(foo_set))? doesn't suit you, perhaps you'd > like ?item = set(foo_set).pop()?. This is precisely what bugs me about this case. There's not one obvious way to do it. There's a collection of ways that are all in some ways/cases problematical. They all involve creating a scratch object from the set and using it's API to (possibly destructively) get the one value that's wanted. In a way, it reminds me of the discussions that eventually led to the if else expression being added. > Regardless, I think you have your answer: Like most things that can > already be done by composing the existing pieces, this corner case > hasn't met the deliberately-high bar for making a special method just to > do it. And it's already been discussed to death. *That's* what I was trying to find out. If it hadn't been, I'd have put together a serious proposal. > I still haven't seen you describe the use case where the existing ways > of doing this aren't good enough. That, of course, is a subjective judgment. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From yselivanov.ml at gmail.com Wed May 16 22:10:00 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 16 May 2012 16:10:00 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: <9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> <9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net> Message-ID: <8609D04B-A4CC-44D7-9420-350857523CDC@gmail.com> On 2012-05-16, at 10:51 AM, Masklinn wrote: > With the difference that ,= also asserts there is only one item in the > iterable, where `next . iter` only does `head`. For that, the ,*_= operator exists ;) >>> a = '123' >>> b ,*_= a >>> b '1' I hope I'll never encounter this, though. - Yury From grosser.meister.morti at gmx.net Wed May 16 22:16:25 2012 From: grosser.meister.morti at gmx.net (=?UTF-8?B?TWF0aGlhcyBQYW56ZW5iw7Zjaw==?=) Date: Wed, 16 May 2012 22:16:25 +0200 Subject: [Python-ideas] get method for sets? In-Reply-To: References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516040941.0330975d@bhuda.mired.org> <4FB3D3C3.2070502@pearwood.info> Message-ID: <4FB40B19.3030604@gmx.net> On 05/16/2012 08:19 PM, Chris Rebert wrote: > On Wed, May 16, 2012 at 9:20 AM, Steven D'Aprano wrote: >> Mike Meyer wrote: >> >>> But my question was actually whether or not there was a reason for it >>> not existing. Has there been a previous discussion of this? >> >> Aye yai yai, have there ever. >> >> http://mail.python.org/pipermail/python-bugs-list/2005-August/030069.html >> >> If you have an hour or two spare, read this thread: >> >> http://mail.python.org/pipermail/python-dev/2009-October/093227.html >> >> By the way, I suggest that a better name than "get" is pick(), which once >> was (but no longer is) suggested by Wikipedia as a fundamental set >> operation. >> >> http://en.wikipedia.org/w/index.php?title=Set_%28abstract_data_type%29&oldid=461872038#Static_sets >> >> >> It seems to me that it has been removed because: >> >> - the actual semantics of what it means to get/pick a value from >> a set are unclear; and >> - few, if any, set implementations actually provide this method. > > Objective-C's NSSet calls it "anyObject" and doesn't specify much > about it (in particular, its behavior when called repeatedly), mainly > just that "the selection is not guaranteed to be random". I haven't > poked around to see how it actually behaves in practice. > > C#'s ISet has First() and Last(), but merely as extension methods. > > Java, Ruby, and Haskell don't seem to include any such operation in > their generic set interfaces. > Ruby's Set has first() but not last(). That saied I'm -1 on a get/pick method for a set. > Cheers, > Chris > -- > http://rebertia.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From grosser.meister.morti at gmx.net Wed May 16 22:28:13 2012 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Wed, 16 May 2012 22:28:13 +0200 Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins` In-Reply-To: <20120509184856.GC3133@bagheera> References: <20120509184856.GC3133@bagheera> Message-ID: <4FB40DDD.6090702@gmx.net> If you look at the __future__ stuff one might get the idea to reverse it: try: from __past__ import unicode_literals, future_builtins except ImportError: pass On 05/09/2012 08:48 PM, Sven Marnach wrote: > With the reintroduction of u"Unicode literals", Python 3.3 will remove > one of the major stumbling stones for supporting Python 2.x and 3.3 > within the same code base. Another rather trivial stumbling stone > could be removed by adding the alias `future_builtins` for the > `builtins` module. Currently, you need to use a try/except block, > which isn't too bad, but I think it would be nicer if a line like > > from future_builtins import map > > continues to work, just like __future__ imports continue to work. I > think the above actually *is* a kind of __future__ report which just > happens to be in a regular module because it doesn't need any special > compiler support. > > I know a few module names changed and some modules have been > reorganised to packages, so you will still need try/except blocks for > other imports. However, I think `future_builtins` is special because > it's sole raison d'?tre is forward-compatibility and becuase of the > analogy with `__future__`. > > Cheers, > Sven From tjreedy at udel.edu Thu May 17 00:41:46 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 16 May 2012 18:41:46 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120516040941.0330975d@bhuda.mired.org> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516040941.0330975d@bhuda.mired.org> Message-ID: On 5/16/2012 4:09 AM, Mike Meyer wrote: > Because next(iter(s)) makes the reader wonder "Why is this iterator > being created?" If s is a non-iterator iterable, to get at the contents of s non-destructively. > makes me think that at the very least, this idiom ought to be > mentioned in the documentation. http://bugs.python.org/issue14836 > But my question was actually whether or not there was a reason for it > not existing. Because not all simple compositions need to be added to the sdtlib and builtins. -- Terry Jan Reedy From tjreedy at udel.edu Thu May 17 00:53:33 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 16 May 2012 18:53:33 -0400 Subject: [Python-ideas] input function: built-in space between string and user-input In-Reply-To: <4FB3811A.4090601@nedbatchelder.com> References: <4FB3811A.4090601@nedbatchelder.com> Message-ID: On 5/16/2012 6:27 AM, Ned Batchelder wrote: > On 5/15/2012 5:19 PM, Terry Reedy wrote: >> On 5/15/2012 6:46 AM, >> Suriaprakash.Mariappan at smsc.com wrote: >>> *_print function: built-in space between string and variable:_* >>> >>> The below python code, >>> >>> */length = 5/* >>> */print('Length is', length)/* >>> >>> gives an output of >>> >>> */Length is 5/* >> >> The */.../* and *_..._* bracketing makes you post harder to read. >> Perhaps this is used in India, but not elsewhere. Omit next time. > > That's your mail client's rendering of bold-italic and bold-underscored > text from the HTML version of the original email. I am reading the Gmane newsgroup mirror with Thunderbird. I have not seen it do anything similar with other mixed text/plain and text/html messages. So let me re-phrase by advice. "The Python mailings lists and newsgroups are, as usual, intended for plain text. Posting html or plaintext and html can have strange and unpredictable effects with various mail and news readers. So if you want people to see what you send, just use plain text, without tab characters." -- Terry Jan Reedy From ethan at stoneleaf.us Thu May 17 01:08:59 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 16 May 2012 16:08:59 -0700 Subject: [Python-ideas] get method for sets? In-Reply-To: <4FB3D3C3.2070502@pearwood.info> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516040941.0330975d@bhuda.mired.org> <4FB3D3C3.2070502@pearwood.info> Message-ID: <4FB4338B.9070109@stoneleaf.us> Steven D'Aprano wrote: > Mike Meyer wrote: > >> But my question was actually whether or not there was a reason for it >> not existing. Has there been a previous discussion of this? > > > Aye yai yai, have there ever. > > http://mail.python.org/pipermail/python-bugs-list/2005-August/030069.html > > If you have an hour or two spare, read this thread: > > http://mail.python.org/pipermail/python-dev/2009-October/093227.html > > By the way, I suggest that a better name than "get" is pick(), which > once was (but no longer is) suggested by Wikipedia as a fundamental set > operation. > > http://en.wikipedia.org/w/index.php?title=Set_%28abstract_data_type%29&oldid=461872038#Static_sets > > > > It seems to me that it has been removed because: > > - the actual semantics of what it means to get/pick a value from > a set are unclear; and > - few, if any, set implementations actually provide this method. > > I still think your best bet is a helper function: > > def pick(s): > return next(iter(s)) Don't forget the doc string! "returns an arbitrary element from set s" ~Ethan~ From greg.ewing at canterbury.ac.nz Thu May 17 01:51:49 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 May 2012 11:51:49 +1200 Subject: [Python-ideas] get method for sets? In-Reply-To: References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> <20120516044628.2aa6dff9@bhuda.mired.org> Message-ID: <4FB43D95.4080801@canterbury.ac.nz> Bruce Leban wrote: > Then maybe you should be using a different data structure than a set. > Maybe set_with_same_tag that declares that constraint and can enforce > the constraint if you want. Another class of use cases is where you know that the set contains only one element, and you want to find out what that element is. I encountered one of these in my recent PyWeek game entry. I have a set of selected units, and commands that can be applied to them. Some commands can only be used on a single unit at a time, so there are places in the code where there can only be one element in the set. Using a separate SetWithOnlyOneElement type in that case would be tedious and unnecessary. I don't need to enforce the constraint; I know it's satisfied because I wouldn't have ended up at that point in the code if it wasn't. -- Greg From greg.ewing at canterbury.ac.nz Thu May 17 01:56:51 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 May 2012 11:56:51 +1200 Subject: [Python-ideas] get method for sets? In-Reply-To: <4FB3D3C3.2070502@pearwood.info> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516040941.0330975d@bhuda.mired.org> <4FB3D3C3.2070502@pearwood.info> Message-ID: <4FB43EC3.4090000@canterbury.ac.nz> Steven D'Aprano wrote: > By the way, I suggest that a better name than "get" is pick() I was going to suggest peek(), which is more suggestive of a non-modifying function. -- Greg From cs at zip.com.au Thu May 17 02:56:29 2012 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 17 May 2012 10:56:29 +1000 Subject: [Python-ideas] get method for sets? In-Reply-To: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20120517005629.GA28044@cskk.homeip.net> On 16May2012 17:42, Stephen J. Turnbull wrote: | Mike Meyer writes: | > On Wed, 16 May 2012 17:26:45 +1000 | | > > Could this helper function not do the job? | > > | > > def get(s): | > > x = s.pop() | > > s.add(x) | > > return x | > | > Sure, if you don't mind munging the set unnecessarily. That's more | > readable, but slower and longer than: | > | > def get(s): | > for x in s: | > return s I was about to suggest Mike's implementation. | Why would you mind munging the set temporarily? Personally, I work with multiple threads quite often. Therefore I habitually avoid data structure modifying operations unless they're neccessary. Any time I modify a data structure is a time I have to worry about shared access. | Why is speed (of | something that almost by definition is undefined if repeated) | important? Besides, modifying a data structure _is_ slow than just looking, usually. There may even be garbage collection:-( | I'm -1 on adding a method that has no motivation in production that I | can see. Just redefine your get() function as a function, with a more | appropriate name such as "get_item_nondeterministically". It will | work on any iterable. (Don't forget to document that it will "use up" | an item if the iterable is not a sequence, though.) Yah: def an(s): for i in s: return i I'm also -1 on a set _method_, though he can always subclass and add his own for his use case. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ If your new theorem can be stated with great simplicity, then there will exist a pathological exception. - Adrian Mathesis From greg.ewing at canterbury.ac.nz Thu May 17 03:51:04 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 May 2012 13:51:04 +1200 Subject: [Python-ideas] get method for sets? In-Reply-To: <4FB3D5A6.602@pearwood.info> References: <20120516023215.4699c0b4@bhuda.mired.org> <20120516034034.048f2eaa@bhuda.mired.org> <20120516044628.2aa6dff9@bhuda.mired.org> <4FB3D5A6.602@pearwood.info> Message-ID: <4FB45988.704@canterbury.ac.nz> On 17/05/12 04:28, Steven D'Aprano wrote: > Which element should it return? "Any > element, so long as it isn't always the same element twice in a row" perhaps? > Would flip-flopping between the first and second elements meet your requirements? It might be useful to have a method specified as returning the same element that a subsequent pop() would return. Then it could be used as a look-ahead for an algorithm involving a pop-loop, or for anything with more liberal requirements. -- Greg From greg.ewing at canterbury.ac.nz Thu May 17 04:00:39 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 May 2012 14:00:39 +1200 Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins` In-Reply-To: <4FB40DDD.6090702@gmx.net> References: <20120509184856.GC3133@bagheera> <4FB40DDD.6090702@gmx.net> Message-ID: <4FB45BC7.20509@canterbury.ac.nz> On 17/05/12 08:28, Mathias Panzenb?ck wrote: > from __past__ import unicode_literals, future_builtins I seem to remember Guido declaring ages ago that there would never be any imports from the past. So the past import feature would first have to be imported from a reality where he hadn't made that decision. from __alternatetimeline__ import __past__ from __past__ import unicode_literals, future_builtins -- Greg From paul.dubois at gmail.com Thu May 17 04:19:50 2012 From: paul.dubois at gmail.com (Paul Du Bois) Date: Wed, 16 May 2012 19:19:50 -0700 Subject: [Python-ideas] get method for sets? In-Reply-To: <20120517005629.GA28044@cskk.homeip.net> References: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp> <20120517005629.GA28044@cskk.homeip.net> Message-ID: > | Mike Meyer writes: > | ?> def get(s): > | ?> ? ? for x in s: > | ?> ? ? ? ? return s On Wed, May 16, 2012 at 5:56 PM, Cameron Simpson wrote: > ?def an(s): > ? ?for i in s: > ? ? ?return i Normally I'm content to lurk, but this thread has been going on for a long time without anyone pointing out that the "for" loop idiom needs an "else: raise KeyError" in order to act pythonically. p From greg.ewing at canterbury.ac.nz Thu May 17 05:03:11 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 May 2012 15:03:11 +1200 Subject: [Python-ideas] get method for sets? In-Reply-To: References: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp> <20120517005629.GA28044@cskk.homeip.net> Message-ID: <4FB46A6F.4050200@canterbury.ac.nz> On 17/05/12 14:19, Paul Du Bois wrote: > On Wed, May 16, 2012 at 5:56 PM, Cameron Simpson wrote: >> > def an(s): >> > for i in s: >> > return i > Normally I'm content to lurk, but this thread has been going on for a > long time without anyone pointing out that the "for" loop idiom needs > an "else: raise KeyError" in order to act pythonically. That depends on what result you want in the empty set case. If returning None is okay, or you know the set can never be empty, then it's fine as written. -- Greg From mwm at mired.org Thu May 17 05:49:01 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 16 May 2012 23:49:01 -0400 Subject: [Python-ideas] get method for sets? In-Reply-To: <4FB46A6F.4050200@canterbury.ac.nz> References: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp> <20120517005629.GA28044@cskk.homeip.net> <4FB46A6F.4050200@canterbury.ac.nz> Message-ID: <20120516234901.6ffb3b1c@bhuda.mired.org> On Thu, 17 May 2012 15:03:11 +1200 Greg Ewing wrote: > On 17/05/12 14:19, Paul Du Bois wrote: > > On Wed, May 16, 2012 at 5:56 PM, Cameron Simpson wrote: > >> > def an(s): > >> > for i in s: > >> > return i > > Normally I'm content to lurk, but this thread has been going on for a > > long time without anyone pointing out that the "for" loop idiom needs > > an "else: raise KeyError" in order to act pythonically. > That depends on what result you want in the empty set case. If > returning None is okay, or you know the set can never be empty, > then it's fine as written. Raising KeyError is probably best, as that parallels "pop". In fact, it would be required for the proposed "peek" method that returns what "pop" would have returned at that point. That method would not only have satisfied all 2.5 of the use cases I had, but would probably be useful for algorithms that want a conditional pop. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From ethan at stoneleaf.us Thu May 17 17:10:40 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 17 May 2012 08:10:40 -0700 Subject: [Python-ideas] weakrefs Message-ID: <4FB514F0.6000403@stoneleaf.us> From the manual [8.11]: > A weak reference to an object is not enough to keep the object alive: > when the only remaining references to a referent are weak references, > garbage collection is free to destroy the referent and reuse its > memory for something else. This leads to a difference in behaviour between CPython and the other implementations: CPython will (currently) immediately destroy any objects that only have weak references to them with the result that trying to access said object will require making a new one; other implementations (at least PyPy, and presumably the others that don't use ref-count gc's) can "reach into the grave" and pull back objects that don't have any strong references left. I would like to have the guarantees for weakrefs strengthened such that any weakref'ed object that has no strong references left will return None instead of the object, even if the object has not yet been garbage collected. Without this stronger guarantee programs that are relying on weakrefs to disappear when strong refs are gone end up relying on the gc method instead, with the result that the program behaves differently on different implementations. ~Ethan~ From solipsis at pitrou.net Thu May 17 17:44:29 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 17 May 2012 17:44:29 +0200 Subject: [Python-ideas] weakrefs References: <4FB514F0.6000403@stoneleaf.us> Message-ID: <20120517174429.75965d06@pitrou.net> On Thu, 17 May 2012 08:10:40 -0700 Ethan Furman wrote: > From the manual [8.11]: > > > A weak reference to an object is not enough to keep the object alive: > > when the only remaining references to a referent are weak references, > > garbage collection is free to destroy the referent and reuse its > > memory for something else. > > This leads to a difference in behaviour between CPython and the other > implementations: CPython will (currently) immediately destroy any > objects that only have weak references to them with the result that > trying to access said object will require making a new one; This is only true if the object isn't caught in a reference cycle. > Without this stronger guarantee programs that are relying on weakrefs to > disappear when strong refs are gone end up relying on the gc method > instead, with the result that the program behaves differently on > different implementations. Why would they "rely on weakrefs to disappear when strong refs are gone"? What is the use case? Regards Antoine. From ckaynor at zindagigames.com Thu May 17 19:13:15 2012 From: ckaynor at zindagigames.com (Chris Kaynor) Date: Thu, 17 May 2012 10:13:15 -0700 Subject: [Python-ideas] weakrefs In-Reply-To: <20120517174429.75965d06@pitrou.net> References: <4FB514F0.6000403@stoneleaf.us> <20120517174429.75965d06@pitrou.net> Message-ID: On Thu, May 17, 2012 at 8:44 AM, Antoine Pitrou wrote: > On Thu, 17 May 2012 08:10:40 -0700 > Ethan Furman wrote: > > From the manual [8.11]: > > > > > A weak reference to an object is not enough to keep the object alive: > > > when the only remaining references to a referent are weak references, > > > garbage collection is free to destroy the referent and reuse its > > > memory for something else. > > > > This leads to a difference in behaviour between CPython and the other > > implementations: CPython will (currently) immediately destroy any > > objects that only have weak references to them with the result that > > trying to access said object will require making a new one; > > This is only true if the object isn't caught in a reference cycle. To further this, consider the following example, ran in CPython2.6: >>> import weakref >>> import gc >>> >>> class O(object): ... pass ... >>> a = O() >>> b = O() >>> a.x = b >>> b.x = a >>> >>> w = weakref.ref(a) >>> >>> >>> del a, b >>> >>> print w() <__main__.O object at 0x0000000003C78B38> >>> >>> gc.collect() 20 >>> >>> print w() None > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri May 18 00:49:05 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 18 May 2012 10:49:05 +1200 Subject: [Python-ideas] weakrefs In-Reply-To: <4FB514F0.6000403@stoneleaf.us> References: <4FB514F0.6000403@stoneleaf.us> Message-ID: <4FB58061.6050809@canterbury.ac.nz> Ethan Furman wrote: > I would like to have the guarantees for weakrefs strengthened such that > any weakref'ed object that has no strong references left will return > None instead of the object, even if the object has not yet been garbage > collected. Why do you want this guarantee? It would complicate implementations for which ref counting is not the native method of managing memory. -- Greg From ethan at stoneleaf.us Fri May 18 18:08:48 2012 From: ethan at stoneleaf.us (stoneleaf) Date: Fri, 18 May 2012 09:08:48 -0700 (PDT) Subject: [Python-ideas] weakrefs In-Reply-To: <4FB514F0.6000403@stoneleaf.us> References: <4FB514F0.6000403@stoneleaf.us> Message-ID: <551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com> On May 17, 8:10?am, Ethan Furman wrote: > ?From the manual [8.11]: > >> A weak reference to an object is not enough to keep the object alive: >> when the only remaining references to a referent are weak references, >> garbage collection is free to destroy the referent and reuse its >> memory for something else. > > This leads to a difference in behaviour between CPython and the other > implementations: ?CPython will (currently) immediately destroy any > objects that only have weak references to them with the result that > trying to access said object will require making a new one; other > implementations (at least PyPy, and presumably the others that don't use > ref-count gc's) can "reach into the grave" and pull back objects that > don't have any strong references left. Antione Pitrou wrote: > This is only true if the object isn't caught in a reference cycle. Good point -- so I would also like the proposed change in CPython as well. Ethan Furman wrote: > I would like to have the guarantees for weakrefs strengthened such that > any weakref'ed object that has no strong references left will return > None instead of the object, even if the object has not yet been garbage > collected. > > Without this stronger guarantee programs that are relying on weakrefs to > disappear when strong refs are gone end up relying on the gc method > instead, with the result that the program behaves differently on > different implementations. Antione Pitrou wrote: > Why would they "rely on weakrefs to disappear when strong refs are > gone"? What is the use case? Greg Ewing wrote: > Why do you want this guarantee? It would complicate > implementations for which ref counting is not the > native method of managing memory. My dbf module provides direct access to dbf files. A retrieved record is a singleton object, and allows temporary changes that are not written to disk. Whether those changes are seen by the next incarnation depends on (I had thought) whether or not the record with the unwritten changes has gone out of scope. I see two questions that determine whether this change should be made: 1) How difficult it would be for the non-ref counting implementations to implement 2) Whether it's appropriate to have objects be changed, but not saved, and then discarded when the strong references are gone so the next incarnation doesn't see the changes, even if the object hasn't been destroyed yet. ~Ethan~ FYI: For dbf I am going to disallow temporary changes so this won't be an immediate issue for me. From masklinn at masklinn.net Fri May 18 18:38:00 2012 From: masklinn at masklinn.net (Masklinn) Date: Fri, 18 May 2012 18:38:00 +0200 Subject: [Python-ideas] weakrefs In-Reply-To: <551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com> References: <4FB514F0.6000403@stoneleaf.us> <551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com> Message-ID: <0A5D8A72-14E3-4614-8720-8070FEBAD28F@masklinn.net> On 2012-05-18, at 18:08 , stoneleaf wrote: > > My dbf module provides direct access to dbf files. A retrieved record > is > a singleton object, and allows temporary changes that are not written > to > disk. Whether those changes are seen by the next incarnation depends > on > (I had thought) whether or not the record with the unwritten changes > has > gone out of scope. If a record is a singleton, that singleton-ification would be handled through weakrefs would it not? In that case, until the GC is triggered (and the weakref is invalidated), you will keep getting your initial singleton and there will be no "next record", I fail to see why that would be an issue. > I see two questions that determine whether this change should be made: > > 1) How difficult it would be for the non-ref counting > implementations > to implement > Pretty much impossible I'd expect, the weakrefs can only be broken on GC runs (at object deallocation) and that is generally non-deterministic without specifying precisely which type of GC implementation is used. You'd need a fully deterministic deallocation model to ensure a weakref is broken as soon as the corresponding object has no outstanding strong (and soft, in some VMs like the JVM) reference. > 2) Whether it's appropriate to have objects be changed, but not > saved, > and then discarded when the strong references are gone so the > next > incarnation doesn't see the changes, even if the object hasn't > been > destroyed yet. If your saves are synchronized with the weakref being broken (the object being *effectively* collected) and the singleton behavior is as well, there will be no difference, I'm not sure what the issue would be, you might just have a second change cycle using the same unsaved (but still modified) object. Although frankly speaking such reliance on non-deterministic events would scare the shit out of me. From ethan at stoneleaf.us Sat May 19 04:54:08 2012 From: ethan at stoneleaf.us (stoneleaf) Date: Fri, 18 May 2012 19:54:08 -0700 (PDT) Subject: [Python-ideas] weakrefs In-Reply-To: <0A5D8A72-14E3-4614-8720-8070FEBAD28F@masklinn.net> References: <4FB514F0.6000403@stoneleaf.us> <551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com> <0A5D8A72-14E3-4614-8720-8070FEBAD28F@masklinn.net> Message-ID: <1b227cd1-3c22-4661-9b24-3edc16a580dd@st3g2000pbc.googlegroups.com> On May 18, 9:38?am, Masklinn wrote: > On 2012-05-18, at 18:08 , stoneleaf wrote: >> My dbf module provides direct access to dbf files. ?A retrieved record >> is >> a singleton object, and allows temporary changes that are not written >> to >> disk. ?Whether those changes are seen by the next incarnation depends >> on >> (I had thought) whether or not the record with the unwritten changes >> has >> gone out of scope. > > If a record is a singleton, that singleton-ification would be handled > through weakrefs would it not? Indeed, that is the current bahavior. > In that case, until the GC is triggered (and the weakref is > invalidated), you will keep getting your initial singleton and there > will be no "next record", I fail to see why that would be an issue. Because, since I had only been using CPython, I was able to count on records that had gone out of scope disappearing along with their _temporary_ changes. If I get that same record back the next time I loop through the table -- well, then the changes weren't temporary, were they? >> I see two questions that determine whether this change should be made: > >> ?1) How difficult it would be for the non-ref counting >> implementations to implement > > Pretty much impossible I'd expect, the weakrefs can only be broken on GC > runs (at object deallocation) and that is generally non-deterministic > without specifying precisely which type of GC implementation is used. > You'd need a fully deterministic deallocation model to ensure a weakref > is broken as soon as the corresponding object has no outstanding strong > (and soft, in some VMs like the JVM) reference. > >> ?2) Whether it's appropriate to have objects be changed, but not >> saved, and then discarded when the strong references are gone so the >> next incarnation doesn't see the changes, even if the object hasn't >> been destroyed yet. > > If your saves are synchronized with the weakref being broken (the object > being *effectively* collected) and the singleton behavior is as well, > there will be no difference, I'm not sure what the issue would be, you > might just have a second change cycle using the same unsaved (but still > modified) object. And that's exactly the problem -- I don't want to see the modifications the second time 'round, and if I can't count on weakrefs invalidating as soon as the strong refs are gone I'll have to completely rethink how I handle records from the table. > Although frankly speaking such reliance on non-deterministic events would > scare the shit out of me. Indeed -- I hadn't realized that I was until somebody using PyPy noticed the problem. ~Ethan~ From fuzzyman at gmail.com Sat May 19 14:33:35 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Sat, 19 May 2012 13:33:35 +0100 Subject: [Python-ideas] weakrefs In-Reply-To: <1b227cd1-3c22-4661-9b24-3edc16a580dd@st3g2000pbc.googlegroups.com> References: <4FB514F0.6000403@stoneleaf.us> <551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com> <0A5D8A72-14E3-4614-8720-8070FEBAD28F@masklinn.net> <1b227cd1-3c22-4661-9b24-3edc16a580dd@st3g2000pbc.googlegroups.com> Message-ID: On 19 May 2012 03:54, stoneleaf wrote: > > > On May 18, 9:38 am, Masklinn wrote: > > On 2012-05-18, at 18:08 , stoneleaf wrote: > >> My dbf module provides direct access to dbf files. A retrieved record > >> is > >> a singleton object, and allows temporary changes that are not written > >> to > >> disk. Whether those changes are seen by the next incarnation depends > >> on > >> (I had thought) whether or not the record with the unwritten changes > >> has > >> gone out of scope. > > > > If a record is a singleton, that singleton-ification would be handled > > through weakrefs would it not? > > Indeed, that is the current bahavior. > > > In that case, until the GC is triggered (and the weakref is > > invalidated), you will keep getting your initial singleton and there > > will be no "next record", I fail to see why that would be an issue. > > Because, since I had only been using CPython, I was able to count on > records that had gone out of scope disappearing along with their > _temporary_ changes. If I get that same record back the next time I > loop > through the table -- well, then the changes weren't temporary, were > they? > So you're taking a *dependence* on the reference counting garbage collection of the CPython implementation, and when that doesn't work for you with other implementations trying to force the same semantics on them. Your proposal can't reasonably be implemented by other implementations as working out whether there are any references to an object is an expensive operation. A much better technique would be for you to use explicit life-cycle-management (like the with statement) for your objects. Michael > > >> I see two questions that determine whether this change should be made: > > > >> 1) How difficult it would be for the non-ref counting > >> implementations to implement > > > > Pretty much impossible I'd expect, the weakrefs can only be broken on GC > > runs (at object deallocation) and that is generally non-deterministic > > without specifying precisely which type of GC implementation is used. > > You'd need a fully deterministic deallocation model to ensure a weakref > > is broken as soon as the corresponding object has no outstanding strong > > (and soft, in some VMs like the JVM) reference. > > > >> 2) Whether it's appropriate to have objects be changed, but not > >> saved, and then discarded when the strong references are gone so the > >> next incarnation doesn't see the changes, even if the object hasn't > >> been destroyed yet. > > > > If your saves are synchronized with the weakref being broken (the object > > being *effectively* collected) and the singleton behavior is as well, > > there will be no difference, I'm not sure what the issue would be, you > > might just have a second change cycle using the same unsaved (but still > > modified) object. > > And that's exactly the problem -- I don't want to see the > modifications the > second time 'round, and if I can't count on weakrefs invalidating as > soon as > the strong refs are gone I'll have to completely rethink how I handle > records > from the table. > > > Although frankly speaking such reliance on non-deterministic events would > > scare the shit out of me. > > Indeed -- I hadn't realized that I was until somebody using PyPy > noticed the > problem. > > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat May 19 17:29:02 2012 From: ethan at stoneleaf.us (stoneleaf) Date: Sat, 19 May 2012 08:29:02 -0700 (PDT) Subject: [Python-ideas] weakrefs In-Reply-To: References: <4FB514F0.6000403@stoneleaf.us> <551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com> <0A5D8A72-14E3-4614-8720-8070FEBAD28F@masklinn.net> <1b227cd1-3c22-4661-9b24-3edc16a580dd@st3g2000pbc.googlegroups.com> Message-ID: <4109f083-f8c7-4f58-84ef-2da278242934@ri8g2000pbc.googlegroups.com> On May 19, 5:33?am, Michael Foord wrote: > So you're taking a *dependence* on the reference counting garbage > collection of the CPython implementation, and when that doesn't work for > you with other implementations trying to force the same semantics on them. I am not trying to force anything. I stated what I would like, and followed up with questions to further the discussion. > Your proposal can't reasonably be implemented by other implementations as > working out whether there are any references to an object is an expensive > operation. Then that nixes it. The (debatable) advantages aren't worth a large expenditure in programmer time, nor a large hit in performance. > A much better technique would be for you to use explicit > life-cycle-management (like the with statement) for your objects. I'm leaning strongly towards just not allowing temporary changes, which will also solve my problem. Thanks everyone for the feedback. ~Ethan~ From bborcic at gmail.com Mon May 21 16:27:35 2012 From: bborcic at gmail.com (Boris Borcic) Date: Mon, 21 May 2012 16:27:35 +0200 Subject: [Python-ideas] [...].join(sep) In-Reply-To: References: Message-ID: anatoly techtonik wrote: > I am certain this was proposed many times, but still - why it is rejected? > > "real man don't use spaces".split().join('+').upper() > instead of > '+'.join("real man don't use spaces".split()).upper() IMO this should really be : '+'.join(' '.split("real man don't use spaces")).upper() From anacrolix at gmail.com Mon May 21 18:17:06 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Tue, 22 May 2012 02:17:06 +1000 Subject: [Python-ideas] Composability and concurrent.futures In-Reply-To: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu> References: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu> Message-ID: On Thu, May 17, 2012 at 4:43 AM, Adrian Sampson wrote: > The concurrent.futures module in the Python standard library has problems > with composability. If I start a ThreadPoolExecutor to run some library > functions that internally use ThreadPoolExecutor, I will end up with many > more worker threads on my system than I expect. For example, each parallel > execution wants to take full advantage of an 8-core machine, I could end up > with as many as 8*8=64 competing worker threads, which could significantly > hurt performance. > > This is because each instance of ThreadPoolExecutor (or > ProcessPoolExecutor) maintains its own independent worker pool. Especially > in situations where the goal is to exploit multiple CPUs, it's essential > for any thread pool implementation to globally manage contention between > multiple concurrent job schedulers. > > I'm not sure about the best way to address this problem, but here's one > proposal: Add additional executors to the futures library. > ComposableThreadPoolExecutor and ComposableProcessPoolExecutor would each > use a *shared* thread-pool model. When created, these composable executors > will check to see if they are being created within a future worker > thread/process initiated by another composable executor. If so, the "child" > executor will forward all submitted jobs to the executor in the parent > thread/process. Otherwise, it will behave normally, starting up its own > worker pool. > > Has anyone else dealt with composition problems in parallel programs? What > do you think of this solution -- is there a better way to tackle this > deficiency? It's my understanding this is a known flaw with concurrency *in general*. Currently most multi-{threaded,process} applications assume they're the only ones running on the system. As does the likely implementation of the proposed composable pools problem you've posed. A proper interprocess scheduler is required to handle this ideally. (See GCD, and runtime implementations that provide at least some userspace scheduling such as Go, however poor it may be). Secondly, composable pools don't handle recursive relationships well. If a thread in one pool depends on the completion of all the tasks in its own pool to complete before it can itself complete, you'll have deadlock. Personally if I implemented a composable thread pool I'd have it global, creation and submission of tasks would be proxied to it via some composable executor class. As it stands, thread pools are best for task-oriented concurrency rather than parallelism anyway, especially in CPython. In short, I think composable thread pools are a hack at best and won't gain you anything except a slightly reduced threading overhead. If you want optimal utilization, threading isn't the right place to be looking. -------------- next part -------------- An HTML attachment was scrubbed... URL: From asampson at cs.washington.edu Mon May 21 19:21:01 2012 From: asampson at cs.washington.edu (Adrian Sampson) Date: Mon, 21 May 2012 10:21:01 -0700 Subject: [Python-ideas] Composability and concurrent.futures In-Reply-To: References: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu> Message-ID: <951AE63A-2AE9-4314-8B05-F80EC90D3314@cs.washington.edu> On May 21, 2012, at 9:17 AM, Matt Joiner wrote: > Personally if I implemented a composable thread pool I'd have it > global, creation and submission of tasks would be proxied to it via > some composable executor class. I agree completely. Maybe the implementation I described was overly hacky for the sake of transparent compatibility with the existing (non-composable) executors in concurrent.futures. Ideally, the system would have one global pool which many concurrency APIs -- not just concurrent.futures -- could potentially share. (In a *really* ideal world, the OS would provide thread pool management -- like GCD, which you mentioned, or scheduler activations. But a cross-platform library currently requires a less ambitious solution.) > In short, I think composable thread pools are a hack at best and won't > gain you anything except a slightly reduced threading overhead. If you > want optimal utilization, threading isn't the right place to be > looking. To be clear, I meant to refer to processes *or* threads when discussing the problem originally. The ProcessPoolExecutor is pretty useful (in my experience) for easily getting speedup even on pure-Python CPU-bound workloads. Adrian From tjreedy at udel.edu Mon May 21 20:29:34 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 21 May 2012 14:29:34 -0400 Subject: [Python-ideas] [...].join(sep) In-Reply-To: References: Message-ID: On 5/21/2012 10:27 AM, Boris Borcic wrote: > anatoly techtonik wrote: >> I am certain this was proposed many times, but still - why it is >> rejected? >> >> "real man don't use spaces".split().join('+').upper() >> instead of >> '+'.join("real man don't use spaces".split()).upper() > > IMO this should really be : > > '+'.join(' '.split("real man don't use spaces")).upper() It the separator were a mandatory argument for .split, then that would be possible, not not with it being optional, and therefore the second argument. >>> ' real men usE SPAces and tabs'.split() ['real', 'men', 'usE', 'SPAces', 'and', 'tabs'] >>> ' real men usE SPAces and tabs'.split(' ') ['', 'real', '', 'men', '', 'usE', 'SPAces', '', '', 'and', '\t', 'tabs'] >>> ' '.join(' real men usE SPAces and tabs'.split()) 'real men usE SPAces and tabs' is a handy way to clean up whitespace -- Terry Jan Reedy From techtonik at gmail.com Tue May 22 17:39:16 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 22 May 2012 18:39:16 +0300 Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout) Message-ID: Hello again, I've finally found some time to partially process the replies and came up with a better solution than subprocess.* and shutil.runret/runout Disclaimer: I don't say that suprocess suxx - it is powerful and very awesome under the hood. What I want to say that its final user interface is awful - for such complex thing as this it should have been passed through several iteration cycles before settling down. Therefore, inspired by Fabric API, I've finally found the solution - shutil.run() function: https://bitbucket.org/techtonik/shutil-run/src run(command, combine_stderr=True): Run command through a system shell, return output string with additional properties: output.succeeded - result of the operation True/False output.return_code - specific return code output.stderr - stderr contents if combine_stderr=False `combine_stderr` if set, makes stderr merged into output string, otherwise it will be available as `output.stderr` attribute. Example: from shellrun import run output = run('ls -la') if output.succeeded: print(output) else: print("Error %s" % output.return_code) That's the most intuitive way I found so far. Objective advantages: 1. Better than subprocess.call(cmd, shell=true) subprocess.check_call(cmd, shell=true) subprocess.check_output(cmd, shell=True) because it is just shutil.run(cmd) i.e. short, simple and _easy to remember_ 2. With shutil.run() you don't need to rewrite your check_call() or check_output() with Popen() if you need to get return_code in addition to stderr contents on error 3. shutil.run() is predictable and consistent - its arguments are not dependent on each other, their combination doesn't change the function behavior over and over requiring you iterate over the documentation and warnings again and again 4. shutil.run() is the correct next level API over subprocess base level. subprocess executes external process - that is its role, but automatic ability to execute external process inside another external process (shell) looks like a hack to me. Practical, but still a hack. 5. No required exception catching, which doesn't work for shell=True anyway 6. No need to learn subprocess.PIPE routing magic (not an argument for hackers, I know) Subjective advantages: 1. More beautiful 2. More simple 3. More readable 4. Practical 5. Obvious 6. It easy to explain Hopefully, it can find its way in stdlib instead of http://shell-command.readthedocs.org/ -- anatoly t. From ericsnowcurrently at gmail.com Tue May 22 18:26:20 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 22 May 2012 10:26:20 -0600 Subject: [Python-ideas] a simple namespace type Message-ID: Below I've included a pure Python implementation of a type that I wish was a builtin. I know others have considered similar classes in the past without any resulting change to Python, but I'd like to consider it afresh[1][2]. class SimpleNamespace: """A simple attribute-based namespace.""" def __init__(self, **kwargs): self.__dict__.update(kwargs) # or self.__dict__ = kwargs def __repr__(self): keys = sorted(k for k in self.__dict__ if not k.startswith('_')) content = ("{}={!r}".format(k, self.__dict__[k]) for k, v in keys) return "{}({})".format(type(self).__name__, ", ".join(content)) This is the sort of class that people implement all the time. There's even a similar one in the argparse module, which inspired the second class below[3]. If the builtin object type were dict-based rather than slot based then this sort of namespace type would be mostly superfluous. However, I also understand how that would add an unnecessary resource burden on _all_ objects. So why not a new type? Nick Coghlan had this objection recently to a similar proposal[4]: Please, no. No new just-like-a-namedtuple-except-you-can't-iterate-over-it type, and definitely not one exposed in the collections module. We've been over this before: collections.namedtuple *is* the standard library's answer for structured records. TOOWTDI, and the way we have already chosen includes iterability as one of its expected properties. As you can see he's referring to "structured records", but I expect that his objections could be extended somewhat to this proposal. I see where he's coming from and agree relative to structured records. However, I also think that a simple namespace type would be a benefit to different use cases, namely where you want a simple dynamic namespace. Making a simple namespace class is trivial and likely just about everyone has written one: "class Namespace: pass" or even "type('Namespace', (), {})". Obviously the type in this proposal has more meat, but that's certainly not necessary. So why a new type? The main reason is that as a builtin type the simple namespace type could be used in builtin modules[5][6][7]. Thoughts? -eric [1] http://mail.python.org/pipermail/python-dev/2012-May/119387.html [2] http://mail.python.org/pipermail/python-dev/2012-May/119393.html [3] http://hg.python.org/cpython/file/dff6c506c2f1/Lib/argparse.py#l1177 [4] http://mail.python.org/pipermail/python-dev/2012-May/119412.html [5] http://mail.python.org/pipermail/python-dev/2012-May/119395.html [6] http://mail.python.org/pipermail/python-dev/2012-May/119399.html [7] http://mail.python.org/pipermail/python-dev/2012-May/119402.html -------------------------- class Namespace(SimpleNamespace): def __dir__(self): return sorted(k for k in self.__dict__ if not k.startswith('_')) def __eq__(self, other): return self.__dict__ == other.__dict__ def __ne__(self, other): return self.__dict__ != other.__dict__ def __contains__(self, name): return name in self.__dict__ From mwm at mired.org Tue May 22 22:30:53 2012 From: mwm at mired.org (Mike Meyer) Date: Tue, 22 May 2012 16:30:53 -0400 Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout) In-Reply-To: References: Message-ID: <20120522163053.684b43d0@bhuda.mired.org> On Tue, 22 May 2012 18:39:16 +0300 anatoly techtonik wrote: > Therefore, inspired by Fabric API, I've finally found the solution - > shutil.run() function: > https://bitbucket.org/techtonik/shutil-run/src > > run(command, combine_stderr=True): > > Run command through a system shell, return output string with > additional properties: > > output.succeeded - result of the operation True/False > output.return_code - specific return code > output.stderr - stderr contents if combine_stderr=False > > `combine_stderr` if set, makes stderr merged into output string, > otherwise it will be available as `output.stderr` attribute. [...] > That's the most intuitive way I found so far. Objective advantages: > > 1. Better than > subprocess.call(cmd, shell=true) > subprocess.check_call(cmd, shell=true) > subprocess.check_output(cmd, shell=True) > because it is just > shutil.run(cmd) > i.e. short, simple and _easy to remember_ -2 Unless there's some way to turn off shell processing (better yet, have no shell processing be the default, and require that it be turned on), it can't be used securely with tainted strings, so it should *not* be used with tainted strings, which means it's pretty much useless in any environment where security matters. With everything being networked, there may no longer be any such environments. > 3. shutil.run() is predictable and consistent - its arguments are not > dependent on each other, their combination doesn't change the function > behavior over and over requiring you iterate over the documentation > and warnings again and again As proposed, it certainly provides a predictable and consistent vulnerability to code injection attacks. > 4. shutil.run() is the correct next level API over subprocess base > level. subprocess executes external process - that is its role, but > automatic ability to execute external process inside another external > process (shell) looks like a hack to me. Practical, but still a hack. It's only correct if you are in an environment where you don't care about security. If you care about security, you can't use it. If we're going to add yet another system() replacement, let's at least try and make it secure. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From ncoghlan at gmail.com Tue May 22 23:41:28 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 23 May 2012 07:41:28 +1000 Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout) In-Reply-To: <20120522163053.684b43d0@bhuda.mired.org> References: <20120522163053.684b43d0@bhuda.mired.org> Message-ID: Right, security implications are one of the reasons why I've held back from proposing Shell Command. The lack of cross platform support is also a pain. This suggestion shares both of those problems. Having dealt with long running child processes lately, I can also say that producing output line-by-line would be on my personal list of requirements. So, yeah, interesting idea, but this is still an area that needs a lot of exploration on PyPI before we select an answer for the stdlib. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Wed May 23 08:49:46 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 23 May 2012 02:49:46 -0400 Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout) In-Reply-To: References: <20120522163053.684b43d0@bhuda.mired.org> Message-ID: On Tue, May 22, 2012 at 5:41 PM, Nick Coghlan wrote: > Having dealt with long running child processes lately, I can also say that > producing output line-by-line would be on my personal list of requirements. You can do that with subprocess, right? Just have to be sure to close stdin/stderr and read p.stdout with readline() repeatedly... I think you might be able to even have the other file descriptors be inputting/outputting if you use threads, but I'm scared of experimenting with these things -- experiments don't tell you that it doesn't work on an OS you don't have. -- Devin From ncoghlan at gmail.com Wed May 23 09:09:12 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 23 May 2012 17:09:12 +1000 Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout) In-Reply-To: References: <20120522163053.684b43d0@bhuda.mired.org> Message-ID: On Wed, May 23, 2012 at 4:49 PM, Devin Jeanpierre wrote: > On Tue, May 22, 2012 at 5:41 PM, Nick Coghlan wrote: >> Having dealt with long running child processes lately, I can also say that >> producing output line-by-line would be on my personal list of requirements. > > You can do that with subprocess, right? Just have to be sure to close > stdin/stderr and read p.stdout with readline() repeatedly... Yep, subprocess is a swiss army knife - you can do pretty much anything with it. That's the complaint, though - *because* it's so configurable, even the existing convenience APIs aren't always that convenient for simple operations. Thus the current spate of efforts to provide a "friendlier" API for performing shell operations from Python. The dust may settle well enough in the 3.4 time frame for us to declare a "winner" and add something to the standard library, but that's far from certain. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From techtonik at gmail.com Wed May 23 10:38:36 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 23 May 2012 11:38:36 +0300 Subject: [Python-ideas] Run attached Python tests in browser Message-ID: I am not sure if it belongs here, to python-dev or infrastructure. Lately I've been looking at http://repl.it/ and found it to be pretty convenient to code stuff that otherwise require a Python editor to be installed. So, I thought that it might be actually convenient to use for automatically testing patches in Python bugtracker without going through the hassle to download, patch and run everything locally. Of course, not everything will work, but at least some parts of it could be. -- anatoly t. From techtonik at gmail.com Wed May 23 10:47:06 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 23 May 2012 11:47:06 +0300 Subject: [Python-ideas] shutil.run no security thread Message-ID: Ok, let's separately discuss shutil.run() added value without touching security at all (subj changed). Is it ok? Is it nice idea? Would it be included in stdlib in an ideal world where security implications doesn't matter? -- anatoly t. On Tue, May 22, 2012 at 11:30 PM, Mike Meyer wrote: > On Tue, 22 May 2012 18:39:16 +0300 > anatoly techtonik wrote: > >> Therefore, inspired by Fabric API, I've finally found the solution - >> shutil.run() function: >> https://bitbucket.org/techtonik/shutil-run/src >> >> run(command, combine_stderr=True): >> >> ? ? Run command through a system shell, return output string with >> ? ? additional properties: >> >> ? ? ? ? output.succeeded ? ?- result of the operation True/False >> ? ? ? ? output.return_code ?- specific return code >> ? ? ? ? output.stderr ? ? ? - stderr contents if combine_stderr=False >> >> ? ? ?`combine_stderr` if set, makes stderr merged into output string, >> ? ? ?otherwise it will be available ?as `output.stderr` attribute. > [...] >> That's the most intuitive way I found so far. Objective advantages: >> >> 1. Better than >> ? ? ? ?subprocess.call(cmd, shell=true) >> ? ? ? ?subprocess.check_call(cmd, shell=true) >> ? ? ? ?subprocess.check_output(cmd, shell=True) >> ? ? ?because it is just >> ? ? ? ?shutil.run(cmd) >> ? ? ?i.e. short, simple and _easy to remember_ > > -2 > > Unless there's some way to turn off shell processing (better yet, have > no shell processing be the default, and require that it be turned on), > it can't be used securely with tainted strings, so it should *not* be > used with tainted strings, which means it's pretty much useless in any > environment where security matters. With everything being networked, > there may no longer be any such environments. > >> 3. shutil.run() is predictable and consistent - its arguments are not >> dependent on each other, their combination doesn't change the function >> behavior over and over requiring you iterate over the documentation >> and warnings again and again > > As proposed, it certainly provides a predictable and consistent > vulnerability to code injection attacks. > >> 4. shutil.run() is the correct next level API over subprocess base >> level. subprocess executes external process - that is its role, but >> automatic ability to execute external process inside another external >> process (shell) looks like a hack to me. Practical, but still a hack. > > It's only correct if you are in an environment where you don't care > about security. If you care about security, you can't use it. If we're > going to add yet another system() replacement, let's at least try and > make it secure. > > ? ? -- > Mike Meyer ? ? ? ? ? ? ?http://www.mired.org/ > Independent Software developer/SCM consultant, email for more information. > > O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From techtonik at gmail.com Wed May 23 11:04:13 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 23 May 2012 12:04:13 +0300 Subject: [Python-ideas] Important fixes not getting to releases Message-ID: I know why important usability features are not getting into releases - they are taken care too late in release cycle and it all starts with bug tracker which imposes the workflow where usability bugs are always an enhancement. http://bugs.python.org/issue14872 -- anatoly t. From techtonik at gmail.com Wed May 23 11:07:55 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 23 May 2012 12:07:55 +0300 Subject: [Python-ideas] processing subprocess output line-by-line (Was: shutil.run) Message-ID: On Wed, May 23, 2012 at 10:09 AM, Nick Coghlan wrote: > On Wed, May 23, 2012 at 4:49 PM, Devin Jeanpierre > wrote: >> On Tue, May 22, 2012 at 5:41 PM, Nick Coghlan wrote: >>> Having dealt with long running child processes lately, I can also say that >>> producing output line-by-line would be on my personal list of requirements. >> >> You can do that with subprocess, right? Just have to be sure to close >> stdin/stderr and read p.stdout with readline() repeatedly... > > Yep, subprocess is a swiss army knife - you can do pretty much > anything with it. That's the complaint, though - *because* it's so > configurable, even the existing convenience APIs aren't always that > convenient for simple operations. It is quite likely that there are use cases where subprocess fails, \ because they require async control. http://bugs.python.org/issue14872 And line-by-line recipe is here: http://stackoverflow.com/questions/5582933/need-to-avoid-subprocess-deadlock-without-communicate -- anatoly t. From techtonik at gmail.com Wed May 23 11:12:51 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 23 May 2012 12:12:51 +0300 Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout) In-Reply-To: References: <20120522163053.684b43d0@bhuda.mired.org> Message-ID: On Wed, May 23, 2012 at 12:41 AM, Nick Coghlan wrote: > Right, security implications are one of the reasons why I've held back from > proposing Shell Command. The lack of cross platform support is also a pain. > This suggestion shares both of those problems. Why shutil.run() is not cross-platform? Is it technically feasible to make shutil.run() (or subprocess.* for that purpose) cross-platform? > Sent from my phone, thus the relative brevity :) That's actually lowers a bounce rate for discussion. =) From pyideas at rebertia.com Wed May 23 11:26:53 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 23 May 2012 02:26:53 -0700 Subject: [Python-ideas] shutil.run no security thread In-Reply-To: References: Message-ID: > On Tue, May 22, 2012 at 11:30 PM, Mike Meyer wrote: >> On Tue, 22 May 2012 18:39:16 +0300 >> anatoly techtonik wrote: >> >>> Therefore, inspired by Fabric API, I've finally found the solution - >>> shutil.run() function: >>> https://bitbucket.org/techtonik/shutil-run/src >> -2 >> >> Unless there's some way to turn off shell processing (better yet, have >> no shell processing be the default, and require that it be turned on), >> it can't be used securely with tainted strings, so it should *not* be >> used with tainted strings, which means it's pretty much useless in any >> environment where security matters. With everything being networked, >> there may no longer be any such environments. >> >>> 3. shutil.run() is predictable and consistent - its arguments are not >> As proposed, it certainly provides a predictable and consistent >> vulnerability to code injection attacks. >> >>> 4. shutil.run() is the correct next level API over subprocess base >>> level. subprocess executes external process - that is its role, but >>> automatic ability to execute external process inside another external >>> process (shell) looks like a hack to me. Practical, but still a hack. >> >> It's only correct if you are in an environment where you don't care >> about security. If you care about security, you can't use it. If we're >> going to add yet another system() replacement, let's at least try and >> make it secure. On Wed, May 23, 2012 at 1:47 AM, anatoly techtonik wrote: > Ok, let's separately discuss shutil.run() added value without touching > security at all (subj changed). > > Is it ok? Is it nice idea? Would it be included in stdlib in an ideal > world where security implications doesn't matter? I hope not, because it'd still have all the /usability/ pitfalls associated with shell interpolation (and the consequent need to escape command arguments). Consider: chris at MBP ~ $ mkdir foo && cd foo chris at MBP foo $ ls chris at MBP foo $ touch '~' # the horror chris at MBP foo $ touch '$EDITOR' # you have a sick mind chris at MBP foo $ ls -l # verify the devious plot total 0 -rw-r--r-- 1 chris staff 0 May 23 02:11 $EDITOR -rw-r--r-- 1 chris staff 0 May 23 02:11 ~ chris at MBP foo $ python Python 2.7.1 (r271:86832, Jul 31 2011, 19:30:53) >>> from os import listdir >>> from subprocess import call >>> for entry in listdir('.'): ? ret = call('ls '+entry, shell=True) # ? la your wrapper ... ls: ed: No such file or directory >>> # that's not what I wanted at all! (Less contrived examples left as an exercise for the reader.) Also, this isn't shell-specific, but it still should be made easier to handle properly: What about a file named "--help"? Cheers, Chris -- Sadly, no, `ed` isn't really my editor. http://rebertia.com P.S. Please avoid top-posting in the future. From ncoghlan at gmail.com Wed May 23 14:22:56 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 23 May 2012 22:22:56 +1000 Subject: [Python-ideas] shutil.run no security thread In-Reply-To: References: Message-ID: On Wed, May 23, 2012 at 6:47 PM, anatoly techtonik wrote: > Ok, let's separately discuss shutil.run() added value without touching > security at all (subj changed). > > Is it ok? Is it nice idea? Would it be included in stdlib in an ideal > world where security implications doesn't matter? Sure. That world is called PHP (or C, for that matter). We *care* about security implications, and trying to be secure by default is part of that. Usability isn't everything, and it's OK if software development is sometimes hard. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From techtonik at gmail.com Wed May 23 15:30:32 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 23 May 2012 16:30:32 +0300 Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout) In-Reply-To: <20120522163053.684b43d0@bhuda.mired.org> References: <20120522163053.684b43d0@bhuda.mired.org> Message-ID: About security. On Tue, May 22, 2012 at 11:30 PM, Mike Meyer wrote: > On Tue, 22 May 2012 18:39:16 +0300 > anatoly techtonik wrote: > >> Therefore, inspired by Fabric API, I've finally found the solution - >> shutil.run() function: >> https://bitbucket.org/techtonik/shutil-run/src >> >> run(command, combine_stderr=True): >> >> ? ? Run command through a system shell, return output string with >> ? ? additional properties: >> >> ? ? ? ? output.succeeded ? ?- result of the operation True/False >> ? ? ? ? output.return_code ?- specific return code >> ? ? ? ? output.stderr ? ? ? - stderr contents if combine_stderr=False >> >> ? ? ?`combine_stderr` if set, makes stderr merged into output string, >> ? ? ?otherwise it will be available ?as `output.stderr` attribute. > [...] >> That's the most intuitive way I found so far. Objective advantages: >> >> 1. Better than >> ? ? ? ?subprocess.call(cmd, shell=true) >> ? ? ? ?subprocess.check_call(cmd, shell=true) >> ? ? ? ?subprocess.check_output(cmd, shell=True) >> ? ? ?because it is just >> ? ? ? ?shutil.run(cmd) >> ? ? ?i.e. short, simple and _easy to remember_ > > -2 > > Unless there's some way to turn off shell processing (better yet, have > no shell processing be the default, and require that it be turned on), > it can't be used securely with tainted strings, so it should *not* be > used with tainted strings, which means it's pretty much useless in any > environment where security matters. With everything being networked, > there may no longer be any such environments. What does this "shell processing" involve to understand what to turn off? Why there is no way to turn off "shell processing"? What's the primary reason that it is impossible to be turned off? >> 3. shutil.run() is predictable and consistent - its arguments are not >> dependent on each other, their combination doesn't change the function >> behavior over and over requiring you iterate over the documentation >> and warnings again and again > > As proposed, it certainly provides a predictable and consistent > vulnerability to code injection attacks. suprocess.* with shell=True provides the same entrypoint for injection attacks, and security through obscurity doesn't help here. People still use shell=True, because that's sometimes the only way to execute external utilities properly. Even my synapses were silent when I reviewed and used shell=True for Rietveld upload script and Spyder IDE. What will help is a better simple explanation in a prominent place, with an example that people can really remember instead of frightening them with warnings. People will ignore warning eventually, and after endless experiments will subprocess.* params mess will just leave shell=True because it works (I did so). No sane web developer will use subprocess calls on server side at all. Regardless of shell=True or not. For example, how can I be sure that Graphviz is save from exploit through malicious input? No sane developer will run shell script on a web side either. For those who still want - there will be this simple explanation right on the shutil.run() page - with link to proper vulnerability analysis instead of uncertainty inducting warning. shutil.run() is aimed for local operations. >> 4. shutil.run() is the correct next level API over subprocess base >> level. subprocess executes external process - that is its role, but >> automatic ability to execute external process inside another external >> process (shell) looks like a hack to me. Practical, but still a hack. > > It's only correct if you are in an environment where you don't care > about security. If you care about security, you can't use it. If we're > going to add yet another system() replacement, let's at least try and > make it secure. I am all ears how to make shutil.run() more secure. Right now I must confess that I don't even realize.how serious is this problems, so if anyone can came up with a real-world example with explanation of security concern that could be copied "as-is" into documentation, it will surely be appreciated not only by me. From bborcic at gmail.com Wed May 23 16:50:18 2012 From: bborcic at gmail.com (Boris Borcic) Date: Wed, 23 May 2012 16:50:18 +0200 Subject: [Python-ideas] [...].join(sep) In-Reply-To: References: Message-ID: Terry Reedy wrote: > On 5/21/2012 10:27 AM, Boris Borcic wrote: >> anatoly techtonik wrote: >>> I am certain this was proposed many times, but still - why it is >>> rejected? >>> >>> "real man don't use spaces".split().join('+').upper() >>> instead of >>> '+'.join("real man don't use spaces".split()).upper() >> >> IMO this should really be : >> >> '+'.join(' '.split("real man don't use spaces")).upper() > > It the separator were a mandatory argument for .split, then that would be > possible, not not with it being optional, and therefore the second argument. > > >>> ' real men usE SPAces and tabs'.split() > ['real', 'men', 'usE', 'SPAces', 'and', 'tabs'] > >>> ' real men usE SPAces and tabs'.split(' ') > ['', 'real', '', 'men', '', 'usE', 'SPAces', '', '', 'and', '\t', 'tabs'] > > >>> ' '.join(' real men usE SPAces and tabs'.split()) > 'real men usE SPAces and tabs' > > is a handy way to clean up whitespace > Kind of beside the point, which is that the desire to repair the inconsistency between split and join has a better prospect at the split side of things than at the join side of things. The problems at the split side of things are comparatively minor. From bruce at leapyear.org Wed May 23 17:29:28 2012 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 23 May 2012 08:29:28 -0700 Subject: [Python-ideas] [...].join(sep) In-Reply-To: References: Message-ID: On Wed, May 23, 2012 at 7:50 AM, Boris Borcic wrote: > Kind of beside the point, which is that the desire to repair the > inconsistency between split and join has a better prospect at the split > side of things than at the join side of things. The problems at the split > side of things are comparatively minor. > The inconsistency that bugs me is the difference in split behavior between languages. Switching between languages means I have to constantly double check this. What is consistent is that you call string.split(separator) rather than separator.split(string) so changing that doesn't seem at all beneficial. Python split has an optional *maxsplit* parameter: If maxsplit is given, at most maxsplit splits are done (thus, the list will have *at most maxsplit+1* elements). The remainder of the string after the last matched separator is included in the last part. Java split has an optional integer *limit* parameter: ... the pattern will be applied at most limit - 1 times, the array's length will be *no greater than limit* ... The remainder of the string after the last matched separator is included in the last part. C# split has an optional *count* parameter: The maximum number of substrings to return. The remainder of the string after the last matched separator is included in the last part. Ruby split has an optional limit parameter: If limit is a positive number, *at most that number of fields* will be returned. The remainder of the string after the last matched separator is included in the last part. Javascript has an optional limit parameter: It returns *at most limit* parts. The remainder of the string after the last matched separator is *discarded*. And I'm not mentioning the differences in how the separator parameter is interpreted. :-) --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Thu May 24 03:32:42 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Thu, 24 May 2012 03:32:42 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module Message-ID: Including an established async IO framework such as Twisted, gevent or Tornado in the Python stdlib has always been a controversial subject. PEP-3153 (http://www.python.org/dev/peps/pep-3153/) tried to face this problem in the most agnostic way as possible, and it's a good starting point IMO. Nevertheless, it's still vague about what the actual API should look like and AFAIK it remained stagnant so far. There's one thing in the whole async stack which is basically the same for all implementations though: the poller/reactor. Could it make sense to add something similar to select module? Differently from PEP-3153, providing such a layer on top of select(), poll() & co. is easier and could possibly be an incentive to avoid such code duplication. I'm coming up with this because I recently did something similar in pyftpdlib as an hack on top of asyncore to add support for epoll() and kqueue(), using the excellent Tornado's io loop as source of inspiration: http://code.google.com/p/pyftpdlib/issues/detail?id=203 http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py The way I imagine it: >>> import select >>> dir(select) [..., 'EpollPoller', 'PollPoller', 'SelectPoller', 'KqueuePoller'] >>> poller = select.EpollPoller() >>> poller.register(fd, handler, poller.READ | poller.WRITE) >>> poller.socket_map {2 : } >>> poller.modify(fd, poller.READ) >>> poller.poll() # will call handler.handle_read_event() if/when it's the case ^C KeyboardInterrupt >>> poller.remove(fd) >>> poller.close() The handler is supposed to provide 3 methods: - handle_read_event - handle_write_event - handle_error_event Users willing to support multiple event loops such as wx, gtk etc can do: >>> while 1: ... poller.poll(timeout=0.1, blocking=False) ... otherpoller.poll() Basically, this would be the whole API. Thoughts? --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From g.rodola at gmail.com Thu May 24 03:43:33 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Thu, 24 May 2012 03:43:33 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: 2012/5/24 Giampaolo Rodol? : > The handler is supposed to provide 3 methods: > - handle_read_event > - handle_write_event > - handle_error_event Further note: this is the approach I used in pyftpdlib. An even more abstracted approach would be having poller.poll() return a dict of {fd: events, fd, events, ...}, similarly to what Tornado currently does. This way we wouldn't be forcing the user to provide a handler class with the 3 methods described above. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From steve at pearwood.info Thu May 24 04:00:58 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 24 May 2012 12:00:58 +1000 Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout) In-Reply-To: References: <20120522163053.684b43d0@bhuda.mired.org> Message-ID: <4FBD965A.4040801@pearwood.info> anatoly techtonik wrote: > I am all ears how to make shutil.run() more secure. Right now I must > confess that I don't even realize.how serious is this problems, so if > anyone can came up with a real-world example with explanation of > security concern that could be copied "as-is" into documentation, it > will surely be appreciated not only by me. Start here: http://cwe.mitre.org/top25/index.html Code injection attacks include two of the top three security vulnerabilities, over even buffer overflows. One sub-category of code injection: OS Command Injection http://cwe.mitre.org/data/definitions/78.html -- Steven From debatem1 at gmail.com Thu May 24 05:24:39 2012 From: debatem1 at gmail.com (geremy condra) Date: Wed, 23 May 2012 20:24:39 -0700 Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout) In-Reply-To: <4FBD965A.4040801@pearwood.info> References: <20120522163053.684b43d0@bhuda.mired.org> <4FBD965A.4040801@pearwood.info> Message-ID: On Wed, May 23, 2012 at 7:00 PM, Steven D'Aprano wrote: > anatoly techtonik wrote: > > I am all ears how to make shutil.run() more secure. Right now I must >> confess that I don't even realize.how serious is this problems, so if >> anyone can came up with a real-world example with explanation of >> security concern that could be copied "as-is" into documentation, it >> will surely be appreciated not only by me. >> > > Start here: > > http://cwe.mitre.org/top25/**index.html > > Code injection attacks include two of the top three security > vulnerabilities, over even buffer overflows. > > One sub-category of code injection: > > OS Command Injection > http://cwe.mitre.org/data/**definitions/78.html I talked about this in my pycon talk this year. It's easy to avoid and disastrous to get wrong. Please don't do it this way. Geremy Condra > > > > > -- > Steven > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Thu May 24 08:47:35 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 24 May 2012 08:47:35 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote: >>>> > > The handler is supposed to provide 3 methods: > - handle_read_event > - handle_write_event > - handle_error_event > > Users willing to support multiple event loops such as wx, gtk etc can do: > >>>> while 1: > ... poller.poll(timeout=0.1, blocking=False) > ... otherpoller.poll() > > > Basically, this would be the whole API. Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher). Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4788 bytes Desc: not available URL: From lyricconch at gmail.com Thu May 24 12:47:48 2012 From: lyricconch at gmail.com (=?UTF-8?B?5rW36Z+1?=) Date: Thu, 24 May 2012 18:47:48 +0800 Subject: [Python-ideas] Extended "break" "continue" in "for ... in" block. Message-ID: Hi all... i'd like to propose a syntax extend of "break" and "continue" to let them work together with "yield". Syntax: continue_stmt: "continue" [ test ] break_stmt: "break" [ test ] it is only valid in "for ... in" block. when we writing "for in : ", we can say there is a generator ("__g = iter()") providing values (" = next(__g)") and the values are processing by . implenetments computing. (we focus) inside geneator("__iter__" of ) implenetments iteration. (sealed logic) ---- as current ---- "continue" is "next(__g)" (which equals to "__g.send(None)"), "break" leave the block and __g is garbage collected(which implies a __g.close()). "return" "raise" inside leave the block and __g is garbage collected. let's make thing reverse. consider we are write __g' code. generator function implenetments computing. (we focus) outside code( of "for ... in") implenetments continuation. (sealed logic) ---- as proposal ---- "continue " is equiv to "__g.send()". "continue" is alias of "continue None". "break " is equiv to "__g.throw()". "break" is alias of "break GeneratorExit". "return" "raise" inside impies a "break" to __g, with communication between "yield" and "continue", "break", this plays just as Ruby's block except return value of __g lost (may we use an "as" after "for ... in" to fetch return value?). -- = =! From simon.sapin at kozea.fr Thu May 24 13:05:15 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Thu, 24 May 2012 13:05:15 +0200 Subject: [Python-ideas] Extended "break" "continue" in "for ... in" block. In-Reply-To: References: Message-ID: <4FBE15EB.4080706@kozea.fr> Le 24/05/2012 12:47, ?? a ?crit : > we can say there is a generator ("__g = iter()") > [...] > "continue " is equiv to "__g.send()". Hi, iter() returns an iterator, not a generator. All generators are iterators, but not all iterators are generators: an iterator may not have a .send() method. How would "continue something_that_is_not_None" behave with an iterator without a .send() method? Regards, -- Simon Sapin From steve at pearwood.info Thu May 24 13:26:25 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 24 May 2012 21:26:25 +1000 Subject: [Python-ideas] Extended "break" "continue" in "for ... in" block. In-Reply-To: References: Message-ID: <4FBE1AE1.9030800@pearwood.info> ?? wrote: > Hi all... > i'd like to propose a syntax extend of "break" and "continue" > to let them work together with "yield". Can you give an example of how you would use them, and why? -- Steven From g.rodola at gmail.com Thu May 24 13:50:27 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Thu, 24 May 2012 13:50:27 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: 2012/5/24 Ronald Oussoren : > > On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote: >>>>> >> >> The handler is supposed to provide 3 methods: >> - handle_read_event >> - handle_write_event >> - handle_error_event >> >> Users willing to support multiple event loops such as wx, gtk etc can do: >> >>>>> while 1: >> ... ? ? ? poller.poll(timeout=0.1, blocking=False) >> ... ? ? ? otherpoller.poll() >> >> >> Basically, this would be the whole API. > > Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher). > > Ronald poller.poll serves the same purpose of asyncore.loop, yes, but this is supposed to be independent from asyncore. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From ubershmekel at gmail.com Thu May 24 13:59:15 2012 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Thu, 24 May 2012 14:59:15 +0300 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: Message-ID: On Tue, May 22, 2012 at 7:26 PM, Eric Snow wrote: > Below I've included a pure Python implementation of a type that I wish > was a builtin. I know others have considered similar classes in the > past without any resulting change to Python, but I'd like to consider > it afresh[1][2]. > > class SimpleNamespace: > """A simple attribute-based namespace.""" > def __init__(self, **kwargs): > self.__dict__.update(kwargs) # or self.__dict__ = kwargs > def __repr__(self): > keys = sorted(k for k in self.__dict__ if not k.startswith('_')) > content = ("{}={!r}".format(k, self.__dict__[k]) for k, v in keys) > return "{}({})".format(type(self).__name__, ", ".join(content)) > > This is the sort of class that people implement all the time. There's > even a similar one in the argparse module, which inspired the second > class below[3]. If the builtin object type were dict-based rather > than slot based then this sort of namespace type would be mostly > superfluous. However, I also understand how that would add an > unnecessary resource burden on _all_ objects. So why not a new type? > > Nick Coghlan had this objection recently to a similar proposal[4]: > > Please, no. No new > just-like-a-namedtuple-except-you-can't-iterate-over-it type, and > definitely not one exposed in the collections module. > > We've been over this before: collections.namedtuple *is* the standard > library's answer for structured records. TOOWTDI, and the way we have > already chosen includes iterability as one of its expected properties. > [...] I've implemented this a few times as well. I called it "AttributeDict" or "Record". I think adding an __iter__ method would be beneficial. E.g. class SimpleNamespace : def __init__(self, **kwargs): self.__dict__.update(kwargs) # or self.__dict__ = kwargs self.__iter__ = lambda: iter(kwargs.keys()) Why do we need this imo: * sometimes x.something feels better than x['something'] * to ease duck-typing, making mocks, etc. * Named tuple feels clunky for certain dynamic cases (why do I need to create the type for a one-off?) I wonder if SimpleNameSpace should allow __getitem__ as well... Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu May 24 14:03:14 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 24 May 2012 14:03:14 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module References: Message-ID: <20120524140314.303d3bcd@pitrou.net> On Thu, 24 May 2012 13:50:27 +0200 Giampaolo Rodol? wrote: > 2012/5/24 Ronald Oussoren : > > > > On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote: > >>>>> > >> > >> The handler is supposed to provide 3 methods: > >> - handle_read_event > >> - handle_write_event > >> - handle_error_event > >> > >> Users willing to support multiple event loops such as wx, gtk etc can do: > >> > >>>>> while 1: > >> ... ? ? ? poller.poll(timeout=0.1, blocking=False) > >> ... ? ? ? otherpoller.poll() > >> > >> > >> Basically, this would be the whole API. > > > > Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher). > > > > Ronald > > poller.poll serves the same purpose of asyncore.loop, yes, but this is > supposed to be independent from asyncore. I agree with Ronald that it looks like a less-braindead version of asyncore. I don't think the select module is the right place. Also, I don't know why you would specify poller.READ or poller.WRITE explicitly. Usually you are interested in all events, no? Regards Antoine. From ncoghlan at gmail.com Thu May 24 14:21:43 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 24 May 2012 22:21:43 +1000 Subject: [Python-ideas] Extended "break" "continue" in "for ... in" block. In-Reply-To: <4FBE1AE1.9030800@pearwood.info> References: <4FBE1AE1.9030800@pearwood.info> Message-ID: On Thu, May 24, 2012 at 9:26 PM, Steven D'Aprano wrote: > ?? wrote: >> >> Hi all... >> i'd like to propose a syntax extend of "break" and ?"continue" >> to let them work together with "yield". > > Can you give an example of how you would use them, and why? It's an approach to driving a coroutine (and one that was discussed back when the coroutine methods were added to generators). Currently, if you're using a generator as a coroutine, you largely *avoid* using it directly as an iterator. Aside from the initial priming of coroutines, most generator based code will either treat them as iterators (via for loops, comprehensions and next() calls), or as coroutines (via send() and throw() calls). The main reason tinkering with for loops has been resisted is that native support for even "continue " (the least controversial part of the suggestion) would likely result in slowing down all for loops to cover the relatively niche coroutine use case. Also, if anything was going to map to throw() it would be "continue raise", not "break": continue -> next(itr) continue -> itr.send() continue raise -> itr.throw() So yeah, this isn't a new proposal, but what's still lacking is a clear justification of what code will actually *gain* from the increase in the language complexity. How often are generator based coroutines actually used outside the context of a larger framework that already takes care of the next/send/throw details? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu May 24 14:37:03 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 24 May 2012 22:37:03 +1000 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On Thu, May 24, 2012 at 9:50 PM, Giampaolo Rodol? wrote: > poller.poll serves the same purpose of asyncore.loop, yes, but this is > supposed to be independent from asyncore. I'd actually like to see something like this pitched as a "concurrent.eventloop" PEP. PEP 3153 really wasn't what I was expecting after the discussions at the PyCon US 2011 language summit - I was expecting "here's a common event loop all the async frameworks can hook into", but instead we got something a *lot* more ambitious taht tried to merge the entire IO stack for the async frameworks, rather than just provide a standard way for their event loops to cooperate. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.rodola at gmail.com Thu May 24 14:45:01 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Thu, 24 May 2012 14:45:01 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: <20120524140314.303d3bcd@pitrou.net> References: <20120524140314.303d3bcd@pitrou.net> Message-ID: 2012/5/24 Antoine Pitrou : > On Thu, 24 May 2012 13:50:27 +0200 > Giampaolo Rodol? > wrote: >> 2012/5/24 Ronald Oussoren : >> > >> > On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote: >> >>>>> >> >> >> >> The handler is supposed to provide 3 methods: >> >> - handle_read_event >> >> - handle_write_event >> >> - handle_error_event >> >> >> >> Users willing to support multiple event loops such as wx, gtk etc can do: >> >> >> >>>>> while 1: >> >> ... ? ? ? poller.poll(timeout=0.1, blocking=False) >> >> ... ? ? ? otherpoller.poll() >> >> >> >> >> >> Basically, this would be the whole API. >> > >> > Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher). >> > >> > Ronald >> >> poller.poll serves the same purpose of asyncore.loop, yes, but this is >> supposed to be independent from asyncore. > > I agree with Ronald that it looks like a less-braindead version of > asyncore. I don't think the select module is the right place. Yeah, probably. Usually when I post here I'm the first one not being sure whether what I propose is a good idea or not. =) Anyway, it must be clear that what I have in mind is not related to asyncore per-se. The proposal is to add a *generic* poller/reactor to select module as an abstraction layer on top of select(), poll(), epoll() and kqueue(), that's all. > Also, I don't know why you would specify poller.READ or poller.WRITE > explicitly. Usually you are interested in all events, no? Nope, that's what asyncore does and that's why it is significantly slower compared to more modern and clever async loops (independenly from the lack of epoll() / kqueue() support in asyncore). You should only be interested in reading for accepting sockets (servers) or when you want to receive data. You should only be interested in writing for connecting sockets (clients) or when you want to send data. Being interested in both when, say, you only intend to receive data is a considerable waste of time, especially when there are many concurrent connections. The performance degradation if you wildly look for both read and write events is *huge*, see benchmarks referring to old vs. new select() implementation here (~8.5x slowdown with 200 concurrent clients): http://code.google.com/p/pyftpdlib/issues/detail?id=203#c6 --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From ncoghlan at gmail.com Thu May 24 14:51:36 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 24 May 2012 22:51:36 +1000 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On Thu, May 24, 2012 at 10:37 PM, Nick Coghlan wrote: > On Thu, May 24, 2012 at 9:50 PM, Giampaolo Rodol? wrote: >> poller.poll serves the same purpose of asyncore.loop, yes, but this is >> supposed to be independent from asyncore. > > I'd actually like to see something like this pitched as a > "concurrent.eventloop" PEP. PEP 3153 really wasn't what I was > expecting after the discussions at the PyCon US 2011 language summit - > I was expecting "here's a common event loop all the async frameworks > can hook into", but instead we got something a *lot* more ambitious > taht tried to merge the entire IO stack for the async frameworks, > rather than just provide a standard way for their event loops to > cooperate. See the final section of my notes here: http://www.boredomandlaziness.org/2011/03/python-language-summit-rough-notes.html Turns out the idea of a PEP 3153 level API *was* raised at the summit, but I'd still like to see a competing PEP that targets the reactor level API directly. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ronaldoussoren at mac.com Thu May 24 14:52:59 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 24 May 2012 14:52:59 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: <20120524140314.303d3bcd@pitrou.net> References: <20120524140314.303d3bcd@pitrou.net> Message-ID: On 24 May, 2012, at 14:03, Antoine Pitrou wrote: > On Thu, 24 May 2012 13:50:27 +0200 > Giampaolo Rodol? > wrote: >> 2012/5/24 Ronald Oussoren : >>> >>> On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote: >>>>>>> >>>> >>>> The handler is supposed to provide 3 methods: >>>> - handle_read_event >>>> - handle_write_event >>>> - handle_error_event >>>> >>>> Users willing to support multiple event loops such as wx, gtk etc can do: >>>> >>>>>>> while 1: >>>> ... poller.poll(timeout=0.1, blocking=False) >>>> ... otherpoller.poll() >>>> >>>> >>>> Basically, this would be the whole API. >>> >>> Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher). >>> >>> Ronald >> >> poller.poll serves the same purpose of asyncore.loop, yes, but this is >> supposed to be independent from asyncore. > > I agree with Ronald that it looks like a less-braindead version of > asyncore. I don't think the select module is the right place. What worries me most is that it might only look like a beter version of asyncore. I'd much rather see something based on the event-handling core of Twisted because that code base is used in production and is hence more likely to be correct w.r.t. odd real-world conditions. IIRC doing this was discussed at the language summit in 2011, but as Nick mentions that doesn't seem to be the focus of PEP 3153. I am by the way not using Twisted myself, I'm at this time still using homebrew select loops and asyncore. > > Also, I don't know why you would specify poller.READ or poller.WRITE > explicitly. Usually you are interested in all events, no? You're not always interested in write events, those are only interesting when you have data that must be written to a socket. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4788 bytes Desc: not available URL: From anacrolix at gmail.com Thu May 24 15:05:12 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 24 May 2012 21:05:12 +0800 Subject: [Python-ideas] Composability and concurrent.futures In-Reply-To: <951AE63A-2AE9-4314-8B05-F80EC90D3314@cs.washington.edu> References: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu> <951AE63A-2AE9-4314-8B05-F80EC90D3314@cs.washington.edu> Message-ID: > > To be clear, I meant to refer to processes *or* threads when discussing > the problem originally. The ProcessPoolExecutor is pretty useful (in my > experience) for easily getting speedup even on pure-Python CPU-bound > workloads. > FWIW that wasn't the default "use processes" spike. In my experience toying with concurrency in Python, trying to manage the load threads put on the system always ends badly. The 2 best supported concurrency mechanisms, threads and processes are constantly t?te-?-t?te, neither are adequate when you start to consider extreme concurrency scenarios. I suggest this because if you're considering composing executors, you're already trying to reduce the overhead (wastage) that processes and threads are incurring on your system for these purposes. -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Thu May 24 15:06:18 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Thu, 24 May 2012 15:06:18 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: <20120524140314.303d3bcd@pitrou.net> Message-ID: 2012/5/24 Ronald Oussoren : > > On 24 May, 2012, at 14:03, Antoine Pitrou wrote: > >> On Thu, 24 May 2012 13:50:27 +0200 >> Giampaolo Rodol? >> wrote: >>> 2012/5/24 Ronald Oussoren : >>>> >>>> On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote: >>>>>>>> >>>>> >>>>> The handler is supposed to provide 3 methods: >>>>> - handle_read_event >>>>> - handle_write_event >>>>> - handle_error_event >>>>> >>>>> Users willing to support multiple event loops such as wx, gtk etc can do: >>>>> >>>>>>>> while 1: >>>>> ... ? ? ? poller.poll(timeout=0.1, blocking=False) >>>>> ... ? ? ? otherpoller.poll() >>>>> >>>>> >>>>> Basically, this would be the whole API. >>>> >>>> Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher). >>>> >>>> Ronald >>> >>> poller.poll serves the same purpose of asyncore.loop, yes, but this is >>> supposed to be independent from asyncore. >> >> I agree with Ronald that it looks like a less-braindead version of >> asyncore. I don't think the select module is the right place. > > What worries me most is that it might only look like a beter version of asyncore. Please, forget about asyncore: this has nothing to do with it per-se as it's just a reactor - it doesn't aim to provide any connection handling. Given the poor asyncore API I doubt it would be even integrable with it without breaking backward compatibility. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From solipsis at pitrou.net Thu May 24 15:06:59 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 24 May 2012 15:06:59 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module References: Message-ID: <20120524150659.49361158@pitrou.net> On Thu, 24 May 2012 22:37:03 +1000 Nick Coghlan wrote: > On Thu, May 24, 2012 at 9:50 PM, Giampaolo Rodol? wrote: > > poller.poll serves the same purpose of asyncore.loop, yes, but this is > > supposed to be independent from asyncore. > > I'd actually like to see something like this pitched as a > "concurrent.eventloop" PEP. Sounds like a good idea to me. By the way, it should also have some support for delayed calls to be actually useful (something that asyncore *still* doesn't have, AFAIK). Regards Antoine. From ncoghlan at gmail.com Thu May 24 15:23:42 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 24 May 2012 23:23:42 +1000 Subject: [Python-ideas] Composability and concurrent.futures In-Reply-To: References: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu> <951AE63A-2AE9-4314-8B05-F80EC90D3314@cs.washington.edu> Message-ID: It's really up to individual libraries to make it possible for applications to provide the executor explicitly, rather than the library assuming it's OK to just create its own. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Thu May 24 19:17:31 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 24 May 2012 11:17:31 -0600 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: Message-ID: On Thu, May 24, 2012 at 5:59 AM, Yuval Greenfield wrote: > On Tue, May 22, 2012 at 7:26 PM, Eric Snow > wrote: >> >> Below I've included a pure Python implementation of a type that I wish >> was a builtin. ?I know others have considered similar classes in the >> past without any resulting change to Python, but I'd like to consider >> it afresh[1][2]. >> [...] > > I've implemented this a few times as well. I called it "AttributeDict" or > "Record". > > > I think adding an __iter__ method would be beneficial. E.g. > > class? SimpleNamespace?: > ? ? ?def __init__(self, **kwargs): > ? ? ? ? ?self.__dict__.update(kwargs) ?# or self.__dict__ = kwargs > ? ? ? ? ?self.__iter__ = lambda: iter(kwargs.keys()) I'd like to limit the syntactic overlap with dict as much as possible. Effectively this is just a simple but distinct facade around dict to give a namespace with attribute access. I suppose part of the question is how much of the Mapping interface would belong instead to a hypothetical Namespace interface. (I'm definitely _not_ proposing such an unnecessary extra level of abstraction). Regardless, if you want to do dict things then you can get the underlying dict using vars(ns) or ns.__dict__ on your instance. Alternately you can subclass the SimpleNamespace type to get all the extra goodies you want, as I showed with the Namespace class at the bottom of my first message. > Why do we need this imo: > > * sometimes x.something feels better than x['something'] > * to ease duck-typing, making mocks, etc. > * Named tuple feels clunky for certain dynamic cases (why do I need to > create the type for a one-off?) Yup. > I wonder if SimpleNameSpace should allow __getitem__ as well... Same thing: just use vars(ns) or a subclass of SimpleNamespace. -eric From guido at python.org Thu May 24 20:14:28 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 24 May 2012 11:14:28 -0700 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: Message-ID: On Thu, May 24, 2012 at 10:17 AM, Eric Snow wrote: > On Thu, May 24, 2012 at 5:59 AM, Yuval Greenfield wrote: >> On Tue, May 22, 2012 at 7:26 PM, Eric Snow >> wrote: >>> >>> Below I've included a pure Python implementation of a type that I wish >>> was a builtin. ?I know others have considered similar classes in the >>> past without any resulting change to Python, but I'd like to consider >>> it afresh[1][2]. >>> [...] >> >> I've implemented this a few times as well. I called it "AttributeDict" or >> "Record". I tend to call it "Struct(ure)" -- I guess I like C better than Pascal. :-) >> I think adding an __iter__ method would be beneficial. E.g. >> >> class? SimpleNamespace?: >> ? ? ?def __init__(self, **kwargs): >> ? ? ? ? ?self.__dict__.update(kwargs) ?# or self.__dict__ = kwargs >> ? ? ? ? ?self.__iter__ = lambda: iter(kwargs.keys()) > > I'd like to limit the syntactic overlap with dict as much as possible. +1 > ?Effectively this is just a simple but distinct facade around dict to > give a namespace with attribute access. ?I suppose part of the > question is how much of the Mapping interface would belong instead to > a hypothetical Namespace interface. (I'm definitely _not_ proposing > such an unnecessary extra level of abstraction). Possibly there is a (weird?) parallel with namedtuple. The end result is somewhat similar: you get to use attribute names instead of the accessor syntax (x[y]) of the underlying type. But the "feel" of the type is different, and inherits more of the underlying type (namedtuple is immutable and has a fixed set of keys, whereas the type proposed here is mutable and allows arbitrary keys as long as they look like Python names). > Regardless, if you want to do dict things then you can get the > underlying dict using vars(ns) or ns.__dict__ on your instance. > Alternately you can subclass the SimpleNamespace type to get all the > extra goodies you want, as I showed with the Namespace class at the > bottom of my first message. > >> Why do we need this imo: >> >> * sometimes x.something feels better than x['something'] >> * to ease duck-typing, making mocks, etc. >> * Named tuple feels clunky for certain dynamic cases (why do I need to >> create the type for a one-off?) > > Yup. > >> I wonder if SimpleNameSpace should allow __getitem__ as well... > > Same thing: just use vars(ns) or a subclass of SimpleNamespace. -- --Guido van Rossum (python.org/~guido) From g.rodola at gmail.com Thu May 24 20:40:31 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Thu, 24 May 2012 20:40:31 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: 2012/5/24 Nick Coghlan : > On Thu, May 24, 2012 at 10:37 PM, Nick Coghlan wrote: >> On Thu, May 24, 2012 at 9:50 PM, Giampaolo Rodol? wrote: >>> poller.poll serves the same purpose of asyncore.loop, yes, but this is >>> supposed to be independent from asyncore. >> >> I'd actually like to see something like this pitched as a >> "concurrent.eventloop" PEP. PEP 3153 really wasn't what I was >> expecting after the discussions at the PyCon US 2011 language summit - >> I was expecting "here's a common event loop all the async frameworks >> can hook into", but instead we got something a *lot* more ambitious >> taht tried to merge the entire IO stack for the async frameworks, >> rather than just provide a standard way for their event loops to >> cooperate. > > See the final section of my notes here: > http://www.boredomandlaziness.org/2011/03/python-language-summit-rough-notes.html > > Turns out the idea of a PEP 3153 level API *was* raised at the summit, > but I'd still like to see a competing PEP that targets the reactor > level API directly. > > Cheers, > Nick. It's not clear to me what such a PEP should address in particular, anyway here's a bunch of semi-random ideas. === Idea #1 === 4 classes (SelectPoller, PollPoller, EpollPoller, KqueuePoller) within concurrent.eventloop namespace all sharing the same API: - register(fd, events, callback) # callback gets called with events as arg - modify(fd, events) - unregister(fd) - call_later(timeout, callback, errback=None) - call_every(timeout, callback, errback=None) - poll(timeout=1.0, blocking=True) - close() call_later() and call_every() can return an object having cancel() and reset() methods. The user willing to register a new handler will do: >>> poller.register(sock.fileno(), poller.READ | poller.WRITE, callback) ...then, in the callback: def callback(events): if events & poller.ERROR and not events & poller.READ: disconnect() else: if events & poller.READ: read() if events & poller.WRITE: write() pros: highly customizable cons: too low level, requires manual handling === Idea #2 === same as #1 except: - register(fd, events) - poll(timeout=1.0) # desn't block, return {fd:events, fd:events, ...} === Idea #3 === same as #1 except: - register(fd, events, handler) - poll(timeout=1.0, blocking=True) ...poll() will call handler.handle_X_event() depending on the current event (READ, WRITE or ERROR). An internal map such as {fd:handler, fd:handler} will be maintaned internally. - pros: easier to use - cons: more rigid, requires a "contract" with the handler --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From ericsnowcurrently at gmail.com Thu May 24 21:34:07 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 24 May 2012 13:34:07 -0600 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: Message-ID: On Thu, May 24, 2012 at 12:14 PM, Guido van Rossum wrote: > On Thu, May 24, 2012 at 10:17 AM, Eric Snow wrote: >> ?Effectively this is just a simple but distinct facade around dict to >> give a namespace with attribute access. ?I suppose part of the >> question is how much of the Mapping interface would belong instead to >> a hypothetical Namespace interface. (I'm definitely _not_ proposing >> such an unnecessary extra level of abstraction). > > Possibly there is a (weird?) parallel with namedtuple. The end result > is somewhat similar: you get to use attribute names instead of the > accessor syntax (x[y]) of the underlying type. But the "feel" of the > type is different, and inherits more of the underlying type > (namedtuple is immutable and has a fixed set of keys, whereas the type > proposed here is mutable and allows arbitrary keys as long as they > look like Python names). Yeah, the feel is definitely different. I've been thinking about this because of the code for sys.implementation. Using a structseq would probably been the simplest approach there, but a named tuple doesn't feel right. In contrast, a SimpleNamespace would fit much better. As far as this goes generally, the pattern of a simple, dynamic attribute-based namespace has been implemented a zillion times (and it's easy to do). This is because people find a simple dynamic namespace really handy and they want the attribute-access interface rather than a mapping. In contrast, a namedtuple is, as Nick said, "the standard library's answer for structured records". It's an immutable (attribute-based) namespace implementing the Sequence interface. It's a tuple and directly reflects the underlying concept of tuples in Python by giving the values names. SimpleNamespace (and the like) isn't a structured record. It's only job is to be an attribute-based namespace with as simple an interface as possible. So why isn't a type like SimpleNamespace in the stdlib? Because it's trivial to implement. There's a certain trivial-ness threshold a function/type must pass before it gets canonized, and rightly so. Anyway, while many would use something like SimpleNamespace out the the standard library, my impetus was having it as a builtin type so I could use it for sys.implementation. :) FWIW, I have an implementation (pure Python + c extension) of SimpleNamespace on PyPI: http://pypi.python.org/pypi/simple_namespace -eric From tjreedy at udel.edu Thu May 24 22:04:40 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 24 May 2012 16:04:40 -0400 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On 5/24/2012 2:40 PM, Giampaolo Rodol? wrote: > It's not clear to me what such a PEP should address in particular, > anyway here's a bunch of semi-random ideas. I have been reading for perhaps a decade how bad asyncore is. So I hope you stick with trying to thrash out something different, even if the discussion gets tedious or contentions. > === Idea #1 === > > 4 classes (SelectPoller, PollPoller, EpollPoller, KqueuePoller) within > concurrent.eventloop namespace all sharing the same API: For new classes, the first question is what concept (and data/function grouping) they and their instances represent. As a naive event loop user, I might think in terms of event sources (or sets of sources) and corresponding handlers. For events generated by 'file' polling, the particular method would seem like a secondary issue. Your proposed classes are named after methods and you give no initialization api. This suggests to me that you mean for all files being polled by the same method to be grouped together. If so, there would only need 0 or 1 instance of each 'class', in while case, they could just as well be modules. In other words, I am unsure what concept these classes would represent. I am perhaps thinking at too high a level. > - register(fd, events, callback) # callback gets called with events as arg > - modify(fd, events) > - unregister(fd) > - call_later(timeout, callback, errback=None) > - call_every(timeout, callback, errback=None) > - poll(timeout=1.0, blocking=True) > - close() > > call_later() and call_every() can return an object having cancel() and > reset() methods. -- Terry Jan Reedy From cs at zip.com.au Fri May 25 00:37:52 2012 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 25 May 2012 08:37:52 +1000 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: <20120524140314.303d3bcd@pitrou.net> References: <20120524140314.303d3bcd@pitrou.net> Message-ID: <20120524223752.GA7468@cskk.homeip.net> On 24May2012 14:03, Antoine Pitrou wrote: | Also, I don't know why you would specify poller.READ or poller.WRITE | explicitly. Usually you are interested in all events, no? Personally, I would want specificity. If I only care about write (eg I'm only sending), I would only specify poller.WRITE and have my handler only know and care about that. Possibly it would be good to be able to raise an exception for events I hadn't handled, but I'd be half inclined to have my handler do that, were it wanted (yes, there is some tension in this sentence). Unless I'm missing something here. Just my 2c, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ I just didn't give up, not riding it out wasn't an option. You don't crash, until you do. The longer you ride it out the more likely you are to ride it out. Throwing it away, saves nothing. - J. Pridmore From ronaldoussoren at mac.com Fri May 25 08:39:23 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 25 May 2012 08:39:23 +0200 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On 24 May, 2012, at 20:40, Giampaolo Rodol? wrote: > 2012/5/24 Nick Coghlan : >> On Thu, May 24, 2012 at 10:37 PM, Nick Coghlan wrote: >>> On Thu, May 24, 2012 at 9:50 PM, Giampaolo Rodol? wrote: >>>> poller.poll serves the same purpose of asyncore.loop, yes, but this is >>>> supposed to be independent from asyncore. >>> >>> I'd actually like to see something like this pitched as a >>> "concurrent.eventloop" PEP. PEP 3153 really wasn't what I was >>> expecting after the discussions at the PyCon US 2011 language summit - >>> I was expecting "here's a common event loop all the async frameworks >>> can hook into", but instead we got something a *lot* more ambitious >>> taht tried to merge the entire IO stack for the async frameworks, >>> rather than just provide a standard way for their event loops to >>> cooperate. >> >> See the final section of my notes here: >> http://www.boredomandlaziness.org/2011/03/python-language-summit-rough-notes.html >> >> Turns out the idea of a PEP 3153 level API *was* raised at the summit, >> but I'd still like to see a competing PEP that targets the reactor >> level API directly. >> >> Cheers, >> Nick. > > > It's not clear to me what such a PEP should address in particular, > anyway here's a bunch of semi-random ideas. All of these are probably too low level to be the only API because they don't encapsulate error handling. A slightly higher level API would have a callback with received data and a buffered API for sending data. That way the networking library can deal with lowlevel socket API errors and translate them to usefull abtract errors. It would also handle some errors like and EGAIN error itself. Also: how would you use SSL with these APIs? The API would probably end up with functionality simular to Twisted's reactor and transport APIs (and possibly endpoints but I don't know how stable that API is). Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4788 bytes Desc: not available URL: From list at qtrac.plus.com Fri May 25 09:53:47 2012 From: list at qtrac.plus.com (Mark Summerfield) Date: Fri, 25 May 2012 08:53:47 +0100 Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion) Message-ID: <20120525085347.31215c94@dino> Hi, Built-ins: In an effort to keep the core language as small as possible (to keep it "brain sized":-) would it be reasonable to deprecate filter() and map() and to move them to the standard library as happened with reduce()? After all, don't people mostly use list comprehensions and generator expressions for these nowadays? Docs: The Python Module Index http://docs.python.org/dev/py-modindex.html Shows _ | a | b | ... This is prettier than _ | A | B | ... but also harder to click because the letters are smaller; so I would prefer the use of capitals. -- Mark Summerfield, Qtrac Ltd, www.qtrac.eu C++, Python, Qt, PyQt - training and consultancy "Programming in Go" - ISBN 0321774639 http://www.qtrac.eu/gobook.html From ncoghlan at gmail.com Fri May 25 10:28:28 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 May 2012 18:28:28 +1000 Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion) In-Reply-To: <20120525085347.31215c94@dino> References: <20120525085347.31215c94@dino> Message-ID: On Fri, May 25, 2012 at 5:53 PM, Mark Summerfield wrote: > Hi, > > Built-ins: > > In an effort to keep the core language as small as possible (to keep it > "brain sized":-) would it be reasonable to deprecate filter() and map() > and to move them to the standard library as happened with reduce()? > After all, don't people mostly use list comprehensions and generator > expressions for these nowadays? I'd personally agree with filter() moving, but "map(str, seq)" still beats "(str(x) for x in seq)" by a substantial margin for me when it comes to quickly and cleanly encapsulating a common idiom such that it is easier both to read *and* write. The basic problem is that the answer to your question is "no" - for preexisting functions, a lot of people still use filter() and map(), with the comprehension forms reigning supreme only when someone would have had to otherwise use a lambda expression. We won the argument for moving reduce() to functools because it's such a pain to use correctly that it clearly qualified as an attractive nuisance. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From simon.sapin at kozea.fr Fri May 25 10:25:29 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Fri, 25 May 2012 10:25:29 +0200 Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion) In-Reply-To: <20120525085347.31215c94@dino> References: <20120525085347.31215c94@dino> Message-ID: <4FBF41F9.90609@kozea.fr> Hi, Le 25/05/2012 09:53, Mark Summerfield a ?crit : > Built-ins: > > In an effort to keep the core language as small as possible (to keep it > "brain sized":-) would it be reasonable to deprecate filter() and map() > and to move them to the standard library as happened with reduce()? > After all, don't people mostly use list comprehensions and generator > expressions for these nowadays? Aside from the pain of porting existing code, what would this achieve? How do filter() and map() bother you if you can just ignore them and not use them? The only upside I can imagine in having less bultins is that using variables with the same names is a kind-of bad practice. But it can not cause a bug if you don?t use the builtin at all. > Docs: > > The Python Module Indexhttp://docs.python.org/dev/py-modindex.html > Shows _ | a | b | ... > This is prettier than _ | A | B | ... > but also harder to click because the letters are smaller; so I would > prefer the use of capitals. Adding CSS padding on links can make the clickable area bigger, so the choice of using capitals or not can be independent of that. Regards, -- Simon Sapin From steve at pearwood.info Fri May 25 11:07:03 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 25 May 2012 19:07:03 +1000 Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion) In-Reply-To: <20120525085347.31215c94@dino> References: <20120525085347.31215c94@dino> Message-ID: <4FBF4BB7.1070308@pearwood.info> Mark Summerfield wrote: > Hi, > > Built-ins: > > In an effort to keep the core language as small as possible (to keep it > "brain sized":-) would it be reasonable to deprecate filter() and map() > and to move them to the standard library as happened with reduce()? > After all, don't people mostly use list comprehensions and generator > expressions for these nowadays? So you would put people through the pain of dealing with broken code and deprecation just so that people don't have to remember functions which you think they don't remember anyway? -1 Keeping the core language small is a benefit to core developers. It is not so much a benefit to users of the language -- if a programmer is only using the builtins, they are surely reinventing the wheel (and probably badly). To be an effective programmer, you surely are using functions and classes in the std lib as well as the builtins, which means you have to memorise both what the function is, *and* where it is. Shrinking the builtins while increasing the size of the std lib is not much of a human-memory optimization, and may very well be a pessimation. If you need a memory-jog, it is much easier to find builtins because they are always available to a quick call to dir(), while finding something in a module means searching the docs or the file system. -- Steven From steve at pearwood.info Fri May 25 11:09:29 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 25 May 2012 19:09:29 +1000 Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion) In-Reply-To: References: <20120525085347.31215c94@dino> Message-ID: <4FBF4C49.30007@pearwood.info> Nick Coghlan wrote: > I'd personally agree with filter() moving, but "map(str, seq)" still > beats "(str(x) for x in seq)" by a substantial margin for me when it > comes to quickly and cleanly encapsulating a common idiom such that it > is easier both to read *and* write. filter(None, seq) [obj for obj in seq if obj] I think the version with filter is *much* better than the second. -- Steven From ironfroggy at gmail.com Fri May 25 11:32:23 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Fri, 25 May 2012 05:32:23 -0400 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On Wed, May 23, 2012 at 9:32 PM, Giampaolo Rodol? wrote: > Including an established async IO framework such as Twisted, gevent or > Tornado in the Python stdlib has always been a controversial subject. > PEP-3153 (http://www.python.org/dev/peps/pep-3153/) tried to face this > problem in the most agnostic way as possible, and it's a good starting > point IMO. > Nevertheless, it's still vague about what the actual API should look > like and AFAIK it remained stagnant so far. > > There's one thing in the whole async stack which is basically the same > for all implementations though: the poller/reactor. > Could it make sense to add something similar to select module? > Differently from PEP-3153, providing such a layer on top of select(), > poll() & co. is easier and could possibly be an incentive to avoid > such code duplication. > > I'm coming up with this because I recently did something similar in > pyftpdlib as an hack on top of asyncore to add support for epoll() and > kqueue(), using the excellent Tornado's io loop as source of > inspiration: > http://code.google.com/p/pyftpdlib/issues/detail?id=203 > http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py > > > The way I imagine it: > >>>> import select >>>> dir(select) > [..., 'EpollPoller', 'PollPoller', 'SelectPoller', 'KqueuePoller'] >>>> poller = select.EpollPoller() >>>> poller.register(fd, handler, poller.READ | poller.WRITE) >>>> poller.socket_map > {2 : } >>>> poller.modify(fd, poller.READ) >>>> poller.poll() ? ? ?# will call handler.handle_read_event() if/when it's the case > ^C > KeyboardInterrupt >>>> poller.remove(fd) >>>> poller.close() > > The handler is supposed to provide 3 methods: > - handle_read_event > - handle_write_event > - handle_error_event > > Users willing to support multiple event loops such as wx, gtk etc can do: > >>>> while 1: > ... ? ? ? poller.poll(timeout=0.1, blocking=False) > ... ? ? ? otherpoller.poll() > > > Basically, this would be the whole API. > > Thoughts? > Frankly, I don't think this deserves a PEP at all, or even to consider one *yet*. Building a new API and a new library from scratch seems a frail comparison to testing a library in the real world, it having real uses, and then being incorporated into the stdlib. The problem here, of course, is that all the real-world solutions (ie, Twisted) include far more than the reactor. > > --- Giampaolo > http://code.google.com/p/pyftpdlib/ > http://code.google.com/p/psutil/ > http://code.google.com/p/pysendfile/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From solipsis at pitrou.net Fri May 25 11:52:06 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 25 May 2012 11:52:06 +0200 Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion) References: <20120525085347.31215c94@dino> <4FBF4C49.30007@pearwood.info> Message-ID: <20120525115206.1f098cbb@pitrou.net> On Fri, 25 May 2012 19:09:29 +1000 Steven D'Aprano wrote: > Nick Coghlan wrote: > > > I'd personally agree with filter() moving, but "map(str, seq)" still > > beats "(str(x) for x in seq)" by a substantial margin for me when it > > comes to quickly and cleanly encapsulating a common idiom such that it > > is easier both to read *and* write. > > filter(None, seq) > [obj for obj in seq if obj] > > I think the version with filter is *much* better than the second. Only if you remember what the special value None does when passed to filter. The cognitive burden is higher. That said, the idea of moving filter() and map() away won't fly before at least Python 4. Regatds Antoine. From jeanpierreda at gmail.com Fri May 25 12:33:22 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 25 May 2012 06:33:22 -0400 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On Fri, May 25, 2012 at 5:32 AM, Calvin Spealman wrote: > Frankly, I don't think this deserves a PEP at all, or even to consider > one *yet*. > > Building a new API and a new library from scratch seems a frail > comparison to testing > a library in the real world, it having real uses, and then being > incorporated into the > stdlib. The problem here, of course, is that all the real-world > solutions (ie, Twisted) > include far more than the reactor. To be fair, PEP-3153 was built based largely on experience from the Twisted project and input from Twisted developers, who know what they are talking about and how to build a useful system. The entire transport/protocol separation is lifted directly out of it. -- Devin From ironfroggy at gmail.com Fri May 25 13:54:13 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Fri, 25 May 2012 07:54:13 -0400 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On Fri, May 25, 2012 at 6:33 AM, Devin Jeanpierre wrote: > On Fri, May 25, 2012 at 5:32 AM, Calvin Spealman wrote: >> Frankly, I don't think this deserves a PEP at all, or even to consider >> one *yet*. >> >> Building a new API and a new library from scratch seems a frail >> comparison to testing >> a library in the real world, it having real uses, and then being >> incorporated into the >> stdlib. The problem here, of course, is that all the real-world >> solutions (ie, Twisted) >> include far more than the reactor. > > To be fair, PEP-3153 was built based largely on experience from the > Twisted project and input from Twisted developers, who know what they > are talking about and how to build a useful system. The entire > transport/protocol separation is lifted directly out of it. > > -- Devin My comments were in response to this post, not PEP-3153 -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From ncoghlan at gmail.com Fri May 25 15:53:04 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 May 2012 23:53:04 +1000 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On May 25, 2012 7:33 PM, "Calvin Spealman" wrote: > On Wed, May 23, 2012 at 9:32 PM, Giampaolo Rodol? > wrote: > > Users willing to support multiple event loops such as wx, gtk etc can do: > > > >>>> while 1: > > ... poller.poll(timeout=0.1, blocking=False) > > ... otherpoller.poll() > > > > > > Basically, this would be the whole API. > > > > Thoughts? > > > > Frankly, I don't think this deserves a PEP at all, or even to consider > one *yet*. > > Building a new API and a new library from scratch seems a frail > comparison to testing > a library in the real world, it having real uses, and then being > incorporated into the > stdlib. The problem here, of course, is that all the real-world > solutions (ie, Twisted) > include far more than the reactor. > No, the specific call at the PyCon US 2011 language summit was for a PEP that proposed a *new* event loop for the standard library that: 1. Provides simple event loop functionality in the standard library, as an improved alternative to asyncore for small apps that don't require the full power of a framework like Twisted (think things like little IRC bots, TCP echo servers, or testing of async components) 2. Provides a clean migration path to a production grade reactor like Twisted's 3. Makes it easier for multiple event loop based frameworks (e.g. tkinter, wxPython, PySide, Twisted) to all cooperate within the same process What we're after is something for the stdlib that is to event loops/reactors as wsgiref is to production grade WSGI servers like mod_wsgi and nginx. asyncore isn't it, because the migration path isn't clean. PEP 3153 currently spends a lot of time talking about transports and protocols, but doesn't answer those 3 core questions: 1. How do I write a simple IRC bot or TCP echo server? 2. How do I migrate my simple app to a production grade reactor like Twisted's? 3. How do I run two different concurrent.eventloop compatible reactors in the same process? As far as I can tell, PEP 3153 wants to handle all that by merging the I/O stacks of all the frameworks first, which strikes me as being *way* too ambitious for a first step. If we can't even figure out a common abstraction for the reactor level (ala WSGI), how are we ever going to agree on a standard async I/O abstraction? Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri May 25 15:54:51 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 May 2012 23:54:51 +1000 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On Fri, May 25, 2012 at 11:53 PM, Nick Coghlan wrote: > as wsgiref is to production grade WSGI servers like mod_wsgi and nginx. s/nginx/gunicorn/ Confusing-my-software-stack-levels'ly, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mwm at mired.org Fri May 25 18:29:12 2012 From: mwm at mired.org (Mike Meyer) Date: Fri, 25 May 2012 12:29:12 -0400 Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion) In-Reply-To: References: <20120525085347.31215c94@dino> Message-ID: <20120525122912.46096701@bhuda.mired.org> On Fri, 25 May 2012 18:28:28 +1000 Nick Coghlan wrote: > On Fri, May 25, 2012 at 5:53 PM, Mark Summerfield wrote: > > In an effort to keep the core language as small as possible (to keep it > > "brain sized":-) would it be reasonable to deprecate filter() and map() > > and to move them to the standard library as happened with reduce()? > > After all, don't people mostly use list comprehensions and generator > > expressions for these nowadays? Wasn't this changed discussed for that very reason as part of the move to 3.x? Which makes me wonder why reduce moved but not map and filter, when map and filter have obvious rewrites as list comprehensions, but reduce doesn't? Seems backwards to me. > The basic problem is that the answer to your question is "no" - for > preexisting functions, a lot of people still use filter() and map(), > with the comprehension forms reigning supreme only when someone would > have had to otherwise use a lambda expression. Personally, I tend to favor list comprehensions most of the time (and I was a pretty heavy user of map and filter in the day), because it's just one less idiom to deal with. The exception is when they'd nest - I use [map(f, l) for l in list-of-lists] rather than nesting the comprehensions, because I then don't have to worry about untangling the nest. But I do agree that since they survived into 3.x, they need to stay put until 4.x. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From guido at python.org Fri May 25 18:47:00 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 25 May 2012 09:47:00 -0700 Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion) In-Reply-To: <20120525122912.46096701@bhuda.mired.org> References: <20120525085347.31215c94@dino> <20120525122912.46096701@bhuda.mired.org> Message-ID: On Fri, May 25, 2012 at 9:29 AM, Mike Meyer wrote: > On Fri, 25 May 2012 18:28:28 +1000 > Nick Coghlan wrote: >> On Fri, May 25, 2012 at 5:53 PM, Mark Summerfield wrote: >> > In an effort to keep the core language as small as possible (to keep it >> > "brain sized":-) would it be reasonable to deprecate filter() and map() >> > and to move them to the standard library as happened with reduce()? >> > After all, don't people mostly use list comprehensions and generator >> > expressions for these nowadays? > > Wasn't this changed discussed for that very reason as part of the move > to 3.x? > > Which makes me wonder why reduce moved but not map and filter, when > map and filter have obvious rewrites as list comprehensions, but > reduce doesn't? Seems backwards to me. How quickly we forget. The point wasn't sparsity of constructs. The point was readability. Code written using map() or filter(), is usually quite readable -- excesses are possible, but not more so than using list comprehensions. However code that uses reduce() has a high likelihood of being unreadable, and is almost always rewritten more easily using a traditional for loop and some variables that are updated in the loop. >> The basic problem is that the answer to your question is "no" - for >> preexisting functions, a lot of people still use filter() and map(), >> with the comprehension forms reigning supreme only when someone would >> have had to otherwise use a lambda expression. > > Personally, I tend to favor list comprehensions most of the time (and > I was a pretty heavy user of map and filter in the day), because it's > just one less idiom to deal with. The exception is when they'd nest - > I use [map(f, l) for l in list-of-lists] rather than nesting the > comprehensions, because I then don't have to worry about untangling > the nest. There are interesting considerations of readability either way. If you have to write a lambda to use map() or filter(), it is *always* better to use a list comprehension, because of the overhead in creating the stack frame for the lambda. But if you are mapping or filtering using an already-existing function, map()/filter() is more concise and I usually find it more readable, because you don't have to invent a loop control variable. My claim is that for the human reader (who is familiar with map/filter), it is less work for the brain to understand map(f, xs) than [f(x) for x in xs] -- there are more words to parse in the latter, and you have to check that it is the same 'x' in both places. The advantage of map/filter increases when f is a built-in function, since the loop implied by map/filter executes more quickly than the explicit loop (implemented using standard looping byte codes) used by list comprehensions. (I hesitate to emphasize the performance too much, since some hypothetical future Python implementation could make the performance the same in all cases. But with today's CPython, Jython and IronPython, it is important to know about relative performance of different constructs; and even PyPy doesn't alter the equation too much here. Still, the readability arguments aligns pretty much with the performance arguments, so they just strengthen each other.) > But I do agree that since they survived into 3.x, they need to stay > put until 4.x. And beyond. -- --Guido van Rossum (python.org/~guido) From mwm at mired.org Fri May 25 23:37:29 2012 From: mwm at mired.org (Mike Meyer) Date: Fri, 25 May 2012 17:37:29 -0400 Subject: [Python-ideas] pmap, preduce, pmapreduce? Message-ID: <20120525173729.5a42a558@bhuda.mired.org> Another crazy idea that may not be possible, based on my finally getting around to watching Guy Steele's talks about what he's up to these days (http://vimeo.com/6624203). Given a function that takes a list (or a container class which len doesn't consume) and a function, and then applies that function to the list in some way: either element wise, or in pairs of elements/results, but does it in parallel. It will hold the GIL, but run the function calls in distinct threads, meaning two applications of the function could interfere with each other. Is it possible to place limitations on the function such that this kind of controlled concurrent operation is safe? http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From pyideas at rebertia.com Sat May 26 00:52:54 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Fri, 25 May 2012 15:52:54 -0700 Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion) In-Reply-To: <4FBF4C49.30007@pearwood.info> References: <20120525085347.31215c94@dino> <4FBF4C49.30007@pearwood.info> Message-ID: On Fri, May 25, 2012 at 2:09 AM, Steven D'Aprano wrote: > Nick Coghlan wrote: >> I'd personally agree with filter() moving, but "map(str, seq)" still >> beats "(str(x) for x in seq)" by a substantial margin for me when it >> comes to quickly and cleanly encapsulating a common idiom such that it >> is easier both to read *and* write. > > filter(None, seq) > [obj for obj in seq if obj] > > I think the version with filter is *much* better than the second. And I think filter(bool, seq) beats the first. Exact same length, more explicit, one less key to press (Shift). The consistency of using comprehensions all the time has a certain attraction though. Cheers, Chris From jeanpierreda at gmail.com Sat May 26 00:54:36 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 25 May 2012 18:54:36 -0400 Subject: [Python-ideas] from foo import bar.baz Message-ID: Has it irritated anyone else that this syntax is invalid? I've wanted it a couple of times, to be equivalent to: import foo.bar.baz from foo import bar del foo # but only if we didn't import foo already before" The idea being that one wants access to foo.bar.baz under the name bar.baz , for readability purposes or what have you. I played around with adding this, but I seem to have really bad luck with extending CPython... -- Devin From grosser.meister.morti at gmx.net Sat May 26 01:53:26 2012 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Sat, 26 May 2012 01:53:26 +0200 Subject: [Python-ideas] from foo import bar.baz In-Reply-To: References: Message-ID: <4FC01B76.4030104@gmx.net> +1 Indeed, I would have expected that "from foo import bar.baz" would work. On 05/26/2012 12:54 AM, Devin Jeanpierre wrote: > Has it irritated anyone else that this syntax is invalid? I've wanted > it a couple of times, to be equivalent to: > > import foo.bar.baz > from foo import bar > del foo # but only if we didn't import foo already before" > > The idea being that one wants access to foo.bar.baz under the name > bar.baz , for readability purposes or what have you. > > I played around with adding this, but I seem to have really bad luck > with extending CPython... > > -- Devin From jsbueno at python.org.br Sat May 26 06:17:29 2012 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Sat, 26 May 2012 01:17:29 -0300 Subject: [Python-ideas] pmap, preduce, pmapreduce? In-Reply-To: <20120525173729.5a42a558@bhuda.mired.org> References: <20120525173729.5a42a558@bhuda.mired.org> Message-ID: On 25 May 2012 18:37, Mike Meyer wrote: > Another crazy idea that may not be possible, based on my finally > getting around to watching Guy Steele's talks about what he's up to > these days (http://vimeo.com/6624203). > > Given a function that takes a list (or a container class which len > doesn't consume) and a function, and then applies that function to the > list in some way: either element wise, or in pairs of elements/results, > but does it in parallel. It will hold the GIL, but run the function > calls in distinct threads, meaning two applications of the function > could interfere with each other. Just like the already existing "map" method in concurrent.futures.Executor ? * js -><- * all praise the Python time machine > ? ? -- > Mike Meyer ? ? ? ? ? ? ?http://www.mired.org/ From jeanpierreda at gmail.com Sat May 26 14:17:35 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sat, 26 May 2012 08:17:35 -0400 Subject: [Python-ideas] pmap, preduce, pmapreduce? In-Reply-To: <20120525173729.5a42a558@bhuda.mired.org> References: <20120525173729.5a42a558@bhuda.mired.org> Message-ID: On Fri, May 25, 2012 at 5:37 PM, Mike Meyer wrote: > Is it possible to place limitations on the function such that this > kind of controlled concurrent operation is safe? I'm not sure what you mean. Tentative answer: restrict it to pure functions. -- Devin From masklinn at masklinn.net Sat May 26 17:34:52 2012 From: masklinn at masklinn.net (Masklinn) Date: Sat, 26 May 2012 17:34:52 +0200 Subject: [Python-ideas] pmap, preduce, pmapreduce? In-Reply-To: <20120525173729.5a42a558@bhuda.mired.org> References: <20120525173729.5a42a558@bhuda.mired.org> Message-ID: On 2012-05-25, at 23:37 , Mike Meyer wrote: > > Is it possible to place limitations on the function such that this > kind of controlled concurrent operation is safe? This would mean ideally only having pure functions, and at the very least having functions which can't share state (not easily anyway). Python, as a language, has no such provision that I know of beyond "be careful" and "you're on your own". A possible option, though, would be to use `multiprocessing` rather than threads: multiprocessing.pool already provides a `map` operation, and processes can't share state by default (doing so is quite an explicit ? and some would say involved ? operation). Going through multiprocessing puts other limitations/complexities on the function implementations, but at the very least it wouldn't be possible to *unknowingly* share state. From ericsnowcurrently at gmail.com Sat May 26 20:53:11 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 26 May 2012 12:53:11 -0600 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: Message-ID: On Thu, May 24, 2012 at 1:34 PM, Eric Snow wrote: > On Thu, May 24, 2012 at 12:14 PM, Guido van Rossum wrote: >> On Thu, May 24, 2012 at 10:17 AM, Eric Snow wrote: >>> ?Effectively this is just a simple but distinct facade around dict to >>> give a namespace with attribute access. ?I suppose part of the >>> question is how much of the Mapping interface would belong instead to >>> a hypothetical Namespace interface. (I'm definitely _not_ proposing >>> such an unnecessary extra level of abstraction). >> >> Possibly there is a (weird?) parallel with namedtuple. The end result >> is somewhat similar: you get to use attribute names instead of the >> accessor syntax (x[y]) of the underlying type. But the "feel" of the >> type is different, and inherits more of the underlying type >> (namedtuple is immutable and has a fixed set of keys, whereas the type >> proposed here is mutable and allows arbitrary keys as long as they >> look like Python names). > > Yeah, the feel is definitely different. ?I've been thinking about this > because of the code for sys.implementation. ?Using a structseq would > probably been the simplest approach there, but a named tuple doesn't > feel right. ?In contrast, a SimpleNamespace would fit much better. > > As far as this goes generally, the pattern of a simple, dynamic > attribute-based namespace has been implemented a zillion times (and > it's easy to do). ?This is because people find a simple dynamic > namespace really handy and they want the attribute-access interface > rather than a mapping. > > In contrast, a namedtuple is, as Nick said, "the standard library's > answer for structured records". ?It's an immutable (attribute-based) > namespace implementing the Sequence interface. ?It's a tuple and > directly reflects the underlying concept of tuples in Python by giving > the values names. > > SimpleNamespace (and the like) isn't a structured record. ?It's only > job is to be an attribute-based namespace with as simple an interface > as possible. > > So why isn't a type like SimpleNamespace in the stdlib? Because it's > trivial to implement. ?There's a certain trivial-ness threshold a > function/type must pass before it gets canonized, and rightly so. > > Anyway, while many would use something like SimpleNamespace out the > the standard library, my impetus was having it as a builtin type so I > could use it for sys.implementation. ?:) > > FWIW, I have an implementation (pure Python + c extension) of > SimpleNamespace on PyPI: > > ?http://pypi.python.org/pypi/simple_namespace > > -eric Any further thoughts on this? Unless anyone is strongly opposed, I'd like to push this forward. -eric From ironfroggy at gmail.com Sat May 26 23:02:31 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Sat, 26 May 2012 17:02:31 -0400 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: Message-ID: On Sat, May 26, 2012 at 2:53 PM, Eric Snow wrote: > On Thu, May 24, 2012 at 1:34 PM, Eric Snow wrote: >> On Thu, May 24, 2012 at 12:14 PM, Guido van Rossum wrote: >>> On Thu, May 24, 2012 at 10:17 AM, Eric Snow wrote: >>>> ?Effectively this is just a simple but distinct facade around dict to >>>> give a namespace with attribute access. ?I suppose part of the >>>> question is how much of the Mapping interface would belong instead to >>>> a hypothetical Namespace interface. (I'm definitely _not_ proposing >>>> such an unnecessary extra level of abstraction). >>> >>> Possibly there is a (weird?) parallel with namedtuple. The end result >>> is somewhat similar: you get to use attribute names instead of the >>> accessor syntax (x[y]) of the underlying type. But the "feel" of the >>> type is different, and inherits more of the underlying type >>> (namedtuple is immutable and has a fixed set of keys, whereas the type >>> proposed here is mutable and allows arbitrary keys as long as they >>> look like Python names). >> >> Yeah, the feel is definitely different. ?I've been thinking about this >> because of the code for sys.implementation. ?Using a structseq would >> probably been the simplest approach there, but a named tuple doesn't >> feel right. ?In contrast, a SimpleNamespace would fit much better. >> >> As far as this goes generally, the pattern of a simple, dynamic >> attribute-based namespace has been implemented a zillion times (and >> it's easy to do). ?This is because people find a simple dynamic >> namespace really handy and they want the attribute-access interface >> rather than a mapping. >> >> In contrast, a namedtuple is, as Nick said, "the standard library's >> answer for structured records". ?It's an immutable (attribute-based) >> namespace implementing the Sequence interface. ?It's a tuple and >> directly reflects the underlying concept of tuples in Python by giving >> the values names. >> >> SimpleNamespace (and the like) isn't a structured record. ?It's only >> job is to be an attribute-based namespace with as simple an interface >> as possible. >> >> So why isn't a type like SimpleNamespace in the stdlib? Because it's >> trivial to implement. ?There's a certain trivial-ness threshold a >> function/type must pass before it gets canonized, and rightly so. >> >> Anyway, while many would use something like SimpleNamespace out the >> the standard library, my impetus was having it as a builtin type so I >> could use it for sys.implementation. ?:) >> >> FWIW, I have an implementation (pure Python + c extension) of >> SimpleNamespace on PyPI: >> >> ?http://pypi.python.org/pypi/simple_namespace >> >> -eric > > Any further thoughts on this? ?Unless anyone is strongly opposed, I'd > like to push this forward. There is no good name for such a type. "Namespace" is a bad name, because the term "namespace" is already a general term that describes a lot of things in Python (and outside it) and shouldn't share a name with a specific thing, this type. That this specific type would also be within the more general namespace-concept only makes that worse. So, what do you call it? Also, is this here because you don't like typing the square brackets and quotes? If so, does it only save you three characters and is that worth the increase to the language size? A final complaint against: would the existence of this fragment python-learners education to the point that they would defer learning and practicing to use classes properly? Sorry to complain, but someone needs to in python-ideas! ;-) Calvin > -eric > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From ironfroggy at gmail.com Sat May 26 23:05:57 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Sat, 26 May 2012 17:05:57 -0400 Subject: [Python-ideas] Add a generic async IO poller/reactor to select module In-Reply-To: References: Message-ID: On Fri, May 25, 2012 at 9:53 AM, Nick Coghlan wrote: > On May 25, 2012 7:33 PM, "Calvin Spealman" wrote: >> >> On Wed, May 23, 2012 at 9:32 PM, Giampaolo Rodol? >> wrote: >> > Users willing to support multiple event loops such as wx, gtk etc can >> > do: >> > >> >>>> while 1: >> > ... ? ? ? poller.poll(timeout=0.1, blocking=False) >> > ... ? ? ? otherpoller.poll() >> > >> > >> > Basically, this would be the whole API. >> > >> > Thoughts? >> > >> >> Frankly, I don't think this deserves a PEP at all, or even to consider >> one *yet*. >> >> Building a new API and a new library from scratch seems a frail >> comparison to testing >> a library in the real world, it having real uses, and then being >> incorporated into the >> stdlib. The problem here, of course, is that all the real-world >> solutions (ie, Twisted) >> include far more than the reactor. > > > No, the specific call at the PyCon US 2011 language summit was for a PEP > that proposed a *new* event loop for the standard library that: > 1. Provides simple event loop functionality in the standard library, as an > improved alternative to asyncore for small apps that don't require the full > power of a framework like Twisted (think things like little IRC bots, TCP > echo servers, or testing of async components) > 2. Provides a clean migration path to a production grade reactor like > Twisted's > 3. Makes it easier for multiple event loop based frameworks (e.g. tkinter, > wxPython, PySide, Twisted) to all cooperate within the same process > > What we're after is something for the stdlib that is to event loops/reactors > as wsgiref is to production grade WSGI servers like mod_wsgi and nginx. > asyncore isn't it, because the migration path isn't clean. > > PEP 3153 currently spends a lot of time talking about transports and > protocols, but doesn't answer those 3 core questions: > > 1. How do I write a simple IRC bot or TCP echo server? > 2. How do I migrate my simple app to a production grade reactor like > Twisted's? > 3. How do I run two different concurrent.eventloop compatible reactors in > the same process? > > As far as I can tell, PEP 3153 wants to handle all that by merging the I/O > stacks of all the frameworks first, which strikes me as being *way* too > ambitious for a first step. If we can't even figure out a common abstraction > for the reactor level (ala WSGI), how are we ever going to agree on a > standard async I/O abstraction? Obviously, for a man with many opinions I miss out on too many conversations and too many potential actions. I should make steps to correct this in the future. Thanks for clearing this up. > Cheers, > Nick. > -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From mwm at mired.org Sun May 27 00:04:18 2012 From: mwm at mired.org (Mike Meyer) Date: Sat, 26 May 2012 18:04:18 -0400 Subject: [Python-ideas] pmap, preduce, pmapreduce? In-Reply-To: References: <20120525173729.5a42a558@bhuda.mired.org> Message-ID: <20120526180418.56871823@bhuda.mired.org> On Sat, 26 May 2012 17:34:52 +0200 Masklinn wrote: > On 2012-05-25, at 23:37 , Mike Meyer wrote: > > Is it possible to place limitations on the function such that this > > kind of controlled concurrent operation is safe? > This would mean ideally only having pure functions, and at the very > least having functions which can't share state (not easily anyway). I'm not sure pure functions is good enough for cPython. If the function involves looking through a tree of state (shared via the arguments, even), then the changing reference counts as the code goes through the key will hose you, unless the function evaluations are serialized via the GIL. > Python, as a language, has no such provision that I know of beyond "be > careful" and "you're on your own". Generally true for your code, but I think it tries to keep the interpreter from tripping over it's own feet (via the GIL, etc.). > A possible option, though, would be to use `multiprocessing` rather than > threads: multiprocessing.pool already provides a `map` operation, and > processes can't share state by default (doing so is quite an explicit > ? and some would say involved ? operation). Going through > multiprocessing puts other limitations/complexities on the function > implementations, but at the very least it wouldn't be possible to > *unknowingly* share state. I'm familiar with that option, but was hoping to avoid it. Though adding reduce (and maybe a mapreduce?) method to something like concurrent.futures might be nice. Thanks http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From ericsnowcurrently at gmail.com Sun May 27 01:33:49 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 26 May 2012 17:33:49 -0600 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: Message-ID: On Sat, May 26, 2012 at 3:02 PM, Calvin Spealman wrote: > On Sat, May 26, 2012 at 2:53 PM, Eric Snow wrote: >> Any further thoughts on this? ?Unless anyone is strongly opposed, I'd >> like to push this forward. > > There is no good name for such a type. "Namespace" is a bad name, because > the term "namespace" is already a general term that describes a lot of things in > Python (and outside it) and shouldn't share a name with a specific > thing, this type. > That this specific type would also be within the more general namespace-concept > only makes that worse. > > So, what do you call it? Yeah, I've seen it called at least 10 different things. I'm certainly open to whatever works best. I've called it "namespace" because it is one of the two kinds of namespace in Python: mapping ([]-access) and object (dotted-access). The builtin dict fills the one role and the builtin object type almost fills the other. I guess "dotted_namespace" or "attribute_namespace" would work if "namespace" is too confusing. > Also, is this here because you don't like typing the square brackets and quotes? If > so, does it only save you three characters and is that worth the increase to the > language size? This is definitely the stick against which to measure! It boils down to this: for me dotted-access communicates a different, more stable sort of namespace than does []-access (a la dicts). Certainly it is less typing, but that isn't really a draw for me. Dotted access is a little easier to read, which is nice but not the big deal for me. No, the big deal is the conceptual difference inherent to access via string vs. access via identifier. Though Python does not currently have a basic, dynamic, attribute-based namespace type, it's trivial to make one: "class Namespace: pass" or "type('Namespace', (), {})". While this has been done countless times, it's so simple that no one has felt like it belonged in the language. And I think that's fine, though it wouldn't hurt to have something a little more than that (see my original message). So if it's so easy, why bother adding it? Well, "class Namespace: pass" is not so simple to do using the C API. That's about it. (I *do* think people would be glad to have a basic attribute-based namespace type in the langauge. > A final complaint against: would the existence of this fragment > python-learners education > to the point that they would defer learning and practicing to use > classes properly? This is an excellent point. I suppose it depends on who was teaching, and how a new simple "namespace" type were exposed and documented. It certainly is not a replacement for classes, which have much more machinery surrounding state/methods/class-ness. If it made it harder to learn Python then it would definitely have to bring *a lot* to the table. > Sorry to complain, but someone needs to in python-ideas! ;-) Hey, I was more worried about the crickets I was hearing. :) -eric From ironfroggy at gmail.com Sun May 27 15:42:26 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Sun, 27 May 2012 09:42:26 -0400 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: Message-ID: On Sat, May 26, 2012 at 7:33 PM, Eric Snow wrote: > On Sat, May 26, 2012 at 3:02 PM, Calvin Spealman wrote: >> On Sat, May 26, 2012 at 2:53 PM, Eric Snow wrote: >>> Any further thoughts on this? ?Unless anyone is strongly opposed, I'd >>> like to push this forward. >> >> There is no good name for such a type. "Namespace" is a bad name, because >> the term "namespace" is already a general term that describes a lot of things in >> Python (and outside it) and shouldn't share a name with a specific >> thing, this type. >> That this specific type would also be within the more general namespace-concept >> only makes that worse. >> >> So, what do you call it? > > Yeah, I've seen it called at least 10 different things. ?I'm certainly > open to whatever works best. ?I've called it "namespace" because it is > one of the two kinds of namespace in Python: mapping ([]-access) and > object (dotted-access). ?The builtin dict fills the one role and the > builtin object type almost fills the other. ?I guess > "dotted_namespace" or "attribute_namespace" would work if "namespace" > is too confusing. > >> Also, is this here because you don't like typing the square brackets and quotes? If >> so, does it only save you three characters and is that worth the increase to the >> language size? > > This is definitely the stick against which to measure! > > It boils down to this: for me dotted-access communicates a different, > more stable sort of namespace than does []-access (a la dicts). This is probably the best case I've heard for such a type. Intent expression is important! > Certainly it is less typing, but that isn't really a draw for me. > Dotted access is a little easier to read, which is nice but not the > big deal for me. ?No, the big deal is the conceptual difference > inherent to access via string vs. access via identifier. > > Though Python does not currently have a basic, dynamic, > attribute-based namespace type, it's trivial to make one: "class > Namespace: pass" or "type('Namespace', (), {})". ?While this has been > done countless times, it's so simple that no one has felt like it > belonged in the language. ?And I think that's fine, though it wouldn't > hurt to have something a little more than that (see my original > message). > > So if it's so easy, why bother adding it? ?Well, "class Namespace: > pass" is not so simple to do using the C API. ?That's about it. ?(I > *do* think people would be glad to have a basic attribute-based > namespace type in the langauge. > >> A final complaint against: would the existence of this fragment >> python-learners education >> to the point that they would defer learning and practicing to use >> classes properly? > > This is an excellent point. ?I suppose it depends on who was teaching, > and how a new simple "namespace" type were exposed and documented. ?It > certainly is not a replacement for classes, which have much more > machinery surrounding state/methods/class-ness. ?If it made it harder > to learn Python then it would definitely have to bring *a lot* to the > table. > >> Sorry to complain, but someone needs to in python-ideas! ;-) > > Hey, I was more worried about the crickets I was hearing. ?:) > > -eric The best names I was able to get crowdsourced from #python this morning are: - record - flexobject - attrobject - attrdict - nameddict - namedobject and the absolute worst name: - Object -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From sven at marnach.net Sun May 27 18:08:26 2012 From: sven at marnach.net (Sven Marnach) Date: Sun, 27 May 2012 17:08:26 +0100 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: Message-ID: <20120527160826.GT14830@bagheera> Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400: > - record > - flexobject > - attrobject > - attrdict > - nameddict > - namedobject Since the proposed type is basically an `object` allowing attributes, another option would be `attrobject`. Adding an `__iter__()` method, as proposed earlier in this thread, seems unnecessary; you can simply iterate over `vars(x)` for an `attrobject` instance `x`. Cheers, Sven From bauertomer at gmail.com Sun May 27 19:58:45 2012 From: bauertomer at gmail.com (T.B.) Date: Sun, 27 May 2012 20:58:45 +0300 Subject: [Python-ideas] a simple namespace type In-Reply-To: <20120527160826.GT14830@bagheera> References: <20120527160826.GT14830@bagheera> Message-ID: <4FC26B55.7080000@gmail.com> On 2012-05-27 19:08, Sven Marnach wrote: > Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400: >> - record >> - flexobject >> - attrobject >> - attrdict >> - nameddict >> - namedobject > > Since the proposed type is basically an `object` allowing attributes, > another option would be `attrobject`. > > Adding an `__iter__()` method, as proposed earlier in this thread, > seems unnecessary; you can simply iterate over `vars(x)` for an > `attrobject` instance `x`. > Is this whole class really necessary? As said before, this type is implemented numerous times: * empty class (included in the Python Tutorial) [1] * argparse.Namespace [2] * multiprocessing.managers.Namespace [3] * bunch (PyPI) that inherits from dict, instead of wrapping __dict__ [4] * many more... Each of them has a different semantics. Each is suited for a slightly different use case and they are so easy to implement. So you can customize to your liking - fields can or can't begin with "_", the later __repr__ comment or the color of the shed. Still, it seems they do not have a "killer feature" like namedtuple's efficiency. Noticeable is how much they resemble a dict. Some let you iterate over the keys, test for equality and even all of the builtin dict methods (bunch). If you already use vars() for iteration, you might want a dict. Funny that except for the easy "class Namespace: pass", the rest fail repr for recursive/self-referential objects: >>> from argparse/multiprocessing.managers/simplenamespace import Namespace >>> ns = Namespace() >>> ns.a = ns >>> repr(ns) ... RuntimeError: maximum recursion depth exceeded The next snippet use the fact that dict's __repr__ knows how to handle recursion to solve the RuntimeError problem: def __repr__(self): return "{}({!r})".format(self.__class__.__name__, self.__dict__) TB [1] http://docs.python.org/dev/tutorial/classes.html#odds-and-ends [2] http://hg.python.org/cpython/file/c1eab1ef9c0b/Lib/argparse.py#l1177 [3] http://hg.python.org/cpython/file/c1eab1ef9c0b/Lib/multiprocessing/managers.py#l913 [4] http://pypi.python.org/pypi/bunch From ironfroggy at gmail.com Sun May 27 21:31:53 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Sun, 27 May 2012 15:31:53 -0400 Subject: [Python-ideas] a simple namespace type In-Reply-To: <4FC26B55.7080000@gmail.com> References: <20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com> Message-ID: On Sun, May 27, 2012 at 1:58 PM, T.B. wrote: > On 2012-05-27 19:08, Sven Marnach wrote: >> Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400: >>> - record >>> - flexobject >>> - attrobject >>> - attrdict >>> - nameddict >>> - namedobject >> >> Since the proposed type is basically an `object` allowing attributes, >> another option would be `attrobject`. >> >> Adding an `__iter__()` method, as proposed earlier in this thread, >> seems unnecessary; you can simply iterate over `vars(x)` for an >> `attrobject` instance `x`. >> > > Is this whole class really necessary? As said before, this type is > implemented numerous times: > * empty class (included in the Python Tutorial) [1] > * argparse.Namespace [2] > * multiprocessing.managers.Namespace [3] > * bunch (PyPI) that inherits from dict, instead of wrapping __dict__ [4] > * many more... All of the re-implementations of essentially the same thing is exactly why a standard version is constantly suggested. That said, it is so simple that it easily has many variants, because it is only the base of the different ideas all these things implement. > Each of them has a different semantics. Each is suited for a slightly > different use case and they are so easy to implement. So you can customize > to your liking - fields can or can't begin with "_", the later __repr__ > comment or the color of the shed. Still, it seems they do not have a "killer > feature" like namedtuple's efficiency. > > Noticeable is how much they resemble a dict. Some let you iterate over the > keys, test for equality and even all of the builtin dict methods (bunch). If > you already use vars() for iteration, you might want a dict. > > Funny that except for the easy "class Namespace: pass", the rest fail repr > for recursive/self-referential objects: > >>>> from argparse/multiprocessing.managers/simplenamespace import Namespace >>>> ns = Namespace() >>>> ns.a = ns >>>> repr(ns) > ... > RuntimeError: maximum recursion depth exceeded > > The next snippet use the fact that dict's __repr__ knows how to handle > recursion to solve the RuntimeError problem: > def __repr__(self): > ? ?return "{}({!r})".format(self.__class__.__name__, self.__dict__) > > > TB > > [1] http://docs.python.org/dev/tutorial/classes.html#odds-and-ends > [2] http://hg.python.org/cpython/file/c1eab1ef9c0b/Lib/argparse.py#l1177 > [3] > http://hg.python.org/cpython/file/c1eab1ef9c0b/Lib/multiprocessing/managers.py#l913 > [4] http://pypi.python.org/pypi/bunch > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From eric at trueblade.com Sun May 27 21:35:42 2012 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 27 May 2012 15:35:42 -0400 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: <20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com> Message-ID: <4FC2820E.2010205@trueblade.com> On 5/27/2012 3:31 PM, Calvin Spealman wrote: > On Sun, May 27, 2012 at 1:58 PM, T.B. wrote: >> On 2012-05-27 19:08, Sven Marnach wrote: >>> Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400: >>>> - record >>>> - flexobject >>>> - attrobject >>>> - attrdict >>>> - nameddict >>>> - namedobject >>> >>> Since the proposed type is basically an `object` allowing attributes, >>> another option would be `attrobject`. >>> >>> Adding an `__iter__()` method, as proposed earlier in this thread, >>> seems unnecessary; you can simply iterate over `vars(x)` for an >>> `attrobject` instance `x`. >>> >> >> Is this whole class really necessary? As said before, this type is >> implemented numerous times: >> * empty class (included in the Python Tutorial) [1] >> * argparse.Namespace [2] >> * multiprocessing.managers.Namespace [3] >> * bunch (PyPI) that inherits from dict, instead of wrapping __dict__ [4] >> * many more... > > All of the re-implementations of essentially the same thing is exactly why a > standard version is constantly suggested. > > That said, it is so simple that it easily has many variants, because it is only > the base of the different ideas all these things implement. A test of the concept would be: could the uses of the similar classes in the standard library be replaced with the proposed new implementation? Eric. From oscar.j.benjamin at gmail.com Sun May 27 22:05:48 2012 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Sun, 27 May 2012 21:05:48 +0100 Subject: [Python-ideas] a simple namespace type In-Reply-To: <4FC2820E.2010205@trueblade.com> References: <20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com> <4FC2820E.2010205@trueblade.com> Message-ID: On 27 May 2012 20:35, Eric V. Smith wrote: > On 5/27/2012 3:31 PM, Calvin Spealman wrote: > > On Sun, May 27, 2012 at 1:58 PM, T.B. wrote: > >> On 2012-05-27 19:08, Sven Marnach wrote: > >>> Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400: > >>>> - record > >>>> - flexobject > >>>> - attrobject > >>>> - attrdict > >>>> - nameddict > >>>> - namedobject > >>> > >>> Since the proposed type is basically an `object` allowing attributes, > >>> another option would be `attrobject`. > >>> > >>> Adding an `__iter__()` method, as proposed earlier in this thread, > >>> seems unnecessary; you can simply iterate over `vars(x)` for an > >>> `attrobject` instance `x`. > What about an `__iter__()` method that works like `dict.items()`? Then you can do a round trip with ns = attrobject(**d) and d = dict(ns) allowing you to quickly convert between attribute-based and item-based access in either direction. > >>> > >> > >> Is this whole class really necessary? As said before, this type is > >> implemented numerous times: > >> * empty class (included in the Python Tutorial) [1] > >> * argparse.Namespace [2] > >> * multiprocessing.managers.Namespace [3] > >> * bunch (PyPI) that inherits from dict, instead of wrapping __dict__ [4] > >> * many more... > > > > All of the re-implementations of essentially the same thing is exactly > why a > > standard version is constantly suggested. > > > > That said, it is so simple that it easily has many variants, because it > is only > > the base of the different ideas all these things implement. > > A test of the concept would be: could the uses of the similar classes in > the standard library be replaced with the proposed new implementation? > > Eric. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun May 27 22:09:06 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 28 May 2012 06:09:06 +1000 Subject: [Python-ideas] a simple namespace type In-Reply-To: <4FC2820E.2010205@trueblade.com> References: <20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com> <4FC2820E.2010205@trueblade.com> Message-ID: Slightly easier bar to reach: could the various incarnations be improved by using a new varobject type as a base class (e.g. I know I often use namedtuple as a base class rather than instantiating them directly, although I do the latter, too). There's also a potentially less controversial alternative: just add an easy spelling for "type(name, (), {})" to the C API. -- Sent from my phone, thus the relative brevity :) On May 28, 2012 5:54 AM, "Eric V. Smith" wrote: > On 5/27/2012 3:31 PM, Calvin Spealman wrote: > > On Sun, May 27, 2012 at 1:58 PM, T.B. wrote: > >> On 2012-05-27 19:08, Sven Marnach wrote: > >>> Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400: > >>>> - record > >>>> - flexobject > >>>> - attrobject > >>>> - attrdict > >>>> - nameddict > >>>> - namedobject > >>> > >>> Since the proposed type is basically an `object` allowing attributes, > >>> another option would be `attrobject`. > >>> > >>> Adding an `__iter__()` method, as proposed earlier in this thread, > >>> seems unnecessary; you can simply iterate over `vars(x)` for an > >>> `attrobject` instance `x`. > >>> > >> > >> Is this whole class really necessary? As said before, this type is > >> implemented numerous times: > >> * empty class (included in the Python Tutorial) [1] > >> * argparse.Namespace [2] > >> * multiprocessing.managers.Namespace [3] > >> * bunch (PyPI) that inherits from dict, instead of wrapping __dict__ [4] > >> * many more... > > > > All of the re-implementations of essentially the same thing is exactly > why a > > standard version is constantly suggested. > > > > That said, it is so simple that it easily has many variants, because it > is only > > the base of the different ideas all these things implement. > > A test of the concept would be: could the uses of the similar classes in > the standard library be replaced with the proposed new implementation? > > Eric. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Mon May 28 18:34:38 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 28 May 2012 10:34:38 -0600 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: <20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com> <4FC2820E.2010205@trueblade.com> Message-ID: On Sun, May 27, 2012 at 2:09 PM, Nick Coghlan wrote: > Slightly easier bar to reach: could the various incarnations be improved by > using a new varobject type as a base class (e.g. I know I often use > namedtuple as a base class rather than instantiating them directly, although > I do the latter, too). Good point. I do the same. > There's also a potentially less controversial alternative: just add an easy > spelling for "type(name, (), {})" to the C API. I really like this. There's a lot of boilerplate to create just a simple type like this in the C API. I'll see what I can come up with. :) As a namespace, it would be good to have a nice repr, but that's not a show stopper. -eric From ericsnowcurrently at gmail.com Tue May 29 02:00:24 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 28 May 2012 18:00:24 -0600 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: <20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com> <4FC2820E.2010205@trueblade.com> Message-ID: On Mon, May 28, 2012 at 10:34 AM, Eric Snow wrote: > On Sun, May 27, 2012 at 2:09 PM, Nick Coghlan wrote: >> There's also a potentially less controversial alternative: just add an easy >> spelling for "type(name, (), {})" to the C API. > > I really like this. ?There's a lot of boilerplate to create just a > simple type like this in the C API. ?I'll see what I can come up with. > ?:) http://bugs.python.org/issue14942 -eric From techtonik at gmail.com Tue May 29 06:00:51 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 29 May 2012 07:00:51 +0300 Subject: [Python-ideas] from foo import bar.baz In-Reply-To: References: Message-ID: On Sat, May 26, 2012 at 1:54 AM, Devin Jeanpierre wrote: > Has it irritated anyone else that this syntax is invalid? I've wanted > it a couple of times, to be equivalent to: > > ? ?import foo.bar.baz > ? ?from foo import bar > ? ?del foo # but only if we didn't import foo already before" > > The idea being that one wants access to foo.bar.baz under the name > bar.baz , for readability purposes or what have you. +1 > I played around with adding this, but I seem to have really bad luck > with extending CPython... TryPyPy? =) From julian at grayvines.com Tue May 29 06:46:08 2012 From: julian at grayvines.com (Julian Berman) Date: Tue, 29 May 2012 00:46:08 -0400 Subject: [Python-ideas] Reimplementing collections.deque as a dynamic array Message-ID: <1251639012979975459@unknownmsgid> I've occasionally had a need for a container with constant-time append to both ends without sacrificing constant-time indexing in the middle. collections.deque will in these cases narrowly miss the target due to linear indexing (with the current use case being for two deques storing the lines of text surrounding the cursor in a text editor while still being randomly indexed occasionally). Wikipedia lists at least two common deque implementations: http://en.wikipedia.org/wiki/Double-ended_queue#Implementations where switching to a dynamic array would seemingly satisfy my requirements. I know from a bit of experience (and a quick SO perusal) that "How do I index a deque" does occasionally come up. Any thoughts on the value of such a change? JB From techtonik at gmail.com Tue May 29 07:05:27 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 29 May 2012 08:05:27 +0300 Subject: [Python-ideas] stdlib crowdsourcing Message-ID: The problem with stdlib - it is all damn subjective. There is no process to add functions and modules if you're not well-behaved and skilled in public debates and don't have really a lot of time to be a champion of your module/function. In other words - it is hard (if not impossible for 80% of Python Earth population). So, many people and projects decide to opt-out. Take a look at Twisted - a lot of useful stuff, but not in Python stdlib. So.. Provide a way for people to opt-out from core stuff, but still allow to share the changes and update code if necessary. This will require: - a local stdlib Python path convention - snippet normalization function and AST hash dumper - web site with stats - source code crawler How it works: 1. Every project maintains its own stdlib directory with functions that they feel are good to have in standard library 2. Functions are placed so that they are imported as if from standard library, but this time with stdlib prefix 3. The license for this directory is public domain to remove all legal barriers (credits are welcome, but optional) 4. Crawler (probably PyPI) scans this stdlib dir, finds functions, normalizes them, calculates hash and submits to web site 4.1 Normalization is required to find the shared function copy/pasted across different projects with different indentation level, docstrings, parameters/variable names etc. 4.2 Hash is calculated upon AST. There are at least three hashes for each entry: 4.2.1 Full hash - all docstrings and variable names are preserved, whitespace normalized 4.2.2 Stripped hash - docstrings are stripped, variable names are normalized 4.2.3 Signature hash - a mark placed in a comment above function name, either calculated from function signature or generated randomly, used for manual tracking of copy/paste e.g. pd:ac546df6b8340a92 5. Web site maintains usage and popularity staff, accepts votes on inclusion of snippets User stories: 1. "I want to find if there is a better/updated version of my function available" 1.1 I enter hash into web site search form 1.2 Site gives me a link to my snippet 1.3 I can see what people proposed to replace this function with 1.4 I can choose the function with most votes 1.5 I can flag the functions I may find irrelevant or 1.5 I can tag the functions that divert in different direction than I need to filter them 2. "I want to reuse code snippets without additional dependencies on 3rd party projects" 1.1 Just place them into my own stdlib directory 3. "I want to update code snippets when there is an update for them" 1.1 I run scanner, it extracts signature hashes, stripped hashes and looks if web-site version of signature matches normalized hash 4. "I want to see what people want to include in the next Python version" 1.1 A call for proposals is made 1.2 People place wannabe's into their stdlib dirs 1.3 Crawl generates new functions on a web site 1.4 Functions are categorized 1.5 Optionally included / declined with a short one-liner reason - why 1.6 Optionally provided with more detailed info why --- feature creep cut --- 5. "I want to see what functions are popular in other languages" 1.1 A separate crawler for Ruby, PHP etc. stdlib converts their AST into compatible format where possible 1.2 Submit to site stats 6. "I want to download the function in Ruby format" 1.1 AST converter tries to do the job automatically where possible 1.2 If it fails - you are encouraged to fix the converter rules or write the replacement for this signature manually Just an idea. -- anatoly t. From ncoghlan at gmail.com Tue May 29 08:02:25 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 29 May 2012 16:02:25 +1000 Subject: [Python-ideas] stdlib crowdsourcing In-Reply-To: References: Message-ID: Once again, you're completely ignoring all existing knowledge and expertise on open collaboration and trying to reinvent the world. It's *not going to happen*. The standard library is just the curated core, and *yes*, it's damn hard to get anything added to it (deliberately so). There's a place where anyone can post anything they want, and see if others find it useful: PyPI. The standard library provides tools to upload to PyPI, and, as of 3.3, will even include tools to download and install from it. If you don't like our ecosystem (it's hard to tell whether or not you do: everything you post is about how utterly awful and unusable everything is, yet you're still here years later). If you think the PyPI UI is awful or inadequate, follow the example of crate.io or pythonpackage.com and *create your own*. There's far more to the Python universe than just core development, stop trying to shoehorn everything into a place where it doesn't belong. Finally-giving-up'ly, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alexandre at peadrop.com Tue May 29 08:50:32 2012 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Tue, 29 May 2012 02:50:32 -0400 Subject: [Python-ideas] Reimplementing collections.deque as a dynamic array In-Reply-To: <1251639012979975459@unknownmsgid> References: <1251639012979975459@unknownmsgid> Message-ID: The current implementation of deque is a doubly linked list of arrays. Indexing is indeed linear, but still very efficient. It takes 1 ms to index a deque with a million items. If that's not good enough, you should try to implement your own container using lists (which are dynamic arrays in Python). That should be easy to implement though this approach will likely be slower for everything but very large datasets. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre at peadrop.com Tue May 29 09:00:32 2012 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Tue, 29 May 2012 03:00:32 -0400 Subject: [Python-ideas] stdlib crowdsourcing In-Reply-To: References: Message-ID: On Tue, May 29, 2012 at 2:02 AM, Nick Coghlan wrote: > > If you don't like our ecosystem (it's hard to tell whether or not you > do: everything you post is about how utterly awful and unusable > everything is, yet you're still here years later). > I understand the discouragement with regard to repeating yourself over and over again. But, let's keep the discussion friendly here, okay? This is Python-ideas: crazy proposals are fine. We can simply ignore those and move on. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Tue May 29 09:04:44 2012 From: cs at zip.com.au (Cameron Simpson) Date: Tue, 29 May 2012 17:04:44 +1000 Subject: [Python-ideas] Reimplementing collections.deque as a dynamic array In-Reply-To: <1251639012979975459@unknownmsgid> References: <1251639012979975459@unknownmsgid> Message-ID: <20120529070444.GA31399@cskk.homeip.net> On 29May2012 00:46, Julian Berman wrote: | I've occasionally had a need for a container with constant-time append | to both ends without sacrificing constant-time indexing in the middle. | collections.deque will in these cases narrowly miss the target due to | linear indexing (with the current use case being for two deques | storing the lines of text surrounding the cursor in a text editor | while still being randomly indexed occasionally). | | Wikipedia lists at least two common deque implementations: | | http://en.wikipedia.org/wiki/Double-ended_queue#Implementations | | where switching to a dynamic array would seemingly satisfy my requirements. | | I know from a bit of experience (and a quick SO perusal) that "How do | I index a deque" does occasionally come up. Any thoughts on the value | of such a change? It was pointed out to me recently that Python's list.append() is constant time overall. Use two lists, one for append-forward and one for append backward. Keep track of the bound. Access to item "i" is trivially computed from the backward and forward list sizes and ends. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ The mere existence of a problem is no proof of the existence of a solution. - Yiddish Proverb From zuo at chopin.edu.pl Tue May 29 19:34:06 2012 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Tue, 29 May 2012 19:34:06 +0200 Subject: [Python-ideas] a simple namespace type In-Reply-To: References: <20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com> <4FC2820E.2010205@trueblade.com> Message-ID: <20120529173406.GA1869@chopin.edu.pl> Eric Snow dixit (2012-05-28, 10:34): > On Sun, May 27, 2012 at 2:09 PM, Nick Coghlan wrote: > > Slightly easier bar to reach: could the various incarnations be improved by > > using a new varobject type as a base class (e.g. I know I often use > > namedtuple as a base class rather than instantiating them directly, although > > I do the latter, too). > > Good point. I do the same. > > > There's also a potentially less controversial alternative: just add an easy > > spelling for "type(name, (), {})" to the C API. > > I really like this. There's a lot of boilerplate to create just a > simple type like this in the C API. I'll see what I can come up with. > :) > > As a namespace, it would be good to have a nice repr, but that's not a > show stopper. Using classes as 'attribute containers' is suboptimal (which means that in performance-critical parts of code you would have to implement a namespace-like type anyway -- if you wanted to have attr-based syntax, of course). There should be one obvious way to do it. Now there is no one. Cheers. *j From ericsnowcurrently at gmail.com Wed May 30 06:52:21 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 29 May 2012 22:52:21 -0600 Subject: [Python-ideas] a simple namespace type In-Reply-To: <20120529173406.GA1869@chopin.edu.pl> References: <20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com> <4FC2820E.2010205@trueblade.com> <20120529173406.GA1869@chopin.edu.pl> Message-ID: On Tue, May 29, 2012 at 11:34 AM, Jan Kaliszewski wrote: > Using classes as 'attribute containers' is suboptimal (which means that > in performance-critical parts of code you would have to implement a > namespace-like type anyway -- if you wanted to have attr-based syntax, > of course). What are the performance problems of using a type object in this way? > > There should be one obvious way to do it. Now there is no one. Yeah, I feel the same way. -eric From g.brandl at gmx.net Wed May 30 08:47:38 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 30 May 2012 08:47:38 +0200 Subject: [Python-ideas] stdlib crowdsourcing In-Reply-To: References: Message-ID: Am 29.05.2012 09:00, schrieb Alexandre Vassalotti: > On Tue, May 29, 2012 at 2:02 AM, Nick Coghlan > > wrote: > > If you don't like our ecosystem (it's hard to tell whether or not you > do: everything you post is about how utterly awful and unusable > everything is, yet you're still here years later). > > > I understand the discouragement with regard to repeating yourself over and over > again. But, let's keep the discussion friendly here, okay? This is Python-ideas: > crazy proposals are fine. We can simply ignore those and move on. I don't see what's unfriendly about that paragraph: it's a quite accurate matter-of-fact statement... Georg From armin.wieser at gmail.com Wed May 30 10:45:31 2012 From: armin.wieser at gmail.com (Armin Wieser) Date: Wed, 30 May 2012 10:45:31 +0200 Subject: [Python-ideas] PEP for Python folder structure Message-ID: <4FC5DE2B.5020503@gmail.com> Hi, I would like to write a PEP about folder structure in python projects. You will think that there is no need for that, because everything is documented (package, module, setuptools). But it should contain something like [0]. If you aren't into those concepts, never have pushed some package to pypi, and you only have written some scripts, it's hard to find out how to structure your folders. Therefore i think a PEP would be a great way to show how you can do it. What do you think about it? [0] http://jcalderone.livejournal.com/39794.html From littleq0903 at gmail.com Wed May 30 11:05:41 2012 From: littleq0903 at gmail.com (LittleQ) Date: Wed, 30 May 2012 17:05:41 +0800 Subject: [Python-ideas] PEP for Python folder structure In-Reply-To: <4FC5DE2B.5020503@gmail.com> References: <4FC5DE2B.5020503@gmail.com> Message-ID: <4D11166B36784105A9CCC360A05049A9@gmail.com> I think one of the goodnesses of Python is "no project structure", that make Python is easy to learn and easy to use. Could you show something like your Python project structure for example? I'm curious for why do you think Python needs a basic project structure : ) Just personally hate the project structure, because Erlang has project structure for each project, that made me get into a mess often. >>>Best Regards, Colin Su (LittleQ) NCCU Computer Science Dept. / PLSM Lab. About.me: http://about.me/littleq On Wednesday, May 30, 2012 at 4:45 PM, Armin Wieser wrote: > Hi, > > I would like to write a PEP about folder structure in python projects. > > You will think that there is no need for that, because everything is > documented (package, module, setuptools). But it should contain > something like [0]. > > If you aren't into those concepts, never have pushed some package to > pypi, and you only have written some scripts, it's hard to find out how to > structure your folders. > > Therefore i think a PEP would be a great way to show how you can do it. > > What do you think about it? > > [0] http://jcalderone.livejournal.com/39794.html > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org (mailto:Python-ideas at python.org) > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed May 30 11:19:07 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 30 May 2012 19:19:07 +1000 Subject: [Python-ideas] PEP for Python folder structure In-Reply-To: <4FC5DE2B.5020503@gmail.com> References: <4FC5DE2B.5020503@gmail.com> Message-ID: <20120530091907.GB27475@ando> On Wed, May 30, 2012 at 10:45:31AM +0200, Armin Wieser wrote: > Hi, > > I would like to write a PEP about folder structure in python projects. Why? PEP stands for Python Enhancement Proposal, and relate to suggested changes to the Python language and standard library. Your blog post about folder structure: > [0] http://jcalderone.livejournal.com/39794.html is interesting, but it has nothing to do with either Python the language or the standard library, as far as I can tell. In fact, some of your project suggestions go against best-practice, or at least common practice: "Don't put your source in a directory called src" Really? I think you'll find many people disagree with that. I think your blog post is a good blog post, and deserves to have people read it and discuss it. With feedback from others, I think it might even become a good How To layout projects. But I think it would be a poor PEP. Of course, you can write a post in the format of a PEP. Just don't call it a PEP unless it is a proposal for an enhancement to Python, or at least related to development of Python, e.g. PEP 8. -- Steven From mal at egenix.com Wed May 30 13:01:42 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 30 May 2012 13:01:42 +0200 Subject: [Python-ideas] PEP for Python folder structure In-Reply-To: <20120530091907.GB27475@ando> References: <4FC5DE2B.5020503@gmail.com> <20120530091907.GB27475@ando> Message-ID: <4FC5FE16.5000200@egenix.com> Steven D'Aprano wrote: > On Wed, May 30, 2012 at 10:45:31AM +0200, Armin Wieser wrote: >> Hi, >> >> I would like to write a PEP about folder structure in python projects. > > Why? > > PEP stands for Python Enhancement Proposal, and relate to suggested > changes to the Python language and standard library. Your blog post > about folder structure: > >> [0] http://jcalderone.livejournal.com/39794.html > > is interesting, but it has nothing to do with either Python the language > or the standard library, as far as I can tell. We do have informational PEPs for the purpose Armin is describing, but we usually only try to use those for standardization of things. I don't think a standard project dir layout is really needed. Helping new package authors finding the right structure for their project does help, though. Perhaps the idea could be turned into a section of the (distutils) documentation, a how-to or a page on the wiki ?! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 30 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-07-17: Python Meeting Duesseldorf ... 48 days to go 2012-07-02: EuroPython 2012, Florence, Italy ... 33 days to go 2012-05-16: Released eGenix pyOpenSSL 0.13 ... http://egenix.com/go29 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ncoghlan at gmail.com Wed May 30 13:48:39 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 30 May 2012 21:48:39 +1000 Subject: [Python-ideas] PEP for Python folder structure In-Reply-To: <4FC5FE16.5000200@egenix.com> References: <4FC5DE2B.5020503@gmail.com> <20120530091907.GB27475@ando> <4FC5FE16.5000200@egenix.com> Message-ID: On Wed, May 30, 2012 at 9:01 PM, M.-A. Lemburg wrote: > I don't think a standard project dir layout is really needed. Helping new > package authors finding the right structure for their project does > help, though. The basic problem is that it's a matter of "it depends what you're building and whether or not there are any other constraints on your layout". Kenneth Reitz has a decent guide that he posted recently ([1]), but see the comments below the post for some useful caveats and discussion. Ultimately though, providing a place to provide opinionated advice on exactly this kind of question is why the Hitchhiker's Guide to Python [2] was created. [1] http://kennethreitz.com/repository-structure-and-python.html [2] http://docs.python-guide.org/en/latest/index.html -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From Ronny.Pfannschmidt at gmx.de Wed May 30 17:03:31 2012 From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt) Date: Wed, 30 May 2012 17:03:31 +0200 Subject: [Python-ideas] FormatRepr in reprlib for declaring simple repr functions easily Message-ID: <4FC636C3.5080004@gmx.de> Hi, i consider my utility class FormatRepr finished, its currently availiable in ( http://pypi.python.org/pypi/reprtools/0.1 ) it supplies a descriptor that allows to simply declare __repr__ methods based on object attributes. i think it greatly enhances readability for those things, as its DRY and focuses on the parts *i* consider important (e.E. what accessible attribute gets formatted how) there is no need ot repeat attribute names or care if something is a property,class-attribute or object attribute (one of the reasons why a simple .format(**vars(self)) will not always work) oversimplified example: .. code-block:: python from reprtools import FormatRepr class User(object): __repr__ = FormatRepr("") def __init__(self, name): self.name = name >>> User('test') -- Ronny From mikegraham at gmail.com Thu May 31 16:38:37 2012 From: mikegraham at gmail.com (Mike Graham) Date: Thu, 31 May 2012 10:38:37 -0400 Subject: [Python-ideas] FormatRepr in reprlib for declaring simple repr functions easily In-Reply-To: <4FC636C3.5080004@gmx.de> References: <4FC636C3.5080004@gmx.de> Message-ID: On Wed, May 30, 2012 at 11:03 AM, Ronny Pfannschmidt wrote: > Hi, > > i consider my utility class FormatRepr finished, > its currently availiable in > ( http://pypi.python.org/pypi/reprtools/0.1 ) > > it supplies a descriptor that allows to simply declare __repr__ methods > based on object attributes. > > i think it greatly enhances readability for those things, > as its DRY and focuses on the parts *i* consider important > (e.E. what accessible attribute gets formatted how) > > there is no need ot repeat attribute names or > care if something is a property,class-attribute or object attribute > (one of the reasons why a simple .format(**vars(self)) will not always work) > > oversimplified example: > > > .. code-block:: python > > ? from reprtools import FormatRepr > > ? class User(object): > ? ? ? __repr__ = FormatRepr("") > > ? ? ? def __init__(self, name): > ? ? ? ? ? self.name = name > > > >>>> User('test') > If we introduce something like this, I think I'd prefer an approach that didn't encourage hardcoding "User". In my __repr__s, I usually make the class's name dynamic so it does not make for confusing reprs in the event of subclassing. You really don't end up implementing __repr__ all that often and if you do you writing a simple one isn't hard. I'm -0 on having this in the stdlib. Mike From Ronny.Pfannschmidt at gmx.de Thu May 31 16:43:23 2012 From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt) Date: Thu, 31 May 2012 16:43:23 +0200 Subject: [Python-ideas] FormatRepr in reprlib for declaring simple repr functions easily In-Reply-To: References: <4FC636C3.5080004@gmx.de> Message-ID: <4FC7838B.5050709@gmx.de> On 05/31/2012 04:38 PM, Mike Graham wrote: > On Wed, May 30, 2012 at 11:03 AM, Ronny Pfannschmidt > wrote: >> Hi, >> >> i consider my utility class FormatRepr finished, >> its currently availiable in >> ( http://pypi.python.org/pypi/reprtools/0.1 ) >> >> it supplies a descriptor that allows to simply declare __repr__ methods >> based on object attributes. >> >> i think it greatly enhances readability for those things, >> as its DRY and focuses on the parts *i* consider important >> (e.E. what accessible attribute gets formatted how) >> >> there is no need ot repeat attribute names or >> care if something is a property,class-attribute or object attribute >> (one of the reasons why a simple .format(**vars(self)) will not always work) >> >> oversimplified example: >> >> >> .. code-block:: python >> >> from reprtools import FormatRepr >> >> class User(object): >> __repr__ = FormatRepr("") >> >> def __init__(self, name): >> self.name = name >> >> >> >>>>> User('test') >> > > If we introduce something like this, I think I'd prefer an approach > that didn't encourage hardcoding "User". In my __repr__s, I usually > make the class's name dynamic so it does not make for confusing reprs > in the event of subclassing. you can just use {__class__.__name__} to have it "softcoded" > > You really don't end up implementing __repr__ all that often and if > you do you writing a simple one isn't hard. I'm -0 on having this in > the stdlib. > > Mike From alexandre.zani at gmail.com Thu May 31 16:51:45 2012 From: alexandre.zani at gmail.com (Alexandre Zani) Date: Thu, 31 May 2012 07:51:45 -0700 Subject: [Python-ideas] FormatRepr in reprlib for declaring simple repr functions easily In-Reply-To: <4FC7838B.5050709@gmx.de> References: <4FC636C3.5080004@gmx.de> <4FC7838B.5050709@gmx.de> Message-ID: I would prefer an interface where I just pass a list of attribute names and the utility class figures everything else out. That said, I'm not sure this does enough to warrant inclusion in the stdlib. It's easy enough to write a __repr__ with just a few more characters. I'm not sure that: __repr__ = FormatRepr("" is actually more readable than def __repr__(self): return "" % self.username In fact, the second option might be better because I don't have to learn anything new to understand it. If I see your version, I have to google it and then use brain-space to hold that feature in memory. If I know python, I already know what the second option means. Alexandre Zani On Thu, May 31, 2012 at 7:43 AM, Ronny Pfannschmidt wrote: > On 05/31/2012 04:38 PM, Mike Graham wrote: >> >> On Wed, May 30, 2012 at 11:03 AM, Ronny Pfannschmidt >> ?wrote: >>> >>> Hi, >>> >>> i consider my utility class FormatRepr finished, >>> its currently availiable in >>> ( http://pypi.python.org/pypi/reprtools/0.1 ) >>> >>> it supplies a descriptor that allows to simply declare __repr__ methods >>> based on object attributes. >>> >>> i think it greatly enhances readability for those things, >>> as its DRY and focuses on the parts *i* consider important >>> (e.E. what accessible attribute gets formatted how) >>> >>> there is no need ot repeat attribute names or >>> care if something is a property,class-attribute or object attribute >>> (one of the reasons why a simple .format(**vars(self)) will not always >>> work) >>> >>> oversimplified example: >>> >>> >>> .. code-block:: python >>> >>> ? from reprtools import FormatRepr >>> >>> ? class User(object): >>> ? ? ? __repr__ = FormatRepr("") >>> >>> ? ? ? def __init__(self, name): >>> ? ? ? ? ? self.name = name >>> >>> >>> >>>>>> User('test') >>> >>> >> >> >> If we introduce something like this, I think I'd prefer an approach >> that didn't encourage hardcoding "User". In my __repr__s, I usually >> make the class's name dynamic so it does not make for confusing reprs >> in the event of subclassing. > > > you can just use {__class__.__name__} to have it "softcoded" > > >> >> You really don't end up implementing __repr__ all that often and if >> you do you writing a simple one isn't hard. I'm -0 on having this in >> the stdlib. >> >> Mike > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas