From sylvain.thenault at logilab.fr Tue Sep 3 11:55:49 2013 From: sylvain.thenault at logilab.fr (Sylvain =?utf-8?B?VGjDqW5hdWx0?=) Date: Tue, 3 Sep 2013 11:55:49 +0200 Subject: [code-quality] Something about Pylint In-Reply-To: References: Message-ID: <20130903095549.GC2581@logilab.fr> On 09 ao?t 20:03, ??? wrote: > Hello, > I am a beginner to Pylint, My name is Hou Zhuoming. When I install the pylint 1.0.0 in python 3.3(OS:Windows7), there is a error happened. > > And then I read the source code of pylint 1.0.0. I found something is wrong in setup.py model. Follow up on https://bitbucket.org/logilab/pylint/issue/51/building-pylint-windows-installer-for -- Sylvain Th?nault, LOGILAB, Paris (01.45.32.03.12) - Toulouse (05.62.17.16.42) Formations Python, Debian, M?th. Agiles: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services CubicWeb, the semantic web framework: http://www.cubicweb.org From skip at pobox.com Tue Sep 17 20:56:19 2013 From: skip at pobox.com (Skip Montanaro) Date: Tue, 17 Sep 2013 13:56:19 -0500 Subject: [code-quality] Spurious useless-else-on-loop warning Message-ID: I have a function which uses for ... else, with an early return. Here's a cut down version of it: """...""" def func(c, dt, ct): for cl, op in zip(c.events[0::2], c.events[1::2]): delta = cl.event_datetime - dt if delta > ct: return dt + ct else: dt = op.event_datetime ct -= delta else: return dt Pylint (1.0.0) complains: early_exit.py:11: [W0120(useless-else-on-loop), func] Else clause on loop without a break statement I thought to myself, "self, a return should be treated like a break," and set out to fix this problem. I found the suspect code in checkers/base.py: def _loop_exits_early(loop): """Returns true if a loop has a break statement in its body.""" loop_nodes = (astroid.For, astroid.While) # Loop over body explicitly to avoid matching break statements # in orelse. for child in loop.body: if isinstance(child, loop_nodes): continue for _ in child.nodes_of_class(astroid.Break, skip_klass=loop_nodes): return True return False Looking to see where a test case would go, I found test/input/func_useless_else_on_loop.py, with this test case: def test_return_for(): """else + return is accetable.""" for i in range(10): if i % 2: return i else: print 'math is broken' Given that _loop_exits_early doesn't check for early returns, how does test_return_for (and test_return_while, not shown) pass when the test suite is run? I think these two lines belong in _loop_exits_early: for _ in child.nodes_of_class(astroid.Return, skip_klass=loop_nodes): return True assuming there is no way to spell that with just a single call to child.nodes_of_class(). (I tried updating my repo, but can't remember my bitbucket password, otherwise I would have checked to see if this problem has already been solved. My apologies if it has.) Skip From ned at nedbatchelder.com Tue Sep 17 21:20:18 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Tue, 17 Sep 2013 15:20:18 -0400 Subject: [code-quality] Spurious useless-else-on-loop warning In-Reply-To: References: Message-ID: <5238AB72.6080603@nedbatchelder.com> On 9/17/13 2:56 PM, Skip Montanaro wrote: > I have a function which uses for ... else, with an early return. > Here's a cut down version of it: > > """...""" > > def func(c, dt, ct): > for cl, op in zip(c.events[0::2], c.events[1::2]): > delta = cl.event_datetime - dt > if delta > ct: > return dt + ct > else: > dt = op.event_datetime > ct -= delta > else: > return dt > > Pylint (1.0.0) complains: > > early_exit.py:11: [W0120(useless-else-on-loop), func] Else clause on > loop without a break statement But this "else" is useless, isn't it? Just put the "return dt" statement after the for statement, no else needed. --Ned. From ned at nedbatchelder.com Wed Sep 18 02:21:51 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Tue, 17 Sep 2013 20:21:51 -0400 Subject: [code-quality] Spurious useless-else-on-loop warning In-Reply-To: References: <5238AB72.6080603@nedbatchelder.com> Message-ID: <5238F21F.4050608@nedbatchelder.com> On 9/17/13 7:56 PM, Skip Montanaro wrote: > > > > But this "else" is useless, isn't it? Just put the "return dt" > statement after the for statement, no else needed. > > In this particular case, yes, but that's not the point of the post. > Pylint doesn't check to see what's in the else clause, only whether > there is an early exit from the body of the loop. > > Skip > But is there a sensible use-case for a for loop with an else but no break? I can't think of one. There's no point changing pylint unless you can find a useful piece of code that it complains about. --Ned. From skip at pobox.com Wed Sep 18 03:38:39 2013 From: skip at pobox.com (Skip Montanaro) Date: Tue, 17 Sep 2013 20:38:39 -0500 Subject: [code-quality] Fwd: Spurious useless-else-on-loop warning In-Reply-To: References: <5238AB72.6080603@nedbatchelder.com> Message-ID: > But this "else" is useless, isn't it? Just put the "return dt" statement after the for statement, no else needed. In this particular case, yes, but that's not the point of the post. Pylint doesn't check to see what's in the else clause, only whether there is an early exit from the body of the loop. Skip From skip at pobox.com Wed Sep 18 03:39:06 2013 From: skip at pobox.com (Skip Montanaro) Date: Tue, 17 Sep 2013 20:39:06 -0500 Subject: [code-quality] Fwd: Spurious useless-else-on-loop warning In-Reply-To: References: <5238AB72.6080603@nedbatchelder.com> <5238F21F.4050608@nedbatchelder.com> Message-ID: Ned, I still think you're missing the point of my original post. A return is as good as a break when considering early exit from a loop. The pylint code only checks for the presence of a break statement, but the test cases clearly show a case with a return statement. The test functions even include docstrings which state that. That's all I'm trying to point out. That I discovered the problem in a piece of code which could be rewritten to avoid the warning is quite beside the point. Skip From fuzzyman at gmail.com Wed Sep 18 06:33:56 2013 From: fuzzyman at gmail.com (Michael Foord) Date: Wed, 18 Sep 2013 05:33:56 +0100 Subject: [code-quality] Fwd: Spurious useless-else-on-loop warning In-Reply-To: References: <5238AB72.6080603@nedbatchelder.com> <5238F21F.4050608@nedbatchelder.com> Message-ID: On 18 September 2013 02:39, Skip Montanaro wrote: > Ned, > > I still think you're missing the point of my original post. A return > is as good as a break when considering early exit from a loop. The > pylint code only checks for the presence of a break statement, but the > test cases clearly show a case with a return statement. The test > functions even include docstrings which state that. > It's not considering the exit from the loop it's considering the entry to the else clause. A break would continue past the body of the loop without entering the else clause (so the else would have a meaning - only enter this code if we haven't hit break). Without a break the else clause will always be entered if the loop terminates normally, so the else is useless. Just putting the return immediately after the loop is functionally identical. That you have some early returns inside the body of the loop is irrelevant - the "else" is still unnecessary / pointless. Michael > > That's all I'm trying to point out. That I discovered the problem in a > piece of code which could be rewritten to avoid the warning is quite > beside the point. > > Skip > _______________________________________________ > code-quality mailing list > code-quality at python.org > https://mail.python.org/mailman/listinfo/code-quality > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylvain.thenault at logilab.fr Wed Sep 18 07:39:47 2013 From: sylvain.thenault at logilab.fr (Sylvain =?utf-8?B?VGjDqW5hdWx0?=) Date: Wed, 18 Sep 2013 07:39:47 +0200 Subject: [code-quality] Fwd: Spurious useless-else-on-loop warning In-Reply-To: References: <5238AB72.6080603@nedbatchelder.com> <5238F21F.4050608@nedbatchelder.com> Message-ID: <20130918053947.GB2604@logilab.fr> On 18 septembre 05:33, Michael Foord wrote: > On 18 September 2013 02:39, Skip Montanaro wrote: > > I still think you're missing the point of my original post. A return > > is as good as a break when considering early exit from a loop. The > > pylint code only checks for the presence of a break statement, but the > > test cases clearly show a case with a return statement. The test > > functions even include docstrings which state that. > > It's not considering the exit from the loop it's considering the entry to > the else clause. A break would continue past the body of the loop without > entering the else clause (so the else would have a meaning - only enter > this code if we haven't hit break). Without a break the else clause will > always be entered if the loop terminates normally, so the else is useless. > Just putting the return immediately after the loop is functionally > identical. > > That you have some early returns inside the body of the loop is irrelevant Yep, that's the point of this check so IMO Pylint doesn't have to be changed. May Torsten or someone else from Google which has contributed this check may confirm this is the expected behaviour. Skip, regarding the test you found: def test_return_for(): """else + return is accetable.""" for i in range(10): if i % 2: return i else: print 'math is broken' the docstring doomed you: the else is actually warned here (see associated message file in test/messages/func_useless_else_on_loop.txt). I'll fix that. -- Sylvain Th?nault, LOGILAB, Paris (01.45.32.03.12) - Toulouse (05.62.17.16.42) Formations Python, Debian, M?th. Agiles: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services CubicWeb, the semantic web framework: http://www.cubicweb.org From jdahlin at async.com.br Thu Sep 19 04:17:36 2013 From: jdahlin at async.com.br (Johan Dahlin) Date: Wed, 18 Sep 2013 23:17:36 -0300 Subject: [code-quality] Pylint and multiprocessing Message-ID: Hi, I'm Johan and I'm working on a python based ERP for the brazilian market[1], it's written in Python. As part of our software development process we run pyflakes/pep8/pylint/unittests/coverage validation _before_ each commit can be integrated into the mainline. The main bottleneck of this validation turns out to be pylint and I started to investigate on how to use multiprocessing to speed that up. Attached is a first take on just that. This creates a configurable number of processes and uses them to split out the work on a per filename basis. I've been working on 0.26.0 of Pylint, which is the version included with Ubuntu 13.04. It seems that the part modified by this patch hasn't changed significantly in trunk, but I can port over my work to a newer version if there's interest. The fact that the different processes are not sharing state between each other probably may mean that certain checks are not going to work 100%, I don't fully understand how pylint works internally, perhaps someone else could chip in. Anyway, some numbers: i3-3220 (1 socket, 2 cores, 4 hyperthreads) Sample set 1: (37kloc, 92 files) - unmodified 18.35s 0% - n_procs = 1 18.50s -1% - n_procs = 2 11.29s +38% - n_procs = 3 10.96s +40% - n_procs = 4 10.85s +41% Sample set 2: (156kloc, 762 files) - unmodified 77.62s - n_procs = 1 82.33s -6% - n_procs = 2 47.68s +39% - n_procs = 3 46.04s +41% - n_procs = 4 45.19s +42% Xeon(R) CPU E5410 (2 sockets, 8 cores, no hyperthreading) Sample set 2: (156kloc, 762 files) - unmodified 140.996s - n_procs = 4 48.675s 65% - n_procs = 8 36.323s 74% The number seems fairly promising and introducing some simple errors in the code base are properly being caught, I didn't quite test any errors that require deep knowledge of other modules. The Xeon CPU is pretty old and thus each core is quite a bit slower than the i3. Running with n_processes = 1 seems to be a little bit slower, especially when there's a lot of source code, I suspect that it's due to the fact that the multiprocessing module is imported and perhaps the overhead of creating a process. That can be mitigated to keep the old code path and only use multiprocessing when n_processes >= 2. Overall I think this is pretty positive for my use case, the total time of running the validation steps went down from 7.5 minutes to 5.5 minutes or so. The other parts of the validation can also be split out over different processes, but that's for another time. [1]: http://www.stoq.com.br/ -- Johan Dahlin -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: pylint-multiprocessing.diff Type: application/octet-stream Size: 5977 bytes Desc: not available URL: From sylvain.thenault at logilab.fr Thu Sep 19 11:14:27 2013 From: sylvain.thenault at logilab.fr (Sylvain =?utf-8?B?VGjDqW5hdWx0?=) Date: Thu, 19 Sep 2013 11:14:27 +0200 Subject: [code-quality] Pylint and multiprocessing In-Reply-To: References: Message-ID: <20130919091426.GE2536@logilab.fr> On 18 septembre 23:17, Johan Dahlin wrote: > Hi, Hi Johan, > I'm Johan and I'm working on a python based ERP for the brazilian > market[1], it's written in Python. As part of our software development > process we run pyflakes/pep8/pylint/unittests/coverage validation _before_ > each commit can be integrated into the mainline. > > The main bottleneck of this validation turns out to be pylint and I started > to investigate on how to use multiprocessing to speed that up. Indeed the pylint quality is definitly not speed :) Using multiprocessing sounds like a neat idea though. [snip] > Overall I think this is pretty positive for my use case, the total time of > running the validation steps went down from 7.5 minutes to 5.5 minutes or > so. The other parts of the validation can also be split out over different > processes, but that's for another time. Would you submit a pull-request on bitbucket [1] so we may discuss about your patch there? I'm definitly interested in having such feature, so I'll help you having it in a shape that may be integrated (no time guarantee though ;) [1] https://bitbucket.org/logilab/pylint -- Sylvain Th?nault, LOGILAB, Paris (01.45.32.03.12) - Toulouse (05.62.17.16.42) Formations Python, Debian, M?th. Agiles: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services CubicWeb, the semantic web framework: http://www.cubicweb.org From jdahlin at async.com.br Thu Sep 19 13:04:05 2013 From: jdahlin at async.com.br (Johan Dahlin) Date: Thu, 19 Sep 2013 08:04:05 -0300 Subject: [code-quality] Pylint and multiprocessing In-Reply-To: <20130919091426.GE2536@logilab.fr> References: <20130919091426.GE2536@logilab.fr> Message-ID: On Thu, Sep 19, 2013 at 6:14 AM, Sylvain Th?nault < sylvain.thenault at logilab.fr> wrote: > On 18 septembre 23:17, Johan Dahlin wrote: > > Hi, > > Hi Johan, > > > I'm Johan and I'm working on a python based ERP for the brazilian > > market[1], it's written in Python. As part of our software development > > process we run pyflakes/pep8/pylint/unittests/coverage validation > _before_ > > each commit can be integrated into the mainline. > > > > The main bottleneck of this validation turns out to be pylint and I > started > > to investigate on how to use multiprocessing to speed that up. > > Indeed the pylint quality is definitly not speed :) Using multiprocessing > sounds > like a neat idea though. > > [snip] > > > Overall I think this is pretty positive for my use case, the total time > of > > running the validation steps went down from 7.5 minutes to 5.5 minutes or > > so. The other parts of the validation can also be split out over > different > > processes, but that's for another time. > > Would you submit a pull-request on bitbucket [1] so we may discuss about > your > patch there? I'm definitly interested in having such feature, so I'll help > you > having it in a shape that may be integrated (no time guarantee though ;) > Sure, I had to create an account etc, hope I did everything right, the pull request can be found here: https://bitbucket.org/logilab/pylint/pull-request/55/add-multiprocessing-support/diff -- Johan Dahlin -------------- next part -------------- An HTML attachment was scrubbed... URL: From skip at pobox.com Mon Sep 23 16:07:27 2013 From: skip at pobox.com (Skip Montanaro) Date: Mon, 23 Sep 2013 09:07:27 -0500 Subject: [code-quality] __len__ but no __getitem__ Message-ID: Having demonstrated my confusion about early breaks from loops, I will proceed to demonstrate my confusion about containers. I have a queue-like class in which I implement __len__ but not __getitem__. Pylint complains: timeddata.py:79: [R0924(incomplete-protocol), TimedDataQueue] Badly implemented Container, implements __len__ but not __getitem__ I can see where that would be a valid complaint for an array-like container, but not all containers should support indexing even if they have a measurable length. Python sets are one example: >>> s = set(range(10)) >>> len(s) 10 >>> s[0] Traceback (most recent call last): File "", line 1, in TypeError: 'set' object does not support indexing Queue-like containers seem similar. While I would allow access to the element at one end or the other (depending if I want queue-like or stack-like behavior), I think it would violate the definition of those types to allow indexing. Should pylint really be this strict? Or am I expected to implement everything necessary for an array-like containiner and just raise exceptions in those methods the user really shouldn't access? Thanks, Skip From sylvain.thenault at logilab.fr Mon Sep 23 16:26:49 2013 From: sylvain.thenault at logilab.fr (Sylvain =?utf-8?B?VGjDqW5hdWx0?=) Date: Mon, 23 Sep 2013 16:26:49 +0200 Subject: [code-quality] __len__ but no __getitem__ In-Reply-To: References: Message-ID: <20130923142649.GK2650@logilab.fr> Hi Skip, On 23 septembre 09:07, Skip Montanaro wrote: > Having demonstrated my confusion about early breaks from loops, I will > proceed to demonstrate my confusion about containers. :) > I have a queue-like class in which I implement __len__ but not > __getitem__. Pylint complains: > > timeddata.py:79: [R0924(incomplete-protocol), TimedDataQueue] Badly > implemented Container, implements __len__ but not __getitem__ > > I can see where that would be a valid complaint for an array-like > container, but not all containers should support indexing even if they > have a measurable length. Python sets are one example: > > >>> s = set(range(10)) > >>> len(s) > 10 > >>> s[0] > Traceback (most recent call last): > File "", line 1, in > TypeError: 'set' object does not support indexing > > Queue-like containers seem similar. While I would allow access to the > element at one end or the other (depending if I want queue-like or > stack-like behavior), I think it would violate the definition of those > types to allow indexing. > > Should pylint really be this strict? Or am I expected to implement > everything necessary for an array-like containiner and just raise > exceptions in those methods the user really shouldn't access? No I'ld say you're right. While it sounded a good idea when proposed, you're not the first one being confused by this message, so I tend to think this check should be either removed or kept for well defined and all-or-nothing protocols (the only one coming to my mind being the context manager __enter__ / __exit__, but there may be others). I would be glad to have others'opinion though. -- Sylvain Th?nault, LOGILAB, Paris (01.45.32.03.12) - Toulouse (05.62.17.16.42) Formations Python, Debian, M?th. Agiles: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services CubicWeb, the semantic web framework: http://www.cubicweb.org From skip at pobox.com Mon Sep 23 23:14:27 2013 From: skip at pobox.com (Skip Montanaro) Date: Mon, 23 Sep 2013 16:14:27 -0500 Subject: [code-quality] __len__ but no __getitem__ In-Reply-To: <20130923142649.GK2650@logilab.fr> References: <20130923142649.GK2650@logilab.fr> Message-ID: >> Should pylint really be this strict? Or am I expected to implement >> everything necessary for an array-like containiner and just raise >> exceptions in those methods the user really shouldn't access? > > No I'ld say you're right. While it sounded a good idea when proposed, you're not > the first one being confused by this message, so I tend to think this check > should be either removed or kept for well defined and all-or-nothing protocols > (the only one coming to my mind being the context manager __enter__ / __exit__, > but there may be others). I would be glad to have others'opinion though. It occurs to me that the reverse case is likely still correct. That is, if __getitem__ is defined, omitting __len__ should happen only rarely, and probably require an explicit suppression somewhere, probably in the class definition. Skip From fuzzyman at gmail.com Mon Sep 23 23:18:39 2013 From: fuzzyman at gmail.com (Michael Foord) Date: Mon, 23 Sep 2013 22:18:39 +0100 Subject: [code-quality] __len__ but no __getitem__ In-Reply-To: References: <20130923142649.GK2650@logilab.fr> Message-ID: On 23 September 2013 22:14, Skip Montanaro wrote: > >> Should pylint really be this strict? Or am I expected to implement > >> everything necessary for an array-like containiner and just raise > >> exceptions in those methods the user really shouldn't access? > > > > No I'ld say you're right. While it sounded a good idea when proposed, > you're not > > the first one being confused by this message, so I tend to think this > check > > should be either removed or kept for well defined and all-or-nothing > protocols > > (the only one coming to my mind being the context manager __enter__ / > __exit__, > > but there may be others). I would be glad to have others'opinion though. > > It occurs to me that the reverse case is likely still correct. That > is, if __getitem__ is defined, omitting __len__ should happen only > rarely, and probably require an explicit suppression somewhere, > probably in the class definition. > Several times I've implemented classes with dynamic behaviour in __getitem__, so they have no strict length (beyond "theoretically infinite"). Michael > > Skip > _______________________________________________ > code-quality mailing list > code-quality at python.org > https://mail.python.org/mailman/listinfo/code-quality > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From skip at pobox.com Mon Sep 23 23:30:20 2013 From: skip at pobox.com (Skip Montanaro) Date: Mon, 23 Sep 2013 16:30:20 -0500 Subject: [code-quality] __len__ but no __getitem__ In-Reply-To: References: <20130923142649.GK2650@logilab.fr> Message-ID: > Several times I've implemented classes with dynamic behaviour in > __getitem__, so they have no strict length (beyond "theoretically > infinite"). Understood, and that's a case where I think you should suppress the warning. I believe the common case is that if you can get a particular item you can count all the items without an infloop or side effects. Said another way, __getitem__ + __len__ is a much more common pattern than __getitem__ without __len__. Skip