From fijall at gmail.com Sun Jan 1 14:50:13 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 1 Jan 2012 15:50:13 +0200 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: <4EFF29FF.7040009@python-academy.de> References: <4EFF29FF.7040009@python-academy.de> Message-ID: On Sat, Dec 31, 2011 at 5:27 PM, Mike M?ller wrote: > Hi, > > I am just wondering if anybody is interested in sprinting > on PyPy and in particular NumPyPy in Leipzig sometime in 2012. > > I can offer working space for up to 12 people with Wi-Fi as well > as some basic catering (hot and cold drinks, snacks, pizza etc.) > for a few days (up to a week). > > Accommodation in Leipzig is very reasonably priced. For example, > there is a decent hotel very close to the venue for 35 Euros/night > (single) or 50 Euros/night (double) including breakfast. > > Being a logistics center, Leipzig is easy to travel to by > car, train or airplane including budget airlines. > > I would also act as co-sponsor (with the resources stated above) for > an application for sprint funds (http://pythonsprints.com/cfa/ or > other sources) that could cover (parts) of the traveling and > accommodation expenses. > > Let me know what you think about it. I am open to ideas. > > Mike That sounds like a really cool idea. Thanks Mike for thinking about us :) What do others think? Cheers, fijal From fijall at gmail.com Sun Jan 1 14:52:47 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 1 Jan 2012 15:52:47 +0200 Subject: [pypy-dev] How to turn a crawling caterpillar of a VM into a graceful butterfly In-Reply-To: <20111231165855.GB32360@phase.tratt.net> References: <20111231100304.GA22828@phase.tratt.net> <20111231152626.GA32360@phase.tratt.net> <20111231165855.GB32360@phase.tratt.net> Message-ID: On Sat, Dec 31, 2011 at 6:58 PM, Laurence Tratt wrote: > On Sat, Dec 31, 2011 at 05:45:35PM +0100, Armin Rigo wrote: > > Hi Armin, > >>> ?func main(): >>> ? ?i := 0 >>> ? ?while i < 100000: >>> ? ? ?i += 1 >> A quick update: on this program, with 100 times the number of iterations, >> "converge-opt3" runs in 2.6 seconds on my laptop, and "converge-jit" runs >> in less than 0.7 seconds. ?That's already a 4x speed-up :-) ?I think that >> you simply underestimated the warm-up times. > > In fairness, I did know that that program benefits from the JIT at least > somewhat :) I was wondering if there are other micro-benchmarks that the PyPy > folk found paricularly illuminating / surprising when optimising PyPy. > > There's also something else that's weird. Try "time make regress" with > --opt=3 and --opt=jit. The latter is often twice as slow as the former. I > have no useful intuitions as why at the moment. > > Yours, > > > Laurie > -- > Personal ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://tratt.net/laurie/ > The Converge programming language ? ? ? ? ? ? ? ? ? ? ?http://convergepl.org/ > ? https://github.com/ltratt ? ? ? ? ? ? ?http://twitter.com/laurencetratt > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev The most coarse-grained test would be PYPYLOG=jit-summary:- , that should provide you some feedback on tracing and other warmup-related times as well as some basic stats. Note that pypy-jit is almost never slowe than pypy-no-jit, but it's certainly slower than CPython for running tests (most of the time). Tests are the particular case of jit-unfriendly code, because ideally they execute each piece of code once. Cheers, fijal From laurie at tratt.net Sun Jan 1 16:09:51 2012 From: laurie at tratt.net (Laurence Tratt) Date: Sun, 1 Jan 2012 15:09:51 +0000 Subject: [pypy-dev] How to turn a crawling caterpillar of a VM into a graceful butterfly In-Reply-To: References: <20111231100304.GA22828@phase.tratt.net> <20111231152626.GA32360@phase.tratt.net> <20111231165855.GB32360@phase.tratt.net> Message-ID: <20120101150951.GB6225@overdrive.tratt.net> On Sun, Jan 01, 2012 at 03:52:47PM +0200, Maciej Fijalkowski wrote: > The most coarse-grained test would be PYPYLOG=jit-summary:- commands you want to run>, that should provide you some feedback on tracing > and other warmup-related times as well as some basic stats. I haven't noticed this particular output type before... but I'm not really sure what most of the numbers actually mean! Does anyone have any pointers? Laurie -- Personal http://tratt.net/laurie/ The Converge programming language http://convergepl.org/ https://github.com/ltratt http://twitter.com/laurencetratt From fijall at gmail.com Sun Jan 1 17:47:08 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 1 Jan 2012 18:47:08 +0200 Subject: [pypy-dev] How to turn a crawling caterpillar of a VM into a graceful butterfly In-Reply-To: <20120101150951.GB6225@overdrive.tratt.net> References: <20111231100304.GA22828@phase.tratt.net> <20111231152626.GA32360@phase.tratt.net> <20111231165855.GB32360@phase.tratt.net> <20120101150951.GB6225@overdrive.tratt.net> Message-ID: On Sun, Jan 1, 2012 at 5:09 PM, Laurence Tratt wrote: > On Sun, Jan 01, 2012 at 03:52:47PM +0200, Maciej Fijalkowski wrote: > >> The most coarse-grained test would be PYPYLOG=jit-summary:- > commands you want to run>, that should provide you some feedback on tracing >> and other warmup-related times as well as some basic stats. > > I haven't noticed this particular output type before... but I'm not really > sure what most of the numbers actually mean! Does anyone have any pointers? Can you paste them somewhere? I'll explain > > > Laurie > -- > Personal ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://tratt.net/laurie/ > The Converge programming language ? ? ? ? ? ? ? ? ? ? ?http://convergepl.org/ > ? https://github.com/ltratt ? ? ? ? ? ? ?http://twitter.com/laurencetratt > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From laurie at tratt.net Sun Jan 1 17:57:57 2012 From: laurie at tratt.net (Laurence Tratt) Date: Sun, 1 Jan 2012 16:57:57 +0000 Subject: [pypy-dev] How to turn a crawling caterpillar of a VM into a graceful butterfly In-Reply-To: References: <20111231100304.GA22828@phase.tratt.net> <20111231152626.GA32360@phase.tratt.net> <20111231165855.GB32360@phase.tratt.net> <20120101150951.GB6225@overdrive.tratt.net> Message-ID: <20120101165757.GD6225@overdrive.tratt.net> On Sun, Jan 01, 2012 at 06:47:08PM +0200, Maciej Fijalkowski wrote: >> I haven't noticed this particular output type before... but I'm not really >> sure what most of the numbers actually mean! Does anyone have any >> pointers? > Can you paste them somewhere? I'll explain Thanks! Here's an example (hopefully a decent one!): http://pastebin.com/51QpPD7C Laurie -- Personal http://tratt.net/laurie/ The Converge programming language http://convergepl.org/ https://github.com/ltratt http://twitter.com/laurencetratt From fijall at gmail.com Sun Jan 1 18:02:14 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 1 Jan 2012 19:02:14 +0200 Subject: [pypy-dev] How to turn a crawling caterpillar of a VM into a graceful butterfly In-Reply-To: <20120101165757.GD6225@overdrive.tratt.net> References: <20111231100304.GA22828@phase.tratt.net> <20111231152626.GA32360@phase.tratt.net> <20111231165855.GB32360@phase.tratt.net> <20120101150951.GB6225@overdrive.tratt.net> <20120101165757.GD6225@overdrive.tratt.net> Message-ID: On Sun, Jan 1, 2012 at 6:57 PM, Laurence Tratt wrote: > On Sun, Jan 01, 2012 at 06:47:08PM +0200, Maciej Fijalkowski wrote: > >>> I haven't noticed this particular output type before... but I'm not really >>> sure what most of the numbers actually mean! Does anyone have any >>> pointers? >> Can you paste them somewhere? I'll explain > > Thanks! Here's an example (hopefully a decent one!): > > ?http://pastebin.com/51QpPD7C so for example here, tracing has taken 0.8s out of 2.5 total (a lot) + 0.2s for the backend. Also, there were 16 loops aborted because trace run out for too long. That might mean a lot of things. If you can post some example I can probably look at traces. For example - is the loop iteration from above a good one? Cheers, fijal > > > Laurie > -- > Personal ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://tratt.net/laurie/ > The Converge programming language ? ? ? ? ? ? ? ? ? ? ?http://convergepl.org/ > ? https://github.com/ltratt ? ? ? ? ? ? ?http://twitter.com/laurencetratt > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From laurie at tratt.net Sun Jan 1 18:13:10 2012 From: laurie at tratt.net (Laurence Tratt) Date: Sun, 1 Jan 2012 17:13:10 +0000 Subject: [pypy-dev] How to turn a crawling caterpillar of a VM into a graceful butterfly In-Reply-To: References: <20111231100304.GA22828@phase.tratt.net> <20111231152626.GA32360@phase.tratt.net> <20111231165855.GB32360@phase.tratt.net> <20120101150951.GB6225@overdrive.tratt.net> <20120101165757.GD6225@overdrive.tratt.net> Message-ID: <20120101171310.GE6225@overdrive.tratt.net> On Sun, Jan 01, 2012 at 07:02:14PM +0200, Maciej Fijalkowski wrote: >> Here's an example (hopefully a decent one!): >> >> ?http://pastebin.com/51QpPD7C > so for example here, tracing has taken 0.8s out of 2.5 total (a lot) + 0.2s > for the backend. If you can post some example I can probably look at > traces. > > For example - is the loop iteration from above a good one? I don't know if it's a good one or not :) This is a real program executing under the Converge JIT - the Converge compiler. If you have a checkout of the Converge VM and have built the VM with a JIT, you can recreate this by doing: $ cd compiler $ PYPYLOG=jit-summary:stats ../vm/converge convergec -o Compiler/Code_Gen.cvb Compiler/Code_Gen.cv > Also, there were 16 loops aborted because trace run out for too long. That > might mean a lot of things. I'm going to assume this is because (at least as I've done things so far) the VM is effectively inlining all calls to RPython-level functions (in other words, if the bytecode calls a builtin function, the latter is inlined). It's not clear to me whether this is entirely desireable - sometimes it might be sensible, but often not. How does PyPy handle this? Does it have a blanket "don't look inside builtin functions" for example? Laurie -- Personal http://tratt.net/laurie/ The Converge programming language http://convergepl.org/ https://github.com/ltratt http://twitter.com/laurencetratt From alex.gaynor at gmail.com Sun Jan 1 20:27:08 2012 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Sun, 1 Jan 2012 13:27:08 -0600 Subject: [pypy-dev] How to turn a crawling caterpillar of a VM into a graceful butterfly In-Reply-To: <20120101171310.GE6225@overdrive.tratt.net> References: <20111231100304.GA22828@phase.tratt.net> <20111231152626.GA32360@phase.tratt.net> <20111231165855.GB32360@phase.tratt.net> <20120101150951.GB6225@overdrive.tratt.net> <20120101165757.GD6225@overdrive.tratt.net> <20120101171310.GE6225@overdrive.tratt.net> Message-ID: On Sun, Jan 1, 2012 at 11:13 AM, Laurence Tratt wrote: > On Sun, Jan 01, 2012 at 07:02:14PM +0200, Maciej Fijalkowski wrote: > > >> Here's an example (hopefully a decent one!): > >> > >> http://pastebin.com/51QpPD7C > > so for example here, tracing has taken 0.8s out of 2.5 total (a lot) + > 0.2s > > for the backend. If you can post some example I can probably look at > > traces. > > > > For example - is the loop iteration from above a good one? > > I don't know if it's a good one or not :) This is a real program executing > under the Converge JIT - the Converge compiler. If you have a checkout of > the > Converge VM and have built the VM with a JIT, you can recreate this by > doing: > > $ cd compiler > $ PYPYLOG=jit-summary:stats ../vm/converge convergec -o > Compiler/Code_Gen.cvb Compiler/Code_Gen.cv > > > Also, there were 16 loops aborted because trace run out for too long. > That > > might mean a lot of things. > > I'm going to assume this is because (at least as I've done things so far) > the > VM is effectively inlining all calls to RPython-level functions (in other > words, if the bytecode calls a builtin function, the latter is inlined). > It's > not clear to me whether this is entirely desireable - sometimes it might be > sensible, but often not. How does PyPy handle this? Does it have a blanket > "don't look inside builtin functions" for example? > > > Laurie > -- > Personal > http://tratt.net/laurie/ > The Converge programming language > http://convergepl.org/ > https://github.com/ltratt http://twitter.com/laurencetratt > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > Take a look at pypy/module/pypyjit/policy.py it shows the JITPolicy for which functions can be inlined. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewfr_ice at yahoo.com Tue Jan 3 19:45:20 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Tue, 3 Jan 2012 10:45:20 -0800 (PST) Subject: [pypy-dev] Problems Installing STM Message-ID: <1325616320.5710.YahooMailNeo@web120703.mail.ne1.yahoo.com> Hi Folks: I want to experiment with the STM module. I read the "Comprehensive Strategy for Contention Management in STM" paper and compiled the rstm module. I figured learning to write C/C++ programmes with the module would be a good way to get a feel for STM. On the PyPy side: I have downloaded pypy from the repository, overlaid it with a pre-compiled (I don't have a big enough machine to compile from source) and use virtualenv, altering PYTHONPATH to point to the top of the pypy directory. When I execute python test_rstm.py the programme starts to compile but ends with no errors. I decided to test some other programmes to see if my environment is functioning correctly. I decided to run? bpnn ../pypy/translator/goal/translate.py --stm bpnn.py (I added the --stm because the translator complained) translation:ERROR]??? File "/home/andrew/pypy-stm/pypy/translator/c/funcgen.py", line 697, in OP_CAST_PTR_TO_ADR [translation:ERROR]???? "in %r" % (self.graph,)) [translation:ERROR]? AssertionError: cast_ptr_to_adr(gcref) is a bad idea with STM.? Consider checking config.stm in [translation] start debugger... > /home/andrew/pypy-stm/pypy/translator/c/funcgen.py(697)OP_CAST_PTR_TO_ADR() -> "in %r" % (self.graph,)) (Pdb+) What am I doing wrong? How should I be going about? this? Cheers, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From coolbutuseless at gmail.com Wed Jan 4 22:02:58 2012 From: coolbutuseless at gmail.com (mike c) Date: Thu, 5 Jan 2012 07:02:58 +1000 Subject: [pypy-dev] micronumpy 'fromnumeric' patch Message-ID: Hi All, Numpy is somewhat derived from another array library called 'Numeric'. What were originally functions in 'Numeric' have been promoted to ndarray methods in numpy. The implementation of this code in numpy is in * numpy/core/fromnumeric.py*. The list of functions that were promoted is listed at the end of this email. However, these functions must be retained for compatibility, but all they should do is call the ndarray method. e.g. "numpy.ravel(a)" is just a wrapper which returns "a.ravel()" I have: * copied all docstrings from numpy1.6 numpy/core/fromnumeric.py into micronumpy/app_numpy.py * for methods which we currently have (e.g. max, argmax), I have * made the function just call the relavant method and return the value * written a test_fromnumeric_* using the docstring examples * for methods which we don't have I have raised a NotImplemented Error I realise that this is a lot of boring boilerplate code & docstrings, but I think it's to everyone's advantage to flesh out the micronumpy implementation a bit more so that we can see what has/hasn't been done. This will also be useful to newcomers - like me - who can now find obvious functions that need implementation. Could someone please review, fix if necessary and commit. Thanks. Mike ====================== # __all__ = ['take', 'reshape', 'choose', 'repeat', 'put', # 'swapaxes', 'transpose', 'sort', 'argsort', 'argmax', 'argmin', # 'searchsorted', 'alen', # 'resize', 'diagonal', 'trace', 'ravel', 'nonzero', 'shape', # 'compress', 'clip', 'sum', 'product', 'prod', 'sometrue', 'alltrue', # 'any', 'all', 'cumsum', 'cumproduct', 'cumprod', 'ptp', 'ndim', # 'rank', 'size', 'around', 'round_', 'mean', 'std', 'var', 'squeeze', # 'amax', 'amin', # ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Thu Jan 5 00:30:34 2012 From: arigo at tunes.org (Armin Rigo) Date: Thu, 5 Jan 2012 00:30:34 +0100 Subject: [pypy-dev] STM Message-ID: Hi all, (Andrew: I'll answer your mail too, but this is independent) Sorry if this mail is more general than just pypy-dev, but I'll post it here at least at first. When thinking more about STM (and also after reading a paper that cfbolz pointed me to), I would like to argue that the currently accepted way to add STM to languages is slightly bogus. So far, the approach is to take an existing language, like Python or C, and add a keyword like "transaction" that delimits a nested scope. You end up with syntax like this (example in Python): def foo(): before() with transaction: during1() during2() after() In this example, "during1(); during2()" is the content of the transaction. But the issue with this approach is that there is no way to structure the transactions differently. What if I want a transaction that starts somewhere, and ends at some unrelated place? This is of course related to the way "we" (I, but others before me) described how transactions could be applied to the CPython interpreter. When you think about the C source of CPython, there is a precise point at which the previous transaction should stop and the next one immediately start (namely, between bytecodes). But there is no way to express this using the syntax above. What would be needed is something like (example in C): void mainloop() { while (1) { dispatch_next_bytecode(); call_stuff_between_bytecodes(); } } void call_stuff_between_bytecodes() { ...; /* here, we would like the previous transaction to stop and the next one to start */ ...; } The above issue occurs when trying to apply STM to CPython, but I believe the issue to be much more general. The solution proposed with the "transaction" keyword assumes a generally-non-STM program into which we carefully insert STM here and there. This is reasonable from the point of view of performance, because running a STM transaction is costly. But if we ignore this point of performance (for example, by assuming we have a hybrid HTM-STM-based approach that was carefully tweaked for years, with the GC and the JIT that are aware of it), then this argument no longer applies. The solution that we really need for CPython requires a different point of view: a generally-STM program, in which we carefully add here and there a "yield", i.e. a point at which it's ok to end the current transaction and start the next one. While this is obvious in the case of the CPython interpreter, I'd argue that it is a useful point of view in general. The paper mentioned above says that the current approach to multithreading is completely non-deterministic, with a large collection of tools to help us "patch" it here and there to make it more deterministic --- i.e. multithreaded programs behave randomly, and you need to carefully use locks, semaphores, and so on, in order to get back something that behaves as you want, hopefully in all cases. That approach is, according to this paper, the reason for which programming with multiple threads is so hard (and I tend to agree). The paper argues in favour of a would-be system that would be the other way around: deterministic with explicit non-determinism added. So, taking this point of view and making it more concrete, I'd like to argue that what we really need is not a nested "transaction" keyword, in C or in Python or in any language. That would be just another way to add some amount of determinism to a inherently non-deterministic approach. Instead, we need by default all threads to run always in a coarse-grained transaction; and using a "yield" keyword or function call, we would as needed make the transactions more fine-grained. (I ignore here the issue of I/O.) In other words, this would be a step back to the old world of cooperative multithreading: each thread needs to call "yield" regularly, otherwise it never allows other threads to run. The difference with the old world of cooperative multithreading is that nowadays we have STM --- "never allows other threads to run" has become a figure of speach, a way to reason about programs *as if* they were really switching between threads only at the points where we call "yield". What *really* occurs is that each thread runs a transaction between two yield points, so each thread can actually run concurrently with the others, and the only issue is that if the threads actually do conflict, then the performance degrades as some threads need to repeat their action. This gives a model where we can write a multithreaded application by starting with almost no "yield" points, and then, *if performance requires it*, we can carefully add extra yield points to reduce conflicts. In other words, it is a model in which we start writing multithreaded applications using just plain old (rather deterministic) cooperative multithreading, and then, as needs dictate, we insert some extra points of non-determinism in order to boost performance. Following that reasoning, I no longer want to go for a PyPy version exposing a "with transaction" context manager. Instead, it would rather be something like a function to call at any point to "yield" control --- or, in the idea that it's better to avoid having yields at random unexpected points, maybe an alternative solution that looks more promising: to start a new thread, the programmer calls a function "start_transactional_thread(g)" where g is a generator; when g yields, it marks the end of a transaction and the start of the next one. (The exact yielded value should be None.) The difference with the previous solution is that you cannot yield from a random function; your function has to have some structure, namely be a generator, in order to use this yield. Also, it turns out that this model is easy to write a non-STM implementation for, for debugging and testing: _running = [] def start_transactional_thread(g): _running.append(g) def run(): "Call this in the main thread to run and wait for all started sub-threads" while _running: i = random.choice(len(_running)) g = _running.pop(i) try: g.next() _running.append(g) except StopIteration: pass (Interestingly, this run() function is similar to stackless.run(). The exact relationships between stackless and this model are yet to investigate, but it seems that one point of view on what I'm proposing might be essentially "use stackless, remove some determinism from the already-unclear tasklet switching order, and add STM under the hood".) A bient?t, Armin. From coolbutuseless at gmail.com Wed Jan 4 22:04:34 2012 From: coolbutuseless at gmail.com (mike c) Date: Thu, 5 Jan 2012 07:04:34 +1000 Subject: [pypy-dev] micronumpy 'fromnumeric' patch In-Reply-To: References: Message-ID: Ooops. Patch attached to this email. mikefc -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: numpypy_fromnumeric.patch Type: application/octet-stream Size: 86695 bytes Desc: not available URL: From arigo at tunes.org Thu Jan 5 11:38:35 2012 From: arigo at tunes.org (Armin Rigo) Date: Thu, 5 Jan 2012 11:38:35 +0100 Subject: [pypy-dev] STM In-Reply-To: References: Message-ID: Re-hi, Ah, I realized something else. When considering solutions for CPython, if we go for the one-transaction-per-bytecode approach, like the approach taken in the 2 papers so far about CPython+TM, then we need non-lexically-nested transactions in the C language, i.e. more than what GCC 4.7 offers. But if instead we go, as I propose, for the coarse-grained-that-can-be-refined transaction approach, then it's not true: in the C language we only need to implement an equivalent to the run() function I pasted, and this function can use a lexically-nested transaction keyword in C. So I guess my next goal suddently shifted back to doing more experiments with CPython and GCC 4.7 :-) A bient?t, Armin. From arigo at tunes.org Thu Jan 5 11:46:16 2012 From: arigo at tunes.org (Armin Rigo) Date: Thu, 5 Jan 2012 11:46:16 +0100 Subject: [pypy-dev] Problems Installing STM In-Reply-To: <1325616320.5710.YahooMailNeo@web120703.mail.ne1.yahoo.com> References: <1325616320.5710.YahooMailNeo@web120703.mail.ne1.yahoo.com> Message-ID: Hi Andrew, On Tue, Jan 3, 2012 at 19:45, Andrew Francis wrote: > I have downloaded pypy from the repository, overlaid it with a pre-compiled > (I don't have a big enough machine to compile from source) and use > virtualenv, altering PYTHONPATH to point to the top of the pypy directory. You don't need a PyPy at all, pre-compiled or not. You don't need virtualenv either. > python test_rstm.py > the programme starts to compile but ends with no errors. As documented at various places, you run the tests by saying: python ../../../test_all.py test_rstm.py typically with "python" being the CPython interpreter. > I decided to test some other programmes to see if my environment is > functioning correctly. I decided to run? bpnn > > ../pypy/translator/goal/translate.py --stm bpnn.py > > (I added the --stm because the translator complained) That's expected: the branch adds stm-only tweaks and doesn't try to integrate non-stm so far. But that's also a pointless example, because bpnn.py doesn't use multiple threads. The only fully-compiled multithreaded demo using stm is in pypy/translator/stm/test/targetdemo.py, which you can use instead of bpnn.py in the line above. >> /home/andrew/pypy-stm/pypy/translator/c/funcgen.py(697)OP_CAST_PTR_TO_ADR() > -> "in %r" % (self.graph,)) > (Pdb+) You are getting this because the support is only good enough to run targetdemo.py. A bient?t, Armin. From dje.gcc at gmail.com Thu Jan 5 17:10:31 2012 From: dje.gcc at gmail.com (David Edelsohn) Date: Thu, 5 Jan 2012 11:10:31 -0500 Subject: [pypy-dev] STM In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 5:38 AM, Armin Rigo wrote: > Ah, I realized something else. ?When considering solutions for > CPython, if we go for the one-transaction-per-bytecode approach, like > the approach taken in the 2 papers so far about CPython+TM, then we > need non-lexically-nested transactions in the C language, i.e. more > than what GCC 4.7 offers. ?But if instead we go, as I propose, for the > coarse-grained-that-can-be-refined transaction approach, then it's not > true: in the C language we only need to implement an equivalent to the > run() function I pasted, and this function can use a lexically-nested > transaction keyword in C. ?So I guess my next goal suddently shifted > back to doing more experiments with CPython and GCC 4.7 :-) Hi, Armin Yes, transaction memory in programming languages has focused on lexical scoping. But hardware implementations of transaction memory have no concept of lexical scoping and implement something closer to your requirements: START TRANSACTION COMMIT TRANSACTION usually with additional facilities for suspending and aborting a transaction. A threaded interpreter or JIT is lower-level than the idealized target envisioned by programming languages designers adding high-level software transaction memory constructs. I suspect your design would work better with hardware transaction memory, where systems programming languages will expose the hardware instructions as builtins. - David From randall.leeds at gmail.com Thu Jan 5 17:14:21 2012 From: randall.leeds at gmail.com (Randall Leeds) Date: Thu, 5 Jan 2012 11:14:21 -0500 Subject: [pypy-dev] STM In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 11:10, David Edelsohn wrote: > On Thu, Jan 5, 2012 at 5:38 AM, Armin Rigo wrote: > >> Ah, I realized something else. ?When considering solutions for >> CPython, if we go for the one-transaction-per-bytecode approach, like >> the approach taken in the 2 papers so far about CPython+TM, then we >> need non-lexically-nested transactions in the C language, i.e. more >> than what GCC 4.7 offers. ?But if instead we go, as I propose, for the >> coarse-grained-that-can-be-refined transaction approach, then it's not >> true: in the C language we only need to implement an equivalent to the >> run() function I pasted, and this function can use a lexically-nested >> transaction keyword in C. ?So I guess my next goal suddently shifted >> back to doing more experiments with CPython and GCC 4.7 :-) > > Hi, Armin > > Yes, transaction memory in programming languages has focused on > lexical scoping. ?But hardware implementations of transaction memory > have no concept of lexical scoping and implement something closer to > your requirements: > > START TRANSACTION > COMMIT TRANSACTION > > usually with additional facilities for suspending and aborting a transaction. > > A threaded interpreter or JIT is lower-level than the idealized target > envisioned by programming languages designers adding high-level > software transaction memory constructs. > > I suspect your design would work better with hardware transaction > memory, where systems programming languages will expose the hardware > instructions as builtins. Although I've not a seen a hardware implementation that allows overlapping transactions, though some allow nesting. Although it's been two years now since I looked into it at all. I'm not sure how that plays into lexical structure, but intuitively it seems like the with-statement style plays nicely. -Randall From tbaldridge at gmail.com Thu Jan 5 17:25:04 2012 From: tbaldridge at gmail.com (Timothy Baldridge) Date: Thu, 5 Jan 2012 10:25:04 -0600 Subject: [pypy-dev] STM In-Reply-To: References: Message-ID: > I'm not sure how > that plays into lexical structure, but intuitively it seems like the > with-statement style plays nicely. The lexical scoping method also helps with general sanity of the programmer. The idea behind STM is that certain failed transactions need to be re-startable. That is, if the transaction fails the code needs to be re-run until it succeeds. Without lexical scoping, we can do some really bizzare stuff like this: def interator(): for x in range(100: yield x run_transactions() for x in interator(): run_transactions() yield x Here things get super but I guess if we apply the same methodology that the JIT uses to locate loops, it shouldn't be too bad. The biggest thing is that in essence, all IO needs to be run only once, and only when the transaction succeeds. This is why Clojure allows functions to be marked with "io" tags that will throw exceptions if executed inside a transaction. But what just gets really fun is when you start looking at all this and say, "you can only run pure python code instide a STM...no ctypes, no io, no nothing". Explaining that to users will be a bit fun when the entire program is STM to start with. That being said, I'm really excited about this, and can't wait to start playing with it in my projects. Timothy From andrewfr_ice at yahoo.com Thu Jan 5 20:34:20 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Thu, 5 Jan 2012 11:34:20 -0800 (PST) Subject: [pypy-dev] Problems Installing STM In-Reply-To: References: <1325616320.5710.YahooMailNeo@web120703.mail.ne1.yahoo.com> Message-ID: <1325792060.58765.YahooMailNeo@web120702.mail.ne1.yahoo.com> Hi Armin: ________________________________ From: Armin Rigo To: Andrew Francis Cc: "pypy-dev at codespeak.net" Sent: Thursday, January 5, 2012 5:46 AM Subject: Re: [pypy-dev] Problems Installing STM >You don't need a PyPy at all, pre-compiled or not.? You don't need >virtualenv either. Thanks for clarifying this. >As documented at various places, you run the tests by saying: >?? python ../../../test_all.py test_rstm.py Me bad. Ran the tests. All four passed. >The only fully-compiled multithreaded demo using stm is in >pypy/translator/stm/test/targetdemo.py, which you can use instead of >bpnn.py in the line above. .... >You are getting this because the support is only good enough to run >targetdemo.py. Yes I realise running bpnn is pointless in regards to STM. But it did alert me to the --stm option. I ran the targetdemo and the error is: [translation:ERROR]? AssertionError: cast_ptr_to_adr(gcref) is a bad idea with STM.? Consider checking config.stm in How is this fixed? As a sidenote: I would like to get to a stage where I can take simple Haskell STM examples and transcribe them into RPython. The first being the "Hello World of STM" : the deposit/withdrawl programme. I am looking at the targetdemo programme. The only place that references rstm is for i in range(LENGTH): ??????? add_at_end_of_chained_list(glob.anchor, i) ??????? rstm.transaction_boundary() ?print "thread done" I don't see rstm.begin_transaction() being called. Is this implicit with when a thread starts? In general, what sort of feedback are you looking for at this stage? Cheers, Andrew A bient?t, Armin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Thu Jan 5 21:08:17 2012 From: arigo at tunes.org (Armin Rigo) Date: Thu, 5 Jan 2012 21:08:17 +0100 Subject: [pypy-dev] Problems Installing STM In-Reply-To: <1325792060.58765.YahooMailNeo@web120702.mail.ne1.yahoo.com> References: <1325616320.5710.YahooMailNeo@web120703.mail.ne1.yahoo.com> <1325792060.58765.YahooMailNeo@web120702.mail.ne1.yahoo.com> Message-ID: Hi Andrew, On Thu, Jan 5, 2012 at 20:34, Andrew Francis wrote: > [translation:ERROR]? AssertionError: cast_ptr_to_adr(gcref) is a bad idea > with STM.? Consider checking config.stm in (pypy.rpython.memory.gc.minimark:427)MiniMarkGC.post_setup at 0x98f82d4> Ah, sorry. You need also to pass the option "--gc=none". All real GCs are unsupported so far :-/ > for i in range(LENGTH): > ??????? add_at_end_of_chained_list(glob.anchor, i) > ??????? rstm.transaction_boundary() > ?print "thread done" > > I don't see rstm.begin_transaction() being called. Is this implicit with > when a thread starts? Yes. > In general, what sort of feedback are you looking for at this stage? It's unclear. At this stage I'm looking for how we would really like the final STM-over-PyPy to look like, as this question is not clear at all. See my other mails about this topic... A bient?t, Armin. From william.leslie.ttg at gmail.com Fri Jan 6 06:52:32 2012 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Fri, 6 Jan 2012 16:52:32 +1100 Subject: [pypy-dev] STM In-Reply-To: References: Message-ID: On 5 January 2012 10:30, Armin Rigo wrote: > Hi all, > > (Andrew: I'll answer your mail too, but this is independent) > > Sorry if this mail is more general than just pypy-dev, but I'll post > it here at least at first. ?When thinking more about STM (and also > after reading a paper that cfbolz pointed me to), I would like to > argue that the currently accepted way to add STM to languages is > slightly bogus. > > So far, the approach is to take an existing language, like Python or > C, and add a keyword like "transaction" that delimits a nested scope. > You end up with syntax like this (example in Python): > > def foo(): > ? ?before() > ? ?with transaction: > ? ? ? ?during1() > ? ? ? ?during2() > ? ?after() > > In this example, "during1(); during2()" is the content of the > transaction. ?But the issue with this approach is that there is no way > to structure the transactions differently. ?What if I want a > transaction that starts somewhere, and ends at some unrelated place? This is the way it has been described, and how most common usages will probably look. But I don't think there has ever been any suggestion that dynamic extent is the scope at which transactions *should* be implemented, any more than context managers are the the be-all-and-end-all solution for resource management. It's a convenience thing. In the case of open files, for example, "with" has lower syntactic overhead than the equivalent try/finally; but file.close() still needs to exist for more advanced usage patterns. > Following that reasoning, I no longer want to go for a PyPy version exposing a > "with transaction" context manager. ?Instead, it would rather be > something like a function to call at any point to "yield" control --- > or, in the idea that it's better to avoid having yields at random > unexpected points, maybe an alternative solution that looks more > promising: to start a new thread, the programmer calls a function > "start_transactional_thread(g)" where g is a generator; when g yields, > it marks the end of a transaction and the start of the next one. ?(The > exact yielded value should be None.) ?The difference with the previous > solution is that you cannot yield from a random function; your > function has to have some structure, namely be a generator, in order > to use this yield. Here you propose two solutions. I'll consider the generator one first: The requirement to be a generator is clever, because it enables you to know that some random function you call won't attempt to commit your current transaction and start a new one. Yet, this is also slightly ugly, because it has similar composition-related issues to context managers - the user is still limited to a single dynamic extent. It's not clear to me what this means in the presence of eg. stackless, either, as you say. Nevertheless, you can implement this generator model using a context manager, if you don't care about protecting the loop header and creation of the context manager (and I don't see why you would): def start_transactional_thread(g): stm.start_thread(_transactional_thread, g) def _transactional_thread(g): iterator = g() while True: with stm.transaction: try: iterator.next() except StopIteration: return The other case was a function to call to commit the transaction (and start a new one?). I would like to think that you shouldn't be able to commit a transaction that you don't know about (following capability discipline), and that concept is easier to represent as a method of some transaction object rather than a function call. This approach is strictly more general than the generator concept and the one that makes the most sense to me. It also more easily extends to distributed transaction management &c. I suspect that *even if you don't allow nesting of transactions*, this model will suit you better. Consider what happens when a JITted loop (which has its own transaction) makes a residual call. If you have the ability to pass around the transaction object you want to talk about, you can commit the existing transaction and create a new one. When you return into the loop, you can create a new transaction and store that somewhere, this transaction becoming the current transaction for the remainder of the loop. The reason I bring this up is that even though you implement transaction handling with its own special llop, you'd never sensibly model this with a generator. If you were limited to generators, you'd not be able to implement this in the llinterp, or the blackhole interpreter either without sufficient magic. -- William Leslie From arigo at tunes.org Fri Jan 6 15:23:32 2012 From: arigo at tunes.org (Armin Rigo) Date: Fri, 6 Jan 2012 15:23:32 +0100 Subject: [pypy-dev] STM In-Reply-To: References: Message-ID: Hi William, On Fri, Jan 6, 2012 at 06:52, William ML Leslie wrote: > This is the way it has been described, and how most common usages will > probably look. ?But I don't think there has ever been any suggestion > that dynamic extent is the scope at which transactions *should* be > implemented, any more than context managers are the the > be-all-and-end-all solution for resource management. I agree, but my issue is precisely that (say) gcc 4.7, as far as I can tell, *imposes* the dynamic extent to be a nested block. There is no way to ask gcc to produce code handling the lower-level but more general alternative. It looks a bit like saying that the expected most common case is covered, but the general case is not; and, too bad, we really need the general case to turn CPython's GIL into transactions, so we can't even start experimenting at all. But indeed, I am describing something slightly different in the rest of the mail, and this other approach can be implemented as a nested block at the level of C. So we are saved, even though it seems to be a bit by chance. (But maybe it's not by chance after all: see below.) > The requirement to be a generator is clever, (...) I think the relationship to Stackless can be made clearer: the generator approach is basically the same as found in some event packages. If we move to Stackless or greenlets, then "the time between two yields" is replaced with "the time between two switches", but the basic idea remains the same (even though greenlets allows again random functions to switch away, whereas just using generators constrains the programmer to more discipline). Most probably, we can generalize this example: the approach should work with any event-like system, like twisted's or pygame's. The general idea is to have a main loop that calls pending events; assuming that (often enough) there are several independent events waiting to be processed, then they can be processed in parallel, with one transaction each. It may be that this is a good approach: it gives the power of using multiple processors to (even existing) programs that are *not* written to use multiple threads at all, so they are free from all the dangers of multithreading. > The other case was a function to call to commit the transaction (and > start a new one?). ?I would like to think that you shouldn't be able > to commit a transaction that you don't know about (following > capability discipline) That other approach relies on the assumption that "a transaction" is not really the correct point of view. There are cases where you don't clearly have such a transaction as the central concept. The typical example is CPython with a transactional GIL. In this approach, a transaction corresponds to whatever runs between the last time the GIL was acquired and the next time the GIL is released; i.e. between two points in time that are rather unrelated to each other. In this model it doesn't make sense to give too much emphasis on "the transaction" by itself. By opposition you have a clearer pairing between the end of a transaction (release the GIL) and the start of the next one (re-acquire it). Also, in this model it doesn't even make sense to think about nesting transactions. But I'm not saying that this is the perfect model, or even that it makes real sense. It seems to let one CPython interpreter to run internally on multiple threads when the programmers requests multiple threads, so it seems the most straightforward solution; but maybe it is not, simply because "import thread" may not be the "correct" long-term solution for the programmer. There may be a relation between this --- transactions in the interpreter but normal threads for the Python programmer --- and the unusual requirement of non-nested-scope-like transactions in C. > The reason I bring this up is that even though you implement > transaction handling with its own special llop, you'd never sensibly > model this with a generator. ?If you were limited to generators, you'd > not be able to implement this in the llinterp, or the blackhole > interpreter either without sufficient magic. I'm unsure it changes something if we take an approach that doesn't allow nested transactions. The difference seems to be only on whether you have to pass around the transaction as an object, or whether the transaction is in some global thread-local variable. I have to say that I don't really see the point of nested transactions so far, but that may be only because I've taken too much the point of view of "CPython+transactional GIL"; if it's not the correct one after all, I need to learn more :-) A bient?t, Armin. From dje.gcc at gmail.com Fri Jan 6 15:49:43 2012 From: dje.gcc at gmail.com (David Edelsohn) Date: Fri, 6 Jan 2012 09:49:43 -0500 Subject: [pypy-dev] STM In-Reply-To: References: Message-ID: On Fri, Jan 6, 2012 at 9:23 AM, Armin Rigo wrote: > I agree, but my issue is precisely that (say) gcc 4.7, as far as I can > tell, *imposes* the dynamic extent to be a nested block. ?There is no > way to ask gcc to produce code handling the lower-level but more > general alternative. ?It looks a bit like saying that the expected > most common case is covered, but the general case is not; and, too > bad, we really need the general case to turn CPython's GIL into > transactions, so we can't even start experimenting at all. Armin, To some extent it is a convenience for the implementation, but STM requires some way for the system to record the transaction that can be committed or re-tried. Hardware or a hypervisor or an operating system can observe and record an arbitrary sequence of operations. A traditional static compiler cannot observe operations beyond the boundary of the translation unit. And without something like whole-program compilation to assert to the compiler that it has seen the entire call graph, it cannot reach closure. Lexical scoping seems necessary for STM in a traditional static compiler without extensively retrofitting a way to record all operations. On the other hand, PyPy's mechanism for generating its IR from the interpreter implementation is a natural fit. - David From celil.kj at gmail.com Fri Jan 6 15:59:57 2012 From: celil.kj at gmail.com (Celil) Date: Fri, 06 Jan 2012 06:59:57 -0800 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy Message-ID: <4F070C6D.4010801@gmail.com> I have been thinking about the possibility of creating a C interpreter in Python. Is anybody already working on that? With PyPy this would presumably be quite easy to do. The interpreter will load the C code, create an AST (presumably using pyparsing and the EBNF spec of the C-language), and then populate the Flow Object Space with all the C objects, and create a control flow graph of the application logic. This graph will containe low level lltype objects, and can then be directly connected to the RPython flow-graph generated after the RTyper step. This would allow for seamless interoperability between C and PyPy, and would also greatly simplify the task of porting existing CPython extensions such as numpy. Rather than going through the error prone task of translating the whole code base into RPython, one will be able to simply load the exiting C source code and integrate it directly into the RPython flow graph. It will be possible to import *.h and *.c files directly without any compilation, and they will run nearly as fast thanks to PyPy's JIT technology. This would also allow us to do things like running CPython on top of PyPy. Right now it is possible to run PyPy on top of CPython, but the reverse is not. If CPython could be run on top of PyPy by interpreting its C source code that would be truly amazing. Interpreting C code would greatly help CPython developers by freeing them from the task of having to repeatedly compile their code. Celil From armin at steinhoff.de Fri Jan 6 17:25:04 2012 From: armin at steinhoff.de (Armin Steinhoff) Date: Fri, 06 Jan 2012 16:25:04 +0000 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy In-Reply-To: <4F070C6D.4010801@gmail.com> References: <4F070C6D.4010801@gmail.com> Message-ID: <4F072060.8020402@steinhoff.de> Celli, a C interpreter implemented on top of the PyPy interpreter make no sense if you need speed ... IMHO. A better aproach would be to bind the TCC library ( libtcc -> http://bellard.org/tcc/tcc-doc.html#SEC22 ) to PyPy. This library allows to compile on the fly and creates x86 executable code in memory space ... no link actions, just call it with a library call. Regards --Armin Celil wrote: > I have been thinking about the possibility of creating a C interpreter > in Python. > > Is anybody already working on that? With PyPy this would presumably be > quite easy to do. The interpreter will load the C code, create an AST > (presumably using pyparsing and the EBNF spec of the C-language), and > then populate the Flow Object Space with all the C objects, and create > a control flow graph of the application logic. This graph will > containe low level lltype objects, and can then be directly connected > to the RPython flow-graph generated after the RTyper step. This would > allow for seamless interoperability between C and PyPy, and would also > greatly simplify the task of porting existing CPython extensions such > as numpy. Rather than going through the error prone task of > translating the whole code base into RPython, one will be able to > simply load the exiting C source code and integrate it directly into > the RPython flow graph. It will be possible to import *.h and *.c > files directly without any compilation, and they will run nearly as > fast thanks to PyPy's JIT technology. > > This would also allow us to do things like running CPython on top of > PyPy. Right now it is possible to run PyPy on top of CPython, but the > reverse is not. If CPython could be run on top of PyPy by interpreting > its C source code that would be truly amazing. Interpreting C code > would greatly help CPython developers by freeing them from the task of > having to repeatedly compile their code. > > Celil > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From blendmaster1024 at gmail.com Fri Jan 6 16:23:28 2012 From: blendmaster1024 at gmail.com (lahwran) Date: Fri, 6 Jan 2012 08:23:28 -0700 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy In-Reply-To: <4F070C6D.4010801@gmail.com> References: <4F070C6D.4010801@gmail.com> Message-ID: Perhaps take a look at how https://github.com/albertz/PyCParser does it? I've never touched the project, but it sounds like exactly what you're describing. On Fri, Jan 6, 2012 at 7:59 AM, Celil wrote: > I have been thinking about the possibility of creating a C interpreter in > Python. > > Is anybody already working on that? With PyPy this would presumably be quite > easy to do. The interpreter will load the C code, create an AST (presumably > using pyparsing and the EBNF spec of the C-language), and then populate the > Flow Object Space with all the C objects, and create a control flow graph of > the application logic. This graph will containe low level lltype objects, > and can then be directly connected to the RPython flow-graph generated after > the RTyper step. This would allow for seamless interoperability between C > and PyPy, and would also greatly simplify the task of porting existing > CPython extensions such as numpy. ?Rather than going through the error prone > task of translating the whole code base into RPython, one will be able to > simply load the exiting C source code and integrate it directly into the > RPython flow graph. It will be possible to import *.h and *.c files directly > without any compilation, and they will run nearly as fast thanks to PyPy's > JIT technology. > > This would also allow us to do things like running CPython on top of PyPy. > Right now it is possible to run PyPy on top of CPython, but the reverse is > not. If CPython could be run on top of PyPy by interpreting its C source > code that would be truly amazing. Interpreting C code would greatly help > CPython developers by freeing them from the task of having to repeatedly > compile their code. > > Celil > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From arigo at tunes.org Fri Jan 6 16:52:37 2012 From: arigo at tunes.org (Armin Rigo) Date: Fri, 6 Jan 2012 16:52:37 +0100 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy In-Reply-To: <4F070C6D.4010801@gmail.com> References: <4F070C6D.4010801@gmail.com> Message-ID: Hi Celil, On Fri, Jan 6, 2012 at 15:59, Celil wrote: > Is anybody already working on that? With PyPy this would presumably be quite > easy to do. The interpreter will load the C code, create an AST (presumably > using pyparsing and the EBNF spec of the C-language), and then populate the > Flow Object Space with all the C objects, and create a control flow graph of > the application logic. I think you are confusing several levels. I don't see what you would gain by this exercise. Interoperability problems between C and PyPy are not going to be magically solved just because we turn C code into lltyped flow graphs. That seems rather pointless: from lltyped flow graphs, what we do is mostly turn them into C code again. For an example of the confusion: > Interpreting C code would greatly help > CPython developers by freeing them from the task of having to repeatedly > compile their code. No: flow graphs need to be created (a Python process that is slower than gcc), then turned into more C code (more time), and finally compiled... by gcc itself. Unless you really have in mind an interpreter-with-JIT for the C language, fully written in PyPy; but in this case there are no flow graphs around. Our JIT would give bad results anyway, because C is a low-level language, not a dynamic language. A bient?t, Armin. From arigo at tunes.org Fri Jan 6 17:04:16 2012 From: arigo at tunes.org (Armin Rigo) Date: Fri, 6 Jan 2012 17:04:16 +0100 Subject: [pypy-dev] STM In-Reply-To: References: Message-ID: Hi David, On Fri, Jan 6, 2012 at 15:49, David Edelsohn wrote: > To some extent it is a convenience for the implementation, but STM > requires some way for the system to record the transaction that can be > committed or re-tried. Hardware or a hypervisor or an operating system > can observe and record an arbitrary sequence of operations. A > traditional static compiler cannot observe operations beyond the > boundary of the translation unit. ?And without something like > whole-program compilation to assert to the compiler that it has seen > the entire call graph, it cannot reach closure. Just for the sake of the argument: This seems wrong to me. You could declare C functions with an attribute meaning "this function is meant to be called in a transaction, but it may end the transaction and start the next one". Or something more general along the lines of "this function may return after activating a new transaction". When seeing a call to such a function, the callee must be ready to handle that case. Either it end the transaction explicitly, or it must itself be marked with the same attribute to allow the new transaction to be propagated further up. This might complicate a lot the code in gcc, for all I know; and right now it looks like it's not needed here, after all. A bient?t, Armin. From celil.kj at gmail.com Fri Jan 6 17:27:48 2012 From: celil.kj at gmail.com (Celil) Date: Fri, 06 Jan 2012 08:27:48 -0800 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy In-Reply-To: References: <4F070C6D.4010801@gmail.com> Message-ID: <4F072104.4010108@gmail.com> So if I understand correctly, the flow graph is simply there to help deduce low level types for the high level python objects so that rpython can be translated into C, and they play no role in the interpreter? For some reason I was under the impression that the interpreter, as it loads a python module, uses the flow graph to deduce which parts of the code are a proper subset of rpython, and compiles them to produce more efficient modules. Celil On 1/6/12 7:52 AM, Armin Rigo wrote: > Hi Celil, > > On Fri, Jan 6, 2012 at 15:59, Celil wrote: >> Is anybody already working on that? With PyPy this would presumably be quite >> easy to do. The interpreter will load the C code, create an AST (presumably >> using pyparsing and the EBNF spec of the C-language), and then populate the >> Flow Object Space with all the C objects, and create a control flow graph of >> the application logic. > I think you are confusing several levels. I don't see what you would > gain by this exercise. Interoperability problems between C and PyPy > are not going to be magically solved just because we turn C code into > lltyped flow graphs. That seems rather pointless: from lltyped flow > graphs, what we do is mostly turn them into C code again. > > For an example of the confusion: > >> Interpreting C code would greatly help >> CPython developers by freeing them from the task of having to repeatedly >> compile their code. > No: flow graphs need to be created (a Python process that is slower > than gcc), then turned into more C code (more time), and finally > compiled... by gcc itself. > > Unless you really have in mind an interpreter-with-JIT for the C > language, fully written in PyPy; but in this case there are no flow > graphs around. Our JIT would give bad results anyway, because C is a > low-level language, not a dynamic language. > > > A bient?t, > > Armin. From tbaldridge at gmail.com Fri Jan 6 17:46:05 2012 From: tbaldridge at gmail.com (Timothy Baldridge) Date: Fri, 6 Jan 2012 10:46:05 -0600 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy In-Reply-To: <4F072104.4010108@gmail.com> References: <4F070C6D.4010801@gmail.com> <4F072104.4010108@gmail.com> Message-ID: On Fri, Jan 6, 2012 at 10:27 AM, Celil wrote: > So if I understand correctly, the flow graph is simply there to help deduce > low level types for the high level python objects so that rpython can be > translated into C, and they play no role in the interpreter? Correct. When pypy loads .py modules, it compiles and interprets them as normal python bytecode. Then as bytecode is run, the JIT starts profiling the code and creating machine code versions of the bytecode instructions. This is a rather nice feature, that I am taking advantage of in my clojure-on-pypy implementation: https://github.com/halgari/clojure-py In my implementation of Clojure we compile the clojure code to pure python bytecode, then let the JIT do its work. According to the JIT, this code is no different than a normal bytecode program. All that to say, the pypy jit doesn't look at python ASTs, it runs directly off the bytecode contents of functions. Timothy From stefan_ml at behnel.de Fri Jan 6 18:57:51 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 06 Jan 2012 18:57:51 +0100 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy In-Reply-To: <4F072060.8020402@steinhoff.de> References: <4F070C6D.4010801@gmail.com> <4F072060.8020402@steinhoff.de> Message-ID: Armin Steinhoff, 06.01.2012 17:25: > A better aproach would be to bind the TCC library ( libtcc -> > http://bellard.org/tcc/tcc-doc.html#SEC22 ) to PyPy. > This library allows to compile on the fly and creates x86 executable code > in memory space ... no link actions, just call it with a library call. ... except that it doesn't support the full C standard, so it won't compile all existing code, at least not correctly, from my experience. And it only works on the x86 (32/64bit) architecture. It also (obviously) won't generate as efficient code as gcc or icc would. So you won't gain as much as you might think. Stefan From fijall at gmail.com Fri Jan 6 19:03:41 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 6 Jan 2012 20:03:41 +0200 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy In-Reply-To: References: <4F070C6D.4010801@gmail.com> <4F072060.8020402@steinhoff.de> Message-ID: On Fri, Jan 6, 2012 at 7:57 PM, Stefan Behnel wrote: > Armin Steinhoff, 06.01.2012 17:25: > >> A better aproach would be to bind the TCC library ( libtcc -> >> http://bellard.org/tcc/tcc-doc.html#SEC22 ) to PyPy. >> This library allows to compile on the fly and creates x86 executable code >> in memory space ... no link actions, just call it with a library call. > > > ... except that it doesn't support the full C standard, so it won't compile > all existing code, at least not correctly, from my experience. And it only > works on the x86 (32/64bit) architecture. It also (obviously) won't generate > as efficient code as gcc or icc would. So you won't gain as much as you > might think. > > Stefan > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Maybe slightly offtopic, but I would like to point out that C/RPython integration works just fine. Cheers, fijal From andrewfr_ice at yahoo.com Fri Jan 6 20:29:23 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Fri, 6 Jan 2012 11:29:23 -0800 (PST) Subject: [pypy-dev] Suggestions Re: Problems Installing STM In-Reply-To: References: <1325616320.5710.YahooMailNeo@web120703.mail.ne1.yahoo.com> <1325792060.58765.YahooMailNeo@web120702.mail.ne1.yahoo.com> Message-ID: <1325878163.11063.YahooMailNeo@web120701.mail.ne1.yahoo.com> Hi Armin: ________________________________ From: Armin Rigo To: Andrew Francis Cc: "pypy-dev at codespeak.net" Sent: Thursday, January 5, 2012 3:08 PM Subject: Re: [pypy-dev] Problems Installing STM >Ah, sorry.? You need also to pass the option "--gc=none".? All real >GCs are unsupported so far :-/ Great this works! AF> I don't see rstm.begin_transaction() being called. Is this implicit with AF> when a thread starts? >Yes. I interpret this as telling the underlying transaction manager that every variable the thread uses is a part of a transaction, therefore log it (this is akin to starting a lock at the beginning of a programme and unlocking at the end).? This is not how I understand STM or transactions to work. In this case, I think the Python way, explicit beats implicit is the better strategy. Does begin_transaction() work? AF> In general, what sort of feedback are you looking for at this stage? >It's unclear.? At this stage I'm looking for how we would really like >the final STM-over-PyPy to look like, as this question is not clear at >all.? See my other mails about this topic... Yes I am reading the other email messages. However I feel commenting on existing work belongs in its own separate thread. Some comments and Questions: 1) I am starting to look at the rstm library, running some of the code. Bench examples emit logging data. I think it would be useful to have a logging facility to see what the underlying STM mechanism is doing.? I have logging for my join pattern implementation. It helped me understand how patterns *worked* (and whether they were working). 2) Looking at Haskell and rstm examples and looking at the targetdemo example, I am not sure how you tell the STM which variables (i.e., the Anchor) are of interest? Cheers, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Fri Jan 6 21:19:02 2012 From: arigo at tunes.org (Armin Rigo) Date: Fri, 6 Jan 2012 21:19:02 +0100 Subject: [pypy-dev] Suggestions Re: Problems Installing STM In-Reply-To: <1325878163.11063.YahooMailNeo@web120701.mail.ne1.yahoo.com> References: <1325616320.5710.YahooMailNeo@web120703.mail.ne1.yahoo.com> <1325792060.58765.YahooMailNeo@web120702.mail.ne1.yahoo.com> <1325878163.11063.YahooMailNeo@web120701.mail.ne1.yahoo.com> Message-ID: Hi Andrew, On Fri, Jan 6, 2012 at 20:29, Andrew Francis wrote: > I interpret this as telling the underlying transaction manager that every > variable the thread uses is a part of a transaction, therefore log it If you're comparing it with STM as used by Haskell, then it's a bit different. You had better compare it with the new "__transaction" keyword of GCC 4.7, for programs written in C. > Yes I am reading the other email messages. However I feel commenting on > existing work belongs in its own separate thread. Existing work might become irrelevant or change in random ways, so for now at least I feel like it's a bit of a waste of time... A bient?t, Armin. From dje.gcc at gmail.com Fri Jan 6 21:21:11 2012 From: dje.gcc at gmail.com (David Edelsohn) Date: Fri, 6 Jan 2012 15:21:11 -0500 Subject: [pypy-dev] STM In-Reply-To: References: Message-ID: On Fri, Jan 6, 2012 at 11:04 AM, Armin Rigo wrote: > Just for the sake of the argument: ?This seems wrong to me. ?You could > declare C functions with an attribute meaning "this function is meant > to be called in a transaction, but it may end the transaction and > start the next one". ?Or something more general along the lines of > "this function may return after activating a new transaction". ?When > seeing a call to such a function, the callee must be ready to handle > that case. ?Either it end the transaction explicitly, or it must > itself be marked with the same attribute to allow the new transaction > to be propagated further up. > > This might complicate a lot the code in gcc, for all I know; and right > now it looks like it's not needed here, after all. Yes, I agree that the feature you suggest would be useful. I am trying to point out the complexity of implementing such a feature in a static compiler. If one uses translation unit to a single source file and the functions are declared "static" -- in the C Language meaning -- then the compiler can track the operations. If the function is external and visible, then it is extremely complicated for the static compiler to create an infrastructure to observe the operations in the transaction for the arbitrary code paths that can intervene. - David From armin at steinhoff.de Fri Jan 6 23:06:11 2012 From: armin at steinhoff.de (Armin Steinhoff) Date: Fri, 06 Jan 2012 22:06:11 +0000 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy In-Reply-To: References: <4F070C6D.4010801@gmail.com> <4F072060.8020402@steinhoff.de> Message-ID: <4F077053.5060903@steinhoff.de> Stefan Behnel wrote: > Armin Steinhoff, 06.01.2012 17:25: >> A better aproach would be to bind the TCC library ( libtcc -> >> http://bellard.org/tcc/tcc-doc.html#SEC22 ) to PyPy. >> This library allows to compile on the fly and creates x86 executable >> code >> in memory space ... no link actions, just call it with a library call. > > ... except that it doesn't support the full C standard, so it won't > compile all existing code, at least not correctly, from my experience. > And it only works on the x86 (32/64bit) architecture. It also > (obviously) won't generate as efficient code as gcc or icc would. So > you won't gain as much as you might think. > > Stefan > OK ... I know the restrictions of TCC. Just an off topic question: is there a comparable library interface for gcc or other C compilers ? Armin Steinhoff > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From tbaldridge at gmail.com Fri Jan 6 22:26:09 2012 From: tbaldridge at gmail.com (Timothy Baldridge) Date: Fri, 6 Jan 2012 15:26:09 -0600 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy In-Reply-To: <4F077053.5060903@steinhoff.de> References: <4F070C6D.4010801@gmail.com> <4F072060.8020402@steinhoff.de> <4F077053.5060903@steinhoff.de> Message-ID: > OK ... I know the restrictions of TCC. Just an off topic question: ?is there> a comparable library interface for gcc or other C compilers ? You might look into clang(llvm)'s bindings. I know tons of people are hooking into that these days. For isntance, the people working on the Qt IDE use clang for source-code analysis. Pretty much the entire project is divided up into smaller libraries. Plus you also get a pretty mature C++ parser/compiler. I've been told writing a proper C++ parser is a good way to age prematurely. Timothy From haag498 at googlemail.com Fri Jan 6 22:45:47 2012 From: haag498 at googlemail.com (Jan Haag) Date: Fri, 06 Jan 2012 22:45:47 +0100 Subject: [pypy-dev] C interpreter written in Python, and running CPython on top of PyPy In-Reply-To: <4F077053.5060903@steinhoff.de> References: <4F070C6D.4010801@gmail.com> <4F072060.8020402@steinhoff.de> <4F077053.5060903@steinhoff.de> Message-ID: <4F076B8B.2090101@gmail.com> On 01/06/2012 11:06 PM, Armin Steinhoff wrote: > OK ... I know the restrictions of TCC. Just an off topic question: is > there a comparable library interface for gcc or other C compilers ? > > Armin Steinhoff There's clang, which basically consists of a set of libraries hooked up to a common interface. I don't know, however, how usable its interface is. Jan From springrider at gmail.com Sat Jan 7 11:07:45 2012 From: springrider at gmail.com (Yan Chunlu) Date: Sat, 7 Jan 2012 18:07:45 +0800 Subject: [pypy-dev] when will pypy support psycopg2? Message-ID: based on the irc chat here: http://www.tismer.com/pypy/irc-logs/pypy/pypy.2011-11-02.log.html PyByteArray_Type, PyMemoryView_Type and PyInterpreterState are missing from the headers. http://codepad.org/FYkhcZKf just wonder is there any schedule about this? I think psycopg2 is crucial for many developers to adopt pypy. thanks! From fijall at gmail.com Sat Jan 7 11:10:22 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 7 Jan 2012 12:10:22 +0200 Subject: [pypy-dev] when will pypy support psycopg2? In-Reply-To: References: Message-ID: On Sat, Jan 7, 2012 at 12:07 PM, Yan Chunlu wrote: > based on the irc chat here: > > http://www.tismer.com/pypy/irc-logs/pypy/pypy.2011-11-02.log.html > > > PyByteArray_Type, PyMemoryView_Type and PyInterpreterState are missing > from the headers. http://codepad.org/FYkhcZKf > > > just wonder is there any schedule about this? I think psycopg2 is > crucial for many developers to adopt pypy. ?thanks! > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Hi Use psycopg2-ct or pg8000 instead, those work fine in PyPy. Cheers, fijal From springrider at gmail.com Sat Jan 7 11:54:59 2012 From: springrider at gmail.com (Yan Chunlu) Date: Sat, 7 Jan 2012 18:54:59 +0800 Subject: [pypy-dev] when will pypy support psycopg2? In-Reply-To: References: Message-ID: sorry I somehow replied to your personally, sorry for disturbing. just didn't get used to gmail's new UI.... about quora's adoption, brought me a lot confidence. thanks a lot! will test it! I am curious that does it hard for pypy to add those data type? like PyByteArray_Type, PyMemoryView_Type and PyInterpreterState On Sat, Jan 7, 2012 at 6:44 PM, Maciej Fijalkowski wrote: > On Sat, Jan 7, 2012 at 12:39 PM, Yan Chunlu wrote: >> okay, thanks! I have installed psycopg2ct and will give a try. >> >> but since my app is serving several millions requests per day, I am >> really worrying if it suitable for production usage. > > You mean because it's relatively new? you should test it and see how > it works, FYI quora moved to the ctypes based new DB binding and they > never complained, seems to be working just fine for them. > >> >> On Sat, Jan 7, 2012 at 6:22 PM, Maciej Fijalkowski wrote: >>> On Sat, Jan 7, 2012 at 12:20 PM, Yan Chunlu wrote: >>>> thanks for the quick reply. ?seems psycopg2 is "An (experimental) >>>> implementation" based on http://pypi.python.org/pypi/psycopg2ct >>> >>> the other way around >>> >>>> >>>> and also, both lib need to make many changes to the code... >>>> >>> >>> they should not, they all support DB API >>> >>>> >>>> >>>> On Sat, Jan 7, 2012 at 6:10 PM, Maciej Fijalkowski wrote: >>>>> On Sat, Jan 7, 2012 at 12:07 PM, Yan Chunlu wrote: >>>>>> based on the irc chat here: >>>>>> >>>>>> http://www.tismer.com/pypy/irc-logs/pypy/pypy.2011-11-02.log.html >>>>>> >>>>>> >>>>>> PyByteArray_Type, PyMemoryView_Type and PyInterpreterState are missing >>>>>> from the headers. http://codepad.org/FYkhcZKf >>>>>> >>>>>> >>>>>> just wonder is there any schedule about this? I think psycopg2 is >>>>>> crucial for many developers to adopt pypy. ?thanks! >>>>>> _______________________________________________ >>>>>> pypy-dev mailing list >>>>>> pypy-dev at python.org >>>>>> http://mail.python.org/mailman/listinfo/pypy-dev >>>>> >>>>> Hi >>>>> >>>>> Use psycopg2-ct or pg8000 instead, those work fine in PyPy. >>>>> >>>>> Cheers, >>>>> fijal From fijall at gmail.com Sat Jan 7 11:59:02 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 7 Jan 2012 12:59:02 +0200 Subject: [pypy-dev] when will pypy support psycopg2? In-Reply-To: References: Message-ID: On Sat, Jan 7, 2012 at 12:54 PM, Yan Chunlu wrote: > sorry I somehow replied to your personally, sorry for disturbing. > just didn't get used to gmail's new UI.... > > about quora's adoption, brought me a lot confidence. thanks a lot! will test it! > > I am curious that does it hard for pypy to add those data type? > like PyByteArray_Type, PyMemoryView_Type and PyInterpreterState It's usually a painful experience to add new stuff to cpyext. But also, it does not necesarilly give you confidence that it'll work, since emulating CPython C API is tricky and tiny problems with refcounts can lead to segfaults that don't occur on cpython for obscure reasons. > > On Sat, Jan 7, 2012 at 6:44 PM, Maciej Fijalkowski wrote: >> On Sat, Jan 7, 2012 at 12:39 PM, Yan Chunlu wrote: >>> okay, thanks! I have installed psycopg2ct and will give a try. >>> >>> but since my app is serving several millions requests per day, I am >>> really worrying if it suitable for production usage. >> >> You mean because it's relatively new? you should test it and see how >> it works, FYI quora moved to the ctypes based new DB binding and they >> never complained, seems to be working just fine for them. >> >>> >>> On Sat, Jan 7, 2012 at 6:22 PM, Maciej Fijalkowski wrote: >>>> On Sat, Jan 7, 2012 at 12:20 PM, Yan Chunlu wrote: >>>>> thanks for the quick reply. ?seems psycopg2 is "An (experimental) >>>>> implementation" based on http://pypi.python.org/pypi/psycopg2ct >>>> >>>> the other way around >>>> >>>>> >>>>> and also, both lib need to make many changes to the code... >>>>> >>>> >>>> they should not, they all support DB API >>>> >>>>> >>>>> >>>>> On Sat, Jan 7, 2012 at 6:10 PM, Maciej Fijalkowski wrote: >>>>>> On Sat, Jan 7, 2012 at 12:07 PM, Yan Chunlu wrote: >>>>>>> based on the irc chat here: >>>>>>> >>>>>>> http://www.tismer.com/pypy/irc-logs/pypy/pypy.2011-11-02.log.html >>>>>>> >>>>>>> >>>>>>> PyByteArray_Type, PyMemoryView_Type and PyInterpreterState are missing >>>>>>> from the headers. http://codepad.org/FYkhcZKf >>>>>>> >>>>>>> >>>>>>> just wonder is there any schedule about this? I think psycopg2 is >>>>>>> crucial for many developers to adopt pypy. ?thanks! >>>>>>> _______________________________________________ >>>>>>> pypy-dev mailing list >>>>>>> pypy-dev at python.org >>>>>>> http://mail.python.org/mailman/listinfo/pypy-dev >>>>>> >>>>>> Hi >>>>>> >>>>>> Use psycopg2-ct or pg8000 instead, those work fine in PyPy. >>>>>> >>>>>> Cheers, >>>>>> fijal > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From springrider at gmail.com Sat Jan 7 12:11:00 2012 From: springrider at gmail.com (Yan Chunlu) Date: Sat, 7 Jan 2012 19:11:00 +0800 Subject: [pypy-dev] when will pypy support psycopg2? In-Reply-To: References: Message-ID: okay, got it. guess I should take it carefully. thanks a lot for the help! On Sat, Jan 7, 2012 at 6:59 PM, Maciej Fijalkowski wrote: > On Sat, Jan 7, 2012 at 12:54 PM, Yan Chunlu wrote: >> sorry I somehow replied to your personally, sorry for disturbing. >> just didn't get used to gmail's new UI.... >> >> about quora's adoption, brought me a lot confidence. thanks a lot! will test it! >> >> I am curious that does it hard for pypy to add those data type? >> like PyByteArray_Type, PyMemoryView_Type and PyInterpreterState > > It's usually a painful experience to add new stuff to cpyext. But > also, it does not necesarilly give you confidence that it'll work, > since emulating CPython C API is tricky and tiny problems with > refcounts can lead to segfaults that don't occur on cpython for > obscure reasons. > >> >> On Sat, Jan 7, 2012 at 6:44 PM, Maciej Fijalkowski wrote: >>> On Sat, Jan 7, 2012 at 12:39 PM, Yan Chunlu wrote: >>>> okay, thanks! I have installed psycopg2ct and will give a try. >>>> >>>> but since my app is serving several millions requests per day, I am >>>> really worrying if it suitable for production usage. >>> >>> You mean because it's relatively new? you should test it and see how >>> it works, FYI quora moved to the ctypes based new DB binding and they >>> never complained, seems to be working just fine for them. >>> >>>> >>>> On Sat, Jan 7, 2012 at 6:22 PM, Maciej Fijalkowski wrote: >>>>> On Sat, Jan 7, 2012 at 12:20 PM, Yan Chunlu wrote: >>>>>> thanks for the quick reply. ?seems psycopg2 is "An (experimental) >>>>>> implementation" based on http://pypi.python.org/pypi/psycopg2ct >>>>> >>>>> the other way around >>>>> >>>>>> >>>>>> and also, both lib need to make many changes to the code... >>>>>> >>>>> >>>>> they should not, they all support DB API >>>>> >>>>>> >>>>>> >>>>>> On Sat, Jan 7, 2012 at 6:10 PM, Maciej Fijalkowski wrote: >>>>>>> On Sat, Jan 7, 2012 at 12:07 PM, Yan Chunlu wrote: >>>>>>>> based on the irc chat here: >>>>>>>> >>>>>>>> http://www.tismer.com/pypy/irc-logs/pypy/pypy.2011-11-02.log.html >>>>>>>> >>>>>>>> >>>>>>>> PyByteArray_Type, PyMemoryView_Type and PyInterpreterState are missing >>>>>>>> from the headers. http://codepad.org/FYkhcZKf >>>>>>>> >>>>>>>> >>>>>>>> just wonder is there any schedule about this? I think psycopg2 is >>>>>>>> crucial for many developers to adopt pypy. ?thanks! >>>>>>>> _______________________________________________ >>>>>>>> pypy-dev mailing list >>>>>>>> pypy-dev at python.org >>>>>>>> http://mail.python.org/mailman/listinfo/pypy-dev >>>>>>> >>>>>>> Hi >>>>>>> >>>>>>> Use psycopg2-ct or pg8000 instead, those work fine in PyPy. >>>>>>> >>>>>>> Cheers, >>>>>>> fijal >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev From mtasic85 at gmail.com Sat Jan 7 12:46:22 2012 From: mtasic85 at gmail.com (Marko Tasic) Date: Sat, 7 Jan 2012 12:46:22 +0100 Subject: [pypy-dev] Reference counting Message-ID: Hi, I've been carefully following your mailing list for years, and using PyPy for different kind of projects, mostly for highly distributed and decentralized systems, and everything is just great compared to CPython, Jython and IronPython. I've even written partial ctypes wrapper for GObject Introspection, so I could use Gtk as GUI toolkit on top of PyPy. Today, I cannot imagine writing Python code without executing it on PyPy because it just feels natural to use it as default implementation instead of an alternative CPython :) I know that you don't feel that using reference counting as GC mechanism is good idea because of overhead of maintaining reference counts to each object and non concurrent nature of it, but can you give me any idea where to start and how I can implement reference counting for PyPy. Have in mind that I am not new to Python, and low level stuff, just want to measure performance and possibly implement alternative reference counting strategies. Cheers, Marko Tasic -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sat Jan 7 13:15:27 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 7 Jan 2012 14:15:27 +0200 Subject: [pypy-dev] Reference counting In-Reply-To: References: Message-ID: On Sat, Jan 7, 2012 at 1:46 PM, Marko Tasic wrote: > Hi, > > I've been carefully following your mailing list for years, and using PyPy > for > different kind of projects, mostly for highly distributed and decentralized > systems, > and everything is just great compared to CPython, Jython and IronPython. > I've even written partial ctypes wrapper for GObject Introspection, so I > could > use Gtk as GUI toolkit on top of PyPy. Today, I cannot imagine writing > Python > code without executing it on PyPy because it just feels natural to use it as > default > implementation instead of an alternative CPython :) Hi Marko, we're very glad to hear that! > > I know that you don't feel that using reference counting as GC mechanism is > good > idea because of overhead of maintaining reference counts to each object and > non concurrent nature of it, but can you give me any idea where to start and > how > I can implement reference counting for PyPy. Have in mind that I am not new > to > Python, and low level stuff, just want to measure performance and possibly > implement alternative reference counting strategies. it is a very valid question. In fact, we already do have a refcounting implementation which you get by passing --gc=ref to the translate.py script. Refcounting has several ups and downs and it's definitely valid to experiment and see overheads. However, at least current approach and pypy in general has several problems with that: * refcounting is implemented as a transformation of flow graphs. See pypy/rpython/memory/gctransform/refcounting.py for details. This approach works very well in the sense we don't have to maintain refcounts by hand in the interpreter (we'll never do that). This is all well and good, however it also means there is a significant redundancy in references. You would need to either implement a smarter policy or a refcount-removing optimization (typically run after inlining) in order for that to be viable. * we generally never cared about avoiding cycles in the python interpreter itself. I don't think pypy's refcounting implementation comes with cycle detector included, but don't quote me on that * there is not JIT support. Typically JIT requires a bit of support from the GC to cooperate. In order to get good results, you would probably also need to implement an optimization that removes unnecessary refcounts in traces. I hope that helps clarify some things, feel free to ask more if not! Cheers, fijal From mtasic85 at gmail.com Sat Jan 7 14:54:44 2012 From: mtasic85 at gmail.com (Marko Tasic) Date: Sat, 7 Jan 2012 14:54:44 +0100 Subject: [pypy-dev] Reference counting In-Reply-To: References: Message-ID: Maciej, Thank you for an express response. > Hi Marko, we're very glad to hear that! Off topic, but I have to mention that I've been an ambassador of PyPy for last few years. In beginning, it was hard, but now results that you've achieved are obvious and promising, and everyone starts to trust your implementation. However, I have to say that there are many notoriously xenophobic developers. Funny thing is that they trust how their source code is written but not how its executed. > * refcounting is implemented as a transformation of flow graphs. See > pypy/rpython/memory/gctransform/refcounting.py for details. This > approach works very well in the sense we don't have to maintain > refcounts by hand in the interpreter (we'll never do that). This is > all well and good, however it also means there is a significant > redundancy in references. You would need to either implement a smarter > policy or a refcount-removing optimization (typically run after > inlining) in order for that to be viable. Is there a way to always have explicit reference counting field for all objects? I know this is bad, but I want to go even further. I would like to have even more fields in object structure besides refcount field. Reason for this is a way to test correctness of alternative RC implementations that handle cycles. I guess that I should turn off inlining? > * we generally never cared about avoiding cycles in the python > interpreter itself. I don't think pypy's refcounting implementation > comes with cycle detector included, but don't quote me on that I've developed an alternative way of handling cycles related to reference counting without using tracing GC, or traversing items in container objects such are list, dict, set, etc. Still checking algorithm and testing correctness, but everything has been fine until now. It is not my intention to make anything spectacular but just to research this field. If anyone has interest in my reference counting mechanism can check: http://code.google.com/p/cosmos-lang/wiki/CosmosRefCounter http://code.google.com/p/cosmos-lang/source/browse/doc/corc.py Instance of "Reference" class can be embedded in "Abstract" class/instance, so no need for separate object. > * there is not JIT support. Typically JIT requires a bit of support > from the GC to cooperate. In order to get good results, you would > probably also need to implement an optimization that removes > unnecessary refcounts in traces. I don't mind if there is no JIT support, but I want to have "unnecessary refcounts". Again, my main interest is testing correctness and not speed. Later on, I will try to optimize interaction between GC and JIT if possible. Cheers, Marko Tasic On Sat, Jan 7, 2012 at 1:15 PM, Maciej Fijalkowski wrote: > > On Sat, Jan 7, 2012 at 1:46 PM, Marko Tasic wrote: > > Hi, > > > > I've been carefully following your mailing list for years, and using PyPy > > for > > different kind of projects, mostly for highly distributed and decentralized > > systems, > > and everything is just great compared to CPython, Jython and IronPython. > > I've even written partial ctypes wrapper for GObject Introspection, so I > > could > > use Gtk as GUI toolkit on top of PyPy. Today, I cannot imagine writing > > Python > > code without executing it on PyPy because it just feels natural to use it as > > default > > implementation instead of an alternative CPython :) > > Hi Marko, we're very glad to hear that! > > > > > I know that you don't feel that using reference counting as GC mechanism is > > good > > idea because of overhead of maintaining reference counts to each object and > > non concurrent nature of it, but can you give me any idea where to start and > > how > > I can implement reference counting for PyPy. Have in mind that I am not new > > to > > Python, and low level stuff, just want to measure performance and possibly > > implement alternative reference counting strategies. > > it is a very valid question. In fact, we already do have a refcounting > implementation which you get by passing --gc=ref to the translate.py > script. Refcounting has several ups and downs and it's definitely > valid to experiment and see overheads. However, at least current > approach and pypy in general has several problems with that: > > * refcounting is implemented as a transformation of flow graphs. See > pypy/rpython/memory/gctransform/refcounting.py for details. This > approach works very well in the sense we don't have to maintain > refcounts by hand in the interpreter (we'll never do that). This is > all well and good, however it also means there is a significant > redundancy in references. You would need to either implement a smarter > policy or a refcount-removing optimization (typically run after > inlining) in order for that to be viable. > > * we generally never cared about avoiding cycles in the python > interpreter itself. I don't think pypy's refcounting implementation > comes with cycle detector included, but don't quote me on that > > * there is not JIT support. Typically JIT requires a bit of support > from the GC to cooperate. In order to get good results, you would > probably also need to implement an optimization that removes > unnecessary refcounts in traces. > > I hope that helps clarify some things, feel free to ask more if not! > > Cheers, > fijal From fijall at gmail.com Sat Jan 7 15:03:53 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 7 Jan 2012 16:03:53 +0200 Subject: [pypy-dev] Reference counting In-Reply-To: References: Message-ID: On Sat, Jan 7, 2012 at 3:54 PM, Marko Tasic wrote: > Maciej, > > Thank you for an express response. > >> Hi Marko, we're very glad to hear that! > > Off topic, but I have to mention that I've been an ambassador of PyPy > for last few years. > In beginning, it was hard, but now results that you've achieved are > obvious and promising, > and everyone starts to trust your implementation. However, I have to > say that there > are many notoriously xenophobic developers. Funny thing is that they > trust how their > source code is written but not how its executed. Thanks :) > >> * refcounting is implemented as a transformation of flow graphs. See >> pypy/rpython/memory/gctransform/refcounting.py for details. This >> approach works very well in the sense we don't have to maintain >> refcounts by hand in the interpreter (we'll never do that). This is >> all well and good, however it also means there is a significant >> redundancy in references. You would need to either implement a smarter >> policy or a refcount-removing optimization (typically run after >> inlining) in order for that to be viable. > > Is there a way to always have explicit reference counting field for all objects? > I know this is bad, but I want to go even further. I would like to have even > more fields in object structure besides refcount field. Reason for this is a > way to test correctness of alternative RC implementations that handle cycles. > I guess that I should turn off inlining? What do you mean by explicit reference count? One that you manipulate from the source of the Python interpreter? Then no. But if you mean "can I have arbitrary fields on objects depending on the GC strategy" then yes. The object layout is completely orthogonal to how the Python interpreter is implemented. > >> * we generally never cared about avoiding cycles in the python >> interpreter itself. I don't think pypy's refcounting implementation >> comes with cycle detector included, but don't quote me on that > > I've developed an alternative way of handling cycles related to reference > counting without using tracing GC, or traversing items in container objects > such are list, dict, set, etc. Still checking algorithm and testing correctness, > but everything has been fine until now. It is not my intention to make anything > spectacular but just to research this field. > > If anyone has interest in my reference counting mechanism can check: > http://code.google.com/p/cosmos-lang/wiki/CosmosRefCounter > http://code.google.com/p/cosmos-lang/source/browse/doc/corc.py People from unladen swallow tried to use the Cosmos GC or was it something else? Again does that answer your questions? PS. it sometimes easier to discuss such stuff on IRC than via mail. Cheers, fijal From amauryfa at gmail.com Sat Jan 7 15:06:24 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Sat, 7 Jan 2012 15:06:24 +0100 Subject: [pypy-dev] when will pypy support psycopg2? In-Reply-To: References: Message-ID: 2012/1/7 Maciej Fijalkowski > > I am curious that does it hard for pypy to add those data type? > > like PyByteArray_Type, PyMemoryView_Type and PyInterpreterState > > It's usually a painful experience to add new stuff to cpyext. But > also, it does not necesarilly give you confidence that it'll work, > since emulating CPython C API is tricky and tiny problems with > refcounts can lead to segfaults that don't occur on cpython for > obscure reasons. I looked at the compilation messages, and added the missing parts; it was not that difficult, after all :-) The module should now compile, provided you apply the attached patch to psycopg. (I suggested the PyDateTime_DELTA_GET_DAYS macros to CPython: http://bugs.python.org/issue13727 ) The resulting module works a little... at least it yielded a meaningful error message: $ ./pypy-c -c "import psycopg2; conn = psycopg2.connect('dbname=test')" OperationalError: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"? -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: psycopg-pypy.patch Type: text/x-patch Size: 1116 bytes Desc: not available URL: From mtasic85 at gmail.com Sat Jan 7 15:50:19 2012 From: mtasic85 at gmail.com (Marko Tasic) Date: Sat, 7 Jan 2012 15:50:19 +0100 Subject: [pypy-dev] Reference counting In-Reply-To: References: Message-ID: Maciej, > What do you mean by explicit reference count? One that you manipulate> from the source of the Python interpreter? Then no. But if you mean> "can I have arbitrary fields on objects depending on the GC strategy"> then yes. The object layout is completely orthogonal to how the Python> interpreter is implemented. I meant that I need to manually increment and decrement refcount from Python interpreter, and also have arbitrary fields on objects depending on the GC strategy. > People from unladen swallow tried to use the Cosmos GC or was it something else? I think it is not that one. This is something different. I haven't had any motivation to fix CPython GC because I don't use it anymore for production code unless it requires py2exe/py2app and PySide/Qt. PyPy is simply more challenging, ATM :) My approach sacrifices memory but does not spend time on traversing objects. Idea is not to have pauses caused by tracing GCs. Anyways, your approach probably beats me, but anyway I want to experiment and research. > Again does that answer your questions? Yes, you did ;) > PS. it sometimes easier to discuss such stuff on IRC than via mail. I'll find some time soon to discuss this with you guys. Cheers,Marko Tasic On Sat, Jan 7, 2012 at 3:03 PM, Maciej Fijalkowski wrote: > On Sat, Jan 7, 2012 at 3:54 PM, Marko Tasic wrote: >> Maciej, >> >> Thank you for an express response. >> >>> Hi Marko, we're very glad to hear that! >> >> Off topic, but I have to mention that I've been an ambassador of PyPy >> for last few years. >> In beginning, it was hard, but now results that you've achieved are >> obvious and promising, >> and everyone starts to trust your implementation. However, I have to >> say that there >> are many notoriously xenophobic developers. Funny thing is that they >> trust how their >> source code is written but not how its executed. > > Thanks :) > >> >>> * refcounting is implemented as a transformation of flow graphs. See >>> pypy/rpython/memory/gctransform/refcounting.py for details. This >>> approach works very well in the sense we don't have to maintain >>> refcounts by hand in the interpreter (we'll never do that). This is >>> all well and good, however it also means there is a significant >>> redundancy in references. You would need to either implement a smarter >>> policy or a refcount-removing optimization (typically run after >>> inlining) in order for that to be viable. >> >> Is there a way to always have explicit reference counting field for all objects? >> I know this is bad, but I want to go even further. I would like to have even >> more fields in object structure besides refcount field. Reason for this is a >> way to test correctness of alternative RC implementations that handle cycles. >> I guess that I should turn off inlining? > > What do you mean by explicit reference count? One that you manipulate > from the source of the Python interpreter? Then no. But if you mean > "can I have arbitrary fields on objects depending on the GC strategy" > then yes. The object layout is completely orthogonal to how the Python > interpreter is implemented. > >> >>> * we generally never cared about avoiding cycles in the python >>> interpreter itself. I don't think pypy's refcounting implementation >>> comes with cycle detector included, but don't quote me on that >> >> I've developed an alternative way of handling cycles related to reference >> counting without using tracing GC, or traversing items in container objects >> such are list, dict, set, etc. Still checking algorithm and testing correctness, >> but everything has been fine until now. It is not my intention to make anything >> spectacular but just to research this field. >> >> If anyone has interest in my reference counting mechanism can check: >> http://code.google.com/p/cosmos-lang/wiki/CosmosRefCounter >> http://code.google.com/p/cosmos-lang/source/browse/doc/corc.py > > People from unladen swallow tried to use the Cosmos GC or was it something else? > > Again does that answer your questions? > > PS. it sometimes easier to discuss such stuff on IRC than via mail. > > Cheers, > fijal From exarkun at twistedmatrix.com Sat Jan 7 15:44:50 2012 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Sat, 07 Jan 2012 14:44:50 -0000 Subject: [pypy-dev] when will pypy support psycopg2? In-Reply-To: References: Message-ID: <20120107144450.5955.1091143824.divmod.xquotient.83@localhost6.localdomain6> On 10:07 am, springrider at gmail.com wrote: >based on the irc chat here: > >http://www.tismer.com/pypy/irc-logs/pypy/pypy.2011-11-02.log.html > > >PyByteArray_Type, PyMemoryView_Type and PyInterpreterState are missing >from the headers. http://codepad.org/FYkhcZKf > > >just wonder is there any schedule about this? I think psycopg2 is >crucial for many developers to adopt pypy. thanks! >_______________________________________________ Just another data point for you - I switched a somewhat simple service from CPython/psycopg2 to PyPy/pg8000 and everything is working well so far. The service even seems to be faster overall on PyPy. Jean-Paul From springrider at gmail.com Sun Jan 8 04:43:43 2012 From: springrider at gmail.com (Yan Chunlu) Date: Sun, 8 Jan 2012 11:43:43 +0800 Subject: [pypy-dev] when will pypy support psycopg2? In-Reply-To: References: Message-ID: thanks for the help! but it seems not resolved the problem. the error message is about "PyByteArray_Type" etc. not sure what to do with PyDateTime. On Sat, Jan 7, 2012 at 10:06 PM, Amaury Forgeot d'Arc wrote: > 2012/1/7 Maciej Fijalkowski >> >> > I am curious that does it hard for pypy to add those data type? >> > like PyByteArray_Type, PyMemoryView_Type and PyInterpreterState >> >> It's usually a painful experience to add new stuff to cpyext. But >> also, it does not necesarilly give you confidence that it'll work, >> since emulating CPython C API is tricky and tiny problems with >> refcounts can lead to segfaults that don't occur on cpython for >> obscure reasons. > > > I looked at the compilation messages, and added the missing parts; > it was not that difficult, after all :-) > > The module should now compile, provided you apply the attached patch?to > psycopg. > (I suggested the PyDateTime_DELTA_GET_DAYS macros to CPython: > http://bugs.python.org/issue13727 ) > > The resulting module works a little... at least it yielded a meaningful > error message: > > $ ./pypy-c -c "import psycopg2; conn = psycopg2.connect('dbname=test')" > OperationalError: could not connect to server: No such file or directory > Is the server running locally and accepting > connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"? > > -- > Amaury Forgeot d'Arc From andrewfr_ice at yahoo.com Sun Jan 8 21:24:22 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Sun, 8 Jan 2012 12:24:22 -0800 (PST) Subject: [pypy-dev] STM In-Reply-To: References: Message-ID: <1326054262.395.YahooMailNeo@web120705.mail.ne1.yahoo.com> Hi Armin: ________________________________ From: Armin Rigo To: PyPy Developer Mailing List Sent: Wednesday, January 4, 2012 6:30 PM Subject: [pypy-dev] STM >While this is obvious in the case of the CPython interpreter, I'd >argue that it is a useful point of view in general.? The paper >mentioned above says that the current approach to multithreading is >completely non-deterministic, .... A silly question? What is the paper you are discussing? > The solution that we really need for >CPython requires a different point of view: a generally-STM program, >in which we carefully add here and there a "yield", i.e. a point at >which it's ok to end the current transaction and start the next one. ... >(Interestingly, this run() function is similar to stackless.run(). >The exact relationships between stackless and this model are yet to >investigate, but it seems that one point of view on what I'm proposing >might be essentially "use stackless, remove some determinism from the >already-unclear tasklet switching order, and add STM under the hood".) Perhaps this is off-topic, in? non-trivial stackless programme using cooperative scheduling, unless you explicitly impose a logical ordering, it is difficult to predict when a particular tasklet will? run. The round-robin doesn't mean much. This is because in non-trivial stackless programmes,? like most non-trivial multi-threaded programmes, coroutines are responding to external events that arrive with some degree of randomness. To cop a line from the creator of Erlang,"the world is concurrent." This is also why I didn't quite get the extreme measures to inject non-determinism in the Newsqueak interpreter so the programmer can't second guess the scheduler. I will be the first to admit that I have a half-baked knowledge of STM. And I have just started looking at the C++ Transactional language specification. And I have a sketchy knowledge of CPython interpreter. So I have a lot to learn and probably many misconceptions. I can sort of see the Stackless angle: in the simplest case a tasklet that does not share data with other tasklets (in this case, it can yield without problems), communications only via channels, or uses set_atomic() looks like a coarse-grained transaction of sorts..... Based on the little I know about stuff like transactional synchronisation order, in a Stackless programme, I can see a beginTransaction like statement acting like a natural yield/schedule() point. Marking transactions boundaries is useful. If another transaction is occuring, the tasklet/thread suspends. Otherwise the tasklet/thread starts its transaction until it finishes. One may be able to model this with stackless.py ..... If I understand what you are proposing, the problem I see with yielding in the coarse-grained tasklet-as-transaction, is that you probably need more machinery to ensure ACID properties, since you are now getting thread interweaving and potential race conditions. >and add STM under the hood". When I did my join pattern prototype, I read the paper "Scalable Join Patterns." A conversation with the authors lead to reading stuff on Concurrent/Parallel ML and "Transactional Events." The common strategy in all those projects was to use STM related technologies under the hood to support a higher level concurrency construct while reducing the performance hit. Cheers, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Mon Jan 9 14:03:18 2012 From: arigo at tunes.org (Armin Rigo) Date: Mon, 9 Jan 2012 14:03:18 +0100 Subject: [pypy-dev] STM In-Reply-To: <1326054262.395.YahooMailNeo@web120705.mail.ne1.yahoo.com> References: <1326054262.395.YahooMailNeo@web120705.mail.ne1.yahoo.com> Message-ID: Hi Andrew, On Sun, Jan 8, 2012 at 21:24, Andrew Francis wrote: > A silly question? What is the paper you are discussing? The Problem with Threads, Edward A. Lee, EECS 2006. > Perhaps this is off-topic, in? non-trivial stackless programme using > cooperative scheduling, unless you explicitly impose a logical ordering, it is difficult to predict > when a particular tasklet will? run. The round-robin doesn't mean much. Ok, so basically no real Stackless program relies on a particular order anyway. This is in line with what I expected. > If I understand what you are proposing, the problem I see with yielding in > the coarse-grained tasklet-as-transaction, is that you probably need more > machinery to ensure ACID properties, since you are now getting thread > interweaving and potential race conditions. Yes, and this machinery is exactly what STM libraries offer. Given that you have multiple transactions that can *mostly* occur in parallel, STM ensures that in *all* cases, it will look like the transactions have been run serially, while actually performing them in parallel under the hood. A bient?t, Armin. From arigo at tunes.org Mon Jan 9 14:30:35 2012 From: arigo at tunes.org (Armin Rigo) Date: Mon, 9 Jan 2012 14:30:35 +0100 Subject: [pypy-dev] STM In-Reply-To: References: <1326054262.395.YahooMailNeo@web120705.mail.ne1.yahoo.com> Message-ID: Re-Hi, On Mon, Jan 9, 2012 at 14:03, Armin Rigo wrote: > The Problem with Threads, Edward A. Lee, EECS 2006. I should add that I don't agree with Lee's main conclusion in this paper, namely that we need to give up solving the issue directly in the languages we use, and instead turn ourselves to new "coordination languages". But I do agree with his attacks against threads, and that provided me with the basics of the current discussion. A bient?t, Armin. From andrewfr_ice at yahoo.com Mon Jan 9 18:38:32 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Mon, 9 Jan 2012 09:38:32 -0800 (PST) Subject: [pypy-dev] STM In-Reply-To: References: <1326054262.395.YahooMailNeo@web120705.mail.ne1.yahoo.com> Message-ID: <1326130712.10581.YahooMailNeo@web120701.mail.ne1.yahoo.com> Hi Armin: ________________________________ From: Armin Rigo To: Andrew Francis Cc: PyPy Developer Mailing List Sent: Monday, January 9, 2012 8:03 AM Subject: Re: [pypy-dev] STM Hi Andrew, On Sun, Jan 8, 2012 at 21:24, Andrew Francis wrote: AF> A silly question? What is the paper you are discussing? >The Problem with Threads, Edward A. Lee, EECS 2006. Thanks! AF> Perhaps this is off-topic, in? non-trivial stackless programme using AF> cooperative scheduling, unless you explicitly impose a logical ordering, it is difficult to predict AF> when a particular tasklet will? run. The round-robin doesn't mean much. >Ok, so basically no real Stackless program relies on a particular order anyway.? This is in >line with what I expected. Yes. Even taking into consideration stuff like channel preferences, I think it is difficult to second guess the scheduler based in even a moderately complex application. Where you really see the philosophy of scheduler as an opaque entity is in the Bell Labs family of languages (Go included).? The reasons I mention Go is that they are tackling the multiple CPU issue (albeit differently from what is proposed here) and it and Stackless share family resemblances. Now what I believe is a bigger issue is rationalising the relationship between scheduler(s), tasklets, and threads. Again, maybe? STM solutions can be tested with an inexpensive prototype. AF> If I understand what you are proposing, the problem I see with yielding in AF> the coarse-grained tasklet-as-transaction, is that you probably need more AF> machinery to ensure ACID properties, since you are now getting thread AF> interweaving and potential race conditions. >Yes, and this machinery is exactly what STM libraries offer.? Given >that you have multiple transactions that can *mostly* occur in >parallel, STM ensures that in *all* cases, it will look like the >transactions have been run serially, while actually performing them in >parallel under the hood. A rich STM library like rstm offers a variety of techniques for making this happen. On the other hand, the Transactional specification has very specific (and maybe more conservative) ways of achieving serialisation and race-free programmes. My gut feeling is that it is difficult to get a one size fits all solution. Maybe an interim solution is to get to a stage where one has a prototype that allows for pluggable STM solutions? A "try before you buy"strategy. I am also not clear on whether a STM interface exposed to a Python programmer will be the same as STM (and other techniques, say lock-free data structures) used internally to support multiple processors. I think this is an exciting time for stuff like PyPy/Stackless! Cheers, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewfr_ice at yahoo.com Mon Jan 9 19:52:27 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Mon, 9 Jan 2012 10:52:27 -0800 (PST) Subject: [pypy-dev] STM In-Reply-To: References: <1326054262.395.YahooMailNeo@web120705.mail.ne1.yahoo.com> Message-ID: <1326135147.3629.YahooMailNeo@web120703.mail.ne1.yahoo.com> Hi Armin: ________________________________ From: Armin Rigo To: Andrew Francis Cc: PyPy Developer Mailing List Sent: Monday, January 9, 2012 8:30 AM Subject: Re: [pypy-dev] STM >I should add that I don't agree with Lee's main conclusion in this >paper, namely that we need to give up solving the issue directly in >the languages we use, and instead turn ourselves to new "coordination >languages".? >But I do agree with his attacks against threads, and that provided me with the basics of the current >discussion. I did a quick reading of the paper. It reminds me of John Ousterhout's "Why Threads are a Bad Idea (for Most Things)" I find? Lee? dismissive of? languages like Erlang that both have syntax and coordination managers. Counter to what Lee believes, I think what will happen is ideas from non-mainstream languages (Erlang, Haskell, Concurrent ML/Jocaml, the Bell Lab languages) are going to enter the mainstream and or the languages themselves become popular. Actually the latter is already happening. Cheers, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From postmaster at meratusline.com Tue Jan 10 09:47:50 2012 From: postmaster at meratusline.com (postmaster at meratusline.com) Date: Tue, 10 Jan 2012 15:47:50 +0700 Subject: [pypy-dev] Undeliverable: [VIRUS FOUND]Important In-Reply-To: <201201100847.q0A8logo025062-q0A8logp025062@mail2.meratusline.com> References: <201201100847.q0A8logo025062-q0A8logp025062@mail2.meratusline.com> Message-ID: <21996dcf-feca-4b25-8e28-2997c30715ea@meratusline.com> Delivery has failed to these recipients or groups: salesjkt at meratusline.com The e-mail address you entered couldn't be found. Please check the recipient's e-mail address and try to resend the message. If the problem continues, please contact your helpdesk. Diagnostic information for administrators: Generating server: meratusline.com salesjkt at meratusline.com #550 5.1.1 RESOLVER.ADR.RecipNotFound; not found ## Original message headers: Received: from MRTS-VMXC5.meratusline.com (192.168.16.11) by MRTS-VMXC2.meratus.com (172.16.0.116) with Microsoft SMTP Server (TLS) id 14.1.218.12; Tue, 10 Jan 2012 15:47:50 +0700 Received: from mail2.meratusline.com (192.168.16.6) by MRTS-VMXC5.meratusline.com (192.168.16.11) with Microsoft SMTP Server (TLS) id 14.1.218.12; Tue, 10 Jan 2012 15:49:48 +0700 Received: from meratusline.com (155.subnet125-164-77.speedy.telkom.net.id [125.164.77.155] (may be forged)) by mail2.meratusline.com with ESMTP id q0A8logo025062-q0A8logp025062 for ; Tue, 10 Jan 2012 15:47:50 +0700 Message-ID: <201201100847.q0A8logo025062-q0A8logp025062 at mail2.meratusline.com> From: To: Subject: [VIRUS FOUND]Important Date: Tue, 10 Jan 2012 15:53:39 -0800 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0009_00006D04.00003E07" X-Priority: 1 X-MSMail-Priority: High Return-Path: pypy-dev at codespeak.net Received-SPF: None (MRTS-VMXC5.meratusline.com: pypy-dev at codespeak.net does not designate permitted sender hosts) X-Brightmail-Tracker: AAAAAgAAAUAAAAFS -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded message was scrubbed... From: Subject: [VIRUS FOUND]Important Date: Tue, 10 Jan 2012 15:53:39 -0800 Size: 1889 URL: From kgardenia42 at googlemail.com Wed Jan 11 00:44:10 2012 From: kgardenia42 at googlemail.com (kgardenia42) Date: Tue, 10 Jan 2012 15:44:10 -0800 Subject: [pypy-dev] Use pypy if available otherwise fallback to CPython Message-ID: Hi, I have some python code (lets call it main.py) which I'd like to run with pypy if available/installed but otherwise fallback to CPython. Clearly if I call my main.py explicitly with either "pypy" or "python" as follows then I will get the behavior I want: python main.py pypy main.py However, my goal is that main.py is executable and I can just do: ./main.py and it will use pypy if installed, otherwise fallback to normal python. In the archives of this list I found the following idiom: try: import __pypy__ except ImportError: __pypy__ = None However, I don't think this helps me here. It seems like (and correct me if I'm wrong) the code just detects python vs pypy when it has already been run with one or the other. It doesn't accomplish my stated goal and it seems like that idiom would only make sense if I wanted custom code to run depending on whether I run from python or pypy. Did I miss something? So otherwise, what can I put in the shebang line of my main.py to accomplish my goal? I assume if I make it /usr/bin/python then I will just always use CPython (irregardless of whether I do the import __pypy__ trick). So all I can think of is the rather brute-force approach of creating my own wrapper shell script (e.g. /usr/bin/python-selector) which delegates to pypy if available but otherwise uses /usr/bin/python. I then put #!/usr/bin/python-selector as the shebang line of my main.py. Does that make sense? I'm pretty sure I must be missing a trick here and there is a better way. Any guidance welcome. Thanks. From fijall at gmail.com Wed Jan 11 00:51:22 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 11 Jan 2012 01:51:22 +0200 Subject: [pypy-dev] Use pypy if available otherwise fallback to CPython In-Reply-To: References: Message-ID: On Wed, Jan 11, 2012 at 1:44 AM, kgardenia42 wrote: > Hi, > > I have some python code (lets call it main.py) which I'd like to run > with pypy if available/installed but otherwise fallback to CPython. > > Clearly if I call my main.py explicitly with either "pypy" or "python" > as follows then I will get the behavior I want: > > ? python main.py > ? pypy main.py > > However, my goal is that main.py is executable and I can just do: > > ? ./main.py > > and it will use pypy if installed, otherwise fallback to normal python. > > In the archives of this list I found the following idiom: > > try: > ? ?import __pypy__ > except ImportError: > ? ?__pypy__ = None > > However, I don't think this helps me here. ?It seems like (and correct > me if I'm wrong) the code just detects python vs pypy when it has > already been run with one or the other. ?It doesn't accomplish my > stated goal and it seems like that idiom would only make sense if I > wanted custom code to run depending on whether I run from python or > pypy. ?Did I miss something? > > So otherwise, what can I put in the shebang line of my main.py to > accomplish my goal? ?I assume if I make it /usr/bin/python then I will > just always use CPython (irregardless of whether I do the import > __pypy__ trick). ?So all I can think of is the rather brute-force > approach of creating my own wrapper shell script (e.g. > /usr/bin/python-selector) which delegates to pypy if available but > otherwise uses /usr/bin/python. ?I then put #!/usr/bin/python-selector > as the shebang line of my main.py. ?Does that make sense? > > I'm pretty sure I must be missing a trick here and there is a better way. > > Any guidance welcome. > > Thanks. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev You can either do that, or do something like: if_pypy_installed (however you check it depends on your os/distro): os.system(pypy, sys.argv) with correct parsing of argv and putting it there, not sure how to do it ;-) From berdario at gmail.com Wed Jan 11 00:52:24 2012 From: berdario at gmail.com (Dario Bertini) Date: Wed, 11 Jan 2012 00:52:24 +0100 Subject: [pypy-dev] Use pypy if available otherwise fallback to CPython In-Reply-To: References: Message-ID: I usually prepend my python scripts with #! /usr/bin/env python this way it'll resolve to your user's current python by using it together with a virtualenv[1] virtualenv --python=/path/to/pypy envname you can have what you want, at least if you're in control of the deploy I guess that if you'll just give the python script to someone who has installed both pypy and cpython, you'd have to rely on some other method [1] http://pypi.python.org/pypi/virtualenv From randall.leeds at gmail.com Wed Jan 11 00:53:40 2012 From: randall.leeds at gmail.com (Randall Leeds) Date: Tue, 10 Jan 2012 15:53:40 -0800 Subject: [pypy-dev] Use pypy if available otherwise fallback to CPython In-Reply-To: References: Message-ID: On Tue, Jan 10, 2012 at 15:51, Maciej Fijalkowski wrote: > On Wed, Jan 11, 2012 at 1:44 AM, kgardenia42 wrote: >> Hi, >> >> I have some python code (lets call it main.py) which I'd like to run >> with pypy if available/installed but otherwise fallback to CPython. >> >> Clearly if I call my main.py explicitly with either "pypy" or "python" >> as follows then I will get the behavior I want: >> >> ? python main.py >> ? pypy main.py >> >> However, my goal is that main.py is executable and I can just do: >> >> ? ./main.py >> >> and it will use pypy if installed, otherwise fallback to normal python. >> >> In the archives of this list I found the following idiom: >> >> try: >> ? ?import __pypy__ >> except ImportError: >> ? ?__pypy__ = None >> >> However, I don't think this helps me here. ?It seems like (and correct >> me if I'm wrong) the code just detects python vs pypy when it has >> already been run with one or the other. ?It doesn't accomplish my >> stated goal and it seems like that idiom would only make sense if I >> wanted custom code to run depending on whether I run from python or >> pypy. ?Did I miss something? >> >> So otherwise, what can I put in the shebang line of my main.py to >> accomplish my goal? ?I assume if I make it /usr/bin/python then I will >> just always use CPython (irregardless of whether I do the import >> __pypy__ trick). ?So all I can think of is the rather brute-force >> approach of creating my own wrapper shell script (e.g. >> /usr/bin/python-selector) which delegates to pypy if available but >> otherwise uses /usr/bin/python. ?I then put #!/usr/bin/python-selector >> as the shebang line of my main.py. ?Does that make sense? >> >> I'm pretty sure I must be missing a trick here and there is a better way. >> >> Any guidance welcome. >> >> Thanks. >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev > > You can either do that, or do something like: > > if_pypy_installed (however you check it depends on your os/distro): > ?os.system(pypy, sys.argv) with correct parsing of argv and putting > it there, not sure how to do it ;-) I would use the os.exec* family of functions to just replace the current process with pypy invoked with the same arguments. From kgardenia42 at googlemail.com Wed Jan 11 01:38:03 2012 From: kgardenia42 at googlemail.com (kgardenia42) Date: Tue, 10 Jan 2012 16:38:03 -0800 Subject: [pypy-dev] pypy 64 bit Ubuntu packages Message-ID: Hi, (Not sure if this is the correct place to ask this but I thought someone on this list would be able to help). I have an Ubuntu package which I would like to state a dependency on pypy. I found this excellent PPA resource for pypy Ubuntu packages: https://launchpad.net/~pypy/+archive/ppa http://ppa.launchpad.net/pypy/ppa/ubuntu/pool/main/p/pypy/ However, even though these packages have a target architecture of "all" when I install them on an amd64 Ubuntu machine I am unable to run the pypy binary. I'm assuming, perhaps wrongly, that this is because it is a 32bit binary: foo at ubuntu:~$ file /usr/share/pypy-1.6/bin/pypy /usr/share/pypy-1.6/bin/pypy: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, stripped So my question is: did I miss some trick in getting this running on 64bit Ubuntu? Or otherwise why are the packages targeting architecture "all" if they don't work on amd64. i.e.why not just target x86 architectures if that is what the package will run on? My fallback option is to download the 64bit release (binary) tarball from pypy.org which works fine but it breaks out of the Ubuntu packaging/dependency model. As before - it is perfectly possible I've just missed something obvious. What do other people do in this case? Are there 64 bit packages I missed? Any help welcome. Thanks. From niebaopeng at gmail.com Wed Jan 11 06:35:20 2012 From: niebaopeng at gmail.com (=?UTF-8?B?6IGC5a6d6bmP?=) Date: Wed, 11 Jan 2012 13:35:20 +0800 Subject: [pypy-dev] a bug of pypy:don't auto close the file handler. Message-ID: <4F0D1F98.6040505@gmail.com> description: pypy don't close file opened in a function.the the program exit after run a while because it open too many files which exceed the limited of system. the code is : #!/usr/bin/env python import os,time,signal last = {} now = {} def stack(s,f): global now,last last = now now = {} lines=open("/proc/net/dev","rt").readlines() #f=open("/proc/net/dev") #lines=f.readlines() #f.close() ls=lines[2:] #del the header line for m in range(0,len(ls)): l=ls[m] # l = per line name,l=l.split(':') # name = dev ,l = other; l=l.split() # l = other.split items. bytes_in = l[0] bytes_out = l[8] packets_in = l[1] packets_out = l[9] now[name]={} now[name]['bytes_in']=bytes_in now[name]['bytes_out']=bytes_out now[name]['packets_in']=packets_in now[name]['packets_out']=packets_out if last: print '-'*80 for name in now.iterkeys(): print "dev: %s\tbytes:\trx:%.2fMb/s\ttx:%.2fMb/s\tpackets\trx:%s\ttx:%s" % (name, int ( int(now[name]['bytes_in']) - int(last[name]['bytes_in']) )/1024.0/1024.0*8.0,int(int(now[name]['bytes_out'])-int(last[name]['bytes_out']))/1024.0/1024.0*8.0 , int ( int(now[name]['packets_in']) - int(last[name]['packets_in']) ),int(int(now[name]['packets_out'])-int(last[name]['packets_out'])) ) print '-'*80 if __name__=='__main__': signal.signal(signal.SIGALRM,stack) signal.setitimer(signal.ITIMER_REAL,0.01,1.0) while 1 : signal.pause() From gbowyer at fastmail.co.uk Wed Jan 11 07:09:25 2012 From: gbowyer at fastmail.co.uk (Greg Bowyer) Date: Tue, 10 Jan 2012 22:09:25 -0800 Subject: [pypy-dev] a bug of pypy:don't auto close the file handler. In-Reply-To: <4F0D1F98.6040505@gmail.com> References: <4F0D1F98.6040505@gmail.com> Message-ID: <4F0D2795.5020705@fastmail.co.uk> You never close the file, so you are depending on CPythons ref-counting GC, which will collect the object when it goes out of scope. Pypys default GC is mark / sweep, so it will not close the file when the file goes out of scope but rather at some _arbiterry_ future point. Putting the file in a with statement will do the right thing, as will closing it. Gritty details can be found in the docs http://doc.pypy.org/en/latest/cpython_differences.html#differences-related-to-garbage-collection-strategies On 10/01/2012 21:35, ??? wrote: > description: > pypy don't close file opened in a function.the the program exit after > run a while because it open too > many files which exceed the limited of system. > the code is : > #!/usr/bin/env python > import os,time,signal > last = {} > now = {} > def stack(s,f): > global now,last > last = now > now = {} > lines=open("/proc/net/dev","rt").readlines() > #f=open("/proc/net/dev") > #lines=f.readlines() > #f.close() > ls=lines[2:] #del the header line > for m in range(0,len(ls)): > l=ls[m] # l = per line > name,l=l.split(':') # name = dev ,l = other; > l=l.split() # l = other.split items. > bytes_in = l[0] > bytes_out = l[8] > packets_in = l[1] > packets_out = l[9] > now[name]={} > now[name]['bytes_in']=bytes_in > now[name]['bytes_out']=bytes_out > now[name]['packets_in']=packets_in > now[name]['packets_out']=packets_out > if last: > print '-'*80 > for name in now.iterkeys(): > print "dev: > %s\tbytes:\trx:%.2fMb/s\ttx:%.2fMb/s\tpackets\trx:%s\ttx:%s" % (name, > int ( int(now[name]['bytes_in']) - int(last[name]['bytes_in']) > )/1024.0/1024.0*8.0,int(int(now[name]['bytes_out'])-int(last[name]['bytes_out']))/1024.0/1024.0*8.0 > , > int ( int(now[name]['packets_in']) - int(last[name]['packets_in']) > ),int(int(now[name]['packets_out'])-int(last[name]['packets_out'])) ) > print '-'*80 > if __name__=='__main__': > signal.signal(signal.SIGALRM,stack) > signal.setitimer(signal.ITIMER_REAL,0.01,1.0) > while 1 : > signal.pause() > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From arigo at tunes.org Wed Jan 11 09:15:25 2012 From: arigo at tunes.org (Armin Rigo) Date: Wed, 11 Jan 2012 09:15:25 +0100 Subject: [pypy-dev] pypy 64 bit Ubuntu packages In-Reply-To: References: Message-ID: Hi, On Wed, Jan 11, 2012 at 01:38, kgardenia42 wrote: > (Not sure if this is the correct place to ask this but I thought > someone on this list would be able to help). You might have better chances if you ask on the Ubuntu channels why a 32-bit package is installed on Ubuntu 64. A bient?t, Armin. From fijall at gmail.com Wed Jan 11 09:40:30 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 11 Jan 2012 10:40:30 +0200 Subject: [pypy-dev] pypy 64 bit Ubuntu packages In-Reply-To: References: Message-ID: On Wed, Jan 11, 2012 at 10:15 AM, Armin Rigo wrote: > Hi, > > On Wed, Jan 11, 2012 at 01:38, kgardenia42 wrote: >> (Not sure if this is the correct place to ask this but I thought >> someone on this list would be able to help). > > You might have better chances if you ask on the Ubuntu channels why a > 32-bit package is installed on Ubuntu 64. I think Armin it's us wrongly specifying the platform. From dmitrey15 at ukr.net Thu Jan 12 10:42:54 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Thu, 12 Jan 2012 11:42:54 +0200 Subject: [pypy-dev] Some NumPyPy propositions Message-ID: <4F0EAB1E.7020502@ukr.net> hi all, I would like to make some propositions wrt NumPy port development: 1) It would be nice to have a build and/or install parameter to available usage of numpypy as numpy, e.g. "setup.py build --numpypy_as_numpy". I know it can be done via some tricks, but explicit parameter would be better, especially for unexperienced users. 2) Many soft packages have some basic functionality without full numpy port, but due to importing issues they cannot yield even it. For example: from numpy import some_func, a_rare_func_unimplemented_in_pypy_yet ... if user_want_this: some_func() elif user_want_some_rare_possibility: a_rare_func_unimplemented_in_pypy_yet() It would be nice to have possibility to install PyPy NumPy port with all unimplemented yet functions as stubs, e.g. def flipud(*args,**kw): raise numpy_absent_exception('flipud is unimplemented yet') (and similar stubs for ndarray and matrix methods) 3) Last but not least; I'm author and developer of openopt suite (openopt.org with ~ 200 visitors daily, that is AFAIK about 10% of scipy.org) and of course both openopt and PyPy will essentially increase their users when openopt will be capable of running with PyPy; yet I see some weeks or months till this still remain. I would be glad to make some contributions toward this, but my current financial situation cannot allow me to work for free. If at least basic financial support could be obtained, I guess I could port some missing numpy functions / array methods, maybe furthermore some functions from scipy.optimize or scipy.sparse. My CV and contacts are here: http://openopt.org/Dmitrey . Regards, D. From michal at bendowski.pl Thu Jan 12 21:24:51 2012 From: michal at bendowski.pl (=?utf-8?Q?Micha=C5=82_Bendowski?=) Date: Thu, 12 Jan 2012 21:24:51 +0100 Subject: [pypy-dev] Work on the JVM backend Message-ID: <4DB0241978A5469C832E6314C81780EF@gmail.com> Hello everyone, Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? Here's a short list of stuff I found and fixed (hopefully): - support the ll_getlength method of StringBuilders in ootype, - make compute_unique_id work on built-ins (StringBuilders again). - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? I hope someone can help me to submit the changes and maybe guide with further work. And once again thanks for all the hard work on such a great project :) Micha? Bendowski From fwierzbicki at gmail.com Fri Jan 13 00:02:02 2012 From: fwierzbicki at gmail.com (fwierzbicki at gmail.com) Date: Thu, 12 Jan 2012 15:02:02 -0800 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <4DB0241978A5469C832E6314C81780EF@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> Message-ID: 2012/1/12 Micha? Bendowski : > Hello everyone, > > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. Hi Michal, I'm afraid I'm only a pypy observer, so I can't help with your specific queries. However, I want to say that I'm cheering for your effort to get the JVM backend up and running again! I tried to look at it a little a few months ago, but couldn't quite get it to translate for me (not sure why, but I ran out of time to play with it). Someday I'd like to see Jython integrate with PyPy somehow, but I'm fuzzy on the how :) Anyway though I don't know what was suggested with respect to JPype I'd like suggest a look at another Java integration point. JDK 7 was recently released and it contains invokedynamic, which would be likely to be a great help with some sorts of Java integration. Here are some relevant links: http://jcp.org/en/jsr/detail?id=292 http://docs.oracle.com/javase/7/docs/api/java/lang/invoke/package-summary.html http://openjdk.java.net/projects/mlvm/ https://wikis.oracle.com/display/mlvm/Home Again, I don't know the details about how JPype would be used exactly, I just wanted to make sure you knew about invokedynamic. -Frank Wierzbicki From alex.gaynor at gmail.com Fri Jan 13 00:06:02 2012 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Thu, 12 Jan 2012 17:06:02 -0600 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: References: <4DB0241978A5469C832E6314C81780EF@gmail.com> Message-ID: 2012/1/12 fwierzbicki at gmail.com > 2012/1/12 Micha? Bendowski : > > Hello everyone, > > > > Back in the summer I asked on this mailing list if there's interest in > moving the JVM backend forward. Back then there was some enthusiasm, so I > got back to it when I had the chance, which unfortunately was a few months > later. The suggestion back then was to look into using JPype to integrate > more closely with Java-side code, and that's what I would like to do. > > Hi Michal, > > I'm afraid I'm only a pypy observer, so I can't help with your > specific queries. However, I want to say that I'm cheering for your > effort to get the JVM backend up and running again! I tried to look at > it a little a few months ago, but couldn't quite get it to translate > for me (not sure why, but I ran out of time to play with it). Someday > I'd like to see Jython integrate with PyPy somehow, but I'm fuzzy on > the how :) > > Anyway though I don't know what was suggested with respect to JPype > I'd like suggest a look at another Java integration point. JDK 7 was > recently released and it contains invokedynamic, which would be likely > to be a great help with some sorts of Java integration. Here are some > relevant links: > > http://jcp.org/en/jsr/detail?id=292 > > http://docs.oracle.com/javase/7/docs/api/java/lang/invoke/package-summary.html > http://openjdk.java.net/projects/mlvm/ > https://wikis.oracle.com/display/mlvm/Home > > Again, I don't know the details about how JPype would be used exactly, > I just wanted to make sure you knew about invokedynamic. > > -Frank Wierzbicki > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > Hey Frank, I'm mostly an observerer on the JVM + PyPy front, but my understanding is JPype or the like would be to solve the issue of "how do we test this without compiling all of PyPy". Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From fwierzbicki at gmail.com Fri Jan 13 01:00:35 2012 From: fwierzbicki at gmail.com (fwierzbicki at gmail.com) Date: Thu, 12 Jan 2012 16:00:35 -0800 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: References: <4DB0241978A5469C832E6314C81780EF@gmail.com> Message-ID: On Thu, Jan 12, 2012 at 3:06 PM, Alex Gaynor wrote: > I'm mostly an observerer on the JVM + PyPy front, but my understanding is > JPype or the like would be to solve the issue of "how do we test this > without compiling all of PyPy". Ah I see - a very different point of integration. Thanks for the clarification! -Frank From cfbolz at gmx.de Fri Jan 13 13:01:53 2012 From: cfbolz at gmx.de (=?utf-8?B?Q2FybCBGcmllZHJpY2ggQm9seg==?=) Date: Fri, 13 Jan 2012 13:01:53 +0100 Subject: [pypy-dev] =?utf-8?q?=5Bpypy-commit=5D_pypy_default=3A_TEMPORARY?= =?utf-8?q?=3A_put_a_limit_=284_by_default=29_on_the_number_of_=22cancelle?= =?utf-8?q?d=2C?= Message-ID: <20120113120836.9EE4E282B5B@codespeak.net> Yes, this breaks Pyrolog. That's fine, I can probably just increase the number. Could you write a test for this commit, though? Cheers, Carl Friedrich ----- Reply message ----- From: "arigo" To: Subject: [pypy-commit] pypy default: TEMPORARY: put a limit (4 by default) on the number of "cancelled, Date: Thu, Jan 12, 2012 17:29 Author: Armin Rigo Branch: Changeset: r51285:b09a9354d977 Date: 2012-01-12 17:28 +0100 http://bitbucket.org/pypy/pypy/changeset/b09a9354d977/ Log: TEMPORARY: put a limit (4 by default) on the number of "cancelled, tracing more" that can occur during one tracing. I think this will again fail in some non-PyPy interpreters like Pyrolog. Sorry about that, but it's the quickest way to fix issue985... diff --git a/pypy/jit/metainterp/pyjitpl.py b/pypy/jit/metainterp/pyjitpl.py --- a/pypy/jit/metainterp/pyjitpl.py +++ b/pypy/jit/metainterp/pyjitpl.py @@ -1553,6 +1553,7 @@ class MetaInterp(object): in_recursion = 0 + cancel_count = 0 def __init__(self, staticdata, jitdriver_sd): self.staticdata = staticdata @@ -1975,6 +1976,13 @@ raise SwitchToBlackhole(ABORT_BAD_LOOP) # For now self.compile_loop(original_boxes, live_arg_boxes, start, resumedescr) # creation of the loop was cancelled! + self.cancel_count += 1 + if self.staticdata.warmrunnerdesc: + memmgr = self.staticdata.warmrunnerdesc.memory_manager + if memmgr: + if self.cancel_count > memmgr.max_unroll_loops: + self.staticdata.log('cancelled too many times!') + raise SwitchToBlackhole(ABORT_BAD_LOOP) self.staticdata.log('cancelled, tracing more...') # Otherwise, no loop found so far, so continue tracing. diff --git a/pypy/jit/metainterp/warmstate.py b/pypy/jit/metainterp/warmstate.py --- a/pypy/jit/metainterp/warmstate.py +++ b/pypy/jit/metainterp/warmstate.py @@ -244,6 +244,11 @@ if self.warmrunnerdesc.memory_manager: self.warmrunnerdesc.memory_manager.max_retrace_guards = value + def set_param_max_unroll_loops(self, value): + if self.warmrunnerdesc: + if self.warmrunnerdesc.memory_manager: + self.warmrunnerdesc.memory_manager.max_unroll_loops = value + def disable_noninlinable_function(self, greenkey): cell = self.jit_cell_at_key(greenkey) cell.dont_trace_here = True diff --git a/pypy/rlib/jit.py b/pypy/rlib/jit.py --- a/pypy/rlib/jit.py +++ b/pypy/rlib/jit.py @@ -401,6 +401,7 @@ 'loop_longevity': 'a parameter controlling how long loops will be kept before being freed, an estimate', 'retrace_limit': 'how many times we can try retracing before giving up', 'max_retrace_guards': 'number of extra guards a retrace can cause', + 'max_unroll_loops': 'number of extra unrollings a loop can cause', 'enable_opts': 'optimizations to enable or all, INTERNAL USE ONLY' } @@ -412,6 +413,7 @@ 'loop_longevity': 1000, 'retrace_limit': 5, 'max_retrace_guards': 15, + 'max_unroll_loops': 4, 'enable_opts': 'all', } unroll_parameters = unrolling_iterable(PARAMETERS.items()) _______________________________________________ pypy-commit mailing list pypy-commit at python.org http://mail.python.org/mailman/listinfo/pypy-commit -------------- next part -------------- An HTML attachment was scrubbed... URL: From anto.cuni at gmail.com Fri Jan 13 16:02:43 2012 From: anto.cuni at gmail.com (Antonio Cuni) Date: Fri, 13 Jan 2012 16:02:43 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <4DB0241978A5469C832E6314C81780EF@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> Message-ID: <4F104793.3060405@gmail.com> Hello Micha?, On 01/12/2012 09:24 PM, Micha? Bendowski wrote: > Hello everyone, > > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. > > But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). the long vs int problems are likely due to the fact that you are translating on a 64 bit machine. The translator toolchain assumes that the "native" long type of the target platform is the same as the source one, but this is not the case if you are targeting the JVM (where long is 32 bit) on a 64 bit linux (where long is 64 bit). This problem is not easily solvable, so my suggestion is just to translate pypy-jvm inside a 32bit chroot for now. > It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? we don't have any precise workflow, although a bitbucket pull request might be the easiest thing to do. I'll be glad to review it. > Here's a short list of stuff I found and fixed (hopefully): > - support the ll_getlength method of StringBuilders in ootype, > - make compute_unique_id work on built-ins (StringBuilders again). not sure what you mean here. What is the relation between compute_unique_id and StringBuilder? > - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. > - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. yes, this is another issue that has been around for a long time. In theory, we would like to be able to write per-backend specific code which overrides the default implementation. This would be useful for rbigint and rsha, but also e.g. for rlib.streamio. However, we never wrote the infrastructure to do that. > - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? yes. Look at translator/c/funcgen.py:848: this is how jit_is_virtual is implemented by the C backend, you can see that it always returns 0/ > I hope someone can help me to submit the changes and maybe guide with further work. Please put your work on bitbucket, I'll review it. I'd greatly appreciate if you committed small checkins (one for each fix/feature you are doing) instead of one giant commit with all the changes :-) ciao, Anto From anto.cuni at gmail.com Fri Jan 13 16:07:37 2012 From: anto.cuni at gmail.com (Antonio Cuni) Date: Fri, 13 Jan 2012 16:07:37 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: References: <4DB0241978A5469C832E6314C81780EF@gmail.com> Message-ID: <4F1048B9.20407@gmail.com> Hello Frank, On 01/13/2012 12:02 AM, fwierzbicki at gmail.com wrote: > Again, I don't know the details about how JPype would be used exactly, > I just wanted to make sure you knew about invokedynamic. as Alex pointed out, the idea is to use JPype to be able to test the code which calls JVM methods without any need to translate. The alternatives are: 1. use CPython + JPype/something else (JPype seems to be no longer developed, but last time I checked it was still the best thing I could find. Suggestions are welcome :-)) 2. use Jython, although this might carry its own set of problems. In any case, the idea is to wrap all the code to interface to the JVM inside rlib/rjvm.py: this way it should be "easy" to write multiple "backends" to actually call the JVM, one for JPype, one for Jython, etc. ciao, Anto From tbaldridge at gmail.com Fri Jan 13 16:37:29 2012 From: tbaldridge at gmail.com (Timothy Baldridge) Date: Fri, 13 Jan 2012 09:37:29 -0600 Subject: [pypy-dev] GIL hacks in pypy Message-ID: I'm in the process of writing some semi-multithreaded code in PyPy. Now I know that PyPy and CPython both have GILs, but what I need is a CompareAndSwap operation in Python code. So basically I want my python code to grab the GIL, execute a small block of code, then hand the GIL back. Python doesn't really give a way to do this. But in CPython we can do this ugly little hack for getting "free" locks where you basically set the GIL "remaining" bytecodes count to 32 billion, execute your code, then return it to the original value. The cool thing is that you get a lock for basically free. Now I doubt this method would work with jitted pypy code. Is there a good way to get cheap "critical sections" or more preferably a CAS operation in PyPy without resorting to (slow) OS locks? Basically I want to tell pypy "don't switch threads for the next 10 bytecodes..." Thanks, Timothy -- ?One of the main causes of the fall of the Roman Empire was that?lacking zero?they had no way to indicate successful termination of their C programs.? (Robert Firth) From russel at russel.org.uk Fri Jan 13 16:42:15 2012 From: russel at russel.org.uk (Russel Winder) Date: Fri, 13 Jan 2012 15:42:15 +0000 Subject: [pypy-dev] pypy 64 bit Ubuntu packages In-Reply-To: References: Message-ID: <1326469335.8664.30.camel@anglides.russel.org.uk> Can I come at this from a different angle. Between 2007-05 and 2009-08 there appears to have been a PyPy package in Debian Unstable though it never made it to Testing (migrating and being immediately removed doesn't count). If PyPy were in Debian Testing then it would almost certainly be in Ubuntu. Is it known why PyPy was removed from Debian Unstable? Is it possible to get it back into Debian? Debian packagers would package separately for 32-bit and 64-bit and thus the whole Ubuntu problem in the Launchpad PPA would go away. More importantly (for me an many others) PyPy would be part of the Debian distribution. PyPy does appear to come as standard with Fedora --- well 17 and later anyway. Thanks. -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at russel.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From arigo at tunes.org Fri Jan 13 18:35:21 2012 From: arigo at tunes.org (Armin Rigo) Date: Fri, 13 Jan 2012 18:35:21 +0100 Subject: [pypy-dev] GIL hacks in pypy In-Reply-To: References: Message-ID: Hi Timothy, On Fri, Jan 13, 2012 at 16:37, Timothy Baldridge wrote: > But in CPython we > can do this ugly little hack for getting "free" locks where you > basically set the GIL "remaining" bytecodes count to 32 billion, > execute your code, then return it to the original value. Bah! That's a hack indeed. I think the cleanest solution would be to write the compare-and-swap operation as C code in CPython, and as RPython code in PyPy. Otherwise, I'm unsure about getting compare-and-swap, but you can definitely do some atomic operations using lists or dicts. A bient?t, Armin. From arigo at tunes.org Fri Jan 13 18:54:55 2012 From: arigo at tunes.org (Armin Rigo) Date: Fri, 13 Jan 2012 18:54:55 +0100 Subject: [pypy-dev] [pypy-commit] pypy default: TEMPORARY: put a limit (4 by default) on the number of "cancelled, In-Reply-To: <20120113120836.9EE4E282B5B@codespeak.net> References: <20120113120836.9EE4E282B5B@codespeak.net> Message-ID: Hi, On Fri, Jan 13, 2012 at 13:01, Carl Friedrich Bolz wrote: > Yes, this breaks Pyrolog. That's fine, I can probably just increase the > number. Could you write a test for this commit, though? It's a bit hard... I don't have a good enough understanding of the reason for which we get cancellations. But I fear that we're going to get the following: even if we manage to reproduce the issue in a test, the next step will be "ah, but in this case we should obviously do this or that instead of failing 4 times in a row". Then we do it. And then the test is no longer a test for the original commit... A bient?t, Armin. From tbaldridge at gmail.com Fri Jan 13 18:58:27 2012 From: tbaldridge at gmail.com (Timothy Baldridge) Date: Fri, 13 Jan 2012 11:58:27 -0600 Subject: [pypy-dev] GIL hacks in pypy In-Reply-To: References: Message-ID: So I guess the next question would be, how does the GIL actually work in PyPy. Let's say you have two threads running jitted code. Does a jitted code loop run one, release the GIL, acquire, run again, etc? In CPython the GIL is released every X number of bytecodes, but jitted code doesn't have bytecodes, so what is the level of cooperation there? Is the jitted code littered with cooperative "release-lock" instructions? Thanks, Timothy On Fri, Jan 13, 2012 at 11:35 AM, Armin Rigo wrote: > Hi Timothy, > > On Fri, Jan 13, 2012 at 16:37, Timothy Baldridge wrote: >> But in CPython we >> can do this ugly little hack for getting "free" locks where you >> basically set the GIL "remaining" bytecodes count to 32 billion, >> execute your code, then return it to the original value. > > Bah! ?That's a hack indeed. > > I think the cleanest solution would be to write the compare-and-swap > operation as C code in CPython, and as RPython code in PyPy. > > Otherwise, I'm unsure about getting compare-and-swap, but you can > definitely do some atomic operations using lists or dicts. > > > A bient?t, > > Armin. -- ?One of the main causes of the fall of the Roman Empire was that?lacking zero?they had no way to indicate successful termination of their C programs.? (Robert Firth) From hakan at debian.org Fri Jan 13 19:12:13 2012 From: hakan at debian.org (Hakan Ardo) Date: Fri, 13 Jan 2012 19:12:13 +0100 Subject: [pypy-dev] [pypy-commit] pypy default: TEMPORARY: put a limit (4 by default) on the number of "cancelled, In-Reply-To: References: <20120113120836.9EE4E282B5B@codespeak.net> Message-ID: Hi, check out the test comited in fff6b491e07d, it tests a fallback in unroll.py that we dont realy want to ocure by injecting an extra optimization stage that generates the issue we want to avoid. On Fri, Jan 13, 2012 at 6:54 PM, Armin Rigo wrote: > Hi, > > On Fri, Jan 13, 2012 at 13:01, Carl Friedrich Bolz wrote: >> Yes, this breaks Pyrolog. That's fine, I can probably just increase the >> number. Could you write a test for this commit, though? > > It's a bit hard... ?I don't have a good enough understanding of the > reason for which we get cancellations. ?But I fear that we're going to > get the following: even if we manage to reproduce the issue in a test, > the next step will be "ah, but in this case we should obviously do > this or that instead of failing 4 times in a row". ?Then we do it. > And then the test is no longer a test for the original commit... > > > A bient?t, > > Armin. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev -- H?kan Ard? From blendmaster1024 at gmail.com Fri Jan 13 21:03:10 2012 From: blendmaster1024 at gmail.com (lahwran) Date: Fri, 13 Jan 2012 13:03:10 -0700 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <4F1048B9.20407@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F1048B9.20407@gmail.com> Message-ID: I haven't looked into it as thoroughly as I'd like, but perhaps JCC could be of use: http://pypi.python.org/pypi/JCC/ On Fri, Jan 13, 2012 at 8:07 AM, Antonio Cuni wrote: > Hello Frank, > > On 01/13/2012 12:02 AM, fwierzbicki at gmail.com wrote: > >> Again, I don't know the details about how JPype would be used exactly, >> I just wanted to make sure you knew about invokedynamic. > > as Alex pointed out, the idea is to use JPype to be able to test the code > which calls JVM methods without any need to translate. > > The alternatives are: > > 1. use CPython + JPype/something else (JPype seems to be no longer developed, > but last time I checked it was still the best thing I could find. Suggestions > are welcome :-)) > > 2. use Jython, although this might carry its own set of problems. > > In any case, the idea is to wrap all the code to interface to the JVM inside > rlib/rjvm.py: this way it should be "easy" to write multiple "backends" to > actually call the JVM, one for JPype, one for Jython, etc. > > ciao, > Anto > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From stefano at rivera.za.net Sat Jan 14 00:18:52 2012 From: stefano at rivera.za.net (Stefano Rivera) Date: Sat, 14 Jan 2012 01:18:52 +0200 Subject: [pypy-dev] pypy 64 bit Ubuntu packages In-Reply-To: <1326469335.8664.30.camel@anglides.russel.org.uk> References: <1326469335.8664.30.camel@anglides.russel.org.uk> Message-ID: <20120113231852.GD27066@bach.rivera.co.za> Sorry, I only got around to joining this list after this thread had already kicked off. Hi everyone, I'm Debian & Ubuntu Developer interested in PyPy (and recently got commit access here, but haven't done much with that, yet). I've got a new PyPy package *just* landed in Debian experimental [0]. [0]: http://packages.qa.debian.org/p/pypy/news/20120113T213634Z.html > Between 2007-05 and 2009-08 there appears to have been a PyPy package > in Debian Unstable though it never made it to Testing (migrating and > being immediately removed doesn't count). I wasn't involved at the time, but here's what I know. PyPy was intentionally kept out of testing by its maintainer (according to the bug, in consultation with upstream) http://bugs.debian.org/486675 > Is it known why PyPy was removed from Debian Unstable? See the removal request [1]. It references the RC bugs at the time, which can be found amongst its archived bugs [2]. [1] http://bugs.debian.org/538858 [2] http://bugs.debian.org/cgi-bin/pkgreport.cgi?archive=both;src=pypy Generally, I think the issues were that PyPy was hard to build on many architectures (due to the massive RAM requirements), and insufficiently mature. Debian supports stable releases for ~3 years, we try and keep things that aren't ready for that out of stable releases. I didn't get any responses from the previous maintainers, when asking if they thought bringing PyPy back was a reasonable idea. So I just went ahead and did it. Let's see how it works out... > Is it possible to get it back into Debian? In progress, and I'd appreciate help. As it stands at the moment, it's usable with a (patched [3]) virtualenv, but I want to go further. I'm keen to support PyPy as an equal interpreter to cpython, in Debian (i.e. have it work with packaged python modules). That would be a first for an alternative python implementation, in Debian, but require a fair amount of work. [3]: https://github.com/stefanor/virtualenv/tree/pypy In the meantime, I want to get it building on as many architectures as possible (Debian supports rather a few), and sort out all the test failures we'll see on them. I also have a fair number of out-of-tree PyPy patches applied [4] (e.g. PEP3147 support), that I must see about upstreaming. [4]: http://anonscm.debian.org/gitweb/?p=collab-maint/pypy.git;a=tree;f=debian/patches It's in experimental, so people can have a bit of a play with it, and we can sort out all the obvious problems, before we start getting too committed to the packaging approaches I've taken. So, please try it. > Debian packagers would package separately for 32-bit and 64-bit and > thus the whole Ubuntu problem in the Launchpad PPA would go away. Correct. In fact, it'll build separately for all supported Debian architectures https://buildd.debian.org/status/package.php?p=pypy&suite=experimental > More importantly (for me an many others) PyPy would be part of the > Debian distribution. In time... > If PyPy were in Debian Testing then it would almost certainly be in > Ubuntu. I'm interested in including it in 12.04, as a "technical preview". But I still need to have that discussion with the rest of the Ubuntu Release team. And I want to get some feedback on it in Debian experimental, first. Now, to respond to the original thread: kgardenia42 said: > However, even though these packages have a target architecture of > "all" when I install them on an amd64 Ubuntu machine I am unable to > run the pypy binary. I'm assuming, perhaps wrongly, that this is > because it is a 32bit binary Correct, that PPA is doing things wrong. Architecture "all" packages are architecture-independent. SR -- Stefano Rivera http://tumbleweed.org.za/ H: +27 21 465 6908 C: +27 72 419 8559 UCT: x3127 From fijall at gmail.com Sat Jan 14 00:41:21 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 14 Jan 2012 01:41:21 +0200 Subject: [pypy-dev] pypy 64 bit Ubuntu packages In-Reply-To: <20120113231852.GD27066@bach.rivera.co.za> References: <1326469335.8664.30.camel@anglides.russel.org.uk> <20120113231852.GD27066@bach.rivera.co.za> Message-ID: On Sat, Jan 14, 2012 at 1:18 AM, Stefano Rivera wrote: > Sorry, I only got around to joining this list after this thread had > already kicked off. > > Hi everyone, I'm Debian & Ubuntu Developer interested in PyPy (and > recently got commit access here, but haven't done much with that, yet). > > I've got a new PyPy package *just* landed in Debian experimental [0]. > Cool, congrats! From arigo at tunes.org Sat Jan 14 12:29:17 2012 From: arigo at tunes.org (Armin Rigo) Date: Sat, 14 Jan 2012 12:29:17 +0100 Subject: [pypy-dev] GIL hacks in pypy In-Reply-To: References: Message-ID: Hi Timothy, On Fri, Jan 13, 2012 at 18:58, Timothy Baldridge wrote: > Is the jitted code littered with cooperative "release-lock" instructions? Yes: every compiled loop ends in (about 4-5) assembler instructions that decrement the GIL counter and jump to some release-and-reacquire-the-GIL code if it ends up negative. We don't do it every loop, as it would have a performance impact. A bient?t, Armin. From fijall at gmail.com Sat Jan 14 19:22:03 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 14 Jan 2012 20:22:03 +0200 Subject: [pypy-dev] Some NumPyPy propositions In-Reply-To: <4F0EAB1E.7020502@ukr.net> References: <4F0EAB1E.7020502@ukr.net> Message-ID: Hi Dmitrey. Let me answer your questions one by one. On Thu, Jan 12, 2012 at 11:42 AM, Dmitrey wrote: > hi all, > > I would like to make some propositions wrt NumPy port development: > > 1) It would be nice to have a build and/or install parameter to available > usage of numpypy as numpy, e.g. "setup.py build --numpypy_as_numpy". I know > it can be done via some tricks, but explicit parameter would be better, > especially for unexperienced users. We can trivially have such a package on cheeseshop. It would do: from numpypy import *. You can even create your own package called numpy.py and put it somewhere on PYTHONPATH to achieve the same result. > > 2) Many soft packages have some basic functionality without full numpy port, > but due to importing issues they cannot yield even it. For example: > from numpy import some_func, a_rare_func_unimplemented_in_pypy_yet > ... > if user_want_this: > ? ?some_func() > elif user_want_some_rare_possibility: > ? ?a_rare_func_unimplemented_in_pypy_yet() > > It would be nice to have possibility to install PyPy NumPy port with all > unimplemented yet functions as stubs, e.g. > def flipud(*args,**kw): > ? ?raise numpy_absent_exception('flipud is unimplemented yet') > > (and similar stubs for ndarray and matrix methods) This is IMO a very bad idea. It pushes back the problem from import time to some later time, while effectively not fixing it. I would say just "wait a bit" or make your package cooperate with numpypy now, but we expect new functionality to appear relatively rapidly. > > 3) Last but not least; I'm author and developer of openopt suite > (openopt.org with ~ 200 visitors daily, that is AFAIK about 10% of > scipy.org) and of course both openopt and PyPy will essentially increase > their users when openopt will be capable of running with PyPy; yet I see > some weeks or months till this still remain. I would be glad to make some > contributions toward this, but my current financial situation cannot allow > me to work for free. If at least basic financial support could be obtained, > I guess I could port some missing numpy functions / array methods, maybe > furthermore some functions from scipy.optimize or scipy.sparse. My CV and > contacts are here: http://openopt.org/Dmitrey . We indeed have some money, but what we have is relatively little. The numpy work done so far was purely volunteer effort and we generally select people for doing paid job who are already core developers or heavy contributors. However, the money we have all come from some 3rd parties, pypy as a project does not earn any money - there is absolutely nothing that stops you from convincing some other 3rd party to do work on PyPy's numpy. Cheers, fijal From michal at bendowski.pl Sat Jan 14 22:18:16 2012 From: michal at bendowski.pl (=?utf-8?Q?Micha=C5=82_Bendowski?=) Date: Sat, 14 Jan 2012 22:18:16 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <4F104793.3060405@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> Message-ID: On Friday, 13 January 2012 at 16:02 , Antonio Cuni wrote: > Hello Micha?, > > On 01/12/2012 09:24 PM, Micha? Bendowski wrote: > > Hello everyone, > > > > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. > > > > But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). > > the long vs int problems are likely due to the fact that you are translating > on a 64 bit machine. The translator toolchain assumes that the "native" long > type of the target platform is the same as the source one, but this is not the > case if you are targeting the JVM (where long is 32 bit) on a 64 bit linux > (where long is 64 bit). > > This problem is not easily solvable, so my suggestion is just to translate > pypy-jvm inside a 32bit chroot for now. > > > It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? > > we don't have any precise workflow, although a bitbucket pull request might be > the easiest thing to do. I'll be glad to review it. > > > Here's a short list of stuff I found and fixed (hopefully): > > - support the ll_getlength method of StringBuilders in ootype, > > - make compute_unique_id work on built-ins (StringBuilders again). > > > > not sure what you mean here. What is the relation between compute_unique_id > and StringBuilder? > > > - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. > > - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. > > > > yes, this is another issue that has been around for a long time. In theory, we > would like to be able to write per-backend specific code which overrides the > default implementation. This would be useful for rbigint and rsha, but also > e.g. for rlib.streamio. However, we never wrote the infrastructure to do that. > > > - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? > > yes. Look at translator/c/funcgen.py:848: this is how jit_is_virtual is > implemented by the C backend, you can see that it always returns 0/ > > > I hope someone can help me to submit the changes and maybe guide with further work. > > Please put your work on bitbucket, I'll review it. I'd greatly appreciate if > you committed small checkins (one for each fix/feature you are doing) instead > of one giant commit with all the changes :-) OK, I got myself a 32bit environment and created the pull request (. I'll be grateful for any feedback. One thing I didn't do was to create regression tests against the problems I found - I didn't know where to put the tests and what (and how) exactly to test. If you can shed some light on it, that would be awesome. Here's the URL: https://bitbucket.org/pypy/pypy/pull-request/19/improvements-to-the-jvm-backend Thank you, Micha? From fijall at gmail.com Sat Jan 14 22:28:01 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 14 Jan 2012 23:28:01 +0200 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> Message-ID: On Sat, Jan 14, 2012 at 11:18 PM, Micha? Bendowski wrote: > > > On Friday, 13 January 2012 at 16:02 , Antonio Cuni wrote: > >> Hello Micha?, >> >> On 01/12/2012 09:24 PM, Micha? Bendowski wrote: >> > Hello everyone, >> > >> > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. >> > >> > But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). >> >> the long vs int problems are likely due to the fact that you are translating >> on a 64 bit machine. The translator toolchain assumes that the "native" long >> type of the target platform is the same as the source one, but this is not the >> case if you are targeting the JVM (where long is 32 bit) on a 64 bit linux >> (where long is 64 bit). >> >> This problem is not easily solvable, so my suggestion is just to translate >> pypy-jvm inside a 32bit chroot for now. >> >> > It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? >> >> we don't have any precise workflow, although a bitbucket pull request might be >> the easiest thing to do. I'll be glad to review it. >> >> > Here's a short list of stuff I found and fixed (hopefully): >> > - support the ll_getlength method of StringBuilders in ootype, >> > - make compute_unique_id work on built-ins (StringBuilders again). >> >> >> >> not sure what you mean here. What is the relation between compute_unique_id >> and StringBuilder? >> >> > - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. >> > - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. >> >> >> >> yes, this is another issue that has been around for a long time. In theory, we >> would like to be able to write per-backend specific code which overrides the >> default implementation. This would be useful for rbigint and rsha, but also >> e.g. for rlib.streamio. However, we never wrote the infrastructure to do that. >> >> > - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? >> >> yes. Look at translator/c/funcgen.py:848: this is how jit_is_virtual is >> implemented by the C backend, you can see that it always returns 0/ >> >> > I hope someone can help me to submit the changes and maybe guide with further work. >> >> Please put your work on bitbucket, I'll review it. I'd greatly appreciate if >> you committed small checkins (one for each fix/feature you are doing) instead >> of one giant commit with all the changes :-) > > > > OK, I got myself a 32bit environment and created the pull request (. I'll be grateful for any feedback. One thing I didn't do was to create regression tests against the problems I found - I didn't know where to put the tests and what (and how) exactly to test. If you can shed some light on it, that would be awesome. Lack of tests is a no-no in PyPy world :) Look how current tests are implemented in pypy/translator/jvm/test/ and either extend those or the base classes. You run them using py.test (which comes included with pypy), refer to py.test documentation for details > > Here's the URL: https://bitbucket.org/pypy/pypy/pull-request/19/improvements-to-the-jvm-backend > > Thank you, > > Micha? > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From michal at bendowski.pl Sun Jan 15 00:41:06 2012 From: michal at bendowski.pl (=?utf-8?Q?Micha=C5=82_Bendowski?=) Date: Sun, 15 Jan 2012 00:41:06 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> Message-ID: <4503105A35FC4DD88E354CB63EC4E381@gmail.com> On Saturday, 14 January 2012 at 22:28 , Maciej Fijalkowski wrote: > On Sat, Jan 14, 2012 at 11:18 PM, Micha? Bendowski wrote: > > > > > > On Friday, 13 January 2012 at 16:02 , Antonio Cuni wrote: > > > > > Hello Micha?, > > > > > > On 01/12/2012 09:24 PM, Micha? Bendowski wrote: > > > > Hello everyone, > > > > > > > > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. > > > > > > > > But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). > > > > > > the long vs int problems are likely due to the fact that you are translating > > > on a 64 bit machine. The translator toolchain assumes that the "native" long > > > type of the target platform is the same as the source one, but this is not the > > > case if you are targeting the JVM (where long is 32 bit) on a 64 bit linux > > > (where long is 64 bit). > > > > > > This problem is not easily solvable, so my suggestion is just to translate > > > pypy-jvm inside a 32bit chroot for now. > > > > > > > It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? > > > > > > we don't have any precise workflow, although a bitbucket pull request might be > > > the easiest thing to do. I'll be glad to review it. > > > > > > > Here's a short list of stuff I found and fixed (hopefully): > > > > - support the ll_getlength method of StringBuilders in ootype, > > > > - make compute_unique_id work on built-ins (StringBuilders again). > > > > > > > > > > > > > > > > > > not sure what you mean here. What is the relation between compute_unique_id > > > and StringBuilder? > > > > > > > - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. > > > > - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. > > > > > > > > > > > > > > > > > > yes, this is another issue that has been around for a long time. In theory, we > > > would like to be able to write per-backend specific code which overrides the > > > default implementation. This would be useful for rbigint and rsha, but also > > > e.g. for rlib.streamio. However, we never wrote the infrastructure to do that. > > > > > > > - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? > > > > > > yes. Look at translator/c/funcgen.py:848: this is how jit_is_virtual is > > > implemented by the C backend, you can see that it always returns 0/ > > > > > > > I hope someone can help me to submit the changes and maybe guide with further work. > > > > > > Please put your work on bitbucket, I'll review it. I'd greatly appreciate if > > > you committed small checkins (one for each fix/feature you are doing) instead > > > of one giant commit with all the changes :-) > > > > > > > > > > > > OK, I got myself a 32bit environment and created the pull request (. I'll be grateful for any feedback. One thing I didn't do was to create regression tests against the problems I found - I didn't know where to put the tests and what (and how) exactly to test. If you can shed some light on it, that would be awesome. > > Lack of tests is a no-no in PyPy world :) Look how current tests are > implemented in pypy/translator/jvm/test/ and either extend those or > the base classes. You run them using py.test (which comes included > with pypy), refer to py.test documentation for details I'll look into it, looks like a whole new codebase to grep through (and I already found a bug in my code). I'll create a new pull request when I'm ready with the tests :) Micha? From michal at bendowski.pl Sun Jan 15 22:28:05 2012 From: michal at bendowski.pl (=?utf-8?Q?Micha=C5=82_Bendowski?=) Date: Sun, 15 Jan 2012 22:28:05 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <4503105A35FC4DD88E354CB63EC4E381@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> Message-ID: <8DE497BA26E043B78A00828550A9AF89@gmail.com> On Sunday, 15 January 2012 at 0:41 , Micha? Bendowski wrote: > On Saturday, 14 January 2012 at 22:28 , Maciej Fijalkowski wrote: > > > On Sat, Jan 14, 2012 at 11:18 PM, Micha? Bendowski wrote: > > > > > > > > > On Friday, 13 January 2012 at 16:02 , Antonio Cuni wrote: > > > > > > > Hello Micha?, > > > > > > > > On 01/12/2012 09:24 PM, Micha? Bendowski wrote: > > > > > Hello everyone, > > > > > > > > > > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. > > > > > > > > > > But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). > > > > > > > > > > > > the long vs int problems are likely due to the fact that you are translating > > > > on a 64 bit machine. The translator toolchain assumes that the "native" long > > > > type of the target platform is the same as the source one, but this is not the > > > > case if you are targeting the JVM (where long is 32 bit) on a 64 bit linux > > > > (where long is 64 bit). > > > > > > > > This problem is not easily solvable, so my suggestion is just to translate > > > > pypy-jvm inside a 32bit chroot for now. > > > > > > > > > It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? > > > > > > > > > > > > we don't have any precise workflow, although a bitbucket pull request might be > > > > the easiest thing to do. I'll be glad to review it. > > > > > > > > > Here's a short list of stuff I found and fixed (hopefully): > > > > > - support the ll_getlength method of StringBuilders in ootype, > > > > > - make compute_unique_id work on built-ins (StringBuilders again). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > not sure what you mean here. What is the relation between compute_unique_id > > > > and StringBuilder? > > > > > > > > > - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. > > > > > - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > yes, this is another issue that has been around for a long time. In theory, we > > > > would like to be able to write per-backend specific code which overrides the > > > > default implementation. This would be useful for rbigint and rsha, but also > > > > e.g. for rlib.streamio. However, we never wrote the infrastructure to do that. > > > > > > > > > - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? > > > > > > > > > > > > yes. Look at translator/c/funcgen.py:848: this is how jit_is_virtual is > > > > implemented by the C backend, you can see that it always returns 0/ > > > > > > > > > I hope someone can help me to submit the changes and maybe guide with further work. > > > > > > > > > > > > Please put your work on bitbucket, I'll review it. I'd greatly appreciate if > > > > you committed small checkins (one for each fix/feature you are doing) instead > > > > of one giant commit with all the changes :-) > > > > > > > > > > > > > > > > > > > > > > > > OK, I got myself a 32bit environment and created the pull request (. I'll be grateful for any feedback. One thing I didn't do was to create regression tests against the problems I found - I didn't know where to put the tests and what (and how) exactly to test. If you can shed some light on it, that would be awesome. > > > > > > Lack of tests is a no-no in PyPy world :) Look how current tests are > > implemented in pypy/translator/jvm/test/ and either extend those or > > the base classes. You run them using py.test (which comes included > > with pypy), refer to py.test documentation for details > > > > > I'll look into it, looks like a whole new codebase to grep through (and I already found a bug in my code). I'll create a new pull request when I'm ready with the tests :) OK - I have create another pull requests here: https://bitbucket.org/pypy/pypy/pull-request/20/improvements-to-the-jvm-backend-this-time The previous one should be rejected/deleted, it seems impossible from my side. I will be grateful for comments about the changes. Micha? From fijall at gmail.com Sun Jan 15 22:43:11 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 15 Jan 2012 23:43:11 +0200 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <8DE497BA26E043B78A00828550A9AF89@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> <8DE497BA26E043B78A00828550A9AF89@gmail.com> Message-ID: On Sun, Jan 15, 2012 at 11:28 PM, Micha? Bendowski wrote: > > > On Sunday, 15 January 2012 at 0:41 , Micha? Bendowski wrote: > >> On Saturday, 14 January 2012 at 22:28 , Maciej Fijalkowski wrote: >> >> > On Sat, Jan 14, 2012 at 11:18 PM, Micha? Bendowski wrote: >> > > >> > > >> > > On Friday, 13 January 2012 at 16:02 , Antonio Cuni wrote: >> > > >> > > > Hello Micha?, >> > > > >> > > > On 01/12/2012 09:24 PM, Micha? Bendowski wrote: >> > > > > Hello everyone, >> > > > > >> > > > > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. >> > > > > >> > > > > But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). >> > > > >> > > > >> > > > the long vs int problems are likely due to the fact that you are translating >> > > > on a 64 bit machine. The translator toolchain assumes that the "native" long >> > > > type of the target platform is the same as the source one, but this is not the >> > > > case if you are targeting the JVM (where long is 32 bit) on a 64 bit linux >> > > > (where long is 64 bit). >> > > > >> > > > This problem is not easily solvable, so my suggestion is just to translate >> > > > pypy-jvm inside a 32bit chroot for now. >> > > > >> > > > > It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? >> > > > >> > > > >> > > > we don't have any precise workflow, although a bitbucket pull request might be >> > > > the easiest thing to do. I'll be glad to review it. >> > > > >> > > > > Here's a short list of stuff I found and fixed (hopefully): >> > > > > - support the ll_getlength method of StringBuilders in ootype, >> > > > > - make compute_unique_id work on built-ins (StringBuilders again). >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > not sure what you mean here. What is the relation between compute_unique_id >> > > > and StringBuilder? >> > > > >> > > > > - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. >> > > > > - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > yes, this is another issue that has been around for a long time. In theory, we >> > > > would like to be able to write per-backend specific code which overrides the >> > > > default implementation. This would be useful for rbigint and rsha, but also >> > > > e.g. for rlib.streamio. However, we never wrote the infrastructure to do that. >> > > > >> > > > > - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? >> > > > >> > > > >> > > > yes. Look at translator/c/funcgen.py:848: this is how jit_is_virtual is >> > > > implemented by the C backend, you can see that it always returns 0/ >> > > > >> > > > > I hope someone can help me to submit the changes and maybe guide with further work. >> > > > >> > > > >> > > > Please put your work on bitbucket, I'll review it. I'd greatly appreciate if >> > > > you committed small checkins (one for each fix/feature you are doing) instead >> > > > of one giant commit with all the changes :-) >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > OK, I got myself a 32bit environment and created the pull request (. I'll be grateful for any feedback. One thing I didn't do was to create regression tests against the problems I found - I didn't know where to put the tests and what (and how) exactly to test. If you can shed some light on it, that would be awesome. >> > >> > >> > Lack of tests is a no-no in PyPy world :) Look how current tests are >> > implemented in pypy/translator/jvm/test/ and either extend those or >> > the base classes. You run them using py.test (which comes included >> > with pypy), refer to py.test documentation for details >> >> >> >> >> I'll look into it, looks like a whole new codebase to grep through (and I already found a bug in my code). I'll create a new pull request when I'm ready with the tests :) > OK - I have create another pull requests here: https://bitbucket.org/pypy/pypy/pull-request/20/improvements-to-the-jvm-backend-this-time > > The previous one should be rejected/deleted, it seems impossible from my side. I will be grateful for comments about the changes. > > Micha? > > That sounds like a good step forward, however, why the tests are skipped? They should be passing now. Also primitives (like float2longlong) miss tests I think. From michal at bendowski.pl Sun Jan 15 23:00:56 2012 From: michal at bendowski.pl (Michal Bendowski) Date: Sun, 15 Jan 2012 23:00:56 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> <8DE497BA26E043B78A00828550A9AF89@gmail.com> Message-ID: 2012/1/15 Maciej Fijalkowski : > On Sun, Jan 15, 2012 at 11:28 PM, Micha? Bendowski wrote: >> >> >> On Sunday, 15 January 2012 at 0:41 , Micha? Bendowski wrote: >> >>> On Saturday, 14 January 2012 at 22:28 , Maciej Fijalkowski wrote: >>> >>> > On Sat, Jan 14, 2012 at 11:18 PM, Micha? Bendowski wrote: >>> > > >>> > > >>> > > On Friday, 13 January 2012 at 16:02 , Antonio Cuni wrote: >>> > > >>> > > > Hello Micha?, >>> > > > >>> > > > On 01/12/2012 09:24 PM, Micha? Bendowski wrote: >>> > > > > Hello everyone, >>> > > > > >>> > > > > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. >>> > > > > >>> > > > > But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). >>> > > > >>> > > > >>> > > > the long vs int problems are likely due to the fact that you are translating >>> > > > on a 64 bit machine. The translator toolchain assumes that the "native" long >>> > > > type of the target platform is the same as the source one, but this is not the >>> > > > case if you are targeting the JVM (where long is 32 bit) on a 64 bit linux >>> > > > (where long is 64 bit). >>> > > > >>> > > > This problem is not easily solvable, so my suggestion is just to translate >>> > > > pypy-jvm inside a 32bit chroot for now. >>> > > > >>> > > > > It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? >>> > > > >>> > > > >>> > > > we don't have any precise workflow, although a bitbucket pull request might be >>> > > > the easiest thing to do. I'll be glad to review it. >>> > > > >>> > > > > Here's a short list of stuff I found and fixed (hopefully): >>> > > > > - support the ll_getlength method of StringBuilders in ootype, >>> > > > > - make compute_unique_id work on built-ins (StringBuilders again). >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > not sure what you mean here. What is the relation between compute_unique_id >>> > > > and StringBuilder? >>> > > > >>> > > > > - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. >>> > > > > - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > yes, this is another issue that has been around for a long time. In theory, we >>> > > > would like to be able to write per-backend specific code which overrides the >>> > > > default implementation. This would be useful for rbigint and rsha, but also >>> > > > e.g. for rlib.streamio. However, we never wrote the infrastructure to do that. >>> > > > >>> > > > > - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? >>> > > > >>> > > > >>> > > > yes. Look at translator/c/funcgen.py:848: this is how jit_is_virtual is >>> > > > implemented by the C backend, you can see that it always returns 0/ >>> > > > >>> > > > > I hope someone can help me to submit the changes and maybe guide with further work. >>> > > > >>> > > > >>> > > > Please put your work on bitbucket, I'll review it. I'd greatly appreciate if >>> > > > you committed small checkins (one for each fix/feature you are doing) instead >>> > > > of one giant commit with all the changes :-) >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > OK, I got myself a 32bit environment and created the pull request (. I'll be grateful for any feedback. One thing I didn't do was to create regression tests against the problems I found - I didn't know where to put the tests and what (and how) exactly to test. If you can shed some light on it, that would be awesome. >>> > >>> > >>> > Lack of tests is a no-no in PyPy world :) Look how current tests are >>> > implemented in pypy/translator/jvm/test/ and either extend those or >>> > the base classes. You run them using py.test (which comes included >>> > with pypy), refer to py.test documentation for details >>> >>> >>> >>> >>> I'll look into it, looks like a whole new codebase to grep through (and I already found a bug in my code). I'll create a new pull request when I'm ready with the tests :) >> OK - I have create another pull requests here: https://bitbucket.org/pypy/pypy/pull-request/20/improvements-to-the-jvm-backend-this-time >> >> The previous one should be rejected/deleted, it seems impossible from my side. I will be grateful for comments about the changes. >> >> Micha? >> >> > > That sounds like a good step forward, however, why the tests are > skipped? They should be passing now. What do you mean? I didn't add any skipping code (except for append_charpsize). What I did find out was that on a 64 bit system all JVM tests get skipped (because of pypy/translator/jvm/conftest.py) - is that what you mean? > Also primitives (like float2longlong) miss tests I think. They also miss implementations. Because JVM lacks the unsigned types, the whole problem of translating the RFFI code for rbigint etc. seems complex. For now I wanted to move the translation process forward, and worry about the numeric calculations when we have something running at all. Should I write tests that skip with "not implemented yet" message? Micha? From fijall at gmail.com Sun Jan 15 23:14:23 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 16 Jan 2012 00:14:23 +0200 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> <8DE497BA26E043B78A00828550A9AF89@gmail.com> Message-ID: On Mon, Jan 16, 2012 at 12:00 AM, Michal Bendowski wrote: > 2012/1/15 Maciej Fijalkowski : >> On Sun, Jan 15, 2012 at 11:28 PM, Micha? Bendowski wrote: >>> >>> >>> On Sunday, 15 January 2012 at 0:41 , Micha? Bendowski wrote: >>> >>>> On Saturday, 14 January 2012 at 22:28 , Maciej Fijalkowski wrote: >>>> >>>> > On Sat, Jan 14, 2012 at 11:18 PM, Micha? Bendowski wrote: >>>> > > >>>> > > >>>> > > On Friday, 13 January 2012 at 16:02 , Antonio Cuni wrote: >>>> > > >>>> > > > Hello Micha?, >>>> > > > >>>> > > > On 01/12/2012 09:24 PM, Micha? Bendowski wrote: >>>> > > > > Hello everyone, >>>> > > > > >>>> > > > > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. >>>> > > > > >>>> > > > > But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). >>>> > > > >>>> > > > >>>> > > > the long vs int problems are likely due to the fact that you are translating >>>> > > > on a 64 bit machine. The translator toolchain assumes that the "native" long >>>> > > > type of the target platform is the same as the source one, but this is not the >>>> > > > case if you are targeting the JVM (where long is 32 bit) on a 64 bit linux >>>> > > > (where long is 64 bit). >>>> > > > >>>> > > > This problem is not easily solvable, so my suggestion is just to translate >>>> > > > pypy-jvm inside a 32bit chroot for now. >>>> > > > >>>> > > > > It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? >>>> > > > >>>> > > > >>>> > > > we don't have any precise workflow, although a bitbucket pull request might be >>>> > > > the easiest thing to do. I'll be glad to review it. >>>> > > > >>>> > > > > Here's a short list of stuff I found and fixed (hopefully): >>>> > > > > - support the ll_getlength method of StringBuilders in ootype, >>>> > > > > - make compute_unique_id work on built-ins (StringBuilders again). >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > not sure what you mean here. What is the relation between compute_unique_id >>>> > > > and StringBuilder? >>>> > > > >>>> > > > > - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. >>>> > > > > - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > yes, this is another issue that has been around for a long time. In theory, we >>>> > > > would like to be able to write per-backend specific code which overrides the >>>> > > > default implementation. This would be useful for rbigint and rsha, but also >>>> > > > e.g. for rlib.streamio. However, we never wrote the infrastructure to do that. >>>> > > > >>>> > > > > - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? >>>> > > > >>>> > > > >>>> > > > yes. Look at translator/c/funcgen.py:848: this is how jit_is_virtual is >>>> > > > implemented by the C backend, you can see that it always returns 0/ >>>> > > > >>>> > > > > I hope someone can help me to submit the changes and maybe guide with further work. >>>> > > > >>>> > > > >>>> > > > Please put your work on bitbucket, I'll review it. I'd greatly appreciate if >>>> > > > you committed small checkins (one for each fix/feature you are doing) instead >>>> > > > of one giant commit with all the changes :-) >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > OK, I got myself a 32bit environment and created the pull request (. I'll be grateful for any feedback. One thing I didn't do was to create regression tests against the problems I found - I didn't know where to put the tests and what (and how) exactly to test. If you can shed some light on it, that would be awesome. >>>> > >>>> > >>>> > Lack of tests is a no-no in PyPy world :) Look how current tests are >>>> > implemented in pypy/translator/jvm/test/ and either extend those or >>>> > the base classes. You run them using py.test (which comes included >>>> > with pypy), refer to py.test documentation for details >>>> >>>> >>>> >>>> >>>> I'll look into it, looks like a whole new codebase to grep through (and I already found a bug in my code). I'll create a new pull request when I'm ready with the tests :) >>> OK - I have create another pull requests here: https://bitbucket.org/pypy/pypy/pull-request/20/improvements-to-the-jvm-backend-this-time >>> >>> The previous one should be rejected/deleted, it seems impossible from my side. I will be grateful for comments about the changes. >>> >>> Micha? >>> >>> >> >> That sounds like a good step forward, however, why the tests are >> skipped? They should be passing now. > > What do you mean? I didn't add any skipping code (except for > append_charpsize). What I did find out was that on a 64 bit system all > JVM tests get skipped (because of pypy/translator/jvm/conftest.py) - > is that what you mean? > >> Also primitives (like float2longlong) miss tests I think. > > They also miss implementations. Because JVM lacks the unsigned types, > the whole problem of translating the RFFI code for rbigint etc. seems > complex. For now I wanted to move the translation process forward, and > worry about the numeric calculations when we have something running at > all. Should I write tests that skip with "not implemented yet" > message? > > Micha? The tests should at least not fail. I would worry about tests a bit before actual translation, but that might be just me :) From michal at bendowski.pl Mon Jan 16 01:00:56 2012 From: michal at bendowski.pl (=?utf-8?Q?Micha=C5=82_Bendowski?=) Date: Mon, 16 Jan 2012 01:00:56 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> <8DE497BA26E043B78A00828550A9AF89@gmail.com> Message-ID: <0FD16011DDA145DF95B29D3A6592F7A1@gmail.com> On Sunday, 15 January 2012 at 23:14 , Maciej Fijalkowski wrote: > On Mon, Jan 16, 2012 at 12:00 AM, Michal Bendowski wrote: > > 2012/1/15 Maciej Fijalkowski : > > > On Sun, Jan 15, 2012 at 11:28 PM, Micha? Bendowski wrote: > > > > > > > > > > > > On Sunday, 15 January 2012 at 0:41 , Micha? Bendowski wrote: > > > > > > > > > On Saturday, 14 January 2012 at 22:28 , Maciej Fijalkowski wrote: > > > > > > > > > > > On Sat, Jan 14, 2012 at 11:18 PM, Micha? Bendowski wrote: > > > > > > > > > > > > > > > > > > > > > On Friday, 13 January 2012 at 16:02 , Antonio Cuni wrote: > > > > > > > > > > > > > > > Hello Micha?, > > > > > > > > > > > > > > > > On 01/12/2012 09:24 PM, Micha? Bendowski wrote: > > > > > > > > > Hello everyone, > > > > > > > > > > > > > > > > > > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. > > > > > > > > > > > > > > > > > > But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). > > > > > > > > > > > > > > > > > > > > > > > > the long vs int problems are likely due to the fact that you are translating > > > > > > > > on a 64 bit machine. The translator toolchain assumes that the "native" long > > > > > > > > type of the target platform is the same as the source one, but this is not the > > > > > > > > case if you are targeting the JVM (where long is 32 bit) on a 64 bit linux > > > > > > > > (where long is 64 bit). > > > > > > > > > > > > > > > > This problem is not easily solvable, so my suggestion is just to translate > > > > > > > > pypy-jvm inside a 32bit chroot for now. > > > > > > > > > > > > > > > > > It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? > > > > > > > > > > > > > > > > > > > > > > > > we don't have any precise workflow, although a bitbucket pull request might be > > > > > > > > the easiest thing to do. I'll be glad to review it. > > > > > > > > > > > > > > > > > Here's a short list of stuff I found and fixed (hopefully): > > > > > > > > > - support the ll_getlength method of StringBuilders in ootype, > > > > > > > > > - make compute_unique_id work on built-ins (StringBuilders again). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > not sure what you mean here. What is the relation between compute_unique_id > > > > > > > > and StringBuilder? > > > > > > > > > > > > > > > > > - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. > > > > > > > > > - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > yes, this is another issue that has been around for a long time. In theory, we > > > > > > > > would like to be able to write per-backend specific code which overrides the > > > > > > > > default implementation. This would be useful for rbigint and rsha, but also > > > > > > > > e.g. for rlib.streamio. However, we never wrote the infrastructure to do that. > > > > > > > > > > > > > > > > > - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? > > > > > > > > > > > > > > > > > > > > > > > > yes. Look at translator/c/funcgen.py:848: this is how jit_is_virtual is > > > > > > > > implemented by the C backend, you can see that it always returns 0/ > > > > > > > > > > > > > > > > > I hope someone can help me to submit the changes and maybe guide with further work. > > > > > > > > > > > > > > > > > > > > > > > > Please put your work on bitbucket, I'll review it. I'd greatly appreciate if > > > > > > > > you committed small checkins (one for each fix/feature you are doing) instead > > > > > > > > of one giant commit with all the changes :-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OK, I got myself a 32bit environment and created the pull request (. I'll be grateful for any feedback. One thing I didn't do was to create regression tests against the problems I found - I didn't know where to put the tests and what (and how) exactly to test. If you can shed some light on it, that would be awesome. > > > > > > > > > > > > > > > > > > Lack of tests is a no-no in PyPy world :) Look how current tests are > > > > > > implemented in pypy/translator/jvm/test/ and either extend those or > > > > > > the base classes. You run them using py.test (which comes included > > > > > > with pypy), refer to py.test documentation for details > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'll look into it, looks like a whole new codebase to grep through (and I already found a bug in my code). I'll create a new pull request when I'm ready with the tests :) > > > > OK - I have create another pull requests here: https://bitbucket.org/pypy/pypy/pull-request/20/improvements-to-the-jvm-backend-this-time > > > > > > > > The previous one should be rejected/deleted, it seems impossible from my side. I will be grateful for comments about the changes. > > > > > > > > Micha? > > > > > > That sounds like a good step forward, however, why the tests are > > > skipped? They should be passing now. > > > > > > > > What do you mean? I didn't add any skipping code (except for > > append_charpsize). What I did find out was that on a 64 bit system all > > JVM tests get skipped (because of pypy/translator/jvm/conftest.py) - > > is that what you mean? > > > > > Also primitives (like float2longlong) miss tests I think. > > > > They also miss implementations. Because JVM lacks the unsigned types, > > the whole problem of translating the RFFI code for rbigint etc. seems > > complex. For now I wanted to move the translation process forward, and > > worry about the numeric calculations when we have something running at > > all. Should I write tests that skip with "not implemented yet" > > message? > > > > Micha? > > The tests should at least not fail. I would worry about tests a bit > before actual translation, but that might be just me :) longlong2float and float2longlong turn out to be pretty straightforward in Java, so I implemented them and added tests. I tried to update the pull request to include this commit, but that crashed BitBucket :/ Maybe you can just pull it from my repo? Micha? From fijall at gmail.com Mon Jan 16 01:20:15 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 16 Jan 2012 02:20:15 +0200 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <0FD16011DDA145DF95B29D3A6592F7A1@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> <8DE497BA26E043B78A00828550A9AF89@gmail.com> <0FD16011DDA145DF95B29D3A6592F7A1@gmail.com> Message-ID: On Mon, Jan 16, 2012 at 2:00 AM, Micha? Bendowski wrote: > On Sunday, 15 January 2012 at 23:14 , Maciej Fijalkowski wrote: >> On Mon, Jan 16, 2012 at 12:00 AM, Michal Bendowski wrote: >> > 2012/1/15 Maciej Fijalkowski : >> > > On Sun, Jan 15, 2012 at 11:28 PM, Micha? Bendowski wrote: >> > > > >> > > > >> > > > On Sunday, 15 January 2012 at 0:41 , Micha? Bendowski wrote: >> > > > >> > > > > On Saturday, 14 January 2012 at 22:28 , Maciej Fijalkowski wrote: >> > > > > >> > > > > > On Sat, Jan 14, 2012 at 11:18 PM, Micha? Bendowski wrote: >> > > > > > > >> > > > > > > >> > > > > > > On Friday, 13 January 2012 at 16:02 , Antonio Cuni wrote: >> > > > > > > >> > > > > > > > Hello Micha?, >> > > > > > > > >> > > > > > > > On 01/12/2012 09:24 PM, Micha? Bendowski wrote: >> > > > > > > > > Hello everyone, >> > > > > > > > > >> > > > > > > > > Back in the summer I asked on this mailing list if there's interest in moving the JVM backend forward. Back then there was some enthusiasm, so I got back to it when I had the chance, which unfortunately was a few months later. The suggestion back then was to look into using JPype to integrate more closely with Java-side code, and that's what I would like to do. >> > > > > > > > > >> > > > > > > > > But before that, I noticed that the JVM backend fails to translate the standard interpreter and spent some time lately getting to know the code and trying to get it to work. What I have right now is a version that outputs valid Jasmin files, which unfortunately still contain some invalid bytecodes (longs vs ints from what I've seen, I'll look into it next). >> > > > > > > > >> > > > > > > > >> > > > > > > > the long vs int problems are likely due to the fact that you are translating >> > > > > > > > on a 64 bit machine. The translator toolchain assumes that the "native" long >> > > > > > > > type of the target platform is the same as the source one, but this is not the >> > > > > > > > case if you are targeting the JVM (where long is 32 bit) on a 64 bit linux >> > > > > > > > (where long is 64 bit). >> > > > > > > > >> > > > > > > > This problem is not easily solvable, so my suggestion is just to translate >> > > > > > > > pypy-jvm inside a 32bit chroot for now. >> > > > > > > > >> > > > > > > > > It would be awesome if someone could take a look at my changes. What's the best way to submit them? Bitbucket pull requests? They will need to go through some review - do you have a workflow for that? >> > > > > > > > >> > > > > > > > >> > > > > > > > we don't have any precise workflow, although a bitbucket pull request might be >> > > > > > > > the easiest thing to do. I'll be glad to review it. >> > > > > > > > >> > > > > > > > > Here's a short list of stuff I found and fixed (hopefully): >> > > > > > > > > - support the ll_getlength method of StringBuilders in ootype, >> > > > > > > > > - make compute_unique_id work on built-ins (StringBuilders again). >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > not sure what you mean here. What is the relation between compute_unique_id >> > > > > > > > and StringBuilder? >> > > > > > > > >> > > > > > > > > - provide oo implementations (or stubs) for pypy__rotateLeft, pypy__longlong2float etc. >> > > > > > > > > - handle rffi.SHORT and rffi.INT showing up in graphs. For now I try to emit something that makes sense (seemed easier), but the right solution is probably to see if the code in question (rbigint, rsha) can be implemented on the java level. >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > yes, this is another issue that has been around for a long time. In theory, we >> > > > > > > > would like to be able to write per-backend specific code which overrides the >> > > > > > > > default implementation. This would be useful for rbigint and rsha, but also >> > > > > > > > e.g. for rlib.streamio. However, we never wrote the infrastructure to do that. >> > > > > > > > >> > > > > > > > > - handle the jit_is_virtual opcode - I had no idea how to "safely ignore" it for now, is False the safe answer? >> > > > > > > > >> > > > > > > > >> > > > > > > > yes. Look at translator/c/funcgen.py:848: this is how jit_is_virtual is >> > > > > > > > implemented by the C backend, you can see that it always returns 0/ >> > > > > > > > >> > > > > > > > > I hope someone can help me to submit the changes and maybe guide with further work. >> > > > > > > > >> > > > > > > > >> > > > > > > > Please put your work on bitbucket, I'll review it. I'd greatly appreciate if >> > > > > > > > you committed small checkins (one for each fix/feature you are doing) instead >> > > > > > > > of one giant commit with all the changes :-) >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > OK, I got myself a 32bit environment and created the pull request (. I'll be grateful for any feedback. One thing I didn't do was to create regression tests against the problems I found - I didn't know where to put the tests and what (and how) exactly to test. If you can shed some light on it, that would be awesome. >> > > > > > >> > > > > > >> > > > > > Lack of tests is a no-no in PyPy world :) Look how current tests are >> > > > > > implemented in pypy/translator/jvm/test/ and either extend those or >> > > > > > the base classes. You run them using py.test (which comes included >> > > > > > with pypy), refer to py.test documentation for details >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > I'll look into it, looks like a whole new codebase to grep through (and I already found a bug in my code). I'll create a new pull request when I'm ready with the tests :) >> > > > OK - I have create another pull requests here: https://bitbucket.org/pypy/pypy/pull-request/20/improvements-to-the-jvm-backend-this-time >> > > > >> > > > The previous one should be rejected/deleted, it seems impossible from my side. I will be grateful for comments about the changes. >> > > > >> > > > Micha? >> > > >> > > That sounds like a good step forward, however, why the tests are >> > > skipped? They should be passing now. >> > >> > >> > >> > What do you mean? I didn't add any skipping code (except for >> > append_charpsize). What I did find out was that on a 64 bit system all >> > JVM tests get skipped (because of pypy/translator/jvm/conftest.py) - >> > is that what you mean? >> > >> > > Also primitives (like float2longlong) miss tests I think. >> > >> > They also miss implementations. Because JVM lacks the unsigned types, >> > the whole problem of translating the RFFI code for rbigint etc. seems >> > complex. For now I wanted to move the translation process forward, and >> > worry about the numeric calculations when we have something running at >> > all. Should I write tests that skip with "not implemented yet" >> > message? >> > >> > Micha? >> >> The tests should at least not fail. I would worry about tests a bit >> before actual translation, but that might be just me :) > > longlong2float and float2longlong turn out to be pretty straightforward in Java, so I implemented them and added tests. I tried to update the pull request to include this commit, but that crashed BitBucket :/ Maybe you can just pull it from my repo? > > Micha? ok From dmitrey15 at ukr.net Mon Jan 16 09:47:37 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Mon, 16 Jan 2012 10:47:37 +0200 Subject: [pypy-dev] Some NumPyPy propositions In-Reply-To: References: <4F0EAB1E.7020502@ukr.net> Message-ID: <4F13E429.4050500@ukr.net> Hi, On 01/14/2012 08:22 PM, Maciej Fijalkowski wrote: > Hi Dmitrey. > > Let me answer your questions one by one. > > On Thu, Jan 12, 2012 at 11:42 AM, Dmitrey wrote: >> hi all, >> >> I would like to make some propositions wrt NumPy port development: >> >> 1) It would be nice to have a build and/or install parameter to available >> usage of numpypy as numpy, e.g. "setup.py build --numpypy_as_numpy". I know >> it can be done via some tricks, but explicit parameter would be better, >> especially for unexperienced users. > We can trivially have such a package on cheeseshop. It would do: from > numpypy import *. You can even create your own package called numpy.py > and put it somewhere on PYTHONPATH to achieve the same result. As a programmer with essential Python experience I could make it done quite easily, but I guess many Python/NumPy teachers/students and other Python newbies will found easier to use other Python distributions like EPD, Sage or PythonXY, than process on PyPy installation with rasp. Well, it's up to you, of course. > >> 2) Many soft packages have some basic functionality without full numpy port, >> but due to importing issues they cannot yield even it. For example: >> from numpy import some_func, a_rare_func_unimplemented_in_pypy_yet >> ... >> if user_want_this: >> some_func() >> elif user_want_some_rare_possibility: >> a_rare_func_unimplemented_in_pypy_yet() >> >> It would be nice to have possibility to install PyPy NumPy port with all >> unimplemented yet functions as stubs, e.g. >> def flipud(*args,**kw): >> raise numpy_absent_exception('flipud is unimplemented yet') >> >> (and similar stubs for ndarray and matrix methods) > This is IMO a very bad idea. It pushes back the problem from import > time to some later time, while effectively not fixing it. I would say > just "wait a bit" or make your package cooperate with numpypy now, but > we expect new functionality to appear relatively rapidly. I haven't said to make this default behaviour, I mere proposed to have a *possibility* for installation of numpypy that will like this. Than people could estimate difference of CPython and PyPy speed on (initially) limited set of tests and, being impressed, contribute some code or maybe even financial support to your project to accomplish NumPy and probably furthermore SciPy port. > >> 3) Last but not least; I'm author and developer of openopt suite >> (openopt.org with ~ 200 visitors daily, that is AFAIK about 10% of >> scipy.org) and of course both openopt and PyPy will essentially increase >> their users when openopt will be capable of running with PyPy; yet I see >> some weeks or months till this still remain. I would be glad to make some >> contributions toward this, but my current financial situation cannot allow >> me to work for free. If at least basic financial support could be obtained, >> I guess I could port some missing numpy functions / array methods, maybe >> furthermore some functions from scipy.optimize or scipy.sparse. My CV and >> contacts are here: http://openopt.org/Dmitrey . > We indeed have some money, but what we have is relatively little. The > numpy work done so far was purely volunteer effort and we generally > select people for doing paid job who are already core developers or > heavy contributors. However, the money we have all come from some 3rd > parties, pypy as a project does not earn any money - there is > absolutely nothing that stops you from convincing some other 3rd party > to do work on PyPy's numpy. I had already mentioned your project and related progress in my (openopt.org) site and forum, you could publish a forum post with your appeal there (like you've done in a scipy mail list), maybe someone will provide some code contributions into your project, a finance support or another assistance. You promised to describe what is already implemented (http://morepypy.blogspot.com/2012/01/numpypy-progress-report-running.html?showComment=1326227580054#c281969238585943039 ), but it's undone yet. I had proposed you to create an online table, where NumPy functions are split into the following categories: * already ported * under development (preferably with name of the person who works on it) * not started yet Is it possible to get the data like this? This way other possible contributors (maybe including me) could select an appropriate (i.e. they are capable of) function for contribution, and other people could estimate the progress done. Also, is it possible to install recent numpypy without reinstallation of whole PyPy? Regards, D. > Cheers, > fijal > > > From anto.cuni at gmail.com Mon Jan 16 12:09:36 2012 From: anto.cuni at gmail.com (Antonio Cuni) Date: Mon, 16 Jan 2012 12:09:36 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <0FD16011DDA145DF95B29D3A6592F7A1@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> <8DE497BA26E043B78A00828550A9AF89@gmail.com> <0FD16011DDA145DF95B29D3A6592F7A1@gmail.com> Message-ID: <4F140570.6090108@gmail.com> Hello Micha?, On 01/16/2012 01:00 AM, Micha? Bendowski wrote: > longlong2float and float2longlong turn out to be pretty straightforward in Java, so I implemented them and added tests. I tried to update the pull request to include this commit, but that crashed BitBucket :/ Maybe you can just pull it from my repo? I reviewed your pull request. A few notes: 1. float2longlong & longlong2float can be implemented in a more direct way by using Double.doubleToRawLongBits and Double.longBitsToDouble 2. in revision 25d4d323cb5f, you implemented _identityhash of builtin types by returning hash(self). This is wrong, because as the name suggests, it needs to return different hashes for objects which are not identical. For example, look at the following code: >>> from pypy.rlib import objectmodel >>> a = 'foo' >>> b = 'f'; b += 'oo' >>> a == b True >>> a is b False >>> objectmodel.compute_identity_hash(a) 192311087 >>> objectmodel.compute_identity_hash(b) -1955336075 The test should probably try to compute_unique_id of two strings which are equal but not identical, and check that they are different 3. could you please transplant your checkins to some branch other than default, e.g. jvm-improvements? This way I could just merge the pull request and then run the test on buildbot, before doing the actual merge to default. Apart from this, your patch looks very good :-) ciao&thanks, Antonio From michal at bendowski.pl Mon Jan 16 13:03:18 2012 From: michal at bendowski.pl (=?utf-8?Q?Micha=C5=82_Bendowski?=) Date: Mon, 16 Jan 2012 13:03:18 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <4F140570.6090108@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> <8DE497BA26E043B78A00828550A9AF89@gmail.com> <0FD16011DDA145DF95B29D3A6592F7A1@gmail.com> <4F140570.6090108@gmail.com> Message-ID: <4F208B911C3C4C848DE30D4C24820B83@gmail.com> On Monday, 16 January 2012 at 12:09 , Antonio Cuni wrote: > Hello Micha?, > > On 01/16/2012 01:00 AM, Micha? Bendowski wrote: > > longlong2float and float2longlong turn out to be pretty straightforward in Java, so I implemented them and added tests. I tried to update the pull request to include this commit, but that crashed BitBucket :/ Maybe you can just pull it from my repo? > > > > I reviewed your pull request. A few notes: > > 1. float2longlong & longlong2float can be implemented in a more direct way by > using Double.doubleToRawLongBits and Double.longBitsToDouble Good point, thanks. > 2. in revision 25d4d323cb5f, you implemented _identityhash of builtin types by > returning hash(self). This is wrong, because as the name suggests, it needs to > return different hashes for objects which are not identical. For example, look > at the following code: > > > > > from pypy.rlib import objectmodel > > > > a = 'foo' > > > > b = 'f'; b += 'oo' > > > > a == b > > > > > > > True > > > > a is b > > > > > > > False > > > > objectmodel.compute_identity_hash(a) > > > > > > > 192311087 > > > > objectmodel.compute_identity_hash(b) > > > > > > > -1955336075 > > The test should probably try to compute_unique_id of two strings which are > equal but not identical, and check that they are different > I have copied the hash(self) from ootype._instance ? didn't consider subclasses messing with __hash__. Anyway, as compute_identity_hash is defined as the RPython equivalent of object.__hash__(x), the "stub implementation" in _builtin_type (and _instance) should just return object.__hash__(self), am I right? > > 3. could you please transplant your checkins to some branch other than > default, e.g. jvm-improvements? This way I could just merge the pull request > and then run the test on buildbot, before doing the actual merge to default. Sure. Do you want me to fix the mentioned problems in new commits or modify the patches using mq? I'm new to Mercurial and don't really know what is the preferred workflow. Micha? From anto.cuni at gmail.com Mon Jan 16 13:53:12 2012 From: anto.cuni at gmail.com (Antonio Cuni) Date: Mon, 16 Jan 2012 13:53:12 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <4F208B911C3C4C848DE30D4C24820B83@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> <8DE497BA26E043B78A00828550A9AF89@gmail.com> <0FD16011DDA145DF95B29D3A6592F7A1@gmail.com> <4F140570.6090108@gmail.com> <4F208B911C3C4C848DE30D4C24820B83@gmail.com> Message-ID: <4F141DB8.3090804@gmail.com> On 01/16/2012 01:03 PM, Micha? Bendowski wrote: > I have copied the hash(self) from ootype._instance ? didn't consider subclasses messing with __hash__. Anyway, as compute_identity_hash is defined as the RPython equivalent of object.__hash__(x), the "stub implementation" in _builtin_type (and _instance) should just return object.__hash__(self), am I right? indeed, I suppose that also _instance._identityhash should be changed, even if for this particular case should not change much (because hash() is implemented in terms of id() for instances). It *might* play a role for null instances, because as you can see _null_mixin overrides __hash__. So, if you decided to change it you should make sure that there are tests which checks for the identityhash of null instances, or write one if it doesn't exist :-). object.__hash__ is implemented in terms of id(), so for our use case it doesn't change much. Personally, I think that using id() is better because it's obvious that the values of two different objects cannot collide, but using object.__hash__ works too. > Sure. Do you want me to fix the mentioned problems in new commits or modify the patches using mq? I'm new to Mercurial and don't really know what is the preferred workflow. doing more checkins is fine, we do it all the time. Personally, I prefer more a list of commits which shows the errors and their fix than a list of commits which are "perfect" but hide the path that leaded to them. ciao, Anto From drewes at interstice.com Mon Jan 16 19:44:24 2012 From: drewes at interstice.com (Rich Drewes) Date: Mon, 16 Jan 2012 10:44:24 -0800 Subject: [pypy-dev] GC error Message-ID: <4F147008.6040208@interstice.com> Hello all, Great work on pypy! I've had good luck with pypy generally but on a program that loads a very large data set I am getting a GC related exception: ---- loading reads, on record 25000000 RPython traceback: File "translator_goal_targetpypystandalone.c", line 888, in entry_point File "interpreter_function.c", line 876, in funccall__star_1 File "interpreter_function.c", line 905, in funccall__star_1 File "rpython_memory_gc_minimark.c", line 2490, in MiniMarkGC_collect_and_reserve File "rpython_memory_gc_minimark.c", line 2193, in MiniMarkGC_minor_collection File "rpython_memory_gc_minimark.c", line 4535, in MiniMarkGC_collect_oldrefs_to_nursery File "rpython_memory_gc_base.c", line 1761, in trace___trace_drag_out File "rpython_memory_gc_minimarkpage.c", line 214, in ArenaCollection_malloc File "rpython_memory_gc_minimarkpage.c", line 536, in ArenaCollection_allocate_new_page File "rpython_memory_gc_minimarkpage.c", line 735, in ArenaCollection_allocate_new_arena Fatal RPython error: MemoryError Aborted ---- This is pypy 1.7.0 from ppa.launchpad.net on x86_64 Ubuntu 11.10. The data being loaded exceeds the size of physical memory, but there is plenty of swap space. The same program works with cpython. I wanted to try to force pypy to use a different GC, but couldn't figure out how to do that yet. Apparently you can't select the GC from pypy command line and my efforts to use translate.py with an alternate GC haven't worked so far. Any suggestions appreciated. Rich From michal at bendowski.pl Mon Jan 16 22:38:40 2012 From: michal at bendowski.pl (=?utf-8?Q?Micha=C5=82_Bendowski?=) Date: Mon, 16 Jan 2012 22:38:40 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: <4F141DB8.3090804@gmail.com> References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> <8DE497BA26E043B78A00828550A9AF89@gmail.com> <0FD16011DDA145DF95B29D3A6592F7A1@gmail.com> <4F140570.6090108@gmail.com> <4F208B911C3C4C848DE30D4C24820B83@gmail.com> <4F141DB8.3090804@gmail.com> Message-ID: On Monday, 16 January 2012 at 13:53 , Antonio Cuni wrote: > On 01/16/2012 01:03 PM, Micha? Bendowski wrote: > > > I have copied the hash(self) from ootype._instance ? didn't consider subclasses messing with __hash__. Anyway, as compute_identity_hash is defined as the RPython equivalent of object.__hash__(x), the "stub implementation" in _builtin_type (and _instance) should just return object.__hash__(self), am I right? > > indeed, I suppose that also _instance._identityhash should be changed, even if > for this particular case should not change much (because hash() is implemented > in terms of id() for instances). > > It *might* play a role for null instances, because as you can see _null_mixin > overrides __hash__. So, if you decided to change it you should make sure that > there are tests which checks for the identityhash of null instances, or write > one if it doesn't exist :-). > > object.__hash__ is implemented in terms of id(), so for our use case it > doesn't change much. Personally, I think that using id() is better because > it's obvious that the values of two different objects cannot collide, but > using object.__hash__ works too. > I changed it to object.__hash__(self). It also occurred to me that calling compute_unique_id here would probably make sense, as it is the "stub implementation" already ? am I right? Or would that be mixing PyPy levels? As for null instances, lltype.identityhash (reused by ootype) suggests that nulls are not a valid input for identity_hash. > > > Sure. Do you want me to fix the mentioned problems in new commits or modify the patches using mq? I'm new to Mercurial and don't really know what is the preferred workflow. > > > doing more checkins is fine, we do it all the time. Personally, I prefer more > a list of commits which shows the errors and their fix than a list of commits > which are "perfect" but hide the path that leaded to them. You can find the fixes in bitbucket.org/benol/pypy in branch jvm-improvements. Micha? From romain.py at gmail.com Mon Jan 16 23:02:26 2012 From: romain.py at gmail.com (Romain Guillebert) Date: Mon, 16 Jan 2012 23:02:26 +0100 Subject: [pypy-dev] GC error In-Reply-To: <4F147008.6040208@interstice.com> References: <4F147008.6040208@interstice.com> Message-ID: Hi PyPy may use more memory than cpython because of the JIT, can you try without the JIT (by passing --jit off to the interpreter) ? Cheers Romain On Mon, Jan 16, 2012 at 7:44 PM, Rich Drewes wrote: > Hello all, > > Great work on pypy! ?I've had good luck with pypy generally but on a program > that loads a very large data set I am getting a GC related exception: > > ---- > loading reads, on record 25000000 > RPython traceback: > ?File "translator_goal_targetpypystandalone.c", line 888, in entry_point > ?File "interpreter_function.c", line 876, in funccall__star_1 > ?File "interpreter_function.c", line 905, in funccall__star_1 > ?File "rpython_memory_gc_minimark.c", line 2490, in > MiniMarkGC_collect_and_reserve > ?File "rpython_memory_gc_minimark.c", line 2193, in > MiniMarkGC_minor_collection > ?File "rpython_memory_gc_minimark.c", line 4535, in > MiniMarkGC_collect_oldrefs_to_nursery > ?File "rpython_memory_gc_base.c", line 1761, in trace___trace_drag_out > ?File "rpython_memory_gc_minimarkpage.c", line 214, in > ArenaCollection_malloc > ?File "rpython_memory_gc_minimarkpage.c", line 536, in > ArenaCollection_allocate_new_page > ?File "rpython_memory_gc_minimarkpage.c", line 735, in > ArenaCollection_allocate_new_arena > Fatal RPython error: MemoryError > Aborted > ---- > > This is pypy 1.7.0 from ppa.launchpad.net on x86_64 Ubuntu 11.10. ?The data > being loaded exceeds the size of physical memory, but there is plenty of > swap space. > > The same program works with cpython. > > I wanted to try to force pypy to use a different GC, but couldn't figure out > how to do that yet. ?Apparently you can't select the GC from pypy command > line and my efforts to use translate.py with an alternate GC haven't worked > so far. > > Any suggestions appreciated. > > Rich > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From drewes at interstice.com Tue Jan 17 00:26:50 2012 From: drewes at interstice.com (Rich Drewes) Date: Mon, 16 Jan 2012 15:26:50 -0800 Subject: [pypy-dev] GC error In-Reply-To: References: <4F147008.6040208@interstice.com> Message-ID: <4F14B23A.6040203@interstice.com> On 01/16/2012 02:02 PM, Romain Guillebert wrote: > Hi > > PyPy may use more memory than cpython because of the JIT, can you try > without the JIT (by passing --jit off to the interpreter) ? When I run with --jit off it gives a MemoryError at about the same point in the run, but with no exception trace: ---- drewes at ladastra:/home/drewes/Desktop/forister2$ pypy --jit off grouper2.py --pass2 p5parsed-TSU6.txt reloading reads . . . loading reads, on record 1000000 ... loading reads, on record 22000000 loading reads, on record 23000000 MemoryError ---- On another run with --jit off it failed like this: ---- ... loading reads, on record 25000000 loading reads, on record 26000000 Traceback (most recent call last): File "app_main.py", line 51, in run_toplevel File "grouper2.py", line 993, in reads, idquals, l3s, quals=method2pass2(argv[1], matchlist=matchlist, maxrec=maxrec, fastq=True) File "grouper2.py", line 217, in method2pass2 reads.append(l2.rstrip()) MemoryError ---- So, the exact point of the error varies. Is there any part of pypy that *requires* physical memory and cannot use swap? That seems unlikely to me, but it is the only guess I can come up with. At the point of failure the program running with pypy is consuming only about 3.7GB resident memory and 3.7GB swap (shown in 'top') and there is plenty of swap free. There is 8GB system RAM but there are other things running on the machine too. When I run my program with cpython, it consumes all available physical memory and then the swap space (as shown in 'top') grows up to about 17GB. The program runs to completion. When I run my program with pypy on default settings with JIT enabled, it throws that exception shown in my last email before it uses much swap space at all (using only about 4GB swap with >30GB swap still free). So when pypy throws the memory exception it is using considerably *less* memory than the cpython version uses at its peak. The pypy version fails when it has loaded less than half the data and is using less than a quarter of the swap space that the cpython program uses without a problem. [BTW, I didn't find the pypy-issues list when I sent my initial query, and this list may not be the right place for this, but thank you for responding.] Rich From timo at wakelift.de Tue Jan 17 06:59:31 2012 From: timo at wakelift.de (Timo Paulssen) Date: Tue, 17 Jan 2012 06:59:31 +0100 Subject: [pypy-dev] GC error In-Reply-To: <4F14B23A.6040203@interstice.com> References: <4F147008.6040208@interstice.com> <4F14B23A.6040203@interstice.com> Message-ID: <4F150E43.9060904@wakelift.de> Hello, PyPy has a hard limit on its heap size. It can be specified with this environment variable: PYPY_GC_MAX The max heap size. If coming near this limit, it will first collect more often, then raise an RPython MemoryError, and if that is not enough, crash the program with a fatal error. Try values like '1.6GB'. Check out the rest of th evariables for the GC in pypy/rpython/memory/gc/minimark.py In general, the pypy GC isn't optimised to know what pages are in memory or in swap, so each major collection will cause every single page to be touched. If a large part of your pypy process will not fit into the swap, you will probably experience very significant slowdowns. Cheers - Timo From anto.cuni at gmail.com Tue Jan 17 10:16:53 2012 From: anto.cuni at gmail.com (Antonio Cuni) Date: Tue, 17 Jan 2012 10:16:53 +0100 Subject: [pypy-dev] Work on the JVM backend In-Reply-To: References: <4DB0241978A5469C832E6314C81780EF@gmail.com> <4F104793.3060405@gmail.com> <4503105A35FC4DD88E354CB63EC4E381@gmail.com> <8DE497BA26E043B78A00828550A9AF89@gmail.com> <0FD16011DDA145DF95B29D3A6592F7A1@gmail.com> <4F140570.6090108@gmail.com> <4F208B911C3C4C848DE30D4C24820B83@gmail.com> <4F141DB8.3090804@gmail.com> Message-ID: <4F153C85.1090007@gmail.com> On 01/16/2012 10:38 PM, Micha? Bendowski wrote: > You can find the fixes in bitbucket.org/benol/pypy in branch jvm-improvements. I pushed your changes to the main repo, thank you :-). I also started a test run to check that nothing unrelated breaks: http://buildbot.pypy.org/builders/own-linux-x86-32/builds/1928 if the tests are ok, I'll merge the branch into default. ciao, Anto From bokr at oz.net Tue Jan 17 17:10:17 2012 From: bokr at oz.net (Bengt Richter) Date: Tue, 17 Jan 2012 08:10:17 -0800 Subject: [pypy-dev] GC error In-Reply-To: <4F14B23A.6040203@interstice.com> References: <4F147008.6040208@interstice.com> <4F14B23A.6040203@interstice.com> Message-ID: On 01/16/2012 03:26 PM Rich Drewes wrote: [...] > When I run my program with cpython, it consumes all available physical > memory and then the swap space (as shown in 'top') grows up to about > 17GB. The program runs to completion. When I run my program with pypy > on default settings with JIT enabled, it throws that exception shown in > my last email before it uses much swap space at all (using only about > 4GB swap with>30GB swap still free). ^^^ could that 4GB correspond to a 32-bit representation of size somewhere that needs to be bigger? Regards, Bengt Richter From fijall at gmail.com Tue Jan 17 17:19:27 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 17 Jan 2012 18:19:27 +0200 Subject: [pypy-dev] GC error In-Reply-To: <4F147008.6040208@interstice.com> References: <4F147008.6040208@interstice.com> Message-ID: On Mon, Jan 16, 2012 at 8:44 PM, Rich Drewes wrote: > Hello all, > > Great work on pypy! ?I've had good luck with pypy generally but on a program > that loads a very large data set I am getting a GC related exception: > > ---- > loading reads, on record 25000000 > RPython traceback: > ?File "translator_goal_targetpypystandalone.c", line 888, in entry_point > ?File "interpreter_function.c", line 876, in funccall__star_1 > ?File "interpreter_function.c", line 905, in funccall__star_1 > ?File "rpython_memory_gc_minimark.c", line 2490, in > MiniMarkGC_collect_and_reserve > ?File "rpython_memory_gc_minimark.c", line 2193, in > MiniMarkGC_minor_collection > ?File "rpython_memory_gc_minimark.c", line 4535, in > MiniMarkGC_collect_oldrefs_to_nursery > ?File "rpython_memory_gc_base.c", line 1761, in trace___trace_drag_out > ?File "rpython_memory_gc_minimarkpage.c", line 214, in > ArenaCollection_malloc > ?File "rpython_memory_gc_minimarkpage.c", line 536, in > ArenaCollection_allocate_new_page > ?File "rpython_memory_gc_minimarkpage.c", line 735, in > ArenaCollection_allocate_new_arena > Fatal RPython error: MemoryError > Aborted Hi Richard. I don't quite know how you got the MemoryError, however using swap with python (and pypy) is a very bad idea. Each time you have a garbage collection cycle, it has to walk over all pages that are addressed by the process, which means reading and writing to all the pages that are swapped. This makes the program essentially not do anything any more except reading and writing to the HD and as such, you're very unlikely to achieve anything. Cheers, fijal From fijall at gmail.com Tue Jan 17 17:21:57 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 17 Jan 2012 18:21:57 +0200 Subject: [pypy-dev] Some NumPyPy propositions In-Reply-To: <4F13E429.4050500@ukr.net> References: <4F0EAB1E.7020502@ukr.net> <4F13E429.4050500@ukr.net> Message-ID: On Mon, Jan 16, 2012 at 10:47 AM, Dmitrey wrote: > Hi, > > On 01/14/2012 08:22 PM, Maciej Fijalkowski wrote: >> >> Hi Dmitrey. >> >> Let me answer your questions one by one. >> >> On Thu, Jan 12, 2012 at 11:42 AM, Dmitrey ?wrote: >>> >>> hi all, >>> >>> I would like to make some propositions wrt NumPy port development: >>> >>> 1) It would be nice to have a build and/or install parameter to available >>> usage of numpypy as numpy, e.g. "setup.py build --numpypy_as_numpy". I >>> know >>> it can be done via some tricks, but explicit parameter would be better, >>> especially for unexperienced users. >> >> We can trivially have such a package on cheeseshop. It would do: from >> numpypy import *. You can even create your own package called numpy.py >> and put it somewhere on PYTHONPATH to achieve the same result. > > As a programmer with essential Python experience I could make it done quite > easily, but I guess many Python/NumPy teachers/students and other Python > newbies will found easier to use other Python distributions like EPD, Sage > or PythonXY, than process on PyPy installation with rasp. Well, it's up to > you, of course. The thing is numpy in pypy as of now is not really ready for such people because it's unfinished. We'll think what to do once it's more finished, which should not take that long. > >> >>> 2) Many soft packages have some basic functionality without full numpy >>> port, >>> but due to importing issues they cannot yield even it. For example: >>> from numpy import some_func, a_rare_func_unimplemented_in_pypy_yet >>> ... >>> if user_want_this: >>> ? ?some_func() >>> elif user_want_some_rare_possibility: >>> ? ?a_rare_func_unimplemented_in_pypy_yet() >>> >>> It would be nice to have possibility to install PyPy NumPy port with all >>> unimplemented yet functions as stubs, e.g. >>> def flipud(*args,**kw): >>> ? ?raise numpy_absent_exception('flipud is unimplemented yet') >>> >>> (and similar stubs for ndarray and matrix methods) >> >> This is IMO a very bad idea. It pushes back the problem from import >> time to some later time, while effectively not fixing it. I would say >> just "wait a bit" or make your package cooperate with numpypy now, but >> we expect new functionality to appear relatively rapidly. > > I haven't said to make this default behaviour, I mere proposed to have a > *possibility* for installation of numpypy that will like this. Than people > could estimate difference of CPython and PyPy speed on (initially) limited > set of tests and, being impressed, contribute some code or maybe even > financial support to your project to accomplish NumPy and probably > furthermore SciPy port. I think people who are willing to contribute are probably also willing to live with the fact that not everything works. I might be wrong of course. > > >> >>> 3) Last but not least; I'm author and developer of openopt suite >>> (openopt.org with ~ 200 visitors daily, that is AFAIK about 10% of >>> scipy.org) and of course both openopt and PyPy will essentially increase >>> their users when openopt will be capable of running with PyPy; yet I see >>> some weeks or months till this still remain. I would be glad to make some >>> contributions toward this, but my current financial situation cannot >>> allow >>> me to work for free. If at least basic financial support could be >>> obtained, >>> I guess I could port some missing numpy functions / array methods, maybe >>> furthermore some functions from scipy.optimize or scipy.sparse. My CV and >>> contacts are here: http://openopt.org/Dmitrey . >> >> We indeed have some money, but what we have is relatively little. The >> numpy work done so far was purely volunteer effort and we generally >> select people for doing paid job who are already core developers or >> heavy contributors. However, the money we have all come from some 3rd >> parties, pypy as a project does not earn any money - there is >> absolutely nothing that stops you from convincing some other 3rd party >> to do work on PyPy's numpy. > > I had already mentioned your project and related progress in my > (openopt.org) site and forum, you could publish a forum post with your > appeal there (like you've done in a scipy mail list), maybe someone will > provide some code contributions into your project, a finance support or > another assistance. > > You promised to describe what is already implemented > (http://morepypy.blogspot.com/2012/01/numpypy-progress-report-running.html?showComment=1326227580054#c281969238585943039 > ), but it's undone yet. I had proposed you to create an online table, where > NumPy functions are split into the following categories: > * already ported > * under development (preferably with name of the person who works on it) > * not started yet > Is it possible to get the data like this? This way other possible > contributors (maybe including me) could select an appropriate (i.e. they are > capable of) function for contribution, and other people could estimate the > progress done. Yes, we're working on such thing. The temporary location is at https://bitbucket.org/fijal/hack2/src/default/numready, we'll probably create a nightly static HTML. > > Also, is it possible to install recent numpypy without reinstallation of > whole PyPy? Nope. From pachalmars at gmail.com Tue Jan 17 19:16:33 2012 From: pachalmars at gmail.com (Arnaud F) Date: Tue, 17 Jan 2012 19:16:33 +0100 Subject: [pypy-dev] Tkinter-pypy not building on windows Message-ID: Hi all, When trying to build tkinter from source, i get the following errors : src/_tkinter.c(33) : warning C4273: 'PyOS_InputHook' : inconsistent dll linkage src/_tkinter.c(32) : see previous definition of 'PyOS_InputHook' src/_tkinter.c(673) : warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. src/_tkinter.c(709) : warning C4996: 'strcat': This function or variable may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. src/_tkinter.c(712) : warning C4996: 'strcat': This function or variable may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. src/_tkinter.c(713) : warning C4996: 'strcat': This function or variable may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. src/_tkinter.c(714) : warning C4996: 'strcat': This function or variable may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. src/_tkinter.c(2113) : warning C4133: 'function' : incompatible types - from 'PythonCmd_ClientData *' to 'char *' src/_tkinter.c(2198) : warning C4090: 'function' : different 'const' qualifiers src/_tkinter.c(2198) : warning C4028: formal parameter 4 different from declaration src/_tkinter.c(2204) : warning C4133: 'function' : incompatible types - from 'PythonCmd_ClientData *' to 'char *' src/_tkinter.c(3133) : error C2065: 'tcl_lock' : undeclared identifier src/_tkinter.c(3133) : error C2065: 'tcl_lock' : undeclared identifier src/_tkinter.c(3133) : warning C4022: 'PyThread_acquire_lock' : pointer mismatch for actual parameter 1 src/_tkinter.c(3134) : error C2065: 'tcl_tstate' : undeclared identifier src/_tkinter.c(3134) : error C2065: 'event_tstate' : undeclared identifier src/_tkinter.c(3138) : error C2065: 'tcl_tstate' : undeclared identifier src/_tkinter.c(3138) : warning C4047: '=' : 'int' differs in levels of indirection from 'void *' src/_tkinter.c(3139) : error C2065: 'tcl_lock' : undeclared identifier src/_tkinter.c(3139) : error C2065: 'tcl_lock' : undeclared identifier src/_tkinter.c(3139) : warning C4022: 'PyThread_release_lock' : pointer mismatch for actual parameter 1 error: command 'C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\BIN\cl.exe' failed with exit status 2 since tcl_lock is defined in a #ifdef WITH_THREAD' section and the error happens in a "#if defined(WITH_THREAD) || defined(MS_WINDOWS)" section, i tryed with the WITH_THREAD macro and got new errors : src/_tkinter.c(33) : warning C4273: 'PyOS_InputHook' : inconsistent dll linkage src/_tkinter.c(32) : see previous definition of 'PyOS_InputHook' src/_tkinter.c(673) : warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. src/_tkinter.c(709) : warning C4996: 'strcat': This function or variable may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. src/_tkinter.c(712) : warning C4996: 'strcat': This function or variable may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. src/_tkinter.c(713) : warning C4996: 'strcat': This function or variable may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. src/_tkinter.c(714) : warning C4996: 'strcat': This function or variable may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. src/_tkinter.c(1367) : error C2065: 'call_mutex' : undeclared identifier src/_tkinter.c(1367) : warning C4047: 'function' : 'Tcl_Mutex *' differs in levels of indirection from 'int *' src/_tkinter.c(1367) : warning C4024: 'Tkapp_ThreadSend' : different types for formal and actual parameter 4 src/_tkinter.c(1642) : error C2065: 'var_mutex' : undeclared identifier src/_tkinter.c(1642) : warning C4047: 'function' : 'Tcl_Mutex *' differs in levels of indirection from 'int *' src/_tkinter.c(1642) : warning C4024: 'Tkapp_ThreadSend' : different types for formal and actual parameter 4 src/_tkinter.c(2113) : warning C4133: 'function' : incompatible types - from 'PythonCmd_ClientData *' to 'char *' src/_tkinter.c(2138) : warning C4090: 'function' : different 'const' qualifiers src/_tkinter.c(2138) : warning C4028: formal parameter 4 different from declaration src/_tkinter.c(2190) : error C2065: 'command_mutex' : undeclared identifier src/_tkinter.c(2190) : warning C4047: 'function' : 'Tcl_Mutex *' differs in levels of indirection from 'int *' src/_tkinter.c(2190) : warning C4024: 'Tkapp_ThreadSend' : different types for formal and actual parameter 4 src/_tkinter.c(2198) : warning C4090: 'function' : different 'const' qualifiers src/_tkinter.c(2198) : warning C4028: formal parameter 4 different from declaration src/_tkinter.c(2204) : warning C4133: 'function' : incompatible types - from 'PythonCmd_ClientData *' to 'char *' src/_tkinter.c(2236) : error C2065: 'command_mutex' : undeclared identifier src/_tkinter.c(2236) : warning C4047: 'function' : 'Tcl_Mutex *' differs in levels of indirection from 'int *' src/_tkinter.c(2236) : warning C4024: 'Tkapp_ThreadSend' : different types for formal and actual parameter 4 error: command 'C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\BIN\cl.exe' failed with exit status 2 Tested on windows 7 64bits with visual studio 2008 (32 bits) and pypy 1.7 Thanks, Arnaud -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitrey15 at ukr.net Tue Jan 17 19:22:37 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Tue, 17 Jan 2012 20:22:37 +0200 Subject: [pypy-dev] Some NumPyPy propositions In-Reply-To: References: <4F0EAB1E.7020502@ukr.net> <4F13E429.4050500@ukr.net> Message-ID: <4F15BC6D.5060404@ukr.net> Hi, > I had proposed you to create an online table, where > NumPy functions are split into the following categories: > * already ported > * under development (preferably with name of the person who works on it) > * not started yet > Is it possible to get the data like this? This way other possible > contributors (maybe including me) could select an appropriate (i.e. they are > capable of) function for contribution, and other people could estimate the > progress done. > Yes, we're working on such thing. The temporary location is at https://bitbucket.org/fijal/hack2/src/default/numready, we'll probably create a nightly static HTML. Well, I just got this: * File "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", line /1518/, in |__call__| return self.wsgi_app(environ, start_response) * File "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", line /1506/, in |wsgi_app| response = self.make_response(self.handle_exception(e)) * File "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", line /1504/, in |wsgi_app| response = self.full_dispatch_request() * File "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", line /1264/, in |full_dispatch_request| rv = self.handle_user_exception(e) * File "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", line /1262/, in |full_dispatch_request| rv = self.dispatch_request() * File "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", line /1248/, in |dispatch_request| return self.view_functions[rule.endpoint](**req.view_args) * File "/home/dmitrey/Install/hack2/numready/numready.py", line /20/, in |index| pypy = read_all_numpy_funcs(sys.executable, 'numpypy') * File "/home/dmitrey/Install/hack2/numready/_numready/process.py", line /21/, in |read_all_numpy_funcs| assert not err AssertionError Well, it doesn't matter essentially, from the code I just see it parses dir(numpypy) and doesn't provide information on the funcs been worked on, thus maybe my own efforts to port a func will be just a waste of time, because someone other will make it done before me. So, I've tried today's nightly build; functions *ravel*, *flatten* and *where*, that are extremely important (they occur very often) are unimplemented yet; Extremely**important function*dot *for matrix multiplication has very strange behavior: >>>> np.dot(np.ones((2,4)),np.array([1,2,3,4])) 20.0 CPython result: >>> np.dot(np.ones((2,4)),np.array([1,2,3,4])) array([ 10., 10.]) I haven't tested any further, it already enough to make a sad conclusion about current numpypy progress and quality. In either way, I would like to have possibility to contribute some funcs, what should I do? Can I obtain git edit rights (or you're using something else, like mercury or svn?) Also, I would like to stay tuned or even participate in numpypy discussion, but mail list already has high flow wrt JIT backends and other info that has no interest to me, I guess reading it is only a waste of time for many other PyPy-numpy users. Is it possible to create other mail list, e.g. PyPy-Math, for discussion of numpy, scipy ports and other related info? ----------------------- Regards, D. http://openopt.org/Dmitrey >> Also, is it possible to install recent numpypy without reinstallation of >> whole PyPy? > Nope. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Tue Jan 17 19:35:02 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 17 Jan 2012 20:35:02 +0200 Subject: [pypy-dev] Some NumPyPy propositions In-Reply-To: <4F15BC6D.5060404@ukr.net> References: <4F0EAB1E.7020502@ukr.net> <4F13E429.4050500@ukr.net> <4F15BC6D.5060404@ukr.net> Message-ID: On Tue, Jan 17, 2012 at 8:22 PM, Dmitrey wrote: > Hi, > > > > I had proposed you to create an online table, where > NumPy functions are split into the following categories: > * already ported > * under development (preferably with name of the person who works on it) > * not started yet > Is it possible to get the data like this? This way other possible > contributors (maybe including me) could select an appropriate (i.e. they are > capable of) function for contribution, and other people could estimate the > progress done. > > >> Yes, we're working on such thing. The temporary location is at >> https://bitbucket.org/fijal/hack2/src/default/numready, we'll probably >> create a nightly static HTML. > > Well, I just got this: > > File > "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", > line 1518, in __call__ > > return self.wsgi_app(environ, start_response) > > File > "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", > line 1506, in wsgi_app > > response = self.make_response(self.handle_exception(e)) > > File > "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", > line 1504, in wsgi_app > > response = self.full_dispatch_request() > > File > "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", > line 1264, in full_dispatch_request > > rv = self.handle_user_exception(e) > > File > "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", > line 1262, in full_dispatch_request > > rv = self.dispatch_request() > > File > "/usr/local/lib/python2.7/dist-packages/Flask-0.8-py2.7.egg/flask/app.py", > line 1248, in dispatch_request > > return self.view_functions[rule.endpoint](**req.view_args) > > File "/home/dmitrey/Install/hack2/numready/numready.py", line 20, in index > > pypy = read_all_numpy_funcs(sys.executable, 'numpypy') > > File "/home/dmitrey/Install/hack2/numready/_numready/process.py", line 21, > in read_all_numpy_funcs > > assert not err > > AssertionError > > Well, it doesn't matter essentially, from the code I just see it parses > dir(numpypy) and doesn't provide information on the funcs been worked on, > thus maybe my own efforts to port a func will be just a waste of time, > because someone other will make it done before me. I guess the main point why it does that is that it can be fully automatic. It's impossible to keep track manually marking those functions. > > So, I've tried today's nightly build; > functions ravel, flatten and where, that are extremely important (they occur > very often) are unimplemented yet; that's correct, also noone is working on them at the moment. > > Extremely important function dot for matrix multiplication has very strange > behavior: > >>>>> np.dot(np.ones((2,4)),np.array([1,2,3,4])) > 20.0 > CPython result: >>>> np.dot(np.ones((2,4)),np.array([1,2,3,4])) > array([ 10.,? 10.]) that's indeed buggy. > > I haven't tested any further, it already enough to make a sad conclusion > about current numpypy progress and quality. Not sure how you make conclusions about quality, but noone ever said that the entire numpy is implemented. > In either way, I would like to > have possibility to contribute some funcs, what should I do? Can I obtain > git edit rights (or you're using something else, like mercury or svn?) We use mercurial. Either create a branch (like a fork on bitbucket) or create a bug tracker issues (bugs.pypy.org) with a diff. Right now work is only being done on indexing by arrays. > Also, I would like to stay tuned or even participate in numpypy discussion, > but mail list already has high flow wrt JIT backends and other info that has > no interest to me, I guess reading it is only a waste of time for many other > PyPy-numpy users. Is it possible to create other mail list, e.g. PyPy-Math, > for discussion of numpy, scipy ports and other related info? I think pypy-dev is sufficiently low volume to skip unrelated discussions if you don't like them, don't you think? Cheers, fijal From dmitrey15 at ukr.net Tue Jan 17 19:57:21 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Tue, 17 Jan 2012 20:57:21 +0200 Subject: [pypy-dev] Some NumPyPy propositions In-Reply-To: References: <4F0EAB1E.7020502@ukr.net> <4F13E429.4050500@ukr.net> <4F15BC6D.5060404@ukr.net> Message-ID: <4F15C491.9070001@ukr.net> On 01/17/2012 08:35 PM, Maciej Fijalkowski wrote: > On Tue, Jan 17, 2012 at 8:22 PM, Dmitrey wrote: >> >> Well, it doesn't matter essentially, from the code I just see it parses >> dir(numpypy) and doesn't provide information on the funcs been worked on, >> thus maybe my own efforts to port a func will be just a waste of time, >> because someone other will make it done before me. > I guess the main point why it does that is that it can be fully > automatic. It's impossible to keep track manually marking those > functions. There are lots of free online tables and spreadsheets, e.g. google or my favorite zoho.com, with many other convenient stuff. A NumPyPy state info datasheet could be created with 1st column - implemented funcs, next several columns - triples like (funcname, person_working_on_it, estimated_finish_date), or something like that, filled by the persons. Potential NumPyPy users or contributors could review current state of numpypy development without having to download and install each latest night pypy build and, possibly, propose their own contributions or financial support to implement funcs they need and still miss. I guess it's not so difficult to create and maintain, but, of course, you could select any other approach. ----------------------- Regards, D. http://openopt.org/Dmitrey From amauryfa at gmail.com Tue Jan 17 20:44:30 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 17 Jan 2012 20:44:30 +0100 Subject: [pypy-dev] Tkinter-pypy not building on windows In-Reply-To: References: Message-ID: Hi, 2012/1/17 Arnaud F > When trying to build tkinter from source, i get the following errors : > [...] > src/_tkinter.c(3133) : error C2065: 'tcl_lock' : undeclared identifier > Which version of tcl are you using? Are you sure it was compiled with threads? There is a comment about this at the beginning of _tkinter.c: /* If Tcl is compiled for threads, we must also define TCL_THREAD. We define it always; if Tcl is not threaded, the thread functions in Tcl are empty. */ But compilation with a thread-less TCL is probably not so well supported. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From drewes at interstice.com Tue Jan 17 23:30:00 2012 From: drewes at interstice.com (Rich Drewes) Date: Tue, 17 Jan 2012 14:30:00 -0800 Subject: [pypy-dev] GC error In-Reply-To: <4F150E43.9060904@wakelift.de> References: <4F147008.6040208@interstice.com> <4F14B23A.6040203@interstice.com> <4F150E43.9060904@wakelift.de> Message-ID: <4F15F668.3050606@interstice.com> On 01/16/2012 09:59 PM, Timo Paulssen wrote: > Hello, > > PyPy has a hard limit on its heap size. It can be specified with this > environment variable: > > PYPY_GC_MAX The max heap size. If coming near this limit, it > will first collect more often, then raise an > RPython MemoryError, and if that is not enough, > crash the program with a fatal error. Try values > like '1.6GB'. Thanks for the suggestion. I tried this, and it did not seem to change the point of failure though the message looks a bit different: ---- drewes at ladastra:/home/drewes$ set | grep PYPY PYPY_GC_MAX=6GB drewes at ladastra:/home/drewes$ pypy grouper2.py --pass2 p5parsed-TSU6.txt reloading reads . . . loading reads, on record 1000000 ... ... loading reads, on record 14000000 Traceback (most recent call last): File "app_main.py", line 51, in run_toplevel File "grouper2.py", line 982, in Using too much memory, aborting Aborted ---- According to 'top', the program is not actually using anywhere near 6GB when the failure occurs. It is only using about 2GB when it fails. > Check out the rest of the evariables for the GC in > pypy/rpython/memory/gc/minimark.py I will do that. > > In general, the pypy GC isn't optimised to know what pages are in > memory or in swap, so each major collection will cause every single > page to be touched. If a large part of your pypy process will not fit > into the swap, you will probably experience very significant slowdowns. That is good information, thanks. Rich From fijall at gmail.com Tue Jan 17 23:50:04 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 18 Jan 2012 00:50:04 +0200 Subject: [pypy-dev] GC error In-Reply-To: <4F15F668.3050606@interstice.com> References: <4F147008.6040208@interstice.com> <4F14B23A.6040203@interstice.com> <4F150E43.9060904@wakelift.de> <4F15F668.3050606@interstice.com> Message-ID: On Wed, Jan 18, 2012 at 12:30 AM, Rich Drewes wrote: > On 01/16/2012 09:59 PM, Timo Paulssen wrote: >> >> Hello, >> >> PyPy has a hard limit on its heap size. It can be specified with this >> environment variable: >> >> ?PYPY_GC_MAX ? ? ? ? ? ?The max heap size. ?If coming near this limit, it >> ? ? ? ? ? ? ? ? ? ? ? ?will first collect more often, then raise an >> ? ? ? ? ? ? ? ? ? ? ? ?RPython MemoryError, and if that is not enough, >> ? ? ? ? ? ? ? ? ? ? ? ?crash the program with a fatal error. ?Try values >> ? ? ? ? ? ? ? ? ? ? ? ?like '1.6GB'. > > Thanks for the suggestion. ?I tried this, and it did not seem to change the > point of failure though the message looks a bit different: > > ---- > drewes at ladastra:/home/drewes$ set | grep PYPY > PYPY_GC_MAX=6GB > drewes at ladastra:/home/drewes$ pypy grouper2.py --pass2 p5parsed-TSU6.txt > > reloading reads . . . > loading reads, on record 1000000 ... > ... loading reads, on record 14000000 > > Traceback (most recent call last): > ?File "app_main.py", line 51, in run_toplevel > ?File "grouper2.py", line 982, in > Using too much memory, aborting > Aborted > ---- > > According to 'top', the program is not actually using anywhere near 6GB when > the failure occurs. ?It is only using about 2GB when it fails. That means you're running a 32bit program in a 64bit environment >> >> Check out the rest of the evariables for the GC in >> pypy/rpython/memory/gc/minimark.py > > I will do that. > >> >> In general, the pypy GC isn't optimised to know what pages are in memory >> or in swap, so each major collection will cause every single page to be >> touched. If a large part of your pypy process will not fit into the swap, >> you will probably experience very significant slowdowns. > > That is good information, thanks. > > Rich > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From drewes at interstice.com Wed Jan 18 05:07:26 2012 From: drewes at interstice.com (Rich Drewes) Date: Tue, 17 Jan 2012 20:07:26 -0800 Subject: [pypy-dev] GC error In-Reply-To: References: <4F147008.6040208@interstice.com> <4F14B23A.6040203@interstice.com> <4F150E43.9060904@wakelift.de> <4F15F668.3050606@interstice.com> Message-ID: <4F16457E.4040108@interstice.com> On 01/17/2012 02:50 PM, Maciej Fijalkowski wrote: > That means you're running a 32bit program in a 64bit environment Yup, that was it. For some reason the package from the launchpad ppa that was pulled in was 32 bit. Rich From dmitrey15 at ukr.net Wed Jan 18 11:27:17 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Wed, 18 Jan 2012 12:27:17 +0200 Subject: [pypy-dev] Some NumPyPy propositions In-Reply-To: <4F15C491.9070001@ukr.net> References: <4F0EAB1E.7020502@ukr.net> <4F13E429.4050500@ukr.net> <4F15BC6D.5060404@ukr.net> <4F15C491.9070001@ukr.net> Message-ID: <4F169E85.3030302@ukr.net> Hi, On 01/17/2012 08:57 PM, Dmitrey wrote: > On 01/17/2012 08:35 PM, Maciej Fijalkowski wrote: >> On Tue, Jan 17, 2012 at 8:22 PM, Dmitrey wrote: >>> >>> Well, it doesn't matter essentially, from the code I just see it parses >>> dir(numpypy) and doesn't provide information on the funcs been >>> worked on, >>> thus maybe my own efforts to port a func will be just a waste of time, >>> because someone other will make it done before me. >> I guess the main point why it does that is that it can be fully >> automatic. It's impossible to keep track manually marking those >> functions. > There are lots of free online tables and spreadsheets, e.g. google or > my favorite zoho.com, with many other convenient stuff. A NumPyPy > state info datasheet could be created with 1st column - implemented > funcs, next several columns - triples like (funcname, > person_working_on_it, estimated_finish_date), or something like that, > filled by the persons. Potential NumPyPy users or contributors could > review current state of numpypy development without having to download > and install each latest night pypy build and, possibly, propose their > own contributions or financial support to implement funcs they need > and still miss. I guess it's not so difficult to create and maintain, > but, of course, you could select any other approach. > > ----------------------- > Regards, D. > http://openopt.org/Dmitrey > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > I have compared contains of numpy and numpypy, numpy has 551 entries (and some of them are modules with lots of other funcs, like linalg or fft), while nympypy has only 121 entry - thus, I guess, numpypy still is too far from full numpy port. Well, some of numpy funcs are used very rarely and thus can be omitted for now. I have reviewed difference of the modules, and IMHO currently these funcs are most important: array_equal nanargmax hstack diag nanmin sinh atleast_1d asscalar eye zeros_like logical_or tile cosh empty_like array_equiv asfarray nanargmin asarray log2 vstack logical_xor nansum rot90 copy savetxt logical_not ceil median where isfinite isnan diff tanh cross flipud isscalar insert logical_and nanmax ones_like log log10 arccosh isinf My skills and current possibilities allow me to work only on limited subset of the funcs mentioned above, if noone mind, during several next days I intend to work on he following funcs - maybe very simple to be ported, but very important: array_equal diag asscalar eye zeros_like empty_like array_equiv flipud isscalar ones_like BTW you mentioned [nan][arg]min/max works on a nightly build (except of axis parameter) (http://morepypy.blogspot.com/2012/01/numpypy-progress-report-running.html?showComment=1326227233044#c7372676427534170441), but I didn't see them at all, as well as numpypy.nan number at all. Maybe they are present in a mercurial branch, that was forgotten to be committed to main trunc? As for the table, if you think it's hard to create and maintain, I could create it by myself, fill by currently done functions, share edit permission with you, and you will share it with other volunteers, to keep them and other people of numpy funcs that are done and under development. I just thought you would prefer to be creator and single owner of the table and that it doesn't take much time to be done, that's why I had proposed it to be done by you. Also, maybe it would be a good idea to provide link to the table from main pypy.org webpage. ----------------------- Regards, D. http://openopt.org/Dmitrey From p.j.a.cock at googlemail.com Wed Jan 18 11:42:21 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 18 Jan 2012 10:42:21 +0000 Subject: [pypy-dev] Some NumPyPy propositions In-Reply-To: <4F169E85.3030302@ukr.net> References: <4F0EAB1E.7020502@ukr.net> <4F13E429.4050500@ukr.net> <4F15BC6D.5060404@ukr.net> <4F15C491.9070001@ukr.net> <4F169E85.3030302@ukr.net> Message-ID: On Wed, Jan 18, 2012 at 10:27 AM, Dmitrey wrote: > Hi, > > I have compared contains of numpy and numpypy, numpy has 551 entries (and > some of them are modules with lots of other funcs, like linalg or fft), > while nympypy has only 121 entry - thus, I guess, numpypy still is too far > from full numpy port. Well, some of numpy funcs are used very rarely and > thus can be omitted for now. Absolutely - for now numpypy deserves its earlier name of micronumpy ;) > I have reviewed difference of the modules, and IMHO currently these funcs > are most important: > > array_equal nanargmax hstack diag nanmin sinh atleast_1d asscalar eye > zeros_like logical_or tile cosh empty_like array_equiv asfarray nanargmin > asarray log2 vstack logical_xor nansum rot90 copy savetxt logical_not ceil > median where isfinite isnan diff tanh cross flipud isscalar insert > logical_and nanmax ones_like log log10 arccosh isinf Some of those are certainly important - but I think every NumPy user would give you a different list. You're missing many of the key functions I use, e.g. https://bugs.pypy.org/issue913 > My skills and current possibilities allow me to work only on limited subset > of the funcs mentioned above, if noone mind, during several next days I > intend to work on he following funcs - maybe very simple to be ported, but > very important: > > array_equal diag asscalar eye zeros_like empty_like array_equiv flipud > isscalar ones_like If they are simple and easy to import, they'll be useful to somebody :) > BTW you mentioned [nan][arg]min/max works on a nightly build (except of axis > parameter) > (http://morepypy.blogspot.com/2012/01/numpypy-progress-report-running.html?showComment=1326227233044#c7372676427534170441), > but I didn't see them at all, as well as numpypy.nan number at all. Maybe > they are present in a mercurial branch, that was forgotten to be committed > to main trunc? If they are fixed, please make a note of this on https://bugs.pypy.org/issue913 > As for the table, if you think it's hard to create and maintain, I could > create it by myself, fill by currently done functions, share edit permission > with you, and you will share it with other volunteers, to keep them and > other people of numpy funcs that are done and under development. I just > thought you would prefer to be creator and single owner of the table and > that it doesn't take much time to be done, that's why I had proposed it to > be done by you. Also, maybe it would be a good idea to provide link to the > table from main pypy.org webpage. Can the table can probably be generate automatically using dir(numpy) and dir(numpypy) and some introspection? Peter From cfbolz at gmx.de Wed Jan 18 12:26:39 2012 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Wed, 18 Jan 2012 12:26:39 +0100 Subject: [pypy-dev] Some NumPyPy propositions In-Reply-To: <4F169E85.3030302@ukr.net> References: <4F0EAB1E.7020502@ukr.net> <4F13E429.4050500@ukr.net> <4F15BC6D.5060404@ukr.net> <4F15C491.9070001@ukr.net> <4F169E85.3030302@ukr.net> Message-ID: <4F16AC6F.60205@gmx.de> On 01/18/2012 11:27 AM, Dmitrey wrote: > As for the table, if you think it's hard to create and maintain, I could > create it by myself, fill by currently done functions, share edit > permission with you, and you will share it with other volunteers, to > keep them and other people of numpy funcs that are done and under > development. I just thought you would prefer to be creator and single > owner of the table and that it doesn't take much time to be done, that's > why I had proposed it to be done by you. Also, maybe it would be a good > idea to provide link to the table from main pypy.org webpage. I was wondering, wouldn't a simple page on PyPy's bitbucket wiki work best?: https://bitbucket.org/pypy/pypy/wiki/Home I don't think something fancy like a table is necessary, there aren't really hundreds of people working on Numpypy. If you could start such a page, I think that would be pretty cool. Thanks for getting involved! Cheers, Carl Friedrich From dmitrey15 at ukr.net Wed Jan 18 13:18:17 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Wed, 18 Jan 2012 14:18:17 +0200 Subject: [pypy-dev] Some NumPyPy propositions In-Reply-To: <4F16AC6F.60205@gmx.de> References: <4F0EAB1E.7020502@ukr.net> <4F13E429.4050500@ukr.net> <4F15BC6D.5060404@ukr.net> <4F15C491.9070001@ukr.net> <4F169E85.3030302@ukr.net> <4F16AC6F.60205@gmx.de> Message-ID: <4F16B889.3030303@ukr.net> On 01/18/2012 01:26 PM, Carl Friedrich Bolz wrote: > On 01/18/2012 11:27 AM, Dmitrey wrote: > > As for the table, if you think it's hard to create and maintain, I > could > > create it by myself, fill by currently done functions, share edit > > permission with you, and you will share it with other volunteers, to > > keep them and other people of numpy funcs that are done and under > > development. I just thought you would prefer to be creator and single > > owner of the table and that it doesn't take much time to be done, > that's > > why I had proposed it to be done by you. Also, maybe it would be a good > > idea to provide link to the table from main pypy.org webpage. > > I was wondering, wouldn't a simple page on PyPy's bitbucket wiki work > best?: > > https://bitbucket.org/pypy/pypy/wiki/Home > > I don't think something fancy like a table is necessary, there aren't > really hundreds of people working on Numpypy. If you could start such a > page, I think that would be pretty cool. I have essential background on editing ordinary wiki - it's engine of my website for 5+ years - but i don't understand the link to wiki you mentioned. Does editing the wiki require changing something in Python files and then committing it to mercurial repository? In either case, editing the wiki is more complex than editing online spreadsheet table. I have created a sample of google spreadsheet https://docs.google.com/spreadsheet/ccc?key=0Ak7GVY0fCdaidE90aHRTVDdNY1puQkg5LVR1SEs5NGc There are 2 possible options: * Allow anyone to edit (no sign-in required) * Only people explicitly granted permission can access (google sign-up is required) Which one would you prefer? Currently I set it to 1st ("allow everyone"). For 2nd option users with edit rights can be append recursively. Also, which changes would you recommend to done (if you support the whole idea, of course)? I have some free time to work on it along with some pypy funcs to implement. If for some reasons you prefer another person to be owner of the google / another engine table like this - no problem, let the person create another table and (if required) copy-paste some data from the table I've done. BTW, I have registered in bugs.pypy.org, but confirmation email hasn't been arrived for several hours, and re-registration doesn't help. Let me inform you about a couple of bugs I found recently: >>> np.array(1).reshape(1,1) array([[1]]) >>>> np.array(1).reshape(1,1) RPython traceback: File "interpreter_gateway.c", line 554, in BuiltinCode_funcrun_obj File "implement_22.c", line 26204, in BuiltinActivation_UwS_BaseArray_ObjSpace_args_w_ File "module_micronumpy_interp_numarray.c", line 51133, in BaseArray_descr_reshape Fatal RPython error: NotImplementedError Aborted >>> np.array([1,2,3],'float') array([ 1., 2., 3.]) >>>> np.array([1,2,3],'float') Traceback (most recent call last): File "", line 1, in TypeError: data type not understood (without quotes it works fine) ----------------------- Regards, D. http://openopt.org/Dmitrey > > Thanks for getting involved! > > Cheers, > > Carl Friedrich > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > > From pachalmars at gmail.com Wed Jan 18 18:37:39 2012 From: pachalmars at gmail.com (Arnaud F) Date: Wed, 18 Jan 2012 18:37:39 +0100 Subject: [pypy-dev] Tkinter-pypy not building on windows In-Reply-To: References: Message-ID: Still no luck. I made sure that i am using the correct version of tcl/tk and tested the compilation of tkinter with python27 and it worked. With pypy, i still have to define WITH_THREAD (it's defined in pyconfig.h for python27 but not in pypy-1.7); I also had to modify the setup.py since i used tcl/tk 8.5 (and the libs do not have the dot in windows : tcl85 and tk85) Now I get missing libs error from the linker : LINK : fatal error LNK1181: cannot open input file 'X11.lib' LINK : fatal error LNK1181: cannot open input file 'python27.lib' if i try to remove X11 from the librairies in setup.py and add a path to python27.lib, i get a lot of unresolved external symbols Thanks Arnaud 2012/1/17 Amaury Forgeot d'Arc > Hi, > > 2012/1/17 Arnaud F > >> When trying to build tkinter from source, i get the following errors : >> > [...] > >> src/_tkinter.c(3133) : error C2065: 'tcl_lock' : undeclared identifier >> > > Which version of tcl are you using? Are you sure it was compiled with > threads? > There is a comment about this at the beginning of _tkinter.c: > /* If Tcl is compiled for threads, we must also define TCL_THREAD. We > define > it always; if Tcl is not threaded, the thread functions in > Tcl are empty. */ > But compilation with a thread-less TCL is probably not so well supported. > > -- > Amaury Forgeot d'Arc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amauryfa at gmail.com Wed Jan 18 18:55:24 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 18 Jan 2012 18:55:24 +0100 Subject: [pypy-dev] Tkinter-pypy not building on windows In-Reply-To: References: Message-ID: 2012/1/18 Arnaud F > Still no luck. > I made sure that i am using the correct version of tcl/tk and tested the > compilation of tkinter with python27 and it worked. > With pypy, i still have to define WITH_THREAD (it's defined in pyconfig.h > for python27 but not in pypy-1.7); I also had to modify the setup.py since > i used tcl/tk 8.5 (and the libs do not have the dot in windows : tcl85 and > tk85) > Now I get missing libs error from the linker : > > LINK : fatal error LNK1181: cannot open input file 'X11.lib' > LINK : fatal error LNK1181: cannot open input file 'python27.lib' > Well, this is progress! > if i try to remove X11 from the librairies in setup.py and add a path to > python27.lib, i get a lot of unresolved external symbols > How comes that you need the X11 library? didn't you build a dll version of tcl/tk? (tcl85.dll, tk85.dll) Don't use the CPython python27.lib, you should use the one shipped with pypy; it's possible that it is named libpypy.lib instead. Just rename it if needed. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From pachalmars at gmail.com Wed Jan 18 19:45:25 2012 From: pachalmars at gmail.com (Arnaud F) Date: Wed, 18 Jan 2012 19:45:25 +0100 Subject: [pypy-dev] Tkinter-pypy not building on windows In-Reply-To: References: Message-ID: The reference to X11.lib comes from the original setup.py. The reference to python27.lib seems to come from distutils (build_ext.py, in the get_libraries method) I didn't find any .lib file in my pypy installation (the standard binaries version 1.7 32bits for windows), just some dlls. Do I need to rebuild pypy from source ? Thanks, Arnaud 2012/1/18 Amaury Forgeot d'Arc > 2012/1/18 Arnaud F > >> Still no luck. >> I made sure that i am using the correct version of tcl/tk and tested the >> compilation of tkinter with python27 and it worked. >> With pypy, i still have to define WITH_THREAD (it's defined in pyconfig.h >> for python27 but not in pypy-1.7); I also had to modify the setup.py since >> i used tcl/tk 8.5 (and the libs do not have the dot in windows : tcl85 and >> tk85) >> Now I get missing libs error from the linker : >> >> LINK : fatal error LNK1181: cannot open input file 'X11.lib' >> LINK : fatal error LNK1181: cannot open input file 'python27.lib' >> > > Well, this is progress! > > >> if i try to remove X11 from the librairies in setup.py and add a path to >> python27.lib, i get a lot of unresolved external symbols >> > > How comes that you need the X11 library? didn't you build a dll version of > tcl/tk? (tcl85.dll, tk85.dll) > Don't use the CPython python27.lib, you should use the one shipped with > pypy; it's possible that it is named libpypy.lib instead. Just rename it if > needed. > > -- > Amaury Forgeot d'Arc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitrey15 at ukr.net Thu Jan 19 17:46:34 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Thu, 19 Jan 2012 18:46:34 +0200 Subject: [pypy-dev] certificate for accepting numpypy new funcs? Message-ID: <4F1848EA.1050203@ukr.net> Hi all, could you provide clarification to numpypy new funcs accepting (not only for me, but for any other possible volunteers)? The doc I've been directed says only "You have to test exhaustively your module", while I would like to know more explicit rules. For example, "at least 3 tests per func" (however, I guess for funcs of different complexity and variability number of tests also should expected to be different). Also, are there any strict rules for the testcases to be submitted, or I, for example, can mere write if __name__ == '__main__': assert array_equal(1, 1) assert array_equal([1, 2], [1, 2]) assert array_equal(N.array([1, 2]), N.array([1, 2])) assert array_equal([1, 2], N.array([1, 2])) assert array_equal([1, 2], [1, 2, 3]) is False print('passed') Or there is a certain rule for storing files with tests? If I or someone else will submit a func with some tests like in the example above, will you put the func and tests in the proper files by yourself? I'm not lazy to go for it by myself, but I mere no merged enough into numpypy dev process, including mercurial branches and numpypy files structure, and can spend only quite limited time for diving into it in nearest future. Of course, you can provide very strict rules of funcs and related testcases submission for to improve quality, but I guess it can reduce number and willing of volunteers to assist, hence an appropriate middle should be chosen. BTW, the ndarray.flat bug mentioned in https://bugs.pypy.org/issue1009 still prevents contribution of funcs diag, eye, array_equal and many others. ----------------------- Regards, D. http://openopt.org/Dmitrey From fijall at gmail.com Thu Jan 19 18:31:40 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 19 Jan 2012 19:31:40 +0200 Subject: [pypy-dev] certificate for accepting numpypy new funcs? In-Reply-To: <4F1848EA.1050203@ukr.net> References: <4F1848EA.1050203@ukr.net> Message-ID: On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey wrote: > Hi all, > could you provide clarification to numpypy new funcs accepting (not only for > me, but for any other possible volunteers)? > The doc I've been directed says only "You have to test exhaustively your > module", while I would like to know more explicit rules. > For example, "at least 3 tests per func" (however, I guess for funcs of > different complexity and variability number of tests also should expected to > be different). > Also, are there any strict rules for the testcases to be submitted, or I, > for example, can mere write > > if __name__ == '__main__': > ? ?assert array_equal(1, 1) > ? ?assert array_equal([1, 2], [1, 2]) > ? ?assert array_equal(N.array([1, 2]), N.array([1, 2])) > ? ?assert array_equal([1, 2], N.array([1, 2])) > ? ?assert array_equal([1, 2], [1, 2, 3]) is False > ? ?print('passed') We have pretty exhaustive automated testing suites. Look for example in pypy/module/micronumpy/test directory for the test file style. They're run with py.test and we require at the very least full code coverage (every line has to be executed, there are tools to check, like coverage). Also passing "unusual" input, like sys.maxint etc. is usually recommended. With your example, you would check if it works for say views and multidimensional arrays. Also "is False" is not considered good style. > > Or there is a certain rule for storing files with tests? > > If I or someone else will submit a func with some tests like in the example > above, will you put the func and tests in the proper files by yourself? I'm > not lazy to go for it by myself, but I mere no merged enough into numpypy > dev process, including mercurial branches and numpypy files structure, and > can spend only quite limited time for diving into it in nearest future. We generally require people to put their own tests as they go with the code (in appropriate places) because you also should not break anything. The usefullness of a patch that has to be sliced and diced and put into places is very limited and for straightforward mostly-copied code, like array_equal, plain useless, since it's almost as much work to just do it. > > Of course, you can provide very strict rules of funcs and related testcases > submission for to improve quality, but I guess it can reduce number and > willing of volunteers to assist, hence an appropriate middle should be > chosen. > > BTW, the ndarray.flat bug mentioned in https://bugs.pypy.org/issue1009 still > prevents contribution of funcs diag, eye, array_equal and many others. Yes, flatiter is unfortunately in a sad state. There is however a branch to fix it and people working on it. From dmitrey15 at ukr.net Thu Jan 19 20:49:40 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Thu, 19 Jan 2012 21:49:40 +0200 Subject: [pypy-dev] certificate for accepting numpypy new funcs? In-Reply-To: References: <4F1848EA.1050203@ukr.net> Message-ID: <4F1873D4.8040401@ukr.net> On 01/19/2012 07:31 PM, Maciej Fijalkowski wrote: > On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey wrote: >> Hi all, >> could you provide clarification to numpypy new funcs accepting (not only for >> me, but for any other possible volunteers)? >> The doc I've been directed says only "You have to test exhaustively your >> module", while I would like to know more explicit rules. >> For example, "at least 3 tests per func" (however, I guess for funcs of >> different complexity and variability number of tests also should expected to >> be different). >> Also, are there any strict rules for the testcases to be submitted, or I, >> for example, can mere write >> >> if __name__ == '__main__': >> assert array_equal(1, 1) >> assert array_equal([1, 2], [1, 2]) >> assert array_equal(N.array([1, 2]), N.array([1, 2])) >> assert array_equal([1, 2], N.array([1, 2])) >> assert array_equal([1, 2], [1, 2, 3]) is False >> print('passed') > We have pretty exhaustive automated testing suites. Look for example > in pypy/module/micronumpy/test directory for the test file style. > They're run with py.test and we require at the very least full code > coverage (every line has to be executed, there are tools to check, > like coverage). Also passing "unusual" input, like sys.maxint etc. is > usually recommended. With your example, you would check if it works > for say views and multidimensional arrays. Also "is False" is not > considered good style. > >> Or there is a certain rule for storing files with tests? >> >> If I or someone else will submit a func with some tests like in the example >> above, will you put the func and tests in the proper files by yourself? I'm >> not lazy to go for it by myself, but I mere no merged enough into numpypy >> dev process, including mercurial branches and numpypy files structure, and >> can spend only quite limited time for diving into it in nearest future. > We generally require people to put their own tests as they go with the > code (in appropriate places) because you also should not break > anything. The usefullness of a patch that has to be sliced and diced > and put into places is very limited and for straightforward > mostly-copied code, like array_equal, plain useless, since it's almost > as much work to just do it. Well, for this func (array_equal) my docstrings really were copied from cpython numpy (why wouln't do this to save some time, while license allows it?), but * why would'n go for this (), while other programmers are busy by other tasks? * engines of my and CPython numpy funcs complitely differs. At first, in PyPy the CPython code just doesn't work at all (because of the problem with ndarray.flat). At 2nd, I have implemented walkaround - just replaced some code lines by Size = a1.size f1, f2 = a1.flat, a2.flat # TODO: replace xrange by range in Python3 for i in xrange(Size): if f1.next() != f2.next(): return False return True Here are some results in CPython for the following bench: from time import time n = 100000 m = 100 a = N.zeros(n) b = N.ones(n) t = time() for i in range(m): N.array_equal(a, b) print('classic numpy array_equal time elapsed (on different arrays): %0.5f' % (time()-t)) t = time() for i in range(m): array_equal(a, b) print('Alternative array_equal time elapsed (on different arrays): %0.5f' % (time()-t)) b = N.zeros(n) t = time() for i in range(m): N.array_equal(a, b) print('classic numpy array_equal time elapsed (on same arrays): %0.5f' % (time()-t)) t = time() for i in range(m): array_equal(a, b) print('Alternative array_equal time elapsed (on same arrays): %0.5f' % (time()-t)) CPython numpy results: classic numpy array_equal time elapsed (on different arrays): 0.07728 Alternative array_equal time elapsed (on different arrays): 0.00056 classic numpy array_equal time elapsed (on same arrays): 0.11163 Alternative array_equal time elapsed (on same arrays): 9.09458 PyPy results (cannot test on "classic" version because it depends on some funcs that are unavailable yet): Alternative array_equal time elapsed (on different arrays): 0.00133 Alternative array_equal time elapsed (on same arrays): 0.95038 So, as you see, even in CPython numpy my version is 138 times faster for different arrays (yet slower in 90 times for same arrays). However, in real world usually different arrays come to this func, and only sometimes similar arrays are encountered. Well, for my implementation for case of equal arrays time elapsed essentially depends on their size, but in either way I still think my implementation is better than CPython, - it's faster and doesn't require allocation of memory for the boolean array, that will go to the logical_and. I updated my array_equal implementation with the changes mentioned above, some tests on multidimensional arrays you've asked and put it in http://pastebin.com/tg2aHE6x (now I'll update the bugs.pypy.org entry with the link). ----------------------- Regards, D. http://openopt.org/Dmitrey From wesmckinn at gmail.com Fri Jan 20 01:15:17 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 19 Jan 2012 19:15:17 -0500 Subject: [pypy-dev] certificate for accepting numpypy new funcs? In-Reply-To: <4F1873D4.8040401@ukr.net> References: <4F1848EA.1050203@ukr.net> <4F1873D4.8040401@ukr.net> Message-ID: On Thu, Jan 19, 2012 at 2:49 PM, Dmitrey wrote: > On 01/19/2012 07:31 PM, Maciej Fijalkowski wrote: >> >> On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey ?wrote: >>> >>> Hi all, >>> could you provide clarification to numpypy new funcs accepting (not only >>> for >>> me, but for any other possible volunteers)? >>> The doc I've been directed says only "You have to test exhaustively your >>> module", while I would like to know more explicit rules. >>> For example, "at least 3 tests per func" (however, I guess for funcs of >>> different complexity and variability number of tests also should expected >>> to >>> be different). >>> Also, are there any strict rules for the testcases to be submitted, or I, >>> for example, can mere write >>> >>> if __name__ == '__main__': >>> ? ?assert array_equal(1, 1) >>> ? ?assert array_equal([1, 2], [1, 2]) >>> ? ?assert array_equal(N.array([1, 2]), N.array([1, 2])) >>> ? ?assert array_equal([1, 2], N.array([1, 2])) >>> ? ?assert array_equal([1, 2], [1, 2, 3]) is False >>> ? ?print('passed') >> >> We have pretty exhaustive automated testing suites. Look for example >> in pypy/module/micronumpy/test directory for the test file style. >> They're run with py.test and we require at the very least full code >> coverage (every line has to be executed, there are tools to check, >> like coverage). Also passing "unusual" input, like sys.maxint ?etc. is >> usually recommended. With your example, you would check if it works >> for say views and multidimensional arrays. Also "is False" is not >> considered good style. >> >>> Or there is a certain rule for storing files with tests? >>> >>> If I or someone else will submit a func with some tests like in the >>> example >>> above, will you put the func and tests in the proper files by yourself? >>> I'm >>> not lazy to go for it by myself, but I mere no merged enough into numpypy >>> dev process, including mercurial branches and numpypy files structure, >>> and >>> can spend only quite limited time for diving into it in nearest future. >> >> We generally require people to put their own tests as they go with the >> code (in appropriate places) because you also should not break >> anything. The usefullness of a patch that has to be sliced and diced >> and put into places is very limited and for straightforward >> mostly-copied code, like array_equal, plain useless, since it's almost >> as much work to just do it. > > Well, for this func (array_equal) my docstrings really were copied from > cpython numpy (why wouln't do this to save some time, while license allows > it?), but > * why would'n go for this (), while other programmers are busy by other > tasks? > * engines of my and CPython numpy funcs complitely differs. At first, in > PyPy the CPython code just doesn't work at all (because of the problem with > ndarray.flat). At 2nd, I have implemented walkaround - just replaced some > code lines by > ? ?Size = a1.size > ? ?f1, f2 = a1.flat, a2.flat > ? ?# TODO: replace xrange by range in Python3 > ? ?for i in xrange(Size): > ? ? ? ?if f1.next() != f2.next(): return False > ? ?return True > > Here are some results in CPython for the following bench: > > from time import time > n = 100000 > m = 100 > a = N.zeros(n) > b = N.ones(n) > t = time() > for i in range(m): > ? ?N.array_equal(a, b) > print('classic numpy array_equal time elapsed (on different arrays): %0.5f' > % (time()-t)) > > > t = time() > for i in range(m): > ? ?array_equal(a, b) > print('Alternative array_equal time elapsed (on different arrays): %0.5f' % > (time()-t)) > > b = N.zeros(n) > > t = time() > for i in range(m): > ? ?N.array_equal(a, b) > print('classic numpy array_equal time elapsed (on same arrays): %0.5f' % > (time()-t)) > > t = time() > for i in range(m): > ? ?array_equal(a, b) > print('Alternative array_equal time elapsed (on same arrays): %0.5f' % > (time()-t)) > > CPython numpy results: > classic numpy array_equal time elapsed (on different arrays): 0.07728 > Alternative array_equal time elapsed (on different arrays): 0.00056 > classic numpy array_equal time elapsed (on same arrays): 0.11163 > Alternative array_equal time elapsed (on same arrays): 9.09458 > > PyPy results (cannot test on "classic" version because it depends on some > funcs that are unavailable yet): > Alternative array_equal time elapsed (on different arrays): 0.00133 > Alternative array_equal time elapsed (on same arrays): 0.95038 > > > So, as you see, even in CPython numpy my version is 138 times faster for > different arrays (yet slower in 90 times for same arrays). However, in real > world usually different arrays come to this func, and only sometimes similar > arrays are encountered. > Well, for my implementation for case of equal arrays time elapsed > essentially depends on their size, but in either way I still think my > implementation is better than CPython, - it's faster and doesn't require > allocation of memory for the boolean array, that will go to the logical_and. > > I updated my array_equal implementation with the changes mentioned above, > some tests on multidimensional arrays you've asked and put it in > http://pastebin.com/tg2aHE6x (now I'll update the bugs.pypy.org entry with > the link). > > > ----------------------- > Regards, D. > http://openopt.org/Dmitrey > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Worth pointing out that the implementation of array_equal and array_equiv in NumPy are a bit embarrassing because they require a full N comparisons instead of short-circuiting whenever a False value is found. This is completely silly IMHO: In [34]: x = np.random.randn(100000) In [35]: y = np.random.randn(100000) In [36]: timeit np.array_equal(x, y) 1000 loops, best of 3: 349 us per loop - W From alex.gaynor at gmail.com Fri Jan 20 01:20:54 2012 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Thu, 19 Jan 2012 18:20:54 -0600 Subject: [pypy-dev] certificate for accepting numpypy new funcs? In-Reply-To: References: <4F1848EA.1050203@ukr.net> <4F1873D4.8040401@ukr.net> Message-ID: On Thu, Jan 19, 2012 at 6:15 PM, Wes McKinney wrote: > On Thu, Jan 19, 2012 at 2:49 PM, Dmitrey wrote: > > On 01/19/2012 07:31 PM, Maciej Fijalkowski wrote: > >> > >> On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey wrote: > >>> > >>> Hi all, > >>> could you provide clarification to numpypy new funcs accepting (not > only > >>> for > >>> me, but for any other possible volunteers)? > >>> The doc I've been directed says only "You have to test exhaustively > your > >>> module", while I would like to know more explicit rules. > >>> For example, "at least 3 tests per func" (however, I guess for funcs of > >>> different complexity and variability number of tests also should > expected > >>> to > >>> be different). > >>> Also, are there any strict rules for the testcases to be submitted, or > I, > >>> for example, can mere write > >>> > >>> if __name__ == '__main__': > >>> assert array_equal(1, 1) > >>> assert array_equal([1, 2], [1, 2]) > >>> assert array_equal(N.array([1, 2]), N.array([1, 2])) > >>> assert array_equal([1, 2], N.array([1, 2])) > >>> assert array_equal([1, 2], [1, 2, 3]) is False > >>> print('passed') > >> > >> We have pretty exhaustive automated testing suites. Look for example > >> in pypy/module/micronumpy/test directory for the test file style. > >> They're run with py.test and we require at the very least full code > >> coverage (every line has to be executed, there are tools to check, > >> like coverage). Also passing "unusual" input, like sys.maxint etc. is > >> usually recommended. With your example, you would check if it works > >> for say views and multidimensional arrays. Also "is False" is not > >> considered good style. > >> > >>> Or there is a certain rule for storing files with tests? > >>> > >>> If I or someone else will submit a func with some tests like in the > >>> example > >>> above, will you put the func and tests in the proper files by yourself? > >>> I'm > >>> not lazy to go for it by myself, but I mere no merged enough into > numpypy > >>> dev process, including mercurial branches and numpypy files structure, > >>> and > >>> can spend only quite limited time for diving into it in nearest future. > >> > >> We generally require people to put their own tests as they go with the > >> code (in appropriate places) because you also should not break > >> anything. The usefullness of a patch that has to be sliced and diced > >> and put into places is very limited and for straightforward > >> mostly-copied code, like array_equal, plain useless, since it's almost > >> as much work to just do it. > > > > Well, for this func (array_equal) my docstrings really were copied from > > cpython numpy (why wouln't do this to save some time, while license > allows > > it?), but > > * why would'n go for this (), while other programmers are busy by other > > tasks? > > * engines of my and CPython numpy funcs complitely differs. At first, in > > PyPy the CPython code just doesn't work at all (because of the problem > with > > ndarray.flat). At 2nd, I have implemented walkaround - just replaced some > > code lines by > > Size = a1.size > > f1, f2 = a1.flat, a2.flat > > # TODO: replace xrange by range in Python3 > > for i in xrange(Size): > > if f1.next() != f2.next(): return False > > return True > > > > Here are some results in CPython for the following bench: > > > > from time import time > > n = 100000 > > m = 100 > > a = N.zeros(n) > > b = N.ones(n) > > t = time() > > for i in range(m): > > N.array_equal(a, b) > > print('classic numpy array_equal time elapsed (on different arrays): > %0.5f' > > % (time()-t)) > > > > > > t = time() > > for i in range(m): > > array_equal(a, b) > > print('Alternative array_equal time elapsed (on different arrays): > %0.5f' % > > (time()-t)) > > > > b = N.zeros(n) > > > > t = time() > > for i in range(m): > > N.array_equal(a, b) > > print('classic numpy array_equal time elapsed (on same arrays): %0.5f' % > > (time()-t)) > > > > t = time() > > for i in range(m): > > array_equal(a, b) > > print('Alternative array_equal time elapsed (on same arrays): %0.5f' % > > (time()-t)) > > > > CPython numpy results: > > classic numpy array_equal time elapsed (on different arrays): 0.07728 > > Alternative array_equal time elapsed (on different arrays): 0.00056 > > classic numpy array_equal time elapsed (on same arrays): 0.11163 > > Alternative array_equal time elapsed (on same arrays): 9.09458 > > > > PyPy results (cannot test on "classic" version because it depends on some > > funcs that are unavailable yet): > > Alternative array_equal time elapsed (on different arrays): 0.00133 > > Alternative array_equal time elapsed (on same arrays): 0.95038 > > > > > > So, as you see, even in CPython numpy my version is 138 times faster for > > different arrays (yet slower in 90 times for same arrays). However, in > real > > world usually different arrays come to this func, and only sometimes > similar > > arrays are encountered. > > Well, for my implementation for case of equal arrays time elapsed > > essentially depends on their size, but in either way I still think my > > implementation is better than CPython, - it's faster and doesn't require > > allocation of memory for the boolean array, that will go to the > logical_and. > > > > I updated my array_equal implementation with the changes mentioned above, > > some tests on multidimensional arrays you've asked and put it in > > http://pastebin.com/tg2aHE6x (now I'll update the bugs.pypy.org entry > with > > the link). > > > > > > ----------------------- > > Regards, D. > > http://openopt.org/Dmitrey > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > http://mail.python.org/mailman/listinfo/pypy-dev > > Worth pointing out that the implementation of array_equal and > array_equiv in NumPy are a bit embarrassing because they require a > full N comparisons instead of short-circuiting whenever a False value > is found. This is completely silly IMHO: > > In [34]: x = np.random.randn(100000) > > In [35]: y = np.random.randn(100000) > > In [36]: timeit np.array_equal(x, y) > 1000 loops, best of 3: 349 us per loop > > - W > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > The correct solution (IMO), is to reuse the original NumPy implementation, but have logical_and.reduce short circuit correctly. This has the nice side effect of allowing all() and any() to use logical_and/logical_or.reduce. Alx -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Fri Jan 20 01:25:11 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 19 Jan 2012 19:25:11 -0500 Subject: [pypy-dev] certificate for accepting numpypy new funcs? In-Reply-To: References: <4F1848EA.1050203@ukr.net> <4F1873D4.8040401@ukr.net> Message-ID: On Thu, Jan 19, 2012 at 7:20 PM, Alex Gaynor wrote: > > > On Thu, Jan 19, 2012 at 6:15 PM, Wes McKinney wrote: >> >> On Thu, Jan 19, 2012 at 2:49 PM, Dmitrey wrote: >> > On 01/19/2012 07:31 PM, Maciej Fijalkowski wrote: >> >> >> >> On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey ?wrote: >> >>> >> >>> Hi all, >> >>> could you provide clarification to numpypy new funcs accepting (not >> >>> only >> >>> for >> >>> me, but for any other possible volunteers)? >> >>> The doc I've been directed says only "You have to test exhaustively >> >>> your >> >>> module", while I would like to know more explicit rules. >> >>> For example, "at least 3 tests per func" (however, I guess for funcs >> >>> of >> >>> different complexity and variability number of tests also should >> >>> expected >> >>> to >> >>> be different). >> >>> Also, are there any strict rules for the testcases to be submitted, or >> >>> I, >> >>> for example, can mere write >> >>> >> >>> if __name__ == '__main__': >> >>> ? ?assert array_equal(1, 1) >> >>> ? ?assert array_equal([1, 2], [1, 2]) >> >>> ? ?assert array_equal(N.array([1, 2]), N.array([1, 2])) >> >>> ? ?assert array_equal([1, 2], N.array([1, 2])) >> >>> ? ?assert array_equal([1, 2], [1, 2, 3]) is False >> >>> ? ?print('passed') >> >> >> >> We have pretty exhaustive automated testing suites. Look for example >> >> in pypy/module/micronumpy/test directory for the test file style. >> >> They're run with py.test and we require at the very least full code >> >> coverage (every line has to be executed, there are tools to check, >> >> like coverage). Also passing "unusual" input, like sys.maxint ?etc. is >> >> usually recommended. With your example, you would check if it works >> >> for say views and multidimensional arrays. Also "is False" is not >> >> considered good style. >> >> >> >>> Or there is a certain rule for storing files with tests? >> >>> >> >>> If I or someone else will submit a func with some tests like in the >> >>> example >> >>> above, will you put the func and tests in the proper files by >> >>> yourself? >> >>> I'm >> >>> not lazy to go for it by myself, but I mere no merged enough into >> >>> numpypy >> >>> dev process, including mercurial branches and numpypy files structure, >> >>> and >> >>> can spend only quite limited time for diving into it in nearest >> >>> future. >> >> >> >> We generally require people to put their own tests as they go with the >> >> code (in appropriate places) because you also should not break >> >> anything. The usefullness of a patch that has to be sliced and diced >> >> and put into places is very limited and for straightforward >> >> mostly-copied code, like array_equal, plain useless, since it's almost >> >> as much work to just do it. >> > >> > Well, for this func (array_equal) my docstrings really were copied from >> > cpython numpy (why wouln't do this to save some time, while license >> > allows >> > it?), but >> > * why would'n go for this (), while other programmers are busy by other >> > tasks? >> > * engines of my and CPython numpy funcs complitely differs. At first, in >> > PyPy the CPython code just doesn't work at all (because of the problem >> > with >> > ndarray.flat). At 2nd, I have implemented walkaround - just replaced >> > some >> > code lines by >> > ? ?Size = a1.size >> > ? ?f1, f2 = a1.flat, a2.flat >> > ? ?# TODO: replace xrange by range in Python3 >> > ? ?for i in xrange(Size): >> > ? ? ? ?if f1.next() != f2.next(): return False >> > ? ?return True >> > >> > Here are some results in CPython for the following bench: >> > >> > from time import time >> > n = 100000 >> > m = 100 >> > a = N.zeros(n) >> > b = N.ones(n) >> > t = time() >> > for i in range(m): >> > ? ?N.array_equal(a, b) >> > print('classic numpy array_equal time elapsed (on different arrays): >> > %0.5f' >> > % (time()-t)) >> > >> > >> > t = time() >> > for i in range(m): >> > ? ?array_equal(a, b) >> > print('Alternative array_equal time elapsed (on different arrays): >> > %0.5f' % >> > (time()-t)) >> > >> > b = N.zeros(n) >> > >> > t = time() >> > for i in range(m): >> > ? ?N.array_equal(a, b) >> > print('classic numpy array_equal time elapsed (on same arrays): %0.5f' % >> > (time()-t)) >> > >> > t = time() >> > for i in range(m): >> > ? ?array_equal(a, b) >> > print('Alternative array_equal time elapsed (on same arrays): %0.5f' % >> > (time()-t)) >> > >> > CPython numpy results: >> > classic numpy array_equal time elapsed (on different arrays): 0.07728 >> > Alternative array_equal time elapsed (on different arrays): 0.00056 >> > classic numpy array_equal time elapsed (on same arrays): 0.11163 >> > Alternative array_equal time elapsed (on same arrays): 9.09458 >> > >> > PyPy results (cannot test on "classic" version because it depends on >> > some >> > funcs that are unavailable yet): >> > Alternative array_equal time elapsed (on different arrays): 0.00133 >> > Alternative array_equal time elapsed (on same arrays): 0.95038 >> > >> > >> > So, as you see, even in CPython numpy my version is 138 times faster for >> > different arrays (yet slower in 90 times for same arrays). However, in >> > real >> > world usually different arrays come to this func, and only sometimes >> > similar >> > arrays are encountered. >> > Well, for my implementation for case of equal arrays time elapsed >> > essentially depends on their size, but in either way I still think my >> > implementation is better than CPython, - it's faster and doesn't require >> > allocation of memory for the boolean array, that will go to the >> > logical_and. >> > >> > I updated my array_equal implementation with the changes mentioned >> > above, >> > some tests on multidimensional arrays you've asked and put it in >> > http://pastebin.com/tg2aHE6x (now I'll update the bugs.pypy.org entry >> > with >> > the link). >> > >> > >> > ----------------------- >> > Regards, D. >> > http://openopt.org/Dmitrey >> > _______________________________________________ >> > pypy-dev mailing list >> > pypy-dev at python.org >> > http://mail.python.org/mailman/listinfo/pypy-dev >> >> Worth pointing out that the implementation of array_equal and >> array_equiv in NumPy are a bit embarrassing because they require a >> full N comparisons instead of short-circuiting whenever a False value >> is found. This is completely silly IMHO: >> >> In [34]: x = np.random.randn(100000) >> >> In [35]: y = np.random.randn(100000) >> >> In [36]: timeit np.array_equal(x, y) >> 1000 loops, best of 3: 349 us per loop >> >> - W >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev > > > The correct solution (IMO), is to reuse the original NumPy implementation, > but have logical_and.reduce short circuit correctly. ?This has the nice side > effect of allowing all() and any() to use logical_and/logical_or.reduce. > > Alx To do that, you're going to have to work around the eagerness of Python-- it sort of makes me cringe to see you guys copying eager-beaver NumPy when you have a wonderful opportunity to do something better. Imagine if NumPy and APL/J/K had a lazy functional lovechild implemented in PyPy. Though maybe you're already 10 steps ahead of me. Hopefully you could make the JIT automatically take a simple array expression like this: bool(logical_and.reduce(equal(a1,a2).ravel())) and examine the array expression and turn it into an ultra fast functional expression that short circuits immediately: for x, y in zip(a1, a2): if x != y: return False return True To do that you would need to make all your ufuncs return generators instead of ndarrays. With the JIT infrastructure you could probably make this work. If ever ufunc yields a generator you could build functional array pipelines (now I'm talking like Peter Wang). But if you insist on replicating C NumPy, well... W > > -- > "I disapprove of what you say, but I will defend to the death your right to > say it." -- Evelyn Beatrice Hall (summarizing Voltaire) > "The people's good is the highest law." -- Cicero > From alex.gaynor at gmail.com Fri Jan 20 01:31:01 2012 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Thu, 19 Jan 2012 18:31:01 -0600 Subject: [pypy-dev] certificate for accepting numpypy new funcs? In-Reply-To: References: <4F1848EA.1050203@ukr.net> <4F1873D4.8040401@ukr.net> Message-ID: On Thu, Jan 19, 2012 at 6:25 PM, Wes McKinney wrote: > On Thu, Jan 19, 2012 at 7:20 PM, Alex Gaynor > wrote: > > > > > > On Thu, Jan 19, 2012 at 6:15 PM, Wes McKinney > wrote: > >> > >> On Thu, Jan 19, 2012 at 2:49 PM, Dmitrey wrote: > >> > On 01/19/2012 07:31 PM, Maciej Fijalkowski wrote: > >> >> > >> >> On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey wrote: > >> >>> > >> >>> Hi all, > >> >>> could you provide clarification to numpypy new funcs accepting (not > >> >>> only > >> >>> for > >> >>> me, but for any other possible volunteers)? > >> >>> The doc I've been directed says only "You have to test exhaustively > >> >>> your > >> >>> module", while I would like to know more explicit rules. > >> >>> For example, "at least 3 tests per func" (however, I guess for funcs > >> >>> of > >> >>> different complexity and variability number of tests also should > >> >>> expected > >> >>> to > >> >>> be different). > >> >>> Also, are there any strict rules for the testcases to be submitted, > or > >> >>> I, > >> >>> for example, can mere write > >> >>> > >> >>> if __name__ == '__main__': > >> >>> assert array_equal(1, 1) > >> >>> assert array_equal([1, 2], [1, 2]) > >> >>> assert array_equal(N.array([1, 2]), N.array([1, 2])) > >> >>> assert array_equal([1, 2], N.array([1, 2])) > >> >>> assert array_equal([1, 2], [1, 2, 3]) is False > >> >>> print('passed') > >> >> > >> >> We have pretty exhaustive automated testing suites. Look for example > >> >> in pypy/module/micronumpy/test directory for the test file style. > >> >> They're run with py.test and we require at the very least full code > >> >> coverage (every line has to be executed, there are tools to check, > >> >> like coverage). Also passing "unusual" input, like sys.maxint etc. > is > >> >> usually recommended. With your example, you would check if it works > >> >> for say views and multidimensional arrays. Also "is False" is not > >> >> considered good style. > >> >> > >> >>> Or there is a certain rule for storing files with tests? > >> >>> > >> >>> If I or someone else will submit a func with some tests like in the > >> >>> example > >> >>> above, will you put the func and tests in the proper files by > >> >>> yourself? > >> >>> I'm > >> >>> not lazy to go for it by myself, but I mere no merged enough into > >> >>> numpypy > >> >>> dev process, including mercurial branches and numpypy files > structure, > >> >>> and > >> >>> can spend only quite limited time for diving into it in nearest > >> >>> future. > >> >> > >> >> We generally require people to put their own tests as they go with > the > >> >> code (in appropriate places) because you also should not break > >> >> anything. The usefullness of a patch that has to be sliced and diced > >> >> and put into places is very limited and for straightforward > >> >> mostly-copied code, like array_equal, plain useless, since it's > almost > >> >> as much work to just do it. > >> > > >> > Well, for this func (array_equal) my docstrings really were copied > from > >> > cpython numpy (why wouln't do this to save some time, while license > >> > allows > >> > it?), but > >> > * why would'n go for this (), while other programmers are busy by > other > >> > tasks? > >> > * engines of my and CPython numpy funcs complitely differs. At first, > in > >> > PyPy the CPython code just doesn't work at all (because of the problem > >> > with > >> > ndarray.flat). At 2nd, I have implemented walkaround - just replaced > >> > some > >> > code lines by > >> > Size = a1.size > >> > f1, f2 = a1.flat, a2.flat > >> > # TODO: replace xrange by range in Python3 > >> > for i in xrange(Size): > >> > if f1.next() != f2.next(): return False > >> > return True > >> > > >> > Here are some results in CPython for the following bench: > >> > > >> > from time import time > >> > n = 100000 > >> > m = 100 > >> > a = N.zeros(n) > >> > b = N.ones(n) > >> > t = time() > >> > for i in range(m): > >> > N.array_equal(a, b) > >> > print('classic numpy array_equal time elapsed (on different arrays): > >> > %0.5f' > >> > % (time()-t)) > >> > > >> > > >> > t = time() > >> > for i in range(m): > >> > array_equal(a, b) > >> > print('Alternative array_equal time elapsed (on different arrays): > >> > %0.5f' % > >> > (time()-t)) > >> > > >> > b = N.zeros(n) > >> > > >> > t = time() > >> > for i in range(m): > >> > N.array_equal(a, b) > >> > print('classic numpy array_equal time elapsed (on same arrays): > %0.5f' % > >> > (time()-t)) > >> > > >> > t = time() > >> > for i in range(m): > >> > array_equal(a, b) > >> > print('Alternative array_equal time elapsed (on same arrays): %0.5f' % > >> > (time()-t)) > >> > > >> > CPython numpy results: > >> > classic numpy array_equal time elapsed (on different arrays): 0.07728 > >> > Alternative array_equal time elapsed (on different arrays): 0.00056 > >> > classic numpy array_equal time elapsed (on same arrays): 0.11163 > >> > Alternative array_equal time elapsed (on same arrays): 9.09458 > >> > > >> > PyPy results (cannot test on "classic" version because it depends on > >> > some > >> > funcs that are unavailable yet): > >> > Alternative array_equal time elapsed (on different arrays): 0.00133 > >> > Alternative array_equal time elapsed (on same arrays): 0.95038 > >> > > >> > > >> > So, as you see, even in CPython numpy my version is 138 times faster > for > >> > different arrays (yet slower in 90 times for same arrays). However, in > >> > real > >> > world usually different arrays come to this func, and only sometimes > >> > similar > >> > arrays are encountered. > >> > Well, for my implementation for case of equal arrays time elapsed > >> > essentially depends on their size, but in either way I still think my > >> > implementation is better than CPython, - it's faster and doesn't > require > >> > allocation of memory for the boolean array, that will go to the > >> > logical_and. > >> > > >> > I updated my array_equal implementation with the changes mentioned > >> > above, > >> > some tests on multidimensional arrays you've asked and put it in > >> > http://pastebin.com/tg2aHE6x (now I'll update the bugs.pypy.org entry > >> > with > >> > the link). > >> > > >> > > >> > ----------------------- > >> > Regards, D. > >> > http://openopt.org/Dmitrey > >> > _______________________________________________ > >> > pypy-dev mailing list > >> > pypy-dev at python.org > >> > http://mail.python.org/mailman/listinfo/pypy-dev > >> > >> Worth pointing out that the implementation of array_equal and > >> array_equiv in NumPy are a bit embarrassing because they require a > >> full N comparisons instead of short-circuiting whenever a False value > >> is found. This is completely silly IMHO: > >> > >> In [34]: x = np.random.randn(100000) > >> > >> In [35]: y = np.random.randn(100000) > >> > >> In [36]: timeit np.array_equal(x, y) > >> 1000 loops, best of 3: 349 us per loop > >> > >> - W > >> _______________________________________________ > >> pypy-dev mailing list > >> pypy-dev at python.org > >> http://mail.python.org/mailman/listinfo/pypy-dev > > > > > > The correct solution (IMO), is to reuse the original NumPy > implementation, > > but have logical_and.reduce short circuit correctly. This has the nice > side > > effect of allowing all() and any() to use logical_and/logical_or.reduce. > > > > Alx > > To do that, you're going to have to work around the eagerness of > Python-- it sort of makes me cringe to see you guys copying > eager-beaver NumPy when you have a wonderful opportunity to do > something better. Imagine if NumPy and APL/J/K had a lazy functional > lovechild implemented in PyPy. Though maybe you're already 10 steps > ahead of me. > Well, you're the first person to ever express the sentiment that we should do something else :) But I think you'll be pleased, read on! > > Hopefully you could make the JIT automatically take a simple array > expression like this: > > bool(logical_and.reduce(equal(a1,a2).ravel())) > > and examine the array expression and turn it into an ultra fast > functional expression that short circuits immediately: > > for x, y in zip(a1, a2): > if x != y: > return False > return True > > To do that you would need to make all your ufuncs return generators > instead of ndarrays. With the JIT infrastructure you could probably > make this work. If ever ufunc yields a generator you could build > functional array pipelines (now I'm talking like Peter Wang). But if > you insist on replicating C NumPy, well... > > W > > > > > -- > > "I disapprove of what you say, but I will defend to the death your right > to > > say it." -- Evelyn Beatrice Hall (summarizing Voltaire) > > "The people's good is the highest law." -- Cicero > > > These don't need to return generators, they just need to return things that look like ndarrays, but are internally lazy. And that's exactly what we do. Using .all() instead of logical_and.reduce() (since we don't have logical_and yet, and even if we did it wouldn't short circuit without some extra work), the JIT will generate almost exactly the code you posted (except in x86 :P). Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Fri Jan 20 09:36:01 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 20 Jan 2012 10:36:01 +0200 Subject: [pypy-dev] certificate for accepting numpypy new funcs? In-Reply-To: References: <4F1848EA.1050203@ukr.net> <4F1873D4.8040401@ukr.net> Message-ID: On Fri, Jan 20, 2012 at 2:31 AM, Alex Gaynor wrote: > > > On Thu, Jan 19, 2012 at 6:25 PM, Wes McKinney wrote: >> >> On Thu, Jan 19, 2012 at 7:20 PM, Alex Gaynor >> wrote: >> > >> > >> > On Thu, Jan 19, 2012 at 6:15 PM, Wes McKinney >> > wrote: >> >> >> >> On Thu, Jan 19, 2012 at 2:49 PM, Dmitrey wrote: >> >> > On 01/19/2012 07:31 PM, Maciej Fijalkowski wrote: >> >> >> >> >> >> On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey ?wrote: >> >> >>> >> >> >>> Hi all, >> >> >>> could you provide clarification to numpypy new funcs accepting (not >> >> >>> only >> >> >>> for >> >> >>> me, but for any other possible volunteers)? >> >> >>> The doc I've been directed says only "You have to test exhaustively >> >> >>> your >> >> >>> module", while I would like to know more explicit rules. >> >> >>> For example, "at least 3 tests per func" (however, I guess for >> >> >>> funcs >> >> >>> of >> >> >>> different complexity and variability number of tests also should >> >> >>> expected >> >> >>> to >> >> >>> be different). >> >> >>> Also, are there any strict rules for the testcases to be submitted, >> >> >>> or >> >> >>> I, >> >> >>> for example, can mere write >> >> >>> >> >> >>> if __name__ == '__main__': >> >> >>> ? ?assert array_equal(1, 1) >> >> >>> ? ?assert array_equal([1, 2], [1, 2]) >> >> >>> ? ?assert array_equal(N.array([1, 2]), N.array([1, 2])) >> >> >>> ? ?assert array_equal([1, 2], N.array([1, 2])) >> >> >>> ? ?assert array_equal([1, 2], [1, 2, 3]) is False >> >> >>> ? ?print('passed') >> >> >> >> >> >> We have pretty exhaustive automated testing suites. Look for example >> >> >> in pypy/module/micronumpy/test directory for the test file style. >> >> >> They're run with py.test and we require at the very least full code >> >> >> coverage (every line has to be executed, there are tools to check, >> >> >> like coverage). Also passing "unusual" input, like sys.maxint ?etc. >> >> >> is >> >> >> usually recommended. With your example, you would check if it works >> >> >> for say views and multidimensional arrays. Also "is False" is not >> >> >> considered good style. >> >> >> >> >> >>> Or there is a certain rule for storing files with tests? >> >> >>> >> >> >>> If I or someone else will submit a func with some tests like in the >> >> >>> example >> >> >>> above, will you put the func and tests in the proper files by >> >> >>> yourself? >> >> >>> I'm >> >> >>> not lazy to go for it by myself, but I mere no merged enough into >> >> >>> numpypy >> >> >>> dev process, including mercurial branches and numpypy files >> >> >>> structure, >> >> >>> and >> >> >>> can spend only quite limited time for diving into it in nearest >> >> >>> future. >> >> >> >> >> >> We generally require people to put their own tests as they go with >> >> >> the >> >> >> code (in appropriate places) because you also should not break >> >> >> anything. The usefullness of a patch that has to be sliced and diced >> >> >> and put into places is very limited and for straightforward >> >> >> mostly-copied code, like array_equal, plain useless, since it's >> >> >> almost >> >> >> as much work to just do it. >> >> > >> >> > Well, for this func (array_equal) my docstrings really were copied >> >> > from >> >> > cpython numpy (why wouln't do this to save some time, while license >> >> > allows >> >> > it?), but >> >> > * why would'n go for this (), while other programmers are busy by >> >> > other >> >> > tasks? >> >> > * engines of my and CPython numpy funcs complitely differs. At first, >> >> > in >> >> > PyPy the CPython code just doesn't work at all (because of the >> >> > problem >> >> > with >> >> > ndarray.flat). At 2nd, I have implemented walkaround - just replaced >> >> > some >> >> > code lines by >> >> > ? ?Size = a1.size >> >> > ? ?f1, f2 = a1.flat, a2.flat >> >> > ? ?# TODO: replace xrange by range in Python3 >> >> > ? ?for i in xrange(Size): >> >> > ? ? ? ?if f1.next() != f2.next(): return False >> >> > ? ?return True >> >> > >> >> > Here are some results in CPython for the following bench: >> >> > >> >> > from time import time >> >> > n = 100000 >> >> > m = 100 >> >> > a = N.zeros(n) >> >> > b = N.ones(n) >> >> > t = time() >> >> > for i in range(m): >> >> > ? ?N.array_equal(a, b) >> >> > print('classic numpy array_equal time elapsed (on different arrays): >> >> > %0.5f' >> >> > % (time()-t)) >> >> > >> >> > >> >> > t = time() >> >> > for i in range(m): >> >> > ? ?array_equal(a, b) >> >> > print('Alternative array_equal time elapsed (on different arrays): >> >> > %0.5f' % >> >> > (time()-t)) >> >> > >> >> > b = N.zeros(n) >> >> > >> >> > t = time() >> >> > for i in range(m): >> >> > ? ?N.array_equal(a, b) >> >> > print('classic numpy array_equal time elapsed (on same arrays): >> >> > %0.5f' % >> >> > (time()-t)) >> >> > >> >> > t = time() >> >> > for i in range(m): >> >> > ? ?array_equal(a, b) >> >> > print('Alternative array_equal time elapsed (on same arrays): %0.5f' >> >> > % >> >> > (time()-t)) >> >> > >> >> > CPython numpy results: >> >> > classic numpy array_equal time elapsed (on different arrays): 0.07728 >> >> > Alternative array_equal time elapsed (on different arrays): 0.00056 >> >> > classic numpy array_equal time elapsed (on same arrays): 0.11163 >> >> > Alternative array_equal time elapsed (on same arrays): 9.09458 >> >> > >> >> > PyPy results (cannot test on "classic" version because it depends on >> >> > some >> >> > funcs that are unavailable yet): >> >> > Alternative array_equal time elapsed (on different arrays): 0.00133 >> >> > Alternative array_equal time elapsed (on same arrays): 0.95038 >> >> > >> >> > >> >> > So, as you see, even in CPython numpy my version is 138 times faster >> >> > for >> >> > different arrays (yet slower in 90 times for same arrays). However, >> >> > in >> >> > real >> >> > world usually different arrays come to this func, and only sometimes >> >> > similar >> >> > arrays are encountered. >> >> > Well, for my implementation for case of equal arrays time elapsed >> >> > essentially depends on their size, but in either way I still think my >> >> > implementation is better than CPython, - it's faster and doesn't >> >> > require >> >> > allocation of memory for the boolean array, that will go to the >> >> > logical_and. >> >> > >> >> > I updated my array_equal implementation with the changes mentioned >> >> > above, >> >> > some tests on multidimensional arrays you've asked and put it in >> >> > http://pastebin.com/tg2aHE6x (now I'll update the bugs.pypy.org entry >> >> > with >> >> > the link). >> >> > >> >> > >> >> > ----------------------- >> >> > Regards, D. >> >> > http://openopt.org/Dmitrey >> >> > _______________________________________________ >> >> > pypy-dev mailing list >> >> > pypy-dev at python.org >> >> > http://mail.python.org/mailman/listinfo/pypy-dev >> >> >> >> Worth pointing out that the implementation of array_equal and >> >> array_equiv in NumPy are a bit embarrassing because they require a >> >> full N comparisons instead of short-circuiting whenever a False value >> >> is found. This is completely silly IMHO: >> >> >> >> In [34]: x = np.random.randn(100000) >> >> >> >> In [35]: y = np.random.randn(100000) >> >> >> >> In [36]: timeit np.array_equal(x, y) >> >> 1000 loops, best of 3: 349 us per loop >> >> >> >> - W >> >> _______________________________________________ >> >> pypy-dev mailing list >> >> pypy-dev at python.org >> >> http://mail.python.org/mailman/listinfo/pypy-dev >> > >> > >> > The correct solution (IMO), is to reuse the original NumPy >> > implementation, >> > but have logical_and.reduce short circuit correctly. ?This has the nice >> > side >> > effect of allowing all() and any() to use logical_and/logical_or.reduce. >> > >> > Alx >> >> To do that, you're going to have to work around the eagerness of >> Python-- it sort of makes me cringe to see you guys copying >> eager-beaver NumPy when you have a wonderful opportunity to do >> something better. Imagine if NumPy and APL/J/K had a lazy functional >> lovechild implemented in PyPy. Though maybe you're already 10 steps >> ahead of me. > > > Well, you're the first person to ever express the sentiment that we should > do something else :) ?But I think you'll be pleased, read on! >> >> >> Hopefully you could make the JIT automatically take a simple array >> expression like this: >> >> bool(logical_and.reduce(equal(a1,a2).ravel())) >> >> and examine the array expression and turn it into an ultra fast >> functional expression that short circuits immediately: >> >> for x, y in zip(a1, a2): >> ? ?if x != y: >> ? ? ? ?return False >> return True >> >> To do that you would need to make all your ufuncs return generators >> instead of ndarrays. With the JIT infrastructure you could probably >> make this work. If ever ufunc yields a generator you could build >> functional array pipelines (now I'm talking like Peter Wang). But if >> you insist on replicating C NumPy, well... >> >> W >> >> > >> > -- >> > "I disapprove of what you say, but I will defend to the death your right >> > to >> > say it." -- Evelyn Beatrice Hall (summarizing Voltaire) >> > "The people's good is the highest law." -- Cicero >> > > > > These don't need to return generators, they just need to return things that > look like ndarrays, but are internally lazy. ?And that's exactly what we do. > > Using .all() instead of logical_and.reduce() (since we don't have > logical_and yet, and even if we did it wouldn't short circuit without some > extra work), the JIT will generate almost exactly the code you posted > (except in x86 :P). > > Alex > The main difference from generator is that if you access element, you have to force the entire array. But this is not too bad since you're mostly interested in all elements anyway. What's even more interesting is that this gives (a yet untapped) potential for good vectorization. Cheers, fijal From fijall at gmail.com Fri Jan 20 09:57:12 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 20 Jan 2012 10:57:12 +0200 Subject: [pypy-dev] [pypy-commit] extradoc extradoc: Planning for today In-Reply-To: <20120118094627.E56D0820D8@wyvern.cs.uni-duesseldorf.de> References: <20120118094627.E56D0820D8@wyvern.cs.uni-duesseldorf.de> Message-ID: > +* Debug the ARM backend (bivab) For what is worth, I found buggy handling of tmpboxes on the PPC branch. ARM probably has the same From josephjavierperla at gmail.com Fri Jan 20 12:49:29 2012 From: josephjavierperla at gmail.com (Joseph Perla) Date: Fri, 20 Jan 2012 06:49:29 -0500 Subject: [pypy-dev] pypy dev Message-ID: I'd like to develop to improve numpypy/scipypy j -------------- next part -------------- An HTML attachment was scrubbed... URL: From josephjavierperla at gmail.com Fri Jan 20 13:16:01 2012 From: josephjavierperla at gmail.com (Joseph Perla) Date: Fri, 20 Jan 2012 07:16:01 -0500 Subject: [pypy-dev] numpypy / scipypy additions Message-ID: Hello everyone, I want to add functions to numpypy and also start making scipypy useful to scientists. How do I commit my code? First, a little bit about myself: I have been following PyPy's development for 5 years. I met Armin Rigo and other PyPy devs at EuroPython 2011 in Florence this past year. I gave a talk about minimalist Python web templates: weby templates. PyPy always seemed like a hugely complicated project far above my talents. I look forward to finally contributing code myself. My goal: I am developing probabilistic models along the lines of Latent Dirichlet Allocation for artificial intelligence applications. I love Python, so I'm developing my models in Python. Unfortunately, it is slow. Fortunately, my models are numerical calculation and loop heavy. It will be easy to run my code on pypy once the numpy and scipy support is stronger. So, I downloaded the nightly build. It nearly works! It is missing a few necessary functions: scipypy.special.gammaln, scipy.special.psi, numpy.reshape, numpy.matrix, and the numpy.random module. So, I implemented gammaln and psi. It seems to be within 2x speed of the Fortran77 code in scipy (it's hard to measure! how do i do this?). I didn't see anywhere on the web about a scipypy project existing. I think I want to start it now, and I want to contribute these functions. An incomplete scipypy will be useful to a lot of people, and will encourage more new developers to add to it. You probably have a plan about how you want to integrate the original scipy code, but I think we should start moving forward with whatever we have as soon as available. I also know I can implement much of the numpy.random module (as well as matrix and reshape) easily once I know how to get the codebase and push changes. I've been using Python and Numpy for years. Of course I'll use the original numpy code when it's pure python. I'm excited to submit, just please let me know how to do that. These improvements will do a lot for machine learning research, I think. j -------------- next part -------------- An HTML attachment was scrubbed... URL: From josephjavierperla at gmail.com Fri Jan 20 13:19:25 2012 From: josephjavierperla at gmail.com (Joseph Perla) Date: Fri, 20 Jan 2012 07:19:25 -0500 Subject: [pypy-dev] numpypy / scipypy additions In-Reply-To: References: Message-ID: I should add that I'm looking into the possibility of writing a Fortran -> RPython compiler in order to mechanically port many of the rest of the functions scipy and get them into the JIT. j On Fri, Jan 20, 2012 at 7:16 AM, Joseph Perla wrote: > Hello everyone, > > I want to add functions to numpypy and also start making scipypy useful to > scientists. How do I commit my code? > > First, a little bit about myself: I have been following PyPy's development > for 5 years. I met Armin Rigo and other PyPy devs at EuroPython 2011 in > Florence this past year. I gave a talk about minimalist Python web > templates: weby templates. > > PyPy always seemed like a hugely complicated project far above my talents. > I look forward to finally contributing code myself. > > My goal: I am developing probabilistic models along the lines of Latent > Dirichlet Allocation for artificial intelligence applications. I love > Python, so I'm developing my models in Python. Unfortunately, it is slow. > Fortunately, my models are numerical calculation and loop heavy. It will > be easy to run my code on pypy once the numpy and scipy support is stronger. > > So, I downloaded the nightly build. It nearly works! It is missing a few > necessary functions: scipypy.special.gammaln, scipy.special.psi, > numpy.reshape, numpy.matrix, and the numpy.random module. > > So, I implemented gammaln and psi. It seems to be within 2x speed of the > Fortran77 code in scipy (it's hard to measure! how do i do this?). I > didn't see anywhere on the web about a scipypy project existing. I think I > want to start it now, and I want to contribute these functions. An > incomplete scipypy will be useful to a lot of people, and will encourage > more new developers to add to it. You probably have a plan about how you > want to integrate the original scipy code, but I think we should start > moving forward with whatever we have as soon as available. > > I also know I can implement much of the numpy.random module (as well as > matrix and reshape) easily once I know how to get the codebase and push > changes. I've been using Python and Numpy for years. > > Of course I'll use the original numpy code when it's pure python. > > I'm excited to submit, just please let me know how to do that. These > improvements will do a lot for machine learning research, I think. > j > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Fri Jan 20 13:47:01 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 20 Jan 2012 07:47:01 -0500 Subject: [pypy-dev] certificate for accepting numpypy new funcs? References: <4F1848EA.1050203@ukr.net> <4F1873D4.8040401@ukr.net> Message-ID: Alex Gaynor wrote: > On Thu, Jan 19, 2012 at 6:15 PM, Wes McKinney wrote: > >> On Thu, Jan 19, 2012 at 2:49 PM, Dmitrey wrote: >> > On 01/19/2012 07:31 PM, Maciej Fijalkowski wrote: >> >> >> >> On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey wrote: >> >>> >> >>> Hi all, >> >>> could you provide clarification to numpypy new funcs accepting (not >> only >> >>> for >> >>> me, but for any other possible volunteers)? >> >>> The doc I've been directed says only "You have to test exhaustively >> your >> >>> module", while I would like to know more explicit rules. >> >>> For example, "at least 3 tests per func" (however, I guess for funcs of >> >>> different complexity and variability number of tests also should >> expected >> >>> to >> >>> be different). >> >>> Also, are there any strict rules for the testcases to be submitted, or >> I, >> >>> for example, can mere write >> >>> >> >>> if __name__ == '__main__': >> >>> assert array_equal(1, 1) >> >>> assert array_equal([1, 2], [1, 2]) >> >>> assert array_equal(N.array([1, 2]), N.array([1, 2])) >> >>> assert array_equal([1, 2], N.array([1, 2])) >> >>> assert array_equal([1, 2], [1, 2, 3]) is False >> >>> print('passed') >> >> >> >> We have pretty exhaustive automated testing suites. Look for example >> >> in pypy/module/micronumpy/test directory for the test file style. >> >> They're run with py.test and we require at the very least full code >> >> coverage (every line has to be executed, there are tools to check, >> >> like coverage). Also passing "unusual" input, like sys.maxint etc. is >> >> usually recommended. With your example, you would check if it works >> >> for say views and multidimensional arrays. Also "is False" is not >> >> considered good style. >> >> >> >>> Or there is a certain rule for storing files with tests? >> >>> >> >>> If I or someone else will submit a func with some tests like in the >> >>> example >> >>> above, will you put the func and tests in the proper files by yourself? >> >>> I'm >> >>> not lazy to go for it by myself, but I mere no merged enough into >> numpypy >> >>> dev process, including mercurial branches and numpypy files structure, >> >>> and >> >>> can spend only quite limited time for diving into it in nearest future. >> >> >> >> We generally require people to put their own tests as they go with the >> >> code (in appropriate places) because you also should not break >> >> anything. The usefullness of a patch that has to be sliced and diced >> >> and put into places is very limited and for straightforward >> >> mostly-copied code, like array_equal, plain useless, since it's almost >> >> as much work to just do it. >> > >> > Well, for this func (array_equal) my docstrings really were copied from >> > cpython numpy (why wouln't do this to save some time, while license >> allows >> > it?), but >> > * why would'n go for this (), while other programmers are busy by other >> > tasks? >> > * engines of my and CPython numpy funcs complitely differs. At first, in >> > PyPy the CPython code just doesn't work at all (because of the problem >> with >> > ndarray.flat). At 2nd, I have implemented walkaround - just replaced some >> > code lines by >> > Size = a1.size >> > f1, f2 = a1.flat, a2.flat >> > # TODO: replace xrange by range in Python3 >> > for i in xrange(Size): >> > if f1.next() != f2.next(): return False >> > return True >> > >> > Here are some results in CPython for the following bench: >> > >> > from time import time >> > n = 100000 >> > m = 100 >> > a = N.zeros(n) >> > b = N.ones(n) >> > t = time() >> > for i in range(m): >> > N.array_equal(a, b) >> > print('classic numpy array_equal time elapsed (on different arrays): >> %0.5f' >> > % (time()-t)) >> > >> > >> > t = time() >> > for i in range(m): >> > array_equal(a, b) >> > print('Alternative array_equal time elapsed (on different arrays): >> %0.5f' % >> > (time()-t)) >> > >> > b = N.zeros(n) >> > >> > t = time() >> > for i in range(m): >> > N.array_equal(a, b) >> > print('classic numpy array_equal time elapsed (on same arrays): %0.5f' % >> > (time()-t)) >> > >> > t = time() >> > for i in range(m): >> > array_equal(a, b) >> > print('Alternative array_equal time elapsed (on same arrays): %0.5f' % >> > (time()-t)) >> > >> > CPython numpy results: >> > classic numpy array_equal time elapsed (on different arrays): 0.07728 >> > Alternative array_equal time elapsed (on different arrays): 0.00056 >> > classic numpy array_equal time elapsed (on same arrays): 0.11163 >> > Alternative array_equal time elapsed (on same arrays): 9.09458 >> > >> > PyPy results (cannot test on "classic" version because it depends on some >> > funcs that are unavailable yet): >> > Alternative array_equal time elapsed (on different arrays): 0.00133 >> > Alternative array_equal time elapsed (on same arrays): 0.95038 >> > >> > >> > So, as you see, even in CPython numpy my version is 138 times faster for >> > different arrays (yet slower in 90 times for same arrays). However, in >> real >> > world usually different arrays come to this func, and only sometimes >> similar >> > arrays are encountered. >> > Well, for my implementation for case of equal arrays time elapsed >> > essentially depends on their size, but in either way I still think my >> > implementation is better than CPython, - it's faster and doesn't require >> > allocation of memory for the boolean array, that will go to the >> logical_and. >> > >> > I updated my array_equal implementation with the changes mentioned above, >> > some tests on multidimensional arrays you've asked and put it in >> > http://pastebin.com/tg2aHE6x (now I'll update the bugs.pypy.org entry >> with >> > the link). >> > >> > >> > ----------------------- >> > Regards, D. >> > http://openopt.org/Dmitrey >> > _______________________________________________ >> > pypy-dev mailing list >> > pypy-dev at python.org >> > http://mail.python.org/mailman/listinfo/pypy-dev >> >> Worth pointing out that the implementation of array_equal and >> array_equiv in NumPy are a bit embarrassing because they require a >> full N comparisons instead of short-circuiting whenever a False value >> is found. This is completely silly IMHO: >> >> In [34]: x = np.random.randn(100000) >> >> In [35]: y = np.random.randn(100000) >> >> In [36]: timeit np.array_equal(x, y) >> 1000 loops, best of 3: 349 us per loop >> >> - W >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev >> > > The correct solution (IMO), is to reuse the original NumPy implementation, > but have logical_and.reduce short circuit correctly. This has the nice > side effect of allowing all() and any() to use > logical_and/logical_or.reduce. > > Alx > I have complained on the numpy list about 1 year ago about this issue. The usual numpy idiom is np.any (some comparison) which will create an array of the full size comparing each element before attempting the 'any', which is obviously wasteful. Hope numpypy can do better. From fijall at gmail.com Fri Jan 20 14:18:44 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 20 Jan 2012 15:18:44 +0200 Subject: [pypy-dev] certificate for accepting numpypy new funcs? In-Reply-To: References: <4F1848EA.1050203@ukr.net> <4F1873D4.8040401@ukr.net> Message-ID: On Fri, Jan 20, 2012 at 2:47 PM, Neal Becker wrote: > Alex Gaynor wrote: > >> On Thu, Jan 19, 2012 at 6:15 PM, Wes McKinney wrote: >> >>> On Thu, Jan 19, 2012 at 2:49 PM, Dmitrey wrote: >>> > On 01/19/2012 07:31 PM, Maciej Fijalkowski wrote: >>> >> >>> >> On Thu, Jan 19, 2012 at 6:46 PM, Dmitrey ?wrote: >>> >>> >>> >>> Hi all, >>> >>> could you provide clarification to numpypy new funcs accepting (not >>> only >>> >>> for >>> >>> me, but for any other possible volunteers)? >>> >>> The doc I've been directed says only "You have to test exhaustively >>> your >>> >>> module", while I would like to know more explicit rules. >>> >>> For example, "at least 3 tests per func" (however, I guess for funcs of >>> >>> different complexity and variability number of tests also should >>> expected >>> >>> to >>> >>> be different). >>> >>> Also, are there any strict rules for the testcases to be submitted, or >>> I, >>> >>> for example, can mere write >>> >>> >>> >>> if __name__ == '__main__': >>> >>> ? ?assert array_equal(1, 1) >>> >>> ? ?assert array_equal([1, 2], [1, 2]) >>> >>> ? ?assert array_equal(N.array([1, 2]), N.array([1, 2])) >>> >>> ? ?assert array_equal([1, 2], N.array([1, 2])) >>> >>> ? ?assert array_equal([1, 2], [1, 2, 3]) is False >>> >>> ? ?print('passed') >>> >> >>> >> We have pretty exhaustive automated testing suites. Look for example >>> >> in pypy/module/micronumpy/test directory for the test file style. >>> >> They're run with py.test and we require at the very least full code >>> >> coverage (every line has to be executed, there are tools to check, >>> >> like coverage). Also passing "unusual" input, like sys.maxint ?etc. is >>> >> usually recommended. With your example, you would check if it works >>> >> for say views and multidimensional arrays. Also "is False" is not >>> >> considered good style. >>> >> >>> >>> Or there is a certain rule for storing files with tests? >>> >>> >>> >>> If I or someone else will submit a func with some tests like in the >>> >>> example >>> >>> above, will you put the func and tests in the proper files by yourself? >>> >>> I'm >>> >>> not lazy to go for it by myself, but I mere no merged enough into >>> numpypy >>> >>> dev process, including mercurial branches and numpypy files structure, >>> >>> and >>> >>> can spend only quite limited time for diving into it in nearest future. >>> >> >>> >> We generally require people to put their own tests as they go with the >>> >> code (in appropriate places) because you also should not break >>> >> anything. The usefullness of a patch that has to be sliced and diced >>> >> and put into places is very limited and for straightforward >>> >> mostly-copied code, like array_equal, plain useless, since it's almost >>> >> as much work to just do it. >>> > >>> > Well, for this func (array_equal) my docstrings really were copied from >>> > cpython numpy (why wouln't do this to save some time, while license >>> allows >>> > it?), but >>> > * why would'n go for this (), while other programmers are busy by other >>> > tasks? >>> > * engines of my and CPython numpy funcs complitely differs. At first, in >>> > PyPy the CPython code just doesn't work at all (because of the problem >>> with >>> > ndarray.flat). At 2nd, I have implemented walkaround - just replaced some >>> > code lines by >>> > ? ?Size = a1.size >>> > ? ?f1, f2 = a1.flat, a2.flat >>> > ? ?# TODO: replace xrange by range in Python3 >>> > ? ?for i in xrange(Size): >>> > ? ? ? ?if f1.next() != f2.next(): return False >>> > ? ?return True >>> > >>> > Here are some results in CPython for the following bench: >>> > >>> > from time import time >>> > n = 100000 >>> > m = 100 >>> > a = N.zeros(n) >>> > b = N.ones(n) >>> > t = time() >>> > for i in range(m): >>> > ? ?N.array_equal(a, b) >>> > print('classic numpy array_equal time elapsed (on different arrays): >>> %0.5f' >>> > % (time()-t)) >>> > >>> > >>> > t = time() >>> > for i in range(m): >>> > ? ?array_equal(a, b) >>> > print('Alternative array_equal time elapsed (on different arrays): >>> %0.5f' % >>> > (time()-t)) >>> > >>> > b = N.zeros(n) >>> > >>> > t = time() >>> > for i in range(m): >>> > ? ?N.array_equal(a, b) >>> > print('classic numpy array_equal time elapsed (on same arrays): %0.5f' % >>> > (time()-t)) >>> > >>> > t = time() >>> > for i in range(m): >>> > ? ?array_equal(a, b) >>> > print('Alternative array_equal time elapsed (on same arrays): %0.5f' % >>> > (time()-t)) >>> > >>> > CPython numpy results: >>> > classic numpy array_equal time elapsed (on different arrays): 0.07728 >>> > Alternative array_equal time elapsed (on different arrays): 0.00056 >>> > classic numpy array_equal time elapsed (on same arrays): 0.11163 >>> > Alternative array_equal time elapsed (on same arrays): 9.09458 >>> > >>> > PyPy results (cannot test on "classic" version because it depends on some >>> > funcs that are unavailable yet): >>> > Alternative array_equal time elapsed (on different arrays): 0.00133 >>> > Alternative array_equal time elapsed (on same arrays): 0.95038 >>> > >>> > >>> > So, as you see, even in CPython numpy my version is 138 times faster for >>> > different arrays (yet slower in 90 times for same arrays). However, in >>> real >>> > world usually different arrays come to this func, and only sometimes >>> similar >>> > arrays are encountered. >>> > Well, for my implementation for case of equal arrays time elapsed >>> > essentially depends on their size, but in either way I still think my >>> > implementation is better than CPython, - it's faster and doesn't require >>> > allocation of memory for the boolean array, that will go to the >>> logical_and. >>> > >>> > I updated my array_equal implementation with the changes mentioned above, >>> > some tests on multidimensional arrays you've asked and put it in >>> > http://pastebin.com/tg2aHE6x (now I'll update the bugs.pypy.org entry >>> with >>> > the link). >>> > >>> > >>> > ----------------------- >>> > Regards, D. >>> > http://openopt.org/Dmitrey >>> > _______________________________________________ >>> > pypy-dev mailing list >>> > pypy-dev at python.org >>> > http://mail.python.org/mailman/listinfo/pypy-dev >>> >>> Worth pointing out that the implementation of array_equal and >>> array_equiv in NumPy are a bit embarrassing because they require a >>> full N comparisons instead of short-circuiting whenever a False value >>> is found. This is completely silly IMHO: >>> >>> In [34]: x = np.random.randn(100000) >>> >>> In [35]: y = np.random.randn(100000) >>> >>> In [36]: timeit np.array_equal(x, y) >>> 1000 loops, best of 3: 349 us per loop >>> >>> - W >>> _______________________________________________ >>> pypy-dev mailing list >>> pypy-dev at python.org >>> http://mail.python.org/mailman/listinfo/pypy-dev >>> >> >> The correct solution (IMO), is to reuse the original NumPy implementation, >> but have logical_and.reduce short circuit correctly. ?This has the nice >> side effect of allowing all() and any() to use >> logical_and/logical_or.reduce. >> >> Alx >> > > > I have complained on the numpy list about 1 year ago about this issue. ?The > usual numpy idiom is > > np.any (some comparison) > > which will create an array of the full size comparing each element before > attempting the 'any', which is obviously wasteful. ?Hope numpypy can do better. It does better already FYI. It does not completely work with all kinds of possible constructs (like .flat) but works in general. From fijall at gmail.com Fri Jan 20 14:24:07 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 20 Jan 2012 15:24:07 +0200 Subject: [pypy-dev] numpypy / scipypy additions In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 2:19 PM, Joseph Perla wrote: > I should add that I'm looking into the possibility of writing a Fortran -> > RPython compiler in order to mechanically port many of the rest of the > functions scipy and get them into the JIT. Hi. I'm glad to hear more about potential contributors! If I can pick up any order, how about numpy.reshape? reshape method is already there on array so it should be more or less mechanical copy-paste into lib_pypy/numpypy. Regarding scipy - have you read a blog about how to have the entire scipy working with "minimal" effort: http://morepypy.blogspot.com/2011/12/plotting-using-matplotlib-from-pypy.html Pursuing slightly more this strategy should be relatively viable. Cheers, fijal From jperla at princeton.edu Fri Jan 20 15:57:17 2012 From: jperla at princeton.edu (Joseph Javier Perla) Date: Fri, 20 Jan 2012 09:57:17 -0500 Subject: [pypy-dev] segmentation fault Message-ID: I wrote several other missing methods (np.log, np.diag, np.concatenate) and now I have the full program working. I would like to add these other features as well. (Also, np.sum() should be able to do element-wise sum of a generator of n-dimensional arrays.) But, now I get a segfault. I was wondering how I could investigate the segfault issue. There is no error message or traceback, and I cannot catch the error with pdb. Thank you. j -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Fri Jan 20 19:26:53 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 20 Jan 2012 20:26:53 +0200 Subject: [pypy-dev] segmentation fault In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 4:57 PM, Joseph Javier Perla wrote: > I wrote several other missing methods (np.log, np.diag, np.concatenate) and > now I have the full program working. ?I would like to add these other > features as well. ?(Also, np.sum() should be able to do element-wise sum of > a generator of n-dimensional arrays.) > > > But, now I get a segfault. ?I was wondering how I could investigate the > segfault issue. ?There is no error message or traceback, and I cannot catch > the error with pdb. Thank you. You either use gdb or try to run it step by step or try on untranslated pypy. Without more info I have absolutely no idea > > j > > > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From coolbutuseless at gmail.com Sat Jan 21 02:57:46 2012 From: coolbutuseless at gmail.com (mike c) Date: Sat, 21 Jan 2012 11:57:46 +1000 Subject: [pypy-dev] numpypy / scipypy additions In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 11:24 PM, Maciej Fijalkowski wrote: > > I'm glad to hear more about potential contributors! If I can pick up > any order, how about numpy.reshape? reshape method is already there on > array so it should be more or less mechanical copy-paste into > lib_pypy/numpypy. the reshape func is already in the nightlies in lib_pypy/numpypy/core/fromnumeric.py Mike. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josephjavierperla at gmail.com Sat Jan 21 11:58:22 2012 From: josephjavierperla at gmail.com (Joseph Perla) Date: Sat, 21 Jan 2012 05:58:22 -0500 Subject: [pypy-dev] numpypy / scipypy additions In-Reply-To: References: Message-ID: Yes, it looks like reshape is there. I was wondering if I should submit patches to this list with the other methods, or if there is a different process for that. Yes, I remember reading that post. It makes sense, but having a few (or more) methods directly available in python run under PyPy would provide a big speed benefit over a C bridge. This makes sense in my case for gammln, for example. If there is no scipypy codebase, I'll just start one on github. j On Fri, Jan 20, 2012 at 8:57 PM, mike c wrote: > > > On Fri, Jan 20, 2012 at 11:24 PM, Maciej Fijalkowski wrote: >> >> I'm glad to hear more about potential contributors! If I can pick up >> any order, how about numpy.reshape? reshape method is already there on >> array so it should be more or less mechanical copy-paste into >> lib_pypy/numpypy. > > > the reshape func is already in the nightlies in > lib_pypy/numpypy/core/fromnumeric.py > > > Mike. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitrey15 at ukr.net Sat Jan 21 12:21:19 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Sat, 21 Jan 2012 13:21:19 +0200 Subject: [pypy-dev] numpypy: could NaN be added? Message-ID: <4F1A9FAF.4050405@ukr.net> hi all, absence of numpypy.nan prevents creating lots of highly important funcs like isnan, nanmin, nanmax, nanargmin, nanargmax, nansum etc. I cannot guarantee I would immediately provide them, especially for multidimensional arrays, but, at least, I (or other programmers) could create temporary replacements for the absent yet funcs in their own code (for example, only for 1-dimensional or at most 2-dimensional arrays), making their code working in PyPy right now. So could you put the number into numpypy, or there are some difficult problems with it? ----------------------- Regards, D. http://openopt.org/Dmitrey From dmitrey15 at ukr.net Sat Jan 21 12:25:29 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Sat, 21 Jan 2012 13:25:29 +0200 Subject: [pypy-dev] numpypy: could NaN be added? In-Reply-To: <4F1A9FAF.4050405@ukr.net> References: <4F1A9FAF.4050405@ukr.net> Message-ID: <4F1AA0A9.4070907@ukr.net> On 01/21/2012 01:21 PM, Dmitrey wrote: > hi all, > > absence of numpypy.nan prevents creating lots of highly important > funcs like isnan, nanmin, nanmax, nanargmin, nanargmax, nansum etc. I > cannot guarantee I would immediately provide them, especially for > multidimensional arrays, but, at least, I (or other programmers) could > create temporary replacements for the absent yet funcs in their own > code (for example, only for 1-dimensional or at most 2-dimensional > arrays), making their code working in PyPy right now. > > So could you put the number into numpypy, or there are some difficult > problems with it? I see nan is somehow present in pypy code, but I cannot acces it from neither numpypy nor pypy: >>>> a = np.inf - np.inf >>>> a nan >>>> type(a) >>>> nan Traceback (most recent call last): File "", line 1, in NameError: global name 'nan' is not defined >>>> np.nan Traceback (most recent call last): File "", line 1, in AttributeError: 'module' object has no attribute 'nan' ----------------------- Regards, D. http://openopt.org/Dmitrey From fijall at gmail.com Sat Jan 21 12:48:49 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 21 Jan 2012 13:48:49 +0200 Subject: [pypy-dev] numpypy: could NaN be added? In-Reply-To: <4F1AA0A9.4070907@ukr.net> References: <4F1A9FAF.4050405@ukr.net> <4F1AA0A9.4070907@ukr.net> Message-ID: On Sat, Jan 21, 2012 at 1:25 PM, Dmitrey wrote: > On 01/21/2012 01:21 PM, Dmitrey wrote: >> >> hi all, >> >> absence of numpypy.nan prevents creating lots of highly important funcs >> like isnan, nanmin, nanmax, nanargmin, nanargmax, nansum etc. I cannot >> guarantee I would immediately provide them, especially for multidimensional >> arrays, but, at least, I (or other programmers) could create temporary >> replacements for the absent yet funcs in their own code (for example, only >> for 1-dimensional or at most 2-dimensional arrays), making their code >> working in PyPy right now. >> >> So could you put the number into numpypy, or there are some difficult >> problems with it? > > I see nan is somehow present in pypy code, but I cannot acces it from > neither numpypy nor pypy: >>>>> a = np.inf - np.inf >>>>> a > nan >>>>> type(a) > >>>>> nan > Traceback (most recent call last): > ?File "", line 1, in > NameError: global name 'nan' is not defined >>>>> np.nan > Traceback (most recent call last): > ?File "", line 1, in > AttributeError: 'module' object has no attribute 'nan' > > ----------------------- > Regards, D. > http://openopt.org/Dmitrey > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev float('nan') works if you want. It's super-trivial to provide it from numpy as well From arigo at tunes.org Sat Jan 21 18:10:13 2012 From: arigo at tunes.org (Armin Rigo) Date: Sat, 21 Jan 2012 18:10:13 +0100 Subject: [pypy-dev] GC error In-Reply-To: <4F16457E.4040108@interstice.com> References: <4F147008.6040208@interstice.com> <4F14B23A.6040203@interstice.com> <4F150E43.9060904@wakelift.de> <4F15F668.3050606@interstice.com> <4F16457E.4040108@interstice.com> Message-ID: Hi, On Wed, Jan 18, 2012 at 05:07, Rich Drewes wrote: >> That means you're running a 32bit program in a 64bit environment > > Yup, that was it. ?For some reason the package from the launchpad ppa that > was pulled in was 32 bit. Sorry for the delay in answering. It's indeed a known issue that the ppa.launchpad.net has a 32-bit binary even though it pretends to be a 64-bit one :-/ Please tell us if, using a real 64-bit pypy like the one from our download page, you still have a much higher memory usage than CPython. And also note again that our current GCs do not perform well when the process is swapping, so it might be much slower than CPython... A bient?t, Armin. From fijall at gmail.com Sun Jan 22 20:21:45 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 22 Jan 2012 21:21:45 +0200 Subject: [pypy-dev] [pypy-commit] pypy arm-backend-2: also remove test_compile_asmlen from runner_test after it was moved to the x86 backend In-Reply-To: <20120118113545.7B141820D8@wyvern.cs.uni-duesseldorf.de> References: <20120118113545.7B141820D8@wyvern.cs.uni-duesseldorf.de> Message-ID: Note that this is completely wrong - the test is precisely in runner_test because this is an expected JIT backend interface for jit hooks to work On Wed, Jan 18, 2012 at 1:35 PM, bivab wrote: > Author: David Schneider > Branch: arm-backend-2 > Changeset: r51441:839659291f03 > Date: 2012-01-18 12:33 +0100 > http://bitbucket.org/pypy/pypy/changeset/839659291f03/ > > Log: ? ?also remove test_compile_asmlen from runner_test after it was moved > ? ? ? ?to the x86 backend > > diff --git a/pypy/jit/backend/test/runner_test.py b/pypy/jit/backend/test/runner_test.py > --- a/pypy/jit/backend/test/runner_test.py > +++ b/pypy/jit/backend/test/runner_test.py > @@ -3188,55 +3188,6 @@ > ? ? ? ? res = self.cpu.get_latest_value_int(0) > ? ? ? ? assert res == -10 > > - ? ?def test_compile_asmlen(self): > - ? ? ? ?from pypy.jit.backend.llsupport.llmodel import AbstractLLCPU > - ? ? ? ?if not isinstance(self.cpu, AbstractLLCPU): > - ? ? ? ? ? ?py.test.skip("pointless test on non-asm") > - ? ? ? ?from pypy.jit.backend.x86.tool.viewcode import machine_code_dump > - ? ? ? ?import ctypes > - ? ? ? ?ops = """ > - ? ? ? ?[i2] > - ? ? ? ?i0 = same_as(i2) ? ?# but forced to be in a register > - ? ? ? ?label(i0, descr=1) > - ? ? ? ?i1 = int_add(i0, i0) > - ? ? ? ?guard_true(i1, descr=faildesr) [i1] > - ? ? ? ?jump(i1, descr=1) > - ? ? ? ?""" > - ? ? ? ?faildescr = BasicFailDescr(2) > - ? ? ? ?loop = parse(ops, self.cpu, namespace=locals()) > - ? ? ? ?faildescr = loop.operations[-2].getdescr() > - ? ? ? ?jumpdescr = loop.operations[-1].getdescr() > - ? ? ? ?bridge_ops = """ > - ? ? ? ?[i0] > - ? ? ? ?jump(i0, descr=jumpdescr) > - ? ? ? ?""" > - ? ? ? ?bridge = parse(bridge_ops, self.cpu, namespace=locals()) > - ? ? ? ?looptoken = JitCellToken() > - ? ? ? ?self.cpu.assembler.set_debug(False) > - ? ? ? ?info = self.cpu.compile_loop(loop.inputargs, loop.operations, looptoken) > - ? ? ? ?bridge_info = self.cpu.compile_bridge(faildescr, bridge.inputargs, > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bridge.operations, > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?looptoken) > - ? ? ? ?self.cpu.assembler.set_debug(True) # always on untranslated > - ? ? ? ?assert info.asmlen != 0 > - ? ? ? ?cpuname = autodetect_main_model_and_size() > - ? ? ? ?# XXX we have to check the precise assembler, otherwise > - ? ? ? ?# we don't quite know if borders are correct > - > - ? ? ? ?def checkops(mc, ops): > - ? ? ? ? ? ?assert len(mc) == len(ops) > - ? ? ? ? ? ?for i in range(len(mc)): > - ? ? ? ? ? ? ? ?assert mc[i].split("\t")[-1].startswith(ops[i]) > - > - ? ? ? ?data = ctypes.string_at(info.asmaddr, info.asmlen) > - ? ? ? ?mc = list(machine_code_dump(data, info.asmaddr, cpuname)) > - ? ? ? ?lines = [line for line in mc if line.count('\t') == 2] > - ? ? ? ?checkops(lines, self.add_loop_instructions) > - ? ? ? ?data = ctypes.string_at(bridge_info.asmaddr, bridge_info.asmlen) > - ? ? ? ?mc = list(machine_code_dump(data, bridge_info.asmaddr, cpuname)) > - ? ? ? ?lines = [line for line in mc if line.count('\t') == 2] > - ? ? ? ?checkops(lines, self.bridge_loop_instructions) > - > > ? ? def test_compile_bridge_with_target(self): > ? ? ? ? # This test creates a loopy piece of code in a bridge, and builds another > _______________________________________________ > pypy-commit mailing list > pypy-commit at python.org > http://mail.python.org/mailman/listinfo/pypy-commit From fijall at gmail.com Mon Jan 23 09:08:31 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 23 Jan 2012 10:08:31 +0200 Subject: [pypy-dev] [pypy-commit] pypy py3k: The exception handler target "except ValueError as exc" was always compiled as a global variable. Test and fix. In-Reply-To: <20120122111001.A2D7B821FA@wyvern.cs.uni-duesseldorf.de> References: <20120122111001.A2D7B821FA@wyvern.cs.uni-duesseldorf.de> Message-ID: Shouldn't that go to trunk as well? On Sun, Jan 22, 2012 at 1:10 PM, amauryfa wrote: > Author: Amaury Forgeot d'Arc > Branch: py3k > Changeset: r51639:e325e4d3227a > Date: 2012-01-22 12:02 +0100 > http://bitbucket.org/pypy/pypy/changeset/e325e4d3227a/ > > Log: ? ?The exception handler target "except ValueError as exc" was always > ? ? ? ?compiled as a global variable. Test and fix. > > diff --git a/pypy/interpreter/astcompiler/symtable.py b/pypy/interpreter/astcompiler/symtable.py > --- a/pypy/interpreter/astcompiler/symtable.py > +++ b/pypy/interpreter/astcompiler/symtable.py > @@ -417,6 +417,11 @@ > ? ? def visit_alias(self, alias): > ? ? ? ? self._visit_alias(alias) > > + ? ?def visit_ExceptHandler(self, handler): > + ? ? ? ?if handler.name: > + ? ? ? ? ? ?self.note_symbol(handler.name, SYM_ASSIGNED) > + ? ? ? ?ast.GenericASTVisitor.visit_ExceptHandler(self, handler) > + > ? ? def visit_Yield(self, yie): > ? ? ? ? self.scope.note_yield(yie) > ? ? ? ? ast.GenericASTVisitor.visit_Yield(self, yie) > diff --git a/pypy/interpreter/astcompiler/test/test_symtable.py b/pypy/interpreter/astcompiler/test/test_symtable.py > --- a/pypy/interpreter/astcompiler/test/test_symtable.py > +++ b/pypy/interpreter/astcompiler/test/test_symtable.py > @@ -142,6 +142,10 @@ > ? ? ? ? scp = self.func_scope("def f(): x") > ? ? ? ? assert scp.lookup("x") == symtable.SCOPE_GLOBAL_IMPLICIT > > + ? ?def test_exception_variable(self): > + ? ? ? ?scp = self.mod_scope("try: pass\nexcept ValueError as e: pass") > + ? ? ? ?assert scp.lookup("e") == symtable.SCOPE_LOCAL > + > ? ? def test_nested_scopes(self): > ? ? ? ? def nested_scope(*bodies): > ? ? ? ? ? ? names = enumerate("f" + string.ascii_letters) > _______________________________________________ > pypy-commit mailing list > pypy-commit at python.org > http://mail.python.org/mailman/listinfo/pypy-commit From fijall at gmail.com Mon Jan 23 09:15:32 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 23 Jan 2012 10:15:32 +0200 Subject: [pypy-dev] [pypy-commit] pypy merge-2.7.2: Implement CPython issue5057: do not const-fold a unicode.__getitem__ In-Reply-To: <20120122195713.336F4821FA@wyvern.cs.uni-duesseldorf.de> References: <20120122195713.336F4821FA@wyvern.cs.uni-duesseldorf.de> Message-ID: Why not? We have only wide build, don't we? On Sun, Jan 22, 2012 at 9:57 PM, amauryfa wrote: > Author: Amaury Forgeot d'Arc > Branch: merge-2.7.2 > Changeset: r51662:693b08144e00 > Date: 2012-01-22 20:24 +0100 > http://bitbucket.org/pypy/pypy/changeset/693b08144e00/ > > Log: ? ?Implement CPython issue5057: do not const-fold a unicode.__getitem__ > ? ? ? ?operation which returns a non-BMP character, this produces .pyc > ? ? ? ?files which depends on the unicode width > > diff --git a/pypy/interpreter/astcompiler/optimize.py b/pypy/interpreter/astcompiler/optimize.py > --- a/pypy/interpreter/astcompiler/optimize.py > +++ b/pypy/interpreter/astcompiler/optimize.py > @@ -5,6 +5,7 @@ > ?from pypy.tool import stdlib_opcode as ops > ?from pypy.interpreter.error import OperationError > ?from pypy.rlib.unroll import unrolling_iterable > +from pypy.rlib.runicode import MAXUNICODE > > > ?def optimize_ast(space, tree, compile_info): > @@ -289,8 +290,30 @@ > ? ? ? ? ? ? ? ? w_idx = subs.slice.as_constant() > ? ? ? ? ? ? ? ? if w_idx is not None: > ? ? ? ? ? ? ? ? ? ? try: > - ? ? ? ? ? ? ? ? ? ? ? ?return ast.Const(self.space.getitem(w_obj, w_idx), subs.lineno, subs.col_offset) > + ? ? ? ? ? ? ? ? ? ? ? ?w_const = self.space.getitem(w_obj, w_idx) > ? ? ? ? ? ? ? ? ? ? except OperationError: > - ? ? ? ? ? ? ? ? ? ? ? ?# Let exceptions propgate at runtime. > - ? ? ? ? ? ? ? ? ? ? ? ?pass > + ? ? ? ? ? ? ? ? ? ? ? ?# Let exceptions propagate at runtime. > + ? ? ? ? ? ? ? ? ? ? ? ?return subs > + > + ? ? ? ? ? ? ? ? ? ?# CPython issue5057: if v is unicode, there might > + ? ? ? ? ? ? ? ? ? ?# be differences between wide and narrow builds in > + ? ? ? ? ? ? ? ? ? ?# cases like u'\U00012345'[0]. > + ? ? ? ? ? ? ? ? ? ?# Wide builds will return a non-BMP char, whereas > + ? ? ? ? ? ? ? ? ? ?# narrow builds will return a surrogate. ?In both > + ? ? ? ? ? ? ? ? ? ?# the cases skip the optimization in order to > + ? ? ? ? ? ? ? ? ? ?# produce compatible pycs. > + ? ? ? ? ? ? ? ? ? ?if (self.space.isinstance_w(w_obj, self.space.w_unicode) > + ? ? ? ? ? ? ? ? ? ? ? ?and > + ? ? ? ? ? ? ? ? ? ? ? ?self.space.isinstance_w(w_const, self.space.w_unicode)): > + ? ? ? ? ? ? ? ? ? ? ? ?unistr = self.space.unicode_w(w_const) > + ? ? ? ? ? ? ? ? ? ? ? ?if len(unistr) == 1: > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?ch = ord(unistr[0]) > + ? ? ? ? ? ? ? ? ? ? ? ?else: > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?ch = 0 > + ? ? ? ? ? ? ? ? ? ? ? ?if (ch > 0xFFFF or > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?(MAXUNICODE == 0xFFFF and 0xD800 <= ch <= OxDFFFF)): > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?return subs > + > + ? ? ? ? ? ? ? ? ? ?return ast.Const(w_const, subs.lineno, subs.col_offset) > + > ? ? ? ? return subs > diff --git a/pypy/interpreter/astcompiler/test/test_compiler.py b/pypy/interpreter/astcompiler/test/test_compiler.py > --- a/pypy/interpreter/astcompiler/test/test_compiler.py > +++ b/pypy/interpreter/astcompiler/test/test_compiler.py > @@ -838,6 +838,30 @@ > ? ? ? ? # Just checking this doesn't crash out > ? ? ? ? self.count_instructions(source) > > + ? ?def test_const_fold_unicode_subscr(self): > + ? ? ? ?source = """def f(): > + ? ? ? ?return u"abc"[0] > + ? ? ? ?""" > + ? ? ? ?counts = self.count_instructions(source) > + ? ? ? ?assert counts == {ops.LOAD_CONST: 1, ops.RETURN_VALUE: 1} > + > + ? ? ? ?# getitem outside of the BMP should not be optimized > + ? ? ? ?source = """def f(): > + ? ? ? ?return u"\U00012345"[0] > + ? ? ? ?""" > + ? ? ? ?counts = self.count_instructions(source) > + ? ? ? ?assert counts == {ops.LOAD_CONST: 2, ops.BINARY_SUBSCR: 1, > + ? ? ? ? ? ? ? ? ? ? ? ? ?ops.RETURN_VALUE: 1} > + > + ? ? ? ?# getslice is not yet optimized. > + ? ? ? ?# Still, check a case which yields the empty string. > + ? ? ? ?source = """def f(): > + ? ? ? ?return u"abc"[:0] > + ? ? ? ?""" > + ? ? ? ?counts = self.count_instructions(source) > + ? ? ? ?assert counts == {ops.LOAD_CONST: 2, ops.SLICE+2: 1, > + ? ? ? ? ? ? ? ? ? ? ? ? ?ops.RETURN_VALUE: 1} > + > ? ? def test_remove_dead_code(self): > ? ? ? ? source = """def f(x): > ? ? ? ? ? ? return 5 > _______________________________________________ > pypy-commit mailing list > pypy-commit at python.org > http://mail.python.org/mailman/listinfo/pypy-commit From amauryfa at gmail.com Mon Jan 23 16:17:22 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 23 Jan 2012 16:17:22 +0100 Subject: [pypy-dev] [pypy-commit] pypy py3k: The exception handler target "except ValueError as exc" was always compiled as a global variable. Test and fix. In-Reply-To: References: <20120122111001.A2D7B821FA@wyvern.cs.uni-duesseldorf.de> Message-ID: 2012/1/23 Maciej Fijalkowski > Shouldn't that go to trunk as well? > No, this is a bug I introduced in the py3k branch. in 2.7 the exception target an be any assignment target: args = ['An exception', None, 'was raised'] try: raise ValueError(1, 2, 3) except ValueError as args[1:2]: print args # prints ['An exception', 1, 2, 3, 'was raised'] In py3k, the exception target can only be a variable name, but when doing the change some time ago I did not understand that the now useless "set_context(target, ast.Store)" call had to be replaced by something equivalent. > On Sun, Jan 22, 2012 at 1:10 PM, amauryfa > wrote: > > Author: Amaury Forgeot d'Arc > > Branch: py3k > > Changeset: r51639:e325e4d3227a > > Date: 2012-01-22 12:02 +0100 > > http://bitbucket.org/pypy/pypy/changeset/e325e4d3227a/ > > > > Log: The exception handler target "except ValueError as exc" was > always > > compiled as a global variable. Test and fix. > -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From amauryfa at gmail.com Mon Jan 23 16:18:48 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 23 Jan 2012 16:18:48 +0100 Subject: [pypy-dev] [pypy-commit] pypy merge-2.7.2: Implement CPython issue5057: do not const-fold a unicode.__getitem__ In-Reply-To: References: <20120122195713.336F4821FA@wyvern.cs.uni-duesseldorf.de> Message-ID: 2012/1/23 Maciej Fijalkowski > Why not? We have only wide build, don't we? > Our unicode is based on wchar_t, which is still 2 bytes on Windows... > On Sun, Jan 22, 2012 at 9:57 PM, amauryfa > wrote: > > Author: Amaury Forgeot d'Arc > > Branch: merge-2.7.2 > > Changeset: r51662:693b08144e00 > > Date: 2012-01-22 20:24 +0100 > > http://bitbucket.org/pypy/pypy/changeset/693b08144e00/ > > > > Log: Implement CPython issue5057: do not const-fold a > unicode.__getitem__ > > operation which returns a non-BMP character, this produces .pyc > > files which depends on the unicode width > -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Jan 23 23:04:34 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 24 Jan 2012 00:04:34 +0200 Subject: [pypy-dev] bitbucket slowness Message-ID: Hi >From things I just learned: if you use ssh instead of https it's much faster, like: ssh://hg at bitbucket.org/pypy/pypy in .hg/hgrc in your pypy checkout cheers, fijal From andrewfr_ice at yahoo.com Mon Jan 23 23:06:19 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Mon, 23 Jan 2012 14:06:19 -0800 (PST) Subject: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? Message-ID: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> Hi Folks: I am trying to write a simple RPython programme that uses the STM module. I am modelling my example after targetdemo.py. this is a code fragment def T1(): ??? global fromAccount, toAccount, counter ??? transactionA(fromAccount, toAccount, counter) def T2(): ??? global fromAccount, toAccount, counter ??? transactionB(fromAccount, toAccount, counter) # __________? Entry point? __________ def entry_point(argv): ??? ll_thread.start_new_thread(T1,()) ??? ll_thread.start_new_thread(T2,()) ??? print "sleeping..." ??? while done < NUM_THREADS: ??????? time.sleep(1) ??? print "done sleeping." ??? return 0 Here are some of the last few lines of errors I receive: translation:ERROR]? AssertionError': ambiguous low-level helper specialization [translation:ERROR] ??? .. v0 = simple_call((function RPyThreadStart), func_0) [translation:ERROR] ??? .. '(pypy.module.thread.ll_thread:88)ll_start_new_thread [translation:ERROR] Processing block: [translation:ERROR]? block at 6 is a [translation:ERROR]? in (pypy.module.thread.ll_thread:88)ll_start_new_thread [translation:ERROR]? containing the following operations: [translation:ERROR]??????? v0 = simple_call((function RPyThreadStart), func_0) [translation:ERROR]??????? v1 = eq(v0, (-1)) [translation:ERROR]??????? v2 = is_true(v1) [translation:ERROR]? --end-- What does this error mean and what do I do to avoid it in the future? Cheers, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Mon Jan 23 23:23:07 2012 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 23 Jan 2012 17:23:07 -0500 Subject: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? In-Reply-To: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> References: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> Message-ID: 2012/1/23 Andrew Francis : > ??? ll_thread.start_new_thread(T1) > ??? ll_thread.start_new_thread(T2) Try this: ll_thread.ll_start_new_thread(T1) ll_thread.ll_start_new_thread(T2) -- Regards, Benjamin From dmitrey15 at ukr.net Tue Jan 24 18:30:47 2012 From: dmitrey15 at ukr.net (Dmitrey) Date: Tue, 24 Jan 2012 19:30:47 +0200 Subject: [pypy-dev] why PyPy build takes so much resources? Message-ID: <4F1EEAC7.5090804@ukr.net> Required 4 GB for 64bit machine and lots of time for build to be elapsed each day is very inconvenient. Well, I can download and install compiled version, but then I have to set all paths to the files I'm working with. Maybe somewhere there are lots of C files with -O3 or like that? Can you provide a build option e.g. --without-optimization to remove those -O3, -O2 or somehow else reduce build time and memory consumption (well I see the opts PYPY_GC_MAX_DELTA=200MB pypy --jit loop_longevity=300 ./translate.py -Ojit from http://pypy.org/download.html#building-from-source but they essentially increase build time). ----------------------- Regards, D. http://openopt.org/Dmitrey From amauryfa at gmail.com Tue Jan 24 18:47:00 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 24 Jan 2012 18:47:00 +0100 Subject: [pypy-dev] why PyPy build takes so much resources? In-Reply-To: <4F1EEAC7.5090804@ukr.net> References: <4F1EEAC7.5090804@ukr.net> Message-ID: Hi, 2012/1/24 Dmitrey > Required 4 GB for 64bit machine and lots of time for build to be elapsed > each day is very inconvenient. Well, I can download and install compiled > version, but then I have to set all paths to the files I'm working with. > Maybe somewhere there are lots of C files with -O3 or like that? Can you > provide a build option e.g. --without-optimization to remove those -O3, -O2 > or somehow else reduce build time and memory consumption (well I see the > opts > > PYPY_GC_MAX_DELTA=200MB pypy --jit loop_longevity=300 ./translate.py -Ojit > from http://pypy.org/download.html#**building-from-source > > but they essentially increase build time). > Yes, pypy takes a long time and a lot of memory to translate. But this is not because of C files: if you look at the output, you'll see that the generation and compilation of C source files is a small part of the whole process -- and memory usage is already at its highest then. To reduce translation time and memory, you can use -O2 instead of -Ojit -- but of course there won't be any JIT. That's the option I often use, because I'm primarily interested in the correctness of the resulting interpreter, not its speed. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewfr_ice at yahoo.com Tue Jan 24 21:38:48 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Tue, 24 Jan 2012 12:38:48 -0800 (PST) Subject: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? In-Reply-To: References: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> Message-ID: <1327437528.3806.YahooMailNeo@web120704.mail.ne1.yahoo.com> Hi Benjamin and folks: Thanks for the response. I made the change to ll_new_start_thread. Still I get the Ambiguous low-level helper specialization error. The programme is small. Here is the programme in its entirety. Hopefully it is a silly error. Any help would be appreciated. Cheers, Andrew ------- import time from pypy.module.thread import ll_thread from pypy.translator.stm import rstm NUM_THREADS = 2 class Account(object): ??? def __init__(self, value): ??????? self.value = value class Done(object): ??? def __init__(self): ??????? self.done = 0 fromAccount = Account(1000) toAccount = Account(2000) counter = Done() def transactionA(source, target, done): ??? print "transaction A starting" ??? source.value -= 50 ??? target.value += 50 ??? rstm.transaction_boundary() ??? print "Source account %d? TargetAccount %d" % (source.value,target.value) ??? print "transaction A done" ??? done.done += 1 def transactionB(source, target, done): ??? print "transaction B starting" ??? t = source.value * .1 ??? source.value -= t ??? target.value += t ??? rstm.transaction_boundary() ??? print "Source account %d? Target %d" % (source.value, target.value) ??? print "transaction A done" ??? done.done += 1 def T1(): ??? #global fromAccount, toAccount, counter ??? transactionA(fromAccount, toAccount, counter) def T2(): ??? #global fromAccount, toAccount, counter ??? transactionB(fromAccount, toAccount, counter) # __________? Entry point? __________ def entry_point(argv): ??? ll_thread.start_new_thread(T1,()) ??? ll_thread.start_new_thread(T2,()) ??? #ll_thread.ll_start_new_thread(T1) ??? #ll_thread.ll_start_new_thread(T2) ??? print "sleeping..." ??? while counter.done < NUM_THREADS:?? ??????? time.sleep(1) ??? print "done sleeping." ??? # easy way to know if transactions are being serialized correctly ??? assert fromAccount.value + toAccount.value == 3000 ??? return 0 # _____ Define and setup target ___ def target(*args): ??? return entry_point, None def T2(): ??? global fromAccount, toAccount, counter ??? transactionB(fromAccount, toAccount, counter) # __________? Entry point? __________ def entry_point(argv): ??? global fromAccount, toAccount, counter ??? #ll_thread.start_new_thread(T1,()) ??? #ll_thread.start_new_thread(T1,()) ??? ll_thread.ll_start_new_thread(T1) ??? ll_thread.ll_start_new_thread(T2) ??? print "sleeping..." ??? while counter.done < NUM_THREADS: ??????? time.sleep(1) ??? print "done sleeping." ??? assert fromAccount.value + toAccount.value == 3000 ??? return 0 # _____ Define and setup target ___ def target(*args): ??? return entry_point, None ________________________________ From: Benjamin Peterson To: Andrew Francis Cc: "pypy-dev at codespeak.net" Sent: Monday, January 23, 2012 5:23 PM Subject: Re: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? 2012/1/23 Andrew Francis : > ??? ll_thread.start_new_thread(T1) > ??? ll_thread.start_new_thread(T2) Try this: ? ? ll_thread.ll_start_new_thread(T1) ? ? ll_thread.ll_start_new_thread(T2) -- Regards, Benjamin -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Tue Jan 24 22:16:28 2012 From: arigo at tunes.org (Armin Rigo) Date: Tue, 24 Jan 2012 22:16:28 +0100 Subject: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? In-Reply-To: <1327437528.3806.YahooMailNeo@web120704.mail.ne1.yahoo.com> References: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> <1327437528.3806.YahooMailNeo@web120704.mail.ne1.yahoo.com> Message-ID: Hi Andrew, Oups, sorry, it's a bug. -------------------------------------------------------------------- r51744 80d054b4615d | arigo | 2012-01-24 22:07 +0100 pypy/module/thread/ll_thread.py pypy/module/thread/test/test_ll_thread.py Test and fix for "Ambiguous low-level helper specialization" when the same RPython program contains several calls to start_new_thread(). -------------------------------------------------------------------- Armin From arigo at tunes.org Wed Jan 25 18:41:12 2012 From: arigo at tunes.org (Armin Rigo) Date: Wed, 25 Jan 2012 18:41:12 +0100 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: References: <4EFF29FF.7040009@python-academy.de> Message-ID: Hi Mike, hi Fijal, On Sun, Jan 1, 2012 at 14:50, Maciej Fijalkowski wrote: > On Sat, Dec 31, 2011 at 5:27 PM, Mike M?ller wrote: >> I am just wondering if anybody is interested in sprinting >> on PyPy and in particular NumPyPy in Leipzig sometime in 2012. > > That sounds like a really cool idea. Thanks Mike for thinking about us :) > > What do others think? It is indeed a cool idea. I note that you're interested in a sprint on NumPyPy. So I think that Fijal is the best-placed person to suggest a time, maybe next time he's in Europe. I don't think it would make much sense without him, at least. If we can have him, then count me in as well :-) A bient?t, Armin. From andrewfr_ice at yahoo.com Wed Jan 25 18:53:36 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Wed, 25 Jan 2012 09:53:36 -0800 (PST) Subject: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? In-Reply-To: References: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> <1327437528.3806.YahooMailNeo@web120704.mail.ne1.yahoo.com> Message-ID: <1327514016.50190.YahooMailNeo@web120703.mail.ne1.yahoo.com> Hello Armin et al: I did a hg pull and a hg update (and a hg update stm) and hg heads. I didn't see any changes to the aforementioned files. Now I get the following error (I am assuming all bets are off if I am using out-of-date or wrong files): [translation:ERROR]???? return r_str.ll.do_stringformat(hop, sourcevars) [translation:ERROR]??? File "/home/andrew/pypy-stm/pypy/rpython/lltypesystem/rstr.py", line 978, in do_stringformat [translation:ERROR]???? assert isinstance(r_arg, IntegerRepr) [translation:ERROR]? AssertionError [translation] start debugger... > /home/andrew/pypy-stm/pypy/rpython/lltypesystem/rstr.py(978)do_stringformat() -> assert isinstance(r_arg, IntegerRepr) Cheers, Andrew ________________________________ From: Armin Rigo To: Andrew Francis Cc: Benjamin Peterson ; "pypy-dev at codespeak.net" Sent: Tuesday, January 24, 2012 4:16 PM Subject: Re: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? Hi Andrew, Oups, sorry, it's a bug. -------------------------------------------------------------------- r51744 80d054b4615d? | arigo | 2012-01-24 22:07 +0100 ? ? pypy/module/thread/ll_thread.py ? ? pypy/module/thread/test/test_ll_thread.py Test and fix for "Ambiguous low-level helper specialization" when the same RPython program contains several calls to start_new_thread(). -------------------------------------------------------------------- Armin -------------- next part -------------- An HTML attachment was scrubbed... URL: From anto.cuni at gmail.com Wed Jan 25 18:58:19 2012 From: anto.cuni at gmail.com (Antonio Cuni) Date: Wed, 25 Jan 2012 18:58:19 +0100 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: References: <4EFF29FF.7040009@python-academy.de> Message-ID: <4F2042BB.8050107@gmail.com> On 01/25/2012 06:41 PM, Armin Rigo wrote: > It is indeed a cool idea. I note that you're interested in a sprint > on NumPyPy. So I think that Fijal is the best-placed person to > suggest a time, maybe next time he's in Europe. I don't think it > would make much sense without him, at least. If we can have him, then > count me in as well :-) Sorry, I didn't notice the original mail until today. Anyway, depending on the time, I might be interested too :-) ciao, Anto From amauryfa at gmail.com Wed Jan 25 19:04:33 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 25 Jan 2012 19:04:33 +0100 Subject: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? In-Reply-To: <1327514016.50190.YahooMailNeo@web120703.mail.ne1.yahoo.com> References: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> <1327437528.3806.YahooMailNeo@web120704.mail.ne1.yahoo.com> <1327514016.50190.YahooMailNeo@web120703.mail.ne1.yahoo.com> Message-ID: Hi, 2012/1/25 Andrew Francis > > > /home/andrew/pypy-stm/pypy/rpython/lltypesystem/rstr.py(978)do_stringformat() > -> assert isinstance(r_arg, IntegerRepr) > This one is easy: you are formatting a float with %d, which is not supported by RPython. For your debug output, use %s instead. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewfr_ice at yahoo.com Wed Jan 25 19:56:09 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Wed, 25 Jan 2012 10:56:09 -0800 (PST) Subject: [pypy-dev] Unknown Operation Re: What Causes An Ambiguous low-level helper specialization Error? In-Reply-To: References: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> <1327437528.3806.YahooMailNeo@web120704.mail.ne1.yahoo.com> <1327514016.50190.YahooMailNeo@web120703.mail.ne1.yahoo.com> Message-ID: <1327517769.49485.YahooMailNeo@web120703.mail.ne1.yahoo.com> Hello Amaury: Yes you are right! I commented out the print statement. I'll fix that later (it is there to check if the code is emitting the correct answers. I can change that a bunch of asserts). Out of curiosity, I thought there was some sort of rpython lint utility out there? Perhaps I can use that to help me catch those sort of errors (I wrote a version of the code in Python to catch mistakes). However now I am getting: [rtyper] -=- specialized 2 more blocks -=- [canraise:WARNING] Unknown operation: stm_transaction_boundary [canraise:WARNING] Unknown operation: stm_transaction_boundary [canraise:WARNING] Unknown operation: stm_transaction_boundary [canraise:WARNING] Unknown operation: stm_transaction_boundary [rtyper] -=- specialized 0 more blocks -=- [backendopt:inlining] phase with threshold factor: 32.4 [backendopt:inlining] heuristic: pypy.translator.backendopt.inline.inlining_heuristic . . .[translation:ERROR]? CompilationError: CompilationError(err=""" [translation:ERROR] ?? ?translator_stm_test_bank.c: In function ?pypy_g_transactionA?: [translation:ERROR] ?? ?translator_stm_test_bank.c:257: warning: implicit declaration of function ?OP_STM_TRANSACTION_BOUNDARY? [translation:ERROR] ?? ?translator_stm_test_bank.o: In function `pypy_g_transactionB': [translation:ERROR] ?? ?translator_stm_test_bank.c:(.text+0xc7): undefined reference to `OP_STM_TRANSACTION_BOUNDARY' [translation:ERROR] ?? ?translator_stm_test_bank.o: In function `pypy_g_transactionA': [translation:ERROR] ?? ?translator_stm_test_bank.c:(.text+0x295): undefined reference to `OP_STM_TRANSACTION_BOUNDARY' [translation:ERROR] ?? ?collect2: ld returned 1 exit status [translation:ERROR] ?? ?make: *** [bank-c] Error 1 [translation:ERROR] ?? ?""") [translation] start debugger... > /home/andrew/pypy-stm/pypy/translator/platform/__init__.py(138)_handle_error() -> raise CompilationError(stdout, stderr) Again, the command I am using to compile the code is: python ../../goal/translate.py --stm --gc=none bank.py I have bank.py in the same directory as targetdemo.py and I use the same command. I can't wait to see this code compile! Bank accounts are the "hello world" of transactional programming! Cheers, Andrew ________________________________ From: Amaury Forgeot d'Arc To: Andrew Francis Cc: Armin Rigo ; "pypy-dev at codespeak.net" Sent: Wednesday, January 25, 2012 1:04 PM Subject: Re: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? Hi, 2012/1/25 Andrew Francis > /home/andrew/pypy-stm/pypy/rpython/lltypesystem/rstr.py(978)do_stringformat() >-> assert isinstance(r_arg, IntegerRepr) This one is easy: you are formatting a float with %d, which is not supported by RPython. For your debug output, use %s instead. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Wed Jan 25 20:06:42 2012 From: arigo at tunes.org (Armin Rigo) Date: Wed, 25 Jan 2012 20:06:42 +0100 Subject: [pypy-dev] Unknown Operation Re: What Causes An Ambiguous low-level helper specialization Error? In-Reply-To: <1327517769.49485.YahooMailNeo@web120703.mail.ne1.yahoo.com> References: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> <1327437528.3806.YahooMailNeo@web120704.mail.ne1.yahoo.com> <1327514016.50190.YahooMailNeo@web120703.mail.ne1.yahoo.com> <1327517769.49485.YahooMailNeo@web120703.mail.ne1.yahoo.com> Message-ID: Hi Andrew, On Wed, Jan 25, 2012 at 19:56, Andrew Francis wrote: > I can't wait to see this code compile! Bank accounts are the "hello world" > of transactional programming! In case you want to avoid having to hack at RPython: by now, you can also compile the full PyPy with transactions enabled. You just get a PyPy with no GC at all (so it cannot run for more than a few seconds before running out of memory). You then get a sane interface from the built-in module 'transaction', and don't have to write any RPython code. I use the following command line (it's very fast and doesn't require too much RAM, because of no GC): ./translate.py -O1 --gc=none --stm targetpypystandalone.py --no-allworkingmodules --withmod-transaction A bient?t, Armin. From andrewfr_ice at yahoo.com Wed Jan 25 21:05:44 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Wed, 25 Jan 2012 12:05:44 -0800 (PST) Subject: [pypy-dev] Unknown Operation Re: What Causes An Ambiguous low-level helper specialization Error? In-Reply-To: References: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> <1327437528.3806.YahooMailNeo@web120704.mail.ne1.yahoo.com> <1327514016.50190.YahooMailNeo@web120703.mail.ne1.yahoo.com> <1327517769.49485.YahooMailNeo@web120703.mail.ne1.yahoo.com> Message-ID: <1327521944.21459.YahooMailNeo@web120703.mail.ne1.yahoo.com> Hi Armin: Thanks for the advice!? I'll try your suggestion. However I really do wish to work with RPython. Also how do I look at the transaction interface? A suggestion: to rule out silly mistakes on my end, perhaps someone could try to compile bank.py (the code I posted earlier). That code is based on the example in section 13.4 (Concurrent executions) of the 3rd edition of "Database System Concepts" by Silberschatz et al. I have used this book to write simple post mortem deadlock detection for Stackless python. I also intend to write a C version of bank.py using the libstm and rstm libraries. Cheers, Andrew ________________________________ From: Armin Rigo To: Andrew Francis Cc: Amaury Forgeot d'Arc ; "pypy-dev at codespeak.net" Sent: Wednesday, January 25, 2012 2:06 PM Subject: Re: Unknown Operation Re: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? Hi Andrew, On Wed, Jan 25, 2012 at 19:56, Andrew Francis wrote: > I can't wait to see this code compile! Bank accounts are the "hello world" > of transactional programming! In case you want to avoid having to hack at RPython: by now, you can also compile the full PyPy with transactions enabled.? You just get a PyPy with no GC at all (so it cannot run for more than a few seconds before running out of memory).? You then get a sane interface from the built-in module 'transaction', and don't have to write any RPython code. I use the following command line (it's very fast and doesn't require too much RAM, because of no GC): ./translate.py -O1 --gc=none --stm targetpypystandalone.py --no-allworkingmodules --withmod-transaction A bient?t, Armin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Wed Jan 25 22:29:30 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 25 Jan 2012 23:29:30 +0200 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: <4F2042BB.8050107@gmail.com> References: <4EFF29FF.7040009@python-academy.de> <4F2042BB.8050107@gmail.com> Message-ID: On Wed, Jan 25, 2012 at 7:58 PM, Antonio Cuni wrote: > On 01/25/2012 06:41 PM, Armin Rigo wrote: > >> It is indeed a cool idea. ?I note that you're interested in a sprint >> on NumPyPy. ?So I think that Fijal is the best-placed person to >> suggest a time, maybe next time he's in Europe. ?I don't think it >> would make much sense without him, at least. ?If we can have him, then >> count me in as well :-) > > > Sorry, I didn't notice the original mail until today. Anyway, depending on > the time, I might be interested too :-) > > ciao, > Anto Hello. I will be in Europe around EuroPython. What do people think about those dates (it can be 2-3 weeks around). Cheers, fijal From arigo at tunes.org Wed Jan 25 22:39:35 2012 From: arigo at tunes.org (Armin Rigo) Date: Wed, 25 Jan 2012 22:39:35 +0100 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: References: <4EFF29FF.7040009@python-academy.de> <4F2042BB.8050107@gmail.com> Message-ID: Hi Fijal, On Wed, Jan 25, 2012 at 22:29, Maciej Fijalkowski wrote: > I will be in Europe around EuroPython. What do people think about > those dates (it can be 2-3 weeks around). Sounds good to me. Armin From mmueller at python-academy.de Wed Jan 25 23:47:14 2012 From: mmueller at python-academy.de (=?UTF-8?B?TWlrZSBNw7xsbGVy?=) Date: Wed, 25 Jan 2012 23:47:14 +0100 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: References: <4EFF29FF.7040009@python-academy.de> <4F2042BB.8050107@gmail.com> Message-ID: <4F208672.1020501@python-academy.de> Am 25.01.2012 22:29, schrieb Maciej Fijalkowski: > On Wed, Jan 25, 2012 at 7:58 PM, Antonio Cuni wrote: >> On 01/25/2012 06:41 PM, Armin Rigo wrote: >> >>> It is indeed a cool idea. I note that you're interested in a sprint >>> on NumPyPy. So I think that Fijal is the best-placed person to >>> suggest a time, maybe next time he's in Europe. I don't think it >>> would make much sense without him, at least. If we can have him, then >>> count me in as well :-) >> >> >> Sorry, I didn't notice the original mail until today. Anyway, depending on >> the time, I might be interested too :-) >> >> ciao, >> Anto > > Hello. > > I will be in Europe around EuroPython. What do people think about > those dates (it can be 2-3 weeks around). Hi, Both, the week before (June 25 - 29) and after EuroPython (July 9 - 13) are still available. How about one of these? Cheers, Mike > > Cheers, > fijal > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From andrewfr_ice at yahoo.com Wed Jan 25 23:49:03 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Wed, 25 Jan 2012 14:49:03 -0800 (PST) Subject: [pypy-dev] SOLVED Re: Unknown Operation Re: What Causes An Ambiguous low-level helper specialization Error? In-Reply-To: References: <1327356379.11009.YahooMailNeo@web120706.mail.ne1.yahoo.com> <1327437528.3806.YahooMailNeo@web120704.mail.ne1.yahoo.com> <1327514016.50190.YahooMailNeo@web120703.mail.ne1.yahoo.com> <1327517769.49485.YahooMailNeo@web120703.mail.ne1.yahoo.com> Message-ID: <1327531743.72542.YahooMailNeo@web120706.mail.ne1.yahoo.com> Hi Folks: Silly error. I didn't notice that targetdemo.py was updated: the API changed! Using the latest version as an example, I got my example to work. Yeah! Thanks to everyone for the help! Cheers, Andrew ________________________________ From: Armin Rigo To: Andrew Francis Cc: Amaury Forgeot d'Arc ; "pypy-dev at codespeak.net" Sent: Wednesday, January 25, 2012 2:06 PM Subject: Re: Unknown Operation Re: [pypy-dev] What Causes An Ambiguous low-level helper specialization Error? Hi Andrew, On Wed, Jan 25, 2012 at 19:56, Andrew Francis wrote: > I can't wait to see this code compile! Bank accounts are the "hello world" > of transactional programming! In case you want to avoid having to hack at RPython: by now, you can also compile the full PyPy with transactions enabled.? You just get a PyPy with no GC at all (so it cannot run for more than a few seconds before running out of memory).? You then get a sane interface from the built-in module 'transaction', and don't have to write any RPython code. I use the following command line (it's very fast and doesn't require too much RAM, because of no GC): ./translate.py -O1 --gc=none --stm targetpypystandalone.py --no-allworkingmodules --withmod-transaction A bient?t, Armin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Wed Jan 25 23:58:24 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 26 Jan 2012 00:58:24 +0200 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: <4F208672.1020501@python-academy.de> References: <4EFF29FF.7040009@python-academy.de> <4F2042BB.8050107@gmail.com> <4F208672.1020501@python-academy.de> Message-ID: On Thu, Jan 26, 2012 at 12:47 AM, Mike M?ller wrote: > Am 25.01.2012 22:29, schrieb Maciej Fijalkowski: >> On Wed, Jan 25, 2012 at 7:58 PM, Antonio Cuni wrote: >>> On 01/25/2012 06:41 PM, Armin Rigo wrote: >>> >>>> It is indeed a cool idea. ?I note that you're interested in a sprint >>>> on NumPyPy. ?So I think that Fijal is the best-placed person to >>>> suggest a time, maybe next time he's in Europe. ?I don't think it >>>> would make much sense without him, at least. ?If we can have him, then >>>> count me in as well :-) >>> >>> >>> Sorry, I didn't notice the original mail until today. Anyway, depending on >>> the time, I might be interested too :-) >>> >>> ciao, >>> Anto >> >> Hello. >> >> I will be in Europe around EuroPython. What do people think about >> those dates (it can be 2-3 weeks around). > > Hi, > > Both, the week before (June 25 - 29) and after EuroPython > (July 9 - 13) are still available. > How about one of these? Works for me, preferably before? No idea what others say for example. > > Cheers, > Mike >> >> Cheers, >> fijal >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From arigo at tunes.org Thu Jan 26 09:38:21 2012 From: arigo at tunes.org (Armin Rigo) Date: Thu, 26 Jan 2012 09:38:21 +0100 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: References: <4EFF29FF.7040009@python-academy.de> <4F2042BB.8050107@gmail.com> <4F208672.1020501@python-academy.de> Message-ID: Hi, On Wed, Jan 25, 2012 at 23:58, Maciej Fijalkowski wrote: >> Both, the week before (June 25 - 29) and after EuroPython >> (July 9 - 13) are still available. How about one of these? > > Works for me, preferably before? No idea what others say for example. Both work for me. Armin From david.schneider at picle.org Thu Jan 26 10:45:25 2012 From: david.schneider at picle.org (David Schneider) Date: Thu, 26 Jan 2012 10:45:25 +0100 Subject: [pypy-dev] [pypy-commit] pypy arm-backend-2: also remove test_compile_asmlen from runner_test after it was moved to the x86 backend In-Reply-To: References: <20120118113545.7B141820D8@wyvern.cs.uni-duesseldorf.de> Message-ID: <4F2120B5.3040404@picle.org> Maciej Fijalkowski wrote: > Note that this is completely wrong - the test is precisely in > runner_test because this is an expected JIT backend interface for jit > hooks to work > > On Wed, Jan 18, 2012 at 1:35 PM, bivab wrote: >> Author: David Schneider >> Branch: arm-backend-2 >> Changeset: r51441:839659291f03 >> Date: 2012-01-18 12:33 +0100 >> http://bitbucket.org/pypy/pypy/changeset/839659291f03/ >> >> Log: also remove test_compile_asmlen from runner_test after it was moved >> to the x86 backend >> >> diff --git a/pypy/jit/backend/test/runner_test.py b/pypy/jit/backend/test/runner_test.py >> --- a/pypy/jit/backend/test/runner_test.py >> +++ b/pypy/jit/backend/test/runner_test.py >> @@ -3188,55 +3188,6 @@ >> res = self.cpu.get_latest_value_int(0) >> assert res == -10 >> >> - def test_compile_asmlen(self): >> - from pypy.jit.backend.llsupport.llmodel import AbstractLLCPU >> - if not isinstance(self.cpu, AbstractLLCPU): >> - py.test.skip("pointless test on non-asm") >> - from pypy.jit.backend.x86.tool.viewcode import machine_code_dump >> - import ctypes >> - ops = """ >> - [i2] >> - i0 = same_as(i2) # but forced to be in a register >> - label(i0, descr=1) >> - i1 = int_add(i0, i0) >> - guard_true(i1, descr=faildesr) [i1] >> - jump(i1, descr=1) >> - """ >> - faildescr = BasicFailDescr(2) >> - loop = parse(ops, self.cpu, namespace=locals()) >> - faildescr = loop.operations[-2].getdescr() >> - jumpdescr = loop.operations[-1].getdescr() >> - bridge_ops = """ >> - [i0] >> - jump(i0, descr=jumpdescr) >> - """ >> - bridge = parse(bridge_ops, self.cpu, namespace=locals()) >> - looptoken = JitCellToken() >> - self.cpu.assembler.set_debug(False) >> - info = self.cpu.compile_loop(loop.inputargs, loop.operations, looptoken) >> - bridge_info = self.cpu.compile_bridge(faildescr, bridge.inputargs, >> - bridge.operations, >> - looptoken) >> - self.cpu.assembler.set_debug(True) # always on untranslated >> - assert info.asmlen != 0 >> - cpuname = autodetect_main_model_and_size() >> - # XXX we have to check the precise assembler, otherwise >> - # we don't quite know if borders are correct >> - >> - def checkops(mc, ops): >> - assert len(mc) == len(ops) >> - for i in range(len(mc)): >> - assert mc[i].split("\t")[-1].startswith(ops[i]) >> - >> - data = ctypes.string_at(info.asmaddr, info.asmlen) >> - mc = list(machine_code_dump(data, info.asmaddr, cpuname)) >> - lines = [line for line in mc if line.count('\t') == 2] >> - checkops(lines, self.add_loop_instructions) >> - data = ctypes.string_at(bridge_info.asmaddr, bridge_info.asmlen) >> - mc = list(machine_code_dump(data, bridge_info.asmaddr, cpuname)) >> - lines = [line for line in mc if line.count('\t') == 2] >> - checkops(lines, self.bridge_loop_instructions) >> - >> >> def test_compile_bridge_with_target(self): >> # This test creates a loopy piece of code in a bridge, and builds another >> _______________________________________________ >> pypy-commit mailing list >> pypy-commit at python.org >> http://mail.python.org/mailman/listinfo/pypy-commit > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Sorry, missed that. Refactored the test on the branch so it now works with both backends. Regards, David From anto.cuni at gmail.com Thu Jan 26 11:04:47 2012 From: anto.cuni at gmail.com (Antonio Cuni) Date: Thu, 26 Jan 2012 11:04:47 +0100 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: References: <4EFF29FF.7040009@python-academy.de> <4F2042BB.8050107@gmail.com> <4F208672.1020501@python-academy.de> Message-ID: <4F21253F.3020506@gmail.com> On 01/25/2012 11:58 PM, Maciej Fijalkowski wrote: >> Both, the week before (June 25 - 29) and after EuroPython >> (July 9 - 13) are still available. >> How about one of these? > > Works for me, preferably before? No idea what others say for example. probably the one before works better for me as well. However, I cannot fully promise to be present in that dates. ciao, Anto From dje.gcc at gmail.com Thu Jan 26 21:21:16 2012 From: dje.gcc at gmail.com (David Edelsohn) Date: Thu, 26 Jan 2012 15:21:16 -0500 Subject: [pypy-dev] PyPy PPC64 Prolog translation Message-ID: Trying to keep everyone informed about what I learn while debugging the PPC backend and get some help with next step in analysis. The translated version of the Prolog interpreter (without JIT) runs for most examples, but crashes for on one. I recompiled the interpreter with "-O1 -g" to get a better traceback. $ pyrolog-c iter.pl >?- iterate_findall(10000). Program received signal SIGSEGV, Segmentation fault. pypy_g_throw (l_exc_1=0xfffb76a99c0, l_scont_4=, l_fcont_4=, l_heap_6=) at interpreter_continuation.c:6580 6580 l_v161545 = (struct pypy_object_vtable0 *)_OP_GET_NEXT_GROUP_MEMBER((&pypy_g_typeinfo), (pypy_halfword_t)l_v161544->_gcheader.h_tid, sizeof(struct pypy_type_info0)); (gdb) where #0 pypy_g_throw (l_exc_1=0xfffb76a99c0, l_scont_4=, l_fcont_4=, l_heap_6=) at interpreter_continuation.c:6580 #1 0x000000001010f8b8 in pypy_g_driver (l_scont_1=, l_fcont_1=, l_heap_2=) at interpreter_continuation.c:3192 #2 0x0000000010116a30 in pypy_g_run (l_query_201=, l_var_to_pos_4=0xfffb76a9748) at interpreter_translatedmain.c:2120 #3 0x000000001011861c in pypy_g_repl () at interpreter_translatedmain.c:231 #4 0x000000001010bf58 in pypy_g_entry_point (l_argv_1=) at targetprologstandalone.c:242 #5 0x0000000010003708 in pypy_main_function (argc=, argv=) at /home/dje/src/pypy-ppc/pypy/translator/c/src/main.h:62 #6 0x00000fffb7c9f05c in .generic_start_main () from /lib64/power7/libc.so.6 #7 0x00000fffb7c9f27c in .__libc_start_main () from /lib64/power7/libc.so.6 #8 0x0000000000000000 in ?? () The following blocks from interpreter_continuation.c:pypy_g_throw() lead to the segfault, but I am not sure why the data structure is initialized with a bad value: 6574 block0: 6575 l_heap_689 = l_heap_6; 6576 l_scont_245 = l_scont_4; 6577 goto block1; 6578 6579 block1: 6580 l_v161066 = (struct pypy_object0 *)l_scont_245; 6581 l_v161067 = (struct pypy_object_vtable0 *)_OP_GET_NEXT_GROUP_MEMBER((&pypy_g_typeinfo), (pypy_halfword_t)l_v161066->_gcheader.h_tid, sizeof(struct pypy_type_info0)); 6582 l_v161068 = (struct pypy_prolog_interpreter_continuation_Continuation_vtabl0 *)l_v161067; 6583 l_v161069 = RPyField(l_v161068, c_cls_is_done); ... 6736 block19: 6737 l_v161116 = RPyField(l_scont_245, c_inst_nextcont); 6738 l_scont_245 = l_v161116; 6739 goto block1; l_scont_245 is correct upon initial entry (assigned from l_scont_4 argument). Execution flows through block19 3 times before the crash. On the third "iteration", l_scont_245 is set to 0, then dereferenced in block0, causing a segfault. pypy_g_throw() is entered with the chain of continuations (I assume that is what c_inst_nextcont means) already containing a NULL. >?- iterate_findall(10000). Breakpoint 1, pypy_g_throw (l_v161056=0xfffb76a99c0, l_scont_4=0xfffb76a98a0, l_fcont_4=0xfffb76a9748, l_heap_6=0xfffb76a9778) at interpreter_continuation.c:6581 (gdb) x/4x l_scont_4 0xfffb76a98a0: 0x00000000 0x00000fb0 0x00000fff 0xb76a9728 (gdb) x/4x 0xfffb76a9728 0xfffb76a9728: 0x00000000 0x000037f0 0x00000fff 0xb76a95e8 (gdb) x/4x 0xfffb76a95e8 0xfffb76a95e8: 0x00000000 0x00003ad8 0x00000000 0x00000000 The third and forth values are the 64 bit address of the next field in the chain. I am not sure how this chain of continuations is initialized, but one element contains a NULL pointer. Thanks, David From arigo at tunes.org Thu Jan 26 21:40:37 2012 From: arigo at tunes.org (Armin Rigo) Date: Thu, 26 Jan 2012 21:40:37 +0100 Subject: [pypy-dev] PyPy PPC64 Prolog translation In-Reply-To: References: Message-ID: Hi David, On Thu, Jan 26, 2012 at 21:21, David Edelsohn wrote: > $ pyrolog-c iter.pl Where can I find this iter.pl? Armin From mmueller at python-academy.de Fri Jan 27 12:22:51 2012 From: mmueller at python-academy.de (=?UTF-8?B?TWlrZSBNw7xsbGVy?=) Date: Fri, 27 Jan 2012 12:22:51 +0100 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: <4F21253F.3020506@gmail.com> References: <4EFF29FF.7040009@python-academy.de> <4F2042BB.8050107@gmail.com> <4F208672.1020501@python-academy.de> <4F21253F.3020506@gmail.com> Message-ID: <4F22890B.1090906@python-academy.de> Am 26.01.2012 11:04, schrieb Antonio Cuni: > On 01/25/2012 11:58 PM, Maciej Fijalkowski wrote: > > >>> Both, the week before (June 25 - 29) and after EuroPython >>> (July 9 - 13) are still available. >>> How about one of these? >> >> Works for me, preferably before? No idea what others say for example. > > probably the one before works better for me as well. However, I cannot fully > promise to be present in that dates. Ok. Let's make it the week before EuroPython then. I think the earlier the announcement it out the higher the chance that people can arrange the dates. Let me know how to proceed from here. I will take full charge of the working space and can help in finding affordable places to stay. Furthermore, I can help if you would like to apply for funding at the PSF or other organizations. I am willing to help were I can. Cheers, Mike From arigo at tunes.org Fri Jan 27 12:34:12 2012 From: arigo at tunes.org (Armin Rigo) Date: Fri, 27 Jan 2012 12:34:12 +0100 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: <4F22890B.1090906@python-academy.de> References: <4EFF29FF.7040009@python-academy.de> <4F2042BB.8050107@gmail.com> <4F208672.1020501@python-academy.de> <4F21253F.3020506@gmail.com> <4F22890B.1090906@python-academy.de> Message-ID: Hi Mike, On Fri, Jan 27, 2012 at 12:22, Mike M?ller wrote: > Ok. Let's make it the week before EuroPython then. Great, thanks! Note however that we should wait until the EuroPython dates are officially announced (I think they're not so far). If they shift by a few days it would be annoying if it ends up conflicting. A bient?t, Armin. From mmueller at python-academy.de Fri Jan 27 12:59:53 2012 From: mmueller at python-academy.de (=?ISO-8859-1?Q?Mike_M=FCller?=) Date: Fri, 27 Jan 2012 12:59:53 +0100 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: References: <4EFF29FF.7040009@python-academy.de> <4F2042BB.8050107@gmail.com> <4F208672.1020501@python-academy.de> <4F21253F.3020506@gmail.com> <4F22890B.1090906@python-academy.de> Message-ID: <4F2291B9.4070809@python-academy.de> Hi Armin, Am 27.01.2012 12:34, schrieb Armin Rigo: > Hi Mike, > > On Fri, Jan 27, 2012 at 12:22, Mike M?ller wrote: >> Ok. Let's make it the week before EuroPython then. > > Great, thanks! Note however that we should wait until the EuroPython > dates are officially announced (I think they're not so far). If they > shift by a few days it would be annoying if it ends up conflicting. Good point. The dates should be fixed and announced any minute ;): http://mail.python.org/pipermail/europython/2011-December/thread.html#8014 So let's plan to have the sprint dates announced no later than two weeks after the EuroPython dates are written on the EuroPython website. What do you think? Mike From arigo at tunes.org Sat Jan 28 12:31:52 2012 From: arigo at tunes.org (Armin Rigo) Date: Sat, 28 Jan 2012 12:31:52 +0100 Subject: [pypy-dev] PyPy/NumPyPy sprint(s) in Leipzig, Germany? In-Reply-To: <4F2291B9.4070809@python-academy.de> References: <4EFF29FF.7040009@python-academy.de> <4F2042BB.8050107@gmail.com> <4F208672.1020501@python-academy.de> <4F21253F.3020506@gmail.com> <4F22890B.1090906@python-academy.de> <4F2291B9.4070809@python-academy.de> Message-ID: Hi Mike, On Fri, Jan 27, 2012 at 12:59, Mike M?ller wrote: > So let's plan to have the sprint dates announced no later than two weeks > after the EuroPython dates are written on the EuroPython website. > What do you think? Sounds good :-) A bient?t, Armin From vwinfred at mcdean.com Sun Jan 29 16:54:37 2012 From: vwinfred at mcdean.com (vwinfred at mcdean.com) Date: Mon, 30 Jan 2012 00:54:37 +0900 Subject: [pypy-dev] SERGEY-MAVRODI.COM Message-ID: An HTML attachment was scrubbed... URL: From jfcgauss at gmail.com Mon Jan 30 16:52:00 2012 From: jfcgauss at gmail.com (Serhat Sevki Dincer) Date: Mon, 30 Jan 2012 17:52:00 +0200 Subject: [pypy-dev] JITability Message-ID: Hi, Suppose you have a function to validate local/international phone numbers (one space between number groups, no space at start/end, no zero at start) like: def validatePhone(v): if not (9 < len(v) < 19 and '0' < v[0]): *# recall ' ' < '0'* return False for c in v: if '0' <= c <= '9': wasspace = False else: if c != ' ' or wasspace: return False wasspace = True if wasspace: return False # last one return True Just out of curiosity, would JITability of this function change if I put "wasspace = False" just before the loop ? Note: attaching just in case indentation is lost when sending.. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: phone.py Type: text/x-python Size: 279 bytes Desc: not available URL: From benjamin at python.org Mon Jan 30 16:56:59 2012 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 30 Jan 2012 10:56:59 -0500 Subject: [pypy-dev] JITability In-Reply-To: References: Message-ID: 2012/1/30 Serhat Sevki Dincer : > Hi, > Suppose you have a function to validate local/international phone numbers > (one space between number groups, no space at start/end, no zero at start) > like: > > def validatePhone(v): > ? ?if not (9 < len(v) < 19 and '0' < v[0]): # recall ' ' < '0' > ? ? ? ?return False > > ? ?for c in v: > ? ? ? ?if '0' <= c <= '9': > ? ? ? ? ? ?wasspace = False > ? ? ? ?else: > ? ? ? ? ? ?if c != ' ' or wasspace: > ? ? ? ? ? ? ? ?return False > ? ? ? ? ? ?wasspace = True > > ? ?if wasspace: > ? ? ? ?return False # last one > ? ?return True > > Just out of curiosity,?would JITability of this function change if I put > "wasspace = False" just before the loop ? It likely doesn't matter unless you expect phone numbers to be longer than 1000 chars. -- Regards, Benjamin From fijall at gmail.com Mon Jan 30 17:05:06 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 30 Jan 2012 18:05:06 +0200 Subject: [pypy-dev] JITability In-Reply-To: References: Message-ID: On Mon, Jan 30, 2012 at 5:56 PM, Benjamin Peterson wrote: > 2012/1/30 Serhat Sevki Dincer : >> Hi, >> Suppose you have a function to validate local/international phone numbers >> (one space between number groups, no space at start/end, no zero at start) >> like: >> >> def validatePhone(v): >> ? ?if not (9 < len(v) < 19 and '0' < v[0]): # recall ' ' < '0' >> ? ? ? ?return False >> >> ? ?for c in v: >> ? ? ? ?if '0' <= c <= '9': >> ? ? ? ? ? ?wasspace = False >> ? ? ? ?else: >> ? ? ? ? ? ?if c != ' ' or wasspace: >> ? ? ? ? ? ? ? ?return False >> ? ? ? ? ? ?wasspace = True >> >> ? ?if wasspace: >> ? ? ? ?return False # last one >> ? ?return True >> >> Just out of curiosity,?would JITability of this function change if I put >> "wasspace = False" just before the loop ? > > It likely doesn't matter unless you expect phone numbers to be longer > than 1000 chars. Or you call the function more than once Benjamin. Answering the *actual* question - I think it does not matter, and for more than one reason in this particular case (on of those being that even though bool objects are boxed, there are only two of them - True and False) Cheers, fijal From skip at pobox.com Mon Jan 30 17:26:57 2012 From: skip at pobox.com (skip at pobox.com) Date: Mon, 30 Jan 2012 10:26:57 -0600 (CST) Subject: [pypy-dev] PyPy on Solaris? Message-ID: <20120130162658.181E628BBE4D@montanaro.dyndns.org> I'm trying to build PyPy (1.7) on Solaris and keep getting this error very early in the translate process: /opt/app/nonc++/gcc-4.5.2/lib/gcc/i386-pc-solaris2.10/4.5.2/include-fixed/sys/feature_tests.h:345:2: error: #error "Compiler or options invalid; UNIX 03 and POSIX.1-2001 applications require the use of c99" I saw a similar thread from last September on this topic which sort of suggested that there is no support for Solaris: http://comments.gmane.org/gmane.comp.python.pypy/8377 Has anything changed since then? If so, how do I craft my translate.py command line to get this to build? Thx, -- Skip Montanaro - skip at pobox.com - http://www.smontanaro.net/ From davide.setti at gmail.com Mon Jan 30 18:40:19 2012 From: davide.setti at gmail.com (Davide Setti) Date: Mon, 30 Jan 2012 18:40:19 +0100 Subject: [pypy-dev] JITability In-Reply-To: References: Message-ID: On Mon, Jan 30, 2012 at 5:05 PM, Maciej Fijalkowski wrote: > I think it does not matter, and for > more than one reason in this particular case (on of those being that > even though bool objects are boxed, there are only two of them - True > and False) > Probably this is another question: what about using 1 and 0 instead of True and False? In (your) jitviewer i see that the "infinite while loops" 'while 1' is better than 'while True'. Am i right? Is it faster in the validatePhone case too? Is it possible to remove this (very small) overhead with the JIT? Thanks -- Davide Setti code: http://github.com/vad -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Jan 30 18:47:57 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 30 Jan 2012 19:47:57 +0200 Subject: [pypy-dev] JITability In-Reply-To: References: Message-ID: On Mon, Jan 30, 2012 at 7:40 PM, Davide Setti wrote: > On Mon, Jan 30, 2012 at 5:05 PM, Maciej Fijalkowski > wrote: >> >> I think it does not matter, and for >> more than one reason in this particular case (on of those being that >> even though bool objects are boxed, there are only two of them - True >> and False) > > > Probably this is another question: what about using 1 and 0 instead of True > and False?\ Not any different in this case. > > In (your) jitviewer i see that the "infinite while loops" 'while 1' is > better than 'while True'. Am i right? Is it faster in the?validatePhone case > too? Is it possible to remove this (very small) overhead with the JIT? > > Thanks > -- > > Davide Setti > code: http://github.com/vad The actual loop for while 1: and while True: is not different, only the preamble differs. It's generally faster in the interpreted mode because you don't have to check if the global True is not rebinded. Example: while True: ... if some_condition: True = False From andrewfr_ice at yahoo.com Mon Jan 30 21:09:46 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Mon, 30 Jan 2012 12:09:46 -0800 (PST) Subject: [pypy-dev] Question about Atomic_ops.h in STM Message-ID: <1327954186.2386.YahooMailNeo@web120705.mail.ne1.yahoo.com> Hi Folks: I noticed a compare_and_swap function in atomic_ops.h. On the IRC channel, it was suggested that I look in _rffi_stm.py. However it is not there (I can understand thta). How can this function be exposed to a RPython programme? The reason I am asking is if compare and swap functions are available, it ought to be possible to write stuff like a lock-free list. Also this give me a chance to learn more about RPython. Cheers, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewfr_ice at yahoo.com Mon Jan 30 21:20:33 2012 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Mon, 30 Jan 2012 12:20:33 -0800 (PST) Subject: [pypy-dev] Question about the STM module and RSTM Message-ID: <1327954833.79806.YahooMailNeo@web120704.mail.ne1.yahoo.com> Hi Folks: I understand that the STM module is still in its infancy. And subject to change at a moment's notice. I enjoy reading the transactional memory related blog posts. Over the weekend, I have been looking at the stm and transactionModule and the C++ based rstm. I haven't written an C++ based programme yet but that will change soon. If the stm library progresses a bit more, I would like to try writing Python versions of some of the STAMP examples. Until then,? I am trying understand the relationship between stm and rstm. It does not seem that stm is simply a wrapper for rstm (a lot of work went into stm!). There are things in rstm that I see such as the ability to configure parameters, set policies and to get diagnostics (i.e., number of aborts). Are there plans to do this with the stm module? Cheers, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Mon Jan 30 21:37:31 2012 From: arigo at tunes.org (Armin Rigo) Date: Mon, 30 Jan 2012 21:37:31 +0100 Subject: [pypy-dev] PyPy on Solaris? In-Reply-To: <20120130162658.181E628BBE4D@montanaro.dyndns.org> References: <20120130162658.181E628BBE4D@montanaro.dyndns.org> Message-ID: Hi, On Mon, Jan 30, 2012 at 17:26, wrote: > I'm trying to build PyPy (1.7) on Solaris and keep getting this error very > early in the translate process: > > ? ?/opt/app/nonc++/gcc-4.5.2/lib/gcc/i386-pc-solaris2.10/4.5.2/include-fixed/sys/feature_tests.h:345:2: > ? ?error: #error "Compiler or options invalid; UNIX 03 and POSIX.1-2001 > ? ?applications ?require the use of c99" Sorry, Solaris is unsupported; we only support ourselves linux, os/x and win32. You have to figure out yourself what this error message means and to work around it. It may be just that gcc needs to be called with the flag -std=c99 on Solaris. You can try to set CC="gcc -std=c99". No real clue, though. I will let others with a similar environment answer you more precisely. A bient?t, Armin. From arigo at tunes.org Mon Jan 30 21:39:05 2012 From: arigo at tunes.org (Armin Rigo) Date: Mon, 30 Jan 2012 21:39:05 +0100 Subject: [pypy-dev] Question about Atomic_ops.h in STM In-Reply-To: <1327954186.2386.YahooMailNeo@web120705.mail.ne1.yahoo.com> References: <1327954186.2386.YahooMailNeo@web120705.mail.ne1.yahoo.com> Message-ID: Hi Andrew, On Mon, Jan 30, 2012 at 21:09, Andrew Francis wrote: > I noticed a compare_and_swap function in atomic_ops.h. On the IRC channel, > it was suggested that I look in _rffi_stm.py. However it is not there (I can > understand thta). How can this function be exposed to a RPython programme? > The reason I am asking is if compare and swap functions are available, it > ought to be possible to write stuff like a lock-free list. Also this give me > a chance to learn more about RPython. Indeed, compare_and_swap() is not exposed to RPython. We don't need it, because it's only called from "et.c", another C source. RPython has no similar atomic operations so far. Armin From arigo at tunes.org Mon Jan 30 21:46:44 2012 From: arigo at tunes.org (Armin Rigo) Date: Mon, 30 Jan 2012 21:46:44 +0100 Subject: [pypy-dev] Question about the STM module and RSTM In-Reply-To: <1327954833.79806.YahooMailNeo@web120704.mail.ne1.yahoo.com> References: <1327954833.79806.YahooMailNeo@web120704.mail.ne1.yahoo.com> Message-ID: Hi, On Mon, Jan 30, 2012 at 21:20, Andrew Francis wrote: > If the stm library progresses a bit > more, I would like to try writing Python versions of some of the STAMP > examples. You wouldn't be able to write pure Python versions of classical STM examples, because the "transaction" module works at a level different from most STM implementations. You can try to write RPython versions of them, just for fun, but they may break at a moment's notice. In PyPy, we look at STM like we would look at the GC. It may be replaced in a week by a different one, but for the "end user" writing pure Python code, it essentially doesn't make a difference. That's why we have no plan at all to let the user access all the details of the STM library. Even the fact that STM is used is almost an implementation detail, which has just a few visible consequences (similar to how the very old versions of Python had a GC based on refcounting alone, which didn't free loops). A bient?t, Armin. From tbaldridge at gmail.com Tue Jan 31 16:26:46 2012 From: tbaldridge at gmail.com (Timothy Baldridge) Date: Tue, 31 Jan 2012 09:26:46 -0600 Subject: [pypy-dev] How will STM actually integrate with normal Python code Message-ID: As Armin stated in a recent mailing list thread: "In PyPy, we look at STM like we would look at the GC. It may be replaced in a week by a different one, but for the "end user" writing pure Python code, it essentially doesn't make a difference. " So, my question is, how exactly will STM integrate into PyPy? I'm going to take a guess here, and perhaps someone can elaborate to correct me. >From what I'm reading, PyPy with STM will offer the same promises (or lack of promises) that the JVM and CLR offer their code: For example, this code: def foo(d): if "foo" in d: del d["foo"] Will never cause a segmentation fault (due to multiple threads accessing "d" at the same time), but it may throw a KeyError. That is, all Python code will continue to be "thread-safe" as it is in CPython, but race conditions will continue to exist and must be handled in standard ways (locks, CAS, etc.). Am I right in this description? Thanks, Timothy -- ?One of the main causes of the fall of the Roman Empire was that?lacking zero?they had no way to indicate successful termination of their C programs.? (Robert Firth) From arigo at tunes.org Tue Jan 31 17:58:53 2012 From: arigo at tunes.org (Armin Rigo) Date: Tue, 31 Jan 2012 17:58:53 +0100 Subject: [pypy-dev] How will STM actually integrate with normal Python code In-Reply-To: References: Message-ID: Hi Timothy, On Tue, Jan 31, 2012 at 16:26, Timothy Baldridge wrote: > def foo(d): > ? ?if "foo" in d: > ? ? ? ?del d["foo"] > > Will never cause a segmentation fault (due to multiple threads > accessing "d" at the same time), but it may throw a KeyError. That is, > all Python code will continue to be "thread-safe" as it is in CPython, > but race conditions will continue to exist and must be handled in > standard ways (locks, CAS, etc.). No, precisely not. Such code will continue to work as it is, without race conditions or any of the messy multithread-induced headaches. Locks, CAS, etc. are all superflous. So what would work and not work? In one word: all "transactions" work exactly as if run serially, one after another. A transaction is just one unit of work; a callback. We use this working code for comparison on top of CPython or a non-STM PyPy: https://bitbucket.org/pypy/pypy/raw/stm/lib_pypy/transaction.py . You add transactions with the add() function, and execute them all with the run() function (which typically contains further add()s). The only source of non-determinism is in run() taking a random transaction as the next one. Of course this demo code runs the transactions serially, but the point is that even "pypy-stm" gives you the illusion of running them serially. So you stat by writing code that is *safe*, and then you have to think a bit in order to increase the parallelism, instead of the other way around when using traditional multithreading in non-Python languages. There are rules that are a bit subtle (but not too much) about when transactions can parallelize or not. Basically, as soon as a transaction does I/O, all the other transactions will be stalled; and if transactions very often change the same objects, then you will get a lot of conflicts and restarts. > "In PyPy, we look at STM like we would look at the GC. It may be > replaced in a week by a different one, but for the "end user" writing > pure Python code, it essentially doesn't make a difference. " I meant to say that STM, in our case, is just (hopefully) an optimization that lets some programs run on multiple CPUs --- the ones that are based on the 'transaction' module. But it's still just an optimization in the sense that the programs run exactly as if using the transaction.py I described above. In yet other words: notice that transaction.py doesn't even use the 'thread' module. So if we get the same behavior with pypy-stm's built-in 'transaction' module, it means that the example you described is perfectly safe as it is. (Update: today we have a "pypy-stm" that works exactly like transaction.py and exhibits multiple-CPU usage. It's just terribly slow and doesn't free any memory ever :-) But it runs http://paste.pocoo.org/show/543646/ , which is a simple epoll-based server creating new transactions in order to do the CPU-intensive portions of answering the requests. In the code there is no trace of CAS, locks, 'multiprocessing', etc.) A bient?t, Armin.