From arigo at tunes.org  Sat Sep  1 11:19:53 2012
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 1 Sep 2012 11:19:53 +0200
Subject: [pypy-dev] STM "version 2"
Message-ID: <CAMSv6X38Fomy5AydzHkL1zZ_tDzeQ2_pn-vO6mBWxMk2eQFUUg@mail.gmail.com>

Hi all,

To keep you informed of the progress on STM:

In the middle of August I found a potentially better approach to STM,
which uses copies of objects more extensively (something which is
neither natural nor easy to do in C/C++, which is probably why it was
not researched before).

The main change is that all globally visible objects are and stay
read-only.  A write that occurs in theory on one such object is ---
like before --- done inside a local copy of the object.  The
difference with before is what occurs on commit.  Before, the local
copy would have its content copied back over the global object.  But
now, the local copy "becomes" the next version of the global object.
It makes the reads simpler and cheaper, because we don't have to worry
about other threads committing changes to the object in parallel.  We
end up with the old version of the global object that says, in its
header, "I'm outdated, here's a pointer to some newer version".
Obviously, this approach relies on good GC support to eventually
collect the old copies.

I ended up documenting it extensively there, in a very terse form so far:

* https://bitbucket.org/pypy/extradoc/raw/extradoc/talk/stm2012/stmimpl.rst

I adapted the high-level testing framework I wrote last year to check
that it is, or appears to be, correct (and found of course a couple of
subtle bugs):

* https://bitbucket.org/arigo/arigo/raw/default/hack/stm/python/c2.py
* https://bitbucket.org/arigo/arigo/raw/default/hack/stm/python/test_c2.py

And I rewrote and tested it in C:

* https://bitbucket.org/arigo/arigo/raw/default/hack/stm/c2

In the next days I will start to adapt PyPy to it.  This approach has
the advantage over the previous one to not require complex support for
the reads, which should also make the JIT backend work more directly.
Also, unlike the previous one, this approach works on non-Intel CPUs
(the C code has been tested on POWER64).


A bient?t,

Armin.

From matti.picus at gmail.com  Wed Sep  5 03:30:00 2012
From: matti.picus at gmail.com (Matti Picus)
Date: Wed, 05 Sep 2012 04:30:00 +0300
Subject: [pypy-dev] numpy fails trigonometry with complex numbers,
	what to do?
Message-ID: <5046AB18.8020202@gmail.com>

I am trying to complete complex numbers in numpypy.
Progress is good, I picked up from previous work on the numpypy-complex2 
branch.
Complex numbers come with extensive tests, it seems all the corner cases 
are covered.
In porting the tests to numpypy, I came across a problem: numpy returns 
different results than cmath.
Some of the differences are due to the fact that numpy does not raise a 
ValueError for dividing by 0 or other silly input values,
but other differences are inexplicable (note the sign of the imaginary 
part):
 >>> numpy.arccos(complex(0.,-0.))
(1.5707963267948966-0j)
 >>> cmath.acos(complex(0.,-0.))
(1.5707963267948966+0j)
 >>>

or this one:
 >>> cmath.acos(complex(float('inf'),2.3))
-infj
 >>> numpy.arccos(complex(float('inf'),2.3))
(0.78539816339744828-inf*j)

Should I ignore the inconsistencies, or fix the 700 out of 2300 test 
instance failures?
What should pypy's numpypy do - be consistent with numpy or with cmath?
cmath is easier and probably faster (no need to mangle results or input 
args), so I would prefer cmath to trying to understand the logic behind 
numpy.
Matti


2280c2365844

From wiktor8010 at o2.pl  Wed Sep  5 09:39:54 2012
From: wiktor8010 at o2.pl (=?UTF-8?Q?Wiktor_Mizdal?=)
Date: Wed, 05 Sep 2012 09:39:54 +0200
Subject: [pypy-dev] =?utf-8?q?pypy_arm_progress?=
Message-ID: <a5eed9b.17f2b50a.504701ca.70250@o2.pl>


Hi,

in "Almost There - PyPy's ARM Backend" article is "The incomplete list of open topics":

-We are looking for a better way to translate PyPy for ARM, than the one describe above. I am not sure if there currently is hardware with enough memory to directly translate PyPy on an ARM based system, this would require between 1.5 or 2 Gig of memory. A fully QEMU based approach could also work, instead of Scratchbox2 that uses QEMU under the hood.
-Test the JIT on different hardware.
-Experiment with the JIT settings to find the optimal thresholds for ARM.
-Continuous integration: We are looking for a way to run the PyPy test suite to make sure everything works as expected on ARM, here QEMU also might provide an alternative.
-A long term plan would be to port the backend to ARMv5 ISA and improve the support for systems without a floating point unit. This would require to implement the ISA and create different code paths and improve the instruction selection depending on the target architecture.
-Review of the generated machine code the JIT generates on ARM to see if the instruction selection makes sense for ARM.
-Build a version that runs on Android.
-Improve the tools, i.e. integrate with jitviewer.

Which points is already done?

Wiktor Mizdal


From david.schneider at picle.org  Wed Sep  5 10:37:49 2012
From: david.schneider at picle.org (David Schneider)
Date: Wed, 5 Sep 2012 10:37:49 +0200
Subject: [pypy-dev] pypy arm progress
In-Reply-To: <a5eed9b.17f2b50a.504701ca.70250@o2.pl>
References: <a5eed9b.17f2b50a.504701ca.70250@o2.pl>
Message-ID: <6CE3FA47-2341-4481-A8F4-E0356580C7B2@picle.org>

Hi Wiktor,


> -We are looking for a better way to translate PyPy for ARM, than the one describe above. I am not sure if there currently is hardware with enough memory to directly translate PyPy on an ARM based system, this would require between 1.5 or 2 Gig of memory. A fully QEMU based approach could also work, instead of Scratchbox2 that uses QEMU under the hood.

By now I think using Scratchbox is the best approach to translate PyPy for ARM. The cross-compilation tools have gotten better over the last year and the setup has become easier to reproduce. There are some machines like the Calxeda servers[1] that have enough resources to translate PyPy directly on the host, but it is still at least 4 or 5 times slower than cross-translating.

> -Test the JIT on different hardware.

We have a BeagleBoard-xM[2] and a i.MX53 Quick Start Board[3] that run the JIT backend tests nightly.

> -Experiment with the JIT settings to find the optimal thresholds for ARM.

Still pending

> -Continuous integration: We are looking for a way to run the PyPy test suite to make sure everything works as expected on ARM, here QEMU also might provide an alternative.

Since yesterday we have a buildbot  on a dual core x86_64 machine that uses a combination of chroot and qemu-arm to run tests. There is a builder that runs the PyPy unit tests, which is very slow in this setup and can mainly improve through having more cores to run the tests, and a builder that translates a version of PyPy with and one without the JIT to run tests on top of them. There are test failures in all of these builders, some of them are apparently related to architecture specific things and in some cases we are seeing seemingly random segfaults in qemu. So here is also some work left to be done.

> -A long term plan would be to port the backend to ARMv5 ISA and improve the support for systems without a floating point unit. This would require to implement the ISA and create different code paths and improve the instruction selection depending on the target architecture.

This is still pending.

> -Review of the generated machine code the JIT generates on ARM to see if the instruction selection makes sense for ARM.

This is also pending.

> -Build a version that runs on Android.

This one is still pending, but there was a discussion about it last week on the mailing list.

> -Improve the tools, i.e. integrate with jitviewer.

This one is done.

My main focus, currently is on getting the branch and the testing infrastructure into a state that allows it to be merged back into the main development line.

Regards,

David

[1] http://www.calxeda.com/technology/products/
[2] http://beagleboard.org/hardware-xM/
[3] http://l.bivab.de/OS6jOu


From fijall at gmail.com  Wed Sep  5 11:20:49 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Wed, 5 Sep 2012 11:20:49 +0200
Subject: [pypy-dev] numpy fails trigonometry with complex numbers,
	what to do?
In-Reply-To: <5046AB18.8020202@gmail.com>
References: <5046AB18.8020202@gmail.com>
Message-ID: <CAK5idxRjLqhcf9s9s=weCcMxhenWx6ThJnhZhPpv0tY9Ue2YDg@mail.gmail.com>

On Wed, Sep 5, 2012 at 3:30 AM, Matti Picus <matti.picus at gmail.com> wrote:
> I am trying to complete complex numbers in numpypy.
> Progress is good, I picked up from previous work on the numpypy-complex2
> branch.
> Complex numbers come with extensive tests, it seems all the corner cases are
> covered.
> In porting the tests to numpypy, I came across a problem: numpy returns
> different results than cmath.
> Some of the differences are due to the fact that numpy does not raise a
> ValueError for dividing by 0 or other silly input values,
> but other differences are inexplicable (note the sign of the imaginary
> part):
>>>> numpy.arccos(complex(0.,-0.))
> (1.5707963267948966-0j)
>>>> cmath.acos(complex(0.,-0.))
> (1.5707963267948966+0j)
>>>>
>
> or this one:
>>>> cmath.acos(complex(float('inf'),2.3))
> -infj
>>>> numpy.arccos(complex(float('inf'),2.3))
> (0.78539816339744828-inf*j)
>
> Should I ignore the inconsistencies, or fix the 700 out of 2300 test
> instance failures?
> What should pypy's numpypy do - be consistent with numpy or with cmath?
> cmath is easier and probably faster (no need to mangle results or input
> args), so I would prefer cmath to trying to understand the logic behind
> numpy.
> Matti

If you ask me, cmath is correct and numpy just didn't care. Maybe you
should submit a bug report to them instead?

From mmueller at python-academy.de  Wed Sep  5 11:54:58 2012
From: mmueller at python-academy.de (=?ISO-8859-1?Q?Mike_M=FCller?=)
Date: Wed, 05 Sep 2012 11:54:58 +0200
Subject: [pypy-dev] numpy fails trigonometry with complex numbers,
 what to do?
In-Reply-To: <5046AB18.8020202@gmail.com>
References: <5046AB18.8020202@gmail.com>
Message-ID: <50472172.3040304@python-academy.de>

Am 05.09.12 03:30, schrieb Matti Picus:
> I am trying to complete complex numbers in numpypy.
> Progress is good, I picked up from previous work on the numpypy-complex2 branch.
> Complex numbers come with extensive tests, it seems all the corner cases are
> covered.
> In porting the tests to numpypy, I came across a problem: numpy returns
> different results than cmath.
> Some of the differences are due to the fact that numpy does not raise a
> ValueError for dividing by 0 or other silly input values,
> but other differences are inexplicable (note the sign of the imaginary part):
>>>> numpy.arccos(complex(0.,-0.))
> (1.5707963267948966-0j)
>>>> cmath.acos(complex(0.,-0.))
> (1.5707963267948966+0j)
>>>>
> 
> or this one:
>>>> cmath.acos(complex(float('inf'),2.3))
> -infj
>>>> numpy.arccos(complex(float('inf'),2.3))
> (0.78539816339744828-inf*j)
> 
> Should I ignore the inconsistencies, or fix the 700 out of 2300 test instance
> failures?
> What should pypy's numpypy do - be consistent with numpy or with cmath?
> cmath is easier and probably faster (no need to mangle results or input args),
> so I would prefer cmath to trying to understand the logic behind numpy.
> Matti
> 

In NumPy you can change how numerical exception are handled:
http://docs.scipy.org/doc/numpy/reference/routines.err.html
http://docs.scipy.org/doc/numpy/user/misc.html#how-numpy-handles-numerical-exceptions


>>> import numpy
>>> numpy.__version__
'1.6.2'
>>> numpy.arccos(complex(float('inf'),2.3))
-c:1: RuntimeWarning: invalid value encountered in arccos
(nan-inf*j)
# Warning only once.
>>> numpy.arccos(complex(float('inf'),2.3))
(nan-inf*j)
>>> old_settings = numpy.seterr(all='raise')
>>> old_settings
Out[8]: {'divide': 'warn', 'invalid': 'warn', 'over': 'warn', 'under': 'ignore'}
>>> numpy.arccos(complex(float('inf'),2.3))
---------------------------------------------------------------------------
FloatingPointError                        Traceback (most recent call last)
<ipython-input-11-92051afcce38> in <module>()
----> 1 numpy.arccos(complex(float('inf'),2.3))
>>> old_settings = numpy.seterr(all='ignore')
>>> numpy.arccos(complex(float('inf'),2.3))
(nan-inf*j)

HTH,
Mike


From andrewfr_ice at yahoo.com  Wed Sep  5 18:28:29 2012
From: andrewfr_ice at yahoo.com (Andrew Francis)
Date: Wed, 5 Sep 2012 09:28:29 -0700 (PDT)
Subject: [pypy-dev] STM "version 2"
In-Reply-To: <CAMSv6X38Fomy5AydzHkL1zZ_tDzeQ2_pn-vO6mBWxMk2eQFUUg@mail.gmail.com>
References: <CAMSv6X38Fomy5AydzHkL1zZ_tDzeQ2_pn-vO6mBWxMk2eQFUUg@mail.gmail.com>
Message-ID: <1346862509.27638.YahooMailNeo@web140702.mail.bf1.yahoo.com>

Hi Armin:


________________________________
 From: Armin Rigo <arigo at tunes.org>
To: PyPy Developer Mailing List <pypy-dev at python.org> 
Sent: Saturday, September 1, 2012 5:19 AM
Subject: [pypy-dev] STM "version 2"
 
>To keep you informed of the progress on STM:

>In the middle of August I found a potentially better approach to STM,
>which uses copies of objects more extensively (something which is
>neither natural nor easy to do in C/C++, which is probably why it was
>not researched before)

>I ended up documenting it extensively there, in a very terse form so far:

>* https://bitbucket.org/pypy/extradoc/raw/extradoc/talk/stm2012/stmimpl.rst

I did an initial reading of the document. Wow!

I will assume that this is the implementation section of ?https://bitbucket.org/pypy/pypy/raw/stm-thread/pypy/doc/stm.rst?

I soon intend to start playing with CPython + atomic. I will re-read the STM document. Along the?way I would like to see?
if I can make some diagrammes and categorise the techniques used (i.e., optimistic locking vs pessimistic,?undo logs vs redo logs)?
to see if I understand what is happening. Also I can ask questions along the way.?

Perhaps if I understand enough, I can give ?a lightening talk (5 minutes)?at the next Montreal Python User group metting. Again, this
is great stuff!

Salut,
Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120905/03d01897/attachment.html>

From editor at downloadatlas.com  Thu Sep  6 07:20:19 2012
From: editor at downloadatlas.com (Vera Zubor - DownloadAtlas.com)
Date: Thu, 6 Sep 2012 01:20:19 -0400
Subject: [pypy-dev] PyPy got an Editors' Choice award !
Message-ID: <b4da642d259ab22a1d0ea7f1c9789370@xeon.euronetix.com>


Hello PyPy Development Team,

 we would like to remind you and announce an Editor's Choice award to PyPy Current method of awarding is chosen by us to reflect the high quality of your software product.  
 
More informations at http://www.downloadatlas.com/open-source-eac5cece.html#awards

Important Note: 

If you have any questions, please don't hesitate to contact us. 

Best wishes,

Vera Zubor
- editor
e-mail: editor at downloadatlas.com


-----------------------------------------------------------------------------------------------------------------------
This message and any attached files are confidential and intended solely for the addressee(s). Any publication, 
transmission or other use of the information by a person or entity other than the intended addressee is prohibited. 
If you receive this in error please contact the sender and delete the material. The sender does not accept 
liability for any errors or omissions as a result of the transmission.

To unsubscribe or change options please login to your account using email address http://user.downloadatlas.com/?account=pypy-dev at python.org 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120906/6d97a7ce/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: editors_choice.png
Type: image/png
Size: 15176 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120906/6d97a7ce/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: editors_choice_2012.png
Type: image/png
Size: 13620 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120906/6d97a7ce/attachment-0003.png>

From arigo at tunes.org  Thu Sep  6 09:12:52 2012
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 6 Sep 2012 09:12:52 +0200
Subject: [pypy-dev] numpy fails trigonometry with complex numbers,
	what to do?
In-Reply-To: <5046AB18.8020202@gmail.com>
References: <5046AB18.8020202@gmail.com>
Message-ID: <CAMSv6X0Sf2cm1w+nTrj23ffQLmOnoiLrhuyPmcM0RTD2K0g7Yw@mail.gmail.com>

Hi,

On Wed, Sep 5, 2012 at 3:30 AM, Matti Picus <matti.picus at gmail.com> wrote:
>>>> numpy.arccos(complex(0.,-0.))
> (1.5707963267948966-0j)
>>>> cmath.acos(complex(0.,-0.))
> (1.5707963267948966+0j)

>>>> cmath.acos(complex(float('inf'),2.3))
> -infj
>>>> numpy.arccos(complex(float('inf'),2.3))
> (0.78539816339744828-inf*j)

According to the C99 standard Annex G (draft,
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf), the cmath
answer is the correct one in both cases.  I don't know if that really
means that numpy didn't care about the details.  It sounds a bit
strange given that it has tests for it; I fear it rather means that
numpy implemented a different standard.  But maybe that's me being too
optimistic/pessimistic (depending on the point of view).  I would
indeed ask on numpy mailing lists or submit a bug entry and see their
reaction.


A bient?t,

Armin.

From wiktor8010 at o2.pl  Thu Sep  6 13:36:46 2012
From: wiktor8010 at o2.pl (=?UTF-8?Q?Wiktor_Mizdal?=)
Date: Thu, 06 Sep 2012 13:36:46 +0200
Subject: [pypy-dev] =?utf-8?q?STM_and_concurrent_GC?=
Message-ID: <c2cac5a.bfe34fc.50488ace.24a@o2.pl>

Hi,
?
will STM can help implement concurrent garbage collector?
?
?
Wiktor Mizdal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120906/599c5946/attachment.html>

From arigo at tunes.org  Thu Sep  6 14:43:14 2012
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 6 Sep 2012 14:43:14 +0200
Subject: [pypy-dev] STM and concurrent GC
In-Reply-To: <c2cac5a.bfe34fc.50488ace.24a@o2.pl>
References: <c2cac5a.bfe34fc.50488ace.24a@o2.pl>
Message-ID: <CAMSv6X0zP6suEYFSPKMRdVv==hrXEwP55e1AuNWnj_qxTyXu+A@mail.gmail.com>

Hi Wiktor,

On Thu, Sep 6, 2012 at 1:36 PM, Wiktor Mizdal <wiktor8010 at o2.pl> wrote:
> will STM can help implement concurrent garbage collector?

No, that's the wrong way around.  A concurrent GC might be needed at
some point in order to help STM.  But you don't want to write a GC
*using* the STM features.  Instead we need to write a GC taking STM
into account, and probably sharing the same read/write barriers; if
anything, it makes the job harder than "just" writing STM or "just"
writing a concurrent GC.


A bient?t,

Armin.

From tbaldridge at gmail.com  Thu Sep  6 19:46:19 2012
From: tbaldridge at gmail.com (Timothy Baldridge)
Date: Thu, 6 Sep 2012 12:46:19 -0500
Subject: [pypy-dev] Locals clearing in RPython
Message-ID: <CAL36E+vvo2iCe3zrCX5uLpiQfqJTaaYTSPefwd-W7tZx_yY7MQ@mail.gmail.com>

Let's imagine that I have some code like the following in RPython:


def wrapper_func(arg1, arg2):
    return inner_func(arg2)

def inner_func(x):
   for y in range(x):
      # do something here
      pass
   return -1

bigint = 1000000

wrapper_func(list(range(bigint)), bigint)

The problem here is that arg1 is going to be held onto in (in CPython at
least) until inner_func returns. This means that the list created on the
invocation of wrapper_func is going to stick around during the entire
execution time of inner_func. This would not make much of a difference
normally, but in languages with extensive use of lazy evaluation, holding
onto the head of a sequence could cause out-of-memory errors. In Clojure
this is fixed-up by the compiler via "locals clearing". Basically the
compiler inserts "arg1 = None" before the invocation of inner_func.

What's the story here in RPython? Since RPython basically compiles down to
single-assignment code I'm guessing the Clojure fix won't help me. When is
the GC able to go and free data held by arguments?


Timothy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120906/abe38a76/attachment.html>

From benjamin at python.org  Thu Sep  6 20:30:39 2012
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 6 Sep 2012 14:30:39 -0400
Subject: [pypy-dev] Locals clearing in RPython
In-Reply-To: <CAL36E+vvo2iCe3zrCX5uLpiQfqJTaaYTSPefwd-W7tZx_yY7MQ@mail.gmail.com>
References: <CAL36E+vvo2iCe3zrCX5uLpiQfqJTaaYTSPefwd-W7tZx_yY7MQ@mail.gmail.com>
Message-ID: <CAPZV6o_jds=4gOKmvvjdKgeBOHBjfNJZTKsrV0SymODS8tZz1g@mail.gmail.com>

2012/9/6 Timothy Baldridge <tbaldridge at gmail.com>:
> Let's imagine that I have some code like the following in RPython:
>
>
> def wrapper_func(arg1, arg2):
>     return inner_func(arg2)
>
> def inner_func(x):
>    for y in range(x):
>       # do something here
>       pass
>    return -1
>
> bigint = 1000000
>
> wrapper_func(list(range(bigint)), bigint)

Since that's all evaluated at import time, I don't see what the problem is.


-- 
Regards,
Benjamin

From tbaldridge at gmail.com  Thu Sep  6 20:34:39 2012
From: tbaldridge at gmail.com (Timothy Baldridge)
Date: Thu, 6 Sep 2012 13:34:39 -0500
Subject: [pypy-dev] Locals clearing in RPython
In-Reply-To: <CAPZV6o_jds=4gOKmvvjdKgeBOHBjfNJZTKsrV0SymODS8tZz1g@mail.gmail.com>
References: <CAL36E+vvo2iCe3zrCX5uLpiQfqJTaaYTSPefwd-W7tZx_yY7MQ@mail.gmail.com>
	<CAPZV6o_jds=4gOKmvvjdKgeBOHBjfNJZTKsrV0SymODS8tZz1g@mail.gmail.com>
Message-ID: <CAL36E+tGjeNSat+EOkTpumtmWDRedW3bq-oLB_iSJstnF2pOBw@mail.gmail.com>

On Thu, Sep 6, 2012 at 1:30 PM, Benjamin Peterson <benjamin at python.org>wrote:

> 2012/9/6 Timothy Baldridge <tbaldridge at gmail.com>:
> > Let's imagine that I have some code like the following in RPython:
> >
> >
> > def wrapper_func(arg1, arg2):
> >     return inner_func(arg2)
> >
> > def inner_func(x):
> >    for y in range(x):
> >       # do something here
> >       pass
> >    return -1
> >
> > bigint = 1000000
> >
> > wrapper_func(list(range(bigint)), bigint)
>
> Since that's all evaluated at import time, I don't see what the problem is.
>
>
>
Nice, but that completely missed the point of my question. I know this
wouldn't be a problem in this exact case. The question is: when is the GC
free to free data passed into a function's arguments. Will that function
hold on to all data passed in through arguments until the execution of the
function terminates? If so is there a way to trigger garbage collection of
unneeded argument data before the end of the function's execution?

Timothy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120906/fece7885/attachment.html>

From arigo at tunes.org  Thu Sep  6 21:25:26 2012
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 6 Sep 2012 21:25:26 +0200
Subject: [pypy-dev] Locals clearing in RPython
In-Reply-To: <CAL36E+tGjeNSat+EOkTpumtmWDRedW3bq-oLB_iSJstnF2pOBw@mail.gmail.com>
References: <CAL36E+vvo2iCe3zrCX5uLpiQfqJTaaYTSPefwd-W7tZx_yY7MQ@mail.gmail.com>
	<CAPZV6o_jds=4gOKmvvjdKgeBOHBjfNJZTKsrV0SymODS8tZz1g@mail.gmail.com>
	<CAL36E+tGjeNSat+EOkTpumtmWDRedW3bq-oLB_iSJstnF2pOBw@mail.gmail.com>
Message-ID: <CAMSv6X3LJuCZ7ketU0K6qaYGWgrLGopcsJJ68UD5eZ4abrhNJg@mail.gmail.com>

Hi Timothy,

On Thu, Sep 6, 2012 at 8:34 PM, Timothy Baldridge <tbaldridge at gmail.com> wrote:
> The question is: when is the GC
> free to free data passed into a function's arguments. Will that function
> hold on to all data passed in through arguments until the execution of the
> function terminates? If so is there a way to trigger garbage collection of
> unneeded argument data before the end of the function's execution?

RPython is not precisely defined for this question to have a definite
answer.  However, currently, with all our own GCs, keep-alive works
this way:

1. First, an RPython function keeps alive a variable "for as little
time as necessary".  Even if, as plain Python, the variable would be
kept alive until the function returns just because it's stored in a
local, it is not the case in RPython.

2. Across a call, *only* the variables that need to remain alive
*after* the call is done are pushed on the shadow stack.  In
particular, tail calls are really just goto's from the point of view
of the GC (even if they are not implemented as tail calls from the
point of view of the translation to C).

Using these general rules to figure out what occurs in your case:

> def wrapper_func(arg1, arg2):
>     return inner_func(arg2)
>
> def inner_func(x):
>    for y in range(x):
>       # do something here
>       pass
>    return -1
>
> wrapper_func(list(range(bigint)), bigint)

The list is passed as 'arg1' but not passed any further.  It is not
saved either across the call to wrapper_func() nor the call to
inner_func().  So it is dead.  If inner_func() triggers a GC cycle, it
will be collected.

If you need to prevent that (e.g. because 'arg1' is an object with a
__del__ and you don't want this __del__ to be called too early) then
you need to insert a call to
pypy.rlib.objectmodel.keepalive_until_here().


A bient?t,

Armin.

From benjamin at python.org  Thu Sep  6 21:26:58 2012
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 6 Sep 2012 15:26:58 -0400
Subject: [pypy-dev] Locals clearing in RPython
In-Reply-To: <CAL36E+tGjeNSat+EOkTpumtmWDRedW3bq-oLB_iSJstnF2pOBw@mail.gmail.com>
References: <CAL36E+vvo2iCe3zrCX5uLpiQfqJTaaYTSPefwd-W7tZx_yY7MQ@mail.gmail.com>
	<CAPZV6o_jds=4gOKmvvjdKgeBOHBjfNJZTKsrV0SymODS8tZz1g@mail.gmail.com>
	<CAL36E+tGjeNSat+EOkTpumtmWDRedW3bq-oLB_iSJstnF2pOBw@mail.gmail.com>
Message-ID: <CAPZV6o8awfLrndKZbwnY7JH_p83D1y-bFLa-mmPmPJqqyzBUSA@mail.gmail.com>

2012/9/6 Timothy Baldridge <tbaldridge at gmail.com>:
>
> Nice, but that completely missed the point of my question. I know this
> wouldn't be a problem in this exact case. The question is: when is the GC
> free to free data passed into a function's arguments. Will that function
> hold on to all data passed in through arguments until the execution of the
> function terminates? If so is there a way to trigger garbage collection of
> unneeded argument data before the end of the function's execution?

It depends on the GC. The moving gcs will not keep arguments as stack
roots unless they are needed.


-- 
Regards,
Benjamin

From gelonida at gmail.com  Sun Sep  9 03:27:52 2012
From: gelonida at gmail.com (Gelonida N)
Date: Sun, 09 Sep 2012 03:27:52 +0200
Subject: [pypy-dev] Windows 7 64-bit Problems installing pip or virutalenv
Message-ID: <k2grao$h2u$1@ger.gmane.org>

Hi,

Just doing some experiments with PyPy

I'd like to install pip / virutalenv.
With pypy and Linux no problem

With CPython (32 bit) and windows 7 64-bit no problem

With pypy and Windows 7 64-bit no success.

Any hints ???? I even tried it with disabling the antivirus.

In fact so far I couldn't download and install any package successfully.
Perhaps somebody could recommend a very simple package as a starting poimt.

Experimenting without virtualenv is a serious pain.


Installing pypy and easy_install (setuptools) seems to work
Installing virtualenv / pip always fails.

I'm running out of ideas. SO any help is welcome.

Below the details.


Step 1: Download and unpack pypy
-------------------------------------
I open a cmd window

I downloaded pypy1.9.0  and extracted it to a directory (%PYPY_PATH%)


Step 2: Installing easy_install (setup_tools)
-----------------------------------------------

I download http://python-distribute.org/distribute_setup.py
and install it with

cd  %PYPY_PATH%
pypy.exe distribute_setup.py

This seems to work basically. However a really strange message is:

Don't have permissions to write 
C:\test\pypy-1.9\site-packages\setuptools-0.6c11-py2.7.egg-info, skipping

Not sure, where this is coming from. I assume, that another thread / 
process didn't close the file properly


Just for fun I tried this also with a privileged window.
Same problem:

Step 3: Try to install virtualenv
-------------------------------------
I tried following steps in a normal cmd window awith a cmd window with 
admin privileges.

When starting virtualenv from a admin-cmd window.
Then I get following output:


C:\test>pypy-1.9.2\bin\easy_install.exe virtualenv
Searching for virtualenv
Reading http://pypi.python.org/simple/virtualenv/
Reading http://www.virtualenv.org
Reading http://virtualenv.openplans.org
Best match: virtualenv 1.8.2
Downloading 
http://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.8.2.tar.gz#md5=174ca075c6b1a42c685415692ec4ce2e
Processing virtualenv-1.8.2.tar.gz
Writing 
c:\users\XXXXX\appdata\local\temp\easy_install-ubcvxv\virtualenv-1.8.2\setup.cfg
Running virtualenv-1.8.2\setup.py -q bdist_egg --dist-dir 
c:\users\XXXXX\appdata\local\temp\easy_install-ubcvxv\virtualenv-1.8.2\egg-dist-tmp-cd7mao
warning: no previously-included files matching '*' found under directory 
'docs\_templates'
warning: no previously-included files matching '*' found under directory 
'docs\_build'
No eggs found in 
c:\users\XXXX\appdata\local\temp\easy_install-ubcvxv\virtualenv-1.8.2\egg-dist-tmp-cd7mao 
(setup script problem?)

C:\test>

Step 4) Trying to Install pip
-------------------------------
Installing pip fails as well.

C:\test>pypy-1.9.2\bin\easy_install.exe pip
Searching for pip
Reading http://pypi.python.org/simple/pip/
Reading http://www.pip-installer.org
Reading http://pip.openplans.org
Best match: pip 1.2.1
Downloading 
http://pypi.python.org/packages/source/p/pip/pip-1.2.1.tar.gz#md5=db8a6d8a4564d3dc7f337ebed67b1a85
Processing pip-1.2.1.tar.gz
Writing 
c:\users\XXXX\appdata\local\temp\easy_install-rxf5c5\pip-1.2.1\setup.cfg
Running pip-1.2.1\setup.py -q bdist_egg --dist-dir 
c:\users\XXXX\appdata\local\temp\easy_install-rxf5c5\pip-1.2.1\egg-dist-tmp-an2nbn
warning: no files found matching '*.html' under directory 'docs'
warning: no previously-included files matching '*.txt' found under 
directory 'docs\_build'
no previously-included directories found matching 'docs\_build\_sources'
No eggs found in 
c:\users\XXXX\appdata\local\temp\easy_install-rxf5c5\pip-1.2.1\egg-dist-tmp-an2nbn 
(setup script problem?)
error: 
c:\users\XXXX\appdata\local\temp\easy_install-rxf5c5\pip-1.2.1\docs\index.txt: 
The process cannot access the file because it is being used by another p
rocess.

C:\test>


If I start, easy_install as normal user, then I get first a popu asking 
me if I want to allow easy_install to make changes to my computer.

then a new cmd window opens with some messages (dowmloading, . . . )
then the window disappears
and then I get a popup, that windows is not sure my program installed 
correctly


From berdario at gmail.com  Sun Sep  9 09:46:05 2012
From: berdario at gmail.com (Dario Bertini)
Date: Sun, 9 Sep 2012 09:46:05 +0200
Subject: [pypy-dev] Windows 7 64-bit Problems installing pip or
	virutalenv
In-Reply-To: <k2grao$h2u$1@ger.gmane.org>
References: <k2grao$h2u$1@ger.gmane.org>
Message-ID: <CAFdyfB1jZw-txh5UxNPXpa09Nxe1DGZT3UOESCyiDUm8TiK7xA@mail.gmail.com>

This is a known problem in pip itself

https://bugs.pypy.org/issue702

(I guess that the same might be happening with virtualenv? )

From gelonida at gmail.com  Sun Sep  9 16:33:23 2012
From: gelonida at gmail.com (Gelonida N)
Date: Sun, 09 Sep 2012 16:33:23 +0200
Subject: [pypy-dev] Windows 7 64-bit Problems installing virutalenv (pip
	is now working)
In-Reply-To: <CAFdyfB1jZw-txh5UxNPXpa09Nxe1DGZT3UOESCyiDUm8TiK7xA@mail.gmail.com>
References: <k2grao$h2u$1@ger.gmane.org>
	<CAFdyfB1jZw-txh5UxNPXpa09Nxe1DGZT3UOESCyiDUm8TiK7xA@mail.gmail.com>
Message-ID: <k2i9bj$ode$2@ger.gmane.org>

On 09/09/2012 09:46 AM, Dario Bertini wrote:
> This is a known problem in pip itself
>
> https://bugs.pypy.org/issue702
>
> (I guess that the same might be happening with virtualenv? )
>
Thanks for your answer Dario.

I'm now able to install pip.

I tested pip and could install
pygments and SOAPpy

I also seem to be able to install virtualenv with pip, though it reports 
some warnings:

> C:\test>\tools\pypy-1.9\bin\pip.exe install virtualenv
> Downloading/unpacking virtualenv
>   Downloading virtualenv-1.8.2.tar.gz (2.2MB): 2.2MB downloaded
>   Running setup.py egg_info for package virtualenv
>
>     warning: no previously-included files matching '*' found under directory 'docs\_templates'
>     warning: no previously-included files matching '*' found under directory 'docs\_build'
> Installing collected packages: virtualenv
>   Running setup.py install for virtualenv
>
>     warning: no previously-included files matching '*' found under directory 'docs\_templates'
>     warning: no previously-included files matching '*' found under directory 'docs\_build'
>     Installing virtualenv-script.py script to C:\tools\pypy-1.9\bin
>     Installing virtualenv.exe script to C:\tools\pypy-1.9\bin
>     Installing virtualenv-2.7-script.py script to C:\tools\pypy-1.9\bin
>     Installing virtualenv-2.7.exe script to C:\tools\pypy-1.9\bin
> Successfully installed virtualenv
> Cleaning up...
>
> C:\test>


When trying to create a vritualenv
with

virtualenv.exe targetdir

I get a new error this time.

the target directory is already created and some files exist already.
However scripts like 'activate.bat' are still missing.

I think the important error line is
> DistributionNotFound: setuptools>=0.6c11


with 'pip freeze' I see:
distribute==0.6.28
an alphabetic comparison of the versions would indicate, that my 
distribute version is too old :-(

I can also paste the full error log, but I think this is the line, that 
counts.


From max.lavrenov at gmail.com  Tue Sep 11 11:41:30 2012
From: max.lavrenov at gmail.com (Max Lavrenov)
Date: Tue, 11 Sep 2012 13:41:30 +0400
Subject: [pypy-dev] build error with --shared flag
Message-ID: <CAAq2fm-aC6QoO4tN5HxOM2nuiF2CHmt1=dGoKHwG=Pznhr35KQ@mail.gmail.com>

Hello

When i tried build master pypy with --shared flag  i got error:

[translation:ERROR]     /usr/bin/ld: /usr/lib/libffi.a(ffi64.o): relocation
R_X86_64_32S against `.rodata' can not be used when making a shared object;
recompile with -fPIC
[translation:ERROR]     /usr/lib/libffi.a: could not read symbols: Bad value
[translation:ERROR]     collect2: ??????: ?????????? ld ??????????? ? ?????
???????? 1
[translation:ERROR]     make: *** [libpypy-c.so] ?????? 1
[translation:ERROR]     """)

What that means?  Should i rebuild my libffi package with -fPIC?   Right
now i've temporary  fixed this probleb with patch def find_libffi_a():
function so it always use dynamic linking.
Will it cause "endless troubles for installing"  ( according comment from
clibffi.py ) ?

Thanks.

Best regards,
Max
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120911/e84aa647/attachment.html>

From fijall at gmail.com  Tue Sep 11 14:16:02 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 11 Sep 2012 14:16:02 +0200
Subject: [pypy-dev] build error with --shared flag
In-Reply-To: <CAAq2fm-aC6QoO4tN5HxOM2nuiF2CHmt1=dGoKHwG=Pznhr35KQ@mail.gmail.com>
References: <CAAq2fm-aC6QoO4tN5HxOM2nuiF2CHmt1=dGoKHwG=Pznhr35KQ@mail.gmail.com>
Message-ID: <CAK5idxT+0jaW+4xuN+xdty4YTEuF6YsDKreYXuPswNcCm38gtg@mail.gmail.com>

Yes, libffi is not build with -fPIC, for reasons unknown to me. This
is a known debian bug. You can instead enable dynamic linking with
libffi, which will make your binary less movable between various linux
distros (but maybe you don't care)

On Tue, Sep 11, 2012 at 11:41 AM, Max Lavrenov <max.lavrenov at gmail.com> wrote:
> Hello
>
> When i tried build master pypy with --shared flag  i got error:
>
> [translation:ERROR]     /usr/bin/ld: /usr/lib/libffi.a(ffi64.o): relocation
> R_X86_64_32S against `.rodata' can not be used when making a shared object;
> recompile with -fPIC
> [translation:ERROR]     /usr/lib/libffi.a: could not read symbols: Bad value
> [translation:ERROR]     collect2: ??????: ?????????? ld ??????????? ? ?????
> ???????? 1
> [translation:ERROR]     make: *** [libpypy-c.so] ?????? 1
> [translation:ERROR]     """)
>
> What that means?  Should i rebuild my libffi package with -fPIC?   Right now
> i've temporary  fixed this probleb with patch def find_libffi_a(): function
> so it always use dynamic linking.
> Will it cause "endless troubles for installing"  ( according comment from
> clibffi.py ) ?
>
> Thanks.
>
> Best regards,
> Max
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev
>

From fijall at gmail.com  Wed Sep 12 22:18:33 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Wed, 12 Sep 2012 22:18:33 +0200
Subject: [pypy-dev] rlib/runicode, python 3 and pypy
Message-ID: <CAK5idxSOOfvEG=UaQw0=rcEeapE61Qa3mAdpsStMsGAjO=K9jA@mail.gmail.com>

Hi

This is mostly a mail to Antonio, but I'm interested in everyone's opinion.
Al
rlib/runicode.py differs on py3k branch from the default. IMO this is
a very very bad idea. This is a push towards more convoluted
interpreter/translator interaction rather than less. Also it makes
RPython a less defined language rather than more. I see the following
options:

* decide utf-8 encoding in RPython is a bad idea alltogether

* move codec somewhere else or patch unicodeobject in python 3 and
decide to keep the default version on py3k

* declare it a temporary hack, but then it confuses people and I'm
generally skeptical about the amount of temporary hacks on py3k
branch.

The one thing I *don't* want to do is to declare "meh, too bad,
RPython is a different language on py3k and on default" as long as
py3k branch stays as a part of PyPy project.

Cheers,
fijal

From haael at interia.pl  Sun Sep 16 20:58:46 2012
From: haael at interia.pl (haael)
Date: Sun, 16 Sep 2012 20:58:46 +0200
Subject: [pypy-dev] Flow graphs, backends and JIT
Message-ID: <50562166.9040701@interia.pl>


OK, I read almost all the documentation I found on the web page. But I still 
don't understand few things.

There are 3 layers in the whole picture. The user application written in 
Python, the Python interpreter written in RPython and the RPython interpreter 
itself.

1. Where do flow graphs are generated from? Is it the representation of the 
user application, or the interpreter?

2. Where does the JIT fit here? I read it traces the execution of the 
iterpreter and indirectly the user application. Does it operate on the flow 
graphs or something else?

3. Which component actually does the JIT? Is it just a tweak on the code 
generator or are the flow graphs generated differently?

4. Is there some documentation how to write a backend (code generator)? The 
source code is poorly documented and the topic is not mentioned on the web 
page. What exactly do I need to implement to have a backend?

haael


From benjamin at python.org  Sun Sep 16 21:41:10 2012
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 16 Sep 2012 15:41:10 -0400
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <50562166.9040701@interia.pl>
References: <50562166.9040701@interia.pl>
Message-ID: <CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>

2012/9/16 haael <haael at interia.pl>:
>
> OK, I read almost all the documentation I found on the web page. But I still
> don't understand few things.
>
> There are 3 layers in the whole picture. The user application written in
> Python, the Python interpreter written in RPython and the RPython
> interpreter itself.
>
> 1. Where do flow graphs are generated from? Is it the representation of the
> user application, or the interpreter?

The interpreter.

>
> 2. Where does the JIT fit here? I read it traces the execution of the
> iterpreter and indirectly the user application. Does it operate on the flow
> graphs or something else?

It operates on a serialized version of the flowgraphs.

>
> 3. Which component actually does the JIT? Is it just a tweak on the code
> generator or are the flow graphs generated differently?

The flow graphs are taken from the translator and modified by the JIT generator.

>
> 4. Is there some documentation how to write a backend (code generator)? The
> source code is poorly documented and the topic is not mentioned on the web
> page. What exactly do I need to implement to have a backend?

You mean a JIT backend or a RPython backend?

You might find this useful: http://www.aosabook.org/en/pypy.html

-- 
Regards,
Benjamin

From haael at interia.pl  Tue Sep 18 09:35:07 2012
From: haael at interia.pl (haael)
Date: Tue, 18 Sep 2012 09:35:07 +0200
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
References: <50562166.9040701@interia.pl>	<CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>	<5056BD28.1030809@interia.pl>
	<CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
Message-ID: <5058242B.801@interia.pl>


>>>> 3. Which component actually does the JIT? Is it just a tweak on the code
>>>> generator or are the flow graphs generated differently?
>>>
>>>
>>> The flow graphs are taken from the translator and modified by the JIT
>>> generator.
>>
>>
>> My question is:
>>
>> Does JIT involve another "transformation" of the flow graphs? In normal
>> (non-JIT) code generation some flow graphs are fed to the backend generator.
>> Wich step is different in the JIT case? Does the backend generator get
>> different flow graphs or are the same flow graphs compiled differently by a
>> tweaked code generator?
>
> They get the same flowgraphs.


So, if I understand well, there is no common JIT code among different backends? 
The JIT we have is the C-backend specific? Different backends would need a new 
JIT approach?


>>>> 4. Is there some documentation how to write a backend (code generator)?
>>>> The
>>>> source code is poorly documented and the topic is not mentioned on the
>>>> web
>>>> page. What exactly do I need to implement to have a backend?
>>>
>>>
>>> You mean a JIT backend or a RPython backend?
>>
>>
>>
>> A RPython backend first. Is there any documentation, tutorial, simple toy
>> backend or anything I could start with?
>
> No. In fact, the only RPython backend that is well-maintained is the C one.


OK, so where could I start from? Is there for example some list of flow graphs 
opcodes?


>>> You might find this useful: http://www.aosabook.org/en/pypy.html
>>>
>>
>> OK, that was useful. It seems that the JIT generator is some assembler
>> embedded into the final binary. Does JIT generator share some code with the
>> backend generator?
>
> No.
>
>>
>> Would it be possible to get rid of the normal code generator (leaving only
>> some glue code) and relaying only on the JIT generator, that would produce
>> the whole code?
>
> No. The JIT generator is specialized for dynamic languages not ones
> like RPython, which can be translated to C.
>
>>
>> This would reduce the size of the binary and would not hit performance much,
>> since loops would be generated as usual, only the non-looping execution
>> would be different.
>
> Why would it reduce the size of the binary?


That is my poor understanding, I might be wrong.

In the current approach in a binary there is a compiled machine code, the flow 
graph representation and the JIT compiler. I think we could get rid of (most) 
compiled machine code, leaving only some startup code to spawn the JIT 
compiler. Then, each code path would be compiled by JIT and executed. Loops 
would run as fast as usual. Non-loop code would run slower, but I think this 
would be a minor slowdown. Most importantly, as I understand, the binary 
contains many versions of the same code paths specialized for different types. 
If we throw it out, the binary would be smaller.

This is not a proposal. It is just a try at understanding things.

haael


From fijall at gmail.com  Tue Sep 18 11:11:55 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 18 Sep 2012 11:11:55 +0200
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <5058242B.801@interia.pl>
References: <50562166.9040701@interia.pl>
	<CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>
	<5056BD28.1030809@interia.pl>
	<CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
	<5058242B.801@interia.pl>
Message-ID: <CAK5idxSR7Rfp-mWUg4tA+e_z2y8MpdfhScY+7C48i32X93D5Yw@mail.gmail.com>

On Tue, Sep 18, 2012 at 9:35 AM, haael <haael at interia.pl> wrote:
>
>>>>> 3. Which component actually does the JIT? Is it just a tweak on the
>>>>> code
>>>>> generator or are the flow graphs generated differently?
>>>>
>>>>
>>>>
>>>> The flow graphs are taken from the translator and modified by the JIT
>>>> generator.
>>>
>>>
>>>
>>> My question is:
>>>
>>> Does JIT involve another "transformation" of the flow graphs? In normal
>>> (non-JIT) code generation some flow graphs are fed to the backend
>>> generator.
>>> Wich step is different in the JIT case? Does the backend generator get
>>> different flow graphs or are the same flow graphs compiled differently by
>>> a
>>> tweaked code generator?
>>
>>
>> They get the same flowgraphs.
>
>
>
> So, if I understand well, there is no common JIT code among different
> backends? The JIT we have is the C-backend specific? Different backends
> would need a new JIT approach?

Most of the JIT code is not C-backend specific. Backends are along the
line of x86, arm, PPC. If you want to create a say LLVM backend, you
would reuse most of the JIT code.

Regarding your other questions - what sort of backend you have in
mind? Because depending on it, it might be easier or harder to write
one and answers to all your other questions might be different.

Cheers,
fijal

From haael at interia.pl  Tue Sep 18 14:41:12 2012
From: haael at interia.pl (haael)
Date: Tue, 18 Sep 2012 14:41:12 +0200
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <CAK5idxSR7Rfp-mWUg4tA+e_z2y8MpdfhScY+7C48i32X93D5Yw@mail.gmail.com>
References: <50562166.9040701@interia.pl>
	<CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>
	<5056BD28.1030809@interia.pl>
	<CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
	<5058242B.801@interia.pl>
	<CAK5idxSR7Rfp-mWUg4tA+e_z2y8MpdfhScY+7C48i32X93D5Yw@mail.gmail.com>
Message-ID: <50586BE8.4060008@interia.pl>


>>>>>> 3. Which component actually does the JIT? Is it just a tweak on the
>>>>>> code
>>>>>> generator or are the flow graphs generated differently?
>>>>>
>>>>>
>>>>>
>>>>> The flow graphs are taken from the translator and modified by the JIT
>>>>> generator.
>>>>
>>>>
>>>>
>>>> My question is:
>>>>
>>>> Does JIT involve another "transformation" of the flow graphs? In normal
>>>> (non-JIT) code generation some flow graphs are fed to the backend
>>>> generator.
>>>> Wich step is different in the JIT case? Does the backend generator get
>>>> different flow graphs or are the same flow graphs compiled differently by
>>>> a
>>>> tweaked code generator?
>>>
>>>
>>> They get the same flowgraphs.
>>
>>
>>
>> So, if I understand well, there is no common JIT code among different
>> backends? The JIT we have is the C-backend specific? Different backends
>> would need a new JIT approach?
>
> Most of the JIT code is not C-backend specific. Backends are along the
> line of x86, arm, PPC. If you want to create a say LLVM backend, you
> would reuse most of the JIT code.

So I don't understand anything again. Where exactly JIT is coded? What is the 
difference between the build process of a JIT and non-JIT binary? It's not in 
the flow graphs. It is in the backend. How can C backend and, say, CLI backend 
share code?

> Regarding your other questions - what sort of backend you have in
> mind? Because depending on it, it might be easier or harder to write
> one and answers to all your other questions might be different.

Nothing in particular. I just want to gain some knowledge and start hacking 
PyPy. I used to write compilers and some embedded programming, so I thought 
that writing a new backend may be the easiest for me. Said again, I just want 
to start.

> Cheers,
> fijal
>

haael


From benjamin at python.org  Tue Sep 18 16:07:51 2012
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 18 Sep 2012 10:07:51 -0400
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <50586BE8.4060008@interia.pl>
References: <50562166.9040701@interia.pl>
	<CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>
	<5056BD28.1030809@interia.pl>
	<CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
	<5058242B.801@interia.pl>
	<CAK5idxSR7Rfp-mWUg4tA+e_z2y8MpdfhScY+7C48i32X93D5Yw@mail.gmail.com>
	<50586BE8.4060008@interia.pl>
Message-ID: <CAPZV6o8ZvD8m=NceLvQGKM4RtiXXNczbRenApkpuY3sp4+CkfA@mail.gmail.com>

2012/9/18 haael <haael at interia.pl>:
>> Most of the JIT code is not C-backend specific. Backends are along the
>> line of x86, arm, PPC. If you want to create a say LLVM backend, you
>> would reuse most of the JIT code.
>
>
> So I don't understand anything again. Where exactly JIT is coded? What is
> the difference between the build process of a JIT and non-JIT binary? It's
> not in the flow graphs. It is in the backend. How can C backend and, say,
> CLI backend share code?

Maciej is referring to JIT backends, not the translator backend.

-- 
Regards,
Benjamin

From benjamin at python.org  Tue Sep 18 16:09:27 2012
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 18 Sep 2012 10:09:27 -0400
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <5058242B.801@interia.pl>
References: <50562166.9040701@interia.pl>
	<CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>
	<5056BD28.1030809@interia.pl>
	<CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
	<5058242B.801@interia.pl>
Message-ID: <CAPZV6o_AEeG0FBQt3giNFRSBxVLmU9mEqvKjmDiAhzHxEvYncQ@mail.gmail.com>

2012/9/18 haael <haael at interia.pl>:
> OK, so where could I start from? Is there for example some list of flow
> graphs opcodes?

You can use the graphviewer described in the documentation.

> In the current approach in a binary there is a compiled machine code, the
> flow graph representation and the JIT compiler. I think we could get rid of
> (most) compiled machine code, leaving only some startup code to spawn the
> JIT compiler. Then, each code path would be compiled by JIT and executed.
> Loops would run as fast as usual. Non-loop code would run slower, but I
> think this would be a minor slowdown. Most importantly, as I understand, the
> binary contains many versions of the same code paths specialized for
> different types. If we throw it out, the binary would be smaller.
>
> This is not a proposal. It is just a try at understanding things.

It might be technically possible, but it's definitely not within the
design goals of the JIT.

>
> haael
>


-- 
Regards,
Benjamin

From arigo at tunes.org  Tue Sep 18 21:55:43 2012
From: arigo at tunes.org (Armin Rigo)
Date: Tue, 18 Sep 2012 21:55:43 +0200
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <50586BE8.4060008@interia.pl>
References: <50562166.9040701@interia.pl>
	<CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>
	<5056BD28.1030809@interia.pl>
	<CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
	<5058242B.801@interia.pl>
	<CAK5idxSR7Rfp-mWUg4tA+e_z2y8MpdfhScY+7C48i32X93D5Yw@mail.gmail.com>
	<50586BE8.4060008@interia.pl>
Message-ID: <CAMSv6X264AeYaZxfSCq9naEQX4EMBze8ABAH_06pH5E-0mm==w@mail.gmail.com>

Hi Haael,

Here is again a high-level overview.  Although we use the term
"backend" for both, there are two completely unrelated components: the
JIT backends and the translation backends.

The translation backends are part of the static translation of a PyPy
(with or without the JIT) to C code.  The translation backends turn
control flow graphs into, say, C source code representing them.  These
control flow graphs are roughly at the same level as Java VM opcodes,
except that depending on the backend, they may either contain GC
operations (e.g. when translating to Java or CLI) or not any more
(e.g. when translating to C).  We have control flow graphs for each
RPython function in the source code of PyPy, describing an interpreter
for Python.

Now the JIT is an optional part of that, which is written as more
RPython code --- and gets statically translated into more control flow
graphs, but describing only the JIT itself, not any JITted code.
JITted code (in the form of machine code) is produced at runtime,
obviously, but using different techniques.  It is the job of the JIT
backends to produce this machine code in memory.  This is unrelated to
the translation backends: a JIT backend inputs something that is not a
control flow graph (but a linear "trace" of operations), works at
runtime (so is itself written in RPython), and outputs machine code in
memory (rather than writing C sources into a file).

The input for the JIT backend comes from a front-end component: the
tracing JIT "metacompiler".  It works by following what the
interpreter would do for some specific input (i.e. the precise Python
code we see at runtime).  This means that the JIT front-end starts
with the control flow graphs of the interpreter and produces a linear
trace out of it, which is fed to the JIT backend.  The control flow
graphs in questions must be available at runtime, so we need to
serialize them.  The precise format in which the flow graphs are
serialized is called "JitCodes".  Although very similar to the flow
graphs, everything that is unnecessary for the JIT was removed, most
importantly the details of the type information --- e.g. all sizes and
signedness of integer variables are all represented as one "int" type,
because the JIT wouldn't have use for more; and similarly any GC
pointer to any object is represented as just one "GC pointer" type.

I hope this helps :-)


A bient?t,

Armin.

From cfbolz at gmx.de  Tue Sep 18 22:00:18 2012
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Tue, 18 Sep 2012 22:00:18 +0200
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <5058242B.801@interia.pl>
References: <50562166.9040701@interia.pl>
	<CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>
	<5056BD28.1030809@interia.pl>
	<CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
	<5058242B.801@interia.pl>
Message-ID: <d1b6df19-3825-42b0-9a45-6194f3cd4332@email.android.com>

Hi Haael, 

Cool that you want to work on PyPy! 

haael <haael at interia.pl> wrote:
>>
>> Why would it reduce the size of the binary?
>
>
>That is my poor understanding, I might be wrong.
>
>In the current approach in a binary there is a compiled machine code,
>the flow 
>graph representation and the JIT compiler. I think we could get rid of
>(most) 
>compiled machine code, leaving only some startup code to spawn the JIT 
>compiler. Then, each code path would be compiled by JIT and executed.
>Loops 
>would run as fast as usual. Non-loop code would run slower, but I think
>this 
>would be a minor slowdown. Most importantly, as I understand, the
>binary 
>contains many versions of the same code paths specialized for different
>types. 
>If we throw it out, the binary would be smaller.
>
>This is not a proposal. It is just a try at understanding things.
>

Will just reply to this part, typing on the phone is annoying. What you write above is actually a good proposal. We have discussed the viability of related schemes in the past. There are two problem that I see with it. 

1. While the speed of your proposed system would eventually be the same, it would suffer from much slower warmup, because after startup you would have to generate a lot of machine code before executing the user's code. 

2. More fundamentally (and this is where I think you have missed a detail about the JIT so far) the JIT ist trace-based. The JIT backends cannot deal with arbitrary control flow, only with linear traces. Therefore it would not be straightforward to use the same JIT backends to bootstrap parts of the interpreter as runtime. 

As for tasks you could work on: would you maybe because interested in helping with the ARM JIT backend? 

Cheers, 


Carl Friedrich 

From fijall at gmail.com  Wed Sep 19 10:50:18 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Wed, 19 Sep 2012 10:50:18 +0200
Subject: [pypy-dev] PyPy at FOSDEM
Message-ID: <CAK5idxRG5XofhX1BpdS2o-0vzhVM7VKKr4ZBdKdEOaCLetoedw@mail.gmail.com>

Hi

Just a quick question - is anyone from the team planning on attending
FOSDEM in Brussels on 2-3rd of Feb 2013?

Cheers,
fijal

From arigo at tunes.org  Wed Sep 19 11:38:21 2012
From: arigo at tunes.org (Armin Rigo)
Date: Wed, 19 Sep 2012 11:38:21 +0200
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <d1b6df19-3825-42b0-9a45-6194f3cd4332@email.android.com>
References: <50562166.9040701@interia.pl>
	<CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>
	<5056BD28.1030809@interia.pl>
	<CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
	<5058242B.801@interia.pl>
	<d1b6df19-3825-42b0-9a45-6194f3cd4332@email.android.com>
Message-ID: <CAMSv6X3dcBwhPSvH=GOgHVqpo5CV3FKP83ucYafUkUVuWQF6PQ@mail.gmail.com>

Hi Carl Friedrich,

On Tue, Sep 18, 2012 at 10:00 PM, Carl Friedrich Bolz <cfbolz at gmx.de> wrote:
> 2. More fundamentally (and this is where I think you have missed a detail about the JIT so far) the JIT ist trace-based. The JIT backends cannot deal with arbitrary control flow, only with linear traces.

You missed an intermediate solution: have the JIT's blackhole
interpreter run the jitcodes before warm-up.  We don't have to
actually JIT-compile everything before being able to run it, which
would indeed completely kill warm-up times.  This would give a (slow
but not unreasonably slow) solution: a very general "RPython
interpreter and JIT-compiler" that would input and run some set of
serialized jitcodes --- similar to a Java VM, actually.  (There are
tons of minor issues ahead, like all the stranger operations that
don't have a jitcode equivalent so far, e.g. working on "long double"
or "long long long" or weakrefs...)

Note that in order to make the "RPython interpreter and JIT-compiler"
itself, we would need to translate regular RPython code --- which
means it doesn't help at all if the goal is to port RPython to non-C
translation targets.  It's merely a cool hack, and maybe a debugging
help to trade fast translation time for a slower result.


A bient?t,

Armin.

From fijall at gmail.com  Wed Sep 19 12:30:51 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Wed, 19 Sep 2012 12:30:51 +0200
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <CAMSv6X3dcBwhPSvH=GOgHVqpo5CV3FKP83ucYafUkUVuWQF6PQ@mail.gmail.com>
References: <50562166.9040701@interia.pl>
	<CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>
	<5056BD28.1030809@interia.pl>
	<CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
	<5058242B.801@interia.pl>
	<d1b6df19-3825-42b0-9a45-6194f3cd4332@email.android.com>
	<CAMSv6X3dcBwhPSvH=GOgHVqpo5CV3FKP83ucYafUkUVuWQF6PQ@mail.gmail.com>
Message-ID: <CAK5idxQMcY0WhhteJLkiq04gsbbui=E9iveT158UCGf9xEfwyQ@mail.gmail.com>

On Wed, Sep 19, 2012 at 11:38 AM, Armin Rigo <arigo at tunes.org> wrote:
> Hi Carl Friedrich,
>
> On Tue, Sep 18, 2012 at 10:00 PM, Carl Friedrich Bolz <cfbolz at gmx.de> wrote:
>> 2. More fundamentally (and this is where I think you have missed a detail about the JIT so far) the JIT ist trace-based. The JIT backends cannot deal with arbitrary control flow, only with linear traces.
>
> You missed an intermediate solution: have the JIT's blackhole
> interpreter run the jitcodes before warm-up.  We don't have to
> actually JIT-compile everything before being able to run it, which
> would indeed completely kill warm-up times.  This would give a (slow
> but not unreasonably slow) solution: a very general "RPython
> interpreter and JIT-compiler" that would input and run some set of
> serialized jitcodes --- similar to a Java VM, actually.  (There are
> tons of minor issues ahead, like all the stranger operations that
> don't have a jitcode equivalent so far, e.g. working on "long double"
> or "long long long" or weakrefs...)
>
> Note that in order to make the "RPython interpreter and JIT-compiler"
> itself, we would need to translate regular RPython code --- which
> means it doesn't help at all if the goal is to port RPython to non-C
> translation targets.  It's merely a cool hack, and maybe a debugging
> help to trade fast translation time for a slower result.
>
>
> A bient?t,
>
> Armin.
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev

I guess this is what pypyjit.py does, more or less. You still need the
blackhole interpreter to run in something

From arigo at tunes.org  Wed Sep 19 14:08:43 2012
From: arigo at tunes.org (Armin Rigo)
Date: Wed, 19 Sep 2012 14:08:43 +0200
Subject: [pypy-dev] Flow graphs, backends and JIT
In-Reply-To: <CAK5idxQMcY0WhhteJLkiq04gsbbui=E9iveT158UCGf9xEfwyQ@mail.gmail.com>
References: <50562166.9040701@interia.pl>
	<CAPZV6o_RU2xs8OPHUcc8L1aEw50VOL0BWuD0+r0ZyKcnh70T+g@mail.gmail.com>
	<5056BD28.1030809@interia.pl>
	<CAPZV6o-wR6mxSNfNxxszG2NSaYftuKq7kv=eoLe-fVLAQtP4Ww@mail.gmail.com>
	<5058242B.801@interia.pl>
	<d1b6df19-3825-42b0-9a45-6194f3cd4332@email.android.com>
	<CAMSv6X3dcBwhPSvH=GOgHVqpo5CV3FKP83ucYafUkUVuWQF6PQ@mail.gmail.com>
	<CAK5idxQMcY0WhhteJLkiq04gsbbui=E9iveT158UCGf9xEfwyQ@mail.gmail.com>
Message-ID: <CAMSv6X0fOK4fWt6DSDw65+72pUCBgkCU0P=xmzdiwn2a-KYHJg@mail.gmail.com>

Hi Fijal,

On Wed, Sep 19, 2012 at 12:30 PM, Maciej Fijalkowski <fijall at gmail.com> wrote:
> I guess this is what pypyjit.py does, more or less. You still need the
> blackhole interpreter to run in something

Right, indeed, pypyjit.py fulfills already the "debugging helper"
role.  That leaves only the "cool hack" role that I can think of right
now... :-)


A bient?t,

Armin.

From arigo at tunes.org  Wed Sep 19 18:25:47 2012
From: arigo at tunes.org (Armin Rigo)
Date: Wed, 19 Sep 2012 18:25:47 +0200
Subject: [pypy-dev] MinGW32 support PyPy with mscr90.dll
In-Reply-To: <509BC4043C474AD595CBCE3792F0BF1F@vSHliutaotao>
References: <509BC4043C474AD595CBCE3792F0BF1F@vSHliutaotao>
Message-ID: <CAMSv6X3NXdZr-Y-WH3y+5TbQvDtN6-n6QE9eZZugr4S074e9Cg@mail.gmail.com>

Hi Bookaa,

On Tue, Jun 5, 2012 at 3:46 AM, bookaa <rorsoft at gmail.com> wrote:
> ...
> I suggest this instruction should be add to PyPy doc: MinGW32 support

Sorry for the delay.  As you noticed, generally our interest in
mingw32 is close to zero, with Visual Studio taking up all of our
(already very tiny) interest about Windows.  If you want the changes
you described to be included in the pypy documentation at
http://doc.pypy.org/en/latest/windows.html#using-the-mingw-compiler ,
you need to update the source yourself (file pypy/doc/windows.rst) and
send us a patch.  We'd be happy to take it.


A bient?t,

Armin.

From fijall at gmail.com  Wed Sep 26 11:43:00 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Wed, 26 Sep 2012 11:43:00 +0200
Subject: [pypy-dev] source code documentation in PyPy
Message-ID: <CAK5idxQUdiFu7w4Tuhdkf=sJjOVA_zp7p_DEtNeWuu-1-SQb9A@mail.gmail.com>

Hi

I would like to suggest we add a requirement to document PyPy source
code slightly better. Step one would be to have few-sentences "where
are you now" info at the top of each file. How about we try to stick
to a policy where each time anyone does a major work on a file, he
adds documentation to the top or checks if what's there is already
correct?

Opinions?

Cheers,
fijal

From amauryfa at gmail.com  Wed Sep 26 11:52:38 2012
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 26 Sep 2012 11:52:38 +0200
Subject: [pypy-dev] source code documentation in PyPy
In-Reply-To: <CAK5idxQUdiFu7w4Tuhdkf=sJjOVA_zp7p_DEtNeWuu-1-SQb9A@mail.gmail.com>
References: <CAK5idxQUdiFu7w4Tuhdkf=sJjOVA_zp7p_DEtNeWuu-1-SQb9A@mail.gmail.com>
Message-ID: <CAGmFidbaO1jE_OHEwVxMe8jiTM6HNpX8JygoEOyCfGP5087Kog@mail.gmail.com>

2012/9/26 Maciej Fijalkowski <fijall at gmail.com>:
> I would like to suggest we add a requirement to document PyPy source
> code slightly better. Step one would be to have few-sentences "where
> are you now" info at the top of each file. How about we try to stick
> to a policy where each time anyone does a major work on a file, he
> adds documentation to the top or checks if what's there is already
> correct?

+100

-- 
Amaury Forgeot d'Arc

From cdleary at acm.org  Sat Sep 29 01:36:45 2012
From: cdleary at acm.org (Chris Leary)
Date: Fri, 28 Sep 2012 16:36:45 -0700
Subject: [pypy-dev] MalGen as a benchmark?
Message-ID: <CAG_6j9PU3gaWXyhYW=NLXWQB+ZO+mVW=5Ot2A=H3YUrS_8KENw@mail.gmail.com>

Found a red-hot, branchy-looking Python kernel in the wild and
naturally I thought of you trace compiler folks! ;-) Hope that it
might be useful: I think it could make a nice addition to the speed
center, seeing as how it's a CPU bound workload on all the machines I
have access to (though I haven't profiled it at all so it could
potentially be leaning heavily on paths in some unoptimized builtins).

    MalGen is a set of scripts which generate large, distributed data
sets suitable for testing and benchmarking software designed to
perform parallel processing on large data sets. The data sets can be
thought of as site-entity log files. After an initial seeding, the
scripts allow for the data generation to be initiated from a single
central node to run the generation concurrently on multiple remote
nodes of the cluster.

    -- http://code.google.com/p/malgen/

Specifically, http://code.google.com/p/malgen/source/browse/trunk/bin/cloud/malgen/malgen.py
which gets run thusly:

::

    pypy malgen.py -O /tmp/ -o INITIAL.txt 0 50000000 10000000 21

(Where 5e7 is the "initial block size" and 1e7 is the
other-than-inital block size.) This generates the initial seeding they
were talking about, followed by a run for each of N blocks on each
node (in this hypothetical setup, for 5 blocks on each of four nodes
the following is run):

::

    pypy malgen.py -O /tmp [start_value]

The metadata is read out of the INITIAL.txt file and used to determine
the size of the block, and the parameter [start_value] is used to bump
to the appropriate start id count for the current block.

Inner loop: http://code.google.com/p/malgen/source/browse/trunk/bin/cloud/malgen/malgen.py#90

Thoughts?

- Leary

From alex.gaynor at gmail.com  Sat Sep 29 01:39:05 2012
From: alex.gaynor at gmail.com (Alex Gaynor)
Date: Fri, 28 Sep 2012 16:39:05 -0700
Subject: [pypy-dev] MalGen as a benchmark?
In-Reply-To: <CAG_6j9PU3gaWXyhYW=NLXWQB+ZO+mVW=5Ot2A=H3YUrS_8KENw@mail.gmail.com>
References: <CAG_6j9PU3gaWXyhYW=NLXWQB+ZO+mVW=5Ot2A=H3YUrS_8KENw@mail.gmail.com>
Message-ID: <CAFRnB2UCajk8KFSB1x_uZSrvc7=KszHywYZZ71=oMYSwTCQjRg@mail.gmail.com>

On Fri, Sep 28, 2012 at 4:36 PM, Chris Leary <cdleary at acm.org> wrote:

> Found a red-hot, branchy-looking Python kernel in the wild and
> naturally I thought of you trace compiler folks! ;-) Hope that it
> might be useful: I think it could make a nice addition to the speed
> center, seeing as how it's a CPU bound workload on all the machines I
> have access to (though I haven't profiled it at all so it could
> potentially be leaning heavily on paths in some unoptimized builtins).
>
>     MalGen is a set of scripts which generate large, distributed data
> sets suitable for testing and benchmarking software designed to
> perform parallel processing on large data sets. The data sets can be
> thought of as site-entity log files. After an initial seeding, the
> scripts allow for the data generation to be initiated from a single
> central node to run the generation concurrently on multiple remote
> nodes of the cluster.
>
>     -- http://code.google.com/p/malgen/
>
> Specifically,
> http://code.google.com/p/malgen/source/browse/trunk/bin/cloud/malgen/malgen.py
> which gets run thusly:
>
> ::
>
>     pypy malgen.py -O /tmp/ -o INITIAL.txt 0 50000000 10000000 21
>
> (Where 5e7 is the "initial block size" and 1e7 is the
> other-than-inital block size.) This generates the initial seeding they
> were talking about, followed by a run for each of N blocks on each
> node (in this hypothetical setup, for 5 blocks on each of four nodes
> the following is run):
>
> ::
>
>     pypy malgen.py -O /tmp [start_value]
>
> The metadata is read out of the INITIAL.txt file and used to determine
> the size of the block, and the parameter [start_value] is used to bump
> to the appropriate start id count for the current block.
>
> Inner loop:
> http://code.google.com/p/malgen/source/browse/trunk/bin/cloud/malgen/malgen.py#90
>
> Thoughts?
>
> - Leary
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev
>

Looks like it could be a good addition, have you run benchmarks on it
yourself? (Also, should we be directing any new benchmarks to the
python-speed mailing list?)

Alex

-- 
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120928/5011b1be/attachment.html>

From cdleary at acm.org  Sat Sep 29 03:22:33 2012
From: cdleary at acm.org (Chris Leary)
Date: Fri, 28 Sep 2012 18:22:33 -0700
Subject: [pypy-dev] MalGen as a benchmark?
In-Reply-To: <CAFRnB2UCajk8KFSB1x_uZSrvc7=KszHywYZZ71=oMYSwTCQjRg@mail.gmail.com>
References: <CAG_6j9PU3gaWXyhYW=NLXWQB+ZO+mVW=5Ot2A=H3YUrS_8KENw@mail.gmail.com>
	<CAFRnB2UCajk8KFSB1x_uZSrvc7=KszHywYZZ71=oMYSwTCQjRg@mail.gmail.com>
Message-ID: <CAG_6j9Pr7L2RhnoZxdVZcvf_5DFULi_RNr29rcwTf9nfh03b_w@mail.gmail.com>

On Fri, Sep 28, 2012 at 4:39 PM, Alex Gaynor <alex.gaynor at gmail.com> wrote:
> Looks like it could be a good addition, have you run benchmarks on it
> yourself? (Also, should we be directing any new benchmarks to the
> python-speed mailing list?)

It's the setup procedure for the MalStone map-reduce benchmark, but
often ends up taking four times as long as the benchmark itself for
large datasets! Should I cross post to python-speed? The site says to
post here: http://speed.pypy.org/about/ -- any additional information
you think I should include to cross post there? Thanks.

- Leary

From stefan_ml at behnel.de  Sat Sep 29 13:19:55 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 29 Sep 2012 13:19:55 +0200
Subject: [pypy-dev] nightly builds require OpenSSL 1.0 ?
In-Reply-To: <CAMSv6X1g+Bk2i6qsd4-0rJOGkJuH7PfOr5=Dka_GG1xmxP1Shg@mail.gmail.com>
References: <k1poni$2ec$1@ger.gmane.org>
	<CAK5idxTLCEoG9-Nebg6c=cPMSH4Bjar_s=YD_gFa3m6XW=82ow@mail.gmail.com>
	<CAMSv6X1g+Bk2i6qsd4-0rJOGkJuH7PfOr5=Dka_GG1xmxP1Shg@mail.gmail.com>
Message-ID: <k46lgq$ead$1@ger.gmane.org>

Armin Rigo, 31.08.2012 10:15:
> On Fri, Aug 31, 2012 at 9:27 AM, Maciej Fijalkowski wrote:
>> This was an effect of a system update. Binary compatibility on linux is hard :/
> 
> I hacked *yet again another time* to link openssl statically in the
> binary, like it should have been before.

The same applies to libffi.so.6 now.

Stefan


From Ronny.Pfannschmidt at gmx.de  Sat Sep 29 13:36:11 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Sat, 29 Sep 2012 13:36:11 +0200
Subject: [pypy-dev] nightly builds require OpenSSL 1.0 ?
In-Reply-To: <k46lgq$ead$1@ger.gmane.org>
References: <k1poni$2ec$1@ger.gmane.org>
	<CAK5idxTLCEoG9-Nebg6c=cPMSH4Bjar_s=YD_gFa3m6XW=82ow@mail.gmail.com>
	<CAMSv6X1g+Bk2i6qsd4-0rJOGkJuH7PfOr5=Dka_GG1xmxP1Shg@mail.gmail.com>
	<k46lgq$ead$1@ger.gmane.org>
Message-ID: <5066DD2B.1010008@gmx.de>

Given that binary compatibility on linux is practically
a broken mess thats just burning developer time,
we might want standardize on a distro that nightly/releases work on
and defer the distro compat to the people at fault - the distributions

-- Ronny


On 09/29/2012 01:19 PM, Stefan Behnel wrote:
> Armin Rigo, 31.08.2012 10:15:
>> On Fri, Aug 31, 2012 at 9:27 AM, Maciej Fijalkowski wrote:
>>> This was an effect of a system update. Binary compatibility on linux is hard :/
>>
>> I hacked *yet again another time* to link openssl statically in the
>> binary, like it should have been before.
>
> The same applies to libffi.so.6 now.
>
> Stefan
>
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev


From stefan_ml at behnel.de  Sat Sep 29 14:23:18 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 29 Sep 2012 14:23:18 +0200
Subject: [pypy-dev] nightly builds require OpenSSL 1.0 ?
In-Reply-To: <k46lgq$ead$1@ger.gmane.org>
References: <k1poni$2ec$1@ger.gmane.org>
	<CAK5idxTLCEoG9-Nebg6c=cPMSH4Bjar_s=YD_gFa3m6XW=82ow@mail.gmail.com>
	<CAMSv6X1g+Bk2i6qsd4-0rJOGkJuH7PfOr5=Dka_GG1xmxP1Shg@mail.gmail.com>
	<k46lgq$ead$1@ger.gmane.org>
Message-ID: <k46p7k$a7s$1@ger.gmane.org>

Stefan Behnel, 29.09.2012 13:19:
> Armin Rigo, 31.08.2012 10:15:
>> On Fri, Aug 31, 2012 at 9:27 AM, Maciej Fijalkowski wrote:
>>> This was an effect of a system update. Binary compatibility on linux is hard :/
>>
>> I hacked *yet again another time* to link openssl statically in the
>> binary, like it should have been before.
> 
> The same applies to libffi.so.6 now.

Oh, and after getting around that problem, the next on the list is
"libtinfo.so.5", which, it seems, is part of ncurses? This is starting to
get tedious ...

Stefan


From santagada at gmail.com  Sat Sep 29 15:55:07 2012
From: santagada at gmail.com (Leonardo Santagada)
Date: Sat, 29 Sep 2012 10:55:07 -0300
Subject: [pypy-dev] nightly builds require OpenSSL 1.0 ?
In-Reply-To: <5066DD2B.1010008@gmx.de>
References: <k1poni$2ec$1@ger.gmane.org>
	<CAK5idxTLCEoG9-Nebg6c=cPMSH4Bjar_s=YD_gFa3m6XW=82ow@mail.gmail.com>
	<CAMSv6X1g+Bk2i6qsd4-0rJOGkJuH7PfOr5=Dka_GG1xmxP1Shg@mail.gmail.com>
	<k46lgq$ead$1@ger.gmane.org> <5066DD2B.1010008@gmx.de>
Message-ID: <CA+HFc87nJ_XCgQ1h1gTRda3HDqX8Kt369jRRBQFJtxW626mKcw@mail.gmail.com>

or statically link everything in... at least people will be able to
run pypy on their machines.

On Sat, Sep 29, 2012 at 8:36 AM, Ronny Pfannschmidt
<Ronny.Pfannschmidt at gmx.de> wrote:
> Given that binary compatibility on linux is practically
> a broken mess thats just burning developer time,
> we might want standardize on a distro that nightly/releases work on
> and defer the distro compat to the people at fault - the distributions
>
> -- Ronny
>
>
>
> On 09/29/2012 01:19 PM, Stefan Behnel wrote:
>>
>> Armin Rigo, 31.08.2012 10:15:
>>>
>>> On Fri, Aug 31, 2012 at 9:27 AM, Maciej Fijalkowski wrote:
>>>>
>>>> This was an effect of a system update. Binary compatibility on linux is
>>>> hard :/
>>>
>>>
>>> I hacked *yet again another time* to link openssl statically in the
>>> binary, like it should have been before.
>>
>>
>> The same applies to libffi.so.6 now.
>>
>> Stefan
>>
>>
>> _______________________________________________
>> pypy-dev mailing list
>> pypy-dev at python.org
>> http://mail.python.org/mailman/listinfo/pypy-dev
>
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev


-- 

Leonardo Santagada

From fijall at gmail.com  Sat Sep 29 16:00:53 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sat, 29 Sep 2012 16:00:53 +0200
Subject: [pypy-dev] MalGen as a benchmark?
In-Reply-To: <CAG_6j9Pr7L2RhnoZxdVZcvf_5DFULi_RNr29rcwTf9nfh03b_w@mail.gmail.com>
References: <CAG_6j9PU3gaWXyhYW=NLXWQB+ZO+mVW=5Ot2A=H3YUrS_8KENw@mail.gmail.com>
	<CAFRnB2UCajk8KFSB1x_uZSrvc7=KszHywYZZ71=oMYSwTCQjRg@mail.gmail.com>
	<CAG_6j9Pr7L2RhnoZxdVZcvf_5DFULi_RNr29rcwTf9nfh03b_w@mail.gmail.com>
Message-ID: <CAK5idxT6LvGJx7VHj6Y7fRv_V-c7i2JtP=0H1iNziBBLnfDN+Q@mail.gmail.com>

On Sat, Sep 29, 2012 at 3:22 AM, Chris Leary <cdleary at acm.org> wrote:
> On Fri, Sep 28, 2012 at 4:39 PM, Alex Gaynor <alex.gaynor at gmail.com> wrote:
>> Looks like it could be a good addition, have you run benchmarks on it
>> yourself? (Also, should we be directing any new benchmarks to the
>> python-speed mailing list?)
>
> It's the setup procedure for the MalStone map-reduce benchmark, but
> often ends up taking four times as long as the benchmark itself for
> large datasets! Should I cross post to python-speed? The site says to
> post here: http://speed.pypy.org/about/ -- any additional information
> you think I should include to cross post there? Thanks.
>
> - Leary

I think if we don't include them, speed.python won't include them for
sure. I'll try to deal with it some time today.

From Ronny.Pfannschmidt at gmx.de  Sat Sep 29 16:16:58 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Sat, 29 Sep 2012 16:16:58 +0200
Subject: [pypy-dev] nightly builds require OpenSSL 1.0 ?
In-Reply-To: <CA+HFc87nJ_XCgQ1h1gTRda3HDqX8Kt369jRRBQFJtxW626mKcw@mail.gmail.com>
References: <k1poni$2ec$1@ger.gmane.org>
	<CAK5idxTLCEoG9-Nebg6c=cPMSH4Bjar_s=YD_gFa3m6XW=82ow@mail.gmail.com>
	<CAMSv6X1g+Bk2i6qsd4-0rJOGkJuH7PfOr5=Dka_GG1xmxP1Shg@mail.gmail.com>
	<k46lgq$ead$1@ger.gmane.org> <5066DD2B.1010008@gmx.de>
	<CA+HFc87nJ_XCgQ1h1gTRda3HDqX8Kt369jRRBQFJtxW626mKcw@mail.gmail.com>
Message-ID: <506702DA.50506@gmx.de>

i vaguely remember that its impossible to statically link glibc,
due to libnss

On 09/29/2012 03:55 PM, Leonardo Santagada wrote:
> or statically link everything in... at least people will be able to
> run pypy on their machines.
>
> On Sat, Sep 29, 2012 at 8:36 AM, Ronny Pfannschmidt
> <Ronny.Pfannschmidt at gmx.de>  wrote:
>> Given that binary compatibility on linux is practically
>> a broken mess thats just burning developer time,
>> we might want standardize on a distro that nightly/releases work on
>> and defer the distro compat to the people at fault - the distributions
>>
>> -- Ronny
>>
>>
>>
>> On 09/29/2012 01:19 PM, Stefan Behnel wrote:
>>>
>>> Armin Rigo, 31.08.2012 10:15:
>>>>
>>>> On Fri, Aug 31, 2012 at 9:27 AM, Maciej Fijalkowski wrote:
>>>>>
>>>>> This was an effect of a system update. Binary compatibility on linux is
>>>>> hard :/
>>>>
>>>>
>>>> I hacked *yet again another time* to link openssl statically in the
>>>> binary, like it should have been before.
>>>
>>>
>>> The same applies to libffi.so.6 now.
>>>
>>> Stefan
>>>
>>>
>>> _______________________________________________
>>> pypy-dev mailing list
>>> pypy-dev at python.org
>>> http://mail.python.org/mailman/listinfo/pypy-dev
>>
>>
>> _______________________________________________
>> pypy-dev mailing list
>> pypy-dev at python.org
>> http://mail.python.org/mailman/listinfo/pypy-dev
>
>
>


From fijall at gmail.com  Sat Sep 29 17:35:29 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sat, 29 Sep 2012 17:35:29 +0200
Subject: [pypy-dev] nightly builds require OpenSSL 1.0 ?
In-Reply-To: <CA+HFc87nJ_XCgQ1h1gTRda3HDqX8Kt369jRRBQFJtxW626mKcw@mail.gmail.com>
References: <k1poni$2ec$1@ger.gmane.org>
	<CAK5idxTLCEoG9-Nebg6c=cPMSH4Bjar_s=YD_gFa3m6XW=82ow@mail.gmail.com>
	<CAMSv6X1g+Bk2i6qsd4-0rJOGkJuH7PfOr5=Dka_GG1xmxP1Shg@mail.gmail.com>
	<k46lgq$ead$1@ger.gmane.org> <5066DD2B.1010008@gmx.de>
	<CA+HFc87nJ_XCgQ1h1gTRda3HDqX8Kt369jRRBQFJtxW626mKcw@mail.gmail.com>
Message-ID: <CAK5idxR2zESg6=QP6v4jjkV5nUUa=eeuXam_+6Ny_qT2FDokgQ@mail.gmail.com>

On Sat, Sep 29, 2012 at 3:55 PM, Leonardo Santagada <santagada at gmail.com> wrote:
> or statically link everything in... at least people will be able to
> run pypy on their machines.

you cannot statically link glibc and libffi is compiled without -fPIC
on debian/ubuntu to make it harder

>
> On Sat, Sep 29, 2012 at 8:36 AM, Ronny Pfannschmidt
> <Ronny.Pfannschmidt at gmx.de> wrote:
>> Given that binary compatibility on linux is practically
>> a broken mess thats just burning developer time,
>> we might want standardize on a distro that nightly/releases work on
>> and defer the distro compat to the people at fault - the distributions
>>
>> -- Ronny
>>
>>
>>
>> On 09/29/2012 01:19 PM, Stefan Behnel wrote:
>>>
>>> Armin Rigo, 31.08.2012 10:15:
>>>>
>>>> On Fri, Aug 31, 2012 at 9:27 AM, Maciej Fijalkowski wrote:
>>>>>
>>>>> This was an effect of a system update. Binary compatibility on linux is
>>>>> hard :/
>>>>
>>>>
>>>> I hacked *yet again another time* to link openssl statically in the
>>>> binary, like it should have been before.
>>>
>>>
>>> The same applies to libffi.so.6 now.
>>>
>>> Stefan
>>>
>>>
>>> _______________________________________________
>>> pypy-dev mailing list
>>> pypy-dev at python.org
>>> http://mail.python.org/mailman/listinfo/pypy-dev
>>
>>
>> _______________________________________________
>> pypy-dev mailing list
>> pypy-dev at python.org
>> http://mail.python.org/mailman/listinfo/pypy-dev
>
>
>
> --
>
> Leonardo Santagada
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev

From andreasdamgaardpedersen at gmail.com  Sun Sep 30 08:34:25 2012
From: andreasdamgaardpedersen at gmail.com (Andreas Pedersen)
Date: Sun, 30 Sep 2012 08:34:25 +0200
Subject: [pypy-dev] Bachelor Thesis - STM / ATM
Message-ID: <CAAu+zAkNn_WUKfehCMJvf59pVqvi+_x0Jg2-3Pb1Dp7+3OpQXA@mail.gmail.com>

Hello PyPy team.

I would like to write my Bachelor Thesis on Python Parallel programming and
your recent blog post Multicore Programming in PyPy and
CPython<http://morepypy.blogspot.dk/2012/08/multicore-programming-in-pypy-and.html>
sounds
very intriguing. I would like to assist you with implementing STM / AME in
PyPy and I can provide you with the equivalent of about 8 weeks of full
time work over the next 3 months for this purpose. The Guidelines require
an overall theme for the thesis, so I was thinking of focusing on a subset
of the STM / ATM problem as the full problem seems too big. I don't
currently have an overview of the code, so I can't see how far you are and
what subproblems you are tackling. Any help on finding a good subject would
be appreciated. The problem for me, is that I had the misconception my
supervisor was more up-to-date on parallel programming in Python than he
was. So now I'm under a very tight schedule to get my synopsis written as
quickly as possible.

Kind Regards
 - Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120930/669dcdc7/attachment.html>

From arigo at tunes.org  Sun Sep 30 09:40:46 2012
From: arigo at tunes.org (Armin Rigo)
Date: Sun, 30 Sep 2012 09:40:46 +0200
Subject: [pypy-dev] Bachelor Thesis - STM / ATM
In-Reply-To: <CAAu+zAkNn_WUKfehCMJvf59pVqvi+_x0Jg2-3Pb1Dp7+3OpQXA@mail.gmail.com>
References: <CAAu+zAkNn_WUKfehCMJvf59pVqvi+_x0Jg2-3Pb1Dp7+3OpQXA@mail.gmail.com>
Message-ID: <CAMSv6X1igdskf-o3MswvQ=c8-TO71Xcn3Q2AXuV0jT0+hmxEyg@mail.gmail.com>

Hi Andreas,

On Sun, Sep 30, 2012 at 8:34 AM, Andreas Pedersen
<andreasdamgaardpedersen at gmail.com> wrote:
> I don't currently have an
> overview of the code, so I can't see how far you are and what subproblems
> you are tackling. Any help on finding a good subject would be appreciated.

Someone else is planning his Master Thesis on this topic.  I have to
warn you that a Bachelor might be too thin at this stage: there are
really all research-like topics.  I wrote down three (open-ended)
topics here: https://bitbucket.org/pypy/pypy/src/default/pypy/doc/project-ideas.rst#stm-software-transactional-memory

> The problem for me, is that I had the misconception my supervisor was more
> up-to-date on parallel programming in Python than he was. So now I'm under a
> very tight schedule to get my synopsis written as quickly as possible.

:-/  You are welcome in any case.  Please join our irc channel if you
want faster discussions (channel #pypy, on irc.freenode.net /
http://webchat.freenode.net/).


A bient?t,

Armin.

From arigo at tunes.org  Sun Sep 30 09:50:33 2012
From: arigo at tunes.org (Armin Rigo)
Date: Sun, 30 Sep 2012 09:50:33 +0200
Subject: [pypy-dev] nightly builds require OpenSSL 1.0 ?
In-Reply-To: <CAK5idxR2zESg6=QP6v4jjkV5nUUa=eeuXam_+6Ny_qT2FDokgQ@mail.gmail.com>
References: <k1poni$2ec$1@ger.gmane.org>
	<CAK5idxTLCEoG9-Nebg6c=cPMSH4Bjar_s=YD_gFa3m6XW=82ow@mail.gmail.com>
	<CAMSv6X1g+Bk2i6qsd4-0rJOGkJuH7PfOr5=Dka_GG1xmxP1Shg@mail.gmail.com>
	<k46lgq$ead$1@ger.gmane.org> <5066DD2B.1010008@gmx.de>
	<CA+HFc87nJ_XCgQ1h1gTRda3HDqX8Kt369jRRBQFJtxW626mKcw@mail.gmail.com>
	<CAK5idxR2zESg6=QP6v4jjkV5nUUa=eeuXam_+6Ny_qT2FDokgQ@mail.gmail.com>
Message-ID: <CAMSv6X2fwD=7WvbpU-QiQo6BUdC=PGgp9X0EmJqa1veanUxPuw@mail.gmail.com>

Hi,

On Sat, Sep 29, 2012 at 5:35 PM, Maciej Fijalkowski <fijall at gmail.com> wrote:
>> or statically link everything in... at least people will be able to
>> run pypy on their machines.
>
> you cannot statically link glibc and libffi is compiled without -fPIC
> on debian/ubuntu to make it harder

To summarize: it's an infinite amount of mess that we are running away
from.  If someone, anyone, feels like helping --- and is ready to put
in the necessary amount of work, including never-ending future work
--- then in this case I'd be happy to leave him the job of correctly
configuring "tannit", the machine we use.  Otherwise, people with
different distributions will have to wait for the next release to be
packaged for their distribution.

Or else find themselves a machine with sufficient RAM and 1-2 free
hours, which is not that hard any more IMHO.  It is documented how to
get a "squeezed" translation (for 30% more time) in 1.6GB of RAM
(32-bit) or 3.0GB of RAM (64-bit) at
http://pypy.org/download.html#building-from-source .


A bient?t,

Armin.

From russel at winder.org.uk  Sun Sep 30 11:10:19 2012
From: russel at winder.org.uk (Russel Winder)
Date: Sun, 30 Sep 2012 10:10:19 +0100
Subject: [pypy-dev] PyPy STM
Message-ID: <1348996219.9072.20.camel@lionors.winder.org.uk>

Armin,

Sarah and I are restarting our work on CSP, and extending to creating
actors and a dataflow library. It would be good to make this work on
Jython, IronPython and PyPy as well as CPython. However we want to get
away from a reliance on multiprocessing since it is rather heavyweight
for the sort of parallelism we are after. STM as an infrastructure layer
in PyPy and CPython would get us away from the GIL and allow for using
Python threads bound to kernel threads to allow a single Python process
to allow us to create lighweight processes (no application level shared
memory).

Is the STM variant of PyPy and/or CPython in any form of usable state?
If so then we can investigate building from source for ourselves so as
to use it as a foundation for building the higher abstraction
parallelism for applications programmers.

Thanks.

-- 
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder at ekiga.net
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel at winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20120930/5cc5d3e0/attachment.pgp>

From arigo at tunes.org  Sun Sep 30 11:41:31 2012
From: arigo at tunes.org (Armin Rigo)
Date: Sun, 30 Sep 2012 11:41:31 +0200
Subject: [pypy-dev] PyPy STM
In-Reply-To: <1348996219.9072.20.camel@lionors.winder.org.uk>
References: <1348996219.9072.20.camel@lionors.winder.org.uk>
Message-ID: <CAMSv6X0GQ6ZX1WUfy3+XgyEoaO_11LWHCYvRRj-8HV=ooWqM6A@mail.gmail.com>

Hi Russel,

On Sun, Sep 30, 2012 at 11:10 AM, Russel Winder <russel at winder.org.uk> wrote:
> However we want to get
> away from a reliance on multiprocessing since it is rather heavyweight
> for the sort of parallelism we are after. STM as an infrastructure layer
> in PyPy and CPython would get us away from the GIL and allow for using
> Python threads bound to kernel threads to allow a single Python process
> to allow us to create lighweight processes (no application level shared
> memory).

I'm not really sure I follow you exactly.  You want to have
'multiprocessing' using OS threads and no shared memory, rather than
using processes?  That looks like it will have the same amount of
overhead to me.  But anyway, if that's what you want, then I don't
understand where STM comes into the picture.  STM is a trade-off
solution: you get *shared* memory concurrency, possibly with easier
models than threads to program with, against some high but fixed
overhead.  If your starting point is no shared memory, then STM makes
little sense.  If you just want several independent Python
interpreters in the same OS process, then it is probably possible with
some small amount of hacking in either CPython or PyPy.


A bient?t,

Armin.

From fijall at gmail.com  Sun Sep 30 15:08:02 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sun, 30 Sep 2012 15:08:02 +0200
Subject: [pypy-dev] MalGen as a benchmark?
In-Reply-To: <CAG_6j9Pr7L2RhnoZxdVZcvf_5DFULi_RNr29rcwTf9nfh03b_w@mail.gmail.com>
References: <CAG_6j9PU3gaWXyhYW=NLXWQB+ZO+mVW=5Ot2A=H3YUrS_8KENw@mail.gmail.com>
	<CAFRnB2UCajk8KFSB1x_uZSrvc7=KszHywYZZ71=oMYSwTCQjRg@mail.gmail.com>
	<CAG_6j9Pr7L2RhnoZxdVZcvf_5DFULi_RNr29rcwTf9nfh03b_w@mail.gmail.com>
Message-ID: <CAK5idxSWQ3MxziET6np9OyBQgemXoxEETqm=deR1KvTbkinvbQ@mail.gmail.com>

On Sat, Sep 29, 2012 at 3:22 AM, Chris Leary <cdleary at acm.org> wrote:
> On Fri, Sep 28, 2012 at 4:39 PM, Alex Gaynor <alex.gaynor at gmail.com> wrote:
>> Looks like it could be a good addition, have you run benchmarks on it
>> yourself? (Also, should we be directing any new benchmarks to the
>> python-speed mailing list?)
>
> It's the setup procedure for the MalStone map-reduce benchmark, but
> often ends up taking four times as long as the benchmark itself for
> large datasets! Should I cross post to python-speed? The site says to
> post here: http://speed.pypy.org/about/ -- any additional information
> you think I should include to cross post there? Thanks.
>
> - Leary

at current svn version it plain doesn't work without seed data

From Ronny.Pfannschmidt at gmx.de  Sun Sep 30 15:50:11 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Sun, 30 Sep 2012 15:50:11 +0200
Subject: [pypy-dev] PyPy STM
In-Reply-To: <1348996219.9072.20.camel@lionors.winder.org.uk>
References: <1348996219.9072.20.camel@lionors.winder.org.uk>
Message-ID: <50684E13.9070104@gmx.de>

Hi,

the following is a collection of unfinished thoughts.

after my thesis i'll be experimenting with a relaxed csp-ish model
based on python native generator based continuations as well as
the new continulet-jit-3 based greenlets.

my basic assumption is that having limited amount
of shared memory is acceptable.

the csp-ish "processes" are to be modeled
as generators with yield expressions or green-lets
and will assume a strong data locality.

all communication will be started by suspending the execution and 
"switching away" with some payload

that kind of starting/stopping seems to lend itself well to stm transactions

basically one iteration of a generator will be one transaction
and internal communication will also be separate transactions

my current hypothesis is that such a model will lend itself to easy 
parallel execution since iteration steps of different continuations will 
be mostly completely independent and can just run in parallel.

communication itself will be in transactions with higher conflict potential,
but i assume that armin will find ways
to evade the conflict issues with in-process communication channels,
so i'm not going to think about it more till
it turns out to be an actual problem in experimentation

My main focus with those experiments will be concurrent systems.


-- Ronny


On 09/30/2012 11:10 AM, Russel Winder wrote:
> Armin,
>
> Sarah and I are restarting our work on CSP, and extending to creating
> actors and a dataflow library. It would be good to make this work on
> Jython, IronPython and PyPy as well as CPython. However we want to get
> away from a reliance on multiprocessing since it is rather heavyweight
> for the sort of parallelism we are after. STM as an infrastructure layer
> in PyPy and CPython would get us away from the GIL and allow for using
> Python threads bound to kernel threads to allow a single Python process
> to allow us to create lighweight processes (no application level shared
> memory).
>
> Is the STM variant of PyPy and/or CPython in any form of usable state?
> If so then we can investigate building from source for ourselves so as
> to use it as a foundation for building the higher abstraction
> parallelism for applications programmers.
>
> Thanks.
>
>
>
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev


From arigo at tunes.org  Sun Sep 30 16:22:00 2012
From: arigo at tunes.org (Armin Rigo)
Date: Sun, 30 Sep 2012 16:22:00 +0200
Subject: [pypy-dev] PyPy STM
In-Reply-To: <50684E13.9070104@gmx.de>
References: <1348996219.9072.20.camel@lionors.winder.org.uk>
	<50684E13.9070104@gmx.de>
Message-ID: <CAMSv6X20KAs78z=yBtqeC4L3LKgL1fytikwy5mx+xG-a4BHdfg@mail.gmail.com>

Hi Ronny,

On Sun, Sep 30, 2012 at 3:50 PM, Ronny Pfannschmidt
<Ronny.Pfannschmidt at gmx.de> wrote:
> after my thesis i'll be experimenting with a relaxed csp-ish model
> based on python native generator based continuations as well as
> the new continulet-jit-3 based greenlets.
>
> my basic assumption is that having limited amount
> of shared memory is acceptable.

What you are thinking about is to start from the naturally multicore
model of separate address spaces, and add some amount of shared
memory.  You would use STM to handle the result.  It is the opposite
of what I'm thinking about, which is to start with a non-multithread,
non-tasklet-based program and add multicore capability to it.  I would
be using STM to "create" multicore capability, whereas you would be
using it to "create" shared memory.  I am more interested in the first
approach than the second because I think it is closer to what
untrained programmers start with, but both approaches are potentially
valid.

Russel: STM is a powerful tool that makes sense of shared memory in
multicore situations.  I fail to understand why you are looking at it
in the absence of shared memory...


A bient?t,

Armin.

From Ronny.Pfannschmidt at gmx.de  Sun Sep 30 16:43:04 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Sun, 30 Sep 2012 16:43:04 +0200
Subject: [pypy-dev] PyPy STM
In-Reply-To: <CAMSv6X20KAs78z=yBtqeC4L3LKgL1fytikwy5mx+xG-a4BHdfg@mail.gmail.com>
References: <1348996219.9072.20.camel@lionors.winder.org.uk>
	<50684E13.9070104@gmx.de>
	<CAMSv6X20KAs78z=yBtqeC4L3LKgL1fytikwy5mx+xG-a4BHdfg@mail.gmail.com>
Message-ID: <50685A78.50908@gmx.de>

On 09/30/2012 04:22 PM, Armin Rigo wrote:
> Hi Ronny,
>
> On Sun, Sep 30, 2012 at 3:50 PM, Ronny Pfannschmidt
> <Ronny.Pfannschmidt at gmx.de>  wrote:
>> after my thesis i'll be experimenting with a relaxed csp-ish model
>> based on python native generator based continuations as well as
>> the new continulet-jit-3 based greenlets.
>>
>> my basic assumption is that having limited amount
>> of shared memory is acceptable.
>
> What you are thinking about is to start from the naturally multicore
> model of separate address spaces, and add some amount of shared
> memory.  You would use STM to handle the result.  It is the opposite
> of what I'm thinking about, which is to start with a non-multithread,
> non-tasklet-based program and add multicore capability to it.  I would
> be using STM to "create" multicore capability, whereas you would be
> using it to "create" shared memory.  I am more interested in the first
> approach than the second because I think it is closer to what
> untrained programmers start with, but both approaches are potentially
> valid.
>

i think both approaches are valid,
they lend themselves to help and reason about different kinds of 
problems that make different kinds of serial programs concurrent and 
parallel.

i think for most purposes simple sequential communicating programs
are way more easy to reason about than anything else.

the transaction module approach already seems to require to chunk up 
programs in semi-small transactions,
that may cause other transactions to be scheduled

which seems more and more like twisted's defereds.
to my eyes twisted style code is a kind of spaghetti
that is very hard to reason about.

which is why i want to experiment executing multiple longer sequential 
programs in chunks that may be interleaved and/or parallel

the reason why i start with generators instead of green-lets is simply 
cause they cannot ever be nested.

this will allow more simple reasoning.


> Russel: STM is a powerful tool that makes sense of shared memory in
> multicore situations.  I fail to understand why you are looking at it
> in the absence of shared memory...

Im under the impression the intent is to have non-shared application 
state while the interpreter states are still shared
(i might be using the wrong words here)

>
>
> A bient?t,
>
> Armin.

-- Ronny

From cdleary at acm.org  Sun Sep 30 19:56:40 2012
From: cdleary at acm.org (Chris Leary)
Date: Sun, 30 Sep 2012 10:56:40 -0700
Subject: [pypy-dev] MalGen as a benchmark?
In-Reply-To: <CAK5idxSWQ3MxziET6np9OyBQgemXoxEETqm=deR1KvTbkinvbQ@mail.gmail.com>
References: <CAG_6j9PU3gaWXyhYW=NLXWQB+ZO+mVW=5Ot2A=H3YUrS_8KENw@mail.gmail.com>
	<CAFRnB2UCajk8KFSB1x_uZSrvc7=KszHywYZZ71=oMYSwTCQjRg@mail.gmail.com>
	<CAG_6j9Pr7L2RhnoZxdVZcvf_5DFULi_RNr29rcwTf9nfh03b_w@mail.gmail.com>
	<CAK5idxSWQ3MxziET6np9OyBQgemXoxEETqm=deR1KvTbkinvbQ@mail.gmail.com>
Message-ID: <CAG_6j9OMJjEj0msjzfKzhDv29gzin8HJ1-bEPvv3E6iBkj-zvw@mail.gmail.com>

On Sun, Sep 30, 2012 at 6:08 AM, Maciej Fijalkowski <fijall at gmail.com> wrote:
> at current svn version it plain doesn't work without seed data

Yeah, the first step is that it has to generate seed data. Since it's
the same loop I'd hope that's good enough.

Python2.7 seems to run ~2 seconds faster on a ~minute length run.

$ perf stat pypy malgen.py -O /tmp/ -o INITIAL.txt 0 5000000 1000000 21
0 Entity
0 Events Generated
500000 Events Generated
[..snip..]

 Performance counter stats for 'pypy malgen.py -O /tmp/ -o INITIAL.txt
0 5000000 1000000 21':

      65447.670235 task-clock                #    0.931 CPUs utilized
             6,091 context-switches          #    0.000 M/sec
                88 CPU-migrations            #    0.000 M/sec
            39,751 page-faults               #    0.001 M/sec
   187,721,401,361 cycles                    #    2.868 GHz
         [83.34%]
   125,966,916,332 stalled-cycles-frontend   #   67.10% frontend
cycles idle    [83.33%]
    89,836,165,138 stalled-cycles-backend    #   47.86% backend
cycles idle    [66.63%]
   122,596,433,926 instructions              #    0.65  insns per cycle
                                             #    1.03  stalled cycles
per insn [83.30%]
    27,158,701,261 branches                  #  414.968 M/sec
         [83.35%]
     1,309,172,455 branch-misses             #    4.82% of all
branches         [83.35%]

      70.276668518 seconds time elapsed

$ perf stat python2.7 malgen.py -O /tmp/ -o INITIAL.txt 0 5000000 1000000 21
0 Entity
0 Events Generated
500000 Events Generated
[..snip..]

 Performance counter stats for 'python2.7 malgen.py -O /tmp/ -o
INITIAL.txt 0 5000000 1000000 21':

      67696.460942 task-clock                #    0.991 CPUs utilized
             6,192 context-switches          #    0.000 M/sec
                87 CPU-migrations            #    0.000 M/sec
             4,168 page-faults               #    0.000 M/sec
   194,918,427,158 cycles                    #    2.879 GHz
         [83.34%]
    95,351,613,483 stalled-cycles-frontend   #   48.92% frontend
cycles idle    [83.32%]
    53,693,951,677 stalled-cycles-backend    #   27.55% backend
cycles idle    [66.68%]
   209,613,931,049 instructions              #    1.08  insns per cycle
                                             #    0.45  stalled cycles
per insn [83.35%]
    44,855,636,904 branches                  #  662.599 M/sec
         [83.32%]
     1,687,165,902 branch-misses             #    3.76% of all
branches         [83.34%]

      68.335222479 seconds time elapsed

- Leary

From fijall at gmail.com  Sun Sep 30 23:38:24 2012
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sun, 30 Sep 2012 23:38:24 +0200
Subject: [pypy-dev] MalGen as a benchmark?
In-Reply-To: <CAG_6j9OMJjEj0msjzfKzhDv29gzin8HJ1-bEPvv3E6iBkj-zvw@mail.gmail.com>
References: <CAG_6j9PU3gaWXyhYW=NLXWQB+ZO+mVW=5Ot2A=H3YUrS_8KENw@mail.gmail.com>
	<CAFRnB2UCajk8KFSB1x_uZSrvc7=KszHywYZZ71=oMYSwTCQjRg@mail.gmail.com>
	<CAG_6j9Pr7L2RhnoZxdVZcvf_5DFULi_RNr29rcwTf9nfh03b_w@mail.gmail.com>
	<CAK5idxSWQ3MxziET6np9OyBQgemXoxEETqm=deR1KvTbkinvbQ@mail.gmail.com>
	<CAG_6j9OMJjEj0msjzfKzhDv29gzin8HJ1-bEPvv3E6iBkj-zvw@mail.gmail.com>
Message-ID: <CAK5idxS7NHCyahMsnhms6RhjUQzffaUi80pW9bHpEid3pwJjnA@mail.gmail.com>

On Sun, Sep 30, 2012 at 7:56 PM, Chris Leary <cdleary at acm.org> wrote:
> On Sun, Sep 30, 2012 at 6:08 AM, Maciej Fijalkowski <fijall at gmail.com> wrote:
>> at current svn version it plain doesn't work without seed data
>
> Yeah, the first step is that it has to generate seed data. Since it's
> the same loop I'd hope that's good enough.

how do you do that?

>
> Python2.7 seems to run ~2 seconds faster on a ~minute length run.
>
> $ perf stat pypy malgen.py -O /tmp/ -o INITIAL.txt 0 5000000 1000000 21
> 0 Entity
> 0 Events Generated
> 500000 Events Generated
> [..snip..]
>
>  Performance counter stats for 'pypy malgen.py -O /tmp/ -o INITIAL.txt
> 0 5000000 1000000 21':
>
>       65447.670235 task-clock                #    0.931 CPUs utilized
>              6,091 context-switches          #    0.000 M/sec
>                 88 CPU-migrations            #    0.000 M/sec
>             39,751 page-faults               #    0.001 M/sec
>    187,721,401,361 cycles                    #    2.868 GHz
>          [83.34%]
>    125,966,916,332 stalled-cycles-frontend   #   67.10% frontend
> cycles idle    [83.33%]
>     89,836,165,138 stalled-cycles-backend    #   47.86% backend
> cycles idle    [66.63%]
>    122,596,433,926 instructions              #    0.65  insns per cycle
>                                              #    1.03  stalled cycles
> per insn [83.30%]
>     27,158,701,261 branches                  #  414.968 M/sec
>          [83.35%]
>      1,309,172,455 branch-misses             #    4.82% of all
> branches         [83.35%]
>
>       70.276668518 seconds time elapsed
>
> $ perf stat python2.7 malgen.py -O /tmp/ -o INITIAL.txt 0 5000000 1000000 21
> 0 Entity
> 0 Events Generated
> 500000 Events Generated
> [..snip..]
>
>  Performance counter stats for 'python2.7 malgen.py -O /tmp/ -o
> INITIAL.txt 0 5000000 1000000 21':
>
>       67696.460942 task-clock                #    0.991 CPUs utilized
>              6,192 context-switches          #    0.000 M/sec
>                 87 CPU-migrations            #    0.000 M/sec
>              4,168 page-faults               #    0.000 M/sec
>    194,918,427,158 cycles                    #    2.879 GHz
>          [83.34%]
>     95,351,613,483 stalled-cycles-frontend   #   48.92% frontend
> cycles idle    [83.32%]
>     53,693,951,677 stalled-cycles-backend    #   27.55% backend
> cycles idle    [66.68%]
>    209,613,931,049 instructions              #    1.08  insns per cycle
>                                              #    0.45  stalled cycles
> per insn [83.35%]
>     44,855,636,904 branches                  #  662.599 M/sec
>          [83.32%]
>      1,687,165,902 branch-misses             #    3.76% of all
> branches         [83.34%]
>
>       68.335222479 seconds time elapsed
>
> - Leary