From planrichi at gmail.com  Wed Mar  1 07:52:06 2017
From: planrichi at gmail.com (Richard Plangger)
Date: Wed, 1 Mar 2017 13:52:06 +0100
Subject: [pypy-dev] PyPy as a Sub-org
Message-ID: <d2f209c4-9764-17f7-0170-087e4092161a@gmail.com>

Hi,

as we discussed during this years PyPy sprint, PyPy again wants to
participate as a Sub-org in this years Google Summer of Code. We are
already on the wiki ideas page, but I think we did not formally apply by
writing this email.

Regards,
Richard

From turnbull.stephen.fw at u.tsukuba.ac.jp  Thu Mar  2 00:03:38 2017
From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Thu, 2 Mar 2017 14:03:38 +0900
Subject: [pypy-dev] [GSoC2017] PyPy as a Sub-org
In-Reply-To: <d2f209c4-9764-17f7-0170-087e4092161a@gmail.com>
References: <d2f209c4-9764-17f7-0170-087e4092161a@gmail.com>
Message-ID: <22711.42922.606041.882426@turnbull.sk.tsukuba.ac.jp>

Richard Plangger writes:
 > Hi,
 > 
 > as we discussed during this years PyPy sprint, PyPy again wants to
 > participate as a Sub-org in this years Google Summer of Code. We are
 > already on the wiki ideas page, but I think we did not formally apply by
 > writing this email.

Your section is visible on the page at http://python-gsoc.org/.  That
was your formal application as a sub-org.  Is there any other problem?

Details:

To actually participate in GSoC under the PSF umbrella, you need to
(1) register at least two mentors X desired slots at
    https://goo.gl/forms/UJb0rHOVQjLna2o53
    Some overlap will be allowed, but the mentors who are working with
    more than one student should be a small minority.
(2) subscribe them to the mailing list
    https://mail.python.org/mailman/listinfo/gsoc-mentors
(3) designate one sub-org admin and one alternate in case the main
    admin is out of contact for more than a day or two
(4) state that you intend to comply with the Python Code of Conduct

See also http://python-gsoc.org/#mentors for more information and
further requirements (that can be satisfied as you go) about mentors
and sub-orgs.

Steve

From yashwardhan.singh at intel.com  Thu Mar  2 20:31:05 2017
From: yashwardhan.singh at intel.com (Singh, Yashwardhan)
Date: Fri, 3 Mar 2017 01:31:05 +0000
Subject: [pypy-dev] Numpy on PyPy : cpyext
Message-ID: <0151F66FF725AC42A760DA612754C5F819912754@ORSMSX104.amr.corp.intel.com>

Hi Everyone,

I am using numpy on pypy to train a deep neural network. For my workload numpy on pypy is taking twice the time to train as numpy on Cpython. I am using Numpy via cpyext.

I read in the documentation, "Performance-wise, the speed is mostly the same as CPython's NumPy (it is the same code); the exception is that interactions between the Python side and NumPy objects are mediated through the slower cpyext layer (which hurts a few benchmarks that do a lot of element-by-element array accesses, for example)." Is there any way in which I can profile my application to see how much additional overhead cypext layer is adding or is it the numpy via pypy which is slowing down the things. I have tried vmprof, but I couldn't figure out from it how much time cpyext layer is taking.

Any help will be highly appreciated.

Regards
Yash
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170303/2fcbbdc5/attachment.html>

From fijall at gmail.com  Fri Mar  3 07:40:39 2017
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Fri, 3 Mar 2017 13:40:39 +0100
Subject: [pypy-dev] Numpy on PyPy : cpyext
In-Reply-To: <0151F66FF725AC42A760DA612754C5F819912754@ORSMSX104.amr.corp.intel.com>
References: <0151F66FF725AC42A760DA612754C5F819912754@ORSMSX104.amr.corp.intel.com>
Message-ID: <CAK5idxQZWVRUxRe3kiSse3mbivczz9ieZN9oqjHdfuk+frimPg@mail.gmail.com>

Hi Yash

Is your software open source? I'm happy to check it out for you

I think the c-level profiling for vmprof is relatively new, you would
need to use pypy nightly in order to get that level of insight.
Additionally, we're working on cpyext improvements *right now* stay
tuned.

If there is a good case for speeding up numpy, we can get it a lot
faster than it is right now and seek some funding for that. Neural
networks might be one of those!

Best regards,
Maciej Fijalkowski

On Fri, Mar 3, 2017 at 2:31 AM, Singh, Yashwardhan
<yashwardhan.singh at intel.com> wrote:
> Hi Everyone,
>
> I am using numpy on pypy to train a deep neural network. For my workload
> numpy on pypy is taking twice the time to train as numpy on Cpython. I am
> using Numpy via cpyext.
>
> I read in the documentation, "Performance-wise, the speed is mostly the same
> as CPython's NumPy (it is the same code); the exception is that interactions
> between the Python side and NumPy objects are mediated through the slower
> cpyext layer (which hurts a few benchmarks that do a lot of
> element-by-element array accesses, for example)." Is there any way in which
> I can profile my application to see how much additional overhead cypext
> layer is adding or is it the numpy via pypy which is slowing down the
> things. I have tried vmprof, but I couldn't figure out from it how much time
> cpyext layer is taking.
>
> Any help will be highly appreciated.
>
> Regards
> Yash
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>

From frankw at mit.edu  Fri Mar  3 09:20:09 2017
From: frankw at mit.edu (Frank Wang)
Date: Fri, 3 Mar 2017 09:20:09 -0500
Subject: [pypy-dev] Disassembling methods called by LOOKUP_METHOD
Message-ID: <CAD=7XjOLLe0my96efcr9Rv634QzEv+cV6ZvUPPhJGP4LQ_0KPw@mail.gmail.com>

Hi,

I'm trying to figure out the opcodes that the "append" function calls for
arrays. When I use the dis tool, it just says that it looks up a method
"append" using the LOOKUP_METHOD opcode. Is there a tool that allows me to
disassemble built-in functions like "append", or what the best way to do
this is?

Thanks,
Frank
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170303/2035eef0/attachment.html>

From rymg19 at gmail.com  Fri Mar  3 09:58:58 2017
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Fri, 3 Mar 2017 08:58:58 -0600
Subject: [pypy-dev] Disassembling methods called by LOOKUP_METHOD
In-Reply-To: <CAD=7XjOLLe0my96efcr9Rv634QzEv+cV6ZvUPPhJGP4LQ_0KPw@mail.gmail.com>
References: <CAD=7XjOLLe0my96efcr9Rv634QzEv+cV6ZvUPPhJGP4LQ_0KPw@mail.gmail.com>
Message-ID: <CAO41-mNvzcXRbU1=ndD_oXAYnik8bk5BFQVbyk=6YdMveOaKzA@mail.gmail.com>

You can look at the source code for the objects (all located in
pypy/objspace/std) and find the method implementations there.

Here's append's (form pypy/objspace/std/listobject.py):


    def append(self, w_item):
        """L.append(object) -- append object to end"""
        self.strategy.append(self, w_item)


So it's just appending to an RPython list. If you want to see the source
for that, look in rpython/rtyper. append's is in rpython/rtyper/rlist.py:


    def rtype_method_append(self, hop):
        v_lst, v_value = hop.inputargs(self, self.item_repr)
        hop.exception_cannot_occur()
        hop.gendirectcall(ll_append, v_lst, v_value)


This ends up calling ll_append in the end (I think the other stuff is for
the JIT?), which is defined in the same file:


def ll_append(l, newitem):
    length = l.ll_length()
    l._ll_resize_ge(length+1)           # see "a note about overflows" above
    l.ll_setitem_fast(length, newitem)


Now, these ll_* functions are defined in the corresponding file inside
rpython/rtyper/lltypesystem; in this case, it's
rpython/rtyper/lltypesystem/rlist.py:


            self.LIST.become(GcStruct("list", ("length", Signed),
                                              ("items", Ptr(ITEMARRAY)),
                                      adtmeths = ADTIList({
                                          "ll_newlist": ll_newlist,
                                          "ll_newlist_hint":
ll_newlist_hint,
                                          "ll_newemptylist":
ll_newemptylist,
                                          "ll_length": ll_length,
                                          "ll_items": ll_items,
                                          "ITEM": ITEM,
                                          "ll_getitem_fast":
ll_getitem_fast,
                                          "ll_setitem_fast":
ll_setitem_fast,
                                          "_ll_resize_ge":
_ll_list_resize_ge,
                                          "_ll_resize_le":
_ll_list_resize_le,
                                          "_ll_resize": _ll_list_resize,
                                          "_ll_resize_hint":
_ll_list_resize_hint,
                                      }),
                                      hints = {'list': True})
                             )


It's signaling to RPython all the different methods on the low-level list
representation. Here, you want ll_setitem_fast and _ll_list_resize_ge (I
also copy-pasted the functions they call):


@jit.look_inside_iff(lambda l, newsize, overallocate:
jit.isconstant(len(l.items)) and jit.isconstant(newsize))
@signature(types.any(), types.int(), types.bool(), returns=types.none())
def _ll_list_resize_hint_really(l, newsize, overallocate):
    """
    Ensure l.items has room for at least newsize elements.  Note that
    l.items may change, and even if newsize is less than l.length on
    entry.
    """
    # This over-allocates proportional to the list size, making room
    # for additional growth.  The over-allocation is mild, but is
    # enough to give linear-time amortized behavior over a long
    # sequence of appends() in the presence of a poorly-performing
    # system malloc().
    # The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
    if newsize <= 0:
        ll_assert(newsize == 0, "negative list length")
        l.length = 0
        l.items = _ll_new_empty_item_array(typeOf(l).TO)
        return
    elif overallocate:
        if newsize < 9:
            some = 3
        else:
            some = 6
        some += newsize >> 3
        new_allocated = newsize + some
    else:
        new_allocated = newsize
    # new_allocated is a bit more than newsize, enough to ensure an
amortized
    # linear complexity for e.g. repeated usage of l.append().  In case
    # it overflows sys.maxint, it is guaranteed negative, and the following
    # malloc() will fail.
    items = l.items
    newitems = malloc(typeOf(l).TO.items.TO, new_allocated)
    before_len = l.length
    if before_len:   # avoids copying GC flags from the prebuilt_empty_array
        if before_len < newsize:
            p = before_len
        else:
            p = newsize
        rgc.ll_arraycopy(items, newitems, 0, 0, p)
    l.items = newitems


def _ll_list_resize_ge(l, newsize):
    """This is called with 'newsize' larger than the current length of the
    list.  If the list storage doesn't have enough space, then really
perform
    a realloc().  In the common case where we already overallocated enough,
    then this is a very fast operation.
    """
    cond = len(l.items) < newsize
    if jit.isconstant(len(l.items)) and jit.isconstant(newsize):
        if cond:
            _ll_list_resize_hint_really(l, newsize, True)
    else:
        jit.conditional_call(cond,
                             _ll_list_resize_hint_really, l, newsize, True)
    l.length = newsize


def ll_items(l):
    return l.items


def ll_setitem_fast(l, index, item):
    ll_assert(index < l.length, "setitem out of bounds")
    l.ll_items()[index] = item
ll_setitem_fast.oopspec = 'list.setitem(l, index, item)'


On Fri, Mar 3, 2017 at 8:20 AM, Frank Wang <frankw at mit.edu> wrote:

> Hi,
>
> I'm trying to figure out the opcodes that the "append" function calls for
> arrays. When I use the dis tool, it just says that it looks up a method
> "append" using the LOOKUP_METHOD opcode. Is there a tool that allows me to
> disassemble built-in functions like "append", or what the best way to do
> this is?
>
> Thanks,
> Frank
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>
>


-- 
Ryan (????)
Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else
http://refi64.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170303/3e9fa620/attachment-0001.html>

From yashwardhan.singh at intel.com  Fri Mar  3 19:20:35 2017
From: yashwardhan.singh at intel.com (Singh, Yashwardhan)
Date: Sat, 4 Mar 2017 00:20:35 +0000
Subject: [pypy-dev] Numpy on PyPy : cpyext
In-Reply-To: <CAK5idxQZWVRUxRe3kiSse3mbivczz9ieZN9oqjHdfuk+frimPg@mail.gmail.com>
References: <0151F66FF725AC42A760DA612754C5F819912754@ORSMSX104.amr.corp.intel.com>
 <CAK5idxQZWVRUxRe3kiSse3mbivczz9ieZN9oqjHdfuk+frimPg@mail.gmail.com>
Message-ID: <0151F66FF725AC42A760DA612754C5F819912918@ORSMSX104.amr.corp.intel.com>

Hi Maciej,

I have applied for clearance to publicly upload the code. I will upload it once I get the permission.

Regards
Yash

-----Original Message-----
From: Maciej Fijalkowski [mailto:fijall at gmail.com] 
Sent: Friday, March 3, 2017 4:41 AM
To: Singh, Yashwardhan <yashwardhan.singh at intel.com>
Cc: pypy-dev at python.org
Subject: Re: [pypy-dev] Numpy on PyPy : cpyext

Hi Yash

Is your software open source? I'm happy to check it out for you

I think the c-level profiling for vmprof is relatively new, you would need to use pypy nightly in order to get that level of insight.
Additionally, we're working on cpyext improvements *right now* stay tuned.

If there is a good case for speeding up numpy, we can get it a lot faster than it is right now and seek some funding for that. Neural networks might be one of those!

Best regards,
Maciej Fijalkowski

On Fri, Mar 3, 2017 at 2:31 AM, Singh, Yashwardhan <yashwardhan.singh at intel.com> wrote:
> Hi Everyone,
>
> I am using numpy on pypy to train a deep neural network. For my 
> workload numpy on pypy is taking twice the time to train as numpy on 
> Cpython. I am using Numpy via cpyext.
>
> I read in the documentation, "Performance-wise, the speed is mostly 
> the same as CPython's NumPy (it is the same code); the exception is 
> that interactions between the Python side and NumPy objects are 
> mediated through the slower cpyext layer (which hurts a few benchmarks 
> that do a lot of element-by-element array accesses, for example)." Is 
> there any way in which I can profile my application to see how much 
> additional overhead cypext layer is adding or is it the numpy via pypy 
> which is slowing down the things. I have tried vmprof, but I couldn't 
> figure out from it how much time cpyext layer is taking.
>
> Any help will be highly appreciated.
>
> Regards
> Yash
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>

From fijall at gmail.com  Sat Mar  4 13:01:52 2017
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sat, 4 Mar 2017 19:01:52 +0100
Subject: [pypy-dev] Speeds of various utf8 operations
Message-ID: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>

Hello everyone

I've been experimenting a bit with faster utf8 operations (and
conversion that does not do much). I'm writing down the results so
they don't get forgotten, as well as trying to put them in rpython
comments.

As far as non-SSE algorithms go, for things like splitlines, split
etc. is important to walk the utf8 string quickly and check properties
of characters.

So far the current finding has been that lookup table, for example:

 def next_codepoint_pos(code, pos):
     chr1 = ord(code[pos])
     if chr1 < 0x80:
         return pos + 1
    return pos + ord(runicode._utf8_code_length[chr1 - 0x80])

is significantly slower than following code (both don't do error checking):

def next_codepoint_pos(code, pos):
    chr1 = ord(code[pos])
    if chr1 < 0x80:
        return pos + 1
    if 0xC2 >= chr1 <= 0xDF:
        return pos + 2
    if chr >= 0xE0 and chr <= 0xEF:
        return pos + 3
    return pos + 4

The exact difference depends on how much multi-byte characters are
there and how big the strings are. It's up to 40%, but as a general
rule, the more ascii characters are, the less of an impact it has, as
well as the larger they are, the more impact memory/L2/L3 cache has.

PS. SSE will be faster still, but we might not want SSE for just splitlines

Cheers,
fijal

From phyo.arkarlwin at gmail.com  Sat Mar  4 13:36:16 2017
From: phyo.arkarlwin at gmail.com (Phyo Arkar)
Date: Sat, 04 Mar 2017 18:36:16 +0000
Subject: [pypy-dev] Speeds of various utf8 operations
In-Reply-To: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
References: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
Message-ID: <CA+HjJrijVPWtR6VW580Le5EsTtY60K-h6cemu3K2MBONHkYEzQ@mail.gmail.com>

SSE measn https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions?

in comparison to CPython is this much slower ?

On Sun, Mar 5, 2017 at 12:32 AM Maciej Fijalkowski <fijall at gmail.com> wrote:

> Hello everyone
>
> I've been experimenting a bit with faster utf8 operations (and
> conversion that does not do much). I'm writing down the results so
> they don't get forgotten, as well as trying to put them in rpython
> comments.
>
> As far as non-SSE algorithms go, for things like splitlines, split
> etc. is important to walk the utf8 string quickly and check properties
> of characters.
>
> So far the current finding has been that lookup table, for example:
>
>  def next_codepoint_pos(code, pos):
>      chr1 = ord(code[pos])
>      if chr1 < 0x80:
>          return pos + 1
>     return pos + ord(runicode._utf8_code_length[chr1 - 0x80])
>
> is significantly slower than following code (both don't do error checking):
>
> def next_codepoint_pos(code, pos):
>     chr1 = ord(code[pos])
>     if chr1 < 0x80:
>         return pos + 1
>     if 0xC2 >= chr1 <= 0xDF:
>         return pos + 2
>     if chr >= 0xE0 and chr <= 0xEF:
>         return pos + 3
>     return pos + 4
>
> The exact difference depends on how much multi-byte characters are
> there and how big the strings are. It's up to 40%, but as a general
> rule, the more ascii characters are, the less of an impact it has, as
> well as the larger they are, the more impact memory/L2/L3 cache has.
>
> PS. SSE will be faster still, but we might not want SSE for just splitlines
>
> Cheers,
> fijal
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170304/661f76ad/attachment.html>

From fijall at gmail.com  Sat Mar  4 14:17:22 2017
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sat, 4 Mar 2017 20:17:22 +0100
Subject: [pypy-dev] Speeds of various utf8 operations
In-Reply-To: <CA+HjJrijVPWtR6VW580Le5EsTtY60K-h6cemu3K2MBONHkYEzQ@mail.gmail.com>
References: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
 <CA+HjJrijVPWtR6VW580Le5EsTtY60K-h6cemu3K2MBONHkYEzQ@mail.gmail.com>
Message-ID: <CAK5idxQ3Ks8eMr5_2YKpAEQ1OYYec-D=CXP-kZsCAcmgPs3USw@mail.gmail.com>

Er... why would it be slower than cpython?

Anyway, the speeds I'm reporting on are based on C/assembler programs so far.

On Sat, Mar 4, 2017 at 7:36 PM, Phyo Arkar <phyo.arkarlwin at gmail.com> wrote:
> SSE measn https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions?
>
> in comparison to CPython is this much slower ?
>
> On Sun, Mar 5, 2017 at 12:32 AM Maciej Fijalkowski <fijall at gmail.com> wrote:
>>
>> Hello everyone
>>
>> I've been experimenting a bit with faster utf8 operations (and
>> conversion that does not do much). I'm writing down the results so
>> they don't get forgotten, as well as trying to put them in rpython
>> comments.
>>
>> As far as non-SSE algorithms go, for things like splitlines, split
>> etc. is important to walk the utf8 string quickly and check properties
>> of characters.
>>
>> So far the current finding has been that lookup table, for example:
>>
>>  def next_codepoint_pos(code, pos):
>>      chr1 = ord(code[pos])
>>      if chr1 < 0x80:
>>          return pos + 1
>>     return pos + ord(runicode._utf8_code_length[chr1 - 0x80])
>>
>> is significantly slower than following code (both don't do error
>> checking):
>>
>> def next_codepoint_pos(code, pos):
>>     chr1 = ord(code[pos])
>>     if chr1 < 0x80:
>>         return pos + 1
>>     if 0xC2 >= chr1 <= 0xDF:
>>         return pos + 2
>>     if chr >= 0xE0 and chr <= 0xEF:
>>         return pos + 3
>>     return pos + 4
>>
>> The exact difference depends on how much multi-byte characters are
>> there and how big the strings are. It's up to 40%, but as a general
>> rule, the more ascii characters are, the less of an impact it has, as
>> well as the larger they are, the more impact memory/L2/L3 cache has.
>>
>> PS. SSE will be faster still, but we might not want SSE for just
>> splitlines
>>
>> Cheers,
>> fijal
>> _______________________________________________
>> pypy-dev mailing list
>> pypy-dev at python.org
>> https://mail.python.org/mailman/listinfo/pypy-dev

From fijall at gmail.com  Sat Mar  4 14:58:01 2017
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sat, 4 Mar 2017 20:58:01 +0100
Subject: [pypy-dev] Speeds of various utf8 operations
In-Reply-To: <CA+HjJrijVPWtR6VW580Le5EsTtY60K-h6cemu3K2MBONHkYEzQ@mail.gmail.com>
References: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
 <CA+HjJrijVPWtR6VW580Le5EsTtY60K-h6cemu3K2MBONHkYEzQ@mail.gmail.com>
Message-ID: <CAK5idxSOLip1ffZT+1K-qqy2C1f8jLUsSuCvxUaRX5Jz37KDLg@mail.gmail.com>

Hi phyo

The mail is about during operations in c/assembler. I will have more
detailed python level benchmarks while I progress with my branch.


On 04 Mar 2017 7:36 PM, "Phyo Arkar" <phyo.arkarlwin at gmail.com> wrote:

SSE measn https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions?

in comparison to CPython is this much slower ?

On Sun, Mar 5, 2017 at 12:32 AM Maciej Fijalkowski <fijall at gmail.com> wrote:

> Hello everyone
>
> I've been experimenting a bit with faster utf8 operations (and
> conversion that does not do much). I'm writing down the results so
> they don't get forgotten, as well as trying to put them in rpython
> comments.
>
> As far as non-SSE algorithms go, for things like splitlines, split
> etc. is important to walk the utf8 string quickly and check properties
> of characters.
>
> So far the current finding has been that lookup table, for example:
>
>  def next_codepoint_pos(code, pos):
>      chr1 = ord(code[pos])
>      if chr1 < 0x80:
>          return pos + 1
>     return pos + ord(runicode._utf8_code_length[chr1 - 0x80])
>
> is significantly slower than following code (both don't do error checking):
>
> def next_codepoint_pos(code, pos):
>     chr1 = ord(code[pos])
>     if chr1 < 0x80:
>         return pos + 1
>     if 0xC2 >= chr1 <= 0xDF:
>         return pos + 2
>     if chr >= 0xE0 and chr <= 0xEF:
>         return pos + 3
>     return pos + 4
>
> The exact difference depends on how much multi-byte characters are
> there and how big the strings are. It's up to 40%, but as a general
> rule, the more ascii characters are, the less of an impact it has, as
> well as the larger they are, the more impact memory/L2/L3 cache has.
>
> PS. SSE will be faster still, but we might not want SSE for just splitlines
>
> Cheers,
> fijal
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170304/cf55465a/attachment-0001.html>

From fijall at gmail.com  Sat Mar  4 18:45:01 2017
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sun, 5 Mar 2017 00:45:01 +0100
Subject: [pypy-dev] revisit web assembly?
Message-ID: <CAK5idxQV=njAAe+7gNH7m=FMT6eW=1oQHmA12j4ELAe2yAg8yg@mail.gmail.com>

I just found that: https://sourceware.org/ml/binutils/2017-03/msg00044.html

It might be cool to see, maybe we can relatively easily compile to the
web platform. Even interpreter-only version could be quite
interesting.

Cheers,
fijal

From armin.rigo at gmail.com  Sun Mar  5 04:14:07 2017
From: armin.rigo at gmail.com (Armin Rigo)
Date: Sun, 5 Mar 2017 10:14:07 +0100
Subject: [pypy-dev] Speeds of various utf8 operations
In-Reply-To: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
References: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
Message-ID: <CAMSv6X1pQbVRvqv4XFSAGL4Jgm58ZtE0uVxubTM6Bd-eQL=S-g@mail.gmail.com>

Hi Maciej,

On 4 March 2017 at 19:01, Maciej Fijalkowski <fijall at gmail.com> wrote:
> def next_codepoint_pos(code, pos):
>     chr1 = ord(code[pos])
>     if chr1 < 0x80:
>         return pos + 1
>     if 0xC2 >= chr1 <= 0xDF:
>         return pos + 2
>     if chr >= 0xE0 and chr <= 0xEF:
>         return pos + 3
>     return pos + 4

If you don't want error checking, then you can simplify a bit the
range checks here.  Maybe it gives some more gains, but who knows:

def next_codepoint_pos(code, pos):
    chr1 = ord(code[pos])
    if chr1 < 0x80:
        return pos + 1
    if chr1 <= 0xDF:
        return pos + 2
    if chr1 <= 0xEF:
        return pos + 3
    return pos + 4


A bient?t,

Armin.

From phyo.arkarlwin at gmail.com  Sun Mar  5 12:08:19 2017
From: phyo.arkarlwin at gmail.com (Phyo Arkar)
Date: Sun, 05 Mar 2017 17:08:19 +0000
Subject: [pypy-dev] revisit web assembly?
In-Reply-To: <CAK5idxQV=njAAe+7gNH7m=FMT6eW=1oQHmA12j4ELAe2yAg8yg@mail.gmail.com>
References: <CAK5idxQV=njAAe+7gNH7m=FMT6eW=1oQHmA12j4ELAe2yAg8yg@mail.gmail.com>
Message-ID: <CA+HjJrhdKPDVvbTmjut6+Qht1NLEE+Rc+KZNM8hJtaP8UJbaQA@mail.gmail.com>

Very intersting! I can't wait to write Javascript in python.

On Sun, Mar 5, 2017 at 6:15 AM Maciej Fijalkowski <fijall at gmail.com> wrote:

> I just found that:
> https://sourceware.org/ml/binutils/2017-03/msg00044.html
>
> It might be cool to see, maybe we can relatively easily compile to the
> web platform. Even interpreter-only version could be quite
> interesting.
>
> Cheers,
> fijal
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170305/83f2e2b5/attachment.html>

From yury at shurup.com  Sun Mar  5 12:39:23 2017
From: yury at shurup.com (Yury V. Zaytsev)
Date: Sun, 5 Mar 2017 18:39:23 +0100 (CET)
Subject: [pypy-dev] revisit web assembly?
In-Reply-To: <CA+HjJrhdKPDVvbTmjut6+Qht1NLEE+Rc+KZNM8hJtaP8UJbaQA@mail.gmail.com>
References: <CAK5idxQV=njAAe+7gNH7m=FMT6eW=1oQHmA12j4ELAe2yAg8yg@mail.gmail.com>
 <CA+HjJrhdKPDVvbTmjut6+Qht1NLEE+Rc+KZNM8hJtaP8UJbaQA@mail.gmail.com>
Message-ID: <alpine.LRH.2.20.1703051837200.1531@vps.zaytsev.net>

On Sun, 5 Mar 2017, Phyo Arkar wrote:

> Very intersting! I can't wait to write Javascript in python.?

<troll>But what for, if you can write it in Haskell? ;-) [*]</troll>

[*]: http://elm-lang.org

Seriously, thought, the WebAssembly thing looks quite exciting!

> On Sun, Mar 5, 2017 at 6:15 AM Maciej Fijalkowski <fijall at gmail.com> wrote:
>       I just found that: https://sourceware.org/ml/binutils/2017-03/msg00044.html
>
>       It might be cool to see, maybe we can relatively easily compile to the
>       web platform. Even interpreter-only version could be quite
>       interesting.
>
>       Cheers,
>       fijal
>       _______________________________________________
>       pypy-dev mailing list
>       pypy-dev at python.org
>       https://mail.python.org/mailman/listinfo/pypy-dev

-- 
Sincerely yours,
Yury V. Zaytsev

From fijall at gmail.com  Sun Mar  5 14:24:24 2017
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sun, 5 Mar 2017 21:24:24 +0200
Subject: [pypy-dev] Speeds of various utf8 operations
In-Reply-To: <CAMSv6X1pQbVRvqv4XFSAGL4Jgm58ZtE0uVxubTM6Bd-eQL=S-g@mail.gmail.com>
References: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
 <CAMSv6X1pQbVRvqv4XFSAGL4Jgm58ZtE0uVxubTM6Bd-eQL=S-g@mail.gmail.com>
Message-ID: <CAK5idxQYXoJiGdVXL3QgJi6Qxro0Gq1uBJg3aCek9pv0DwVqtg@mail.gmail.com>

This is checking for spaces in unicode (so it's known to be valid utf8)

On Sun, Mar 5, 2017 at 11:14 AM, Armin Rigo <armin.rigo at gmail.com> wrote:
> Hi Maciej,
>
> On 4 March 2017 at 19:01, Maciej Fijalkowski <fijall at gmail.com> wrote:
>> def next_codepoint_pos(code, pos):
>>     chr1 = ord(code[pos])
>>     if chr1 < 0x80:
>>         return pos + 1
>>     if 0xC2 >= chr1 <= 0xDF:
>>         return pos + 2
>>     if chr >= 0xE0 and chr <= 0xEF:
>>         return pos + 3
>>     return pos + 4
>
> If you don't want error checking, then you can simplify a bit the
> range checks here.  Maybe it gives some more gains, but who knows:
>
> def next_codepoint_pos(code, pos):
>     chr1 = ord(code[pos])
>     if chr1 < 0x80:
>         return pos + 1
>     if chr1 <= 0xDF:
>         return pos + 2
>     if chr1 <= 0xEF:
>         return pos + 3
>     return pos + 4
>
>
> A bient?t,
>
> Armin.

From armin.rigo at gmail.com  Mon Mar  6 02:13:19 2017
From: armin.rigo at gmail.com (Armin Rigo)
Date: Mon, 6 Mar 2017 08:13:19 +0100
Subject: [pypy-dev] Speeds of various utf8 operations
In-Reply-To: <CAK5idxQYXoJiGdVXL3QgJi6Qxro0Gq1uBJg3aCek9pv0DwVqtg@mail.gmail.com>
References: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
 <CAMSv6X1pQbVRvqv4XFSAGL4Jgm58ZtE0uVxubTM6Bd-eQL=S-g@mail.gmail.com>
 <CAK5idxQYXoJiGdVXL3QgJi6Qxro0Gq1uBJg3aCek9pv0DwVqtg@mail.gmail.com>
Message-ID: <CAMSv6X1qBgy6JsFpOeDDYyEaUi0d-9smRML=yRSyEOcs0M1LYg@mail.gmail.com>

Hi Maciej,

On 5 March 2017 at 20:24, Maciej Fijalkowski <fijall at gmail.com> wrote:
> This is checking for spaces in unicode (so it's known to be valid utf8)

Ok, then you might have missed another property of UTF-8: when you
check for "being a substring" in UTF-8, you don't need to do any
decoding.  Instead you only need to check "being a substring" with the
two encoded UTF-8 strings.  This always works as expected, i.e. you
can never get a positive answer by chance.  So for example:

    x in y   can be implemented as   x._utf8 in y._utf8

and in this case, you can find spaces in a unicode string just by
searching for the 10 byte patterns that are spaces-encoded-as-UTF-8
(11 if you also count '\n\r' as one such pattern).

That's also how the 're' module could be rewritten to directly handle
UTF-8 strings, instead of decoding it first.


A bient?t,

Armin.

From fijall at gmail.com  Mon Mar  6 02:15:44 2017
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Mon, 6 Mar 2017 11:15:44 +0400
Subject: [pypy-dev] Speeds of various utf8 operations
In-Reply-To: <CAMSv6X1qBgy6JsFpOeDDYyEaUi0d-9smRML=yRSyEOcs0M1LYg@mail.gmail.com>
References: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
 <CAMSv6X1pQbVRvqv4XFSAGL4Jgm58ZtE0uVxubTM6Bd-eQL=S-g@mail.gmail.com>
 <CAK5idxQYXoJiGdVXL3QgJi6Qxro0Gq1uBJg3aCek9pv0DwVqtg@mail.gmail.com>
 <CAMSv6X1qBgy6JsFpOeDDYyEaUi0d-9smRML=yRSyEOcs0M1LYg@mail.gmail.com>
Message-ID: <CAK5idxRgtFa0k6ae-mo0OR4y_xAdpsRgia7gHF3W5EwatY2t9g@mail.gmail.com>

Yes sure, I'm aware of that :-)

The problem only shows up with "start" and "end" parameters being used

On Mon, Mar 6, 2017 at 11:13 AM, Armin Rigo <armin.rigo at gmail.com> wrote:
> Hi Maciej,
>
> On 5 March 2017 at 20:24, Maciej Fijalkowski <fijall at gmail.com> wrote:
>> This is checking for spaces in unicode (so it's known to be valid utf8)
>
> Ok, then you might have missed another property of UTF-8: when you
> check for "being a substring" in UTF-8, you don't need to do any
> decoding.  Instead you only need to check "being a substring" with the
> two encoded UTF-8 strings.  This always works as expected, i.e. you
> can never get a positive answer by chance.  So for example:
>
>     x in y   can be implemented as   x._utf8 in y._utf8
>
> and in this case, you can find spaces in a unicode string just by
> searching for the 10 byte patterns that are spaces-encoded-as-UTF-8
> (11 if you also count '\n\r' as one such pattern).
>
> That's also how the 're' module could be rewritten to directly handle
> UTF-8 strings, instead of decoding it first.
>
>
> A bient?t,
>
> Armin.

From terri at toybox.ca  Mon Mar  6 23:04:40 2017
From: terri at toybox.ca (Terri Oda)
Date: Mon, 6 Mar 2017 20:04:40 -0800
Subject: [pypy-dev] [GSoC2017] PyPy as a Sub-org
In-Reply-To: <22711.42922.606041.882426@turnbull.sk.tsukuba.ac.jp>
References: <d2f209c4-9764-17f7-0170-087e4092161a@gmail.com>
 <22711.42922.606041.882426@turnbull.sk.tsukuba.ac.jp>
Message-ID: <8c6faba5-f579-1a78-ac0d-f85c3d353b34@toybox.ca>

Just to confirm: you're up on the page now, and I see you have 4 mentors 
signed up, so it looks like PyPy is all set to go.  You should have all 
been issued invites to google's submission system, so just make sure to 
sign up there so you can see applications when they start coming in!

  Terri


On 2017-03-01 9:03 PM, Stephen J. Turnbull wrote:
> Richard Plangger writes:
>  > Hi,
>  >
>  > as we discussed during this years PyPy sprint, PyPy again wants to
>  > participate as a Sub-org in this years Google Summer of Code. We are
>  > already on the wiki ideas page, but I think we did not formally apply by
>  > writing this email.
>
> Your section is visible on the page at http://python-gsoc.org/.  That
> was your formal application as a sub-org.  Is there any other problem?
>
> Details:
>
> To actually participate in GSoC under the PSF umbrella, you need to
> (1) register at least two mentors X desired slots at
>     https://goo.gl/forms/UJb0rHOVQjLna2o53
>     Some overlap will be allowed, but the mentors who are working with
>     more than one student should be a small minority.
> (2) subscribe them to the mailing list
>     https://mail.python.org/mailman/listinfo/gsoc-mentors
> (3) designate one sub-org admin and one alternate in case the main
>     admin is out of contact for more than a day or two
> (4) state that you intend to comply with the Python Code of Conduct
>
> See also http://python-gsoc.org/#mentors for more information and
> further requirements (that can be satisfied as you go) about mentors
> and sub-orgs.
>
> Steve
>
>


From planrichi at gmail.com  Wed Mar  8 12:17:24 2017
From: planrichi at gmail.com (Richard Plangger)
Date: Wed, 8 Mar 2017 18:17:24 +0100
Subject: [pypy-dev] Speeds of various utf8 operations
In-Reply-To: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
References: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
Message-ID: <e366651c-7a26-4742-3bad-92596062d35b@gmail.com>

Hi,

as we discussed on the sprint I have now experimented with an SSE/AVX
implementation to 'len(utf8 string)' (this includes a check if it is
valid utf8). Since this is related to this mailing list thread I'll just
add it here!

I ran some small measurements on it:

Here some explanation of the names:

pypy-seq-.*: sequential implementation in C, nothing fancy just a baseline
pypy-vec-sse4-.*: implementation using sse4 (128 bit registers)
pypy-vec-avx2-.*: implementation using avx2 (256 bit registers)
libunistring-.*: benchmarking the function u8_check in that gnu library,
NO length is calculated
mystrlenutf8-.*: some guy doing length calculation (no validity check)
only using 64bit words instead of per byte iteration. (see here [1])

.*-news-de: html of a german website (has quite a lot of 2 byte code
points), ~ 1MB
.*-news-cn: worldjournarl.com -> mandarin (html website with lots of 4
byte code points) ~ 700 KB
.*-tipitaka-thai: xml page of some religious text with lots of 3 byte
code points (~4.5 MB) copied many times (original file was 300KB)

Why is u8u16 missing? Well, as far as I can tell there is no function in
u8u16 that returns the length of an utf8 string and checks if it is
valid at the same time, without rewriting it. u8u16 is really just for
transforming utf8 to utf16.

The benchmark runs read the content from a file (e.g. .*-news-de, a
german html news website), and in a loop iterates 10 times the
utf-8-get-length-and-check function written in C and sums up the time
for each run (using clock_t clock(void) in C, man 3 clock).

.....................
pypy-seq-news-de: Median +- std dev: 76.0 us +- 1.4 us
.....................
pypy-sse4-vec-news-de: Median +- std dev: 5.16 us +- 0.14 us
.....................
pypy-avx2-vec-news-de: Median +- std dev: 384 ns +- 11 ns
.....................
libunistring-news-de: Median +- std dev: 33.0 us +- 0.4 us
.....................
mystrlenutf8-news-de: Median +- std dev: 9.25 us +- 0.22 us
.....................
pypy-seq-news-cn: Median +- std dev: 59.8 us +- 1.2 us
.....................
pypy-sse4-vec-news-cn: Median +- std dev: 7.70 us +- 0.12 us
.....................
pypy-avx2-vec-news-cn: Median +- std dev: 23.3 ns +- 0.4 ns
.....................
libunistring-news-cn: Median +- std dev: 30.5 us +- 0.4 us
.....................
mystrlenutf8-news-cn: Median +- std dev: 6.54 us +- 0.20 us
.....................
pypy-seq-tipitaka-thai: Median +- std dev: 939 us +- 39 us
.....................
pypy-sse4-vec-tipitaka-thai: Median +- std dev: 425 us +- 7 us
.....................
pypy-avx2-vec-tipitaka-thai: Median +- std dev: 19.9 ns +- 0.3 ns
.....................
libunistring-tipitaka-thai: Median +- std dev: 615 us +- 28 us
.....................
WARNING: the benchmark seems unstable, the standard deviation is high
(stdev/median: 17%)
Try to rerun the benchmark with more runs, samples and/or loops

mystrlenutf8-tipitaka-thai: Median +- std dev: 45.1 us +- 7.9 us

What do you think?

I think it would even be a good idea to take a look at AVX512 (which
gives you a crazy amount of 512 bits (or 64 bytes) in your vector register).

The AVX implementation is a bit fishy (compare avx2-vec-tipitaka-thai
and pypy-avx2-vec-news-cn). I need to recheck that, it would not make
sense to process 10x 4.5 MB in 20ns and 10x 700KB in 23ns.

As soon as I have ironed out the issue I'll start to think about indexing...

Cheers,
Richard

[1] http://www.daemonology.net/blog/2008-06-05-faster-utf8-strlen.html

On 03/04/2017 07:01 PM, Maciej Fijalkowski wrote:
> Hello everyone
> 
> I've been experimenting a bit with faster utf8 operations (and
> conversion that does not do much). I'm writing down the results so
> they don't get forgotten, as well as trying to put them in rpython
> comments.
> 
> As far as non-SSE algorithms go, for things like splitlines, split
> etc. is important to walk the utf8 string quickly and check properties
> of characters.
> 
> So far the current finding has been that lookup table, for example:
> 
>  def next_codepoint_pos(code, pos):
>      chr1 = ord(code[pos])
>      if chr1 < 0x80:
>          return pos + 1
>     return pos + ord(runicode._utf8_code_length[chr1 - 0x80])
> 
> is significantly slower than following code (both don't do error checking):
> 
> def next_codepoint_pos(code, pos):
>     chr1 = ord(code[pos])
>     if chr1 < 0x80:
>         return pos + 1
>     if 0xC2 >= chr1 <= 0xDF:
>         return pos + 2
>     if chr >= 0xE0 and chr <= 0xEF:
>         return pos + 3
>     return pos + 4
> 
> The exact difference depends on how much multi-byte characters are
> there and how big the strings are. It's up to 40%, but as a general
> rule, the more ascii characters are, the less of an impact it has, as
> well as the larger they are, the more impact memory/L2/L3 cache has.
> 
> PS. SSE will be faster still, but we might not want SSE for just splitlines
> 
> Cheers,
> fijal
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
> 

From planrichi at gmail.com  Wed Mar  8 13:09:57 2017
From: planrichi at gmail.com (Richard Plangger)
Date: Wed, 8 Mar 2017 19:09:57 +0100
Subject: [pypy-dev] Speeds of various utf8 operations
In-Reply-To: <CAGWvnymKgZqZqwXao27MHH06Q-=o9pbMddLWFchwgPVcM5Ds+Q@mail.gmail.com>
References: <CAK5idxT8e3CddkU9KKeT-ckzH3H0k36wU6o=mppSe15CJBukYw@mail.gmail.com>
 <e366651c-7a26-4742-3bad-92596062d35b@gmail.com>
 <CAGWvnymKgZqZqwXao27MHH06Q-=o9pbMddLWFchwgPVcM5Ds+Q@mail.gmail.com>
Message-ID: <CADVVSUE+1ygAdea=PwPdy+GO8nFRmD1TF+=-SX9zhhQ6YvjJtw@mail.gmail.com>

Yes ;)? At some point, now I'm still experimenting with the operations we
think we need.

Cheers,
Richard

On Mar 8, 2017 6:50 PM, "David Edelsohn" <dje.gcc at gmail.com> wrote:

> And POWER VSX and Z VX? ;-)
>
> - David
>
>
> On Wed, Mar 8, 2017 at 12:17 PM, Richard Plangger <planrichi at gmail.com>
> wrote:
> > Hi,
> >
> > as we discussed on the sprint I have now experimented with an SSE/AVX
> > implementation to 'len(utf8 string)' (this includes a check if it is
> > valid utf8). Since this is related to this mailing list thread I'll just
> > add it here!
> >
> > I ran some small measurements on it:
> >
> > Here some explanation of the names:
> >
> > pypy-seq-.*: sequential implementation in C, nothing fancy just a
> baseline
> > pypy-vec-sse4-.*: implementation using sse4 (128 bit registers)
> > pypy-vec-avx2-.*: implementation using avx2 (256 bit registers)
> > libunistring-.*: benchmarking the function u8_check in that gnu library,
> > NO length is calculated
> > mystrlenutf8-.*: some guy doing length calculation (no validity check)
> > only using 64bit words instead of per byte iteration. (see here [1])
> >
> > .*-news-de: html of a german website (has quite a lot of 2 byte code
> > points), ~ 1MB
> > .*-news-cn: worldjournarl.com -> mandarin (html website with lots of 4
> > byte code points) ~ 700 KB
> > .*-tipitaka-thai: xml page of some religious text with lots of 3 byte
> > code points (~4.5 MB) copied many times (original file was 300KB)
> >
> > Why is u8u16 missing? Well, as far as I can tell there is no function in
> > u8u16 that returns the length of an utf8 string and checks if it is
> > valid at the same time, without rewriting it. u8u16 is really just for
> > transforming utf8 to utf16.
> >
> > The benchmark runs read the content from a file (e.g. .*-news-de, a
> > german html news website), and in a loop iterates 10 times the
> > utf-8-get-length-and-check function written in C and sums up the time
> > for each run (using clock_t clock(void) in C, man 3 clock).
> >
> > .....................
> > pypy-seq-news-de: Median +- std dev: 76.0 us +- 1.4 us
> > .....................
> > pypy-sse4-vec-news-de: Median +- std dev: 5.16 us +- 0.14 us
> > .....................
> > pypy-avx2-vec-news-de: Median +- std dev: 384 ns +- 11 ns
> > .....................
> > libunistring-news-de: Median +- std dev: 33.0 us +- 0.4 us
> > .....................
> > mystrlenutf8-news-de: Median +- std dev: 9.25 us +- 0.22 us
> > .....................
> > pypy-seq-news-cn: Median +- std dev: 59.8 us +- 1.2 us
> > .....................
> > pypy-sse4-vec-news-cn: Median +- std dev: 7.70 us +- 0.12 us
> > .....................
> > pypy-avx2-vec-news-cn: Median +- std dev: 23.3 ns +- 0.4 ns
> > .....................
> > libunistring-news-cn: Median +- std dev: 30.5 us +- 0.4 us
> > .....................
> > mystrlenutf8-news-cn: Median +- std dev: 6.54 us +- 0.20 us
> > .....................
> > pypy-seq-tipitaka-thai: Median +- std dev: 939 us +- 39 us
> > .....................
> > pypy-sse4-vec-tipitaka-thai: Median +- std dev: 425 us +- 7 us
> > .....................
> > pypy-avx2-vec-tipitaka-thai: Median +- std dev: 19.9 ns +- 0.3 ns
> > .....................
> > libunistring-tipitaka-thai: Median +- std dev: 615 us +- 28 us
> > .....................
> > WARNING: the benchmark seems unstable, the standard deviation is high
> > (stdev/median: 17%)
> > Try to rerun the benchmark with more runs, samples and/or loops
> >
> > mystrlenutf8-tipitaka-thai: Median +- std dev: 45.1 us +- 7.9 us
> >
> > What do you think?
> >
> > I think it would even be a good idea to take a look at AVX512 (which
> > gives you a crazy amount of 512 bits (or 64 bytes) in your vector
> register).
> >
> > The AVX implementation is a bit fishy (compare avx2-vec-tipitaka-thai
> > and pypy-avx2-vec-news-cn). I need to recheck that, it would not make
> > sense to process 10x 4.5 MB in 20ns and 10x 700KB in 23ns.
> >
> > As soon as I have ironed out the issue I'll start to think about
> indexing...
> >
> > Cheers,
> > Richard
> >
> > [1] http://www.daemonology.net/blog/2008-06-05-faster-utf8-strlen.html
> >
> > On 03/04/2017 07:01 PM, Maciej Fijalkowski wrote:
> >> Hello everyone
> >>
> >> I've been experimenting a bit with faster utf8 operations (and
> >> conversion that does not do much). I'm writing down the results so
> >> they don't get forgotten, as well as trying to put them in rpython
> >> comments.
> >>
> >> As far as non-SSE algorithms go, for things like splitlines, split
> >> etc. is important to walk the utf8 string quickly and check properties
> >> of characters.
> >>
> >> So far the current finding has been that lookup table, for example:
> >>
> >>  def next_codepoint_pos(code, pos):
> >>      chr1 = ord(code[pos])
> >>      if chr1 < 0x80:
> >>          return pos + 1
> >>     return pos + ord(runicode._utf8_code_length[chr1 - 0x80])
> >>
> >> is significantly slower than following code (both don't do error
> checking):
> >>
> >> def next_codepoint_pos(code, pos):
> >>     chr1 = ord(code[pos])
> >>     if chr1 < 0x80:
> >>         return pos + 1
> >>     if 0xC2 >= chr1 <= 0xDF:
> >>         return pos + 2
> >>     if chr >= 0xE0 and chr <= 0xEF:
> >>         return pos + 3
> >>     return pos + 4
> >>
> >> The exact difference depends on how much multi-byte characters are
> >> there and how big the strings are. It's up to 40%, but as a general
> >> rule, the more ascii characters are, the less of an impact it has, as
> >> well as the larger they are, the more impact memory/L2/L3 cache has.
> >>
> >> PS. SSE will be faster still, but we might not want SSE for just
> splitlines
> >>
> >> Cheers,
> >> fijal
> >> _______________________________________________
> >> pypy-dev mailing list
> >> pypy-dev at python.org
> >> https://mail.python.org/mailman/listinfo/pypy-dev
> >>
> > _______________________________________________
> > pypy-dev mailing list
> > pypy-dev at python.org
> > https://mail.python.org/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170308/893b45ae/attachment-0001.html>

From shubharamani at yahoo.com  Fri Mar 10 10:15:45 2017
From: shubharamani at yahoo.com (Shubha Ramani)
Date: Fri, 10 Mar 2017 15:15:45 +0000 (UTC)
Subject: [pypy-dev] looptoken.number of bridge
References: <2028230117.2808877.1489158945920.ref@mail.yahoo.com>
Message-ID: <2028230117.2808877.1489158945920@mail.yahoo.com>

1) Is it a true statement to say that the looptoken.number between an original loop and a bridge stays the same ??2) Is it a true statement to say "loopname" as passed into assemble_loop will apply to a bridge if indeed statement 1)above is true - if a loop and bridge share the same original looptoken.number, do they also share the same loopname(as passed in from get_printable_location) ?
Thanks,
Shubha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170310/92728209/attachment.html>

From shubharamani at yahoo.com  Fri Mar 10 10:22:33 2017
From: shubharamani at yahoo.com (Shubha Ramani)
Date: Fri, 10 Mar 2017 15:22:33 +0000 (UTC)
Subject: [pypy-dev] jitted region start address and size
References: <1830734433.2882206.1489159353300.ref@mail.yahoo.com>
Message-ID: <1830734433.2882206.1489159353300@mail.yahoo.com>

1) in assemble_loop:Is this correct and if not, what is the correct answer ?starting address of jitted region: looppos + rawstartsize of jitted region:: size_excluding_failure_stuff - looppos

2) in assemble_bridge,?Is this correct and if not, what is the correct answer ?starting address of jitted region: startpos + rawstartsize of jitted region: codeendpos - startpos
Thanks,
Shubha


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170310/5ac699ac/attachment.html>

From John.Zhang at anu.edu.au  Mon Mar 13 20:17:11 2017
From: John.Zhang at anu.edu.au (John Zhang)
Date: Tue, 14 Mar 2017 00:17:11 +0000
Subject: [pypy-dev] What is RuntimeTypeInfo?
Message-ID: <CBAD163B-6944-433B-833D-81DA4DF54502@anu.edu.au>

Hi all,
Can anyone does the favour for me by explaining the story of RuntimeTypeInfo in RPython?
I couldn?t quite understand the description on the online documentation (http://rpython.readthedocs.io/en/latest/rtyper.html#opaque-types). From looking at the source code, and inspecting the generated C source, RTTI doesn?t seem to be used at all at runtime. The generated C source seems to just return an uninitialised function pointer on the stack when compiling the following code:

from rpython.rtyper.rclass import OBJECTPTR
class A:
    pass

class B(A):
    pass

def f(a):
    obj = rffi.cast(OBJECTPTR, a)
    return lltype.runtime_type_info(obj)

t = Translation(f, [A], backend='c')
t.backendopt(mallocs=True)
t.view()
lib = t.compile()

We were looking at RTTI in finding a solution to the loss of type information problem at JIT encoding. My collaborators are trying to develop a JIT back-end targeting a micro virtual machine (Mu). It seems that by default the JIT transformer throws away the actual type information (GcStruct etc., which is not representable under RPython, I know) and only keeps the size. However, in Mu, memory allocation requires specific type information. Thus, among other ways, we are trying to see how much we can recover this object layout/type information. RTTI seems promising based on the description on the documentation, but I can?t picture what it looks like at run time.
Can anyone provide some insight on this?

Thanks,
John Zhang

------------------------------------------------------
John Zhang
Research Assistant
Programming Languages, Design & Implementation Division
Computer Systems Group
ANU College of Engineering & Computer Science
108 North Rd
The Australian National University
Acton ACT 2601
john.zhang at anu.edu.au<mailto:john.zhang at anu.edu.au>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170314/92cff556/attachment.html>

From william.leslie.ttg at gmail.com  Tue Mar 14 23:40:07 2017
From: william.leslie.ttg at gmail.com (William ML Leslie)
Date: Wed, 15 Mar 2017 14:40:07 +1100
Subject: [pypy-dev] What is RuntimeTypeInfo?
In-Reply-To: <CBAD163B-6944-433B-833D-81DA4DF54502@anu.edu.au>
References: <CBAD163B-6944-433B-833D-81DA4DF54502@anu.edu.au>
Message-ID: <CAHgd1hEB0r8GFsgXOdr2kvYXx2px-EwL_doSnMY7JE6XNh10kg@mail.gmail.com>

On 14 March 2017 at 11:17, John Zhang <John.Zhang at anu.edu.au> wrote:
> Hi all,
> Can anyone does the favour for me by explaining the story of RuntimeTypeInfo
> in RPython?

Hi John!

The RTTI are a hook that the backend can implement, there is a fair
bit of flexibility in what values they can take.  The hooks in the C
backend live here:

https://bitbucket.org/pypy/pypy/src/699382943bd73bf19565e996d2042d54e7569e31/rpython/translator/c/gc.py?at=default&fileviewer=file-view-default#gc.py-145

This class is one example of a node for an RTTI value.  In this one,
the RTTI value is a function that can statically deallocate structs of
this type.  There are more in this file, for example the RTTI for a
struct in a framework GC is just an identifier iiuc.

I think you want to translate your opaque types yourself, possibly

> I couldn?t quite understand the description on the online documentation
> (http://rpython.readthedocs.io/en/latest/rtyper.html#opaque-types). From
> looking at the source code, and inspecting the generated C source, RTTI
> doesn?t seem to be used at all at runtime. The generated C source seems to
> just return an uninitialised function pointer on the stack when compiling
> the following code:
>
> from rpython.rtyper.rclass import OBJECTPTR
> class A:
>     pass
>
> class B(A):
>     pass
>
> def f(a):
>     obj = rffi.cast(OBJECTPTR, a)
>     return lltype.runtime_type_info(obj)
>
> t = Translation(f, [A], backend='c')
> t.backendopt(mallocs=True)
> t.view()
> lib = t.compile()
>

This example (which uses the refcount GC) grabs a function from the
type OBJECTPTR that can de-allocate an OBJECTPTR (that is, it can
decrement the refcount of the object `a`).  I haven't had a look at
why it would be uninitialised.

> We were looking at RTTI in finding a solution to the loss of type
> information problem at JIT encoding. My collaborators are trying to develop
> a JIT back-end targeting a micro virtual machine (Mu). It seems that by
> default the JIT transformer throws away the actual type information
> (GcStruct etc., which is not representable under RPython, I know) and only
> keeps the size. However, in Mu, memory allocation requires specific type
> information. Thus, among other ways, we are trying to see how much we can
> recover this object layout/type information. RTTI seems promising based on
> the description on the documentation, but I can?t picture what it looks like
> at run time.
> Can anyone provide some insight on this?
>

The low-level allocation operations specify size as an operand - it
might be better for you to translate the various new_ operations into
a form you can make use of long before jitcode generation.  You'll
need to extend the codewriter to allow for those operations, too.

-- 
William Leslie

Notice:
Likely much of this email is, by the nature of copyright, covered
under copyright law.  You absolutely MAY reproduce any part of it in
accordance with the copyright law of the nation you are reading this
in.  Any attempt to DENY YOU THOSE RIGHTS would be illegal without
prior contractual agreement.

From matti.picus at gmail.com  Wed Mar 15 01:34:13 2017
From: matti.picus at gmail.com (Matti Picus)
Date: Wed, 15 Mar 2017 07:34:13 +0200
Subject: [pypy-dev] Freeze of pypy2 and pypy3 for upcoming release
Message-ID: <f4bf5968-961e-b99f-6b4c-81f50e25a37a@gmail.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170315/adc638b6/attachment-0001.html>

From armin.rigo at gmail.com  Wed Mar 15 03:13:06 2017
From: armin.rigo at gmail.com (Armin Rigo)
Date: Wed, 15 Mar 2017 08:13:06 +0100
Subject: [pypy-dev] What is RuntimeTypeInfo?
In-Reply-To: <CAHgd1hEB0r8GFsgXOdr2kvYXx2px-EwL_doSnMY7JE6XNh10kg@mail.gmail.com>
References: <CBAD163B-6944-433B-833D-81DA4DF54502@anu.edu.au>
 <CAHgd1hEB0r8GFsgXOdr2kvYXx2px-EwL_doSnMY7JE6XNh10kg@mail.gmail.com>
Message-ID: <CAMSv6X37TetVDjWPYFAN9XZJGVo02tMUTSY22yaFLUFBWvnbhQ@mail.gmail.com>

Hi,

On 15 March 2017 at 04:40, William ML Leslie
<william.leslie.ttg at gmail.com> wrote:
> The RTTI are a hook that the backend can implement, there is a fair
> bit of flexibility in what values they can take.

That's right, but also, the RTTI is actually something from the early
days of PyPy and not used any more nowadays.  It is used with our
test-only refcounting GC but not by "real code".  I wouldn't start
with that.

>> It seems that by default the JIT transformer throws away the actual type information
>> (GcStruct etc., which is not representable under RPython, I know) and only keeps
>> the size. However, in Mu, memory allocation requires specific type information.

That's not really true.  We need to keep at least the typeid (a
number, also called "tid") in addition to the size.  This is stored in
the SizeDescr, for GcStructs, which is a small piece of type
information built at translation time by cpu.sizeof().  Look for
get_size_descr() in jit/backend/llsupport/, and for init_size_descr()
in jit/backend/llsupport/gc.py.  You can tweak SizeDescr to attach
whatever info is needed there.

As William said, depending on what exactly you need, you need to also
tweak jit/codewriter/, which is the code that ultimately invokes the
translation-time setting up of SizeDescr.

Also, the same applies to the other Descr classes in
llsupport/descr.py, at least ArrayDescr.


A bient?t,

Armin.

From cfbolz at gmx.de  Wed Mar 15 05:35:09 2017
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Wed, 15 Mar 2017 10:35:09 +0100
Subject: [pypy-dev] What is RuntimeTypeInfo?
In-Reply-To: <CAMSv6X37TetVDjWPYFAN9XZJGVo02tMUTSY22yaFLUFBWvnbhQ@mail.gmail.com>
References: <CBAD163B-6944-433B-833D-81DA4DF54502@anu.edu.au>
 <CAHgd1hEB0r8GFsgXOdr2kvYXx2px-EwL_doSnMY7JE6XNh10kg@mail.gmail.com>
 <CAMSv6X37TetVDjWPYFAN9XZJGVo02tMUTSY22yaFLUFBWvnbhQ@mail.gmail.com>
Message-ID: <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com>

Hi John, 

As an aside, if you strictly need more information in your descrs, nobody stops you from using less of llsupport and instead writing or at least overriding your own descr infrastructure. I would imagine that a number of things from llsupport are not a perfect match for the mu backend. 

Cheers, 
Carl Friedrich

On March 15, 2017 8:13:06 AM GMT+01:00, Armin Rigo <armin.rigo at gmail.com> wrote:
>Hi,
>
>On 15 March 2017 at 04:40, William ML Leslie
><william.leslie.ttg at gmail.com> wrote:
>> The RTTI are a hook that the backend can implement, there is a fair
>> bit of flexibility in what values they can take.
>
>That's right, but also, the RTTI is actually something from the early
>days of PyPy and not used any more nowadays.  It is used with our
>test-only refcounting GC but not by "real code".  I wouldn't start
>with that.
>
>>> It seems that by default the JIT transformer throws away the actual
>type information
>>> (GcStruct etc., which is not representable under RPython, I know)
>and only keeps
>>> the size. However, in Mu, memory allocation requires specific type
>information.
>
>That's not really true.  We need to keep at least the typeid (a
>number, also called "tid") in addition to the size.  This is stored in
>the SizeDescr, for GcStructs, which is a small piece of type
>information built at translation time by cpu.sizeof().  Look for
>get_size_descr() in jit/backend/llsupport/, and for init_size_descr()
>in jit/backend/llsupport/gc.py.  You can tweak SizeDescr to attach
>whatever info is needed there.
>
>As William said, depending on what exactly you need, you need to also
>tweak jit/codewriter/, which is the code that ultimately invokes the
>translation-time setting up of SizeDescr.
>
>Also, the same applies to the other Descr classes in
>llsupport/descr.py, at least ArrayDescr.
>
>
>A bient?t,
>
>Armin.
>_______________________________________________
>pypy-dev mailing list
>pypy-dev at python.org
>https://mail.python.org/mailman/listinfo/pypy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170315/c0ac439d/attachment.html>

From cfbolz at gmx.de  Wed Mar 15 06:22:40 2017
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Wed, 15 Mar 2017 11:22:40 +0100
Subject: [pypy-dev] Programming Language Implementation Summer School (PLISS)
Message-ID: <89526bb0-177b-2b70-f975-89ecf624a2a9@gmx.de>

============================================================================
         Programming Language Implementation Summer School (PLISS)

                      May 20-27, 2017, Bertinoro Italy

                        https://pliss2017.github.io/
============================================================================

Programming languages are our interface to the myriad of computer
systems we interact with on a daily basis. They allow us to craft
complex sequences of operations at increasing high levels of
abstraction. How are these languages designed? How are they implemented?
How do we evaluate them?

The First Programming Language Implementation Summer School (PLISS) will
be held in Bertinoro, Italy from May 20 to 27, 2017. The Summer School's
goal is to prepare early graduate students and advanced undergraduates
for research in the field. This will be done through a combination of
lectures on language implementation techniques and short talks exploring
the state of the art in programming language research and practice.

Lectures cover current research and future trends in programming
language design and implementation, including:
 * Writing Just-in-time Compilers with LLVM
 * Performance Evaluation and Benchmarking
 * Designing a Commercial Actor Language
 * High-Performance Fully Concurrent Garbage Collection
 * Compiling Dynamic Languages
 * Language-support for Distributed Datastores

The instructors are accomplished researchers and practitioners with
extensive experience designing and engineering successful languages and
tools.

We gratefully acknowledge the support of our sponsors in allowing us to
make travel grants and fellowships available to support students
interested in attending PLISS.

More details at https://pliss2017.github.io/

From John.Zhang at anu.edu.au  Wed Mar 15 20:14:31 2017
From: John.Zhang at anu.edu.au (John Zhang)
Date: Thu, 16 Mar 2017 00:14:31 +0000
Subject: [pypy-dev] What is RuntimeTypeInfo?
In-Reply-To: <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com>
References: <CBAD163B-6944-433B-833D-81DA4DF54502@anu.edu.au>
 <CAHgd1hEB0r8GFsgXOdr2kvYXx2px-EwL_doSnMY7JE6XNh10kg@mail.gmail.com>
 <CAMSv6X37TetVDjWPYFAN9XZJGVo02tMUTSY22yaFLUFBWvnbhQ@mail.gmail.com>
 <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com>
Message-ID: <FE481CC4-9FD5-425C-9074-59B67E61F1C9@anu.edu.au>

Thanks Carl, Armin and William! We will look into it further.

Cheers,
John Zhang

------------------------------------------------------
John Zhang
Research Assistant
Programming Languages, Design & Implementation Division
Computer Systems Group
ANU College of Engineering & Computer Science
108 North Rd
The Australian National University
Acton ACT 2601
john.zhang at anu.edu.au<mailto:john.zhang at anu.edu.au>


On 15 Mar 2017, at 20:35, Carl Friedrich Bolz <cfbolz at gmx.de<mailto:cfbolz at gmx.de>> wrote:


Hi John,

As an aside, if you strictly need more information in your descrs, nobody stops you from using less of llsupport and instead writing or at least overriding your own descr infrastructure. I would imagine that a number of things from llsupport are not a perfect match for the mu backend.

Cheers,

Carl Friedrich

On March 15, 2017 8:13:06 AM GMT+01:00, Armin Rigo <armin.rigo at gmail.com<mailto:armin.rigo at gmail.com>> wrote:

Hi,

On 15 March 2017 at 04:40, William ML Leslie
<william.leslie.ttg at gmail.com<mailto:william.leslie.ttg at gmail.com>> wrote:
 The RTTI are a hook that the backend can implement, there is a fair
 bit of flexibility in what values they can take.

That's right, but also, the RTTI is actually something from the early
days of PyPy and not used any more nowadays.  It is used with our
test-only refcounting GC but not by "real code".  I wouldn't start
with that.

 It seems that by default the JIT transformer throws away the actual type information
 (GcStruct etc., which is not representable under RPython, I know) and only keeps
 the size. However, in Mu, memory allocation requires specific type information.

That's not really true.  We need to keep at least the typeid (a
number, also called "tid") in addition to the size.  This is stored in
the SizeDescr, for GcStructs, which is a small piece of type
information built at translation time by cpu.sizeof().  Look for
get_size_descr() in jit/backend/llsupport/, and for init_size_descr()
in jit/backend/llsupport/gc.py<http://gc.py/>.  You can tweak SizeDescr to attach
whatever info is needed there.

As William said, depending on what exactly you need, you need to also
tweak jit/codewriter/, which is the code that ultimately invokes the
translation-time setting up of SizeDescr.

Also, the same applies to the other Descr classes in
llsupport/descr.py<http://descr.py/>, at least ArrayDescr.


A bient?t,

Armin.
________________________________

pypy-dev mailing list
pypy-dev at python.org<mailto:pypy-dev at python.org>
https://mail.python.org/mailman/listinfo/pypy-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170316/babc20f3/attachment.html>

From John.Zhang at anu.edu.au  Wed Mar 15 20:47:59 2017
From: John.Zhang at anu.edu.au (John Zhang)
Date: Thu, 16 Mar 2017 00:47:59 +0000
Subject: [pypy-dev] What is RuntimeTypeInfo?
In-Reply-To: <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com>
References: <CBAD163B-6944-433B-833D-81DA4DF54502@anu.edu.au>
 <CAHgd1hEB0r8GFsgXOdr2kvYXx2px-EwL_doSnMY7JE6XNh10kg@mail.gmail.com>
 <CAMSv6X37TetVDjWPYFAN9XZJGVo02tMUTSY22yaFLUFBWvnbhQ@mail.gmail.com>
 <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com>
Message-ID: <34397520-113E-4ABE-AED2-5CFAC15AD833@anu.edu.au>

Hi Carl, Armin, William,
I have thought about modifying the JIT code instruction set, descriptors and runtime rewrite etc. to encode the MuTyped CFG (which is a further type and ops specialisation towards the Mu MicroVM) for Mu back-end. But I presume this will involve a large amount of work? Would this be the case? And thus this wouldn?t be a good idea, right?

Regards,
John Zhang

------------------------------------------------------
John Zhang
Research Assistant
Programming Languages, Design & Implementation Division
Computer Systems Group
ANU College of Engineering & Computer Science
108 North Rd
The Australian National University
Acton ACT 2601
john.zhang at anu.edu.au<mailto:john.zhang at anu.edu.au>


On 15 Mar 2017, at 20:35, Carl Friedrich Bolz <cfbolz at gmx.de<mailto:cfbolz at gmx.de>> wrote:


Hi John,

As an aside, if you strictly need more information in your descrs, nobody stops you from using less of llsupport and instead writing or at least overriding your own descr infrastructure. I would imagine that a number of things from llsupport are not a perfect match for the mu backend.

Cheers,

Carl Friedrich

On March 15, 2017 8:13:06 AM GMT+01:00, Armin Rigo <armin.rigo at gmail.com<mailto:armin.rigo at gmail.com>> wrote:

Hi,

On 15 March 2017 at 04:40, William ML Leslie
<william.leslie.ttg at gmail.com<mailto:william.leslie.ttg at gmail.com>> wrote:
 The RTTI are a hook that the backend can implement, there is a fair
 bit of flexibility in what values they can take.

That's right, but also, the RTTI is actually something from the early
days of PyPy and not used any more nowadays.  It is used with our
test-only refcounting GC but not by "real code".  I wouldn't start
with that.

 It seems that by default the JIT transformer throws away the actual type information
 (GcStruct etc., which is not representable under RPython, I know) and only keeps
 the size. However, in Mu, memory allocation requires specific type information.

That's not really true.  We need to keep at least the typeid (a
number, also called "tid") in addition to the size.  This is stored in
the SizeDescr, for GcStructs, which is a small piece of type
information built at translation time by cpu.sizeof().  Look for
get_size_descr() in jit/backend/llsupport/, and for init_size_descr()
in jit/backend/llsupport/gc.py<http://gc.py/>.  You can tweak SizeDescr to attach
whatever info is needed there.

As William said, depending on what exactly you need, you need to also
tweak jit/codewriter/, which is the code that ultimately invokes the
translation-time setting up of SizeDescr.

Also, the same applies to the other Descr classes in
llsupport/descr.py<http://descr.py/>, at least ArrayDescr.


A bient?t,

Armin.
________________________________

pypy-dev mailing list
pypy-dev at python.org<mailto:pypy-dev at python.org>
https://mail.python.org/mailman/listinfo/pypy-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170316/4d2ddc29/attachment-0001.html>

From me at manueljacob.de  Wed Mar 15 21:20:07 2017
From: me at manueljacob.de (Manuel Jacob)
Date: Thu, 16 Mar 2017 02:20:07 +0100
Subject: [pypy-dev] Remaining test_importlib failures
Message-ID: <c4d2becc2cfcfd603dbfa232d9d84ad2@manueljacob.de>

Hi,

I'm currently trying to fix the remaining test_importlib failures 
(http://buildbot.pypy.org/summary/longrepr?testname=%3Aunmodified&builder=pypy-c-jit-linux-x86-64&build=4451&mod=lib-python%2F3%2Ftest%2Ftest_importlib), 
of which some are a bit obscure.

Most (or all) tests from test_importlib.frozen are failing because PyPy 
doesn't really have frozen modules (in the sense of CPython, where these 
are a special kind of modules besides normal Python modules and 
extension modules).  Does it sound reasonable to skip these tests 
completely?

Another class of tests is failing because we didn't implement PEP 489 
(Multi-phase extension module initialization) so far.  This is mostly 
cpyext-related, which isn't my area of expertise, but I can look into it 
if noone else is interested.

-Manuel

From william.leslie.ttg at gmail.com  Wed Mar 15 21:44:57 2017
From: william.leslie.ttg at gmail.com (William ML Leslie)
Date: Thu, 16 Mar 2017 12:44:57 +1100
Subject: [pypy-dev] What is RuntimeTypeInfo?
In-Reply-To: <34397520-113E-4ABE-AED2-5CFAC15AD833@anu.edu.au>
References: <CBAD163B-6944-433B-833D-81DA4DF54502@anu.edu.au>
 <CAHgd1hEB0r8GFsgXOdr2kvYXx2px-EwL_doSnMY7JE6XNh10kg@mail.gmail.com>
 <CAMSv6X37TetVDjWPYFAN9XZJGVo02tMUTSY22yaFLUFBWvnbhQ@mail.gmail.com>
 <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com>
 <34397520-113E-4ABE-AED2-5CFAC15AD833@anu.edu.au>
Message-ID: <CAHgd1hFYZQrdTdV8PGVn3RO-TMxn_R3J_2TU6NwCeMbbpriOFQ@mail.gmail.com>

On 16 March 2017 at 11:47, John Zhang <John.Zhang at anu.edu.au> wrote:
> Hi Carl, Armin, William,
> I have thought about modifying the JIT code instruction set, descriptors and
> runtime rewrite etc. to encode the MuTyped CFG (which is a further type and
> ops specialisation towards the Mu MicroVM) for Mu back-end. But I presume
> this will involve a large amount of work? Would this be the case? And thus
> this wouldn?t be a good idea, right?
>

The more of the metainterp you can make use of the better.  I would
start by trying to push your graphs through the codewriter and seeing
how/if that fails.

LLTyped graphs have gone to the effort to encode the types involved
into the identifier of the operation, eg, you get float_abs and
llong_rshift.  If you are not able to do something like that, you can
still pass constants as arguments to the operation, and then ignore
them in the backend (or use them to generate a particular operation)
and have the codewriter preserve them for analysis later.  They will
get recorded in the constant table on the jitcodes.  For an example of
this, note that indirect_call operations maintain a set of possible
targets as their last argument; it might be illustrative to search for
that name.

I guess it is worth asking:  What other operations are you finding
difficult to type?  I get that the lltype specialisation of new_ is
one, and that address manipulation with composite offsets is another
(though it turns out to always be well-typed in practice).

-- 
William Leslie

Notice:
Likely much of this email is, by the nature of copyright, covered
under copyright law.  You absolutely MAY reproduce any part of it in
accordance with the copyright law of the nation you are reading this
in.  Any attempt to DENY YOU THOSE RIGHTS would be illegal without
prior contractual agreement.

From planrichi at gmail.com  Thu Mar 16 04:13:59 2017
From: planrichi at gmail.com (Richard Plangger)
Date: Thu, 16 Mar 2017 09:13:59 +0100
Subject: [pypy-dev] Remaining test_importlib failures
In-Reply-To: <c4d2becc2cfcfd603dbfa232d9d84ad2@manueljacob.de>
References: <c4d2becc2cfcfd603dbfa232d9d84ad2@manueljacob.de>
Message-ID: <802f243c-33b7-98f2-0783-1e9d640d9fe5@gmail.com>

Hello,

> Most (or all) tests from test_importlib.frozen are failing because PyPy
> doesn't really have frozen modules (in the sense of CPython, where these
> are a special kind of modules besides normal Python modules and
> extension modules).  Does it sound reasonable to skip these tests
> completely?

IMHO: Yes I would think so. It might be worthwhile to revisit at some
time in the future if people complain that we do not support that. This
should also be documented on pypy.readthedocs.org if we decide not to do
it now.

> Another class of tests is failing because we didn't implement PEP 489
> (Multi-phase extension module initialization) so far.  This is mostly
> cpyext-related, which isn't my area of expertise, but I can look into it
> if noone else is interested.

It does not sound very hard. And since it exposes some new public API
people will start to use it... So I'm +1 to implement it.

Cheers,
Richard

From abcdoyle888 at gmail.com  Fri Mar 17 01:48:49 2017
From: abcdoyle888 at gmail.com (Dingyuan Wang)
Date: Fri, 17 Mar 2017 13:48:49 +0800
Subject: [pypy-dev] Segfaults when compiling PyPy
Message-ID: <6b330322-c830-e673-a1be-270330a0f025@gmail.com>

Dear all,

Is there anyone also having the problem that CPython2.7 or PyPy2
randomly crashes when compiling PyPy (several latest versions on hg)?
I'm using Python 2.7.13 (or PyPy2 latest) on Debian stretch.

One kind of problems is https://bugs.python.org/issue29242

Another kind is shown below. (at 90736:e668451adc8d)

Program received signal SIGSEGV, Segmentation fault.
update_refs () at ../Modules/gcmodule.c:332
332     ../Modules/gcmodule.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
#1  0x00007ffff6f2d40a in __GI_abort () at abort.c:89
#2  0x00007ffff6f69bd0 in __libc_message (do_abort=do_abort at entry=2,
    fmt=fmt at entry=0x7ffff705ec30 "*** Error in `%s': %s: 0x%s ***\n")
    at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007ffff6f6ff96 in malloc_printerr (action=3,
    str=0x7ffff705ec88 "munmap_chunk(): invalid pointer", ptr=<optimized
out>,
    ar_ptr=<optimized out>) at malloc.c:5046
#4  0x0000555555630f5f in list_dealloc.lto_priv ()
    at ../Objects/listobject.c:316
#5  0x0000555555688456 in dict_dealloc.lto_priv.61 (mp=0x7fffe2e52398)
    at ../Objects/dictobject.c:1040
#6  subtype_dealloc.lto_priv () at ../Objects/typeobject.c:1035
#7  0x0000555555678af2 in list_ass_slice.lto_priv ()
    at ../Objects/listobject.c:704
#8  0x000055555569253e in assign_slice.lto_priv () at ../Python/ceval.c:4758
#9  0x000055555565300b in PyEval_EvalFrameEx () at ../Python/ceval.c:1868
#10 0x0000555555654c1f in fast_function (nk=<optimized out>,
    na=<optimized out>, n=<optimized out>, pp_stack=0x7fffffffcc50,
    func=<function at remote 0x7fffc0bed938>) at ../Python/ceval.c:4437
#11 call_function (oparg=<optimized out>, pp_stack=0x7fffffffcc50)
    at ../Python/ceval.c:4372
#12 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
---Type <return> to continue, or q <return> to quit---
#13 0x0000555555654c1f in fast_function (nk=<optimized out>,
    na=<optimized out>, n=<optimized out>, pp_stack=0x7fffffffcda0,
    func=<function at remote 0x7fffc0a8e758>) at ../Python/ceval.c:4437
#14 call_function (oparg=<optimized out>, pp_stack=0x7fffffffcda0)
    at ../Python/ceval.c:4372
#15 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
#16 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584
#17 0x0000555555654f19 in fast_function (nk=1, na=<optimized out>,
    n=<optimized out>, pp_stack=0x7fffffffcfb0,
    func=<function at remote 0x7fffc0a8ed70>) at ../Python/ceval.c:4447
#18 call_function (oparg=<optimized out>, pp_stack=0x7fffffffcfb0)
    at ../Python/ceval.c:4372
#19 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
#20 0x0000555555654c1f in fast_function (nk=<optimized out>,
    na=<optimized out>, n=<optimized out>, pp_stack=0x7fffffffd100,
    func=<function at remote 0x7ffff4a42a28>) at ../Python/ceval.c:4437
#21 call_function (oparg=<optimized out>, pp_stack=0x7fffffffd100)
    at ../Python/ceval.c:4372
#22 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
#23 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584
#24 0x0000555555669ea8 in function_call.lto_priv ()
    at ../Objects/funcobject.c:523
#25 0x000055555563b673 in PyObject_Call () at ../Objects/abstract.c:2547
---Type <return> to continue, or q <return> to quit---
#26 0x00005555556518a5 in ext_do_call (nk=0, na=3, flags=<optimized out>,
    pp_stack=0x7fffffffd3b8, func=<function at remote 0x7ffff4a42848>)
    at ../Python/ceval.c:4666
#27 PyEval_EvalFrameEx () at ../Python/ceval.c:3028
#28 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584
#29 0x0000555555655698 in fast_function (nk=1, na=<optimized out>,
    n=<optimized out>, pp_stack=0x7fffffffd5c0,
    func=<function at remote 0x7ffff4b75140>) at ../Python/ceval.c:4447
#30 call_function (oparg=<optimized out>, pp_stack=0x7fffffffd5c0)
    at ../Python/ceval.c:4372
#31 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
#32 0x0000555555654c1f in fast_function (nk=<optimized out>,
    na=<optimized out>, n=<optimized out>, pp_stack=0x7fffffffd710,
    func=<function at remote 0x7ffff4a42e60>) at ../Python/ceval.c:4437
#33 call_function (oparg=<optimized out>, pp_stack=0x7fffffffd710)
    at ../Python/ceval.c:4372
#34 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
#35 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584
#36 0x0000555555655698 in fast_function (nk=0, na=<optimized out>,
    n=<optimized out>, pp_stack=0x7fffffffd920,
    func=<function at remote 0x7ffff66f2848>) at ../Python/ceval.c:4447
#37 call_function (oparg=<optimized out>, pp_stack=0x7fffffffd920)
    at ../Python/ceval.c:4372
---Type <return> to continue, or q <return> to quit---
#38 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
#39 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584
#40 0x000055555564d2d9 in PyEval_EvalCode (co=<optimized out>,
    globals=<optimized out>, locals=<optimized out>) at
../Python/ceval.c:669
#41 0x000055555567ce3f in run_mod.lto_priv () at ../Python/pythonrun.c:1376
#42 0x0000555555677d52 in PyRun_FileExFlags () at ../Python/pythonrun.c:1362
#43 0x000055555567789e in PyRun_SimpleFileExFlags ()
    at ../Python/pythonrun.c:948
#44 0x0000555555628af1 in Py_Main () at ../Modules/main.c:640
#45 0x00007ffff6f192b1 in __libc_start_main (main=0x555555628420 <main>,
    argc=4, argv=0x7fffffffdd68, init=<optimized out>, fini=<optimized
out>,
    rtld_fini=<optimized out>, stack_end=0x7fffffffdd58)
    at ../csu/libc-start.c:291
#46 0x000055555562831a in _start ()


-- 
Dingyuan Wang

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170317/140219a6/attachment.sig>

From abcdoyle888 at gmail.com  Fri Mar 17 01:52:17 2017
From: abcdoyle888 at gmail.com (Dingyuan Wang)
Date: Fri, 17 Mar 2017 13:52:17 +0800
Subject: [pypy-dev] Segfaults when compiling PyPy
In-Reply-To: <6b330322-c830-e673-a1be-270330a0f025@gmail.com>
References: <6b330322-c830-e673-a1be-270330a0f025@gmail.com>
Message-ID: <cadf82d1-7ce0-0c9e-3192-3e522ed4209b@gmail.com>

> Dear all,
> 
> Is there anyone also having the problem that CPython2.7 or PyPy2
> randomly crashes when compiling PyPy (several latest versions on hg)?
> I'm using Python 2.7.13 (or PyPy2 latest) on Debian stretch.
> 
> One kind of problems is https://bugs.python.org/issue29242
> 
> Another kind is shown below. (at 90736:e668451adc8d)
> 
> Program received signal SIGSEGV, Segmentation fault.
> update_refs () at ../Modules/gcmodule.c:332
> 332     ../Modules/gcmodule.c: No such file or directory.

copy&paste error, the above should be:

*** Error in `/usr/bin/python': munmap_chunk():invalid pointer:
0x00007fffe2a1ba50 ***

...

Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
58      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

> (gdb) bt
> #0  __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
> #1  0x00007ffff6f2d40a in __GI_abort () at abort.c:89
> #2  0x00007ffff6f69bd0 in __libc_message (do_abort=do_abort at entry=2,
>     fmt=fmt at entry=0x7ffff705ec30 "*** Error in `%s': %s: 0x%s ***\n")
>     at ../sysdeps/posix/libc_fatal.c:175
> #3  0x00007ffff6f6ff96 in malloc_printerr (action=3,
>     str=0x7ffff705ec88 "munmap_chunk(): invalid pointer", ptr=<optimized
> out>,
>     ar_ptr=<optimized out>) at malloc.c:5046
> #4  0x0000555555630f5f in list_dealloc.lto_priv ()
>     at ../Objects/listobject.c:316
> #5  0x0000555555688456 in dict_dealloc.lto_priv.61 (mp=0x7fffe2e52398)
>     at ../Objects/dictobject.c:1040
> #6  subtype_dealloc.lto_priv () at ../Objects/typeobject.c:1035
> #7  0x0000555555678af2 in list_ass_slice.lto_priv ()
>     at ../Objects/listobject.c:704
> #8  0x000055555569253e in assign_slice.lto_priv () at ../Python/ceval.c:4758
> #9  0x000055555565300b in PyEval_EvalFrameEx () at ../Python/ceval.c:1868
> #10 0x0000555555654c1f in fast_function (nk=<optimized out>,
>     na=<optimized out>, n=<optimized out>, pp_stack=0x7fffffffcc50,
>     func=<function at remote 0x7fffc0bed938>) at ../Python/ceval.c:4437
> #11 call_function (oparg=<optimized out>, pp_stack=0x7fffffffcc50)
>     at ../Python/ceval.c:4372
> #12 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
> ---Type <return> to continue, or q <return> to quit---
> #13 0x0000555555654c1f in fast_function (nk=<optimized out>,
>     na=<optimized out>, n=<optimized out>, pp_stack=0x7fffffffcda0,
>     func=<function at remote 0x7fffc0a8e758>) at ../Python/ceval.c:4437
> #14 call_function (oparg=<optimized out>, pp_stack=0x7fffffffcda0)
>     at ../Python/ceval.c:4372
> #15 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
> #16 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584
> #17 0x0000555555654f19 in fast_function (nk=1, na=<optimized out>,
>     n=<optimized out>, pp_stack=0x7fffffffcfb0,
>     func=<function at remote 0x7fffc0a8ed70>) at ../Python/ceval.c:4447
> #18 call_function (oparg=<optimized out>, pp_stack=0x7fffffffcfb0)
>     at ../Python/ceval.c:4372
> #19 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
> #20 0x0000555555654c1f in fast_function (nk=<optimized out>,
>     na=<optimized out>, n=<optimized out>, pp_stack=0x7fffffffd100,
>     func=<function at remote 0x7ffff4a42a28>) at ../Python/ceval.c:4437
> #21 call_function (oparg=<optimized out>, pp_stack=0x7fffffffd100)
>     at ../Python/ceval.c:4372
> #22 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
> #23 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584
> #24 0x0000555555669ea8 in function_call.lto_priv ()
>     at ../Objects/funcobject.c:523
> #25 0x000055555563b673 in PyObject_Call () at ../Objects/abstract.c:2547
> ---Type <return> to continue, or q <return> to quit---
> #26 0x00005555556518a5 in ext_do_call (nk=0, na=3, flags=<optimized out>,
>     pp_stack=0x7fffffffd3b8, func=<function at remote 0x7ffff4a42848>)
>     at ../Python/ceval.c:4666
> #27 PyEval_EvalFrameEx () at ../Python/ceval.c:3028
> #28 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584
> #29 0x0000555555655698 in fast_function (nk=1, na=<optimized out>,
>     n=<optimized out>, pp_stack=0x7fffffffd5c0,
>     func=<function at remote 0x7ffff4b75140>) at ../Python/ceval.c:4447
> #30 call_function (oparg=<optimized out>, pp_stack=0x7fffffffd5c0)
>     at ../Python/ceval.c:4372
> #31 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
> #32 0x0000555555654c1f in fast_function (nk=<optimized out>,
>     na=<optimized out>, n=<optimized out>, pp_stack=0x7fffffffd710,
>     func=<function at remote 0x7ffff4a42e60>) at ../Python/ceval.c:4437
> #33 call_function (oparg=<optimized out>, pp_stack=0x7fffffffd710)
>     at ../Python/ceval.c:4372
> #34 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
> #35 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584
> #36 0x0000555555655698 in fast_function (nk=0, na=<optimized out>,
>     n=<optimized out>, pp_stack=0x7fffffffd920,
>     func=<function at remote 0x7ffff66f2848>) at ../Python/ceval.c:4447
> #37 call_function (oparg=<optimized out>, pp_stack=0x7fffffffd920)
>     at ../Python/ceval.c:4372
> ---Type <return> to continue, or q <return> to quit---
> #38 PyEval_EvalFrameEx () at ../Python/ceval.c:2989
> #39 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584
> #40 0x000055555564d2d9 in PyEval_EvalCode (co=<optimized out>,
>     globals=<optimized out>, locals=<optimized out>) at
> ../Python/ceval.c:669
> #41 0x000055555567ce3f in run_mod.lto_priv () at ../Python/pythonrun.c:1376
> #42 0x0000555555677d52 in PyRun_FileExFlags () at ../Python/pythonrun.c:1362
> #43 0x000055555567789e in PyRun_SimpleFileExFlags ()
>     at ../Python/pythonrun.c:948
> #44 0x0000555555628af1 in Py_Main () at ../Modules/main.c:640
> #45 0x00007ffff6f192b1 in __libc_start_main (main=0x555555628420 <main>,
>     argc=4, argv=0x7fffffffdd68, init=<optimized out>, fini=<optimized
> out>,
>     rtld_fini=<optimized out>, stack_end=0x7fffffffdd58)
>     at ../csu/libc-start.c:291
> #46 0x000055555562831a in _start ()
> 
> 

-- 
Dingyuan Wang

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170317/d0adcb43/attachment.sig>

From matti.picus at gmail.com  Mon Mar 20 15:46:41 2017
From: matti.picus at gmail.com (Matti Picus)
Date: Mon, 20 Mar 2017 21:46:41 +0200
Subject: [pypy-dev] 2,
 7 and 3.5 release is almost here - please help check you favorite
 platform
Message-ID: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170320/064df731/attachment.html>

From me at manueljacob.de  Mon Mar 20 18:37:28 2017
From: me at manueljacob.de (Manuel Jacob)
Date: Mon, 20 Mar 2017 23:37:28 +0100
Subject: [pypy-dev] Remove cpyext.load_module()?
Message-ID: <2be9b2ea1771a0ebb150f4ca44bdb6c3@manueljacob.de>

Hi,

In order to implement PEP 489 (Multi-phase extension module 
initialization) on py3.5 I need to change the way app-level Python code 
interacts with extension module loading internals.  Maintaining 
cpyext.load_module() is a bit annoying.  Probably the best way would be 
to delegate straight to imp.load_dynamic().  But then the question is 
why we need it in the first place.  I'd like to remove 
cpyext.load_module() on both default and py3.5.

-Manuel

From phyo.arkarlwin at gmail.com  Tue Mar 21 00:53:22 2017
From: phyo.arkarlwin at gmail.com (Phyo Arkar)
Date: Tue, 21 Mar 2017 04:53:22 +0000
Subject: [pypy-dev] 2,
 7 and 3.5 release is almost here - please help check you favorite
 platform
In-Reply-To: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com>
References: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com>
Message-ID: <CA+HjJriBiO225HkeLWNHrJh8T_90P_7-3D81fEgyiP181E3dTg@mail.gmail.com>

Teating. Saw warning that it is much slower than pypy2? How much slower?

On Tue, Mar 21, 2017, 02:17 Matti Picus <matti.picus at gmail.com> wrote:

> The release is almost ready. Please check the website for obvious typos,
>
> http://pypy.org/download.html (note I fixed a typo in the hashsum titles
> in 536afa5d31cf )
>
> and more importantly the downloads for problems on your platform
>
> https://bitbucket.org/pypy/pypy/downloads
>
> Thanks,
> Matti
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170321/00be076b/attachment.html>

From John.Zhang at anu.edu.au  Tue Mar 21 01:29:08 2017
From: John.Zhang at anu.edu.au (John Zhang)
Date: Tue, 21 Mar 2017 05:29:08 +0000
Subject: [pypy-dev] lltype.Signed type in ThreadLocalReference
Message-ID: <B2079BE0-139F-4F6D-9F1C-F0FAFCC3893B@anu.edu.au>

Hi all,
I?m wondering why rthread.ThreadLocalReference is initialised to have lltype.Signed type (rthread.py:387). If in get() it?s retrieved as rclass.OBJECTPTR, why not just set the type of the field to be OBJECTPTR? is this related to some specific optimisation?
The problem I?m having is that in my back-end I cannot cast an integer to a GC-ed heap object reference (which the OBJECTPTR translates to), it can only be cast to a non-GC-ed memory object reference (a different memory space not part of the GC managed heap).
Any ideas?

Regards,
John Zhang

------------------------------------------------------
John Zhang
Research Assistant
Programming Languages, Design & Implementation Division
Computer Systems Group
ANU College of Engineering & Computer Science
108 North Rd
The Australian National University
Acton ACT 2601
john.zhang at anu.edu.au<mailto:john.zhang at anu.edu.au>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170321/bd2a2a89/attachment.html>

From rymg19 at gmail.com  Tue Mar 21 12:14:22 2017
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Tue, 21 Mar 2017 11:14:22 -0500
Subject: [pypy-dev] 2,
 7 and 3.5 release is almost here - please help check you favorite
 platform
In-Reply-To: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com>
References: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com>
Message-ID: <CAO41-mOLA3kBzDNRamu84XsG6WNBMw2M9=WCOjkY6_pSQwC-fg@mail.gmail.com>

Typo: in this subject header, you wrote 2, 7 instead of 2.7. ;)

--
Ryan (????)
Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else
http://refi64.com

On Mar 20, 2017 2:47 PM, "Matti Picus" <matti.picus at gmail.com> wrote:

> The release is almost ready. Please check the website for obvious typos,
>
> http://pypy.org/download.html (note I fixed a typo in the hashsum titles
> in 536afa5d31cf )
>
> and more importantly the downloads for problems on your platform
>
> https://bitbucket.org/pypy/pypy/downloads
>
> Thanks,
> Matti
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170321/2ea41b2b/attachment.html>

From me at manueljacob.de  Tue Mar 21 20:03:55 2017
From: me at manueljacob.de (Manuel Jacob)
Date: Wed, 22 Mar 2017 01:03:55 +0100
Subject: [pypy-dev] lltype.Signed type in ThreadLocalReference
In-Reply-To: <B2079BE0-139F-4F6D-9F1C-F0FAFCC3893B@anu.edu.au>
References: <B2079BE0-139F-4F6D-9F1C-F0FAFCC3893B@anu.edu.au>
Message-ID: <2c8b5b1daf6488e9e917d707f72d3630@manueljacob.de>

Hi John,

I can't say for sure (Armin probably knows better), but from the commit 
history it looks like this was specifically changed for the JIT.  With 
commit 5291d2692c2375a4105b43498188e749d4204dc8 the type was changed 
from llmemory.GCREF to lltype.Signed.

I'd recommend changing it back to llmemory.GCREF or rclass.OBJECTPTR in 
your fork and look whether you'll run into problems later.

-Manuel

On 2017-03-21 06:29, John Zhang wrote:
> Hi all,
> I?m wondering why rthread.ThreadLocalReference is initialised to have
> lltype.Signed type (rthread.py:387). If in get() it?s retrieved as
> rclass.OBJECTPTR, why not just set the type of the field to be
> OBJECTPTR? is this related to some specific optimisation?
> The problem I?m having is that in my back-end I cannot cast an integer
> to a GC-ed heap object reference (which the OBJECTPTR translates to),
> it can only be cast to a non-GC-ed memory object reference (a
> different memory space not part of the GC managed heap).
> Any ideas?
> 
> Regards,
> John Zhang
> 
> ------------------------------------------------------
> John Zhang
> Research Assistant
> Programming Languages, Design & Implementation Division
> Computer Systems Group
> ANU College of Engineering & Computer Science
> 108 North Rd
> The Australian National University
> Acton ACT 2601
> john.zhang at anu.edu.au<mailto:john.zhang at anu.edu.au>
> 
> 
> 
> 
> 
> 
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev

From John.Zhang at anu.edu.au  Wed Mar 22 01:29:04 2017
From: John.Zhang at anu.edu.au (John Zhang)
Date: Wed, 22 Mar 2017 05:29:04 +0000
Subject: [pypy-dev] lltype.Signed type in ThreadLocalReference
In-Reply-To: <2c8b5b1daf6488e9e917d707f72d3630@manueljacob.de>
References: <B2079BE0-139F-4F6D-9F1C-F0FAFCC3893B@anu.edu.au>
 <2c8b5b1daf6488e9e917d707f72d3630@manueljacob.de>
Message-ID: <60C49E26-622C-4C88-8D33-DC9904EB4FCD@anu.edu.au>

Hi Manuel,
I attempted to change it to OBJECTPTR in my local repo and it worked. So I will see how it go I guess?

Thanks for the reply.

Cheers,
John Zhang

------------------------------------------------------
John Zhang
Research Assistant
Programming Languages, Design & Implementation Division
Computer Systems Group
ANU College of Engineering & Computer Science
108 North Rd
The Australian National University
Acton ACT 2601
john.zhang at anu.edu.au<mailto:john.zhang at anu.edu.au>


On 22 Mar 2017, at 11:03, Manuel Jacob <me at manueljacob.de<mailto:me at manueljacob.de>> wrote:

Hi John,

I can't say for sure (Armin probably knows better), but from the commit history it looks like this was specifically changed for the JIT.  With commit 5291d2692c2375a4105b43498188e749d4204dc8 the type was changed from llmemory.GCREF to lltype.Signed.

I'd recommend changing it back to llmemory.GCREF or rclass.OBJECTPTR in your fork and look whether you'll run into problems later.

-Manuel

On 2017-03-21 06:29, John Zhang wrote:
Hi all,
I?m wondering why rthread.ThreadLocalReference is initialised to have
lltype.Signed type (rthread.py:387). If in get() it?s retrieved as
rclass.OBJECTPTR, why not just set the type of the field to be
OBJECTPTR? is this related to some specific optimisation?
The problem I?m having is that in my back-end I cannot cast an integer
to a GC-ed heap object reference (which the OBJECTPTR translates to),
it can only be cast to a non-GC-ed memory object reference (a
different memory space not part of the GC managed heap).
Any ideas?
Regards,
John Zhang
------------------------------------------------------
John Zhang
Research Assistant
Programming Languages, Design & Implementation Division
Computer Systems Group
ANU College of Engineering & Computer Science
108 North Rd
The Australian National University
Acton ACT 2601
john.zhang at anu.edu.au<mailto:john.zhang at anu.edu.au><mailto:john.zhang at anu.edu.au>
_______________________________________________
pypy-dev mailing list
pypy-dev at python.org<mailto:pypy-dev at python.org>
https://mail.python.org/mailman/listinfo/pypy-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170322/bfa088df/attachment-0001.html>

From dynamicgl at gmail.com  Thu Mar 23 03:41:29 2017
From: dynamicgl at gmail.com (Gelin Yan)
Date: Thu, 23 Mar 2017 15:41:29 +0800
Subject: [pypy-dev] numpy 1.12.1 segfault with pypy 2 5.7 on ubuntu 14.04
Message-ID: <CABkOF6QbLrpuYc8L2xkfWZ5c3+GOzBnO_D6ZS7RBCX13NO0-_g@mail.gmail.com>

Hi All

     I built pypy 2 5.7 from the source on Ubuntu 14.04 and installed numpy
1.12.1 via pip. When running numpy.test('full'), I noticed there was a
segfault.

     I tested pypy 2 5.6 with numpy 1.12.1 too. I didn't see any segfault.

Regards

gelin yan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170323/a14755d7/attachment.html>

From matti.picus at gmail.com  Thu Mar 23 03:51:17 2017
From: matti.picus at gmail.com (matti picus)
Date: Thu, 23 Mar 2017 07:51:17 +0000
Subject: [pypy-dev] numpy 1.12.1 segfault with pypy 2 5.7 on ubuntu 14.04
In-Reply-To: <CABkOF6QbLrpuYc8L2xkfWZ5c3+GOzBnO_D6ZS7RBCX13NO0-_g@mail.gmail.com>
References: <CABkOF6QbLrpuYc8L2xkfWZ5c3+GOzBnO_D6ZS7RBCX13NO0-_g@mail.gmail.com>
Message-ID: <CAO8bCggXmotd+40ELjzxTo9NwDbWU9A9tVX_8OP9JR-+rfKEOQ@mail.gmail.com>

On Thu, 23 Mar 2017 at 9:42 am, Gelin Yan <dynamicgl at gmail.com> wrote:

> Hi All
>
>      I built pypy 2 5.7 from the source on Ubuntu 14.04 and installed
> numpy 1.12.1 via pip. When running numpy.test('full'), I noticed there was
> a segfault.
>
>      I tested pypy 2 5.6 with numpy 1.12.1 too. I didn't see any segfault.
>
> Regards
>
> gelin yan
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>

What platform are you using? Do you have limited RAM? Please rerun the
tests in verbose mode and preferably open an issue on
https://bitbucket.org/pypy/pypy/issues
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170323/30529525/attachment.html>

From armin.rigo at gmail.com  Thu Mar 23 06:39:14 2017
From: armin.rigo at gmail.com (Armin Rigo)
Date: Thu, 23 Mar 2017 11:39:14 +0100
Subject: [pypy-dev] Remove cpyext.load_module()?
In-Reply-To: <2be9b2ea1771a0ebb150f4ca44bdb6c3@manueljacob.de>
References: <2be9b2ea1771a0ebb150f4ca44bdb6c3@manueljacob.de>
Message-ID: <CAMSv6X3ASPm=co2JAebH1DdaUMKy_T=BUCQ1uDiTpyfUeUaYBg@mail.gmail.com>

Hi,

On 20 March 2017 at 23:37, Manuel Jacob <me at manueljacob.de> wrote:
> I'd like to remove cpyext.load_module() on both default and py3.5.

If you're talking about the app-level function, it doesn't seem to be
called at all (at least on default) apart from
test_cpyext.AppTestApi.test_load_error.


A bient?t,

Armin.

From armin.rigo at gmail.com  Thu Mar 23 06:46:22 2017
From: armin.rigo at gmail.com (Armin Rigo)
Date: Thu, 23 Mar 2017 11:46:22 +0100
Subject: [pypy-dev] 2,
 7 and 3.5 release is almost here - please help check you favorite
 platform
In-Reply-To: <CA+HjJriBiO225HkeLWNHrJh8T_90P_7-3D81fEgyiP181E3dTg@mail.gmail.com>
References: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com>
 <CA+HjJriBiO225HkeLWNHrJh8T_90P_7-3D81fEgyiP181E3dTg@mail.gmail.com>
Message-ID: <CAMSv6X1AZt67Uh_ONN1UKn2feT=RHc9TjEvQ8=3zG4evPEXyHA@mail.gmail.com>

Hi Phyo,

On 21 March 2017 at 05:53, Phyo Arkar <phyo.arkarlwin at gmail.com> wrote:
> Teating. Saw warning that it is much slower than pypy2? How much slower?

This warning should be removed or at least made much less strong.  We
didn't measure, but it passes many of the same tests for JIT-code
quality now.  Where did we leave such a warning?


A bient?t,

Armin.

From armin.rigo at gmail.com  Thu Mar 23 07:36:26 2017
From: armin.rigo at gmail.com (Armin Rigo)
Date: Thu, 23 Mar 2017 12:36:26 +0100
Subject: [pypy-dev] Maybe do a 5.7.1 release soon?
Message-ID: <CAMSv6X3D1SBAUzkSxOwdjX5PR9H3vmDPtgxGZA4m01FELJEnTg@mail.gmail.com>

Hi all,

https://bitbucket.org/pypy/pypy/issues/2508/dictionary-pop-with-default-fails-with

This is a core regression found by Jason and fixed by Alex Gaynor.
Thanks to both!  Is this justification enough to plan a 5.7.1 release
soon?


A bient?t,

Armin.

From matti.picus at gmail.com  Thu Mar 23 13:55:27 2017
From: matti.picus at gmail.com (Matti Picus)
Date: Thu, 23 Mar 2017 19:55:27 +0200
Subject: [pypy-dev] 2,
 7 and 3.5 release is almost here - please help check you favorite
 platform
In-Reply-To: <CAMSv6X1AZt67Uh_ONN1UKn2feT=RHc9TjEvQ8=3zG4evPEXyHA@mail.gmail.com>
References: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com>
 <CA+HjJriBiO225HkeLWNHrJh8T_90P_7-3D81fEgyiP181E3dTg@mail.gmail.com>
 <CAMSv6X1AZt67Uh_ONN1UKn2feT=RHc9TjEvQ8=3zG4evPEXyHA@mail.gmail.com>
Message-ID: <965ee337-3068-0066-225b-0d806abed7ed@gmail.com>

On 23/03/17 12:46, Armin Rigo wrote:

> Hi Phyo,
>
> On 21 March 2017 at 05:53, Phyo Arkar <phyo.arkarlwin at gmail.com> wrote:
>> Teating. Saw warning that it is much slower than pypy2? How much slower?
> This warning should be removed or at least made much less strong.  We
> didn't measure, but it passes many of the same tests for JIT-code
> quality now.  Where did we leave such a warning?
>
>
> A bient?t,
>
> Armin.
It's in the release notice/blog post. Sorry, a bit late to refine that 
now :(
Matti

From omer.drow at gmail.com  Sun Mar 26 12:40:44 2017
From: omer.drow at gmail.com (Omer Katz)
Date: Sun, 26 Mar 2017 16:40:44 +0000
Subject: [pypy-dev] Maybe do a 5.7.1 release soon?
In-Reply-To: <CAMSv6X3D1SBAUzkSxOwdjX5PR9H3vmDPtgxGZA4m01FELJEnTg@mail.gmail.com>
References: <CAMSv6X3D1SBAUzkSxOwdjX5PR9H3vmDPtgxGZA4m01FELJEnTg@mail.gmail.com>
Message-ID: <CAOZCFcCBwJb0LUXrPgW1PkSrpGGYMiD6srJ4ACcr4uzVoFwGnA@mail.gmail.com>

I think this regression will cause a lot of applications to fail because
they rely on this behavior.
I think that warrants a new release.

On Thu, Mar 23, 2017, 13:38 Armin Rigo <armin.rigo at gmail.com> wrote:

> Hi all,
>
>
> https://bitbucket.org/pypy/pypy/issues/2508/dictionary-pop-with-default-fails-with
>
> This is a core regression found by Jason and fixed by Alex Gaynor.
> Thanks to both!  Is this justification enough to plan a 5.7.1 release
> soon?
>
>
> A bient?t,
>
> Armin.
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170326/21649535/attachment.html>

From matti.picus at gmail.com  Sun Mar 26 13:27:07 2017
From: matti.picus at gmail.com (Matti Picus)
Date: Sun, 26 Mar 2017 20:27:07 +0300
Subject: [pypy-dev] Maybe do a 5.7.1 release soon?
In-Reply-To: <CAOZCFcCBwJb0LUXrPgW1PkSrpGGYMiD6srJ4ACcr4uzVoFwGnA@mail.gmail.com>
References: <CAMSv6X3D1SBAUzkSxOwdjX5PR9H3vmDPtgxGZA4m01FELJEnTg@mail.gmail.com>
 <CAOZCFcCBwJb0LUXrPgW1PkSrpGGYMiD6srJ4ACcr4uzVoFwGnA@mail.gmail.com>
Message-ID: <93b08754-71d0-acf2-4af4-60d350beb793@gmail.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170326/83ae476e/attachment.html>

From sergey.forum at gmail.com  Mon Mar 27 01:38:03 2017
From: sergey.forum at gmail.com (Sergey Kurdakov)
Date: Mon, 27 Mar 2017 08:38:03 +0300
Subject: [pypy-dev] error building source pypy2-v5.7.0 cygwin 32/windows 64
Message-ID: <CAFVQEu1z3xEZ3bb=JPc4sCCD-O8HJKMBig9JfO=6DWentZwY1Q@mail.gmail.com>

Hi,

as my windows pypy would always crash on my project
but the same project on linux works fine and my main developing environment
is windows
I decided to try cygwin pypy

so I got pypy2-v5.7.0-src.tar.bz2 and run on latest Cygwin-32 on windows
10/64

python ../../rpython/bin/rpython -Ojit targetpypystandalone

aside of few warnings
like

 /tmp/usession-release-pypy2.7-v5.7.0-0/platcheck_50.c:92:1: warning:
implicit declaration of function ?mremap?
[-Wimplicit-function-declaration]
/tmp/usession-release-pypy2.7-v5.7.0-0/module_cache/module_0.c:82:68:
warning: implicit declaration of function ?malloc?
[-Wimplicit-function-declaration]

which seems indicate - that some flags are not correctly set


 I almost immediately get following error:
===
[platform:execute] gcc -shared
/tmp/usession-release-pypy2.7-v5.7.0-0/module_cache/module_0.o
/tmp/usession-release-pypy2.7-v5.7.0-0/module_cache/module_1.o
/tmp/usession-release-pypy2.7-v5.7.0-0/module_cache/module_2.o
-Wl,--export-all-symbols -lrt -o
/tmp/usession-release-pypy2.7-v5.7.0-0/shared_cache/externmod.dll
Traceback (most recent call last):
  File "../../rpython/bin/rpython", line 20, in <module>
    main()
  File "/home/Sergey/pypy2/rpython/translator/goal/translate.py", line
217, in main
    targetspec_dic, translateconfig, config, args =
parse_options_and_load_target()
  File "/home/Sergey/pypy2/rpython/translator/goal/translate.py", line
155, in parse_options_and_load_target
    targetspec_dic = load_target(targetspec)
  File "/home/Sergey/pypy2/rpython/translator/goal/translate.py", line
97, in load_target
    mod = __import__(specname)
  File "targetpypystandalone.py", line 11, in <module>
    from pypy.tool.option import make_objspace
  File "/home/Sergey/pypy2/pypy/tool/option.py", line 3, in <module>
    from pypy.config.pypyoption import get_pypy_config
  File "/home/Sergey/pypy2/pypy/config/pypyoption.py", line 44, in <module>
    if detect_cpu.autodetect().startswith('x86'):
  File "/home/Sergey/pypy2/rpython/jit/backend/detect_cpu.py", line
106, in autodetect
    return detect_model_from_host_platform()
  File "/home/Sergey/pypy2/rpython/jit/backend/detect_cpu.py", line
85, in detect_model_from_host_platform
    if feature.detect_sse2():
  File "/home/Sergey/pypy2/rpython/jit/backend/x86/detect_feature.py",
line 20, in detect_sse2
    code = cpu_id(eax=1)
  File "/home/Sergey/pypy2/rpython/jit/backend/x86/detect_feature.py",
line 34, in cpu_id
    return cpu_info(''.join(asm))
  File "/home/Sergey/pypy2/rpython/jit/backend/x86/detect_feature.py",
line 16, in cpu_info
    free(data, 4096)
  File "/home/Sergey/pypy2/rpython/rtyper/lltypesystem/rffi.py", line
260, in wrapper
    assert len(args) == nb_args
AssertionError

====

any guide  for building latest pypy on Cygwin? or at least to fix
those mentioned errors?

Regards
Sergey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170327/16673985/attachment.html>

From macek at sandbox.cz  Sun Mar 26 07:06:32 2017
From: macek at sandbox.cz (=?UTF-8?B?VmzDocSPYSBNYWNlaw==?=)
Date: Sun, 26 Mar 2017 13:06:32 +0200
Subject: [pypy-dev] pypy real world example,
 a django project data processing. but slow...
Message-ID: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz>

Hi, recently I asked my friends to run my sort of a benchmark on their
machines (attached). The goal was to test the speed of different data
access in python2 and python3, 32bit and 64bit. One of my friends sent me
the pypy results -- the script ran fast as hell! Astounding.

At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded
your binary
https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and
confirmed my friend's results, wow.

I develop a large Django project, that includes a big amount of background
data processing. Reads large files, computes, issues much SQL to postgresql
via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs.

I'd welcome a speedup here very much.

So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set
up the paths and ran. The computation printouts were the same, very
promising -- taking into account how complicated the project is! The SQL
looked right too. My respect on compatiblity!

Unfortunately, the time needed to complete was double in comparison CPython
2.7 for exactly the same task.

You mention you might have some tips for why it's slow. Are you interested
in getting in touch? Although I rather can't share the code and data with
you, I'm offering a real world example of significant load that might help
Pypy get better.

Thank you,

-- 
: Vlada Macek  :  http://macek.sandbox.cz  : +420 608 978 164
: UNIX && Dev || Training : Python, Django : PGP key 97330EBD

(Disclaimer: The opinions expressed herein are not necessarily those
of my employer, not necessarily mine, and probably not necessary.)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: access-timer.py
Type: text/x-python
Size: 4329 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170326/91eec26b/attachment-0001.py>

From fijall at gmail.com  Mon Mar 27 11:21:28 2017
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Mon, 27 Mar 2017 17:21:28 +0200
Subject: [pypy-dev] pypy real world example,
 a django project data processing. but slow...
In-Reply-To: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz>
References: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz>
Message-ID: <CAK5idxTM0zmHhjMLd4qbX4E_cj-zwR3ORn6gZXEsjsfwz9vpnQ@mail.gmail.com>

Hi Vlada

Generally speaking, if we can't have a look there is incredibly little
we can do "I have a program" can be pretty much anything.

It is well known that django ORM is very slow (both on pypy and on
cpython) and makes the JIT take forever to warm up. I have absolutely
no idea how long is your run at full CPU, but this is definitely one
of your suspects

On Sun, Mar 26, 2017 at 1:06 PM, Vl??a Macek <macek at sandbox.cz> wrote:
> Hi, recently I asked my friends to run my sort of a benchmark on their
> machines (attached). The goal was to test the speed of different data
> access in python2 and python3, 32bit and 64bit. One of my friends sent me
> the pypy results -- the script ran fast as hell! Astounding.
>
> At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded
> your binary
> https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and
> confirmed my friend's results, wow.
>
> I develop a large Django project, that includes a big amount of background
> data processing. Reads large files, computes, issues much SQL to postgresql
> via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs.
>
> I'd welcome a speedup here very much.
>
> So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set
> up the paths and ran. The computation printouts were the same, very
> promising -- taking into account how complicated the project is! The SQL
> looked right too. My respect on compatiblity!
>
> Unfortunately, the time needed to complete was double in comparison CPython
> 2.7 for exactly the same task.
>
> You mention you might have some tips for why it's slow. Are you interested
> in getting in touch? Although I rather can't share the code and data with
> you, I'm offering a real world example of significant load that might help
> Pypy get better.
>
> Thank you,
>
> --
> : Vlada Macek  :  http://macek.sandbox.cz  : +420 608 978 164
> : UNIX && Dev || Training : Python, Django : PGP key 97330EBD
>
> (Disclaimer: The opinions expressed herein are not necessarily those
> of my employer, not necessarily mine, and probably not necessary.)
>
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>

From armin.rigo at gmail.com  Tue Mar 28 07:02:29 2017
From: armin.rigo at gmail.com (Armin Rigo)
Date: Tue, 28 Mar 2017 13:02:29 +0200
Subject: [pypy-dev] error building source pypy2-v5.7.0 cygwin 32/windows
 64
In-Reply-To: <CAFVQEu1z3xEZ3bb=JPc4sCCD-O8HJKMBig9JfO=6DWentZwY1Q@mail.gmail.com>
References: <CAFVQEu1z3xEZ3bb=JPc4sCCD-O8HJKMBig9JfO=6DWentZwY1Q@mail.gmail.com>
Message-ID: <CAMSv6X2=NVdy7_yT3BDZ-e-+maHvQkwkp=CWCiCZtthy5jWx8w@mail.gmail.com>

Hi Sergey,

On 27 March 2017 at 07:38, Sergey Kurdakov <sergey.forum at gmail.com> wrote:
> any guide  for building latest pypy on Cygwin? or at least to fix those
> mentioned errors?

Cygwin is not officially supported.  At some point in the past it used
to work, thanks to contributions.  If it no longer does, then a few
fixes are needed again.  It looks unlikely to come from the core PyPy
team, but if you or someone else wants to contribute the relevant
fixes, you are welcome to :-)


A bient?t,

Armin.

From nanjekyejoannah at gmail.com  Tue Mar 28 10:14:31 2017
From: nanjekyejoannah at gmail.com (joannah nanjekye)
Date: Tue, 28 Mar 2017 17:14:31 +0300
Subject: [pypy-dev] Details on project idea: Explicit typing in RPython
Message-ID: <CANFMCgQhk-rLDsrD9JMPxaZ8RthwjWyRKw-=vLxGTtXkxWzNHA@mail.gmail.com>

Hello,

I am interested in working on the above project. I need to understand what
it is about so that I can make a plan for it. I would love to work on it
for GSoC if accepted.

In summary..I want to know the goal and the most important stack involved
working on it.

I am proficient in python. If the above project idea is not so much in that
direction you can advise a better idea among the ones listed here
http://pypy.readthedocs.io/en/latest/project-ideas.html.

Kind regards,

-- 
Joannah Nanjekye
+256776468213
F : Nanjekye Captain Joannah
S : joannah.nanjekye
T : @captainjoannah
SO : joannah


*"You think you know when you learn, are more sure when you can write, even
more when you can teach, but certain when you can program." Alan J. Perlis*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170328/6d66399e/attachment.html>

From planrichi at gmail.com  Wed Mar 29 08:01:11 2017
From: planrichi at gmail.com (Richard Plangger)
Date: Wed, 29 Mar 2017 08:01:11 -0400
Subject: [pypy-dev] Details on project idea: Explicit typing in RPython
In-Reply-To: <CANFMCgQhk-rLDsrD9JMPxaZ8RthwjWyRKw-=vLxGTtXkxWzNHA@mail.gmail.com>
References: <CANFMCgQhk-rLDsrD9JMPxaZ8RthwjWyRKw-=vLxGTtXkxWzNHA@mail.gmail.com>
Message-ID: <7cc2c179-d09c-9615-aac8-1b0fb06b66f4@gmail.com>

Hello Joannah,

Ronan might know more about this topic. But here is a short explanation:

A solid start is to read the following documentation:

http://rpython.readthedocs.io/en/latest/translation.html

It explains how Python source code is analyzed, transformed and compiled.

As you know, there are no type annotations as a "static" language
provides (like C++, C, Java, ...).

def foo(a,b):
	return a * b

The two parameters a and b can carry any type (even the ones that are
not able to execute binary add).

One step of the transformation described in the link above "annotates"
the types and deduces other properties.

If you have a call site:

foo("a", 2)

It will deduce that foo's parameter a is an instance of "SomeString" and
b is an instance of "SomeInteger".

So it will assume that when foo is called every call site must provide
SomeString for a and SomeInteger for b (or a subtype, but I'm not aware
of the full details).

If at another place foo(1,2) is called (which is valid python), rpython
must complain, because it cannot be statically compiled.

What we would like is a way to explicitly annotate the types (be aware
that this is just an example, it is up to you how you solve it):

@explicit_types(a=SomeInteger, b=SomeInteger)
def foo(a,b):
	return a * b

This would mean that rpython would complain as soon as it sees foo("a", 2).

Preferably I think it would be good to have a mini language to describe
such function properties, or variable properties.

Cheers,
Richard

On 03/28/2017 10:14 AM, joannah nanjekye wrote:
> Hello,
> 
> I am interested in working on the above project. I need to understand
> what it is about so that I can make a plan for it. I would love to work
> on it for GSoC if accepted.
> 
> In summary..I want to know the goal and the most important stack
> involved working on it.
> 
> I am proficient in python. If the above project idea is not so much in
> that direction you can advise a better idea among the ones listed here
> http://pypy.readthedocs.io/en/latest/project-ideas.html.
> 
> Kind regards,
> 
> -- 
> //Joannah Nanjekye
> +256776468213
> F : Nanjekye Captain Joannah
> S : joannah.nanjekye
> T : @captainjoannah
> SO : joannah
> 
> /"You think you know when you learn, are more sure when you can write,
> even more when you can teach, but certain when you can program."
> Alan J. Perlis/
> 
> 
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
> 

From rymg19 at gmail.com  Wed Mar 29 10:28:02 2017
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Wed, 29 Mar 2017 09:28:02 -0500
Subject: [pypy-dev] Details on project idea: Explicit typing in RPython
In-Reply-To: <7cc2c179-d09c-9615-aac8-1b0fb06b66f4@gmail.com>
References: <CANFMCgQhk-rLDsrD9JMPxaZ8RthwjWyRKw-=vLxGTtXkxWzNHA@mail.gmail.com>
 <7cc2c179-d09c-9615-aac8-1b0fb06b66f4@gmail.com>
Message-ID: <CAO41-mMh4u4oabWfE9Cd5XHUzOnVgOH_0ax4euFA=4agKR26nQ@mail.gmail.com>

RPython already has this:


https://bitbucket.org/pypy/pypy/src/tip/rpython/rlib/signature.py

--
Ryan (????)
Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else
http://refi64.com

On Mar 29, 2017 7:01 AM, "Richard Plangger" <planrichi at gmail.com> wrote:

> Hello Joannah,
>
> Ronan might know more about this topic. But here is a short explanation:
>
> A solid start is to read the following documentation:
>
> http://rpython.readthedocs.io/en/latest/translation.html
>
> It explains how Python source code is analyzed, transformed and compiled.
>
> As you know, there are no type annotations as a "static" language
> provides (like C++, C, Java, ...).
>
> def foo(a,b):
>         return a * b
>
> The two parameters a and b can carry any type (even the ones that are
> not able to execute binary add).
>
> One step of the transformation described in the link above "annotates"
> the types and deduces other properties.
>
> If you have a call site:
>
> foo("a", 2)
>
> It will deduce that foo's parameter a is an instance of "SomeString" and
> b is an instance of "SomeInteger".
>
> So it will assume that when foo is called every call site must provide
> SomeString for a and SomeInteger for b (or a subtype, but I'm not aware
> of the full details).
>
> If at another place foo(1,2) is called (which is valid python), rpython
> must complain, because it cannot be statically compiled.
>
> What we would like is a way to explicitly annotate the types (be aware
> that this is just an example, it is up to you how you solve it):
>
> @explicit_types(a=SomeInteger, b=SomeInteger)
> def foo(a,b):
>         return a * b
>
> This would mean that rpython would complain as soon as it sees foo("a", 2).
>
> Preferably I think it would be good to have a mini language to describe
> such function properties, or variable properties.
>
> Cheers,
> Richard
>
> On 03/28/2017 10:14 AM, joannah nanjekye wrote:
> > Hello,
> >
> > I am interested in working on the above project. I need to understand
> > what it is about so that I can make a plan for it. I would love to work
> > on it for GSoC if accepted.
> >
> > In summary..I want to know the goal and the most important stack
> > involved working on it.
> >
> > I am proficient in python. If the above project idea is not so much in
> > that direction you can advise a better idea among the ones listed here
> > http://pypy.readthedocs.io/en/latest/project-ideas.html.
> >
> > Kind regards,
> >
> > --
> > //Joannah Nanjekye
> > +256776468213
> > F : Nanjekye Captain Joannah
> > S : joannah.nanjekye
> > T : @captainjoannah
> > SO : joannah
> >
> > /"You think you know when you learn, are more sure when you can write,
> > even more when you can teach, but certain when you can program."
> > Alan J. Perlis/
> >
> >
> > _______________________________________________
> > pypy-dev mailing list
> > pypy-dev at python.org
> > https://mail.python.org/mailman/listinfo/pypy-dev
> >
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170329/66583047/attachment.html>

From ronan.lamy at gmail.com  Wed Mar 29 11:34:37 2017
From: ronan.lamy at gmail.com (Ronan Lamy)
Date: Wed, 29 Mar 2017 16:34:37 +0100
Subject: [pypy-dev] Details on project idea: Explicit typing in RPython
In-Reply-To: <CAO41-mMh4u4oabWfE9Cd5XHUzOnVgOH_0ax4euFA=4agKR26nQ@mail.gmail.com>
References: <CANFMCgQhk-rLDsrD9JMPxaZ8RthwjWyRKw-=vLxGTtXkxWzNHA@mail.gmail.com>
 <7cc2c179-d09c-9615-aac8-1b0fb06b66f4@gmail.com>
 <CAO41-mMh4u4oabWfE9Cd5XHUzOnVgOH_0ax4euFA=4agKR26nQ@mail.gmail.com>
Message-ID: <a5a7335d-f595-9f3a-8a4a-ce42044eede0@gmail.com>

Le 29/03/17 ? 15:28, Ryan Gonzalez a ?crit :
> RPython already has this:
>
>
> https://bitbucket.org/pypy/pypy/src/tip/rpython/rlib/signature.py

Indeed, @signature is one of 2 prior attempts at doing this in 
rpython[*]. However its syntax is cumbersome and it's rather limited in 
the types it can express - you can only use what's in rpython.rlib.types 
and these functions cannot be combined arbitrarily to build more complex 
types.

[*] The other one is @enforceargs: 
https://bitbucket.org/pypy/pypy/src/tip/rpython/rlib/objectmodel.py

From fijall at gmail.com  Fri Mar 31 03:58:19 2017
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Fri, 31 Mar 2017 09:58:19 +0200
Subject: [pypy-dev] pypy real world example,
 a django project data processing. but slow...
In-Reply-To: <e979ec24-bdc3-383a-a5f2-054c521fcf30@sandbox.cz>
References: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz>
 <CAK5idxTM0zmHhjMLd4qbX4E_cj-zwR3ORn6gZXEsjsfwz9vpnQ@mail.gmail.com>
 <e979ec24-bdc3-383a-a5f2-054c521fcf30@sandbox.cz>
Message-ID: <CAK5idxSAfunsnf_C+FAoAjhepPJUiywOTBcrkoN-qMzszm5cyA@mail.gmail.com>

What I meant is that ORM is slow *and* it takes forever to warmup.
Your code might not run long enough for the ORM to be warm. It's also
very likely it'll end up slower on pypy. one thing you can do is to
run PYPYLOG=jit-summary:- pypy <your program> and copy paste the
summary output

The only way to store the warmed up state is to keep the process alive
(as a daemon) and rerun it further. You can see if it speeds up after
two or three runs in one process and make decisions accordingly.

On Thu, Mar 30, 2017 at 2:09 PM, Vl??a Macek <macek at sandbox.cz> wrote:
> Hi Maciej (and others?),
>
> I know I must be one of many who wanted a gain without pain. :-) Just gave
> it a try without having an opportunity for some deeper profiling due to my
> project deadlines. I just thought to get in touch in case I missed
> something apparent to you from the combination I reported.
>
> ORM might me slow, but I compare interpreters, not ORMs. Here's my
> program's final stats of processing the input file (nginx access log):
>
> CPython 2.7.6 32bit
> 130.1 secs, 177492 valid lines (866160 invalid), 8021 l/s, max density 72 l/s
>
> pypy2-v5.7.0-linux32
> 183.0 secs, 177492 valid lines (866160 invalid), 5703 l/s, max density 72 l/s
>
> This is longer run than what I tried previously and surely this is not a
> "double time". But still significantly slower.
>
> Each line is analyzed using a regexp, which I read is slow in pypy.
>
> Both runs have exactly same input and output. Subjectively, the processing
> debugging output really got faster gradually for pypy, cpython is constant
> speed. Is it normal that the warmup can take minutes? I don't know the details.
>
> In production, this processing is run from cron every five minutes. Is it
> possible to store the warmed-up state between runs? (Note: I have *.pyc
> files disabled at home using PYTHONDONTWRITEBYTECODE=1.)
>
> I know it's annoying I don't share code and I'm sorry. With this mail I
> just wanted to give out some numbers for the possibly curious.
>
> The pypy itself is interesting and I hope I'll return to it someday more
> thoroughly.
>
> Thanks again & have a nice day,
>
> Vl??a
>
>
> On 27.3.2017 17:21, Maciej Fijalkowski wrote:
>> Hi Vlada
>>
>> Generally speaking, if we can't have a look there is incredibly little
>> we can do "I have a program" can be pretty much anything.
>>
>> It is well known that django ORM is very slow (both on pypy and on
>> cpython) and makes the JIT take forever to warm up. I have absolutely
>> no idea how long is your run at full CPU, but this is definitely one
>> of your suspects
>>
>> On Sun, Mar 26, 2017 at 1:06 PM, Vl??a Macek <macek at sandbox.cz> wrote:
>>> Hi, recently I asked my friends to run my sort of a benchmark on their
>>> machines (attached). The goal was to test the speed of different data
>>> access in python2 and python3, 32bit and 64bit. One of my friends sent me
>>> the pypy results -- the script ran fast as hell! Astounding.
>>>
>>> At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded
>>> your binary
>>> https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and
>>> confirmed my friend's results, wow.
>>>
>>> I develop a large Django project, that includes a big amount of background
>>> data processing. Reads large files, computes, issues much SQL to postgresql
>>> via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs.
>>>
>>> I'd welcome a speedup here very much.
>>>
>>> So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set
>>> up the paths and ran. The computation printouts were the same, very
>>> promising -- taking into account how complicated the project is! The SQL
>>> looked right too. My respect on compatiblity!
>>>
>>> Unfortunately, the time needed to complete was double in comparison CPython
>>> 2.7 for exactly the same task.
>>>
>>> You mention you might have some tips for why it's slow. Are you interested
>>> in getting in touch? Although I rather can't share the code and data with
>>> you, I'm offering a real world example of significant load that might help
>>> Pypy get better.
>>>
>>> Thank you,
>>>
>>> --
>>> : Vlada Macek  :  http://macek.sandbox.cz  : +420 608 978 164
>>> : UNIX && Dev || Training : Python, Django : PGP key 97330EBD
>>>
>>> (Disclaimer: The opinions expressed herein are not necessarily those
>>> of my employer, not necessarily mine, and probably not necessary.)
>>>
>

From nanjekyejoannah at gmail.com  Fri Mar 31 06:44:24 2017
From: nanjekyejoannah at gmail.com (joannah nanjekye)
Date: Fri, 31 Mar 2017 13:44:24 +0300
Subject: [pypy-dev] Details on project idea: Explicit typing in RPython
In-Reply-To: <a5a7335d-f595-9f3a-8a4a-ce42044eede0@gmail.com>
References: <CANFMCgQhk-rLDsrD9JMPxaZ8RthwjWyRKw-=vLxGTtXkxWzNHA@mail.gmail.com>
 <7cc2c179-d09c-9615-aac8-1b0fb06b66f4@gmail.com>
 <CAO41-mMh4u4oabWfE9Cd5XHUzOnVgOH_0ax4euFA=4agKR26nQ@mail.gmail.com>
 <a5a7335d-f595-9f3a-8a4a-ce42044eede0@gmail.com>
Message-ID: <CANFMCgQUGkn44zfjBOyJuTxUTwNxNgVzkofhGYjOsps+UQnLwA@mail.gmail.com>

Thank you I think this is clearer to me now.


On Wed, Mar 29, 2017 at 6:34 PM, Ronan Lamy <ronan.lamy at gmail.com> wrote:

> Le 29/03/17 ? 15:28, Ryan Gonzalez a ?crit :
>
>> RPython already has this:
>>
>>
>> https://bitbucket.org/pypy/pypy/src/tip/rpython/rlib/signature.py
>>
>
> Indeed, @signature is one of 2 prior attempts at doing this in rpython[*].
> However its syntax is cumbersome and it's rather limited in the types it
> can express - you can only use what's in rpython.rlib.types and these
> functions cannot be combined arbitrarily to build more complex types.
>
> [*] The other one is @enforceargs: https://bitbucket.org/pypy/pyp
> y/src/tip/rpython/rlib/objectmodel.py
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>


-- 
Joannah Nanjekye
+256776468213
F : Nanjekye Captain Joannah
S : joannah.nanjekye
T : @captainjoannah
SO : joannah


*"You think you know when you learn, are more sure when you can write, even
more when you can teach, but certain when you can program." Alan J. Perlis*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170331/20cac399/attachment.html>

From macek at sandbox.cz  Thu Mar 30 08:09:55 2017
From: macek at sandbox.cz (=?UTF-8?B?VmzDocSPYSBNYWNlaw==?=)
Date: Thu, 30 Mar 2017 14:09:55 +0200
Subject: [pypy-dev] pypy real world example,
 a django project data processing. but slow...
In-Reply-To: <CAK5idxTM0zmHhjMLd4qbX4E_cj-zwR3ORn6gZXEsjsfwz9vpnQ@mail.gmail.com>
References: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz>
 <CAK5idxTM0zmHhjMLd4qbX4E_cj-zwR3ORn6gZXEsjsfwz9vpnQ@mail.gmail.com>
Message-ID: <e979ec24-bdc3-383a-a5f2-054c521fcf30@sandbox.cz>

Hi Maciej (and others?),

I know I must be one of many who wanted a gain without pain. :-) Just gave
it a try without having an opportunity for some deeper profiling due to my
project deadlines. I just thought to get in touch in case I missed
something apparent to you from the combination I reported.

ORM might me slow, but I compare interpreters, not ORMs. Here's my
program's final stats of processing the input file (nginx access log):

CPython 2.7.6 32bit
130.1 secs, 177492 valid lines (866160 invalid), 8021 l/s, max density 72 l/s

pypy2-v5.7.0-linux32
183.0 secs, 177492 valid lines (866160 invalid), 5703 l/s, max density 72 l/s

This is longer run than what I tried previously and surely this is not a
"double time". But still significantly slower.

Each line is analyzed using a regexp, which I read is slow in pypy.

Both runs have exactly same input and output. Subjectively, the processing
debugging output really got faster gradually for pypy, cpython is constant
speed. Is it normal that the warmup can take minutes? I don't know the details.

In production, this processing is run from cron every five minutes. Is it
possible to store the warmed-up state between runs? (Note: I have *.pyc
files disabled at home using PYTHONDONTWRITEBYTECODE=1.)

I know it's annoying I don't share code and I'm sorry. With this mail I
just wanted to give out some numbers for the possibly curious.

The pypy itself is interesting and I hope I'll return to it someday more
thoroughly.

Thanks again & have a nice day,

Vl??a


On 27.3.2017 17:21, Maciej Fijalkowski wrote:
> Hi Vlada
>
> Generally speaking, if we can't have a look there is incredibly little
> we can do "I have a program" can be pretty much anything.
>
> It is well known that django ORM is very slow (both on pypy and on
> cpython) and makes the JIT take forever to warm up. I have absolutely
> no idea how long is your run at full CPU, but this is definitely one
> of your suspects
>
> On Sun, Mar 26, 2017 at 1:06 PM, Vl??a Macek <macek at sandbox.cz> wrote:
>> Hi, recently I asked my friends to run my sort of a benchmark on their
>> machines (attached). The goal was to test the speed of different data
>> access in python2 and python3, 32bit and 64bit. One of my friends sent me
>> the pypy results -- the script ran fast as hell! Astounding.
>>
>> At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded
>> your binary
>> https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and
>> confirmed my friend's results, wow.
>>
>> I develop a large Django project, that includes a big amount of background
>> data processing. Reads large files, computes, issues much SQL to postgresql
>> via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs.
>>
>> I'd welcome a speedup here very much.
>>
>> So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set
>> up the paths and ran. The computation printouts were the same, very
>> promising -- taking into account how complicated the project is! The SQL
>> looked right too. My respect on compatiblity!
>>
>> Unfortunately, the time needed to complete was double in comparison CPython
>> 2.7 for exactly the same task.
>>
>> You mention you might have some tips for why it's slow. Are you interested
>> in getting in touch? Although I rather can't share the code and data with
>> you, I'm offering a real world example of significant load that might help
>> Pypy get better.
>>
>> Thank you,
>>
>> --
>> : Vlada Macek  :  http://macek.sandbox.cz  : +420 608 978 164
>> : UNIX && Dev || Training : Python, Django : PGP key 97330EBD
>>
>> (Disclaimer: The opinions expressed herein are not necessarily those
>> of my employer, not necessarily mine, and probably not necessary.)
>>


From macek at sandbox.cz  Fri Mar 31 07:19:31 2017
From: macek at sandbox.cz (=?UTF-8?B?VmzDocSPYSBNYWNlaw==?=)
Date: Fri, 31 Mar 2017 13:19:31 +0200
Subject: [pypy-dev] pypy real world example,
 a django project data processing. but slow...
In-Reply-To: <CAK5idxSAfunsnf_C+FAoAjhepPJUiywOTBcrkoN-qMzszm5cyA@mail.gmail.com>
References: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz>
 <CAK5idxTM0zmHhjMLd4qbX4E_cj-zwR3ORn6gZXEsjsfwz9vpnQ@mail.gmail.com>
 <e979ec24-bdc3-383a-a5f2-054c521fcf30@sandbox.cz>
 <CAK5idxSAfunsnf_C+FAoAjhepPJUiywOTBcrkoN-qMzszm5cyA@mail.gmail.com>
Message-ID: <79d397bf-9b52-0e5e-f1c7-c60b543f6dfb@sandbox.cz>

Thanks! I ran it again on a much larger input and let it print the
lines/sec speed on every millionth line (either valid or invalid).

SPEED 6588 l/s
SPEED 8208 l/s
SPEED 9172 l/s
SPEED 10351 l/s
SPEED 16946 l/s
SPEED 23263 l/s

662.6 secs, 973701 valid lines (5610778 invalid), 9937 l/s, max density 73 l/s

[1c3dac321147] {jit-summary
Tracing:          2794    8.313955
Backend:          2245    1.946692
TOTAL:              667.678971
ops:                 5768705
recorded ops:        1478597
  calls:             231321
guards:              392450
opt ops:             456372
opt guards:          101057
opt guards shared:    61039
forcings:            0
abort: trace too long:    52
abort: compiling:    0
abort: vable escape:    497
abort: bad loop:     0
abort: force quasi-immut:    0
nvirtuals:           284152
nvholes:             146657
nvreused:            90634
vecopt tried:        0
vecopt success:      0
Total # of loops:    583
Total # of bridges:    1778
Freed # of loops:    140
Freed # of bridges:    189
[1c3dac33785b] jit-summary}

CPython again for comparison on the same input:

SPEED 8819 l/s
SPEED 9625 l/s
SPEED 10285 l/s
SPEED 11384 l/s
SPEED 16428 l/s
SPEED 20588 l/s

596.8 secs, 973701 valid lines (5610778 invalid), 11032 l/s, max density 73 l/s

Interesting that after 5 million lines the PyPy speed exceeded the CPython
somehow. Both runs got faster with time, probably due to my insane level of
local caching of values (less SQL required).

Anyway, I still hesitate whether pypy was really still warming up all that
time...

Thanks,

Vlada


On 31.3.2017 09:58, Maciej Fijalkowski wrote:
> What I meant is that ORM is slow *and* it takes forever to warmup.
> Your code might not run long enough for the ORM to be warm. It's also
> very likely it'll end up slower on pypy. one thing you can do is to
> run PYPYLOG=jit-summary:- pypy <your program> and copy paste the
> summary output
>
> The only way to store the warmed up state is to keep the process alive
> (as a daemon) and rerun it further. You can see if it speeds up after
> two or three runs in one process and make decisions accordingly.
>
> On Thu, Mar 30, 2017 at 2:09 PM, Vl??a Macek <macek at sandbox.cz> wrote:
>> Hi Maciej (and others?),
>>
>> I know I must be one of many who wanted a gain without pain. :-) Just gave
>> it a try without having an opportunity for some deeper profiling due to my
>> project deadlines. I just thought to get in touch in case I missed
>> something apparent to you from the combination I reported.
>>
>> ORM might me slow, but I compare interpreters, not ORMs. Here's my
>> program's final stats of processing the input file (nginx access log):
>>
>> CPython 2.7.6 32bit
>> 130.1 secs, 177492 valid lines (866160 invalid), 8021 l/s, max density 72 l/s
>>
>> pypy2-v5.7.0-linux32
>> 183.0 secs, 177492 valid lines (866160 invalid), 5703 l/s, max density 72 l/s
>>
>> This is longer run than what I tried previously and surely this is not a
>> "double time". But still significantly slower.
>>
>> Each line is analyzed using a regexp, which I read is slow in pypy.
>>
>> Both runs have exactly same input and output. Subjectively, the processing
>> debugging output really got faster gradually for pypy, cpython is constant
>> speed. Is it normal that the warmup can take minutes? I don't know the details.
>>
>> In production, this processing is run from cron every five minutes. Is it
>> possible to store the warmed-up state between runs? (Note: I have *.pyc
>> files disabled at home using PYTHONDONTWRITEBYTECODE=1.)
>>
>> I know it's annoying I don't share code and I'm sorry. With this mail I
>> just wanted to give out some numbers for the possibly curious.
>>
>> The pypy itself is interesting and I hope I'll return to it someday more
>> thoroughly.
>>
>> Thanks again & have a nice day,
>>
>> Vl??a
>>
>>
>> On 27.3.2017 17:21, Maciej Fijalkowski wrote:
>>> Hi Vlada
>>>
>>> Generally speaking, if we can't have a look there is incredibly little
>>> we can do "I have a program" can be pretty much anything.
>>>
>>> It is well known that django ORM is very slow (both on pypy and on
>>> cpython) and makes the JIT take forever to warm up. I have absolutely
>>> no idea how long is your run at full CPU, but this is definitely one
>>> of your suspects
>>>
>>> On Sun, Mar 26, 2017 at 1:06 PM, Vl??a Macek <macek at sandbox.cz> wrote:
>>>> Hi, recently I asked my friends to run my sort of a benchmark on their
>>>> machines (attached). The goal was to test the speed of different data
>>>> access in python2 and python3, 32bit and 64bit. One of my friends sent me
>>>> the pypy results -- the script ran fast as hell! Astounding.
>>>>
>>>> At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded
>>>> your binary
>>>> https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and
>>>> confirmed my friend's results, wow.
>>>>
>>>> I develop a large Django project, that includes a big amount of background
>>>> data processing. Reads large files, computes, issues much SQL to postgresql
>>>> via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs.
>>>>
>>>> I'd welcome a speedup here very much.
>>>>
>>>> So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set
>>>> up the paths and ran. The computation printouts were the same, very
>>>> promising -- taking into account how complicated the project is! The SQL
>>>> looked right too. My respect on compatiblity!
>>>>
>>>> Unfortunately, the time needed to complete was double in comparison CPython
>>>> 2.7 for exactly the same task.
>>>>
>>>> You mention you might have some tips for why it's slow. Are you interested
>>>> in getting in touch? Although I rather can't share the code and data with
>>>> you, I'm offering a real world example of significant load that might help
>>>> Pypy get better.
>>>>
>>>> Thank you,
>>>>
>>>> --
>>>> : Vlada Macek  :  http://macek.sandbox.cz  : +420 608 978 164
>>>> : UNIX && Dev || Training : Python, Django : PGP key 97330EBD
>>>>
>>>> (Disclaimer: The opinions expressed herein are not necessarily those
>>>> of my employer, not necessarily mine, and probably not necessary.)
>>>>


-- 
: Vlada Macek  :  http://macek.sandbox.cz  : +420 608 978 164
: UNIX && Dev || Training : Python, Django : PGP key 97330EBD

(Disclaimer: The opinions expressed herein are not necessarily those
of my employer, not necessarily mine, and probably not necessary.)