From planrichi at gmail.com Wed Mar 1 07:52:06 2017 From: planrichi at gmail.com (Richard Plangger) Date: Wed, 1 Mar 2017 13:52:06 +0100 Subject: [pypy-dev] PyPy as a Sub-org Message-ID: Hi, as we discussed during this years PyPy sprint, PyPy again wants to participate as a Sub-org in this years Google Summer of Code. We are already on the wiki ideas page, but I think we did not formally apply by writing this email. Regards, Richard From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Mar 2 00:03:38 2017 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Thu, 2 Mar 2017 14:03:38 +0900 Subject: [pypy-dev] [GSoC2017] PyPy as a Sub-org In-Reply-To: References: Message-ID: <22711.42922.606041.882426@turnbull.sk.tsukuba.ac.jp> Richard Plangger writes: > Hi, > > as we discussed during this years PyPy sprint, PyPy again wants to > participate as a Sub-org in this years Google Summer of Code. We are > already on the wiki ideas page, but I think we did not formally apply by > writing this email. Your section is visible on the page at http://python-gsoc.org/. That was your formal application as a sub-org. Is there any other problem? Details: To actually participate in GSoC under the PSF umbrella, you need to (1) register at least two mentors X desired slots at https://goo.gl/forms/UJb0rHOVQjLna2o53 Some overlap will be allowed, but the mentors who are working with more than one student should be a small minority. (2) subscribe them to the mailing list https://mail.python.org/mailman/listinfo/gsoc-mentors (3) designate one sub-org admin and one alternate in case the main admin is out of contact for more than a day or two (4) state that you intend to comply with the Python Code of Conduct See also http://python-gsoc.org/#mentors for more information and further requirements (that can be satisfied as you go) about mentors and sub-orgs. Steve From yashwardhan.singh at intel.com Thu Mar 2 20:31:05 2017 From: yashwardhan.singh at intel.com (Singh, Yashwardhan) Date: Fri, 3 Mar 2017 01:31:05 +0000 Subject: [pypy-dev] Numpy on PyPy : cpyext Message-ID: <0151F66FF725AC42A760DA612754C5F819912754@ORSMSX104.amr.corp.intel.com> Hi Everyone, I am using numpy on pypy to train a deep neural network. For my workload numpy on pypy is taking twice the time to train as numpy on Cpython. I am using Numpy via cpyext. I read in the documentation, "Performance-wise, the speed is mostly the same as CPython's NumPy (it is the same code); the exception is that interactions between the Python side and NumPy objects are mediated through the slower cpyext layer (which hurts a few benchmarks that do a lot of element-by-element array accesses, for example)." Is there any way in which I can profile my application to see how much additional overhead cypext layer is adding or is it the numpy via pypy which is slowing down the things. I have tried vmprof, but I couldn't figure out from it how much time cpyext layer is taking. Any help will be highly appreciated. Regards Yash -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Fri Mar 3 07:40:39 2017 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 3 Mar 2017 13:40:39 +0100 Subject: [pypy-dev] Numpy on PyPy : cpyext In-Reply-To: <0151F66FF725AC42A760DA612754C5F819912754@ORSMSX104.amr.corp.intel.com> References: <0151F66FF725AC42A760DA612754C5F819912754@ORSMSX104.amr.corp.intel.com> Message-ID: Hi Yash Is your software open source? I'm happy to check it out for you I think the c-level profiling for vmprof is relatively new, you would need to use pypy nightly in order to get that level of insight. Additionally, we're working on cpyext improvements *right now* stay tuned. If there is a good case for speeding up numpy, we can get it a lot faster than it is right now and seek some funding for that. Neural networks might be one of those! Best regards, Maciej Fijalkowski On Fri, Mar 3, 2017 at 2:31 AM, Singh, Yashwardhan wrote: > Hi Everyone, > > I am using numpy on pypy to train a deep neural network. For my workload > numpy on pypy is taking twice the time to train as numpy on Cpython. I am > using Numpy via cpyext. > > I read in the documentation, "Performance-wise, the speed is mostly the same > as CPython's NumPy (it is the same code); the exception is that interactions > between the Python side and NumPy objects are mediated through the slower > cpyext layer (which hurts a few benchmarks that do a lot of > element-by-element array accesses, for example)." Is there any way in which > I can profile my application to see how much additional overhead cypext > layer is adding or is it the numpy via pypy which is slowing down the > things. I have tried vmprof, but I couldn't figure out from it how much time > cpyext layer is taking. > > Any help will be highly appreciated. > > Regards > Yash > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > From frankw at mit.edu Fri Mar 3 09:20:09 2017 From: frankw at mit.edu (Frank Wang) Date: Fri, 3 Mar 2017 09:20:09 -0500 Subject: [pypy-dev] Disassembling methods called by LOOKUP_METHOD Message-ID: Hi, I'm trying to figure out the opcodes that the "append" function calls for arrays. When I use the dis tool, it just says that it looks up a method "append" using the LOOKUP_METHOD opcode. Is there a tool that allows me to disassemble built-in functions like "append", or what the best way to do this is? Thanks, Frank -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Fri Mar 3 09:58:58 2017 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Fri, 3 Mar 2017 08:58:58 -0600 Subject: [pypy-dev] Disassembling methods called by LOOKUP_METHOD In-Reply-To: References: Message-ID: You can look at the source code for the objects (all located in pypy/objspace/std) and find the method implementations there. Here's append's (form pypy/objspace/std/listobject.py): def append(self, w_item): """L.append(object) -- append object to end""" self.strategy.append(self, w_item) So it's just appending to an RPython list. If you want to see the source for that, look in rpython/rtyper. append's is in rpython/rtyper/rlist.py: def rtype_method_append(self, hop): v_lst, v_value = hop.inputargs(self, self.item_repr) hop.exception_cannot_occur() hop.gendirectcall(ll_append, v_lst, v_value) This ends up calling ll_append in the end (I think the other stuff is for the JIT?), which is defined in the same file: def ll_append(l, newitem): length = l.ll_length() l._ll_resize_ge(length+1) # see "a note about overflows" above l.ll_setitem_fast(length, newitem) Now, these ll_* functions are defined in the corresponding file inside rpython/rtyper/lltypesystem; in this case, it's rpython/rtyper/lltypesystem/rlist.py: self.LIST.become(GcStruct("list", ("length", Signed), ("items", Ptr(ITEMARRAY)), adtmeths = ADTIList({ "ll_newlist": ll_newlist, "ll_newlist_hint": ll_newlist_hint, "ll_newemptylist": ll_newemptylist, "ll_length": ll_length, "ll_items": ll_items, "ITEM": ITEM, "ll_getitem_fast": ll_getitem_fast, "ll_setitem_fast": ll_setitem_fast, "_ll_resize_ge": _ll_list_resize_ge, "_ll_resize_le": _ll_list_resize_le, "_ll_resize": _ll_list_resize, "_ll_resize_hint": _ll_list_resize_hint, }), hints = {'list': True}) ) It's signaling to RPython all the different methods on the low-level list representation. Here, you want ll_setitem_fast and _ll_list_resize_ge (I also copy-pasted the functions they call): @jit.look_inside_iff(lambda l, newsize, overallocate: jit.isconstant(len(l.items)) and jit.isconstant(newsize)) @signature(types.any(), types.int(), types.bool(), returns=types.none()) def _ll_list_resize_hint_really(l, newsize, overallocate): """ Ensure l.items has room for at least newsize elements. Note that l.items may change, and even if newsize is less than l.length on entry. """ # This over-allocates proportional to the list size, making room # for additional growth. The over-allocation is mild, but is # enough to give linear-time amortized behavior over a long # sequence of appends() in the presence of a poorly-performing # system malloc(). # The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... if newsize <= 0: ll_assert(newsize == 0, "negative list length") l.length = 0 l.items = _ll_new_empty_item_array(typeOf(l).TO) return elif overallocate: if newsize < 9: some = 3 else: some = 6 some += newsize >> 3 new_allocated = newsize + some else: new_allocated = newsize # new_allocated is a bit more than newsize, enough to ensure an amortized # linear complexity for e.g. repeated usage of l.append(). In case # it overflows sys.maxint, it is guaranteed negative, and the following # malloc() will fail. items = l.items newitems = malloc(typeOf(l).TO.items.TO, new_allocated) before_len = l.length if before_len: # avoids copying GC flags from the prebuilt_empty_array if before_len < newsize: p = before_len else: p = newsize rgc.ll_arraycopy(items, newitems, 0, 0, p) l.items = newitems def _ll_list_resize_ge(l, newsize): """This is called with 'newsize' larger than the current length of the list. If the list storage doesn't have enough space, then really perform a realloc(). In the common case where we already overallocated enough, then this is a very fast operation. """ cond = len(l.items) < newsize if jit.isconstant(len(l.items)) and jit.isconstant(newsize): if cond: _ll_list_resize_hint_really(l, newsize, True) else: jit.conditional_call(cond, _ll_list_resize_hint_really, l, newsize, True) l.length = newsize def ll_items(l): return l.items def ll_setitem_fast(l, index, item): ll_assert(index < l.length, "setitem out of bounds") l.ll_items()[index] = item ll_setitem_fast.oopspec = 'list.setitem(l, index, item)' On Fri, Mar 3, 2017 at 8:20 AM, Frank Wang wrote: > Hi, > > I'm trying to figure out the opcodes that the "append" function calls for > arrays. When I use the dis tool, it just says that it looks up a method > "append" using the LOOKUP_METHOD opcode. Is there a tool that allows me to > disassemble built-in functions like "append", or what the best way to do > this is? > > Thanks, > Frank > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > > -- Ryan (????) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yashwardhan.singh at intel.com Fri Mar 3 19:20:35 2017 From: yashwardhan.singh at intel.com (Singh, Yashwardhan) Date: Sat, 4 Mar 2017 00:20:35 +0000 Subject: [pypy-dev] Numpy on PyPy : cpyext In-Reply-To: References: <0151F66FF725AC42A760DA612754C5F819912754@ORSMSX104.amr.corp.intel.com> Message-ID: <0151F66FF725AC42A760DA612754C5F819912918@ORSMSX104.amr.corp.intel.com> Hi Maciej, I have applied for clearance to publicly upload the code. I will upload it once I get the permission. Regards Yash -----Original Message----- From: Maciej Fijalkowski [mailto:fijall at gmail.com] Sent: Friday, March 3, 2017 4:41 AM To: Singh, Yashwardhan Cc: pypy-dev at python.org Subject: Re: [pypy-dev] Numpy on PyPy : cpyext Hi Yash Is your software open source? I'm happy to check it out for you I think the c-level profiling for vmprof is relatively new, you would need to use pypy nightly in order to get that level of insight. Additionally, we're working on cpyext improvements *right now* stay tuned. If there is a good case for speeding up numpy, we can get it a lot faster than it is right now and seek some funding for that. Neural networks might be one of those! Best regards, Maciej Fijalkowski On Fri, Mar 3, 2017 at 2:31 AM, Singh, Yashwardhan wrote: > Hi Everyone, > > I am using numpy on pypy to train a deep neural network. For my > workload numpy on pypy is taking twice the time to train as numpy on > Cpython. I am using Numpy via cpyext. > > I read in the documentation, "Performance-wise, the speed is mostly > the same as CPython's NumPy (it is the same code); the exception is > that interactions between the Python side and NumPy objects are > mediated through the slower cpyext layer (which hurts a few benchmarks > that do a lot of element-by-element array accesses, for example)." Is > there any way in which I can profile my application to see how much > additional overhead cypext layer is adding or is it the numpy via pypy > which is slowing down the things. I have tried vmprof, but I couldn't > figure out from it how much time cpyext layer is taking. > > Any help will be highly appreciated. > > Regards > Yash > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > From fijall at gmail.com Sat Mar 4 13:01:52 2017 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 4 Mar 2017 19:01:52 +0100 Subject: [pypy-dev] Speeds of various utf8 operations Message-ID: Hello everyone I've been experimenting a bit with faster utf8 operations (and conversion that does not do much). I'm writing down the results so they don't get forgotten, as well as trying to put them in rpython comments. As far as non-SSE algorithms go, for things like splitlines, split etc. is important to walk the utf8 string quickly and check properties of characters. So far the current finding has been that lookup table, for example: def next_codepoint_pos(code, pos): chr1 = ord(code[pos]) if chr1 < 0x80: return pos + 1 return pos + ord(runicode._utf8_code_length[chr1 - 0x80]) is significantly slower than following code (both don't do error checking): def next_codepoint_pos(code, pos): chr1 = ord(code[pos]) if chr1 < 0x80: return pos + 1 if 0xC2 >= chr1 <= 0xDF: return pos + 2 if chr >= 0xE0 and chr <= 0xEF: return pos + 3 return pos + 4 The exact difference depends on how much multi-byte characters are there and how big the strings are. It's up to 40%, but as a general rule, the more ascii characters are, the less of an impact it has, as well as the larger they are, the more impact memory/L2/L3 cache has. PS. SSE will be faster still, but we might not want SSE for just splitlines Cheers, fijal From phyo.arkarlwin at gmail.com Sat Mar 4 13:36:16 2017 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Sat, 04 Mar 2017 18:36:16 +0000 Subject: [pypy-dev] Speeds of various utf8 operations In-Reply-To: References: Message-ID: SSE measn https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions? in comparison to CPython is this much slower ? On Sun, Mar 5, 2017 at 12:32 AM Maciej Fijalkowski wrote: > Hello everyone > > I've been experimenting a bit with faster utf8 operations (and > conversion that does not do much). I'm writing down the results so > they don't get forgotten, as well as trying to put them in rpython > comments. > > As far as non-SSE algorithms go, for things like splitlines, split > etc. is important to walk the utf8 string quickly and check properties > of characters. > > So far the current finding has been that lookup table, for example: > > def next_codepoint_pos(code, pos): > chr1 = ord(code[pos]) > if chr1 < 0x80: > return pos + 1 > return pos + ord(runicode._utf8_code_length[chr1 - 0x80]) > > is significantly slower than following code (both don't do error checking): > > def next_codepoint_pos(code, pos): > chr1 = ord(code[pos]) > if chr1 < 0x80: > return pos + 1 > if 0xC2 >= chr1 <= 0xDF: > return pos + 2 > if chr >= 0xE0 and chr <= 0xEF: > return pos + 3 > return pos + 4 > > The exact difference depends on how much multi-byte characters are > there and how big the strings are. It's up to 40%, but as a general > rule, the more ascii characters are, the less of an impact it has, as > well as the larger they are, the more impact memory/L2/L3 cache has. > > PS. SSE will be faster still, but we might not want SSE for just splitlines > > Cheers, > fijal > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sat Mar 4 14:17:22 2017 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 4 Mar 2017 20:17:22 +0100 Subject: [pypy-dev] Speeds of various utf8 operations In-Reply-To: References: Message-ID: Er... why would it be slower than cpython? Anyway, the speeds I'm reporting on are based on C/assembler programs so far. On Sat, Mar 4, 2017 at 7:36 PM, Phyo Arkar wrote: > SSE measn https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions? > > in comparison to CPython is this much slower ? > > On Sun, Mar 5, 2017 at 12:32 AM Maciej Fijalkowski wrote: >> >> Hello everyone >> >> I've been experimenting a bit with faster utf8 operations (and >> conversion that does not do much). I'm writing down the results so >> they don't get forgotten, as well as trying to put them in rpython >> comments. >> >> As far as non-SSE algorithms go, for things like splitlines, split >> etc. is important to walk the utf8 string quickly and check properties >> of characters. >> >> So far the current finding has been that lookup table, for example: >> >> def next_codepoint_pos(code, pos): >> chr1 = ord(code[pos]) >> if chr1 < 0x80: >> return pos + 1 >> return pos + ord(runicode._utf8_code_length[chr1 - 0x80]) >> >> is significantly slower than following code (both don't do error >> checking): >> >> def next_codepoint_pos(code, pos): >> chr1 = ord(code[pos]) >> if chr1 < 0x80: >> return pos + 1 >> if 0xC2 >= chr1 <= 0xDF: >> return pos + 2 >> if chr >= 0xE0 and chr <= 0xEF: >> return pos + 3 >> return pos + 4 >> >> The exact difference depends on how much multi-byte characters are >> there and how big the strings are. It's up to 40%, but as a general >> rule, the more ascii characters are, the less of an impact it has, as >> well as the larger they are, the more impact memory/L2/L3 cache has. >> >> PS. SSE will be faster still, but we might not want SSE for just >> splitlines >> >> Cheers, >> fijal >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev From fijall at gmail.com Sat Mar 4 14:58:01 2017 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 4 Mar 2017 20:58:01 +0100 Subject: [pypy-dev] Speeds of various utf8 operations In-Reply-To: References: Message-ID: Hi phyo The mail is about during operations in c/assembler. I will have more detailed python level benchmarks while I progress with my branch. On 04 Mar 2017 7:36 PM, "Phyo Arkar" wrote: SSE measn https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions? in comparison to CPython is this much slower ? On Sun, Mar 5, 2017 at 12:32 AM Maciej Fijalkowski wrote: > Hello everyone > > I've been experimenting a bit with faster utf8 operations (and > conversion that does not do much). I'm writing down the results so > they don't get forgotten, as well as trying to put them in rpython > comments. > > As far as non-SSE algorithms go, for things like splitlines, split > etc. is important to walk the utf8 string quickly and check properties > of characters. > > So far the current finding has been that lookup table, for example: > > def next_codepoint_pos(code, pos): > chr1 = ord(code[pos]) > if chr1 < 0x80: > return pos + 1 > return pos + ord(runicode._utf8_code_length[chr1 - 0x80]) > > is significantly slower than following code (both don't do error checking): > > def next_codepoint_pos(code, pos): > chr1 = ord(code[pos]) > if chr1 < 0x80: > return pos + 1 > if 0xC2 >= chr1 <= 0xDF: > return pos + 2 > if chr >= 0xE0 and chr <= 0xEF: > return pos + 3 > return pos + 4 > > The exact difference depends on how much multi-byte characters are > there and how big the strings are. It's up to 40%, but as a general > rule, the more ascii characters are, the less of an impact it has, as > well as the larger they are, the more impact memory/L2/L3 cache has. > > PS. SSE will be faster still, but we might not want SSE for just splitlines > > Cheers, > fijal > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sat Mar 4 18:45:01 2017 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 5 Mar 2017 00:45:01 +0100 Subject: [pypy-dev] revisit web assembly? Message-ID: I just found that: https://sourceware.org/ml/binutils/2017-03/msg00044.html It might be cool to see, maybe we can relatively easily compile to the web platform. Even interpreter-only version could be quite interesting. Cheers, fijal From armin.rigo at gmail.com Sun Mar 5 04:14:07 2017 From: armin.rigo at gmail.com (Armin Rigo) Date: Sun, 5 Mar 2017 10:14:07 +0100 Subject: [pypy-dev] Speeds of various utf8 operations In-Reply-To: References: Message-ID: Hi Maciej, On 4 March 2017 at 19:01, Maciej Fijalkowski wrote: > def next_codepoint_pos(code, pos): > chr1 = ord(code[pos]) > if chr1 < 0x80: > return pos + 1 > if 0xC2 >= chr1 <= 0xDF: > return pos + 2 > if chr >= 0xE0 and chr <= 0xEF: > return pos + 3 > return pos + 4 If you don't want error checking, then you can simplify a bit the range checks here. Maybe it gives some more gains, but who knows: def next_codepoint_pos(code, pos): chr1 = ord(code[pos]) if chr1 < 0x80: return pos + 1 if chr1 <= 0xDF: return pos + 2 if chr1 <= 0xEF: return pos + 3 return pos + 4 A bient?t, Armin. From phyo.arkarlwin at gmail.com Sun Mar 5 12:08:19 2017 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Sun, 05 Mar 2017 17:08:19 +0000 Subject: [pypy-dev] revisit web assembly? In-Reply-To: References: Message-ID: Very intersting! I can't wait to write Javascript in python. On Sun, Mar 5, 2017 at 6:15 AM Maciej Fijalkowski wrote: > I just found that: > https://sourceware.org/ml/binutils/2017-03/msg00044.html > > It might be cool to see, maybe we can relatively easily compile to the > web platform. Even interpreter-only version could be quite > interesting. > > Cheers, > fijal > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yury at shurup.com Sun Mar 5 12:39:23 2017 From: yury at shurup.com (Yury V. Zaytsev) Date: Sun, 5 Mar 2017 18:39:23 +0100 (CET) Subject: [pypy-dev] revisit web assembly? In-Reply-To: References: Message-ID: On Sun, 5 Mar 2017, Phyo Arkar wrote: > Very intersting! I can't wait to write Javascript in python.? But what for, if you can write it in Haskell? ;-) [*] [*]: http://elm-lang.org Seriously, thought, the WebAssembly thing looks quite exciting! > On Sun, Mar 5, 2017 at 6:15 AM Maciej Fijalkowski wrote: > I just found that: https://sourceware.org/ml/binutils/2017-03/msg00044.html > > It might be cool to see, maybe we can relatively easily compile to the > web platform. Even interpreter-only version could be quite > interesting. > > Cheers, > fijal > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev -- Sincerely yours, Yury V. Zaytsev From fijall at gmail.com Sun Mar 5 14:24:24 2017 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 5 Mar 2017 21:24:24 +0200 Subject: [pypy-dev] Speeds of various utf8 operations In-Reply-To: References: Message-ID: This is checking for spaces in unicode (so it's known to be valid utf8) On Sun, Mar 5, 2017 at 11:14 AM, Armin Rigo wrote: > Hi Maciej, > > On 4 March 2017 at 19:01, Maciej Fijalkowski wrote: >> def next_codepoint_pos(code, pos): >> chr1 = ord(code[pos]) >> if chr1 < 0x80: >> return pos + 1 >> if 0xC2 >= chr1 <= 0xDF: >> return pos + 2 >> if chr >= 0xE0 and chr <= 0xEF: >> return pos + 3 >> return pos + 4 > > If you don't want error checking, then you can simplify a bit the > range checks here. Maybe it gives some more gains, but who knows: > > def next_codepoint_pos(code, pos): > chr1 = ord(code[pos]) > if chr1 < 0x80: > return pos + 1 > if chr1 <= 0xDF: > return pos + 2 > if chr1 <= 0xEF: > return pos + 3 > return pos + 4 > > > A bient?t, > > Armin. From armin.rigo at gmail.com Mon Mar 6 02:13:19 2017 From: armin.rigo at gmail.com (Armin Rigo) Date: Mon, 6 Mar 2017 08:13:19 +0100 Subject: [pypy-dev] Speeds of various utf8 operations In-Reply-To: References: Message-ID: Hi Maciej, On 5 March 2017 at 20:24, Maciej Fijalkowski wrote: > This is checking for spaces in unicode (so it's known to be valid utf8) Ok, then you might have missed another property of UTF-8: when you check for "being a substring" in UTF-8, you don't need to do any decoding. Instead you only need to check "being a substring" with the two encoded UTF-8 strings. This always works as expected, i.e. you can never get a positive answer by chance. So for example: x in y can be implemented as x._utf8 in y._utf8 and in this case, you can find spaces in a unicode string just by searching for the 10 byte patterns that are spaces-encoded-as-UTF-8 (11 if you also count '\n\r' as one such pattern). That's also how the 're' module could be rewritten to directly handle UTF-8 strings, instead of decoding it first. A bient?t, Armin. From fijall at gmail.com Mon Mar 6 02:15:44 2017 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 6 Mar 2017 11:15:44 +0400 Subject: [pypy-dev] Speeds of various utf8 operations In-Reply-To: References: Message-ID: Yes sure, I'm aware of that :-) The problem only shows up with "start" and "end" parameters being used On Mon, Mar 6, 2017 at 11:13 AM, Armin Rigo wrote: > Hi Maciej, > > On 5 March 2017 at 20:24, Maciej Fijalkowski wrote: >> This is checking for spaces in unicode (so it's known to be valid utf8) > > Ok, then you might have missed another property of UTF-8: when you > check for "being a substring" in UTF-8, you don't need to do any > decoding. Instead you only need to check "being a substring" with the > two encoded UTF-8 strings. This always works as expected, i.e. you > can never get a positive answer by chance. So for example: > > x in y can be implemented as x._utf8 in y._utf8 > > and in this case, you can find spaces in a unicode string just by > searching for the 10 byte patterns that are spaces-encoded-as-UTF-8 > (11 if you also count '\n\r' as one such pattern). > > That's also how the 're' module could be rewritten to directly handle > UTF-8 strings, instead of decoding it first. > > > A bient?t, > > Armin. From terri at toybox.ca Mon Mar 6 23:04:40 2017 From: terri at toybox.ca (Terri Oda) Date: Mon, 6 Mar 2017 20:04:40 -0800 Subject: [pypy-dev] [GSoC2017] PyPy as a Sub-org In-Reply-To: <22711.42922.606041.882426@turnbull.sk.tsukuba.ac.jp> References: <22711.42922.606041.882426@turnbull.sk.tsukuba.ac.jp> Message-ID: <8c6faba5-f579-1a78-ac0d-f85c3d353b34@toybox.ca> Just to confirm: you're up on the page now, and I see you have 4 mentors signed up, so it looks like PyPy is all set to go. You should have all been issued invites to google's submission system, so just make sure to sign up there so you can see applications when they start coming in! Terri On 2017-03-01 9:03 PM, Stephen J. Turnbull wrote: > Richard Plangger writes: > > Hi, > > > > as we discussed during this years PyPy sprint, PyPy again wants to > > participate as a Sub-org in this years Google Summer of Code. We are > > already on the wiki ideas page, but I think we did not formally apply by > > writing this email. > > Your section is visible on the page at http://python-gsoc.org/. That > was your formal application as a sub-org. Is there any other problem? > > Details: > > To actually participate in GSoC under the PSF umbrella, you need to > (1) register at least two mentors X desired slots at > https://goo.gl/forms/UJb0rHOVQjLna2o53 > Some overlap will be allowed, but the mentors who are working with > more than one student should be a small minority. > (2) subscribe them to the mailing list > https://mail.python.org/mailman/listinfo/gsoc-mentors > (3) designate one sub-org admin and one alternate in case the main > admin is out of contact for more than a day or two > (4) state that you intend to comply with the Python Code of Conduct > > See also http://python-gsoc.org/#mentors for more information and > further requirements (that can be satisfied as you go) about mentors > and sub-orgs. > > Steve > > From planrichi at gmail.com Wed Mar 8 12:17:24 2017 From: planrichi at gmail.com (Richard Plangger) Date: Wed, 8 Mar 2017 18:17:24 +0100 Subject: [pypy-dev] Speeds of various utf8 operations In-Reply-To: References: Message-ID: Hi, as we discussed on the sprint I have now experimented with an SSE/AVX implementation to 'len(utf8 string)' (this includes a check if it is valid utf8). Since this is related to this mailing list thread I'll just add it here! I ran some small measurements on it: Here some explanation of the names: pypy-seq-.*: sequential implementation in C, nothing fancy just a baseline pypy-vec-sse4-.*: implementation using sse4 (128 bit registers) pypy-vec-avx2-.*: implementation using avx2 (256 bit registers) libunistring-.*: benchmarking the function u8_check in that gnu library, NO length is calculated mystrlenutf8-.*: some guy doing length calculation (no validity check) only using 64bit words instead of per byte iteration. (see here [1]) .*-news-de: html of a german website (has quite a lot of 2 byte code points), ~ 1MB .*-news-cn: worldjournarl.com -> mandarin (html website with lots of 4 byte code points) ~ 700 KB .*-tipitaka-thai: xml page of some religious text with lots of 3 byte code points (~4.5 MB) copied many times (original file was 300KB) Why is u8u16 missing? Well, as far as I can tell there is no function in u8u16 that returns the length of an utf8 string and checks if it is valid at the same time, without rewriting it. u8u16 is really just for transforming utf8 to utf16. The benchmark runs read the content from a file (e.g. .*-news-de, a german html news website), and in a loop iterates 10 times the utf-8-get-length-and-check function written in C and sums up the time for each run (using clock_t clock(void) in C, man 3 clock). ..................... pypy-seq-news-de: Median +- std dev: 76.0 us +- 1.4 us ..................... pypy-sse4-vec-news-de: Median +- std dev: 5.16 us +- 0.14 us ..................... pypy-avx2-vec-news-de: Median +- std dev: 384 ns +- 11 ns ..................... libunistring-news-de: Median +- std dev: 33.0 us +- 0.4 us ..................... mystrlenutf8-news-de: Median +- std dev: 9.25 us +- 0.22 us ..................... pypy-seq-news-cn: Median +- std dev: 59.8 us +- 1.2 us ..................... pypy-sse4-vec-news-cn: Median +- std dev: 7.70 us +- 0.12 us ..................... pypy-avx2-vec-news-cn: Median +- std dev: 23.3 ns +- 0.4 ns ..................... libunistring-news-cn: Median +- std dev: 30.5 us +- 0.4 us ..................... mystrlenutf8-news-cn: Median +- std dev: 6.54 us +- 0.20 us ..................... pypy-seq-tipitaka-thai: Median +- std dev: 939 us +- 39 us ..................... pypy-sse4-vec-tipitaka-thai: Median +- std dev: 425 us +- 7 us ..................... pypy-avx2-vec-tipitaka-thai: Median +- std dev: 19.9 ns +- 0.3 ns ..................... libunistring-tipitaka-thai: Median +- std dev: 615 us +- 28 us ..................... WARNING: the benchmark seems unstable, the standard deviation is high (stdev/median: 17%) Try to rerun the benchmark with more runs, samples and/or loops mystrlenutf8-tipitaka-thai: Median +- std dev: 45.1 us +- 7.9 us What do you think? I think it would even be a good idea to take a look at AVX512 (which gives you a crazy amount of 512 bits (or 64 bytes) in your vector register). The AVX implementation is a bit fishy (compare avx2-vec-tipitaka-thai and pypy-avx2-vec-news-cn). I need to recheck that, it would not make sense to process 10x 4.5 MB in 20ns and 10x 700KB in 23ns. As soon as I have ironed out the issue I'll start to think about indexing... Cheers, Richard [1] http://www.daemonology.net/blog/2008-06-05-faster-utf8-strlen.html On 03/04/2017 07:01 PM, Maciej Fijalkowski wrote: > Hello everyone > > I've been experimenting a bit with faster utf8 operations (and > conversion that does not do much). I'm writing down the results so > they don't get forgotten, as well as trying to put them in rpython > comments. > > As far as non-SSE algorithms go, for things like splitlines, split > etc. is important to walk the utf8 string quickly and check properties > of characters. > > So far the current finding has been that lookup table, for example: > > def next_codepoint_pos(code, pos): > chr1 = ord(code[pos]) > if chr1 < 0x80: > return pos + 1 > return pos + ord(runicode._utf8_code_length[chr1 - 0x80]) > > is significantly slower than following code (both don't do error checking): > > def next_codepoint_pos(code, pos): > chr1 = ord(code[pos]) > if chr1 < 0x80: > return pos + 1 > if 0xC2 >= chr1 <= 0xDF: > return pos + 2 > if chr >= 0xE0 and chr <= 0xEF: > return pos + 3 > return pos + 4 > > The exact difference depends on how much multi-byte characters are > there and how big the strings are. It's up to 40%, but as a general > rule, the more ascii characters are, the less of an impact it has, as > well as the larger they are, the more impact memory/L2/L3 cache has. > > PS. SSE will be faster still, but we might not want SSE for just splitlines > > Cheers, > fijal > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > From planrichi at gmail.com Wed Mar 8 13:09:57 2017 From: planrichi at gmail.com (Richard Plangger) Date: Wed, 8 Mar 2017 19:09:57 +0100 Subject: [pypy-dev] Speeds of various utf8 operations In-Reply-To: References: Message-ID: Yes ;)? At some point, now I'm still experimenting with the operations we think we need. Cheers, Richard On Mar 8, 2017 6:50 PM, "David Edelsohn" wrote: > And POWER VSX and Z VX? ;-) > > - David > > > On Wed, Mar 8, 2017 at 12:17 PM, Richard Plangger > wrote: > > Hi, > > > > as we discussed on the sprint I have now experimented with an SSE/AVX > > implementation to 'len(utf8 string)' (this includes a check if it is > > valid utf8). Since this is related to this mailing list thread I'll just > > add it here! > > > > I ran some small measurements on it: > > > > Here some explanation of the names: > > > > pypy-seq-.*: sequential implementation in C, nothing fancy just a > baseline > > pypy-vec-sse4-.*: implementation using sse4 (128 bit registers) > > pypy-vec-avx2-.*: implementation using avx2 (256 bit registers) > > libunistring-.*: benchmarking the function u8_check in that gnu library, > > NO length is calculated > > mystrlenutf8-.*: some guy doing length calculation (no validity check) > > only using 64bit words instead of per byte iteration. (see here [1]) > > > > .*-news-de: html of a german website (has quite a lot of 2 byte code > > points), ~ 1MB > > .*-news-cn: worldjournarl.com -> mandarin (html website with lots of 4 > > byte code points) ~ 700 KB > > .*-tipitaka-thai: xml page of some religious text with lots of 3 byte > > code points (~4.5 MB) copied many times (original file was 300KB) > > > > Why is u8u16 missing? Well, as far as I can tell there is no function in > > u8u16 that returns the length of an utf8 string and checks if it is > > valid at the same time, without rewriting it. u8u16 is really just for > > transforming utf8 to utf16. > > > > The benchmark runs read the content from a file (e.g. .*-news-de, a > > german html news website), and in a loop iterates 10 times the > > utf-8-get-length-and-check function written in C and sums up the time > > for each run (using clock_t clock(void) in C, man 3 clock). > > > > ..................... > > pypy-seq-news-de: Median +- std dev: 76.0 us +- 1.4 us > > ..................... > > pypy-sse4-vec-news-de: Median +- std dev: 5.16 us +- 0.14 us > > ..................... > > pypy-avx2-vec-news-de: Median +- std dev: 384 ns +- 11 ns > > ..................... > > libunistring-news-de: Median +- std dev: 33.0 us +- 0.4 us > > ..................... > > mystrlenutf8-news-de: Median +- std dev: 9.25 us +- 0.22 us > > ..................... > > pypy-seq-news-cn: Median +- std dev: 59.8 us +- 1.2 us > > ..................... > > pypy-sse4-vec-news-cn: Median +- std dev: 7.70 us +- 0.12 us > > ..................... > > pypy-avx2-vec-news-cn: Median +- std dev: 23.3 ns +- 0.4 ns > > ..................... > > libunistring-news-cn: Median +- std dev: 30.5 us +- 0.4 us > > ..................... > > mystrlenutf8-news-cn: Median +- std dev: 6.54 us +- 0.20 us > > ..................... > > pypy-seq-tipitaka-thai: Median +- std dev: 939 us +- 39 us > > ..................... > > pypy-sse4-vec-tipitaka-thai: Median +- std dev: 425 us +- 7 us > > ..................... > > pypy-avx2-vec-tipitaka-thai: Median +- std dev: 19.9 ns +- 0.3 ns > > ..................... > > libunistring-tipitaka-thai: Median +- std dev: 615 us +- 28 us > > ..................... > > WARNING: the benchmark seems unstable, the standard deviation is high > > (stdev/median: 17%) > > Try to rerun the benchmark with more runs, samples and/or loops > > > > mystrlenutf8-tipitaka-thai: Median +- std dev: 45.1 us +- 7.9 us > > > > What do you think? > > > > I think it would even be a good idea to take a look at AVX512 (which > > gives you a crazy amount of 512 bits (or 64 bytes) in your vector > register). > > > > The AVX implementation is a bit fishy (compare avx2-vec-tipitaka-thai > > and pypy-avx2-vec-news-cn). I need to recheck that, it would not make > > sense to process 10x 4.5 MB in 20ns and 10x 700KB in 23ns. > > > > As soon as I have ironed out the issue I'll start to think about > indexing... > > > > Cheers, > > Richard > > > > [1] http://www.daemonology.net/blog/2008-06-05-faster-utf8-strlen.html > > > > On 03/04/2017 07:01 PM, Maciej Fijalkowski wrote: > >> Hello everyone > >> > >> I've been experimenting a bit with faster utf8 operations (and > >> conversion that does not do much). I'm writing down the results so > >> they don't get forgotten, as well as trying to put them in rpython > >> comments. > >> > >> As far as non-SSE algorithms go, for things like splitlines, split > >> etc. is important to walk the utf8 string quickly and check properties > >> of characters. > >> > >> So far the current finding has been that lookup table, for example: > >> > >> def next_codepoint_pos(code, pos): > >> chr1 = ord(code[pos]) > >> if chr1 < 0x80: > >> return pos + 1 > >> return pos + ord(runicode._utf8_code_length[chr1 - 0x80]) > >> > >> is significantly slower than following code (both don't do error > checking): > >> > >> def next_codepoint_pos(code, pos): > >> chr1 = ord(code[pos]) > >> if chr1 < 0x80: > >> return pos + 1 > >> if 0xC2 >= chr1 <= 0xDF: > >> return pos + 2 > >> if chr >= 0xE0 and chr <= 0xEF: > >> return pos + 3 > >> return pos + 4 > >> > >> The exact difference depends on how much multi-byte characters are > >> there and how big the strings are. It's up to 40%, but as a general > >> rule, the more ascii characters are, the less of an impact it has, as > >> well as the larger they are, the more impact memory/L2/L3 cache has. > >> > >> PS. SSE will be faster still, but we might not want SSE for just > splitlines > >> > >> Cheers, > >> fijal > >> _______________________________________________ > >> pypy-dev mailing list > >> pypy-dev at python.org > >> https://mail.python.org/mailman/listinfo/pypy-dev > >> > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shubharamani at yahoo.com Fri Mar 10 10:15:45 2017 From: shubharamani at yahoo.com (Shubha Ramani) Date: Fri, 10 Mar 2017 15:15:45 +0000 (UTC) Subject: [pypy-dev] looptoken.number of bridge References: <2028230117.2808877.1489158945920.ref@mail.yahoo.com> Message-ID: <2028230117.2808877.1489158945920@mail.yahoo.com> 1) Is it a true statement to say that the looptoken.number between an original loop and a bridge stays the same ??2) Is it a true statement to say "loopname" as passed into assemble_loop will apply to a bridge if indeed statement 1)above is true - if a loop and bridge share the same original looptoken.number, do they also share the same loopname(as passed in from get_printable_location) ? Thanks, Shubha -------------- next part -------------- An HTML attachment was scrubbed... URL: From shubharamani at yahoo.com Fri Mar 10 10:22:33 2017 From: shubharamani at yahoo.com (Shubha Ramani) Date: Fri, 10 Mar 2017 15:22:33 +0000 (UTC) Subject: [pypy-dev] jitted region start address and size References: <1830734433.2882206.1489159353300.ref@mail.yahoo.com> Message-ID: <1830734433.2882206.1489159353300@mail.yahoo.com> 1) in assemble_loop:Is this correct and if not, what is the correct answer ?starting address of jitted region: looppos + rawstartsize of jitted region:: size_excluding_failure_stuff - looppos 2) in assemble_bridge,?Is this correct and if not, what is the correct answer ?starting address of jitted region: startpos + rawstartsize of jitted region: codeendpos - startpos Thanks, Shubha -------------- next part -------------- An HTML attachment was scrubbed... URL: From John.Zhang at anu.edu.au Mon Mar 13 20:17:11 2017 From: John.Zhang at anu.edu.au (John Zhang) Date: Tue, 14 Mar 2017 00:17:11 +0000 Subject: [pypy-dev] What is RuntimeTypeInfo? Message-ID: Hi all, Can anyone does the favour for me by explaining the story of RuntimeTypeInfo in RPython? I couldn?t quite understand the description on the online documentation (http://rpython.readthedocs.io/en/latest/rtyper.html#opaque-types). From looking at the source code, and inspecting the generated C source, RTTI doesn?t seem to be used at all at runtime. The generated C source seems to just return an uninitialised function pointer on the stack when compiling the following code: from rpython.rtyper.rclass import OBJECTPTR class A: pass class B(A): pass def f(a): obj = rffi.cast(OBJECTPTR, a) return lltype.runtime_type_info(obj) t = Translation(f, [A], backend='c') t.backendopt(mallocs=True) t.view() lib = t.compile() We were looking at RTTI in finding a solution to the loss of type information problem at JIT encoding. My collaborators are trying to develop a JIT back-end targeting a micro virtual machine (Mu). It seems that by default the JIT transformer throws away the actual type information (GcStruct etc., which is not representable under RPython, I know) and only keeps the size. However, in Mu, memory allocation requires specific type information. Thus, among other ways, we are trying to see how much we can recover this object layout/type information. RTTI seems promising based on the description on the documentation, but I can?t picture what it looks like at run time. Can anyone provide some insight on this? Thanks, John Zhang ------------------------------------------------------ John Zhang Research Assistant Programming Languages, Design & Implementation Division Computer Systems Group ANU College of Engineering & Computer Science 108 North Rd The Australian National University Acton ACT 2601 john.zhang at anu.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.leslie.ttg at gmail.com Tue Mar 14 23:40:07 2017 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Wed, 15 Mar 2017 14:40:07 +1100 Subject: [pypy-dev] What is RuntimeTypeInfo? In-Reply-To: References: Message-ID: On 14 March 2017 at 11:17, John Zhang wrote: > Hi all, > Can anyone does the favour for me by explaining the story of RuntimeTypeInfo > in RPython? Hi John! The RTTI are a hook that the backend can implement, there is a fair bit of flexibility in what values they can take. The hooks in the C backend live here: https://bitbucket.org/pypy/pypy/src/699382943bd73bf19565e996d2042d54e7569e31/rpython/translator/c/gc.py?at=default&fileviewer=file-view-default#gc.py-145 This class is one example of a node for an RTTI value. In this one, the RTTI value is a function that can statically deallocate structs of this type. There are more in this file, for example the RTTI for a struct in a framework GC is just an identifier iiuc. I think you want to translate your opaque types yourself, possibly > I couldn?t quite understand the description on the online documentation > (http://rpython.readthedocs.io/en/latest/rtyper.html#opaque-types). From > looking at the source code, and inspecting the generated C source, RTTI > doesn?t seem to be used at all at runtime. The generated C source seems to > just return an uninitialised function pointer on the stack when compiling > the following code: > > from rpython.rtyper.rclass import OBJECTPTR > class A: > pass > > class B(A): > pass > > def f(a): > obj = rffi.cast(OBJECTPTR, a) > return lltype.runtime_type_info(obj) > > t = Translation(f, [A], backend='c') > t.backendopt(mallocs=True) > t.view() > lib = t.compile() > This example (which uses the refcount GC) grabs a function from the type OBJECTPTR that can de-allocate an OBJECTPTR (that is, it can decrement the refcount of the object `a`). I haven't had a look at why it would be uninitialised. > We were looking at RTTI in finding a solution to the loss of type > information problem at JIT encoding. My collaborators are trying to develop > a JIT back-end targeting a micro virtual machine (Mu). It seems that by > default the JIT transformer throws away the actual type information > (GcStruct etc., which is not representable under RPython, I know) and only > keeps the size. However, in Mu, memory allocation requires specific type > information. Thus, among other ways, we are trying to see how much we can > recover this object layout/type information. RTTI seems promising based on > the description on the documentation, but I can?t picture what it looks like > at run time. > Can anyone provide some insight on this? > The low-level allocation operations specify size as an operand - it might be better for you to translate the various new_ operations into a form you can make use of long before jitcode generation. You'll need to extend the codewriter to allow for those operations, too. -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. From matti.picus at gmail.com Wed Mar 15 01:34:13 2017 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 15 Mar 2017 07:34:13 +0200 Subject: [pypy-dev] Freeze of pypy2 and pypy3 for upcoming release Message-ID: An HTML attachment was scrubbed... URL: From armin.rigo at gmail.com Wed Mar 15 03:13:06 2017 From: armin.rigo at gmail.com (Armin Rigo) Date: Wed, 15 Mar 2017 08:13:06 +0100 Subject: [pypy-dev] What is RuntimeTypeInfo? In-Reply-To: References: Message-ID: Hi, On 15 March 2017 at 04:40, William ML Leslie wrote: > The RTTI are a hook that the backend can implement, there is a fair > bit of flexibility in what values they can take. That's right, but also, the RTTI is actually something from the early days of PyPy and not used any more nowadays. It is used with our test-only refcounting GC but not by "real code". I wouldn't start with that. >> It seems that by default the JIT transformer throws away the actual type information >> (GcStruct etc., which is not representable under RPython, I know) and only keeps >> the size. However, in Mu, memory allocation requires specific type information. That's not really true. We need to keep at least the typeid (a number, also called "tid") in addition to the size. This is stored in the SizeDescr, for GcStructs, which is a small piece of type information built at translation time by cpu.sizeof(). Look for get_size_descr() in jit/backend/llsupport/, and for init_size_descr() in jit/backend/llsupport/gc.py. You can tweak SizeDescr to attach whatever info is needed there. As William said, depending on what exactly you need, you need to also tweak jit/codewriter/, which is the code that ultimately invokes the translation-time setting up of SizeDescr. Also, the same applies to the other Descr classes in llsupport/descr.py, at least ArrayDescr. A bient?t, Armin. From cfbolz at gmx.de Wed Mar 15 05:35:09 2017 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Wed, 15 Mar 2017 10:35:09 +0100 Subject: [pypy-dev] What is RuntimeTypeInfo? In-Reply-To: References: Message-ID: <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com> Hi John, As an aside, if you strictly need more information in your descrs, nobody stops you from using less of llsupport and instead writing or at least overriding your own descr infrastructure. I would imagine that a number of things from llsupport are not a perfect match for the mu backend. Cheers, Carl Friedrich On March 15, 2017 8:13:06 AM GMT+01:00, Armin Rigo wrote: >Hi, > >On 15 March 2017 at 04:40, William ML Leslie > wrote: >> The RTTI are a hook that the backend can implement, there is a fair >> bit of flexibility in what values they can take. > >That's right, but also, the RTTI is actually something from the early >days of PyPy and not used any more nowadays. It is used with our >test-only refcounting GC but not by "real code". I wouldn't start >with that. > >>> It seems that by default the JIT transformer throws away the actual >type information >>> (GcStruct etc., which is not representable under RPython, I know) >and only keeps >>> the size. However, in Mu, memory allocation requires specific type >information. > >That's not really true. We need to keep at least the typeid (a >number, also called "tid") in addition to the size. This is stored in >the SizeDescr, for GcStructs, which is a small piece of type >information built at translation time by cpu.sizeof(). Look for >get_size_descr() in jit/backend/llsupport/, and for init_size_descr() >in jit/backend/llsupport/gc.py. You can tweak SizeDescr to attach >whatever info is needed there. > >As William said, depending on what exactly you need, you need to also >tweak jit/codewriter/, which is the code that ultimately invokes the >translation-time setting up of SizeDescr. > >Also, the same applies to the other Descr classes in >llsupport/descr.py, at least ArrayDescr. > > >A bient?t, > >Armin. >_______________________________________________ >pypy-dev mailing list >pypy-dev at python.org >https://mail.python.org/mailman/listinfo/pypy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From cfbolz at gmx.de Wed Mar 15 06:22:40 2017 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Wed, 15 Mar 2017 11:22:40 +0100 Subject: [pypy-dev] Programming Language Implementation Summer School (PLISS) Message-ID: <89526bb0-177b-2b70-f975-89ecf624a2a9@gmx.de> ============================================================================ Programming Language Implementation Summer School (PLISS) May 20-27, 2017, Bertinoro Italy https://pliss2017.github.io/ ============================================================================ Programming languages are our interface to the myriad of computer systems we interact with on a daily basis. They allow us to craft complex sequences of operations at increasing high levels of abstraction. How are these languages designed? How are they implemented? How do we evaluate them? The First Programming Language Implementation Summer School (PLISS) will be held in Bertinoro, Italy from May 20 to 27, 2017. The Summer School's goal is to prepare early graduate students and advanced undergraduates for research in the field. This will be done through a combination of lectures on language implementation techniques and short talks exploring the state of the art in programming language research and practice. Lectures cover current research and future trends in programming language design and implementation, including: * Writing Just-in-time Compilers with LLVM * Performance Evaluation and Benchmarking * Designing a Commercial Actor Language * High-Performance Fully Concurrent Garbage Collection * Compiling Dynamic Languages * Language-support for Distributed Datastores The instructors are accomplished researchers and practitioners with extensive experience designing and engineering successful languages and tools. We gratefully acknowledge the support of our sponsors in allowing us to make travel grants and fellowships available to support students interested in attending PLISS. More details at https://pliss2017.github.io/ From John.Zhang at anu.edu.au Wed Mar 15 20:14:31 2017 From: John.Zhang at anu.edu.au (John Zhang) Date: Thu, 16 Mar 2017 00:14:31 +0000 Subject: [pypy-dev] What is RuntimeTypeInfo? In-Reply-To: <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com> References: <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com> Message-ID: Thanks Carl, Armin and William! We will look into it further. Cheers, John Zhang ------------------------------------------------------ John Zhang Research Assistant Programming Languages, Design & Implementation Division Computer Systems Group ANU College of Engineering & Computer Science 108 North Rd The Australian National University Acton ACT 2601 john.zhang at anu.edu.au On 15 Mar 2017, at 20:35, Carl Friedrich Bolz > wrote: Hi John, As an aside, if you strictly need more information in your descrs, nobody stops you from using less of llsupport and instead writing or at least overriding your own descr infrastructure. I would imagine that a number of things from llsupport are not a perfect match for the mu backend. Cheers, Carl Friedrich On March 15, 2017 8:13:06 AM GMT+01:00, Armin Rigo > wrote: Hi, On 15 March 2017 at 04:40, William ML Leslie > wrote: The RTTI are a hook that the backend can implement, there is a fair bit of flexibility in what values they can take. That's right, but also, the RTTI is actually something from the early days of PyPy and not used any more nowadays. It is used with our test-only refcounting GC but not by "real code". I wouldn't start with that. It seems that by default the JIT transformer throws away the actual type information (GcStruct etc., which is not representable under RPython, I know) and only keeps the size. However, in Mu, memory allocation requires specific type information. That's not really true. We need to keep at least the typeid (a number, also called "tid") in addition to the size. This is stored in the SizeDescr, for GcStructs, which is a small piece of type information built at translation time by cpu.sizeof(). Look for get_size_descr() in jit/backend/llsupport/, and for init_size_descr() in jit/backend/llsupport/gc.py. You can tweak SizeDescr to attach whatever info is needed there. As William said, depending on what exactly you need, you need to also tweak jit/codewriter/, which is the code that ultimately invokes the translation-time setting up of SizeDescr. Also, the same applies to the other Descr classes in llsupport/descr.py, at least ArrayDescr. A bient?t, Armin. ________________________________ pypy-dev mailing list pypy-dev at python.org https://mail.python.org/mailman/listinfo/pypy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From John.Zhang at anu.edu.au Wed Mar 15 20:47:59 2017 From: John.Zhang at anu.edu.au (John Zhang) Date: Thu, 16 Mar 2017 00:47:59 +0000 Subject: [pypy-dev] What is RuntimeTypeInfo? In-Reply-To: <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com> References: <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com> Message-ID: <34397520-113E-4ABE-AED2-5CFAC15AD833@anu.edu.au> Hi Carl, Armin, William, I have thought about modifying the JIT code instruction set, descriptors and runtime rewrite etc. to encode the MuTyped CFG (which is a further type and ops specialisation towards the Mu MicroVM) for Mu back-end. But I presume this will involve a large amount of work? Would this be the case? And thus this wouldn?t be a good idea, right? Regards, John Zhang ------------------------------------------------------ John Zhang Research Assistant Programming Languages, Design & Implementation Division Computer Systems Group ANU College of Engineering & Computer Science 108 North Rd The Australian National University Acton ACT 2601 john.zhang at anu.edu.au On 15 Mar 2017, at 20:35, Carl Friedrich Bolz > wrote: Hi John, As an aside, if you strictly need more information in your descrs, nobody stops you from using less of llsupport and instead writing or at least overriding your own descr infrastructure. I would imagine that a number of things from llsupport are not a perfect match for the mu backend. Cheers, Carl Friedrich On March 15, 2017 8:13:06 AM GMT+01:00, Armin Rigo > wrote: Hi, On 15 March 2017 at 04:40, William ML Leslie > wrote: The RTTI are a hook that the backend can implement, there is a fair bit of flexibility in what values they can take. That's right, but also, the RTTI is actually something from the early days of PyPy and not used any more nowadays. It is used with our test-only refcounting GC but not by "real code". I wouldn't start with that. It seems that by default the JIT transformer throws away the actual type information (GcStruct etc., which is not representable under RPython, I know) and only keeps the size. However, in Mu, memory allocation requires specific type information. That's not really true. We need to keep at least the typeid (a number, also called "tid") in addition to the size. This is stored in the SizeDescr, for GcStructs, which is a small piece of type information built at translation time by cpu.sizeof(). Look for get_size_descr() in jit/backend/llsupport/, and for init_size_descr() in jit/backend/llsupport/gc.py. You can tweak SizeDescr to attach whatever info is needed there. As William said, depending on what exactly you need, you need to also tweak jit/codewriter/, which is the code that ultimately invokes the translation-time setting up of SizeDescr. Also, the same applies to the other Descr classes in llsupport/descr.py, at least ArrayDescr. A bient?t, Armin. ________________________________ pypy-dev mailing list pypy-dev at python.org https://mail.python.org/mailman/listinfo/pypy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at manueljacob.de Wed Mar 15 21:20:07 2017 From: me at manueljacob.de (Manuel Jacob) Date: Thu, 16 Mar 2017 02:20:07 +0100 Subject: [pypy-dev] Remaining test_importlib failures Message-ID: Hi, I'm currently trying to fix the remaining test_importlib failures (http://buildbot.pypy.org/summary/longrepr?testname=%3Aunmodified&builder=pypy-c-jit-linux-x86-64&build=4451&mod=lib-python%2F3%2Ftest%2Ftest_importlib), of which some are a bit obscure. Most (or all) tests from test_importlib.frozen are failing because PyPy doesn't really have frozen modules (in the sense of CPython, where these are a special kind of modules besides normal Python modules and extension modules). Does it sound reasonable to skip these tests completely? Another class of tests is failing because we didn't implement PEP 489 (Multi-phase extension module initialization) so far. This is mostly cpyext-related, which isn't my area of expertise, but I can look into it if noone else is interested. -Manuel From william.leslie.ttg at gmail.com Wed Mar 15 21:44:57 2017 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Thu, 16 Mar 2017 12:44:57 +1100 Subject: [pypy-dev] What is RuntimeTypeInfo? In-Reply-To: <34397520-113E-4ABE-AED2-5CFAC15AD833@anu.edu.au> References: <5064a0e9-5a21-4e7c-a3ef-d08e279990af@email.android.com> <34397520-113E-4ABE-AED2-5CFAC15AD833@anu.edu.au> Message-ID: On 16 March 2017 at 11:47, John Zhang wrote: > Hi Carl, Armin, William, > I have thought about modifying the JIT code instruction set, descriptors and > runtime rewrite etc. to encode the MuTyped CFG (which is a further type and > ops specialisation towards the Mu MicroVM) for Mu back-end. But I presume > this will involve a large amount of work? Would this be the case? And thus > this wouldn?t be a good idea, right? > The more of the metainterp you can make use of the better. I would start by trying to push your graphs through the codewriter and seeing how/if that fails. LLTyped graphs have gone to the effort to encode the types involved into the identifier of the operation, eg, you get float_abs and llong_rshift. If you are not able to do something like that, you can still pass constants as arguments to the operation, and then ignore them in the backend (or use them to generate a particular operation) and have the codewriter preserve them for analysis later. They will get recorded in the constant table on the jitcodes. For an example of this, note that indirect_call operations maintain a set of possible targets as their last argument; it might be illustrative to search for that name. I guess it is worth asking: What other operations are you finding difficult to type? I get that the lltype specialisation of new_ is one, and that address manipulation with composite offsets is another (though it turns out to always be well-typed in practice). -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. From planrichi at gmail.com Thu Mar 16 04:13:59 2017 From: planrichi at gmail.com (Richard Plangger) Date: Thu, 16 Mar 2017 09:13:59 +0100 Subject: [pypy-dev] Remaining test_importlib failures In-Reply-To: References: Message-ID: <802f243c-33b7-98f2-0783-1e9d640d9fe5@gmail.com> Hello, > Most (or all) tests from test_importlib.frozen are failing because PyPy > doesn't really have frozen modules (in the sense of CPython, where these > are a special kind of modules besides normal Python modules and > extension modules). Does it sound reasonable to skip these tests > completely? IMHO: Yes I would think so. It might be worthwhile to revisit at some time in the future if people complain that we do not support that. This should also be documented on pypy.readthedocs.org if we decide not to do it now. > Another class of tests is failing because we didn't implement PEP 489 > (Multi-phase extension module initialization) so far. This is mostly > cpyext-related, which isn't my area of expertise, but I can look into it > if noone else is interested. It does not sound very hard. And since it exposes some new public API people will start to use it... So I'm +1 to implement it. Cheers, Richard From abcdoyle888 at gmail.com Fri Mar 17 01:48:49 2017 From: abcdoyle888 at gmail.com (Dingyuan Wang) Date: Fri, 17 Mar 2017 13:48:49 +0800 Subject: [pypy-dev] Segfaults when compiling PyPy Message-ID: <6b330322-c830-e673-a1be-270330a0f025@gmail.com> Dear all, Is there anyone also having the problem that CPython2.7 or PyPy2 randomly crashes when compiling PyPy (several latest versions on hg)? I'm using Python 2.7.13 (or PyPy2 latest) on Debian stretch. One kind of problems is https://bugs.python.org/issue29242 Another kind is shown below. (at 90736:e668451adc8d) Program received signal SIGSEGV, Segmentation fault. update_refs () at ../Modules/gcmodule.c:332 332 ../Modules/gcmodule.c: No such file or directory. (gdb) bt #0 __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58 #1 0x00007ffff6f2d40a in __GI_abort () at abort.c:89 #2 0x00007ffff6f69bd0 in __libc_message (do_abort=do_abort at entry=2, fmt=fmt at entry=0x7ffff705ec30 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175 #3 0x00007ffff6f6ff96 in malloc_printerr (action=3, str=0x7ffff705ec88 "munmap_chunk(): invalid pointer", ptr=, ar_ptr=) at malloc.c:5046 #4 0x0000555555630f5f in list_dealloc.lto_priv () at ../Objects/listobject.c:316 #5 0x0000555555688456 in dict_dealloc.lto_priv.61 (mp=0x7fffe2e52398) at ../Objects/dictobject.c:1040 #6 subtype_dealloc.lto_priv () at ../Objects/typeobject.c:1035 #7 0x0000555555678af2 in list_ass_slice.lto_priv () at ../Objects/listobject.c:704 #8 0x000055555569253e in assign_slice.lto_priv () at ../Python/ceval.c:4758 #9 0x000055555565300b in PyEval_EvalFrameEx () at ../Python/ceval.c:1868 #10 0x0000555555654c1f in fast_function (nk=, na=, n=, pp_stack=0x7fffffffcc50, func=) at ../Python/ceval.c:4437 #11 call_function (oparg=, pp_stack=0x7fffffffcc50) at ../Python/ceval.c:4372 #12 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 ---Type to continue, or q to quit--- #13 0x0000555555654c1f in fast_function (nk=, na=, n=, pp_stack=0x7fffffffcda0, func=) at ../Python/ceval.c:4437 #14 call_function (oparg=, pp_stack=0x7fffffffcda0) at ../Python/ceval.c:4372 #15 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 #16 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 #17 0x0000555555654f19 in fast_function (nk=1, na=, n=, pp_stack=0x7fffffffcfb0, func=) at ../Python/ceval.c:4447 #18 call_function (oparg=, pp_stack=0x7fffffffcfb0) at ../Python/ceval.c:4372 #19 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 #20 0x0000555555654c1f in fast_function (nk=, na=, n=, pp_stack=0x7fffffffd100, func=) at ../Python/ceval.c:4437 #21 call_function (oparg=, pp_stack=0x7fffffffd100) at ../Python/ceval.c:4372 #22 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 #23 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 #24 0x0000555555669ea8 in function_call.lto_priv () at ../Objects/funcobject.c:523 #25 0x000055555563b673 in PyObject_Call () at ../Objects/abstract.c:2547 ---Type to continue, or q to quit--- #26 0x00005555556518a5 in ext_do_call (nk=0, na=3, flags=, pp_stack=0x7fffffffd3b8, func=) at ../Python/ceval.c:4666 #27 PyEval_EvalFrameEx () at ../Python/ceval.c:3028 #28 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 #29 0x0000555555655698 in fast_function (nk=1, na=, n=, pp_stack=0x7fffffffd5c0, func=) at ../Python/ceval.c:4447 #30 call_function (oparg=, pp_stack=0x7fffffffd5c0) at ../Python/ceval.c:4372 #31 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 #32 0x0000555555654c1f in fast_function (nk=, na=, n=, pp_stack=0x7fffffffd710, func=) at ../Python/ceval.c:4437 #33 call_function (oparg=, pp_stack=0x7fffffffd710) at ../Python/ceval.c:4372 #34 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 #35 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 #36 0x0000555555655698 in fast_function (nk=0, na=, n=, pp_stack=0x7fffffffd920, func=) at ../Python/ceval.c:4447 #37 call_function (oparg=, pp_stack=0x7fffffffd920) at ../Python/ceval.c:4372 ---Type to continue, or q to quit--- #38 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 #39 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 #40 0x000055555564d2d9 in PyEval_EvalCode (co=, globals=, locals=) at ../Python/ceval.c:669 #41 0x000055555567ce3f in run_mod.lto_priv () at ../Python/pythonrun.c:1376 #42 0x0000555555677d52 in PyRun_FileExFlags () at ../Python/pythonrun.c:1362 #43 0x000055555567789e in PyRun_SimpleFileExFlags () at ../Python/pythonrun.c:948 #44 0x0000555555628af1 in Py_Main () at ../Modules/main.c:640 #45 0x00007ffff6f192b1 in __libc_start_main (main=0x555555628420
, argc=4, argv=0x7fffffffdd68, init=, fini=, rtld_fini=, stack_end=0x7fffffffdd58) at ../csu/libc-start.c:291 #46 0x000055555562831a in _start () -- Dingyuan Wang -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From abcdoyle888 at gmail.com Fri Mar 17 01:52:17 2017 From: abcdoyle888 at gmail.com (Dingyuan Wang) Date: Fri, 17 Mar 2017 13:52:17 +0800 Subject: [pypy-dev] Segfaults when compiling PyPy In-Reply-To: <6b330322-c830-e673-a1be-270330a0f025@gmail.com> References: <6b330322-c830-e673-a1be-270330a0f025@gmail.com> Message-ID: > Dear all, > > Is there anyone also having the problem that CPython2.7 or PyPy2 > randomly crashes when compiling PyPy (several latest versions on hg)? > I'm using Python 2.7.13 (or PyPy2 latest) on Debian stretch. > > One kind of problems is https://bugs.python.org/issue29242 > > Another kind is shown below. (at 90736:e668451adc8d) > > Program received signal SIGSEGV, Segmentation fault. > update_refs () at ../Modules/gcmodule.c:332 > 332 ../Modules/gcmodule.c: No such file or directory. copy&paste error, the above should be: *** Error in `/usr/bin/python': munmap_chunk():invalid pointer: 0x00007fffe2a1ba50 *** ... Program received signal SIGABRT, Aborted. __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58 58 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. > (gdb) bt > #0 __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58 > #1 0x00007ffff6f2d40a in __GI_abort () at abort.c:89 > #2 0x00007ffff6f69bd0 in __libc_message (do_abort=do_abort at entry=2, > fmt=fmt at entry=0x7ffff705ec30 "*** Error in `%s': %s: 0x%s ***\n") > at ../sysdeps/posix/libc_fatal.c:175 > #3 0x00007ffff6f6ff96 in malloc_printerr (action=3, > str=0x7ffff705ec88 "munmap_chunk(): invalid pointer", ptr= out>, > ar_ptr=) at malloc.c:5046 > #4 0x0000555555630f5f in list_dealloc.lto_priv () > at ../Objects/listobject.c:316 > #5 0x0000555555688456 in dict_dealloc.lto_priv.61 (mp=0x7fffe2e52398) > at ../Objects/dictobject.c:1040 > #6 subtype_dealloc.lto_priv () at ../Objects/typeobject.c:1035 > #7 0x0000555555678af2 in list_ass_slice.lto_priv () > at ../Objects/listobject.c:704 > #8 0x000055555569253e in assign_slice.lto_priv () at ../Python/ceval.c:4758 > #9 0x000055555565300b in PyEval_EvalFrameEx () at ../Python/ceval.c:1868 > #10 0x0000555555654c1f in fast_function (nk=, > na=, n=, pp_stack=0x7fffffffcc50, > func=) at ../Python/ceval.c:4437 > #11 call_function (oparg=, pp_stack=0x7fffffffcc50) > at ../Python/ceval.c:4372 > #12 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 > ---Type to continue, or q to quit--- > #13 0x0000555555654c1f in fast_function (nk=, > na=, n=, pp_stack=0x7fffffffcda0, > func=) at ../Python/ceval.c:4437 > #14 call_function (oparg=, pp_stack=0x7fffffffcda0) > at ../Python/ceval.c:4372 > #15 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 > #16 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 > #17 0x0000555555654f19 in fast_function (nk=1, na=, > n=, pp_stack=0x7fffffffcfb0, > func=) at ../Python/ceval.c:4447 > #18 call_function (oparg=, pp_stack=0x7fffffffcfb0) > at ../Python/ceval.c:4372 > #19 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 > #20 0x0000555555654c1f in fast_function (nk=, > na=, n=, pp_stack=0x7fffffffd100, > func=) at ../Python/ceval.c:4437 > #21 call_function (oparg=, pp_stack=0x7fffffffd100) > at ../Python/ceval.c:4372 > #22 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 > #23 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 > #24 0x0000555555669ea8 in function_call.lto_priv () > at ../Objects/funcobject.c:523 > #25 0x000055555563b673 in PyObject_Call () at ../Objects/abstract.c:2547 > ---Type to continue, or q to quit--- > #26 0x00005555556518a5 in ext_do_call (nk=0, na=3, flags=, > pp_stack=0x7fffffffd3b8, func=) > at ../Python/ceval.c:4666 > #27 PyEval_EvalFrameEx () at ../Python/ceval.c:3028 > #28 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 > #29 0x0000555555655698 in fast_function (nk=1, na=, > n=, pp_stack=0x7fffffffd5c0, > func=) at ../Python/ceval.c:4447 > #30 call_function (oparg=, pp_stack=0x7fffffffd5c0) > at ../Python/ceval.c:4372 > #31 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 > #32 0x0000555555654c1f in fast_function (nk=, > na=, n=, pp_stack=0x7fffffffd710, > func=) at ../Python/ceval.c:4437 > #33 call_function (oparg=, pp_stack=0x7fffffffd710) > at ../Python/ceval.c:4372 > #34 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 > #35 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 > #36 0x0000555555655698 in fast_function (nk=0, na=, > n=, pp_stack=0x7fffffffd920, > func=) at ../Python/ceval.c:4447 > #37 call_function (oparg=, pp_stack=0x7fffffffd920) > at ../Python/ceval.c:4372 > ---Type to continue, or q to quit--- > #38 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 > #39 0x000055555564d535 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 > #40 0x000055555564d2d9 in PyEval_EvalCode (co=, > globals=, locals=) at > ../Python/ceval.c:669 > #41 0x000055555567ce3f in run_mod.lto_priv () at ../Python/pythonrun.c:1376 > #42 0x0000555555677d52 in PyRun_FileExFlags () at ../Python/pythonrun.c:1362 > #43 0x000055555567789e in PyRun_SimpleFileExFlags () > at ../Python/pythonrun.c:948 > #44 0x0000555555628af1 in Py_Main () at ../Modules/main.c:640 > #45 0x00007ffff6f192b1 in __libc_start_main (main=0x555555628420
, > argc=4, argv=0x7fffffffdd68, init=, fini= out>, > rtld_fini=, stack_end=0x7fffffffdd58) > at ../csu/libc-start.c:291 > #46 0x000055555562831a in _start () > > -- Dingyuan Wang -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From matti.picus at gmail.com Mon Mar 20 15:46:41 2017 From: matti.picus at gmail.com (Matti Picus) Date: Mon, 20 Mar 2017 21:46:41 +0200 Subject: [pypy-dev] 2, 7 and 3.5 release is almost here - please help check you favorite platform Message-ID: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com> An HTML attachment was scrubbed... URL: From me at manueljacob.de Mon Mar 20 18:37:28 2017 From: me at manueljacob.de (Manuel Jacob) Date: Mon, 20 Mar 2017 23:37:28 +0100 Subject: [pypy-dev] Remove cpyext.load_module()? Message-ID: <2be9b2ea1771a0ebb150f4ca44bdb6c3@manueljacob.de> Hi, In order to implement PEP 489 (Multi-phase extension module initialization) on py3.5 I need to change the way app-level Python code interacts with extension module loading internals. Maintaining cpyext.load_module() is a bit annoying. Probably the best way would be to delegate straight to imp.load_dynamic(). But then the question is why we need it in the first place. I'd like to remove cpyext.load_module() on both default and py3.5. -Manuel From phyo.arkarlwin at gmail.com Tue Mar 21 00:53:22 2017 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Tue, 21 Mar 2017 04:53:22 +0000 Subject: [pypy-dev] 2, 7 and 3.5 release is almost here - please help check you favorite platform In-Reply-To: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com> References: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com> Message-ID: Teating. Saw warning that it is much slower than pypy2? How much slower? On Tue, Mar 21, 2017, 02:17 Matti Picus wrote: > The release is almost ready. Please check the website for obvious typos, > > http://pypy.org/download.html (note I fixed a typo in the hashsum titles > in 536afa5d31cf ) > > and more importantly the downloads for problems on your platform > > https://bitbucket.org/pypy/pypy/downloads > > Thanks, > Matti > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From John.Zhang at anu.edu.au Tue Mar 21 01:29:08 2017 From: John.Zhang at anu.edu.au (John Zhang) Date: Tue, 21 Mar 2017 05:29:08 +0000 Subject: [pypy-dev] lltype.Signed type in ThreadLocalReference Message-ID: Hi all, I?m wondering why rthread.ThreadLocalReference is initialised to have lltype.Signed type (rthread.py:387). If in get() it?s retrieved as rclass.OBJECTPTR, why not just set the type of the field to be OBJECTPTR? is this related to some specific optimisation? The problem I?m having is that in my back-end I cannot cast an integer to a GC-ed heap object reference (which the OBJECTPTR translates to), it can only be cast to a non-GC-ed memory object reference (a different memory space not part of the GC managed heap). Any ideas? Regards, John Zhang ------------------------------------------------------ John Zhang Research Assistant Programming Languages, Design & Implementation Division Computer Systems Group ANU College of Engineering & Computer Science 108 North Rd The Australian National University Acton ACT 2601 john.zhang at anu.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Tue Mar 21 12:14:22 2017 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Tue, 21 Mar 2017 11:14:22 -0500 Subject: [pypy-dev] 2, 7 and 3.5 release is almost here - please help check you favorite platform In-Reply-To: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com> References: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com> Message-ID: Typo: in this subject header, you wrote 2, 7 instead of 2.7. ;) -- Ryan (????) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On Mar 20, 2017 2:47 PM, "Matti Picus" wrote: > The release is almost ready. Please check the website for obvious typos, > > http://pypy.org/download.html (note I fixed a typo in the hashsum titles > in 536afa5d31cf ) > > and more importantly the downloads for problems on your platform > > https://bitbucket.org/pypy/pypy/downloads > > Thanks, > Matti > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at manueljacob.de Tue Mar 21 20:03:55 2017 From: me at manueljacob.de (Manuel Jacob) Date: Wed, 22 Mar 2017 01:03:55 +0100 Subject: [pypy-dev] lltype.Signed type in ThreadLocalReference In-Reply-To: References: Message-ID: <2c8b5b1daf6488e9e917d707f72d3630@manueljacob.de> Hi John, I can't say for sure (Armin probably knows better), but from the commit history it looks like this was specifically changed for the JIT. With commit 5291d2692c2375a4105b43498188e749d4204dc8 the type was changed from llmemory.GCREF to lltype.Signed. I'd recommend changing it back to llmemory.GCREF or rclass.OBJECTPTR in your fork and look whether you'll run into problems later. -Manuel On 2017-03-21 06:29, John Zhang wrote: > Hi all, > I?m wondering why rthread.ThreadLocalReference is initialised to have > lltype.Signed type (rthread.py:387). If in get() it?s retrieved as > rclass.OBJECTPTR, why not just set the type of the field to be > OBJECTPTR? is this related to some specific optimisation? > The problem I?m having is that in my back-end I cannot cast an integer > to a GC-ed heap object reference (which the OBJECTPTR translates to), > it can only be cast to a non-GC-ed memory object reference (a > different memory space not part of the GC managed heap). > Any ideas? > > Regards, > John Zhang > > ------------------------------------------------------ > John Zhang > Research Assistant > Programming Languages, Design & Implementation Division > Computer Systems Group > ANU College of Engineering & Computer Science > 108 North Rd > The Australian National University > Acton ACT 2601 > john.zhang at anu.edu.au > > > > > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev From John.Zhang at anu.edu.au Wed Mar 22 01:29:04 2017 From: John.Zhang at anu.edu.au (John Zhang) Date: Wed, 22 Mar 2017 05:29:04 +0000 Subject: [pypy-dev] lltype.Signed type in ThreadLocalReference In-Reply-To: <2c8b5b1daf6488e9e917d707f72d3630@manueljacob.de> References: <2c8b5b1daf6488e9e917d707f72d3630@manueljacob.de> Message-ID: <60C49E26-622C-4C88-8D33-DC9904EB4FCD@anu.edu.au> Hi Manuel, I attempted to change it to OBJECTPTR in my local repo and it worked. So I will see how it go I guess? Thanks for the reply. Cheers, John Zhang ------------------------------------------------------ John Zhang Research Assistant Programming Languages, Design & Implementation Division Computer Systems Group ANU College of Engineering & Computer Science 108 North Rd The Australian National University Acton ACT 2601 john.zhang at anu.edu.au On 22 Mar 2017, at 11:03, Manuel Jacob > wrote: Hi John, I can't say for sure (Armin probably knows better), but from the commit history it looks like this was specifically changed for the JIT. With commit 5291d2692c2375a4105b43498188e749d4204dc8 the type was changed from llmemory.GCREF to lltype.Signed. I'd recommend changing it back to llmemory.GCREF or rclass.OBJECTPTR in your fork and look whether you'll run into problems later. -Manuel On 2017-03-21 06:29, John Zhang wrote: Hi all, I?m wondering why rthread.ThreadLocalReference is initialised to have lltype.Signed type (rthread.py:387). If in get() it?s retrieved as rclass.OBJECTPTR, why not just set the type of the field to be OBJECTPTR? is this related to some specific optimisation? The problem I?m having is that in my back-end I cannot cast an integer to a GC-ed heap object reference (which the OBJECTPTR translates to), it can only be cast to a non-GC-ed memory object reference (a different memory space not part of the GC managed heap). Any ideas? Regards, John Zhang ------------------------------------------------------ John Zhang Research Assistant Programming Languages, Design & Implementation Division Computer Systems Group ANU College of Engineering & Computer Science 108 North Rd The Australian National University Acton ACT 2601 john.zhang at anu.edu.au _______________________________________________ pypy-dev mailing list pypy-dev at python.org https://mail.python.org/mailman/listinfo/pypy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From dynamicgl at gmail.com Thu Mar 23 03:41:29 2017 From: dynamicgl at gmail.com (Gelin Yan) Date: Thu, 23 Mar 2017 15:41:29 +0800 Subject: [pypy-dev] numpy 1.12.1 segfault with pypy 2 5.7 on ubuntu 14.04 Message-ID: Hi All I built pypy 2 5.7 from the source on Ubuntu 14.04 and installed numpy 1.12.1 via pip. When running numpy.test('full'), I noticed there was a segfault. I tested pypy 2 5.6 with numpy 1.12.1 too. I didn't see any segfault. Regards gelin yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Mar 23 03:51:17 2017 From: matti.picus at gmail.com (matti picus) Date: Thu, 23 Mar 2017 07:51:17 +0000 Subject: [pypy-dev] numpy 1.12.1 segfault with pypy 2 5.7 on ubuntu 14.04 In-Reply-To: References: Message-ID: On Thu, 23 Mar 2017 at 9:42 am, Gelin Yan wrote: > Hi All > > I built pypy 2 5.7 from the source on Ubuntu 14.04 and installed > numpy 1.12.1 via pip. When running numpy.test('full'), I noticed there was > a segfault. > > I tested pypy 2 5.6 with numpy 1.12.1 too. I didn't see any segfault. > > Regards > > gelin yan > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > What platform are you using? Do you have limited RAM? Please rerun the tests in verbose mode and preferably open an issue on https://bitbucket.org/pypy/pypy/issues -------------- next part -------------- An HTML attachment was scrubbed... URL: From armin.rigo at gmail.com Thu Mar 23 06:39:14 2017 From: armin.rigo at gmail.com (Armin Rigo) Date: Thu, 23 Mar 2017 11:39:14 +0100 Subject: [pypy-dev] Remove cpyext.load_module()? In-Reply-To: <2be9b2ea1771a0ebb150f4ca44bdb6c3@manueljacob.de> References: <2be9b2ea1771a0ebb150f4ca44bdb6c3@manueljacob.de> Message-ID: Hi, On 20 March 2017 at 23:37, Manuel Jacob wrote: > I'd like to remove cpyext.load_module() on both default and py3.5. If you're talking about the app-level function, it doesn't seem to be called at all (at least on default) apart from test_cpyext.AppTestApi.test_load_error. A bient?t, Armin. From armin.rigo at gmail.com Thu Mar 23 06:46:22 2017 From: armin.rigo at gmail.com (Armin Rigo) Date: Thu, 23 Mar 2017 11:46:22 +0100 Subject: [pypy-dev] 2, 7 and 3.5 release is almost here - please help check you favorite platform In-Reply-To: References: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com> Message-ID: Hi Phyo, On 21 March 2017 at 05:53, Phyo Arkar wrote: > Teating. Saw warning that it is much slower than pypy2? How much slower? This warning should be removed or at least made much less strong. We didn't measure, but it passes many of the same tests for JIT-code quality now. Where did we leave such a warning? A bient?t, Armin. From armin.rigo at gmail.com Thu Mar 23 07:36:26 2017 From: armin.rigo at gmail.com (Armin Rigo) Date: Thu, 23 Mar 2017 12:36:26 +0100 Subject: [pypy-dev] Maybe do a 5.7.1 release soon? Message-ID: Hi all, https://bitbucket.org/pypy/pypy/issues/2508/dictionary-pop-with-default-fails-with This is a core regression found by Jason and fixed by Alex Gaynor. Thanks to both! Is this justification enough to plan a 5.7.1 release soon? A bient?t, Armin. From matti.picus at gmail.com Thu Mar 23 13:55:27 2017 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 23 Mar 2017 19:55:27 +0200 Subject: [pypy-dev] 2, 7 and 3.5 release is almost here - please help check you favorite platform In-Reply-To: References: <77d8e605-f9fc-3bf4-5491-fece4eb76863@gmail.com> Message-ID: <965ee337-3068-0066-225b-0d806abed7ed@gmail.com> On 23/03/17 12:46, Armin Rigo wrote: > Hi Phyo, > > On 21 March 2017 at 05:53, Phyo Arkar wrote: >> Teating. Saw warning that it is much slower than pypy2? How much slower? > This warning should be removed or at least made much less strong. We > didn't measure, but it passes many of the same tests for JIT-code > quality now. Where did we leave such a warning? > > > A bient?t, > > Armin. It's in the release notice/blog post. Sorry, a bit late to refine that now :( Matti From omer.drow at gmail.com Sun Mar 26 12:40:44 2017 From: omer.drow at gmail.com (Omer Katz) Date: Sun, 26 Mar 2017 16:40:44 +0000 Subject: [pypy-dev] Maybe do a 5.7.1 release soon? In-Reply-To: References: Message-ID: I think this regression will cause a lot of applications to fail because they rely on this behavior. I think that warrants a new release. On Thu, Mar 23, 2017, 13:38 Armin Rigo wrote: > Hi all, > > > https://bitbucket.org/pypy/pypy/issues/2508/dictionary-pop-with-default-fails-with > > This is a core regression found by Jason and fixed by Alex Gaynor. > Thanks to both! Is this justification enough to plan a 5.7.1 release > soon? > > > A bient?t, > > Armin. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Sun Mar 26 13:27:07 2017 From: matti.picus at gmail.com (Matti Picus) Date: Sun, 26 Mar 2017 20:27:07 +0300 Subject: [pypy-dev] Maybe do a 5.7.1 release soon? In-Reply-To: References: Message-ID: <93b08754-71d0-acf2-4af4-60d350beb793@gmail.com> An HTML attachment was scrubbed... URL: From sergey.forum at gmail.com Mon Mar 27 01:38:03 2017 From: sergey.forum at gmail.com (Sergey Kurdakov) Date: Mon, 27 Mar 2017 08:38:03 +0300 Subject: [pypy-dev] error building source pypy2-v5.7.0 cygwin 32/windows 64 Message-ID: Hi, as my windows pypy would always crash on my project but the same project on linux works fine and my main developing environment is windows I decided to try cygwin pypy so I got pypy2-v5.7.0-src.tar.bz2 and run on latest Cygwin-32 on windows 10/64 python ../../rpython/bin/rpython -Ojit targetpypystandalone aside of few warnings like /tmp/usession-release-pypy2.7-v5.7.0-0/platcheck_50.c:92:1: warning: implicit declaration of function ?mremap? [-Wimplicit-function-declaration] /tmp/usession-release-pypy2.7-v5.7.0-0/module_cache/module_0.c:82:68: warning: implicit declaration of function ?malloc? [-Wimplicit-function-declaration] which seems indicate - that some flags are not correctly set I almost immediately get following error: === [platform:execute] gcc -shared /tmp/usession-release-pypy2.7-v5.7.0-0/module_cache/module_0.o /tmp/usession-release-pypy2.7-v5.7.0-0/module_cache/module_1.o /tmp/usession-release-pypy2.7-v5.7.0-0/module_cache/module_2.o -Wl,--export-all-symbols -lrt -o /tmp/usession-release-pypy2.7-v5.7.0-0/shared_cache/externmod.dll Traceback (most recent call last): File "../../rpython/bin/rpython", line 20, in main() File "/home/Sergey/pypy2/rpython/translator/goal/translate.py", line 217, in main targetspec_dic, translateconfig, config, args = parse_options_and_load_target() File "/home/Sergey/pypy2/rpython/translator/goal/translate.py", line 155, in parse_options_and_load_target targetspec_dic = load_target(targetspec) File "/home/Sergey/pypy2/rpython/translator/goal/translate.py", line 97, in load_target mod = __import__(specname) File "targetpypystandalone.py", line 11, in from pypy.tool.option import make_objspace File "/home/Sergey/pypy2/pypy/tool/option.py", line 3, in from pypy.config.pypyoption import get_pypy_config File "/home/Sergey/pypy2/pypy/config/pypyoption.py", line 44, in if detect_cpu.autodetect().startswith('x86'): File "/home/Sergey/pypy2/rpython/jit/backend/detect_cpu.py", line 106, in autodetect return detect_model_from_host_platform() File "/home/Sergey/pypy2/rpython/jit/backend/detect_cpu.py", line 85, in detect_model_from_host_platform if feature.detect_sse2(): File "/home/Sergey/pypy2/rpython/jit/backend/x86/detect_feature.py", line 20, in detect_sse2 code = cpu_id(eax=1) File "/home/Sergey/pypy2/rpython/jit/backend/x86/detect_feature.py", line 34, in cpu_id return cpu_info(''.join(asm)) File "/home/Sergey/pypy2/rpython/jit/backend/x86/detect_feature.py", line 16, in cpu_info free(data, 4096) File "/home/Sergey/pypy2/rpython/rtyper/lltypesystem/rffi.py", line 260, in wrapper assert len(args) == nb_args AssertionError ==== any guide for building latest pypy on Cygwin? or at least to fix those mentioned errors? Regards Sergey -------------- next part -------------- An HTML attachment was scrubbed... URL: From macek at sandbox.cz Sun Mar 26 07:06:32 2017 From: macek at sandbox.cz (=?UTF-8?B?VmzDocSPYSBNYWNlaw==?=) Date: Sun, 26 Mar 2017 13:06:32 +0200 Subject: [pypy-dev] pypy real world example, a django project data processing. but slow... Message-ID: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz> Hi, recently I asked my friends to run my sort of a benchmark on their machines (attached). The goal was to test the speed of different data access in python2 and python3, 32bit and 64bit. One of my friends sent me the pypy results -- the script ran fast as hell! Astounding. At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded your binary https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and confirmed my friend's results, wow. I develop a large Django project, that includes a big amount of background data processing. Reads large files, computes, issues much SQL to postgresql via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs. I'd welcome a speedup here very much. So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set up the paths and ran. The computation printouts were the same, very promising -- taking into account how complicated the project is! The SQL looked right too. My respect on compatiblity! Unfortunately, the time needed to complete was double in comparison CPython 2.7 for exactly the same task. You mention you might have some tips for why it's slow. Are you interested in getting in touch? Although I rather can't share the code and data with you, I'm offering a real world example of significant load that might help Pypy get better. Thank you, -- : Vlada Macek : http://macek.sandbox.cz : +420 608 978 164 : UNIX && Dev || Training : Python, Django : PGP key 97330EBD (Disclaimer: The opinions expressed herein are not necessarily those of my employer, not necessarily mine, and probably not necessary.) -------------- next part -------------- A non-text attachment was scrubbed... Name: access-timer.py Type: text/x-python Size: 4329 bytes Desc: not available URL: From fijall at gmail.com Mon Mar 27 11:21:28 2017 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 27 Mar 2017 17:21:28 +0200 Subject: [pypy-dev] pypy real world example, a django project data processing. but slow... In-Reply-To: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz> References: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz> Message-ID: Hi Vlada Generally speaking, if we can't have a look there is incredibly little we can do "I have a program" can be pretty much anything. It is well known that django ORM is very slow (both on pypy and on cpython) and makes the JIT take forever to warm up. I have absolutely no idea how long is your run at full CPU, but this is definitely one of your suspects On Sun, Mar 26, 2017 at 1:06 PM, Vl??a Macek wrote: > Hi, recently I asked my friends to run my sort of a benchmark on their > machines (attached). The goal was to test the speed of different data > access in python2 and python3, 32bit and 64bit. One of my friends sent me > the pypy results -- the script ran fast as hell! Astounding. > > At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded > your binary > https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and > confirmed my friend's results, wow. > > I develop a large Django project, that includes a big amount of background > data processing. Reads large files, computes, issues much SQL to postgresql > via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs. > > I'd welcome a speedup here very much. > > So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set > up the paths and ran. The computation printouts were the same, very > promising -- taking into account how complicated the project is! The SQL > looked right too. My respect on compatiblity! > > Unfortunately, the time needed to complete was double in comparison CPython > 2.7 for exactly the same task. > > You mention you might have some tips for why it's slow. Are you interested > in getting in touch? Although I rather can't share the code and data with > you, I'm offering a real world example of significant load that might help > Pypy get better. > > Thank you, > > -- > : Vlada Macek : http://macek.sandbox.cz : +420 608 978 164 > : UNIX && Dev || Training : Python, Django : PGP key 97330EBD > > (Disclaimer: The opinions expressed herein are not necessarily those > of my employer, not necessarily mine, and probably not necessary.) > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > From armin.rigo at gmail.com Tue Mar 28 07:02:29 2017 From: armin.rigo at gmail.com (Armin Rigo) Date: Tue, 28 Mar 2017 13:02:29 +0200 Subject: [pypy-dev] error building source pypy2-v5.7.0 cygwin 32/windows 64 In-Reply-To: References: Message-ID: Hi Sergey, On 27 March 2017 at 07:38, Sergey Kurdakov wrote: > any guide for building latest pypy on Cygwin? or at least to fix those > mentioned errors? Cygwin is not officially supported. At some point in the past it used to work, thanks to contributions. If it no longer does, then a few fixes are needed again. It looks unlikely to come from the core PyPy team, but if you or someone else wants to contribute the relevant fixes, you are welcome to :-) A bient?t, Armin. From nanjekyejoannah at gmail.com Tue Mar 28 10:14:31 2017 From: nanjekyejoannah at gmail.com (joannah nanjekye) Date: Tue, 28 Mar 2017 17:14:31 +0300 Subject: [pypy-dev] Details on project idea: Explicit typing in RPython Message-ID: Hello, I am interested in working on the above project. I need to understand what it is about so that I can make a plan for it. I would love to work on it for GSoC if accepted. In summary..I want to know the goal and the most important stack involved working on it. I am proficient in python. If the above project idea is not so much in that direction you can advise a better idea among the ones listed here http://pypy.readthedocs.io/en/latest/project-ideas.html. Kind regards, -- Joannah Nanjekye +256776468213 F : Nanjekye Captain Joannah S : joannah.nanjekye T : @captainjoannah SO : joannah *"You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program." Alan J. Perlis* -------------- next part -------------- An HTML attachment was scrubbed... URL: From planrichi at gmail.com Wed Mar 29 08:01:11 2017 From: planrichi at gmail.com (Richard Plangger) Date: Wed, 29 Mar 2017 08:01:11 -0400 Subject: [pypy-dev] Details on project idea: Explicit typing in RPython In-Reply-To: References: Message-ID: <7cc2c179-d09c-9615-aac8-1b0fb06b66f4@gmail.com> Hello Joannah, Ronan might know more about this topic. But here is a short explanation: A solid start is to read the following documentation: http://rpython.readthedocs.io/en/latest/translation.html It explains how Python source code is analyzed, transformed and compiled. As you know, there are no type annotations as a "static" language provides (like C++, C, Java, ...). def foo(a,b): return a * b The two parameters a and b can carry any type (even the ones that are not able to execute binary add). One step of the transformation described in the link above "annotates" the types and deduces other properties. If you have a call site: foo("a", 2) It will deduce that foo's parameter a is an instance of "SomeString" and b is an instance of "SomeInteger". So it will assume that when foo is called every call site must provide SomeString for a and SomeInteger for b (or a subtype, but I'm not aware of the full details). If at another place foo(1,2) is called (which is valid python), rpython must complain, because it cannot be statically compiled. What we would like is a way to explicitly annotate the types (be aware that this is just an example, it is up to you how you solve it): @explicit_types(a=SomeInteger, b=SomeInteger) def foo(a,b): return a * b This would mean that rpython would complain as soon as it sees foo("a", 2). Preferably I think it would be good to have a mini language to describe such function properties, or variable properties. Cheers, Richard On 03/28/2017 10:14 AM, joannah nanjekye wrote: > Hello, > > I am interested in working on the above project. I need to understand > what it is about so that I can make a plan for it. I would love to work > on it for GSoC if accepted. > > In summary..I want to know the goal and the most important stack > involved working on it. > > I am proficient in python. If the above project idea is not so much in > that direction you can advise a better idea among the ones listed here > http://pypy.readthedocs.io/en/latest/project-ideas.html. > > Kind regards, > > -- > //Joannah Nanjekye > +256776468213 > F : Nanjekye Captain Joannah > S : joannah.nanjekye > T : @captainjoannah > SO : joannah > > /"You think you know when you learn, are more sure when you can write, > even more when you can teach, but certain when you can program." > Alan J. Perlis/ > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > From rymg19 at gmail.com Wed Mar 29 10:28:02 2017 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Wed, 29 Mar 2017 09:28:02 -0500 Subject: [pypy-dev] Details on project idea: Explicit typing in RPython In-Reply-To: <7cc2c179-d09c-9615-aac8-1b0fb06b66f4@gmail.com> References: <7cc2c179-d09c-9615-aac8-1b0fb06b66f4@gmail.com> Message-ID: RPython already has this: https://bitbucket.org/pypy/pypy/src/tip/rpython/rlib/signature.py -- Ryan (????) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On Mar 29, 2017 7:01 AM, "Richard Plangger" wrote: > Hello Joannah, > > Ronan might know more about this topic. But here is a short explanation: > > A solid start is to read the following documentation: > > http://rpython.readthedocs.io/en/latest/translation.html > > It explains how Python source code is analyzed, transformed and compiled. > > As you know, there are no type annotations as a "static" language > provides (like C++, C, Java, ...). > > def foo(a,b): > return a * b > > The two parameters a and b can carry any type (even the ones that are > not able to execute binary add). > > One step of the transformation described in the link above "annotates" > the types and deduces other properties. > > If you have a call site: > > foo("a", 2) > > It will deduce that foo's parameter a is an instance of "SomeString" and > b is an instance of "SomeInteger". > > So it will assume that when foo is called every call site must provide > SomeString for a and SomeInteger for b (or a subtype, but I'm not aware > of the full details). > > If at another place foo(1,2) is called (which is valid python), rpython > must complain, because it cannot be statically compiled. > > What we would like is a way to explicitly annotate the types (be aware > that this is just an example, it is up to you how you solve it): > > @explicit_types(a=SomeInteger, b=SomeInteger) > def foo(a,b): > return a * b > > This would mean that rpython would complain as soon as it sees foo("a", 2). > > Preferably I think it would be good to have a mini language to describe > such function properties, or variable properties. > > Cheers, > Richard > > On 03/28/2017 10:14 AM, joannah nanjekye wrote: > > Hello, > > > > I am interested in working on the above project. I need to understand > > what it is about so that I can make a plan for it. I would love to work > > on it for GSoC if accepted. > > > > In summary..I want to know the goal and the most important stack > > involved working on it. > > > > I am proficient in python. If the above project idea is not so much in > > that direction you can advise a better idea among the ones listed here > > http://pypy.readthedocs.io/en/latest/project-ideas.html. > > > > Kind regards, > > > > -- > > //Joannah Nanjekye > > +256776468213 > > F : Nanjekye Captain Joannah > > S : joannah.nanjekye > > T : @captainjoannah > > SO : joannah > > > > /"You think you know when you learn, are more sure when you can write, > > even more when you can teach, but certain when you can program." > > Alan J. Perlis/ > > > > > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > https://mail.python.org/mailman/listinfo/pypy-dev > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronan.lamy at gmail.com Wed Mar 29 11:34:37 2017 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Wed, 29 Mar 2017 16:34:37 +0100 Subject: [pypy-dev] Details on project idea: Explicit typing in RPython In-Reply-To: References: <7cc2c179-d09c-9615-aac8-1b0fb06b66f4@gmail.com> Message-ID: Le 29/03/17 ? 15:28, Ryan Gonzalez a ?crit : > RPython already has this: > > > https://bitbucket.org/pypy/pypy/src/tip/rpython/rlib/signature.py Indeed, @signature is one of 2 prior attempts at doing this in rpython[*]. However its syntax is cumbersome and it's rather limited in the types it can express - you can only use what's in rpython.rlib.types and these functions cannot be combined arbitrarily to build more complex types. [*] The other one is @enforceargs: https://bitbucket.org/pypy/pypy/src/tip/rpython/rlib/objectmodel.py From fijall at gmail.com Fri Mar 31 03:58:19 2017 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 31 Mar 2017 09:58:19 +0200 Subject: [pypy-dev] pypy real world example, a django project data processing. but slow... In-Reply-To: References: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz> Message-ID: What I meant is that ORM is slow *and* it takes forever to warmup. Your code might not run long enough for the ORM to be warm. It's also very likely it'll end up slower on pypy. one thing you can do is to run PYPYLOG=jit-summary:- pypy and copy paste the summary output The only way to store the warmed up state is to keep the process alive (as a daemon) and rerun it further. You can see if it speeds up after two or three runs in one process and make decisions accordingly. On Thu, Mar 30, 2017 at 2:09 PM, Vl??a Macek wrote: > Hi Maciej (and others?), > > I know I must be one of many who wanted a gain without pain. :-) Just gave > it a try without having an opportunity for some deeper profiling due to my > project deadlines. I just thought to get in touch in case I missed > something apparent to you from the combination I reported. > > ORM might me slow, but I compare interpreters, not ORMs. Here's my > program's final stats of processing the input file (nginx access log): > > CPython 2.7.6 32bit > 130.1 secs, 177492 valid lines (866160 invalid), 8021 l/s, max density 72 l/s > > pypy2-v5.7.0-linux32 > 183.0 secs, 177492 valid lines (866160 invalid), 5703 l/s, max density 72 l/s > > This is longer run than what I tried previously and surely this is not a > "double time". But still significantly slower. > > Each line is analyzed using a regexp, which I read is slow in pypy. > > Both runs have exactly same input and output. Subjectively, the processing > debugging output really got faster gradually for pypy, cpython is constant > speed. Is it normal that the warmup can take minutes? I don't know the details. > > In production, this processing is run from cron every five minutes. Is it > possible to store the warmed-up state between runs? (Note: I have *.pyc > files disabled at home using PYTHONDONTWRITEBYTECODE=1.) > > I know it's annoying I don't share code and I'm sorry. With this mail I > just wanted to give out some numbers for the possibly curious. > > The pypy itself is interesting and I hope I'll return to it someday more > thoroughly. > > Thanks again & have a nice day, > > Vl??a > > > On 27.3.2017 17:21, Maciej Fijalkowski wrote: >> Hi Vlada >> >> Generally speaking, if we can't have a look there is incredibly little >> we can do "I have a program" can be pretty much anything. >> >> It is well known that django ORM is very slow (both on pypy and on >> cpython) and makes the JIT take forever to warm up. I have absolutely >> no idea how long is your run at full CPU, but this is definitely one >> of your suspects >> >> On Sun, Mar 26, 2017 at 1:06 PM, Vl??a Macek wrote: >>> Hi, recently I asked my friends to run my sort of a benchmark on their >>> machines (attached). The goal was to test the speed of different data >>> access in python2 and python3, 32bit and 64bit. One of my friends sent me >>> the pypy results -- the script ran fast as hell! Astounding. >>> >>> At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded >>> your binary >>> https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and >>> confirmed my friend's results, wow. >>> >>> I develop a large Django project, that includes a big amount of background >>> data processing. Reads large files, computes, issues much SQL to postgresql >>> via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs. >>> >>> I'd welcome a speedup here very much. >>> >>> So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set >>> up the paths and ran. The computation printouts were the same, very >>> promising -- taking into account how complicated the project is! The SQL >>> looked right too. My respect on compatiblity! >>> >>> Unfortunately, the time needed to complete was double in comparison CPython >>> 2.7 for exactly the same task. >>> >>> You mention you might have some tips for why it's slow. Are you interested >>> in getting in touch? Although I rather can't share the code and data with >>> you, I'm offering a real world example of significant load that might help >>> Pypy get better. >>> >>> Thank you, >>> >>> -- >>> : Vlada Macek : http://macek.sandbox.cz : +420 608 978 164 >>> : UNIX && Dev || Training : Python, Django : PGP key 97330EBD >>> >>> (Disclaimer: The opinions expressed herein are not necessarily those >>> of my employer, not necessarily mine, and probably not necessary.) >>> > From nanjekyejoannah at gmail.com Fri Mar 31 06:44:24 2017 From: nanjekyejoannah at gmail.com (joannah nanjekye) Date: Fri, 31 Mar 2017 13:44:24 +0300 Subject: [pypy-dev] Details on project idea: Explicit typing in RPython In-Reply-To: References: <7cc2c179-d09c-9615-aac8-1b0fb06b66f4@gmail.com> Message-ID: Thank you I think this is clearer to me now. On Wed, Mar 29, 2017 at 6:34 PM, Ronan Lamy wrote: > Le 29/03/17 ? 15:28, Ryan Gonzalez a ?crit : > >> RPython already has this: >> >> >> https://bitbucket.org/pypy/pypy/src/tip/rpython/rlib/signature.py >> > > Indeed, @signature is one of 2 prior attempts at doing this in rpython[*]. > However its syntax is cumbersome and it's rather limited in the types it > can express - you can only use what's in rpython.rlib.types and these > functions cannot be combined arbitrarily to build more complex types. > > [*] The other one is @enforceargs: https://bitbucket.org/pypy/pyp > y/src/tip/rpython/rlib/objectmodel.py > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -- Joannah Nanjekye +256776468213 F : Nanjekye Captain Joannah S : joannah.nanjekye T : @captainjoannah SO : joannah *"You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program." Alan J. Perlis* -------------- next part -------------- An HTML attachment was scrubbed... URL: From macek at sandbox.cz Thu Mar 30 08:09:55 2017 From: macek at sandbox.cz (=?UTF-8?B?VmzDocSPYSBNYWNlaw==?=) Date: Thu, 30 Mar 2017 14:09:55 +0200 Subject: [pypy-dev] pypy real world example, a django project data processing. but slow... In-Reply-To: References: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz> Message-ID: Hi Maciej (and others?), I know I must be one of many who wanted a gain without pain. :-) Just gave it a try without having an opportunity for some deeper profiling due to my project deadlines. I just thought to get in touch in case I missed something apparent to you from the combination I reported. ORM might me slow, but I compare interpreters, not ORMs. Here's my program's final stats of processing the input file (nginx access log): CPython 2.7.6 32bit 130.1 secs, 177492 valid lines (866160 invalid), 8021 l/s, max density 72 l/s pypy2-v5.7.0-linux32 183.0 secs, 177492 valid lines (866160 invalid), 5703 l/s, max density 72 l/s This is longer run than what I tried previously and surely this is not a "double time". But still significantly slower. Each line is analyzed using a regexp, which I read is slow in pypy. Both runs have exactly same input and output. Subjectively, the processing debugging output really got faster gradually for pypy, cpython is constant speed. Is it normal that the warmup can take minutes? I don't know the details. In production, this processing is run from cron every five minutes. Is it possible to store the warmed-up state between runs? (Note: I have *.pyc files disabled at home using PYTHONDONTWRITEBYTECODE=1.) I know it's annoying I don't share code and I'm sorry. With this mail I just wanted to give out some numbers for the possibly curious. The pypy itself is interesting and I hope I'll return to it someday more thoroughly. Thanks again & have a nice day, Vl??a On 27.3.2017 17:21, Maciej Fijalkowski wrote: > Hi Vlada > > Generally speaking, if we can't have a look there is incredibly little > we can do "I have a program" can be pretty much anything. > > It is well known that django ORM is very slow (both on pypy and on > cpython) and makes the JIT take forever to warm up. I have absolutely > no idea how long is your run at full CPU, but this is definitely one > of your suspects > > On Sun, Mar 26, 2017 at 1:06 PM, Vl??a Macek wrote: >> Hi, recently I asked my friends to run my sort of a benchmark on their >> machines (attached). The goal was to test the speed of different data >> access in python2 and python3, 32bit and 64bit. One of my friends sent me >> the pypy results -- the script ran fast as hell! Astounding. >> >> At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded >> your binary >> https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and >> confirmed my friend's results, wow. >> >> I develop a large Django project, that includes a big amount of background >> data processing. Reads large files, computes, issues much SQL to postgresql >> via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs. >> >> I'd welcome a speedup here very much. >> >> So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set >> up the paths and ran. The computation printouts were the same, very >> promising -- taking into account how complicated the project is! The SQL >> looked right too. My respect on compatiblity! >> >> Unfortunately, the time needed to complete was double in comparison CPython >> 2.7 for exactly the same task. >> >> You mention you might have some tips for why it's slow. Are you interested >> in getting in touch? Although I rather can't share the code and data with >> you, I'm offering a real world example of significant load that might help >> Pypy get better. >> >> Thank you, >> >> -- >> : Vlada Macek : http://macek.sandbox.cz : +420 608 978 164 >> : UNIX && Dev || Training : Python, Django : PGP key 97330EBD >> >> (Disclaimer: The opinions expressed herein are not necessarily those >> of my employer, not necessarily mine, and probably not necessary.) >> From macek at sandbox.cz Fri Mar 31 07:19:31 2017 From: macek at sandbox.cz (=?UTF-8?B?VmzDocSPYSBNYWNlaw==?=) Date: Fri, 31 Mar 2017 13:19:31 +0200 Subject: [pypy-dev] pypy real world example, a django project data processing. but slow... In-Reply-To: References: <1b7805ab-a1ef-c16e-4983-fc5ef75ae62a@sandbox.cz> Message-ID: <79d397bf-9b52-0e5e-f1c7-c60b543f6dfb@sandbox.cz> Thanks! I ran it again on a much larger input and let it print the lines/sec speed on every millionth line (either valid or invalid). SPEED 6588 l/s SPEED 8208 l/s SPEED 9172 l/s SPEED 10351 l/s SPEED 16946 l/s SPEED 23263 l/s 662.6 secs, 973701 valid lines (5610778 invalid), 9937 l/s, max density 73 l/s [1c3dac321147] {jit-summary Tracing: 2794 8.313955 Backend: 2245 1.946692 TOTAL: 667.678971 ops: 5768705 recorded ops: 1478597 calls: 231321 guards: 392450 opt ops: 456372 opt guards: 101057 opt guards shared: 61039 forcings: 0 abort: trace too long: 52 abort: compiling: 0 abort: vable escape: 497 abort: bad loop: 0 abort: force quasi-immut: 0 nvirtuals: 284152 nvholes: 146657 nvreused: 90634 vecopt tried: 0 vecopt success: 0 Total # of loops: 583 Total # of bridges: 1778 Freed # of loops: 140 Freed # of bridges: 189 [1c3dac33785b] jit-summary} CPython again for comparison on the same input: SPEED 8819 l/s SPEED 9625 l/s SPEED 10285 l/s SPEED 11384 l/s SPEED 16428 l/s SPEED 20588 l/s 596.8 secs, 973701 valid lines (5610778 invalid), 11032 l/s, max density 73 l/s Interesting that after 5 million lines the PyPy speed exceeded the CPython somehow. Both runs got faster with time, probably due to my insane level of local caching of values (less SQL required). Anyway, I still hesitate whether pypy was really still warming up all that time... Thanks, Vlada On 31.3.2017 09:58, Maciej Fijalkowski wrote: > What I meant is that ORM is slow *and* it takes forever to warmup. > Your code might not run long enough for the ORM to be warm. It's also > very likely it'll end up slower on pypy. one thing you can do is to > run PYPYLOG=jit-summary:- pypy and copy paste the > summary output > > The only way to store the warmed up state is to keep the process alive > (as a daemon) and rerun it further. You can see if it speeds up after > two or three runs in one process and make decisions accordingly. > > On Thu, Mar 30, 2017 at 2:09 PM, Vl??a Macek wrote: >> Hi Maciej (and others?), >> >> I know I must be one of many who wanted a gain without pain. :-) Just gave >> it a try without having an opportunity for some deeper profiling due to my >> project deadlines. I just thought to get in touch in case I missed >> something apparent to you from the combination I reported. >> >> ORM might me slow, but I compare interpreters, not ORMs. Here's my >> program's final stats of processing the input file (nginx access log): >> >> CPython 2.7.6 32bit >> 130.1 secs, 177492 valid lines (866160 invalid), 8021 l/s, max density 72 l/s >> >> pypy2-v5.7.0-linux32 >> 183.0 secs, 177492 valid lines (866160 invalid), 5703 l/s, max density 72 l/s >> >> This is longer run than what I tried previously and surely this is not a >> "double time". But still significantly slower. >> >> Each line is analyzed using a regexp, which I read is slow in pypy. >> >> Both runs have exactly same input and output. Subjectively, the processing >> debugging output really got faster gradually for pypy, cpython is constant >> speed. Is it normal that the warmup can take minutes? I don't know the details. >> >> In production, this processing is run from cron every five minutes. Is it >> possible to store the warmed-up state between runs? (Note: I have *.pyc >> files disabled at home using PYTHONDONTWRITEBYTECODE=1.) >> >> I know it's annoying I don't share code and I'm sorry. With this mail I >> just wanted to give out some numbers for the possibly curious. >> >> The pypy itself is interesting and I hope I'll return to it someday more >> thoroughly. >> >> Thanks again & have a nice day, >> >> Vl??a >> >> >> On 27.3.2017 17:21, Maciej Fijalkowski wrote: >>> Hi Vlada >>> >>> Generally speaking, if we can't have a look there is incredibly little >>> we can do "I have a program" can be pretty much anything. >>> >>> It is well known that django ORM is very slow (both on pypy and on >>> cpython) and makes the JIT take forever to warm up. I have absolutely >>> no idea how long is your run at full CPU, but this is definitely one >>> of your suspects >>> >>> On Sun, Mar 26, 2017 at 1:06 PM, Vl??a Macek wrote: >>>> Hi, recently I asked my friends to run my sort of a benchmark on their >>>> machines (attached). The goal was to test the speed of different data >>>> access in python2 and python3, 32bit and 64bit. One of my friends sent me >>>> the pypy results -- the script ran fast as hell! Astounding. >>>> >>>> At home I have a 64bit Dell laptop running 32bit Ubuntu 14.04. I downloaded >>>> your binary >>>> https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.7.0-linux32.tar.bz2 and >>>> confirmed my friend's results, wow. >>>> >>>> I develop a large Django project, that includes a big amount of background >>>> data processing. Reads large files, computes, issues much SQL to postgresql >>>> via psycopg2, every 5 minutes. Heavily uses memcache daemon between runs. >>>> >>>> I'd welcome a speedup here very much. >>>> >>>> So let's give it a try. Installed psycopg2cffi (via pip in virtualenv), set >>>> up the paths and ran. The computation printouts were the same, very >>>> promising -- taking into account how complicated the project is! The SQL >>>> looked right too. My respect on compatiblity! >>>> >>>> Unfortunately, the time needed to complete was double in comparison CPython >>>> 2.7 for exactly the same task. >>>> >>>> You mention you might have some tips for why it's slow. Are you interested >>>> in getting in touch? Although I rather can't share the code and data with >>>> you, I'm offering a real world example of significant load that might help >>>> Pypy get better. >>>> >>>> Thank you, >>>> >>>> -- >>>> : Vlada Macek : http://macek.sandbox.cz : +420 608 978 164 >>>> : UNIX && Dev || Training : Python, Django : PGP key 97330EBD >>>> >>>> (Disclaimer: The opinions expressed herein are not necessarily those >>>> of my employer, not necessarily mine, and probably not necessary.) >>>> -- : Vlada Macek : http://macek.sandbox.cz : +420 608 978 164 : UNIX && Dev || Training : Python, Django : PGP key 97330EBD (Disclaimer: The opinions expressed herein are not necessarily those of my employer, not necessarily mine, and probably not necessary.)