From nanjekyejoannah at gmail.com Tue Mar 10 07:51:07 2020 From: nanjekyejoannah at gmail.com (joannah nanjekye) Date: Tue, 10 Mar 2020 08:51:07 -0300 Subject: [pypy-dev] Question : PyPy on Heptapod Message-ID: Hey all, Do I need to request for special privileges in order to submit a patch? If yes, what is the process? I want to submit an update to a long time patch I had on the old bitbucket repository: https://foss.heptapod.net/pypy/pypy/merge_requests/639 Apologies for the delay. I am hoping for a final update this time. The hustle for time is real. Joannah -- Best, Joannah Nanjekye *"You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program." Alan J. Perlis* -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Tue Mar 10 12:01:18 2020 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 10 Mar 2020 18:01:18 +0200 Subject: [pypy-dev] rpython.org Message-ID: Hey everyone, rpython.org expired because I didn't pay in time Sorry about that, I will try to be more on it. It's fixed now Best, Maciej From ronan.lamy at gmail.com Tue Mar 10 12:30:58 2020 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Tue, 10 Mar 2020 16:30:58 +0000 Subject: [pypy-dev] Question : PyPy on Heptapod In-Reply-To: References: Message-ID: <1fece36f-180e-3274-e48a-caff1ee0daa7@gmail.com> Le 10/03/2020 ? 11:51, joannah nanjekye a ?crit?: > Hey all, > > Do I need to request for special privileges?in order to submit a patch? > > If yes, what is the process? I want to submit an update to a long time > patch I had on the old bitbucket repository: > https://foss.heptapod.net/pypy/pypy/merge_requests/639 > > Apologies for the delay. I am hoping for a final update this time. The > hustle for time is real. > > Joannah Hi Joannah, Yes, you need to request access on Heptapod. So, first create a user there and then click the button on https://foss.heptapod.net/pypy (see the full instructions at https://doc.pypy.org/en/latest/contributing.html#get-access). Also, remember to update your .hg/hgrc and (preferrably) install the hg-evolve extension (see https://doc.pypy.org/en/latest/mercurial_heptapod.html). Come to IRC if you have any issues with the new workflow. Cheers, Ronan From nanjekyejoannah at gmail.com Tue Mar 10 14:28:44 2020 From: nanjekyejoannah at gmail.com (joannah nanjekye) Date: Tue, 10 Mar 2020 15:28:44 -0300 Subject: [pypy-dev] Question : PyPy on Heptapod In-Reply-To: <1fece36f-180e-3274-e48a-caff1ee0daa7@gmail.com> References: <1fece36f-180e-3274-e48a-caff1ee0daa7@gmail.com> Message-ID: Great. Thank you. I will follow up on these steps. Best, Joannah On Tue, Mar 10, 2020 at 1:31 PM Ronan Lamy wrote: > Le 10/03/2020 ? 11:51, joannah nanjekye a ?crit : > > Hey all, > > > > Do I need to request for special privileges in order to submit a patch? > > > > If yes, what is the process? I want to submit an update to a long time > > patch I had on the old bitbucket repository: > > https://foss.heptapod.net/pypy/pypy/merge_requests/639 > > > > Apologies for the delay. I am hoping for a final update this time. The > > hustle for time is real. > > > > Joannah > > Hi Joannah, > > Yes, you need to request access on Heptapod. So, first create a user > there and then click the button on https://foss.heptapod.net/pypy (see > the full instructions at > https://doc.pypy.org/en/latest/contributing.html#get-access). Also, > remember to update your .hg/hgrc and (preferrably) install the hg-evolve > extension (see https://doc.pypy.org/en/latest/mercurial_heptapod.html). > > Come to IRC if you have any issues with the new workflow. > > Cheers, > Ronan > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -- Best, Joannah Nanjekye *"You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program." Alan J. Perlis* -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmiscml at gmail.com Sun Mar 29 13:52:30 2020 From: pmiscml at gmail.com (Paul Sokolovsky) Date: Sun, 29 Mar 2020 20:52:30 +0300 Subject: [pypy-dev] Explicitly defining a string buffer object (aka StringIO += operator) Message-ID: <20200329205230.0e473ae2@zenbook> Hello, 1. Intro -------- It is a well-known anti-pattern to use a string as a string buffer, to construct a long (perhaps very long) string piece-wise. A running example is: buf = "" for i in range(50000): buf += "foo" print(buf) An alternative is to use a buffer-like object explicitly designed for incremental updates, which for Python is io.StringIO: buf = io.StringIO() for i in range(50000): buf.write("foo") print(buf.getvalue()) As can be seen, this requires changing the way buffer is constructed (usually in one place), the way buffer value is taken (usually in one place), but more importantly, it requires changing each line which adds content to a buffer, and there can be many of those for more complex algorithms, leading to a code less clear than the original code, requiring noise-like changes, and complicating updates to 3rd-party code which needs optimization. To address this, this RFC proposes to add an __iadd__ method (i.e. implementing "+=" operator) to io.StringIO and io.BytesIO objects, making it the exact alias of .write() method. This will allow for the code very parallel to the original str-using code: buf = io.StringIO() for i in range(50000): buf += "foo" print(buf.getvalue()) This will still require updates for buffer construction/getting value, but that's usually 2 lines. But it will leave the rest of code intact, and not obfuscate the original content construction algorithm. 2. Performance Discussion ------------------------- The motivation for this change (of promoting usage of io.StringIO, by making it look&feel more like str) is performance. But is it really a problem? Turns out, it is such a pervasive anti-pattern, that recent versions on CPython3 have a special optimization for it. Let's use following script for testing: --------- import timeit import io def string(): sb = u"" for i in range(50000): sb += u"a" def strio(): sb = io.StringIO() for i in range(50000): sb.write(u"a") print(timeit.timeit(string, number=10)) print(timeit.timeit(strio, number=10)) --------- With CPython3.6 the result is: $ python3.6 str_iadd-vs-StringIO_write.py 0.03350826998939738 0.033480543992482126 In other words, there's no difference between usage of str vs StringIO. But it wasn't always like that, with CPython2.7.17: $ python2.7 str_iadd-vs-StringIO_write.py 2.10510993004 0.0399420261383 But Python2 is dead, right? Ok, let's see how Jython3 and IronPython3 fair. To my surprise, there're no (public releases of) such. Both projects sit firmly in the Python2 territory. So, let's try them: $ java -jar jython-standalone-2.7.2.jar str_iadd-vs-StringIO_write.py 10.8869998455 1.74700021744 Sadly, I wasn't able to get to run IronPython.2.7.9.zip on my Linux system, so I used the online version at https://tio.run/#python2-iron (after discovering that https://ironpython.net/try/ is dead). 2.7.9 (IronPython 2.7.9 (2.7.9.0) on Mono 4.0.30319.42000 (64-bit)) 26.2704391479 1.55628967285 So, it seems that rumors of Python2 being dead are somewhat exaggerated. Let's try a project which tries to provide "missing migration path" between Python2 and Python3 - https://github.com/naftaliharris/tauthon Tauthon 2.8.1+ (heads/master:7da5b76f5b, Mar 29 2020, 18:05:05) $ tauthon str_iadd-vs-StringIO_write.py 0.792158126831 0.0467159748077 Whoa, tauthon seems to be faithful to its promise of being half-way between CPython2 and CPython2. Anyway, let's get back to Python3. Fortunately, there's PyPy3, so let's try that: $ ./pypy3.6-v7.3.0-linux64/bin/pypy3 str_iadd-vs-StringIO_write.py 0.5423258490045555 0.01754526497097686 Let's not forget little Python brothers and continue with MicroPython 1.12 (https://github.com/micropython/micropython): $ micropython str_iadd-vs-StringIO_write.py 41.63419413566589 0.08073711395263672 Pycopy 3.0.6 (https://github.com/pfalcon/pycopy): $ pycopy str_iadd-vs-StringIO_write.py 25.03198313713074 0.0713810920715332 I also wanted to include TinyPy (http://tinypy.org/) and Snek (https://github.com/keith-packard/snek) in the shootout, but both (seem to) lack StringIO object. These results can be summarized as follows: of more than half-dozen Python implementations, CPython3 is the only implementation which optimizes for the dubious usage of an immutable string type as an accumulating character buffer. For all other implementations, unintended usage of str incurs overhead of about one order of magnitude, 2 order of magnitude for implementations optimized for particular usecases (this includes PyPy optimized for speed vs MicroPython/Pycopy optimized for small code size and memory usage). Consequently, other implementations have 2 choices: 1. Succumb to applying the same mis-optimization for string type as CPython3. (With the understanding that for speed-optimized projects, implementing mis-optimizations will eat into performance budget, and for memory-optimized projects, it likely will lead to noticeable memory bloat.) 2. Struggle against inefficient-by-concept usage, and promote usage of the correct object types for incremental construction of string content. This would require improving ergonomics of existing string buffer object, to make its usage less painful for both writing new code and refactoring existing. As you may imagine, the purpose of this RFC is to raise awareness and try to make headway with the choice 2. 3. Scope Creep, aka "Possible Future Work" ------------------------------------------ The purpose of this RFC is specifically to propose to apply *single* simple, obvious change. .__iadd__ is just an alias for .write, period. However, for completeness, it makes sense to consider both alternatives and where the path of adding "str-like functionality" may lead us. 1. One alternative to patching StringIO would be to introduce a completely different type, e.g. StringBuf. But that largely would be "creating more entities without necessity", given that StringIO already offers needed buffering functionality, and just needs a little touch of polish with interface. If 2 classes like StringIO and StringBuf existed, it would be extra quiz to explain difference between them and why they both exist. 2. On the other hand, this RFC fixates on the output buffering. But just image how much fun can be done re: input buffers! E.g., we can define "buf[0]" to have semantics of "tmp = buf.tell(); res = buf.read(1); buf.seek(tmp); return res". Ditto for .startswith(), etc., etc. So, well... This RFC is about making .__iadd__ to be an alias for .write, and cover with this output buffering usecase. Whoever may have interest in dealing with input buffer shortcuts would need to provide a separate RFC, with separate usecases and argumentation. 4. (Self-)Criticism and Risks. ------------------------------ 1. The biggest "criticism" I see is a response a-la "there's no problem with CPython3, so there's nothing to fix". This is related to a bigger questions "whether a life outside CPython exists", or put more formally, where's the border between Python-the-language and CPython-the-implementation. To address this point, I tried to collect performance stats for a pretty wide array of Python implementations. 2. Another potential criticism is that this may open a scope creep to add more str-like functionality to classes which classically expose a stream interface. Paragraph 3.2 is dedicated specifically to address this point, by invoking hopefully the best-practice approach: request to focus on the currently proposed feature, which requires very little changes for arguably noticeable improvements. (At the same time, for an abstract possibility of this change to be found positive, this may influence further proposals from interested parties). 3. Implementing the required functionality is pretty easy with a user subclass: class MyStringBuf(io.StringIO): def __iadd__(self, s): self.write(s) return self Voila. The problem is performance. Calling such .__iadd__() method implementing in Python is 3 times slower than calling .write() directly (with CPython3.6). But paradigmatic problem is even bigger: this RFC seeks to establish the best practice of using explicitly designed for the purpose type with ergonomic interface. Saying that "we lack such clearly designated type out of the box, but if you figure out that it's a problem (if you do figure that out), you can easily resolve that on your side, albeit with a performance hit when compared with it being provided out of the box" - that's not really welcoming or ergonomic. 5. Prior Art ------------ As many things related to Python, the idea is not new. I found thread from 2006 dedicated to it: https://mail.python.org/pipermail/python-list/2006-January/403480.html (strangely, there's some mixup in archives, and "thread view" shows another message as thread starter, though it's not: https://mail.python.org/pipermail/python-list/2006-January/357453.html) . The discussion there seemed to be without clear resolution and has been swamped into discussion of unicode handling complexities in Python2 and implementation details of "StringIO" vs "cStringIO" modules (both of which were deprecated in favor of "io"). -- Best regards, Paul mailto:pmiscml at gmail.com From mark.doerr at uni-greifswald.de Sun Mar 29 16:43:51 2020 From: mark.doerr at uni-greifswald.de (mark doerr) Date: Sun, 29 Mar 2020 22:43:51 +0200 Subject: [pypy-dev] pypy.org : Updating the current pypy.org webpage - removing yatiblog README and adding new nikola system instructions ? In-Reply-To: References: Message-ID: <46090a46-dd71-a4a8-a1e3-6839d031fc1b@uni-greifswald.de> Hello pypy.org webmasters, while discussing with Armin Rigo the update of the documentation of sandbox-2 on pypy.org, I realised, that the README (in the root dir of the pypy.org repo) is still pointing to the old yatiblog pages (source/README - which contains the old yatiblog instructions). Is someone taking care of updating the instructions to the new Nikola system (and removing the yatiblog source folder) ? Otherwise I could offer to do a new merge request with the changes. With kind regards, mark -------- Forwarded Message -------- Subject: Re: pypy-sandbox-2 - documentation fix. Branch uploaded Date: Sun, 29 Mar 2020 18:41:52 +0200 From: mark doerr Organization: University Greifswald To: Armin Rigo Hi Armin, on a second look, I realised, that pypy moved from yatiblog to nikola. Is that true ? Then the README.rst in root is wrong: it still points to source/README (which is the old yatiblog pages and gives yatiblog instructions). Are the pages in the "pages" folder be used currently by your system (I guess that nikola takes the pages from "pages" to build the static html) ? Maybe the main README should also be updated ? Is someone taking care of this ? With kind regards, mark On 3/29/20 6:09 PM, mark doerr wrote: > Hi Armin, > > I wondering, why you had two different repos, one and github and one > on foss. > > It is good that you bundled the development on one site. > > It is not a big problem, that you deleted my files, I still have them > as a local copy. > > I will add the changes to the repo on foss.heptapod tonight (I just > realised, that the pypy people use yatiblog - it seems that it > converts reStructured text into static html, which is nice.) > > In my old commit, I changed the generated html code directly, which > was not nice ;) > > I will send you a note, when changes are in ... > > > A bient?t, > > mark > > > > On 3/29/20 2:19 PM, Armin Rigo wrote: >> Hi Mark, >> >> Oops.? We lost your pull request on >> https://github.com/pypy/pypy-website because we killed that repo >> (which was never meant to be there).? The real repo is at >> https://foss.heptapod.net/pypy/pypy.org .? If you still have your >> changes locally, could you make a merge request on the real repo? >> Thank you! >> >> >> A bient?t, >> >> Armin. > - -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Mon Mar 30 04:10:30 2020 From: matti.picus at gmail.com (Matti Picus) Date: Mon, 30 Mar 2020 11:10:30 +0300 Subject: [pypy-dev] pypy.org : Updating the current pypy.org webpage - removing yatiblog README and adding new nikola system instructions ? In-Reply-To: <46090a46-dd71-a4a8-a1e3-6839d031fc1b@uni-greifswald.de> References: <46090a46-dd71-a4a8-a1e3-6839d031fc1b@uni-greifswald.de> Message-ID: <6b362c5d-04c5-b556-bffb-3120c818a5de@gmail.com> An HTML attachment was scrubbed... URL: From mark.doerr at uni-greifswald.de Mon Mar 30 04:41:48 2020 From: mark.doerr at uni-greifswald.de (mark doerr) Date: Mon, 30 Mar 2020 10:41:48 +0200 Subject: [pypy-dev] pypy.org : Updating the current pypy.org webpage - removing yatiblog README and adding new nikola system instructions ? sphinx vs. nikola In-Reply-To: <6b362c5d-04c5-b556-bffb-3120c818a5de@gmail.com> References: <46090a46-dd71-a4a8-a1e3-6839d031fc1b@uni-greifswald.de> <6b362c5d-04c5-b556-bffb-3120c818a5de@gmail.com> Message-ID: <300d1647-8784-46bc-8b85-512fb78df04a@uni-greifswald.de> You're welcome, Matti, thanks for you fast reply. I will make a proposal for an update. Just one question/comment, since you are at the very beginning of your new repository: On our SiLA python repository ( https://gitlab.com/SiLA2/sila_python ) we are just using sphinx to generate the gitLAB static html pages: https://sila2.gitlab.io/sila_python/ With the python_docs_theme they have exactly the same look as the python.org pages. CI/CD pipeline generation is as simple as (see https://gitlab.com/SiLA2/sila_python/-/blob/master/.gitlab-ci.yml ) : pages: script: - pip install sphinx sphinx-rtd-theme python_docs_theme - cd docs ; make html - mv _build/html/ ../public/ artifacts: paths: - public only: - master That's it. No extra tools, no extra dependency, just python standard documentation tools everyone uses anyway and the final result is very nice and close to what people see from python.org. It is also very easy to include the pypy source code documentation with autodoc in the html generation. I do not know, if that was considered, if not, it might be worth considering before doing the final move to nikola. With best regards, mark On 3/30/20 10:10 AM, Matti Picus wrote: > > On 29/3/20 11:43 pm, mark doerr wrote: >> >> Hello pypy.org webmasters, >> >> while discussing with Armin Rigo the update of the documentation of >> sandbox-2 on pypy.org, I realised, >> >> that the README (in the root dir of the pypy.org repo) is still >> pointing to the old yatiblog pages (source/README - which contains >> the old yatiblog instructions). >> >> Is someone taking care of updating the instructions to the new Nikola >> system (and removing the yatiblog source folder) ? >> >> Otherwise I could offer to do a new merge request with the changes. >> >> With kind regards, >> >> mark >> >> > > Yes thanks, that would be great. The README-DO-NOT-EDIT should be > merged into the README and adjusted, and all directions should point > to running "make" in the top level directory. "make help" could also > be improved with a bit more verbosity. There may also be some stray > files around that can be deleted. > > > Matti > > > >> >> -------- Forwarded Message -------- >> Subject: Re: pypy-sandbox-2 - documentation fix. Branch uploaded >> Date: Sun, 29 Mar 2020 18:41:52 +0200 >> From: mark doerr >> Organization: University Greifswald >> To: Armin Rigo >> >> >> >> Hi Armin, >> >> on a second look, I realised, that pypy moved from yatiblog to nikola. >> >> Is that true ? Then the README.rst in root is wrong: it still points >> to source/README (which is the old yatiblog pages and gives yatiblog >> instructions). >> >> Are the pages in the "pages" folder be used currently by your system >> (I guess that nikola takes the pages from "pages" to build the static >> html) ? >> >> Maybe the main README should also be updated ? Is someone taking care >> of this ? >> >> With kind regards, >> >> mark >> >> >> >> On 3/29/20 6:09 PM, mark doerr wrote: >>> Hi Armin, >>> >>> I wondering, why you had two different repos, one and github and one >>> on foss. >>> >>> It is good that you bundled the development on one site. >>> >>> It is not a big problem, that you deleted my files, I still have >>> them as a local copy. >>> >>> I will add the changes to the repo on foss.heptapod tonight (I just >>> realised, that the pypy people use yatiblog - it seems that it >>> converts reStructured text into static html, which is nice.) >>> >>> In my old commit, I changed the generated html code directly, which >>> was not nice ;) >>> >>> I will send you a note, when changes are in ... >>> >>> >>> A bient?t, >>> >>> mark >>> >>> >>> >>> On 3/29/20 2:19 PM, Armin Rigo wrote: >>>> Hi Mark, >>>> >>>> Oops.? We lost your pull request on >>>> https://github.com/pypy/pypy-website because we killed that repo >>>> (which was never meant to be there).? The real repo is at >>>> https://foss.heptapod.net/pypy/pypy.org .? If you still have your >>>> changes locally, could you make a merge request on the real repo? >>>> Thank you! >>>> >>>> >>>> A bient?t, >>>> >>>> Armin. >>> >> - >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev -- -- Dr. mark doerr Working group: Prof. Uwe T. Bornscheuer Institute of Biochemistry Dept. of Biotechnology & Enzyme Catalysis University of Greifswald Felix-Hausdorff-Str. 4 D-17487 Greifswald germany Phone: +49 (0)3834-420-4411 +49 (0)3834-420-4391 (Secr.) Fax: +49 (0)3834-420-794367 Room: B021 and B047 Web: http://lara.uni-greifswald.de/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Mon Mar 30 06:34:06 2020 From: matti.picus at gmail.com (Matti Picus) Date: Mon, 30 Mar 2020 13:34:06 +0300 Subject: [pypy-dev] pypy.org : Updating the current pypy.org webpage - removing yatiblog README and adding new nikola system instructions ? sphinx vs. nikola In-Reply-To: <300d1647-8784-46bc-8b85-512fb78df04a@uni-greifswald.de> References: <46090a46-dd71-a4a8-a1e3-6839d031fc1b@uni-greifswald.de> <6b362c5d-04c5-b556-bffb-3120c818a5de@gmail.com> <300d1647-8784-46bc-8b85-512fb78df04a@uni-greifswald.de> Message-ID: <24f4d0ad-c319-ac39-bd39-44bf92201e2a@gmail.com> An HTML attachment was scrubbed... URL: From mark.doerr at uni-greifswald.de Mon Mar 30 07:09:49 2020 From: mark.doerr at uni-greifswald.de (mark doerr) Date: Mon, 30 Mar 2020 13:09:49 +0200 Subject: [pypy-dev] pypy.org : Updating the current pypy.org webpage - removing yatiblog README and adding new nikola system instructions ? sphinx vs. nikola In-Reply-To: <24f4d0ad-c319-ac39-bd39-44bf92201e2a@gmail.com> References: <46090a46-dd71-a4a8-a1e3-6839d031fc1b@uni-greifswald.de> <6b362c5d-04c5-b556-bffb-3120c818a5de@gmail.com> <300d1647-8784-46bc-8b85-512fb78df04a@uni-greifswald.de> <24f4d0ad-c319-ac39-bd39-44bf92201e2a@gmail.com> Message-ID: I see, Matti, thanks for explaining the decision. These are valid arguments using nikola (the blog post part I was not aware of). Using the CI of heptapod might be really worth to have a closer look into (that also might reduce errors, since everything relies only on one service rather than two or more ). Since the core (text) information is kept in reStructured text files that are somehow rendered, it is not really important, which engine finally does it (and if in the future another engine is better suited, it will be not too difficult to switch).... Then I will make a suggestion for the README as a new topic branch and create a merge request (very likely tomorrow evening). With best regards, mark On 3/30/20 12:34 PM, Matti Picus wrote: > > Those pages do look nice, and are very similar to doc.python.org or > doc.pypy.org (but with a nicer theme). However this repo renders > www.pypy.org, which is parallel to www.python.org, which is the > "non-developer front page" of pypy and python. > > > Your pages are hosted as gitlab pages, which heptapod does not > currently support. This repo's pages are hosted on python > infrastructure via a cron script that pulls the repo from heptapod > every 15 minutes. > > > I thought about using sphinx instead of nikola, but > > - it was easier to get the theme to nikola up and running > > - the theme supports mobile better than the rtd one > > - nikola has some hooks for blog posts. We would like to import the > morepypy.blogspot.com/ pages and self-host them here rather than on > blogspot. > > - nikola is actually a bit lighter than sphinx, but that is not really > a criteria for selecting it as the render is almost instantaneous > > > We could do the build of the docs on CI, the CI runners for heptapod > are quite new. Maybe worth looking into doing the "python -mpip > install nikola==8.0.3 jinja2 aiohttp watchdog" and "nikola buld" in > CI, but that would require rethinking the hosting process. > > > Of course, this is all very ad-hoc, we don't have a community of > documentation contributors, so nothing is final and contributions are > very welcome. > > Matti > > > On 30/3/20 11:41 am, mark doerr wrote: >> You're welcome, Matti, thanks for you fast reply. I will make a >> proposal for an update. Just one question/comment, since you are at >> the very beginning of your new repository: On our SiLA python >> repository ( https://gitlab.com/SiLA2/sila_python ) we are just using >> sphinx to generate the gitLAB static html pages: >> https://sila2.gitlab.io/sila_python/ With the python_docs_theme they >> have exactly the same look as the python.org pages. CI/CD pipeline >> generation is as simple as (see >> https://gitlab.com/SiLA2/sila_python/-/blob/master/.gitlab-ci.yml ) : >> pages: >> script: >> - pip install sphinx sphinx-rtd-theme python_docs_theme >> - cd docs ; make html >> - mv _build/html/ ../public/ >> artifacts: >> paths: >> - public >> only: >> - master >> >> >> That's it. No extra tools, no extra dependency, just python standard documentation tools everyone uses anyway and the final result is very nice and close to what people see from python.org. >> It is also very easy to include the pypy source code documentation with autodoc in the html generation. >> >> I do not know, if that was considered, if not, it might be worth considering before doing the final move to nikola. >> >> With best regards, >> mark >> >> >> >> On 3/30/20 10:10 AM, Matti Picus wrote: >>> >>> On 29/3/20 11:43 pm, mark doerr wrote: >>>> >>>> Hello pypy.org webmasters, >>>> >>>> while discussing with Armin Rigo the update of the documentation of >>>> sandbox-2 on pypy.org, I realised, >>>> >>>> that the README (in the root dir of the pypy.org repo) is still >>>> pointing to the old yatiblog pages (source/README - which contains >>>> the old yatiblog instructions). >>>> >>>> Is someone taking care of updating the instructions to the new >>>> Nikola system (and removing the yatiblog source folder) ? >>>> >>>> Otherwise I could offer to do a new merge request with the changes. >>>> >>>> With kind regards, >>>> >>>> mark >>>> >>>> >>> >>> Yes thanks, that would be great. The README-DO-NOT-EDIT should be >>> merged into the README and adjusted, and all directions should point >>> to running "make" in the top level directory. "make help" could also >>> be improved with a bit more verbosity. There may also be some stray >>> files around that can be deleted. >>> >>> >>> Matti >>> >>> >>> >>>> >>>> -------- Forwarded Message -------- >>>> Subject: Re: pypy-sandbox-2 - documentation fix. Branch uploaded >>>> Date: Sun, 29 Mar 2020 18:41:52 +0200 >>>> From: mark doerr >>>> Organization: University Greifswald >>>> To: Armin Rigo >>>> >>>> >>>> >>>> Hi Armin, >>>> >>>> on a second look, I realised, that pypy moved from yatiblog to nikola. >>>> >>>> Is that true ? Then the README.rst in root is wrong: it still >>>> points to source/README (which is the old yatiblog pages and gives >>>> yatiblog instructions). >>>> >>>> Are the pages in the "pages" folder be used currently by your >>>> system (I guess that nikola takes the pages from "pages" to build >>>> the static html) ? >>>> >>>> Maybe the main README should also be updated ? Is someone taking >>>> care of this ? >>>> >>>> With kind regards, >>>> >>>> mark >>>> >>>> >>>> >>>> On 3/29/20 6:09 PM, mark doerr wrote: >>>>> Hi Armin, >>>>> >>>>> I wondering, why you had two different repos, one and github and >>>>> one on foss. >>>>> >>>>> It is good that you bundled the development on one site. >>>>> >>>>> It is not a big problem, that you deleted my files, I still have >>>>> them as a local copy. >>>>> >>>>> I will add the changes to the repo on foss.heptapod tonight (I >>>>> just realised, that the pypy people use yatiblog - it seems that >>>>> it converts reStructured text into static html, which is nice.) >>>>> >>>>> In my old commit, I changed the generated html code directly, >>>>> which was not nice ;) >>>>> >>>>> I will send you a note, when changes are in ... >>>>> >>>>> >>>>> A bient?t, >>>>> >>>>> mark >>>>> >>>>> >>>>> >>>>> On 3/29/20 2:19 PM, Armin Rigo wrote: >>>>>> Hi Mark, >>>>>> >>>>>> Oops.? We lost your pull request on >>>>>> https://github.com/pypy/pypy-website because we killed that repo >>>>>> (which was never meant to be there).? The real repo is at >>>>>> https://foss.heptapod.net/pypy/pypy.org .? If you still have your >>>>>> changes locally, could you make a merge request on the real repo? >>>>>> Thank you! >>>>>> >>>>>> >>>>>> A bient?t, >>>>>> >>>>>> Armin. >>>>> >>>> - >>>> >>>> _______________________________________________ >>>> pypy-dev mailing list >>>> pypy-dev at python.org >>>> https://mail.python.org/mailman/listinfo/pypy-dev >> -- >> -- >> >> Dr. mark doerr >> Working group: Prof. Uwe T. Bornscheuer >> Institute of Biochemistry >> Dept. of Biotechnology & Enzyme Catalysis >> University of Greifswald >> Felix-Hausdorff-Str. 4 >> D-17487 Greifswald >> germany >> >> Phone: >> +49 (0)3834-420-4411 >> +49 (0)3834-420-4391 (Secr.) >> Fax: >> +49 (0)3834-420-794367 >> >> Room: >> B021 and B047 >> >> Web: >> http://lara.uni-greifswald.de/ -- -- Dr. mark doerr Working group: Prof. Uwe T. Bornscheuer Institute of Biochemistry Dept. of Biotechnology & Enzyme Catalysis University of Greifswald Felix-Hausdorff-Str. 4 D-17487 Greifswald germany Phone: +49 (0)3834-420-4411 +49 (0)3834-420-4391 (Secr.) Fax: +49 (0)3834-420-794367 Room: B021 and B047 Web: http://lara.uni-greifswald.de/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmiscml at gmail.com Mon Mar 30 14:58:32 2020 From: pmiscml at gmail.com (Paul Sokolovsky) Date: Mon, 30 Mar 2020 21:58:32 +0300 Subject: [pypy-dev] [Python-ideas] Explicitly defining a string buffer object (aka StringIO += operator) In-Reply-To: References: <20200329205230.0e473ae2@zenbook> Message-ID: <20200330215832.19ccb259@zenbook> Hello, On Mon, 30 Mar 2020 09:58:32 -0700 Brett Cannon wrote: > On Sun, Mar 29, 2020 at 10:58 AM Paul Sokolovsky > wrote: > > > [SNIP] > > > > 1. Succumb to applying the same mis-optimization for string type as > > CPython3. (With the understanding that for speed-optimized projects, > > implementing mis-optimizations will eat into performance budget, and > > for memory-optimized projects, it likely will lead to noticeable > > memory bloat.) > > [SNIP] > > > > 1. The biggest "criticism" I see is a response a-la "there's no > > problem with CPython3, so there's nothing to fix". This is related > > to a bigger questions "whether a life outside CPython exists", or > > put more formally, where's the border between Python-the-language > > and CPython-the-implementation. To address this point, I tried to > > collect performance stats for a pretty wide array of Python > > implementations. > > I don't think characterizing this as a "mis-optimization" is fair. > There is use of in-place add with strings in the wild and CPython > happens to be able to optimize for it. Everyone definitely doesn't have to agree with that characterization. Nor there's strong need to be offended that it's "unfair". After all, it's just somebody's opinion. Roughly speaking, the need to be upset by the "mis-" prefix is about the same as need to be upset by "bad" in some random blog post, e.g. https://snarky.ca/my-impressions-of-elm/ I'm also sure that people familiar with implementation details would understand why that "mis-" prefix, but let me be explicit otherwise: a string is one of the fundamental types in many languages, including Python. And trying to make it too many things at once has its overheads. Roughly speaking, to support efficient appending, one need to be ready to over-allocate string storage, and maintain bookkeeping for this. Another known optimization CPython does is for stuff like "s = s[off:]", which requires maintaining another "offset" pointer. Even with this simplistic consideration, internal structure of "str" would be about the same as "io.StringIO" (which also needs to over-allocate and maintain "current offset" pointer). But why, if there's io.StringIO in the first place? > Someone was motivated to do > the optimization so we took it without hurting performance for other > things. There are plenty of other things that I see people regularly > that I don't personally think is best practices but that doesn't mean > we should automatically ignore them and not help make their code more > performant if possible without sacrificing best practice performance. Nowhere did I argue against applying that optimization in CPython. Surely, in general, the more optimizations, the better. I just stated the fact that of 8 (well, 11, 11!) Python'ish implementations surveyed, only 1 implemented it. And what went implied, is that even under ideal conditions that other implementations say "we have resources to implement and maintain that optimization" (we still talking about "str +=" optimization), then at least for some projects, it would be against their interests. E.g. MicroPython, Pycopy, Snek optimize for memory usage, TinyPy for simplicity of implementation. "Too-complex basic types" are also a known problem for JITs (which become less performant due to need to handle multiple cases of the same primitive type and much harder to develop and debug). At the same time, ergonomics of "str +=" is very good (heck, that's why people use it). So, I was looking for the simplest possible change which would allow for the largest part of that ergonomics in an object type more suitable for content accumulation *across* different Python'ish implementations. I have to admit that I was inspired to write down this RFC by PEP 616 "String methods to remove prefixes and suffixes". Who'd think that after so many years, there's still something useful to be added to sting methods (and then, that it doesn't have to be as complex as one can devise at full throttle, but much simpler than that). > And I'm not sure if you're trying to insinuate that CPython represents > Python the language That's an old and painful (to some) topic. > and thus needs to not optimize for something other > implementations have/can not optimize for, which if you are As I clarified, I don't say that CPython shouldn't optimize for things. I just tried to argue that there's no clearly defined abstraction (*) for accumulating string buffer, and argued that it could be easily "established". (*) Instead, there're various of practical hacks to implement it, as both 2006's and this thread shows. > suggesting that then I have an uncomfortable conversation I need to > have with PyPy ?. > Or if you're saying CPython and Python should be > considered separate, then why can't CPython optimize for something it > happens to be positioned to optimize for that other implementations > can't/haven't? Yes, I personally think that CPython and Python should be considered separate. E.g. the topic of this RFC shouldn't be considered just from CPython's point of view, but rather from the angle of "Python doesn't seem to define a useful abstraction of (ergonomic) string builder, here's how different Python implementations can acquire it almost for free". -- Best regards, Paul mailto:pmiscml at gmail.com From steve at pearwood.info Mon Mar 30 22:45:55 2020 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 31 Mar 2020 13:45:55 +1100 Subject: [pypy-dev] [Python-ideas] Re: Explicitly defining a string buffer object (aka StringIO += operator) In-Reply-To: <9AAC52DE-5A72-41E4-9FE8-5487CAD47C46@yahoo.com> References: <9AAC52DE-5A72-41E4-9FE8-5487CAD47C46@yahoo.com> Message-ID: <20200331024555.GF6933@ando.pearwood.info> On Mon, Mar 30, 2020 at 10:24:02AM -0700, Andrew Barnert via Python-ideas wrote: > On Mar 30, 2020, at 10:01, Brett Cannon wrote: [talking about string concatenation] > > I don't think characterizing this as a "mis-optimization" is fair. [...] > Yes. A big part of the reason there?s so much use in the wild is that > for small cases that aren?t in the middle of a bottleneck, it?s > perfectly reasonable for people to add two or three strings and not > care about performance. (Who cares about N**2 when N<=15 and it > happens at most 4 times per run of your program?) When you're talking about N that small (2 or 4, say), it is quite possible that the overhead of constructing a list then looking up and calling a method may be greater than that of string concatenation, even without the optimization. I wouldn't want to bet either way without benchmarks, and I wouldn't trust the benchmarks from one machine to apply to another. > So people do it, and > it?s fine. When they really do need to optimize, a quick search of the > FAQ or StackOverflow or whatever will tell them the right way to do > it, and they do it, but most of the time it doesn?t matter. Ah, but that's the rub. How often do they know they need to do that "quick search"? Unless they get bitten by poor performance, and spend the time to profile their script and discover the cause of the slow down, how would they know what the cause was? If people already know about the string concatenation trap, they don't need a quick search, and they're probably not writing repeated concatenation for arbitrary N in the first place. Although I have come across a few people who are completely dismissive of the idea of using cross-platform best practices. Even actively hostile to the idea that they should avoid idioms that will perform badly on other interpreters. On the third hand, if they don't know about the trap, then it won't be a quick search because they don't know what to search for (unless it's "why is Python so slow?" which won't be helpful). Disclaimer: intellectually, I like the CPython string concatenation optimization. It's clever, a Neat Hack, I really admire it. But I can't help feeling that, *just maybe*, it's a misplaced optimization, and if it were proposed today when we are more concerned about alternative interpreters, we might not have accepted it. Perhaps if CPython didn't dominate the ecosystem so completely, and more people wrote cross-platform code that was run across multiple interpreters, we wouldn't be quite so keen on an optimization that encourages quadratic behaviour half the time. So even though I don't *quite* agree with Paul, I can see that from the perspective of people using alternate interpreters, this CPython optimization could easily be characterized as a mis-optimization. "Why is CPython encouraging people to use an idiom that is all but guaranteed to be hideously slow on everyone else's interpreter?" Since Brett brought up the notion of fairness, one might even be forgiven for considering that such an optimization in the reference interpreter, knowing that most of the other interpreters cannot match it, is an unfair, aggressive, anti-competitive action. Personally I wouldn't go quite so far. But I can see why people who are passionate about alternate interpeters might feel that this optimization is both harmful and unfair on the greater Python ecosystem. Apart from cross-platform issues, another risk with the concat optimization is that it's quite fragile and sensitive to the exact form of your code. A small, seemingly insignificant change to your code can have enormous consequences: In [1]: strings = ['abc']*500000 In [2]: %%timeit ...: s = '' ...: for x in strings: ...: s = s+x ...: 36.4 ms ? 313 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each) In [3]: %%timeit ...: s = '' ...: for x in strings: ...: s = t = s+x ...: 59.7 s ? 799 ms per loop (mean ? std. dev. of 7 runs, 1 loop each) That's more than a thousand times slower. And I think people often underestimate how painful it can be to debug performance problems caused by this. If you haven't been burned by it before, it may not be obvious just how risky repeated concatenation can be. Here is an example from real life. In 2009, about four years after the in-place string concatenation optimization was added to CPython, Chris Withers asked for help debugging a problem where Python httplib was literally hundreds of times slower than other tools, like wget and Internet Explorer: https://mail.python.org/pipermail/python-dev/2009-August/091125.html A few weeks later, Simon Cross realised the problem was probably the quadratic behaviour of repeated string addition: https://mail.python.org/pipermail/python-dev/2009-September/091582.html leading to this quote from Antoine Pitrou: "Given differences between platforms in realloc() performance, it might be the reason why it goes unnoticed under Linux but degenerates under Windows." https://mail.python.org/pipermail/python-dev/2009-September/091583.html and Guido's comment: "Also agreed that this is an embarrassment." https://mail.python.org/pipermail/python-dev/2009-September/091592.html So even in CPython, it isn't inconceivable that the concat optimization may fail and you will have hideously slow code. At this point, I think that CPython is stuck with this optimization, for good or ill. Removing it will just hurt a bunch of code that currently performs adequately. But I can't help but feel that, knowing what we know now, there's a good chance that if that optimization were proposed now rather than in 2.4, we might not accept it. > Maybe the OP could argue that this was a bad decision by finding > examples of code that actually relies on that optimization despite > being intended to be portable to other implementations. It?s worth > comparing the case of calling sum on strings?which is potentially > abused more often than used harmlessly, so instead of optimizing it, > CPython made it an error. But without any such known examples, it?s > hard not to call the string concatenation optimization a win. Does the CPython standard library count? See above. -- Steven From steve at pearwood.info Mon Mar 30 23:10:32 2020 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 31 Mar 2020 14:10:32 +1100 Subject: [pypy-dev] [Python-ideas] Re: Explicitly defining a string buffer object (aka StringIO += operator) In-Reply-To: <1466543D-7DF9-42C5-BFCC-D8E5AB89762D@yahoo.com> References: <20200330215832.19ccb259@zenbook> <1466543D-7DF9-42C5-BFCC-D8E5AB89762D@yahoo.com> Message-ID: <20200331031032.GG6933@ando.pearwood.info> On Mon, Mar 30, 2020 at 12:37:48PM -0700, Andrew Barnert via Python-ideas wrote: > On Mar 30, 2020, at 12:00, Paul Sokolovsky wrote: > > Roughly speaking, to support efficient appending, one need to > > be ready to over-allocate string storage, and maintain bookkeeping for > > this. Another known optimization CPython does is for stuff like "s = > > s[off:]", which requires maintaining another "offset" pointer. Even > > with this simplistic consideration, internal structure of "str" would > > be about the same as "io.StringIO" (which also needs to over-allocate > > and maintain "current offset" pointer). But why, if there's io.StringIO > > in the first place? > > Because io.StringIO does _not_ need to do that. The same comment can be made that str does not need to implement the in-place concat optimization either. And yet it does, in CPython if not any other interpreter. It seems to me that Paul makes a good case that, unlike the string concat optimization, just about every interpreter could add this to StringIO without difficulty or great cost. Perhaps they could even get together and agree to all do so. But unless CPython does so too, it won't do them much good, because hardly anyone will take advantage of it. When one platform dominates 90% of the ecosystem, one can sensibly write code that depends on that platform's specific optimizations, but going the other way, not so much. The question that comes to my mind is not whether StringIO *needs* to do this, but whether there is any significant cost to doing this? Of course there is *some* cost: somebody has to do the work, and it won't be me. But once done, is there any significant maintenance cost beyond what there would be without it? Is there any downside? [...] > And it doesn?t allow you to do random-access seeks to arbitrary > character positions. Sorry, I don't see why random access to arbitrary positions is relevant to a discussion about concatenation. What am I missing? -- Steven From pmiscml at gmail.com Tue Mar 31 15:57:05 2020 From: pmiscml at gmail.com (Paul Sokolovsky) Date: Tue, 31 Mar 2020 22:57:05 +0300 Subject: [pypy-dev] [Python-ideas] Re: Explicitly defining a string buffer object (aka StringIO += operator) In-Reply-To: References: <9AAC52DE-5A72-41E4-9FE8-5487CAD47C46@yahoo.com> <20200331024555.GF6933@ando.pearwood.info> Message-ID: <20200331225705.2fb8c7a8@zenbook> Hello, On Mon, 30 Mar 2020 23:20:28 -0400 David Mertz wrote: > I have myself been "guilty" of using the problem style for N < 10. In > fact, I had forgotten about the optimization even, since my uses are > negligible time. I personally don't think it's a "problem style" per se. I'm using it all the time. I'm going to keep using it. We all love Python for being a language which you can easily prototype in, and don't have to concern yourself with implementation details unless you want. My concern for Python to be a language which can be progressively and easily optimized when you need it. Going over the code and changing many lines along the lines of diff: - buf += piece + buf.append(piece) isn't my idea of "seamless" optimization, especially that I know it guaranteedly will grow my memory usage. Sadly I came to conclusion that even - buf += piece + buf.write(piece) patching hurts my aesthetic feelings. At least, that's the only explanation I have for why in one of my modules https://github.com/pfalcon/pycopy-lib/blob/master/utokenize/utokenize.py#L44 I converted one loop from "str +=" to "StringIO.write()", but not another. I then tried to think what could help with that and having += on StringIO seemed to do the trick. > > For stuff like this, it's fast no matter what: > > for clause in query_clauses: > sql += clause > > Maybe I have a WHERE or two. Maybe an ORDER BY. Etc. But if I'm > sure there won't be more than 6 such clauses to the query I'm > building, so what? Or probably likewise with bits of a file path, or > a URL with optional parameters, and a few other things. > > On Mon, Mar 30, 2020 at 11:15 PM David Mertz wrote: > > > Does anyone know if any linters find and warn about the `string += > > word` in a loop pattern? It feels like a linter would be the place > > to do that. I don't think we could possibly make it an actual > > interpreter warning given borderline OK uses (or possibly even > > preferred ones). But a little nagging in tooling could draw > > attention. [] -- Best regards, Paul mailto:pmiscml at gmail.com