[Chicago] PyPy

Tal Liron tal.liron at threecrickets.com
Mon Jul 25 05:46:19 CEST 2011


For the people recommending PyPy right now, a serious question:


Who would you recommend PyPy to? Assume a user or dev who does not care 
about speed benchmarks.


On 07/24/2011 10:13 PM, Brian Herman wrote:

> +1 for PYPY
>
>
> Arigatou gozaimasu,
> (Thank you very much)
> Brian Herman
>
> brianjherman.com <http://brianjherman.com>
> brianherman at acm.org <mailto:brianherman at acm.org>
>
>
>
>
>
>
>
>
> On Sun, Jul 24, 2011 at 7:57 PM, Alex Gaynor <alex.gaynor at gmail.com 
> <mailto:alex.gaynor at gmail.com>> wrote:
>
>
>
>     On Sun, Jul 24, 2011 at 5:38 PM, Tal Liron
>     <tal.liron at threecrickets.com <mailto:tal.liron at threecrickets.com>>
>     wrote:
>
>         JVM 7 will have some neat features, but they haven't been
>         stabilized yet, and at this point it's mostly experimentation.
>         Fact is, even though JVM 6 has been out for a few years
>         already, many deployments still stick to JVM 5. It does the
>         job, and "upgrades" have their costs, money and otherwise. I
>         choose JVM for my project not because of speed, but because of
>         the maturity of the platform, which includes administration
>         tools, monitoring, security, and several best-in-class 3rd
>         party libraries. It's nice to know that performance is very
>         high up there if I really need it (at which case I just "drop
>         down" to Java, rather than use a dynamic JVM language).
>
>
>         The whole Jython codebase could use some help... it's even
>         messier than CPython's, if that's possible. There's a lot of
>         room for optimization, even before igniting JVM 7 shortcuts,
>         though it will surely be at the cost of regressions and
>         stability.Luckily, there's a decent test suite, which makes it
>         easy to experiment for. The Jython community would LOVE help,
>         and it doesn't have to be just in terms of coding. Their
>         recent big project was to move the whole codebase from
>         Subversion to Mercurial. Another big item on the todo list is
>         to get up to date with Python 3. (Jython = Python 2.5
>         formally, though it has quite a few 2.6 additions.)
>
>
>         Jython also has some nice collaboration with JRuby, including
>         people who work on both projects. But, what I would make me
>         happier is if there was real code sharing, allowing for a
>         dynamic core that would work well for both projects.
>
>
>         Anyway. I guess I'm always confused by what people mean by
>         "faster." What are you trying to code for, exactly? Where is
>         your bottleneck? What is your funding? It's more likely that
>         (although not necessarily) what you really are looking for is
>         "scalability," for which shear computational performance is
>         likely not the real issue. If money is coming, getting more
>         expensive, faster machines may do the trick better than any
>         JVM 7 optimization.
>
>
>         If you just want a command line tool that starts fast, JVM is
>         *not* where you want to go. It has notoriously slow startup,
>         for exactly those mechanisms that make it perform so well as
>         it runs.
>
>
>         Another way to look at "faster" is as a way to save money.
>         Weird, huh? But consider Facebook's HipHop project. (Sorry
>         that all of my examples are from the web arena; it's where I
>         mostly work these days.) The issue was not that PHP was
>         "slow," it was that when you have 1,000 machines running at
>         90% CPU, a faster PHP runtime means that you can use 800
>         machines, instead, for the same workload. A few orders of
>         magnitude forward, and savings can be enormous.
>
>
>         If you have a project with 1,000 machines running at 90% CPU,
>         please hire me! It may be very worthwhile for you to create a
>         more performant Python runtime (JVM-based or not), and I'd
>         love to be paid to do that. :) And it would also make a lot of
>         irrational Python speed freaks happy.
>
>
>         -Tal
>
>
>     <minor derail>
>     No offense, but if you want a more performant Python runtime, it's
>     here today: http://speed.pypy.org/, no need to start from scratch.
>     </minor derail>
>
>     Alex
>
>
>
>         On 07/24/2011 06:18 PM, John Stoner wrote:
>
>             Jython's not bad. I've used it a lot, and it plays well
>             with lots of Java APIs. Pretty slick, actually. I hear
>             Java 1.7 has some new dynamic features at the JVM level. I
>             always imagined Jython would run a lot faster if it took
>             advantage of them. Tal, do you know if there's any work on
>             that? Googling around a bit I'm not seeing much.
>
>             On Sun, Jul 24, 2011 at 4:32 PM, Joshua Herman
>             <zitterbewegung at gmail.com
>             <mailto:zitterbewegung at gmail.com>
>             <mailto:zitterbewegung at gmail.com
>             <mailto:zitterbewegung at gmail.com>>> wrote:
>
>                At least erlang works for the use cases. I wasn't aware
>             that Jython
>                was that powerful I will have to play with it.
>
>                On Sun, Jul 24, 2011 at 3:46 PM, Tal Liron
>             <tal.liron at threecrickets.com
>             <mailto:tal.liron at threecrickets.com>
>             <mailto:tal.liron at threecrickets.com
>             <mailto:tal.liron at threecrickets.com>>>
>
>                wrote:
>             > There is an alternative: Jython, which is Python on the
>             JVM, and
>                has no GIL.
>             > It's real, it works, and has a very open community. If
>             you want
>                to do
>             > high-concurrency in Python, it's the way to go. (And it has
>                other advantages
>             > and disadvantages, of course.)
>             >
>             >
>             > I am always a bit frightened by community attempts to
>             create new
>                virtual
>             > machines for favorite languages in order to solve problem X.
>                This shows a
>             > huge under-estimation of what it means to create a
>             robust, reliable,
>             > performative generic platform. Consider how many really
>             reliable
>                versions of
>             > the C standard library out there -- and how many decades
>             they
>                took to
>             > mature, even with thousands of expert eyes poring over
>             the code
>                and testing
>             > it. And this is without duck typing (or ANY typing), data
>                integrity, scoping
>             > (+call/cc), tail recursion, or any other of the other
>             huge (and
>                exciting)
>             > challenges required to run a dynamic language like Python.
>             >
>             >
>             > So, it's almost amusing to see projects like Rubinius or
>             Parrot
>                come to be.
>             > Really? This is the best use of our time and effort? I'm
>             equally
>                impressed
>             > by the ballsiness of Erlang to create a new virtual
>             machine from
>                scratch.
>             >
>             >
>             > But those are rather unique histories. CPython has it's own
>                unique history.
>             > Not many people realize this, but Python is about 6
>             years older
>                than Java,
>             > and the JVM would take another decade before reaching
>                prominence. JavaScript
>             > engines (running in web browsers only) at the time were
>                terrible, and Perl
>             > was entirely interpreted (no VM). So, in fact, CPython was
>                written where
>             > there was no really good platform for dynamic languages. It
>                wasn't a matter
>             > of hubris ("not invented here") to build a VM from scratch;
>                there was simply
>             > no choice.
>             >
>             >
>             > Right now, though, there are many good choices. People
>             like Rich
>                Hickey
>             > (Clojure) and Martin Odersky (Scala) have it right in
>             targeting
>                the JVM,
>             > although both projects are also exploring .NET/Mono. If
>             Python
>                were invented
>             > today, I imagine it also would start with "Jython,"
>             instead of
>                trying to
>             > reinvent the wheel (well, reinvent a whole damn car fleet,
>                really, in terms
>             > of the work required).
>             >
>             >
>             > One caveat: I think there is room for "meta-VM" projects
>             like
>                PyPy and LLVM.
>             > These signify a real progress in architecture, whereas "yet
>                another dynamic
>             > VM" does not.
>             >
>             >
>             > -Tal
>             >
>             >
>             > On 07/24/2011 02:56 PM, Jason Rexilius wrote:
>             >
>             >> I also have to quote:
>             >>
>             >> "rather that, for problems for which shared-memory
>             concurrency is
>             >> appropriate (read: the valid cases to complain about
>             the GIL),
>                message
>             >> passing will not be, because of the marshal/unmarshal
>             overhead
>                (plus data
>             >> size/locality ones)."
>             >>
>             >>
>             >> I have to say this is some of the best discussion in
>             quite a
>                while. Dave's
>             >> passionate response is great as well as others. I think the
>                rudeness, or
>             >> not, is kinda besides the point.
>             >>
>             >> There is a valid point to be made about marshal/unmarshal
>                overhead in
>             >> situations where data-manipulation-concurrency AND _user
>                expectation_ or
>             >> environmental constraints apply.  I think that's why people
>                have some
>             >> grounds to be unhappy with the GIL concept (for me its a
>                concept) in certain
>             >> circumstances. Tal is dead on in that "scalability" means
>                different things.
>             >>
>             >> Oddly, I'm more engaged in this as an abstract comp sci
>                question than a
>             >> specific python question.  The problem set applies across
>                languages.
>             >>
>             >> The question I would raise is if, given that an engineer
>                understands the
>             >> problem he is facing, are there both tools in the
>             toolbox?  Is
>                there an
>             >> alternative to GIL for the use-cases where it is not
>             the ideal
>                solution?
>             >>
>             >> BTW, I will stand up for IPC as one of the tools in the
>             toolbox
>                to deal
>             >> with scale/volume/speed/concurrency problems.
>             >>
>             >>
>             >> On 7/24/11 1:58 PM, Tal Liron wrote:
>             >>>
>             >>> I would say that there's truth in both approaches.
>                "Scalability" means
>             >>> different things at different levels of scale. A web
>             example: the
>             >>> architecture of Twitter or Facebook is nothing like the
>                architecture of
>             >>> even a large Django site. It's not even the same
>             problem field.
>             >>>
>             >>>
>             >>> A good threading model can be extremely efficient at
>             certain
>                scales. For
>             >>> data structures that are mostly read, not written,
>                synchronization is
>             >>> not a performance issue, and you get the best throughput
>                possible in
>             >>> multicore situations. The truly best scalability would be
>                achieved by a
>             >>> combined approach: threading on a single node, message
>             passing
>                between
>             >>> nodes. Programming for that, though, is a nightmare
>             (unless
>                you had a
>             >>> programming language that makes both approaches
>             transparent)
>                and so
>             >>> usually at the large scale the latter approach is
>             chosen. One
>             >>> significant challenge is to make sure that operations that
>                MIGHT use the
>             >>> same data structures are actually performed on the
>             same node,
>                so that
>             >>> threading would be put to use.
>             >>>
>             >>>
>             >>> So, what Dave said applies very well to threading,
>             too: "you
>                still need
>             >>> to know what you're doing and how to decompose your
>                application to use
>             >>> it."
>             >>>
>             >>>
>             >>> Doing concurrency right is hard. Doing message passing
>             right
>                is hard.
>             >>> Functional (persistent data structure) languages are hard,
>                too. Good
>             >>> thing we're all such awesome geniuses, bursting with
>                experience and a
>             >>> desire to learn.
>             >>>
>             >>>
>             >>> -Tal
>             >>>
>             >>>
>             >>> On 07/23/2011 01:40 PM, David Beazley wrote:
>             >>>
>             >>>>> "high performance just create multi processes that
>             message" very
>             >>>>> rarely have
>             >>>>> I heard IPC and high performance in the same sentence.
>             >>>>>
>             >>>>> Alex
>             >>>>>
>             >>>> Your youth and inexperience is the only reason would
>             make a
>                statement
>             >>>> that ignorant. Go hang out with some people doing
>             Python and
>             >>>> supercomputing for awhile and report back---you will find
>                that almost
>             >>>> significant application is based on message passing
>             (e.g.,
>                MPI). This
>             >>>> is because message passing has proven itself to be
>             about the
>                only sane
>             >>>> way of scaling applications up to run across
>             thousands to tens of
>             >>>> thousands of CPU cores.
>             >>>>
>             >>>> I speak from some experience as I was writing such
>             software
>                for large
>             >>>> Crays, Connection Machines, and other systems when I
>             first
>                discovered
>             >>>> Python back in 1996. As early as 1995, our group had done
>                performance
>             >>>> experiments comparing threads vs. message passing on some
>             >>>> multiprocessor SMP systems and found that threads
>             just didn't
>                scale or
>             >>>> perform as well as message passing even on machines
>             with as
>                few as 4
>             >>>> CPUs. This was all highly optimized C code for
>             numerics (i.e., no
>             >>>> Python or GIL).
>             >>>>
>             >>>> That said, in order to code with message passing, you
>             still
>                need to
>             >>>> know what you're doing and how to decompose your
>             application
>                to use it.
>             >>>>
>             >>>> Cheers,
>             >>>> Dave
>             >>>>
>             >>>>
>             >>>>
>             >>>>
>             >>>>
>             >>>>
>             >>>>
>             >>>>
>             >>>> _______________________________________________
>             >>>> Chicago mailing list
>             >>>> Chicago at python.org <mailto:Chicago at python.org>
>             <mailto:Chicago at python.org <mailto:Chicago at python.org>>
>
>             >>>> http://mail.python.org/mailman/listinfo/chicago
>             >>>
>             >>> _______________________________________________
>             >>> Chicago mailing list
>             >>> Chicago at python.org <mailto:Chicago at python.org>
>             <mailto:Chicago at python.org <mailto:Chicago at python.org>>
>
>             >>> http://mail.python.org/mailman/listinfo/chicago
>             >>
>             >> _______________________________________________
>             >> Chicago mailing list
>             >> Chicago at python.org <mailto:Chicago at python.org>
>             <mailto:Chicago at python.org <mailto:Chicago at python.org>>
>
>             >> http://mail.python.org/mailman/listinfo/chicago
>             >
>             > _______________________________________________
>             > Chicago mailing list
>             > Chicago at python.org <mailto:Chicago at python.org>
>             <mailto:Chicago at python.org <mailto:Chicago at python.org>>
>
>             > http://mail.python.org/mailman/listinfo/chicago
>             >
>                _______________________________________________
>                Chicago mailing list
>             Chicago at python.org <mailto:Chicago at python.org>
>             <mailto:Chicago at python.org <mailto:Chicago at python.org>>
>
>             http://mail.python.org/mailman/listinfo/chicago
>
>
>
>
>             -- 
>             blogs:
>             http://johnstoner.wordpress.com/
>             'In knowledge is power; in  wisdom, humility.'
>
>
>             _______________________________________________
>             Chicago mailing list
>             Chicago at python.org <mailto:Chicago at python.org>
>             http://mail.python.org/mailman/listinfo/chicago
>
>
>         _______________________________________________
>         Chicago mailing list
>         Chicago at python.org <mailto:Chicago at python.org>
>         http://mail.python.org/mailman/listinfo/chicago
>
>
>
>
>     -- 
>     "I disapprove of what you say, but I will defend to the death your
>     right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
>     "The people's good is the highest law." -- Cicero
>
>
>     _______________________________________________
>     Chicago mailing list
>     Chicago at python.org <mailto:Chicago at python.org>
>     http://mail.python.org/mailman/listinfo/chicago
>
>
>
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago



More information about the Chicago mailing list