Call julia from Python: which package?

Oscar Benjamin oscar.j.benjamin at gmail.com
Fri Dec 17 18:30:12 EST 2021


On Fri, 17 Dec 2021 at 23:11, Chris Angelico <rosuav at gmail.com> wrote:
>
> On Sat, Dec 18, 2021 at 10:01 AM Oscar Benjamin
> <oscar.j.benjamin at gmail.com> wrote:
> >
> > On Fri, 17 Dec 2021 at 22:40, Chris Angelico <rosuav at gmail.com> wrote:
> > >
> > > On Sat, Dec 18, 2021 at 9:24 AM Oscar Benjamin
> > > <oscar.j.benjamin at gmail.com> wrote:
> > > > When I timed the result in Julia and in Python I found that the Julia
> > > > code was slower than the Python code. Of course I don't know how to
> > > > optimise Julia code so I asked one of my colleagues who does (and who
> > > > likes to proselytise about Julia). He pointed me to here where the
> > > > creator of Julia says "BigInts are currently pretty slow in Julia":
> > > > https://stackoverflow.com/questions/37193586/bigints-seem-slow-in-julia#:~:text=BigInts%20are%20currently%20pretty%20slow,that%20basic%20operations%20are%20fast.
> > > > I should make clear here that I used the gmpy2 library in Python for
> > > > the basic integer operations which is a wrapper around the same gmp
> > > > library that is used by Julia. That means that the basic integer
> > > > operations were being done by the same gmp library (a C library) in
> > > > each case. The timing differences between Python and Julia are purely
> > > > about overhead around usage of the same underlying C library.
> > >
> > > Point of note: "Python actually uses a hybrid approach where small
> > > integer values are represented inline and only when values get too
> > > large are they represented as BigInts" might be sorta-kinda true for
> > > Python 2, but it's not true for Python 3. In all versions of CPython
> > > to date, all integers are objects, they're not "represented inline"
> > > (there *are* some languages that do this intrinsically, and I think
> > > that PyPy can sometimes unbox integers, but CPython never will); and
> > > the performance advantage of Py2's machine-sized integers clearly
> > > wasn't worth recreating in Py3. (It would be possible to implement it
> > > in Py3 as an optimization, while still having the same 'int' type for
> > > both, but nobody's done it.)
> > >
> > > So if Python's integers are faster than Julia's, it's not because
> > > there's any sort of hybrid approach, it's simply because CPython is
> > > more heavily optimized for that sort of work.
> > >
> > > (That said: I have no idea whether a StackOverflow answer from 2016 is
> > > still applicable.)
> >
> > To be clear: I wasn't using Python's int type. I used the gmpy2.mpz
> > type which is precisely the same mpz from the same gmp library that
> > Julia uses. I'm told by my colleague that Julia has a lot of overhead
> > when using "heap types" which is possibly the cause of the problem.
>
> Ah, interesting. What's the advantage of using mpz instead of Python's
> builtin int?

In this particular context the advantage was to give parity to the two
languages I was profiling but in general gmpy2/flint have faster large
integer arithmetic than Python's int type:

In [13]: from gmpy2 import mpz

In [14]: nums_int = [3**i for i in range(1000)]

In [15]: nums_mpz = [mpz(3)**i for i in range(1000)]

In [16]: def prod(nums):
    ...:     result = nums[0]
    ...:     for num in nums[1:]:
    ...:         result *= num
    ...:     return result
    ...:

In [17]: %time ok = prod(nums_int)
CPU times: user 384 ms, sys: 12 ms, total: 396 ms
Wall time: 398 ms

In [18]: %time ok = prod(nums_mpz)
CPU times: user 124 ms, sys: 0 ns, total: 124 ms
Wall time: 125 ms

That's somewhat significant but the big difference for SymPy in using
gmpy2 (as an optional dependency) is the mpq rational type which is
significantly faster for small rational numbers:

In [19]: from gmpy2 import mpq

In [20]: from fractions import Fraction

In [21]: nums_mpq = [mpq(i, 3) for i in range(1000)]

In [22]: nums_frac = [Fraction(i, 3) for i in range(1000)]

In [23]: %time ok = sum(nums_mpq)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 633 µs

In [24]: %time ok = sum(nums_frac)
CPU times: user 8 ms, sys: 0 ns, total: 8 ms
Wall time: 10.2 ms

For some slow operations SymPy is about 30x faster when gmpy2 is installed.

--
Oscar


More information about the Python-list mailing list