bytecode non-backcompatibility

Mike Meyer mwm at mired.org
Tue Apr 26 06:03:27 EDT 2005


Maurice LING <mauriceling at acm.org> writes:

>> All you have to do is convince the developers to declare the set of
>> bytecodes fixed, and they'd be stable. I don't think that will be
>> easy, as it removes one of the methods used to improve the performance
>> of Python. Since rebuilding the bytecode files is trivial (just invoke
>> compileall.py in the library as a script), this isn't really a
>> problem. The only thing that breaks are modules that muck about inside
>> the bytecode. I don't know how much bytecode improvements have sped
>> things up, but I do know that if I ever used a module that mucked
>> around with byte codes, I wasn't aware of it - so this is a tradeoff
>> I'm more than happy to make.
>>
>
> Thanks Mike,
>
> My arguments to the developers will be:
>
> 1. Python had gone from a purely scripting language to a general
> purpose programming language.

I think that's irrelevant. It just means that there are more
applications around that have to be dealt with when you update.

> 2. The current compilation scheme (compiling to bytecode as and when
> it is needed) works well for scripting purposes but is less desirable
> in commercial settings. Less distribution happens when it is used
> purely for scripting purposes, such as system maintenance or tuning.

The solution with the current situation depends on what you mean by
"commercial settings".

> 3. Using Python in commercial settings will usually require
> distribution of resulting software and it is may or may not be
> desirable to distribute source codes as well. Unless the application
> is frozen, distributing source code is a must.

People have recommended that you distribute the Python interpreter
with commercial applications. This makes installing them much easier
for the end user. It also means the byte code and the interpreter will
always match.

> 4. One advantage that Java platform has is that it does not require
> the release of source files and still promotes platform-independence.

Java is only "platform-independent" to the degree that the jvm and
libraries you need for your application are already available.  A
quick google finds applications that require jvm versions - so I
suspect that you can't always run Java code built for a new version of
the jvm on older jvms. So you have to make sure the target platform
has a recent enough implementation of the jvm installed. You may have
problems if you try using a different groups jvm as well. I haven't
looked into that in a number of years.

So you can't just ship java bytecodes to someone expecting they'll be
able to run it out of the box. They may well need to update their java
environment, or install a second one (I recall one time having four
Java products that took in total three different jvms to run. Bleah.)

This isn't really different from Python. In both cases, the end user
has to have a suitable version of the bytecode interpreter and
libraries installed. Java is a little better in that they provide
backwards compatability.

> 5. Unstable bytecodes makes updating to a newer version of Python very
> tedious and risk breaking old scripts, if they uses C modules.

Unstable bytecodes have nothing to do with these problems. 

The current CPython installation process puts the python command and
the libraries for different versions in different directories. This
allows you to have multiple versions installed so you can keep old
scripts working with the old version until you've had time to test
them. It also makes life much easier on the developers, as they can
have a development version installed on the machine at the same time
as they have a production version without breaking the old scripts. It
also means you have to reinstall all your modules on the new
installation - which is what makes the update process tedious for me.

Now, this could be mitigated by having Python libraries installed in
the same location for all versions. You could fix all the bytecode
files by running compileall.py as a script. Sometimes, a module won't
work properly on a new version of Python, and will have to be
updated. You'll have to find those by trial and error and fix them as
you find them. You'll also have to recompile all the C modules, which
will break running scripts on the old interpreter.

Under the currewnt system, old scripts that are run by the new
interpreter will break if all the modules they need aren't installed
yet. Once they're installed, the scripts should work.

If you have scripts that use C modules that aren't installed in the
standard Python search path, they may well break on the new
interpreter until you recompile them. Having both versions of the
interpreter available makes this tolerable.

The cure I proposed seems worse than disease. If you've got a better
solution, I'm sure we'd be interested in hearing it.

> 6. Unstable bytecodes also makes research work into Python, such as,
> just-in-time compilation and just-in-time specialization unfavourable
> as they may only be applicable to a specific version of Python. There
> is much less chance of getting a project grant than if the same
> project is applied for Java (stable bytecodes).

This doesn't seem to have stopped the Psyco project from turning out a
jit compiler. I can't speak to grant applications, though. If you're
really interested, you could apply to the PSF for a grant.

> I may be chopped by saying this but by having a stable set of
> bytecodes, we may lose a means of optimization. But we may gain more
> from filling the need for commerical distribution of applications
> writing in Python and ease of upgrading...

The commercial distribution needs can be met without going to stable
bytecodes. The ease of upgrading needs aren't met by going to stable
bytecodes.

> At current stage, every time a new version of Python is installed in
> my server or system, I have to test and ensure the needed libraries
> are there... which may be a horror in corporate settings. Couple that
> with messy dependencies of the libraries to be installed. I remembered
> the time I was trying to install Eric 3, the dependencies makes me
> want to give up... I have to install Qt, then pyQt, then something
> else, then Eric3. Imagine you need 10 of those libraries... which may
> happen... I am not yet masochistic enough to take pleasures in this...

Fixing the byte-code problem won't help with the dependency
problem. If anything, it'll make it worse, because you'll have old
libraries that were installed with previous versions of Python to
contend with. If those are acceptable, that's all well and good. But
for complex packages like eric - which depends on sip and qt and a
number of other things - the latest versions tend to rely on having
up-to-date libraries. So you install eric, and watch it fail because
some library is out of date. Update that library and repeat. The
easiest way to deal with this is to find all the libraries it needs,
and just update them all.

The dependencies problem is actually pretty easy to solve. In fact, in
many Python environments, it's already solved. On my system, if I
install Eric3 using the provided installation package, the
dependencies will be picked up automatically. Not very helpfull if
you're not on such a system, I know. The long-term solution is for
PyPI to grow to include this functionality.

     Thank you,
     <mike

-- 
Mike Meyer <mwm at mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.



More information about the Python-list mailing list