skip to navigation
skip to content

Help Fund Python

python-dev Summary for 2006-10-16 through 2006-10-31

[The HTML version of this Summary is available at http://www.python.org/dev/summary/2006-10-16_2006-10-31]

Announcements

Roundup to replace SourceForge tracker

Roundup has been named as the official replacement for the SourceForge issue tracker. Thanks go out to the new volunteer admins, Paul DuBois, Michael Twomey, Stefan Seefeld, and Erik Forsberg, and also to Upfront Systems who will be hosting the tracker. If you'd like to provide input on what the new tracker should do, please join the tracker-discuss mailing list.

Contributing threads:

Summaries

The buffer protocol and communicating binary format information

Travis E. Oliphant presented a pre-PEP for adding a standard way to describe the shape and intended types of binary-formatted data. It was accompanied by a pre-PEP for extending the buffer protocol to handle such shapes and types. Under the proposal, a new datatype object would describe binary-formatted data with an API like:

datatype((float, (3,2))
# describes a 3*2*8=48 byte block of memory that should be interpreted
# as 6 doubles laid out as arr[0,0], arr[0,1], ... a[2,0], a[1,2]

datatype([( ([1,2],'coords'), 'f4', (3,6)), ('address', 'S30')])
# describes the structure
#     float coords[3*6]   /* Has [1,2] associated with this field */
#     char  address[30]

Alexander Belopolsky provided a nice example of why you might want to extend the buffer protocol along these lines. Currently, there's not much you can do with a basic buffer object. If you want to pass it to numpy, you have to provide the type and shape information yourself:

>>> b = buffer(array('d', [1,2,3]))
>>> numpy.ndarray(shape=(3,), dtype=float, buffer=b)
array([ 1.,  2.,  3.])

By extending the buffer protocol appropriately so that the necessary information can be provided, you should be able to pass the buffer directly to numpy and have it understand the format itself:

>>> numpy.array(b)

People were uncomfortable with the many datatype variants -- the constructor accepted types, strings, lists or dicts, each of which could specify the structure in a different way. Also, a number of people questioned why the existing ctypes mechanisms for describing binary data couldn't be used instead, particularly since ctypes could already describe things like function pointers and recursive types, which the pre-PEP could not. Travis said he was looking for a way to unify the data formats of all the array, struct, numpy and ctypes modules, and felt like using the ctypes approach was too verbose for use in the other modules. In particular, he felt like the ctypes use of type objects as binary-format specifiers was problematic because type objects were harder to manipulate at the C level.

The discussion continued on into the next fortnight.

Contributing threads:

The "lazy strings" patch

Discussion continued on Larry Hastings lazy strings patch that would have delayed until necessary the evaluation of some string operations, like concatenation and slicing. With his patch, repeated string concatenation could be used instead of the standard .join() idiom, and slices which were never used would never be rendered. Discussions of the patch showed that people were concerned about memory increases when a small slice of a very large string kept the large string around in memory. People also felt like a stronger motivation was necessary to justify complicating the string representation so much. Larry was pointed to some code that his patch would break, which was using ob_sval directly instead of calling PyString_AS_STRING() like it was supposed to. He was also referred to the Python 3000 list where the recent discussions of string views would be relevant, and his proposal might have a better chance of acceptance.

Contributing threads:

PEP 355 status

BJörn Lindqvist wanted to wrap up the loose ends of PEP 355 and asked whether the problem was the specific path object of PEP 355 or path objects in general. A number of people felt that some reorganization of the path-related functions could be helpful, but that trying to put everything into a single object was a mistake. Some important requirements for a reorganization of the path-related functions:

  • should divide the functions into coherent groups
  • should allow you to manipulate paths foreign to your OS

There were a few suggestions of possible new APIs, but no concrete implementations. People seemed hopeful that the issue could be resurrected for Python 3K, but no one appeared to be taking the lead.

Discussion has largely moved to the python-3000 list, and there is a partial implementation.

[Thanks to Jim Jewett for an update to this summary]

Contributing thread:

Buildbots, configure changes and extension modules

Grig Gheorghiu, who's been taking care of the Python Community Buildbots, noticed that the buildbots started failing after a checkin that made changes to configure. Martin v. Löwis explained that even though a plain make will trigger a re-run of configure if it has changed, there is an issue with distutils not rebuilding when header files change, and so extension modules are sometimes not rebuilt. Contributions to fix that deficiency in distutils are welcome.

Martin also pointed out a handy way of forcing a buildbot to start with a clean build: ask the buildbot to build a non-existing branch. This causes the checkouts to be deleted and the build to fail. The next regular build will then start from scratch.

Contributing thread:

Sqlite versions

Skip Montanaro ran into some problems running test_sqlite on OSX where he was getting a bunch of ProgrammingError: library routine called out of sequence errors. These errors appeared reliably when test_sqlite was run immediately after ctypes' test_find. When he started linking to sqlite 3.1.3 instead of sqlite 3.3.8, the problems went away. Barry Warsaw mentioned that he had run into similar troubles when he tried to upgrade from 3.2.1 to 3.2.8.

Contributing thread:

Threads, generators, exceptions and segfaults

Mike Klaas managed to provoke a segfault in Python 2.5 using threads, generators and exceptions. Tim Peters was able to whittle Mike's problem down to a relatively simple test case, where a generator was created within a thread, and then the thread vanished before the generator had exited. The segfault was a result of Python's attempt to clean up the abandoned generator, during which it tried to access the generator's already free()'d thread state. No clear solution to this problem had been decided on at the time of this summary.

Contributing thread:

ctypes and win64

Previously, Thomas Heller had asked that ctypes be removed from the Python 2.5 win64 MSI installers since it did not work for that platform at the time. Since then, Thomas integrated some patches in the trunk so that _ctypes could be built for win64/AMD64. Backporting these fixes to Python 2.5 would have meant that, while the MSI installer would still not include it, _ctypes could be built from a source distribution on win64/AMD64. It was unclear whether this would constitute a bugfix (in which case the backport would be okay) or a feature (in which case it wouldn't).

Contributing thread:

Python 2.3.X and 2.4.X retired

Anthony Baxter pushed out a Python 2.4.4 release and was pushing out the Python 2.3.6 source release as well. He indicated that once 2.3.6 was out, both of these branches could be officially retired.

Contributing thread:

Producing bytecode from Python 2.5 ASTs

Michael Spencer offered up his compiler2 module, a rewrite of the compiler module which allows bytecode to be produced from _ast.AST objects. Currently, it produces almost identical output to __builtin__.compile for all the stdlib modules and their tests. He asked for feedback on what would be necessary to get it stdlib ready, but had no responses.

Contributing thread:

Skipped Threads

Epilogue

This is a summary of traffic on the python-dev mailing list from October 16, 2006 through October 31, 2006. It is intended to inform the wider Python community of on-going developments on the list on a semi-monthly basis. An archive of previous summaries is available online.

An RSS feed of the titles of the summaries is available. You can also watch comp.lang.python or comp.lang.python.announce for new summaries (or through their email gateways of python-list or python-announce, respectively, as found at http://mail.python.org).

This python-dev summary is the 15th written by Steve Bethard.

To contact me, please send email:

  • Steve Bethard (steven.bethard at gmail.com)

Do not post to comp.lang.python if you wish to reach me.

The Python Software Foundation is the non-profit organization that holds the intellectual property for Python. It also tries to advance the development and use of Python. If you find the python-dev Summary helpful please consider making a donation. You can make a donation at http://python.org/psf/donations.html . Every cent counts so even a small donation with a credit card, check, or by PayPal helps.

Commenting on Topics

To comment on anything mentioned here, just post to comp.lang.python (or email python-list@python.org which is a gateway to the newsgroup) with a subject line mentioning what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join python-dev!

How to Read the Summaries

This summary is written using reStructuredText. Any unfamiliar punctuation is probably markup for reST (otherwise it is probably regular expression syntax or a typo :); you can safely ignore it. We do suggest learning reST, though; it's simple and is accepted for PEP markup and can be turned into many different formats like HTML and LaTeX.