skip to navigation
skip to content

Help Fund Python

python-dev Summary for 2006-08-01 through 2006-08-15

[The HTML version of this Summary is available at http://www.python.org/dev/summary/2006-08-01_2006-08-15]

Summaries

Mixing str and unicode dict keys

Ralf Schmitt noted that in Python head, inserting str and unicode keys to the same dictionary would sometimes raise UnicodeDecodeErrors:

>>> d = {}
>>> d[u'm\xe1s'] = 1
>>> d['m\xe1s'] = 1
Traceback (most recent call last):
  ...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1: ordinal not in range(128)

This error showed up as a result of Armin Rigo's patch to stop dict lookup from hiding exceptions, which meant that the UnicodeDecodeError raised when a str object is compared to a non-ASCII unicode object was no longer silenced. In the end, people agreed that UnicodeDecodeError should not be raised for equality comparisons, and in general, __eq__() methods should not raise exceptions. But comparing str and unicode objects is often a programming error, so in addition to just returning False, equality comparisons on str and non-ASCII unicode now issues a warning with the UnicodeDecodeError message.

Contributing threads:

Rounding floats to ints

Bob Ippolito pointed out a long-standing bug in the struct module where floats were automatically converted to ints. Michael Urman showed a simple case that would provoke an exception if the bug were fixed:

pack('>H', round(value * 32768))

The source of this bug is the expectation that round() returns an int, when it actually returns a float. There was then some discussion about splitting the round functionality into two functions: __builtin__.round() which would round floats to ints, and math.round() which would round floats to floats. There was also some discussion about the optional argument to round() which currently specifies the number of decimal places to round to -- a number of folks felt that it was a mistake to round to decimal places when a float can only truly reflect binary places.

In the end, there were no definite conclusions about the future of round(), but it seemed like the discussion might be resumed on the Python 3000 list.

Contributing threads:

Assigning to function calls

Neal Becker proposed that code by X() += 2 be allowed so that you could call __iadd__ on objects immediately after creation. People pointed out that allowing augmented assignment is misleading when no assignment can occur, and it would be better just to call the method directly, e.g. X().__iadd__(2).

Contributing threads:

PEP 357: Integer clipping and __index__

After some further discussion on the __index__ issue of last fortnight, Travis E. Oliphant proposed a patch for __index__ that introduced three new C API functions:

  • PyIndex_Check(obj) -- checks for nb_index
  • PyObject* PyNumber_Index(obj) -- calls nb_index if possible or raises a TypeError
  • Py_ssize_t PyNumber_AsSsize_t(obj, err) -- converts the object to a Py_ssize_t, raising err on overflow

After a few minor edits, this patch was checked in.

Contributing threads:

OpenSSL and Windows binaries

Jim Jewett pointed out that a default build of OpenSSL includes the patented IDEA cipher, and asked whether that needed to be kept out of the Windows binary versions. There was some concern about dropping a feature, but Gregory P. Smith pointed out that IDEA isn't directly exposed to any Python user, and suggested that IDEA should never be required by any sane SSL connection. Martin v. Löwis promised to look into making the change.

Update: The change was checked in before 2.5 was released.

Contributing threads:

Type of range object members

Alexander Belopolsky proposed making the members of the range() object use Py_ssize_t instead of C longs. Guido indicated that this was basically wasted effort -- in the long run, the members should be PyObject* so that they can handle Python longs correctly, so converting them to Py_ssize_t would be an intermediate step that wouldn't help in the transition.

There was then some discussion about the int and long types in Python 3000, with Guido suggesting two separate implementations that would be mostly hidden at the Python level.

Contributing thread:

Distutils version number

A user noted that Python 2.4.3 shipped with distutils 2.4.1 and the version number of distutils in the repository was only 2.4.0 and requested that Python 2.5 include the newer distutils. In fact, the newest distutils was already the one in the repository but the version number had not been appropriately bumped. For a short while, the distutils number was automatically generated from the Python one, but Marc-Andre Lemburg volunteered to manually bump it so that it would be easier to use the SVN distutils with a different Python version.

Contributing threads:

Dict containment and unhashable items

tomer filiba suggested that dict.__contain__ should return False instead of raising a TypeError in situations like:

>>> a={1:2, 3:4}
>>> [] in a
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: list objects are unhashable

Guido suggested that swallowing the TypeError here would be a mistake as it would also swallow any TypeErrors produced by faulty __hash__() methods.

Contributing threads:

Returning longs from __hash__()

Armin Rigo pointed out that Python 2.5's change that allows id() to return ints or longs would have caused some breakage for custom hash functions like:

def __hash__(self):
    return id(self)

Though it has long been documented that the result of id() is not suitable as a hash value, code like this is apparently common. So Martin v. Löwis and Armin arranged for PyLong_Type.tp_hash to be called in the code for hash().

Contributing thread:

instancemethod builtin

Nick Coghlan suggested adding an instancemethod() builtin along the lines of staticmethod() and classmethod() which would allow arbitrary callables to act more like functions. In particular, Nick was considering code like:

class C(object):
    method = some_callable

Currently, if some_callable did not define the __get__() method, C().method would not bind the C instance as the first argument. By introducing instancemethod(), this problem could be solved like:

class C(object):
    method = instancemethod(some_callable)

There wasn't much of a reaction one way or another, so it looked like the idea would at least temporarily be shelved.

Contributing thread:

Unicode versions and unicodedata

Armin Ronacher noted that Python 2.5 implements Unicode 4.1 but while a ucd_3_2_0 object is available (implementing Unicode 3.2), no ucd_4_1_0 object is available. Martin v. Löwis explained that the ucd_3_2_0 object is only available because IDNA needs it, and that there are no current plans to expose any other Unicode versions (and that ucd_3_2_0 may go away when IDNA no longer needs it).

Contributing thread:

Elementtree and Namespaces

Elements (and attributes) can be associated with a namespace, such as

http://www.w3.org/XML/1998/namespace:id

The xmlns attribute creates a "prefix" (alias) for a namespace, so that you can abbreviate the above as

xml:id

ElementTree treats the prefix as a just an aid to human readers, and creates its own abbreviations that are consistent throughout a document. Some tools (including w3 recommendations for canonicalization) treat the prefix itself as meaningful.

Elementtree may support this in version 1.3, but it wasn't going to be there in time for 2.5, and it wasn't judged important enough to keep etree out of the release.

If you need it sooner, then http://codespeak.net/lxml supports the etree API and does retain prefixes.

Contributing thread:

[Thanks to Jim Jewett for this summary.]

Epilogue

This is a summary of traffic on the python-dev mailing list from August 01, 2006 through August 15, 2006. It is intended to inform the wider Python community of on-going developments on the list on a semi-monthly basis. An archive of previous summaries is available online.

An RSS feed of the titles of the summaries is available. You can also watch comp.lang.python or comp.lang.python.announce for new summaries (or through their email gateways of python-list or python-announce, respectively, as found at http://mail.python.org).

This python-dev summary is the 10th written by Steve Bethard.

To contact me, please send email:

  • Steve Bethard (steven.bethard at gmail.com)

Do not post to comp.lang.python if you wish to reach me.

The Python Software Foundation is the non-profit organization that holds the intellectual property for Python. It also tries to advance the development and use of Python. If you find the python-dev Summary helpful please consider making a donation. You can make a donation at http://python.org/psf/donations.html . Every cent counts so even a small donation with a credit card, check, or by PayPal helps.

Commenting on Topics

To comment on anything mentioned here, just post to comp.lang.python (or email python-list@python.org which is a gateway to the newsgroup) with a subject line mentioning what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join python-dev!

How to Read the Summaries

This summary is written using reStructuredText. Any unfamiliar punctuation is probably markup for reST (otherwise it is probably regular expression syntax or a typo :); you can safely ignore it. We do suggest learning reST, though; it's simple and is accepted for PEP markup and can be turned into many different formats like HTML and LaTeX.