[New-bugs-announce] [issue28055] pyhash's siphash24 assumes alignment of the data pointer

Matthias Klose report at bugs.python.org
Fri Sep 9 19:29:02 EDT 2016


New submission from Matthias Klose:

pyhash's siphash24 assumes alignment of the data pointer, casting a void pointer (src) to an uint64_t pointer, increasing the required alignment from 1 to 4 bytes. That's invalid code. siphash24 can't assume that the pointer to the data to hash is 4-byte aligned.

Seen as a bus error trying to run a ARM32 binary on a AArch64 kernel.

./python -c 'import datetime; print(hash(datetime.datetime(2015, 1, 1)))'

the datetime type is defined as

#define _PyTZINFO_HEAD \
    PyObject_HEAD \
    Py_hash_t hashcode; \
    char hastzinfo; /* boolean flag */

typedef struct
{
    _PyTZINFO_HEAD
    unsigned char data[_PyDateTime_DATE_DATASIZE];
} PyDateTime_Date;

and data is used to calculate the hash of the object, not being 4 byte aligned, you get the bus error. Inserting three fill bytes, are making the data member 4-byte aligned solves the issue, however introducing an ABI change makes the new datetime ABI incompatible, and we don't know about the alignment of objects outside the standard library.

The solution is to use a memcpy instead of the cast to uint64_t, for now limited to the little endian ARM targets, but I don't see why the memcpy cannot always be used on little endian targets instead of the cast.

----------
assignee: doko
components: Interpreter Core
files: pyhash.diff
keywords: patch
messages: 275493
nosy: doko
priority: normal
severity: normal
status: open
title: pyhash's siphash24 assumes alignment of the data pointer
versions: Python 3.5, Python 3.6
Added file: http://bugs.python.org/file44514/pyhash.diff

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue28055>
_______________________________________


More information about the New-bugs-announce mailing list