[Python-Dev] Cost-Free Slice into FromString constructors--Long

Tim Peters tim.peters at gmail.com
Thu May 25 18:20:52 CEST 2006


[Jean-Paul Calderone]
>> ...
>> Hmm, one reason could be that the general solution doesn't work:
>>
>>   exarkun at kunai:~$ python
>>   Python 2.4.3 (#2, Apr 27 2006, 14:43:58)
>>   [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
>>   Type "help", "copyright", "credits" or "license" for more information.
>> >>> long(buffer('1234', 0, 3))
>>   Traceback (most recent call last):
>>     File "<stdin>", line 1, in ?
>>   ValueError: null byte in argument for long()
>> >>> long(buffer('123a', 0, 3))
>>   Traceback (most recent call last):
>>     File "<stdin>", line 1, in ?
>>   ValueError: invalid literal for long(): 123a

[Bob Ippolito]
> One problem with buffer() is that it does a memcpy of the buffer.

It does not, at least not in the cases above.  In fact, that's
essentially _why_ they fail now.  Here's what actually happens (in
outline) for the first example:  a buffer object is created that
merely points to the "1234" string object, recording the offset of 0
and the length of 3.  PyLong_FromString() only sees the starting
address, and-- as it always does --parses until it hits a character
that doesn't make sense for the input base.  That happens to be the
NUL at the end of the "1234" string object guts (Python string objects
are always NUL-terminated, BTW).

PyLong_FromString() is perfectly happy, but converted "1234" rather
than "123".  It's PyLong_FromString()'s _caller_ that's unhappy,
because PyLong_FromString also told the caller that it stopped parsing
at offset 4, and then

	if (end != s + len) {
		PyErr_SetString(PyExc_ValueError,
				"null byte in argument for long()");

The assumption here is bad:  the caller assumes that if end != s+len,
it must be the case that PyLong_FromString() stopped parsing _before_
hitting the end, and did so _because_ it hit an embedded NUL byte.
Neither is true in this case, so the error message is senseless.

In any case, none of this would have happened if the buffer object had
done a memcpy of the 3 requested bytes, and NUL-terminated it.

> A zero-copy version of buffer (a view on some object that implements
> the buffer API) would be nice.

Above we already had that.  The internal parsing APIs don't currently
support the buffer's "offset & length" view of the world, so have no
chance of working as hoped with any kind of buffer object now.


More information about the Python-Dev mailing list