[Python-Dev] PEP 393 review
"Martin v. Löwis"
martin at v.loewis.de
Mon Aug 29 21:20:53 CEST 2011
Am 29.08.2011 11:03, schrieb Dirkjan Ochtman:
> On Sun, Aug 28, 2011 at 21:47, "Martin v. Löwis" <martin at v.loewis.de> wrote:
>> result strings. In PEP 393, a buffer must be scanned for the
>> highest code point, which means that each byte must be inspected
>> twice (a second time when the copying occurs).
>
> This may be a silly question: are there things in place to optimize
> this for the case where two strings are combined? E.g. highest
> character in combined string is max(highest character in either of the
> strings).
Unicode_Concat goes like this
maxchar = PyUnicode_MAX_CHAR_VALUE(u);
if (PyUnicode_MAX_CHAR_VALUE(v) > maxchar)
maxchar = PyUnicode_MAX_CHAR_VALUE(v);
/* Concat the two Unicode strings */
w = (PyUnicodeObject *) PyUnicode_New(
PyUnicode_GET_LENGTH(u) +
PyUnicode_GET_LENGTH(v),
maxchar);
if (w == NULL)
goto onError;
PyUnicode_CopyCharacters(w, 0, u, 0, PyUnicode_GET_LENGTH(u));
PyUnicode_CopyCharacters(w, PyUnicode_GET_LENGTH(u), v, 0,
PyUnicode_GET_LENGTH(v));
> Also, this PEP makes me wonder if there should be a way to distinguish
> between language PEPs and (CPython) implementation PEPs, by adding a
> tag or using the PEP number ranges somehow.
Well, no. This would equally apply to every single patch, and is just
not feasible. Instead, alternative implementations typically target a
CPython version, and then find out what features they need to implement
to claim conformance.
Regards,
Martin
More information about the Python-Dev
mailing list