[Python-Dev] PEP 393 Summer of Code Project

Wed Aug 24 04:39:49 CEST 2011

On Tue, Aug 23, 2011 at 08:15, Antoine Pitrou <solipsis at pitrou.net> wrote:
> So why would you need three separate implementation of the unrolled
> loop? You already have a macro named WRITE_FLEXIBLE_OR_WSTR.

The WRITE_FLEXIBLE_OR_WSTR macro does a check for kind and then
writes.  Using this macro for the fast path would be inefficient, to
have a real fast path, you would need a outer if to check for kind and
then in each condition body the matching access to the string (1, 2,
or 4 bytes) and for each body also write 4 or 8 times (guarded by
#ifdef, depending on platform).

As all these cases bloated up the C code, we went for the simple
solution with the goal of profiling the code again afterwards to see
where the new performance bottlenecks would be.

> Even without taking into account the unrolled loop, I wonder how much
> slower UTF-8 decoding becomes with that approach, by the way. Instead of
> testing the "kind" variable at each loop iteration, using a
> stringlib-like approach may be a better deal IMO.

To me this feels like this would complicate the C source code and
decrease readability.  For each function you would need a wrapper
which does the kind checking logic and then, in a separate file, the
implementation of the function which then gets included three times
for each character width.

Regards,
Torsten