flaming vs accuracy [was Re: Performance of int/long in Python 3]

Neil Hodgson nhodgson at iinet.net.au
Thu Mar 28 23:57:20 EDT 2013


MRAB:

> Implementing the regex module (http://pypi.python.org/pypi/regex) would
> have been more difficult if the internal representation had been UTF-8,
> because of the need to decode, and the implementation would also have
> been slower for that reason.

    One way to build regex support for UTF-8 is to build a fixed width 
version of the regex code and then interpose an object that converts 
between the UTF-8 representation and that code.

    The C++11 standard library contains a regex template that can be 
instantiated over a UTF-8 representation in this way.

    Neil




More information about the Python-list mailing list