[pypy-dev] PyPy 2 unicode class
Armin Rigo
arigo at tunes.org
Thu Jan 23 18:13:41 CET 2014
Hi Oscar,
Thanks for explaining the caching in detail :-)
On Thu, Jan 23, 2014 at 2:27 PM, Oscar Benjamin
<oscar.j.benjamin at gmail.com> wrote:
> big saving. If the string comes from anything other than utf-8 the indexing
> cache can be built while decoding (and reencoding as utf-8 under the hood).
Actually, you need to walk the string even to do "u =
s.decode('utf-8')". The reason is that you need to check if the byte
string is well-formed UTF-8 or not. So we can build the cache eagerly
in all cases, it seems.
A bientôt,
Armin.
More information about the pypy-dev
mailing list