PEP 393 vs UTF-8 Everywhere

Chris Angelico rosuav at gmail.com
Fri Jan 20 18:36:03 EST 2017


On Sat, Jan 21, 2017 at 10:15 AM, Thomas Nyberg <tomuxiong at gmx.com> wrote:
> But I have one extra question. Is string indexing guaranteed to be
> constant-time for python? I thought so, but I couldn't find it documented
> anywhere. (Not that I think it practically matters, since it couldn't really
> change if it weren't for all the reasons you mentioned.) I found this which
> at details (if not explicitly "guarantees") the complexity properties of
> other datatypes:
>

No, it isn't; this question came up in the context of MicroPython,
which chose to go UTF-8 internally instead of PEP 393. But the
considerations for uPy are different - it's not designed to handle
gobs of data, so constant-time vs linear isn't going to have as much
impact. But in normal work, it's important enough to have predictable
string performance. You can't afford to deploy a web application, test
it, and then have someone send a large amount of data at it, causing
massive O(n^2) blowouts.

ChrisA



More information about the Python-list mailing list