[issue19219] speed up marshal.loads()
Serhiy Storchaka
report at bugs.python.org
Fri Oct 11 13:57:43 CEST 2013
Serhiy Storchaka added the comment:
> - unmarshalling ASCII strings is faster: you can pass 127 to PyUnicode_New without scanning for non-ASCII chars
You should ensure that loaded bytes are ASCII-only. Otherwise broken or malicious marshalled data will compromise you program. Decoding UTF-8 is so fast as decoding ASCII (with checks) and is almost so fast as memcpy.
As for output, we could use cached UTF-8 representation of string (always exists for ASCII only strings) before calling PyUnicode_AsUTF8String().
I'm good with buffering and codes for short strings and tuples (I have not examined a code closely yet), but special casing ASCII looks not so good to me.
----------
nosy: +serhiy.storchaka
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19219>
_______________________________________
More information about the Python-bugs-list
mailing list