[New-bugs-announce] [issue3460] PyUnicode_Join could perhaps be simpler

Mon Jul 28 21:04:38 CEST 2008

New submission from Antoine Pitrou <pitrou at free.fr>:

In py3k, PyUnicode_Join inherits some complexity from the 2.x days.
However, it seems some of the precautions taken there may not be needed
anymore. Witness the following comment:

    /* Grrrr.  A codec may be invoked to convert str objects to
     * Unicode, and so it's possible to call back into Python code
     * during PyUnicode_FromObject(), and so it's possible for a sick
     * codec to change the size of fseq (if seq is a list).  Therefore
     * we have to keep refetching the size -- can't assume seqlen
     * is invariant.
     */

Perhaps it would also allow to preallocate the target buffer all at once
(like bytes.join does) rather than resize it incrementally.
Marc-Andre, what do you think?

----------
components: Unicode
messages: 70367
nosy: lemburg, pitrou
priority: normal
severity: normal
status: open
title: PyUnicode_Join could perhaps be simpler
type: performance
versions: Python 3.0

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3460>
_______________________________________