''.join() with encoded strings

Diez B. Roggisch deets at nospam.web.de
Mon Feb 27 12:55:57 EST 2006


Sandra-24 wrote:

> I'd love to know why calling ''.join() on a list of encoded strings
> automatically results in converting to the default encoding. First of
> all, it's undocumented, so If I didn't have non-ascii characters in my
> utf-8 data, I'd never have known until one day I did, and then the code
> would break. Secondly you can't override (for valid reasons) the
> default encoding, so that's not a way around it. So ''.join becomes
> pretty useless when dealing with the real (non-ascii) world.
> 
> I won't miss the str class when it finally goes (in v3?).
> 
> How can I join my encoded strings effeciently?

By not mixing unicode objects with ordinary byte strings. Use

u''.join(some_unicode_objects)

to get a joined unicode object.

Diez



More information about the Python-list mailing list