Swedish characters in Python strings

Brian Quinlan brian at sweetapp.com
Tue Oct 15 16:31:26 EDT 2002


Magnus wrote:
: >>> import MP3Info
: >>> title = getattr(MP3Info.MP3Info(open('file.mp3', 'rb')), 'title')
: >>> title
: 'K\xf6ttbullar i n\xe4san'

I took a quick look at the ID3 specification reveals that Unicode was
not introduced until ID3v2, so determining the encoding before that was
not possible.

With ID3v2, the possible encodings are ISO-8859-1, UTF-16 + BOM,
UTF-16BE (no BOM) and UTF-8. 

When reading ID3v2, MP3Info should present the title as a Unicode object
(though I have no idea if it actually does this or not). 

When reading ID3v[0,1], MP3Info would have no choice but to present the
title as a byte string.

Cheers,
Brian





More information about the Python-list mailing list