Python for Vcard Parsing in UTF16

Adam Atlas adam at atlas.st
Tue Apr 24 07:59:42 EDT 2007


On Apr 21, 7:28 pm, R Wood <r... at therandymon.com> wrote:
> To me this was a natural task for Perl.  Turns out however, there's a catch.  
> Apple exports the file in UTF-16 to ensure anyone with Chinese characters in
> their addressbook gets a legitimate Vcard file.

Here's a function that, given a `str` containing a vcard in some
encoding, guesses the encoding and returns a canonical representation
as a `unicode` object.

def fix_encoding(s):
    m = u'BEGIN:VCARD'
    for c in ('ascii', 'utf_16_be', 'utf_16_le', 'utf_8'):
        try: u = unicode(s, c)
        except UnicodeDecodeError: continue
        if m in u: return u
    return None




More information about the Python-list mailing list