Python for Vcard Parsing in UTF16

R Wood rwood at therandymon.com
Sat Apr 21 19:28:27 EDT 2007


Greetings -

A recent Perl experiment hasn't turned out so well, which has piqued my
interest in Python.  The project is this: take a Vcard file exported from
Apple's Addressbook and use a language that is good at parsing text to convert
it into a mutt alias file.  There are better ways to use Mutt with Mac's 
addressbook, but I want to be able to periodically convert my working 
addressbook file into an alias file I can then transfer across all my different 
machines - two Macs, two Linux, and one FreeBSD. It's basically a couple of 
regexes that look for FN: followed by a name and convert all the words of the 
name into a single structure separated by underscores, followed by the email 
addresses.  You would wind up with

alias Linus_Torvalds Linus Torvalds <lt at linux.com>

To me this was a natural task for Perl.  Turns out however, there's a catch.  
Apple exports the file in UTF-16 to ensure anyone with Chinese characters in 
their addressbook gets a legitimate Vcard file.  And of course Perl somewhat 
chokes on UTF.  I've found several ways to do it that involve complicated 
downloads and installations of Perl modules, but that defeats the purpose of 
making it simple. In an ideal world you should be able to say "try this cool 
script" and be done with it.  Once you have to say "go to CPAN, download and
compile this module, then ..." it gets less exciting.

I know nothing about Python except that it interests me and has interested me
since I first learned the Rekall database frontend (Linux) runs on it.  I just 
ordered Learning Python and if that works out satisfactorily I'm going to go 
back for Programming Python.  In the meantime, I thought I would pose the 
question to this newsgroup: would Python be useful for a parsing exercise like 
this one?



More information about the Python-list mailing list