Reading Windows CSV file with LCID entries under Linux.

Thomas Troeger thomas.troeger.ext at siemens.com
Mon Sep 22 10:43:23 EDT 2008


Dear all,

I've stumbled over a problem with Windows Locale ID information and 
codepages. I'm writing a Python application that parses a CSV file,
the format of a line in this file is "LCID;Text1;Text2". Each line can 
contain a different locale id (LCID) and the text fields contain data 
that is encoded in some codepage which is associated with this LCID. My 
current data file contains the codes 1033 for German and 1031 for 
English US (as listed in 
http://www.microsoft.com/globaldev/reference/lcid-all.mspx). 
Unfortunately, I cannot find out which Codepage (like cp-1252 or 
whatever) belongs to which LCID.

My question is: How can I convert this data into something more 
reasonable like unicode? Basically, what I want is something like 
"Text1;Text2", both fields encoded as UTF-8. Can this be done with 
Python? How can I find out which codepage I have to use for 1033 and 1031?

Any help appreciated,
Thomas.



More information about the Python-list mailing list