Unicode strings, struct, and files

Tom Plunket tomas at fancy.org
Mon Oct 9 00:26:05 EDT 2006


I am building a file with the help of the struct module.

I would like to be able to put Unicode strings into this file, but I'm
not sure how to do it.

The format I'm trying to write is basically this C structure:

struct MyFile
{
   int magic;
   int flags;
   short otherFlags;
   char pad[22];

   wchar_t line1[32];
   wchar_t line2[32];

   // ... other data which is easy.  :)
};

(I'm writing data on a PC to be read on a big-endian machine.)

So I can write the four leading members with the output of
struct.pack('>IIH22x', magic, flags, otherFlags).  Unfortunately I
can't figure out how to write the unicode strings, since:

message = unicode('Hello, world')
myFile.write(message)

results in 'message' being converted back to a string before being
written.  Is the way to do this to do something hideous like this:

for c in message:
   myFile.write(struct.pack('>H', ord(unicode(c))))

?

Thanks from a unicode n00b,
-tom!



More information about the Python-list mailing list