Unicode strings, struct, and files
Tom Plunket
tomas at fancy.org
Mon Oct 9 03:34:45 EDT 2006
John Machin wrote:
> > message = unicode('Hello, world')
> > myFile.write(message)
> >
> > results in 'message' being converted back to a string before being
> > written. Is the way to do this to do something hideous like this:
> >
> > for c in message:
> > myFile.write(struct.pack('>H', ord(unicode(c))))
>
> I'd suggest UTF-encoding it as a string, using the encoding that
> matches whatever wchar means on the target machine, for example
> assuming bigendian and sizeof(wchar) == 2:
Ahh, this is the info that my trawling through the documentation
didn't let me find!
Thanks a bunch.
> utf_line1 = unicode_line1.encode('utf_16_be')
> etc
> struct.pack(">.........64s64s", ......, utf_line1, utf_line2)
> Presumes (1) you have already checked that you don't have more than 32
> characters in each "line" (2) padding with unichr(0) is acceptable.
This works frighteningly well. ;)
-tom!
More information about the Python-list
mailing list