[IronPython] Binary files and byte strings
Jonathan Jacobs
korpse-ironpython at kaydash.za.net
Tue Dec 20 16:06:26 CET 2005
Hi,
I have a CPython script to parse specific data files and allow me to
manipulate them, mostly relying on the struct module. IronPython doesn't
seem to have an implementation of this (yet?) so I used PyPy's
implementation and discovered that IronPython's sys module doesn't
define a "byteorder" attribute, which was easily worked around.
I then ran into the problem that IronPython's file objects don't
implement a "tell" function, so I added one that simply returned
stream.Position.
After this effort I fired up my script and was bombarded by all manner
of assertions in my script, telling me that the file I was parsing was
*not* in a valid format, I double checked the file executing the script
via CPython without a hitch. After some debugging it looked like stream
was being repositioned by the reader (perhaps due to buffering?), which
left stream.Position unusable.
Grokking the PythonFile class showed that binary mode files were
implemented using a StreamReader (as opposed to a NewLineReader for
text-mode files) which meant that the data would be being decoded as
text, which is not particularly useful in the case of binary files and
really only serves to mangle data into some unusable mess.
In the end I opted for just using stream.Read to get the original
information out in the form of a byte[] and using
StringOps.FromByteArray (after turning it into a public function as I
couldn't find any other way to turn my byte[] into a byte-string) to get
this data back to the user in something they could use.
Now, I'm not sure if I missed something here but reading (not sure about
writing, I'm too scared) binary files seem to be rather broken. Another
thing that struck me was how IronPython's "str" type was married to
.NET's string type. I don't know if there is some magic deeper down to
deal with this but Python's "str" type is a byte-string whereas
"unicode" is actual text while .NET's "string" type is designed to
represent text as a series of Unicode characters.
Hopefully I've said something right.
--
Jonathan
When you meet a master swordsman,
show him your sword.
When you meet a man who is not a poet,
do not show him your poem.
-- Rinzai, ninth century Zen master
More information about the Ironpython-users
mailing list