[Pythonmac-SIG] convert binary plist to xml string

Bob Ippolito bob at redivi.com
Mon May 16 05:22:40 CEST 2005


On May 15, 2005, at 10:22 PM, Bill Janssen wrote:

>> If "works pretty well" means that it's one of the most common sources
>> of bugs in the Python standard library (that corrupt data for years
>> before being found)
>
> Do you mean it contained (contains?) bugs, or that it's easy to misuse
> and someone misused it in some other part of the std library?  I've
> found it quite handy and efficient in packing and unpacking binary
> structures.  But one does have to pay attention -- I wish the default
> byte-ordering was network byte order rather than native.

I mean misuse causes bugs, primarily of the i vs I order, but there  
have been others.  Recent example: the zipfile module had a bug where  
'i' was used instead of 'I', which imposed artificial limits (2G file  
limit) and deviated from the format documentation.  It only took a  
COUPLE YEARS for someone to figure this out.  Here's the kicker  
though.  Neither the person who generated the patch, and the person  
who audited and committed the patch noticed that there was a bug of  
the very same kind on the NEXT LINE (same block of code anyway, don't  
remember specifically).  This is still unfixed (each file must start  
before the 2G boundary, so effectively the 2G file limit still exists  
unless you only use one file), but I did submit a patch.  I just  
haven't committed it yet.

>> and that it's harder to write struct based code
>> than it is to write the equivalent in C
>
> I wouldn't go that far, but it may be a wash.  I try to avoid C,  
> though.

I certainly would:
first, second, third, fourth, fifth, ..... = struct.unpack 
("iiIIiIiiILllfd", data)

It is SO EASY to screw this up.  Pick the wrong type code, or  
misalign the type and field.  It needs to look more like something sane:

class Point(struct.BigEndianStruct):
    x = struct.SInt32()
    y = struct.SInt32()

This isn't an API proposal, but it needs to be something remotely  
sane that:

- Keeps the types and fields together, for god's sake
- Lets you use named fields and nested structs
- Uses named types with *precise meanings* and not semi-obscure  
letters that differ based on the platform
- Lets you specify padding.. right now, for packed structs, you need  
to read EACH ELEMENT SEPARATELY
- Can read a struct off of a file-like-object without explicitly  
calculating the size first

There are plenty of other places in the standard library that really  
suck too, but this is the very worst thing in Python that I have ever  
needed to use.  Specifically, it's the one module I need to use  
rather often, but ALWAYS have to have the docstrings open to triple- 
check that it's done right.  The C API is also like this to an extent  
(there should be, but isn't, a naming convention that lets you know  
when some function is going to borrow or steal a reference)... but  
that's more or less to be expected out of an ancient C API.

-bob



More information about the Pythonmac-SIG mailing list