distinction between unzipping bytes and unzipping a file

Chris Mellon arkanes at gmail.com
Fri Jan 9 16:12:42 EST 2009


On Fri, Jan 9, 2009 at 3:08 PM, Chris Mellon <arkanes at gmail.com> wrote:
> On Fri, Jan 9, 2009 at 2:32 PM, webcomm <ryandw at gmail.com> wrote:
>> On Jan 9, 3:15 pm, Steve Holden <st... at holdenweb.com> wrote:
>>> webcomm wrote:
>>> > Hi,
>>> > In python, is there a distinction between unzipping bytes and
>>> > unzipping a binary file to which those bytes have been written?
>>>
>>> > The following code is, I think, an example of writing bytes to a file
>>> > and then unzipping...
>>>
>>> > decoded = base64.b64decode(datum)
>>> > #datum is a base64 encoded string of data downloaded from a web
>>> > service
>>> > f = open('data.zip', 'wb')
>>> > f.write(decoded)
>>> > f.close()
>>> > x = zipfile.ZipFile('data.zip', 'r')
>>>
>>> > After looking at the preceding code, the provider of the web service
>>> > gave me this advice...
>>> > "Instead of trying to create a file, take the unzipped bytes and get a
>>> > Unicode string of text from it."
>>>
>>> Not terribly useful advice, but one presumes he she or it was trying to
>>> be helpful.
>>>
>>> > If so, I'm not sure how to do what he's suggesting, or if it's really
>>> > different from what I've done.
>>>
>>> Well, what you have done appears pretty wrong to me, but let's take a
>>> look. What's datum? You appear to be treating it as base64-encoded data;
>>> is that correct? Have you examined it?
>>
>> It's data that has been compressed then base64 encoded by the web
>> service.  I'm supposed to download it, then decode, then unzip.  They
>> provide a C# example of how to do this on page 13 of
>> http://forums.regonline.com/forums/docs/RegOnlineWebServices.pdf
>>
>> If you have a minute, see also this thread...
>> http://groups.google.com/group/comp.lang.python/browse_thread/thread/d72d883409764559/5b9eceeee3e77dd4?hl=en&lnk=gst&q=webcomm#5b9eceeee3e77dd4
>>
>
> When they say "zip", they're talking about a zlib compressed stream of
> bytes, not a zip archive.
>
> You want to base64 decode the data, then zlib decompress it, then
> finally interpret it as (I think) UTF-16, as that's what Windows
> usually means when it says "Unicode".
>
> decoded = base64.b64decode(datum)
> decompressed = zlib.decompress(decoded)
> result = decompressed.decode('utf-16')
>


And of course as *soon* as I write that, I read the appendix on the
documentation in full and turn out to be wrong. Ignore me *sigh*.

It would really help if you could post a sample file somewhere.



More information about the Python-list mailing list