Problem with zipfile and newlines

John Machin sjmachin at lexicon.net
Mon Mar 10 07:25:53 EDT 2008


On Mar 10, 8:31 pm, "Neil Crighton" <neilcrigh... at gmail.com> wrote:
> I'm using the zipfile library to read a zip file in Windows, and it
> seems to be adding too many newlines to extracted files. I've found
> that for extracted text-encoded files, removing all instances of '\r'
> in the extracted file seems to fix the problem, but I can't find an
> easy solution for binary files.
>
> The code I'm using is something like:
>
> from zipfile import Zipfile
> z = Zipfile(open('zippedfile.zip'))
> extractedfile = z.read('filename_in_zippedfile')
>

"Too many newlines" is fixed by removing all instances of '\r'. What
are you calling a newline? '\r'??

How do you know there are too many thingies? What operating system
were the original files created on?

When you do:
    # using a more meaningful name :-)
    extractedfilecontents = z.read('filename_in_zippedfile')
then:
    print repr(extractedfilecontents)
what do you see at the end of what you regard as each line:
(1) \n
(2) \r\n
(3) \r
(4) something else
?

Do you fiddle with extractedfilecontents (other than trying to fix it)
before writing it to the file?

When you write out a text file,
do you do:
    open('foo.txt', 'w').write(extractedfilecontents)
or
    open('foo.txt', 'wb').write(extractedfilecontents)
?

When you write out a binary file,
do you do:
    open('foo.txt', 'w').write(extractedfilecontents)
or
    open('foo.txt', 'wb').write(extractedfilecontents)
?



More information about the Python-list mailing list