Problem with zipfile and newlines
John Machin
sjmachin at lexicon.net
Mon Mar 10 07:25:53 EDT 2008
On Mar 10, 8:31 pm, "Neil Crighton" <neilcrigh... at gmail.com> wrote:
> I'm using the zipfile library to read a zip file in Windows, and it
> seems to be adding too many newlines to extracted files. I've found
> that for extracted text-encoded files, removing all instances of '\r'
> in the extracted file seems to fix the problem, but I can't find an
> easy solution for binary files.
>
> The code I'm using is something like:
>
> from zipfile import Zipfile
> z = Zipfile(open('zippedfile.zip'))
> extractedfile = z.read('filename_in_zippedfile')
>
"Too many newlines" is fixed by removing all instances of '\r'. What
are you calling a newline? '\r'??
How do you know there are too many thingies? What operating system
were the original files created on?
When you do:
# using a more meaningful name :-)
extractedfilecontents = z.read('filename_in_zippedfile')
then:
print repr(extractedfilecontents)
what do you see at the end of what you regard as each line:
(1) \n
(2) \r\n
(3) \r
(4) something else
?
Do you fiddle with extractedfilecontents (other than trying to fix it)
before writing it to the file?
When you write out a text file,
do you do:
open('foo.txt', 'w').write(extractedfilecontents)
or
open('foo.txt', 'wb').write(extractedfilecontents)
?
When you write out a binary file,
do you do:
open('foo.txt', 'w').write(extractedfilecontents)
or
open('foo.txt', 'wb').write(extractedfilecontents)
?
More information about the Python-list
mailing list