Out of memory while reading excel file

Pavol Lisy pavol.lisy at gmail.com
Fri May 12 15:13:44 EDT 2017


On 5/11/17, Peter Otten <__peter__ at web.de> wrote:
> Mahmood Naderan via Python-list wrote:
>
>> Excuse me, I changed
>>
>> csv.writer(outstream)
>>
>> to
>>
>> csv.writer(outstream, delimiter =' ')
>>
>>
>> It puts space between cells and omits "" around some content.
>
> If your data doesn't contain any spaces that's fine. Otherwise you need a
> way to distinguish between space as a delimiter and space inside a field, e.
>
> g. by escaping it:
>
>>>> w = csv.writer(sys.stdout, delimiter=" ", quoting=csv.QUOTE_NONE,
> escapechar="\\")
>>>> w.writerow(["a", "b c"])
> a b\ c
> 8
>
>> However,
>> between two lines there is a new empty line. In other word, the first
>> line
>> is the first row of excel file. The second line is empty ("\n") and the
>> third line is the second row of the excel file.
>>
>> Any thought?
>
> In text mode Windows translates "\n" to b"\r\n" in the file. Python allows
> you to override that:
>
>>>> help(open)
> Help on built-in function open in module io:
>
> open(...)
>     open(file, mode='r', buffering=-1, encoding=None,
>          errors=None, newline=None, closefd=True, opener=None) -> file
> object
>
> <snip>
>
>     newline controls how universal newlines works (it only applies to text
>     mode). It can be None, '', '\n', '\r', and '\r\n'.  It works as
>     follows:
>
> <snip>
>
>     * On output, if newline is None, any '\n' characters written are
>       translated to the system default line separator, os.linesep. If
>       newline is '' or '\n', no translation takes place. If newline is any
>       of the other legal values, any '\n' characters written are translated
>       to the given string.
>
> So you need to specify newlines:
>
> with open(dest, "w", newline="") as outstream:
>     ...
>

But lineterminator parameter (
https://docs.python.org/3.6/library/csv.html#csv.Dialect.lineterminator
) is by default \r\n on linux too!

b = io.StringIO()
w = csv.writer(b)
w.writerows([["a", "b c"], ['a', 'b,c']])
b.getvalue()  # 'a,b c\r\na,"b,c"\r\n'

b = io.StringIO()
w = csv.writer(b, lineterminator='\n')
w.writerows([["a", "b c"], ['a', 'b,c']])
b.getvalue()  # 'a,b c\na,"b,c"\n'

PL.



More information about the Python-list mailing list