[Chennaipy] Issue with parsing an uploaded excel generated CSV file | Flask, Werkzeug question

Shrayas rajagopal shrayasr at gmail.com
Thu Oct 23 16:49:34 CEST 2014


On Thu, Oct 23, 2014 at 7:41 PM, Shrayas rajagopal <shrayasr at gmail.com>
wrote:

> Has any of you encountered this? Have you managed to fix it? Any help
> would be appreciated. Thanks.


Guys,

Seems like I solved the issue and also _kind of_ understood why it is
happening.

Basically, the issue was caused because the line terminators are *expected*
to
either be \r or \n. And saving it via excel seemed like it gave it a \r\n
ending

Firstly, I should thank Daniel who on the same[1] mailing list I linked in
my
first mail, spoke about writing an iterator/generator and replacing the \r\n
with just \n on every line.

With this, I just had a quick thought and tried it out. As the
documentation for
FileStorage[2] indicates, It is possible to do a `stream.read()` to get the
current
opened stream.

So, all I did was to read the stream, split it at \r and pass it to the
csv.reader function and voila! It worked. Here is the same piece of code,
in
working condition:

[...]

@app.route("/", methods=["POST"])
def index_post():
    payload = request.files["foobar"]
    lines = csv.DictReader(payload.read().split("\r"))
    for line in lines:
        print line
    return redirect(url_for("index"))

[...]

That was just a hunch though, I wanted to know why it worked and why it
didn't
work initially and I ended up finding answers for both.

1. Why it didn't work

  Turns out there is this property called lineterminator[3] on the csv
Dialect
  object that lets us specify a lineterminator.

  But under that documentation theres a small box that says, and I quote:

  "Note The reader is hard-coded to recognise either '\r' or '\n' as
  end-of-line, and ignores lineterminator. This behavior may change in the
  future."

  So it wouldn't matter even if I set the lineterminator and gave the csv a
  new dialect. the reader would just ignore it.

2. Why it worked

  The second question was why it worked when I gave the csv.reader class a
list?

  Going back to the documentation for csv.reader[4], I found out that
_csvfile_
  can accept any _Iterator_ object as its input. I quote:

  "csvfile can be any object which supports the iterator protocol and
returns a
  string each time its next() method is called — file objects and list
objects
  are both suitable."

So overall it seemed to me like the csv.reader was looking for a \n
lineterminator
and saving it in Excel gave it a \r\n lineterminator which is why the
initial
exception was being thrown. When I split it by the \r and passed it in to
the
reader, it worked since the line endings would now have been simple \ns

Hope this helps someone else.

---
[1]:
http://librelist.com/browser/flask/2013/7/18/filestorage-and-excel-file/#762ba54312006274ba80a09b7d59090d
[2]:
http://werkzeug.pocoo.org/docs/0.9/datastructures/#werkzeug.datastructures.FileStorage
[3]: https://docs.python.org/2/library/csv.html#csv.Dialect.lineterminator
[4]: https://docs.python.org/2/library/csv.html#csv.reader
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chennaipy/attachments/20141023/ae6f0efa/attachment.html>


More information about the Chennaipy mailing list