[Tutor] mapping header row to data rows in file
Peter Otten
__peter__ at web.de
Fri Jun 28 19:09:22 CEST 2013
Sivaram Neelakantan wrote:
> I apologise for mailing you directly but this one seems to work but I
> don't seem to understand it. Could you please explain this?
[I don't see anything private about your questions, so I'm taking the
liberty do bring this back on list]
> a) for row in reader(f)...
> reader(f) is called 6 times or not?
No, the reader() function is called once before the first iteration of the
loop. You can think of
for x in expr():
...
as syntactic sugar for
tmp = iter(expr())
while True:
try:
x = next(tmp)
except StopIteration:
break
...
> b) why isn't the print in reader() not printing each row each time
> reader() is called
It is called just once. The function returns a "generator" built from the
"generator expression"
(Row(*values) for values in rows)
which corresponds to the "tmp" variable in the long version of the for-loop
above. A generator lazily produces one value when you call its next()
method:
>>> g = (i*i for i in [1, 2, 3])
>>> next(g) # same as g.next() in Python 2 or g.__next__() in Python 3
1
>>> next(g)
4
>>> next(g)
9
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
There is an alternative way to create a generator that is perhaps easier to
grasp:
>>> def f():
... for i in [1, 2, 3]:
... yield i*i
...
>>> g = f()
>>> next(g)
1
>>> next(g)
4
>>> next(g)
9
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
On each next() call the code in f() is executed until it encounters a
"yield".
> c) what does Row(*values) do?
It unpacks the values sequence. For example, if values is a list of length 3
like values = ["a", "b", "c"] then
Row(*values)
is equivalent to
Row(values[0], values[1], values[2])
or
Row("a", "b", "c")
>
> --8<---------------cut here---------------start------------->8---
> def reader(instream):
> rows = csv.reader(instream)
> # rows = (line.split(",") for line in instream)
> rows = ([field.strip() for field in row] for row in rows)
> print type(rows)
> names = next(rows)
> print names
> Row = namedtuple("Row", names)
> return (Row(*values) for values in rows)
>
> with open("AA.csv", "r") as f:
> for row in reader(f):
> print row
>
> $ python csvproc.py
> <type 'generator'>
> ['Symbol', 'Series', 'Date', 'Prev_Close']
> Row(Symbol='STER', Series='EQ', Date='22-Nov-2012', Prev_Close='9')
> Row(Symbol='STER', Series='EQ', Date='29-Nov-2012', Prev_Close='10')
> Row(Symbol='STER', Series='EQ', Date='06-Dec-2012', Prev_Close='11')
> Row(Symbol='STER', Series='EQ', Date='06-Jun-2013', Prev_Close='9')
> Row(Symbol='STER', Series='EQ', Date='07-Jun-2013', Prev_Close='9')
More information about the Tutor
mailing list