continue vs. pass in this IO reading and writing

kbtyo ahlusar.ahluwalia at gmail.com
Thu Sep 3 11:57:30 EDT 2015


On Thursday, September 3, 2015 at 11:52:16 AM UTC-4, Chris Angelico wrote:
> On Fri, Sep 4, 2015 at 1:38 AM, kbtyo wrote:
> > Thank you for the elaboration. So, what I hear you saying is that (citing, "In this case, there's no further body, so it's going to be the same as "pass" (which
> > means "do nothing")") that the else block is not entered. For exma
> 
> Seems like a cut-off paragraph here, but yes. In a try/except/else
> block, the 'else' block executes only if the 'try' didn't raise an
> exception of the specified type(s).
> 
> > Do you mind elaborating on what you meant by "compatible headers?". The files that I am processing may or may not have the same headers (but if they do they should add the respective values only).
> >
> 
> Your algorithm is basically: Take the entire first file, including its
> header, and then append all other files after skipping their first
> lines. If you want a smarter form of CSV merge, I would recommend
> using the 'csv' module, and probably doing a quick check of all files
> before you begin, so as to collect up the full set of headers. That'll
> also save you the hassle of playing around with StopIteration as you
> read in the headers.
> 
> ChrisA


I have files that may have different headers. If they are different, they should be appended (along with their values). If there are duplicate headers, then their values should just be added. 

I have used CSV and collections. For some reason when I apply this algorithm, all of my files are not added (the output is ridiculously small considering how much goes in - think KB output vs MB input):

from glob import iglob
import csv
from collections import OrderedDict

files = sorted(iglob('*.csv'))
header = OrderedDict()
data = []

for filename in files:
    with open(filename, 'r') as fin:
        csvin = csv.DictReader(fin)
        header.update(OrderedDict.fromkeys(csvin.fieldnames))
        data.append(next(csvin))

with open('output_filename_version2.csv', 'w') as fout:
    csvout = csv.DictWriter(fout, fieldnames=list(header))
    csvout.writeheader()
    csvout.writerows(data)



More information about the Python-list mailing list