To write headers once with different values in separate row in CSV

Sahlusar ahlusar.ahluwalia at gmail.com
Tue Jun 23 13:15:56 EDT 2015


I have the following script for writing out to CSV two items in a list to a CSV in such a format, such that if we have:

L = [('A', 1), ('B', 2), ('C', 3), ('D', 4),('A', 5), ('B', 6), ('D', 8)]

we want

A B C D
1 2 3 4
5 6   8

And with this 

L = [('A', 1), ('B', 2), ('C', 3), ('D', 4),('D', 5), ('B', 6), ('A', 8)]

A B C D
1 2 3 4
8 6   5

I have this script: 

    def makerows(pairs):
        headers = []
        columns = {}
        for k, v in pairs:
            if k in columns:
                columns[k].extend((v,))
            else:
                headers.append(k)
                columns[k] = [k, v]
        m = max(len(c) for c in columns.values())
        for c in columns.values():
            c.extend('' for i in range(len(c), m))
        L = [columns[k] for k in headers]
        rows = list(zip(*L))
        return rows

//With this test//


    if __name__ == '__main__':
        
        lists = [
            [('A', 1), ('B', 2), ('C', 3), ('D', 4),('A', 5), ('B', 6), ('C', 7), ('D', 8)],
            [('A', 1), ('B', 2), ('C', 3), ('D', 4),('A', 5), ('B', 6), ('D', 8)],
            [('A', 1), ('B', 2), ('C', 3), ('D', 4),('D', 5), ('B', 6), ('A', 8)],
        ]
        from pprint import pprint
        for data in lists:
            print(data)
            pprint(makerows(data))


output:
    
    [('A', 1), ('B', 2), ('C', 3), ('D', 4), ('A', 5), ('B', 6), ('C', 7), ('D', 8)]
    [('A', 'B', 'C', 'D'), (1, 2, 3, 4), (5, 6, 7, 8)]
    [('A', 1), ('B', 2), ('C', 3), ('D', 4), ('A', 5), ('B', 6), ('D', 8)]
    [('A', 'B', 'C', 'D'), (1, 2, 3, 4), (5, 6, '', 8)]
    [('A', 1), ('B', 2), ('C', 3), ('D', 4), ('D', 5), ('B', 6), ('A', 8)]  

[('A', 'B', 'C', 'D'), (1, 2, 3, 4), (8, 6, '', 5)]

However, when I extrapolate this same logic with a list like:

('Response.MemberO.PMembers.PMembers.Member.CurrentEmployer.EmployerAddress.TimeAtPreviousAddress.', None), where the headers/columns are the first item (only to be written out once) with different values. I receive an output CSV with repeating headers and values all printed in one long string (when opened in an application like Excel). 

I use this script:

def makerows(pairs):

        headers = []
        columns = {}
        for k, v in pairs:
            if k in columns:
                columns[k].extend((v,))
            else:
                headers.append(k)
                columns[k] = [k, v]
        m = max(len(c) for c in columns.values())
        for c in columns.values():
            c.extend(' ' for i in range(len(c), m))
        L = [columns[k] for k in headers]
        rows = list(zip(*L))
        return rows
    
    
    def main():
        with open('sample.xml', 'r', encoding='utf-8') as f: 
            xml_string = f.read() 
        xml_string= xml_string.replace('�', '') #optional to remove ampersands. 
        root = ElementTree.XML(xml_string) 
        for item in root:
            print(root)
        writer = csv.writer(open("test_out.csv", 'wt'))
        writer.writerows(makerows(flatten_dict(root)))
        
    if __name__ == "__main__":
            main()


************FYI, flatten_dict() parses XML elements and their text in to key, value pairs and then converts them into a list. That is not the underlying issue. Any thoughts or suggestions would be very helpful. Thank you. 



More information about the Python-list mailing list