csv into multiple columns using split function using python

Peter Otten __peter__ at web.de
Wed Nov 30 04:14:18 EST 2016


handar94 at gmail.com wrote:

> I am trying to split a specific column of csv into multiple column and
> then appending the split values at the end of each row.
> 
> `enter code here`
> ------------------------------------------------
> import csv
> fOpen1=open('Meta_D1.txt')
> 
> reader=csv.reader(fOpen1)
> mylist=[elem[1].split(',') for elem in reader]
> mylist1=[]
> 
> for elem in mylist1:
>     mylist1.append(elem)
> 
> 
> #writing to a csv file
> with open('out1.csv', 'wb') as fp:
>     myf = csv.writer(fp, delimiter=',')
>     myf.writerows(mylist1)
> 
> ---------------------------------------------------
> Here is the link to file I am working on 2 column.
> https://spaces.hightail.com/space/4hFTj
> 
> Can someone guide me further?

Use helper functions to process one row and the column you want to split:

import csv

def split_column(column):
    """
    >>> split_column("foo,bar,baz")
    ['foo', 'bar', 'baz']
    """
    return column.split(",")

def process_row(row):
    """
    >>> process_row(["foo", "one,two,three", "bar"])
    ['foo', 'one,two,three', 'bar', 'one', 'two', 'three']
    """
    new_row = row + split_column(row[1])
    return new_row

def convert_csv(infile, outfile):
    with open(infile) as instream:
        rows = csv.reader(instream)
        with open(outfile, "w") as outstream:
            writer = csv.writer(outstream, delimiter=",")
            writer.writerows(process_row(row) for row in rows)

if __name__ == "__main__":
    convert_csv(infile="infile.csv", outfile="outfile.csv")

That makes it easy to identify (and fix) the parts that do not work to your 
satisfaction. Let's say you want to remove the original unsplit second 
column. You know you only have to modify process_row(). You change the 
doctest first

def process_row(row):
    """
    >>> process_row(["foo", "one,two,three", "bar"])
    ['foo', 'bar', 'one', 'two', 'three']
    """
    new_row = row + split_column(row[1])
    return new_row

and verify that it fails:

$ python3 -m doctest split_column2.py
**********************************************************************
File "/somewhere/split_column2.py", line 12, in split_column2.process_row
Failed example:
    process_row(["foo", "one,two,three", "bar"])
Expected:
    ['foo', 'bar', 'one', 'two', 'three']
Got:
    ['foo', 'one,two,three', 'bar', 'one', 'two', 'three']
**********************************************************************
1 items had failures:
   1 of   1 in split_column2.process_row
***Test Failed*** 1 failures.

Then fix the function until you get

$ python3 -m doctest split_column2.py

If you don't trust the "no output means everything is OK" philosophy use the 
--verbose flag:

$ python3 -m doctest --verbose split_column2.py
Trying:
    process_row(["foo", "one,two,three", "bar"])
Expecting:
    ['foo', 'bar', 'one', 'two', 'three']
ok
Trying:
    split_column("foo,bar,baz")
Expecting:
    ['foo', 'bar', 'baz']
ok
2 items had no tests:
    split_column2
    split_column2.convert_csv
2 items passed all tests:
   1 tests in split_column2.process_row
   1 tests in split_column2.split_column
2 tests in 4 items.
2 passed and 0 failed.
Test passed.





More information about the Python-list mailing list