*.csv to *.txt after adding columns

Bryan Britten britten.bryan at gmail.com
Tue Sep 17 22:28:54 EDT 2013


Dave -

I can't print the output because there are close to 1,000,000 records. It would be extremely inefficient and resource intensive to look at every row. Like I said, when I take just one file and run the code over the first few records I get what I'd expect to see. Here's an example(non-redacted code):

INPUT:

import csv

fileHandle = 'C:/Users/Bryan/Data Analysis/Crime Analysis/Data/'

varNames = 'ID\tCaseNum\tDate\tTime\tBlock\tIUCR\tPrimaryType\tDescription\tLocDesc\tArrest\tDomestic\tBeat\tDistrict\tWard\tCommArea\tFBICode\tXCoord\tYCoord\tYear\tUpdatedOn\tLat\tLong\tLoc\n'

outFile = open(fileHandle + 'ChiCrime01_02.txt', 'w')
inFile = open(fileHandle + 'ChiCrime01_02.csv', 'rb')
reader = csv.reader(inFile, delimiter=',')
rowNum = 0
for row in reader:
    if rowNum < 5:
        if rowNum >= 1:
            date, time = row[2].split()
            row.insert(3, date)
            row.insert(4, time)
            row.remove(row[2])
            print '\t'.join(row)
            rowNum+=1
        else:
            print varNames
            rowNum+=1


OUTPUT:

ID      CaseNum Date    Time    Block   IUCR    PrimaryType     Description     LocDesc Arrest  Domestic        Beat    District        Ward    CommArea        FBICode XCoord  YCoord  Year    UpdatedOn       Lat     Long    Loc

2924745 HJ602602        12/31/2002      23:59   006XX W 117TH ST        841     THEFT   FINANCIAL ID THEFT:$300 &UNDER  RESIDENCE PORCH/HALLWAY FALSE   FALSE   524     5       34      53      6       1173831 1827387 2002    3/30/2006 21:10 41.68175482     -87.63931351    (41.681754819160666, -87.63931350564216)

2523290 HJ101091        12/31/2002      23:59   002XX W 112TH PL        1310    CRIMINAL DAMAGE TO PROPERTY     APARTMENT       FALSE   FALSE   522             34      49      14      1176848 1830383 2002    3/30/2006 21:10 41.68990907     -87.62817988    (41.689909072449474, -87.62817987594765)

2527332 HJ105139        12/31/2002      23:55   005XX E 89TH PL 486     BATTERY DOMESTIC BATTERY SIMPLE RESIDENCE       FALSE   TRUE    633             6       44      08B     1181369 1845794 2002    3/30/2006 21:10 41.73209609     -87.61115533    (41.732096089465905, -87.61115532670617)

2524251 HJ100175        12/31/2002      23:55   012XX S KARLOV AVE      041A    BATTERY AGGRAVATED: HANDGUN     SIDEWALK        FALSE   FALSE   1011            24      29      04B     1149196 1894387 2002    3/30/2006 21:10 41.86612296     -87.72776536    (41.86612295941429, -87.72776535755746)


Like I said, the output is exactly what I want, but it doesn't seem to be writing to the file and I don't know why. I said I didn't know if it was raising an exception because I'm new to Python and I didn't know if there were some methods that included "silent" errors where it would continue the code but produce the wrong results, such as not writing my files. 

Lastly, why does everyone seem to push for os.path.join versus the method I have used? Is it just a 'standard' that people like to see?

Thanks for your help



More information about the Python-list mailing list