text file reformatting

Braden Faulkner bradenf at hotmail.com
Sun Oct 31 15:52:21 EDT 2010


I also am having issues with this.

> Date: Sun, 31 Oct 2010 14:48:09 -0500
> From: python.list at tim.thechases.com
> To: iwawi123 at gmail.com
> Subject: Re: text file reformatting
> CC: python-list at python.org
> 
> > PRJ01001 4 00100END
> > PRJ01002 3 00110END
> >
> > I would like to pick only some columns to a new file and put them to a
> > certain places (to match previous data) - definition file (def.csv)
> > could be something like this:
> >
> > VARIABLE	FIELDSTARTS	FIELD SIZE	NEW PLACE IN NEW DATA FILE
> > ProjID	;	1	;	5	;	1
> > CaseID	;	6	;	3	;	10
> > UselessV  ;	10	;	1	;
> > Zipcode	;	12	;	5	;	15
> >
> > So the new datafile should look like this:
> >
> > PRJ01    001       00100END
> > PRJ01    002       00110END
> 
> 
> How flexible is the def.csv format?  The difficulty I see with 
> your def.csv format is that it leaves undefined gaps (presumably 
> to be filled in with spaces) and that you also have a blank "new 
> place in new file" value.  If instead, you could specify the 
> width to which you want to pad it and omit variables you don't 
> want in the output, ordering the variables in the same order you 
> want them in the output:
> 
>   Variable; Start; Size; Width
>   ProjID; 1; 5; 10
>   CaseID; 6; 3; 10
>   Zipcode; 12; 5; 5
>   End; 16; 3; 3
> 
> (note that I lazily use the same method to copy the END from the 
> source to the destination, rather than coding specially for it) 
> you could do something like this (untested)
> 
>    import csv
>    f = file('def.csv', 'rb')
>    f.next() # discard the header row
>    r = csv.reader(f, delimiter=';')
>    fields = [
>      (varname, slice(int(start), int(start)+int(size)), width)
>      for varname, start, size, width
>      in r
>      ]
>    f.close()
>    out = file('out.txt', 'w')
>    try:
>      for row in file('data.txt'):
>        for varname, slc, width in fields:
>          out.write(row[slc].ljust(width))
>        out.write('\n')
>    finally:
>      out.close()
> 
> Hope that's fairly easy to follow and makes sense.  There might 
> be some fence-posting errors (particularly your use of "1" as the 
> initial offset, while python uses "0" as the initial offset for 
> strings)
> 
> If you can't modify the def.csv format, then things are a bit 
> more complex and I'd almost be tempted to write a script to try 
> and convert your existing def.csv format into something simpler 
> to process like what I describe.
> 
> -tkc
> 
> 
> -- 
> http://mail.python.org/mailman/listinfo/python-list
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20101031/a3c73195/attachment-0001.html>


More information about the Python-list mailing list