[TriZPUG] More Fun With Text Processing
Josh Johnson
jj at email.unc.edu
Fri Apr 3 20:07:30 CEST 2009
It might be, I'll check. It's coming through pexpect so the output might
be weird (I know the tty module messes with the line endings)
JJ
Chris Rossi wrote:
> Or maybe it's already outputting tab characters?
>
> Chris
>
>
> On Fri, Apr 3, 2009 at 11:51 AM, Stephan Altmueller
> <stephan_altmueller at unc.edu <mailto:stephan_altmueller at unc.edu>> wrote:
>
> Josh,
>
> I think the first thing you should do is nail down the exact file
> format.
> If you have missing values and spaces in your format you have no
> unambiguous way
> to decide what column an entry belongs to.
>
> Can you make the command line program insert some sort of
> delimiter like
> commas ?
>
> -- Stephan
>
> Josh Johnson wrote:
> > Ok all,
> > Since we've got a brain trust of pythonistas that know how to deal
> > with strings, here's a problem I'm facing right now that I'd
> like some
> > input on:
> >
> > I've got a tabular list, it's the output from a command-line
> program,
> > and I need to parse it into some sort of structure.
> >
> > Here's an example of the data (the headings and column width
> will vary):
> > TARGET VOLUME GROUP LENGTH AVAILABLE NPE
> > MIRROR
> > 1.1 HIGHAVAIL 5001.023GB 4501.008GB
> 1192337 2.1
> > 1.3 BACKUP 5001.023GB 4250.759GB 1192337
> > 1.4 BACKUP 3000.613GB 3000.353GB 715402
> > 2.2 HIGHAVAIL 5001.023GB 5001.015GB
> 1192337 1.2
> > 2.3 BACKUP 5001.023GB 5000.763GB 1192337
> > 2.4 BACKUP 3000.613GB 3000.353GB 715402
> >
> > I'd like a structure I can work with, like say, a list of hashes.
> >
> > My initial approach involves treating the header row as the
> guide for
> > the field lengths, and then extracting substrings for each field in
> > each row.
> >
> > I also thought about just doing a split on spaces, but some of the
> > fields could have spaces in their data.
> >
> > What do you guys think?
> >
> > JJ
> > _______________________________________________
> > TriZPUG mailing list
> > TriZPUG at python.org <mailto:TriZPUG at python.org>
> > http://mail.python.org/mailman/listinfo/trizpug
> > http://trizpug.org is the Triangle Zope and Python Users Group
>
>
> --
> -------------------------------------------------
> Stephan Altmueller
> Applications Analyst, Enterprise Applications
> Office of Arts and Sciences Information Services
> University of North Carolina at Chapel Hill
> CB 3056, 06 Howell Hall
> Chapel Hill, NC 27599-3056
> 919.448.5936 (direct line)
> stephan_altmueller at unc.edu <mailto:stephan_altmueller at unc.edu>
> AIM: oasisaltmuell
> http://oasis.unc.edu
>
> _______________________________________________
> TriZPUG mailing list
> TriZPUG at python.org <mailto:TriZPUG at python.org>
> http://mail.python.org/mailman/listinfo/trizpug
> http://trizpug.org is the Triangle Zope and Python Users Group
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> TriZPUG mailing list
> TriZPUG at python.org
> http://mail.python.org/mailman/listinfo/trizpug
> http://trizpug.org is the Triangle Zope and Python Users Group
More information about the TriZPUG
mailing list