Editing tabular data [was: PEP8 79 char max]

Neil Cerutti neilc at norwich.edu
Thu Aug 1 10:04:53 EDT 2013


On 2013-08-01, Chris Angelico <rosuav at gmail.com> wrote:
> On Wed, Jul 31, 2013 at 8:02 PM, Grant Edwards <invalid at invalid.invalid> wrote:
>> On 2013-07-31, Skip Montanaro <skip at pobox.com> wrote:
>>>> I don't understand.  That just moves them to a different
>>>> file -- doesn't it?  You've still got to deal with editing a
>>>> large table of data (for example when I want to add
>>>> instructions to your assembler).
>>>
>>> My guess is it would be more foolproof to edit that stuff
>>> with a spreadsheet.
>>
>> Many years ago, I worked with somebody who used a spreadsheet
>> like that.  I tried it and found it to be way too cumbersome.
>> The overhead involved of putting tables in to slew of
>> different files and starting up LibreOffice to edit/view them
>> is huge compared to just editing them with emacs in a file
>> along with the source code.  Maybe my computer is too
>> old/slow.  Maybe it's just due to how bad I am at
>> Excel/LibreOffice...
>
> I'm glad someone else feels that way!
>
> At work, we have a number of CSV files (at my boss's
> insistence; I would much rather they be either embedded in the
> source, or in some clearer and simpler format) which I like to
> manipulate in SciTE, rather than OO/LibreOffice. (I'll not
> distinguish those two. Far as I'm concerned, they're one
> product with two names.) My boss can't understand why I do
> this. I can't understand why he objects to having to edit code
> files to alter internal data. I have pointed him to [1] but to
> no avail.
>
> The one thing I would do, though, is align with tabs rather
> than spaces. That gives you an 8:1 (if you keep your tabs at
> eight, which I do) improvement in maintainability, because
> edits that don't cross a boundary don't require fiddling with
> the layout.
>
> [1] http://thedailywtf.com/Articles/Soft_Coding.aspx

Thanks for that link. Good food for thought.

Here's an excerpt from one of my more questionable tables:

Attribute, Description, Fund, Amount
AFSO,Air Force Special Ops Command,,
CSEN,English Proficiency Met,,
CSMT,Math Proficiency Met,,
GBFP,MBA Full Program,,
GBMP,MBA Prereq Met,,
GCEC,Continuing Education Civilian Tuition Rate,,
GCEM,Continuing Education Military Tuition Rate,,
GCFP,MCA Prereq Needed,,
GCMP,MCE Prereq Met,,
GCRT,Certificate Student,,
GE25,25% to XCompany,,
GE40,40% to XCompany,,
GEMP,Employee,Fac,100%
GI03,CISSP Scholarship,CISSP,1500
GIHR,Grad In-House Recruiting,,
GRMS,Graduate Military Scholarship,Milit,1200

It lists all the student atributes, a description, what fund that
attribute requires, if any, and what amount. A tiny amount of DSL
is involved, with Faculty Scholarship paying 100% of tuition
instead of a fixed number. Another _ (not shown above), which
means the fund takes an arbitrary amount determined by a person
we have to literally query to discover.

I think I can see the potential problems. Two special codes for
amount is managable, but the more special cases I end up creating
the more of a mess I get. Plus, I haven't really documented the
file.

Most of the information is irrelevant, though I do like receiving
an exception when Admissions tries to sneak in a new attribute
without telling me.

If I instead had a function that handled only the interesting
attributes it might be pretty small. I'll have to think on this.

-- 
Neil Cerutti



More information about the Python-list mailing list