JSON Object to CSV Question

Fri Jun 19 17:04:23 EDT 2015

@Joonas:

The previous example was a typo. Please use the below example as a case
study.

   1. {'D_B': ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'],
   2. 'F_Int32': ['0',
   3. '0',
   4. '0',
   5. '0',
   6. '0',
   7. '0',
   8. '0',
   9. '0',
   10. '0',
   11. '0',
   12. '0',
   13. '0',
   14. '0',
   15. '0',
   16. '0',
   17. '0',
   18. '0',
   19. '0',
   20. '0',
   21. '0',
   22. '0',
   23. '0',
   24. '0',
   25. '0',
   26. '0'],
   27. 'OTF': '0',
   28. 'PBDS_Double': ['0', '0', '0', '0', '0', '0', '0', '0'],
   29. 'SCS_String': ['1', '2']}
   30.

All of the questions regarding XML I have asked. I have to work within
their parameters. The CSV, for example may look like this:

DLA,FC,PC,WC,CN,Description,Code,CMC
0,00000,0,0,,,0,0

I have made some head way, as stated previously. I have now been able to
experiment with a list comprehension that will demonstrates how each header
will need  to be for headers with repeating values.

   1. >>> key = 'spam'
   2. >>> L = ['foo', 'bar', 'baz']
   3. >>> [('{}{}'.format(key, i), value) for i, value in enumerate(L, 1)]
   4. [('spam1', 'foo'), ('spam2', 'bar'), ('spam3', 'baz')]

Is there a way that you know of that will:

1) Allow me to preserve the following two functions:

port json
import sys

def hook(obj):
    return obj

def flatten(obj):
    for k, v in obj:
        if isinstance(v, list):
            yield from flatten(v)
        else:
            yield k, v

if __name__ == "__main__":
    with open("somefileneame.json") as f:
        data = json.load(f, object_pairs_hook=hook)

    pairs = list(flatten(data))

    writer = csv.writer(sys.stdout)
    header = writer.writerow([k for k, v in pairs])
    row = writer.writerow([v for k, v in pairs]) #writer.writerows for any
other iterable object

and

2)

Will conditionally allow me to recursively check if a key has a nested
array. If so, then apply the  [('{}{}'.format(key, i), value) for i, value
in enumerate(L, 1)] list comprehension?

On Fri, Jun 19, 2015 at 4:32 PM, Joonas Liik <liik.joonas at gmail.com> wrote:

> this.. might not throw an eror, but you have 2 keys with the same name
> "F", and 1 of them will probably be disgarded..., you have data
> corruption even before you try to process it.
>
> {
>  "F": "False",
> "F": {
> "Int32": ["0",
> "0",
> "0"]
> },
>  }
>
> you mentioned Excel at one point.
> perhaps you could mock up what you'd like your finished data to look
> like in a spreadsheet (google docs for instance, since thats easy to
> link to) and reference there.
>
> just having a list of headers doesnt say much about the data format you
> want.
>
> "client wants csv" hmm..they want "csv" or they want "csv that fists
> this very particular description that fits our special decoder or the
> like" ?
>
> do you know how the client will use this data. could that info be used
> to simplify the output to some degree?
>
> and finally..
> the client gives you malformed xml??
> I'm very sorry to hear that. also does the client know they are
> emitting invalid xml?
>     is it rly xml?
>     is it valid in another language? (html is much more lenient for
> instance, an html parser might be able to gleam more meaning)
>     by what definition is it malformed? is it outright structuralyl
> broken, does it fail to meet some schema?
>                       does it fail to meet some expectation you have
> for some reason ("client said it has these properties")
>
> you also mentioned you use JSON because it maps nicely to python
> dicts.. this is true ofc.. but why not just read that in to a python
> dict in the first place?
>
> > DB1 : 0, DB2: 0, DB3: 0 etc. and F1: 0, F1: 0. DB1, DB2 would be the
> headers and the 0s as values in the CSV file.
>
> DB1 etc seems ok at first glance however...
> say there are 2 nested items and each of them have a DB property which
> is an array, you will have name collisions.
> you need more thought in to naming the headers at the very least.
>
> if this is meant for a spreadsheet.. then you will end up with 2 very
> very very long rows, it will NOT be readable by any stretch of the
> imagination.. do you want this.
>
> i'm afraid you'll essentially end up with a translation that looks sth like
>
> {A:{b:"c",q:"w"}}
> ===============
> "A.b", "A.q"
> "c", "w"
>
> if you just want key-value pairs there are better options out
> there..besides csv..
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20150619/5ecd620a/attachment.html>