[Numpy-discussion] variable number of columns in loadtxt/genfromtxt

Paul Hobson pmhobson at gmail.com
Tue Sep 25 20:02:50 EDT 2012


On Tue, Sep 25, 2012 at 9:35 AM, Andreas Hilboll <lists at hilboll.de> wrote:
>> On Tue, Sep 25, 2012 at 2:31 AM, Andreas Hilboll <lists at hilboll.de> wrote:
>>> I commonly have to deal with legacy ASCII files, which don't have a
>>> constant number of columns. The standard is 10 values per row, but
>>> sometimes, there are less columns. loadtxt doesn't support this, and in
>>> genfromtext, the rows which have less than 10 values are excluded from
>>> the
>>> resulting array.
>>>
>>> Is there any way around this?
>>
>> the trick is: what does it mean when there are fewer values in a row?
>> There is no way to universally define that.
>>
>> Anyway, I'd just punt on using a standard ascii file reader, in the
>> time it took to write this question, you'd be halfway to writing a
>> custom file parser -- it's really easy in Python, at least if you
>> don't need absolutely top performance (which loadtext and genfromtext
>> doen't give you anyway)
>
> Actually, that's just what I did before writing this question ;) I was
> just wondering if there were some solution available which I didn't know
> about.

This may or may not be relevant, but pandas does a pretty good job of
handling this sort of thing...
http://nbviewer.maxdrawdown.com/3785198

Notebook Viewer hasn't quite caught up with the dev version of
ipython. I've attached a screen shot too.
-paul
-------------- next part --------------
A non-text attachment was scrubbed...
Name: variable_cols.png
Type: image/png
Size: 44286 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120925/735e2cd1/attachment.png>


More information about the NumPy-Discussion mailing list