parsing tab separated data efficiently into numpy/pylab arrays

per perfreem at gmail.com
Fri Mar 13 18:19:12 EDT 2009


hi all,

what's the most efficient / preferred python way of parsing tab
separated data into arrays? for example if i have a file containing
two columns one corresponding to names the other numbers:

col1    \t     col 2
joe    \t  12.3
jane   \t 155.0

i'd like to parse into an array() such that i can do: mydata[:, 0] and
mydata[:, 1] to easily access all the columns.

right now i can iterate through the file, parse it manually using the
split('\t') command and construct a list out of it, then convert it to
arrays. but there must be a better way?

also, my first column is just a name, and so it is variable in length
-- is there still a way to store it as an array so i can access: mydata
[:, 0] to get all the names (as a list)?

thank you.



More information about the Python-list mailing list