[Tutor] Most efficient way to read large csv files with properly converted mixed data types.
Alan Gauld
alan.gauld at yahoo.co.uk
Sat Jun 25 03:31:57 EDT 2016
On 25/06/16 08:04, Ek Esawi wrote:
> genfromtxt or (2) looping through each line in the file and split, strip,
> and assign data type to each entry.
>
> I am wondering if there is a better and more efficient alternative,
> especially to method 2 without using numpy or pandas.
The csv module will be more reliable and probably faster than using
looping with split/strip. You will still need to do the data conversions
from strings to native data however. Depending how you currently do that
it may be possible to improve that process using
a mapping to function (similar to what genfromtext does).
> Alan Gauld mentioned namedtuples for another question. I read a little
> about collections and in particular namedtuples
> but was not sure how to apply theme here, if they
> are applicable to begin with.
named tuples provide an alternative to dictionaries for read-only
data and are more readable than standard tuples, but whether they
have a role to play in this particular case is not clear, we'd need
to know a lot more about how you plan to access the data once its converted.
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos
More information about the Tutor
mailing list