[Tutor] Most efficient way to read large csv files with properly converted mixed data types.

Alan Gauld alan.gauld at yahoo.co.uk
Sat Jun 25 03:31:57 EDT 2016


On 25/06/16 08:04, Ek Esawi wrote:

> genfromtxt or (2) looping through each line in the file and split, strip,
> and assign data type to each entry.
> 
> I am wondering if there is a better and more efficient alternative,
> especially to method 2 without using numpy or pandas. 

The csv module will be more reliable and probably faster than using
looping with split/strip. You will still need to do the data conversions
from strings to native data however. Depending how you currently do that
it may be possible to improve that process using
a mapping to function (similar to what genfromtext does).

> Alan Gauld mentioned namedtuples for another question. I read a little 
> about collections and in particular namedtuples
> but was not sure how to apply theme here, if they
> are applicable to begin with.

named tuples provide an alternative to dictionaries for read-only
data and are more readable than standard tuples, but whether they
have a role to play in this particular case is not clear, we'd need
to know a lot more about how you plan to access the data once its converted.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos




More information about the Tutor mailing list