converting strings to most their efficient types '1' --> 1, 'A' ---> 'A', '1.2'---> 1.2
James Stroud
jstroud at mbi.ucla.edu
Sat May 19 07:17:22 EDT 2007
John Machin wrote:
> The approach that I've adopted is to test the values in a column for all
> types, and choose the non-text type that has the highest success rate
> (provided the rate is greater than some threshold e.g. 90%, otherwise
> it's text).
>
> For large files, taking a 1/N sample can save a lot of time with little
> chance of misdiagnosis.
Why stop there? You could lower the minimum 1/N by straightforward
application of Bayesian statistics, using results from previous tables
as priors.
James
More information about the Python-list
mailing list