Reading 'scientific' csv using Pandas?

Shakti Kumar shakti.shrivastava13 at gmail.com
Mon Nov 19 05:33:26 EST 2018


Hi Martin,

On Sun, 18 Nov 2018 at 23:59, Martin Schöön <martin.schoon at gmail.com> wrote:
>
> Den 2018-11-18 skrev Shakti Kumar <shakti.shrivastava13 at gmail.com>:
> > On Sun, 18 Nov 2018 at 18:18, Martin Schöön <martin.schoon at gmail.com> wrote:
> >>
> >> Now I hit a bump in the road when some of the data is not in plain
> >> decimal notation (xxx,xx) but in 'scientific' (xx,xxxe-xx) notation.
> >>
> >
> > Martin, I believe this should be done by pandas itself while reading
> > the csv file,
> > I took an example in scientific notation and checked this out,
> >
> > my sample.csv file is,
> > col1,col2
> > 1.1,0
> > 10.24e-05,1
> > 9.492e-10,2
> >
> That was a quick answer!
>
> My pandas is up to date.
>
> In your example you use the US convention of using "." for decimals
> and "," to separate data. This works perfect for me too.
>
> However, my data files use European conventions: decimal "," and TAB
> to separate data:
>
> col1    col2
> 1,1     0
> 10,24e-05       1
> 9,492e-10       2
>

A quick fix would be to replace all commas in your file with stops (.)
In case you have other stops in your file not necessarily in your
scientific notation columns only, you may do this replace process only
for your interested columns.
Meanwhile I should be looking for a cleaner way of loading this csv in
pandas, never came through this comma notation :)
Members of @python-list at python.org, any better solution?

> I use
>
> EUData = pd.read_csv('file.csv', skiprows=1, sep='\t',
> decimal=',', engine='python')
>
> to read from such files. This works so so. 'Common floats' (3,1415 etc)
> works just fine but 'scientific' stuff (1,6023e23) does not work.
>
> /Martin
> --
> https://mail.python.org/mailman/listinfo/python-list



-- 
Shakti.



More information about the Python-list mailing list