[Tutor] Most efficient way to replace ", " with "." in a array and/or dataframe

Cameron Simpson cs at cskk.id.au
Sat Sep 21 22:27:18 EDT 2019


On 21Sep2019 20:42, Markos <markos at c2o.pro.br> wrote:
>I have a table.csv file with the following structure:
>
>, Polyarene conc ,, mg L-1 ,,,,,,,
>Spectrum, Py, Ace, Anth,
>1, "0,456", "0,120", "0,168"
>2, "0,456", "0,040", "0,280"
>3, "0,152", "0,200", "0,280"
>
>I open as dataframe with the command:
>data = pd.read_csv ('table.csv', sep = ',', skiprows = 1)
[...]
>And the data_array variable gets the fields in string format:
>[['0,456' '0,120' '0,168']
[...]

Please see the documentation for the read_csv function here:

  https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html?highlight=read_csv#pandas.read_csv

In particular, because you have values formatted in the European style 
with "," for the decimal marker (and possibly "." for the thousands 
marker), you want to set the "decimal=" parameter of read-csv to ",".

This is better than trying to mangle the data yourself, better to just 
correctly specify the dialect (i.e. set decimal= in your call).

Cheers,
Cameron Simpson <cs at cskk.id.au>



More information about the Python-list mailing list