[Tutor] Most efficient way to replace ", " with "." in a array and/or dataframe

Albert-Jan Roskam sjeik_appie at hotmail.com
Sun Sep 22 03:39:48 EDT 2019



On 22 Sep 2019 04:27, Cameron Simpson <cs at cskk.id.au> wrote:

On 21Sep2019 20:42, Markos <markos at c2o.pro.br> wrote:
>I have a table.csv file with the following structure:
>
>, Polyarene conc ,, mg L-1 ,,,,,,,
>Spectrum, Py, Ace, Anth,
>1, "0,456", "0,120", "0,168"
>2, "0,456", "0,040", "0,280"
>3, "0,152", "0,200", "0,280"
>
>I open as dataframe with the command:
>data = pd.read_csv ('table.csv', sep = ',', skiprows = 1)
[...]
>And the data_array variable gets the fields in string format:
>[['0,456' '0,120' '0,168']
[...]

>Please see the documentation for the >read_csv function here:

> https://pandas.pydata.org/pandas

>docs/stable/reference/api/pandas.read_cs> v.html?highlight=read_csv#pandas.read_csv

Do you think it's a deliberate design choice that decimal and thousands where used here as params, and not a 'locale' param? It seems nice to be able to specify e.g. locale='dutch' and then all the right lc_numeric, lc_monetary, lc_time where used. Or even locale='nl_NL.1252' and you also wouldn't need 'encoding' as a separate param. Or might that be bad on windows where there's no locale-gen? Just wondering...

Albert-Jan



More information about the Python-list mailing list