[issue35195] Pandas read_csv() is 3.5X Slower on Python 3.7.1 vs Python 3.6.7 & 3.5.2 On Windows 10

Dragoljub report at bugs.python.org
Thu Nov 8 17:56:21 EST 2018


New submission from Dragoljub <dragoljub at gmail.com>:

xref: https://github.com/pandas-dev/pandas/issues/23516

Example:
import io
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(1000000, 10), columns=('COL{}'.format(i) for i in range(10)))
csv = io.StringIO(df.to_csv(index=False))
df2 = pd.read_csv(csv) #3.5X slower on Python 3.7.1

pd.read_csv() reads data at 30MB/sec on Python 3.7.1 while at 100MB/sec on Python 3.6.7.

This issue seems to be only present on Windows 10 Builds both x86 & x64. 

Possibly some IO changes in Python 3.7 could have contributed to this slowdown on Windows but not on Linux?

----------
components: IO
messages: 329490
nosy: Dragoljub
priority: normal
severity: normal
status: open
title: Pandas read_csv() is 3.5X Slower on Python 3.7.1 vs Python 3.6.7 & 3.5.2 On Windows 10
type: performance
versions: Python 3.7

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35195>
_______________________________________


More information about the Python-bugs-list mailing list