[SciPy-User] numpy I/O question
Matwey V. Kornilov
matwey.kornilov at gmail.com
Sun Jan 2 11:09:37 EST 2011
These files are pipe-streams but when they are dumped they are about 50M.
Replacement that you described requires O(N) (where N is line length) but
C++ operator>> requires O(1) for the same parsing.
I hoped there were a way to split data for numpy by regexp instead of
delimiter.
i.e.
np.genfromtxt(StringIO(data), regexp=r"-?[\d\.]+")
instead of
np.genfromtxt(StringIO(data), delimiter=None)
Yury V. Zaytsev wrote:
> On Sun, 2011-01-02 at 18:51 +0300, Matwey V. Kornilov wrote:
>>
>> I will be asked 'why should we use python which even can't parse as good
>> as c++ does?' `sed` isn't a solution.
>
> How big are these files in question?
>
> Why can't you just load them in memory and do the replacement before
> feeding them into NumPy if you don't want to pre-process files
> beforehand? This is just 2-3 lines of code.
>
More information about the SciPy-User
mailing list