very high-level IO functions?
York
yorklee70 at gmail.com
Wed Sep 21 17:56:32 EDT 2005
Thank you, Tom.
-York
Tom Anderson wrote:
> On Mon, 19 Sep 2005, Bruno Desthuilliers wrote:
>
>> York a écrit :
>> (snip)
>>
>>> I love python. However, as a biologist, I like some high-levels
>>> functions in R. I don't want to spend my time on parse a data file.
>>
>>
>> http://www.python.org/doc/current/lib/module-csv.html
>>
>>> Then in my python script, I call R to read data file and write them
>>> into an MySQL table. If python can do this easily, I don't need R at
>>> all.
>>
>>
>> So you don't need R at all.
>
>
> Did you even read the OP's post? Specifically, this bit:
>
> R language has very high-level IO functions, its read.table can read a
> total .csv file and recogonize the types of each column.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Python's csv module gives you tuples of strings; it makes no effort to
> recognise the types of the data. AFAIK, python doesn't have any IO
> facilities like this.
>
> Larry's point that automagical type detection is risky because it can
> make mistakes is a good one, but that doesn't mean that magic is useless
> - on the contrary, for the majority of cases, it works fine, and is
> extremely convenient.
>
> The good news is that it's reasonably easy to write such a function: you
> just need a function 'type_convert' which takes a string and returns an
> object of the right type; then you can do:
>
> import csv
>
> def read_table(f):
> for row in csv.reader(f):
> yield map(type_convert, row)
>
> This is a very, very rough cut - it doesn't do comment stripping,
> skipping blank lines, dealing with the presence of a header line or the
> use of different separators, etc, but all that's pretty easy to add.
> Also, note that this returns an iterator rather than a list; use
> list(read_table(f)) if you want an actual list, or change the
> implementation of the function.
>
> type_convert is itself fairly simple:
>
> def _bool(s): # helper method for booleans
> s = s.lower()
> if (s == "true"): return True
> elif (s == "false"): return False
> else: raise ValueError, s
>
> types = (int, float, complex, _bool, str)
>
> def type_convert(s):
> for type in types:
> try:
> return type(s)
> except ValueError:
> pass
> raise ValueError, s
>
> This whole thing isn't quite as sophisticated as R's table.convert; R
> reads the whole table in, then tries to find a type for each column
> which will fit all the values in that column, whereas i do each cell
> individually. Again, it wouldn't be too hard to do this the other way
> round.
>
> Anyway, hope this helps. Bear in mind that there are python bindings for
> the R engine, so you could just use R's version of read.table in python.
>
> tom
>
More information about the Python-list
mailing list