very high-level IO functions?

York yorklee70 at gmail.com
Wed Sep 21 17:56:32 EDT 2005


Thank you, Tom.


-York


Tom Anderson wrote:
> On Mon, 19 Sep 2005, Bruno Desthuilliers wrote:
> 
>> York a écrit :
>> (snip)
>>
>>> I love python. However, as a biologist, I like some high-levels 
>>> functions in R. I don't want to spend my time on parse a data file.
>>
>>
>> http://www.python.org/doc/current/lib/module-csv.html
>>
>>> Then in my python script, I call R to read data file and write them 
>>> into an MySQL table. If python can do this easily, I don't need R at 
>>> all.
>>
>>
>> So you don't need R at all.
> 
> 
> Did you even read the OP's post? Specifically, this bit:
> 
> R language has very high-level IO functions, its read.table can read a 
> total .csv file and recogonize the types of each column.
>                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Python's csv module gives you tuples of strings; it makes no effort to 
> recognise the types of the data. AFAIK, python doesn't have any IO 
> facilities like this.
> 
> Larry's point that automagical type detection is risky because it can 
> make mistakes is a good one, but that doesn't mean that magic is useless 
> - on the contrary, for the majority of cases, it works fine, and is 
> extremely convenient.
> 
> The good news is that it's reasonably easy to write such a function: you 
> just need a function 'type_convert' which takes a string and returns an 
> object of the right type; then you can do:
> 
> import csv
> 
> def read_table(f):
>     for row in csv.reader(f):
>         yield map(type_convert, row)
> 
> This is a very, very rough cut - it doesn't do comment stripping, 
> skipping blank lines, dealing with the presence of a header line or the 
> use of different separators, etc, but all that's pretty easy to add. 
> Also, note that this returns an iterator rather than a list; use 
> list(read_table(f)) if you want an actual list, or change the 
> implementation of the function.
> 
> type_convert is itself fairly simple:
> 
> def _bool(s): # helper method for booleans
>     s = s.lower()
>     if (s == "true"): return True
>     elif (s == "false"): return False
>     else: raise ValueError, s
> 
> types = (int, float, complex, _bool, str)
> 
> def type_convert(s):
>     for type in types:
>         try:
>             return type(s)
>         except ValueError:
>             pass
>     raise ValueError, s
> 
> This whole thing isn't quite as sophisticated as R's table.convert; R 
> reads the whole table in, then tries to find a type for each column 
> which will fit all the values in that column, whereas i do each cell 
> individually. Again, it wouldn't be too hard to do this the other way 
> round.
> 
> Anyway, hope this helps. Bear in mind that there are python bindings for 
> the R engine, so you could just use R's version of read.table in python.
> 
> tom
> 



More information about the Python-list mailing list