perl to python

Roy Smith roy at panix.com
Wed May 12 08:11:43 EDT 2004


Kirk Job-Sluder <kirk at eyegor.jobsluder.net> wrote:
> And here is the fundamental question.  Why should I spend my time
> writing a module in python to emulate another tool, when I can simply
> use that other tool?  Why should I, as a resarcher who must process
> large quantities of data, spend my time and my employer's money
> reinventing the wheel? 

At the risk of veering this thread in yet another different direction, 
anybody who does analysis of large amounts of data should take a look at 
Gary Perlman's excellent, free, and generally under-appreciated |STAT 
package.

http://www.acm.org/~perlman/stat/

It's been around in one version or another for something like 20 years.  
It fills an interesting little niche that's part data manipulation and 
part statistics.

> Here is the solution in awk:
> BEGIN { FS="\t" } 
> {printf("%s %s %s %s", $4, $3, $2, $1)}

In |STAT, that would be simply "colex 4 3 2 1".

There's nothing you can do in |STAT that you couldn't do with more 
general purpose tools like awk, perl, python, etc, but |STAT often has a 
quicker, simpler, easier way to do many common statistical tasks.  A 
good tool to have in your toolbox.

For example, on of the cool tools is the "validata".  You feed it a file 
and it applies some heuristics trying to guess which data in it might be 
invalid.  For example, if a file looks like it's columns of numbers, and 
the third column is all integers except for one entry which is a 
floating point number, it'll guess that might be an error and flag it.  
It's great when you're analyzing 5000 log files of 100,000 lines each 
and one of them makes your script crash for no apparent reason.



More information about the Python-list mailing list