Large File Parsing
Tim Roberts
timr at probo.com
Tue Jun 17 00:49:39 EDT 2003
Robert S Shaffer <r.shaffer9 at verizon.net> wrote:
>
>I have upto a 3 million record file to parse, remove duplicates and
>sort by size then numeric value. Is this the best way to do this in
>python.
In my opinion, no; the best way would be to use a simple chain of command
filters:
cut -f 0 -d , inputfile | sort -n | uniq > outputfile
There is no need to reinvent the wheel when perfectly good solutions exist.
even if you are using Windows, you can download either Cygwin or the
UnxUtils, which provides all of these tools.
--
- Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.
More information about the Python-list
mailing list