"Newbie" questions - "unique" sorting ?
John Fitzsimons
xpm4senn001 at sneakemail.com
Fri Jun 20 20:56:36 EDT 2003
On Fri, 20 Jun 2003 04:23:24 -0700, "Cousin Stanley"
<CousinStanley at hotmail.com> wrote:
Hi Cousin Stanley,
http://fastq.com/~sckitching/Python/word_list.zip
< snip >
>The version you have should run without any command-line arguments
>but requires that the input file be named word_source.txt ...
Okay, before I try the new file I re-tried the original. Because I
have some big files I tried one. Here is what I did.
Started with word_source.txt 3,116KB
Result was the addition of
word_dups.txt 3,116KB
word_target.txt 882KB
Now what was I supposed to do ? I thought I had to re-name one of the
above as word_source.txt ? Then re-run to get rid of duplicate lines ?
It didn't seem to work so I have obviously done something wrong. Have
I got the steps correct ? If so then do I re-name the first, or
second, file ?
< snip >
>I've up-loaded a second version that does require arguments
>for path_in and path_out but leaves the temporary dups file
>named as word_dups.txt ...
< snip >
I want to try that option too BUT want to get the first one working
before that. I will possibly prefer the second option so that I can
better remember the format.
Though, if I can work it out, I might someday split the last part
into another file like zapdups_sorted.py. At my age I don't find
remembering things overly easy, particularly DOS syntax/
step by steps.
The first step worked amazingly well. I thought it might choke on
such a large file. Great job ! Thanks again. :-)
I wonder how the size affects things ? Is Python on windows
limited by RAM ? Or does it use the HDD if there isn't sufficient
memory ?
Could I get this to work on eg. a 50MB text file for example ? Has
anyone here used Python on text files this large ? If so then how
did things go ? On a windows platform.
Regards, John.
More information about the Python-list
mailing list