[Tutor] Python - help with something most essential
Abdur-Rahmaan Janhangeer
arj.python at gmail.com
Mon Jun 12 04:21:24 EDT 2017
i might add that
with open( . . .
instead of
foo = open( . . .
also shows some maturity in py
Abdur-Rahmaan Janhangeer,
Mauritius
abdurrahmaanjanhangeer.wordpress.com
On 11 Jun 2017 12:33, "Peter Otten" <__peter__ at web.de> wrote:
> Japhy Bartlett wrote:
>
> > I'm not sure that they cared about how you used file.readlines(), I think
> > the memory comment was a hint about instantiating Counter()s
>
> Then they would have been clueless ;)
>
> Both Schtvveer's original script and his subsequent "Verschlimmbesserung"
> --
> beautiful german word for making things worse when trying to improve them
> --
> use only two Counters at any given time. The second version is very
> inefficient because it builds the same Counter over and over again -- but
> this does not affect peak memory usage much.
>
> Here's the original version that triggered the comment:
>
> [Schtvveer Schvrveve]
>
> > import sys
> > from collections import Counter
> >
> > def main(args):
> > filename = args[1]
> > word = args[2]
> > print countAnagrams(word, filename)
> >
> > def countAnagrams(word, filename):
> >
> > fileContent = readFile(filename)
> >
> > counter = Counter(word)
> > num_of_anagrams = 0
> >
> > for i in range(0, len(fileContent)):
> > if counter == Counter(fileContent[i]):
> > num_of_anagrams += 1
> >
> > return num_of_anagrams
> >
> > def readFile(filename):
> >
> > with open(filename) as f:
> > content = f.readlines()
> >
> > content = [x.strip() for x in content]
> >
> > return content
> >
> > if __name__ == '__main__':
> > main(sys.argv)
> >
>
> referenced as before.py below, and here's a variant that removes
> readlines(), range(), and the [x.strip() for x in content] list
> comprehension, the goal being minimal changes, not code as I would write it
> from scratch.
>
> # after.py
> import sys
> from collections import Counter
>
> def main(args):
> filename = args[1]
> word = args[2]
> print countAnagrams(word, filename)
>
> def countAnagrams(word, filename):
>
> fileContent = readFile(filename)
> counter = Counter(word)
> num_of_anagrams = 0
>
> for line in fileContent:
> if counter == Counter(line):
> num_of_anagrams += 1
>
> return num_of_anagrams
>
> def readFile(filename):
> # this relies on garbage collection to close the file
> # which should normally be avoided
> for line in open(filename):
> yield line.strip()
>
> if __name__ == '__main__':
> main(sys.argv)
>
> How to measure memoryview? I found
> <https://stackoverflow.com/questions/774556/peak-memory-
> usage-of-a-linux-unix-process> and as test data I use files containing
> 10**5 and 10**6
> integers. With that setup (snipping everything but memory usage from the
> time -v output):
>
> $ /usr/bin/time -v python before.py anagrams5.txt 123
> 6
> Maximum resident set size (kbytes): 17340
> $ /usr/bin/time -v python before.py anagrams6.txt 123
> 6
> Maximum resident set size (kbytes): 117328
>
>
> $ /usr/bin/time -v python after.py anagrams5.txt 123
> 6
> Maximum resident set size (kbytes): 6432
> $ /usr/bin/time -v python after.py anagrams6.txt 123
> 6
> Maximum resident set size (kbytes): 6432
>
> See the pattern? before.py uses O(N) memory, after.py O(1).
>
> Run your own tests if you need more datapoints or prefer a different method
> to measure memory consumption.
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
More information about the Tutor
mailing list