From syd at plug.ca Thu Oct 4 16:31:44 2007 From: syd at plug.ca (Sydney Weidman) Date: Thu, 04 Oct 2007 15:31:44 -0500 Subject: [Python Wpg] Prescribed way to run doctest on all modules in many packages Message-ID: <1191529904.17262.5.camel@sweidman-laptop> I was wondering if any doctest experts out there have a "clever" way to run doctests in all modules in a given set of packages. Currently I'm doing this: """ rundoctest.py: run all doctests in a directory """ import doctest from glob import glob def _test(): for f in glob('*.py'): doctest.testfile(f, optionflags=doctest.ELLIPSIS) if __name__ == '__main__': _test() which is terse, but seems clumsy, since I have to have one of these "rundoctest.py" things in every package. Any suggestions? Thanks in advance! - Syd From high.res.mike at gmail.com Thu Oct 4 16:42:06 2007 From: high.res.mike at gmail.com (Mike Pfaiffer) Date: Thu, 04 Oct 2007 15:42:06 -0500 Subject: [Python Wpg] Prescribed way to run doctest on all modules in many packages In-Reply-To: <1191529904.17262.5.camel@sweidman-laptop> References: <1191529904.17262.5.camel@sweidman-laptop> Message-ID: <4705501E.8000901@gmail.com> Sydney Weidman wrote: > I was wondering if any doctest experts out there have a "clever" way to > run doctests in all modules in a given set of packages. Currently I'm > doing this: I don't know if this info will help. Check out http://freshmeat.net to find some testing packages. As I recall there is at least one which shows up there semi-regularly. I can't recall the name (then again I'm having trouble recalling much lately). I haven't looked at in great detail but I gather it will perform bruit force checks of all parts of a program. Later Mike From syd at plug.ca Wed Oct 10 16:36:28 2007 From: syd at plug.ca (Sydney Weidman) Date: Wed, 10 Oct 2007 15:36:28 -0500 Subject: [Python Wpg] PUG Meeting location schedule for 2007 - 2008 Message-ID: <1192048588.18335.14.camel@sweidman-laptop> Here are the room assignments for PUG through March 2008. We'll have to book for April, May and June after the spring session courses get scheduled. It's pretty simple: room 3M58. Third floor, Manitoba Hall, room 58. Oct 24, 2007: 3M58 Nov 28, 2007: 3M58 Dec 26, 2007: Boxing Day. No meeting, U of W closed Jan 23, 2008: 3M58 Feb 27, 2008: 3M58 Mar 26, 2008: 3M58 Room 3M58 is a smaller seminar room with rectangular (8 ft) tables and a wall for a projection screen. Seats about 15 to 20. We can be relocated if it looks like we're going to have a big crowd. Regards, Syd From syd at plug.ca Tue Oct 16 12:31:43 2007 From: syd at plug.ca (Sydney Weidman) Date: Tue, 16 Oct 2007 11:31:43 -0500 Subject: [Python Wpg] Testing for Python version Message-ID: <1192552303.9533.13.camel@sweidman-laptop> Hi, all! I'm trying to create doctests that will work on multiple platforms, each with different versions of Python. OSX includes version 2.3.0, which doesn't have doctest.ELLIPSIS constant available, so I need to skip using that constant on OSX. It seems as if the idiom for testing Python version is sys.version_info[:3] >= (2,4,0): import sys import doctest def _test(verbose) if sys.version_info[:3] >= (2,4,0): doctest.testmod(verbose=verbose,optionflags=doctest.ELLIPSIS) else: doctest.testmod(verbose=verbose) Does this seem like the "right" way to test python version? I can't seem to find any definitive documentation about how to do this. - syd From syd at plug.ca Tue Oct 16 12:58:18 2007 From: syd at plug.ca (Sydney Weidman) Date: Tue, 16 Oct 2007 11:58:18 -0500 Subject: [Python Wpg] Testing for Python version In-Reply-To: <2617889.2811192553748370.JavaMail.root@zimbra> References: <2617889.2811192553748370.JavaMail.root@zimbra> Message-ID: <1192553898.10308.0.camel@sweidman-laptop> On Tue, 2007-16-10 at 12:55 -0400, Jason Hildebrand wrote: > Hi Syd, > > Why not simply check for the feature(s) you need? I have seen lots of python code which takes this approach. For example: > > def _test(verbose) > if hasattr(doctest, 'ELLIPSIS'): > doctest.testmod(verbose=verbose,optionflags=doctest.ELLIPSIS) > else: > doctest.testmod(verbose=verbose) > > peace, > Jason A much better idea! Thanks! - syd From jason at peaceworks.ca Tue Oct 16 12:55:48 2007 From: jason at peaceworks.ca (Jason Hildebrand) Date: Tue, 16 Oct 2007 12:55:48 -0400 (EDT) Subject: [Python Wpg] Testing for Python version In-Reply-To: <1192552303.9533.13.camel@sweidman-laptop> Message-ID: <2617889.2811192553748370.JavaMail.root@zimbra> Hi Syd, Why not simply check for the feature(s) you need? I have seen lots of python code which takes this approach. For example: def _test(verbose) if hasattr(doctest, 'ELLIPSIS'): doctest.testmod(verbose=verbose,optionflags=doctest.ELLIPSIS) else: doctest.testmod(verbose=verbose) peace, Jason -- Jason Hildebrand PeaceWorks Computer Consulting #2 - 396 Assiniboine Ave, Winnipeg 204 480 0314 --or-- 519 725 7875, ext 620. ----- "Sydney Weidman" wrote: > Hi, all! > > I'm trying to create doctests that will work on multiple platforms, > each > with different versions of Python. OSX includes version 2.3.0, which > doesn't have doctest.ELLIPSIS constant available, so I need to skip > using that constant on OSX. It seems as if the idiom for testing > Python > version is sys.version_info[:3] >= (2,4,0): > > import sys > import doctest > > def _test(verbose) > if sys.version_info[:3] >= (2,4,0): > doctest.testmod(verbose=verbose,optionflags=doctest.ELLIPSIS) > else: > doctest.testmod(verbose=verbose) > > Does this seem like the "right" way to test python version? > > I can't seem to find any definitive documentation about how to do > this. > > - syd > > > _______________________________________________ > Winnipeg mailing list > Winnipeg at python.org > http://mail.python.org/mailman/listinfo/winnipeg From stuartw at mts.net Thu Oct 18 07:28:19 2007 From: stuartw at mts.net (Stuart Williams) Date: Thu, 18 Oct 2007 06:28:19 -0500 Subject: [Python Wpg] Meeting next week Message-ID: <18199.17235.744139.568879@gavel.swilliams.ca> Our monthly meeting is next Wednesday. I remember the discussion about agenda, I just can't remember the content or outcome. Anyone? Stuart. From high.res.mike at gmail.com Tue Oct 23 17:46:33 2007 From: high.res.mike at gmail.com (Mike Pfaiffer) Date: Tue, 23 Oct 2007 16:46:33 -0500 Subject: [Python Wpg] Meeting tomorrow Message-ID: <471E6BB9.6020607@gmail.com> Sorry I won't be able to make it (yet again) this month. A throat infection this time. As I was mentioning to a couple of people... It's been one thing after another since spring. A couple of days ago I was sounding like a cross between Jack Klugman and a Cylon. With everything else happening right now, I am really wiped out. The doctor says I should be fine by the end of the week. With Python related stuff. When I'm feeling better and not watching the world through a "fish bowl" I plan on writing a short program for door prize give-aways. The reason is I'll be doing a short presentation for the MWCS in two weeks (an N64 emulator under OS X) and I have a coupon for a free bag of chips from Subway. I'd give it to the Python group (or someone who lives near by) but it would be more effort for someone to collect it than it would be for them to just buy one. Anyhow, I plan for multiple prize submissions with one prize per winner. I will post the code when it's done here. OTOH, if someone already has the code or is just looking for some practice go ahead and post it. Later Mike From stuartw at mts.net Wed Oct 24 08:39:00 2007 From: stuartw at mts.net (Stuart Williams) Date: Wed, 24 Oct 2007 07:39:00 -0500 Subject: [Python Wpg] Winnipeg Python Users Group meeting tonight Message-ID: <18207.15588.20151.76124@gavel.swilliams.ca> Don't forget the Winnipeg PUG meeting tonight, usual time and location as per http://winnipug.ca. It will be a series of 4 or 5 short lightning talks about how we've used Python recently. Stuart. From sara_arenson at yahoo.ca Wed Oct 24 13:49:29 2007 From: sara_arenson at yahoo.ca (Sara Arenson) Date: Wed, 24 Oct 2007 13:49:29 -0400 (EDT) Subject: [Python Wpg] Winnipeg Python Users Group meeting tonight In-Reply-To: <18207.15588.20151.76124@gavel.swilliams.ca> Message-ID: <458953.35154.qm@web90502.mail.mud.yahoo.com> Hi guys, Tonight's meeting sounds interesting, but I can't make it. See you guys next time, hopefully. Later, Sara --- Stuart Williams wrote: > Don't forget the Winnipeg PUG meeting tonight, usual time and location > as per http://winnipug.ca. > > It will be a series of 4 or 5 short lightning talks about how we've > used Python recently. > > Stuart. > _______________________________________________ > Winnipeg mailing list > Winnipeg at python.org > http://mail.python.org/mailman/listinfo/winnipeg > Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail at http://mrd.mail.yahoo.com/try_beta?.intl=ca From syd at plug.ca Thu Oct 25 12:30:49 2007 From: syd at plug.ca (Sydney Weidman) Date: Thu, 25 Oct 2007 11:30:49 -0500 Subject: [Python Wpg] QuickReference and "any" in no particular order Message-ID: <1193329849.7151.9.camel@sweidman-laptop> I had to try something with "any", so here it is: import random def run(): draw = sorted(random.sample(range(1,50),6)) print "6-49 Draw: %s" % (draw,) mypicks = sorted(random.sample(range(1,50),6)) print "My 6-49 Picks: %s" % (mypicks,) if any([pick in draw for pick in mypicks]): print "You have winning numbers!!\n\n" return True else: print "Sorry, try again...\n\n" return False if __name__ == '__main__': if any([run() for i in range(0,10)]): print "You had some winners in the series" else: print "No winners in the series" Also, I found the Python Quick Reference: http://rgruet.free.fr/#QuickRef Very handy! See you all next month! - Syd From high.res.mike at gmail.com Thu Oct 25 16:53:08 2007 From: high.res.mike at gmail.com (Mike Pfaiffer) Date: Thu, 25 Oct 2007 15:53:08 -0500 Subject: [Python Wpg] Door prize draw Message-ID: <47210234.6040403@gmail.com> I was feeling a little more like doing something today. So I whipped up a quick and dirty door prize draw program (no need to go crazy with it). Save the text to prize.py. Here is the source. If I can resolve my problems with Shaw it will be posted there too. Enjoy. Later Mike P.S. Produced on a Mac Mini with Textwrangler. --------------------- # This program distributes door prizes for clubs # Import modules as needed from random import seed, random # Seed random number generator seed() # Print introductory message print """Door prize distribution program: This program will distribute a number of door prizes to all people who enter their names in to this program. The program presumes no errors in data entry. Note: to collect the prize the person must actually be present. The first step is to enter the prizes and the second is to enter the names of the people participating in the draw. The third will be the actual draw. """ # Enter prizes prize_count = -1 print "Enter prize list. Terminate with 'end'." print prize = '' while (prize != 'end'): prize = raw_input('Prize --> ') if prize != 'end': prize_count = prize_count + 1 if prize_count == 0: prize_list = [prize] else: prize_list = prize_list + [prize] else: print "Prize entry terminated." print # List prizes print "Prize list:" for counter in range(prize_count + 1): print prize_list[counter] print # Enter names name_count = -1 print "Enter names. Terminate with 'end'." print name = '' while name != 'end': name = raw_input('Name --> ') if name != 'end': name_count = name_count + 1 template = [name, 'no prize'] if name_count == 0: name_list = [template] else: name_list = name_list + [template] else: print "Name entry terminated." print # List names print "Name list:" for counter in range(name_count + 1): print name_list[counter][0] print # Award prizes print "The prizes go to..." counter = 0 while counter <= (prize_count): award = int(random() * (name_count + 1)) if name_list[award][1] == 'no prize': name_list[award][1] = prize_list[counter] counter = counter + 1 # Display results for counter in range(name_count + 1): print name_list[counter] From m.hohner at uwinnipeg.ca Fri Oct 26 09:20:25 2007 From: m.hohner at uwinnipeg.ca (Michael Hohner) Date: Fri, 26 Oct 2007 08:20:25 -0500 Subject: [Python Wpg] Door prize draw In-Reply-To: <47210234.6040403@gmail.com> References: <47210234.6040403@gmail.com> Message-ID: <4721A3BD.96A9.0081.0@uwinnipeg.ca> Mike; Slight improvement: U-of-Ws-Computer:~ mhohner$ diff prize_orig.py prize.py 0a1,2 > import random > 52c54 < award = int(random() * (name_count + 1)) --- > award = int(random.random() * (name_count + 1)) Otherwise I get: File "prize.py", line 52, in ? award = int(random() * (name_count + 1)) NameError: name 'random' is not defined Thanks! - neat idea! Cheers, Michael Michael Hohner, Systems & Media Services Coordinator University of Winnipeg Library VOICE: (204) 786-9812 FAX: (204) 783-8910 EMAIL: m.hohner at uwinnipeg.ca WEB: http://library.uwinnipeg.ca/ >>> Mike Pfaiffer 10/25/07 3:53 PM >>> I was feeling a little more like doing something today. So I whipped up a quick and dirty door prize draw program (no need to go crazy with it). Save the text to prize.py. Here is the source. If I can resolve my problems with Shaw it will be posted there too. Enjoy. Later Mike P.S. Produced on a Mac Mini with Textwrangler. --------------------- # This program distributes door prizes for clubs # Import modules as needed from random import seed, random # Seed random number generator seed() # Print introductory message print """Door prize distribution program: This program will distribute a number of door prizes to all people who enter their names in to this program. The program presumes no errors in data entry. Note: to collect the prize the person must actually be present. The first step is to enter the prizes and the second is to enter the names of the people participating in the draw. The third will be the actual draw. """ # Enter prizes prize_count = - 1 print "Enter prize list. Terminate with 'end'." print prize = '' while (prize != 'end'): prize = raw_input('Prize -- > ') if prize != 'end': prize_count = prize_count + 1 if prize_count == 0: prize_list = [prize] else: prize_list = prize_list + [prize] else: print "Prize entry terminated." print # List prizes print "Prize list:" for counter in range(prize_count + 1): print prize_list[counter] print # Enter names name_count = - 1 print "Enter names. Terminate with 'end'." print name = '' while name != 'end': name = raw_input('Name -- > ') if name != 'end': name_count = name_count + 1 template = [name, 'no prize'] if name_count == 0: name_list = [template] else: name_list = name_list + [template] else: print "Name entry terminated." print # List names print "Name list:" for counter in range(name_count + 1): print name_list[counter][0] print # Award prizes print "The prizes go to..." counter = 0 while counter <= (prize_count): award = int(random() * (name_count + 1)) if name_list[award][1] == 'no prize': name_list[award][1] = prize_list[counter] counter = counter + 1 # Display results for counter in range(name_count + 1): print name_list[counter] _______________________________________________ Winnipeg mailing list Winnipeg at python.org http://mail.python.org/mailman/listinfo/winnipeg From high.res.mike at gmail.com Fri Oct 26 12:21:36 2007 From: high.res.mike at gmail.com (Mike Pfaiffer) Date: Fri, 26 Oct 2007 11:21:36 -0500 Subject: [Python Wpg] Door prize draw In-Reply-To: <4721A3BD.96A9.0081.0@uwinnipeg.ca> References: <47210234.6040403@gmail.com> <4721A3BD.96A9.0081.0@uwinnipeg.ca> Message-ID: <47221410.8030905@gmail.com> Michael Hohner wrote: > Mike; > > Slight improvement: No problem. I'll add them. More compatibility and all that. Funny you should get an error. I tested it on my Linux box as well with no problem. Later Mike > U-of-Ws-Computer:~ mhohner$ diff prize_orig.py prize.py > 0a1,2 >> import random >> > 52c54 > < award = int(random() * (name_count + 1)) > --- >> award = int(random.random() * (name_count + 1)) > > Otherwise I get: > > File "prize.py", line 52, in ? > award = int(random() * (name_count + 1)) > NameError: name 'random' is not defined > > Thanks! - neat idea! > > Cheers, > > Michael > > > > > > Michael Hohner, > Systems & Media Services Coordinator > University of Winnipeg Library > > VOICE: (204) 786-9812 > FAX: (204) 783-8910 > EMAIL: m.hohner at uwinnipeg.ca > WEB: http://library.uwinnipeg.ca/ > > > >>>> Mike Pfaiffer 10/25/07 3:53 PM >>> > I was feeling a little more like doing something today. So I whipped up > a quick and dirty door prize draw program (no need to go crazy with it). > Save the text to prize.py. Here is the source. If I can resolve my > problems with Shaw it will be posted there too. Enjoy. > > Later > Mike > > P.S. Produced on a Mac Mini with Textwrangler. > > --------------------- > # This program distributes door prizes for clubs > > # Import modules as needed > from random import seed, random > > # Seed random number generator > seed() > > # Print introductory message > print """Door prize distribution program: > > This program will distribute a number of door prizes to all people who > enter their names in to this program. The program presumes no errors > in data entry. > > Note: to collect the prize the person must actually be present. > > The first step is to enter the prizes and the second is to enter the names > of the people participating in the draw. The third will be the actual draw. > > """ > > # Enter prizes > prize_count = - 1 > print "Enter prize list. Terminate with 'end'." > print > prize = '' > while (prize != 'end'): > prize = raw_input('Prize -- > ') > if prize != 'end': > prize_count = prize_count + 1 > if prize_count == 0: > prize_list = [prize] > else: > prize_list = prize_list + [prize] > else: > print "Prize entry terminated." > print > > # List prizes > print "Prize list:" > for counter in range(prize_count + 1): > print prize_list[counter] > print > > # Enter names > name_count = - 1 > print "Enter names. Terminate with 'end'." > print > name = '' > while name != 'end': > name = raw_input('Name -- > ') > if name != 'end': > name_count = name_count + 1 > template = [name, 'no prize'] > if name_count == 0: > name_list = [template] > else: > name_list = name_list + [template] > else: > print "Name entry terminated." > print > > # List names > print "Name list:" > for counter in range(name_count + 1): > print name_list[counter][0] > print > > # Award prizes > print "The prizes go to..." > counter = 0 > while counter <= (prize_count): > award = int(random() * (name_count + 1)) > if name_list[award][1] == 'no prize': > name_list[award][1] = prize_list[counter] > counter = counter + 1 > > # Display results > for counter in range(name_count + 1): > print name_list[counter] > _______________________________________________ > Winnipeg mailing list > Winnipeg at python.org > http://mail.python.org/mailman/listinfo/winnipeg > > > From aklaassen at gmail.com Mon Oct 29 03:18:06 2007 From: aklaassen at gmail.com (Aaron Klaassen) Date: Mon, 29 Oct 2007 15:18:06 +0800 Subject: [Python Wpg] Just hook it up to my veins Message-ID: So...after lurking on this mailing list for the better part of two years (I even went to a meeting once!), I've finally got some time on my hands and I figure I should pick up this "Python" thing that everybody's talking about. Any suggestions for specific online tutorials (or books that are worth the dollars)? I'm not a new programmer, just new to Python (I've mostly been a C/Java/php guy). I mean, I can type "python tutorial" into Google as well as the next guy, but maybe someone here can suggest something that would save me the digging around for something worthwhile. Thanks. Aaron. -------------- next part -------------- An HTML attachment was scrubbed... URL: From syd at plug.ca Mon Oct 29 08:24:05 2007 From: syd at plug.ca (Sydney Weidman) Date: Mon, 29 Oct 2007 07:24:05 -0500 Subject: [Python Wpg] Just hook it up to my veins In-Reply-To: References: Message-ID: <1193660645.31760.10.camel@localhost.localdomain> On Mon, 2007-10-29 at 15:18 +0800, Aaron Klaassen wrote: > So...after lurking on this mailing list for the better part of two > years (I even went to a meeting once!), I've finally got some time on > my hands and I figure I should pick up this "Python" thing that > everybody's talking about. > > Any suggestions for specific online tutorials (or books that are worth > the dollars)? I'm not a new programmer, just new to Python (I've > mostly been a C/Java/php guy). I mean, I can type "python tutorial" > into Google as well as the next guy, but maybe someone here can > suggest something that would save me the digging around for something > worthwhile. > > Thanks. > Aaron. The two online tutorials that I found most helpful were: Think Like a Computer Scientist, Python version: http://www.ibiblio.org/obp/thinkCSpy/ And although it has not been updated since 2004, Dive Into Python should also be helpful: http://www.diveintopython.org/toc/index.html I've also grown very fond of O'Reilly's Python Cookbook recently, which provides hundreds of examples of how to solve typical programming problems in Pythonic ways. And then, of course, there's Stuart's various introductory talks which are all excellent. Hope this helps. - syd From syd at plug.ca Tue Oct 30 10:21:42 2007 From: syd at plug.ca (Sydney Weidman) Date: Tue, 30 Oct 2007 09:21:42 -0500 Subject: [Python Wpg] Descriptors Message-ID: <1193754102.8084.4.camel@sweidman-laptop> This page about descriptors and the descriptor protocol touches on some of what Stuart was talking about at the last meeting with respect to when __getattr__ method is invoked: http://users.rcn.com/python/download/Descriptor.htm I found the document interesting and helpful, as was Stuart's presentation. - Syd From stuartw at mts.net Wed Oct 31 08:38:22 2007 From: stuartw at mts.net (Stuart Williams) Date: Wed, 31 Oct 2007 07:38:22 -0500 Subject: [Python Wpg] Descriptors In-Reply-To: <1193754102.8084.4.camel@sweidman-laptop> References: <1193754102.8084.4.camel@sweidman-laptop> Message-ID: Good resource. I'm aware of property but haven't used it consistently. Here's a snippet of the old code I presented last week and a new untested version which uses property instead of __getattr__. # Old class Event: def __str__(self): return '%2d %6s %s' % (self.EventClass, self.Duration, \ (self.TextData[:30] if self.TextData else '')) def __getattr__(self, name): if name is not 'spname': raise AttributeError self.spname = e.TextData.split()[0] return self.spname # New class Event(object): def __str__(self): return '%2d %6s %s' % (self.EventClass, self.Duration, \ (self.TextData[:30] if self.TextData else '')) def get_spname(self): if not hasattr(self, '__spname'): self.__spname = e.TextData.split()[0] return self.__spname def set_spname(self, value): # needed to make spname a data descriptor raise AttributeError spname = property(get_spname, set_spname, None, "Stored procedure name") On 10/30/07, Sydney Weidman wrote: > This page about descriptors and the descriptor protocol touches on some > of what Stuart was talking about at the last meeting with respect to > when __getattr__ method is invoked: > > http://users.rcn.com/python/download/Descriptor.htm > > I found the document interesting and helpful, as was Stuart's > presentation. > > - Syd > > > _______________________________________________ > Winnipeg mailing list > Winnipeg at python.org > http://mail.python.org/mailman/listinfo/winnipeg > From peter at pogma.com Wed Oct 31 13:03:24 2007 From: peter at pogma.com (Peter O'Gorman) Date: Wed, 31 Oct 2007 12:03:24 -0500 Subject: [Python Wpg] remove dup mails Message-ID: <4728B55C.9060406@pogma.com> I mentioned at the meeting that fetchmail went mad and downloaded my mail messages repeatedly leaving me with multiple copies of several hundred messages. The files were not identical, but only differed in "Received" headers. This the the python script I came up with (it took a while, I had to spend a good deal of time reading the docs). I'm sure that there are better ways to do this, and would not mind a critique, but this did work. Thanks, Peter #! /usr/bin/python import os import sys import email import hashlib dups = {} for root, dirs, files in os.walk('/home/pogma/Maildir'): for fname in files: try: fobj = open(os.path.join(root,fname)) msg = email.message_from_file(fobj) fobj.close() except: fobj.close() continue msg.__delitem__('Received') hash = hashlib.md5(msg.as_string()).hexdigest() if not dups.has_key(hash): dups[hash] = os.path.join(root,fname) else: os.unlink(os.path.join(root,fname)) From stuartw at mts.net Wed Oct 31 22:09:50 2007 From: stuartw at mts.net (Stuart Williams) Date: Wed, 31 Oct 2007 21:09:50 -0500 Subject: [Python Wpg] remove dup mails In-Reply-To: <4728B55C.9060406@pogma.com> References: <4728B55C.9060406@pogma.com> Message-ID: This looks great! I can't think of significantly better ways of doing it. Here are some style suggestions. Follow PEP 8 (http://www.python.org/dev/peps/pep-0008/) for indentation, etc. Note that __delitem__ is a special name, intended to implement dicts, so see how I used it below (untested). Also, file is one of the objects that follows the new context management protocol which allows you to use it with "with" a la http://docs.python.org/whatsnew/pep-343.html and get rid of the try and both closes. Lastly dict's support the "in" operator. So here's a slightly different version: #! /usr/bin/python from __future__ import with_statement import os import sys import email import hashlib dups = {} for root, dirs, files in os.walk('a'): for fname in files: with open(os.path.join(root,fname)) as fobj: msg = email.message_from_file(fobj) del msg['Received'] hash = hashlib.md5(msg.as_string()).hexdigest() if not hash in dups: dups[hash] = os.path.join(root,fname) else: print 'unlink' # os.unlink(os.path.join(root,fname)) On 10/31/07, Peter O'Gorman wrote: > I mentioned at the meeting that fetchmail went mad and downloaded my > mail messages repeatedly leaving me with multiple copies of several > hundred messages. The files were not identical, but only differed in > "Received" headers. This the the python script I came up with (it took a > while, I had to spend a good deal of time reading the docs). > > I'm sure that there are better ways to do this, and would not mind a > critique, but this did work. > > Thanks, > Peter > > #! /usr/bin/python > import os > import sys > import email > import hashlib > > dups = {} > > for root, dirs, files in os.walk('/home/pogma/Maildir'): > for fname in files: > try: > fobj = open(os.path.join(root,fname)) > msg = email.message_from_file(fobj) > fobj.close() > except: > fobj.close() > continue > msg.__delitem__('Received') > hash = hashlib.md5(msg.as_string()).hexdigest() > if not dups.has_key(hash): > dups[hash] = os.path.join(root,fname) > else: > os.unlink(os.path.join(root,fname)) > _______________________________________________ > Winnipeg mailing list > Winnipeg at python.org > http://mail.python.org/mailman/listinfo/winnipeg > From stuartw at mts.net Wed Oct 31 22:16:58 2007 From: stuartw at mts.net (Stuart Williams) Date: Wed, 31 Oct 2007 21:16:58 -0500 Subject: [Python Wpg] remove dup mails In-Reply-To: References: <4728B55C.9060406@pogma.com> Message-ID: One more thought, not Python-related. Wouldn't msg['Message-id'] reliably replace the hash as a unique handle on the message? On 10/31/07, Stuart Williams wrote: > This looks great! I can't think of significantly better ways of doing > it. Here are some style suggestions. > > Follow PEP 8 (http://www.python.org/dev/peps/pep-0008/) for > indentation, etc. Note that __delitem__ is a special name, intended > to implement dicts, so see how I used it below (untested). Also, file > is one of the objects that follows the new context management > protocol which allows you to use it with "with" a la > http://docs.python.org/whatsnew/pep-343.html and get rid of the try > and both closes. Lastly dict's support the "in" operator. > > So here's a slightly different version: > > #! /usr/bin/python > from __future__ import with_statement > > import os > import sys > import email > import hashlib > > dups = {} > > for root, dirs, files in os.walk('a'): > for fname in files: > with open(os.path.join(root,fname)) as fobj: > msg = email.message_from_file(fobj) > del msg['Received'] > hash = hashlib.md5(msg.as_string()).hexdigest() > if not hash in dups: > dups[hash] = os.path.join(root,fname) > else: > print 'unlink' > # os.unlink(os.path.join(root,fname)) > > > On 10/31/07, Peter O'Gorman wrote: > > I mentioned at the meeting that fetchmail went mad and downloaded my > > mail messages repeatedly leaving me with multiple copies of several > > hundred messages. The files were not identical, but only differed in > > "Received" headers. This the the python script I came up with (it took a > > while, I had to spend a good deal of time reading the docs). > > > > I'm sure that there are better ways to do this, and would not mind a > > critique, but this did work. > > > > Thanks, > > Peter > > > > #! /usr/bin/python > > import os > > import sys > > import email > > import hashlib > > > > dups = {} > > > > for root, dirs, files in os.walk('/home/pogma/Maildir'): > > for fname in files: > > try: > > fobj = open(os.path.join(root,fname)) > > msg = email.message_from_file(fobj) > > fobj.close() > > except: > > fobj.close() > > continue > > msg.__delitem__('Received') > > hash = hashlib.md5(msg.as_string()).hexdigest() > > if not dups.has_key(hash): > > dups[hash] = os.path.join(root,fname) > > else: > > os.unlink(os.path.join(root,fname)) > > _______________________________________________ > > Winnipeg mailing list > > Winnipeg at python.org > > http://mail.python.org/mailman/listinfo/winnipeg > > > From peter at pogma.com Wed Oct 31 22:37:08 2007 From: peter at pogma.com (Peter O'Gorman) Date: Wed, 31 Oct 2007 21:37:08 -0500 Subject: [Python Wpg] remove dup mails In-Reply-To: References: <4728B55C.9060406@pogma.com> Message-ID: <47293BD4.8010403@pogma.com> Stuart Williams wrote: > One more thought, not Python-related. Wouldn't msg['Message-id'] > reliably replace the hash as a unique handle on the message? I considered that, but, did not trust that the Message-id was, in fact, a unique message identifier. I don't know enough about email. > > On 10/31/07, Stuart Williams wrote: >> This looks great! I can't think of significantly better ways of doing >> it. Here are some style suggestions. >> >> Follow PEP 8 (http://www.python.org/dev/peps/pep-0008/) for >> indentation, etc. Note that __delitem__ is a special name, intended >> to implement dicts, so see how I used it below (untested). Also, file >> is one of the objects that follows the new context management >> protocol which allows you to use it with "with" a la >> http://docs.python.org/whatsnew/pep-343.html and get rid of the try >> and both closes. Lastly dict's support the "in" operator. Thank you for the review and pointers. >> >> So here's a slightly different version: >> for fname in files: >> with open(os.path.join(root,fname)) as fobj: I'm going off to read about 'with' right now! Peter