[Mailman-Developers] chunkify suggestion, with patch.
Darrell Fuhriman
darrell@grumblesmurf.net
Mon, 17 Jul 2000 13:59:26 -0700 (PDT)
> Also, one problem with this code is that the domain_sort fucntion is
> really rather slow. If anyone has suggestions for speeding it up, that'd
> be great. :)
OK, I looked at bulk_mailer and realized it has a much
faster/simpler/slicker way of doing this work. So I stole the idea. :)
Here's a new patch for SMTPDirect.py. Who decides what get included in
the release, BTW?
Darrell
--- Mailman/Handlers/SMTPDirect.py~ Fri Jun 2 21:59:45 2000
+++ Mailman/Handlers/SMTPDirect.py Mon Jul 17 13:50:16 2000
@@ -106,48 +106,48 @@
def chunkify(recips, chunksize):
- # First do a simple sort on top level domain. It probably doesn't buy us
- # much to try to sort on MX record -- that's the MTA's job. We're just
- # trying to avoid getting a max recips error. Split the chunks along
- # these lines (as suggested originally by Chuq Von Rospach and slightly
- # elaborated by BAW).
- chunkmap = {'com': 1,
- 'net': 2,
- 'org': 2,
- 'edu': 3,
- 'us' : 3,
- 'ca' : 3,
- }
- buckets = {}
- for r in recips:
- tld = None
- i = string.rfind(r, '.')
- if i >= 0:
- tld = r[i+1:]
- bin = chunkmap.get(tld, 0)
- bucket = buckets.get(bin, [])
- bucket.append(r)
- buckets[bin] = bucket
+ # If we turn down the chunksize (i.e. SMTP_MAX_RCPTS), and have
+ # the addresses sorted by domain, it's much nicer to the MTA and
+ # to the users. (In the majordomo world, this is what bulk_mailer
+ # would do.)
+ # In an ideal world, a single domain wouldn't be split across
+ # multiple chunks unless a someother threshold had been met.
+ # I'll save that for sometime when it's not 2:30am. :)
+
+ recips = domain_sort(recips)
+
# Now start filling the chunks
chunks = []
currentchunk = []
- chunklen = 0
- for bin in buckets.values():
- for r in bin:
- currentchunk.append(r)
- chunklen = chunklen + 1
- if chunklen >= chunksize:
- chunks.append(currentchunk)
- currentchunk = []
- chunklen = 0
- if currentchunk:
+ for recip in recips:
+ if len(currentchunk) >= chunksize:
chunks.append(currentchunk)
currentchunk = []
- chunklen = 0
+ currentchunk.append(recip)
+ if len(currentchunk) != 0:
+ chunks.append(currentchunk)
return chunks
+
+
+def domain_sort(recips):
+ # first we need to reverse every element of the list
+ for i in range(0,len(recips)):
+ recips[i] = string_reverse(recips[i])
+
+ recips.sort()
+ for i in range(0,len(recips)):
+ recips[i] = string_reverse(recips[i])
+ return recips
+def string_reverse(str):
+ tmp = array.array('c', str)
+ tmp.reverse()
+ str = tmp.tostring()
+ return str
+
+
def pre_deliver(envsender, msgtext, failures, chunkq):
while 1:
# Get the next recipient chunk, if there is one