Bug in Mailman version 2.1a1

Variable	Value
`sys.version`	2.0b2 (#6, Oct 7 2000, 22:07:24) = [C]
`sys.executable`	/usr/local/bin/python
`sys.prefix`	/usr/local
`sys.exec_prefix`	/usr/local
`sys.path`	/usr/local
`sys.platform`	sco_sv3

Variable	Value
`DOCUMENT_ROOT`	/home
`SERVER_ADDR`	207.234.31.38
`HTTP_ACCEPT_ENCODING`	gzip, deflate
`SERVER_PORT`	80
`PATH_TRANSLATED`	/home/homebrew
`REMOTE_ADDR`	207.234.31.45
`UNIQUE_ID`	OjWeyc-qHyYAAAVukiM
`HTTP_ACCEPT_LANGUAGE`	ie-ee,en-us;q=3D0.5 =
`GATEWAY_INTERFACE`	CGI/1.1
`SERVER_NAME`	sco.theporch.com
`TZ`	CST6CDT
`HTTP_USER_AGENT`	Mozilla/4.0 (compatible; = MSIE 5.5; Windows 98)
`QUERY_STRING`
`HTTP_ACCEPT`	image/gif, image/x-xbitmap, = image/jpeg, image/pjpeg, application/vnd.ms-powerpoint, = application/vnd.ms-excel, application/msword, /
`REQUEST_URI`	/mailman/admin/homebrew =
`REMOTE_PORT`	1205
`SCRIPT_FILENAME`	/home/mailman/cgi-bin/admin =
`SCRIPT_URL`	/mailman/admin/homebrew
`HTTP_HOST`	www.theporch.com:8080
`REQUEST_METHOD`	GET
`SERVER_SIGNATURE`	Apache/1.3.14 = Server at sco.theporch.com Port 80 =0A=
`SCRIPT_URI`	= http://sco.theporch.com/mailman/admin/homebrew
`SCRIPT_NAME`	/mailman/admin
`SERVER_ADMIN`	root@sco.theporch.com
`SERVER_SOFTWARE`	Apache/1.3.14 (Unix) = PHP/4.0.3pl1 mod_ssl/2.7.1 OpenSSL/0.9.7-dev
`PYTHONPATH`	/home/mailman
`PATH_INFO`	/homebrew
`SERVER_PROTOCOL`	HTTP/1.1
`HTTP_CONNECTION`	Keep-Alive

There are many reasons not to introduce or override the Reply-To: header. One is that some posters depend on their own Reply-To: settings to convey their valid return address. Another is that modifying Reply-To: makes it much more difficult to send private replies. See `Reply-To' Munging Considered Harmful for a general discussion of this issue. See Reply-To Munging Considered Useful for a dissenting opinion.

Some mailing lists have restricted posting privileges, with a parallel list devoted to discussions. Examples are `patches' or `checkin' lists, where software changes are posted by a revision control system, but discussion about the changes occurs on a developers mailing list. To support these types of mailing lists, select Explicit address and set the Reply-To: address below to point to the parallel list."""), ('administrivia', mm_cfg.Radio, ('No', 'Yes'), 0, "(Administrivia filter) Check postings and intercept ones" " that seem to be administrative requests?", "Administrivia tests will check postings to see whether" " it's really meant as an administrative request (like" " subscribe, unsubscribe, etc), and will add it to the" " the administrative requests queue, notifying the " " administrator of the new request, in the process. "), ('umbrella_list', mm_cfg.Radio, ('No', 'Yes'), 0, 'Send password reminders to, eg, "-owner" address instead of' ' directly to user.', "Set this to yes when this list is intended to cascade only to" " other mailing lists. When set, meta notices like confirmations" " and password reminders will be directed to an address derived" " from the member\'s address - it will have the value of" ' \"umbrella_member_suffix\" appended to the' " member\'s account name."), ('umbrella_member_suffix', mm_cfg.String, WIDTH, 0, 'Suffix for use when this list is an umbrella for other lists,' ' according to setting of previous "umbrella_list" setting.', 'When \"umbrella_list\" is set to indicate that this list has' " other mailing lists as members, then administrative notices" " like confirmations and password reminders need to not be sent" " to the member list addresses, but rather to the owner of those" " member lists. In that case, the value of this setting is" " appended to the member\'s account name for such notices." " \'-owner\' is the typical choice. This setting has no" ' effect when \"umbrella_list\" is \"No\".'), ('send_reminders', mm_cfg.Radio, ('No', 'Yes'), 0, 'Send monthly password reminders or no? Overrides the previous ' 'option.'), ('send_welcome_msg', mm_cfg.Radio, ('No', 'Yes'), 0, 'Send welcome message when people subscribe?', "Turn this on only if you plan on subscribing people manually " "and don't want them to know that you did so. This option " "is most useful for transparently migrating lists from " "some other mailing list manager to Mailman."), ('admin_immed_notify', mm_cfg.Radio, ('No', 'Yes'), 0, 'Should administrator get immediate notice of new requests, ' 'as well as daily notices about collected ones?', "List admins are sent daily reminders of pending admin approval" " requests, like subscriptions to a moderated list or postings" " that are being held for one reason or another. Setting this" " option causes notices to be sent immediately on the arrival" " of new requests, as well."), ('admin_notify_mchanges', mm_cfg.Radio, ('No', 'Yes'), 0, 'Should administrator get notices of subscribes/unsubscribes?'), ('dont_respond_to_post_requests', mm_cfg.Radio, ('Yes', 'No'), 0, 'Send mail to poster when their posting is held for approval?', "Approval notices are sent when mail triggers certain of the" " limits except routine list moderation and spam" " filters, for which notices are not sent. This" " option overrides ever sending the notice."), ('max_message_size', mm_cfg.Number, 7, 0, 'Maximum length in Kb of a message body. Use 0 for no limit.'), ('host_name', mm_cfg.Host, WIDTH, 0, 'Host name this list prefers.', "The host_name is the preferred name for email to mailman-related" " addresses on this host, and generally should be the mail" " host's exchanger address, if any. This setting can be useful" " for selecting among alternative names of a host that has" " multiple addresses."), ('web_page_url', mm_cfg.String, WIDTH, 0, '''Base URL for Mailman web interface. The URL must end in a single "/". See also the details for an important warning when changing this value.''', """This is the common root for all Mailman URLs referencing this mailing list. It is also used in the listinfo overview of mailing lists to identify whether or not this list resides on the virtual host identified by the overview URL; i.e. if this value is found (anywhere) in the URL, then this list is considered to be on that virtual host. If not, then it is excluded from the listing.

Warning: setting this value to an invalid base URL will render the mailing list unusable. You will also not be able to fix this from the web interface! In that case, the site administrator will have to fix the mailing list from the command line."""), ] if mm_cfg.ALLOW_OPEN_SUBSCRIBE: sub_cfentry = ('subscribe_policy', mm_cfg.Radio, ('none', 'confirm', 'require approval', 'confirm+approval'), 0, "What steps are required for subscription?
", "None - no verification steps (Not" " Recommended )
" "confirm (*) - email confirmation step" " required
" "require approval - require list administrator" " approval for subscriptions
" "confirm+approval - both confirm and approve" "

(*) when someone requests a subscription," " mailman sends them a notice with a unique" " subscription request number that they must" " reply to in order to subscribe.
This" " prevents mischievous (or malicious) people" " from creating subscriptions for others" " without their consent." ) config_info['privacy'] = [ "List access policies, including anti-spam measures," " covering members and outsiders." ' (See also the Archival Options' ' section for separate archive-privacy settings.)' % (self.GetScriptURL('admin')), "Subscribing", ('advertised', mm_cfg.Radio, ('No', 'Yes'), 0, 'Advertise this list when people ask what lists are on ' 'this machine?'), sub_cfentry, "Membership exposure", ('private_roster', mm_cfg.Radio, ('Anyone', 'List members', 'List admin only'), 0, 'Who can view subscription list?', "When set, the list of subscribers is protected by" " member or admin password authentication."), ('obscure_addresses', mm_cfg.Radio, ('No', 'Yes'), 0, "Show member addrs so they're not directly recognizable" ' as email addrs?', "Setting this option causes member email addresses to be" " transformed when they are presented on list web pages (both" " in text and as links), so they're not trivially" " recognizable as email addresses. The intention is to" " to prevent the addresses from being snarfed up by" " automated web scanners for use by spammers."), "General posting filters", ('moderated', mm_cfg.Radio, ('No', 'Yes'), 0, 'Must posts be approved by an administrator?'), ('member_posting_only', mm_cfg.Radio, ('No', 'Yes'), 0, 'Restrict posting privilege to list members?' ' (member_posting_only)', "Use this option if you want to restrict posting to list members." " If you want list members to be able to" " post, plus a handful of other posters, see the posters " " setting below"), ('posters', mm_cfg.EmailList, (5, WIDTH), 1, 'Addresses of members accepted for posting to this' ' list without implicit approval requirement. (See' ' "Restrict ... to list members"' ' for whether or not this is in addition to allowing posting' ' by list members', "Adding entries here will have one of two effects," " according to whether another option restricts posting to" " members.

The cost is that the list will not accept unhindered any" " postings relayed from other addresses, unless

For backwards compatibility with Mailman 1.1, if the regexp" " does not contain an `@', then the pattern is matched against" " just the local part of the recipient address. If that match" " fails, or if the pattern does contain an `@', then the pattern" " is matched against the entire recipient address. " "

Matching against the local part is deprecated; in a future" " release, the patterm will always be matched against the " " entire recipient address."), ('max_num_recipients', mm_cfg.Number, 5, 0, 'Ceiling on acceptable number of recipients for a posting.', "If a posting has this number, or more, of recipients, it is" " held for admin approval. Use 0 for no ceiling."), ('forbidden_posters', mm_cfg.EmailList, (5, WIDTH), 1, 'Addresses whose postings are always held for approval.', "Email addresses whose posts should always be held for" " approval, no matter what other options you have set." " See also the subsequent option which applies to arbitrary" " content of arbitrary headers."), ('bounce_matching_headers', mm_cfg.Text, (6, WIDTH), 0, 'Hold posts with header value matching a specified regexp.', "Use this option to prohibit posts according to specific header" " values. The target value is a regular-expression for" " matching against the specified header. The match is done" " disregarding letter case. Lines beginning with '#' are" " ignored as comments." "

Note that leading whitespace is trimmed from the" " regexp. This can be circumvented in a number of ways, eg" " by escaping or bracketing it." "

See also the forbidden_posters option for" " a related mechanism."), ('anonymous_list', mm_cfg.Radio, ('No', 'Yes'), 0, 'Hide the sender of a message, replacing it with the list ' 'address (Removes From, Sender and Reply-To fields)'), ] config_info['nondigest'] = [ "Policies concerning immediately delivered list traffic.", ('nondigestable', mm_cfg.Toggle, ('No', 'Yes'), 1, 'Can subscribers choose to receive mail immediately,' ' rather than in batched digests?'), ('msg_header', mm_cfg.Text, (4, WIDTH), 0, 'Header added to mail sent to regular list members', "Text prepended to the top of every immediately-delivery" " message. " + Utils.maketext('headfoot.html', raw=1)), ('msg_footer', mm_cfg.Text, (4, WIDTH), 0, 'Footer added to mail sent to regular list members', "Text appended to the bottom of every immediately-delivery" " message. " + Utils.maketext('headfoot.html', raw=1)), ] config_info['bounce'] = Bouncer.GetConfigInfo(self) return config_info def Create(self, name, admin, crypted_password): if Utils.list_exists(name): raise Errors.MMListAlreadyExistsError, name Utils.ValidateEmail(admin) Utils.MakeDirTree(os.path.join(mm_cfg.LIST_DATA_DIR, name)) self._full_path = os.path.join(mm_cfg.LIST_DATA_DIR, name) self._internal_name = name # Don't use Lock() since that tries to load the non-existant config.db self.__lock.lock() self.InitVars(name, admin, crypted_password) self._ready = 1 self.InitTemplates() self.Save() # Touch these files so they have the right dir perms no matter what. # A "just-in-case" thing. This shouldn't have to be here. ou = os.umask(002) try: path = os.path.join(self._full_path, 'next-digest') fp = open(path, "a+") fp.close() fp = open(path+'-topics', "a+") fp.close() finally: os.umask(ou) def __save(self, dict): # Marshal this dictionary to file, and rotate the old version to a # backup file. The dictionary must contain only builtin objects. We # must guarantee that config.db is always valid so we never rotate # unless the we've successfully written the temp file. fname = os.path.join(self._full_path, 'config.db') fname_tmp = fname + '.tmp.%s.%d' % (socket.gethostname(), os.getpid()) fname_last = fname + '.last' fp = None try: fp = open(fname_tmp, 'w') # marshal doesn't check for write() errors so this is safer. fp.write(marshal.dumps(dict)) fp.close() except IOError, e: syslog('error', 'Failed config.db write, retaining old state.\n%s' % e) if fp is not None: os.unlink(fname_tmp) raise # Now do config.db.tmp.xxx -> config.db -> config.db.last rotation # as safely as possible. try: # might not exist yet os.unlink(fname_last) except OSError, e: if e.errno <> errno.ENOENT: raise try: # might not exist yet os.link(fname, fname_last) except OSError, e: if e.errno <> errno.ENOENT: raise os.rename(fname_tmp, fname) def Save(self): # Refresh the lock, just to let other processes know we're still # interested in it. This will raise a NotLockedError if we don't have # the lock (which is a serious problem!). TBD: do we need to be more # defensive? self.__lock.refresh() # If more than one client is manipulating the database at once, we're # pretty hosed. That's a good reason to make this a daemon not a # program. self.IsListInitialized() # copy all public attributes to marshalable dictionary dict = {} for key, value in self.__dict__.items(): if key[0] <> '_': dict[key] = value # Make config.db unreadable by `other', as it contains all the # list members' passwords (in clear text). omask = os.umask(007) try: self.__save(dict) finally: os.umask(omask) self.SaveRequestsDb() self.CheckHTMLArchiveDir() def __load(self, dbfile): # Attempt to load and unmarshal the specified database file, which # could be config.db or config.db.last. On success return a 2-tuple # of (dictionary, None). On error, return a 2-tuple of the form # (None, errorobj). try: fp = open(dbfile) except IOError, e: if e.errno <> errno.ENOENT: raise return None, e try: try: dict = marshal.load(fp) if type(dict) <> DictType: return None, 'Unmarshal expected to return a dictionary' except (EOFError, ValueError, TypeError, MemoryError), e: return None, e finally: fp.close() return dict, None def Load(self, check_version=1): if not Utils.list_exists(self.internal_name()): raise Errors.MMUnknownListError # We first try to load config.db, which contains the up-to-date # version of the database. If that fails, perhaps because it is # corrupted or missing, then we load config.db.last as a fallback. dbfile = os.path.join(self._full_path, 'config.db') lastfile = dbfile + '.last' dict, e = self.__load(dbfile) if dict is None: # Had problems with config.db. Either it's missing or it's # corrupted. Try config.db.last as a fallback. syslog('error', '%s db file was corrupt, using fallback: %s' % (self.internal_name(), lastfile)) dict, e = self.__load(lastfile) if dict is None: # config.db.last is busted too. Nothing much we can do now. syslog('error', '%s fallback was corrupt, giving up' % self.internal_name()) raise Errors.MMCorruptListDatabaseError, e # We had to read config.db.last, so copy it back to config.db. # This allows the logic in Save() to remain unchanged. Ignore # any OSError resulting from possibly illegal (but unnecessary) # chmod. try: shutil.copy(lastfile, dbfile) except OSError, e: if e.errno <> errno.EPERM: raise # Copy the unmarshaled dictionary into the attributes of the mailing # list object. self.__dict__.update(dict) self._ready = 1 if check_version: self.CheckValues() self.CheckVersion(dict) def CheckVersion(self, stored_state): """Migrate prior version's state to new structure, if changed.""" if (self.data_version >= mm_cfg.DATA_FILE_VERSION and type(self.data_version) == type(mm_cfg.DATA_FILE_VERSION)): return else: self.InitVars() # Init any new variables, self.Load(check_version = 0) # then reload the file if self.Locked(): from versions import Update Update(self, stored_state) self.data_version = mm_cfg.DATA_FILE_VERSION if self.Locked(): self.Save() def CheckValues(self): """Normalize selected values to known formats.""" if '' in urlparse(self.web_page_url)[:2]: # Either the "scheme" or the "network location" part of the parsed # URL is empty; substitute faulty value with (hopefully sane) # default. self.web_page_url = mm_cfg.DEFAULT_URL if self.web_page_url and self.web_page_url[-1] <> '/': self.web_page_url = self.web_page_url + '/' def IsListInitialized(self): if not self._ready: raise Errors.MMListNotReadyError #xxx def AddMember(self, name, password, digest=0, remote=None): self.IsListInitialized() # normalize the name, it could be of the form # # User Name # person@place.com (User Name) # etc # email = Address.address() email.addr = Utils.ParseAddrs(name) email.name = Utils.ParseNames(name) # Remove spaces... it's a common thing for people to add... email.addr = string.join(string.split(email.addr), '') # lower case only the domain part email.addr = Utils.LCDomain(email.addr) # Validate the e-mail address to some degree. Utils.ValidateEmail(email.addr) if self.IsMember(email.addr): raise Errors.MMAlreadyAMember if email.addr == string.lower(self.GetListEmail()): # Trying to subscribe the list to itself! raise Errors.MMBadEmailError if digest and not self.digestable: raise Errors.MMCantDigestError elif not digest and not self.nondigestable: raise Errors.MMMustDigestError if self.subscribe_policy == 0: # no confirmation or approval necessary: self.ApprovedAddMember(email.addr, password, digest) elif self.subscribe_policy == 1 or self.subscribe_policy == 3: # confirmation: from Pending import Pending cookie = Pending().new(email.addr, password, digest) if remote is not None: by = " " + remote remote = " from %s" % remote else: by = "" remote = "" recipient = self.GetMemberAdminEmail(email.addr) text = Utils.maketext('verify.txt', {"email" : email.addr, "listaddr" : self.GetListEmail(), "listname" : self.real_name, "cookie" : cookie, "hostname" : remote, "requestaddr": self.GetRequestEmail(), "remote" : remote, "listadmin" : self.GetAdminEmail(), }) msg = Message.UserNotification( recipient, self.GetRequestEmail(), '%s -- confirmation of subscription -- request %d' % (self.real_name, cookie), text) msg['Reply-To'] = self.GetRequestEmail() HandlerAPI.DeliverToUser(self, msg) if recipient != email.addr: who = "%s (%s)" % (email.addr, string.split(recipient, '@')[0]) else: who = email.addr syslog('subscribe', '%s: pending %s %s' % (self.internal_name(), who, by)) raise Errors.MMSubscribeNeedsConfirmation else: # subscription approval is required. add this entry to the admin # requests database. self.HoldSubscription(email.addr, password, digest) raise Errors.MMNeedApproval, \ 'subscriptions to %s require administrator approval' % \ self.real_name def ProcessConfirmation(self, cookie): from Pending import Pending got = Pending().confirmed(cookie) if not got: raise Errors.MMBadConfirmation else: (email_addr, password, digest) = got try: if self.subscribe_policy == 3: # confirm + approve self.HoldSubscription(email_addr, password, digest) raise Errors.MMNeedApproval, \ 'subscriptions to %s require administrator approval' % \ self.real_name self.ApprovedAddMember(email_addr, password, digest) finally: self.Save() def ApprovedAddMember(self, name, password, digest, ack=None, admin_notif=None): res = self.ApprovedAddMembers([name], [password], digest, ack, admin_notif) # There should be exactly one (key, value) pair in the returned dict, # extract the possible exception value res = res.values()[0] if res is None: # User was added successfully return else: # Split up the exception list and reraise it here e, v = res raise e, v def ApprovedAddMembers(self, names, passwords, digest, ack=None, admin_notif=None): """Subscribe members in list `names'. Passwords can be supplied in the passwords list. If an empty password is encountered, a random one is generated and used. Returns a dict where the keys are addresses that were tried subscribed, and the corresponding values are either two-element tuple containing the first exception type and value that was raised when trying to add that address, or `None' to indicate that no exception was raised. """ if ack is None: if self.send_welcome_msg: ack = 1 else: ack = 0 if admin_notif is None: if self.admin_notify_mchanges: admin_notif = 1 else: admin_notif = 0 if type(passwords) is not ListType: # Type error -- ignore whatever value(s) we were given passwords = [None] * len(names) lenpws = len(passwords) lennames = len(names) if lenpws < lennames: passwords.extend([None] * (lennames - lenpws)) result = {} dirty = 0 #xxx email = Address.address() for i in range(lennames): try: # normalize the name, it could be of the form # # User Name # person@place.com (User Name) # etc # email = Address.address() email.addr = Utils.ParseAddrs(names[i]) email.name = Utils.ParseNames(names[i]) Utils.ValidateEmail(email.addr) email.addr = Utils.LCDomain(email.addr) except (Errors.MMBadEmailError, Errors.MMHostileAddress): # We don't really need the traceback object for the exception, # and as using it in the wrong way prevents garbage collection # from working smoothly, we strip it away result[email.addr] = sys.exc_info()[:2] # WIBNI we could `continue' within `try' constructs... if result.has_key(email.addr): continue if self.IsMember(email.addr): result[email.addr] = [Errors.MMAlreadyAMember, email.addr] continue self.__AddMember(email, digest) self.SetUserOption(email.addr, mm_cfg.DisableMime, 1 - self.mime_is_default_digest, save_list=0) # Make sure we set a "good" password password = passwords[i] if not password: password = Utils.MakeRandomPassword() self.passwords[string.lower(email.addr)] = password # An address has been added successfully, make sure the # list config is saved later on dirty = 1 result[email.addr] = None if dirty: self.Save() if digest: kind = " (D)" else: kind = "" for email.addr in result.keys(): if result[email.addr] is None: syslog('subscribe', '%s: new%s %s' % (self.internal_name(), kind, email.addr)) if ack: self.SendSubscribeAck( email.addr, self.passwords[string.lower(email.addr)], digest) if admin_notif: adminaddr = self.GetAdminEmail() subject = ('%s subscription notification' % self.real_name) text = Utils.maketext( "adminsubscribeack.txt", {"listname" : self.real_name, "member" : email.addr, }) msg = Message.UserNotification( self.owner, mm_cfg.MAILMAN_OWNER, subject, text) HandlerAPI.DeliverToUser(self, msg) return result def DeleteMember(self, name, whence=None, admin_notif=None, userack=1): self.IsListInitialized() # FindMatchingAddresses *should* never return more than 1 address. # However, should log this, just to make sure. aliases = Utils.FindMatchingAddresses(name, self.members, self.digest_members) if not len(aliases): raise Errors.MMNoSuchUserError def DoActualRemoval(alias, me=self): kind = "(unfound)" try: del me.passwords[alias] except KeyError: pass if me.user_options.has_key(alias): del me.user_options[alias] try: del me.members[alias] kind = "regular" except KeyError: pass try: del me.digest_members[alias] kind = "digest" except KeyError: pass map(DoActualRemoval, aliases) if userack and self.goodbye_msg and len(self.goodbye_msg): self.SendUnsubscribeAck(name) self.ClearBounceInfo(name) self.Save() if admin_notif is None: if self.admin_notify_mchanges: admin_notif = 1 else: admin_notif = 0 if admin_notif: subject = '%s unsubscribe notification' % self.real_name text = Utils.maketext( 'adminunsubscribeack.txt', {'member' : name, 'listname': self.real_name, }) msg = Message.UserNotification(self.owner, mm_cfg.MAILMAN_OWNER, subject, text) HandlerAPI.DeliverToUser(self, msg) if whence: whence = "; %s" % whence else: whence = "" syslog('subscribe', '%s: deleted %s%s' % (self.internal_name(), name, whence)) def IsMember(self, address): return len(Utils.FindMatchingAddresses(address, self.members, self.digest_members)) def HasExplicitDest(self, msg): """True if list name or any acceptable_alias is included among the to or cc addrs.""" # this is the list's full address listfullname = '%s@%s' % (self.internal_name(), self.host_name) recips = [] # check all recipient addresses against the list's explicit addresses, # specifically To: Cc: and Resent-to: to = [] for header in ('to', 'cc', 'resent-to', 'resent-cc'): to.extend(msg.getaddrlist(header)) for fullname, addr in to: # It's possible that if the header doesn't have a valid # (i.e. RFC822) value, we'll get None for the address. So skip # it. if addr is None: continue addr = string.lower(addr) localpart = string.split(addr, '@')[0] if (# TBD: backwards compatibility: deprecated localpart == self.internal_name() or # Exact match against the complete list address. TBD: # this test should be case-insensitive. addr == listfullname): return 1 recips.append((addr, localpart)) # # helper function used to match a pattern against an address. Do it def domatch(pattern, addr): try: if re.match(pattern, addr): return 1 except re.error: # The pattern is a malformed regexp -- try matching safely, # with all non-alphanumerics backslashed: if re.match(re.escape(pattern), addr): return 1 # # Here's the current algorithm for matching acceptable_aliases: # # 1. If the pattern does not have an `@' in it, we first try matching # it against just the localpart. This was the behavior prior to # 2.0beta3, and is kept for backwards compatibility. # (deprecated). # # 2. If that match fails, or the pattern does have an `@' in it, we # try matching against the entire recip address. for addr, localpart in recips: for alias in string.split(self.acceptable_aliases, '\n'): stripped = string.strip(alias) if not stripped: # ignore blank or empty lines continue if '@' not in stripped and domatch(stripped, localpart): return 1 if domatch(stripped, addr): return 1 return 0 def parse_matching_header_opt(self): """Return a list of triples [(field name, regex, line), ...].""" # - Blank lines and lines with '#' as first char are skipped. # - Leading whitespace in the matchexp is trimmed - you can defeat # that by, eg, containing it in gratuitous square brackets. all = [] for line in string.split(self.bounce_matching_headers, '\n'): stripped = string.strip(line) if not stripped or (stripped[0] == "#"): # Skip blank lines and lines *starting* with a '#'. continue else: try: h, e = re.split(":[ \t]*", stripped, 1) try: re.compile(e) all.append((h, e, stripped)) except re.error, cause: # The regexp in this line is malformed -- log it # and ignore it syslog('config', '%s - bad regexp %s [%s] ' 'in bounce_matching_header line %s' % (self.real_name, `e`, `cause`, `stripped`)) except ValueError: # Whoops - some bad data got by: syslog('config', '%s - bad bounce_matching_header line %s' % (self.real_name, `stripped`)) return all def HasMatchingHeader(self, msg): """True if named header field (case-insensitive) matches regexp. Case insensitive. Returns constraint line which matches or empty string for no matches.""" pairs = self.parse_matching_header_opt() for field, matchexp, line in pairs: fragments = msg.getallmatchingheaders(field) subjs = [] l = len(field) for f in fragments: # Consolidate header lines, stripping header name & whitespace. if (len(f) > l and f[l] == ":" and string.lower(field) == string.lower(f[0:l])): # Non-continuation line - trim header name: subjs.append(f[l+2:]) elif not subjs: # Whoops - non-continuation that matches? subjs.append(f) else: # Continuation line. subjs[-1] = subjs[-1] + f for s in subjs: # This is safe because parse_matching_header_opt only # returns valid regexps if re.search(matchexp, s, re.I): return line return 0 def Locked(self): return self.__lock.locked() def Lock(self, timeout=0): self.__lock.lock(timeout) # Must reload our database for consistency. Watch out for lists that # don't exist. try: self.Load() except Errors.MMUnknownListError: self.Unlock() raise def Unlock(self): self.__lock.unlock(unconditionally=1) def __repr__(self): if self.Locked(): status = " (locked)" else: status = "" return ("<%s.%s %s%s at %s>" % (self.__module__, self.__class__.__name__, `self._internal_name`, status, hex(id(self))[2:])) def internal_name(self): return self._internal_name def fullpath(self): return self._full_path --------------268321D1775935CAA82A6949 Content-Type: text/plain; charset=us-ascii; name="Utils.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="Utils.py" # Copyright (C) 1998,1999,2000 by the Free Software Foundation, Inc. # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. """Miscellaneous essential routines. This includes actual message transmission routines, address checking and message and address munging, a handy-dandy routine to map a function on all the mailing lists, and whatever else doesn't belong elsewhere. """ import sys import os import string import re from UserDict import UserDict from types import StringType # XXX: obsolete, should use re module import regsub import random import urlparse from Mailman import mm_cfg from Mailman import Errors #xxx from Mailman import Address ##try: ## import md5 ##except ImportError: ## md5 = None from Mailman import Crypt def list_exists(listname): """Return true iff list `listname' exists.""" # It is possible that config.db got removed erroneously, in which case we # can fall back to config.db.last dbfile = os.path.join(mm_cfg.LIST_DATA_DIR, listname, 'config.db') lastfile = dbfile + '.last' return os.path.exists(dbfile) or os.path.exists(lastfile) def list_names(): """Return the names of all lists in default list directory.""" got = [] for fn in os.listdir(mm_cfg.LIST_DATA_DIR): if list_exists(fn): got.append(fn) return got # a much more naive implementation than say, Emacs's fill-paragraph! def wrap(text, column=70): """Wrap and fill the text to the specified column. Wrapping is always in effect, although if it is not possible to wrap a line (because some word is longer than `column' characters) the line is broken at the next available whitespace boundary. Paragraphs are also always filled, unless the line begins with whitespace. This is the algorithm that the Python FAQ wizard uses, and seems like a good compromise. """ wrapped = '' # first split the text into paragraphs, defined as a blank line paras = re.split('\n\n', text) for para in paras: # fill lines = [] fillprev = 0 for line in string.split(para, '\n'): if not line: lines.append(line) continue if line[0] in string.whitespace: fillthis = 0 else: fillthis = 1 if fillprev and fillthis: # if the previous line should be filled, then just append a # single space, and the rest of the current line lines[-1] = string.rstrip(lines[-1]) + ' ' + line else: # no fill, i.e. retain newline lines.append(line) fillprev = fillthis # wrap each line for text in lines: while text: if len(text) <= column: line = text text = '' else: bol = column # find the last whitespace character while bol > 0 and text[bol] not in string.whitespace: bol = bol - 1 # now find the last non-whitespace character eol = bol while eol > 0 and text[eol] in string.whitespace: eol = eol - 1 # watch out for text that's longer than the column width if eol == 0: # break on whitespace after column eol = column while eol < len(text) and \ text[eol] not in string.whitespace: eol = eol + 1 bol = eol while bol < len(text) and \ text[bol] in string.whitespace: bol = bol + 1 bol = bol - 1 line = text[:eol+1] + '\n' # find the next non-whitespace character bol = bol + 1 while bol < len(text) and text[bol] in string.whitespace: bol = bol + 1 text = text[bol:] wrapped = wrapped + line wrapped = wrapped + '\n' # end while text wrapped = wrapped + '\n' # end for text in lines # the last two newlines are bogus return wrapped[:-2] def QuotePeriods(text): return string.join(string.split(text, '\n.\n'), '\n .\n') # TBD: what other characters should be disallowed? _badchars = re.compile('[][()<>|;^,]') def ValidateEmail(str): """Verify that the an email address isn't grossly invalid.""" # Pretty minimal, cheesy check. We could do better... if not str: raise Errors.MMBadEmailError if _badchars.search(str) or str[0] == '-': raise Errors.MMHostileAddress if string.find(str, '/') <> -1 and \ os.path.isdir(os.path.split(str)[0]): # then raise Errors.MMHostileAddress user, domain_parts = ParseEmail(str) # this means local, unqualified addresses, are no allowed if not domain_parts: raise Errors.MMBadEmailError if len(domain_parts) < 2: raise Errors.MMBadEmailError # User J. Person _addrcre1 = re.compile('<(.*)>') # person@allusers.com (User J. Person) _addrcre2 = re.compile('([^(]*)\s(.*)') #xxx # We want to get the names now, not just the addresses # User J. Person _namecre1 = re.compile('([^<]*)') # person@allusers.com (User J. Person) _namecre2 = re.compile('$(.*)$') def ParseAddrs(addresses): """Parse common types of email addresses: User J. Person person@allusers.com (User J. Person) TBD: I wish we could use rfc822.parseaddr() but 1) the interface is not convenient, and 2) it doesn't work for the second type of address. Argument is a list of addresses, return value is a list of the parsed email addresses. The argument can also be a single string, in which case the return value is a single string. All addresses are string.strip()'d. """ single = 0 if type(addresses) == type(''): single = 1 addrs = [addresses] else: addrs = addresses parsed = [] for a in addrs: mo = _addrcre1.search(a) if mo: parsed.append(mo.group(1)) continue mo = _addrcre2.search(a) if mo: parsed.append(mo.group(1)) continue parsed.append(a) if single: return string.strip(parsed[0]) return map(string.strip, parsed) # Added by BB, 9/24/00 # Gets the name of the user from the email address def ParseNames(addresses): single = 0 if type(addresses) == type(''): single = 1 addrs = [addresses] else: addrs = addresses parsed = [] for a in addrs: print "Utils: %s" % (a) #XYZ mo = _namecre1.search(a) if mo: parsed.append(mo.group(1)) print "Utils: %s" % (mo.group(1)) #XYZ continue mo = _namecre2.search(a) if mo: parsed.append(mo.group(1)) print "Utils: %s" % (mo.group(1)) #XYZ continue parsed.append(a) if single: return parsed[0] return parsed def GetPathPieces(envar='PATH_INFO'): path = os.environ.get(envar) if path: return filter(None, string.split(path, '/')) return None def ScriptURL(target, web_page_url=None, absolute=0): """target - scriptname only, nothing extra web_page_url - the list's configvar of the same name absolute - a flag which if set, generates an absolute url """ if web_page_url is None: web_page_url = mm_cfg.DEFAULT_URL if web_page_url[-1] <> '/': web_page_url = web_page_url + '/' fullpath = os.environ.get('REQUEST_URI') if fullpath is None: fullpath = os.environ.get('SCRIPT_NAME', '') + \ os.environ.get('PATH_INFO', '') baseurl = urlparse.urlparse(web_page_url)[2] if not absolute and fullpath[:len(baseurl)] == baseurl: # Use relative addressing fullpath = fullpath[len(baseurl):] i = string.find(fullpath, '?') if i > 0: count = string.count(fullpath, '/', 0, i) else: count = string.count(fullpath, '/') path = ('../' * count) + target else: path = web_page_url + target return path + mm_cfg.CGIEXT def MakeDirTree(path, perms=0775, verbose=0): made_part = '/' path_parts = filter(None, string.split(path, '/')) for item in path_parts: made_part = os.path.join(made_part, item) if os.path.exists(made_part): if not os.path.isdir(made_part): raise "RuntimeError", ("Couldn't make dir tree for %s. (%s" " already exists)" % (path, made_part)) else: ou = os.umask(0) try: os.mkdir(made_part, perms) finally: os.umask(ou) if verbose: print 'made directory: ', madepart # This takes an email address, and returns a tuple containing (user,host) def ParseEmail(email): user = None domain = None email = string.lower(email) at_sign = string.find(email, '@') if at_sign < 1: return (email, None) user = email[:at_sign] rest = email[at_sign+1:] domain = string.split(rest, '.') return (user, domain) def LCDomain(addr): "returns the address with the domain part lowercased" atind = string.find(addr, '@') if atind == -1: # no domain part return addr return addr[:atind] + '@' + string.lower(addr[atind + 1:]) # Return 1 if the 2 addresses match. 0 otherwise. # Might also want to match if there's any common domain name... # There's password protection anyway. def AddressesMatch(addr1, addr2): "True when username matches and host addr of one addr contains other's." addr1, addr2 = map(LCDomain, [addr1, addr2]) if not mm_cfg.SMART_ADDRESS_MATCH: return addr1 == addr2 user1, domain1 = ParseEmail(addr1) user2, domain2 = ParseEmail(addr2) if user1 != user2: return 0 if domain1 == domain2: return 1 elif not domain1 or not domain2: return 0 for i in range(-1 * min(len(domain1), len(domain2)), 0): # By going from most specific component of host part we're likely # to hit a difference sooner. if domain1[i] != domain2[i]: return 0 return 1 def GetPossibleMatchingAddrs(name): """returns a sorted list of addresses that could possibly match a given name. For Example, given scott@pobox.com, return ['scott@pobox.com'], given scott@blackbox.pobox.com return ['scott@blackbox.pobox.com', 'scott@pobox.com']""" name = string.lower(name) user, domain = ParseEmail(name) res = [name] if domain: domain = domain[1:] while len(domain) >= 2: res.append("%s@%s" % (user, string.join(domain, "."))) domain = domain[1:] return res def List2Dict(list): """List2Dict returns a dict keyed by the entries in the list passed to it.""" res = {} for item in list: res[item] = 1 return res def FindMatchingAddresses(name, dict, *dicts): """Given an email address, and any number of dictionaries keyed by email addresses, returns the subset of the list that matches the given address. Should sort based on exactness of match, just in case.""" dicts = list(dicts) dicts.insert(0, dict) if not mm_cfg.SMART_ADDRESS_MATCH: for d in dicts: if d.has_key(string.lower(name)): return [name] return [] # # GetPossibleMatchingAddrs return string.lower'd values # p_matches = GetPossibleMatchingAddrs(name) res = [] for pm in p_matches: for d in dicts: if d.has_key(pm): res.append(pm) return res _vowels = ('a', 'e', 'i', 'o', 'u') _consonants = ('b', 'c', 'd', 'f', 'g', 'h', 'k', 'm', 'n', 'p', 'r', 's', 't', 'v', 'w', 'x', 'z') _syllables = [] for v in _vowels: for c in _consonants: _syllables.append(c+v) _syllables.append(v+c) def MakeRandomPassword(length=6): syls = [] while len(syls)*2 < length: syls.append(random.choice(_syllables)) return string.join(syls, '')[:length] def GetRandomSeed(): chr1 = int(random.random() * 52) chr2 = int(random.random() * 52) def mkletter(c): if 0 <= c < 26: c = c + 65 if 26 <= c < 52: c = c - 26 + 97 return c return "%c%c" % tuple(map(mkletter, (chr1, chr2))) def SetSiteAdminPassword(pw): fp = open_ex(mm_cfg.SITE_PW_FILE, 'w', perms=0640) fp.write(Crypt.crypt(pw, GetRandomSeed())) fp.close() def CheckSiteAdminPassword(pw1): try: f = open(mm_cfg.SITE_PW_FILE) pw2 = f.read() f.close() return Crypt.crypt(pw1, pw2[:2]) == pw2 # There probably is no site admin password if there was an exception except IOError: return 0 def QuoteHyperChars(str): arr = regsub.splitx(str, '[<>"&]') i = 1 while i < len(arr): if arr[i] == '<': arr[i] = '<' elif arr[i] == '>': arr[i] = '>' elif arr[i] == '"': arr[i] = '"' else: #if arr[i] == '&': arr[i] = '&' i = i + 2 return string.join(arr, '') # Just changing these two functions should be enough to control the way # that email address obscuring is handled. def ObscureEmail(addr, for_text=0): """Make email address unrecognizable to web spiders, but invertable. When for_text option is set (not default), make a sentence fragment instead of a token.""" if for_text: return string.replace(addr, "@", " at ") else: return string.replace(addr, "@", "--at--") def UnobscureEmail(addr): """Invert ObscureEmail() conversion.""" # Contrived to act as an identity operation on already-unobscured # emails, so routines expecting obscured ones will accept both. return string.replace(addr, "--at--", "@") def chunkify(members, chunksize=None): """ return a list of lists of members """ if chunksize is None: chunksize = mm_cfg.DEFAULT_ADMIN_MEMBER_CHUNKSIZE members.sort() res = [] while 1: if not members: break chunk = members[:chunksize] res.append(chunk) members = members[chunksize:] return res class SafeDict(UserDict): """Dictionary which returns a default value for unknown keys. This is used in maketext so that editing templates is a bit more robust. """ def __init__(self, d=None): # optional initial dictionary is a Python 1.5.2-ism. Do it this way # for portability UserDict.__init__(self) if d is not None: self.update(d) def __getitem__(self, key): try: return self.data[key] except KeyError: if type(key) == StringType: return '%('+key+')s' else: return '' % `key` def maketext(templatefile, dict=None, raw=0): """Make some text from a template file. Reads the `templatefile', relative to mm_cfg.TEMPLATE_DIR, does string substitution by interpolating in the `dict', and if `raw' is false, wraps/fills the resulting text by calling wrap(). """ if dict is None: dict = {} file = os.path.join(mm_cfg.TEMPLATE_DIR, templatefile) fp = open(file) template = fp.read() fp.close() text = template % SafeDict(dict) if raw: return text return wrap(text) # given a Message.Message object, test for administrivia (eg subscribe, # unsubscribe, etc). the test must be a good guess -- messages that return # true get sent to the list admin instead of the entire list. # def IsAdministrivia(msg): lines = map(string.lower, string.split(msg.body, "\n")) # check to see how many lines that actually have text in them there are admin_data = {"subscribe": (0, 3), "unsubscribe": (0, 1), "who": (0,0), "info": (0,0), "lists": (0,0), "set": (3, 3), "help": (0,0), "password": (2, 2), "options": (0,0), "remove": (0, 0)} lines_with_text = 0 for line in lines: if string.strip(line): lines_with_text = lines_with_text + 1 if lines_with_text > mm_cfg.DEFAULT_MAIL_COMMANDS_MAX_LINES: return 0 sig_ind = string.find(msg.body, "\n-- ") if sig_ind != -1: body = msg.body[:sig_ind] else: body = msg.body if admin_data.has_key(string.lower(string.strip(body))): return 1 try: if admin_data.has_key(string.lower(string.strip(msg["subject"]))): return 1 except KeyError: pass for line in lines[:5]: if not string.strip(line): continue words = string.split(line) if admin_data.has_key(words[0]): min_args, max_args = admin_data[words[0]] if min_args <= len(words[1:]) <= max_args: if (words[0] == 'set' and (words[2] not in ['on', 'off'])): continue return 1 return 0 def mkdir(dir, mode=02775): """Wraps os.mkdir() in a umask saving try/finally. Two differences from os.mkdir(): - umask is forced to 0 during mkdir() - default mode is 02775 """ ou = os.umask(0) try: os.mkdir(dir, mode) finally: os.umask(ou) def open_ex(filename, mode='r', bufsize=-1, perms=0664): """Use os.open() to open a file in a particular mode. Returns a file-like object instead of a file descriptor. Also umask is forced to 0 during the open(). `b' flag is currently unsupported.""" modekey = mode trunc = os.O_TRUNC if mode == 'r': trunc = 0 elif mode[-1] == '+': trunc = 0 modekey = mode[:-1] else: trunc = os.O_TRUNC flags = {'r' : os.O_RDONLY, 'w' : os.O_WRONLY | os.O_CREAT, 'a' : os.O_RDWR | os.O_CREAT | os.O_APPEND, 'rw': os.O_RDWR | os.O_CREAT, # TBD: should also support `b' }.get(modekey) if flags is None: raise TypeError, 'Unsupported file mode: ' + mode flags = flags | trunc ou = os.umask(0) try: try: fd = os.open(filename, flags, perms) fp = os.fdopen(fd, mode, bufsize) return fp # transform any os.errors into IOErrors except OSError, e: e.__class__ = IOError raise IOError, e, sys.exc_info()[2] finally: os.umask(ou) def GetRequestURI(fallback=None): """Return the full virtual path this CGI script was invoked with. Newer web servers seems to supply this info in the REQUEST_URI environment variable -- which isn't part of the CGI/1.1 spec. Thus, if REQUEST_URI isn't available, we concatenate SCRIPT_NAME and PATH_INFO, both of which are part of CGI/1.1. Optional argument `fallback' (default `None') is returned if both of the above methods fail. """ if os.environ.has_key('REQUEST_URI'): return os.environ['REQUEST_URI'] elif os.environ.has_key('SCRIPT_NAME') and os.environ.has_key('PATH_INFO'): return os.environ['SCRIPT_NAME'] + os.environ['PATH_INFO'] else: return fallback # Wait on a dictionary of child pids def reap(kids, func=None): while kids: if func: func() pid, status = os.waitpid(-1, os.WNOHANG) if pid <> 0: try: del kids[pid] except KeyError: # Huh? How can this happen? pass # Useful conversion routines # unhexlify(hexlify(s)) == s # # Python 2.0 has these in the binascii module try: from binascii import hexlify, unhexlify except ImportError: # Not the most efficient of implementations, but good enough for older # versions of Python. def hexlify(s): acc = [] def munge(byte, append=acc.append, a=ord('a'), z=ord('0')): if byte > 9: append(byte+a-10) else: append(byte+z) for c in s: hi, lo = divmod(ord(c), 16) munge(hi) munge(lo) return string.join(map(chr, acc), '') def unhexlify(s): acc = [] append = acc.append # In Python 2.0, we can use the int() built-in int16 = string.atoi for i in range(0, len(s), 2): append(chr(int16(s[i:i+2], 16))) return string.join(acc, '') def write(*args, **kws): file = sys.stdout sep = ' ' end = '\n' if kws.has_key('file'): file = kws['file'] del kws['file'] if kws.has_key('nl'): if not kws['nl']: end = ' ' del kws['nl'] if kws.has_key('sep'): sep = kws['sep'] del kws['sep'] if kws: raise TypeError('unexpected keywords') file.write(string.join(map(str, args), sep) + end) --------------268321D1775935CAA82A6949 Content-Type: text/plain; charset=us-ascii; name="add_members" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="add_members" #! /usr/bin/env python # # Copyright (C) 1998,1999,2000 by the Free Software Foundation, Inc. # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # # argv[1] should be the name of the list. # argv[2] should be the list of non-digested users. # argv[3] should be the list of digested users. # Make sure that the list of email addresses doesn't contain any comments, # like majordomo may throw in. For now, you just have to remove them manually. """Add members to a list from the command line. Usage: add_members [-n ] [-d ] [-c ] [-w ] [-h] listname Where: --non-digest-members-file -n A file containing addresses of the members to be added, one address per line. This list of people become non-digest members. If is `-', read addresses from stdin. --digest-members-file -d Similar to above, but these people become digest members. --changes-msg= -c set whether or not to send the list members the `there's going to be big changes to your list' message. defaults to no. --welcome-msg= -w set whether or not to send the list members a welcome message, overriding whatever the list's `send_welcome_msg' setting is. --help -h Print this help message and exit. listname The name of the Mailman list you are adding members to. It must already exist. You must supply at least one of -n and -d options. At most one of the files can be `-'. """ import sys import os import string import getopt import paths from Mailman import MailList from Mailman import Utils from Mailman import Message from Mailman import Errors from Mailman import mm_cfg from Mailman.Handlers import HandlerAPI def usage(status, msg=''): if msg: print msg print __doc__ % globals() sys.exit(status) def ReadFile(filename): lines = [] if filename == "-": fp = sys.stdin else: fp = open(filename) lines = filter(None, map(string.strip, fp.readlines())) fp.close() return lines def SendExplanation(mlist, users): adminaddr = mlist.GetAdminEmail() d = {'listname' : mlist.real_name, 'listhost' : mlist.host_name, 'listaddr' : mlist.GetListEmail(), 'listinfo_url': mlist.GetScriptURL('listinfo', absolute=1), 'requestaddr' : mlist.GetRequestEmail(), 'adminaddr' : adminaddr, 'version' : mm_cfg.VERSION, } text = Utils.maketext('convert.txt', d) subject = 'Big change in %(listname)s@%(listhost)s mailing list' % d msg = Message.OutgoingMessage(text) msg['From'] = adminaddr msg['Subject'] = subject HandlerAPI.DeliverToUser(mlist, msg, {'recips': users}) def main(): try: opts, args = getopt.getopt(sys.argv[1:], 'n:d:c:w:h', ['non-digest-members-file=', 'digest-members-file=', 'changes-msg=', 'welcome-msg=', 'help']) except getopt.error, msg: usage(1, msg) if not len(args) == 1: usage(1) listname = string.lower(args[0]) nfile = None dfile = None send_changes_msg = 0 send_welcome_msg = -1 for opt, arg in opts: if opt in ('-h', '--help'): usage(0) elif opt in ('-d', '--digest-members-file'): dfile = arg elif opt in ('-n', '--non-digest-members-file'): nfile = arg elif opt in ('-c', '--changes-msg'): if arg == 'y': send_changes_msg = 1 elif arg == 'n': send_changes_msg = 0 else: usage(1) elif opt in ('-w', '--welcome-msg'): if arg == 'y': send_welcome_msg = 1 elif arg == 'n': send_welcome_msg = 0 else: usage(1) if dfile is None and nfile is None: usage(1) if dfile == "-" and nfile == "-": print "Sorry, can't read both digest *and* normal members from stdin." sys.exit(1) try: ml = MailList.MailList(listname) except Errors.MMUnknownListError: usage(1, 'You must first create the list by running: newlist %s' % listname) if send_welcome_msg == -1: send_welcome_msg = ml.send_welcome_msg try: dmembers = [] if dfile: try: dmembers = ReadFile(dfile) except IOError: pass nmembers = [] if nfile: try: nmembers = ReadFile(nfile) except IOError: pass if not dmembers and not nmembers: usage(1) if nmembers: nres = ml.ApprovedAddMembers(nmembers, None, 0, send_welcome_msg) else: nres = {} if dmembers: dres = ml.ApprovedAddMembers(dmembers, None, 1, send_welcome_msg) else: dres = {} for result in (nres, dres): for name in result.keys(): if result[name] is None: pass else: # `name' was not subscribed, find out why. On failures, # result[name] is set from sys.exc_info()[:2] e, v = result[name] if e is Errors.MMAlreadyAMember: print 'Already subscribed (skipping):', name elif issubclass(e, Errors.EmailAddressError): if name == '': name = '( blank line )' print "Not a valid email address:", name if send_changes_msg: SendExplanation(ml, nmembers + dmembers) finally: ml.Unlock() main() --------------268321D1775935CAA82A6949-- From mats@laplaza.org Thu Dec 28 20:44:01 2000 From: mats@laplaza.org (Mats Wichmann) Date: Thu, 28 Dec 2000 12:44:01 -0800 Subject: [Mailman-Developers] Off-line for a while :( In-Reply-To: Message-ID: <5.0.2.1.1.20001228124104.00a5b6e0@mail.laplaza.org> At 01:06 PM 12/21/2000 -0500, Barry A. Warsaw wrote: >Hello all, > >As of 6pm last night, my old DSL line was finally yanked, so >I'm effectively off-line. (I'm writing this from Guido's >house via a web interface to corporate email -- tedious and >a whole state away so I won't be doing this too often.) ...summarizing my long-standing gripe with mailman's enforced web interface for things like canning spam and approving user requests (last year I was 1000+ miles away from my servers over 50% of the time)... I still hope someday there will be an email "bypass" to get around that traffic jam. Mats From barry@wooz.org Sun Dec 31 01:06:23 2000 From: barry@wooz.org (Barry A. Warsaw) Date: Sat, 30 Dec 2000 20:06:23 -0500 Subject: [Mailman-Developers] Re: Most everything is busted References: Message-ID: <14926.34447.60988.553140@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> + news->mail for c.l.py hasn't delivered anything for well TP> over 24 hours. TP> + No mail to Python-Dev has showed up in the archives (let TP> alone been delivered) since Fri, 29 Dec 2000 16:42:44 +0200 TP> (IST). TP> + The other Python mailing lists appear equally dead. There's a stupid, stupid bug in Mailman 2.0, which I've just fixed and (hopefully) unjammed things on the Mailman end[1]. We're still probably subject to the Postfix delays unfortunately; I think those are DNS related, and I've gotten a few other reports of DNS oddities, which I've forwarded off to the DC sysadmins. I don't think that particular problem will be fixed until after the New Year. relax-and-enjoy-the-quiet-ly y'rs, -Barry [1] For those who care: there's a resource throttle in qrunner which limits the number of files any single qrunner process will handle. qrunner does a listdir() on the qfiles directory and ignores any .msg file it finds (it only does the bulk of the processing on the corresponding .db files). But it performs the throttle check on every file in listdir() so depending on the order that listdir() returns and the number of files in the qfiles directory, the throttle check might get triggered before any .db file is seen. Wedge city. This is serious enough to warrant a Mailman 2.0.1 release, probably mid-next week. From colin@pythontech.co.uk Sun Dec 31 13:25:40 2000 From: colin@pythontech.co.uk (Colin Hogben) Date: Sun, 31 Dec 2000 13:25:40 +0000 (GMT) Subject: [Mailman-Developers] Installing on virtual server - info offered Message-ID: <14927.13268.779520.716464@gumby.pythontech.co.uk> Hi, I wanted to run Mailman on a virtual web hosting account, namely on a Cobalt RaQ. I didn't find any advice in the mailing list archives, only a couple of other people asking how. So I disregarded the "You will need root access on the machine hosting your Mailman installation" in the README, had a go and (modulo a few hoops jumped through) appear to have succeeded. As a contribution to the community I would like to write up some notes on what I did, and make them available. Can you suggest where and in what form would be the best way to do that? Maybe a wiki or faqomatic where others can add their own input / caveats etc? Regards, -- Colin Hogben From davek@mail.commercedata.com Sun Dec 31 18:32:08 2000 From: davek@mail.commercedata.com (Dave Klingler) Date: Sun, 31 Dec 2000 11:32:08 -0700 (MST) Subject: [Mailman-Developers] Installing on virtual server - info offered Message-ID: <200012311832.LAA02970@mail.commercedata.com> Hi folks. I also appear to have solved most of the bugaboos, after trying about five or six different approaches (whew!). I'm still trying to get the Redhat cron to run qrunner correctly, but I'd like to compare notes. BTW, I would love it if there were some way to separate the CGI prefix from the MTA and admin script prefix - this makes it tough to install on virtual machines. I spent the morning browsing the code to figure out how it all works and eventually try to spot a way to do it. It'd be lovely and wonderful to just edit a mailman.conf file and have everything work automagically; that way it'd be much easier to create new virtual Mailman installations whenever one creates a new host without rebuilding the package. I realize that right now, anyway, Mailman isn't architected that way, but one can dream... Thanks to everyone for a neat package! Happy New Year! Dave Klingler > I wanted to run Mailman on a virtual web hosting account, namely on a > Cobalt RaQ. I didn't find any advice in the mailing list archives, > only a couple of other people asking how. So I disregarded the "You > will need root access on the machine hosting your Mailman > installation" in the README, had a go and (modulo a few hoops jumped > through) appear to have succeeded. > > As a contribution to the community I would like to write up some notes > on what I did, and make them available. Can you suggest where and in > what form would be the best way to do that? Maybe a wiki or faqomatic > where others can add their own input / caveats etc? > > Regards, > -- > Colin Hogben From atrac@infomed.sld.cu Sat Dec 23 17:31:22 2000 From: atrac@infomed.sld.cu (Eduardo Bonzon) Date: Sat, 23 Dec 2000 14:31:22 -0300 Subject: [Mailman-Developers] (no subject) Message-ID: <000c01c06d06$2e963080$97a09ea9@eduardo> This is a multi-part message in MIME format. ------=_NextPart_000_0009_01C06CED.06667200 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Dr.Eduardo Bonzon Castellanos atrac@infomed.sld.cu ------=_NextPart_000_0009_01C06CED.06667200 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

Dr.Eduardo Bonzon Castellanos
atrac@infomed.sld.cu ------=_NextPart_000_0009_01C06CED.06667200-- From jam@jamux.com Fri Dec 1 14:13:29 2000 From: jam@jamux.com (John A. Martin) Date: Fri, 01 Dec 2000 09:13:29 -0500 Subject: [Mailman-Developers] Monthly reminder Errors-To etc Message-ID: <20001201141330.24A914800B@athene.jamux.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 It must be my mistake because otherwise there would be a hue and cry by now if all Mailmans (Mailmen? :-)) were doing the same. My Mailman-1.1 sent monthly reminders thus - -------------- cut here ---->8 ---< head Return-Path: Delivered-To: jam@jamux.com Received: from milan.essential.org (milan.essential.org [216.0.124.12]) by athene.jamux.com (Postfix) with ESMTP id 1AD3448031 for ; Wed, 1 Nov 2000 05:01:39 -0500 (EST) Received: from venice.essential.org (venice.essential.org [216.0.124.17]) by milan.essential.org (8.9.3/8.9.3) with ESMTP id FAA06070 for ; Wed, 1 Nov 2000 05:01:38 -0500 Received: from venice.essential.org (localhost [127.0.0.1]) by venice.essential.org (Postfix) with ESMTP id DC46429B60 for ; Wed, 1 Nov 2000 05:01:37 -0500 (EST) From: mailman-owner@venice.essential.org Subject: lists.essential.org mailing list memberships reminder To: jam@essential.org X-No-Archive: yes Precedence: bulk X-Mailman-Version: 1.1 Precedence: bulk List-Id: Promoting new uses of agricultural residue... Message-Id: <20001101100137.DC46429B60@venice.essential.org> Date: Wed, 1 Nov 2000 05:01:37 -0500 (EST) - ---- 8<------- cut here ----------> tail the bogus "List-Id being a known problem creating a minor annoiance. Now my shinny new Mailman-2.0 sent them thus[1] - -------------- cut here ---->8 ---< head Return-Path: Delivered-To: jam@jamux.com Received: from milan.essential.org (milan.essential.org [216.0.124.12]) by athene.jamux.com (Postfix) with ESMTP id 728B14800B for ; Fri, 1 Dec 2000 05:02:06 -0500 (EST) Received: from venice.essential.org (venice.essential.org [216.0.124.17]) by milan.essential.org (8.9.3/8.9.3) with ESMTP id FAA29820 for ; Fri, 1 Dec 2000 05:02:05 -0500 Received: from venice.essential.org (localhost [127.0.0.1]) by venice.essential.org (Postfix) with ESMTP id 9418729BA8 for ; Fri, 1 Dec 2000 05:02:05 -0500 (EST) Subject: lists.essential.org mailing list memberships reminder From: mailman-owner@venice.essential.org To: jam@essential.org X-No-Archive: yes X-Ack: no Sender: ababa-admin@venice.essential.org Errors-To: ababa-admin@venice.essential.org X-BeenThere: ababa@lists.essential.org Precedence: bulk Message-Id: <20001201100205.9418729BA8@venice.essential.org> Date: Fri, 1 Dec 2000 05:02:05 -0500 (EST) - ---- 8<------- cut here ----------> tail with the bogus envelope sender, "Errors-To" and header-sender presumably causing to be inundated with delivery error notices. Is this an error that has crept into my Mailman configuration, or is this what Mailman will do until the item on the To Do list is resolved? I am making an unadvertised aaaaa-list hoping to shield from a repeat performance. Will this work? ALSO, when the problem of the monthly reminders is worked upon I urge again for a X-Addressed-To or some such header field to help understand those delivery error notifications from AOL and many universities that, correctly IMHO, attempt to reduce the exposure of potentially private information by excluding the original message body from delivery error notices. Add recipient address munging and without some counter measure such a delivery error notice is useless. jam Footnotes: [1] To protect the innocent, I have replaced the name of the alphabetically first mailing list on this Mailman with a fictions list name. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (GNU/Linux) Comment: OpenPGP encrypted mail preferred. See iEYEARECAAYFAjonsgAACgkQUEvv1b/iXy8c0ACfZNqc1MiPgJdMkMoZM3fRRLM1 iZ0AniKuBuz+GswVatO8iz6vi9+eqx2n =uD/d -----END PGP SIGNATURE----- From sigma@pair.com Thu Dec 7 11:59:54 2000 From: sigma@pair.com (sigma@pair.com) Date: Thu, 7 Dec 2000 06:59:54 -0500 (EST) Subject: [Mailman-Developers] Huge qrunner process Message-ID: <20001207115954.1364.qmail@smx.pair.com> Can anyone enlighten me about why the qrunner process might need a tremendous amount of memory? Running Mailman 2.0 on a FreeBSD 4.1.1-STABLE server with Python 2.0. There are about 1500 lists, but the qfiles directory only has 32 files in it. Nonetheless, each minute when qrunner runs, it looks like this in top: 59606 mailman -2 20 135M 74456K getblk 0:01 8.93% 3.52% python 135 MB? It seems excessive. Any insight would be appreciated :) Thanks, Kevin From sigma@pair.com Thu Dec 7 17:51:49 2000 From: sigma@pair.com (sigma@pair.com) Date: Thu, 7 Dec 2000 12:51:49 -0500 (EST) Subject: [Mailman-Developers] Re: Huge qrunner process Message-ID: <20001207175149.24828.qmail@smx.pair.com> Following up on my own message... qrunner isn't the only culprit. senddigests is worse, since it doesn't stop itself from running indefinitely. There is a definite memory leak or inefficiency somewhere. Just the following fragment of code, edited from senddigests, is enough to send the memory usage sky-high: def main(): for listname in Utils.list_names(): mlist = MailList.MailList(listname, lock=0) del mlist The "del mlist" doesn't help. I've noticed that one pathological list has a 12MB config.db file. If loading config.db is inefficient by a factor of eleven, then that could explain the swelling to 135MB. With that list removed, the memory usage peak is just 89MB. Is there a way to tell if Python is deleting the MailList object when mlist gets reassigned, so I can find out if there is a leak each time a list is loaded? Otherwise it's just inefficient memory usage in proportion to the size of config.db Thanks, Kevin ----- Forwarded message from sigma@pair.com ----- >From mailman-developers-admin@python.org Thu Dec 07 12:00:16 2000 Delivered-To: sigma@smx.pair.com Delivered-To: sigma@pair.com Delivered-To: mailman-developers@new.python.org Delivered-To: mailman-developers@python.org Message-ID: <20001207115954.1364.qmail@smx.pair.com> From: sigma@pair.com To: mailman-developers@python.org X-Mailer: ELM [version 2.4ME+ PL40 (25)] Subject: [Mailman-Developers] Huge qrunner process Sender: mailman-developers-admin@python.org Errors-To: mailman-developers-admin@python.org X-BeenThere: mailman-developers@python.org X-Mailman-Version: 2.0 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Mailman mailing list developers List-Unsubscribe: , List-Archive: Date: Thu, 7 Dec 2000 06:59:54 -0500 (EST) Can anyone enlighten me about why the qrunner process might need a tremendous amount of memory? Running Mailman 2.0 on a FreeBSD 4.1.1-STABLE server with Python 2.0. There are about 1500 lists, but the qfiles directory only has 32 files in it. Nonetheless, each minute when qrunner runs, it looks like this in top: 59606 mailman -2 20 135M 74456K getblk 0:01 8.93% 3.52% python 135 MB? It seems excessive. Any insight would be appreciated :) Thanks, Kevin _______________________________________________ Mailman-Developers mailing list Mailman-Developers@python.org http://www.python.org/mailman/listinfo/mailman-developers ----- End of forwarded message from sigma@pair.com ----- From barry@digicool.com Thu Dec 7 19:17:26 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 7 Dec 2000 14:17:26 -0500 Subject: [Mailman-Developers] New checkins coming Message-ID: <14895.57926.570880.976401@anthem.concentric.net> I'm going to start checking in a bunch of new changes, first to clean up some of the code by using Python 2.0 features (most notably string methods), and then by integrating Juan Carlos's i18n patches. This means that if you plan on using the cvs snapshot you will have to be running Python 2.0. Second, there are no guarantees of stability for a little while until the i18n changes stabilize. For now, stick with Mailman 2.0. If necessary we'll create a cvs branch for 2.0.1, but I only want to do that if we have to. -Barry From ckolar@admin.aurora.edu Thu Dec 7 22:21:20 2000 From: ckolar@admin.aurora.edu (Christopher Kolar) Date: Thu, 07 Dec 2000 16:21:20 -0600 Subject: [Mailman-Developers] msg_header: insertion of sender name/address Message-ID: <5.0.1.4.2.20001207161856.03b5e630@admin.aurora.edu> I had to read it twice, but this is not another question about whether user names can be merged into a message for custom stuff. Any idea if there is a way to grab the Sender/From lines and put them into the message body? I think that the problem is that the mail systems that she is using munge up the display of senders. Cheers. --chris >From: Patrick_Healy@nywd.uscourts.gov >X-Mailer: ccMail Link to SMTP R8.30.00.7 >Date: Thu, 07 Dec 2000 16:39:40 -0500 >To: >Subject: Mailman > > >Mr. Kolar, > >Sorry to bother you with this question, but I've not received any responses to >my question on the list. I really appreciate any help you could give me >on this >problem. > >In a nutshell - The federal judiciary uses Cc:Mail as it's enterprise email >system (soon to be replaced by Notes). The Mailman lists that I've >created and >administer work great to those clients. However, they have no indication >in the >body of the message of who sent the message. > >This means that unless the sender remembers to sign there note, it essentially >becomes anonymous. > >Is there an environment variable that I can insert into the custom header that >will cause the sender's name to be displayed? > >Thanks! > >Pat Healy >U.S. District Court, Western District of New York From Dan Mick Thu Dec 7 23:07:36 2000 From: Dan Mick (Dan Mick) Date: Thu, 7 Dec 2000 15:07:36 -0800 (PST) Subject: [Mailman-Developers] msg_header: insertion of sender name/address Message-ID: <200012072306.PAA04841@utopia.west.sun.com> Well, it looks like the header and footer get the entire mlist's dictionary, plus mm_cfg.CGIEXT, to interpolate with, but nothing per-message. I suppose the right answer would be to change Decorate.py (or add a new module parallel to it) to add in selected fields from the message dictionary (assuming there are no name clashes between the mlist dictionary and the message dictionary, anyway). d = Utils.SafeDict(mlist.__dict__) d['cgiext'] = mm_cfg.CGIEXT # interpolate into the header try: header = string.replace(mlist.msg_header % d, '\r\n', '\n') I don't see a way it can be done without hackery. > I had to read it twice, but this is not another question about whether user > names can be merged into a message for custom stuff. Any idea if there is > a way to grab the Sender/From lines and put them into the message body? I > think that the problem is that the mail systems that she is using munge up > the display of senders. > > Cheers. > > --chris > > > >From: Patrick_Healy@nywd.uscourts.gov > >X-Mailer: ccMail Link to SMTP R8.30.00.7 > >Date: Thu, 07 Dec 2000 16:39:40 -0500 > >To: > >Subject: Mailman > > > > > >Mr. Kolar, > > > >Sorry to bother you with this question, but I've not received any responses to > >my question on the list. I really appreciate any help you could give me > >on this > >problem. > > > >In a nutshell - The federal judiciary uses Cc:Mail as it's enterprise email > >system (soon to be replaced by Notes). The Mailman lists that I've > >created and > >administer work great to those clients. However, they have no indication > >in the > >body of the message of who sent the message. > > > >This means that unless the sender remembers to sign there note, it essentially > >becomes anonymous. > > > >Is there an environment variable that I can insert into the custom header that > >will cause the sender's name to be displayed? > > > >Thanks! > > > >Pat Healy > >U.S. District Court, Western District of New York > > > _______________________________________________ > Mailman-Developers mailing list > Mailman-Developers@python.org > http://www.python.org/mailman/listinfo/mailman-developers From claw@kanga.nu Thu Dec 7 23:11:24 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 07 Dec 2000 15:11:24 -0800 Subject: [Mailman-Developers] msg_header: insertion of sender name/address In-Reply-To: Message from Dan Mick of "Thu, 07 Dec 2000 15:07:36 PST." <200012072306.PAA04841@utopia.west.sun.com> References: <200012072306.PAA04841@utopia.west.sun.com> Message-ID: <22777.976230684@kanga.nu> On Thu, 7 Dec 2000 15:07:36 -0800 (PST) Dan Mick wrote: > Well, it looks like the header and footer get the entire mlist's > dictionary, plus mm_cfg.CGIEXT, to interpolate with, but nothing > per-message. I suppose the right answer would be to change > Decorate.py (or add a new module parallel to it) to add in > selected fields from the message dictionary (assuming there are no > name clashes between the mlist dictionary and the message > dictionary, anyway). Simpler would be to prefix Mailman with a procmail/maildrop/whatwever script that copied the From: header into the body of the message. The handling of MIME problems of course is a different matter. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From marc_news@valinux.com Fri Dec 8 00:22:34 2000 From: marc_news@valinux.com (Marc MERLIN) Date: Thu, 7 Dec 2000 16:22:34 -0800 Subject: [Mailman-Developers] about qrunner and locking Message-ID: <20001207162234.D25463@marc.merlins.org> Summary since the mail is long: I'm trying to find out why qrunner needs to lock a list before doing delivery of a message. I'm getting corruption over NFS under high, concurrent, load (I'm setting up and testing the new Sourceforge list servers) The basic question of this Email is: can I have qrunner ship Emails without modifying the lists' config.db? Longer version with explainations and details: So, I setup that NFS shared mailman tree I was talking about a little while ago. As a reminder: /var/local/mailman is NFS exported /var/local/mailman/qfiles is symlinked to ../mailman.local. I then applied the two following patches to mailman: --- mailman-2.0.orig/cron/qrunner Mon Sep 18 14:28:42 2000 +++ mailman-2.0/cron/qrunner Wed Dec 6 14:02:28 2000 @@ -96,7 +96,7 @@ import signal signal.signal(signal.SIGCHLD, signal.SIG_DFL) -QRUNNER_LOCK_FILE = os.path.join(mm_cfg.LOCK_DIR, 'qrunner.lock') +QRUNNER_LOCK_FILE = os.path.join(mm_cfg.QUEUE_DIR, 'qrunner.lock') LogStdErr('error', 'qrunner', manual_reprime=0, tee_to_stdout=0) --- mailman-2.0.orig/Mailman/Logging/StampedLogger.py Mon Mar 20 22:25:58 2000 +++ mailman-2.0/Mailman/Logging/StampedLogger.py Wed Dec 6 16:20:03 2000 @@ -16,6 +16,7 @@ import os import time +import socket from Logger import Logger class StampedLogger(Logger): @@ -66,7 +67,9 @@ label = "(%d)" % os.getpid() else: label = "%s(%d):" % (self.__label, os.getpid()) - prefix = stamp + label + hostname = socket.gethostname() + " " + prefix = stamp + hostname + label Logger.write(self, "%s %s" % (prefix, msg)) if msg and msg[-1] == '\n': self.__bol = 1 The plan here is to have two mailing list servers sharing the same list configs, but running two different queues to avoid the mailman -> exim bottleneck (I have 2 machines with 2 CPUs, and I don't want 3 CPUs idle because a single qrunner is holding a global lock. Sure, it is rather fast with exim, but since I have two machines (required for failover), I don't really want one sitting idle, and want to do load balancing too) To stress test everything, I sent 1000 local messages on each machine at the same time, and while they were able to get their own qrunner locks, they had to fight for the list lock (all messages were to the same list, which only had one user). Well, it was fast: 15:12:10 Start of injection of 1000 Emails on usw-sf-list1 and usw-sf-list2 15:14:06 1000 Emails accepted and queued on usw-sf-list1 15:14:11 1000 Emails accepted and queued on usw-sf-list2 15:19:07 usw-sf-list1's mailman shipped all the mails 15:19:37 usw-sf-list2's mailman shipped all the mails Note that usw-sf-list2 was doing this over NFS, and was only marginally slower considering this. As expected, there was some corruption in the shared log files (both logs/smtp and logs/post are missing about 10% of the lines they should have). Now, what's less fun is this in my error logs: Dec 07 15:14:39 2000 usw-sf-list2 (17221) test db file was corrupt, using fallback: /var/local/mailman/lists/test/config.db.last Dec 07 15:14:40 2000 usw-sf-list2 (17285) test db file was corrupt, using fallback: /var/local/mailman/lists/test/config.db.last Dec 07 15:14:40 2000 usw-sf-list2 (17282) test db file was corrupt, using fallback: /var/local/mailman/lists/test/config.db.last Dec 07 15:14:40 2000 usw-sf-list2 post(17285): Traceback (innermost last): usw-sf-list2 post(17285): File "/var/local/mailman/scripts/post", line 94, in ? usw-sf-list2 post(17285): main() usw-sf-list2 post(17285): File "/var/local/mailman/scripts/post", line 73, in main usw-sf-list2 post(17285): mlist = MailList.MailList(listname, lock=0) usw-sf-list2 post(17285): File "/var/local/mailman/Mailman/MailList.py", line 79, in __init__ Dec 07 15:14:40 2000 usw-sf-list2 (17291) test db file was corrupt, using fallback: /var/local/mailman/lists/test/config.db.last usw-sf-list2 post(17285): self.Load() usw-sf-list2 post(17285): File "/var/local/mailman/Mailman/MailList.py", line 908, in Load usw-sf-list2 post(17285): shutil.copy(lastfile, dbfile) usw-sf-list2 post(17285): File "/usr/lib/python1.5/shutil.py", line 52, in copy usw-sf-list2 post(17285): copyfile(src, dst) usw-sf-list2 post(17285): File "/usr/lib/python1.5/shutil.py", line 18, in copyfile usw-sf-list2 post(17285): fdst = open(dst, 'wb') usw-sf-list2 post(17285): IOError : [Errno 116] Stale NFS file handle: '/var/local/mailman/lists/test/config.db' Dec 07 15:14:41 2000 usw-sf-list2 (17349) test db file was corrupt, using fallback: /var/local/mailman/lists/test/config.db.last Dec 07 15:14:50 2000 usw-sf-list2 (17905) test db file was corrupt, using fallback: /var/local/mailman/lists/test/config.db.last Dec 07 15:14:50 2000 usw-sf-list2 (17913) test db file was corrupt, using fallback: /var/local/mailman/lists/test/config.db.last Dec 07 15:14:50 2000 usw-sf-list2 post(17905): Traceback (innermost last): usw-sf-list2 post(17905): File "/var/local/mailman/scripts/post", line 94, in ? usw-sf-list2 post(17905): main() usw-sf-list2 post(17905): File "/var/local/mailman/scripts/post", line 73, in main usw-sf-list2 post(17905): mlist = MailList.MailList(listname, lock=0) usw-sf-list2 post(17905): File "/var/local/mailman/Mailman/MailList.py", line 79, in __init__ usw-sf-list2 post(17905): self.Load() usw-sf-list2 post(17905): File "/var/local/mailman/Mailman/MailList.py", line 908, in Load usw-sf-list2 post(17905): shutil.copy(lastfile, dbfile) usw-sf-list2 post(17905): File "/usr/lib/python1.5/shutil.py", line 52, in copy usw-sf-list2 post(17905): copyfile(src, dst) usw-sf-list2 post(17905): File "/usr/lib/python1.5/shutil.py", line 18, in copyfile usw-sf-list2 post(17905): fdst = open(dst, 'wb') usw-sf-list2 post(17905): IOError : [Errno 116] Stale NFS file handle: '/var/local/mailman/lists/test/config.db' Dec 07 15:14:50 2000 usw-sf-list2 (17922) test db file was corrupt, using fallback: /var/local/mailman/lists/test/config.db.last Dec 07 15:15:44 2000 usw-sf-list2 (21410) test db file was corrupt, using fallback: /var/local/mailman/lists/test/config.db.last Dec 07 15:15:44 2000 usw-sf-list2 (21419) test db file was corrupt, using fallback: /var/local/mailman/lists/test/config.db.last and the fact that 3 mails (out of 2000) didn't make it to my mailbox. I think I can live with the occasional log corruption (I can also lock the log files before writing to them), but of course, mail loss is not as good. I've looked at the qrunner code a bit, and I'm trying to understand why it needs a lock o the list's config.db I suppose NFS it to blame for this, and somehow, even though both machines lock the test list, the locking is somehow not NFS safe (I thought it would be though). But then comes the question: why does qrunner have to modify the list's config.db when it ships a message? I suppose the relevant piece of code in qrunner is: try: keepqueued = dispose_message(mlist, msg, msgdata) # Did the delivery generate child processes? Don't store them in # the message data files. kids = msgdata.get('_kids') if kids: allkids.update(kids) del msgdata['_kids'] if not keepqueued: # We're done with this message dequeue(root) but I have to admit to not understanding what it does. Is there any way to have qrunner send messages without modifying the list config and thus without having to lock the list either? Thanks Marc -- Microsoft is to operating systems & security .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | Finger marc_f@merlins.org for PGP key From marc_news@valinux.com Fri Dec 8 01:46:26 2000 From: marc_news@valinux.com (Marc MERLIN) Date: Thu, 7 Dec 2000 17:46:26 -0800 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: <20001207162234.D25463@marc.merlins.org>; from marc_news@valinux.com on Thu, Dec 07, 2000 at 04:22:34PM -0800 References: <20001207162234.D25463@marc.merlins.org> Message-ID: <20001207174626.H25463@marc.merlins.org> On Thu, Dec 07, 2000 at 04:22:34PM -0800, Marc MERLIN wrote: > and the fact that 3 mails (out of 2000) didn't make it to my mailbox. I just found out that the 3 mails in question weren't totally lost, they bounced: One like this: test@lists.sourceforge.net: unknown local-part "test" in domain "lists.sourceforge.net" (Err, what? Oh, I get it, I use the exim stat config.db to look for a list config, and exim happened to stat for the file exactly when mailman deleted config.db and replaced it with config.db.last I guess. Can mailman ensure that config.db is always here? (I suppose mv isn't really atomic, is it?) and two like this: test@lists.sourceforge.net: Child process of list_transport transport returned 1 from command: /var/local/mailman/mail/wrapper Those were on the second server (doing NFS mounting) and those two match the two python traces I had in my error log, see my previous message (IOError : [Errno 116] Stale NFS file handle: '/var/local/mailman/lists/test/config.db') I supposed that happened while config.db was being moved... It still seems like a bug though Marc -- Microsoft is to operating systems & security .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | Finger marc_f@merlins.org for PGP key From chuqui@plaidworks.com Fri Dec 8 06:29:55 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Thu, 7 Dec 2000 22:29:55 -0800 Subject: [Mailman-Developers] glitch in 2.0 check_perms Message-ID: Justfound a glitch in 2.0 check_perms. I was getting a failure on one of my lists when trying to process an admin request, because of a permission problem. I found that the listname/request.db file was mode 644 instead of 664, so that anything trying to open it for writing failed. I don't know how it got that way, none of the other lists have had this problem. But when I found the problem, I ran check_perms, and it reported no problem, which is clearly wrong... -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From linuxcurry@yahoo.com Fri Dec 8 08:20:26 2000 From: linuxcurry@yahoo.com (Rajat) Date: Fri, 8 Dec 2000 00:20:26 -0800 (PST) Subject: [Mailman-Developers] giving mailman egroups like functionality? Message-ID: <20001208082026.36476.qmail@web9608.mail.yahoo.com> hello friends, am new to this list ... but have been using mailman for quite some time. am also aware of howto program in python .... but am no big expert in python! one of my implementation projets requires what egroups gives .... Now mailman is an excellent tool for mailing list management with excellent web based GUI interface .... what i want is howto - 1.) a web page thru which users can login ( that i have taken care of - assumed that users are already in my ldap database and they get into their user area say like a mailbox ) and can create their own mailing lists rather than someone makes for them from a shell! 2.) then everytime they login they see what lists they are moderator are for plus also thise lists which they have subscribed too. 3.) when they click on the list they are moderator are for then it should give the link to http://hostname/mailman/admin/listname 4.) and when they click the link to the list they have been subscribed they go the link http://hostnanme/mailman/listindo/listname 5.) also how can i execute the commands present in my ~Mailman/bin directory .. eg: list_lists i would really appreciate if you could help me out in doing this ... plus if these functional are, if u think not required in mailman or do not confer to mailman standards, then please at least do help me out howto go about it .. either, a) changing the python code of mailman b) any way i can achieve this my executing any existing binary commands which comes with Mailman thru perl or PHP. The requirements are on a urgent basis ... so would appreciate all your help Thnx Regards Rajat P.S. : i have told that s/w like lyris list manager does this .. but am a hardcore mailman fan .. and i would never like to use any other mailing list manager :) __________________________________________________ Do You Yahoo!? Yahoo! Shopping - Thousands of Stores. Millions of Products. http://shopping.yahoo.com/ From thomas@xs4all.net Fri Dec 8 08:36:11 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 8 Dec 2000 09:36:11 +0100 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: <20001207174626.H25463@marc.merlins.org>; from marc_news@valinux.com on Thu, Dec 07, 2000 at 05:46:26PM -0800 References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> Message-ID: <20001208093611.E4396@xs4all.nl> On Thu, Dec 07, 2000 at 05:46:26PM -0800, Marc MERLIN wrote: > On Thu, Dec 07, 2000 at 04:22:34PM -0800, Marc MERLIN wrote: > > and the fact that 3 mails (out of 2000) didn't make it to my mailbox. > Oh, I get it, I use the exim stat config.db to look for a list config, and > exim happened to stat for the file exactly when mailman deleted config.db > and replaced it with config.db.last I guess. > Can mailman ensure that config.db is always here? (I suppose mv isn't really > atomic, is it?) Over NFS, almost nothing is atomic. And even if you do grab an atomic operation, you can get .... > (IOError : [Errno 116] Stale NFS file handle: > '/var/local/mailman/lists/test/config.db') I'm afraid that there isn't a good solution to your problem, right now. In all honesty, and I say this with all my professional years of experience in this area, NFS sucks large granite elephant testicles through a very thin straw. (To butcher a Pratchett quote, "NFS is like a vampire; it bites, it sucks, and it leaves you lifeless") Which is really a pity, since Barry did a lot of work to get NFS locking right. But that isn't going to help much unless the other parts of Mailman can also handle this properly as well. And replacing a file by another is a very tricky operation, over NFS. It's entirely OS-dependant (or rather NFS-implementation-dependant) what will lead to a stale NFS handle. Some OSes do it if a file is moved or deleted (or replaced) and another process still has the file open. Some do it when they have the file cached. Some do it when the moon's full, or at least something I haven't been able to figure out :) I was able to get NFS-locking to work properly, but I was only locking on one machine, because I didn't have a machine to spare to run the web-interface. When you work with different machines, you can get all kinds of problems with attribute-caching and data-caching and what not. Not to mention OS bugs, which pop up especially under heavy load. I've seen quite a few of those, too. At work here, we're really hoping NFSv4 will be universally adopted, though it'll probably bankrupt our supplier of black goat's blood and headless chickens. (But hey, there's still SCSI!) Probably the best solution is to write a network daemon to do the locking & config info, rather than rely on NFS and locking over NFS... I'm not sure howmuch work that is, though. Actually, maybe an even better solution, and probably about as much work, is to put all the mailman config stuff into a separate database, and just allow connections from several points. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas@xs4all.net Fri Dec 8 08:49:35 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 8 Dec 2000 09:49:35 +0100 Subject: [Mailman-Developers] Re: Huge qrunner process In-Reply-To: <20001207175149.24828.qmail@smx.pair.com>; from sigma@pair.com on Thu, Dec 07, 2000 at 12:51:49PM -0500 References: <20001207175149.24828.qmail@smx.pair.com> Message-ID: <20001208094935.F4396@xs4all.nl> On Thu, Dec 07, 2000 at 12:51:49PM -0500, sigma@pair.com wrote: > There is a definite memory leak or inefficiency somewhere. Just the following > fragment of code, edited from senddigests, is enough to send the memory usage > sky-high: > > def main(): > for listname in Utils.list_names(): > mlist = MailList.MailList(listname, lock=0) > del mlist > > The "del mlist" doesn't help. > Is there a way to tell if Python is deleting the MailList object when mlist > gets reassigned, so I can find out if there is a leak each time a list is > loaded? Otherwise it's just inefficient memory usage in proportion to the > size of config.db Python uses reference-counting, so the mailinglist should go away as soon as all references to it go away. However: python -i bin/withlist mailman-devel-test >>> import sys >>> sys.getrefcount(m) 12 There are 12 references to the mailinglist object. One is the argument passed to 'getrefcount', one is the local variable 'm', but the other 10 are unaccounted for. I think it's safe to say there's a reference cycle in there somewhere ;) The easiest way to fix this is probably to install Python 2.0 with the garbage collector. It's a new feature, which tries to collect as much cyclic garbage as possible. If anything, it can help figure out where those cycles exist. Barry ? Would it be a good idea, in the mean time, to explicitly break the cycle in some way, say 'mlist._release()' or some such, document it as internal, and use it wisely in senddigests/qrunner ? That would require finding the cycles, of course ;P -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From sigma@pair.com Fri Dec 8 11:16:10 2000 From: sigma@pair.com (sigma@pair.com) Date: Fri, 8 Dec 2000 06:16:10 -0500 (EST) Subject: [Mailman-Developers] Re: Huge qrunner process In-Reply-To: <20001208094935.F4396@xs4all.nl> from Thomas Wouters at "Dec 8, 0 09:49:35 am" Message-ID: <20001208111610.4097.qmail@smx.pair.com> > Python uses reference-counting, so the mailinglist should go away as soon as > all references to it go away. However: > > python -i bin/withlist mailman-devel-test > >>> import sys > >>> sys.getrefcount(m) > 12 I suspected as much :( > There are 12 references to the mailinglist object. One is the argument > passed to 'getrefcount', one is the local variable 'm', but the other 10 are > unaccounted for. I think it's safe to say there's a reference cycle in there > somewhere ;) The easiest way to fix this is probably to install Python 2.0 > with the garbage collector. It's a new feature, which tries to collect as > much cyclic garbage as possible. If anything, it can help figure out where > those cycles exist. We are already running Python 2.0 on this machine :( I suppose I could litter the code with debugging statements and see how the reference count goes up. I don't see any kind of destroy-no-matter-what function for objects. Thanks, Kevin From thomas@xs4all.net Fri Dec 8 12:03:37 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 8 Dec 2000 13:03:37 +0100 Subject: [Mailman-Developers] Re: Huge qrunner process In-Reply-To: <20001208111610.4097.qmail@smx.pair.com>; from sigma@pair.com on Fri, Dec 08, 2000 at 06:16:10AM -0500 References: <20001208094935.F4396@xs4all.nl> <20001208111610.4097.qmail@smx.pair.com> Message-ID: <20001208130337.D15654@xs4all.nl> On Fri, Dec 08, 2000 at 06:16:10AM -0500, sigma@pair.com wrote: > I don't see any kind of destroy-no-matter-what function for objects. There isn't. There should be no reason to have it, object disappear when their references go away, and you shouldn't want to destroy one that is still being referenced -- it would lead to nasty crashes. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From sigma@pair.com Fri Dec 8 12:06:10 2000 From: sigma@pair.com (sigma@pair.com) Date: Fri, 8 Dec 2000 07:06:10 -0500 (EST) Subject: [Mailman-Developers] Re: Huge qrunner process In-Reply-To: <20001208130337.D15654@xs4all.nl> from Thomas Wouters at "Dec 8, 0 01:03:37 pm" Message-ID: <20001208120610.5715.qmail@smx.pair.com> Except in the case where you're certain that the object is really unreferenced, like the circular reference case. Perhaps as a quick hack, I could rewrite qrunner and/or senddigests to launch a new script for each list in the loop. That would workaround the memory problem. Kevin > On Fri, Dec 08, 2000 at 06:16:10AM -0500, sigma@pair.com wrote: > > > I don't see any kind of destroy-no-matter-what function for objects. > > There isn't. There should be no reason to have it, object disappear when > their references go away, and you shouldn't want to destroy one that is > still being referenced -- it would lead to nasty crashes. > > -- > Thomas Wouters > > Hi! I'm a .signature virus! copy me into your .signature file to help me spread! > From thomas@xs4all.net Fri Dec 8 12:10:58 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 8 Dec 2000 13:10:58 +0100 Subject: [Mailman-Developers] Re: Huge qrunner process In-Reply-To: <20001208120610.5715.qmail@smx.pair.com>; from sigma@pair.com on Fri, Dec 08, 2000 at 07:06:10AM -0500 References: <20001208130337.D15654@xs4all.nl> <20001208120610.5715.qmail@smx.pair.com> Message-ID: <20001208131058.E15654@xs4all.nl> On Fri, Dec 08, 2000 at 07:06:10AM -0500, sigma@pair.com wrote: > Except in the case where you're certain that the object is really > unreferenced, like the circular reference case. No, it *is* referenced. That it's referenced by yourself, or by an object that you yourself reference, or how long the circle is, is unimportant. Just deallocating it could still lead to crashes. > Perhaps as a quick hack, I could rewrite qrunner and/or senddigests to > launch a new script for each list in the loop. That would workaround the > memory problem. Probably, yes. It would require spawning a new python interpreter each time though :P -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From barry@digicool.com Fri Dec 8 15:29:24 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 10:29:24 -0500 Subject: [Mailman-Developers] about qrunner and locking References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> <20001208093611.E4396@xs4all.nl> Message-ID: <14896.65108.807206.488209@anthem.concentric.net> >>>>> "TW" == Thomas Wouters writes: TW> I'm afraid that there isn't a good solution to your problem, TW> right now. In all honesty, and I say this with all my TW> professional years of experience in this area, NFS sucks large TW> granite elephant testicles through a very thin straw. (To TW> butcher a Pratchett quote, "NFS is like a vampire; it bites, TW> it sucks, and it leaves you lifeless") TW> Which is really a pity, since Barry did a lot of work to get TW> NFS locking right. Yeah, sigh. I've been playing a lot with zodb/zeo lately, and I think this is the right long term solution to move to. Background for those who don't know: zodb is the Zope Object Database, ZEO is Zope Enterprise Objects. Think of them this way: zodb is a framework for making Python objects transparently persistent. It's an object database for Python programs with some very nice features, like transaction support, pluggable backend storages, mountable databases, etc. zeo is a client/server storage that enables things like replication, multiple processes, and more. It's all open source and very cool stuff. It's got support of a fairly large community, and will likely be supported by folks at Pythonlabs. We've talked about all this stuff before, but the question now is: is it better to jump in sooner rather than later? There are some other short term gains we can make by splitting the qfiles into three queues: incoming, outgoing, and bounces. We've talked about that before too. The advantage is that a list's database does not need to be touched when a message is in outgoing -- except to handle smtp errors, but we can hack around that I think. So the idea is that a message is received by Mailman and goes into incoming. It flows through the pipeline to get prepared for delivery, and then goes into outgoing. From there qrunner (probably a separate outgoing-qrunner) simply moves messages from outgoing to the smtpd. We'd have to handle collisions for multiple qrunner processes, potentially on separate machines. One way that doesn't involve locking shenanigans is to divide the hash space up and assign a segment to each out-qrunner process. Since the messages get hashed into that space completely randomly, there should be decent coverage by multiple out-qrunners. You could modify the segments based on relative performance, nfs delays, and other factors. -Barry From barry@digicool.com Fri Dec 8 15:38:12 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 10:38:12 -0500 Subject: [Mailman-Developers] about qrunner and locking References: <20001207162234.D25463@marc.merlins.org> Message-ID: <14897.100.883156.91474@anthem.concentric.net> MM> But then comes the question: why does qrunner have to modify MM> the list's config.db when it ships a message? I suppose the MM> relevant piece of code in qrunner is: | try: | keepqueued = dispose_message(mlist, msg, msgdata) | # Did the delivery generate child processes? Don't store them in | # the message data files. | kids = msgdata.get('_kids') | if kids: | allkids.update(kids) | del msgdata['_kids'] | if not keepqueued: | # We're done with this message | dequeue(root) MM> but I have to admit to not understanding what it does. This isn't directly related to your problem, but some pipeline modules can create subprocesses, although the only one that does this currently is ToUsenet.py. This code makes sure that all those children are waited on so they don't zombie. What /really/ ought to happen is that there is a separate queue for usenet postings since once the message is prepared for usenet, it doesn't need to touch the list database again. -Barry From barry@digicool.com Fri Dec 8 15:39:32 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 10:39:32 -0500 Subject: [Mailman-Developers] glitch in 2.0 check_perms References: Message-ID: <14897.180.320303.525084@anthem.concentric.net> >>>>> "CVR" == Chuq Von Rospach writes: CVR> Justfound a glitch in 2.0 check_perms. I was getting a CVR> failure on one of my lists when trying to process an admin CVR> request, because of a permission problem. I found that the CVR> listname/request.db file was mode 644 instead of 664, so that CVR> anything trying to open it for writing failed. CVR> I don't know how it got that way, none of the other lists CVR> have had this problem. But when I found the problem, I ran CVR> check_perms, and it reported no problem, which is clearly CVR> wrong... I found the same thing recently when I was moving the python.org lists. I'll fix check_perms to look at request.db. -Barry From barry@digicool.com Fri Dec 8 16:38:31 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 11:38:31 -0500 Subject: [Mailman-Developers] Re: Huge qrunner process References: <20001207175149.24828.qmail@smx.pair.com> <20001208094935.F4396@xs4all.nl> Message-ID: <14897.3719.961623.8009@anthem.concentric.net> >>>>> "TW" == Thomas Wouters writes: TW> Barry ? Would it be a good idea, in the mean time, to TW> explicitly break the cycle in some way, say 'mlist._release()' TW> or some such, document it as internal, and use it wisely in TW> senddigests/qrunner ? That would require finding the cycles, TW> of course ;P Which isn't something I want to spend a lot of time on. One of the beauties of zodb is that it tracks usage of objects, moving them from memory out into disk storage when they're unreferenced (depending on various tuning parameters). I think using zodb/zeo here would help a lot, or at least make it manageable. -Barry From barry@digicool.com Fri Dec 8 16:40:29 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 11:40:29 -0500 Subject: [Mailman-Developers] Re: Huge qrunner process References: <20001208094935.F4396@xs4all.nl> <20001208111610.4097.qmail@smx.pair.com> <20001208130337.D15654@xs4all.nl> Message-ID: <14897.3837.702831.163643@anthem.concentric.net> One ther thing to notice about qrunner. It keeps a cache of MailList objects referenced by name (see _listcache). It does this to avoid the overhead of having to reinstantiate the MailList object each time it finds a message destined for a particular list. You could try to redefine open_list() so that it doesn't cache the objects. I don't know how much that'll help, but it's worth a try. -Barry From claw@kanga.nu Fri Dec 8 17:26:13 2000 From: claw@kanga.nu (J C Lawrence) Date: Fri, 08 Dec 2000 09:26:13 -0800 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: Message from Thomas Wouters of "Fri, 08 Dec 2000 09:36:11 +0100." <20001208093611.E4396@xs4all.nl> References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> <20001208093611.E4396@xs4all.nl> Message-ID: <23337.976296373@kanga.nu> On Fri, 8 Dec 2000 09:36:11 +0100 Thomas Wouters wrote: > Over NFS, almost nothing is atomic. And even if you do grab an > atomic operation, you can get .... The only operation which I've found guaranteed atomic across all NFS implementations is creat(). > I'm afraid that there isn't a good solution to your problem, right > now. In all honesty, and I say this with all my professional years > of experience in this area, NFS sucks large granite elephant > testicles through a very thin straw. (To butcher a Pratchett > quote, "NFS is like a vampire; it bites, it sucks, and it leaves > you lifeless") Yuo're beginning to sound a bit like me, only you're more polite. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From barry@digicool.com Fri Dec 8 17:51:30 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 12:51:30 -0500 Subject: [Mailman-Developers] Thoughts on splitting qrunner Message-ID: <14897.8098.811401.711293@anthem.concentric.net> http://www.zope.org/Members/bwarsaw/MailmanDesignNotes/SplittingQrunner Comments please, either here or in the Wiki! -Barry From csf@moscow.com Fri Dec 8 18:17:36 2000 From: csf@moscow.com (Michael Yount) Date: Fri, 8 Dec 2000 10:17:36 -0800 Subject: [Mailman-Developers] Thoughts on splitting qrunner In-Reply-To: <14897.8098.811401.711293@anthem.concentric.net>; from barry@digicool.com on Fri, Dec 08, 2000 at 12:51:30PM -0500 References: <14897.8098.811401.711293@anthem.concentric.net> Message-ID: <20001208101736.C648@moscow.com> On 08 Dec 12:51, Barry A. Warsaw wrote: > > http://www.zope.org/Members/bwarsaw/MailmanDesignNotes/SplittingQrunner > > Comments please, either here or in the Wiki! > -Barry > Concerning local bounces, what we did in Mj2 is to cache the bad addresses, and at the end of the delivery process, mail an error message to the list owners. This avoids problems with locks by keeping delivery and bounce processing entirely separate. Michael From barry@digicool.com Fri Dec 8 18:24:45 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 13:24:45 -0500 Subject: [Mailman-Developers] Thoughts on splitting qrunner References: <14897.8098.811401.711293@anthem.concentric.net> <20001208101736.C648@moscow.com> Message-ID: <14897.10093.961509.671023@anthem.concentric.net> >>>>> "MY" == Michael Yount writes: MY> Concerning local bounces, what we did in Mj2 is to cache the MY> bad addresses, and at the end of the delivery process, mail an MY> error message to the list owners. This avoids problems with MY> locks by keeping delivery and bounce processing entirely MY> separate. I thought about that. In Mailman, I guess you'd send it to the -admin address so it gets bounce processed first. It's definitely a good idea, thanks. -Barry From chuqui@plaidworks.com Fri Dec 8 18:49:40 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 8 Dec 2000 10:49:40 -0800 Subject: [Mailman-Developers] Thoughts on splitting qrunner In-Reply-To: <14897.8098.811401.711293@anthem.concentric.net> References: <14897.8098.811401.711293@anthem.concentric.net> Message-ID: At 12:51 PM -0500 12/8/00, Barry A. Warsaw wrote: >http://www.zope.org/Members/bwarsaw/MailmanDesignNotes/SplittingQrunner > I like it. I'd like to suggest one other thing for qrunner. Make qrunner the queue-mom (so to speak), and have it manage what gets spawned when. The idea is that we end up with a single cron entry that fires every minute. that thing looks around and decides what needs to be spawned -- sort of sucking cron into mailman. Why not use cron? Here are a few reasons: 1) tweaking stuff in cron requires someone with CLI access, so it falls on the site admin to do things. By sucking cron into mailman, you can add a web access to all of this, which allows it to be managed remotely. 2) down the road, I see a strong positive in being able to split this stuff out further, on a per-list basis for at least some stuff. the current cron-based setup is monolithic to the system, unless you want a cron file the size of a small truck, and in that case, maintenance is horrific) 3) It makes the whole queing system less sensitive to upgrades (either by losing your customizations by re-installing the generic cron file, or forgetting to install the updated cronfile, or not realizing that the upgrade has a new cron file that needs to be merged with your custom changes ot the existing cron processes) -- it makes it a lot easier to manage for the admin, plus it allows us as developers to write tools to auto-update the queueing stuff if needed. With cron -- good luck.. thoughts? -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Fri Dec 8 18:36:25 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 8 Dec 2000 10:36:25 -0800 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: <14896.65108.807206.488209@anthem.concentric.net> References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> <20001208093611.E4396@xs4all.nl> <14896.65108.807206.488209@anthem.concentric.net> Message-ID: > TW> professional years of experience in this area, NFS sucks large > TW> granite elephant testicles through a very thin straw. Now that's a visual I did not need... (grin) {completely off the subject, I was part of the group that worked on the first third party portsof NFS to non-sun hardwar,e back in the days when I was working at sun... Bonus points for knowing what the first non-sun hardware NFS ran on...) >Background for those who don't know: zodb is the Zope Object Database, >ZEO is Zope Enterprise Objects. My only worry about this is adding enough complexity and overhead that mailman loses it's attractiveness to the small site. >Pythonlabs. We've talked about all this stuff before, but the >question now is: is it better to jump in sooner rather than later? Probably sooner, if that's the direction we want to go -- but that simply defines 2.1 as "bug fixes and really easy stuff", and puts us in 3.0 development sooner, rather than later So to a good degree it means 2.1 or 3.0 determinations are made based on "easy" rather than "high priority" to minimize re-doing stuff when it's rearchitected. >We'd have to handle collisions for multiple qrunner processes, >potentially on separate machines. One way that doesn't involve >locking shenanigans is to divide the hash space up and assign a >segment to each out-qrunner process. here's another way that should work: each record has a locking field in it. When qrunner wants to execute that item, it reads the field. If the field is NULL, it writes its ID (hwatever it is, guaranteed unique) into that locking field. It then waits a beat, and reads it back. if it reads back its own ID, it knows it owns the record and can execute it. If it reads back someone else's ID, it lost the lock, but someone else owns the record so it can skip it and move on. you can simulate atomic locks with a little thought and cooperative processes, by everyone writing to the store and then seeing who won. A LOT easier from and administrative view than partitioning hashes and the like, IMHO. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From barry@digicool.com Fri Dec 8 19:02:18 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 14:02:18 -0500 Subject: [Mailman-Developers] Thoughts on splitting qrunner References: <14897.8098.811401.711293@anthem.concentric.net> Message-ID: <14897.12346.911588.80167@anthem.concentric.net> >>>>> "CVR" == Chuq Von Rospach writes: >> http://www.zope.org/Members/bwarsaw/MailmanDesignNotes/SplittingQrunner CVR> I like it. Cool! CVR> I'd like to suggest one other thing for qrunner. Make qrunner CVR> the queue-mom (so to speak), and have it manage what gets CVR> spawned when. The idea is that we end up with a single cron CVR> entry that fires every minute. that thing looks around and CVR> decides what needs to be spawned -- sort of sucking cron into CVR> mailman. I like it a lot. This could also help us move away from the one-shot process architecture we currently have. queue-mom (I love that name, even if it doesn't quite capture what this thing is becoming :) could include the watchdog features to make sure any long-running processes are still running. Kind of the init of Mailman. -Barry From chuqui@plaidworks.com Fri Dec 8 19:01:04 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 8 Dec 2000 11:01:04 -0800 Subject: [Mailman-Developers] Thoughts on splitting qrunner In-Reply-To: <14897.12346.911588.80167@anthem.concentric.net> References: <14897.8098.811401.711293@anthem.concentric.net> <14897.12346.911588.80167@anthem.concentric.net> Message-ID: At 2:02 PM -0500 12/8/00, Barry A. Warsaw wrote: >I like it a lot. This could also help us move away from the one-shot >process architecture we currently have. It gives us a lot of possibility down the road. > queue-mom (I love that name, >even if it doesn't quite capture what this thing is becoming :) heh. I'm writing (slowly) a replacement for bulk_mailer for some of my systems, and it's code-name is maildude....it was originally queuemon (queue monitor), until I ripped all the queueing out of it and went with QPS (why write a queueing systme when you can borrow one off the shelf?) > could >include the watchdog features to make sure any long-running processes >are still running. Kind of the init of Mailman. that's the paradigm I wanted. not cron, but inetd. And that concept might help us get away fro the idea of spawning processes every minute, too -- make them persistent and sleeping, with a watchdog to restart if they die. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From barry@digicool.com Fri Dec 8 19:13:34 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 14:13:34 -0500 Subject: [Mailman-Developers] about qrunner and locking References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> <20001208093611.E4396@xs4all.nl> <14896.65108.807206.488209@anthem.concentric.net> Message-ID: <14897.13022.921116.705151@anthem.concentric.net> >>>>> "CVR" == Chuq Von Rospach writes: >> Background for those who don't know: zodb is the Zope Object >> Database, ZEO is Zope Enterprise Objects. CVR> My only worry about this is adding enough complexity and CVR> overhead that mailman loses it's attractiveness to the small CVR> site. If I was proposing to swallow all of Zope, I think that'd be a valid criticism. I think ZODB is self-contained, transparent, and small enough to outweigh any complexity. In fact it may be a complexity win because of the headaches involved in the current architecture. Certainly it'll be miles better than trying to cook our own, which I fear would end up looking a lot like ZODB, feature-wise. >> Pythonlabs. We've talked about all this stuff before, but the >> question now is: is it better to jump in sooner rather than >> later? CVR> Probably sooner, if that's the direction we want to go -- but CVR> that simply defines 2.1 as "bug fixes and really easy stuff", CVR> and puts us in 3.0 development sooner, rather than later So CVR> to a good degree it means 2.1 or 3.0 determinations are made CVR> based on "easy" rather than "high priority" to minimize CVR> re-doing stuff when it's rearchitected. Yep, although you have to add i18n into the mix for 2.1, which is probably enough if we want to release it early in 2001. >> We'd have to handle collisions for multiple qrunner processes, >> potentially on separate machines. One way that doesn't involve >> locking shenanigans is to divide the hash space up and assign a >> segment to each out-qrunner process. CVR> here's another way that should work: each record has a CVR> locking field in it. When qrunner wants to execute that item, CVR> it reads the field. If the field is NULL, it writes its ID CVR> (hwatever it is, guaranteed unique) into that locking CVR> field. It then waits a beat, and reads it back. if it reads CVR> back its own ID, it knows it owns the record and can execute CVR> it. If it reads back someone else's ID, it lost the lock, but CVR> someone else owns the record so it can skip it and move on. CVR> you can simulate atomic locks with a little thought and CVR> cooperative processes, by everyone writing to the store and CVR> then seeing who won. A LOT easier from and administrative CVR> view than partitioning hashes and the like, IMHO. Hmm, I do worry about using writes to coordinate the various processes. I think we're heading toward the NFS atomicity problem again. Administratively, partitioning the hash space shouldn't be too hard -- we can simply have a variable that says how many concurrent qrunners to start and divide the hash space up evenly. If we want to weight the hash partitions, then I think it would be simple to have a list of partition weights. The lenght of the list would be the number of concurrent qrunner processes, and then it's a simple matter of taking the ratio for each individual weight. What's difficult is getting the right qrunners to start on separate machines, and splitting the hash space up across machines, but that'd be difficult anyway. -Barry From barry@digicool.com Fri Dec 8 19:17:08 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 14:17:08 -0500 Subject: [Mailman-Developers] Thoughts on splitting qrunner References: <14897.8098.811401.711293@anthem.concentric.net> <14897.12346.911588.80167@anthem.concentric.net> Message-ID: <14897.13236.522682.327156@anthem.concentric.net> >>>>> "CVR" == Chuq Von Rospach writes: CVR> that's the paradigm I wanted. not cron, but inetd. And that CVR> concept might help us get away fro the idea of spawning CVR> processes every minute, too -- make them persistent and CVR> sleeping, with a watchdog to restart if they die. 'Zackly! From claw@kanga.nu Fri Dec 8 22:15:29 2000 From: claw@kanga.nu (J C Lawrence) Date: Fri, 08 Dec 2000 14:15:29 -0800 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: Message from Chuq Von Rospach of "Fri, 08 Dec 2000 10:36:25 PST." References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> <20001208093611.E4396@xs4all.nl> <14896.65108.807206.488209@anthem.concentric.net> Message-ID: <17693.976313729@kanga.nu> On Fri, 8 Dec 2000 10:36:25 -0800 Chuq Von Rospach wrote: >> Background for those who don't know: zodb is the Zope Object >> Database, ZEO is Zope Enterprise Objects. > My only worry about this is adding enough complexity and overhead > that mailman loses it's attractiveness to the small site. I argue similarly. To echo you Chuq, a primary goal should be ability to integrate. That't given I'm increasingly coming to question the use of Python pickles in the first place, let alone use of custom database implementations. I don't see that the value is there for the increased complexity and isolation of the system. We already have enough problems given the fact that that the membership base is kept in a pickle that I don't see much reason to go further down that rat hole. Yes, pickles are nice -- for private data that will never be seen or accessed by an external system. >> We'd have to handle collisions for multiple qrunner processes, >> potentially on separate machines. One way that doesn't involve >> locking shenanigans is to divide the hash space up and assign a >> segment to each out-qrunner process. > here's another way that should work: each record has a locking > field in it. When qrunner wants to execute that item, it reads the > field. If the field is NULL, it writes its ID (hwatever it is, > guaranteed unique) into that locking field. It then waits a beat, > and reads it back. if it reads back its own ID, it knows it owns > the record and can execute it. If it reads back someone else's ID, > it lost the lock, but someone else owns the record so it can skip > it and move on. You missed a few race conditions in there. As I wrote earlier the only NFS operation which is guaranteed to be unique across implementation is creat(). Given that, lock files based on the hash of the message would seem to be the answer. If you can create a lock file with with a filename based on the hash of the message, you have rights to deliver it. If not, well, someone else obviously has those rights. The problem that remains is lockfile aging. > you can simulate atomic locks with a little thought and > cooperative processes, by everyone writing to the store and then > seeing who won. A LOT easier from and administrative view than > partitioning hashes and the like, IMHO. This assumes that writes are atomic. The problem is that they occassionally aren't. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From barry@digicool.com Fri Dec 8 23:16:51 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 18:16:51 -0500 Subject: [Mailman-Developers] about qrunner and locking References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> <20001208093611.E4396@xs4all.nl> <14896.65108.807206.488209@anthem.concentric.net> <17693.976313729@kanga.nu> Message-ID: <14897.27619.230944.677429@anthem.concentric.net> >>>>> "JCL" == J C Lawrence writes: JCL> I argue similarly. To echo you Chuq, a primary goal should JCL> be ability to integrate. That't given I'm increasingly JCL> coming to question the use of Python pickles in the first JCL> place, let alone use of custom database implementations. I JCL> don't see that the value is there for the increased JCL> complexity and isolation of the system. We already have JCL> enough problems given the fact that that the membership base JCL> is kept in a pickle that I don't see much reason to go JCL> further down that rat hole. Yes, pickles are nice -- for JCL> private data that will never be seen or accessed by an JCL> external system. I completely agree that keeping the list database in marshals (not pickles, actually) is broken. The question is what to do about it. As I see it, we're going to have to adopt some kind of database API and write against that. The most general -- and most Pythonic -- one that I know of is ZODB because all you need to do is derive your class from a Persistent class and you're good to go (modulo a couple of special rules). Being able to write Python the way you're used to writing it is a big win. With other approaches and dbi's you can hit very uncomfortable impedance mismatches. Or you have to write lots of strings with embedded SQL, etc. ZODB has backend storages to interface to many different underlying databases. The default FileStorage probably isn't appropriate for Mailman because it is a versioning storage, and we don't need versioning (so we don't want to pay the hit in performance and disk usage). There are gdbm storages, Berkeley db storages, Oracle storages, etc. ZEO is, in fact, implemented as a storage. Something I believe is possible, is the ability to "mount" storages into your object tree. If they work the way I think they work, they'd be pretty cool. Here's an example of how I envision using this: Let's say you run a Mailman site that has intranet mailing lists and extranet mailing lists. Let's say further that for external lists, people can either be clients or people who haven't registered with your site at all except for their email address. In this scenario, you could be getting list rosters from three difference sources (e.g. internal LDAP server, client list in a relational db, Mailman-only database). You'd like Mailman's lists to be able to encorporate any of these users in its list object. So, a rough design might be that you've got UserRepositories that can return User objects. You can then create Rosters that contain collections of Users, and list objects would contain a list of Rosters which would make up the membership addresses. Now, in the above example you'd have three different UserRepositories (LDAP, client rdbms, Mailman objects) which would be mounted into the ZODB object space. So when a list is crafting its recipient lists, it doesn't care where the Users are coming from. This also comes into play when users want to change their address or delivery options. Maybe the options are stored in the backend database, maybe they're stored only in Mailman's db. It shouldn't matter, and Mailman's object system should map those into the same space transparently. Same for Rosters perhaps, e.g. maybe Rosters coming from the intranet database aren't writable through Mailman because the backend database prohibits it. That's one way to address the "we're not going to let employees unsubscribe from this list" issue. So the real advantage of using ZODB is that it's very friendly to Python programs, and should provide the right kind of framework for plugging in all kinds of backend database sources. -Barry From ralph@inputplus.demon.co.uk Fri Dec 8 23:16:23 2000 From: ralph@inputplus.demon.co.uk (Ralph Corderoy) Date: Fri, 08 Dec 2000 23:16:23 +0000 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: Message from Chuq Von Rospach of "Fri, 08 Dec 2000 10:36:25 PST." Message-ID: <200012082316.XAA23350@inputplus.demon.co.uk> Hi Chuq, > here's another way that should work: each record has a locking field > in it. When qrunner wants to execute that item, it reads the field. > If the field is NULL, it writes its ID (hwatever it is, guaranteed > unique) into that locking field. It then waits a beat, and reads it > back. if it reads back its own ID, it knows it owns the record and > can execute it. If it reads back someone else's ID, it lost the lock, > but someone else owns the record so it can skip it and move on. Sorry if this is wandering a little off topic, but what's a `beat'? What stops A reads, gets NULL B reads, gets NULL A writes `A' A waits a beat A reads `A' and has the lock B held up for a little writes `B' B waits a beat B reads `B' and also has the lock I think I'm missing something and that's annoying :-) Ralph. From claw@kanga.nu Sat Dec 9 02:48:39 2000 From: claw@kanga.nu (J C Lawrence) Date: Fri, 08 Dec 2000 18:48:39 -0800 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) of "Fri, 08 Dec 2000 18:16:51 EST." <14897.27619.230944.677429@anthem.concentric.net> References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> <20001208093611.E4396@xs4all.nl> <14896.65108.807206.488209@anthem.concentric.net> <17693.976313729@kanga.nu> <14897.27619.230944.677429@anthem.concentric.net> Message-ID: <9548.976330119@kanga.nu> On Fri, 8 Dec 2000 18:16:51 -0500 Barry A Warsaw wrote: >>>>>> "JCL" == J C Lawrence writes: JCL> I argue similarly. To echo you Chuq, a primary goal should be JCL> ability to integrate. That't given I'm increasingly coming to JCL> question the use of Python pickles in the first place, let JCL> alone use of custom database implementations. I don't see that JCL> the value is there for the increased complexity and isolation JCL> of the system. We already have enough problems given the fact JCL> that that the membership base is kept in a pickle that I don't JCL> see much reason to go further down that rat hole. Yes, pickles JCL> are nice -- for private data that will never be seen or JCL> accessed by an external system. > I completely agree that keeping the list database in marshals (not > pickles, actually) is broken. Oops. My bad. > The question is what to do about it. I think we have to bite the bullet and make an architectual rhange and not try and do bolt ons that continue the current business. We need to classify the persistant data into management sets, define access policies for those sets, and then provide abstracted interfaces for them. (any more buzzwords I missed?) For example there appear to be five basic data sets that Mailman is concerned with: Inbound mail Outbound mail Global MLM configurations List configurations User account configurations The queueing improvements we've recently discussed would seem to handle the first two reasonably well, and certainly don't seem to commit Mailman to a design choice that will be expensive later. The last one is the really interesting one. Not just because users want the ability to have membership lists be entirely dynamic, but because the same abstractions we use for user accounts are likely to be applicable to the other two. I'm now going to kick back with a good beer, crank up Tubular Bells, and think about this for a bit. > So the real advantage of using ZODB is that it's very friendly to > Python programs, and should provide the right kind of framework > for plugging in all kinds of backend database sources. There's a danger here of assuming that everything is a Python shaped nail. I have misgivings over making a Pythonic wrapper that consumes the entire membership resource versus a callout that returns the membership base view requested by whatever means the installer provides. I don't like over-arching APIs. I do like small self contained and well constrained tools that assemble in useful fashions. Or, if you wish, I'd rather take a Postfix-ish view of a multitude of cooperating single-purpose tools in a defined framkework, rather than a sendmail-ish one-thing-does-everything-with-plugins approach. I have the feeling that with a little thought we could break the base design of Mailman down into a finely grained granular mesh of single purpose tools. Then changing the basic setup simply means replacing one of the base tools with some other simple tool that generates the same type of output from different resources. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From chuqui@plaidworks.com Sat Dec 9 07:20:23 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 8 Dec 2000 23:20:23 -0800 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: <9548.976330119@kanga.nu> References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> <20001208093611.E4396@xs4all.nl> <14896.65108.807206.488209@anthem.concentric.net> <17693.976313729@kanga.nu> <14897.27619.230944.677429@anthem.concentric.net> <9548.976330119@kanga.nu> Message-ID: At 6:48 PM -0800 12/8/00, J C Lawrence wrote: >There's a danger here of assuming that everything is a Python shaped >nail. I have misgivings over making a Pythonic wrapper that >consumes the entire membership resource versus a callout that >returns the membership base view requested by whatever means the >installer provides. On the other hand, look at where the sponsorship of Mailman is coming from, and because of that, I don't have a huge problem with it -- but to some degree that implies mostly that we design an API that can by default use Python (mumbles), but allow syou to write a python glue set to some other (mumble) if you want. I think making as much of the *default* configuration python is a very good idea, as long as you allow for the ability of users to replace those default pieces with more sophisticated (or different) pieces if you want. I ahve this terrible nightmare of coming out the other side of 3.0 where the INSTALL document opens with "first, download these 17 things from freshmeat and install them. you can now start configuring Mailman..." Don't take that to imply entirely self-contained, just that reacing out to outside code has to be carefully considered and well thought out. it's stupid to re-invent the MTA or Apache, for instance. it looks (from discussions today) that it'd be silly NOT to integrate queue mangement stuff (aka, "suck in cron") into Mailman. >I don't like over-arching APIs. Maybe we shouldn't be using the term API here, because what I think we're talking about are to some degree class objects. And if you think of it in terms of OOP, you may have an over-arching object (class "mail list manager"), but that gives you the opportunity to override subclasses to do what you want -- and ONLY have to rewrite those pieces you need accomplish what you want. The concept of an API for "subscriber database" seems overwhelming, but if you have a class for that, if it's done right, redoing the subclass that interfaces with the outside store and switching from zdb to dbi/mysql shouldn't be too significant. I think I'm arguing semantics here, because I think we're using the term API to more or less the same definition as class definitions in OOP, but it might help us to think about this if we look at it in classes architecturally. >I have the feeling that with a little thought we could break the >base design of Mailman down into a finely grained granular mesh of >single purpose tools. Then changing the basic setup simply means >replacing one of the base tools with some other simple tool that >generates the same type of output from different resources. Hear hear. I wish I'd said that. Here's something to chew on, just because it bubbled up out of the muck and seems worthy of metioning. In all this talk of 2.1 and 3.0, of zope database stores and my site-wide subscriber beastie, and multiple-machine send queues and thelike, I'm wondering if we're maybe starting to define a thing that reaches beyond the needs and capabilities of the small site. So as we move forward, maybe we need to keep in our mind the issues of "mailman light" versus "mailman robusto". this ties in nicely I think with what JC is bringing up, and would help us test the robustness of the APIs/classes - there be a baseline Mailman (mailman light) with basic reasonable functionality, and plug-in replacements with more functionality where we want to move towards the robusto version. For instance, we might not want to use zdb for the light version, but some simpler but endemic format (like marshalls) -- but you lose performance, you lose NFS locking, and other features a small site won't need anyway -- but you don't need to install lots of other stufff to get running. And if you think in terms of that, then 2.1 is the bugfix/easy-upgrade release, and then perhaps 2.5 is where all of those interfaces get defined and mailman-light is implemented, and 3.0 is where robusto is implemented to supplement the light. It would give us the ability to build parallel development teams for robusto and light (or for modules within each) down the road, and keep us honest in terms of the APIs, because if we cheat on an API, we break one or the other... -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Sat Dec 9 07:29:20 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 8 Dec 2000 23:29:20 -0800 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: <200012082316.XAA23350@inputplus.demon.co.uk> References: <200012082316.XAA23350@inputplus.demon.co.uk> Message-ID: At 11:16 PM +0000 12/8/00, Ralph Corderoy wrote: >Sorry if this is wandering a little off topic, but what's a `beat'? >What stops > A writes `A' > A waits a beat A short wait. Just long enough to let things settle so you get reliable data. It's really a musical term that's I've absconded with... -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Sat Dec 9 08:22:37 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Sat, 9 Dec 2000 00:22:37 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? Message-ID: I'm passing this along mostly as a FYI, but also as a sanity check. I sent this out to list-managers tonight, to bring up an issue that sort of crystalized this afternoon and made me realize that I think we have the beginnings of a problem in mail list land. Your thoughts are welcome....If I'm right, well, oh, boy. If I'm wrong -- I'd love to find out my idea won't work, but I think it's not only possible, but fairly easy. ---- I somewhat hesitate to bring this up, but I heard of another situation today that seems to fit in, and I think it's time to raise the issue. I'm beginning to think that mailback validation as an anti-spam technique has been beaten. Worse, I think there are now spam systems written that will beat them in an automated way. I will say up front I don't have a smoking gun. If and when I find one, I'll say so. But I'm now beginning to think the spammers have figured out how to beat mailbacks. Someone we know runs a list on egroups. Twice today he was spammed by the porn spammers -- from subscribed accounts. This isn't the first time I've heard of this in the last few weeks, but he's someone I know runs a pretty clean ship. to get hit by two separate porn spammers on the same day, in independent attacks, that raises a real warning flag, because where the porn spammers innovate, everyone else follows. In the last few years, there have been some significant, fundamental changes in the internet (duh). Now that I've spent a few hours thinking like a spammer, I realize these changes make it trivial for a *smart* spammer with some basic resources to circumvent mailbacks. Here's how: First, you get access to some domains -- the key ot mailbacks is that you have to have physical access to the mailback address to finish the confirmation. n today's internet, however -- that isn't a big deal. you register one for yourself, hook yourself up using dynamic DNS while attached via PPP to UUnet or one of the ISPs, and you have a fully functional mailserver. Or if you prefer, simply break into some lameoid's home machine sitting on a cable modem and borrow imstupid.org while he's not paying attention. Either way, you now have a spammer with a set of available domains, which he's either bought, borrowed or stolen, and access to the return mail sent to those domains. this spammer's built a validation-bot. It's fed a list of mailing lists, and it spends all of its time figuring out what MLM it uses (not hard), and subscribing accounts to them. it can send the appropriate subscribe messages, read the confirmations, and send appropriate confirmations. Even better, if the MLM supports nomail, you turn off deliveries, so you don't run the risk of inbound e-mail alerting anyway on imstupid.org (if you think about it, the only thing that has to be on imstupid.org is a set of aliases forwarding to your real machine, and only for the period of time you're setting up the subscriptions. If you're real lucky, you find out you can hack their DNS and set up really.imstupid.org, and send EVERYHTING offsite). The spammer lets his bot run for a while, and tracks the database with which address is subscribed to which list. He can even subscribe multiples from multiple domains if he wants, and let them lie fallow. When you block off one, it falls back and sends from the next. he now owns your list, at least until you figure out what's going on and nuke the subscribed address. But if you think about it, once that validation handshake is complete, there's never ANY further validation. so he can set up temporary shop, validate to his heart's content, and then later on, after all the temporary stuff is safely hidden away, spam from anywhere, safely. Because he knows the address that will get him on the list. If this is true, and it's beginning to look like egroups is a target of one attack, and I've heard rumors of some mailman lists being hit as well, then lists that depend on mailback validation have a problem. And I think there's been a feeling that mailbacks are the one true way of validation to the point where there hasn't been much (if any) thought about improved techniques or alternatives. And if I, having spent four hours on the "how would I do this?" train of thought can find a fairly easy to implement design, so can those that aren't so pure of heart and don't say their prayers at night. This isn't something the "buy a CD for $200" lameoid spammers can do (but I'll bet a really good spammer could build a system to do it taht's turnkey. there's enough wide open hardware out on the net, especially overseas, that you could get a good 6 month run before neough stuff shut you down to make it not worth it...), but the port spammers and gambling spammers and the spammers for hire? it's perfect for them. I've felt for a while that the list community was way too comfortable with mailbacks as "safe and unbeatable". I'm now seeing what I think is evidence that this is no longer true. And I'm afraid that because we have sat back adn not innovated here, we're going to end up behind the eight ball. and I don't see any easy answers if I'm right -- only that if I am wrong, I won't be wrong forever. So I'm throwing it to the list, to see if there's information others have that might corroborate what I think I'm seeing (that you may not have realized for waht it might be), or t poke holes in my analysis, or to start thinking of how to deal with it. There I go, being a troublemaker again... (grin, sort of) thoughts? chuq -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From lindsey@ncsa.uiuc.edu Sat Dec 9 09:09:26 2000 From: lindsey@ncsa.uiuc.edu (Christopher Lindsey) Date: Sat, 9 Dec 2000 03:09:26 -0600 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: ; from chuqui@plaidworks.com on Sat, Dec 09, 2000 at 12:22:37AM -0800 References: Message-ID: <20001209030926.A26087@ncsa.uiuc.edu> > I'm passing this along mostly as a FYI, but also as a sanity check. I > sent this out to list-managers tonight, to bring up an issue that > sort of crystalized this afternoon and made me realize that I think > we have the beginnings of a problem in mail list land. Your thoughts > are welcome....If I'm right, well, oh, boy. If I'm wrong -- I'd love > to find out my idea won't work, but I think it's not only possible, > but fairly easy. Hi Chuq, Yes, this has definitely been troublesome. I've blocked many commercial sites like findmail.com (egroups) and remarq.com from my lists because of their secret archiving that displays email addresses to the public, but at least they don't spam the lists back. But of course anyone can browse these sites and get addresses to their heart's content, then forge MAIL FROM: to sneak mail into the lists. I'm not sure what the right thing is to do. MLMs like sympa ( http://listes.cru.fr/sympa/ ) are definitely moving in the right direction with S/MIME signatures/encryption and X509 user certs, but that still doesn't stop someone from using throwaway certs to spam several lists or from harvesting addresses. The problem is that when these methods are used for authentication they just prove that the email address sending the stuff is who we think he or she is. But at least you can't forge the source email address to look like it's coming from a list member who is allowed to post (well, it's harder :) I think that there's an implicit level of trust that has to be honored in mailing list management. Even SASL-based SMTP authentication from ISPs isn't going to prevent throw-away accounts from being used. Until we can get a fingerprint or cornea scan (or even a driver's license) with each mailing list subscription and compare it against a master database (which I'm not advocating), you can't be 100% sure of the users. For now I'd say that the best method is a social one; require references when people want to subscribe to your list. Ask them which lists they participate on, an example post from another list, etc. But ultimately it becomes a judgement call by the listowner either way. Just my humble opinion on the matter... Chris ---------------------------------------------------------------------- Christopher Lindsey, Senior System Engineer National Center for Supercomputing Applications (NCSA) From ralph@inputplus.demon.co.uk Sat Dec 9 10:32:36 2000 From: ralph@inputplus.demon.co.uk (Ralph Corderoy) Date: Sat, 09 Dec 2000 10:32:36 +0000 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: Message from Chuq Von Rospach of "Fri, 08 Dec 2000 23:29:20 PST." Message-ID: <200012091032.KAA02371@inputplus.demon.co.uk> Hi, > > Sorry if this is wandering a little off topic, but what's a `beat'? > > What stops > > > > A writes `A' > > A waits a beat > > A short wait. Just long enough to let things settle so you get > reliable data. It's really a musical term that's I've absconded > with... This won't work. A doesn't know how long to wait because it doesn't know how quickly B might get around to trampling the `A'. Ralph. From marc_news@valinux.com Sat Dec 9 16:03:49 2000 From: marc_news@valinux.com (Marc MERLIN) Date: Sat, 9 Dec 2000 08:03:49 -0800 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: <14897.100.883156.91474@anthem.concentric.net>; from barry@digicool.com on Fri, Dec 08, 2000 at 10:38:12AM -0500 References: <20001207162234.D25463@marc.merlins.org> <14897.100.883156.91474@anthem.concentric.net> Message-ID: <20001209080348.A14951@marc.merlins.org> On Fri, Dec 08, 2000 at 10:38:12AM -0500, Barry A. Warsaw wrote: > > MM> But then comes the question: why does qrunner have to modify > MM> the list's config.db when it ships a message? I suppose the > MM> relevant piece of code in qrunner is: > > | try: > | keepqueued = dispose_message(mlist, msg, msgdata) > | # Did the delivery generate child processes? Don't store them in > | # the message data files. > | kids = msgdata.get('_kids') > | if kids: > | allkids.update(kids) > | del msgdata['_kids'] > | if not keepqueued: > | # We're done with this message > | dequeue(root) > > MM> but I have to admit to not understanding what it does. > > This isn't directly related to your problem, but some pipeline modules > can create subprocesses, although the only one that does this > currently is ToUsenet.py. This code makes sure that all those > children are waited on so they don't zombie. What /really/ ought to > happen is that there is a separate queue for usenet postings since > once the message is prepared for usenet, it doesn't need to touch the > list database again. I read the other messages with interest (thanks to all those who contributed), so let me ask: what happens if I remove the piece of code above, and just not lock the config.db at all in qrunner? (in my case, I will not be doing usenet gatewaying, so the children problem doesn't seem to apply to me) Marc -- Microsoft is to operating systems & security .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | Finger marc_f@merlins.org for PGP key From chuqui@plaidworks.com Sat Dec 9 16:26:06 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Sat, 9 Dec 2000 08:26:06 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: <20001209030926.A26087@ncsa.uiuc.edu> References: <20001209030926.A26087@ncsa.uiuc.edu> Message-ID: At 3:09 AM -0600 12/9/00, Christopher Lindsey wrote: > Yes, this has definitely been troublesome. I've blocked many > commercial sites like findmail.com (egroups) and remarq.com from my > lists because of their secret archiving that displays email addresses > to the public, but at least they don't spam the lists back. But > of course anyone can browse these sites and get addresses to their > heart's content, then forge MAIL FROM: to sneak mail into the lists. Ya know, I hadn't thought of that -- I've wokred at closing off my list archives from the spam harvesters, but I'd never thought of the list archives as a source of addresses to use to spam ONTO the lists. (shudder). That's a real, legitimate issue, because you're basically handing them access. damn. I have to go rethink that again. And I realized, after I posted, that as long as there are free e-mail sites (netscape.net, hotmail, etc), you don't even need to create or hack domains to do this. Over a period of a week, create a thousand email accounts on the various free sites. Then you can set up the mailbots to start using them to subscribe and spam. As admins get accounts nuked by the free sites, simply disable them, move to other ones in your collection, and create some more. Even under the best of circumstances, it'd be tough to impossible for the admins of a place like hotmail to keep ahead of that, and their only real block is an IP block -- and if you have multiple IPs... This charade could go on for a long time. > ) are definitely moving in the right direction with S/MIME > signatures/encryption and X509 user certs, but that still doesn't > stop someone from using throwaway certs to spam several lists or > from harvesting addresses. And it doesn't help the reality that most users can't/won't do this, and it simply means you'll scare away legitimate issues, which is like being so scared of having the cow stolen you weld the barn door shut. The cow doens't get stolen, but it eventually starves to death... > For now I'd say that the best method is a social one; require > references when people want to subscribe to your list. that works if you have active listowners and a small list. Imagine me doing that for a large list with dozens of subscriptions a day -- on my big mailman site, I'd have to hire staff to even START doing that. Not practical, unfortunately. But Murr Rhame on list-managers said something that made me think of a possible answer -- new subscribers automatically go into "hold for approval" mode. it'd be another flag in the user record (like digest or nomail), and when you subscribe, it's turned on. All messages are held for the admin to approve. Once an admin can trust a new account, he turns off the flag and they post without restriction. There are some topics and lists wher ethis would be a good thing to have, because of the incendiary aspects of the topic, or because (in my case) there are problems with trolls.... -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From darrell@grumblesmurf.net Sat Dec 9 17:01:35 2000 From: darrell@grumblesmurf.net (Darrell Fuhriman) Date: 09 Dec 2000 09:01:35 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: Chuq Von Rospach's message of "Sat, 9 Dec 2000 08:26:06 -0800" References: <20001209030926.A26087@ncsa.uiuc.edu> Message-ID: Chuq Von Rospach writes: > But Murr Rhame on list-managers said something that made me think of a > possible answer -- new subscribers automatically go into "hold for > approval" mode. it'd be another flag in the user record (like digest Yes, Lyris does this. It's a feature I've wanted to add to mailman, but haven't had the bandwidth to do it. It would be even nicer if it were smart enough to set a threshold and as soon as they've had that many posts approved, the restriction is automatically lifted. There's a couple other things that Lyris does that would be cool, but I can't remember what they are right now. :) Darrell From chongo@mellis.com Sat Dec 9 18:21:02 2000 From: chongo@mellis.com (Landon Curt Noll) Date: Sat, 9 Dec 2000 10:21:02 -0800 (PST) Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? Message-ID: <200012091821.KAA14884@delta.mellis.com> > I think that there's an implicit level of trust that has to be honored > in mailing list management. Even SASL-based SMTP authentication from > ISPs isn't going to prevent throw-away accounts from being used. > Until we can get a fingerprint or cornea scan (or even a driver's > license) with each mailing list subscription and compare it against > a master database (which I'm not advocating), you can't be 100% > sure of the users. > > For now I'd say that the best method is a social one; require > references when people want to subscribe to your list. Ask them which > lists they participate on, an example post from another list, etc. > But ultimately it becomes a judgement call by the listowner either way. I think Chris is right on target here. Judgement and non-automated checks are one good way of reducing the number of spammers. For one list, I depend on the person telling me something or asking me something about topic. It all I get in a join request, I write back asking them if they really wanted to join the list and "BTW: which version of XXX are you using?" or "What would you like to see improved in XXX?" or "In what ways are you using XXX?" or something like that. Yea, I suppose a spammer could go through the effort to learn about XXX to get on the list ... that would be a lot of special work just to target a small list. =-= Processing requests ``by hand'' is important in order to use ``Judgement and non-automated checks''. Clearly this does not work for huge lists or where the list owner is too busy to process each and every request. An alternative to perform a ``credit check'' first: *) Is the request from a recently created domain? *) Did the DNS lookup of the IP address of the sender match the envelope? *) Did the request come from an area flagged by the RBL, DUL, RSS or dorkslayers? *) Did the request come from a known spam haven or place from which you had problem before? None of these checks are perfect, but that is OK. The false negatives of things like the RSS are not a problem either. Anything that can pass all of the credit checks gets processed automatically. Anything else you process ``by hand'' using ``Judgement and non-automated checks''. =-= Prevention helps. Avoid placing your list or list-request address on the web (or on Usenet) in text form. Put your instructions / list information as text in the form of a jpg. So far spammers are not scanning jpg images for text and groking them ... I hope! =-= Do not use a single EMail address for the list. Make use of sub-domains. For example instead of: XXX-list@foo.org Use: XXX-list@Jrandom-text.XXX.m.foo.org where `Jrandom-text' is different for each and every list member. Here `Jrandom-text' is could be random collection of chars produced by your friendly neighborhood Lavarand server. :-) Perform automated in-bound checks on EMail going into 'Jrandom-text.XXX.m.foo.org'. Use the 'credit checks' listed above. Look for the EMail being relayed thru the subscription address. Look for procmail-like spam triggers. If the message passes the sanity check, allow it to go thru the system in an automated fashion. If a flag is raised, process it by hand. When EMail it sent out to the list, send it out with each message being from XXX-list@Jrandom-text.XXX.m.foo.org style addresses. All list users know (or care) is that they have / use / receive EMail from a single address. They need not know / or care that the single address is different from person to person. Another advantage to the ``non-single EMail address for the list'' is that if an address becomes a problem, you can turn it off without impacting the entire list. Say the user who is assigned 'XXX-list@table.XXX.m.foo.org' gets hit by a virus that spams their address book. Say this virus is able to duck under the ``credit checks''. Say despite all of your efforts the spam goes onto the list. Well you: *) disable 'XXX-list@atlbe.XXX.m.foo.org' *) apologize to the XXX-list Later on if the member cleans up their mess, allow them to re-join by assigning them a new address: 'XXX-list@lztyg.XXX.m.foo.org'. If people object to Jrandom-text, one can always use a randomly selected word-number combination. chongo <> /\oo/\ From lm@bitmover.com Sat Dec 9 18:31:38 2000 From: lm@bitmover.com (Larry McVoy) Date: Sat, 9 Dec 2000 10:31:38 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: <200012091821.KAA14884@delta.mellis.com> References: <200012091821.KAA14884@delta.mellis.com> Message-ID: <20001209103138.A20749@work.bitmover.com> [lotso spam avoidance discussion] Folks, I am not a Mailman developer, I use it to host the mailing lists for musictogether.com and like it a lot. But I have an idea/question. Is it possible to put a list into a mode where it is partially moderated and partially unmoderated? Where moderated applies not only to posting but to all other operations as well, such as querying the list members. If you could, I think this would fix the spammer problem somewhat. Here's how. Suppose you set up a new list, when people subscribe, they are automatically in the moderated category. They do their first post, the moderater looks at it, it's either obviously OK or not. If it is OK, the moderater OK's it and switches the user from moderated to unmoderated status. This puts some amount of work onto the list owner, but the amount of work quickly dies down, and in fact, I think it is probably less work than other approaches. Can mailman do this? If not, could it? Would it be a good idea? -- --- Larry McVoy lm at bitmover.com http://www.bitmover.com/lm From chongo@mellis.com Sat Dec 9 18:41:40 2000 From: chongo@mellis.com (Landon Curt Noll) Date: Sat, 9 Dec 2000 10:41:40 -0800 (PST) Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? Message-ID: <200012091841.KAA14945@delta.mellis.com> > Is it possible to put a list into a mode where it is partially moderated and > partially unmoderated? Where moderated applies not only to posting but to > all other operations as well, such as querying the list members. You raise a good point here. In my system that is under development, I use the number of postings and the posting rate as part of the credit checks. You could say, for example, the first 3 EMails fail the ``credit check'' and must be processed by hand. Again failing the ``credit check'' does not mean rejection, just that the message must go to a human for social checks and their best judgement. Another thing I have thought of is to check on the 'days since last message in relation to the average member' to watch for sleeper EMail addresses being enabled by a spammer. chongo <> /\oo/\ From claw@kanga.nu Sat Dec 9 18:44:13 2000 From: claw@kanga.nu (J C Lawrence) Date: Sat, 09 Dec 2000 10:44:13 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: Message from Chuq Von Rospach of "Sat, 09 Dec 2000 00:22:37 PST." References: Message-ID: <26289.976387453@kanga.nu> On Sat, 9 Dec 2000 00:22:37 -0800 Chuq Von Rospach wrote: > I'm beginning to think that mailback validation as an anti-spam > technique has been beaten. Worse, I think there are now spam > systems written that will beat them in an automated way. I've written on this before to the Mailman lists. I have similar suspicions. Like you I have no smoking guns, but I have a suggestive evidence. > I will say up front I don't have a smoking gun. If and when I find > one, I'll say so. But I'm now beginning to think the spammers have > figured out how to beat mailbacks. Its hardly complex -- just look for key strings in messages coming to an account, and then bounce back messages accordingly. Given someone with minimal scripting knowledge, what, 30 minutes? Four simple patterns will cover 95% of the lists out there. > Someone we know runs a list on egroups. Twice today he was spammed > by the porn spammers -- from subscribed accounts. This isn't the > first time I've heard of this in the last few weeks, but he's > someone I know runs a pretty clean ship. to get hit by two > separate porn spammers on the same day, in independent attacks, > that raises a real warning flag, because where the porn spammers > innovate, everyone else follows. Occam's razor indicates that this could be done equally well thru mail forgery of a blameless member. > he now owns your list, at least until you figure out what's going > on and nuke the subscribed address. But if you think about it, > once that validation handshake is complete, there's never ANY > further validation. so he can set up temporary shop, validate to > his heart's content, and then later on, after all the temporary > stuff is safely hidden away, spam from anywhere, safely. Because > he knows the address that will get him on the list. Bingo. This is one of the base reasons I now hand moderate my main lists. I'm looking hard at going back to a posting_authority setup (members prove themselves worthy of automatic posting (no moderator overview)), but Mailman does not currently lend itself to that model. Yet. (Using approved posted in Mailman is not sufficiently maintainable) > If this is true, and it's beginning to look like egroups is a > target of one attack, and I've heard rumors of some mailman lists > being hit as well, then lists that depend on mailback validation > have a problem. And I think there's been a feeling that mailbacks > are the one true way of validation to the point where there hasn't > been much (if any) thought about improved techniques or > alternatives. When you get down to it this is a question of trust models, and is a susbset of the problem of reputational systems. Its a non-trivial problem. > I've felt for a while that the list community was way too > comfortable with mailbacks as "safe and unbeatable". I'm now > seeing what I think is evidence that this is no longer true. And > I'm afraid that because we have sat back adn not innovated here, > we're going to end up behind the eight ball. and I don't see any > easy answers if I'm right -- only that if I am wrong, I won't be > wrong forever. I'm at the point where I'm willing to lay money on your being not only right, but being visibily demonstrated as right within the next calendar year. We have two problems: 1) Determining that a given member of a list is not a spammer. 2) Determining that a given post is not a SPAM The first can be largely addressed via putting in mechanisms where N moderator approved posts are required before being granted posting authority. Its a barrier to entry technique -- not secure, but certainly not profitable for the spammer in terms of ROI. As a side comment, this is one of the features I'd like to see rolled in the next Mailman design we're discussing (given the model I'm musing, it should be trivial). The second is a horrible nasty problem in this age of mail forgery and the ease of harvesting member addresses from lists (especially once you are a subscriber). Given that a spammer can susbcribe and can then harvest addresses with (presumably) posting authority with no more than a couple hours worth of scripting and a little time waiting while his bot runs, the simple MESSAGE_FROM_XXX_IS_OKAY metric is likely to last no longer. So what's the final solution? I don't think there is an elegant solution without involving presumed non-forgeable proofs of identity (ie public key crypto). Doing that requires a broadscale PKI structure (a horrible problem in and of itself), severe changes in user habits, and a host of other invasive non-trivial changes. Its going to happen tho. TLS/SMTP is just not enough. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- On Sat, 9 Dec 2000 00:22:37 -0800 Chuq Von Rospach wrote: > I'm passing this along mostly as a FYI, but also as a sanity > check. I sent this out to list-managers tonight, to bring up an > issue that sort of crystalized this afternoon and made me realize > that I think we have the beginnings of a problem in mail list > land. Your thoughts are welcome....If I'm right, well, oh, boy. If > I'm wrong -- I'd love to find out my idea won't work, but I think > it's not only possible, but fairly easy. > ---- > I somewhat hesitate to bring this up, but I heard of another > situation today that seems to fit in, and I think it's time to > raise the issue. > I'm beginning to think that mailback validation as an anti-spam > technique has been beaten. Worse, I think there are now spam > systems written that will beat them in an automated way. > I will say up front I don't have a smoking gun. If and when I find > one, I'll say so. But I'm now beginning to think the spammers have > figured out how to beat mailbacks. > Someone we know runs a list on egroups. Twice today he was spammed > by the porn spammers -- from subscribed accounts. This isn't the > first time I've heard of this in the last few weeks, but he's > someone I know runs a pretty clean ship. to get hit by two > separate porn spammers on the same day, in independent attacks, > that raises a real warning flag, because where the porn spammers > innovate, everyone else follows. > In the last few years, there have been some significant, > fundamental changes in the internet (duh). Now that I've spent a > few hours thinking like a spammer, I realize these changes make it > trivial for a *smart* spammer with some basic resources to > circumvent mailbacks. Here's how: > First, you get access to some domains -- the key ot mailbacks is > that you have to have physical access to the mailback address to > finish the confirmation. n today's internet, however -- that isn't > a big deal. you register one for yourself, hook yourself up using > dynamic DNS while attached via PPP to UUnet or one of the ISPs, > and you have a fully functional mailserver. Or if you prefer, > simply break into some lameoid's home machine sitting on a cable > modem and borrow imstupid.org while he's not paying > attention. Either way, you now have a spammer with a set of > available domains, which he's either bought, borrowed or stolen, > and access to the return mail sent to those domains. > this spammer's built a validation-bot. It's fed a list of mailing > lists, and it spends all of its time figuring out what MLM it uses > (not hard), and subscribing accounts to them. it can send the > appropriate subscribe messages, read the confirmations, and send > appropriate confirmations. Even better, if the MLM supports > nomail, you turn off deliveries, so you don't run the risk of > inbound e-mail alerting anyway on imstupid.org (if you think about > it, the only thing that has to be on imstupid.org is a set of > aliases forwarding to your real machine, and only for the period > of time you're setting up the subscriptions. If you're real lucky, > you find out you can hack their DNS and set up > really.imstupid.org, and send EVERYHTING offsite). > The spammer lets his bot run for a while, and tracks the database > with which address is subscribed to which list. He can even > subscribe multiples from multiple domains if he wants, and let > them lie fallow. When you block off one, it falls back and sends > from the next. > he now owns your list, at least until you figure out what's going > on and nuke the subscribed address. But if you think about it, > once that validation handshake is complete, there's never ANY > further validation. so he can set up temporary shop, validate to > his heart's content, and then later on, after all the temporary > stuff is safely hidden away, spam from anywhere, safely. Because > he knows the address that will get him on the list. > If this is true, and it's beginning to look like egroups is a > target of one attack, and I've heard rumors of some mailman lists > being hit as well, then lists that depend on mailback validation > have a problem. And I think there's been a feeling that mailbacks > are the one true way of validation to the point where there hasn't > been much (if any) thought about improved techniques or > alternatives. > And if I, having spent four hours on the "how would I do this?" > train of thought can find a fairly easy to implement design, so > can those that aren't so pure of heart and don't say their prayers > at night. This isn't something the "buy a CD for $200" lameoid > spammers can do (but I'll bet a really good spammer could build a > system to do it taht's turnkey. there's enough wide open hardware > out on the net, especially overseas, that you could get a good 6 > month run before neough stuff shut you down to make it not worth > it...), but the port spammers and gambling spammers and the > spammers for hire? it's perfect for them. > I've felt for a while that the list community was way too > comfortable with mailbacks as "safe and unbeatable". I'm now > seeing what I think is evidence that this is no longer true. And > I'm afraid that because we have sat back adn not innovated here, > we're going to end up behind the eight ball. and I don't see any > easy answers if I'm right -- only that if I am wrong, I won't be > wrong forever. > So I'm throwing it to the list, to see if there's information > others have that might corroborate what I think I'm seeing (that > you may not have realized for waht it might be), or t poke holes > in my analysis, or to start thinking of how to deal with it. > There I go, being a troublemaker again... (grin, sort of) > thoughts? > chuq > -- Chuq Von Rospach - Plaidworks Consulting > (mailto:chuqui@plaidworks.com) Apple Mail List Gnome > (mailto:chuq@apple.com) > We're visiting the relatives. Cover us. > _______________________________________________ Mailman-Developers > mailing list Mailman-Developers@python.org > http://www.python.org/mailman/listinfo/mailman-developers -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From claw@kanga.nu Sat Dec 9 19:24:10 2000 From: claw@kanga.nu (J C Lawrence) Date: Sat, 09 Dec 2000 11:24:10 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: Message from Christopher Lindsey of "Sat, 09 Dec 2000 03:09:26 CST." <20001209030926.A26087@ncsa.uiuc.edu> References: <20001209030926.A26087@ncsa.uiuc.edu> Message-ID: <29542.976389850@kanga.nu> On Sat, 9 Dec 2000 03:09:26 -0600 Christopher Lindsey wrote: >> I'm passing this along mostly as a FYI, but also as a sanity >> check. I sent this out to list-managers tonight, to bring up an >> issue that sort of crystalized this afternoon and made me realize >> that I think we have the beginnings of a problem in mail list >> land. Your thoughts are welcome....If I'm right, well, oh, >> boy. If I'm wrong -- I'd love to find out my idea won't work, but >> I think it's not only possible, but fairly easy. > Hi Chuq, > Yes, this has definitely been troublesome. I've blocked many > commercial sites like findmail.com (egroups) and remarq.com from > my lists because of their secret archiving that displays email > addresses to the public, but at least they don't spam the lists > back. But of course anyone can browse these sites and get > addresses to their heart's content, then forge MAIL FROM: to sneak > mail into the lists. > I'm not sure what the right thing is to do. MLMs like sympa ( > http://listes.cru.fr/sympa/ > ) are definitely moving in the right direction with S/MIME > signatures/encryption and X509 user certs, but that still doesn't > stop someone from using throwaway certs to spam several lists or > from harvesting addresses. You are failing to distinguish between two problems: 1) Is this post from someone I know? 2) Is this post from who I think it is from? #1 is handled by any form of digital signature, and is handled especially well by non-centrally signed/verified forms (eg PGP, GPG, etc). All that's needed is for the list member to convey or have conveyed to you their public key. #2 is handled by having a public key for a member. It doesn't matter if it is signed or if Verisign or some other set of pains vouch for it. It merely matters that a current post from them cross-checks with their signature. So what problem does that leave? Is this post, which comes from a member for whom I have a key, and which ckecks against his key, SPAM? It doesn't matter where the keys or certs come from. It doesn't matter if a trusted authority is involved or not. It is a human question. I'm sure that spammers are just as capable as we are at getting demo certs from Thawte, or in cooking up new GPG keys willy nilly. Again, this is a question of trust networks, and is a subset of the problem of reputational systems. Unfortunately, unlike the systems I normally needing reputational systems that I normally spend my time looking at, this is not an area which panders to central databases, and over-arching solutions. SPAM detection at the individual membership level is a question of individual evaluation. > The problem is that when these methods are used for authentication > they just prove that the email address sending the stuff is who we > think he or she is. But at least you can't forge the source email > address to look like it's coming from a list member who is allowed > to post (well, it's harder :) It raises the bar to requiring that SPAMers compromise members keys in a wholesale fashion. While certainly possible (I spent an hour last weekend seeing how many exploitable Windows systems I could find within that hour. I gave up after about 20 minutes when the count passed treble digits), the barrier to entry is much larger and the ROI is much smaller (a compromised key only gains access to a few lists and posting venues). > I think that there's an implicit level of trust that has to be > honored in mailing list management. Even SASL-based SMTP > authentication from ISPs isn't going to prevent throw-away > accounts from being used. Until we can get a fingerprint or > cornea scan (or even a driver's license) with each mailing list > subscription and compare it against a master database (which I'm > not advocating), you can't be 100% sure of the users. No. Think in terms of trust, not identity. Do you really care that you can track that particularl membership back to one identifiable human body? Really? Over the entire planet? Or do you really just care that JoeBlow has posted signal in the past and you feel that you can trust him to post signal in the future? Or more simply, if you are not going to operate on past posting behaviour as your trust metric: Why do you trust member X to not post spam? What are the criteria you use for making that decision? Why are those criteria trustable? What are the risks? I like past behaviour as it is simple, non-invasive (I don't need to know who anybody is), and it fits transparently in as an invisible extention of traditional list moderation models. > For now I'd say that the best method is a social one; require > references when people want to subscribe to your list. Ask them > which lists they participate on, an example post from another > list, etc. But ultimately it becomes a judgement call by the > listowner either way. For a few years I ran a list where membersip was by invitation only -- a current member had to invite you. It worked well. Membership grew steadily, more than 70% of the members regularly posted to the list, and signal was high. Later I moved the list to free subscription with posting authority granted on application only, with applications needing to be accompanied by a proposed first post. This too worked well with minor caveats. List membership grew roughly twice as fast, but the poting percentage fell to around 40%. The caveats were in maintaining the approved poster list, and in particular determinging and enforcing policy for removing posting authority was painful. I currently run an open subscription model with hand moderation of all posts. Again, this has worked well. Subscription rates are roughly 4 times higher than ever before, but posting percentages are down in the <10% range. It is of course labour intensive. Recently I've moved to not only hand moderating the list, but hand editing posts (eg to remove HTML, over quoting, inflammatory content, etc) marking each post I edit with a comment as to the changes. While I haven't been doing this for long, subscription rates have noticably increased by perhaps 20% (tho the sample time is small). The work load on me is non-trivial. It moved me from the position of gatekeeper to editor, a position I'm willing to occupy, but am not keen on. In the end you are trading editorial control (your trust model and signal definition) for work. Ultimately you are attempting to automate the process of determining signal. The initial approximation is by determining signal sources. The approximation we are discussing above is in determining if signal sources really are signal sources. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From vince@vjs.org Sat Dec 9 22:32:53 2000 From: vince@vjs.org (Vince Sabio) Date: Sat, 9 Dec 2000 17:32:53 -0500 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: References: <20001209030926.A26087@ncsa.uiuc.edu> Message-ID: ** Sometime around 09:01 -0800 12/09/2000, Darrell Fuhriman sent us: >Chuq Von Rospach writes: > > > But Murr Rhame on list-managers said something that made me think of a > > possible answer -- new subscribers automatically go into "hold for > > approval" mode. it'd be another flag in the user record (like digest > >Yes, Lyris does this. It's a feature I've wanted to add to >mailman, but haven't had the bandwidth to do it. It would be >even nicer if it were smart enough to set a threshold and as soon >as they've had that many posts approved, the restriction is >automatically lifted. Lyris actually uses this method. The list owner selects the number of [approved] posts that constitute the probationary period for that list, and all new members are subject to moderation for that number of approvals. I refer to this as a "semi-moderated" list. In addition, Lyris supports "spot moderation," where an individual can be moderated, either permanently or for X number of posts, by one of the list administrators. >There's a couple other things that Lyris does that would be cool, >but I can't remember what they are right now. :) Another nice feature is text/regex-based filtering; posts can be rejected with admin-customized messages based on text strings in the messages. This allows flames to be quelled pretty quickly, without resorting to moderation. It *could* be used as a spam filter, though I must admit that I do not have spam filters in place. All of my production lists are semi-moderated and post-by-subscriber-only; we've probably been spammed twice in the past 3 years, and each time it was by someone who had been posting to the list for long enough to get past the newbie-moderation threshold. I do not personally have any examples of forged-subscriber spam, but it is a risk that has bothered me for many years. I date back to the days of Kevin Lipsitz and the "Tempting Tear-Outs" spams; Kevin targeted primarily mailing lists (vs. individual addresses), and used several methods to [attempt to] subvert basic list security. Back then, few lists were moderated, even fewer required posts from subscribers only, and semi-moderation wasn't even a gleam in anyone's eye. Kevin used PAML to attack mailing lists across the 'Net, and he largely had full rampage ability on those lists. He was also pretty sharp technically (for a dork). A Lipstiz clone today would have his work cut out for him, but he could still easily subvert a list with much less effort than Chuq's domain-snatching idea: 1. Use PAML (or similar) to subscribe to discussion lists far and wide. Automation of subscription confirmations is a snap. 2. Collect mail from those lists, and parse & save addresses of the posters; be sure to correlate addresses with mailing lists. 3. For each list, sort the addresses in order of volume; this will help identify the prolific posters, thus helping to subvert semi-moderated lists. 4. Post via forged smtp mail from *and* header From:. Short of S/MIME and similar measures that most of us would consider to be extreme (right now, anyway; probably won't be considered extreme measures for much longer), there is little that the owner of a large, busy discussion list can do to protect his list from an attack such as this. Sure, you could moderate the list, but many of my lists see 50 to 100 posts/day, and the max I've ever had posted to a single list in one day was more than 450. That's a lot of moderation. I'd sooner buy a copy of MailShield to protect the server. Like Chuq, I shudder at the thought of someone forging subscriber addresses to spam mailing lists. - Vince From markf@wingedpig.com Sat Dec 9 23:17:15 2000 From: markf@wingedpig.com (Mark Fletcher) Date: Sat, 09 Dec 2000 15:17:15 -0800 Subject: [Mailman-Developers] FYI -- mailback validations nolonger safe? References: <20001209030926.A26087@ncsa.uiuc.edu> Message-ID: <3A32BD7B.FD1F06CE@wingedpig.com> Apologies if some of this is repeated in other posts, I haven't had a chance to read through everything yet... Chuq Von Rospach wrote: > > At 3:09 AM -0600 12/9/00, Christopher Lindsey wrote: > > > Yes, this has definitely been troublesome. I've blocked many > > commercial sites like findmail.com (egroups) and remarq.com from my > > lists because of their secret archiving that displays email addresses > > to the public, but at least they don't spam the lists back. But > > of course anyone can browse these sites and get addresses to their > > heart's content, then forge MAIL FROM: to sneak mail into the lists. > > Ya know, I hadn't thought of that -- I've wokred at closing off my > list archives from the spam harvesters, but I'd never thought of the > list archives as a source of addresses to use to spam ONTO the lists. > (shudder). That's a real, legitimate issue, because you're basically > handing them access. > A couple of quick corrections. eGroups no longer archives lists hosted elsewhere, although there are still a few legacy lists. We stopped that about a year ago. I also think that remarq.com has stopped that as well. As for archives, eGroups obscures email addresses to prevent spam harvesting. We never saw an instance of successful spam harvesting of email addresses from the archives because of this. ... snip ... > But Murr Rhame on list-managers said something that made me think of > a possible answer -- new subscribers automatically go into "hold for > approval" mode. it'd be another flag in the user record (like digest > or nomail), and when you subscribe, it's turned on. All messages are > held for the admin to approve. Once an admin can trust a new account, > he turns off the flag and they post without restriction. > eGroups has had this for quite some time, and many listowners have had success using it. There are two types of spam problems with lists. One is harvesting of email addresses, the other is sending spam directly to groups. Given the current state of Internet email, neither can be fully addressed. But the good news is that spammers generally are impatient, and are looking for the biggest bang for the buck (most email addresses for least effort). So, subscribing to a group and harvesting email addresses by looking at the messages you receive is not popular with spammers (in our experience). It takes too long and yields too few addresses. The biggest source of spam complaints on eGroups is the case of a spammer subscribing to a bunch of groups and then sending their spam to the groups, which if I understand correctly is what happened to your friend, Chuq. But besides the 'moderate new users' function, and the anti-cross posting features of eGroups, I'm not sure what else you can do to eliminate that problem. As an aside, I have actually seen software designed to send spam to mailing lists. It comes with a database of hundreds of lists (lots of ONElist/eGroups lists included). It assumes you have subscribed to the lists already. You compose your spam template, and it sends out individual messages to each of the groups. By doing so, it defeats the anti-cross posting feature of eGroups. It was targeted to people who subscribed to the numerous (at the time) 'make money fast' groups on eGroups and elsewhere (basically groups where subscribers spam each other). So it wasn't really a problem for our normal users. Mark From chuqui@plaidworks.com Sat Dec 9 23:20:08 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Sat, 9 Dec 2000 15:20:08 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: References: <20001209030926.A26087@ncsa.uiuc.edu> Message-ID: At 5:32 PM -0500 12/9/00, Vince Sabio wrote: >Short of S/MIME and similar measures that most of us would consider >to be extreme (right now, anyway; probably won't be considered >extreme measures for much longer), there is little that the owner of >a large, I've been mulling this over all day, and I have a couple of ideas on it. I'm sure they're not original, but they might open up concepts for MLM authors to consider. my first premise: this has to be "solved" at the MLM level. The real answer is authenticated e-mail addresses, and that implies S/MIME and all of the logistical overhead and development that implies. In practice, for many places, that's simply not an option until SOMEONE figures out how to get AOL to support it, because without AOL, a significant chunk of the audience can't do it, making it worthless (and while individual list admins can tell AOL to take a flying leap, the typical one won't, and many of us can't. So any MLM that comes up with a solution that effectively locks out AOL is a MLM that dies in the marketplace...) my second premise: that we put the onus of managing this first on the MLM software, second on the list admin, and third on the end user. The higher the bar you place between your user base and the list, the fewer will bother jumping over it. And the harder you make it for an admin to be admin, the more likely they are to turn the bloody stuff off, or choose a different MLM. We have to remember we're talking about 'solving' what we see might be an emerging problem, which means we aren't going to have admins beating down our doors screaming FIX THIS. Instead, we have to fix it and keep it from ever BEING that kind of problem, which means the barriers for entry and use have to be kept minimal. either that, or we wait until it does fall apart, and pray we can put it back together quickly. not my idea of fun. I've come up with two ideas that seem promising. First is not new. It's moving the validation from the point of subscription to the posting time (actually, do both). This involves assigning a user an access password, which is attached to the message they want posted. This can be strongly automated, which is good. It only solves the forbge subscriber address part, which means it's not a complete solution, but at least it deals with the most pernicious aspects of all of this, a harvester posting via forged addresses of legitimate subscribers. Passwords can be pulled off a web site, similar to what users do now when they forget a password as most sites with registration, and have it e-mailed to the subscribed address. It's tehn atached via the subject line, first line of the body, x-header, I don't care. the MLM has to be paranoid about stripping these passwords without overswtripping legitimate content, to protect it. At that point, we can at least get back to knowing the user posting the message has access to the e-mail address's mailbox, which is about as secure as we can get with e-mail. They aren't just sucking addresses at random and re-using them. Passowrds, if you want, can time out, and if you really want, the admin can set their length, from one-time to permanent, depending on their paranoia. Second idea puts the onus on the list admin. There is one other identifying piece of info we know about the poster that can't be forged. it is the IP address of the machine that relays the mail to your MLM machine. All of the OTHER received lines can be forged, but the one your server adds to tell you who it got the mail from -- the direct connection -- can't be (or you have bigger problems). In this scheme, then, messages from a user are held for approval, and the list admin has to teach the MLM which IP addresses to acccept mail from with that "From" address. Now, a given "From" address may relay in from more than one address, but the list of those addresses is finite -- so we can build an authentication list for EACH user fairly easily, over time. The admin will be pretty busy early on, but the main work is done by the MLM itself, and the end-user in almost all cases doesn't have ot worry about it. And we can base this on a human teaching a machine "right" and "wrong", using a piece of known-valid data. There are opportunities for automation here, of course, such as automatically validating AOL users where the SMTP relay is an AOL machine, that can help the admin minimize their pain, but you run some risk of opening up some holes. It seems like both approaches will work, both can be done TODAY, without waiting for significant technological advances, client enhancements, maturation of technologies or building of new infrastructures, and they layer ontop of what we already are doing in reasonably non-invasive ways. I think the SMTP-relay authorization (to some degree, a list-specific variation of the SMTP-after-POP email setup...) has some interesting possibilities, and I wonder if there are other pieces of data that we "know" about a user once we get the email that we can use to validate without worrying about their corruption or forging. And yes, I know about TCP spoofing, but frankly, I think if spammers get that sophisticated intheir attacks, it's unlikely anything reasonable will stop them. but I'm willing to try, and I think we solve the cases we can solve, and continue to move forward from there... >busy discussion list can do to protect his list from an attack such >as this. Sure, you could moderate the list, but many of my lists see >50 to 100 posts/day, and the max I've ever had posted to a single >list in one day was more than 450. That's a lot of moderation. Moderation is a tool, but not a solution. I'd have to hire staff to do nothing but moderate my big machine. That's the wrong way to look at this. I'd rather hire staff to find ways to FIX it so we don't have to put human filters in the way. the SMTP-relay IP address is nice, because while there's some pain while you're teaching your server, and it adds SOME continuing overhead to the admin's load due to new users, moving users and network changes within other people's networks -- the primary load is managed by the server, not the admin. And it doesn't impact the end-user or require new user skills or client technologies (or training users to apply passwords ot messages, or... ) -- it's purely server based. >Like Chuq, I shudder at the thought of someone forging subscriber >addresses to spam mailing lists. it's a scary thought. brr. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From sam@spacething.org Sun Dec 10 06:38:07 2000 From: sam@spacething.org (Sam Stickland) Date: Sat, 9 Dec 2000 22:38:07 -0800 Subject: [Mailman-Developers] Writing a custom subscribe interface References: <20001209030926.A26087@ncsa.uiuc.edu> <29542.976389850@kanga.nu> Message-ID: <002101c0627a$23af5840$3925073e@monster> Hi, I have a need to write a custom web-subscribe interface for mailman. I need a single screen that displays lists of my choosing, marking which ones you are subscribed to, and a series of buttons for subscribing and unsubscribing to those lists. It's a little more complex than that as I want to disable certain lists from subscription if you are subscribed to others (this doesn't need to be built into mailman, just this interface. I don't care if people subscribe themselves to the other lists via email requests etc.). I'm prepared to do this myself (and in fact I'm probably going to start digging around the mailman source as soon as I've sent this) - but I'm wondering if anyone here has already done something similar to this. Could save me some time :) (Also if someone had the time to point me the bits of the mailman internals I need to achieve this it would be appreciated) Thanks, Sam From lindsey@ncsa.uiuc.edu Sat Dec 9 23:35:34 2000 From: lindsey@ncsa.uiuc.edu (Christopher Lindsey) Date: Sat, 9 Dec 2000 17:35:34 -0600 Subject: [Mailman-Developers] FYI -- mailback validations nolonger safe? In-Reply-To: <3A32BD7B.FD1F06CE@wingedpig.com>; from markf@wingedpig.com on Sat, Dec 09, 2000 at 03:17:15PM -0800 References: <20001209030926.A26087@ncsa.uiuc.edu> <3A32BD7B.FD1F06CE@wingedpig.com> Message-ID: <20001209173534.F26614@ncsa.uiuc.edu> > A couple of quick corrections. eGroups no longer archives lists hosted > elsewhere, although there are still a few legacy lists. We stopped that > about a year ago. I also think that remarq.com has stopped that as well. Yes, remarq appears to have stopped now. We still have some NCSA lists (well, at least one) archived at eGroups, but I suspect it's one of those legacy archives since it still has the old subscription information from almost three years ago on it. :) > As for archives, eGroups obscures email addresses to prevent spam > harvesting. We never saw an instance of successful spam harvesting of > email addresses from the archives because of this. The addresses are now obscured, but when it was done through findmail the addresses were there for the world to see. I'm not targeting eGroups or Remarq, but just listing them as examples of what can happen. In these cases, the two companies started archiving and then addressed the problems that they had created later. And that's the whole point -- you can make your server as secure as possible, hide email addresses in your archives and do anything else imaginable, but one irresponsible subscriber makes the whole setup worthless. They just need to setup an archive that doesn't hide email addresses, and voila... S/MIME or PGP signatures would of course prevent the addresses being used for spamming, but would still allow direct spam. That's why I use unique email addresses for most lists that I subscribe to; at least then I can track the origins of a spam. Coupled with an MLM that signs outbound messages, I'd be pretty spam-free since I could disregard anything that wasn't signed. [apologies for double quoting -- I don't remember the original poster] > > But Murr Rhame on list-managers said something that made me think of > > a possible answer -- new subscribers automatically go into "hold for > > approval" mode. it'd be another flag in the user record (like digest > > or nomail), and when you subscribe, it's turned on. All messages are > > held for the admin to approve. Once an admin can trust a new account, > > he turns off the flag and they post without restriction. It's a pretty standard feature in MLMs... Even old and crusty majordomo 1.94.x can require subscriber approval. Chris (who's thinking that maybe we should remove Spaf et al from the Cc: list?) ---------------------------------------------------------------------- Christopher Lindsey, Senior System Engineer National Center for Supercomputing Applications (NCSA) From jam@jamux.com Sun Dec 10 01:36:36 2000 From: jam@jamux.com (John A. Martin) Date: Sat, 09 Dec 2000 20:36:36 -0500 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: (Chuq Von Rospach; Sat, 09 Dec 2000 15:20:08 -0800) Message-ID: <20001210013636.9D1954800C@athene.jamux.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >>>>> "CVR" == Chuq Von Rospach >>>>> "Re: [Mailman-Developers] FYI -- mailback validations no longer safe?" >>>>> Sat, 9 Dec 2000 15:20:08 -0800 CVR> Second idea puts the onus on the list admin. There is one CVR> other identifying piece of info we know about the poster that CVR> can't be forged. it is the IP address of the machine that CVR> relays the mail to your MLM machine. All of the OTHER CVR> received lines can be forged, but the one your server adds to CVR> tell you who it got the mail from -- the direct connection -- CVR> can't be (or you have bigger problems). Would you unconditionally accept postings received at your list host from a backup MX? Once the SMTP-relay check is deployed the spammer will just relay through one of the target's MX hosts[1]. Checking back through the trace of backup mx hosts could get messy considering the variations in received header fields, no? jam Footnotes: [1] I've noticed senders that get rejected by MTA anti-spam measures try a backup MX host shortly thereafter. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (GNU/Linux) Comment: OpenPGP encrypted mail preferred. See iEYEARECAAYFAjoy3f4ACgkQUEvv1b/iXy8LPgCdFDtLWwICvI9LJEL+dpmXqnqQ c1wAn1Y5liEbzdKzgj2+n8ZtNm8Pvw9T =mMZC -----END PGP SIGNATURE----- From claw@kanga.nu Sun Dec 10 04:47:04 2000 From: claw@kanga.nu (J C Lawrence) Date: Sat, 09 Dec 2000 20:47:04 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: Message from Vince Sabio of "Sat, 09 Dec 2000 17:32:53 EST." References: <20001209030926.A26087@ncsa.uiuc.edu> Message-ID: <13466.976423624@kanga.nu> On Sat, 9 Dec 2000 17:32:53 -0500 Vince Sabio wrote: > Like Chuq, I shudder at the thought of someone forging subscriber > addresses to spam mailing lists. I've already had posts (three times now) to my lists purportedly from members of that list (From: header etc), where the members in question deny ever having written those posts, and careful examination of the Received headers supports their claim that it wasn't them. The simple transaltion is that it sure as heck *looks* like people are already forging mail as members of lists. The fact that the posts happened to be on-topic is slightly droll and beyond my ability to explain. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From claw@kanga.nu Sun Dec 10 04:51:12 2000 From: claw@kanga.nu (J C Lawrence) Date: Sat, 09 Dec 2000 20:51:12 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: Message from Chuq Von Rospach of "Sat, 09 Dec 2000 15:20:08 PST." References: <20001209030926.A26087@ncsa.uiuc.edu> Message-ID: <13748.976423872@kanga.nu> On Sat, 9 Dec 2000 15:20:08 -0800 Chuq Von Rospach wrote: > In practice, for many places, that's simply not an option until > SOMEONE figures out how to get AOL to support it, because without > AOL, a significant chunk of the audience can't do it, making it > worthless (and while individual list admins can tell AOL to take a > flying leap, the typical one won't, and many of us can't. So any > MLM that comes up with a solution that effectively locks out AOL > is a MLM that dies in the marketplace...) Does anyone here have contacts at AOL? I used to know people in their NOC and in their IS team (which doesn't really apply to this end). I'll see if I can't dig them up to see what might be needed to get some basic things done (kinda difficult at this time of year). -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From claw@kanga.nu Sun Dec 10 05:06:46 2000 From: claw@kanga.nu (J C Lawrence) Date: Sat, 09 Dec 2000 21:06:46 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: Message from Chuq Von Rospach of "Sat, 09 Dec 2000 20:51:07 PST." References: <20001209030926.A26087@ncsa.uiuc.edu> <13466.976423624@kanga.nu> Message-ID: <14736.976424806@kanga.nu> On Sat, 9 Dec 2000 20:51:07 -0800 Chuq Von Rospach wrote: > At 8:47 PM -0800 12/9/00, J C Lawrence wrote: >> The fact that the posts happened to be on-topic is slightly droll >> and beyond my ability to explain. > I've had that happen, and it was finally tracked to another list > member who was pissed at the first, and attempting to ruin their > reputation on the list. Don't downplay the underlying personal > interactions and politics of a list, especially one iwth strong > emotions, where otherwise mature people act like three year olds > over a stale donut. Yeah, that was my first suspicion and it really fit all the way up to the fact that two of them were actually pretty good posts that sparked useful and deep running threads, and the third was non-commital to the point of being ho-hum (technically on-topic, simply re-iterated prior traffic). The members and I in each case just agreed to >boggle< and keep running without comment, under the suspicion that comment would only encourage the activity. Problem is, I know how trivially simple it is: $ dig python.org MX ... python.org. 23h47m13s IN MX 50 dinsdale.python.org. ... $ telnet dinsdale.python.org smtp HELO kanga.nu MAIL FROM: RCPT TO: mailman-developers@python.org DATA ... . QUIT (which in one case appears to be exactly what happened) -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From sam@spacething.org Sun Dec 10 13:27:27 2000 From: sam@spacething.org (Sam Stickland) Date: Sun, 10 Dec 2000 05:27:27 -0800 Subject: [Mailman-Developers] Writing a custom subscribe interface References: <20001209030926.A26087@ncsa.uiuc.edu> <29542.976389850@kanga.nu> <002101c0627a$23af5840$3925073e@monster> Message-ID: <00d701c062ac$f20fe9c0$3925073e@monster> No need to reply this. I figured it all out (and taught myself Python in the process :) ), and everything works fine now. Sam ----- Original Message ----- From: "Sam Stickland" To: "Mailman development" Sent: Saturday, December 09, 2000 10:38 PM Subject: [Mailman-Developers] Writing a custom subscribe interface > Hi, > > I have a need to write a custom web-subscribe interface for mailman. I need > a single screen that displays lists of my choosing, marking which ones you > are subscribed to, and a series of buttons for subscribing and unsubscribing > to those lists. It's a little more complex than that as I want to disable > certain lists from subscription if you are subscribed to others (this > doesn't need to be built into mailman, just this interface. I don't care if > people subscribe themselves to the other lists via email requests etc.). > > I'm prepared to do this myself (and in fact I'm probably going to start > digging around the mailman source as soon as I've sent this) - but I'm > wondering if anyone here has already done something similar to this. Could > save me some time :) > > (Also if someone had the time to point me the bits of the mailman internals > I need to achieve this it would be appreciated) > > Thanks, > > Sam From jam@jamux.com Mon Dec 11 13:01:22 2000 From: jam@jamux.com (John A. Martin) Date: Mon, 11 Dec 2000 08:01:22 -0500 Subject: [Mailman-Developers] Cookie error? Message-ID: <20001211130122.1D4DC4800C@athene.jamux.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Is there something to be done for this? - -------------- cut here ---->8 ---< head Dec 11 06:59:25 2000 admin(31509): @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ admin(31509): [----- Mailman Version: 2.0 -----] admin(31509): [----- Traceback ------] admin(31509): Traceback (innermost last): admin(31509): File "/home/mailman/scripts/driver", line 96, in run_main admin(31509): main() admin(31509): File "/home/mailman/Mailman/Cgi/private.py", line 157, in main admin(31509): cookie='archive') admin(31509): File "/home/mailman/Mailman/SecurityManager.py", line 68, in WebAuthenticate admin(31509): return self.CheckCookie(key) admin(31509): File "/home/mailman/Mailman/SecurityManager.py", line 117, in CheckCookie admin(31509): c = Cookie.Cookie(cookiedata) admin(31509): File "/home/mailman/Mailman/Cookie.py", line 509, in __init__ admin(31509): if input: self.load(input) admin(31509): File "/home/mailman/Mailman/Cookie.py", line 546, in load admin(31509): self.__ParseString(rawdata) admin(31509): File "/home/mailman/Mailman/Cookie.py", line 573, in __ParseString admin(31509): M.set(K, apply(self.net_setfunc, (V,)), V) admin(31509): File "/home/mailman/Mailman/Cookie.py", line 421, in set admin(31509): raise CookieError("Attempt to set a reserved key: %s" % key) admin(31509): CookieError: Attempt to set a reserved key: expires admin(31509): [----- Python Information -----] admin(31509): sys.version = 1.5.2 (#1, Feb 1 2000, 16:32:16) [GCC egcs-2.91.66 19990314/Linux (egcs- admin(31509): sys.executable = /usr/bin/python admin(31509): sys.prefix = /usr admin(31509): sys.exec_prefix= /usr admin(31509): sys.path = /usr admin(31509): sys.platform = linux-i386 admin(31509): [----- Environment Variables -----] admin(31509): DOCUMENT_ROOT: /home/listproc/httpd/html/ admin(31509): SERVER_ADDR: 216.0.124.17 admin(31509): SERVER_PORT: 80 admin(31509): PATH_TRANSLATED: /home/listproc/httpd/html/child-proofing/ admin(31509): GATEWAY_INTERFACE: CGI/1.1 admin(31509): HTTP_USER_AGENT: EmailSiphon admin(31509): REMOTE_ADDR: 63.203.128.204 admin(31509): SERVER_NAME: lists.essential.org admin(31509): SCRIPT_FILENAME: /home/mailman/cgi-bin/private admin(31509): HTTP_ACCEPT: www/source, text/html, video/mpeg, image/jpeg, image/x-tiff, image/x-rgb, image/x-xbm, image/gif, */*, application/postscript admin(31509): REQUEST_URI: /mailman/private/child-proofing/ admin(31509): QUERY_STRING: admin(31509): SERVER_PROTOCOL: HTTP/1.0 admin(31509): PATH_INFO: /child-proofing/ admin(31509): HTTP_HOST: lists.essential.org admin(31509): REQUEST_METHOD: GET admin(31509): SERVER_SIGNATURE:

Apache/1.3.14 Server at lists.essential.org Port 80

admin(31509): SCRIPT_NAME: /mailman/private admin(31509): SERVER_ADMIN: root@localhost admin(31509): SERVER_SOFTWARE: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_perl/1.23 admin(31509): PYTHONPATH: /home/mailman admin(31509): HTTP_COOKIE: browser=3FCB80CC3A34BF2E; expires=Tue, 31-Dec-2002 05:00:00 GMT; domain=.zdnet.com; path=/ admin(31509): REMOTE_PORT: 4868 - ---- 8<------- cut here ----------> tail jam -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (GNU/Linux) Comment: OpenPGP encrypted mail preferred. See iEYEARECAAYFAjo00AkACgkQUEvv1b/iXy+e2ACghQLA7YjnTJd694ZkvNaC/0kC JAAAoJxIM+E3kMreQPtzqtKenh+rPTi4 =Uzey -----END PGP SIGNATURE----- From lohner@debian.org Mon Dec 11 15:03:45 2000 From: lohner@debian.org (Nils Lohner) Date: Mon, 11 Dec 2000 16:03:45 +0100 Subject: [Mailman-Developers] issues... Message-ID: <200012111503.QAA16804@topaze.ecf.teradyne.com> 1. The scripts should probably check if you have sufficient privileges before launching... /etc/mailman > newlist rt-general Traceback (innermost last): File "/usr/sbin/newlist", line 227, in ? main() File "/usr/sbin/newlist", line 122, in main os.setgid(MAILMAN_GID) OSError: [Errno 1] Operation not permitted /etc/mailman > sudo newlist rt-general Enter the email of the person running the list: 2. General list information can be found at _the_mailing_list_overview_page _. When clicking on that link (in Netscape, at least...) and not having changed the name in /etc/mailman/mm_cfg.py it tries to contact http://www.localhost.com/cgi-bin/mailman/listinfo I don't know where the 'www.' and '.com' are coming from, but they shouldn't be there. After changing localhost to the hostname it worked fine. THanks, Nils. From marc_news@valinux.com Mon Dec 11 16:25:38 2000 From: marc_news@valinux.com (Marc MERLIN) Date: Mon, 11 Dec 2000 08:25:38 -0800 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: <20001209080348.A14951@marc.merlins.org>; from marc_news@valinux.com on Sat, Dec 09, 2000 at 08:03:49AM -0800 References: <20001207162234.D25463@marc.merlins.org> <14897.100.883156.91474@anthem.concentric.net> <20001209080348.A14951@marc.merlins.org> Message-ID: <20001211082538.H29247@marc.merlins.org> > > MM> But then comes the question: why does qrunner have to modify > > MM> the list's config.db when it ships a message? I suppose the > > MM> relevant piece of code in qrunner is: > > > > | try: > > | keepqueued = dispose_message(mlist, msg, msgdata) > > | # Did the delivery generate child processes? Don't store them in > > | # the message data files. > > | kids = msgdata.get('_kids') > > | if kids: > > | allkids.update(kids) > > | del msgdata['_kids'] > > | if not keepqueued: > > | # We're done with this message > > | dequeue(root) > > > > MM> but I have to admit to not understanding what it does. > > > > This isn't directly related to your problem, but some pipeline modules > > can create subprocesses, although the only one that does this > > currently is ToUsenet.py. This code makes sure that all those > > children are waited on so they don't zombie. What /really/ ought to > > happen is that there is a separate queue for usenet postings since > > once the message is prepared for usenet, it doesn't need to touch the > > list database again. > > I read the other messages with interest (thanks to all those who > contributed), so let me ask: what happens if I remove the piece of code > above, and just not lock the config.db at all in qrunner? > (in my case, I will not be doing usenet gatewaying, so the children problem > doesn't seem to apply to me) I'm not asking for a money back garantee here, don't be afraid to speak up :-) In other words, is there a good chance that it will work, or on the contrary do you think that I'll probably break things horribly by doing this? (I am assuming that I can remove locking of config.db after removing the piece of code shown above, or better yet, leave the code alone, just comment out the locking, and make sure I don't do usenet gatewaying) Thanks Marc -- Microsoft is to operating systems & security .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | Finger marc_f@merlins.org for PGP key From claw@kanga.nu Tue Dec 12 01:17:51 2000 From: claw@kanga.nu (J C Lawrence) Date: Mon, 11 Dec 2000 17:17:51 -0800 Subject: [Mailman-Developers] (no subject) Message-ID: <31823.976583871@kanga.nu> I'm working on a design proposal for Mailman v3 and have arrived at a few questions: 1) Is there a GPL distributed queue processing system ala IBM's MQ about? I've not been able to find one. The specific problem I'm attempting to solve in a performant manner is: How does a given delivery process on a given system obtain access to a message it is qualified to deliver with a minimum of lock contention or other overhead while preventing any other process from attempting to also access the message it has acquired from that point forward, but while also allowing another process to recover that message should the original process or host fail? Note that the queue directory and the delivery process may be on different systems (eg NFS) and that there may be multiple competing queue delivery processes on any given system as well as across multiple systems. NB Needs to be GPL as Mailman is a GNU project. 2) How much interest is there in optionally supporting VERP? I've no interest in making Mailman v3 VERP-only, but I'd like to see it as an option, and specifically, an option which can be turned on and off dynamically on a configurable percentage of posts basis (eg N% of the messages exploded from an arbitrary list post are VERPed). I've got the basics worked out (assuming the local MTA supports plus+addressing), but some of the smaller bits are fiddly. 3) The current design requires messages to have MessageIDs which are unique within the range of MessageIDs currently within the new Mailman queue system (I'm using MessageIDs as the tag oc choice for moderation by email). To do this, I'm proposing that Mailman do a couple new things: 1) Insert MessageID headers with created values in messages that don't contain any MessageID. 2) Detect collisions within its rather small/arbitrary window, and auto-discard/reject messages subsequent messages with a duplicate MessageID. This would not a rigorous dupe check, but would only check for dupes against the messages already in the Mailman queue (ie received and not yet sent back out). Any problems in messing with MessageIDs in this way? I'm not keen on MessageID re-writing, however its worth noting that the above two rules would only come into effect when the mail system has already broken down at some level (MUA emitting non-unique or no IDs, mail dupes, etc). 4) While it seems a subtlesmall point, its bugging me. Given user account support, and messages to a given user bouncing, should that user be unsubscribed from only that list, or from all lists at that site? Where this is actually bugging me most is for virtual domains and whether or not lists in a virtual domains should be transparent or opaque to a bounce on a list in a different virtual domain? The admin in me says, "Hell yes!" The commercial reality nut in me demurrs (think about list hosting for small companies and their PR image given transparent virtual hosting). For those interested the basic model is built upon arbitrary process queues and pipes. Also please remember I'm coming up with a proposal, not a mandate (ie Barry etc haven't seen or commented on this yet). -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From ken@kyler.com Tue Dec 12 01:28:26 2000 From: ken@kyler.com (Ken Kyler) Date: Mon, 11 Dec 2000 20:28:26 -0500 Subject: [Mailman-Developers] (no subject) In-Reply-To: <31823.976583871@kanga.nu> Message-ID: > 4) While it seems a subtlesmall point, its bugging me. Given user > account support, and messages to a given user bouncing, should > that user be unsubscribed from only that list, or from all > lists at that site? Where this is actually bugging me most is > for virtual domains and whether or not lists in a virtual > domains should be transparent or opaque to a bounce on a list > in a different virtual domain? > > The admin in me says, "Hell yes!" The commercial reality nut > in me demurrs (think about list hosting for small companies and > their PR image given transparent virtual hosting). I'd like a per-list option to do just that. Some virtual hosts might like that option and others won't. Ken Kyler From claw@kanga.nu Tue Dec 12 01:42:34 2000 From: claw@kanga.nu (J C Lawrence) Date: Mon, 11 Dec 2000 17:42:34 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: Message from "Ken Kyler" of "Mon, 11 Dec 2000 20:28:26 EST." References: Message-ID: <1328.976585354@kanga.nu> On Mon, 11 Dec 2000 20:28:26 -0500 Ken Kyler wrote: >> 4) While it seems a subtlesmall point, its bugging me. Given >> user account support, and messages to a given user bouncing, >> should that user be unsubscribed from only that list, or from all >> lists at that site? Where this is actually bugging me most is >> for virtual domains and whether or not lists in a virtual domains >> should be transparent or opaque to a bounce on a list in a >> different virtual domain? >> >> The admin in me says, "Hell yes!" The commercial reality nut in >> me demurrs (think about list hosting for small companies and >> their PR image given transparent virtual hosting). > I'd like a per-list option to do just that. Some virtual hosts > might like that option and others won't. Awwww crap. I was kinda hoping to cut out another level of abstraction. (Must remember to add subject line next time). -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From chuqui@plaidworks.com Tue Dec 12 02:26:26 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Mon, 11 Dec 2000 18:26:26 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: <31823.976583871@kanga.nu> References: <31823.976583871@kanga.nu> Message-ID: At 5:17 PM -0800 12/11/00, J C Lawrence wrote: > 1) Is there a GPL distributed queue processing system ala IBM's MQ > about? I've not been able to find one. wehn I evaluated it a while back, it wasn't stable on solaris, but it had the functionality I wanted. > 2) How much interest is there in optionally supporting VERP? strong here. I did some testing in the last week on a system that is effectively VERPing stuff (but written in perl), which gave me a good idea how to ramp it up to fast volume and get away from the DNS delays and SMTP stuff. it could easily turn into an optional module. > 1) Insert MessageID headers with created values in messages > that don't contain any MessageID. that's no problem, although in theory, the MTA should do it for you. The only way I can think of this (if everyone acts properly) happening is someone somehow delivering a message to Mailman that never touches an MTA. I'm not sure that's possible. > 2) Detect collisions within its rather small/arbitrary > window, and auto-discard/reject messages subsequent messages > with a duplicate MessageID. This would not a rigorous dupe > check, but would only check for dupes against the messages > already in the Mailman queue (ie received and not yet sent > back out). It's not that expensive to keep a hash of message IDs, where the key is the Message-ID, the value is a timestamp. And, say, once a day, you delete records where the timestamp is older than (configurable) days. If you're gong ot dupe check at all, why not do it for real? > Any problems in messing with MessageIDs in this way? not that I can think of. >(MUA emitting > non-unique or no IDs, mail dupes, etc). it's not the MUA that's responsible for message-iid's, it's the MTA. And every MTA is responsible for adding one if it finds it's missing in the RFCs. So the only way I can see Mailman ever seeing a message without a Message-ID is a local user who somehow uses a local delivery tool like binmail to deliver a posting wtihuout it seeing the local MTA, or if the local MTA is broken. Or if it's a forgery inserted directly into the system somehow. All of which imply the local system is broken or compromised, so IMHO, Mailman doens't need to really worry about it.l > 4) While it seems a subtlesmall point, its bugging me. Given user > account support, and messages to a given user bouncing, should > that user be unsubscribed from only that list, or from all > lists at that site? I unsubscribe from the site. I'm sure at some point, an email sent from A might bounce and still be valid if sent from B, but that case is so rare I wouldn't think of wasting time on it, because the only way I can see taht happen (minus broken systems, of course) is someone who decides to try to unsubscribe by blocking a list, isntead of following the directions. And I don't see we need to write code into mailman to help users not follow the instructions.... (grin) > Where this is actually bugging me most is > for virtual domains and whether or not lists in a virtual > domains should be transparent or opaque to a bounce on a list > in a different virtual domain? since we've talked about a single data store for subscriber data, I think you do it globally. If they really want opaqueness across virtual domains, run mujltiples copies of Mailman. that'll still be an option, after all. >For those interested the basic model is built upon arbitrary process >queues and pipes. which is a nice system -- it's how I finally did my big muther list server, but instead of gnu queue, I'm using QPS. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From claw@kanga.nu Tue Dec 12 03:16:28 2000 From: claw@kanga.nu (J C Lawrence) Date: Mon, 11 Dec 2000 19:16:28 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: Message from Chuq Von Rospach of "Mon, 11 Dec 2000 18:26:26 PST." References: <31823.976583871@kanga.nu> Message-ID: <10142.976590988@kanga.nu> On Mon, 11 Dec 2000 18:26:26 -0800 Chuq Von Rospach wrote: > At 5:17 PM -0800 12/11/00, J C Lawrence wrote: >> 1) Is there a GPL distributed queue processing system ala IBM's >> MQ about? I've not been able to find one. > > wehn I evaluated it a while back, it wasn't stable on solaris, but > it had the functionality I wanted. Yeah, I just spent some time playing around there. Its not encouraging right now. >> 1) Insert MessageID headers with created values in messages that >> don't contain any MessageID. > that's no problem, although in theory, the MTA should do it for > you. The only way I can think of this (if everyone acts properly) > happening is someone somehow delivering a message to Mailman that > never touches an MTA. I'm not sure that's possible. Not exactly. My architecture has the ability to create messages internally that are then passed back thru the processing system. I'm not interested in passing back out to the MTA (wasted cycles and need to know what machine has a valid MTA on it), or in generating IDs at the point of message generation (which is a template), so I'd rather just punt and just build IDs when I need them. >> 2) Detect collisions within its rather small/arbitrary window, >> and auto-discard/reject messages subsequent messages with a >> duplicate MessageID. This would not a rigorous dupe check, but >> would only check for dupes against the messages already in the >> Mailman queue (ie received and not yet sent back out). > It's not that expensive to keep a hash of message IDs, where the > key is the Message-ID, the value is a timestamp. And, say, once a > day, you delete records where the timestamp is older than > (configurable) days. If you're gong ot dupe check at all, why not > do it for real? I could. At this point the ONLY reason I'm interested in message IDa is for the moderation interface which needs to be assured that no two messages in the moderation queue for a given list have the same ID. I guess a little DBM file wouldn't hurt, but I don't think I'll spec it. >> (MUA emitting non-unique or no IDs, mail dupes, etc). > it's not the MUA that's responsible for message-iid's, it's the > MTA. Oops, you're right. I forgot that. >> 4) While it seems a subtle small point, its bugging me. Given >> user account support, and messages to a given user bouncing, >> should that user be unsubscribed from only that list, or from all >> lists at that site? > I unsubscribe from the site. I'm sure at some point, an email sent > from A might bounce and still be valid if sent from B, but that > case is so rare I wouldn't think of wasting time on it, because > the only way I can see taht happen (minus broken systems, of > course) is someone who decides to try to unsubscribe by blocking a > list, isntead of following the directions. And I don't see we need > to write code into mailman to help users not follow the > instructions.... (grin) I kinda like the way you think. >> Where this is actually bugging me most is for virtual domains and >> whether or not lists in a virtual domains should be transparent >> or opaque to a bounce on a list in a different virtual domain? > since we've talked about a single data store for subscriber data, > I think you do it globally. If they really want opaqueness across > virtual domains, run mujltiples copies of Mailman. that'll still > be an option, after all. Neater. >> For those interested the basic model is built upon arbitrary >> process queues and pipes. > which is a nice system -- it's how I finally did my big muther > list server, but instead of gnu queue, I'm using QPS. I'm ending up with a sort of pseudo-queue model. Still a "list-mom" cron job, but it works by orchestrating a series of arbitrary process pipes as orphaned children. I'd like to go for a full queue implementation, but I think the culture shock and overhead for the small case might be a bit much. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From chuqui@plaidworks.com Tue Dec 12 03:27:06 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Mon, 11 Dec 2000 19:27:06 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: <10142.976590988@kanga.nu> References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> Message-ID: At 7:16 PM -0800 12/11/00, J C Lawrence wrote: > >Not exactly. My architecture has the ability to create messages >internally that are then passed back thru the processing system. oh, yeah. duh. >I kinda like the way you think. that should scare you... -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From claw@kanga.nu Tue Dec 12 03:51:32 2000 From: claw@kanga.nu (J C Lawrence) Date: Mon, 11 Dec 2000 19:51:32 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: Message from Chuq Von Rospach of "Mon, 11 Dec 2000 19:27:06 PST." References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> Message-ID: <13025.976593092@kanga.nu> On Mon, 11 Dec 2000 19:27:06 -0800 Chuq Von Rospach wrote: > At 7:16 PM -0800 12/11/00, J C Lawrence wrote: >> Not exactly. My architecture has the ability to create messages >> internally that are then passed back thru the processing system. > oh, yeah. duh. >> I kinda like the way you think. > that should scare you... According to my wife you should be terrified about now. FWLIW I'm working on the following leading notes: ---- Assumption: The localhost is Unix-like. --- ObTheme: All config files should be human readable unless those files are dynamically created and contain data which will be easily and automatically recreated. ObTheme: Unless a data set is inherently private to Mailman, Mailman will not mandate a storage format or location for that data set, and will allow that data set to be the result of a locally defined abitrary replaceable process. ObTheme: Every single program or process may be replaced with something else, as long as that other thing accepts the same inputs, generates outputs within spec, and performs a somewhat similar function. --- There are basically three approaches to scalability in use for this sort of application: 1) Using multiple simultaneous processes/threads to parallelise a given task. 2) Using multiple systems running parallel to parallelise a given task. 3) Using multiple systems, each one dedicated to some portion(s) or sub-set of the overall task (might be all working in parallel on the entire problem (lock contention! failure modes!)). The intent is to be able to transparently support all three models on a per-list basis or per-installation basis or some arbitrary mix of the two (some sections of the problem for some lists handled by dedicated systems, other sections of the problem for all the other lists handled either by a different pool of systems or processes). --- Observation: MLMs are primarily IO bound devices, and are specifically IO bound on output. Internal processing on mail servers, even given crypto authentication and expensive membership generation processes (eg heavy SQL DB joins etc) are an order of magnitude smaller problem than just getting the outbound mail off the system. Consider a mid-size list of 1K members. It is a busy list and receives 500 messages a day, each of which is exploded to all 1K members: -- That's 500 authentication cycles per day. -- That's 500 membership list generations. -- That's 500,000 outbound messages -- That's 500,000/MAX_RCPT_TOs SMTP transactions Even given a MAX_RCPT_TOS of 500 (a bit large in my mind) that's 1K high latency multi-process SMTP transactions versus 500 crytps crypts or SQL queries. --- Observation: In the real of MLM installations there are two end points to the scalability problem: 1) Sites with lists with very large numbers of members 2) Sites with large numbers of lists which have few members. Sites with large numbers of lists with large numbers of members (and presumably large numbers of messages per list) are the pessimal case, and is not one Mailman is currently targeting to solve. The first case MLM is oubound bounnd. The second case may be local storage IO bound as it spends significant time walking local filesystems during queue processing which the outbound IO rates are comparitively small (and unbursty). Possibly. SourceForge falls into the second case. --- Observation: Traffic bursts are bad. Minimally the MLM should attempt to smooth out delivery rates to a given MTA to be no higher than N messages/time. This doesn't mean the MLM doesn't deliver mail quickly, just that in the case of a mail burst (suddenly 20Million messages sitting in the outbound queue), that the MLM will give the MTA the opportunity to try and react intelligently rather than overwhelming it near instantly with all 20M messages dumped in the MTA spool over 30 seconds while the spool filesystem gags. --- There are five basic transition points for a message passing thru a mailing list server: 1) Receipt of message by local MTA 2) Receipt by list server 3) Approval/editing/moderation 4) Processing of message and emission of any resultant message(s) 5) Delivery of message to MTA for final delivery. #1 is significant only because we can can rely on the MTA to distinguish between valif list-related addresses and non-list addresses. #2 is just that. The message is received by the MLM and put somewhere where it eill later be processed. The intent is that this is a lightweight LDA process that does nothing but write queue files. The MLM's business is to make life as easy as possible on the MTA. This is part of that. #3 Mainly occurs for moderation, and encludes editing, approval, authentication, and any other requisite steps. The general purpose of this step is to determine what (if any) subsequent processing there will be of this message . #4 Any requisite processing on the message occurs, and any messages generated by that processing are placed int he outbound queue. #5 An equivalent to the current queue runner process empties the queue by creating SMTP transations for the entries in the queue. The basic view I'm taking os the list server is that it is a staged sequence of processes, each invokved distinctly, orchestrated in the background by cron. Note: Bounce processing and request processing re not detailed at this point as their rate of occurance outside of DoS attacks is comparitively low and are far cheaper than list broadcasts in general. --- List processing is a sequence of accepting a message, performing various operations on it which cause state changes to the message and the list processing system, and optionally emitting some number of messages at the end. As such this lends itself to process queues and process pipes. --- We don't want an over-arching API, or the attempt to solve the entire peoblem with either one hammer, or one sort of hammer. The intent is to build something that the end user/SysAdm can adapt to his local installation without either stretching or breaking the model, and without needint to build an installation which is necessarily structurally very different from either the very light weight single machine small list system, or the larger EGroups/Topica equivalent. By using process queues based on cannonical names in known filesystem locations and pre-defined data exchange formats between processes we can make the processes themselves arbitrary black boxes so long as they accept the appropriate inputs and generate the expected output. ---- -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From root@theporch.com Tue Dec 12 04:03:40 2000 From: root@theporch.com (Phillip Porch) Date: Mon, 11 Dec 2000 22:03:40 -0600 (CST) Subject: [Mailman-Developers] Bug in Mailman version 2.1a1 (fwd) Message-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ------=_NextPart_000_0000_01C063A3.9CBFABC0 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII Content-ID: Got this message when trying to click on the admin page for one of my lists. -- Phillip P. Porch NIC:PP1573 finger for http://www.theporch.com UTM - 16 514546E 3994565N GnuPG key ---------- Forwarded message ---------- Date: Mon, 11 Dec 2000 18:53:12 -0600 From: Phillip Porch To: Phillip Porch Subject: Bug in Mailman version 2.1a1 ------=_NextPart_000_0000_01C063A3.9CBFABC0 Content-Type: TEXT/HTML; NAME="Bug in Mailman version 2.1a1.htm" Content-Transfer-Encoding: QUOTED-PRINTABLE Content-ID: Content-Description: Content-Disposition: ATTACHMENT; FILENAME="Bug in Mailman version 2.1a1.htm" =0A= Bug in Mailman version 2.1a1=0A=

Bug in Mailman version 2.1a1

=0A=

We're sorry, we hit a bug!

=0A= =0A=

If you would like to help us identify the problem,=0A= please email a copy of this page to the webmaster for this site with=0A= a description of what happened. Thanks!=0A= =0A=

Traceback:

=0A=
Traceback (most recent call last):=0A=
  File "/home/mailman/scripts/driver", line 105, in run_main=0A=
    main()=0A=
  File "/home/mailman/Mailman/Cgi/admin.py", line 67, in main=0A=
    FormatAdminOverview(_('No such list %s') % listname)=0A=
TypeError: not all arguments converted=0A=
=0A=
=0A=

=0A=

Python information:

=0A= =0A=

=0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A=
Variable Value
sys.version 2.0b2 (#6, Oct 7 2000, 22:07:24) = [C]
sys.executable /usr/local/bin/python
sys.prefix /usr/local
sys.exec_prefix /usr/local
sys.path /usr/local
sys.platform sco_sv3
=0A=

Variable	Value
`sys.version`	2.0b2 (#6, Oct 7 2000, 22:07:24) = [C]
`sys.executable`	/usr/local/bin/python
`sys.prefix`	/usr/local
`sys.exec_prefix`	/usr/local
`sys.path`	/usr/local
`sys.platform`	sco_sv3

Environment variables:

=0A= =0A=

=0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A=
Variable Value
DOCUMENT_ROOT /home
SERVER_ADDR 207.234.31.38
HTTP_ACCEPT_ENCODING gzip, deflate
SERVER_PORT 80
PATH_TRANSLATED /home/homebrew
REMOTE_ADDR 207.234.31.45
UNIQUE_ID OjWeyc-qHyYAAAVukiM
HTTP_ACCEPT_LANGUAGE ie-ee,en-us;q=3D0.5 =
GATEWAY_INTERFACE CGI/1.1
SERVER_NAME sco.theporch.com
TZ CST6CDT
HTTP_USER_AGENT Mozilla/4.0 (compatible; = MSIE 5.5; Windows 98)
QUERY_STRING
HTTP_ACCEPT image/gif, image/x-xbitmap, = image/jpeg, image/pjpeg, application/vnd.ms-powerpoint, = application/vnd.ms-excel, application/msword, */*
REQUEST_URI /mailman/admin/homebrew =
REMOTE_PORT 1205
SCRIPT_FILENAME /home/mailman/cgi-bin/admin =
SCRIPT_URL /mailman/admin/homebrew
HTTP_HOST www.theporch.com:8080
REQUEST_METHOD GET
SERVER_SIGNATURE
Apache/1.3.14 = Server at sco.theporch.com Port 80
=0A=
SCRIPT_URI = http://sco.theporch.com/mailman/admin/homebrew
SCRIPT_NAME /mailman/admin
SERVER_ADMIN root@sco.theporch.com
SERVER_SOFTWARE Apache/1.3.14 (Unix) = PHP/4.0.3pl1 mod_ssl/2.7.1 OpenSSL/0.9.7-dev
PYTHONPATH /home/mailman
PATH_INFO /homebrew
SERVER_PROTOCOL HTTP/1.1
HTTP_CONNECTION Keep-Alive
=0A= ------=_NextPart_000_0000_01C063A3.9CBFABC0-- From root@theporch.com Tue Dec 12 04:07:36 2000 From: root@theporch.com (Phillip Porch) Date: Mon, 11 Dec 2000 22:07:36 -0600 (CST) Subject: [Mailman-Developers] Bug in Mailman version 2.1a1 (fwd) In-Reply-To: Message-ID: On Mon, 11 Dec 2000, Phillip Porch wrote: > Date: Mon, 11 Dec 2000 22:03:40 -0600 (CST) > From: Phillip Porch > To: mailman-developers@python.org > Subject: [Mailman-Developers] Bug in Mailman version 2.1a1 (fwd) > > Got this message when trying to click on the admin page for one of my > lists. > > Hate to follow up my post.... I had a typo in my URL I was trying to go to... Duuuh. Sorry to waste the bandwidth. Please ignore my previous post. Things are working fine. -- Phillip P. Porch NIC:PP1573 finger for http://www.theporch.com UTM - 16 514546E 3994565N GnuPG key From chuqui@plaidworks.com Tue Dec 12 04:49:36 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Mon, 11 Dec 2000 20:49:36 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: <13025.976593092@kanga.nu> References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> Message-ID: At 7:51 PM -0800 12/11/00, J C Lawrence wrote: >ObTheme: All config files should be human readable unless those >files are dynamically created and contain data which will be easily >and automatically recreated. ObTheme: All configuration should be possible via the web, even if the system is misconfigured and non-functional. Anything that can NOT be safely reconfigured without breaking the system should not be configurable via the web. (in other words, anything you can change, you should be able to change remotely, unless you can break the ssytem. If you cna break the system, you shouldn't be allowed near it trivially...) > 1) Using multiple simultaneous processes/threads to parallelise > a given task. > > 2) Using multiple systems running parallel to parallelise a given > task. > > 3) Using multiple systems, each one dedicated to some portion(s) > or sub-set of the overall task (might be all working in > parallel on the entire problem (lock contention! failure > modes!)). that's my model perfectly, althought I think 2 and 3 are reversed. it's cleaner architecturally to go to divesting and distributing functionality before 'clustering'. In fact, I'm not sure clustering (which I'll use to term multiple mailman systems running in parallel) implies a system really, really large, when you realize that the primary resource eaters (like delivery) can effectively be infinitely distributed. I'm not sure how big a Mailman system you'd need ot require parallelizing the core process, as long as you can divest off other pieces to a farm that could grow without bounds. So maybe we don't need that next (complicated) step, and make it parallelized and distributable for everything except that core control process, but manage the complexity of that control process to keep everyting out of it exect the absolute necessity. >Observation: MLMs are primarily IO bound devices, and are >specifically IO bound on output. Internal processing on mail >servers, even given crypto authentication and expensive membership >generation processes (eg heavy SQL DB joins etc) are an order of >magnitude smaller problem than just getting the outbound mail off >the system. some of that is the MUA's problem, actually, but they get tied together. you don't, for instance, want an MLM who will dump 50K pieces of email an hour into the queues of an MUA that can only process 40K... But in general, you're correct. Especially if you define DNS delays and SMTP protocol delays caused by the receiving machine to be "output" (grin) >Sites with large numbers of lists with large numbers of members (and >presumably large numbers of messages per list) are the pessimal >case, and is not one Mailman is currently targeting to solve. but if you define the distribution capabilities correctly, this case is solved by throwing even more hardware at it, and the owners of this pessimal case presumably have a budget for it. If you see someone tryting to run Sourceforge on a 486 and a 128K DSL line, you laugh at them. >Observation: Traffic bursts are bad. Minimally the MLM should >attempt to smooth out delivery rates to a given MTA to be no higher >than N messages/time. The obverse of that is that end-users seriously dislike delays, especially on conversational lists. It turns into the old "user expectation" problem -- it's better to hold ALL mail for 15 minutes so users come to expect it than to normally deliver mail in 2 minutes, except during the worst bulges... But in general, the MLM should deliver as fast as it reasonable can without overloading the MUA, which implies some kind of monitoring setup for the MUA, or some user-controlled throttling system. the latter unfortunately, implies teaching admins how to monitor and adjust, a support issue. The former implies writing an interface for every MTA -- a development AND support issue. >20Million messages sitting in the outbound queue), that the MLM will >give the MTA the opportunity to try and react intelligently rather >than overwhelming it near instantly with all 20M messages dumped in >the MTA spool over 30 seconds while the spool filesystem gags. I will not make comments about qmail. I will not make comments about qmail. I will be good. I will be good. (grin) > 1) Receipt of message by local MTA 1a) passthrough of message via a security wrapper from MTA to list server... (I think it's important we remember that, because we can't lose it, and it involves a layer of passthrough and a process spawning, so it's somewhat heavyweight -- but indispensable) > 2) Receipt by list server > 3) Approval/editing/moderation > 4) Processing of message and emission of any resultant message(s) > 5) Delivery of message to MTA for final delivery. 6) delivery of message to non-MTA recipients (the archiver, the logging thing, the digester, the bounce processor....) >#1 is significant only because we can can rely on the MTA to >distinguish between valif list-related addresses and non-list >addresses. although one thing I've toyed with is to give a subdomain to the MLM, and simply pass everything to it (in sendmail terms, using virtusertable to pass @list.foo.bar to mailman@foo.bar). Then you take the MLM out of having to know what lists exist and administrative needs to keep that interface in sync. The downside is it doesn't fit the design of some users (but that can be fixed by education if we can prove why it's better), and you get into having to handle some MTA functions, such as DSN compatible bounce messages. I've more or less decided than when I rewrite my internal corporate mail list, I'll do that rather than generate alias listings (for, oh, 12,000 groups) and teh hassles and overheads of all that. That'll be especially useful if we do waht I hope, which is set it up so the server has no data at all, but authenticates via LDAP to get list information on demand out of the corporate databases. There are some definite advantages to not knowing whether something exists until the need to know exists -- and as Mailman starts edging towards interfacing to non-Mailman data sources for list information, that ability grows in importance. 6) is the processesing needed to support other functions that act on messages. The idea is that instead of delivering to the MTA, we have a suite of functions that deliver the message ot whatever needs to process it. Those can be asynchronous and don't need to be as timely as (5), and have different enough design needs that I split them out from the MTA delivery (although traditionally, stuff like digests are managed by doing an MTA transfer out of the MLM and back in to a different program...) It also assumes that these non-delivery things are separate processes from teh act of making them available to those things, to keep (6) lightweight as possible. >Note: Bounce processing and request processing re not detailed at >this point as their rate of occurance outside of DoS attacks is >comparitively low and are far cheaper than list broadcasts in >general. and besides, they are basically independent, asynchronous processes that don't need to be managed by any of the core logic, other than handing messages into their queue and making sure they stay running. same with, IMHO, storing messages for archives, storing messages for digests, updating archives, processing digests (but the processed digest is fed back into the core logic for delivery), and whatever else we decide it needs to do that isn't part of the core, time-sensitive code base. (in fact, there's no reason why you couldn't have multiple flavors of these things, feeding archives into an mbox, another archiver into mhonarc or pipermail, something that updates the search engine indexes, and text adn mime digesters... by turning them into their own logic streams with their own queues, you effectivley have just made them all plug-in swappable, because you're writing to a queue, and not worrying about what happens once its there. you merely need to make sure it goes in the right queue, in the approved format. >We don't want an over-arching API, or the attempt to solve the >entire peoblem with either one hammer, or one sort of hammer. I like hammers! My thumb doesn't, not since the divorce, at least... kewl. good stuff here. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From claw@kanga.nu Tue Dec 12 07:15:31 2000 From: claw@kanga.nu (J C Lawrence) Date: Mon, 11 Dec 2000 23:15:31 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: Message from Chuq Von Rospach of "Mon, 11 Dec 2000 20:49:36 PST." References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> Message-ID: <31145.976605331@kanga.nu> On Mon, 11 Dec 2000 20:49:36 -0800 Chuq Von Rospach wrote: > At 7:51 PM -0800 12/11/00, J C Lawrence wrote: >> ObTheme: All config files should be human readable unless those >> files are dynamically created and contain data which will be >> easily and automatically recreated. > ObTheme: All configuration should be possible via the web, even if > the system is misconfigured and non-functional. Anything that can > NOT be safely reconfigured without breaking the system should not > be configurable via the web. (in other words, anything you can > change, you should be able to change remotely, unless you can > break the ssytem. If you cna break the system, you shouldn't be > allowed near it trivially...) Agreed. I'm currently working under the generous assumption that its possible to cook up a web interface design for almost anything, so I'm punting there for now. >> 1) Using multiple simultaneous processes/threads to parallelise a >> given task. >> >> 2) Using multiple systems running parallel to parallelise a given >> task. >> >> 3) Using multiple systems, each one dedicated to some portion(s) >> or sub-set of the overall task (might be all working in parallel >> on the entire problem (lock contention! failure modes!)). > that's my model perfectly, althought I think 2 and 3 are > reversed. > it's cleaner architecturally to go to divesting and distributing > functionality before 'clustering'. In fact, I'm not sure > clustering (which I'll use to term multiple mailman systems > running in parallel) implies a system really, really large, when > you realize that the primary resource eaters (like delivery) can > effectively be infinitely distributed. Yup, I've accounted for that so far in the design. > I'm not sure how big a Mailman system you'd need ot require > parallelizing the core process, as long as you can divest off > other pieces to a farm that could grow without bounds. So maybe we > don't need that next (complicated) step, and make it parallelized > and distributable for everything except that core control process, > but manage the complexity of that control process to keep > everyting out of it exect the absolute necessity. I'm working on the principle that there is no core process, and thre are no musical conductors or other time beaters, just discrete nodes and processes competing for resources. >> Observation: MLMs are primarily IO bound devices, and are >> specifically IO bound on output. Internal processing on mail >> servers, even given crypto authentication and expensive >> membership generation processes (eg heavy SQL DB joins etc) are >> an order of magnitude smaller problem than just getting the >> outbound mail off the system. > some of that is the MUA's problem, actually, but they get tied > together. you don't, for instance, want an MLM who will dump 50K > pieces of email an hour into the queues of an MUA that can only > process 40K... I think you mean MTA above and below. > But in general, you're correct. Especially if you define DNS > delays and SMTP protocol delays caused by the receiving machine to > be "output" (grin) Or just the simple FS commit requirements for MTA spools. Its a heavy process. >> Sites with large numbers of lists with large numbers of members >> (and presumably large numbers of messages per list) are the >> pessimal case, and is not one Mailman is currently targeting to >> solve. > but if you define the distribution capabilities correctly, this > case is solved by throwing even more hardware at it, and the > owners of this pessimal case presumably have a budget for it. If > you see someone tryting to run Sourceforge on a 486 and a 128K DSL > line, you laugh at them. True, except that lock contention becomes a major problem and scheduling strategies become critical. >> Observation: Traffic bursts are bad. Minimally the MLM should >> attempt to smooth out delivery rates to a given MTA to be no >> higher than N messages/time. > The obverse of that is that end-users seriously dislike delays, > especially on conversational lists. It turns into the old "user > expectation" problem -- it's better to hold ALL mail for 15 > minutes so users come to expect it than to normally deliver mail > in 2 minutes, except during the worst bulges... But in general, > the MLM should deliver as fast as it reasonable can without > overloading the MUA, which implies some kind of monitoring setup > for the MUA, or some user-controlled throttling system. the latter > unfortunately, implies teaching admins how to monitor and adjust, > a support issue. The former implies writing an interface for every > MTA -- a development AND support issue. My intent so far is just "deliver no more than N mesages per minute" per outbound aueue runner. It knocks the peaks off the problem, and the base structure ie easy to extend from there (and I don't want to think about that now). >> 1) Receipt of message by local MTA > 1a) passthrough of message via a security wrapper from MTA to list > server... (I think it's important we remember that, because we > can't lose it, and it involves a layer of passthrough and a > process spawning, so it's somewhat heavyweight -- but > indispensable) I should note that my base design is very heavy in terms of process forks (which happen to be quite light weight under Linux, but that's another matter). The general structural approach I'm taking is: There's a directory full of scripts/programs. Run them all, in directory sort order, on this message to determine if we should do XXX with it. Now the default case could have those directories empty, meaning that Mailman will default to internal/cheap implementations, but its much easier to just have default implementations of the scripts for those directories and then punt normally. ObNote; Of course the default scripts could by python scripts and could be processed in-line as modules rather than forking children. >> 2) Receipt by list server 3) Approval/editing/moderation 4) >> Processing of message and emission of any resultant message(s) 5) >> Delivery of message to MTA for final delivery. > 6) delivery of message to non-MTA recipients (the archiver, > the logging thing, the digester, the bounce processor....) I'm actually doing archiving by injecting new messages back into the inbound queue which are addressed to the archiver. Digest processing should probably be handled this way, but I've currently got it as a pre-post script (not entirely keen on that). Bounces are handled entirely OOB to the rest of the MLM, rather similarly in fact to request messages. >> #1 is significant only because we can can rely on the MTA to >> distinguish between valif list-related addresses and non-list >> addresses. > although one thing I've toyed with is to give a subdomain to the > MLM, and simply pass everything to it (in sendmail terms, using > virtusertable to pass @list.foo.bar to mailman@foo.bar). Then you > take the MLM out of having to know what lists exist and > administrative needs to keep that interface in sync. There are a couple other list servers that demand that approach. The problem is that it really doesn't fit well with people/sites that don't control their own DNS. > I've more or less decided than when I rewrite my internal > corporate mail list, I'll do that rather than generate alias > listings (for, oh, 12,000 groups) and teh hassles and overheads of > all that. That'll be especially useful if we do waht I hope, which > is set it up so the server has no data at all, but authenticates > via LDAP to get list information on demand out of the corporate > databases. There are some definite advantages to not knowing > whether something exists until the need to know exists -- and as > Mailman starts edging towards interfacing to non-Mailman data > sources for list information, that ability grows in importance. FWIW the design can do that right now. A message comes in, various parameters are extracted from it, and the parameters are handled to a directory of scripts the accumulated stdout of which forms the distribution list for that message. The distribution list can be passed thru a pre-processor (dupe removal, domain sorting, MX sorting, whatever) to do any final processing of the distribution list before attaching it to the message and putting it in the outbound queue. So, want LDAP? Want SQL? Want local DBM? Want all three? No problem. > 6) is the processesing needed to support other functions that act > on messages. The idea is that instead of delivering to the MTA, we > have a suite of functions that deliver the message ot whatever > needs to process it. Those can be asynchronous and don't need to > be as timely as (5), and have different enough design needs that I > split them out from the MTA delivery (although traditionally, > stuff like digests are managed by doing an MTA transfer out of the > MLM and back in to a different program...) I have one set of queue processing functions geared solely for list posts. There are then several parallel queue sequences dor processing non-posts (such as bounces, command requests, etc). Additionally, you can trivially set things up so that post explosions occur on N machines, while command processing and bounce processing occur only on say machine X (which is perhaps on the other side of the DMZ and access rights to your internal backing stores). I don't see the different queues needing markedly different designs, but needing to be able to have their processes supports cleanly divisible. The base structures end up markedly similar after that. > It also assumes that these non-delivery things are separate > processes from teh act of making them available to those things, > to keep (6) lightweight as possible. Process fork overhead is a problem I've not confronted yet. Its going to need looking at. That and distributed lock contention. Bother are pretty ugly with my current model. > and besides, they are basically independent, asynchronous > processes that don't need to be managed by any of the core logic, > other than handing messages into their queue and making sure they > stay running. same with, IMHO, storing messages for archives, BTW I'd like to have the MLM archive messages such that a member can request, "SEND ME POST XXX" and have the MLM send it to him. Ditto for digests. This is in addition to any web archiving. > storing messages for digests, updating archives, processing > digests (but the processed digest is fed back into the core logic > for delivery), and whatever else we decide it needs to do that > isn't part of the core, time-sensitive code base. I've been thinking about this. I *REALLY* don't think there's much time sensitive code in a MLM. There's a lot of data you want to get out of the way as quickly as possible, because if you don't its just going to build up and make a bigger problem, but the actual speed with which individual bits prior to the explosion occur seems arbitrary outside of a latency viewpoint. Okay, it takes N ticks for a post to start exploding versus it takes 5N ticks to start exploding? Am I really going to care when handling the explosion takes several hundred N? Even with say 50 forks per inbound list post prior to explosion, the comparitive overhead compared to explosion is still trivial. > (in fact, there's no reason why you couldn't have multiple flavors > of these things, feeding archives into an mbox, another archiver > into mhonarc or pipermail, something that updates the search > engine indexes, and text adn mime digesters... by turning them > into their own logic streams with their own queues, you > effectivley have just made them all plug-in swappable, because > you're writing to a queue, and not worrying about what happens > once its there. you merely need to make sure it goes in the right > queue, in the approved format. Hurm. Good point. I like that idea. Just inject messages back into the system targetted for the apppropriate stream. Nice. Heck, perhaps I should shove this thing as-is into Barry's ZWiki. > We're visiting the relatives. Cover us. I missed you. Please wait while I reload. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From chuqui@plaidworks.com Tue Dec 12 07:43:07 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Mon, 11 Dec 2000 23:43:07 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: <31145.976605331@kanga.nu> References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <31145.976605331@kanga.nu> Message-ID: At 11:15 PM -0800 12/11/00, J C Lawrence wrote: > >I'm working on the principle that there is no core process, and thre >are no musical conductors or other time beaters, just discrete nodes >and processes competing for resources. there has to be one, somewhere. It may be a policeman, directing traffic around the place, or it may be an overseer, of the inetd mode, or maybe even a watchdog, similar to init. But I doubt you can really build something (or want to) where you don't have something that comes first and decides who does what. Every orchestra has a conductor... it doesn't necessarily have to be a heavyweight from a code point of view, but something has to be there to make sure everyone else does their job. > > some of that is the MUA's problem, actually, but they get tied >> together. you don't, for instance, want an MLM who will dump 50K >> pieces of email an hour into the queues of an MUA that can only >> process 40K... > >I think you mean MTA above and below. sigh. Yup. Sorry. > >My intent so far is just "deliver no more than N mesages per minute" >per outbound aueue runner. It knocks the peaks off the problem, and >the base structure ie easy to extend from there (and I don't want to >think about that now). and leaves it up to the admin to tune. That's probably fine for 3.0. full queue watching adn self-throttling can wait. it's nice to have, but we probably shouldn't try to do everything at once. Just to leave the hooks for later... >I should note that my base design is very heavy in terms of process >forks (which happen to be quite light weight under Linux, but that's >another matter). There are definitely places for threads, but to be honest, I see some tendency of people to go thread-happy. it's the "new puppy", so everything needs to be designed around threads... Given the amount of I/O we have going on, the fork overhead is going to get lost in the noise in most cases. > There's a directory full of scripts/programs. > > Run them all, in directory sort order, on this message to > determine if we should do XXX with it. and who does this? this missing core policeman process, of course (grin). but -- I'd suggest against this approach. There are problems. to start, the approach is pretty darn I/O heavy. you'd be better off loading all of this stuff into an internal database, and making it a memory-resident table, not a disk-based system. Administratively, it has some issues as well, since you're more or less requiring that someone with a CLI deal with a lot of the configuration -- or opening you up to all sorts of web-based attacks. Instead, you store scripts, and the CLI admin manages that process, but configuration is within Mailman, and web based. i've been working on a new API for the for the moderator/autobounce/admin/anti-spam stuff. I'll post that in a day or so, what I have, because I think the way I'm putting it together is relevant to how I think the overall control system could be done. >Now the default case could have those directories empty, meaning >that Mailman will default to internal/cheap implementations, but its >much easier to just have default implementations of the scripts for >those directories and then punt normally. again, I'd make as much as possible separate scripts, but have a default processing logic suite in the control data structures in the Mailman system internals. You want to embed nothing (IMHO), because it reduces the complexity of all of the pieces and ofrces you to keep the interfaces clean and rigourous. >There are a couple other list servers that demand that approach. >The problem is that it really doesn't fit well with people/sites >that don't control their own DNS. yah. that's the rub. >So, want LDAP? Want SQL? Want local DBM? Want all three? No >problem. I sure wouldn't mind being able to plug in someone else's code in that server -- but the reality is, it can't use 99% of a typical MLM, since it's all controlled upstairs by a corporate system, so it's overkill. That MLM is basically two scripts, one to eat a data set and generate the list setup, and another to authenticate and resend. that's all it does, so it's quite lightweight. >I don't see the different queues needing markedly different designs, >but needing to be able to have their processes supports cleanly >divisible. The base structures end up markedly similar after that. Other than, say, imagining a system wher earchives are on a different machine (or two), and the search engine on a third (or fourth), so you want to be able to distribute the processing cleanly.... And the realization that archives and digest stuff can be held into a low-priority queue and turned into idle-time processing tasks. A big plus if you've got a busy system a little closer to the edge than you like. > > It also assumes that these non-delivery things are separate > > processes from teh act of making them available to those things, > > to keep (6) lightweight as possible. > >Process fork overhead is a problem I've not confronted yet. And I wouldn't worry about it much. don't think it's going to be a problem, other than in the MLM->MTA interface where you might be doing a lot of spawning and forking to parallelize, VERP, or whatever. And that can be minimized and avoided with some careful design. In the rest of the system, don't bother. When I'm talking about lightweight, I was meaning code compleixity and feature creep. You want to stuff as much into external code pieces that are brought in via queueing and messagings, and keep it out of the control piece. >That and distributed lock contention. >Bother are pretty ugly with my current model. Locks are a b-tch. period. Both because they don't go multi-machine well at all, and because whatever you choose it'll be missing or broken on various releases of various OSes. >BTW I'd like to have the MLM archive messages such that a member can >request, "SEND ME POST XXX" and have the MLM send it to him. Ditto >for digests. This is in addition to any web archiving. and another flavor of digest, what I call the HTML-TOC. Simply a message full of digest info (poster, subject, maybe the first couple of lines), and a URL to pull it out of archives. Some folks want a digest to skim, some folks only want header data -- so why send all those bytes that won't be read? >I've been thinking about this. I *REALLY* don't think there's much >time sensitive code in a MLM. The process of sending list mail is time sensitive, but most of the issues involving time tend to be in the MTA. On a typical MLM, a user might not notice if messages don't turn around in 5 minutes or 15, but if they're consistency turning aound at 30 minutes, many will. they may not even recognize why they're unhappy -- but many get unhappy. and the worst aspects of this are out of everyone's control, since the biggest delays are caused by receiving sites, not teh sending site -- so you end up, if you need to, spending a lot of time minimizing the pain those sites cause you, through parallelism, domain sorting, etc. > > We're visiting the relatives. Cover us. > >I missed you. Please wait while I reload. Kevlar is your friend. back at 350 for an hour with a little garlic and garnish with chives. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From claw@kanga.nu Tue Dec 12 08:26:58 2000 From: claw@kanga.nu (J C Lawrence) Date: Tue, 12 Dec 2000 00:26:58 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: Message from Chuq Von Rospach of "Mon, 11 Dec 2000 23:43:07 PST." References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <31145.976605331@kanga.nu> Message-ID: <4830.976609618@kanga.nu> On Mon, 11 Dec 2000 23:43:07 -0800 Chuq Von Rospach wrote: > At 11:15 PM -0800 12/11/00, J C Lawrence wrote: >> My intent so far is just "deliver no more than N mesages per >> minute" per outbound aueue runner. It knocks the peaks off the >> problem, and the base structure ie easy to extend from there (and >> I don't want to think about that now). > and leaves it up to the admin to tune. That's probably fine for > 3.0. full queue watching adn self-throttling can wait. it's nice > to have, but we probably shouldn't try to do everything at > once. Just to leave the hooks for later... Precisely. >> I should note that my base design is very heavy in terms of >> process forks (which happen to be quite light weight under Linux, >> but that's another matter). > There are definitely places for threads, but to be honest, I see > some tendency of people to go thread-happy. it's the "new puppy", > so everything needs to be designed around threads... Given the > amount of I/O we have going on, the fork overhead is going to get > lost in the noise in most cases. That's my hope. >> There's a directory full of scripts/programs. >> >> Run them all, in directory sort order, on this message to >> determine if we should do XXX with it. > and who does this? this missing core policeman process, of course > (grin). Nope. The individual process which somehow got nominated for picking up a message sitting in a list pending queue. So, it picks up the mesasges, asks for its distribution list, gets it, and shoves them both over into the outbound queue. Later some arbitrary outbound queue processor wins/gets control of that message, opens an SMTP session, and shovels the message down to the list of RCPT TOs. Nobody is responsible for more than their tiny area of the field. There is a pseudo orchestra leader, but all he really does is fork processes that go see if there is anything in the queues to process, and if so, start on them. > but -- I'd suggest against this approach. There are problems. to > start, the approach is pretty darn I/O heavy. you'd be better off > loading all of this stuff into an internal database, and making it > a memory-resident table, not a disk-based. Kinda tough for LDAP or SQL where the list of membersi is dynamic and depends on the message itself (non-traditional lists). But yes, it hurts. The default case will be some sort of local/cheap DB with a single process. The idea is that the above architecture is there should it be needed > Administratively, it has some issues as well, since you're more or > less requiring that someone with a CLI deal with a lot of the > configuration -- or opening you up to all sorts of web-based > attacks. Semi. The idea is that the CLI guy installs the base set of scripts that are potentially available for to a given list. The list owner then picks from that library for his list, and assmbles and orders them (building a symlink table on dist) via his web interface (drop and combo boxes). > Instead, you store scripts, and the CLI admin manages that > process, but configuration is within Mailman, and web based. Precisely. > i've been working on a new API for the for the > moderator/autobounce/admin/anti-spam stuff. I'll post that in a > day or so, what I have, because I think the way I'm putting it > together is relevant to how I think the overall control system > could be done. I haven't really thought about bounce processing at all yet. > You want to embed nothing (IMHO), because it reduces the > complexity of all of the pieces and ofrces you to keep the > interfaces clean and rigourous. Yeah. >> I don't see the different queues needing markedly different >> designs, but needing to be able to have their processes supports >> cleanly divisible. The base structures end up markedly similar >> after that. > Other than, say, imagining a system wher earchives are on a > different machine (or two), and the search engine on a third (or > fourth), so you want to be able to distribute the processing > cleanly.... And the realization that archives and digest stuff can > be held into a low-priority queue and turned into idle-time > processing tasks. A big plus if you've got a busy system a little > closer to the edge than you like. I haven't thought about system load sensitivities yet, but I don't see any innate reason they couldn't be another variable thrown into the, "What am I currently allowed to process" equation. >> Process fork overhead is a problem I've not confronted yet. > And I wouldn't worry about it much. don't think it's going to be > a problem, other than in the MLM->MTA interface where you might be > doing a lot of spawning and forking to parallelize, VERP, or > whatever. My idea for VERP is trivially simple: The member script which generate the list of RCTP TOs which are attached to a pending message will periodically add a second token (a hash value) after the email address, seperated by whitespace. Note: instead of text a DMB would work just as well, perhaps better. The process that then picks up a message from outbound notices the hash token and constructs a special envelope for that address only, using the hash string as +suffix to the envelope return address. Want VERP all the time? Members always generates hash values. Or just a percentage of the time, or as a function of how long it was since we last caught a bounce from that address, or as a function of how much we like that domain. The idea is that VERPed messages are built on the instant of handing them off to an MTA. > And that can be minimized and avoided with some careful design. In > the rest of the system, don't bother. When I'm talking about > lightweight, I was meaning code compleixity and feature creep. You > want to stuff as much into external code pieces that are brought > in via queueing and messagings, and keep it out of the control > piece. Bingo. >> BTW I'd like to have the MLM archive messages such that a member >> can request, "SEND ME POST XXX" and have the MLM send it to him. >> Ditto for digests. This is in addition to any web archiving. > and another flavor of digest, what I call the HTML-TOC. Simply a > message full of digest info (poster, subject, maybe the first > couple of lines), and a URL to pull it out of archives. Some folks > want a digest to skim, some folks only want header data -- so why > send all those bytes that won't be read? Ahh, excellant point, Digest really should be an OOB process handled by their own queue. Yup. Absolutely. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From thomas@xs4all.net Tue Dec 12 10:11:17 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 12 Dec 2000 11:11:17 +0100 Subject: [Mailman-Developers] 2 bugs, but I need a confirmation :-) Message-ID: <20001212111117.J4396@xs4all.nl> I *think* I found two (more) bugs in Mailman, but my setup is sufficiently hacked that I can't test it for sure. One of the two I tested on python-list, and I think I saw it reproduced. The other one I'm not going to test, because it's potentially destructive. 1) Subscription-confirmation-response-emails to *-request, with multiple attachements, fail. The problem is that Mailman tries to interpret the MIME boundary and content-type headers and what not as commands, rather than taking the first attachement and parsing that. This wasn't a real problem when I tested it on python-list, because my mailer doesn't put enough headers in the first MIME part, but customers of ours have seen honest problems with this. People mailing with HTML mail enabled, for instance, but also people who get a signature attached to the email, without being able to prevent it. This enforced signature is becoming more and more populair in clueless paranoid companies :P 2) '\n.\n' screws up Mailman. This comes in two flavours :) If the '\n.\n' sequence is late enough in the email, Mailman doesn't notice, and the rest of the mail (including the '\n.\n') silently vanishes. If the sequence is a bit higher, Mailman does notice: sendmail stops the transmission while Mailman still has data to send. Mailman considers the mail not sent, and tries again later -- but the first part of the mail is sent to all recipients just fine. This is a problem in particular with digests. One of our employees found out she could skip the mailman-enforced signature by adding '\n.\n' to the end of her own signature. She forgot about digests, however, and 5 other employees got 300+ copies of the start of each digest, up to her signature. Obviously, I don't want to test this on python.org's lists, unless Barry or someone else is ready to edit the qfiles to remove the '\n.\n' sequence. Is there a mailman-test-list on python.org or some 'vanilla' installation that this could be tested on ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas@xs4all.net Tue Dec 12 14:03:28 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 12 Dec 2000 15:03:28 +0100 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: <14897.27619.230944.677429@anthem.concentric.net>; from barry@digicool.com on Fri, Dec 08, 2000 at 06:16:51PM -0500 References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> <20001208093611.E4396@xs4all.nl> <14896.65108.807206.488209@anthem.concentric.net> <17693.976313729@kanga.nu> <14897.27619.230944.677429@anthem.concentric.net> Message-ID: <20001212150327.K4396@xs4all.nl> On Fri, Dec 08, 2000 at 06:16:51PM -0500, Barry A. Warsaw wrote: > This also comes into play when users want to change their address or > delivery options. Maybe the options are stored in the backend > database, maybe they're stored only in Mailman's db. It shouldn't > matter, and Mailman's object system should map those into the same > space transparently. Same for Rosters perhaps, e.g. maybe Rosters > coming from the intranet database aren't writable through Mailman > because the backend database prohibits it. That's one way to address > the "we're not going to let employees unsubscribe from this list" > issue. For the record, that is one solution I could live with extremely well. If we had that, I'd immediately build an employee-database (or adapt the half-hearted and hardly-used one we have right now) and hook Mailman onto it. And withdraw the 'approved unsubscription' patch. Not sure if it's going to solve everyone's problems, but it should, mostly. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From claw@kanga.nu Wed Dec 13 02:41:19 2000 From: claw@kanga.nu (J C Lawrence) Date: Tue, 12 Dec 2000 18:41:19 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: Message from Chuq Von Rospach of "Mon, 11 Dec 2000 18:26:26 PST." References: <31823.976583871@kanga.nu> Message-ID: <7746.976675279@kanga.nu> On Mon, 11 Dec 2000 18:26:26 -0800 Chuq Von Rospach wrote: > At 5:17 PM -0800 12/11/00, J C Lawrence wrote: >> 1) Is there a GPL distributed queue processing system ala IBM's >> MQ about? I've not been able to find one. > > wehn I evaluated it a while back, it wasn't stable on solaris, but > it had the functionality I wanted. Any experience with GNQS? http://www.gnqs.org/ It looks a little weak for what we neet (mostly at the parallelisation points), but interesting otherwise. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From claw@kanga.nu Wed Dec 13 03:38:34 2000 From: claw@kanga.nu (J C Lawrence) Date: Tue, 12 Dec 2000 19:38:34 -0800 Subject: [Mailman-Developers] about qrunner and locking In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) of "Fri, 08 Dec 2000 18:16:51 EST." <14897.27619.230944.677429@anthem.concentric.net> References: <20001207162234.D25463@marc.merlins.org> <20001207174626.H25463@marc.merlins.org> <20001208093611.E4396@xs4all.nl> <14896.65108.807206.488209@anthem.concentric.net> <17693.976313729@kanga.nu> <14897.27619.230944.677429@anthem.concentric.net> Message-ID: <12746.976678714@kanga.nu> On Fri, 8 Dec 2000 18:16:51 -0500 Barry A Warsaw wrote: > I completely agree that keeping the list database in marshals (not > pickles, actually) is broken. The question is what to do about > it. A couple questions before I continue with my architecture musings: -- What do you think about moving to either a queue based system (preferably using a generic queue manager) or a system which is queue-like and thus easily moved over later to a queue manager once a better one becomes available? -- What do you think about abstracting all of Mailman's data access needs to utility programs with well defined contracts such that simply replacing those tools with something of the same name that accesses a different data store and otherwise fulfills the same contract changes Mailman's data access? -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From dgc@uchicago.edu Thu Dec 14 02:28:18 2000 From: dgc@uchicago.edu (David Champion) Date: Wed, 13 Dec 2000 20:28:18 -0600 Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists In-Reply-To: ; from dj@pcisys.net on Wed, Dec 13, 2000 at 03:34:30PM -0700 References: Message-ID: <20001213202818.R1405@smack.uchicago.edu> I shifted this to mailman-developers because I want to talk about changes in qrunner that D.J. Atkinson brought up. On 2000.12.13, in , "D.J. Atkinson" wrote: > > I posted a message over the weekend where I saw qrunner only processing > part of the queue. It turned out that there were three messages in the > queue with 3 unresolvable names each. (3 messages to the same list) > Each of these queued files took 400 seconds to time out, by which time, we > were past the default max qrunner process length (15 minutes), and qrunner > exited. > > I've of course now increased the process length to 30 minutes, and > everything seems to be OK. But that's only temporary, I'm sure. As list > volume builds, it will become a problem again. It would be great if there > were a more graceful way of dealing with this than currently exists. How about altering qrunner's algorithm to split the queue on timeout, appending the head of the queue to the tail? A - fails B - succeeds C - fails D - fails/unprocessed; qrunner times out E - unprocessed F - unprocessed With this change, your next queue runner will process this queue: E F A C D Eventually (ahem) the queue will contain only those batches which are hard to deliver, and they'll be re-ordered with each run to give equal attempts over time. Actually, that's not true if the queue is reduced to containing only A, C, and D, and qrunner always times out on D; D will never get the same time as A and C. Leaving D at the head of the queue (that is, splitting the queue ahead of the current batch, rather than behind it) solves that problem until the case occurs in which D contains enough bad or slow addresses to stop the queue even though it's first. Two solutions to this: 1) never stop qrunner during the first queued batch (always wait for it to exit); or 2) split the queue ahead or behind of the current batch randomly. Does this seem to anyone else to solve the problem? I haven't looked at the code yet, so this is just cursory thought. -- -D. dgc@uchicago.edu NSIT University of Chicago From claw@kanga.nu Thu Dec 14 02:38:01 2000 From: claw@kanga.nu (J C Lawrence) Date: Wed, 13 Dec 2000 18:38:01 -0800 Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists In-Reply-To: Message from David Champion of "Wed, 13 Dec 2000 20:28:18 CST." <20001213202818.R1405@smack.uchicago.edu> References: <20001213202818.R1405@smack.uchicago.edu> Message-ID: <9182.976761481@kanga.nu> On Wed, 13 Dec 2000 20:28:18 -0600 David Champion wrote: > On 2000.12.13, in > , > "D.J. Atkinson" wrote: > How about altering qrunner's algorithm to split the queue on > timeout, appending the head of the queue to the tail? You are talking about different queue runners. He's talking about the queue runner in sendmail, and you are talking about Mailman. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From dachinetwanadoo.fr@wanadoo.fr Thu Dec 14 04:47:15 2000 From: dachinetwanadoo.fr@wanadoo.fr (Dominique) Date: Thu, 14 Dec 2000 05:47:15 +0100 Subject: [Mailman-Developers] mailinglist submission Message-ID: <001e01c06588$f1365000$0a00000a@fti2p27t44fti> C'est un message de format MIME en plusieurs parties. ------=_NextPart_000_001B_01C06591.50215940 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable ok yhanks dachi' ------=_NextPart_000_001B_01C06591.50215940 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

Variable	Value
`DOCUMENT_ROOT`	/home
`SERVER_ADDR`	207.234.31.38
`HTTP_ACCEPT_ENCODING`	gzip, deflate
`SERVER_PORT`	80
`PATH_TRANSLATED`	/home/homebrew
`REMOTE_ADDR`	207.234.31.45
`UNIQUE_ID`	OjWeyc-qHyYAAAVukiM
`HTTP_ACCEPT_LANGUAGE`	ie-ee,en-us;q=3D0.5 =
`GATEWAY_INTERFACE`	CGI/1.1
`SERVER_NAME`	sco.theporch.com
`TZ`	CST6CDT
`HTTP_USER_AGENT`	Mozilla/4.0 (compatible; = MSIE 5.5; Windows 98)
`QUERY_STRING`
`HTTP_ACCEPT`	image/gif, image/x-xbitmap, = image/jpeg, image/pjpeg, application/vnd.ms-powerpoint, = application/vnd.ms-excel, application/msword, /
`REQUEST_URI`	/mailman/admin/homebrew =
`REMOTE_PORT`	1205
`SCRIPT_FILENAME`	/home/mailman/cgi-bin/admin =
`SCRIPT_URL`	/mailman/admin/homebrew
`HTTP_HOST`	www.theporch.com:8080
`REQUEST_METHOD`	GET
`SERVER_SIGNATURE`	Apache/1.3.14 = Server at sco.theporch.com Port 80 =0A=
`SCRIPT_URI`	= http://sco.theporch.com/mailman/admin/homebrew
`SCRIPT_NAME`	/mailman/admin
`SERVER_ADMIN`	root@sco.theporch.com
`SERVER_SOFTWARE`	Apache/1.3.14 (Unix) = PHP/4.0.3pl1 mod_ssl/2.7.1 OpenSSL/0.9.7-dev
`PYTHONPATH`	/home/mailman
`PATH_INFO`	/homebrew
`SERVER_PROTOCOL`	HTTP/1.1
`HTTP_CONNECTION`	Keep-Alive

yhanks

dachi'

------=_NextPart_000_001B_01C06591.50215940-- From chuqui@plaidworks.com Thu Dec 14 04:58:21 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Wed, 13 Dec 2000 20:58:21 -0800 Subject: [Mailman-Developers] 2 bugs, but I need a confirmation :-) In-Reply-To: <20001212111117.J4396@xs4all.nl> References: <20001212111117.J4396@xs4all.nl> Message-ID: At 11:11 AM +0100 12/12/00, Thomas Wouters wrote: >2) '\n.\n' screws up Mailman. This comes in two flavours :) If the '\n.\n' >sequence is late enough in the email, Mailman doesn't notice, and the rest >of the mail (including the '\n.\n') silently vanishes. that's because that's a standard end of message delimiter in the SMTP protocols. It may not even be Mailman, but the MTA. >This is a problem in particular with digests. yes, it's related to one I found, where ^@ (NUL) does the same, because it also is read as an End of File flag, so the digest truncates. But the one you have is actually part of the SMTP standards, so at some level, the answer is "don't do that", I think, or disable it in your MTA configuration. I know it can be turned off in sendmail, but don't remember the command offhand. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Thu Dec 14 05:06:38 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Wed, 13 Dec 2000 21:06:38 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: <7746.976675279@kanga.nu> References: <31823.976583871@kanga.nu> <7746.976675279@kanga.nu> Message-ID: At 6:41 PM -0800 12/12/00, J C Lawrence wrote: > > >Any experience with GNQS? > > http://www.gnqs.org/ it's what I'm using now (I think I called it QPS before. my bad). I'm pretty happy with it, but Im' not distributing anything right now -- one machine, a number of queues, solaris and it's nicely stable. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From claw@kanga.nu Thu Dec 14 05:21:51 2000 From: claw@kanga.nu (J C Lawrence) Date: Wed, 13 Dec 2000 21:21:51 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: Message from Chuq Von Rospach of "Wed, 13 Dec 2000 21:06:38 PST." References: <31823.976583871@kanga.nu> <7746.976675279@kanga.nu> Message-ID: <24098.976771311@kanga.nu> On Wed, 13 Dec 2000 21:06:38 -0800 Chuq Von Rospach wrote: > At 6:41 PM -0800 12/12/00, J C Lawrence wrote: >> >> >> Any experience with GNQS? >> >> http://www.gnqs.org/ > it's what I'm using now (I think I called it QPS before. my > bad). I'm pretty happy with it, but Im' not distributing anything > right now -- one machine, a number of queues, solaris and it's > nicely stable. What's performance like? I'm also looking at MQM: http://www.cs.orst.edu/~pancake/ptools/mqm/ Which looks a little more mature and comes with perhaps the most abbreviated BSD-style license I've seen, -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From jam@jamux.com Thu Dec 14 05:47:22 2000 From: jam@jamux.com (John A. Martin) Date: Thu, 14 Dec 2000 00:47:22 -0500 Subject: [Mailman-Developers] Access to admin page times out - what to do? Message-ID: <20001214054722.BDE8E4800C@athene.jamux.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Access to a Mailman 2.0 list times out without the browser displaying anything. This list has more than 450 held messages amounting to 8MB in mailman/data. How should this be repaired without disrupting this or other lists? It looks like another list on the same host with about 50 held messages amounting to about 700KB can be accessed OK while another with 290 messages amounting to 3MB cannot. Longer term, would be well to let the web pages be served up in parts rather only in their entirety, if indeed that is what causes the failures described above? jam -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (GNU/Linux) Comment: OpenPGP encrypted mail preferred. See iEYEARECAAYFAjo4XtUACgkQUEvv1b/iXy+rjACgnqtwCTeEzW1IX1BkLbLZnj7d /YcAniepLedQVf8xFfG9u2NeURiflxiQ =qonr -----END PGP SIGNATURE----- From claw@kanga.nu Thu Dec 14 06:56:21 2000 From: claw@kanga.nu (J C Lawrence) Date: Wed, 13 Dec 2000 22:56:21 -0800 Subject: [Mailman-Developers] Access to admin page times out - what to do? In-Reply-To: Message from "John A. Martin" of "Thu, 14 Dec 2000 00:47:22 EST." <20001214054722.BDE8E4800C@athene.jamux.com> References: <20001214054722.BDE8E4800C@athene.jamux.com> Message-ID: <32113.976776981@kanga.nu> On Thu, 14 Dec 2000 00:47:22 -0500 John A Martin wrote: > Access to a Mailman 2.0 list times out without the browser > displaying anything. This list has more than 450 held messages > amounting to 8MB in mailman/data. How should this be repaired > without disrupting this or other lists? It looks like another > list on the same host with about 50 held messages amounting to > about 700KB can be accessed OK while another with 290 messages > amounting to 3MB cannot. One of two problems: 1) A stale lock file in ~/locks 2) It really is a timeout (unlikely). For the #1 remove the locks in ~/locks. If it really is the latter, move some of the held messages (~/heldmsg--*) off to the side, and they try accessing the admin interface again. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From dj@pcisys.net Thu Dec 14 15:22:22 2000 From: dj@pcisys.net (D.J. Atkinson) Date: Thu, 14 Dec 2000 08:22:22 -0700 (MST) Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists In-Reply-To: <9182.976761481@kanga.nu> Message-ID: > >You are talking about different queue runners. He's talking about >the queue runner in sendmail, and you are talking about Mailman. No. I was talking about the qrunner in Mailman. With synchronous DNS sendmail won't accept messages if it can't pick out an MX host. Those sit in the mailman queue and bog things down. -- o o o o o o o . . . _______ o _____ _____ ____________________ ____] D D [_||___ ._][__n__n___|DD[ [ \_____ | D.J. Atkinson | | dj@pcisys.net | >(____________|__|_[___________]_|__________________|_|_______________| _/oo OOOO OOOO oo` 'ooooo ooooo` 'o!o o!o` 'o!o o!o` -+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+- Visit my web page at http://www.pcisys.net/~dj From dj@pcisys.net Thu Dec 14 17:23:13 2000 From: dj@pcisys.net (D.J. Atkinson) Date: Thu, 14 Dec 2000 10:23:13 -0700 (MST) Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists In-Reply-To: <20001213202818.R1405@smack.uchicago.edu> Message-ID: Thanks David, From what I've seen on how Mailman's qrunner works, this would help my situation tremendously. As long as this is going to the developers list, what do you all think of the possibility of adding the "filebase" to the log line of the smtp-failure log and/or the smtp log? I know this would increase the size of the logs, so maybe it would be an option/flag set in the Defaults.py/mm_cfg.py files? This would have been very helpful in tracking down those files that were sucking all the time out of my qrunner jobs. Regards, DJ On Wed, 13 Dec 2000, David Champion wrote: > >I shifted this to mailman-developers because I want to talk about >changes in qrunner that D.J. Atkinson brought up. > > >On 2000.12.13, in , > "D.J. Atkinson" wrote: >> >> I posted a message over the weekend where I saw qrunner only processing >> part of the queue. It turned out that there were three messages in the >> queue with 3 unresolvable names each. (3 messages to the same list) >> Each of these queued files took 400 seconds to time out, by which time, we >> were past the default max qrunner process length (15 minutes), and qrunner >> exited. >> >> I've of course now increased the process length to 30 minutes, and >> everything seems to be OK. But that's only temporary, I'm sure. As list >> volume builds, it will become a problem again. It would be great if there >> were a more graceful way of dealing with this than currently exists. > >How about altering qrunner's algorithm to split the queue on timeout, >appending the head of the queue to the tail? > >A - fails >B - succeeds >C - fails >D - fails/unprocessed; qrunner times out >E - unprocessed >F - unprocessed > >With this change, your next queue runner will process this queue: > >E >F >A >C >D > >Eventually (ahem) the queue will contain only those batches which are >hard to deliver, and they'll be re-ordered with each run to give equal >attempts over time. > >Actually, that's not true if the queue is reduced to containing only A, >C, and D, and qrunner always times out on D; D will never get the same >time as A and C. Leaving D at the head of the queue (that is, >splitting the queue ahead of the current batch, rather than behind it) >solves that problem until the case occurs in which D contains enough >bad or slow addresses to stop the queue even though it's first. Two >solutions to this: 1) never stop qrunner during the first queued batch >(always wait for it to exit); or 2) split the queue ahead or behind of >the current batch randomly. > >Does this seem to anyone else to solve the problem? I haven't looked >at the code yet, so this is just cursory thought. > >-- > -D. dgc@uchicago.edu NSIT University of Chicago > -- o o o o o o o . . . _______ o _____ _____ ____________________ ____] D D [_||___ ._][__n__n___|DD[ [ \_____ | D.J. Atkinson | | dj@pcisys.net | >(____________|__|_[___________]_|__________________|_|_______________| _/oo OOOO OOOO oo` 'ooooo ooooo` 'o!o o!o` 'o!o o!o` -+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+- Visit my web page at http://www.pcisys.net/~dj From dj@pcisys.net Thu Dec 14 17:28:16 2000 From: dj@pcisys.net (D.J. Atkinson) Date: Thu, 14 Dec 2000 10:28:16 -0700 (MST) Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists In-Reply-To: <20001213202818.R1405@smack.uchicago.edu> Message-ID: One more thing... >Actually, that's not true if the queue is reduced to containing only A, >C, and D, and qrunner always times out on D; D will never get the same >time as A and C. Leaving D at the head of the queue (that is, >splitting the queue ahead of the current batch, rather than behind it) >solves that problem until the case occurs in which D contains enough >bad or slow addresses to stop the queue even though it's first. Two >solutions to this: 1) never stop qrunner during the first queued batch >(always wait for it to exit); or 2) split the queue ahead or behind of >the current batch randomly. I'm obviously not the exepert, but my observations indicate that qrunner does complete the current message batch before checking to see if it's exceeded the "QRUNNER_PROCESS_LIFETIME" value, so you could always set it to the next message in the queue. -- o o o o o o o . . . _______ o _____ _____ ____________________ ____] D D [_||___ ._][__n__n___|DD[ [ \_____ | D.J. Atkinson | | dj@pcisys.net | >(____________|__|_[___________]_|__________________|_|_______________| _/oo OOOO OOOO oo` 'ooooo ooooo` 'o!o o!o` 'o!o o!o` -+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+- Visit my web page at http://www.pcisys.net/~dj From stopha@geneseo.edu Thu Dec 14 19:20:13 2000 From: stopha@geneseo.edu (Maryann Stopha) Date: Thu, 14 Dec 2000 14:20:13 -0500 Subject: [Mailman-Developers] List of list admins Message-ID: > This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --B_3059648413_1052182 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit Hi, I need to get a list of all the admins for our Mailman system, and wondered if anyone could help me with this. I have tried to grep them out, but it has not worked with getting info out of the .db files. Does anyone know a simple way to accomplish this? Thanks, Maryann Stopha -- Web Development Professional Computing & Information Technology SUNY Geneseo 716.245.5577 stopha@geneseo.edu --B_3059648413_1052182 Content-type: text/html; charset="US-ASCII" Content-transfer-encoding: quoted-printable List of list admins Hi,
I need to get a list of all the admins for our Mailman system, and wondered= if anyone could help me with this. I have tried to grep them out, but= it has not worked with getting info out of the .db files. Does anyone= know a simple way to accomplish this?

Thanks,
Maryann Stopha
--
Web Development Professional
Computing & Information Technology
SUNY Geneseo
716.245.5577
stopha@geneseo.edu
--B_3059648413_1052182-- From csf@moscow.com Thu Dec 14 20:10:55 2000 From: csf@moscow.com (Michael Yount) Date: Thu, 14 Dec 2000 12:10:55 -0800 Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists In-Reply-To: ; from chuqui@plaidworks.com on Thu, Dec 14, 2000 at 09:48:25AM -0800 References: Message-ID: <20001214121055.B964@moscow.com> I made the same recommendation a few months ago. Later scrutiny of how defer mode works (with 8.9.3, IIRC) uncovered that using it caused relaying checks to be bypassed entirely. I retracted the recommendation in early April: http://csf.colorado.edu/archive/2000/mj2-dev/msg00219.html Looking at the check_rcpt section of the stock configuration files for sendmail 8.11.1, it appears that this is still the case. If so, it is probably wise for administrators who, like me, have a modest understanding of sendmail to use defer mode only on a restricted interface. Michael On 14 Dec 09:48, Chuq Von Rospach wrote: > > Try setting > > O DeliveryMode=defer > > in your sendmail.cf. That causes sendmail to accept the mail without > making a DNS lookup on it first. Note that this also implies > DeliveryMode=queue, so stuff won't be delivered immediately. That > means (if you already aren't) that you need to do queue runs > aggressively using -q. > From claw@kanga.nu Thu Dec 14 20:08:18 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 12:08:18 -0800 Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists In-Reply-To: Message from "D.J. Atkinson" of "Thu, 14 Dec 2000 08:22:22 MST." References: Message-ID: <6333.976824498@kanga.nu> On Thu, 14 Dec 2000 08:22:22 -0700 (MST) D J Atkinson wrote: >> You are talking about different queue runners. He's talking >> about the queue runner in sendmail, and you are talking about >> Mailman. > No. I was talking about the qrunner in Mailman. With synchronous > DNS sendmail won't accept messages if it can't pick out an MX > host. Those sit in the mailman queue and bog things down. In which case the solution is to configure your MTA to not DNS verify deliveries from localhost. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From chuqui@plaidworks.com Thu Dec 14 20:20:58 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Thu, 14 Dec 2000 12:20:58 -0800 Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists In-Reply-To: <20001214121055.B964@moscow.com> References: <20001214121055.B964@moscow.com> Message-ID: At 12:10 PM -0800 12/14/00, Michael Yount wrote: >I made the same recommendation a few months ago. Later >scrutiny of how defer mode works (with 8.9.3, IIRC) uncovered that using >it caused relaying checks to be bypassed entirely. Oh, grumble. >If so, it is probably wise for administrators who, like me, have a >modest understanding of sendmail to use defer mode only on a restricted >interface. what that means for Mailman is you can't tweak the sendmail.cf, and therefore can't use DELIVERY_MODULE = SMTPDirect. You'd have to instead use the Sendmail DELIVERY_MODULE (which I haven't tested, and which doesn't (sigh) use MAX_RCPTS. And you could add the -odd to SENDMAIL_CMD in the mm_cfg file to get this. But it changes other stuff, so... Oh, grumble. It works if you spin the dead chicken three times while wearing teal socks, but not if you spin it counterclockwise in red socks, during a full moon.. Thanks for the heads-up, Michael. Another option, I guess, is to run two sendmails on two ports, a public one using mode=queue on 25, and a second one on some other port mode=defer, but that's at best security by obscurity. Ah, the joys of sendmail... (postfix fans can step in and start giggling any time they want...) -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From marc_news@valinux.com Thu Dec 14 20:36:00 2000 From: marc_news@valinux.com (Marc MERLIN) Date: Thu, 14 Dec 2000 12:36:00 -0800 Subject: [Mailman-Developers] List of list admins In-Reply-To: ; from stopha@geneseo.edu on Thu, Dec 14, 2000 at 02:20:13PM -0500 References: Message-ID: <20001214123600.A18209@marc.merlins.org> On Thu, Dec 14, 2000 at 02:20:13PM -0500, Maryann Stopha wrote: > Hi, > I need to get a list of all the admins for our Mailman system, and wondered > if anyone could help me with this. I have tried to grep them out, but it > has not worked with getting info out of the .db files. Does anyone know a > simple way to accomplish this? cd ~mailman/lists; for i in *; do echo "$i: "`~mailman/bin/dumpdb $i/config.db | grep "'owner'"`; done Marc -- Microsoft is to operating systems & security .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | Finger marc_f@merlins.org for PGP key From chuqui@plaidworks.com Thu Dec 14 21:42:37 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Thu, 14 Dec 2000 13:42:37 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: <24098.976771311@kanga.nu> References: <31823.976583871@kanga.nu> <7746.976675279@kanga.nu> <24098.976771311@kanga.nu> Message-ID: At 9:21 PM -0800 12/13/00, J C Lawrence wrote: > >> http://www.gnqs.org/ > >> it's what I'm using now (I think I called it QPS before. my >> bad). I'm pretty happy with it, but Im' not distributing anything >> right now -- one machine, a number of queues, solaris and it's >> nicely stable. > >What's performance like? Seems okay, but I'm not really stressing it. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From claw@kanga.nu Thu Dec 14 22:33:55 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 14:33:55 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: Message from Chuq Von Rospach of "Thu, 14 Dec 2000 13:42:37 PST." References: <31823.976583871@kanga.nu> <7746.976675279@kanga.nu> <24098.976771311@kanga.nu> Message-ID: <19765.976833235@kanga.nu> On Thu, 14 Dec 2000 13:42:37 -0800 Chuq Von Rospach wrote: > At 9:21 PM -0800 12/13/00, J C Lawrence wrote: >> >> http://www.gnqs.org/ >> What's performance like? > Seems okay, but I'm not really stressing it. I'm currently going back and forth on usine an external queue system. Its a nice idea, and there are definite administrative and control benefits, but its also moderately complex, is likely unfamiliar to many/most SysAdms, adds a new level of fault intolerance/dependency to distributed systems and exposes several security concerns that aren't entirely attractive. My current noodling is tending towards implementing a queue-like system using light-weight self-discovering processes that could easily be individually removed and their interconnects replaced by calls to a real queueing system (whatever that may be) on an ad-hoc/per-site basis. The implementation complexity isn't that bad (the documentation would be), but the number of interface abstractions is kinda scary. -- J C Lawrence claw@kanga.nu ---------(*) : http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From chuqui@plaidworks.com Thu Dec 14 22:52:00 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Thu, 14 Dec 2000 14:52:00 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: <19765.976833235@kanga.nu> References: <31823.976583871@kanga.nu> <7746.976675279@kanga.nu> <24098.976771311@kanga.nu> <19765.976833235@kanga.nu> Message-ID: At 2:33 PM -0800 12/14/00, J C Lawrence wrote: >I'm currently going back and forth on usine an external queue >system. Its a nice idea, and there are definite administrative and >control benefits, but its also moderately complex, [...] you forgot compatibility and stability issues. Tie yourself to a queue system, you only run on systems that queue system runs on, and you're only as stable as that queue system is. So you end up adding a lot of dependencies onto a piece of code you don't control that's core to the program. >My current noodling is tending towards implementing a queue-like >system using light-weight self-discovering processes that could >easily be individually removed and their interconnects replaced by >calls to a real queueing system (whatever that may be) on an >ad-hoc/per-site basis. one thing I'm doodling with for my SMTP back end is starting up a server that places a socket, the n starting up "N" SMTP processes as clients that grab addresses from the server one at a time for delivery. This gets me completely away fro this "slow DNS" stuff, since any one slow address slows only itself, and since the system I'm looking at is 100% verped/customized (ala Lyris's footers, at the minimum), I'm not worrying about the added overhead (you could potentially do batches through an interface). using sockets means the clients can go off-machine for free, as long as they know where to look. now, maybe it could be something like that, a controlling process that uses both threads and forks (and perhaps remote commands through rsh or ssh) to spawn instances as needed... -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From barry@digicool.com Thu Dec 14 23:03:15 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 18:03:15 -0500 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) References: <31823.976583871@kanga.nu> Message-ID: <14905.20915.851635.680942@anthem.concentric.net> Cool, I've read through this thread and I think I see where you're heading. I like a lot of what you guys have posted, and much of it I agree with. I need to break down the separate threads so that each can be captured in the Wiki for posterity. >>>>> "JCL" == J C Lawrence writes: JCL> 4) While it seems a subtlesmall point, its bugging me. JCL> Given user account support, and messages to a given user JCL> bouncing, should that user be unsubscribed from only that JCL> list, or from all lists at that site? Where this is actually JCL> bugging me most is for virtual domains and whether or not JCL> lists in a virtual domains should be transparent or opaque to JCL> a bounce on a list in a different virtual domain? Here's what I've been thinking about. There should be a conceptual user account, with a primary key that may be internal to the system. Users can associate multiple email addresses with their account and can authenticate with the system using any of these addresses as their login. Any one of those email addresses can be deleted at any time with no restrictions, or the entire account can be wiped. I contend that people will remember (one of) their email addresses better than they will remember a login name they've had to specially craft for the site. Plus, it's likely they've already crafted some unique login @aol.com or @hotmail.com, so why have to potentially craft and remember yet another one just to interact with Mailman at this site? I'm leary about having shorter login names as abbreviations of the email address because today's unique-to-5-chars login may be tomorrow's collision. Authentication gets trickier when we're pulling users from external databases. Do we authenticate with the userid/password of that external database? Create our own for the mailing list account? I suspect we may need to allow multiple authentication paths for each user. Once authenticated, a user can edit their options, one of which is a mapping of email addresses in their account to mailing lists. They want to join mailman-users with "barry at digicool.com" but dc-bass with "barry at wooz.org". They can do this on their options pages. Each mailing list in fact may have a vector of addresses to try for this user. Perhaps there's a default for all lists unless specifically overridden. Perhaps a user can create personal distribution vectors and then can assign a distro vector to a mailing list. When an address is disabled due to bouncing, Mailman can fallover to the next address in the distro vector. This way, users can plan ahead if they know they're moving. Plus it gives Mailman a way to notify a user when their primary delivery address begins to hard fail. To add a new address to your distro vector, a confirmation transaction will have to be approved by both an address on the vector already, and the new address being added to the vector. The actual mechanism of this confirmation will be discussed in a separate thread. So what about virtual domains? >>>>> "CVR" == Chuq Von Rospach writes: CVR> I unsubscribe from the site. I'm sure at some point, an email CVR> sent from A might bounce and still be valid if sent from B, CVR> but that case is so rare I wouldn't think of wasting time on CVR> it, because the only way I can see taht happen (minus broken CVR> systems, of course) is someone who decides to try to CVR> unsubscribe by blocking a list, isntead of following the CVR> directions. And I don't see we need to write code into CVR> mailman to help users not follow the instructions.... (grin) Agreed. CVR> since we've talked about a single data store for subscriber CVR> data, I think you do it globally. If they really want CVR> opaqueness across virtual domains, run mujltiples copies of CVR> Mailman. that'll still be an option, after all. I completely agree. In fact, as a user I probably want /more/ globalization of my options and distro vectors, not less. That may be at odds with what some sites want, but too bad. They either run multiple copies or hack the code themselves. What I mean is, that I'm a member of 10 lists on python.org, 3 or 4 on zope.org, a dozen on SourceForge, and handfuls on other sites. We can excuse the non-Mailman sites their shortsightedness, but I would really love to be able to manage my subscriptions to all those lists in a seamless, transparent manner. Maybe this is done by editing my accounts with an app on my local system that interfaces to all those Mailmen via CORBA. Maybe it's a Java applet that I configure locally with the list of sites, and it screenscrapes and presents a consolidated view to me. Maybe it only works with a cooperative federation of Mailmen sharing ZEO connections. Whatever. I'm not expecting this to be implemented first, or even ever, but it's my vision of how a user /should/ be able to interact with the system. -Barry From chuqui@plaidworks.com Thu Dec 14 23:27:32 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Thu, 14 Dec 2000 15:27:32 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: <14905.20915.851635.680942@anthem.concentric.net> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> Message-ID: At 6:03 PM -0500 12/14/00, Barry A. Warsaw wrote: >Here's what I've been thinking about. There should be a conceptual >user account, with a primary key that may be internal to the system. absoldefinitely. My current big machine uses the email address as the primary key in the databases. Makes sense at some level -- until you realize the email address changes (a lot). So in the redesign I'm doing, I'm assigning a user_id to an account, and it's unique to that account. you can then attach an email address to the account, and not have to worry about it wandering out into the rest of the database so it's easy to update. (one of these minutes I'll finish the key parts of the schema and post it). Assigning multiple addresses to an account, and defining one as the "receiver" address and the rest as poster addresses is a small enhancement from there. It complicates bounce processing some, but a well-designed search system onto the email address (more on that later, if there's interest) gets around it. turns out you rarely have to do brute force searches for an address if you put a little thought into it. >I contend that people will remember (one of) their email addresses >better than they will remember a login name they've had to specially >craft for the site. true. in fact, I probably wouldn't advertise an account/login name. Instead, I'd use a password and any defined email, perhaps with a few carefully chosen heuristics to help find them if they're confused (for instance, users use earthlink.com and earthlink.net interchangeably). You still have problems wehre companies re-arrange their e-mail name space and don't tell the workers (and that happens more often than you might think) but leave in aliases for the old names, but you won't get 100%. To get into the account, you need one email address attached to it, and the password. To get the passowrd, you need to know any attached email address, and it's sent to that address. That way, they don't need to remember anything, but if they want to, they can. >Authentication gets trickier when we're pulling users from external >databases. Do we authenticate with the userid/password of that >external database? Create our own for the mailing list account? I >suspect we may need to allow multiple authentication paths for each >user. And another option is "none" -- where Mailman is simply the delivery agent for an address system controlled elsewhere and whic users aren't allowed to update via Mailman. Once you start going to external databases, either they're likely to be holding stores for a standard mailman database, or they're likely to be severely restricted access, or read-only from the Mailman point of view. >Each mailing list in fact may have a vector of addresses to try for >this user. Perhaps there's a default for all lists unless >specifically overridden. Perhaps a user can create personal >distribution vectors and then can assign a distro vector to a mailing >list. As a side note -- if we do this, we need to make sure we can assign different addresses to different lists, all under the same account. So if someone wants to put eachlist to a different address, they can... that starts turning into an N x N mapping, so it can get complex (and it implies that the account has an account ID, whic points to 1 -> N email addresses, which each have an email ID, whic is what's used to do the actual subscription. So the schema starts getting complex...) >To add a new address to your distro vector, a confirmation transaction >will have to be approved by both an address on the vector already, and >the new address being added to the vector. After the first one, why? (note for future mumuring: leave an interface for the ability to build different validation setups, or allow them to validate via one of many. don't hardware mailbacks as THE validation setup...) -- you have an audit trail back to the person, so if they decide to try to spam someone you know who they are, and who to shoot. As long as you don't lose the authenticity trail, once is all you need (that would, I think, require authenticating another address before allowing deletion of the one that's authenticated, and disabling any account when all of the authenticated addresses are disabled by bounce processing...) >So what about virtual domains? Um, well... At a high level, it's simply another table in the schema, where you attach a host (and associated UI) around the core of mailman. So you could potentially (he says hopefully) have a list live on multiple domains, or even the same domain under multiple UIs through careful abuse of this interface... (why do the latter? how about allowing me to have a "test" UI on the domain for development purposes, but having two 'mailman systems' on one virtual host under two URLs?) >What I mean is, that I'm a member of 10 lists on python.org, 3 or 4 on >zope.org, a dozen on SourceForge, and handfuls on other sites. We can >excuse the non-Mailman sites their shortsightedness, but I would >really love to be able to manage my subscriptions to all those lists >in a seamless, transparent manner. but -- maybe the sites don't want that done? For marketing purposes, for identification purposes, for whatever purposes? I don't even WANT to start thinking about sharing user data across physical machines. Virtual hosts are enough joy here. I"m of two minds here. One mind sees a reason why virtual hosts don't want to share -- but in that case, do we build this into the system, and if so, how? It's one set of data, and if they need to be left alone, aren't they better off running their own unique instance of mailman? (and can we validate the security of their data? I don't think we want to go there) But my other mind looks at trying to do all this, and shudders. So I guess I'm in the school, at least right now, of saying "we have one mailman engine, N lists, M vhosts. And for every vhost, there are a subset of those N lists published, but if you access the admin page through that vhost, you get that vhost's UI -- but it's a portal into the global mailman data setup. If they have to be kept separate, run multiple instances of mailman with separate data stores. The only burble I have with that is -- making sure the user knows where a given list exists in the space, so they can access it if they run into it while doing admin on a site where it's not published. Either that, or (I'd probably prefer it this way) you can't get info on subscribing to a list from other than a vhost it's advertised on, but anythign you're already subscribed to, you can manage. Basically, i guess, I'm treating public lists on other vhosts as private lists on this vhost... (I think that works. yet?)) -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From barry@digicool.com Fri Dec 15 00:05:25 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 19:05:25 -0500 Subject: [Mailman-Developers] Components and pluggablility References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> Message-ID: <14905.24645.631076.945270@anthem.concentric.net> I like the idea of process queues, but I don't want to take the federation-of-processes architecture too far. Yes, we want a component architecture, but where I see the process boundaries is at the message queue level. For the delivery of messages, I see Mailman's primary job as moderation-and-munge. Message come into the system from the MTA, nntp-scraper, web-board poster, or are internally crafted. All these things end up in the incoming queue. They need to be approved, rewritten, moderated, and eventually sent on to various outbound queues: nntp-poster, smtp-delivery, archiver, etc. Some of these are completely independent of the Mailman databases. E.g. it is a mistake that SMTPDirect is in the message pipeline in 2.0 because once a message hits this component, it's future disposition is (largely) independent of the rest of the system. So in my view, when Mailman decides that a message can be delivered to a membership list, it's dropped fully formed in an outbound queue. The file formats are the interface b/w Mailman and the queue runners and should be platform (i.e. Python) independent. That way, I can ship a simple queue runner that takes messages from the outbound queue and hands them off to the smtpd, but /you/ could drop in a different runner process that uses GNQS to distribute load across an infinitely expandable smtpd server farm. [Side note. Here's another reason why I'm keen on ZODB/ZEO as the underlying persistency mechanism for internal Mailman data: I believe we can parallelize the moderate-and-munge part of message processing. Because the ZEO protocols serialize writes at commit time, you could have multiple moderate-and-munge processes running on a server farm and guarantee db consistency across them. What I don't know is how ZEO would perform given a write-intensive environment (and maybe Mailman isn't as write intensive as I think it is). But even if it sucks, it simply means that the moderate-and-munge part won't be efficiently parallizable until that's fixed.] >>>>> "JCL" == J C Lawrence writes: >>>>> "CVR" == Chuq Von Rospach writes: JCL> There are five basic transition points for a message passing JCL> thru a mailing list server: | 1) Receipt of message by local MTA | 1a) passthrough of message via a security wrapper from MTA to | list server... (I think it's important we remember that, because | we can't lose it, and it involves a layer of passthrough and a | process spawning, so it's somewhat heavyweight -- but | indispensable) No problems here, because I see these as being outside the bounds of the MLM. The MLM has an incoming queue and it expects messages in a particular format (very likely just RFC822 text files). These arrive here via whatever tortuous path is necessary: MTA->security wrapper, nntpd->news scraper, web board cgi poster, etc. | 2) Receipt by list server | 3) Approval/editing/moderation What I've been calling moderate-and-munge. | 4) Processing of message and emission of any resultant message(s) Here's where the output queues and process boundaries come it. Once they're in the outbound queues, Mailman's out of the loop. | 5) Delivery of message to MTA for final delivery. Again, that's the responsibility of the mta-qrunner, be it a simple minded Python process like today's qrunner, or batch processing system like you've been investigating. These processes are not completely independent of Mailman though, e.g. for handling hard errors at smtp transaction time or URL generation for summary digests. Some of these can be handled by re-injection into the message queues (i.e. generate a bounce message and stick it in the bounce queue), but some may need an rpc interface. | 6) delivery of message to non-MTA recipients (the archiver, the | logging thing, the digester, the bounce processor....) Each of these should be separate queues with defined process interfaces, but again there may be synchronous information communicated back to Mailman. The archiver discussions we've had come to mind here. CVR> and besides, they are basically independent, asynchronous CVR> processes that don't need to be managed by any of the core CVR> logic, other than handing messages into their queue and CVR> making sure they stay running. same with, IMHO, storing CVR> messages for archives, storing messages for digests, updating CVR> archives, processing digests (but the processed digest is fed CVR> back into the core logic for delivery), and whatever else we CVR> decide it needs to do that isn't part of the core, CVR> time-sensitive code base. (in fact, there's no reason why you CVR> couldn't have multiple flavors of these things, feeding CVR> archives into an mbox, another archiver into mhonarc or CVR> pipermail, something that updates the search engine indexes, CVR> and text adn mime digesters... by turning them into their own CVR> logic streams with their own queues, you effectivley have CVR> just made them all plug-in swappable, because you're writing CVR> to a queue, and not worrying about what happens once its CVR> there. you merely need to make sure it goes in the right CVR> queue, in the approved format. I agree! -Barry From barry@digicool.com Fri Dec 15 00:08:34 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 19:08:34 -0500 Subject: [Mailman-Developers] Configuration Safety References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> Message-ID: <14905.24834.431074.188380@anthem.concentric.net> CVR> ObTheme: All configuration should be possible via the web, CVR> even if the system is misconfigured and CVR> non-functional. Anything that can NOT be safely reconfigured CVR> without breaking the system should not be configurable via CVR> the web. (in other words, anything you can change, you should CVR> be able to change remotely, unless you can break the CVR> ssytem. If you cna break the system, you shouldn't be allowed CVR> near it trivially...) Agreed completely. web_page_url comes to mind here. :( -Barry From barry@digicool.com Fri Dec 15 00:18:34 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 19:18:34 -0500 Subject: [Mailman-Developers] Responsiveness References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> Message-ID: <14905.25434.202694.941787@anthem.concentric.net> >>>>> "CVR" == Chuq Von Rospach writes: CVR> The obverse of that is that end-users seriously dislike CVR> delays, especially on conversational lists. It turns into the CVR> old "user expectation" problem -- it's better to hold ALL CVR> mail for 15 minutes so users come to expect it than to CVR> normally deliver mail in 2 minutes, except during the worst CVR> bulges... That's been my experience as well. People expect email to do strange things to conversations, like show you replies before you've seen the original message, or have turnaround times on the order of quarter-hours or whatever. It's when the behavior they expect changes that people start to notice. Case in point: I recently moved most of the python.org mailing lists to a new, faster machine with better network connectivity. Turnaround time tanked and I started getting complaints that messages weren't being seen in inboxes after 6 hours or so. To make matters worse, those messages were /in/ the archive! . Ah ha! /etc/syslog.conf was configured to log mail.* and syslogd was starving the MTA. CVR> But in general, the MLM should deliver as fast as CVR> it reasonable can without overloading the MUA, which implies CVR> some kind of monitoring setup for the MUA, or some CVR> user-controlled throttling system. the latter unfortunately, CVR> implies teaching admins how to monitor and adjust, a support CVR> issue. The former implies writing an interface for every MTA CVR> -- a development AND support issue. Let me change that a little bit. The MLM should /process/ messages as fast as possible, getting them through the moderate-and-munge and into the outbound queue at top speed possible. Once that message is sitting in that outbound queue, it's that queue's runner process that can be configured to throttle, distribute, batch, whatever it takes. It's not the MLM's problem (not to say it isn't the problem of the simple-minded qrunner script we distribute and enable by default). All I need to do is document the file format for the outbound queue files and site administrators can take it from there. -Barry From barry@digicool.com Fri Dec 15 00:20:36 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 19:20:36 -0500 Subject: [Mailman-Developers] Web Interface References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <31145.976605331@kanga.nu> Message-ID: <14905.25556.797790.749195@anthem.concentric.net> >>>>> "JCL" == J C Lawrence writes: JCL> I'm currently working under the generous assumption that its JCL> possible to cook up a web interface design for almost JCL> anything, so I'm punting there for now. Valid assumption. But of course, this is another componentization dimension. -Barry From barry@digicool.com Fri Dec 15 00:59:41 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 19:59:41 -0500 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> Message-ID: <14905.27901.891059.922294@anthem.concentric.net> CVR> So in the redesign I'm doing, I'm assigning a user_id to an CVR> account, and it's unique to that account. you can then attach CVR> an email address to the account, and not have to worry about CVR> it wandering out into the rest of the database so it's easy CVR> to update. (one of these minutes I'll finish the key parts of CVR> the schema and post it). Assigning multiple addresses to an CVR> account, and defining one as the "receiver" address and the CVR> rest as poster addresses is a small enhancement from there. Very cool. CVR> It complicates bounce processing some, but a well-designed CVR> search system onto the email address (more on that later, if CVR> there's interest) gets around it. turns out you rarely have CVR> to do brute force searches for an address if you put a little CVR> thought into it. Good. CVR> true. in fact, I probably wouldn't advertise an account/login CVR> name. I agree. CVR> Instead, I'd use a password and any defined email, perhaps CVR> with a few carefully chosen heuristics to help find them if CVR> they're confused (for instance, users use earthlink.com and CVR> earthlink.net interchangeably). Hadn't thought about that, and that is a good point. CVR> You still have problems wehre companies re-arrange their CVR> e-mail name space and don't tell the workers (and that CVR> happens more often than you might think) but leave in aliases CVR> for the old names, but you won't get 100%. Right. I don't think 100% coverage is necessary. I really want to handle the situations where someone knows his email is going to change, or one where they purposely want different lists to deliver to different addresses. Here's another complication: are the delivery options set per-address or per-list? Maybe I want all deliveries to "barry at wooz" to be digests. Maybe I want lists A, B, and C to be regular deliveries. CVR> To get into the account, you need one email address attached CVR> to it, and the password. To get the passowrd, you need to CVR> know any attached email address, and it's sent to that CVR> address. That way, they don't need to remember anything, but CVR> if they want to, they can. Agreed. CVR> And another option is "none" -- where Mailman is simply the CVR> delivery agent for an address system controlled elsewhere and CVR> whic users aren't allowed to update via Mailman. Once you CVR> start going to external databases, either they're likely to CVR> be holding stores for a standard mailman database, or they're CVR> likely to be severely restricted access, or read-only from CVR> the Mailman point of view. Good point. CVR> As a side note -- if we do this, we need to make sure we can CVR> assign different addresses to different lists, all under the CVR> same account. Yup, absolutely, which is why I posed the config option question above. CVR> So if someone wants to put eachlist to a CVR> different address, they can... that starts turning into an N CVR> x N mapping, so it can get complex (and it implies that the CVR> account has an account ID, whic points to 1 -> N email CVR> addresses, which each have an email ID, whic is what's used CVR> to do the actual subscription. So the schema starts getting CVR> complex...) I see a level of indirection coming to the rescue. MailingLists have Rosters and Rosters have EmailAddresses which in turn link back to the UserAccount. A MailingList might actually want to deliver to multiple Rosters, which is where I think the umbrella list stuff could be improved. I.e. you have a Roster for mailman-developers and a Roster for mailman-users and mailman-announce contains a computed Roster composed of those first two, along with it's own Roster. Now you send a message out to mailman-announce and everybody gets it (although what do you do about Subject: munging, footer addition List-* headers and the like?). I also see creation of Rosters for list owners, site administrators and so on, so you could do things like compute a Roster for all-list-owners@mysite.com if you had urgent information for your list admins. CVR> After the first one, why? (note for future mumuring: leave an CVR> interface for the ability to build different validation CVR> setups, or allow them to validate via one of many. don't CVR> hardware mailbacks as THE validation setup...) Agreed (and I have some thoughts on the mailback thread which I'll try to get to separately). CVR> -- you have an CVR> audit trail back to the person, so if they decide to try to CVR> spam someone you know who they are, and who to shoot. As long CVR> as you don't lose the authenticity trail, once is all you CVR> need (that would, I think, require authenticating another CVR> address before allowing deletion of the one that's CVR> authenticated, and disabling any account when all of the CVR> authenticated addresses are disabled by bounce processing...) I'm concerned about the scenario where I subscribe to a list, then add your address to my account, then disable my address because I'm "going on vacatin". Now suddenly you're getting flooded with postings that make no sense to you. You don't know anything about Mailman, so you're not even sure where to start complaining. And it could get very annoying very quickly. Sure, once the admins are aware of the problem they can trace it back to me and I'll get slapped around, but in the meantime you're really pissed off. Where if your address was confirmed when I tried to add it, you might still be pretty confused, but you shouldn't be annoyed. (Come to think of it, without protections, this is a nice annoyance spam route or DoS). >What I mean is, that I'm a member of 10 lists on python.org, 3 or 4 on >zope.org, a dozen on SourceForge, and handfuls on other sites. We can >excuse the non-Mailman sites their shortsightedness, but I would >really love to be able to manage my subscriptions to all those lists >in a seamless, transparent manner. CVR> but -- maybe the sites don't want that done? For marketing CVR> purposes, for identification purposes, for whatever purposes? No doubt. It's what I as a user of Mailman want though. :) CVR> I don't even WANT to start thinking about sharing user data CVR> across physical machines. Virtual hosts are enough joy here. No, no, no, neither do I. I was just musing. Not to be seriously considered for 3.0 (if at all!). CVR> So I guess I'm in the school, at least right now, of saying CVR> "we have one mailman engine, N lists, M vhosts. And for every CVR> vhost, there are a subset of those N lists published, but if CVR> you access the admin page through that vhost, you get that CVR> vhost's UI -- but it's a portal into the global mailman data CVR> setup. If they have to be kept separate, run multiple CVR> instances of mailman with separate data stores. I'm firmly in agreement. CVR> Either that, or (I'd probably prefer it this way) you can't CVR> get info on subscribing to a list from other than a vhost CVR> it's advertised on, but anythign you're already subscribed CVR> to, you can manage. Basically, i guess, I'm treating public CVR> lists on other vhosts as private lists on this vhost... (I CVR> think that works. yet?)) Yes, that's what I've been thinking too. -Barry From chuqui@plaidworks.com Fri Dec 15 00:31:11 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Thu, 14 Dec 2000 16:31:11 -0800 Subject: [Mailman-Developers] Responsiveness In-Reply-To: <14905.25434.202694.941787@anthem.concentric.net> References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.25434.202694.941787@anthem.concentric.net> Message-ID: At 7:18 PM -0500 12/14/00, Barry A. Warsaw wrote: > >Let me change that a little bit. The MLM should /process/ messages as >fast as possible, getting them through the moderate-and-munge and into >the outbound queue at top speed possible. Once that message is >sitting in that outbound queue, it's that queue's runner process that >can be configured to throttle, distribute, batch, whatever it takes. buy that man a beer. >It's not the MLM's problem (not to say it isn't the problem of the >simple-minded qrunner script we distribute and enable by default). >All I need to do is document the file format for the outbound queue >files and site administrators can take it from there. except I see this as still part of the MLM, since it's the tool doing the MLM->MTA handoff, not part of the MTA itself. chuq -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From klm@digicool.com Fri Dec 15 02:01:30 2000 From: klm@digicool.com (Ken Manheimer) Date: Thu, 14 Dec 2000 21:01:30 -0500 (EST) Subject: [Mailman-Developers] Re: Mailman-Developers digest, Vol 1 #741 - 8 msgs In-Reply-To: <20001215005707.230CDE841@mail.python.org> Message-ID: This is one case where i really mean for the message subject to refer to the digest as a whole - because today's material, this digest, is *chock-full* of material, and i want to make a suggestion about it: maybe it's time to do some harvesting - take a look at carving out some space in barry's post-2.0 mailman wiki: http://www.zope.org/Members/bwarsaw/MailmanDesignNotes/FrontPage and filling in some areas covered by these postings. (It may be worthwhile to do a top-down, "what are the topic areas" first kind of approach - and then list various variations being discussed, for controversial areas; it looks like there's a *lot* of agreement about some substantial portions of a Mailman Next Generation world view.) I know i'm unable to do more than scan most of it, and even if you are savoring every bit, it's flowing by - tomorrow you'll either have to rehash it or have it fade - unless you get it in some stable, growable home... Ken klm@digicool.com From claw@kanga.nu Fri Dec 15 02:27:17 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 18:27:17 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: Message from Chuq Von Rospach of "Thu, 14 Dec 2000 15:27:32 PST." References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> Message-ID: <9024.976847237@kanga.nu> On Thu, 14 Dec 2000 15:27:32 -0800 Chuq Von Rospach wrote: > At 6:03 PM -0500 12/14/00, Barry A. Warsaw wrote: >> Here's what I've been thinking about. There should be a >> conceptual user account, with a primary key that may be internal >> to the system. > absoldefinitely. My current big machine uses the email address as > the primary key in the databases. Makes sense at some level -- > until you realize the email address changes (a lot). > So in the redesign I'm doing, I'm assigning a user_id to an > account, and it's unique to that account. you can then attach an > email address to the account, and not have to worry about it > wandering out into the rest of the database so it's easy to > update. (one of these minutes I'll finish the key parts of the > schema and post it). Assigning multiple addresses to an account, > and defining one as the "receiver" address and the rest as poster > addresses is a small enhancement from there. I *like* email-address based IDs. You can get the best of both worlds with little effort much as you describe, Abstract UserIDs are a nice idea that user's hate for the same reason they hate passwords -- because it yet another abstract string that they have to remember. > true. in fact, I probably wouldn't advertise an account/login > name. Instead, I'd use a password and any defined email, perhaps > with a few carefully chosen heuristics to help find them if > they're confused (for instance, users use earthlink.com and > earthlink.net interchangeably). Yeouch. > And another option is "none" -- where Mailman is simply the > delivery agent for an address system controlled elsewhere and whic > users aren't allowed to update via Mailman. Once you start going > to external databases, either they're likely to be holding stores > for a standard mailman database, or they're likely to be severely > restricted access, or read-only from the Mailman point of view. This gets worse: What about list "addresses" where the membership list is entirely dynamic such that a message sent to it will be broadcast to a list of addresses based on the content of the message? Heck, remove the entire concept of mailing lists entirely and make the very existance of a list and its configs and membership dynamic: You send a message to @lists.domain, Mailman receives it, does an LDP query for every user with a attribute on their record, and broadcasts the message to that generated list of addresses. Marketeers will love thsi sort of thing when run against customer databases. The advantage that an MLM brings to the table is in scalability, bounce, and unsubscribe handling. Account cusomisations (digests, metoo, nomail, etc) of course also passes outside of Mailman's purview once account data goes outside. >> Each mailing list in fact may have a vector of addresses to try >> for this user. Perhaps there's a default for all lists unless >> specifically overridden. Perhaps a user can create personal >> distribution vectors and then can assign a distro vector to a >> mailing list. > As a side note -- if we do this, we need to make sure we can > assign different addresses to different lists, all under the same > account. So if someone wants to put eachlist to a different > address, they can... that starts turning into an N x N mapping, so > it can get complex (and it implies that the account has an account > ID, whic points to 1 -> N email addresses, which each have an > email ID, whic is what's used to do the actual subscription. So > the schema starts getting complex...) Its complicated until people get involved. Then it gets messy. The problem is that this is an insolvable problem at our level. Management is going to have one need, SysAdm another, and the end users a third. At different times and different sites and in different instances, each one is going to win some of the time. We can't solve this one. What we can do is say: If you are going to take control of membserhip list definition, you have take control of all of it, and that encludes resolution of ambiguities! Not very pleasant mayhap, but whatever else we do someone is (quite rightfully) going to scream > I don't even WANT to start thinking about sharing user data across > physical machines. Virtual hosts are enough joy here. This is a problem we can handle by punting. We provide an internal membership solution and make it easily replaceable by externals. The external solutions can then do whatever their little masochistic hearts desire, but yet again they have to take responsibility for the whole problem, and that encludes ambiguity resolution. > I"m of two minds here. One mind sees a reason why virtual hosts > don't want to share -- but in that case, do we build this into the > system, and if so, how? It's one set of data, and if they need to > be left alone, aren't they better off running their own unique > instance of mailman? (and can we validate the security of their > data? I don't think we want to go there) Virtual hosts are a hack, and an ugly hack at that. I don't see that they are worth wasting time on when multiple installation appeases the privacy concerns. <> -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From claw@kanga.nu Fri Dec 15 02:35:02 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 18:35:02 -0800 Subject: [Mailman-Developers] Responsiveness In-Reply-To: Message from Chuq Von Rospach of "Thu, 14 Dec 2000 16:31:11 PST." References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.25434.202694.941787@anthem.concentric.net> Message-ID: <9493.976847702@kanga.nu> On Thu, 14 Dec 2000 16:31:11 -0800 Chuq Von Rospach wrote: > At 7:18 PM -0500 12/14/00, Barry A. Warsaw wrote: >> It's not the MLM's problem (not to say it isn't the problem of >> the simple-minded qrunner script we distribute and enable by >> default). All I need to do is document the file format for the >> outbound queue files and site administrators can take it from >> there. > except I see this as still part of the MLM, since it's the tool > doing the MLM->MTA handoff, not part of the MTA itself. We need to define what an MLM is and does: We're entering a scary territory which goes far beyond the standard modesl of subscriber-based list to more dynamic work-flow and collaborative flow oriented entities where really the only thing the MLM does is prove a named membership list, which it will yield on query after passing some authentication mechanism such that the membership list is associated with a particular message. Actual delivery of messages, MTAs, transports, authentication mechanisms, membership definitions, account definitions, etc, are really outside of its purview. All an MLM does is provide names which can be associated with lists of addresses, and a means for associating those lists of addresses with a message. -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From barry@digicool.com Fri Dec 15 02:49:23 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 21:49:23 -0500 Subject: [Mailman-Developers] Responsiveness References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.25434.202694.941787@anthem.concentric.net> Message-ID: <14905.34483.401067.491743@anthem.concentric.net> >>>>> "CVR" == Chuq Von Rospach writes: >> It's not the MLM's problem (not to say it isn't the problem of >> the simple-minded qrunner script we distribute and enable by >> default). All I need to do is document the file format for the >> outbound queue files and site administrators can take it from >> there. CVR> except I see this as still part of the MLM, since it's the CVR> tool doing the MLM->MTA handoff, not part of the MTA itself. Fair enough. It's definitely not part of the MTA. -Barry From claw@kanga.nu Fri Dec 15 02:50:29 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 18:50:29 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) of "Thu, 14 Dec 2000 19:59:41 EST." <14905.27901.891059.922294@anthem.concentric.net> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <14905.27901.891059.922294@anthem.concentric.net> Message-ID: <11218.976848629@kanga.nu> On Thu, 14 Dec 2000 19:59:41 -0500 Barry A Warsaw wrote: > Here's another complication: are the delivery options set > per-address or per-list? Maybe I want all deliveries to "barry at > wooz" to be digests. Maybe I want lists A, B, and C to be regular > deliveries. We can generalise this problem into insolvability. Arguably customisation at the list level is the lowest common demoninator (another other config can be built from there), so we should do that. This is especially pleasant as, should somsone want to do something else they can just take replace the membership tools and do their own thing to provide theiur own model. Repeat after me: Mailman is not trying solve all problems! Mailman is trying to provide a toolkit such that users can solve problems in ways they prefer! We're providing a toolkit, a set of relationships, and a reference implementation of how they *might* be related an installed in practice. From there its up to SysAdms and end users. CVR> As a side note -- if we do this, we need to make sure we can CVR> assign different addresses to different lists, all under the CVR> same account. > Yup, absolutely, which is why I posed the config option question > above. Note: We need to allow account customisation for accounts we maintain *AND* for accounts that external agencies maintain. We also need to allow lists to be populated both with accounts that can be customised, and accounts that can't (eg corporate addresses and outside subscriptions). I actually acount at the point that I think Mailman shouldn't have a default membership implementation, but again, just one or more reference implementations. -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From barry@digicool.com Fri Dec 15 02:57:08 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 21:57:08 -0500 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <9024.976847237@kanga.nu> Message-ID: <14905.34948.21105.707959@anthem.concentric.net> >>>>> "JCL" == J C Lawrence writes: JCL> What about list "addresses" where the membership list is JCL> entirely dynamic such that a message sent to it will be JCL> broadcast to a list of addresses based on the content of the JCL> message? You've just described Roundup's nosy lists: http://software-carpentry.codesourcery.com/entries/track/Roundup/Roundup.html T'would be very cool to support this in Mailman. I'm calling this Computed Rosters. JCL> If you are going to take control of membserhip list JCL> definition, you have take control of all of it, and that JCL> encludes resolution of ambiguities! Excellent point. I agree. -Barry From barry@digicool.com Fri Dec 15 03:07:36 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 22:07:36 -0500 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <14905.27901.891059.922294@anthem.concentric.net> <11218.976848629@kanga.nu> Message-ID: <14905.35576.871041.973403@anthem.concentric.net> >>>>> "JCL" == J C Lawrence writes: JCL> We're providing a toolkit, a set of relationships, and a JCL> reference implementation of how they *might* be related an JCL> installed in practice. From there its up to SysAdms and end JCL> users. Look at slightly differently, Mailman is a framework with interfaces to key components. Those components can be customized or replaced via subclassing or interface conformance. Either way... JCL> I actually acount at the point that I think Mailman shouldn't JCL> have a default membership implementation, but again, just one JCL> or more reference implementations. ...it adds up to Mailman providing those reference implementations in an easy to install and use distro, but allowing the flexibility for sites to drop-in replace bits and pieces as necessary. -Barry From barry@digicool.com Fri Dec 15 03:12:05 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 22:12:05 -0500 Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists References: <20001213202818.R1405@smack.uchicago.edu> Message-ID: <14905.35845.531685.710323@anthem.concentric.net> >>>>> "DJA" == D J Atkinson writes: DJA> As long as this is going to the developers list, what do you DJA> all think of the possibility of adding the "filebase" to the DJA> log line of the smtp-failure log and/or the smtp log? I know DJA> this would increase the size of the logs, so maybe it would DJA> be an option/flag set in the Defaults.py/mm_cfg.py files? DJA> This would have been very helpful in tracking down those DJA> files that were sucking all the time out of my qrunner jobs. Great idea! From claw@kanga.nu Fri Dec 15 03:07:34 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 19:07:34 -0800 Subject: [Mailman-Developers] Re: Components and pluggablility In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) of "Thu, 14 Dec 2000 19:05:25 EST." <14905.24645.631076.945270@anthem.concentric.net> References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.24645.631076.945270@anthem.concentric.net> Message-ID: <12343.976849654@kanga.nu> On Thu, 14 Dec 2000 19:05:25 -0500 Barry A Warsaw wrote: > I like the idea of process queues, but I don't want to take the > federation-of-processes architecture too far. Yes, we want a > component architecture, but where I see the process boundaries is > at the message queue level. There are in essence seven queues: 1) inbound Message arrives at the MLM 2) authentication Do I accept it? 3) moderation Does the list accept it? 4) pending Associate a distribution list with message 5) outbound Send it. 5) bounce Demote the subscriber 7) command A combo of #2, #3, and command processing. There's a possible eighth for OOB stuff like archiving and digests, which I mostly see as a fork off the side of the pending queue. > So in my view, when Mailman decides that a message can be > delivered to a membership list, it's dropped fully formed in an > outbound queue. Not exactly. It drops a mesasge, any relevant meta data, and a distribution list in the outbound queue. A delivery process then takes that and does what it will with them (eg VERP, application of templates, etc). Process pipes... > The file formats are the interface b/w Mailman and the queue > runners and should be platform (i.e. Python) independent. Bingo. This is a point I've invested considerable time into. > That way, I can ship a simple queue runner that takes messages > from the outbound queue and hands them off to the smtpd, but /you/ > could drop in a different runner process that uses GNQS to > distribute load across an infinitely expandable smtpd server farm. If you continue the same abstraction across all queues and the staging processes of queus, you build something that isn't inherently a queue run-system, it merely looks like one and can in fact be fairly trivially hung off a queue based system (MQM or whatever). Consider the following setup: Three machines: HostA is the primary MX and receives the list mail along with all mail for the rest of the site. HostB has a private hole in the firewall and is the only host to have access to the backing stores for authentication and membership data. HostC has a nicely tuned MTA built for outbound processing. Given a queue bases system supporting that, or something several dozen times more complex, becomes trivial. The problem is in making the architecture that runs on a single host without an external queue manager the same as the system above where different hosts each take responsibility for different queues in the message system. It can be don, it just requires a little elegance. > [Side note. Here's another reason why I'm keen on ZODB/ZEO as the > underlying persistency mechanism for internal Mailman data: I > believe we can parallelize the moderate-and-munge part of message > processing. Because the ZEO protocols serialize writes at commit > time, you could have multiple moderate-and-munge processes running > on a server farm and guarantee db consistency across them. There are problems with this due to the fact that external transactions (such as SMTP sends) are asynchonous and not nested in ZODB transactions. >>>>>> "JCL" == J C Lawrence writes: "CVR" == Chuq >>>>>> Von Rospach writes: > These processes are not completely independent of Mailman though, > e.g. for handling hard errors at smtp transaction time or URL > generation for summary digests. Some of these can be handled by > re-injection into the message queues (i.e. generate a bounce > message and stick it in the bounce queue), but some may need an > rpc interface. Thus the pending queue above -- it allows a mesasge to undergo a set of pre-post filters prior to landing in the outbound queue. Archiving, digests, all sorts of things can happen at that point. -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From barry@digicool.com Fri Dec 15 03:13:44 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 22:13:44 -0500 Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists References: <20001213202818.R1405@smack.uchicago.edu> Message-ID: <14905.35944.21010.553125@anthem.concentric.net> >>>>> "DJA" == D J Atkinson writes: DJA> I'm obviously not the exepert, but my observations indicate DJA> that qrunner does complete the current message batch before DJA> checking to see if it's exceeded the DJA> "QRUNNER_PROCESS_LIFETIME" value, so you could always set it DJA> to the next message in the queue. Actually, it doesn't. It checks QRUNNER_PROCESS_LIFETIME before processing every file in the directory listing. -Barry From claw@kanga.nu Fri Dec 15 03:12:26 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 19:12:26 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) of "Thu, 14 Dec 2000 21:57:08 EST." <14905.34948.21105.707959@anthem.concentric.net> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <9024.976847237@kanga.nu> <14905.34948.21105.707959@anthem.concentric.net> Message-ID: <13040.976849946@kanga.nu> On Thu, 14 Dec 2000 21:57:08 -0500 Barry A Warsaw wrote: > You've just described Roundup's nosy lists: > http://software-carpentry.codesourcery.com/entries/track/Roundup/Roundup.html Brilliant stuff! -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From barry@digicool.com Fri Dec 15 03:20:45 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 22:20:45 -0500 Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists References: <20001214121055.B964@moscow.com> Message-ID: <14905.36365.931133.268502@anthem.concentric.net> >>>>> "CVR" == Chuq Von Rospach writes: CVR> what that means for Mailman is you can't tweak the CVR> sendmail.cf, and therefore can't use DELIVERY_MODULE = CVR> SMTPDirect. You'd have to instead use the Sendmail CVR> DELIVERY_MODULE (which I haven't tested, and which doesn't CVR> (sigh) use MAX_RCPTS. And you could add the -odd to CVR> SENDMAIL_CMD in the mm_cfg file to get this. But it changes CVR> other stuff, so... Two things to note. First, if people want to get serious about Sendmail.py, someone please fix it not to go through the shell (i.e. don't use os.popen()). It's not hard to fix, just tedious and I don't want to spend the time on it. I'll take contributions of course. Be aware that in 2.1 DELIVERY_MODULE will probably not be part of the mainline message pipeline. Second, someone posted patches to Python's smtplib.py so that the same interface can be used to "sendmail -bs" (i.e. sendmail run from the command line with SMTP commands on standard input). I don't remember who or where that was posted, but it's worth searching for and proofreading. Or if the original author is reading this, resend me the url and I'll see about getting it into Python 2.1. Does sendmail do the same synchronous DNS lookups when used in -bs mode? giggle-ly y'rs, -Barry From claw@kanga.nu Fri Dec 15 03:24:58 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 19:24:58 -0800 Subject: [Mailman-Developers] (no subject) In-Reply-To: Message from Chuq Von Rospach of "Thu, 14 Dec 2000 14:52:00 PST." References: <31823.976583871@kanga.nu> <7746.976675279@kanga.nu> <24098.976771311@kanga.nu> <19765.976833235@kanga.nu> Message-ID: <14188.976850698@kanga.nu> On Thu, 14 Dec 2000 14:52:00 -0800 Chuq Von Rospach wrote: > one thing I'm doodling with for my SMTP back end is starting up a > server that places a socket, then starting up "N" SMTP processes > as clients that grab addresses from the server one at a time for > delivery. This gets me completely away fro this "slow DNS" stuff, > since any one slow address slows only itself, and since the system > I'm looking at is 100% verped/customized (ala Lyris's footers, at > the minimum), I'm not worrying about the added overhead (you could > potentially do batches through an interface). using sockets means > the clients can go off-machine for free, as long as they know > where to look. If the number of slow addresses at any instant exceed N th entire system stops for that period. The benefit of parallelisation in this case is that in *general* traffic will cotniue to flow given one or more bad addresses, and the assumption is that the rate of bad/slow addresses will never/rarely coincide to the point that the entire queue is bogged. > now, maybe it could be something like that, a controlling process > that uses both threads and forks (and perhaps remote commands > through rsh or ssh) to spawn instances as needed... I see this as orthogonal to Mailman or the MLM process. You could drop such a solution transparently into the outbound queue process. -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From barry@digicool.com Fri Dec 15 03:37:51 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 22:37:51 -0500 Subject: [Mailman-Developers] Re: Components and pluggablility References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.24645.631076.945270@anthem.concentric.net> <12343.976849654@kanga.nu> Message-ID: <14905.37391.141053.210120@anthem.concentric.net> >>>>> "JCL" == J C Lawrence writes: | 1) inbound Message arrives at the MLM | 2) authentication Do I accept it? | 3) moderation Does the list accept it? Remind me again about the difference between 2 and 3, and why 2 is under the purview of the MLM. | 4) pending Associate a distribution list with message | 5) outbound Send it. | 5) bounce Demote the subscriber | 7) command A combo of #2, #3, and command processing. I'm not totally sold that 4 needs a process boundary, or that 2, 3, and 4 aren't part of the same process, structured as interfaces in the framework. > So in my view, when Mailman decides that a message can be > delivered to a membership list, it's dropped fully formed in an > outbound queue. JCL> Not exactly. It drops a mesasge, any relevant meta data, and JCL> a distribution list in the outbound queue. A delivery JCL> process then takes that and does what it will with them (eg JCL> VERP, application of templates, etc). In Mailman 2.0 the distribution list is also metadata -- it lives in the .db file. Do you see that differently? > [Side note. Here's another reason why I'm keen on ZODB/ZEO as the > underlying persistency mechanism for internal Mailman data: I > believe we can parallelize the moderate-and-munge part of message > processing. Because the ZEO protocols serialize writes at commit > time, you could have multiple moderate-and-munge processes running > on a server farm and guarantee db consistency across them. JCL> There are problems with this due to the fact that external JCL> transactions (such as SMTP sends) are asynchonous and not JCL> nested in ZODB transactions. But I see the smtp sends as being in a separate process, i.e. the outbound qrunner. The easiest way I see of getting smtp hard failures back into Mailman is simply to craft an internal message and drop it in the bounce queue. > These processes are not completely independent of Mailman though, > e.g. for handling hard errors at smtp transaction time or URL > generation for summary digests. Some of these can be handled by > re-injection into the message queues (i.e. generate a bounce > message and stick it in the bounce queue), but some may need an > rpc interface. JCL> Thus the pending queue above -- it allows a mesasge to JCL> undergo a set of pre-post filters prior to landing in the JCL> outbound queue. Archiving, digests, all sorts of things can JCL> happen at that point. Maybe I'm taking the term "distribution list" too narrowly in your description about. Do you "list of recipient addresses" or do you mean "internal queue routing destinations", or something else? -Barry From barry@digicool.com Fri Dec 15 03:54:10 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 22:54:10 -0500 Subject: [Mailman-Developers] 2 bugs, but I need a confirmation :-) References: <20001212111117.J4396@xs4all.nl> Message-ID: <14905.38370.371699.315689@anthem.concentric.net> >>>>> "TW" == Thomas Wouters writes: TW> 1) Subscription-confirmation-response-emails to *-request, TW> with multiple attachements, fail. The problem is that Mailman TW> tries to interpret the MIME boundary and content-type headers TW> and what not as commands, rather than taking the first TW> attachement and parsing that. This wasn't a real problem when TW> I tested it on python-list, because my mailer doesn't put TW> enough headers in the first MIME part, but customers of ours TW> have seen honest problems with this. People mailing with HTML TW> mail enabled, for instance, but also people who get a TW> signature attached to the email, without being able to prevent TW> it. This enforced signature is becoming more and more populair TW> in clueless paranoid companies :P The standard MIME parsing modules in Python suck. I want to fix this for 2.1, but haven't yet had time. mimectl might be the way to go, but I also think I want a DOM-like approach so I can slice and dice MIME messages, then just say "okay, spit out the results". TW> 2) '\n.\n' screws up Mailman. This comes in two flavours :) If TW> the '\n.\n' sequence is late enough in the email, Mailman TW> doesn't notice, and the rest of the mail (including the TW> '\n.\n') silently vanishes. If the sequence is a bit higher, TW> Mailman does notice: sendmail stops the transmission while TW> Mailman still has data to send. Mailman considers the mail not TW> sent, and tries again later -- but the first part of the mail TW> is sent to all recipients just fine. Hmm, a simplistic approach on my personal test lists, using a rather stock Postfix doesn't seem to suffer from this. I have a couple of test lists floating around on various domains, although I think they all use Postfix. Thomas, I'll email you separately and you can try some things out. -Barry From claw@kanga.nu Fri Dec 15 04:59:12 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 20:59:12 -0800 Subject: [Mailman-Developers] Re: Components and pluggablility In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) of "Thu, 14 Dec 2000 22:37:51 EST." <14905.37391.141053.210120@anthem.concentric.net> References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.24645.631076.945270@anthem.concentric.net> <12343.976849654@kanga.nu> <14905.37391.141053.210120@anthem.concentric.net> Message-ID: <22532.976856352@kanga.nu> On Thu, 14 Dec 2000 22:37:51 -0500 Barry A Warsaw wrote: > "JCL" == J C Lawrence writes: >> 1) inbound Message arrives at the MLM 2) authentication Do I > >> accept it? 3) moderation Does the list accept it? > Remind me again about the difference between 2 and 3, and why 2 is > under the purview of the MLM. >> 4) pending Associate a distribution list with message 5) >> outbound Send it. 5) bounce Demote the subscriber 7) command >> A combo of #2, #3, and command processing. > I'm not totally sold that 4 needs a process boundary, or that 2, > 3, and 4 aren't part of the same process, structured as interfaces > in the framework. Oddly enough I didn't write what I meant to write, and what I meant also wasn't what my own notes and documents say (which happen to be more correct than I was). I'm looking at the following process. First the base model: Configuration of what exactly happens to a message is done by dropping scrpts/program in specially named directories (it is expected that typically only SymLinks will be dropped (which makes the web interface easy -- just creates and moves symlinks about)). In general processing of any item will consist of taking that item and iteratively passing it to every script in the appropriately named directory, invoking those scripts in directory sort order (cf SysV init scripts with /etc/rc.# etc). This makes the web config interface easy -- it just varies the numerical prefix on the scripts to enforce a processing order. Of course each script must follow a defined contract as far arguments, IO, return codes, and other behaviour. The processing sequence for a message: Message arrives in inbound queue. Message is picked up it is passed on to moderation which consists of: a) extracting a set of meta data from the message and any associated resources and then associating that meta data with the message (this is done as an efficiency support ala pre-processing (very useful for later template expansion)) b) Iteratively passing the message thru all the scripts in the moderation directory, in order, until either one returns non-zero or all scripts have been run. All scripts returning zero means that the message is moved to the pending queue. Various non-zero returns have other effects ranging from instant deletion, leaving the message in the inbound queue, moving to the moderation queue ("holding pen" is more accurate), auto-bouncing some sort of reply... Some event happens to move a mesasge from the moderation queue to the pending queue (or it goes straight there, bypassing moderation). Message is found in the pending queue and two things happen in order: a) It is passed iteratively, with its meta data through every script in the membership directory. A non-zero return means that the message stays in pending. The combined output (stdout) of all membership scripts forms the distribution list for that message. Upon getting a membership (everything returns zero), the resultant list is passed thru a specially anemd script (if it exists) for post-processing (dupe removal, VERP instruction insertion, domain/MX sorting, etc). The final distribution list is associated with the message much like the meta data already is. b) The message is passed iteratively, with its meta data and distribution list through the contents of the pre-post directory. This does whatever it wants to do (archiving, digests, templating, VERP spit outs, whatever). A non zero return will leave the message in the pending queue for a later pass. Finally, having gained a distribution list and been pre-post processed (if there's anything there), the message is moved to the outbound queue with its distribution list. The mesasge is found in the outbound queue and is handed off to whatever the transport is (MTA, NNTP, whatever). Now where this gets interesting is that every script and tool along the path above is free to edit the message, edit the meta data, edit the distribution list, to cause the current message to be silently deleted, and/or to crete new or derived messages and to inject them into any other mesasge queue in the system. >> So in my view, when Mailman decides that a message can be >> delivered to a membership list, it's dropped fully formed in an >> outbound queue. Yes, with its associated distribution list. JCL> Not exactly. It drops a mesasge, any relevant meta data, and a JCL> distribution list in the outbound queue. A delivery process JCL> then takes that and does what it will with them (eg VERP, JCL> application of templates, etc). > In Mailman 2.0 the distribution list is also metadata -- it lives > in the .db file. Do you see that differently? I see it as a class of meta data, but specifically one that is bound to and unique to the message in question, not to the list or the list config (consider the previous discussion of of nosy lists). As such a distribution list is genned for every message, and then attached to that message for later editing/delivery. >> These processes are not completely independent of Mailman though, >> e.g. for handling hard errors at smtp transaction time or URL >> generation for summary digests. Some of these can be handled by >> re-injection into the message queues (i.e. generate a bounce >> message and stick it in the bounce queue), but some may need an >> rpc interface. JCL> Thus the pending queue above -- it allows a mesasge to undergo JCL> a set of pre-post filters prior to landing in the outbound JCL> queue. Archiving, digests, all sorts of things can happen at JCL> that point. > Maybe I'm taking the term "distribution list" too narrowly in your > description about. Do you "list of recipient addresses" or do you > mean "internal queue routing destinations", or something else? I don't understand your question. Given the above, could you rephrase? -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment=-- From barry@digicool.com Fri Dec 15 06:17:58 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 15 Dec 2000 01:17:58 -0500 Subject: [Mailman-Developers] Re: Components and pluggablility References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.24645.631076.945270@anthem.concentric.net> <12343.976849654@kanga.nu> <14905.37391.141053.210120@anthem.concentric.net> <22532.976856352@kanga.nu> Message-ID: <14905.46998.231057.257569@anthem.concentric.net> >>>>> "JCL" == J C Lawrence writes: JCL> Configuration of what exactly happens to a message is done JCL> by dropping scrpts/program in specially named directories (it JCL> is expected that typically only SymLinks will be dropped JCL> (which makes the web interface easy -- just creates and moves JCL> symlinks about)). At a high level, what you're describing is a generalization of MM2's message handler pipeline. In that respect, I'm in total agreement. It's a nice touch to have separate pipelines between each queue boundary, with return codes directing the machinery as to the future disposition of the message. But I don't like the choice of separate scripts/programs as the basic components of this pipeline. Let me rotate that just a few degrees to the left, squint my eyes, and change the scripts to Python modules, and return codes to return values or exceptions. Then I'm sold, and I think you can do everything you want (including using separate scripts if you want), and are more efficient for the common situations. First, we don't need to mess with symlinks to make processing order configurable. We simply change the order of entries in a sequence (read: Python list). It's a trivial matter to allow list admins to select the names of the components they want, the order, etc. and to keep this information on a per-list basis. Actually, the web interface I imagine doesn't give list admins configurability at that fine a grain. Instead, a site administrator can set up list "styles" or patterns, one of which includes canned filter sets; i.e. predefined component orderings created, managed, named, and made available by the site administrator. Second, it's more efficient because I imagine Mailman 3.0 will be largely a long running server process, so modules need only be imported once as the system warms up. Even re-importing in a one-shot architecture will be more efficient than starting and stopping scripts all the time, because of the way Python modules cache their bytecodes (pyc files). Third, you can still do separate scripts/programs if you want or need. Say there's something you can only do by writing a separate Java program to interface with your corporate backend Subject: header munger. You should be able to easily write a pipeline module that hides all that in the implementation. You can even design your own efficient backend IPC protocol to talk to whatever external resource you need to talk to. I contend that the overhead and complexity of forking off scripts, waiting for their exit codes, process management, etc. etc. just isn't necessary in the common case, where 5 or 50 lines of Python will do the job nicely. Fourth, yes, maybe it's a little harder to write these components in Perl, bash, Icon or whatever. That doesn't bother me. I'm not going to make it impossible, and in fact, I think if that if that were to become widely necessary, a generic process-forking module could be written and distributed. I don't think this is very far afield of what your describing, and it has performance and architectural benefits IMO. We still formalize the interface that pipeline modules must conform to, probably spelled like a Python class definition, with elaborations accomplished through subclassing. Does this work for you? Is there something a script/program component model gives you that the class/module approach does not? -Barry From claw@kanga.nu Fri Dec 15 07:00:02 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 23:00:02 -0800 Subject: [Mailman-Developers] Re: Components and pluggablility In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) of "Fri, 15 Dec 2000 01:17:58 EST." <14905.46998.231057.257569@anthem.concentric.net> References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.24645.631076.945270@anthem.concentric.net> <12343.976849654@kanga.nu> <14905.37391.141053.210120@anthem.concentric.net> <22532.976856352@kanga.nu> <14905.46998.231057.257569@anthem.concentric.net> Message-ID: <817.976863602@kanga.nu> On Fri, 15 Dec 2000 01:17:58 -0500 Barry A Warsaw wrote: >>>>>> "JCL" == J C Lawrence writes: JCL> Configuration of what exactly happens to a message is done by JCL> dropping scrpts/program in specially named directories (it is JCL> expected that typically only SymLinks will be dropped (which JCL> makes the web interface easy -- just creates and moves symlinks JCL> about)). > At a high level, what you're describing is a generalization of > MM2's message handler pipeline. In that respect, I'm in total > agreement. It's a nice touch to have separate pipelines between > each queue boundary, with return codes directing the machinery as > to the future disposition of the message. > But I don't like the choice of separate scripts/programs as the > basic components of this pipeline. Let me rotate that just a few > degrees to the left, squint my eyes, and change the scripts to > Python modules, and return codes to return values or exceptions. > Then I'm sold, and I think you can do everything you want > (including using separate scripts if you want), and are more > efficient for the common situations. Fair dinkum, given the below caveat. > First, we don't need to mess with symlinks to make processing > order configurable. We simply change the order of entries in a > sequence (read: Python list). It's a trivial matter to allow list > admins to select the names of the components they want, the order, > etc. and to keep this information on a per-list basis. > Actually, the web interface I imagine doesn't give list admins > configurability at that fine a grain. Instead, a site > administrator can set up list "styles" or patterns, one of which > includes canned filter sets; i.e. predefined component orderings > created, managed, named, and made available by the site > administrator. I'll discuss this later below (it comes down to a multi-level list setup/definition deal). > Second, it's more efficient because I imagine Mailman 3.0 will be > largely a long running server process, so modules need only be > imported once as the system warms up. I have been working specifically on the assumption that it will not be a long running process, and that instead it will be automated by cron starting up a helper app periodically which will fork an appropriate number of sub-processes to run the various queues (with simple checks to make sure that the total number of queue running processes of a given type on a given host don't exceed some configured value. The base reason for this assumption is that it makes the queue processing more analagous to traditional queue managers, allowing the potential transition from Mailman's internal (cron based) automation to a real queue manager semi-transparent. The assumption in this was that the tool used to move a message between queues was an external explicitly standa-alone script. The supporting reason being a that simple replacement of that script by something that called the appropriate queue management tools for the queue manager de jour would allow the removal of the Mailman "listmom" and its replacement by the queue manager, be it LSF, QPS MQM, GNU queu, or something else. This is what I mean by "light weight self-discovering processes that behave in a queue-like manner". The processes are small and light. They figure out what needs to be done locally per their host-specific configurations, and then do that in a queue-like manner. What's this host-specific stuff? More later. ObNote: There actually need to be seperate and discrete tools for both moving a given message into a specific queue (ie different tools for inbound, pending, oubound, etc) and different tools for injecting messages (that didn't exist before) into each queu. Doing it this way allows a site to roll part of the system to a queue manager and allow the rest to remain default. This could be done by a single tool linked to different names, or passing the queue name as an argument and allowing an easy call-out as above to a module-wrapped external tool. > Even re-importing in a one-shot architecture will be more > efficient than starting and stopping scripts all the time, because > of the way Python modules cache their bytecodes (pyc files). I'm sold given the comment on the next paragraph. > Third, you can still do separate scripts/programs if you want or > need. Say there's something you can only do by writing a separate > Java program to interface with your corporate backend Subject: > header munger. You should be able to easily write a pipeline > module that hides all that in the implementation. You can even > design your own efficient backend IPC protocol to talk to whatever > external resource you need to talk to. I contend that the > overhead and complexity of forking off scripts, waiting for their > exit codes, process management, etc. etc. just isn't necessary in > the common case, where 5 or 50 lines of Python will do the job > nicely. Then we should provide a template python module that accepts the approriate arguments passes them the a template external program, and grabs its stdout and RC. Configuring users could/would then merely take this, rename it, and customise it and roll it in transparently. > Fourth, yes, maybe it's a little harder to write these components > in Perl, bash, Icon or whatever. That doesn't bother me. I'm not > going to make it impossible, and in fact, I think if that if that > were to become widely necessary, a generic process-forking module > could be written and distributed. Umm, yeah. Shame nobody thought of that. > I don't think this is very far afield of what your describing, and > it has performance and architectural benefits IMO. We still > formalize the interface that pipeline modules must conform to, > probably spelled like a Python class definition, with elaborations > accomplished through subclassing. Bingo. > Does this work for you? Is there something a script/program > component model gives you that the class/module approach does not? Not inherently given a mathod for easy call outs as mentioned above. Now onto the business of the host-specific configurations, what I've been looking at is something as below. The global list configuration consists of the following directories and files: ~/cgi-bin/* (MLM CGIs) ~/config (global MLM config) ~/config.force (global MLM config (can't change) ~/config. (config specifics for this host) ~/scripts/* (all the tools and scripts that do things) ~/scripts/member/* (membership scripts) ~/scripts/moderate/* (moderation scripts) ~/scripts/pre-post/* (scripts run before posting) ~/inbound/* (messages awaiting processing by the MLM) ~/outbound/* (messages to be sent my the MLM) ~/services/* (the processes that actually run mailman) ~/templates/* (well, templates) ~/groups/ (groups of list configs) ~/groups/default/ (There has to be a default) ~/groups/default/... (Basically a full duplicate of the root setup, mostly done as symlinks) ~/groups//config (deltas from ~/config) ...etc Then on the list base: ~lists//config (list config as deltas from group config) ~lists//group (symlink to ~/groups/) ~lists//moderate/* (messages held for moderation) ~lists//pending/* (messages waiting to be processed) ~lists//scripts/* (what does all the work) The assumption so far is that the queues were represented as discrete files on disk, much like the current held messages in v2, with file names mapping to address/function of the message (ie list name plus command/request/post/bounce/reject/something) with filename extentions for various meta data sets, etc, (this helps keep things human readable). There are aspects of this I'm not happy with (eg for distribution lists on account of size (consider a 1M member list). The idea is that the config files are simple collections of variable assignments much like the current Defaults.py or mm_cfg.py. Further, they are read in the following order: ~/config ~/groups//config ~/lists//config ~/groups//config.force ~/config.force Where the web interface would present the options that are locked by a higher level config (ie in a force file) as present but unconfigurable. Now, the next thing, outside of populating the initial root directory with files (such as the various configures python modules etc), everything else gets gone from the web. One account has access to the root and can create and edit groups etc. Another account has access to the list configs, and then of course there are mdoerator-only accounts. All of this of course gets exported thru the standard authentication methods so thta it can get replaced by . -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From claw@kanga.nu Fri Dec 15 07:07:50 2000 From: claw@kanga.nu (J C Lawrence) Date: Thu, 14 Dec 2000 23:07:50 -0800 Subject: [Mailman-Developers] Re: Components and pluggablility In-Reply-To: Message from J C Lawrence of "Thu, 14 Dec 2000 23:00:02 PST." <817.976863602@kanga.nu> References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.24645.631076.945270@anthem.concentric.net> <12343.976849654@kanga.nu> <14905.37391.141053.210120@anthem.concentric.net> <22532.976856352@kanga.nu> <14905.46998.231057.257569@anthem.concentric.net> <817.976863602@kanga.nu> Message-ID: <1302.976864070@kanga.nu> On Thu, 14 Dec 2000 23:00:02 -0800 J C Lawrence wrote: > What's this host-specific stuff? More later. Oops, I forgot to paste a particular bit from my notes: The basic pattern of execution: Messages arrive from the MTA and are put in files in ~/inbound with the name of the file matching both the list address it was send to and what needs to be done to it (bounce processing, list post, etc). A cron job runs and reads ~/config., and on the basis of the contents of that file, forks up to three child processes and then silently dies: i) Process inbound messages ii) Process messages in ~/lists//pending iii) Process messages in ~/outbound The ~/config.hostname file indicates which if any or all of the above three children should be forked, and further, can specify both which groups or individual lists to process for all three. It is this which allows functionality to be distributed across hosts with individual hosts assuming specific named loads. Note: Its actually more than three children as that doesn't (yet) account for command and bounce processing (which I've put into seperate queues), but it gives you the general idea. -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From chuqui@plaidworks.com Fri Dec 15 07:36:28 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Thu, 14 Dec 2000 23:36:28 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: <14905.27901.891059.922294@anthem.concentric.net> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <14905.27901.891059.922294@anthem.concentric.net> Message-ID: At 7:59 PM -0500 12/14/00, Barry A. Warsaw wrote: > > CVR> Instead, I'd use a password and any defined email, perhaps > CVR> with a few carefully chosen heuristics to help find them if > CVR> they're confused (for instance, users use earthlink.com and > CVR> earthlink.net interchangeably). > >Hadn't thought about that, and that is a good point. you learn these the hard way. Guess how many ways there are to misspell hotmail.com... >Right. I don't think 100% coverage is necessary. My motto: get the first 90% right, then work on the next 90% of what's left. Especially with email, 100% solutions don't exist, because of disasters like lotus notes and other non-conformant systems. >Here's another complication: are the delivery options set per-address >or per-list? Maybe I want all deliveries to "barry at wooz" to be >digests. Maybe I want lists A, B, and C to be regular deliveries. Neither, really. For every account, you can subscribe one or more address to a list. For every address subscribed, you have a set of list options. So a person could get both the messages and digest to separate addresses, and have a third address validated for posting but get nothing. Useful if that person is doing offline munging into a private archive (or if you're using this form to gateway into some scripting system as the admin, where the scripting isn't tied into Mailman. you could then have a single account, but attach all of the gateway addresses to it, and configure each one separately. Much neater administratively.) > > >I see a level of indirection coming to the rescue. MailingLists have >Rosters and Rosters have EmailAddresses which in turn link back to the >UserAccount. Let me think about this in pseudo-SQL terms a sec. A user creates an account. that is given an acct_id, which is unique to the system. He attaches 1..N email addresses to his account. Each email gets an email_ID, which is also unique to the system, so we now have a 1->mapping from account to a set of email addresses. One or more of these addresses have been validated in some way to guarantee ownership. question: are email addresses unique to the system? to the user? I'd argue they have to be, if for no other reason than if foo@bar.com is attached to two accounts and someone logs on via it, which acct does he get? So email addresses are unique but you can't use the email address as the primary key because it changes. So any time you add one or change one, you have to validate against uniqueness before accepting it. from the other side, the admin creates a list, which is assigned a list_id. when a user subscribes to that list, the relationship is between a user's email_id and the list_id, and there's a unique set of preferences attached to that relationship. the only way a list can find out who the user is is to refer back through the list_id to get the account_id, which, frankly, I don't think we want to allow anyway, on privacy issues. you only get the information you need to do the job. > A MailingList might actually want to deliver to multiple >Rosters, which is where I think the umbrella list stuff could be >improved. I.e. you have a Roster for mailman-developers and a Roster >for mailman-users and mailman-announce contains a computed Roster >composed of those first two, along with it's own Roster. Now you send >a message out to mailman-announce and everybody gets it (although what >do you do about Subject: munging, footer addition List-* headers and >the like?). and you need to do duplicate supression, too. I think the headers have to come from the list subscribed to, because that's the list the user will try to ujnsubscribe from, and all the documentation in the world won't explain why headers for mailman-announce won't work for you because you're really on mailman-users. > I also see creation of Rosters for list owners, site >administrators and so on, so you could do things like compute a Roster >for all-list-owners@mysite.com if you had urgent information for your >list admins. and a meta-list for all subscribers for the same. But then you get into the issue of who has permission to post to what, and you end up with an authentication database (which is not a bad idea, FWIW.) Especially if you want ot get into whether the site admin wants to allow list admins to set reply-to coercion or not... > CVR> -- you have an > CVR> audit trail back to the person, so if they decide to try to > CVR> spam someone you know who they are, and who to shoot. As long > CVR> as you don't lose the authenticity trail, once is all you > CVR> need (that would, I think, require authenticating another > CVR> address before allowing deletion of the one that's > CVR> authenticated, and disabling any account when all of the > CVR> authenticated addresses are disabled by bounce processing...) > >I'm concerned about the scenario where I subscribe to a list, then add >your address to my account, then disable my address because I'm "going >on vacatin". I thought about it at the hockey game, and now I disagree with myself. all addresses are validated. Otherwise, it gets gnarly. First, if you validate an address and add others, and the first bounces -- what do you do? you can't subscribe the others until they'r evalidated. That is the wrong time from user expectation to ask for that. At best, you confuse the hell out of someone. Second, it opens you up to mailforward attacks (create a hotmail account. Sign up for 900 lists. Forward that account to someone you hate. disappear). At least with validations, a user sees it coming, and knowing they'll get warning, it'll only get used by stupid users... So yea, on further review, forget i suggested that. > -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Fri Dec 15 07:54:11 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Thu, 14 Dec 2000 23:54:11 -0800 Subject: [Mailman-Developers] Re: Slow Performance on semi-large lists In-Reply-To: <14905.36365.931133.268502@anthem.concentric.net> References: <20001214121055.B964@moscow.com> <14905.36365.931133.268502@anthem.concentric.net> Message-ID: At 10:20 PM -0500 12/14/00, Barry A. Warsaw wrote: > > >Does sendmail do the same synchronous DNS lookups when used in -bs >mode? yes, but you can add another flag to turn it off (I think... -odd) fitting flag, now that I look at it. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Fri Dec 15 07:52:57 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Thu, 14 Dec 2000 23:52:57 -0800 Subject: [Mailman-Developers] Responsiveness In-Reply-To: <14905.34483.401067.491743@anthem.concentric.net> References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.25434.202694.941787@anthem.concentric.net> <14905.34483.401067.491743@anthem.concentric.net> Message-ID: At 9:49 PM -0500 12/14/00, Barry A. Warsaw wrote: >Fair enough. It's definitely not part of the MTA. MTAs deliver stuff, real fast. but they're real dumb. You have to tell them exactly what to deliver, and so the MLM has to hand off finished pieces. think of the MTA as the fedex delivery truck. it doesn't address boxes or tape the packages or fill them with popcorn worms. All that is done in the warehouse of the shipper -- and that's the MLM. It goes on the truck, and the truck speeds away into the sunset with your package, and all you have is a tracking number... -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Fri Dec 15 07:50:32 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Thu, 14 Dec 2000 23:50:32 -0800 Subject: [Mailman-Developers] Responsiveness In-Reply-To: <9493.976847702@kanga.nu> References: <31823.976583871@kanga.nu> <10142.976590988@kanga.nu> <13025.976593092@kanga.nu> <14905.25434.202694.941787@anthem.concentric.net> <9493.976847702@kanga.nu> Message-ID: At 6:35 PM -0800 12/14/00, J C Lawrence wrote: > > except I see this as still part of the MLM, since it's the tool >> doing the MLM->MTA handoff, not part of the MTA itself. > >We need to define what an MLM is and does: Here's how I define the MLM: it has access to a subscriber database, and when a piece of email is received it is in charge of deciding who it is sent to, and generating the content necessary to have ti delivered. Optionally, it manages the subscriber database. that means that the MTA sees that the mail is addressed to a list and hands it off to MLM. The MLM owns it until it hands the outbound mail to the MTA for delivery. it does everything except acceptance into the system and delivery back out. >Actual delivery of messages, MTAs, transports, authentication >mechanisms, membership definitions, account definitions, etc, are >really outside of its purview. The MLM is really an overriding architecture that handles content flow from reception through delivery (but not including delivery), and all of the support subsystems to make that possible. The key one is subscriber maintenance, but archival, digesting, bounce processing are all also part of the MLM archictecture, even if they're managed by external subsystems through a defined API. what's internal and what's external comes down to needs and how strongly you want stuff integrated. As we webify everything (even email), integration and (dare I say it! I dare!) convergence tend to encourage that strong integration (and that we're even thinking of using Zope pieces here shows how far that integration is infiltrating real life already...) -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Fri Dec 15 08:14:16 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 15 Dec 2000 00:14:16 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: <11218.976848629@kanga.nu> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <14905.27901.891059.922294@anthem.concentric.net> <11218.976848629@kanga.nu> Message-ID: At 6:50 PM -0800 12/14/00, J C Lawrence wrote: >We can generalise this problem into insolvability. great1 let's define 3.0 done, then, and go have a beer (ducking) > Mailman is not trying solve all problems! yes, it is. We will merely defer some of them to future releases (or generations) > Mailman is trying to provide a toolkit such that users can solve > problems in ways they prefer! but we need to understand all this enough to be able to write them, even fi we don't write them ourselves, lest those toolkits not do what's needed of them. > >I actually acount at the point that I think Mailman shouldn't have a >default membership implementation, but again, just one or more >reference implementations. We go back to where I stepped in here (stepped into it?) a few days back. We define a subscriber management interface, splitting Mailman into two pieces: subscriber management and MLM actions. Then the subscriber piece is implemented twice to prove the API -- one by porting the existing Mailman interface to it, and one by (I'd hope) my proposed site-wide authentication scheme to it. Assuming it works for those two cases, it likely works for almose every case, and if someone wants to hook it up to an oracle database via LDAP for corporate mailing lists -- they write another module to the API. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Fri Dec 15 08:19:54 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 15 Dec 2000 00:19:54 -0800 Subject: [Mailman-Developers] 2 bugs, but I need a confirmation :-) In-Reply-To: <14905.38370.371699.315689@anthem.concentric.net> References: <20001212111117.J4396@xs4all.nl> <14905.38370.371699.315689@anthem.concentric.net> Message-ID: At 10:54 PM -0500 12/14/00, Barry A. Warsaw wrote: >The standard MIME parsing modules in Python suck. I want to fix this >for 2.1, but haven't yet had time. mimectl might be the way to go, >but I also think I want a DOM-like approach so I can slice and dice >MIME messages, then just say "okay, spit out the results". please, do, because it falls right in place with tw things I'm grappling with -- the auto-moderation/anti-spam stuff (initial design ideas coming soon, honest!) and my active filter content stuff, so I can integrate what de-mime does for me with greater granularity (and since htere are now viruses in shockwave, I hate to say "I was right", but -- I was right. it's not just .exe or virtual basis or word macros, but any active content has to be screened... -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Fri Dec 15 08:16:25 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 15 Dec 2000 00:16:25 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: <14905.34948.21105.707959@anthem.concentric.net> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <9024.976847237@kanga.nu> <14905.34948.21105.707959@anthem.concentric.net> Message-ID: At 9:57 PM -0500 12/14/00, Barry A. Warsaw wrote: >You've just described Roundup's nosy lists: so now we've grafted Brad Templeton's old Knews (keyword usenet) idea with google and implemented both into mailman so the list server can figure out if you really want the email, granular to the individual message? Cool My grandkids might finish coding it... (giggle) -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Fri Dec 15 08:10:13 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 15 Dec 2000 00:10:13 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: <9024.976847237@kanga.nu> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <9024.976847237@kanga.nu> Message-ID: At 6:27 PM -0800 12/14/00, J C Lawrence wrote: >I *like* email-address based IDs. You can get the best of both >worlds with little effort much as you describe, But it gets gnarly fast. Trust me. I'm slogging around umpty-bazillion of them in my database every day. it makes Change of address ugly, nad it's also a long string, which can affect your searching and linking performance. >Abstract UserIDs are a nice idea that user's hate for the same >reason they hate passwords -- because it yet another abstract string >that they have to remember. that's why I don't think the user ever sees it. No need for them to. I'm using the ID to do table linkiing as primary key, but presenting the email address to the user as their external identifier. it's just then that when they COA the address, none of the table relationships have to be changed, just the email table. > > they're confused (for instance, users use earthlink.com and >> earthlink.net interchangeably). > >Yeouch. man, you don't want to know some of the things users do consistently. let's nt even talk about what MSN has done to its users (email.msn.com? classic.msn.com? msn.com?), or worldnet, or (sigh) compuserve (numerics, anyone? how about .cs.com instead of compuserve.com? yada yada). Or users who put spaces into their names like AOL lets them. Or.... > > And another option is "none" -- where Mailman is simply the >> delivery agent for an address system controlled elsewhere and whic >> users aren't allowed to update via Mailman. Once you start going >> to external databases, either they're likely to be holding stores >> for a standard mailman database, or they're likely to be severely >> restricted access, or read-only from the Mailman point of view. > >This gets worse: and -- lest we forget -- we always have the option of drawing lines in the sand and saying "this we don't do. This we can't do. this we do later. This we leave an API, and if you want it, submit the results... >Heck, remove the entire concept of mailing lists entirely and make >the very existance of a list and its configs and membership dynamic: > > You send a message to @lists.domain, Mailman receives > it, does an LDP query for every user with a attribute > on their record, and broadcasts the message to that generated list > of addresses. you just defined the server I'm looking at rewriting, probably next summer. We did a preliminary of that a couple of months ago. it gets -- well, very interesting. >Marketeers will love thsi sort of thing when run against customer >databases. The advantage that an MLM brings to the table is in >scalability, bounce, and unsubscribe handling. not necessarily. (he says, carefully). That's one reason my really big machine is fully custom. Nobody could do it the way it was needed, at the scale it was needed. And many times, they don't WANT all the features or offer the user that many choices. or have needs that aren't easily transmogrified into a MLM's paradigm. By the time Mailman gets to this point, it might (or might not) handle my big machine, but by then, my big machine will be headed off into other directions again that Mailman won't be able to touch (and I really can't say more than that...), but I've actually got the next two years work on the big machine, more or less, drafted out already. >Account cusomisations (digests, metoo, nomail, etc) of course also >passes outside of Mailman's purview once account data goes outside. and you start running into data ownership issues up the ying yang. that was something my site-subscriber thingie was aimed at starting to deal with, how you can have an integrated subscription service to a site with a shared data suite and multiple sets of integrated data that's specific to diffferent modules and private from each other. >Its complicated until people get involved. Then it gets messy. let's just forget the people, and write ;systems that only computers can opperate. Easier that way... >The problem is that this is an insolvable problem at our level. I'm not sure I agree. But it's going to come down to authentication and ownership of data, and perhaps down to the column tlevel. Definitely to the table level. An d it's going to require work to do right and avoid security issues. >Management is going to have one need, SysAdm another, and the end >users a third. At different times and different sites and in >different instances, each one is going to win some of the time. > >We can't solve this one. yes we can, at least to a good degree, through careful authentication and hierarchies of data, with the possibility of data delegation. And that is both at the people level (site mom authorizes list mom to operate a list and set the subject notice, but not coerce reply-tos. List mom authorizes content dude to moderate messages, but not tweak list configurations.) but at the procedure level. Which actually sounds gnarlier than it is. >Virtual hosts are a hack, and an ugly hack at that. I don't see >that they are worth wasting time on when multiple installation >appeases the privacy concerns. they're a necessary hack, too. We can't blow them off trivially. *I* need them for various things. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From claw@kanga.nu Fri Dec 15 22:52:11 2000 From: claw@kanga.nu (J C Lawrence) Date: Fri, 15 Dec 2000 14:52:11 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: Message from Chuq Von Rospach of "Thu, 14 Dec 2000 23:36:28 PST." References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <14905.27901.891059.922294@anthem.concentric.net> Message-ID: <20660.976920731@kanga.nu> On Thu, 14 Dec 2000 23:36:28 -0800 Chuq Von Rospach wrote: > For every account, you can subscribe one or more address to a > list. For every address subscribed, you have a set of list > options. So a person could get both the messages and digest to > separate addresses, and have a third address validated for posting > but get nothing. Useful if that person is doing offline munging > into a private archive (or if you're using this form to gateway > into some scripting system as the admin, where the scripting isn't > tied into Mailman. you could then have a single account, but > attach all of the gateway addresses to it, and configure each one > separately. Much neater administratively.) We can generalise usefully here and punt the rest of the problem into end-user configuration space. I suggest we make the following divide: At the general level list owners/moderators deal with accounts, not email addresses. They set moderation controls etc on accounts. It is an account that subscribes to a list, resulting in every address on that account being a "member" etc. Equally, at the general level, from a member's view, participation in a list is configured at the email address level. This address will or will not receive postings, and has this specific configuration in regard to that list. AccountX/AddressY is on ListZ and receives mail from ListZ. ListOwner for ListZ approves AccountX for automatic posting to the list (no moderation). AccountX/AddressQ then posts to the list, and has his post go straight thru as his *account* is approved, not the posting address. If someone wants something different they can get something different by taking responsibility for the membership problem in its entirety and doing something else (sbclassing, plugins, et al as previously discussed). However the above does what users natively expect: They, the human, joined the list, and the list therefore should be intelligent enough to know that even tho they subscribed with their work email address, when they post from home or from hotmail that it really is them and not some anonymous stranger. >> I see a level of indirection coming to the rescue. MailingLists >> have Rosters and Rosters have EmailAddresses which in turn link >> back to the UserAccount. > Let me think about this in pseudo-SQL terms a sec. > A user creates an account. that is given an acct_id, which is > unique to the system. He attaches 1..N email addresses to his > account. Each email gets an email_ID, which is also unique to the > system, so we now have a 1-> mapping from account to a set of > email addresses. One or more of these addresses have been > validated in some way to guarantee ownership. > question: are email addresses unique to the system? to the user? > I'd argue they have to be, if for no other reason than if > foo@bar.com is attached to two accounts and someone logs on via > it, which acct does he get? So email addresses are unique but you > can't use the email address as the primary key because it > changes. So any time you add one or change one, you have to > validate against uniqueness before accepting it. Precisely. > from the other side, the admin creates a list, which is assigned a > list_id. when a user subscribes to that list, the relationship is > between a user's email_id and the list_id, and there's a unique > set of preferences attached to that relationship. the only way a > list can find out who the user is is to refer back through the > list_id to get the account_id, which, frankly, I don't think we > want to allow anyway, on privacy issues. you only get the > information you need to do the job. Of course. We don't reveal that data to the list owner. However the decisions that the list owner makes in regard to a given address are applied to the account, not the address. Yes, this means that with sufficient experiementation and observation, a list owner could deduce most of an account definition ("I approved address XXX for posting and now YYY is also approved? They must be on the same account!". I don't see that as a problem. > Second, it opens you up to mailforward attacks (create a hotmail > account. Sign up for 900 lists. Forward that account to someone > you hate. disappear). At least with validations, a user sees it > coming, and knowing they'll get warning, it'll only get used by > stupid users... I don't see this as our problem, simply because its not one we either have control over or can defend against. -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From claw@kanga.nu Fri Dec 15 23:14:13 2000 From: claw@kanga.nu (J C Lawrence) Date: Fri, 15 Dec 2000 15:14:13 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: Message from Chuq Von Rospach of "Fri, 15 Dec 2000 00:16:25 PST." References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <9024.976847237@kanga.nu> <14905.34948.21105.707959@anthem.concentric.net> Message-ID: <22788.976922053@kanga.nu> On Fri, 15 Dec 2000 00:16:25 -0800 Chuq Von Rospach wrote: > At 9:57 PM -0500 12/14/00, Barry A. Warsaw wrote: >> You've just described Roundup's nosy lists: > so now we've grafted Brad Templeton's old Knews (keyword usenet) > idea with google and implemented both into mailman so the list > server can figure out if you really want the email, granular to > the individual message? Yes, but better, since now (post v3) an individual can implement a knews-style implementation without having to re-invent the weel, and without even necessarily needing to have to. Consider a mailing list system ala: ---- From: Ka-Ping Yee [mailto:ping@lfw.org] Sent: Monday, May 22, 2000 2:45 AM To: Roundup users Subject: {3} Welcome to the Singularity discussion system! Issue: {3} http://headspaces.com/singularity/3 Description: -> Welcome to the Singularity discussion system! Priority: -> 5 Keywords: -> admin Nosy: -> ping@lfw.org Hello everyone! Thank you all for joining the Singularity discussion system and helping to carry forward the dialogue and all the great work that took place at this weekend's Foresight Gathering. This system works much like a typical electronic mailing list, but it does have some key features which should help improve information flow (in one of my sessions, someone used the term "noise management" and i thought it was great). 1. Only messages opening new threads are sent to the entire mailing list. This happens when you compose a new message and send it directly to singularity@headspaces.com. 2. Starting a new thread in this way opens an "issue" in the discussion system. The issue gets its title from your subject line. Each issue maintains its own list of interested people. Initially, this list just contains you, the initial poster. 3. Issues show up on the website at headspaces.com. They have a priority level that you can edit, and they are listed so you can see which ones have been most active recently. Issues are assigned keywords so they can be classified and searched. To specify keywords, put some words in square brackets in your initial subject line, such as [nano upload]. If you don't specify keywords, the system will pick words from your subject. 4. Messages forwarded from the system will have a number in curly-braces at the beginning of the subject line, like {37}. This number identifies the issue and causes further mail to collect on the issue's web page. 5. When you reply to mail, preserve this id number on your subject line. (A preceding "Re:" is no problem.) You will be placed on the issue's "nosy list" of interested parties, and your message will get forwarded only to the other people on the issue's list. If you "Cc:" anyone on your message, they will also get added to the issue's list, bringing them into the discussion. 6. If you want to join in on an issue after it has already started, just go to the web page for the issue (click on the issue id in the main list) to catch up on all the discussion that has already taken place. Join in by sending a message to singularity@headspaces.com with the issue id as the first thing in your subject line. Of course, we are going to need a little discipline to use the system best. For example, if an issue starts to veer off-topic, please open a new issue so that interested parties can join in. And when you're about to create a new issue, have a look at the list of existing issues first to see if there's already an issue relevant to what you want to say. It's a bit like running lots of mailing lists at once, where creating a new list is effortless. You don't need to worry about subscribing -- you just participate -- and you usually don't need to worry about unsubscribing: issues will die down as they get resolved, and if a new relevant point is raised, you will probably have wanted to see it anyway. You can always edit the "nosy" field of the issue directly if you need to. As i mentioned in the closing session, the software is new and experimental. We've used software like this where i work, however, and it's worked out very well for us. I hope that it helps keep the discussion going without sending you an excessive volume of e-mail. Please feel free to ask me any questions you have about the system -- just reply to this message, and they will become part of this issue. Enjoy! -- ?!ng ---- And then consider how difficult that would be to implement under the proposed Mailman v3 (I figure no more than a wekk (a few days more like) over a default installation). > Cool My grandkids might finish coding it... (giggle) Uhh huh. -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From claw@kanga.nu Fri Dec 15 23:27:17 2000 From: claw@kanga.nu (J C Lawrence) Date: Fri, 15 Dec 2000 15:27:17 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: Message from Chuq Von Rospach of "Fri, 15 Dec 2000 00:10:13 PST." References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <9024.976847237@kanga.nu> Message-ID: <23620.976922837@kanga.nu> On Fri, 15 Dec 2000 00:10:13 -0800 Chuq Von Rospach wrote: > At 6:27 PM -0800 12/14/00, J C Lawrence wrote: >> Heck, remove the entire concept of mailing lists entirely and >> make the very existance of a list and its configs and membership >> dynamic: >> >> You send a message to @lists.domain, Mailman receives >> it, does an LDP query for every user with a attribute >> on their record, and broadcasts the message to that generated >> list of addresses. > you just defined the server I'm looking at rewriting, probably > next summer. We did a preliminary of that a couple of months > ago. it gets -- well, very interesting. Care to comment? >> Virtual hosts are a hack, and an ugly hack at that. I don't see >> that they are worth wasting time on when multiple installation >> appeases the privacy concerns. > they're a necessary hack, too. We can't blow them off > trivially. *I* need them for various things. They are needed now, yes, this I don't argue. Post IPv6 I don't see much use for them any more and I expect their use to almost vanish. Heck, more simply, once application-centric VLANS (ala IPSec for instance) become common (which is already happening but is really pending IPv6 to get the address space) the whole problem is going to get messy on a far different score (multiple addressing and side-scale use of non-routable network blocks without accepted authentication). <> -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From chuqui@plaidworks.com Sat Dec 16 04:46:08 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 15 Dec 2000 20:46:08 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: <23620.976922837@kanga.nu> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <9024.976847237@kanga.nu> <23620.976922837@kanga.nu> Message-ID: At 3:27 PM -0800 12/15/00, J C Lawrence wrote: > > you just defined the server I'm looking at rewriting, probably >> next summer. We did a preliminary of that a couple of months >> ago. it gets -- well, very interesting. > >Care to comment? it's not strongly defined yet, but we've talked about taking our existing corporate server, and re-doing it. Right now, the group list is either corporate generated (organizational aliases) or user-built (through a web-based system). it feeds me the data set a few times a day, and I massage it into a massive sendamil alias setup and some added control datasets. The hope is to redo this so taht the interface is LDAP, so whenever an email comes in, we pull the data out of the database to see if it's a defined list, and if so, whether the user is validated to send to it (and if they are, send it, of course). So that my MLM would keep zero data local, and generate everything dynamically on request. In a case like this, I have no control of the data, I'm a read-only leaf node. So I can'y do sub/unsub or even bounce processing (although we're talking about exactly how to deal with that. it's a problem). > >They are needed now, yes, this I don't argue. Post IPv6 I don't see >much use for them any more and I expect their use to almost vanish. and once we fully implement style sheets in browsers, we can kill a lot of bad HTML hacks. yu're correct, but that's far enough out in practical terms I'm not sure we want to build that as a design idea. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Sat Dec 16 04:39:19 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Fri, 15 Dec 2000 20:39:19 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: <20660.976920731@kanga.nu> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <14905.27901.891059.922294@anthem.concentric.net> <20660.976920731@kanga.nu> Message-ID: At 2:52 PM -0800 12/15/00, J C Lawrence wrote: > At the general level list owners/moderators deal with accounts, > not email addresses. They set moderation controls etc on > accounts. It is an account that subscribes to a list, resulting > in every address on that account being a "member" etc. I disagree with this. The way I see it is this: o Site admin: he has overall control of the mailman installation, and manages the system. He has CLI access, and admin access to all lists. Basically, he's god. o List admin: manages a list on the site. He controls subscription policies and configuration setups for that list, unless the site admin has locked a value (for instance, teh site admin can lock reply-to coercion for a site-wide standard). o list moderator: handles message approval or denial. May be list admin, but it's a separate function. you may not want a moderator to ahve access to list configuration options. o list user: manages an individual account. list admins don't deal with accounts, per se. They set list defaults that are adopted by the user when they subscribe, and whcih can be modified by the user, unless a list admin locks a value (no digest, for instance). List admins have address maintenance duties,but I think that's different than what you're implying here. List admins dn't act on a user or account 9except to fix something on request), the act on lists. >If someone wants something different they can get something >different by taking responsibility for the membership problem in its >entirety and doing something else (sbclassing, plugins, et al as >previously discussed). However the above does what users natively >expect: They, the human, joined the list, and the list therefore >should be intelligent enough to know that even tho they subscribed >with their work email address, when they post from home or from >hotmail that it really is them and not some anonymous stranger. agreed. >Of course. We don't reveal that data to the list owner. However >the decisions that the list owner makes in regard to a given address >are applied to the account, not the address. I'm not sure I agree. here's why. Let's go back to my "subscribe N addresses to a list from account M", to allow them to be configured differently. The relationship for the list is to the email address, not the acccount, but the API has to have a function that allows you to query whether this address is a valid address. For privacy reasons, I don't think you hand over the addresses that aren't subscribed, but instead, you hand off an address and ask whether it can be allowed to post. I think you create a data leak by linking to the account that you don't want here, and you remove the ability to multiply subscribe to a list -- and while there aren't a lot of people who do this, I *do* know people who are subscribed both to messages and digests to lists, and want that option. (I do, too, when I'm developing or testing MLM stufff.....) > > Second, it opens you up to mailforward attacks (create a hotmail >> account. Sign up for 900 lists. Forward that account to someone >> you hate. disappear). At least with validations, a user sees it >> coming, and knowing they'll get warning, it'll only get used by >> stupid users... > >I don't see this as our problem, simply because its not one we >either have control over or can defend against. It is, because we're the tool being used. By validating the address when it's added to the list, you simplify the user experience, make it easier on Mailman, and inhibit this attack significantly because the luser will (we hope) know he'll get flagged immediately and so not use us for the attack, and failing that, the attack-ee will be warned early that something is happening and hopefully be able to stop it before it hits hard. it's a win win sitaution, since vallidating is waht the user is going to expect, makes our processing easier, and limits exposure to this attack. And it basically has no significant negative I can think of. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From claw@kanga.nu Sat Dec 16 20:07:47 2000 From: claw@kanga.nu (J C Lawrence) Date: Sat, 16 Dec 2000 12:07:47 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: Message from Chuq Von Rospach of "Fri, 15 Dec 2000 20:39:19 PST." References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <14905.27901.891059.922294@anthem.concentric.net> <20660.976920731@kanga.nu> Message-ID: <32559.976997267@kanga.nu> On Fri, 15 Dec 2000 20:39:19 -0800 Chuq Von Rospach wrote: > At 2:52 PM -0800 12/15/00, J C Lawrence wrote: >> At the general level list owners/moderators deal with accounts, >> not email addresses. They set moderation controls etc on >> accounts. It is an account that subscribes to a list, resulting >> in every address on that account being a "member" etc. > I disagree with this. The way I see it is this: > -- Site admin: he has overall control of the mailman installation, > and manages the system. He has CLI access, and admin access to all > lists. Basically, he's god. > -- List admin: manages a list on the site. He controls > subscription policies and configuration setups for that list, > unless the site admin has locked a value (for instance, teh site > admin can lock reply-to coercion for a site-wide standard). > -- list moderator: handles message approval or denial. May be list > admin, but it's a separate function. you may not want a moderator > to ahve access to list configuration options. > -- list user: manages an individual account. > list admins don't deal with accounts, per se. They set list > defaults that are adopted by the user when they subscribe, and > whcih can be modified by the user, unless a list admin locks a > value (no digest, for instance). List admins have address > maintenance duties,but I think that's different than what you're > implying here. List admins dn't act on a user or account 9except > to fix something on request), the act on lists. Okay, let's split this out. There are five levels and a pseudonymous sixth: 1) Site owner -- SysAdm for the host 2) Group owner -- Sets group defaults 3) List owner -- Sets list defaults 4) List moderator -- Controls day-to-day operation of list 5) Member account -- Individual humans 6) Email address The site owner has CLI access etc. The Group owner runs, say, a vhost and defines the default for a class of lists as well as handling creation of lists within that class. The List owner configures and defines a given list. The list moderator implements the human side of enforcing list policy and handles the day to day chores of running a list (post approval etc). A member account subscribes to a given list with one or more of its email addresses, each of which has a configuration for that list which is unique to that email address. But, this is slightly confusing and deceptive. A list moderator in the course of their normal duties may do the following (among other things): -- Kick someone off a list -- State that all future posts from a given member will not be hand moderated -- State that the next N posts from a specific member will be hand moderated -- State that all future posts from a given member will not be hand moderated and will be automatically posted. -- Write an arbitrary note that is then associated with a member such that any moderator for that list will see that note when presented with data concerning that member (eg a post held for moderation). In each case what the list moderator sees in terms of the definition of the member, and what they define the above actions against, is an email address. They flag email address XXX as being hand moderated from here out. They do something else to email address YYY. My point however is that in the general case for these commands, the list owner is actually not interested in a specific email address, but in the account. When the moderator decides to auto-approve a specific account for autoposting, he generally is intending to approve the human and not just the one email address. Yes, that Chuq guy posts signal. Please auto-post everything he writes. That claw guy however is a dweeb and keeps argiung about stoopid things. I want to hand moderate all his posts from here on. While moderatator decisions are typically phrased in terms of email addresses, the actual intent is typically in terms of humans. There are of course exceptions: I want to hand moderate all his posts from his work address because they auto-append legal cruft I want to delete. He only posts from Yahoo when he's drunk -- unsubscribe that address. And a host of others. Of course account groupings (dealing by human above) can be approximated by just issueing enough commands for every intersection of an account and a list (should you know the intersections), but that's a royal pain. If I as moderator decide that Bubba is just a source of noice and really needs to be kicked from my list I want all of his addresses kicked, not just the one, and in fact I want my list config to remember that fact so that should Bubba try and re-join he's either auto-refused or his subscription is held for moderator approval (depsite the fact that everyone else gets let thru automatically). We need to be able to implement moderator rulings at both the address and account levels. This does not mean that a moderator should be able to query, "show me all the addresses that re part of account XXX", but that he should be able to issue a command which summates to: Do XXX to every intersection of the account that owns email address EEE and this list. Where XXX and EEE are arbitrary. >> Of course. We don't reveal that data to the list owner. However >> the decisions that the list owner makes in regard to a given >> address are applied to the account, not the address. > I'm not sure I agree. here's why. Let's go back to my "subscribe N > addresses to a list from account M", to allow them to be > configured differently. The relationship for the list is to the > email address, not the acccount, but the API has to have a > function that allows you to query whether this address is a valid > address. For privacy reasons, I don't think you hand over the > addresses that aren't subscribed, but instead, you hand off an > address and ask whether it can be allowed to post. I think you > create a data leak by linking to the account that you don't want > here, and you remove the ability to multiply subscribe to a list > -- and while there aren't a lot of people who do this, I *do* know > people who are subscribed both to messages and digests to lists, > and want that option. (I do, too, when I'm developing or testing > MLM stufff.....) While there is a data leak, yes, there isn't in the general case, and should that fact be of sufficient concern the admin command processing ionterface (which I'm looking at in exactly the same replaceable compnonent manner as members, and everything else) can be replaced with something that will *ONLY* process addresses (and which quite likely removes all concept of member accounts in the first place. My intent above is not to build a system that approaches a Grecian Ideal, but to build something that is maximally workable for both list members and list moderators, and which can be adapted towards local definitions of "ideal". >>> Second, it opens you up to mailforward attacks (create a hotmail >>> account. Sign up for 900 lists. Forward that account to someone >>> you hate. disappear). At least with validations, a user sees it >>> coming, and knowing they'll get warning, it'll only get used by >>> stupid users... >> I don't see this as our problem, simply because its not one we >> either have control over or can defend against. > It is, because we're the tool being used. By validating the > address when it's added to the list, you simplify the user > experience, make it easier on Mailman, and inhibit this attack > significantly because the luser will (we hope) know he'll get > flagged immediately and so not use us for the attack, and failing > that, the attack-ee will be warned early that something is > happening and hopefully be able to stop it before it hits hard. I'm not arguing against mailback validation. I consider that's actually a very good thing (as it demonstrates control of the address and some level of ackknowledgemnt of the change). I think we're talking across each other. There is nothing that can prevent the following: I create a hotmail account. I subscribe that hotmail account to 500 lists of various flavours. I now configure that hotmail account to auto-forward all received messages to There is nothing we can do about this because the forwarding decision is outside of our control or purview. The subscriptions were perfectly legitimate from our stance -- they were acknowledged and everything. Its the forwarding decision that was bad, and that we had nothing to do with. For those interested the reverse vector of this just happened to a list I'm on. Some fine fellow got a Hotmail account, subscribed it to several dozen porn lists and then forwarded that account to the list I was on (which had open posting). We can't stop it. We can't eevn impede it other than by applying standard authentication rules to incoming posts (envelope, From:, whatever). -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From ken@kyler.com Sat Dec 16 20:28:42 2000 From: ken@kyler.com (Ken Kyler) Date: Sat, 16 Dec 2000 15:28:42 -0500 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: <32559.976997267@kanga.nu> Message-ID: > Okay, let's split this out. There are five levels and a > pseudonymous sixth: > > 1) Site owner -- SysAdm for the host > 2) Group owner -- Sets group defaults > 3) List owner -- Sets list defaults > 4) List moderator -- Controls day-to-day operation of list > 5) Member account -- Individual humans > 6) Email address Using the virtual host paradigm, it becomes... 1) SysAdmin -- God 2) Domain owner -- Virtual host administrator 3) Group owner -- Sets group defaults 4) List owner -- Sets list defaults 5) List moderator -- Controls day-to-day operation of list 6) Member account -- Individual humans 7) Email address I mention this as I hope to steer things to a more virtual host friendly way. Ken Kyler From claw@kanga.nu Sun Dec 17 03:21:51 2000 From: claw@kanga.nu (J C Lawrence) Date: Sat, 16 Dec 2000 19:21:51 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: Message from "Ken Kyler" of "Sat, 16 Dec 2000 15:28:42 EST." References: Message-ID: <4551.977023311@kanga.nu> On Sat, 16 Dec 2000 15:28:42 -0500 Ken Kyler wrote: >> Okay, let's split this out. There are five levels and a >> pseudonymous sixth: >> 1) Site owner -- SysAdm for the host >> 2) Group owner -- Sets group defaults >> 3) List owner -- Sets list defaults >> 4) List moderator -- Controls day-to-day operation of list >> 5) Member account -- Individual humans >> 6) Email address > Using the virtual host paradigm, it becomes... > 1) SysAdmin -- God > 2) Domain owner -- Virtual host administrator > 3) Group owner -- Sets group defaults > 4) List owner -- Sets list defaults > 5) List moderator -- Controls day-to-day operation of list > 6) Member account -- Individual humans > 7) Email address I don't see that there's enough gain added by splitting VHosts out from groups to make the extra complextiy worth it. A group can be a vhost, or it can be a more logical grouping. Given that the likely tenuous life of vhosts (they've already started rolling out IPv6 on a production basis in a few places) I really don't have much temptation to bend that far when groups can accomplish the same end (if need-be just create multiple groups representing factions of a vhost). -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From chuqui@plaidworks.com Mon Dec 18 06:56:50 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Sun, 17 Dec 2000 22:56:50 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: <32559.976997267@kanga.nu> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <14905.27901.891059.922294@anthem.concentric.net> <20660.976920731@kanga.nu> <32559.976997267@kanga.nu> Message-ID: (sorry for being slow to respond. I spent the weekend upgrading the powerbook to a 20 gig disk and setting it up to dual boot linux, so it's been in pieces for various large hunks o' weekend...) > > >Okay, let's split this out. There are five levels and a >pseudonymous sixth: > > 1) Site owner -- SysAdm for the host > 2) Group owner -- Sets group defaults If the group owner manages a virtual site , why not call it that? If we want to get technical, you have the owner of the mailman instance (since a given machine can have multple ones), the owner of the virtual host (which may be the only user of the mailman, or may share it), the list owner, the list moderator, and the list user. I don't see any advantage to breaking it out into finer gradiations, or generalizing the functionality beyond that. >The Group owner runs, say, a >vhost and defines the default for a class of lists as well as >handling creation of lists within that class. Also UI and graphic definitions for the site, since each site is going to wrap a different (do I dare use the term? I dare) skin over mailman, and we need to make sure we support that properly. >But, this is slightly confusing and deceptive. A list moderator in >the course of their normal duties may do the following (among other >things): true, although at the discretion of the list owner. It may be the owner reserves these functions to himself, or to a subset of moderators. You shouldn't assume that a moderator WILL have these abilities. the moderator MAY have them. > -- Write an arbitrary note that is then associated with a member > such that any moderator for that list will see that note when > presented with data concerning that member (eg a post held for > moderation). you know, you just wandered down something I've played with in the past but keep forgetting about (mostly, i want it while I'm dealing with a problem, but not enough to create it the rest of the time) -- the problem/case book. Needs to be list-specific for privacy reasons, but there needs to be a way for admins to track users and issues, and a generalized note-taking/history-keeping function attached to a user_ID and a list would be great for this. ("what do you mean you never start fights, last January, you...") >There are of course exceptions: > > I want to hand moderate all his posts from his work address > because they auto-append legal cruft I want to delete. > > He only posts from Yahoo when he's drunk -- unsubscribe that > address. I expect these situations rare enough I wonder if it's worth even considering in the design. I'm trying to think the last time I might have used something like this, and I can't think of one. they're a nice addition, but I think it's solving a problem I'm not convinced shows up often enough to worry about. > >While there is a data leak, yes, there isn't in the general case, >and should that fact be of sufficient concern the admin command >processing ionterface (which I'm looking at in exactly the same >replaceable compnonent manner as members, and everything else) can >be replaced with something that will *ONLY* process addresses (and >which quite likely removes all concept of member accounts in the >first place. hmm. Okay, for now. I think. >I'm not arguing against mailback validation. I consider that's >actually a very good thing (as it demonstrates control of the >address and some level of ackknowledgemnt of the change). Neither am I. we need it. >There is nothing we can do about this because the forwarding >decision is outside of our control or purview. true. nor should we. but if we allow un-validated accounts into mailman, we create the same environment within mailman, because they can config up the same general thing inside mailman, albeit possibly on a smaller scale (depending on the size of the mailman installation). And we can (and should) fix it. I'm not trying to fix the hotmail-forward problem. Im' trying to keep mailman from allowing the same attack vector. but the hotmail attack thing is an indication of just how complex and gnarly email is on the net these days, because there really isn't much of an easy way to stop something like that. Fortunately, it's fairly rare. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From schorsch@schorsch.com Mon Dec 18 17:40:31 2000 From: schorsch@schorsch.com (Georg Mischler) Date: Mon, 18 Dec 2000 12:40:31 -0500 (EST) Subject: [Mailman-Developers] Random HTML archiving failures possibly solved Message-ID: Hi all, There have been a number of reports about the HTML archiving to fail misteriously, which were apparently impossible to reproduce for the experts. I think I have just found a bug in Mailbox.py from 2.0 that can cause this behaviour. Since I'm CVS challenged, I am unable check if it has already been fixed since then, but here it goes anyway. The pattern that checks for the unix style "From " lines fails when it encounters a negative timezone: _fromlinepattern = r'From \s*\S+\s+\w\w\w\s+\w\w\w\s+\d\d?\s+' \ r'\d\d?:\d\d(:\d\d)?(\s+\S+)?\s+\d\d\d\d\s*$' The consequence is, that when a mailbox file has a message from such a timezone at the beginning, then Mailman will think it contains no messages at all. A more robust approach (assuming that a plus sign in front of the timezone is also legal) would probably look similar to this: _fromlinepattern = r'From \s*\S+\s+\w\w\w\s+\w\w\w\s+\d\d?\s+' \ r'\d\d?:\d\d(:\d\d)?(\s+\S+)?\s+[+-]?\d\d\d\d\s*$' At least this fixes the problem on my system here... On another thought, wouldn't it be even better to use rfc822.parsedate_tz() here as well? I realize this implies some processing overhead, but I'd prefer robustness before the last two percent of increased performance. Have fun! -schorsch -- Georg Mischler -- simulations developer -- schorsch at schorsch.com +schorsch.com+ -- lighting design tools -- http://www.schorsch.com/ From chuqui@plaidworks.com Sun Dec 10 04:51:07 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Sat, 9 Dec 2000 20:51:07 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: <13466.976423624@kanga.nu> References: <20001209030926.A26087@ncsa.uiuc.edu> <13466.976423624@kanga.nu> Message-ID: At 8:47 PM -0800 12/9/00, J C Lawrence wrote: >The fact that the >posts happened to be on-topic is slightly droll and beyond my >ability to explain. I've had that happen, and it was finally tracked to another list member who was pissed at the first, and attempting to ruin their reputation on the list. Don't downplay the underlying personal interactions and politics of a list, especially one iwth strong emotions, where otherwise mature people act like three year olds over a stale donut. -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From chuqui@plaidworks.com Sun Dec 10 05:07:32 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Sat, 9 Dec 2000 21:07:32 -0800 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: <20001210013636.9D1954800C@athene.jamux.com> References: <20001210013636.9D1954800C@athene.jamux.com> Message-ID: At 8:36 PM -0500 12/9/00, John A. Martin wrote: > CVR> received lines can be forged, but the one your server adds to > CVR> tell you who it got the mail from -- the direct connection -- > CVR> can't be (or you have bigger problems). > >Would you unconditionally accept postings received at your list host >from a backup MX? I'd say it's up to the list admin. that's the advantage of allowing the admin to approve given IP addresses as approved addresses for that email. it can be dealt with on a case by case basis. And if you run into a case where an approved IP is abused, you remvoe it from the approval list and manually moderate those messages. > -- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com) We're visiting the relatives. Cover us. From spaf@cerias.purdue.edu Sun Dec 10 06:01:43 2000 From: spaf@cerias.purdue.edu (Gene Spafford) Date: Sun, 10 Dec 2000 01:01:43 -0500 Subject: [Mailman-Developers] FYI -- mailback validations no longer safe? In-Reply-To: References: <20001210013636.9D1954800C@athene.jamux.com> Message-ID: Hi, all. I got added to the original messages because I maintain lists and I'm interested in how systems get subverted. But I really don't have anything to add, and I think I've seen everything that might make a difference. SO, please leave me out of subsequent messages? Thanks. --spaf From ken@kyler.com Mon Dec 18 23:58:05 2000 From: ken@kyler.com (Ken Kyler) Date: Mon, 18 Dec 2000 18:58:05 -0500 Subject: [Mailman-Developers] Patience please - admindb.py question Message-ID: When handling administrative requests for a list, the text of a message is displayed in TEXTAREA form elements. After looking through the code in admindb.py, it appears that any edits to the message are discarded. Correct? I'm just now starting to poke through the code. I'm just now starting to learn Python. I'll try to limit my dumb questions as much as possible. Is there a design document publicly available? Ken -- ken@kyler.com http://www.kyler.com From claw@kanga.nu Tue Dec 19 00:57:52 2000 From: claw@kanga.nu (J C Lawrence) Date: Mon, 18 Dec 2000 16:57:52 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: Message from Chuq Von Rospach of "Sun, 17 Dec 2000 22:56:50 PST." References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <14905.27901.891059.922294@anthem.concentric.net> <20660.976920731@kanga.nu> <32559.976997267@kanga.nu> Message-ID: <26103.977187472@kanga.nu> On Sun, 17 Dec 2000 22:56:50 -0800 Chuq Von Rospach wrote: >> Okay, let's split this out. There are five levels and a >> pseudonymous sixth: >> >> 1) Site owner -- SysAdm for the host 2) Group owner -- Sets group >> defaults > If the group owner manages a virtual site , why not call it that? Gecause groups are a logical construct and may be both larger and smaller than virtual domains. A group may consist of the lists assigned to a particular list owner, lists sharing a common topic, a virtual domain, or any other structure of divide you may care to consider. > If we want to get technical, you have the owner of the mailman > instance (since a given machine can have multple ones), the owner > of the virtual host (which may be the only user of the mailman, or > may share it), the list owner, the list moderator, and the list > user. I don't see any advantage to breaking it out into finer > gradiations, or generalizing the functionality beyond that. The idea of the above group concept is that groups could be used for virtual hosts, or any other grouping desired. It really doesn't matter. Consider Python.Org: All the python lists could share a group and therefore a set of common templates, CSS files etc for their own unique/shared configs. All the pythonic app lists could share another group, the SIG lists another group, etc. In my case I'd group lists into those run by me, and those run by others (ie not "offocial representatives of Me (tm)". >> The Group owner runs, say, a vhost and defines the default for a >> class of lists as well as handling creation of lists within that >> class. Vhosts are merely the simplest way of tagging the group concept so people can grab it and relate it to something they already know and understand. > Also UI and graphic definitions for the site, since each site is > going to wrap a different (do I dare use the term? I dare) skin > over mailman, and we need to make sure we support that properly. Quite. This would be done at the group level. >> -- Write an arbitrary note that is then associated with a member >> such that any moderator for that list will see that note when >> presented with data concerning that member (eg a post held for >> moderation). > you know, you just wandered down something I've played with in the > past but keep forgetting about (mostly, i want it while I'm > dealing with a problem, but not enough to create it the rest of > the time) -- the problem/case book. Needs to be list-specific for > privacy reasons, but there needs to be a way for admins to track > users and issues, and a generalized note-taking/history-keeping > function attached to a user_ID and a list would be great for > this. ("what do you mean you never start fights, last January, > you...") I've got some ideas there, mostly centered about tacking a CRM tool off the side, linked to the AccountID I discussed earlier. Entries in the CRM would be tagged with a ListID and would be flagged as public (can be viewed by other list moderators) or private to the ListID. This would exist in parallel to Mailman per se, linked only by a tag that some module inserts into the Mailman genned HTML... >> There are of course exceptions: >> >> I want to hand moderate all his posts from his work address >> because they auto-append legal cruft I want to delete. >> >> He only posts from Yahoo when he's drunk -- unsubscribe that >> address. > I expect these situations rare enough I wonder if it's worth even > considering in the design. I had a tough time coming up with good examples. Okay, better: "Hey moderator, yesterday was my last day at XXX and I forgot to unsubscribe. Sorry about that, but would you mind unsubscribing me? In the mean time I'm reading through my home address on the same account." > I'm trying to think the last time I might have used something like > this, and I can't think of one. they're a nice addition, but I > think it's solving a problem I'm not convinced shows up often > enough to worry about. Which? By email address or by account? > but the hotmail attack thing is an indication of just how complex > and gnarly email is on the net these days, because there really > isn't much of an easy way to stop something like > that. Fortunately, it's fairly rare. One of the programming lists I'm on just had someone take a hotmail account which they'd already subscribed to 20+ daily porn picture services, and set it to auto-forward to the list I'm on. Asides from the fact that there are some very able photoshop people out there who should have better things to do with their time, it was quite difficult finding the account to remove/block (it was an open posting list). -- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ --=| A man is as sane as he is dangerous to his environment |=-- From chuqui@plaidworks.com Tue Dec 19 04:37:57 2000 From: chuqui@plaidworks.com (Chuq Von Rospach) Date: Mon, 18 Dec 2000 20:37:57 -0800 Subject: [Mailman-Developers] Users, Bounces, and Virtual Domains (was (no subject)) In-Reply-To: <26103.977187472@kanga.nu> References: <31823.976583871@kanga.nu> <14905.20915.851635.680942@anthem.concentric.net> <14905.27901.891059.922294@anthem.concentric.net> <20660.976920731@kanga.nu> <32559.976997267@kanga.nu>