[Mailman-Developers] Scrubber.py confusion, 2.1b3

Michael Meltzer mjm@michaelmeltzer.com
Mon, 12 Aug 2002 03:21:10 -0400


I been going over some of the Scrubber.py code two thing are standing out
for me

1)A lot of work was made to make the filename unique in "save_attachment",
it look like a straight bug that the url returned does not have the "extra"
part returned as part of the url, looks to me like the last line should be

url = baseurl + 'attachments/%s/%s' % (msgdir, filename + extra)

frankly I think the forming of the name could better, like filenamebase +
"-" +counter + "." + ext, but that more of a feature request

2)It looks like this code is doing directory abuse, it looks like a
unlimited amount of files names fill be placed in one directory, like 2^32,
this is not good for systems performance, even with the latest dirhash
methods by the operating system ,this will become a linear screech very
quickly for file creates and file exists. Been their and killed the patient
that way. Hard to spot it until you ramp the systems up. I am playing around
by adding two more time based directories to the system
"attachments/YYYYMM/DD/". BTW that what made spotting bug #1 so easy :-)

MJM