OT: MoinMoin and Mediawiki?

Ian Bicking ianb at colorstudy.com
Wed Jan 12 18:37:03 EST 2005


Paul Rubin wrote:
>>If you are just trying to avoid too many files in a directory, another
>>option is to put files in subdirectories like:
>>
>>base = struct.pack('i', hash(page_name))
>>base = base.encode('base64').strip().strip('=')
>>filename = os.path.join(base, page_name)
> 
> 
> Using subdirectories certainly keeps directory size down, and it's a
> good idea for MoinMoin given the way MoinMoin uses the file system.
> But for really big wikis, I think using the file system like that
> isn't workable even with subdirectories.  Plus, there's the issue of
> how to find backlinks and how to do full text search.

If the data has to be somewhere, and you have to have relatively random 
access to it (i.e., access any page; not necessarily a chunk of a page), 
then the filesystem does that pretty well, with lots of good features 
like caching and whatnot.  I can't see a reason not to use the 
filesystem, really.

For backlink indexing, that's a relatively easy index to maintain 
manually, simply by scanning pages whenever they are modified.  The 
result of that indexing can be efficiently put in yet another file 
(well, maybe one file per page).

For full text search, you'll want already-existing code to do it for 
you.  MySQL contains such code.  But there's also lots of that software 
that works well on the filesystem to do the same thing.

A database would be important if you wanted to do arbitrary queries 
combining several sources of data.  And that's certainly possible in a 
wiki, but that's not so much a scaling issue as a 
flexibility-in-reporting issue.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org



More information about the Python-list mailing list