Dealing with marketing types...

Paul Rubin http
Sun Jun 12 02:28:56 EDT 2005


Andrew Dalke <dalke at dalkescientific.com> writes:
> I know little about it, though I read at
> http://goathack.livejournal.org/docs.html
> ] LiveJournal source is lots of Perl mixed up with lots of MySQL
> 
> I found more details at
> http://jeremy.zawodny.com/blog/archives/001866.html
> 
> It's a bunch of things - Perl, C, MySQL-InnoDB, MyISAM, Akamai,
> memcached.  The linked slides say "lots of MySQL usage." 60 servers.

LM uses MySQL extensively but what I don't know is whether it serves
up individual pages by the obvious bunch of queries like a smaller BBS
might.  I have the impression that it's more carefully tuned than that.

> I don't see that example as validating your statement that
> LAMP doesn't scale for mega-numbers of hits any better than
> whatever you might call "printing press" systems.

What example?  Slashdot?  It uses way more hardware than it needs to,
at least ten servers and I think a lot more.  If LJ is using 6x as
many servers and taking 20x (?) as much traffic as Slashdot, then LJ
is doing something more efficiently than Slashdot.  

> How permanent though does the history need to be?  Your
> approach wipes history when the user clears the cookie and it
> might not be obvious that doing so should clear the history.

The cookie is set at user login and it only has to persist through the
login session.  It's not as if the info only exists in the cookie and
nowhere else.

> > As for "big", hmm, I'd say as production web sites go, 100k users is
> > medium sized, Slashdot is "largish", Ebay is "big", Google is huge.
> 
> I'ld say that few sites have >100k users, much less
> daily users with personalized information. As a totally made-up
> number, only few dozens of sites (maybe a couple hundred?) would
> need to worry about those issues.

Yes, but for those of us interested in how big sites are put together,
those are the types of sites we have to think about ;-).  I'd say
there's more than a few hundred of them, but it's not like there's
millions.  And some of them really can't afford to waste so much
hardware--look at the constant Wikipedia fundraising pitches for more
server iron because the Wikimedia software (PHP/MySQL, natch) can't
handle the load.

> If that's indeed the case then I'll also argue that each of
> them is going to have app-specific choke points which are best
> hand-optimized and not framework optimized.  Is there enough
> real-world experience to design a EnterpriseWeb-o-Rama (your
> "printing press") which can handle those examples you gave
> any better than starting off with a LAMP system and hand-caching
> the parts that need it?

Yes, of course there is.  Look at the mainframe transaction systems of
the 60's-70's-80's, for example.  Look at Google.  Then there's the
tons of experience we all have with LAMP systems.  By putting some
effort into seeing where the resources in those things go, I believe
we can do a much better job.  In particular, those sites like Slashdot
are really not update intensive in the normal database sense.  They
can be handled almost entirely with some serial log files plus some
ram caching.  At that point almost all the SQL overhead and a lot of
the context switching can go away.



More information about the Python-list mailing list