[Web-SIG] Nodejs cluster

anatoly techtonik techtonik at gmail.com
Tue Mar 18 08:12:49 CET 2014


On Tue, Mar 18, 2014 at 5:16 AM, est <electronixtar at gmail.com> wrote:

>
> IPython.parallel
>
>
> http://ipython.org/ipython-doc/stable/install/install.html#dependencies-for-ipython-parallel-parallel-computing
>
> It's based on ZeroMQ(PyZMQ), and the `ssh` command. I don't think that's
> lightweigh enough for busy web clusters.
>

You will need to secure you web cluster computations anyway. SSH may be
slower that HTTPS, I agree, but I'd still see the benchmarks. IPython is
good for handling long processing tasks. For myriad of tiny code+data
workers I'd choose Stackless. Not sure about the web server part.


> By QMachine I assume that's
>
> https://github.com/wilkinson/qmachine
>
> For web server cluster it's really not a good idea to amplify HTTP
> requests. One client request amplifies several other HTTP requests on
> server clusters.
>

Right. Because your workers are not trusted you need to distribute the load
and validate results with multiple passes.


> What I propose is something like Zed Shawn's Mongrel2 project (
> http://mongrel2.org/), use a very lightweight server-side serialization
> protocol as cluster IPC, you can pass states/data between nodes (workers)
> easily. It should be agnostic to framework or libraries, the objective is
> to unite python modules in the realtime web world. Because for
> request-response web world, a synchronized gateway like WSGI is good
> enough, between each requests, share nothing<https://docs.djangoproject.com/en/dev/faq/general/#does-django-scale>
> .
>
> But for realtime web, server side state is very much required. There need
> to be a fd pool for DBs, external services, and stuff like Server-Side-Push
> technologies.
>

"realtime web" is a very broad term. Need a more concise definition. I see
only one difference in "web" over standard protocol - is that client is
limited to send operations only and requests to HTTP(S) protocol only. Is
that true? All other parts of the system can communicate with whatever
protocols they like.

So, to unify the network under some standard, we need common base. Stick to
limitations of client to make all nodes work the same. Limit choice to bare
minimum and extend where it is needed.

Let's assume the following scenario:
>
> One user submits a blog, his follower gets browser/iOS/Android push
> notification. Because users are connected different nodes in one big
> cluster, we need some kind of mechanism to broadcast this message.
>
> In such an architecture we can write simpler code like this:
>
> from django.db.models.signals import post_save
>
> @receiver(post_save, sender=BlogPostModel)
> def my_handler(sender, **kwargs):
>     msg = "User X just posted a new blog, check it out at http://..."
>     browser_followers.send(msg)
>     ios_followers.send(msg)
>     android_followers.send(msg)
>
> Currently this library reall shines.
>
> https://pypi.python.org/pypi/telegraphy/
>
> Telegraphy architecture is like this:
>
> [image: Inline image 1]
>
> What I propose is to merge Web-app part and the AutobahnPython Gateway
> part into *one* based on a community honored standard.
>

Just a side note - XML-RPC is a bad way of operation and I am going to
promote that belief.

The key component here that is not depicted is client limitations (able to
only request events, and accept events after websocket connection is
established with a single server). Channel description (WS, HTTP) are not
informative in this regard to capture that limitation that this
architecture should deal with.

When client (browser) establishes connection to HTTP site, can it open a
websocket to the site in other domain? If no - then cross-domain
interaction should also be included into problem description before
unifying Django and Autobahn. If this limitation exists - the generic
clustering problem will include management DNS infrastructure (to make sure
client can send requests to any node in the cluster) or clustering will
require frontends on servers to reroute requests on established websocket
connections to appropriate cluster nodes.

Not sure I got the positioning of NodeJS cluster right, so feel free to fix
that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20140318/78c3668e/attachment-0001.html>


More information about the Web-SIG mailing list