Real-time Applications and will Django adapt to it?

While talking about Django at PyCon India this year, the most common question I got was whether it supports Single Page applications or an event driven architecture. I was quick to point out that such realtime applications are usually a handled by separate services at the backend (running on a different port?) and a JavaScript MVC framework on the front. Sometimes, Django supported by a REST library like Django REST or Piston is used as the backend too.

But the more I thought about it, those questions felt more and more important. Then, I had a chance to read about Meteor. I tried some of their tutorials and they were fantastic. I was so impressed at that point that I felt this framework had the power to change web development in a big way. I was even tempted to write an article similar to Why Meteor will kill Ruby on Rails for Django. But then, better sense prevailed and I dug deeper.

So I wrote this post to really understand the Real-time web application phenomena and pen my thoughts around it. It will look at how a traditional web framework like Django might tackle it or whether it really can. It is not a comprehensive analysis by any means but some of my thoughts regarding it.

What is a Real-time Application?

Real-time web applications are what used to be known as AJAX applications or Single-page applications. Except that, now we are not talking about just submitting a form without refreshing anymore. Upon an event, the server is supposed to push data into the browser enabling a two-way communication (thereby, making the terms Server and Client moot). To the Facebook crowd, this means that the application can notify them about events like your friends liking your post or starting a new chat conversation live - as and when it happens.

The application will give you the impression that it is “alive”, in the sense that traditional websites were pretty much stale after the page has loading. To use an analogy, traditional websites are like cooked food that get spoiled with time but real-time websites are like living, breathing organisms - constantly interacting with and responding to the environment. As we will see, both have their own advantages and purposes.

The word Real-time itself comes from the field of Real-time Computing. There are various forms of real-time computing based on how strict the systems perceives the response times. Realtime Web applications are probably the least strict and could be termed as ‘soft realtime’.

What kind of sites needs them?

Practically any kind of website can benefit from realtime updates. Off the top of my head, here are some examples:

E-commerce Site: Offers against an item updates its discounted price realtime, user reviews
News Site: Minute-by-minute update of an emerging story, errata of an article, reader reactions, polls
Investment Site: Stock prices, exchange rates, any investment trigger set by user/customer advisor
Utilities Site: Updates your utility usage in real-time, notifies of any upcoming downtimes
Micro-blogging: Pretty much everything

Of course, there are several kinds of sites which would not be ideal for the real-time treatment. Sites with relatively static content like blogs or wikis. Imagine if a blog site was designed to constantly poll for post updates and it suddenly becomes viral. This stretches the limits of the server’s scalability for a largely unchanging content. The most optimal approach today would be to pre-compute and cache the article’s contents and serve the comments via Disqus to create an almost complete realtime experience. But, as we would soon see, this could change in the future as the fundamental nature of the web changes.

Websites are Turning into Multiplayer Games

Broadly, our examples have two kinds of real-time updates - from the server itself and from the peers. The former involves notifications like changes in the discounted price or external events like stock price change. But peer updates, from other users, are becoming extremely valuable as user-generated content becomes the basis for many modern sites. Especially for the constant inflow of information which keeps such sites fresh and interesting. In addition, the social factor adds to inherent human tendencies to form groups, to know and interact with each other.

In many ways, this is exactly how a multiplayer game works. There are global events like weather changes or a sudden disaster and there are player generated events like a melee attack or a chat conversation. Games, being some of first programs that I had written; I am quite fond of them and know quite a bit about how they work. Real-time web applications are designed like multiplayer games especially those that were designed to work with a central server rather than say over a LAN. Multiplayer games have been in existence for about three decades now so real-time web applications are able to leverage much of the work that has gone into creating them.

Technically, there are several similarities too. The event loop that every game programmer writes while beginning to write a game is at the heart of most event driven servers like Node.js or Tornado. Typically, the game client and server are written in two different languages for e.g. Eve Online using Stackless Python on the server and possibly C++ with Python scripting on the client side. This is because, like web applications, the server side needs to interact with a database for bookkeeping purposes and would be more IO bound rather than CPU/GPU-bound. Thus, the needs are different and games being extremely performance hungry creations, developers often use the best language or tool or framework for the client and server sides. They often end up being different.

Of course, in the case of web applications, the de facto language was JavaScript. Over the years, several JavaScript APIs exposing the underlying system have emerged which further cemented JavaScript’s position as the client-side language of choice. However with several languages targeting JavaScript and with browsers supporting source-maps, other options are like pyjs and ClojureScript have now emerged.

How does Meteor Solve it?

Meteor and Derby claim to be a new breed of web applications frameworks which are built for the needs of the real-time web. These frameworks are written in JavaScript to eliminate the need to duplicate the logic in the client and server. While using Django or Rails, model declarations and behaviour had to be written in Python/Rails on the sever side and typically rewritten in JavaScript for MVC frameworks like AngularJS or Knockout on the client side, as well. Depending on the amount of logic shared between the client and server, this would become a development and maintenance nightmare.

These new frameworks also allow automatic synchronisation of data. In other words, part or whole of the server information is replicated between the server and all the connected clients. If you have ever programmed a multiplayer game then you would realise how hard it is to maintain a consistent state across all the clients. By automatically synchronising the database information, you have an extremely powerful abstraction for rapidly creating real-time web applications.

A high-level understanding of how real-time web frameworks work

However, like Drew (the creator of Dropbox) pointed out treating anything over the network as local accessible is a “leaky abstraction” due to network latencies. Programmers are very likely to under-engineer the various challenges that networks can bring up like a slower mobile connection or even a server crash. Users hate it when the real-time component stops working. In fact, once they start seeing ‘Connection Lost’ on a simple chat window the worse thing that could happen is that they lose their train of thought. But when an entire site becomes unresponsive, I believe they would start distrusting the site itself. Perhaps, it is critical not to entirely rely on the data synchronisation mechanism for all kinds of functionality.

Regarding the advantages of using the same language to share logic between the client and server, the previous discussion about multiplayer games comes to mind. Often the requirements of a web server are quite different from that of a client. Even if you avoid the Callback Hell with Futures, JavaScript might not be everyone’s first choice for server side programming. Until recently, it didn’t matter which language you used at the server as long as it returned the expected results say HTML, XML or JSON. People can get very attached to their favourite language and unsurprisingly so; considering the large amount of time one needs to spend in mastering every nook and corner of a programming language. Expecting everyone to adopt JavaScript might not be possible.

The payoff is, of course, that the shared data structures and logic will reduce the need to write them twice in two different languages. Unlike multiplayer games, this is a big deal in web programming due to the sheer amount of shared bookkeeping happening at both ends. However, is having JavaScript at both ends the only way out? We can think of at least one possible alternative approach. But before that, we need to look whether we can continue using traditional frameworks.

Can Django Adapt?

Realtime web is a very real challenge that Django faces. There are some elegant solutions like running Django with Gevent. But these solutions look like hacks in the face of a comprehensive solution like Meteor.

Django community is quite aware of this. At last year’s DjangoCon US, the customary “Why I Hate Django” keynote was by Geoff Schmidt, a principal developer of Meteor. He starts off, in an ominous note, by comparing advent of real-time apps as possibly an extinction event for Django similar to the asteriod impact which nearly drove dinosaurs to extinction. Actually, Geoff is a big fan of Django and he tried to adapt it to his needs but was not quite happy with the results.

And he is not alone. Guido, in his Pycon keynote Q&A, mentioned how it would be difficult for traditional web frameworks like Django to completely adapt to an event driven model. He believes that newer frameworks might actually tackle that need better.

Before we answer the question whether Django can adapt, we need to find out what Django’s shortcomings are. Or rather, which parts of the framework are beginning to show its age in the new real-time paradigm.

Templates are no longer necessary

To be exact, HTML templates getting used lesser and lesser. Just send me the meat - seems to be the mantra of the real-time applications. Content is often rendered at the client side so the wire essentially carries data in the form of XML or JSON. Originally when Django was created, all clients couldn’t support client side rendering using JavaScript. But with increasingly powerful mobile clients, the situation is quickly changing.

However this doesn’t mean that templating will no longer be required in frameworks. A case could be made for XML or JSON templates. But Python data structures can be mapped to JSON and back in a straightforward manner (just like JavaScript).

However, the previously mentioned Django solutions like Piston and Django REST does not encourage using serialised Python data structures directly for a good reason - Security. Data coming from the outside world cannot be trusted. You will need to define classes with appropriate data types to perform basic type validation. Essentially, you might end up repeating the model class definitions you already wrote for the front end.

HTTP and the WSGI interface

If you read the above examples closely, you will notice that real-time web works best for sites involving short bursts of data. For sites with long form content like blogs, wikis or even news sites, it is probably best to stick with traditional web frameworks or even static content. Even if you have a very high bandwidth connection, it would simply be too chatty to check for published information (unless you expect to have too many typos)

In fact, the web is specifically suited for the dissemination of long-form content. It works best for a request-reply mechanism for retrieval of documents (or hypertext if you like to be pedantic). It is stateless hence suited for caching, wherever possible. In fact, this explains why hacks like long-polling had to be created for the browser to support bidirectional communication until web sockets arrived.

This explains the design of WSGI, a synchronous protocol to handle request and response. It can handle only one request at a time and hence not ideally suited for creating realtime applications. Since Django is designed to be deployed on a server with a WSGI interface, this means that asynchronous code needs to bypass this mechanism.

This seems like a genuine shortcoming of Django. There might be hacks to work around it like frequent polling etc. But it would be much better if the framework can integrate with the asynchronous model.

Today, writing a REST-based API using Django and interacting with it using a JavaScript MVC library seems to be a popular way of creating a single page application. To make it realtime, you might have to fiddle around with Gevent or Tornado and web sockets.

Can you have the cake and eat it too?

It is possible to look at the language impedance mismatch problem in a different way. Why not have your favourite language, even if it is not JavaScript, run on the server and the client? Since JavaScript is increasingly used as a target language (I am refraining from calling it the ‘assembly language of the web’), why not convert the client part of the code in, say Python, into JavaScript?

If we can have an intelligent Python compiler which can say target byte codes for the server part and the shared code; and then target JavaScript (perhaps in asm.js too for better performance) and the shared code for the client part - then it might just work. Other details will also have to be worked out like event loops at both ends which can pass messages asynchronously, data synchronisation through a compact binary format and data validation. It is a project much bigger than writing a library but it might be a decent solution that is general enough to be applied to most languages.

Conclusion

Django is a good solution for a vast majority of web applications today. But the expectations are rapidly changing to show real-time updates. Realtime web applications need a lot of design changes in existing web frameworks like Django. The existing solutions require a lot of components and sometimes repetitive code. Newer frameworks like Meteor and Derby seem to be well suited for these needs and rapid development. But the design and scalability of real-time application will still be tricky. Finally, if you are a non-JavaScript developer there might be still be hope.