A Guide to ASGI in Django 3.0 and its Performance

    In Dec 2019, we saw the release of Django 3.0 with an interesting new feature — support for ASGI servers. I was intrigued by what this meant. When I checked the performance benchmarks of asynchronous Python web frameworks they were ridiculously faster than their synchronous counterparts often by a factor of 3x–5x.

    So I set out to test Django 3.0 performance with a very simple Docker setup. The results — though not spectacular — were still impressive. But before that, you might need a little bit of background about ASGI.

    Before ASGI there was WSGI

    It was 2003 and various Python web frameworks like Zope, Quixote used to ship with their own web servers or had their own home grown interfaces to talk to popular web servers like Apache.

    Being a Python web developer meant a devout commitment to learning an entire stack but relearning everything if you needed another framework. As you can imagine this led to fragmentation. A PEP 333 - “Python Web Server Gateway Interface v1.0” tried to solve this problem by defining a simple standard interface called WSGI (Web Server Gateway Interface). Its brilliance was in its simplicity.

    In fact the entire WSGI specification can be simplified (conveniently leaving out some hairy details) as the server side invoking a callable object (i.e. anything from a Python function to a class with a call method) provided by the framework or the application. If you have a component that can play both roles, then you have created a “middleware” or an intermediate layer in this pipeline. Thus, WSGI components can be easily chained together to handle requests.

    Illustration
    When connecting became easy, merriment followed

    WSGI became so popular that it was adopted not just by the large web frameworks like Django and Pylons but also by microframeworks like Bottle. Your favourite framework could be plugged into any WSGI-compatible application server and it would work flawlessly. It was so easy and intuitive that there was really no excuse not to use it.

    Road ‘Blocks’ to Scale

    So if we were perfectly fine with WSGI, why did we have to come up with ASGI? The answer will be quite evident if you have followed the path of a webrequest. Check out my animation of how a webrequest flows into Django. Notice how the framework is waiting after querying the database before sending the response. This is the drawback of synchronous processing.

    Frankly this drawback was not obvious or pressing until Node.js came into the scene in 2009. Ryan Dahl, the creator of Node.js, was bothered by the C10K problem i.e. why popular web servers like Apache cannot handle 10,000 or more concurrent connections (given a typical web server hardware it would run out of memory) . He asked “What is the software doing while it queries the database?”.

    Illustration
    Looks like she has been waiting forever

    The answer was, of course, nothing. It was waiting for the database to respond. Ryan argued that webservers should not be waiting on I/O activities at all. Instead it should switch to serving other requests and get notified when the slow activity is completed. Using this technique, Node.js could serve many orders of magnitude more users using less memory and on a single thread!

    It was becoming increasingly clear that asynchronous event-based architectures are the right way to solve many kinds of concurrency problems. Probably that is why Python’s creator Guido himself worked towards a language level support with the Tulip project, which later became the asyncio module. Eventually Python 3.7 added the new keywords async and await to support asynchronous event loops. This has pretty significant consequences in not just how Python code is written but executed as well.

    Two Worlds of Python

    Though writing asynchronous code in Python might seem as easy as sliding an async keyword in front of a function definition, you have to be very careful not to break an important rule - Do not freely mix synchronous and asynchronous code.

    This is because synchronous code can block an event loop in asynchronous code. Such situations can bring your application to a standstill. As Andrew Goodwin writes this splits your code into two worlds - “Synchronous” and “Asynchronous” with different libraries and calling styles.

    Illustration
    When two worlds collide, results can be quite unexpected

    Coming back to WSGI, this means we cannot simply write an asynchronous callable and plug it in. WSGI was written for a synchronous world. We will need a new mechanism to invoke asynchronous code. But if everyone writes their own mechanisms we would be back to the incompatibility hell we started with. So we need a new standard similar to WSGI for asynchronous code. Hence, ASGI was born.

    ASGI had some other goals as well. But before that let’s look at two similar web applications greeting “Hello World” in WSGI and ASGI style.

    In WSGI:

    def application(environ, start_response):
        start_response("200 OK", [("Content-Type", "text/plain")])
        return b"Hello, World"
    

    In ASGI:

    async def application(scope, receive, send):
        await send({"type": "http.response.start", "status": 200, "headers": [(b"Content-Type", "text/plain")]})
        await send({"type": "http.response.body", "body": b"Hello World"})
    

    Notice the change in the arguments passed into the callables. The scope argument is similar to the earlier environ argument. The send argument corresponds to start_response. But the receive argument is new. It allows clients to nonchalantly slip messages to the server in protocols like WebSockets that allow bidirectional communications.

    Like WSGI, the ASGI callables can be chained one after the other to handle web requests (as well as other protocol requests). In fact, ASGI is a superset of WSGI and can call WSGI callables. ASGI also has support for long polling, slow streaming and other exciting response types without side-loading resulting in faster responses.

    Thus, ASGI introduces new ways to build asynchronous web interfaces and handle bi-directional protocols. Neither the client or server needs to wait for each other to communicate - it can happen any time asynchronously. Existing WSGI-based web frameworks being written in synchronous code would not support this event-driven way of working.

    Django Evolves

    This also brings us to the crux of the problem with bringing all the async goodness to Django - all of Django was written in synchronous style code. If we need to write any asynchronous code then there needs to be a clone of the entire Django framework written in asynchronous style. In other words, create two worlds of Django.

    Well, don’t panic — we might not have to write an entire clone as there are clever ways to reuse bits of code between the two worlds. But as Andrew Godwin who leads Django’s Async Project rightly remarks “it’s one of the biggest overhauls of Django in its history”. An ambitious project involving reimplementation of components like ORM, request handler, Template renderer etc in asynchronous style. This will be done in phases and in several releases. Here is how Andrew envisions it (not to be taken as a committed schedule):

    • Django 3.0 - ASGI Server
    • Django 3.1 - Async Views (see an example below)
    • Django 3.2/4.0 - Async ORM

    You might be thinking what about the rest of the components like Template rendering, Forms, Cache etc. They may still remain synchronous or an asynchronous implementation be fitted somewhere in the future roadmap. But the above are the key milestones in evolving Django to work in an asynchronous world.

    That brings us to the first phase.

    Django talks ASGI

    In 3.0, Django can work in a “async outside, sync inside” mode. This allows it to talk to all known ASGI servers such as:

    • Daphne - an ASGI reference server, written in Twisted
    • Uvicorn - a fast ASGI server based on uvloop and httptools
    • Hypercorn - an ASGI server based on the sans-io hyper, h11, h2, and wsproto libraries

    It is important to reiterate that internally Django is still processing requests synchronously in a threadpool. But the underlying ASGI server would be handling requests asynchronously.

    This means your existing Django projects require no changes. Think of this change as merely a new interface by which HTTP requests can enter your Django application.

    But this is a significant first step in transforming Django from “outside-in”. You could also start using Django on the ASGI server which is usually faster.

    How to use ASGI?

    Every Django project (since version 1.4) ships with a wsgi.py file, which is a WSGI handler module. While deploying to production, you will point your WSGI server like gunicorn to this file. For instance, you might have seen this line in your Docker compose file

    command: gunicorn mysite.wsgi:application
    

    If you create a new Django project (for e.g. created by running the django-admin startproject command) then you will find a brand new file asgi.py alongside wsgi.py. You will need to point your ASGI server (like daphene) to this ASGI handler file. For example, the above line would be changed to:

    command: daphene mysite.asgi:application
    

    Note that this requires the presence of an asgi.py file.

    Running Existing Django Projects under ASGI

    None of the projects created before Django 3.0 have an asgi.py. So how do you go about creating one? It is quite easy.

    Here is a side-by-side comparison (docstrings and comments omitted) of the wsgi.py and asgi.py for a Django project:

    WSGIASGI
    import os
    

    from django.core.wsgi import get_wsgi_application

    os.environ.setdefault(‘DJANGO_SETTINGS_MODULE’, ‘mysite.settings’)

    application = get_wsgi_application()

    import os
    

    from django.core.asgi import get_asgi_application

    os.environ.setdefault(‘DJANGO_SETTINGS_MODULE’, ‘mysite.settings’)

    application = get_asgi_application()

    If you are squinting too hard to find the differences, let me help you - everywhere ‘wsgi’ is replaced by ‘asgi’. Yes, it is as straightforward as taking your existing wsgi.py and running a string replacement s/wsgi/asgi/g.

    Gotchas

    You should take care not to call any sync code in the ASGI Handler in your asgi.py. For example, if you make a call to some web API within your ASGI handler for some reason, then it must be an asyncio callable.

    ASGI vs WSGI Performance

    I did a very simple performance test trying out the Django polls project in ASGI and WSGI configurations. Like all performance tests, you should take my results with liberal doses of salt. My Docker setup includes Nginx and Postgresql. The actual load testing was done using the versatile Locust tool.

    The test case was opening a poll form in the Django polls application and submitting a random vote. It makes n requests per second when there are n users. The wait time is between 1 and 2 seconds.

    Illustration
    Being fast isn't quite enough, you need to avoid failures

    The results shown below indicate around 50% increase in the number of simultaneous users when running in ASGI mode compared to WSGI mode.

    Users 100 200 300 400 500 600 700
    WSGI Failures 0% 0% 0% 5% 12% 35% 50%
    ASGI Failures 0% 0% 0% 0% 0% 15% 20%

    As the number of simultaneous request ramp up the WSGI or ASGI handler will not be able to cope up beyond a certain point resulting in errors or failures. The requests per second after the WSGI failures start varies wildly. The ASGI performance is much more stable even after failures.

    As the table shows, the number of simultaneous users is around 300 for WSGI and 500 for ASGI on my machine. This is about 66% increase in the number of users the servers can handle without error. Your mileage might vary.

    Frequent Questions

    I did a talk about ASGI and Django at BangPypers recently and there were a lot of interesting questions the audiences raised (even after the event). So I thought I’ll address them here (in no particular order):

    Q. Is Django Async the same as Channels?

    Channels was created to support asynchronous protocols like Websockets and long polling HTTP. Django applications still run synchronously. Channels is an official Django project but not part of core Django.

    Django Async project will support writing Django applications with asynchronous code in addition to synchronous code. Async is a part of Django core.

    Both were led by Andrew Goodwin.

    These are independent projects in most cases. You can have a project that uses either or both. For example if you need to support a chat application over web sockets, then you can use Channels without using Django’s ASGI interface. On the other hand if you want to make an async function in a Django view, then you will have to wait for Django’s Async support for views.

    Q. Any new dependencies in Django 3.0?

    Installing just Django 3.0 will install the following into your environment:

    $ pip freeze
    asgiref==3.2.3
    Django==3.0.2
    pytz==2019.3
    sqlparse==0.3.0
    

    The asgiref library is a new dependency. It contains sync-to-async and async-to-sync function wrappers so that you can call sync code from async and vice versa. It also contains a StatelessServer and a WSGI-to-ASGI adapter.

    Q. Will upgrading to Django 3.0 break my project?

    Version 3.0 might sound like a big change from its previous version Django 2.2. But that is slightly misleading. Django project does not follow semantic version exactly (where a major version number change may break the API) and the differences are explained in the Release Process page.

    You will notice very few serious backward incompatible changes in the Django 3.0 release notes. If your project does not use any of them, then you can upgrade without any modifications.

    Then why did the version number jump from 2.2 to 3.0? This is explained in the release cadence section:

    Starting with Django 2.0, version numbers will use a loose form of semantic versioning such that each version following an LTS will bump to the next “dot zero” version. For example: 2.0, 2.1, 2.2 (LTS), 3.0, 3.1, 3.2 (LTS), etc.

    Since the last release Django 2.2 was long-term support (LTS) release, the following release had to increase the major version number to 3.0. That’s pretty much it!

    Q. Can I continue to use WSGI?

    Yes. Asynchronous programming could be seen as an entirely optional way to write code in Django. The familiar synchronous way of using Django would continue to work and be supported.

    Andrew writes:

    Even if there is a fully asynchronous path through the handler, WSGI compatibility has to also be maintained; in order to do this, the WSGIHandler will coexist alongside a new ASGIHandler, and run the system inside a one-off eventloop - keeping it synchronous externally, and asynchronous internally.

    This will allow async views to do multiple asynchronous requests and launch short-lived coroutines even inside of WSGI, if you choose to run that way. If you choose to run under ASGI, however, you will then also get the benefits of requests not blocking each other and using less threads

    Q. When can I write async code in Django?

    As explained earlier in the Async Project roadmap, it is expected that Django 3.1 will introduce async views which will support writing asynchronous code like:

    async def view(request):
    	await asyncio.sleep(0.5)
        return HttpResponse("Hello, async world!")
    

    You are free to mix async and sync views, middleware, and tests as much as you want; Django will ensure that you always end up with the right execution context. Sounds pretty awesome, right?

    At the time of writing, this the patch is almost certain to land in 3.1 now, awaiting a final review.

    Wrapping Up

    We covered a lot of background about what led to Django supporting asynchronous capabilities. We tried to understand ASGI and how it compares to WSGI. We also found some performance improvements in terms of increased number of simultaneous requests in ASGI mode. A number of frequently asked questions related to Django’s support of ASGI were also addressed.

    I believe asynchronous support in Django could be a game changer. It will be one of the first large Python web frameworks to evolve into handling asynchronous requests. It is admirable that it is done with a lot of care not to break backward compatibility.

    I usually do not make my tech predictions public (frankly many of them have been proven right over the years). So here goes my tech prediction — Most Django deployments will use async functionality in five years.

    That should be enough motivation to check it out!

    Thanks to Andrew Goodwin for his comments on an early draft. All illustrations courtesy of Old Book Illustrations

    Comments →

    Ray Tracer in Python (Part 3) - Show Notes of "3D Balls in 2D Space"

    Graphics is what made Mathematics enjoyable for me. I first heard of trigonometric functions like sine and cosine when I read GW-BASIC manual. Geometry was easy to visualize with the rudimentary graphics of LINE and CIRCLE statements. While I could see many struggle with Mathematics, I always found it interesting.

    So my challenge was to make this math-heavy episode interesting so that you see how I see it. I needed to give personalities to Ray and Sphere before I could show their intersection formula. This needed a lot of illustration and animation work. But I believe the end result was worth it.

    This time there is a lot of furious typing and less talking because of the number of lines entered in this part. I did not want to fast forward code writing segments because it doesn’t help the learners. In any case, YouTube can speed up videos if you choose to.

    These are the topics we will cover in this episode:

    • Introduction
      • Why meshes in movies and spheres in raytracers
      • Simplified ray-tracing
      • Ray-sphere intersection
      • Aspect Ratio Corrections
    • First sub-problem: 3D Balls in 2D Space
    • Coding the solution
      • Hex colors
      • Classes for Engine, Ray, Sphere, etc.
      • Rendering Algorithm

    Here is the video:

    Code for part three is tagged on the Puray Github project

    Bonus (Traffic Lights) Code is available for download.

    Show Notes

    Books and articles that can help understand this part:

    Note: References may contain affiliate links

    Comments →

    Ray Tracer in Python (Part 2) - Show Notes of "Revealing the True Colors"

    It is always a good idea to create a visible output at the start of a long project. If you are making a game, start with showing something moving on the screen. It keeps you motivated and gives you something cool to show your friends as progress. In the second part of our ray tracer tutorial, I will introduce you to PPM a very simple image format that will be used for our renders. You don’t need to install any image libraries and yet PPM files can be read by most image viewers.

    The color class is a lot simpler than what I originally designed. Features like gamma correction and linear interpolation seemed like an overkill for a project like this. But I plan to add some convenience constructors later. Hopefully, this will be a good introduction to how colors are manipulated in computer graphics.

    These are the topics we will cover in this episode:

    • Introduction
      • Compressed Image are Hard
      • What Images are Made of
      • RGB triplets
    • First sub-problem: Revealing the True Colors
    • Coding the solution
      • Use the shebang line
      • Separate classes in separate files
      • Why fileobj instead of filename?

    Here is the video:

    Code for part two is tagged on the Puray Github project

    Bonus (Gradient) Code is available for download.

    Show Notes

    Books and articles that can help understand this part:

    Note: References may contain affiliate links

    Comments →

    Ray Tracer in Python (Part 1) - Show Notes of "Points in 3D Space"

    I’m really excited to start a new video tutorial series on creating a ray tracer from scratch. This a set of intermediate-level Python tutorials. Recently realtime ray tracing became a hot topic in the gaming community after various Minecraft Ray tracing videos started popping up. Of course, you need a monster of a machine to get decent framerates. However, we will be making an non-realtime ray tracer entirely in Python.

    These are the topics we will cover in this episode:

    • Introduction
    • Who should watch this tutorial?
    • What will you cover?
    • Ray tracing in a Nutshell
    • Getting Familiarized with my Emacs
    • First sub-problem: Points in 3D Space
    • Coding the solution
      • TDD (Test Driven Development)
      • Data Structures
      • 3D Vectors

    Here is the video:

    Code for part one is tagged on the Puray Github project

    Show Notes

    Books I recommend for learning Python:

    Some links to learn the Vector math I cover, in more detail:

    Note: References may contain affiliate links

    A Dream Forever in Making

    I have been forever interested in Computer Graphics since creating computer games is what really got me interested in programming (or “coding” as it is the fashionable term now). In early days, ray traced images used to blow my mind compared to the blocky graphics that 3D games generated.

    But trying to learn the algorithms was frustrating for two reasons - the mathematics seemed too dense and it took a really long time for each render. In 2015, I spent a weekend playing with various algorithms to create a simple ray tracer in Python by heavily leveraging NumPy.

    I felt the NumPy parts looked “un-pythonic”, so I reimplemented it without NumPy. It hit a sweet spot between functionality and readability. You needn’t be a math guru to figure out how it worked. I had to share what a learnt not because there was a lack of ray tracing tutorials but I wanted to make an accessible tutorial with gentle learning curve.

    However the process of creating video tutorials have changed over the years. Gone are the days of a simple screen recording or screencasts. Now we have to have slick intros, animations and click-baity thumbnails with a face overlay having a shocked expression. But honestly I am in awe of how much time people spend on making each video (it is way, way more than you think) and how frequently they make them (“new video every week”).

    The challenge is even harder when you need to break down all that math and physics behind computer graphics into simple concepts in a logical flow. That also takes way more time than you would expect. Sometimes you find that one clear diagram would explain an idea perfectly but nobody has made one so far so you need to draw a fresh one. Or you need to cut down your explanation because listeners are getting lost in the details. Plus your real life slows you down with work deadlines and goof ups like out of focus video recordings. This process of iterating until my script (and code) became streamlined took me months.

    My approach has always been about posting higher quality stuff at low frequencies. So I am happy if the end result was worth the wait (and I hope it is). This could be a tutorial that might outlive many of my other videos and that is satisfying in itself. Hope you’ll enjoy this journey with me as much as I did.

    Check back for part 2 (EDIT: it is out)!

    Comments →

    Black Holes and Python

    Black Hole at the center of M87 Galaxy
    Black Hole in M87 - Image Courtesy: BBC

    The first picture of a black hole is probably one of the most exciting developments in the world of science. The blurry ring of fiery orange might not seem difficult to produce. In fact, it involved years of effort by an international team of scientists, including computer scientists.

    Reading the account, I am excited about the role Python played in this endeavour. This is interesting because when we talk about a scientific discovery we usually talk about the people - the scientists who made leaps of intuitions and found correlations that no one else had. But increasingly technology is playing a significant role in discoveries by sifting through enormous amounts of data and extracting valuable insights.

    Python’s popularity in the scientific computing would not be a surprise to most Python programmers today. But back in 2009 when I attended the first PyCon India in IISc Bangalore, I was surprised to see talks on experimental Physics and fluid simulations. When I asked Prof Prabhu on why Python is so popular in scientific computing, he said “it is very accessible to us – non-programmers”.

    Casually browsing through the software used by the astronomers, you will find mentions of Python libraries like Numpy, Scipy, Matplotlib, Pandas and Jupyter. Remarkably, entire projects such as eht-python are written only in Python. Python is not just the language of choice, it is the lingua franca of scientific computing.

    Yet, if you think about it, there are better programming languages be it in terms of - speed, type safety or brevity. But Python overcomes these limitations and sometimes succeeds due to some pragmatic language design decisions.

    Speed Can Be Delegated

    Ironically, plain Python code can perform very poorly for computation intensive tasks. But libraries like NumPy are de facto when it comes to any form of number crunching. It provides an N-dimensional array object with several high level operations like cross product or transpose. The C engine of the library accelerates these operations close to raw machine speed.

    In the early days of Python, it was expected that performance intensive parts would be written in other languages like C or FORTRAN and a wrapper interface would be used to invoke them. Over time, wealth of libraries like NumPy made it unnecessary to write any custom C code. Why reinvent the wheel when you can just “import” and use it?

    Libraries that Play Well

    Obligatory XKCD 353
    import gravity - Obligatory XKCD 353

    Working with third party C libraries is not for the weak hearted. In 2002, when I was adapting the algorithm in a paper for my project on wavelet-based image compression, I learnt this the hard way. We needed to use an existing Fast Fourier transform library written in C.

    The library worked when you used it as is. But if you tried to extend a data structure, you might end up with a null pointer exception. Manual memory management by working out all the code paths turned out to be very stressful. The library was well documented but we practically needed to understand every line before tweaking it.

    Eventually, we gave up and started implementing most of the project in Python. It was much easier to work with higher level data structures like dictionaries and lists without going through the dance of malloc and free. Even better, the Python code was pretty much a direct translation of the mathematics in the paper to code.

    Python libraries tend to compose quite well (while C libraries don’t). This is partly due to its dynamic typing and automatic memory management, but I personally feel it is mainly due to good conventions. Most of the Python idioms are well documented and this leads to minimum surprises. For instance, a deeply nested class hierarchy is frowned upon because “flat is better”.

    Interactive Exploration

    Research is explorative. We not know what we may find. Even if we do, we cannot wait for ages to find out because we might be chasing a dead end. An interactive interface is a key tool for a researcher or scientist. A Jupyter notebook is close to the ideal with its live code and embedded visualization abilities.

    If you need to try a computation with a different set of parameters, you can invoke it and view the results. Even plot it to visualize it better. Then you could take the results and feed it to another computation. This recorded transcript is a valuable data pipeline that can be replayed by a different user for verification or with a different set of observations.

    If you think about it, a conversational interface could be more approachable to a non-programmer. Alan Kay was very impressed with an early interactive programming environment called JOSS developed in RAND that appealed to economists. I find it endearing that it replied to any command it did not understand with a “Eh?” or “SORRY”.

    Imagine using today’s voice recognition technology to build such a conversational virtual assistant for scientists. Considering much of science (especially physics) involves mathematics – spelling out complex equations can quickly get tedious. Listening to tables of numbers is no fun either. So unless the conversation steps up with an amazing level of artificial intelligence (imagine a reply like “I have run simulations on every known element, and none can serve as a viable replacement for the palladium core."), we are probably stuck with current interfaces.

    Future of Python

    Katie Bouman
    Katie Bouman, one of the key contributors, with the amazing amount of data

    M87 EHT project involved processing petabytes of information (which is publicly available). They plan to add new telescopes in the future, increasing the volume of data by orders of magnitude. In general, the computational demands of science will keep growing and even enter new domains. The question is - will Python keep up or get replaced?

    Python has a strong ecosystem with hundreds of libraries. It will be hard for another language to reproduce that. It is a very easy language to pick up. The readability is so good that Python code is often compared to pseudocode. I believe, it has changed the expectation of how code should look like. Any new language should have equal or better readability to inspire a switch.

    While there are several other promising languages like Julia or Rust, I am confident that Python will remain the scientist’s favourite programming language for a long while. Despite its limitations, Python has found a sweet spot between ease and power.

    Every year technological progresses keeps accelerating. This can translate into progress for humanity if we can make technology more accessible. We need physicists, mathematicians, biologists, economists, farmers and so on to use cheap computing power to build better things.

    Python does play a significant role by making coding less intimidating and more collaborative. That’s why I believe you will see it in bringing more people to computers and being a part of more future breakthroughs.

    Comments →

    « Newer Page 3 of 39 Older »