Fitting a Django Application in One File

    Earlier this week, Anthony a french economics university student wanted to talk to me over Zoom about my ray tracer tutorials. He and his friend were new to Python but were excited to implement their own ray tracer after following my videos. One of the questions that popped up in the conversation was - “Can we put all the classes in one file instead of breaking it into individual files per class?". I said, “Of course” and noticed a wave of relief in their faces. In fact, I explained, my earlier implementation was all in one file and later broken up for better pedagogy.

    But the charm of an entire project in a single file is compelling. I remember seeing a Sinatra web application a few years ago containing the entire application and assets like HTML templates and CSS in a single file. Presenting all the components in the same file gave a complete high-level overview of the project by simply scrolling up and down.

    Normally, at this point, someone would suggest a microframework. But it is not that easy.

    Microframeworks

    Microframeworks take a minimalistic approach by omitting certain components or directing you to a few recommended components. For instance, Bottle contains basic form handling capabilities but has no protection against CSRF or clickjacking.

    So the approach is generally to use another library like bottle-utils-csrf. This leaves the task of integration to the developer. This is not to pooh-pooh tiny web frameworks. I love the idea (especially Bottle which I think is really cute). But for public facing sites, I prefer the safety and convenience of Django.

    So I am tempted to try this one-file trick in Django. Let’s try to make a non-trivial web application with forms, templates and images. How does one go about doing something like that?

    Note: if you prefer to watch the video version, click on the video below:

    Django Applications in One File

    Minimal

    Let’s start small by creating a minimal Hello World application in Django. It might be amusing to some Django developers that we will not start with the startproject command. In fact, it is not necessary for Django to work at all. All that initial directory structure and files like settings.py are for your convenience.

    First, create a simple file called app.py with the following:

    import sys
    
    from django.conf import settings
    from django.urls import path
    from django.http import HttpResponse
    
    settings.configure(
    	DEBUG=True,  # For debugging
    	SECRET_KEY="a-bad-secret",  # Insecure! change this
    	ROOT_URLCONF=__name__,
    )
    
    
    def home(request):
    	return HttpResponse("Welcome!")
    
    
    urlpatterns = [
    	path("", home),
    ]
    
    if __name__ == "__main__":
    	from django.core.management import execute_from_command_line
    
    	execute_from_command_line(sys.argv)
    

    Yes, that’s all you need. There are some bad practices like hard coding the secret key (easily fixed). But the sheer elegance of everything being in hardly a screenful is quite rewarding.

    The command to run this file is: python app.py runserver 8080

    Now we will skip a couple of steps (they are in the video) and move on to a simple “Coming Soon” landing page.

    Coming Soon Application

    The idea of a coming-soon page is to gauge interest in a product before it is released. Such pages must have a clear call to action (CTA) like asking for your email. Ideally it should have minimum friction and yet collect all the relevant information.

    Let’s look at my updated app.py:

    import os
    import sys
    
    from django.conf import settings
    from django.urls import path
    from django.http import HttpResponse, HttpResponseRedirect
    from django.core.wsgi import get_wsgi_application
    from django.template import RequestContext, Template
    from django import forms
    
    CSV_LIST = "thelist.csv"
    
    settings.configure(
    	DEBUG=(os.environ.get("DEBUG", "") == "1"),
    	ALLOWED_HOSTS=["*"],  # Disable host header validation
    	ROOT_URLCONF=__name__,
    	SECRET_KEY=os.environ.get("SECRET_KEY", "a-bad-secret"),
    	TEMPLATES=[{"BACKEND": "django.template.backends.django.DjangoTemplates"}],
    	MIDDLEWARE_CLASSES=(
        	"django.middleware.common.CommonMiddleware",
        	"django.middleware.csrf.CsrfViewMiddleware",
        	"django.middleware.clickjacking.XFrameOptionsMiddleware",
    	),
    )
    
    
    class EnlistForm(forms.Form):
    	email = forms.EmailField(
        	required=True,
        	label=False,
        	widget=forms.EmailInput(attrs={"placeholder": "Email"}),
    	)
    	referrer = forms.CharField(required=False, widget=forms.HiddenInput())
    
    
    def home(request):
    	if request.method == "POST":
        	form = EnlistForm(request.POST)
        	if form.is_valid():
            	email = form.cleaned_data["email"]
            	referrer = form.cleaned_data["referrer"]
            	ip = request.META.get("REMOTE_ADDR")
            	print(f"Got email of {email}")
            	with open(CSV_LIST, "a") as csv:
                	csv.write(f"{email},{referrer},{ip}\n")
            	return HttpResponseRedirect("/thanks/")
    	else:
        	form = EnlistForm(initial={"referrer": request.META.get("HTTP_REFERER")})
    	context = RequestContext(
        	request, {"content": "Sign up for early access", "form": form}
    	)
    	return HttpResponse(MAIN_HTML.render(context))
    
    
    def thanks(request):
    	context = RequestContext(
        	request,
        	{"content": "Thank you for signing up. We will contact you!", "form": None},
    	)
    	return HttpResponse(MAIN_HTML.render(context))
    
    
    urlpatterns = [
    	path("", home),
    	path("thanks/", thanks),
    ]
    
    app = get_wsgi_application()
    
    
    MAIN_HTML = Template(
    	"""
    <html>
      <head>
    	<title>Coming Soon | Flying Cars</title>
    	<meta name="viewport" content="width=device-width, initial-scale=1.0">
    	<style>
     	@import url('https://fonts.googleapis.com/css2?family=Exo:wght@400;500;600;700;800;900&display=swap');
     	*{
       	margin: 0;
       	padding: 0;
       	box-sizing: border-box;
       	font-family: 'Exo', sans-serif;
     	}
     	html,body{
       	display: grid;
       	height: 100%;
       	width: 100%;
       	place-items: center;
       	background-color: #343434;
       	/* Thanks to Hero Patterns for the background */
       	background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 56 28' width='56' height='28'%3E%3Cpath fill='%23000000' fill-opacity='0.4' d='M56 26v2h-7.75c2.3-1.27 4.94-2 7.75-2zm-26 2a2 2 0 1 0-4 0h-4.09A25.98 25.98 0 0 0 0 16v-2c.67 0 1.34.02 2 .07V14a2 2 0 0 0-2-2v-2a4 4 0 0 1 3.98 3.6 28.09 28.09 0 0 1 2.8-3.86A8 8 0 0 0 0 6V4a9.99 9.99 0 0 1 8.17 4.23c.94-.95 1.96-1.83 3.03-2.63A13.98 13.98 0 0 0 0 0h7.75c2 1.1 3.73 2.63 5.1 4.45 1.12-.72 2.3-1.37 3.53-1.93A20.1 20.1 0 0 0 14.28 0h2.7c.45.56.88 1.14 1.29 1.74 1.3-.48 2.63-.87 4-1.15-.11-.2-.23-.4-.36-.59H26v.07a28.4 28.4 0 0 1 4 0V0h4.09l-.37.59c1.38.28 2.72.67 4.01 1.15.4-.6.84-1.18 1.3-1.74h2.69a20.1 20.1 0 0 0-2.1 2.52c1.23.56 2.41 1.2 3.54 1.93A16.08 16.08 0 0 1 48.25 0H56c-4.58 0-8.65 2.2-11.2 5.6 1.07.8 2.09 1.68 3.03 2.63A9.99 9.99 0 0 1 56 4v2a8 8 0 0 0-6.77 3.74c1.03 1.2 1.97 2.5 2.79 3.86A4 4 0 0 1 56 10v2a2 2 0 0 0-2 2.07 28.4 28.4 0 0 1 2-.07v2c-9.2 0-17.3 4.78-21.91 12H30zM7.75 28H0v-2c2.81 0 5.46.73 7.75 2zM56 20v2c-5.6 0-10.65 2.3-14.28 6h-2.7c4.04-4.89 10.15-8 16.98-8zm-39.03 8h-2.69C10.65 24.3 5.6 22 0 22v-2c6.83 0 12.94 3.11 16.97 8zm15.01-.4a28.09 28.09 0 0 1 2.8-3.86 8 8 0 0 0-13.55 0c1.03 1.2 1.97 2.5 2.79 3.86a4 4 0 0 1 7.96 0zm14.29-11.86c1.3-.48 2.63-.87 4-1.15a25.99 25.99 0 0 0-44.55 0c1.38.28 2.72.67 4.01 1.15a21.98 21.98 0 0 1 36.54 0zm-5.43 2.71c1.13-.72 2.3-1.37 3.54-1.93a19.98 19.98 0 0 0-32.76 0c1.23.56 2.41 1.2 3.54 1.93a15.98 15.98 0 0 1 25.68 0zm-4.67 3.78c.94-.95 1.96-1.83 3.03-2.63a13.98 13.98 0 0 0-22.4 0c1.07.8 2.09 1.68 3.03 2.63a9.99 9.99 0 0 1 16.34 0z'%3E%3C/path%3E%3C/svg%3E");
     	}
     	::selection{
       	color: #fff;
       	background: #FC4782;
     	}
     	.wrapper{
       	color: #eee;
       	max-width: 900px;
       	text-align: center;
       	padding: 0 50px;
     	}
     	.signup {
       	margin-top: 30px;
       	margin-bottom: 10px;
     	}
     	.content {
       	margin-top: 40px;
       	margin-bottom: 10px;
     	}
    	</style>
      </head>
      <body>
    	<div class="wrapper">
      	<svg width="600" height="300" version="1.1" viewBox="0 0 600 300" xmlns="http://www.w3.org/2000/svg"><g transform="translate(0,312)"><path d="m218.36-289.66h163.29c29.039 0 52.417 23.378 52.417 52.417v45.564c0 29.039-23.378 52.417-52.417 52.417h-163.29c-29.039 0-52.417-23.378-52.417-52.417v-45.564c0-29.039 23.378-52.417 52.417-52.417z" fill="#204a87" stop-color="#000000" stroke="#eeeeec" stroke-linecap="round" stroke-linejoin="round" stroke-width="4"/><g fill="#729fcf" stroke="#eeeeec" stroke-linejoin="round" stroke-width="4"><path d="m240.88-162.15c21.473-37.192 42.946-74.385 64.419-111.58" stop-color="#000000"/><g stroke-linecap="round"><path d="m276.15-249.32h-72.081" stop-color="#000000"/><path d="m259.9-221.32h-56.025" stop-color="#000000"/><path d="m267.69-235.32h-63.714" stop-color="#000000"/><path d="m370.37-162.15c-21.473-37.192-42.946-74.385-64.419-111.58" stop-color="#000000"/><path d="m332-249.93h72.081" stop-color="#000000"/><path d="m348.25-221.93h56.025" stop-color="#000000"/><path d="m340.46-235.93h63.714" stop-color="#000000"/><path d="m240.88-162.15c21.473-26.526 42.946-53.051 64.419-79.577" stop-color="#000000"/></g><path d="m370.37-162.15c-21.473-26.526-42.946-53.051-64.419-79.577" stop-color="#000000"/></g><g fill="#eeeeec"><path d="m183.74-116.06-6.3858 17.316h12.795zm-2.6568-4.6378h5.337l13.261 34.795h-4.8942l-3.1696-8.9261h-15.685l-3.1696 8.9261h-4.9641z" style="text-decoration-color:#000000;text-decoration-line:none"/><path d="m222.66-120.7h4.7078v34.795h-4.7078z" style="text-decoration-color:#000000;text-decoration-line:none"/><path d="m271.14-102.22q1.5149.51273 2.9365 2.1907 1.445 1.678 2.8899 4.6145l4.7777 9.5087h-5.0573l-4.4514-8.9261q-1.7246-3.4959-3.356-4.6378-1.6081-1.142-4.4048-1.142h-5.1273v14.706h-4.7078v-34.795h10.627q5.9663 0 8.9028 2.4937t2.9365 7.5278q0 3.2861-1.5382 5.4535-1.5149 2.1674-4.4281 3.0064zm-11.793-14.613v12.352h5.9196q3.4026 0 5.1273-1.5615 1.7479-1.5848 1.7479-4.6378t-1.7479-4.5912q-1.7246-1.5615-5.1273-1.5615z" style="text-decoration-color:#000000;text-decoration-line:none"/><path d="m329.38-118.02v4.9641q-2.3772-2.214-5.0806-3.3094-2.6802-1.0954-5.7099-1.0954-5.9663 0-9.1358 3.659-3.1696 3.6357-3.1696 10.534 0 6.8752 3.1696 10.534 3.1696 3.6357 9.1358 3.6357 3.0297 0 5.7099-1.0954 2.7035-1.0954 5.0806-3.3094v4.9175q-2.4704 1.678-5.2438 2.517-2.7501.83901-5.8264.83901-7.9006 0-12.445-4.8243-4.5446-4.8476-4.5446-13.214 0-8.39 4.5446-13.214 4.5446-4.8476 12.445-4.8476 3.123 0 5.873.839 2.7734.8157 5.1972 2.4704z" style="text-decoration-color:#000000;text-decoration-line:none"/><path d="m366.18-116.06-6.3858 17.316h12.795zm-2.6568-4.6378h5.337l13.261 34.795h-4.8942l-3.1696-8.9261h-15.685l-3.1696 8.9261h-4.9641z" style="text-decoration-color:#000000;text-decoration-line:none"/><path d="m421.6-102.22q1.5149.51273 2.9365 2.1907 1.445 1.678 2.8899 4.6145l4.7777 9.5087h-5.0573l-4.4514-8.9261q-1.7246-3.4959-3.356-4.6378-1.6081-1.142-4.4048-1.142h-5.1272v14.706h-4.7078v-34.795h10.627q5.9663 0 8.9028 2.4937t2.9365 7.5278q0 3.2861-1.5382 5.4535-1.5149 2.1674-4.4281 3.0064zm-11.793-14.613v12.352h5.9196q3.4026 0 5.1272-1.5615 1.7479-1.5848 1.7479-4.6378t-1.7479-4.5912q-1.7246-1.5615-5.1272-1.5615z" style="text-decoration-color:#000000;text-decoration-line:none"/></g></g></svg>
      	<h1>All Your Traffic Problems Solved!</h1>
      	<h2>Feel the future with affordable levitating cars.</h2>
      	<div class="content">
        	{{ content }}
        	{% if form %}
          	<form action="." method="post" class="enlist_form">
            	{% csrf_token %}
            	{{ form.non_field_errors }}
            	{{ form.email.errors }}
            	{{ form.referrer }}
            	{{ form.referrer.errors }}
            	{{ form.email }}
            	<button type="submit">Add Me</button>
          	</form>
        	{% endif %}
      	</div>
    	</div>
      </body>
    </html>
    """
    )
    
    
    if __name__ == "__main__":
    	from django.core.management import execute_from_command_line
    
    	execute_from_command_line(sys.argv)
    

    Except for the absence of individual files, most of the code should be familiar to a Django developer. There is a large HTML template (including two SVG images) embedded as a string.

    Note that I do not use the ORM here. Django does seem to need a directory structure for that (Unless any reader could show me how to do it in a single file).

    Hopefully this shows how minimal Django could be. You might be able to use your favourite framework in places which you didn’t think were possible.

    Comments →

    3 Effective Examples of Django Async Views without Sleeping

    In August this year, Django 3.1 arrived with support for Django async views. This was fantastic news but most people raised the obvious question – What can I do with it? There have been a few tutorials about Django asynchronous views that demonstrate asynchronous execution while calling asyncio.sleep. But that merely led to the refinement of the popular question – What can I do with it besides sleep-ing?

    The short answer is – it is a very powerful technique to write efficient views. For a detailed overview of what asynchronous views are and how they can be used, keep on reading. If you are new to asynchronous support in Django and like to know more background, read my earlier article: A Guide to ASGI in Django 3.0 and its Performance.

    Django Async Views

    Django now allows you to write views which can run asynchronously. First let’s refresh your memory by looking at a simple and minimal synchronous view in Django:

    def index(request):
        return HttpResponse("Made a pretty page")
    

    It takes a request object and returns a response object. In a real world project, a view does many things like fetching records from a database, calling a service or rendering a template. But they work synchronously or one after the other.

    In Django’s MTV (Model Template View) architecture, Views are disproportionately more powerful than others (I find it comparable to a controller in MVC architecture though these things are debatable). Once you enter a view you can perform almost any logic necessary to create a response. This is why Asynchronous Views are so important. It lets you do more things concurrently.

    It is quite easy to write an asynchronous view. For example the asynchronous version of our minimal example above would be:

    async def index_async(request):
        return HttpResponse("Made a pretty page asynchronously.")
    

    This is a coroutine rather than a function. You cannot call it directly. An event loop needs to be created to execute it. But you do not have to worry about that difference since Django takes care of all that.

    Note that this particular view is not invoking anything asynchronously. If Django is running in the classic WSGI mode, then a new event loop is created (automatically) to run this coroutine. So in this case, it might be slightly slower than the synchronous version. But that’s because you are not using it to run tasks concurrently.

    So then why bother writing asynchronous views? The limitations of synchronous views become apparent only at a certain scale. When it comes to large scale web applications probably nothing beats FaceBook.

    Views at Facebook

    In August, Facebook released a static analysis tool to detect and prevent security issues in Python. But what caught my eye was how the views were written in the examples they had shared. They were all async!

    # views/user.py
    async def get_profile(request: HttpRequest) -> HttpResponse:
       profile = load_profile(request.GET['user_id'])
       ...
     
    # controller/user.py
    async def load_profile(user_id: str):
       user = load_user(user_id) # Loads a user safely; no SQL injection
       pictures = load_pictures(user.id)
       ...
     
    # model/media.py
    async def load_pictures(user_id: str):
       query = f"""
          SELECT *
          FROM pictures
          WHERE user_id = {user_id}
       """
       result = run_query(query)
       ...
     
    # model/shared.py
    async def run_query(query: str):
       connection = create_sql_connection()
       result = await connection.execute(query)
       ...
    

    Note that this is not Django but something similar. Currently, Django runs the database code synchronously. But that may change sometime in the future.

    If you think about it, it makes perfect sense. Synchronous code can be blocked while waiting for an I/O operation for several microseconds. However, its equivalent asynchronous code would not be tied up and can work on other tasks. Therefore it can handle more requests with lower latencies. More requests gives Facebook (or any other large site) the ability to handle more users on the same infrastructure.

    Illustration
    Scalability Problems in the 1800s, I suppose

    Even if you are not close to reaching Facebook scale, you could use Python’s asyncio as a more predictable threading mechanism to run many things concurrently. A thread scheduler could interrupt in between destructive updates of shared resources leading to difficult to debug race conditions. Compared to threads, coroutines can achieve a higher level of concurrency with very less overhead.

    Misleading Sleep Examples

    As I joked earlier, most of the Django async views tutorials show an example involving sleep. Even the official Django release notes had this example:

    async def my_view(request):
        await asyncio.sleep(0.5)
        return HttpResponse('Hello, async world!')
    

    To a Python async guru this code might indicate the possibilities that were not previously possible. But to the vast majority, this code is misleading in many ways.

    Firstly, the sleep happening synchronously or asynchronously makes no difference to the end user. The poor chap who just opened the URL linked to that view will have to wait for 0.5 seconds before it returns a cheeky “Hello, async world!". If you are a complete novice, you may have expected an immediate reply and somehow the “hello” greeting to appear asynchronously half a second later. Of course, that sounds silly but then what is this example trying to do compared to a synchronous time.sleep() inside a view?

    The answer is, as with most things in the asyncio world, in the event loop. If the event loop had some other task waiting to be run then that half second window would give it an opportunity to run that. Note that it may take longer than that window to complete. Cooperative Multithreading assumes that everyone works quickly and hands over the control promptly back to the event loop.

    Secondly, it does not seem to accomplish anything useful. Some command-line interfaces use sleep to give enough time for users to read a message before disappearing. But it is the opposite for web applications - a faster response from the web server is the key to a better user experience. So by slowing the response what are we trying to demonstrate in such examples?

    Illustration
    Letting them Sleep Would Be Better Idea

    The best explanation for such simplified examples I can give is convenience. It needs a bit more setup to show examples which really need asynchronous support. That’s what we are trying to explore here.

    Better examples

    A rule of thumb to remember before writing an asynchronous view is to check if it is I/O bound or CPU-bound. A view which spends most of the time in a CPU-bound activity for e.g. matrix multiplication or image manipulation would really not benefit from rewriting them to async views. You should be focussing on the I/O bound activities.

    Invoking Microservices

    Most large web applications are moving away from a monolithic architecture to one composed of many microservices. Rendering a view might require the results of many internal or external services.

    In our example, an ecommerce site for books renders its front page - like most popular sites - tailored to the logged in user by displaying recommended books. The recommendation engine is typically implemented as a separate microservice that makes recommendations based on past buying history and perhaps a bit of machine learning by understanding how successful its past recommendations were.

    In this case, we also need the results of another microservice that decides which promotional banners to display as a rotating banner or slideshow to the user. These banners are not tailored to the logged in user but change depending on the items currently on sale (active promotional campaign) or date.

    Let’s look at how a synchronous version of such a page might look like:

    def sync_home(request):
        """Display homepage by calling two services synchronously"""
        context = {}
        try:
            response = httpx.get(PROMO_SERVICE_URL)
            if response.status_code == httpx.codes.OK:
                context["promo"] = response.json()
            response = httpx.get(RECCO_SERVICE_URL)
            if response.status_code == httpx.codes.OK:
                context["recco"] = response.json()
        except httpx.RequestError as exc:
            print(f"An error occurred while requesting {exc.request.url!r}.")
        return render(request, "index.html", context)
    

    Here instead of the popular Python requests library we are using the httpx library because it supports making synchronous and asynchronous web requests. The interface is almost identical.

    The problem with this view is that the time taken to invoke these services add up since they happen sequentially. The Python process is suspended until the first service responds which could take a long time in a worst case scenario.

    Let’s try to run them concurrently using a simplistic (and ineffective) await call:

    async def async_home_inefficient(request):
        """Display homepage by calling two awaitables synchronously (does NOT run concurrently)"""
        context = {}
        try:
            async with httpx.AsyncClient() as client:
                response = await client.get(PROMO_SERVICE_URL)
                if response.status_code == httpx.codes.OK:
                    context["promo"] = response.json()
                response = await client.get(RECCO_SERVICE_URL)
                if response.status_code == httpx.codes.OK:
                    context["recco"] = response.json()
        except httpx.RequestError as exc:
            print(f"An error occurred while requesting {exc.request.url!r}.")
        return render(request, "index.html", context)
    

    Notice that the view has changed from a function to a coroutine (due to async def keyword). Also note that there are two places where we await for a response from each of the services. You don’t have to try to understand every line here, as we will explain with a better example.

    Interestingly, this view does not work concurrently and takes the same amount of time as the synchronous view. If you are familiar with asynchronous programming, you might have guessed that simply awaiting a coroutine does not make it run other things concurrently, you will just yield control back to the event loop. The view still gets suspended.

    Let’s look at a proper way to run things concurrently:

    async def async_home(request):
        """Display homepage by calling two services asynchronously (proper concurrency)"""
        context = {}
        try:
            async with httpx.AsyncClient() as client:
                response_p, response_r = await asyncio.gather(
                    client.get(PROMO_SERVICE_URL), client.get(RECCO_SERVICE_URL)
                )
    
                if response_p.status_code == httpx.codes.OK:
                    context["promo"] = response_p.json()
                if response_r.status_code == httpx.codes.OK:
                    context["recco"] = response_r.json()
        except httpx.RequestError as exc:
            print(f"An error occurred while requesting {exc.request.url!r}.")
        return render(request, "index.html", context)
    

    If the two services we are calling have similar response times, then this view should complete in _half _the time compared to the synchronous version. This is because the calls happen concurrently as we would want.

    Let’s try to understand what is happening here. There is an outer try…except block to catch request errors while making either of the HTTP calls. Then there is an inner async…with block which gives a context having the client object.

    The most important line is one with the asyncio.gather call taking the coroutines created by the two client.get calls. The gather call will execute them concurrently and return only when both of them are completed. The result would be a tuple of responses which we will unpack into two variables response_p and response_r. If there were no errors, these responses are populated in the context sent for template rendering.

    Microservices are typically internal to the organization hence the response times are low and less variable. Yet, it is never a good idea to rely solely on synchronous calls for communicating between microservices. As the dependencies between services increases, it creates long chains of request and response calls. Such chains can slow down services.

    Why Live Scraping is Bad

    We need to address web scraping because so many asyncio examples use them. I am referring to cases where multiple external websites or pages within a website are concurrently fetched and scraped for information like live stock market (or bitcoin) prices. The implementation would be very similar to what we saw in the Microservices example.

    But this is very risky since a view should return a response to the user as quickly as possible. So trying to fetch external sites which have variable response times or throttling mechanisms could be a poor user experience or even worse a browser timeout. Since microservice calls are typically internal, response times can be controlled with proper SLAs.

    Ideally, scraping should be done in a separate process scheduled to run periodically (using celery or rq). The view should simply pick up the scraped values and present them to the users.

    Serving Files

    Django addresses the problem of serving files by trying hard not to do it itself. This makes sense from a “Do not reinvent the wheel” perspective. After all, there are several better solutions to serve static files like nginx.

    Illustration
    'Serving simultaneously is not for everyone'

    But often we need to serve files with dynamic content. Files often reside in a (slower) disk-based storage (we now have much faster SSDs). While this file operation is quite easy to accomplish with Python, it could be expensive in terms of performance for large files. Regardless of the file’s size, this is a potentially blocking I/O operation that could potentially be used for running another task concurrently.

    Imagine we need to serve a PDF certificate in a Django view. However the date and time of downloading the certificate needs to be stored in the metadata of the PDF file, for some reason (possibly for identification and validation).

    We will use the aiofiles library here for asynchronous file I/O. The API is almost the same as the familiar Python’s built-in file API. Here is how the asynchronous view could be written:

    async def serve_certificate(request):
        timestamp = datetime.datetime.now().isoformat()
    
        response = HttpResponse(content_type="application/pdf")
        response["Content-Disposition"] = "attachment; filename=certificate.pdf"
        async with aiofiles.open("homepage/pdfs/certificate-template.pdf", mode="rb") as f:
            contents = await f.read()
            response.write(contents.replace(b"%timestamp%", bytes(timestamp, "utf-8")))
        return response
    

    This example illustrates why we need asynchronous template rendering in Django. But until that gets implemented, you could use aiofiles library to pull local files without skipping a beat.

    There are downsides to directly using local files instead of Django’s staticfiles. In the future, when you migrate to a different storage space like Amazon S3, make sure you adapt your code accordingly.

    Handling Uploads

    On the flip side, uploading a file is also a potentially long, blocking operation. For security and organizational reasons, Django stores all uploaded content into a separate ‘media’ directory.

    If you have a form that allows uploading a file, then we need to anticipate that some pesky user would upload an impossibly large one. Thankfully Django passes the file to the view as chunks of a certain size. Combined with aiofile’s ability to write a file asynchronously, we could support highly concurrent uploads.

    async def handle_uploaded_file(f):
        async with aiofiles.open(f"uploads/{f.name}", "wb+") as destination:
            for chunk in f.chunks():
                await destination.write(chunk)
    
    
    async def async_uploader(request):
        if request.method == "POST":
            form = UploadFileForm(request.POST, request.FILES)
            if form.is_valid():
                await handle_uploaded_file(request.FILES["file"])
                return HttpResponseRedirect("/")
        else:
            form = UploadFileForm()
        return render(request, "upload.html", {"form": form})
    

    Again this is circumventing Django’s default file upload mechanism, so you need to be careful about the security implications.

    Where To Use

    Django Async project has full backward compatibility as one of its main goals. So you can continue to use your old synchronous views without rewriting them into async. Asynchronous views are not a panacea for all performance issues, so most projects will still continue to use synchronous code since they are quite straightforward to reason about.

    In fact, you can use both async and sync views in the same project. Django will take care of calling the view in the appropriate manner. However, if you are using async views it is recommended to deploy the application on ASGI servers.

    This gives you the flexibility to try asynchronous views gradually especially for I/O intensive work. You need to be careful to pick only async libraries or mix them with sync carefully (use the async_to_sync and sync_to_async adaptors).

    Hopefully this writeup gave you some ideas.

    Thanks to Chillar Anand and Ritesh Agrawal for reviewing this post. All illustrations courtesy of Old Book Illustrations

    Comments →

    Ray Tracer in Python (Part 6) - Show Notes of "Firing All Cores"

    When it comes to code optimization, there is an overused statement about Premature Optimization. Here is the rarely-shown full quote:

    “Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” – Donald Knuth

    I understand this as not worrying about small optimizations all the time. It is best to focus on the areas which have the most return on (time) investment. Usually you need to do extensive performance measurements to find those areas but there are some natural places to look.

    So far, we have been not too concerned about performance especially when it affected readability (else we wouldn’t be writing it in Python). However, in the previous part we got a huge speedup by switching to the PyPy interpreter.

    To me the next big opportunity lies in using all the cores. As an experienced Python programmer, I know this means using the multiprocessing module. But I also need to explain why other options, notably Python threads, would not help.

    To demonstrate this I wrote a multi-threaded C program, a multi-threaded Python program and a multi-processing Python program - all doing the same thing. What was most entertaining was to watch the GIL in action where it allows only one thread to execute at a time.

    Race Conditions

    Race conditions are often explained using a bank ATM example. I think it can be explained in a non technical way using many real life scenarios. I wanted to explain it using a cooking example. It’s weird how many asynchronous and parallel concepts can be taught using what we do in the kitchen. So much so that technical job interviews can be made more effective by simply checking your culinary skills (“Sorry we don’t have a whiteboard, just a cutting board for you.").

    Ideally the entire explanation should be animated. My drawing skills are decent but I cannot animate even a bouncing ball. Don’t laugh, just try it. Firstly, there are several cues that any animated object gives which can be only be detected if you watch in slow motion. The Twelve Principles of Animation is a good start.

    Secondly, it is a lot of work. A good animator will know how to give maximum expression with the minimum frames. Timing is also crucial in animation. Even a subtle timing error makes animations look janky or confusing. Mere mortals would need to spend days to eventually produce a 5 second clip.

    Race Conditions
    Animating the Race Conditions

    So I took the safe approach by showing a slideshow. I think it looks decent. I might post it as a separate clip on Race Conditions so that is doesn’t get lost in this video.

    These are the topics we will cover in this episode:

    • Introduction
      • An Embarrassingly Parallel Problems
      • What’s a Thread?
      • Threads in C
      • Threads in Python
      • Race Condition Animated
      • Why Python has GIL
      • Multi-processing in Python
    • Sub-problem: Firing All Cores
    • Coding the solution
      • Command-line argument
      • Diving pixels among processes
      • Passing Data via Files
      • Value for Progressbar
      • Bug Hunting
    • Performance Comparison
    • Topics Not Covered
    • Learning More
    • Final Words

    Here is the video:

    Code for part six is tagged on the Puray Github project

    Show Notes

    Books and articles that can help understand this part:

    Note: References may contain affiliate links

    Wrapping up

    This concludes the ray tracing series that I had conceived more than a year back. If you make your own ray tracer, please don’t forget to tag me.

    Comments →

    Ray Tracer in Python (Part 5) - Show Notes of "Some Light Reflections"

    One of the fascinating ways to build a complex system like a computer is to build it from scratch using simple logic gates (a good book which shows how is The Elements of Computing Systems). Same goes for late John Conway’s Game of Life which starts with a simple set of rules but shows very life-like behaviour. This is called Emergent complexity. I guess the only way to understand something that seems complex is to understand its basic principles.

    Look carefully at the, by now familiar, image that we will be rendering by the end of this part:

    Render of two balls
    Render of the two balls scene

    Notice within the pinkish-purple ball there is a reflection of the red ball reflecting the pink ball. This kind of detailing that unfolds on closer inspection is what gives realistic renders its beauty. Yet it is governed by the most basic laws of Physics like the laws of reflection.

    Reflection

    The law tells us that the angle of reflection is the same as the angle of incidence. But if we apply it to the world of vectors a rather different looking formula emerges that could be derived from the very same law. The reflected ray is again traced and the process continues.

    If you have been following the series closely so far, then you might point out that we have already dealt with this earlier. Yes, diffuse and specular shading are special cases of a material reflecting light. But since we are now considering mirror-like reflecting surfaces it is time to look at the general formula for reflection.

    The computation for each pixel will now increase many fold so the overall render time will increase proportional to the maximum depth of reflections we need.

    Procedural Materials

    A plainly colored object is not quite interesting so you would find even the earliest ray traced images containing a chessboard pattern:

    The Compleat Angler
    The Compleat Angler (1978) by Turner Whitted

    So I introduce a chessboard pattern generated procedurally into the scene. Procedural textures are fascinating and a lot of fun to make. Compared to image textures, they have almost infinite detail. Sort of like analog versus digital.

    The chessboard pattern’s formula is easy to guess so it is an ideal introduction. Once you start playing around there is an entire universe of textures to explore with marble, Voronoi and Perlin noise patterns. Some even go to the extend of building entire scenes with only procedural textures. This is deeply satisfying but probably pointless.

    Plugin Scenes

    Most toy raytracers are happy generating their scene in the main program. But this quickly becomes frustrating when you want to render a couple of examples. The straightforward solution would be to define a scene as data say using JSON and import the scene given as an argument. This is how games like Doom load levels.

    JSON, YAML and other configuration languages are deceptively simple to read but you could spend a lot of time writing them due to their tiny quirks with commas and whitespaces. It is also not suited for scenes generated procedurally which is happens quite a bit in ray tracers. You would soon wish if these languages were Turing complete. So I decided to ditch all that and use plain old Python instead.

    To be honest, I was not comfortable in allowing a given Python file to describe a scene. But the power and flexibility it allows is really a great tradeoff for the security. I used importlib to import modules inspired by Django.

    I can now define a new procedural texture material class inside a scene! This makes the ray tracer quite extensible like a plugin system. I love this approach and look forward to trying this in future projects.

    Accelerating Python

    Towards the end we see a dramatic 7X speedup of the ray tracer due to the use of PyPy. I also mention my rough rule of thumb to increase Python performance:

    Processor Bound? Try Pypy.
    IO Bound? Try AsyncIO.

    If you are learning to improve the performance of your Python program, this would be pretty bad advice. In that case, make sure you first profile your program and identify the performance hotspots. Then try different ways to optimize those places. After you have tried all that and the performance is still bad, then you can use my rule of thumb for unconventional ways to get great results.

    Ray tracing concepts

    With this part, I would have covered all the basic ray tracing concepts that I had planned to cover. The next part would be about improving the performance of the ray tracer by using multiple cores.

    Many have contacted me asking whether I would be covering topics like Dielectrics, Depth of Field, Anti-aliasing etc. I think there are enough books like Ray Tracing Gems and Ray Tracing in One Weekend which cover all that and much more. If you have followed this series then reading those books would be much easier and you would have a ready implementation to tinker with.

    Nevertheless, I may work on a follow-up if people find that useful and time permits. So do let me know.

    These are the topics we will cover in this episode:

    • Introduction
      • Laws of Reflection
      • Stack Overflow
      • Scene Definition
    • Sub-problem: Some Light Reflections
    • Coding the solution
      • Chessboard Material
      • Ground Plane
      • Config vs Code as Config
      • Speedup

    Here is the video:

    Code for part five is tagged on the Puray Github project

    Show Notes

    Books and articles that can help understand this part:

    Further reading on ray tracing:

    Note: References may contain affiliate links

    Comments →

    Ray Tracer in Python - Show Notes of "Ray Tracing a Coronavirus"

    The first ray-traced animation I had ever seen was in my first year of Engineering. My senior showed us a gold-plated logo of the institution slowly spinning and glinting in the light. It had an amazing level of detail - a triangular frame, a gear wheel on the left, a tower on the right and even a palm tree in the center. We were dying to know how he did it. When he showed us, it blew our minds - it was all code! Every shape was made by combining (or subtracting) primitive solids like spheres, cylinders and cubes. It featured on the masthead of the first edition of our online magazine (distributed on ​3 1⁄2 inch floppies).

    Of course, these days 3D modelling is mostly done by artists in a graphics program like 3ds Max or Maya. But not everything is tinkered by the hand of an artist. Procedural code is still used to create thousands of unique soldiers in an army scene or an infinite world in a multiplayer game. Computer generated art can often surprise or amuse you.

    So when I had to design the model of the omnipresent SAR-Cov-2 virus for a poster, even though I was very tempted to use such tools, I opted to generate the model in code. My Python raytracer (currently in it’s fourth part of the video series) was decent enough to create 3D looking spheres. Having only spheres is a big constraint but you can be more creative under constraints.

    Deconstruction

    There is a process fundamental to artistically recreating any real world object – Deconstruction. When you learn to draw you are asked to “block out” the simple shapes you are trying to draw. So a face becomes an elongated sphere, a nose becomes a triangle etc. If your big shapes are correctly proportioned then it will be a lot easier to add in the details.

    In good old days, the computer artists used graph paper. Like my senior who traced the logo from the college magazine, enlarged it on graph paper and wrote down all the coordinates. These days you could use a 2D drawing program like Inkscape to overlay a photograph and trace over it. But the idea is essentially the same - find all the large blocking shapes that make your image.

    Prince of Persia art
    Graph paper sketch of Prince of Persia by Jordan Mechner in 1987

    Reconstruction

    Generating the model in the computer can be a process of trial and error. In the days of slower computers, rendering a simple object took nearly an hour. Even with today’s dramatically faster machines, rendering a complex scene could take hours. There is simply a ton of numbers to crunch.

    In the case of the coronavirus, I needed to figure out how to make a crown. Essentially I needed to position points evenly around a circle. People say a lot of things about high school trigonometry and how much of a waste of time it was. But I found it quite useful not only to arrange the spikes of the crown but also give it a slight random shift. This would mean each time you render the virus would look sightly different. I had a lot of fun animating several renders to a beat in the video.

    Presentation

    I couldn’t stop playing with the model even after the poster was made. I tweaked it into various color combinations that I found online. Ironically, it looks distinctively charming in any avatar.

    It shows that simple things are awesome than the most professionally done creations. Especially if you could make your own. Looking forward to seeing what others would create with this.

    Here is the video:

    Code for part four is the starting point of the video.

    Final code is also tagged in the repository.

    Show Notes

    Links:

    Comments →

    Page 1 of 38 Older »