Jan 3, 2010

ESI using varnish and django

In my previous post [2] I wrote that Varnish [1] has much more to offer than upstream cache. I have decided to explore its ESI supports [3].

ESI : Edge Side Includes (ESI) is an XML-based markup language that provides a means to assemble resources in HTTP clients. Unlike other in-markup languages, ESI is designed to leverage client tools like caches to improve end-user perceived performance, reduce processing overhead on the origin server, and enhanced availability. ESI allows for dynamic content assembly at the edge of the network.

On of the most common difficulties that leads you to not cache a page for a logged in user is that you want to display some custom information for that user. For example you want to be able to display:

  *  " welcome joes, [ link to his profile ] "

However the rest of the page will be common for all the users. The diagram below explains the composition of the page : 



The yellow part of the page is common for all the users where the green part of the page should be customized for every user.

This is a very common pattern, you can also have a header, footer and a navigation block that don't change very often and the rest of the page which is more dynamic like: recent activity, last articles, ... So the idea here is to use varnish to assemble information coming from different urls and having a different lifetime in cache for each item.

So in our example we will cache the "dyn_page" (yellow in the diagram) for five minutes and we will never cache the user info. I am going to start by dumping the code for this toy app and then explain it as we progress.
Here it is the code of the views.py:



import time
from django.views.generic.simple import direct_to_template
from django.views.decorators.cache import cache_page, never_cache
@never_cache
def user_info(request):
    return direct_to_template(request, "esi_app/user_info.html")
@cache_page(60*5)
def view_dyn_page(request):
    #Simulate the fact that this page take a lot of time to be built
    time.sleep(2)
    return direct_to_template(request, "esi_app/dyn_page.html")
@never_cache
def view_page(request):
    USE_ESI = True
    if not USE_ESI:
        #Simulate the fact that the dynamic part of the page
        # take a lot of time to be built
        time.sleep(2)
    return direct_to_template(request,
                              template="esi_app/page.html",
                              extra_context={'USE_ESI':USE_ESI,})

The views.py contains 3 views one that can display the information for each individual block (yellow and green in the diagram above) and one that can displays the complete page. Note the "USE_ESI" variable that we will utilize in our template. I have added a sleep of 2 seconds in the code to simulate an operation which is taking a lot of time thus the caching strategies make more sense and my ab test later or will be more meaningful.

Here it is the code of of the urls.py:

from django.conf.urls.defaults import *
urlpatterns = patterns('esi_app.views',
    url(r'^page/', 'view_page', name='view_page'),
    url(r'^dyn_page/', 'view_dyn_page', name='viewd_dyn_page'),
    url(r'^user_info/', 'user_info', name='user_info'),
)

Here it is the code for the base.html: 


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

        "http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>

    <title>{% block title %} {%endblock%}</title>

    <style type="text/css">

        #user_info {

          background:lightgreen;

        }

        #content {

            background:lightyellow;

        }

    </style>

</head>

<body>

    <div id="user_info">

        {% block user_info %} {%endblock%}

    </div>

    <div id="content">

        {% block content %} {%endblock%}

    </div>

    

</body>

</html>




Here it is the code for the page.html: 


{% extends "esi_app/base.html" %}
{% block user_info %}
    {% if USE_ESI %}
        <esi:include src="http://192.168.1.18:6081/esi/user_info/"/>
    {% else %}
        {% include "esi_app/user_info.html" %}
    {% endif %}
{%endblock%}
{% block content %}
    {%if USE_ESI %}
        <esi:include src="http://192.168.1.18:6081/esi/dyn_page/"/>
    {% else %}
        {% include "esi_app/dyn_page.html" %}
    {% endif %}
{%endblock%}

This template use the variables "USE_ESI" to decide whether the page will be built using an ESI server or not. This allows a graceful degradation and will help you to debug your page. In a real life situation this variable might come from a django's context processor. The idea here is  that "/esi/page/" is built using "/esi/dyn_page/" and "/esi/user_info/" 

{% load webdesign %}

<h1>{% lorem 2 w random %}</h1>

<p>{% lorem 2 p random %}</p>
Here it is the code for the user_info.html: 

{% if user.is_authenticated %}

    Welcome, {{ user }} -- <a href="/admin/logout">Logout</a>

{% else %}

    <a href="/admin/login">Login</a>

{% endif %}

Then you will need to configure varnish to make it do the ESI transformation on the page with the url equal to "/esi/page/".
Here it is the varnish configuration for the /etc/varnish/default.vcl: 


backend default {
.host = "127.0.0.1";
.port = "8080";
}
sub vcl_recv {
     unset req.http.Accept-Encoding;
     #unset req.http.Vary;
}
sub vcl_fetch {
     if (req.url == "/esi/page/") {
         esi;
     }
}

The code below is rather self explanatory it tells Varnish to do ESI substitution on the page located at "/esi/page/". has in my previous post Cherokee is located on the port 8080 and Varnish on the port 6081. The trickiest part there is the vcl_recv, in this section varnish explicitly prevents the backend from gzipping the content.


All the machinery is in place now so you can use your favorite browser to visualize the result :
  * go to http://192.168.1.18:6081/esi/page/ to view the page generated by varnish
  * go to http://192.168.1.18:8080/esi/page/ to view the page returned by Cherokee

Curl is another tools that is useful when playing cached page :
  * curl [url] wil display in the console the html source code
  * curl -I [url] will show the document info in a console.
  * curl [url] -H "Accept-g: gzip,deflate"

Here it is the page rendered by cherokee :
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

        "http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>

    <title> </title>

    <style type="text/css">

        #user_info {

          background:lightgreen;

        }

        #content {

            background:lightyellow;

        }

    </style>

</head>

<body>

    <div id="user_info">

        <esi:include src="http://192.168.1.18:6081/esi/user_info/"/>

    </div>

    <div id="content">

        <esi:include src="http://192.168.1.18:6081/esi/dyn_page/"/>

    </div>

</body>

</html>
Here it is the page after the substitution done by Varnish :

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

        "http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>

    <title> </title>

    <style type="text/css">

        #user_info {

          background:lightgreen;

        }

        #content {

            background:lightyellow;

        }

    </style>

</head>

<body>

    <div id="user_info">

    <a href="/admin/login">Login</a>

    </div>

    <div id="content">

<h1>ipsa sed</h1>

<p><p>Nemo perferendis delectus pariatur aliquid repellendus repellat explicabo facilis, molestiae veritatis odit, accusantium repellat culpa ab laboriosam, iste laborum amet et harum, iusto illum ipsa a quos necessitatibus voluptatem consectetur cumque? Doloremque atque delectus ipsa ad veniam incidunt cum exercitationem voluptates labore, sapiente ducimus deserunt expedita aperiam temporibus omnis magnam qui architecto, pariatur voluptates nesciunt nam ab dolore omnis, quo voluptatem nihil accusamus aperiam excepturi exercitationem? Consectetur mollitia neque quod quas.</p>



<p>Tempore amet voluptate ipsum suscipit placeat exercitationem labore nam voluptas, debitis esse dignissimos, fugiat illum asperiores suscipit deleniti maiores consequuntur, doloribus architecto repellendus dicta nemo corporis explicabo? A fuga ex, voluptates quam dignissimos aspernatur, reprehenderit accusantium id magni ut debitis adipisci esse voluptas tempora, quas doloribus blanditiis voluptatum veniam nam magni et adipisci fuga pariatur provident? Aut ipsum quam quia earum quod cum sapiente officia inventore delectus, expedita ex quia ipsam consectetur exercitationem ut ad sunt illum minus voluptatum, accusantium maxime facere eos numquam explicabo, rerum ab dolorem repellendus, praesentium debitis tempora aut facere sapiente odit veniam quae?</p></p>

    </div>

</body>

</html>
Some quick and dirty ab test will show you the interest of this Technic. Varnish is very fast at assembling the content coming from different sources. Several months ago Adrian Holovaty has written an article about an alternate approach to this class of problem.

I would be glad to hear from you what other varnish tricks can be used on top of a django web application.


[1] http://varnish.projects.linpro.no/ 
[2]http://yml-blog.blogspot.com/2010/01/response-time-optimisation-with-varnish.html 
[3] http://varnish.projects.linpro.no/wiki/ESIfeatures 
[4] http://www.w3.org/TR/esi-lang
[5] http://www.holovaty.com/writing/django-two-phased-rendering/

Jan 2, 2010

Response time optimisation with Varnish


This blog post shows you how to optimize the tools chain on your server to improve its performance by an order of magnetude with out changing a single line in your django project. In order to do so I will use again django-cms [1] as guinea pig because there is a fair amount for processing to display a page but it is still easy to install. Note: django-cms example has the cache middleware activated by default.


Then I will run ab testing on a particular page and compare the results. These tests are being performed on my laptop hp dv6-1030. The important information is not the figures but by them self but rather the variation of the response time.



Before starting my test I have moved django-cms to be mounted under "/". In order to do this you will need to change the configuration into the file called example_uwsgi.py.


import os
import django.core.handlers.wsgi

# Set the django settings and define the wsgi app
os.environ['DJANGO_SETTINGS_MODULE'] = 'example.settings'
application = django.core.handlers.wsgi.WSGIHandler()

# Mount the application to the url
applications = {'/':application, }



Then you need to change the rule behavior in cherokee admin to reflect this change. Cheorkee admin makes this task a breeze.









Before diving head first into the the meat of this article here it is a diagram of the architecture that we are going to work with :





The goal of this article is to show you the incredible boost that varnish can give to certain type of web application. 



varnish [2] is a state-of-the-art, high-performance HTTP accelerator. It uses the advanced features in Linux 2.6, FreeBSD 6/7 and Solaris 10 to achieve its high performance.
Some of the features include:
  • VCL - a very flexible configuration language
  • Load balancing with health checking of backends
  • Partial support for ESI
  • URL rewriting
  • Graceful handling of "dead" backends
  • ...

The bottom line is that just by installing it and using it with a vanilla configuration, on ubuntu, will increase the responsiveness of your site by an order of magnitude that is hard to believe we are talking here of an improvement factor ranging from 50 to 600 times.



The first thing that you would like to do is to install varnish [2]. On ubuntu varnish is very easy to install/configure since there is a package that exists. Once this operation is executed you will need to define the backend, this varnish jargon means that you need to tell varnish where Cherokee is located.


backend default {
.host = "127.0.0.1";
.port = "8080";
}



Here it is some ab tests that I have done to illustrate this article, 8080 and 6081 are respectively the port for Cherokee and Varnish.

Cherokee

ab -n 100 -c 1 http://192.168.1.18:8080/

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.18 (be patient).....done


Server Software: Cherokee/0.99.37
Server Hostname: 192.168.1.18
Server Port: 8080

Document Path: /
Document Length: 3440 bytes

Concurrency Level: 1
Time taken for tests: 15.285 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 364300 bytes
HTML transferred: 344000 bytes
Requests per second: 6.54 [#/sec] (mean)
Time per request: 152.851 [ms] (mean)
Time per request: 152.851 [ms] (mean, across all concurrent requests)
Transfer rate: 23.28 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 2.4 0 24
Processing: 133 153 17.3 149 235
Waiting: 133 153 17.3 149 235
Total: 134 153 17.2 149 235

Percentage of the requests served within a certain time (ms)
50% 149
66% 158
75% 164
80% 166
90% 172
95% 175
98% 230
99% 235
100% 235 (longest request)


ab -n 100 -c 50 http://192.168.1.18:8080/

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.18 (be patient).....done


Server Software: Cherokee/0.99.37
Server Hostname: 192.168.1.18
Server Port: 8080

Document Path: /
Document Length: 3440 bytes

Concurrency Level: 50
Time taken for tests: 8.202 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 364300 bytes
HTML transferred: 344000 bytes
Requests per second: 12.19 [#/sec] (mean)
Time per request: 4101.021 [ms] (mean)
Time per request: 82.020 [ms] (mean, across all concurrent requests)
Transfer rate: 43.37 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 1.1 1 3
Processing: 740 3283 1158.0 3906 4438
Waiting: 740 3283 1158.0 3906 4438
Total: 743 3284 1157.1 3906 4438

Percentage of the requests served within a certain time (ms)
50% 3906
66% 4048
75% 4112
80% 4182
90% 4285
95% 4341
98% 4359
99% 4438
100% 4438 (longest request)

ab -n 100 -c 100 http://192.168.1.18:8080/

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.18 (be patient).....done


Server Software: Cherokee/0.99.37
Server Hostname: 192.168.1.18
Server Port: 8080

Document Path: /
Document Length: 3440 bytes

Concurrency Level: 100
Time taken for tests: 8.236 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 364300 bytes
HTML transferred: 344000 bytes
Requests per second: 12.14 [#/sec] (mean)
Time per request: 8235.626 [ms] (mean)
Time per request: 82.356 [ms] (mean, across all concurrent requests)
Transfer rate: 43.20 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 3 4 0.8 4 5
Processing: 699 4533 2337.0 4650 8228
Waiting: 699 4533 2337.0 4650 8228
Total: 704 4537 2336.2 4654 8230

Percentage of the requests served within a certain time (ms)
50% 4654
66% 5884
75% 6717
80% 7017
90% 7749
95% 8115
98% 8221
99% 8230
100% 8230 (longest request)

Varnish


ab -n 100 -c 1 http://192.168.1.18:6081/

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.18 (be patient).....done


Server Software: Cherokee/0.99.37
Server Hostname: 192.168.1.18
Server Port: 6081

Document Path: /
Document Length: 3440 bytes

Concurrency Level: 1
Time taken for tests: 0.030 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 374800 bytes
HTML transferred: 344000 bytes
Requests per second: 3320.49 [#/sec] (mean)
Time per request: 0.301 [ms] (mean)
Time per request: 0.301 [ms] (mean, across all concurrent requests)
Transfer rate: 12153.53 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 0 0 0.9 0 8
Waiting: 0 0 0.9 0 8
Total: 0 0 0.9 0 8

Percentage of the requests served within a certain time (ms)
50% 0
66% 0
75% 0
80% 0
90% 0
95% 0
98% 4
99% 8
100% 8 (longest request)

ab -n 100 -c 50 http://192.168.1.18:6081/

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.18 (be patient).....done


Server Software: Cherokee/0.99.37
Server Hostname: 192.168.1.18
Server Port: 6081

Document Path: /
Document Length: 3440 bytes

Concurrency Level: 50
Time taken for tests: 0.012 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 378548 bytes
HTML transferred: 347440 bytes
Requests per second: 8522.97 [#/sec] (mean)
Time per request: 5.866 [ms] (mean)
Time per request: 0.117 [ms] (mean, across all concurrent requests)
Transfer rate: 31507.35 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 1.2 0 3
Processing: 0 3 2.0 3 8
Waiting: 0 3 2.0 2 8
Total: 0 4 2.8 4 11

Percentage of the requests served within a certain time (ms)
50% 4
66% 5
75% 6
80% 7
90% 9
95% 10
98% 11
99% 11
100% 11 (longest request)

ab -n 100 -c 100 http://192.168.1.18:6081/

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.18 (be patient).....done


Server Software: Cherokee/0.99.37
Server Hostname: 192.168.1.18
Server Port: 6081

Document Path: /
Document Length: 3440 bytes

Concurrency Level: 100
Time taken for tests: 0.013 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 374800 bytes
HTML transferred: 344000 bytes
Requests per second: 7662.25 [#/sec] (mean)
Time per request: 13.051 [ms] (mean)
Time per request: 0.131 [ms] (mean, across all concurrent requests)
Transfer rate: 28045.03 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 3 0.6 3 4
Processing: 7 7 0.7 7 9
Waiting: 5 6 0.6 6 7
Total: 9 10 1.2 10 13

Percentage of the requests served within a certain time (ms)
50% 10
66% 10
75% 10
80% 11
90% 12
95% 13
98% 13
99% 13
100% 13 (longest request)

Conclusion

Installing varnish in front of your web server is propably this first step you should take in the end less journey of optimising your web application. It is interesting to note that in addition of dramatically improving the response time Varnish will also reduce the load on your application server stack [ uWSGI + django +db].



This blog post barely scratches the surface of how django can take advantage of of caches, django gives you the possibility to cache information at different stages during the request/response cycle. You can cache the output of specific views, you can cache only the pieces that are difficult to produce, you can cache a portion of template, or you can cache your entire site. Django also works well with "upstream" caches, such as varnish and browser-based caches. These are the types of caches that you don't directly control but to which you can provide hints (via HTTP headers) about which parts of your site should be cached, and how. If you want more information about this you can read the django's cache documentation.

Varnish is also a beast by itself, you can fine tuned it to suit your particular situation and you can used it to do much more in your infrastructure than just upstream cache of your dynamic web site.





[1] http://yml-blog.blogspot.com/2009/12/flup-vs-uwsgi-with-cherokee.html
[2] http://varnish.projects.linpro.no/
[3] http://docs.djangoproject.com/en/1.1/topics/cache/#topics-cache