Tuesday, August 25, 2009

Java| Thread Dump

My Load Test » Java Thread Dump

Java Thread Dump

A Java thread dump is a way of finding out what every thread in the JVM is doing at a particular point in time. This is especially useful if your Java application sometimes seems to hang when running under load, as an analysis of the dump will show where the threads are stuck.

You can generate a thread dump under Unix/Linux by running kill -QUIT <pid>, and under Windows by hitting Ctl + Break.

A great example of where this would be useful is the well-known Dining Philosophers deadlocking problem. Taking example code from Concurrency: State Models & Java Programs, we can cause a deadlock situation and then create a thread dump.

Thursday, August 13, 2009

Django|SerializedDataField

Custom Fields in Django | David Cramer's Blog

I was helping someone today in the Django IRC channel and the question came across about storing a denormalized data set in a single field. Typically I do such things by either serializing the data, or by separating the values with a token (comma for example).

Django has a built-in field type for CommaSeparatedIntegerField, but most of the time I'm storing strings, as I already have the integers available elsewhere. As I began to answer the person's question by giving him an example of usage of serialization + custom properties, until I realized that it would be much easier to just write this as a Field subclass.

So I quickly did, and replaced a few lines of repetitive code with two new field classes in our source:

Update: There were some issues with my understanding of how the metaclass was working. I've corrected the code and it should function properly now.

SerializedDataField

This field is typically used to store raw data, such as a dictionary, or a list of items, or could even be used for more complex objects.

 from django.db import models   try:     import cPickle as pickle except:     import pickle   import base64   class SerializedDataField(models.TextField):     """Because Django for some reason feels its needed to repeatedly call     to_python even after it's been converted this does not support strings."""     __metaclass__ = models.SubfieldBase       def to_python(self, value):         if value is None: return         if not isinstance(value, basestring): return value         value = pickle.loads(base64.b64decode(value))         return value       def get_db_prep_save(self, value):         if value is None: return         return base64.b64encode(pickle.dumps(value))

SeparatedValuesField

An alternative to the CommaSeparatedIntegerField, it allows you to store any separated values. You can also optionally specify a token parameter.

 from django.db import models   class SeparatedValuesField(models.TextField):     __metaclass__ = models.SubfieldBase       def __init__(self, *args, **kwargs):         self.token = kwargs.pop('token', ',')         super(SeparatedValuesField, self).__init__(*args, **kwargs)       def to_python(self, value):         if not value: return         if isinstance(value, list):             return value         return value.split(self.token)       def get_db_prep_value(self, value):         if not value: return         assert(isinstance(value, list) or isinstance(value, tuple))         return self.token.join([unicode(s) for s in value])       def value_to_string(self, obj):         value = self._get_val_from_obj(obj)         return self.get_db_prep_value(value)

Friday, August 7, 2009

Django|Tips to keep your Django/mod_python memory usage down

Tips to keep your Django/mod_python memory usage down - WebFaction

Tips to keep your Django/mod_python memory usage down

Updated Jan 28 at 04:44 CDT (first posted May 30 at 09:57 CDT) by Remi in Django, Memory, Tips  - 16 comment(s)

Most people manage to run their Django site on mod_python within the memory limits of their "Shared 1" or "Shared 2" plans but a few people are struggling to stay within the limits.

So here are a few tips that you can use to try and keep your memory usage down when using Django on mod_python:

  • Make sure that you set DEBUG to False in settings.py: if it isn't, set it to False and restart apache. Amongst other things, DEBUG mode stores all SQL queries in memory so your memory usage will quickly increase if you don't turn it off.
  • Use "ServerLimit" in your apache config: by default apache will spawn lots of processes, which will use lots of memory. You can use the "ServerLimit" directive to limit these processes. A value of 3 or 4 is usually enough for most sites if your static data is not served by your Django instance (see below).
  • Check that no big objects are being loaded in memory: for instance, check that your code isn't loading hundreds or thousands of database records in memory all at once. Also, if your application lets people download or upload big files, check that these big files are not being loaded in memory all at once.
  • Serve your static data from our main server: this is a general advice for all django sites: make sure that your static data (images, stylesheets, ...) is served directly by our main apache server. This will save your Django app from having to serve all these little extra requests. Details on how to do that can be found here and here.
  • Use "MaxRequestsPerChild" in your apache config: sometimes there are some slow memory leaks that you can't do anything about (they can be in the tools that you use themselves for instance). If this is the case then you can use the "MaxRequestsPerChild" to tell apache to only serve a certain number of requests before killing the process and starting a fresh one. Reasonable values are usually between 100 and 1000. Another more extreme/uglier version of this technique is to setup a cronjob to run "stop/start" once in a while.
  • Find out and understand how much memory you're using: to find out what your processes are and how much memory they're using, you can run the "ps -u <username> -o pid,rss,command" command, like this:
    1 [testweb14@web14 bin]$ ps -u testweb14 -o pid,rss,command 
    2PID RSS COMMAND 
    323111 1404 -bash 
    427988 3848 /home/testweb14/webapps/django/apache2/bin/httpd -f /home/testweb14/webapps/django/apache2/conf/httpd.c 
    5 27989 10312 /home/testweb14/webapps/django/apache2/bin/httpd -f /home/testweb14/webapps/django/apache2/conf/httpd. 
    627990 9804 /home/testweb14/webapps/django/apache2/bin/httpd -f /home/testweb14/webapps/django/apache2/conf/httpd.c 
    7 28078 760 ps -u testweb14 -o pid,rss,command 
    8[testweb14@web14 bin]$ 
    view plain | print | ?
    As you can see we have three "httpd" processes running that use respectively 3848KB, 10312KB and 9804KB of memory (there are various ways to interpret the memory used by a process on Linux and we have chosen to use the "Resident Set Size" (RSS) or your processes).

    The first one is the apache "supervisor" and the other two are the "workers" (in this example, "ServerLimit" is set to 2). The memory used by the supervisor usually doesn't change too much, but the memory used by the workers can increase greatly if you have bad memory leaks in your application.

    So the total memory used by our Apache/django instance in this example is 3848KB + 10312KB + 9804KB = 23MB.

Wednesday, August 5, 2009

Django|Speed up with NginX, Memcached, and django-compress

How to Speed up Your Django Sites with NginX, Memcached, and django-compress | Code Spatter

How to Speed up Your Django Sites with NginX, Memcached, and django-compress

Posted on April 23rd, 2009 by Greg Allard in Django, Programming, Server Administration | View commentsComments

A lot of these steps will speed up any kind of application, not just django projects, but there are a few django specific things. Everything has been tested on IvyLees which is running in a Debian/Ubuntu environment.

These three simple steps will speed up your server and allow it to handle more traffic.

Reducing the Number of HTTP Requests

Yahoo has developed a firefox extension called YSlow. It analyzes all of the traffic from a website and gives a score on a few categories where improvements can be made.

It recommends reducing all of your css files into one file and all of your js files into one file or as few as possible. There is a pluggable, open source django application available to help with that task. After setting up django-compress, a website will have css and js files that are minified (excess white space and characters are removed to reduce file size). The application will also give the files version numbers so that they can be cached by the web browser and won't need to be downloaded again until a change is made and a new version of the file is created. How to setup the server to set a far future expiration is shown below in the lightweight server section.

Setting up Memcached

Django makes it really simple to set up caching backends and memcached is easy to install.

sudo aptitude install memcached, python-setuptools

We will need setuptools so that we can do the following command.

sudo easy_install python-memcached

Once that is done you can start the memcached server by doing the following:

sudo memcached -d -u www-data -p 11211 -m 64

-d will start it in daemon mode, -u is the user for it to run as, -p is the port, and -m is the maximum number of megabytes of memory to use.

Now open up the settings.py file for your project and add the following line:

CACHE_BACKEND = 'memcached://127.0.0.1:11211/'

Find the MIDDLEWARE_CLASSES section and add this to the beginning of the list:

    'django.middleware.cache.UpdateCacheMiddleware',

and this to the end of the list:

    'django.middleware.cache.FetchFromCacheMiddleware',

For more about caching with django see the django docs on caching. You can reload the server now to try it out.

sudo /etc/init.d/apache2 reload

To make sure that memcached is set up correctly you can telnet into it and get some statistics.

telnet localhost 11211

Once you are in type stats and it will show some information (press ctrl ] and then ctrl d to exit). If there are too many zeroes, it either isn't working or you haven't visited your site since the caching was set up. See the memcached site for more information.

Don't Use Apache for Static Files

Apache has some overhead involved that makes it good for serving php, python, or ruby applications, but you do not need that for static files like your images, style sheets, and javascript. There are a few options for lightweight servers that you can put in front of apache to handle the static files. Lighttpd (lighty) and nginx (engine x) are two good options. Adding this layer in front of your application will act as an application firewall so there is a security bonus to the speed bonus.

There is this guide to install a django setup with nginx and apache from scratch. If you followed my guide to set up your server or already have apache set up for your application, then there are a few steps to get nginx handling your static files.

sudo aptitude install nginx

Edit the config file for your site (sudo nano /etc/apache2/sites-available/default) and change the port from 80 to 8080 and change the ip address (might be *) to 127.0.0.1. The lines will look like the following

NameVirtualHost 127.0.0.1:8080 <VirtualHost 127.0.0.1:8080>

Also edit the ports.conf file (sudo nano /etc/apache2/ports.conf) so that it will listen on 8080.

Listen 8080

Don't restart the server yet, you want to configure nginx first. Edit the default nginx config file (sudo nano /etc/nginx/sites-available/default) and find where it says

        location / {                root   /var/www/nginx-default;                index  index.html index.htm;         }

and replace it with

location / {     proxy_pass http://192.168.0.180:8080;     proxy_redirect off;     proxy_set_header Host $host;     proxy_set_header X-Real-IP $remote_addr;     proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;     client_max_body_size 10m;     client_body_buffer_size 128k;     proxy_connect_timeout 90;     proxy_send_timeout 90;     proxy_read_timeout 90;     proxy_buffer_size 4k;     proxy_buffers 4 32k;     proxy_busy_buffers_size 64k;     proxy_temp_file_write_size 64k;  } location /files/ {     root /var/www/myproject/;     expires max; }

/files/ is where I've stored all of my static files and /var/www/myproject/ is where my project lives and it contains the files directory.

Set static files to expire far in the future

expires max; will tell your users' browsers to cache the files from that directory for a long time. Only use that if you are use those files won't change. You can use expires 24h; if you aren't sure.

Configure gzip

Edit the nginx configuration to use gzip on all of your static files (sudo nano /etc/nginx/nginx.conf). Where it says gzip on; make sure it looks like the following:

    gzip  on;     gzip_comp_level 2;     gzip_proxied any;     gzip_types      text/plain text/html text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript;

The servers should be ready to be restarted.

sudo /etc/init.d/apache2 reload sudo /etc/init.d/nginx reload

If you are having any problems I suggest reading through this guide and seeing if you have something set up differently.

Speedy Django Sites

Those three steps should speed up your server and allow for more simultaneous visitors. There is a lot more that can be done, but getting these three easy things out of the way first is a good start.

Django|compression and other best practices

Speed up Django with far-future expires, compression and other best practices — Greg Brown

Speed up Django with far-future expires, compression and other best practices

As a web developer with a shoddy rural internet connection, I'm always interested in speeding up my sites. One technique for doing this is far-future expires — i.e. telling the browser to cache media requests forever, then changing the uri when the media changes. In this article, I outline how to implement this and several other techniques in django.

Goals

  1. Reduce http requests for css and js files to a bare minimum.
  2. Add far-future-expires headers to all static content
  3. Gzip all css and js content
  4. Reduce css/js filesize by minification

The django-compress App

First up, I installed the django-compress app. I was about to build my own solution when I realised this one did exactly what I needed — gotta love the django community. Configuring is straightforward — the project wiki has articles on installation, configuration, and usage..

I copied the compress/ directory into my django/library/ folder (which is on the python path) and added "compress" to my INSTALLED_APPS.

Apache Configuration

Once I had django-compress up and running, I had achieved goal #1. To achieve #2 and #3 I needed to configure apache to send the right headers along with each request. To do this, I put the following directive in my httpd.conf file:

<DirectoryMatch /path-to-django-projects/([^/]+)/media>      Order allow,deny     Allow from all      # Insert mod_deflate filter     SetOutputFilter DEFLATE     # Netscape 4.x has some problems...     BrowserMatch ^Mozilla/4 gzip-only-text/html     # Netscape 4.06-4.08 have some more problems     BrowserMatch ^Mozilla/4\.0[678] no-gzip     # MSIE masquerades as Netscape, but it is fine     BrowserMatch \bMSIE !no-gzip !gzip-only-text/html     # Don't compress images     SetEnvIfNoCase Request_URI \     \.(?:gif|jpe?g|png)$ no-gzip dont-vary     # Make sure proxies don't deliver the wrong content     Header append Vary User-Agent env=!dont-vary      # MOD EXPIRES SETUP     ExpiresActive on     ExpiresByType text/javascript "access plus 10 year"     ExpiresByType application/x-javascript "access plus 10 year"     ExpiresByType text/css "access plus 10 years"     ExpiresByType image/png  "access plus 10 years"     ExpiresByType image/x-png  "access plus 10 years"     ExpiresByType image/gif  "access plus 10 years"     ExpiresByType image/jpeg  "access plus 10 years"     ExpiresByType image/pjpeg  "access plus 10 years"     ExpiresByType application/x-flash-swf  "access plus 10 years"     ExpiresByType application/x-shockwave-flash  "access plus 10 years"      # No etags as we're using far-future expires     FileETag none  </DirectoryMatch> 

Notes

  • <DirectoryMatch /path-to-django-projects/([^/]+)/media> is equivalent to writing <Directory /path-to-django-projects/site-name/media> for each site.
  • mod_deflate configuration directives from the Apache site.
  • Note that I'm sending far-future-expires headers for images and flash too — at this stage, that means I have to manually change the filenames whenever I change the content.

This means that for all my django sites' media directories:

  • static content (except images) is gzipped via mod_deflate
  • everything gets a header telling the browser to cache it for 10 years

Note you will need mod_deflate and mod_expires enabled in your apache config - if you have apache 2.2 it should just be a matter of copying the relevant files from apache2/mods-available/ to /apache2/mods-enabled/.

Minification

Step #4 was the trickiest of the lot, and many would argue that it's not really worth the trouble. Depending on how verbosely you comment your js and css, it may or may not be worthwhile for you — personally, I just thought I may as well go the whole hog. In the end, I probably only saved a few percent worth of bandwidth for my small content sites, but it'll be more significant with js-heavy web-apps.

For js minification, django-compress comes with jsmin built in. I've found this to be ideal for the job, and it is enabled by default.

For CSS, django-compress comes with CSSTidy — a CSS parser and optimiser — built in, in the form of csstidy_python. (You can also use a csstidy binary if you have one installed.) Personally, I find CSSTidy messes with my css, and more significantly, messes with that of my css framework of choice, 960.gs. I was after something that simply stripped whitespace, newlines and comments, without parsing the code. After scouring the web, I came across Slimmer — a lightweight pyhon app that did exactly what I needed. After installing it, I added the following file to the django-compress app's filters directory.

#compress/filters/slimmer_css/__init__.py  import slimmer from compress.filter_base import FilterBase  class SlimmerCSSFilter(FilterBase):     def filter_css(self, css):         return slimmer.css_slimmer(css) 

Then it was simply a matter of adding the following line to my settings.py file, as per the django-compress documentation:

COMPRESS_CSS_FILTERS = ('compress.filters.slimmer_css.SlimmerCSSFilter',) 

So my complete django-compress configuration in settings.py was as follows:

# compress app settings COMPRESS_CSS = {     'all': {         'source_filenames': (             'css/lib/reset.css',             'css/lib/text.css',             'css/lib/960.css',             'css/style.css',         ),         'output_filename': 'compress/c-?.css',         'extra_context': {             'media': 'screen,projection',         },     },      # other CSS groups goes here } COMPRESS_JS = {     'all': {         'source_filenames': ('js/lib/jquery.js', 'js/behaviour.js',),         'output_filename': 'compress/j-?.js',     }, }  COMPRESS = True COMPRESS_VERSION = True COMPRESS_CSS_FILTERS = ('compress.filters.slimmer_css.SlimmerCSSFilter',) 

Other best practices

I keep all my css within the <head> tags, and js at the bottom of the page — this is because the page browser needs to download all the css before it can start rendering the page, but doesn't need the js. It doesn't actually speed up the site, but it gives the impression of loading faster, and the user is unlikely to click on anything before the js has loaded anyway.

For a definitive guide, see Yahoo's performance rules. I also recommend Yahoo's YSlow, and if you are one of the 3 remaining web developers without it, Firebug.