Channels ▼
RSS

Web Development

Concurrency and Python


Asynchronous Programming

What happens if you have a lot of sockets that are waiting to read or write data? Asynchronous programming lets you write code that basically says, "Call my callback when you actually have something for me." Although this approach is used all the time in C, it's even nicer in Python because Python has first-class functions.

These days, there are many servers written asynchronously. nginx is a "simplified version" of Apache that is both very fast and highly concurrent. Squid, the popular open source Web proxy, is also written asynchronously. This makes a lot of sense if you think about what a Web proxy does. It spends all of its time managing a ton of sockets, funneling data between clients and servers.

Asynchronous programming starts with operating system APIs such as select, poll, kqueue, aio, and epoll. These APIs let you write code that basically says, "These are the file descriptors I'm working with. Which of them is ready for me to do some reading or writing?" In Python, libraries like the built-in asyncore module and the popular Twisted framework take these low-level APIs and orchestrate callback systems on top of them.

Let's look at an example of asynchronous code. First, the linear (non-asynchronous) code in Example 4.

def handle_request(request):
    data = talk_to_database()
    print "Processing request with data from database."
Example 4: Non-asynchronous Code

Re-written asynchronously, you end up with something like Example 5. (You can move use_data into a new top-level function after handle_request, but it's convenient to do it this way to maintain access to request via a closure.)

def handle_request(request):
    def use_data(data):
        print "Processing request with data from database."
    deferred = talk_to_database()
    deferred.addCallback(use_data)
Example 5: Asynchronous Code.

Notice that the talk_to_database function no longer returns a value directly. Rather, it returns a deferred object to which you can attach callbacks.

This is called "continuation passing style". Rather than waiting for a function to simply return, you must pass a callback detailing how to continue once the data is obtained. Because you must use continuation passing style anytime you call a function that might block, it soon permeates your codebase. This can be painful and prevents you from using any library that does blocking I/O unless it's written using continuation passing style.

On the other hand, living in the asynchronous ghetto has its benefits. Aside from the clear concurrency benefits, the Twisted codebase is widely regarded as well-written code, and it provides implementations for most popular protocols.

Subroutines Versus Coroutines

In the beginning, there was the GOTO. It didn't take any parameters, and it was a one-way trip.

A coroutine is like a subroutine, except it doesn't necessarily return. With subroutines, you can do things like:

f -> g -> h (return to g, return to f)

With coroutines, you can do things like:

f -> g -> h -> f

Coroutines can be used for simple cooperative multitasking. The Python Cookbook has a great recipe for coroutines based on generators. Example 6 is a simple version of it.

import itertools
def my_coro(name):
    count = 0
    while True:
        count += 1
        print "%s %s" % (name, count)
        yield
coros = [my_coro('coro1'), my_coro('coro2')]
for coro in itertools.cycle(coros):  # A round-robin scheduler :)
    coro.next()
# Produces:
#
# coro1 1
# coro2 1
# coro1 2
# coro2 2
# ...
Example 6: Generator-based Coroutines

Using generators to implement coroutines is definitely a cute hack. By the way, this same trick can be used in Twisted to alleviate some of the need to use callbacks everywhere.

On the other hand, there are some limitations to this technique. Specifically, you can only call yield in the generator. What happens if my_coro calls some function f and f wants to yield? There are some workarounds, but the limitation is actually pretty core to Python. (Because Python isn't stackless, it can't support true continuations in the same way that Scheme can.) I've written about this topic in detail on my blog.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 
Dr. Dobb's TV