Multi-threaded map() for Python

2007-09-05 at 04:39 | In devel, lang:en, talk | 8 Comments
Tags: , ,

The idea of multi-processing map() for Python is quite nice. And what about multi-threaded one? Threads usually cause less overhead than processes. If a mapping function is quite side-effect free (even if it does some HTTP GETs — they are idempotent), you don’t rely on a parallel execution model you’ve selected. And when it isn’t, then such an approach is error-prone. I’ve implemented a very simple threaded exception-aware map() using one thread per call. This is the basic usage scenario:

@measured
def single_threaded():
  return [urlopen(url) for x in range(count)]

@measured
def multi_threaded():
  return map(lambda x: urlopen(url), range(count))

ps_s = single_threaded()
ps_m = multi_threaded()

The results for url = "http://ya.ru/" and count = 1000:

single_threaded() is finished in 121.333 s
multi_threaded() is finished in 29.692 s

A multi-threaded map() is rather useful, isn’t it?

P. S. The first exception in a map() thread will be re-raised (with its traceback) in the main thread while others will be suppressed.

Blog at WordPress.com. | Theme: Pool by Borja Fernandez.
Entries and comments feeds.