Python for Scientific Programming
1 Goals
- Teach you the basics of Python
- Amaze you with Python's power
- Show you why Python is attractive to scientists
- Fry your brain (just a little)
2 Outline
- Python
- Basics
- Philosophy
- Magic
- For scientists
- IPython
- Numpy
- Matplotlib
- Cython
- Extended worked development example
3 Style
- This is a highly interactive session.
- I expect that you will have different backgrounds, levels of familiarity with Python, styles of thought, professional needs, etc.
- I will throw a lot of material at you quickly …
- If you don't stop me to ask questions, I will assume that you already know everything I am saying, and I will go even faster.
- Questions are very welcome at any stage. In fact your questions are necessary for me to know how to be of any use to you.
- It is your responsibility to tell me what you need to know, and when you require help.
- But given the limited time available, I may have to keep some discussions shorter than would be ideal.
- I do not expect to have the time to cover all the material that appears below, in depth. Which bits are skimmed over quickly (or skipped entirely) depends on you.
4 Python
4.1 Interactivity
No lengthy compilation steps. Feedback is immediately available. You can try out ideas interactively and even inject new code (or corrected code) into running programs.
A great boon to understanding.
Interactively try out all the examples that follow.
4.2 Introspection
dir()
,dir(1)
help(dir)
,help(range)
type(1)
,type(type(1))
,type(type(type(1)))
4.2.1 fake dir
def dir(*args): """A quieter version of dir. Ignores special names (those starting and ending with two underscores). Useful for reducing beginners' confusion, and removing the temptation to abuse special names.""" return [name for name in __builtins__.dir(*args) if not (name.startswith('__') and name.endswith('__'))]
4.3 Namespaces and callables
Let's explore two basic stumbling blocks: get Python to give you the
real and imaginary components and the conjugate of c
, where
c = 1+2j
imag(c) # NameError c.imag c.real c.conjugate # This is not an error. You got the object you asked # for. It's a callable. If you want it to do anything, you will have to # call it c.conjugate()
4.4 Moral of the story
- Look for names in the correct namespace
- Call callables when you want their result; don't call them if you want the callable itself.
- Functions are data
4.5 Basic data structures
4.5.1 lists
a = [1,2,3,4] len(a) 3 in a a[0] a[3] a[4] a[2:4] # slicing a[:2] a[:] dir(a)
Lists are heterogeneous dynamically resizeable arrays.
4.5.2 tuples
b = (1,2,3,4) b = 1,2,3,4 len(b) 3 in b b[0] b[2:] # etc. 3 in b dir(b)
Tuples are immutable. Tuples are lightweight structs with anonymous members
4.5.3 strings
s = 'hello' len(s) 'e' in s s[0] s[2:4] # etc.
4.5.4 dicts
numbers = {} numbers[1] = 'one' numbers[2] = 'two' len(numbers) numbers[1] numbers[3] g2e = {'eins':'one', 'zwei':'two', 'drei':'three'} len(g2e) g2e['eins'] g2e['vier']
Dicts provide very fast (hash-table based) random-access lookup from key to value.
Keys must be immutable (hashable).
Dicts have no meaningful order
4.6 Modules
import math
from math import sin, cos
import math as calc
from math import sin as cos
Ask Python for the sine of one sixth of pi.
4.7 Functions
def square(n): "Return the square of the input." return n*n
def ping(n): print "Ping", n if n > 0: pong(n-1) def pong(n): print "Pong", n ping(n)
4.8 Sequence unpacking
a,b,c = (1,2,3) a,b,c = 4,5,6 a,b = b,a c1, c2, c3, c4, c5 = 'hello' a,b = range(2) a,b,c = range(10,13) a,b,c,d = range(100,108,2)
4.9 Multiple return values
def powers(n): return n*n, n*n*n square_of_2, cube_of_2 = powers(2) square_and_cube_of_3 = powers(3)
4.10 Variadic functions
def one(*args): return args one(1) one(1,2) one('hello', 1.3, 12) one(*(1,2,3)) one(*'hello')
4.11 Default arguments
def two(a=6, b=7): return a,b two() two(3) two(3,4)
4.12 Keyword arguments
two(b=4) def three(*args, **kwds): return args, kwds three() three(1,2,3) three(a=1, b=2, c=3) three(**{'a':1, 'b':2, 'c':3}) three(*'hello', **{'a':1, 'b':2, 'c':3})
4.13 Looping
n = 10 while n > 0: print n n -= 1
for item in sequence: print item
for position, item in enumerate(sequence): print position, item
Python's for loops abstract iteration over containers.
4.14 Anonymous functions
lambda x:x+1 (lambda x:x+1)(2) lambda a,b,c: a*b+c lambda a,b,c: a*b+c(2,3,4) (lambda a,b,c: a*b+c)(2,3,4)
4.15 How to read lambda expressions
lambda
XXX : YYY The function which takes the arguments XXX and returns YYY
4.16 Higher order functions
def add_23(n): return n+23 map(add_23, range(10)) map(lambda n:n+23, range(10))
def is_odd(n): return n%2 def is_even(n): return not n%2 filter(is_odd, range(10)) filter(lambda n:n%2, range(10))
Remember:
- Functions are data.
- Functions are objects.
- (Almost) everyhing is an object in Python!
4.17 Comprehensions
List comprehensions
[ (x,x*x) for x in range(10) ] [ (x, x*x) for x in range(10) if x%2 ] [ letter*N for N in range(1,5) for letter in 'abcd' ]
Dict comprehensions (new in Python 2.7)
{ x:x*x for x in range(10) }
Set comprehensions (new in Python 2.7)
{ x*x for x in range(10) }
Generator expressions
( (x,x*x) for x in range(10) ) ( (x, x*x) for x in range(10) if x%2 ) ( letter*N for N in range(1,5) for letter in 'abcd' )
Lazy version of list comprehensions
4.18 Boolean contexts
[ '%s is %s' % (item, bool(item)) for item in [0, 1, 0.0, 0.1, 0j, 1j, [], [0], (), (0,), {}, {'a':'A'}, None]]
4.19 Using a list to represent a queue
queue = []
queue.append(1)
queue.append(2)
queue.append(3)
queue.pop(0)
queue.pop(0)
queue.pop(0)
queue.pop(0)
4.20 Queue Abstract Data Type
An abstract data type is one defined by the operations you can carry out on it.
def make_queue(): return [] def add_to_queue(queue, item): queue.append(item) def remove_from_front_of_queue(queue): return queue.pop(0) queue = make_queue() add_to_queue(queue, 1) add_to_queue(queue, 2) add_to_queue(queue, 3) remove_from_front_of_queue(queue) remove_from_front_of_queue(queue) remove_from_front_of_queue(queue) remove_from_front_of_queue(queue)
4.21 Exceptions
def remove_from_front_of_queue(queue): if queue: return queue.pop() raise Exception("Cannot remove from queue: queue is empty.")
4.22 Classes
class Queue(object): def __init__(queue): queue.data = [] def add(queue, item): queue.data.append(item) def remove(queue): if queue.data: return queue.data.pop(0) raise Exception("Cannot remove from queue: queue is empty.") #raise EmptyQueue("Cannot remove from queue: queue is empty.") queue = Queue() queue.add(1) queue.add(2) queue.add(3) queue.remove() queue.remove() queue.remove() queue.remove()
NB, always use the name self
for the first argument of methods:
It's the strongest convention in Python, and using any other name will
lead to confusion. I used the name queue
instead, for didactic
purposes, because it ties in with the preceding narrative. I wouldn't
dream of perpetrating such crimes in production code!
4.23 Inheritance
class EmergencyQueue(Queue): def add_to_front(queue, item): queue.data.insert(0,item)
4.24 Custom Exceptions
class EmptyQueue(Exception): pass
4.25 Handling exceptions
orders = EmergencyQueue() orders.add(('Monkey', 'banana', 12)) orders.add_to_front(('Burning house', 'fire engine', 3)) orders.add(('Dog', 'bone', 2)) try: while True: customer, item, quantity = orders.remove() print "Sending %d %ss to %s" % (quantity, item, customer) except EmptyQueue: print "No more work to be done. Going home."
4.26 Everything is an object
- Numbers, strings, lists, dictionaries
- Functions
- Types and classes
- Modules
- Stack frames
4.27 Functions are data
Once you grasp the fact that functions are just objects which happen to be callable, Python will fit your brain much better.
Any objects that are callable can be thought of as functions.
l = [] store = l.append store('apple') store('banana') l
map(int, '012345')
4.28 Duck Typing
If it looks like a duck, and it quacks like a duck, then it's a duck.
What a Python object can do is far more important that its type.
I don't care who your parents are, as long as you can do the job.
Avoid checking types, check capabilities. This is often achieved by using avoiding Look Before You Leap style
if type(candidate) is not JugglingClanMember: print "I'm not even going to let you try" else: print candidate.juggle_balls(a,b,c,d,e)
and using Easier to Ask for Forgiveness than Permission style instead
try: print candidate.juggle_balls(a,b,c,d,e) except AttributeError: print "Hmm, you don't seem to know how to juggle at all." except JugglingError: print "I'm sorry, I seem to have given you too many balls."
4.29 Uniformity
Python is far from perfect. Yes, there are inconsistencies, special cases, dark corners and warts. But compared to many other languages, Python is very regular, consistent and logically clean.
If you get a good grasp of certain key ideas, you will be able to understand Python much better, and will be a much more empowered Python programmer.
- Everything is an object.
- Every object is a namespace (and every namespace is an object).
- Functions are data. Call them if you want their result right now; don't call them if you need the function itself rather than its result.
- Assginment is (re-)binding, nothing more.
- Duck Typing.
- Duck Typing.
- Duck Typing!!
Unfortunately we do not have the time to (formally) cover these topics with the care they deserve.
4.30 Closures
def make_adder(n): def adder(m): return n+m return adder add3 = make_adder(3) add9 = make_adder(9) add3(10), add9(10)
- Function factories
- Stateful functions
- Functions accepting arguments at different times
4.31 Classes vs closures
class Make_Adder(object): def __init__(self, n): self.n = n def __call__(self, m): return self.n + m add2 = Make_Adder(2) add8 = Make_Adder(8) add2(10), add8(10)
4.32 Metaprogramming
Python metaprogramming is based around two simple facts
- pretty much everything is a Python object which can be manipulated, in pure Python, at run time
- Python is very dynamic: very little is fixed at compile time; most things can be modified at run time.
4.32.1 Decorators: the idea
def one(_): return 1 def inc(n): return n+1 def inc_by(n): def inc(m): return n+m return inc def addtwo(a,b): return a+b addtwo = one(addtwo) addtwo = inc(addtwo) addtwo = inc(addtwo) addtwo = inc_by(6)(addtwo)
4.32.2 Decorators: the syntax
@inc_by(6) @inc @inc @one def multiplytwo(a,b): return a*b print multiplytwo
4.32.3 Our first useful decorator
from functools import wraps def report_args(the_function_we_are_decorating): @wraps(the_function_we_are_decorating) # Ignore this line at first def decorated_version_of_the_function(*args, **kwds): print the_function_we_are_decorating.__name__, ' got args: ', args, kwds return the_function_we_are_decorating(*args, **kwds) return decorated_version_of_the_function
4.32.4 Stucture of general function decorator
from functools import wraps def increment_result_by(the_parameter_of_the_decorator): def decorator_with_parameter(the_function_we_are_decorating): @wraps(the_function_we_are_decorating) def decorated_version_of_the_function(*args, **kwds): return the_function_we_are_decorating(*args, **kwds) + the_parameter_of_the_decorator return decorated_version_of_the_function return decorator_with_parameter
Use it like this
@increment_result_by(66) def dodgy_multiply(a,b): return a*b dodgy_multiply(10, 60)
If the usage confuses you, just remember that it's equivalent to
def dodgy_add(a,b): return a+b dodgy_add = increment_result_by(66)(dodgy_add) dodgy_add(200, 400)
4.32.5 General function decorator structure distilled
Three nested functions:
- Accepts parameters of the decorator
- The decorator itself: accepts to-be-decorated function as argument
- The decorated (replacement) function: accepts original function's arguments
The decorator (2.) and/or the replacement function (3.) are usually closures over the parameters of 1.
def parameter_receiver(parm1, maybe_parm2, maybe_kwds='etc'): # 1. def the_decorator(original_function): # 2. def replacement_of_original_function(*args, **kwds): # 3. # Extra functionality result = original_function(*args, **kwds) # More extra functionality return result return replacement_of_original_function return the_decorator
4.32.6 Dead ringers
The decorated function usually replaces the function to be
decorated. It should therefore usually resemble the original function
in terms of __name__
, docstring etc.
functools.wraps
is a utility which helps to make the decorated
function look as much as the original as necessary/possible.
@wraps(original_function) def proxy_function(...)
means
proxy_function
is the decorated version oforiginal_function
(the former wraps the latter), so copy stuff like__name__
and__doc__
across.
4.32.7 Class decorator
def list_instances(class_): # Get a handle on the original constructor: The enhanced version # will need to run it after it has been replaced original__init__ = class_.__init__ # Storage space for remembering all the instances class_._all_instances_ever_made = [] # Enhanced constructor, does whatever original constructor did, # plus remembering every instance it creates. def __init__(self, *args, **kwds): original__init__(self, *args, **kwds) self._all_instances_ever_made.append(self) # Install the enhanced constructor in the class class_.__init__ = __init__ class_.get_all_instances_ever_made = lambda self:self._all_instances_ever_made # Don't forget to return the class, otherwise None will be # returned implicitly and will replace the class. return class_ @list_instances class Foo(object): def __init__(self, thing): print "Constructing with ", thing self.thing = thing def hello(self): print "Hi, I'm a Foo and I've got a ", self.thing Foo('apple') Foo('banana') for f in Foo('orange').get_all_instances_ever_made(): f.hello()
4.32.8 Decorators are really simple!
Really!
- We don't need the syntax
- It's just pure Python manipulating Python objects
- Functions and classes are objects (data)
(Yes, the implementation of any given decorator might be complicated, but the decorator idea is really simple.)
4.32.9 Class generation
def Structish(name, item_names): """Create simple classes. Create a class with the given name and the given set of attribute names. The attributes may be accessed by name or by position with itemgetting syntax.""" # Core of the class we are creating class new_struct(list): def __init__(self, *args): return list.__init__(self,args) # Set the class' internal name to what the user specified new_struct.__name__ = name # Factory of getters and setters which will be used to translate # attribute name to sequence position. def make_getter_and_setter(item_position): def getter(self): return self[item_position] def setter(self, value): self[item_position] = value return getter, setter # For each requested attribute name, create a property which gets # and sets the corresponding item in the sequence. for item_position, item_name in enumerate(item_names.split()): setattr(new_struct, item_name, property(*make_getter_and_setter(item_position))) # Now that we have stuck all the necessary extra bits on to the # class, return it as the final product. return new_struct
Vec3D = Structish('Vec3D', 'x y z') v = Vec3D(1,2,3) assert v[0] == v.x assert v[1] == v.y assert v[2] == v.z v[1] = 10 assert v.y == 10 v.z = 20 assert v.z == 20
A related idea appears in the standard library as namedtuple
in the
collections
module.
4.33 Back to Earth
We've just looked at some powerful techniques.
Some of you may find them esoteric.
Some of you may think you'll never use them. Maybe, but if you understood any of it, it will make you a better Python programmer overall.
5 IPython
IPython is an enhanced interactive shell for Python.
Its feature set is huge. We will use IPython as our interactive shell from now on, and hopefully you'll pick up some useful ideas along the way.
For now, make sure that you are aware of some of the most obvious features:
- Enhanced, intelligent name completion
- Help access with
?
and??
- OS interaction with
!
- Input and output history, persistent command history
- Pretty printed input, output and tracebacks
- Type
%magic
Some major features which we will not have time to discuss at all, include:
- Notebook
- Parallel and distributed computing support
5.1 edit
magic
Make sure to try IPython's edit
magic, and to set your EDITOR
environment variable to something you like; by default it's vi
which
is an experience many of you will not enjoy!
I will rely on your ability to use this in the major exercise at the end of the course.
6 Numpy
Much of the High Performance Computing done in Python is based around Numpy: a library providing optimized multi-dimensional arrays and associated operations.
We will use Numpy in our main exercise.
6.1 Creating arrays
There are many ways to initialize Numpy arrays:
[Hint: start ipython with <tt>ipython –pylab</tt> to have all the required names imported automatically.]
array([2,9,3,6]) arange(10) linspace(-10, 10, 11) logspace(1,10, 4, base=2) empty(4) zeros((4,2)) ones((4,2), complex) ones((4,2,3)) a = empty((2,3)) ones_like(a) a.shape a.reshape(6,1) a ogrid[20:30:2] ogrid[-10:10:5j] ogrid[20:30:2, -10:10:5j]
and many, many more.
6.2 Element access and slicing
Using the following as our sample array
shape = (4,5,6) a = arange(prod(shape)).reshape(shape)
try to understand the results given by the following
a[0, 0, 0] a[-1, 1, -2] a[1, :, 2] a[2, 3] a[:, 1, -2] a[0, 2:, 1:3] a[1:3, 2:-1, 3:-1]
6.3 ufuncs
Ufuncs are Numpy functions which act elementwise on the members of an array.
sin(linspace(0, pi/2, 100)) log2(logspace(0,10, 21, base=2))
[name for (name,obj) in locals().items() if type(obj) is ufunc]
Most operators are overloaded to act as ufuncs on Numpy arrays
arange(10) < 5 ogrid[-1:1:11j] ** 2 x,y = ogrid[-1:1:11j, -1:1:5j] x y x+y (x+y) ** 2 imshow((x+y) ** 2)
6.4 Temporaries
Beware of temporaries: the expression (x+y) ** 2
creates a temporary
array containing the result of x+y
, which is then used to create the
final result. When dealing with large arrays, you may wish/need to
avoid these temporaries.
Numexpr is a package which (among other things) automates the process of eliminating such temporaries.
7 Matplotlib
Matplotlib is a rich library for the graphical presentation of data, implemented in Python. By design, its interface resembles that of Matlab in many ways.
The best way to use Matplotlib interactively is probably via IPython's pylab:
ipython --pylab
To get something pretty on the screen quickly, try
x = linspace(-10, 10, 1000) y = sinc(x) plot(x,y)
Then try
clf() hist(randn(10000), 50)
7.1 Examples
To get an idea of Matlab's capabilities, use IPython's loadpy
magic
to run some of these examples. For instance
%loadpy http://matplotlib.sourceforge.net/examples/animation/dynamic_image.py
or
%loadpy http://matplotlib.sourceforge.net/examples/event_handling/data_browser.py
loadpy
gives you a chance to edit the loaded code before executing
it: if nothing seems to be happening, press ENTER
once more.
8 There's much more
While NumPy, SciPy, Matplotlib and IPython are likely to be familiar and useful to the majority of Python-using scientists, many more domain specific packages exist. Whatever your field, there is a fair chance that some good third-party modules exist that could help you in your work.
9 Cython
Cython can be thought of as
- An extension of the Python language
- A static compiler for the Python language
- A tool for writing Python extension modules
- A tool for accessing C/C++/Fortran code from Python
9.1 Cython as a compiler
Create an ordinary python module, let's call
it triangle.py
:
def triangle(n): total = 0 for i in range(1,n+1): total += i return total
Compile it with cython
:
cython triangle.py
Look at the generated file, triangle.c
.
9.2 Cython and distutils
Create a file (traditionally called setup.py
) containing
from distutils.core import setup from distutils.extension import Extension from Cython.Distutils import build_ext ext_modules = [Extension("triangleC", ["triangle.py"])] setup( name = "Cython-test", cmdclass = {'build_ext':build_ext}, ext_modules = ext_modules)
With this in place, all you need is
python setup.py build_ext --inplace
from triangleC import triangle map(triangle, range(10))
Confirm that triangleC.triangle
has identical semantics to
triangle.triangle
.
[Here are the files for your convenience: triangle.py, setup.py]
9.3 It's still too dynamic
Test relative performance of the pure Python module and the equivalent Cython-generated extension module:
python -m timeit -s "from triangle import triangle" "triangle(100)" python -m timeit -s "from triangleC import triangle" "triangle(100)"
On my machine I get a speedup factor of 1.5 (2). Not bad, but not very
impressive. We can do much better: The extension module still does too
much unnecessary work, such as repeatedly checking the type of the
data involved, even though the type will always be int
.
9.4 Cython annotation
Ask Cython to annotate your source:
cython -a triangle.py
and use a web browser to view triangle.html
. The more yellow you
see, the more potential there is to speed up that line.
Clicking on a source line in the annotator output will toggle display of the C code that the line generates.
9.5 Cython type declarations
Cython allows you to introduce type declarations into your
program. The primary syntax for doing this is not compatible with pure
Python, so we must rename the source file to triangle.pyx
.
Try adding/removing the following declarations, and see how it affects
the output of Cython's annotator, and how it affects the execution
speed. (Don't forget to change traingle.py
to triangle.pyx
in
setup.py
.)
Declare the parameter type
def triangle(int n): ...
Declare the type of a local variable
def ... :
...
cdef int i
...
Declare the return type
cpdef int triangle(int i): ...
Try to turn everything white. [You should be able to turn everything
except the first and last lines of triangle
completely white.]
On my machine timeit
now reports a speedup factor of 46 (66). [Solution:
triangle.pyx, setup_pyx.py ]
9.6 Pure C functions
The triangle
function receives Python objects as arguments and
returns Python objects. In order to deal with these, some dynamic code
must remain: this is the reason the first and last lines of triangle
could not be turned white.
We can turn triangle
completely white, at the cost of making it
inaccessible (directly) from Python: just change the cpdef
on the
first line of triangle to cdef
. Observe that this has two effects:
triangle
turns purely white in the annotatortraingle
disappears from thetriangleC
module
What is the point of having a function which is not callable from Python? You can call it indirectly, from some other function in the module which is callable from Python.
9.7 There's more to Cython
In the time available, we can only scratch the surface of Cython, but we have already seen the central ideas or which it is based:
- Write high-level Python, get low-level C
- Annotate where necessary to improve performance
Some of the features we haven't discussed include
- Cythonized classes
- C++ integration
- Fortran integration
- Numpy integration
- Automagical compilation
- Accessing low-level libraries
- Inlining C/C++
We will use Cython again in the main exercise which follows.
10 Putting it together
10.1 Extended example
We need a toy program which
- is easy to implement in a short time
- does not require deep domain-sepcific expertise
- is CPU intensive
- gives easily verifiable and pretty results
We'll write a Mandelbrot set explorer.
The explorer should show a square portion of the complex plane. Points on the plane should be given a colour representing the number of iterations it took to prove that they did not belong to the set.
It should be possible to use keystrokes to perform the following actions:
- pan the view port
- zoom in and out
- change the resolution
- change the iteration cutoff
Clicking in the viewport should zoom in, recentering on the clicked point.
(Depending on how quickly we work our way through the material, I expect to skip over many of the details of the development of the larger program we are now going to write. I have tried to provide complete working examples at many intermediate stages, to help you work through the details in your own time.)
10.2 Simplest thing that works
Let's start off by writing a program which displays the Mandelbrot set. We won't pay too much attention to style or efficiency for now: we just want something that works.
Run this code with ipython --pylab -i mandelbrot_0.py
Pick a value for the resolution which gives output in a reasonably quick time.
10.3 Enable viewport panning
Add the following method to the Mandelbrot
class
def pan(self, direction): dx,dy = direction self._x += dx * self._width self._y += dy * self._width print "Panning to", (self._x, self._y) self.calculate_and_draw()
You should now be able to pan the viewport by, for example, a quarter
window's width to the right, by typing mandel.pan((0.25, 0))
in the
ineractive session.
10.4 Interactive class update
We can change the behaviour of an existing instance of our class.
def pan(self, direction): print "Panning has been disabled" Mandelbrot.pan = pan mandel.pan((1,2))
The trouble with this is that the behaviour of the class instance no longer matches the source code. This is great for quickly testing ideas, but can easily lead to confusion.
10.5 More careful interactive class update
- Edit the definition of
Mandelbrot.pan
. Give it some behaviour that you will easily recognize. - Copy the whole
Mandelbrot
class definition to the clipboard - type
paste
in your IPython session - type
mandel.__class__ = Mandelbrot
in your IPython session - Call
mandel.pan(...)
in your IPython session and confirm that the new behaviour is active.
Note that we changed the type of mandel
from the old version of the
class, to the new one.
10.6 Update all existing instances
Here is how you might track down all existing instances of the
Mandelbrot
class:
import gc need_update = [ instance for instance in gc.get_objects() if type(instance) is Mandelbrot ]
Now use IPython's edit
magic to change the definition of a method in
the Mandelbrot class. Then you can make all the instances you found
earlier pick up the new definition, by changing their class to the
new version of Mandelbrot
.
for instance in need_update: instance.__class__ = Mandelbrot
10.7 Instance-updating decorator
Write a class decorator, based on the last idea, which automatically updates the instances of the decorated class, when that class is redefined.
This is useful for trying out new ideas in your classes without losing the existing state of your instances: An interactive development aid.
Here is how we might use it with our Mandelbrot
class. Use it to add
zoom
, change_resoluion
and max_iterations
methods. Test them and
fix any problems interactively, reusing the same single instance of
Mandelbrot
.
10.8 Zooming, Resolution and Iteration limit
Hints:
def zoom(self, factor=2): self._width /= factor print "Zooming to width", self._width self.calculate_and_draw()
def change_resolution(self, factor=1.1): self._resolution = int(factor * self._resolution) print "Changing resolution to", self._resolution self.calculate_and_draw()
def max_iterations(self, new_value=None): if new_value: self._max_iterations = new_value print "Setting maximum iterations to", self._max_iterations self.calculate_and_draw() return self._max_iterations
10.9 Optimization
As we increase the resolution and zoom in (requiring an increase in iteration limit), the time needed to calculate the image grows rapidly. Deeper exploration of the fractal becomes impractical at this speed. We need to optimize.
Before optimizing always profile. Let's use IPython's prun
magic
to help us. Interactively change the image parameters until the image
update takes a few seconds, then profile with
prun mandel.update_image()
On my machine this suggests that around 93% of the time is spent in
the escape_time
function (and the abs
function called inside
it). Let's use Cython to speed up escape_time
, while keeping as much
of the rest of the code as possible in dynamic, flexible, pure Python.
10.10 Cythonizing escape_time
Simply moving out escape_time
into a separate module, compiling it
with Cython and importing it, results in a speedup factor of around
1.1 on my machine.
Use the Cython annotator to guide you in adding type declarations to
escape_time
. After doing this, I get a speedup factor of almost 8
WRT the mandelbrot_3.py
.
The use of abs
seems to be the major cause of remaining
yellowness. Let's hand code something equivalent. I now get a speedup
factor close to 16 WRT mandelbrot_3.py
.
[Note, the speedup factor is highly dependent on the value of
max_iterations
: while max_iterations = 1000
gives me a speedup
factor of 16, max_iterations = 100
gives me a factor of only 2. It
is also strongly affected by the region of the complex plane that is
being analysed.]
Solution: mandelbrot_4.py, cython_mandel4.pyx, setup_4.py
10.11 Push the loop down
Mandelbrot.update_image
calls escape_time
in a loop. On each
iteration, an unboxed complex number is extracted from a Numpy array
placed into a Python object, which is then sent into escape_time
where it is unboxed again and dealt with at a low level. Each point is
(pointlessly) hopping back and forth accross the Python-C boundary.
It would be much better if we could push that whole loop down into Cython.
Solution: mandelbrot_5.py, cython_mandel5.pyx, setup_5.py
On my machine, this gives a speedup factor of almost 270 WRT
mandelbrot_3.py
.
Swap the order of the nested loops in escape_time
and see what
effect it has on performance. For large arrays you should find that
one version is appreciably faster than the other. Why?
10.11.1 Speed comparison
For convenient comparison of the performance of different versions, I
have provided this script. Edit the value of versions
in the source
to specify which versions you want to compare, then execute with
ipython --pylab -i compare_speed.py
10.12 Mouse clicks
Having to type code at the IPython prompt, in order to navigate the Mandelbrot set, is not very convenient. We'll add the ability to navigate with keystrokes and mouse clicks in the viewer.
Look trough the documentation on Matplotlib event handling.
Add click-to-zoom support for the viewer thus:
class Mandelbrot(object): def __init__(*args, **kwds): # ... fig = figure() fig.canvas.mpl_connect('button_press_event', self.on_click) # ... def on_click(self, event): # Don't do anything if the click lies outside the axes if event.xdata is None or event.ydata is None: return # Convert axis coordinates into image coordinates and recenter # image on the clicked point. self._x += self._pixel_to_position(event.xdata, 'x') self._y += self._pixel_to_position(event.ydata, 'y') print "Event coords:", event.xdata, event.ydata print "Image coordsA", self._x, self._y # Mouse buttons which zoom after recentering if event.button != 1: # As you view the fractal boundary in greater detail, you # need more iterations to distinguish the features. self._max_iterations *= 1.25 # Not calling self.max_iterations() as this would # recalculate the image unnecessarily, *before* we zoom. print "Setting maximum iterations to", self._max_iterations self.zoom() # Any other mouse buttons only recentre else: print "Recentering on", (self._x, self._y) self.calculate_and_draw() def _pixel_to_position(self, pixel, axis): # In Matplotlib coordinates, the y-axis points downwards, so # we reverse the result for the y-axis result = (float(pixel) / self._resolution - 0.5) * self._width return -result if axis == 'y' else result
Add a handler which allows you to navigate with keystrokes. For example
- arrow keys to pan the viewer
+
/-
to zoom in/outR
/r
to increase/decrease the resolution- Sequence of digits followed by
ENTER
to set the iteration limit.
10.13 Development style
Let's take a step back to think about how we developed this program.
- Quick and dirty first version. Easily done with Python's expressive nature, interactive development, and helpful packages such as IPython, Numpy and Matplotlib.
- Used Cython to optimize just the tiny portion of the code that needed to go faster. (We are running some very efficient low-level code, both in the libraries that we use, and in the Cython-generated portion that we wrote: but we have been spared most of the pain of dealing with the low-level details ourselves.)
- Continued flexible development with most code in high-level Python.
I submit that this style of development works well in many situations, especially ones where the nature of the code and what it is applied to are exploratory and where rapid interactive feedback is valuable, as is often the case in scientific programming.
11 Summary
- Basic Python is easy to learn
- Python is highly expressive
- Python's consistent underlying principles make it relatively easy to get a solid understanding of the language.
- A lot of magic is within easy reach in Python
- Python is being used widely for scientific programming, and many packages exist which make the experience productive and pleasant
- Write your code in a high-level language, then selectively optimize where necessary. Cython is a great tool for doing this in Python.