CompMattSci - Boosting your Colab project with Cython

What happens when your standard Python numerical computing libraries don't contain an operation you need? If you're forced to write a custom operation and it requires a for loop (i.e. you can't easily vectorize the operation using standard numpy/scipy operations), Cython may be a lifesaver. I recently encountered this when using the Method of Lines to solve some partial differential equations that model unsteady-state diffusion and reaction within a spherical catalyst pellet (more on this interesting chemical engineering problem in a future blog post). I'll present an example of setting up a custom finite difference calculation using Cython within a Google Colab notebook.

Within Colab, first load cython:

%load_ext cython

Next, cimport both cython and numpy, then write your function (remembering to declare return and all variable types). For some background, fd_interior_spherical_axisymmetic() numerically approximates the spatial derivate term for the axisymmetric spherical diffusion problem (see DOI: 10.1371/journal.pone.0135506 for details).

%%cython
cimport cython
cimport numpy as np

cdef inline np.ndarray[np.float64_t, ndim=1] fd_interior_spherical_axisymmetic(\
                                            np.ndarray[np.float64_t, ndim=1] f,
                                            np.float64_t dr_nondim):
  """
  f : numpy array for the concentration as a function of r (r from 0 to 1)
  dr : the distance between radial points (non-dimensionalized)

  returns :
    numerical approximation of the spatial derivative terms at the interior 
    discretized points of an axisymmetric, spherical particle, i.e. : 
    1/r^2 d(r^2 df/dr)/dr
  """

  cdef int i = 1
  cdef int endpts = 2
  cdef np.ndarray[np.float64_t, ndim=1] answer
  answer = np.zeros(f.shape[0]-endpts, dtype=np.float64)

  for i in range(1,f.shape[0]-1):
    answer[i-1]=((i+1)*f[i+1]- 2*i*f[i] + (i-1)*f[i-1])/(i*dr_nondim**2)

  return answer

def fd_interior_spherical_axisymmetic_wrapper(np.ndarray[np.float64_t, ndim=1] f,
                                              np.float64_t dr_nondim):
  return fd_interior_spherical_axisymmetic(f,dr_nondim)

import numpy as np
import timeit

Let's now define a pure pythonic function to do the same:

def pure_python_compute(f, dr_nondim):
  answer = np.zeros(len(f)-2)
  for i in range(1,f.shape[0]-1):
    answer[i-1]=((i+1)*f[i+1]- 2*i*f[i] + (i-1)*f[i-1])/(i*dr_nondim**2)
  return answer

Finally, let's compare their running times:

testarr = np.linspace(0,100,10000)
%timeit pure_python_compute(testarr,1)
%timeit fd_interior_spherical_axisymmetic_wrapper(testarr,1)

Which shows:

10 loops, best of 3: 20.7 ms per loop
10000 loops, best of 3: 65.3 µs per loop

This speedup of ~300x is absolutely critical given the number of times this function must be evaluated when solving a set of PDEs using the Method of Lines.