Conway's Life Vectorised

My first attempt at writing Conway's Life in python using tkinter produced life_slow.py. Unfortunately it lives up to its name. On a Pi 4 it achieves 0.70 fps (it prints its frame rate as it runs), and therefore it will take almost half an hour to find the end state of the initial configuration of five live cells.

The update of the grid was been written as

def evolve(grid):
    size=len(grid)
    new_grid=np.zeros((size,size),dtype=np.uint8)
    for i in range(0,size):
        for j in range(0,size):
# Note =-grid[i,j] is wrong, as grid is unsigned
            neighbours=0-grid[i,j];
            for ii in range(max(0,i-1),min(i+1,size-1)+1):
                for jj in range(max(0,j-1),min(j+1,size-1)+1):
                    if (grid[ii,jj]==1): neighbours+=1
            if (neighbours==2):
                new_grid[i,j]=grid[i,j]
            elif (neighbours==3):
                new_grid[i,j]=1
    return new_grid

For each grid cell i,j it first counts the number of neighbours by setting up the ii and jj loops to scan over the eight neighbours and the cell itself, and the cell's own state is subtracted. The min and max functions ensure that the scan does not go beyond the edges of the array -- an element in the corner has just four cells (including itself) to scan, not nine.

The above is how one might write the code in a traditional compiled language, such as C, C++ or Fortran, but loops work slowly in interpretted languages like python. Each statement is translated every time it is reached, then evaluated, and the translation forgotten. A compiler would translate the whole program into a form directly comprehensible to the CPU first, and then it is run, but that is not how python works.

The above is also slightly, but only slightly, too wary about exceeding array bounds. Numpy does not mind if one specifies array indices which are too large, and automatically reduces them to the largest valid value. Too small is another matter, for numpy regards negative indices as being valid relative to the maximum value, with -1 being the maximum valid value. So if a is an array of six element, which will have indicies 0 to 5, then a[1:2] is the second and third elements, a[5:6] is the last element, and a[-1:0] is equivalent to a[5:-1] and is hence an empty array, not the first element.

The quick and easy way of accelerating the above is to use numba to compile just the above function. This is done in life_jit.py. This improves things to about 20 fps, almost thirty times as fast. But I regard numba as rather dangerous in python. Not all functions can be compiled with it, and global variables in compiled functions become read-only constants, which can cause code to start behaving differently without warning. On some platforms (such as MacOS) numba can be hard to install.

However, a good python programmer worried about speed would never write code like the above when there are alternatives such as

def evolve(grid):
    size=len(grid)
    new_grid=np.empty((size,size),dtype=np.uint8)
    for i in range (0,size):
        if (i==0):
            column=grid[0,:]+grid[1,:]
        elif (i==size-1):
            column=grid[size-2,:]+grid[size-1,:]
        else:
            column=grid[i-1,:]+grid[i,:]+grid[i+1,:]
        neighbours=column-grid[i,:]
        neighbours[1:size]+=column[0:size-1]
        neighbours[0:size-1]+=column[1:size]
        amask=np.logical_and((neighbours==2), (grid[i,:]==1))
        amask=np.logical_or(amask, (neighbours==3))
        new_grid[i,:]=amask.astype(np.uint8)

    return new_grid

This code, called life_vec.py, avoids most loops in favour of operations on arrays, and achieves 18 fps, so 90% of the speed of the numba version.

It is arguably less clear. It takes a little thought to realise that the array neighbours really does end up with a count of the number of neighbours for a column of the grid. It does, for firstly neighbouring columns are summed into the column array, and then this is accumulated with offsets of -1, 0 and 1, which effectively sums over neighbouring rows.

Some of the syntax is perhaps a little obscure. The expression (neighbours==2) produces a logical array of the same size as the array neighbours whose values are true when the condition is true for the corresponding element in neighbours, and similarly for (grid==1). The np.logical_ operations perform the logical operations element-wise on a pair of arrays, and then the .astype line converts the logical array to an integer array.

The code is not perfectly optimised. But now the interpreter needs to translate the lines in the inner loop size times per update of the 2D array, which is 120 times using the example initialisation. In the code with multiple loops, the line

  if (grid[ii,jj]==1): neighbours+=1

was being passed through the interpreter about 9xsize2 times for each update, which, with size=120, is almost 130,000 times per update.

Python programmers refer to the process of replacing loops with scalar operations on array elements by operations on whole dimensions of an array at once as vectorisation, hence the name of this example.

The code given on the main Life page adds a basic rle file reader, and pads rather than clears the grid if the grid size is increased.

Further Ideas

The Life program above allows one to stop the simulation. When stopped, one can toggle individual cells with a mouse-click, and even add one type of "glider" with a right mouse click. There are many different sorts of small pattern which have been identified within Life, and some can be observed by replacing the lines between

## Initial configuration

and

## End initial configuration

with

size=40
grid=np.zeros((size,size),dtype=np.uint8)

grid[9][21]=1
grid[10][20]=1
grid[10][21]=1
grid[11][20]=1
grid[11][21]=1
grid[12][20]=1

grid[2][2]=1
grid[2][3]=1
grid[2][4]=1
         
glider(grid,2,32)

for i in range(21,31):
    grid[i][8]=1

grid[25][35]=1
grid[26][35]=1
grid[27][35]=1
grid[28][35]=1
grid[29][34]=1
grid[25][34]=1
grid[25][33]=1
grid[26][32]=1
grid[29][32]=1

speed=.25

For this small size a delay is added to make it easier to see how things evolve. The above shows patterns with periods of one, two and fifteen, as well as two patterns which translate themselves indefinitely and so disappear off the grid.

Other simple starting positions can be found on Wikipedia.