1. How to Program, Part I
  2. How to Program, Part II
  3. How to Program, Part III
  4. How to Program, Part IV
  5. How to Program, Part V
  6. How to Program, Part VI
  7. exercises
  8. pyMPI tutorial
  9. Calculating PI, Part I
  10. Calculating PI, Part II
  11. Calculating PI, Part III
  12. Dividing Work
  13. More MPI
  14. Poogle - Web Search
  15. Mandelbrot Sets
  16. Mandelbrot, The Code
  17. Mandelbrot, The Images
  18. Mandelbrot In CUDA
  19. Conway's Life, Part I
  20. Life Code Listing
  21. Conway's Life, Part II
  22. MPI Life Code Listing

More MPI

Using what we've learned so far, we can now sum up an array in parallel.

suma.py
1import mpi
2a = range(10)
3 
4divide up the work
5n = len(a)/mpi.size
6ilo = mpi.rank*n
7ihi = (mpi.rank+1)*n-1
8if mpi.rank+== mpi.size:
9  ihi = len(a)-1
10 
11sum one piece of the array
12s = 0
13for i in range(ilo,ihi+1):
14  s += a[i]
15 
16call allreduce and print
17s = mpi.allreduce(s,mpi.SUM)
18if mpi.rank == 0:
19  print 'sum=',s
$ mpiexec -np 4 python ./suma.py
sum= 45

In addition to using allreduce, you can simply send a message between two processes using send and receive. The send function receives data and a destination (in this case proc 1), the recv function specifies the source it will receive data from (in this case proc 0). Note the variable named 'rc' that comes back with receive. We'll talk more about that in a moment.

send.py
1import random
2import mpi
3 
4if mpi.rank==0:
5  n = random.randint(1,100)
6  print 'sending',n
7  mpi.send(n,1)
8elif mpi.rank==1:
9  n,rc = mpi.recv(0)
10  print 'received',n
$ mpiexec -np 2 python ./send.py
sending 71
received 71

Problem:Note that we print on both rank 0 and rank 1. Does this create the possibility of overlapping output? Why, or why not?

If you don't care where you receive your data from, you can specify that too with mpi.ANY_SOURCE. Afterwards, you can use the rc variable to figure out what the source was.

send2.py
1import random
2import mpi
3 
4if mpi.rank==0:
5  n = random.randint(1,100)
6  print 'sending',n
7  mpi.send(n,1)
8elif mpi.rank==1:
9  n,rc = mpi.recv(mpi.ANY_SOURCE)
10  print 'received',n,'source=',rc.source
$ mpiexec -np 2 python ./send2.py
sending 1
received 1 source= 0

In the example above, the variable 'rc.source' contains the rank of the mpi process that sent the message.

Using the 'senda' code above, we were able to sum up an array in parallel. We "cheated" a little, because each process computed array "a" independently. Here's how we can communicate the array to the other processes.

suma2.py
1import mpi
2 
3if mpi.rank == 0:
4  # compute the array
5  a = range(10) 
6  # send the array to everyone else
7  for i in range(1,mpi.size):
8    mpi.send(a,i)
9else:
10  # receive the array
11  a,rc = mpi.recv(0)
12 
13divide up the work
14n = len(a)/mpi.size
15ilo = mpi.rank*n
16ihi = (mpi.rank+1)*n-1
17if mpi.rank+== mpi.size:
18  ihi = len(a)-1
19 
20sum one piece of the array
21s = 0
22for i in range(ilo,ihi+1):
23  s += a[i]
24 
25call allreduce and print
26s = mpi.allreduce(s,mpi.SUM)
27if mpi.rank == 0:
28  print 'sum=',s
$ mpiexec -np 4 python ./suma2.py
sum= 45

Of course, the problem with the above program is it sends the entire array to each child process when it only needs to send a piece. The rank 0 process should only send what each of the other ranks need. This program fixes that problem, and introduces a new piece of python syntax, the array slice on line 14.

suma3.py
1import mpi
2 
3n = 0 
4if mpi.rank == 0:
5  # compute the array
6  a = range(10) 
7  n = len(a)/mpi.size
8  # send the array to everyone else
9  for r in range(1,mpi.size):
10    ilo = r*n
11    ihi = (r+1)*n-1
12    if r+== mpi.size:
13      ihi = len(a)-1
14    mpi.send(a[ilo:ihi+1],r)
15else:
16  # receive the array
17  a,rc = mpi.recv(0)
18 
19if mpi.rank == 0:
20  n = len(a)/mpi.size
21else:
22  n = len(a)
23 
24sum one piece of the array
25s = 0
26for i in range(n):
27  s += a[i]
28 
29call allreduce and print
30s = mpi.allreduce(s,mpi.SUM)
31if mpi.rank == 0:
32  print 'sum=',s
$ mpiexec -np 4 python ./suma3.py
sum= 45

This seems like a long program for such a simple task, but parallel programming is more difficult.


Problems:

  1. What's wrong with this program?
    bad60.py
    1import mpi
    2a = range(10)
    3 
    4divide up the work
    5n = len(a)/mpi.size
    6ilo = mpi.rank*n
    7ihi = (mpi.rank-1)*n+1
    8if mpi.rank+== mpi.size:
    9  ihi = len(a)-1
    10 
    11sum one piece of the array
    12s = 0
    13for i in range(ilo,ihi+1):
    14  s += a[i]
    15 
    16call allreduce and print
    17s = mpi.allreduce(s,mpi.SUM)
    18if mpi.rank == 0:
    19  print 'sum=',s
  2. Modify the program "suma3" so that a is not simply set by range, but instead contains random numbers. After computing the answer in parallel, compute it again on process 0 and make sure you get the right answer.

  3. Ping pong test: Process 0 should send, then receive 100 messages. Process 1 should receive, then send 100 messages. At the end, print out the amount of time it took to do all that sending and receiving.

    The amount of time it takes to run a program can be determined using the time function.

    timed.py
    1import time
    2 
    3tstart = time.time()
    4### measuring time below
    5s = 0
    6for i in range(1,10**7):
    7  s += i
    8### measuring time above
    9tend = time.time()
    10print 'time=',tend - tstart
    $ python ./timed.py
    time= 1.32153916359