More MPI
Using what we've learned so far, we can now sum up an
array in parallel.
| suma.py |
| 1 | import mpi |
| 2 | a = range(10) |
| 3 | |
| 4 | # divide up the work |
| 5 | n = len(a)/mpi.size |
| 6 | ilo = mpi.rank*n |
| 7 | ihi = (mpi.rank+1)*n-1 |
| 8 | if mpi.rank+1 == mpi.size: |
| 9 | ihi = len(a)-1 |
| 10 | |
| 11 | # sum one piece of the array |
| 12 | s = 0 |
| 13 | for i in range(ilo,ihi+1): |
| 14 | s += a[i] |
| 15 | |
| 16 | # call allreduce and print |
| 17 | s = mpi.allreduce(s,mpi.SUM) |
| 18 | if mpi.rank == 0: |
| 19 | print 'sum=',s |
$ mpiexec -np 4 python ./suma.py
sum= 45
|
In addition to using allreduce, you can simply send a message
between two processes using send and receive. The send function
receives data and a destination (in this case proc 1), the recv function specifies
the source it will receive data from (in this case proc 0). Note the
variable named 'rc' that comes back with receive. We'll talk more
about that in a moment.
| send.py |
| 1 | import random |
| 2 | import mpi |
| 3 | |
| 4 | if mpi.rank==0: |
| 5 | n = random.randint(1,100) |
| 6 | print 'sending',n |
| 7 | mpi.send(n,1) |
| 8 | elif mpi.rank==1: |
| 9 | n,rc = mpi.recv(0) |
| 10 | print 'received',n |
$ mpiexec -np 2 python ./send.py
sending 71
received 71
|
Problem:Note that we print on both rank 0 and rank 1. Does this
create the possibility of overlapping output? Why, or why not?
If you don't care where you receive your data from, you can
specify that too with mpi.ANY_SOURCE. Afterwards, you can use
the rc variable to figure out what the source was.
| send2.py |
| 1 | import random |
| 2 | import mpi |
| 3 | |
| 4 | if mpi.rank==0: |
| 5 | n = random.randint(1,100) |
| 6 | print 'sending',n |
| 7 | mpi.send(n,1) |
| 8 | elif mpi.rank==1: |
| 9 | n,rc = mpi.recv(mpi.ANY_SOURCE) |
| 10 | print 'received',n,'source=',rc.source |
$ mpiexec -np 2 python ./send2.py
sending 1
received 1 source= 0
|
In the example above, the variable 'rc.source' contains the
rank of the mpi process that sent the message.
Using the 'senda' code above, we were able to sum up an array
in parallel. We "cheated" a little, because each process
computed array "a" independently. Here's how we can
communicate the array to the other processes.
| suma2.py |
| 1 | import mpi |
| 2 | |
| 3 | if mpi.rank == 0: |
| 4 | # compute the array |
| 5 | a = range(10) |
| 6 | # send the array to everyone else |
| 7 | for i in range(1,mpi.size): |
| 8 | mpi.send(a,i) |
| 9 | else: |
| 10 | # receive the array |
| 11 | a,rc = mpi.recv(0) |
| 12 | |
| 13 | # divide up the work |
| 14 | n = len(a)/mpi.size |
| 15 | ilo = mpi.rank*n |
| 16 | ihi = (mpi.rank+1)*n-1 |
| 17 | if mpi.rank+1 == mpi.size: |
| 18 | ihi = len(a)-1 |
| 19 | |
| 20 | # sum one piece of the array |
| 21 | s = 0 |
| 22 | for i in range(ilo,ihi+1): |
| 23 | s += a[i] |
| 24 | |
| 25 | # call allreduce and print |
| 26 | s = mpi.allreduce(s,mpi.SUM) |
| 27 | if mpi.rank == 0: |
| 28 | print 'sum=',s |
$ mpiexec -np 4 python ./suma2.py
sum= 45
|
Of course, the problem with the above program
is it sends the entire array to each child process
when it only needs to send a piece. The rank 0
process should only send what each of the other
ranks need. This program fixes that problem,
and introduces a new piece of python syntax,
the array slice on line 14.
| suma3.py |
| 1 | import mpi |
| 2 | |
| 3 | n = 0 |
| 4 | if mpi.rank == 0: |
| 5 | # compute the array |
| 6 | a = range(10) |
| 7 | n = len(a)/mpi.size |
| 8 | # send the array to everyone else |
| 9 | for r in range(1,mpi.size): |
| 10 | ilo = r*n |
| 11 | ihi = (r+1)*n-1 |
| 12 | if r+1 == mpi.size: |
| 13 | ihi = len(a)-1 |
| 14 | mpi.send(a[ilo:ihi+1],r) |
| 15 | else: |
| 16 | # receive the array |
| 17 | a,rc = mpi.recv(0) |
| 18 | |
| 19 | if mpi.rank == 0: |
| 20 | n = len(a)/mpi.size |
| 21 | else: |
| 22 | n = len(a) |
| 23 | |
| 24 | # sum one piece of the array |
| 25 | s = 0 |
| 26 | for i in range(n): |
| 27 | s += a[i] |
| 28 | |
| 29 | # call allreduce and print |
| 30 | s = mpi.allreduce(s,mpi.SUM) |
| 31 | if mpi.rank == 0: |
| 32 | print 'sum=',s |
$ mpiexec -np 4 python ./suma3.py
sum= 45
|
This seems like a long program for such a simple task, but
parallel programming is more difficult.
Problems:
-
What's wrong with this program?
| bad60.py |
| 1 | import mpi |
| 2 | a = range(10) |
| 3 | |
| 4 | # divide up the work |
| 5 | n = len(a)/mpi.size |
| 6 | ilo = mpi.rank*n |
| 7 | ihi = (mpi.rank-1)*n+1 |
| 8 | if mpi.rank+1 == mpi.size: |
| 9 | ihi = len(a)-1 |
| 10 | |
| 11 | # sum one piece of the array |
| 12 | s = 0 |
| 13 | for i in range(ilo,ihi+1): |
| 14 | s += a[i] |
| 15 | |
| 16 | # call allreduce and print |
| 17 | s = mpi.allreduce(s,mpi.SUM) |
| 18 | if mpi.rank == 0: |
| 19 | print 'sum=',s |
-
Modify the program "suma3" so that a is not simply set by range, but instead
contains random numbers. After computing the answer in parallel, compute it
again on process 0 and make sure you get the right answer.
-
Ping pong test: Process 0 should send, then receive 100 messages.
Process 1 should receive, then send 100 messages. At the end, print
out the amount of time it took to do all that sending and receiving.
The amount of time it takes to run a program can be determined using
the time function.
| timed.py |
| 1 | import time |
| 2 | |
| 3 | tstart = time.time() |
| 4 | ### measuring time below |
| 5 | s = 0 |
| 6 | for i in range(1,10**7): |
| 7 | s += i |
| 8 | ### measuring time above |
| 9 | tend = time.time() |
| 10 | print 'time=',tend - tstart |
$ python ./timed.py
time= 1.32153916359
|
|