Originally posted by: AbC
There are varieties of tools that work with the IBM XL compilers. Some help productivity in the development phase (IBM debugger, RDp), some help exploit the architecture characteristics (compiler report) and some help utilize the hardware.
The IBM Parallel Environment program product (PE) is a distributed memory message passing system supported on AIX and Linux. This is a separate IBM product. For detail information, refer to
http://www-03.ibm.com/systems/software/parallel/index.html .
PE is designed for developing and executing parallel Fortran, C, or C++ programs. PE supports the two basic parallel programming models – SPMD and MPMD. In the SPMD (Single Program Multiple Data) model, the same program is running as each parallel task. The tasks, however, work on different sets of data. In the MPMD (Multiple Program Multiple Data) model, each task may be running a different program.
Let’s talk about how XL compilers and PE work together to exploit the parallel computing environment. A set of invocation commands are provided in PE for compiling programs that are executed in the parallel environment. The invocation command invokes the XL compiler with specific options and links in special libraries (Partition Manager and message passing interface libraries) for executing in the environment. The name of these invocation commands starts with “mp”, for example, mpcc_r for C program, mpxlf90_r for Fortran program and mpCC_r for C++ program. For executing the program in the parallel environment, the poe command is also provided to invoke the Parallel Operating Environment (POE) for loading and executing programs on remote processor nodes.
We will walk thru a few steps with a simple program to illustrate how XL compilers and PE work together. In this example, we have a C program (main.c) that calls a Fortran procedure (arr_cal.f90) for computation and then print the result in main,
main.c
#include <stdio.h>
void initialize_arr(float *, int n);
float summation(float *, int n);
void main()
{
int N=500;
float arr[N], tot;
initialize_arr(arr, N);
tot = summation(arr, N);
printf("tot = %f\n", tot);
}
arr_cal.f90
subroutine initialize_arr(arr, n) bind(c)
use, intrinsic :: iso_c_binding
real(kind=c_float) :: arr(n)
integer(kind=c_int), value :: n
call random_number(arr)
end subroutine
function summation(arr, n) result(tot) bind(c)
use, intrinsic :: iso_c_binding
integer(kind=c_int), value :: n
real(kind=c_float) :: arr(n), tot
tot = sum(arr)
end function
On AIX, the following commands are used to compile and link the program.
$ mpxlf90_r -c arr_cal.f90
$ mpcc_r –c main.c
$ mpxlf90_r arr_cal.o main.o –o test1
The executable can be executed on a cluster of machines by using poe command. Before using the poe command, a host file needs to be created to specify on which hosts the program is executed. In addition, the same directory (with the same absolute path as the current directory on the local host) has to be created on all the remote hosts.
host.list
! comments: have the following hosts available
machine1.ibm.com
machine1.ibm.com
machine1.ibm.com
machine2.ibm.com
The command to execute the program on the listed hosts is shown as follow:
$ mcp ./test1 –procs 4 –hfile host.list
$ poe ./test1 -procs 4 -hfile host.list
tot = 248.259613
tot = 248.259613
tot = 248.259613
tot = 248.259613
The option –procs is to specify how many tasks are created. In this case, four tasks are created to execute the same program (test1) on different hosts as specified in the host file (host.list). What the mcp command does is to copy the executable to the remote hosts. The option –labelio specifies that the output from the parallel tasks is labeled by task id.
$ poe ./test1 -procs 4 -hfile host.list -labelio yes
2:tot = 248.259613
3:tot = 248.259613
1:tot = 248.259613
0:tot = 248.259613
This example is a simple program to illustrate how the XL compilers work with PE. If you need to develop a program that exploits a distributed environment, PE is an essential tool. PE also provides a parallel debugger (pdb) for debugging parallel programs.
In this blog, we briefly describe the PE product and how it is used with the XL compilers for exploiting the parallel environment. Of course, the program can be much more complicated and useful than the simple one discussed here. Some applications decompose the problem to smaller size and distribute it to different hosts to work on. After the work finishes, the application collects the data from different hosts for the final result. In addition, the poe command is demonstrated here to use for executing the program on any remote hosts.