3.1.7.11.4. Hybrid parallel¶
The following is an example of compiling a multi-node job (hybrid parallel) Fortran program.
- Prepare source program.Prepare sample program as
/home/system/sample/Fortran/mpi/sample_hybrid.f
.
1 PROGRAM sample_mpi
2 INCLUDE 'mpif.h'
3 INTEGER,parameter :: RANGE=9000
4
5 INTEGER :: rank, size, root, ierror
6 INTEGER :: i, j, data, result
7 REAL(8), DIMENSION(RANGE, RANGE) :: a,b,c
8
9 result = 0
10 c(:,:) = 0.0d0
11
12 CALL MPI_INIT( ierror )
13 CALL MPI_COMM_RANK( MPI_COMM_WORLD, rank, ierror )
14 CALL MPI_COMM_SIZE( MPI_COMM_WORLD, size, ierror )
15
16 IF ( rank .EQ. 0 ) THEN
17 WRITE(*,*) 'MPI communication start. size=',size
18 ENDIF
19
20 DO j=1 ,RANGE
21 DO i=1, RANGE
22 a(i,j) = i+j*0.5
23 b(i,j) = i+j/(rank+1)
24 c(i,j) = a(i,j)+b(i,j)
25 ENDDO
26 ENDDO
27
28 data = c(1,1)/(rank+1)
29 root = 0
30 CALL MPI_REDUCE( data, result, 1, MPI_INTEGER,
31 &MPI_SUM, root, MPI_COMM_WORLD, ierror )
32
33 IF (rank .EQ. 0) THEN
34 WRITE(*,*) 'MPI communication end'
35 WRITE(*,*) 'result(',result,')'
36 ENDIF
37
38 CALL MPI_FINALIZE( ierror )
39 STOP
40 END PROGRAM sample_mpi
Compile sample program.
[_LNlogin]$ mpifrtpx -V -Kfast,parallel,optmsg=2 -Nlst=t -o sample_mpi sample_hybrid.f frtpx: Fujitsu Fortran Compiler 4.1.0 tcsds-1.2.24 jwd_fortpx: Fujitsu Fortran Compiler 4.1.0 (Feb 26 2020 07:41:18) Fortran diagnostic messages: program name(sample_mpi) jwd5003p-i "sample_hybrid.f", line 10: Array description is parallelized. jwd6003s-i "sample_hybrid.f", line 10: SIMD conversion is applied to array description. jwd8663o-i "sample_hybrid.f", line 10: This loop is not software pipelined because the software pipelining does not improve the performance. jwd8202o-i "sample_hybrid.f", line 10: Loop unrolled 4 times. jwd5001p-i "sample_hybrid.f", line 20: DO loop with DO variable 'j' is parallelized. jwd6001s-i "sample_hybrid.f", line 21: SIMD conversion is applied to DO loop with DO variable 'i'. jwd8204o-i "sample_hybrid.f", line 21: This loop is software pipelined. jwd8205o-i "sample_hybrid.f", line 21: The software-pipelined loop is chosen at run time when the iteration count is greater than or equal to 128. GNU assembler version 2.30 (aarch64-linux-gnu) using BFD version version 2.30-49.el7 GNU ld version 2.30-49.el7 Supported emulations: aarch64linux aarch64elf aarch64elf32 aarch64elf32b aarch64elfb armelf armelfb aarch64linuxb aarch64linux32 aarch64linux32b armelfb_linux_eabi armelf_linux_eabi i386pep i386pe flistpx: Fujitsu Listing Processor 4.1.0 (Jan 9 2020 14:46:36)
- Prepare job script.Job script sample is prepared as
/home/system/sample/Fortran/mpi/job_mpi.sh
.
#!/bin/sh #PJM -L "node=2" #PJM -L "rscgrp=small" #PJM -L "elapse=10:00" #PJM --mpi max-proc-per-node=4 #PJM -x PJM_LLIO_GFSCACHE=/vol000N #PJM -g groupname #PJM -s # execute job export OMP_NUM_THREADS=12 mpiexec -n 8 ./sample_mpi
Submit a job with pjsub command.
[_LNlogin]$ pjsub job_mpi.sh [INFO] PJM 0000 pjsub Job 122 submitted.
- Check execution result.The standard output is output as
Job name.Job ID.out
[_LNlogin]$ cat job_mpi.sh.122.out MPI communication start. size= 8 MPI communication end result( 4 )