6.3. How to execute back ground job¶
This indicates how to execute MPI program in back ground.
6.3.1. Back ground execution¶
This indicates the specification example of back ground execution.
$ mpiexec -n 5 --vcoordfile vcoord_file ./a.out &
Option
Contents
-n
Specify parallel number
--vcoordfile
Specifies that parallel processes be allocated based on the list of coordinates specified in the VCOORD_FILE file.
See also
The description format of the coordinates in the VCOORD_FILE file is conformed to the format of
rank-map-hostfile
parameter of pjsub command’s--mpi
option. For details, refer to the following in the manual “MPI User’s Guide”.
--vcoordfile
explanation of “Options that can be specified in global_options”“VCOORD file format”
6.3.2. Job creation example¶
Create VCOORD_FILE file
VCOORD_FILE file is created as conforming to
rank-map-hostfile
parameter format. Create so that the coordinates do not overlap between mpiexec to be executed simultaneously. If there is a duplicate, an error will occur. Here, create as the file name ofvcode1
,vcode2
,vcode3
. Indicate file contents in below.
[vcode1]
(0) (1) (2) (3) (4)[vcode2]
(5) (6) (7) (8) (9)[vcode3]
(10) (11) (12) (13) (14)
Create a job script
Create a job script as following.
#!/bin/bash -x # #PJM -L "node=15" # Number of total node required to process of executing at the same time #PJM -L "rscgrp=small" #PJM -L "elapse=01:00:00" #PJM -g groupname #PJM -x PJM_LLIO_GFSCACHE=/vol000N #PJM -s # mpiexec -n 5 --vcoordfile vcode1 ./sample_mpi & mpiexec -n 5 --vcoordfile vcode2 ./sample_mpi & mpiexec -n 5 --vcoordfile vcode3 ./sample_mpi & wait # Input the process to wait until finishing back ground job
6.3.3. Notes on back ground execution¶
- The maximum number that can be executed simultaneously in background execution is 128.If the number exceeds 128, the background job will not be accepted with the message “[ERR.] PLE 0050 plexec cannot be executed any further.”
When executing multiple background jobs, it is necessary to specify the VCOORD_FILE file.
The VCOORD_FILE file cannot have overlapping coordinates between background jobs running simultaneously. An error will occur if the coordinates are duplicated.
If there is no end-waiting by wait, processing of the job script ends regardless of the presence or absence of the background job, and processing of the job also ends.
In principle, it cannot be used at the same time as dynamic process generation, but it can be used only when all of the following conditions are satisfied.
Multiple mpiexec is not executed at the same time.
The number of MPI_COMM_WORLD of the dynamic process that exists simultaneously does not exceed 65535.
On 1 execution of mpiexec command, the total number of calls to the MPI_COMM_SPAWN or MPI_COMM_SPAWN_MULTIPLE routine does not exceed 4294967295.