
The system operations and development unit has conducted operations and maintenance management, user management, and user support of super computer systems operated by R-CCS in addition to the research and development on advanced management and operations of them. We have made improvements of the job scheduler, file system, and MPI library (communications library) while analyzing the operational statistics of systems. Additionally, we have also carried out network maintenance management, etc., within R-CCS.
We have analyzed the execution status of jobs and optimized some parameters of the scheduler, etc. We have also resolved various problems on the system operations, and have optimized the management and operations of systems and the system maintenance.
On the K computer, we improved about 20% of the system utilization by optimizing the scheduling parameters.
Figure: example of system utilization improvement of the K computer
We have improved the system software (MPI library, compliers and etc.) and they have enabled users to improve the performance of the user’s application without any code changes. And we have also developed tools that support the use of computer systems.
Figure: example of performance improvement of “Alltoallv” on the K computer
We have conducted the user support, such as the user management and the consulting services.