System Software Development Team

Research summary

The supercomputer Fugaku will consist of 10 million compute nodes, each of which has a so-called manycore CPU in which many CPU cores are integrated. In the Post-K (Fugaku) development project, several challenging issues are tackled in order to provide an efficient execution environment by the system software. Three of them are introduced as follows.

Issues on managing manycore CPU
The execution time of an application depends on how CPU cores and memory are allocated to both the application and system software, and it also depends on what kind of services, provided by the system software, are requested by the application. We are developing new system software that provides the best execution environment for an application with selecting system software services. The application program, running on the Linux system software, may run without any modifications.
Issues on communication latency
It is expected that communication latency will not be drastically improved unlike performance improvement of a compute node. Thus, communication latency will become dominant in the total execution time of an application. We are designing new communication software in cooperation with application developers to reduce the latency.
Issues on big data movement
In many traditional huge supercomputers, such as K computer, a data set used in an application is transferred from the global storage to the local storage, and from the local storage to compute nodes. After finishing the application execution, the result data set is transferred from compute nodes to the global storage via the local storage. To reduce the data transfer time in such a system, high performance network is needed, but it is expensive and requires much electricity. We are designing new system software to allocate the data set for an application in a right place and to reduce the data transfer time without extra network hardware resources.

Representative papers

1.Atsushi Hori, Min Si, Balazs Gerofi, Masamichi Takagi, Jai Dayal, Pavan Balaji, and Yutaka Ishikawa.:
"Process-in-process: techniques for practical address-space sharing"
International Symposium on High-Performance Parallel and Distributed Computing (HPDC '18), (Karsten Schwan Best Paper Award), (2018).
2.Min Si, Antonio J. Peña, Jeff R. Hammond, Pavan Balaji, Masamichi Takagi, Yutaka Ishikawa.:
"Dynamic Adaptable Asynchronous Progress Model for MPI RMA Multiphase Applications"
IEEE Trans. Parallel Distrib. Syst, Vol. 29, No. 9, pp. 1975-1989, (2018).
3.Tatiana V. Martsinkevich, Balazs Gerofi, Guo-Yuan Lien, Seiya Nishizawa, Wei-keng Liao, Takemasa Miyoshi, Hirofumi Tomita, Alok N. Choudhary, and Yutaka Ishikawa.:
"DTF: An I/O Arbitration Framework for Multi-component Data Processing Workflows"
Proceedings for the 33rd ISC High Performance conference, (2018).
4.Masayuki Hatanaka, Masamichi Tanaka, Atsushi Hori and Yutaka Ishikawa.:
"Offloaded MPI Persistent Collectives using Persistent Generalized Request Interface"
EuroMPI/USA 2017, (2017).
5.Balazs Gerofi, Rolf Riesen, Robert W. Wisniewski and Yutaka Ishikawa.:
"Toward Full Specialization of the HPC System Software Stack: Reconciling Application Containers and Lightweight Multi-kernels"
International Workshop on Runtime and Operating Systems for Supercomputers (ROSS), held in conjunction with ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), (Best Paper Award), (2017).
6.Takemasa Miyoshi, Guo-Yuan Lien, Shinsuke Satoh, Tomoo Ushio, Kotaro Bessho, Hirofumi Tomita, Seiya Nishizawa, Ryuji Yoshida, Sachiho A. Adachi, Jianwei Liao, Balazs Gerofi, Yutaka Ishikawa, Masaru Kunii, Juan Ruiz, Yasumitsu Maejima, Shigenori Otsuka, Michiko Otsuka, Kozo Okamoto, Hiromu Seko.:
"'Big Data Assimilation' Toward Post-Petascale Severe Weather Prediction: An Overview and Progress"
Proceedings of the IEEE 104(11): 2155-2179, (2016).
7.Jianwei Liao, Balazs Gerofi, Guo-Yuan Lien, Seiya Nishizawa, Takemasa Miyoshi, Hirofumi Tomita and Yutaka Ishikawa.:
"Toward a General I/O Arbitration Framework for netCDF based Big Data Processing"
International European Conference on Parallel and Distributed Computing (Euro-Par), (2016).
8.Balazs Gerofi, Masamichi Takagi, Gou Nakamura, Tomoki Shirasawa, Atsushi Hori and Yutaka Ishikawa.:
"On the Scalability, Performance Isolation and Device Driver Transparency of the IHK/McKernel Hybrid Lightweight Kernel"
IEEE International Parallel and Distributed Processing Symposium (IPDPS), (2016).
9.Balazs Gerofi, Masamichi Takagi, Yutaka Ishikawa, Rolf Riesen, Evan Powers and Robert W. Wisniewski.:
"Exploring the Design Space of Combining Linux with Lightweight Kernels for Extreme Scale Computing"
International Workshop on Runtime and Operating Systems for Supercomputers (ROSS), held in conjunction with ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), (Best Paper Award) (2015).
10.Taku Shimosawa, Balazs Gerofi, Masamichi Takagi, Gou Nakamura,Tomoki Shirasawa, Yuji Saeki, Masaaki Shimizu, Atsushi Hori and Yutaka Ishikawa.:
"Interface for Heterogeneous Kernels: A Framework to Enable Hybrid OS Designs targeting High Performance Computing on Manycore Architectures"
Proceedings of IEEE International Conference on High Performance Computing, (2014).

Team Leader Yutaka Ishikawa

Keyword

System Software Development Team

Team Leader Yutaka Ishikawa

Keyword

Research summary

Representative papers

Annual Reports