RIKEN Center for Computational Science

OVERVIEW 計算科学研究機構とは

Processor Research Team

Developing Parallel-computing Models and Acceleration Technologies for Large-scale High-performance Computing

To achieve high-performance computing with a supercomputer such as K computer and supercomputer Fugaku, we need to use a huge number of computing nodes in a way that they cooperate with each other using an inter-node network to communicate among them. However, the overall performance may be degraded by the considerable overhead required for global communications and synchronization among the nodes. We are developing computing accelerators to achieve large-scale processing with less performance degradation by introducing a new parallel computing model based on a “Data-Flow” model with localized communication and synchronization. Also, we are developing data-flow accelerators where custom-computing circuits are automatically generated by a high-level synthesis compiler for each target application. Such specially customized hardware structures allow us to achieve high performance processing even for those applications which conventional CPUs are not good at handling. These research results are helping advance usage of the K computer, as well as aiding exploration of new computing models and new architectures for future supercomputers.

Research Content

Low-power and high-performance numerical computing using our own hardware compiler to generate custom-computing acceleration
As the advancement of semiconductor technology based on Moore’s Law slows down, it will be difficult to improve computing performance with multi-core microprocessors in the near future. One of the promising solutions to solve this problem is a reconfigurable custom computing machine, where software code of a target application is converted to run on customized accelerator hardware implemented and executed with field-programmable gate arrays (FPGAs).

To date, we have developed a high-level synthesis compiler to generate stream-computing hardware modules with a data-flow computing model, as well as a system to execute high-performance computing with the generated modules implemented on FPGAs. In the case of a tsunami simulation, for instance, we achieved two times higher sustained performance and an eight-fold improvement in power performance using FPGAs compared with GPUs. These improvements were achieved by employing efficient subsystem structures tailored to the target application, including customized memory subsystems and data-paths with increased pipelines. In addition, we have developed a real-time data-compression hardware module using multiple FPGAs to enhance memory and network bandwidth for high performance computing.

Going forward, we will further advance these developments, and also develop a new system to easily achieve high performance with massively-large-scale and complex computers. We aim to establish a new computing model and architecture for high-performance computing in the Post-Moore era.


Team Leader Kentaro Sano

Team Leader
Kentaro Sano

Biography: Detail
Annual Report
(PDF 933KB)