Developing a Numerical Library for Fast, High-precision Simulation
To utilize the full computing potential of the supercomputer K and Fugaku, and to produce simulations with higher efficiency and accuracy, it is necessary to take advantage of numerical software libraries that have been tuned to the needs of computer science and applied mathematics. We are researching and developing a large-scale, high-performance numerical library called KMATHLIB. Typical library algorithms commonly employed in simulation programs are used to handle systems of linear equations, eigenvalue calculations, three-dimensional Fast Fourier transforms, and long-period random number generators. We maintain corresponding library algorithms such as EigenExa, KMATH_FFT3D, and KMATH_RANDOM. We are modifying the existing KMATHLIB algorithms for use as a component of KMATHLIB2, which works in a scalable manner on Fugaku.
Furthermore, we are promoting R&D of innovative algorithms that can deal with the unprecedented challenges raised by the K computer, such as the eigenvalue problem of a nonsymmetric matrix, and higher order tensor calculations. In addition to the modification and development of such algorithms, we are extending a high-precision calculation framework developed for the K computer to use on Fugaku. We have also developed a method to reduce accumulated errors, and which guarantees the reproducibility of calculations by controlling the number of effective digits and removing non-determinism hidden in parallel calculations and when repeating these calculations. Collaboration is another important issue. We are collaborating with researchers and companies in Japan and overseas to establish fundamental technologies for numerical libraries that can be made the best use perpetually.
World Largest Dense Eigenvalue Computation
The solution of real symmetric dense eigenvalue problems is one of the fundamental matrix computations. To date, several new high-performance eigensolvers have been developed for peta and postpeta scale systems. One of these, the EigenExa eigensolver, has been developed in Japan. EigenExa provides two routines: eigen_s, which is based on traditional tridiagonalization, and eigen_sx, which employs a new method via a pentadiagonal matrix. Recently, we conducted a detailed performance evaluation of EigenExa by using 4,800 nodes of the Oakleaf-FX supercomputer system. In this paper, we report the results of our evaluation, which is mainly focused on investigating the differences between the two routines.
The results clearly indicate both the advantages and disadvantages of eigen_sx over eigen_s, which will contribute to further performance improvement of EigenExa. The obtained results are also expected to be useful for other parallel dense matrix computations, in addition to eigenvalue problems. We have successfully solved a world largest-scale dense eigenvalue problem (one million dimension) by EigenExa taking advantage of the overall nodes (82,944 processors) of K computer in 3,464 seconds. Our EigenExa achieves 1.7 PFLOPS (16% of the K computer’s peak performance). It is the world highest performance for solving an eigenvalue problem of a dense matrix.
Team leader, Large-scale Parallel Numerical Computing Technology Research Team, AICS, RIKEN (-present)
Guest Scientist, High Performance Computing Center Stuttgart
Assistant Professor, University of Electro-Communications
Researcher, Japan Atomic Energy Research Institute
Graduated from Applied Systems and Science, Graduate School, Division of Engineering, Kyoto University
- Annual Report