EigenExa
Overview
EigenExa, a part of KMATHLIB, is a high performance eigen-solver.
EigenExaはポストペタスケール計算機から富岳に至るまでのスーパーコンピューティング環境上で高性能・スケーラブルな大規模固有値計算を実現するため、理化学研究所計算科学研究センターにおいて開発された。FS2020開発プロジェクトにおいて、幾つかの重点課題アプリケーションと協力研究を進めている。現在「富岳」ならびに国内のスパコンでの高速計算を指向したバージョン2.5以降の公開準備を進めている。
Downloads
- EigenExa version 2.12 (tar.gz, 2376KB) (October 25, 2022)
- EigenExa version 2.11 (tar.gz, 2375KB) (December 1, 2021)
- [Upgrade] Experimental support for Hermite solver, namely, eigen_h.
- [Serious] Fix the numerical error happened to be included in v2.10, where the tall-skinny QR coded in eigen_prd_t4x.F of eigen_sx was too sensitive to treat tiny values and forced-double truncation. But, it might depend on the compiler version and the code generator.
- Modify pointer attribution to 'allocatable' to avoid automatic deallocation on the exit of callee routines.
- Some code modifications are applied to pass strong debugging tools with respect to Fortran 95 and some of Fortran 2003 extensions.
- Fix some bugs, for exmaple, missing private attribution to some variables for OpenMP
- EigenExa version 2.10 (tar.gz, 2307KB) (October 17, 2021)
- [Serious] Bug fix for violation of the result of allreduce in DC. It happened very rarely when data to be transferred was shorter than the number of processes participating.
- [Serious] Bug fix for inconsistent API interpretation of DLAED4, when K is less than or equal to 2. This bug happened when a lot of deflations are carried out, and sub-matrices are shrunk tiny as 1 or 2. So, it is infrequent to see.
- [Serious] Bug fix for non-deterministic behavior of the DC branch, which happened if an uninitialized variable referred in the brach condition, and is affected by the side-effects of other modules, etc. It was fixed when 2.9 was released but noted in the release.
- Reduce the internal data capacity in the TRD and DC routines.
- Fix the installation of Fortran modules.
- EigenExa version 2.9 (tar.gz, 2302KB) (September 24, 2021)
- Modify the flops count precise in DC kernels.
- Modify trbak not to multiply D^{-1} and TRSM.
- Add enable/disable-switch for building the shared library.
- Modify to detect the memory allocation fault.
- EigenExa version 2.8 (tar.gz, 1533KB) (August 20, 2021)
- Modify the DC kernel to reduce intermediate buffer storage.
- Bug fix on a t1 loop structure.
- Updated the error check routine.
- Fixed on Makefile to add the missing fortran module.
- EigenExa version 2.7 (tar.gz, 1507KB) (April 9, 2021)
- Modify the compilation rules corresponding to static/shared library defined in src/Makefile.am
- Performance tweak with a modification of the compilation options not to use -fPIC when build a static library.
- License document is packed as an independent file (the license notice was stated in User's manual for version 2.6).
- EigenExa version 2.6 (tgz, 1202KB) (November 1, 2020)
- This version applies a communication avoidance technique to the householder tridiagonalization together with, new process mapping for the load balance of the divide and conquer method.
- EigenExa version 2.4b (tgz, 329KB) (August 20, 2018)
- [Serious] Bug fix for incorrect data redistribution, which might violate allocated memory. The bug might have happened in the case that the number of processes, P=Px*Py, is large, and Px and Py are not equal but nearly equal.
- This version is for only bug fix for the serious one.
- EigenExa version 2.3m (tgz, 500KB) (August 20, 2018)
- [Serious] Bug fix for incorrect data redistribution, which might violate allocated memory. The bug might have happened in the case that the number of processes, P=Px*Py, is large, and Px and Py are not equal but nearly equal.
- This version is for only bug fix for the serious one.
- EigenExa version 2.4p1 (tgz, 334KB) (patch (patch, 13KB)) (May 25, 2017)
- [Serious] Bug fix for incorrect data redistribution in eigen_s.
- Major change with Autoconf -and- Automake framework.
- If you need older versions, please contact us (imamura.toshiyuki [at] riken.jp).
Publications
- Shuhei Kudo, and Toshiyuki Imamura: Cache-efficient implementation and batching of tridiagonalization on manycore CPUs: HPC Asia 2019, Vanburgh Hotel, Guangzhou, China, Jan. 15th, 2019, Won the Best Paper Award in HPC Asia 2019.
- Takeshi Fukaya, Toshiyuki Imamura, and Yusaku Yamamoto: A Case Study on Modeling the Performance of Dense Matrix Computation: Tridiagonalization in the EigenExa Eigensolver on the K Computer. IPDPS Workshops 2018: May 2018, pp. 1113-1122.
- Yusuke Hirota, and Toshiyuki Imamura, Performance Analysis of a Dense Eigenvalue Solver on the K Computer, Proc. the 36th JSST Annual International Conference on Simulation Technology, Oct. 2017.
- Toshiyuki Imamura, Takeshi Fukaya, Yusuke Hirota, Susumu Yamada, and Masahiko Machida: “CAHTR: Communication-Avoiding Householder Tridiagonalization”, Proceedings of ParCo2015, Advances in Parallel Computing, Vol.27: Parallel Computing: On the Road to Exascale, p.381-390, doi:10.3233/978-1-61499-621-7-381, 2016.
- Yusuke Mukunoki, Susumu Yamada, Narimasa Sasa, Toshiyuki Imamura, and Masahiko Machida, Performance of Quadruple Precision Eigenvalue Solver Libraries QPEigenK & QPEigenG on the K Computer, poster presentation, ISC2016, Best poster award in ‘HPC in Asia’.
- Takeshi Fukaya, and Toshiyuki Imamura: “Performance Evaluation of the Eigen Exa Eigensolver on Oakleaf-FX: Tridiagonalization Versus Pentadiagonalization”, Proceedings of the Parallel and Distributed Processing Symposium Workshop (IPDPSW, PDSEC 2015), p.960-969, doi:10.1109/IPDPSW.2015.128, 2015.
- Toshiyuki Imamura, Susumu Yamada, and Masahiko Machida: “Eigen-G: GPU-based eigenvalue solver for real-symmetric dense matrices”, Proceedings of PPAM2013, Lecture Notes in Computer Science (LNCS), Vol.8384, p.673-682, doi:10.1007/978-3-642-55224-3, 2014.
- Toshiyuki Imamura, Susumu Yamada, and Masahiko Machida: “A High Performance SYMV Kernel on a Fermi-core GPU”, Proceedings of VECPAR2012, Lecture Note in Computer Science (LNCS), Vol.7851, p.59-71, doi:10.1007/978-3-642-38718-0, 2013.
- Toshiyuki Imamura, Susumu Yamada, and Masahiko Machida, “Development of a High Performance Eigensolver on the Peta-Scale Next Generation Supercomputer System”, Progress in Nuclear Science and Technology, the Atomic Energy Society of Japan, Vol. 2, pp.643-650, 2011.
- Huu Phuong Pham, Toshiyuki Imamura, Susumu Yamada, and Masahiko Machida, “Novel approach in a divide and conquer algorithm for eigenvalue problems of real symmetric band matrices”, Proceedings of Joint International Conference on Supercomputing in Nuclear Applications and Monte Carlo 2010 (SNA+MC2010), Tokyo, Japan, 2010.
- Toshiyuki Imamura, Takuma Kano, Susumu Yamada, Masahiko Okumura, and Masahiko Machida, “High-Performance Quantum Simulation for Coupled Josephson Junctions on the Earth Simulator: A challenge to Schrodinger Equation on 256^4Grids”, International Journal of High Performance Computing Applications, SAGE publications, Vol. 24, No. 3, pp. 319-334, 2010.
- Toshiyuki IMAMURA, Susumu YAMADA, and Masahiko MACHIDA, “Narrow-band reduction approach of aDRSM eigensolver on a multicore-based cluster system”, ParallelComputing: From Multicores to GPU’s to Petascale, Chapman, B. et al.(eds), Advances in Parallel Computing 19, IOS Press, pp.91-98, 2010.
- Susumu Yamada, Toshiyuki Imamura, Takuma Kano and Masahiko Machida, “High-Performance Computing for Exact Numerical Approaches to Quantum Many-Body Problems on the Earth Simulator”, ACM/IEEE SC06, Tampa, USA, 2006. (Selected as one of the finalist papers of Gordon Bell Prize 2006).[CD-ROM]