TOP
Research
Research Teams
High Performance Artificial Intelligence Systems Research Team
High Performance Artificial Intelligence Systems Research Team
Japanese
Team Leader Mohamed Wahib
mohamed.attia [at] riken.jp (Lab location: Tokyo)
- Please change [at] to @
- 2022
- Team Leader, High Performance Artificial Intelligence Systems Research Team, R-CCS, RIKEN (-present)
- 2017
- Senior Research Scientist, Real World Big Data Computing Open Innovation Laboratory (RWBC-OIL), AIST
- Visiting Researcher, AICS (renamed R-CCS in 2018), RIKEN
- Specially Appointed Researcher, Tokyo Tech
- 2012
- Postdoctoral Researcher, AICS, RIKEN
- 2012
- Ph. D. from the Hokkaido University
Keyword
- High Performance Artificial Intelligence Systems
- Intelligent Programming Systems
- Performance Modeling of AI Systems e.g. Deep Learning
- Scalable Deep Learning
- Convergence of AI and Simulation
Research summary
The High Performance Artificial Intelligence Systems Research Team is an R-CCS laboratory focusing on convergence of HPC and AI, namely high performance systems, software, and algorithms research for artificial intelligence/machine learning. In collaboration with other research institutes in HPC and AI-related research in Japan as well as globally, it seeks to develop next-generation AI technology that will utilize state-of-the-art high-performance computation facilities, including Fugaku. Specifically, we conduct research on next-generation AI systems by focusing on the following topics:
- Extreme speedup and scalability of deep learning:
Achieve extreme scalability of deep learning in large-scale supercomputing environments including the post-K, extending the latest algorithms and frameworks for deep learning. - Performance analysis of deep learning:
Accelerate computational kernels for AI over the state-of-the-art hardware architectures by analyzing algorithms for deep learning and other machine learning/AI, measuring their performance and constructing their performance models. - Acceleration of modern AI algorithms:
Accelerate advanced AI algorithms, such as ultra-deep neural networks and high-resolution GAN over images, those that require massive computational resources, using extreme-scale deep learning systems. - Acceleration of HPC algorithms using machine learning:
Accelerate HPC algorithms and applications using empirical models based on machine learning. - Intelligent programming systems:
Use AI to auto-generate programs that can adapt to and withstand the complexity and divergence of hardware design.
Representative papers
- Jintao Meng, Chen Zhuang, Peng Chen, Mohamed Wahib, Bertil Schmidt, Xiao Wang, Haidong Lan, Dou Wu, Minwen Deng, Yanjie Wei, Shenzhong Feng.:
"Automatic Generation of High-Performance Convolution Kernels on ARM CPUs for Deep Learning",
IEEE Transactions on Parallel & Distributed Systems, vol. 34, April 2022. - Jintao Meng, Peng Chen, Mingjun Yang, Mohamed Wahib, Yanjie Wei, Shengzhong Feng, Wei Liu, Junzhou Huang.:
“Boosting the Predictive Performance with Aqueous Solubility Dataset Curation”,
Nature Scientific Data, March 2022. - Truong Thao Nguyen, Francois Trahay, Jens Domke, Aleksandr Drozd, Emil Vatai, Jianwei Liao, Mohamed Wahib, Balazs Gerofi.:
"Why Globally Re-shuffle? Revisiting Data Shuffling in Large Scale Deep Learning",
36th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2022). - Albert Khaira, Truong Thao Nguyen, Leonardo Bautista Gomez, Ryousei Takano, Rosa Badia, Mohamed Wahib.:
"An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of Convolutional Neural Networks",
30th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2021). - Peng Chen, Mohamed Wahib, Xiao Wang, shinichiro takizawa, Takahiro Hirofuchi, Ogawa Hirotaka, Satoshi Matsuoka.:
"Performance Portable Back-projection Algorithms on CPUs: Agnostic Data Locality and Vectorization Optimizations",
35th ACM International Conference on Supercomputing (ICS 2021). - Peng Chen, Mohamed Wahib, Xiao Wang, Takahiro Hirofuchi, Hirotaka Ogawa, Ander Biguri, Richard Boardman, Thomas Blumensath, Satoshi Matsuoka.:
"Scalable FBP Decomposition for Cone-Beam CT Reconstruction",
International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2021). - Fareed Mohammad Qararyah, Mohamed Wahib, Doga Dikbayır, Mehmet Esat Belviranl, Didem Unat.:
“A computational-graph Partitioning Method for Training Memory-constrained DNNs”,
Elsevier Parallel Computing, Volume 104 pp. 102-117, July 2021. - Jens Domke, Emil Vatai, Aleksandr Drozd, Peng Chen, Yosuke Oyama, Lingqi Zhang, Shweta Salaria, Daichi Mukunoki, Artur Podobas, Mohamed Wahib, Satoshi Matsuoka.:
"Matrix Engines for HPC: A Performance Study from the Applications Perspective",
35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021). - Mohamed Wahib, Haoyu Zhang, Truong Thao Nguyen, Aleksandr Drozd, Jens Domke, Lingqi Zhang, Ryousei Takano, Satoshi Matsuoka.:
"Scaling Deep Learning Workloads Beyond Memory Capacity",
International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2020). - Chen Peng,Wahib Mohamed,Takizawa Shinichiro,Matsuoka Satoshi.:
"A Versatile Software Systolic Execution Model for GPU Memory Bound Kernels",
International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2019).