TOP    Events & Outreach    R-CCS Cafe    The 254th R-CCS Cafe (Oct 13, 2023)

Date Fri, Oct 13, 2023
Time 3:00 pm - 4:40 pm (3 pm - 4:40 pm Talks, 4:40 pm - Free discussion and coffee break)
City Kobe, Japan/Online

Lecture Hall (6th floor) at R-CCS, Online seminar on Zoom

  • If you are not affiliated with R-CCS and would like to attend R-CCS Cafe, please email us at r-ccs-cafe[at]
Language Presentation Language: English
Presentation Material: English

Hidehiko Kohshiro

Computational Materials Science Research Team
Postdoctoral Researcher

Boma Adhi

Processor Research Team
Postdoctoral Researcher

Niclas Jansson

KTH Royal Institute of Technology, Sweden

Talk Titles and Abstracts

1st Speaker: Hidehiko Kohshiro

Building Tensor Network Software
The tensor network method is an approximation technique for representing higher-order tensors using lower-order tensors.
Traditionally employed in quantum physics and statistical mechanics, tensor network methods have, in recent years, expanded their applications into various other areas.
Notably, in the field of quantum computing, it has gained significant attention as a high-performance quantum circuit simulator.
With the ongoing advancements in tensor network-related fields, numerous frameworks and applications have emerged.
In this talk, we will provide an overview of the structure of tensor network applications, primarily focusing on a quantum circuit simulator, and present the current status and future challenges of tensor network software, including those under development by our team.

2nd Speaker: Boma Adhi

Unleashing CGRA Potential for HPC
A Coarse-Grained Reconfigurable Array (CGRA) is a reconfigurable computing architecture traditionally used for accelerators in low-powered embedded devices, akin to FPGA but trades the bit-level programmability with ASIC-like performance and efficiency. Recently some CGRA-like devices have been commercialized for AI accelerators and CGRA is seen as a promising GPU and FPGA alternatives for many HPC and AI related workload that can be transformed into data-flow style computation. RIKEN CGRA is our in-house CGRA architecture targeted at future HPC systems developed by the Processor Research Team. Adapting the traditional CGRA architecture optimized for low-power accelerator into a performance oriented HPC accelerator is not a trivial task. This talk highlights our previous and future design-space exploration effort to optimize the architecture, i.e., intra-CGRA interconnect optimization, FMA and transcendental operation on CGRA, programmable buffer, systolic-array style execution on CGRA, predication support, CPU-CGRA interconnect, CGRA Memory interface, and FPGA based emulation on actual HPC environment.

3rd Speaker: Niclas Jansson

Neko: A Modern, Portable, and Scalable Framework for High-Fidelity Computational Fluid
Recent trends and advancements in including more diverse and heterogeneous hardware in High-Performance Computing are challenging scientific software developers in their pursuit of good performance and efficient numerical methods. As a result, the well-known maxim “software outlives hardware” may no longer necessarily hold true, and researchers are today forced to re-factor their codes to leverage these powerful new heterogeneous systems. We present Neko – a portable framework for high-fidelity spectral element flow simulations. Unlike prior works, Neko adopts a modern object-oriented Fortran 2008 approach, allowing multi-tier abstractions of the solver stack and facilitating various hardware backends ranging from general-purpose processors, accelerators down to exotic vector processors and Field- Programmable Gate Arrays (FPGAs) via Neko’s device abstraction layer. Focusing on the performance and accuracy of Neko, we show the first direct numerical simulation (DNS) of a Flettner rotor submerged in a turbulent boundary layer, observing excellent agreement of lift with experimental data. Using a mesh with five million spectral elements, which turns into more than a billion unique degrees of freedom, the simulation requires less than three days to complete on accelerated systems compared to weeks on traditional non-accelerated systems. Finally, we present performance measurements on a wide range of accelerated computing platforms, including the EuroHPC pre-exascale system LUMI, where Neko achieves excellent parallel efficiency for a large DNS of turbulent fluid flow using up to 80% of the entire LUMI supercomputer.

Important Notes

  • Please turn off your video and microphone when you join the meeting.
  • The broadcasting may be interrupted or terminated depending on the network condition or any other unexpected event.
  • The program schedule and contents may be modified without prior notice.
  • Depending on the utilized device and network environment, it may not be able to watch the session.
  • All rights concerning the broadcasted material will belong to the organizer and the presenters, and it is prohibited to copy, modify, or redistribute the total or a part of the broadcasted material without the previous permission of RIKEN.

(Oct 4, 2023)