TOP
Events & Outreach
R-CCS Cafe
The 282nd R-CCS Cafe (Dec 19, 2025)
The 282nd R-CCS Cafe (Dec 19, 2025)
Japanese| Date | Fri, Dec 19, 2025 |
|---|---|
| Time | 3:00 pm - 4:30 pm (3:00 pm - 4:00 pm three talks, 4:00 pm - 4:30 pm Free discussion) |
| City | Kobe, Japan/Online |
| Place | Lecture Hall (6th floor) at R-CCS, Online seminar on Zoom
|
| Language | Presentation Language: English Presentation Material: English |
| Speakers |
Lukas Broers Computational Materials Science Research Team ![]() Qianxiang Ma Large-Scale Parallel Numerical Computing Technology Research Team ![]() Ke Cui Data Management Platform Development Unit ![]() |
Talk Titles and Abstracts
1st Speaker: Lukas Broers
Title:
Scalable Simulation of Quantum Many-Body Dynamics with Or-Represented Quantum Algebra
Abstract:
High-performance numerical methods are essential for advancing quantum many-body physics, as well as for enabling the integration of supercomputers with emerging quantum computing platforms. We have developed a scalable and general-purpose numerical framework for quantum simulations based on or-represented quantum algebra (ORQA). This framework applies to arbitrary spin-systems and naturally integrates with quantum circuit simulation in the Heisenberg picture, particularly relevant to recent large-scale experiments on superconducting qubit processors [Kim et al., Nature 618, 500 (2023)]. As a benchmark, we simulate the kicked Ising model on a 127-qubit heavy-hexagon lattice, successfully tracking the time-evolution of local magnetization using up to one trillion Pauli strings. Our simulations exhibit strong scaling up to 2^17 parallel processes with near-linear communication overhead. Further, we show that our framework is naturally extended to a broader range of quantum systems, superseding the capabilities of recently established Pauli propagation methods. We present possible future directions on how to utilize our algorithm.
2nd Speaker: Qianxiang Ma
Title:
Mixed-precision Interpolative Decomposition on GPU
Abstract:
Low-rank approximation is a powerful tool for reducing storage and arithmetic complexity while preserving linearity. Traditional rank-revealing factorizations, such as the singular value decomposition (SVD) and pivoted QR, have been adapted to GPUs by increasing reliance on GEMM kernels, yet they remain throughput-limited. We present \texttt{hyacin}, a GPU implementation for constructing low-rank approximations via interpolative decomposition. \texttt{hyacin} combines column-pivoted Cholesky QR with extended integer quantization, which is an abstraction from the Ozaki-scheme error-free floating-point transformation, as well as mixed floating-point precisions. Integer-emulated GEMMs accelerate Gram matrix formation, while dynamically tuned quantization orders balance between accuracy and performance.
On a single NVIDIA B200 GPU, our method computes low-rank approximations of large rectangular complex double-precision matrices over $14\times$ faster than cuSOLVER’s bi-diagonalization SVD and MAGMA’s column-pivoted QR, while preserving LAPACK-quality pivoting. Against cuSOLVER’s randomized SVD, \texttt{hyacin} remains consistently superior: $4.5\times$ and $1.8\times$ faster correspondingly with and without robustness options such as power iterations. Comparable speedups are observed across other GPU architectures, including the H100, A100, and RTX 4090. These results demonstrate that \texttt{hyacin} delivers LAPACK-level accuracy at unprecedented speed, outperforming both deterministic and randomized state-of-the-art GPU factorizations while maintaining strong numerical stability.
3rd Speaker: Ke Cui
Title:
Lossless Text Compression with Large Language Model via Token Prediction Statistics
Abstract:
Lossless text compression preserves bitwise fidelity while reducing storage and transmission costs, making it essential for text data such as documents, source code, and system logs. Recently, large language models (LLMs) have been explored as predictive models for lossless compression by converting token sequences into rank sequences derived from prediction probabilities, then compressing the rank sequences with general-purpose compressors. However, existing approaches largely ignore the statistical characteristics of rank sequences and rely on inefficient fixed-length context management strategies.
To address these limitations, we propose an enhanced LLM-based lossless text compression approach that introduces a variable-length KV-cache context management strategy to eliminate redundant computation. We conduct a systematic statistical analysis of natural language text, programming code, and computer system log data, revealing distinct rank distributions across these text data types. These distributional properties can be further exploited to enable more effective encoding.
Motivated by the distributional characteristics of different text types, we design domain-specific encoding schemes tailored to natural language, source code, and computer system logs. In addition, for semi-structured computer system log data, we demonstrate how Bloom-filter-based token filtering and few-shot prompting with structural templates can be combined to refine rank distributions further and improve the compression ratio. Experimental results show that, compared with existing LLM-based lossless text compression methods, the proposed approach achieves significantly higher compression ratios.
Important Notes
- Please turn off your video and microphone when you join the meeting.
- The broadcasting may be interrupted or terminated depending on the network condition or any other unexpected event.
- The program schedule and contents may be modified without prior notice.
- Depending on the utilized device and network environment, it may not be able to watch the session.
- All rights concerning the broadcasted material will belong to the organizer and the presenters, and it is prohibited to copy, modify, or redistribute the total or a part of the broadcasted material without the previous permission of RIKEN.
(Dec 12, 2025)



