トップページ
イベント・広報
R-CCS Cafe
R-CCS Cafe 第282回(2025年12月19日)
R-CCS Cafe 第282回(2025年12月19日)
English| 開催日 | 2025年12月19日(金) |
|---|---|
| 開催時間 | 15:00 - 16:30(15:00 - 16:00 講演者3名による講演、16:00~ 自由討論(参加自由)) |
| 開催都市 | 兵庫県神戸市/オンライン |
| 場所 | 計算科学研究センター(R-CCS)6階講堂/Zoomによる遠隔セミナー
|
| 使用言語 | 発表・スライド共に英語 |
| 登壇者 |
Lukas Broers 量子系物質科学研究チーム ![]() Qianxiang Ma 大規模並列数値計算技術研究チーム ![]() Ke Cui AI学習・推論データ管理基盤開発ユニット ![]() |
講演題目・要旨
1st Speaker: Lukas Broers
Title:
Scalable Simulation of Quantum Many-Body Dynamics with Or-Represented Quantum Algebra
Abstract:
High-performance numerical methods are essential for advancing quantum many-body physics, as well as for enabling the integration of supercomputers with emerging quantum computing platforms. We have developed a scalable and general-purpose numerical framework for quantum simulations based on or-represented quantum algebra (ORQA). This framework applies to arbitrary spin-systems and naturally integrates with quantum circuit simulation in the Heisenberg picture, particularly relevant to recent large-scale experiments on superconducting qubit processors [Kim et al., Nature 618, 500 (2023)]. As a benchmark, we simulate the kicked Ising model on a 127-qubit heavy-hexagon lattice, successfully tracking the time-evolution of local magnetization using up to one trillion Pauli strings. Our simulations exhibit strong scaling up to 2^17 parallel processes with near-linear communication overhead. Further, we show that our framework is naturally extended to a broader range of quantum systems, superseding the capabilities of recently established Pauli propagation methods. We present possible future directions on how to utilize our algorithm.
2nd Speaker: Qianxiang Ma
Title:
Mixed-precision Interpolative Decomposition on GPU
Abstract:
Low-rank approximation is a powerful tool for reducing storage and arithmetic complexity while preserving linearity. Traditional rank-revealing factorizations, such as the singular value decomposition (SVD) and pivoted QR, have been adapted to GPUs by increasing reliance on GEMM kernels, yet they remain throughput-limited. We present \texttt{hyacin}, a GPU implementation for constructing low-rank approximations via interpolative decomposition. \texttt{hyacin} combines column-pivoted Cholesky QR with extended integer quantization, which is an abstraction from the Ozaki-scheme error-free floating-point transformation, as well as mixed floating-point precisions. Integer-emulated GEMMs accelerate Gram matrix formation, while dynamically tuned quantization orders balance between accuracy and performance.
On a single NVIDIA B200 GPU, our method computes low-rank approximations of large rectangular complex double-precision matrices over $14\times$ faster than cuSOLVER’s bi-diagonalization SVD and MAGMA’s column-pivoted QR, while preserving LAPACK-quality pivoting. Against cuSOLVER’s randomized SVD, \texttt{hyacin} remains consistently superior: $4.5\times$ and $1.8\times$ faster correspondingly with and without robustness options such as power iterations. Comparable speedups are observed across other GPU architectures, including the H100, A100, and RTX 4090. These results demonstrate that \texttt{hyacin} delivers LAPACK-level accuracy at unprecedented speed, outperforming both deterministic and randomized state-of-the-art GPU factorizations while maintaining strong numerical stability.
3rd Speaker: Ke Cui
Title:
Lossless Text Compression with Large Language Model via Token Prediction Statistics
Abstract:
Lossless text compression preserves bitwise fidelity while reducing storage and transmission costs, making it essential for text data such as documents, source code, and system logs. Recently, large language models (LLMs) have been explored as predictive models for lossless compression by converting token sequences into rank sequences derived from prediction probabilities, then compressing the rank sequences with general-purpose compressors. However, existing approaches largely ignore the statistical characteristics of rank sequences and rely on inefficient fixed-length context management strategies.
To address these limitations, we propose an enhanced LLM-based lossless text compression approach that introduces a variable-length KV-cache context management strategy to eliminate redundant computation. We conduct a systematic statistical analysis of natural language text, programming code, and computer system log data, revealing distinct rank distributions across these text data types. These distributional properties can be further exploited to enable more effective encoding.
Motivated by the distributional characteristics of different text types, we design domain-specific encoding schemes tailored to natural language, source code, and computer system logs. In addition, for semi-structured computer system log data, we demonstrate how Bloom-filter-based token filtering and few-shot prompting with structural templates can be combined to refine rank distributions further and improve the compression ratio. Experimental results show that, compared with existing LLM-based lossless text compression methods, the proposed approach achieves significantly higher compression ratios.
注意事項
- 参加の際はPCマイクの音声・ビデオをオフにされるようお願いいたします。
- 当日の会場環境や通信状態により、やむなく配信を中止・中断する場合がございます。
- プログラムの内容、時間は予告なく変更される場合があります。
- ご使用の機器やネットワークの環境によっては、ご視聴いただけない場合がございます。
- インターネット中継に関する著作権は、主催者及び発表者に帰属します。なお、配信された映像及び音声、若しくはその内容を、理化学研究所の許可無くほかのウェブサイトや著作物等への転載、複製、改変等を行うことを禁じます。
(2025年12月12日)



