理化学研究所 計算科学研究センター

メニュー
メニュー
Events/Documents イベント・広報

R-CCS Cafe

R-CCS Cafe は、異分野融合のための足掛かりとして、計算科学研究センター(R-CCS)に集う研究者が井戸端会議的にざっくばらんに議論する場として、毎月2回程度予定しております。興味をお持ちの方は原則どなたでも参加可能です。

  • 目 的: 異分野間の壁を超えた研究協力を促進し、新しい学問分野の開拓を目指すため、 研究者間の情報交換・相互理解の場を提供し、研究協力のきっかけを作る。
  • 会 場:R-CCS 6階講堂もしくは1階セミナー室
  • 言 語:講演は日本語/英語、スライドは英語
  • その他:講演者は他分野の方にも理解できる発表を心掛け、参加者は積極的に質問しましょう。

第170回 第2部
日時: 2019年6月7日(金)、13:55 - 14:50
場所: R-CCS 6階講堂

・講演題目: Systemization of performance optimization technique
・講演者: 南 一生(運用技術部門 チューニング技術ユニット ユニットリーダー)
※発表・スライド共に英語

講演要旨: 詳細を見る

Modern supercomputers are highly parallel machine combining inter-nodes process parallelism and inter-core thread parallelism. And the memory hierarchy including the cache in the node is also complicated. On the other hand, applications that run on supercomputers cannot fully utilize the performance of hardware unless high parallelization and individual node tuning are performed according to that hardware.Therefore, the two points“programming conscious of parallelism” and “programming conscious of execution performance”are essential techniques for users, researchers, and programmers who use the present supercomputers equipped with tens of thousands of processors and containing various enhancements and new functions. Here, we call the technique as the performance-optimizing techniques to application programs. Performance optimization of application is not always done by the application developer. It is difficult to interpret applications developed by others, evaluate their performance, discover problems, and solve problems. To systemize of the techniques of performance optimization will provide useful information for engineers and researchers who want to optimize the execution performance of applications. In this talk, I will talk about systemization of the techniques of performance optimization for single CPUs and high parallelism. Specifically, the following content is included.
-Classification of applications from the viewpoint of single CPU performance.
-Explain of busy time.
-Relationship of busy time and performance.
-Relationship of busy time and classification of applications.
-Relationship of busy time and performance tuning.
-Maximum performance estimation when busy time depends only on bandwidth.
-Accumulation of tuning techniques each application classification.
-Classification of problem regarding high parallelism.
-Accumulation of tuning techniques each problem classification.

第169回
日時: 2019年5月31日(金)、15:00 - 16:10
場所: R-CCS 6階講堂

・講演題目: Statistical emulation to quantify uncertainties in tsunami modelling using high performance computing
・講演者: Serge Guillas (Professor, University College London)
※発表・スライド共に英語

講演要旨: 詳細を見る

In this talk, we present solutions to the investigation of uncertainties in tsunami impacts in three settings.
First, we consider landslides as a source of tsunamis from the Indus Canyon in the Western Indian Ocean. We employ statistical emulation, i.e. surrogate modelling, to efficiently quantify uncertainties associated with slump-generated tsunamis at the slopes of the canyon. We simulated 60 slump scenarios to train the emulator and predict 500,000 trial scenarios in order to study probabilistically the tsunami hazard over the near field. The results show that the most likely tsunami amplitudes and velocities can potentially impact vessels and maritime facilities. We demonstrate that the emulator-based approach is an important tool for probabilistic hazard analysis since it can generate thousands of tsunami scenarios in few seconds, compared to days of computations on High Performance Computing facilities for a single run of the dispersive tsunami solver that we use here.
We then examine future tsunami hazard from the Makran subduction zone in the Western Indian Ocean. Since tsunamis present a high risk to ports in the form of high velocities and vorticity, we capture these phenomena in high resolution (down to 10m) using carefully constructed unstructured meshes for the port of Karachi. The seabed deformations triggered by the earthquake sources vary in magnitude. A parametrization of these sources is done via geometric descriptions and a newly introduced amplification parameter of the vertical deformation due the sediments. A emulator approximates the functional relationship between inputs and outputs maximum velocity and free surface elevation. A hazard assessment is performed using the emulator. Finally, we create emulators that respect the nature of time series outputs. We introduce here a novel statistical emulation of the input-output dependence of these computer models: functional registration and Functional Principal Components techniques improve the predictions of the emulator. Our phase registration method captures fine variations in amplitude. Smoothness in the time series of outputs is modelled, and we are thus able to select more representative, and more parsimonious, regression functions than a fixed basis method such as a Fourier basis. We apply this approach to the high resolution tsunami wave propagation and coastal inundation for the Cascadia region in the Pacific Northwest.

第168回 第1部
日時: 2019年5月20日(月)、13:00 - 13:35
場所: R-CCS 1階セミナー室

※ 第168回のR-CCS Cafeは、第1回 LBNL/R-CCS ワークショップ New Frontiers of Computer Architecture and System Software towards Post-Moore Era の招待講演として開催されます。

・招待講演2: Bandwidth Steering in HPC using Silicon Nanophotonics
・講演者: George Michelogiannakis (Lawrence Berkeley National Laboratory)
※発表・スライド共に英語

講演要旨: 詳細を見る

Communication is threatening to become an increasing bottleneck towards performance scaling in the post exascale era as bytes-per-FLOP ratios continue to decline. We describe bandwidth steering in HPC to take advantage of emerging photonic switches for efficiently changing the connectivity of the lower layers in a hierarchical topology to reconstruct locality that was lost from system fragmentation and was impossible to recover with task placement. This allows for more aggressive oversubscription of the higher layers to reduce cost with no performance penalty. We demonstrate bandwidth steering with a scalable algorithm in an experimental testbed and at system scale using simulations. At the system scale, bandwidth steering reduces static power consumption per unit throughput by 51% and dynamic power consumption by 10% compared to a reference topology. In addition, bandwidth steering reduces average network latency by up to 87% and improves the average throughput by an average of 4.3x.

第168回 第2部
日時: 2019年5月20日(月)、13:35 - 14:10
場所: R-CCS 1階セミナー室

・招待講演3: qFirm: Digital Firmware for Classical Control of Qubits
・講演者: Farzad Fatollahi-Fard (Lawrence Berkeley National Laboratory)
※発表・スライド共に英語

講演要旨: 詳細を見る

As the field of Quantum Computing grows, various levels of abstraction must be developed to make it easier for users to adopt. Sitting in between a control processor and the digital/analog interface for a classical control system, we propose a layer called qFirm. This layer will provide the vital interface for converting quantum instructions into the analog signals sent to control the quantum device, as well as reading the results of the device. This will provide the essential glue logic for the classical control stack for a quantum control system.

第168回 第3部
日時: 2019年5月20日(月)、14:10 - 14:45
場所: R-CCS 1階セミナー室

・招待講演4:Extending Classical Processors to Support Future Large Scale Quantum Accelerators
・講演者: Butko Anastasiia (Lawrence Berkeley National Laboratory)
※発表・スライド共に英語

講演要旨: 詳細を見る

Extensive research in material science together with outstanding engineering efforts allowed quantum technology to be significantly improved hence enabling continuing scaling of quantum circuit size. However, quantum circuit scaling itself does not guarantee any practical use without appropriate progress on the part of classical control hardware and software. To operate such a large-scale universal quantum computer with thousands of qubits, extensive classical computational resources will be required. Control hardware includes multiple layers each of which is responsible for a specific set of tasks, e.g. controllers, digital-analogue and analogue-digital converters, filters, waveform generators, etc. At this early stage of quantum architecture development, there is no clear understanding of where the upcoming challenges will be addressed through the entire stack of complex digital and analogue circuits. However, we expect that control processor will become a crucial part for successful implementation and adoption of future quantum computers.
In our talk, we discuss the challenges that classical processors will face while controlling future large-scale quantum systems. We discuss how these challenges will affect processor micro-architecture to guarantee on time quantum gate execution, continuing qubit state measurement, store and analysis, support massive parallelism and perform advanced bit manipulations on the top of the measured data.

第168回 第4部
日時: 2019年5月20日(月)、15:05 - 15:40
場所: R-CCS 1階セミナー室

・招待講演5:How open source designs will drive the next generation of HPC Systems
・講演者: David Daniel Donofrio (Lawrence Berkeley National Laboratory)
※発表・スライド共に英語

講演要旨: 詳細を見る

As we approach the end of Moore’s law modern, complex, HPC systems are increasingly relying upon specialized accelerators in order to deliver continued performance increases for specific computational workloads. Developers of these accelerators, especially in in many low volume scientific applications, face a stark choice: spend millions on a commercial license for processors and other IP, or face the significant risk and of developing custom hardware. Rapid prototyping methods need to be explored in order to make the design, verification and programming tools for these new accelerators more accessible to the broader scientific community. To increase access and innovation while reducing cost there has been a consistent march towards open source solutions for each of these components including Facebook’s Open Compute Project and Intel’s OpenHPC effort, as well as a burgeoning community surrounding RISC-V based processors.
Looking beyond accelerators that may be tightly integrated with HPC systems we see opportunities for open source hardware to include programmable logic embedded within high performance sensors and detectors for aggressive data reduction or being used in conjunction with FPGA and other reconfigurable computing based platforms. This talk will explore the emerging open source hardware effort as well as showcase new platforms for the rapid generation of future HPC accelerators.

第168回 第5部
日時: 2019年5月20日(月)、15:40 - 16:15
場所: R-CCS 1階セミナー室

・招待講演6:PARADISE: Modeling and Simulation of Emerging Post-CMOS Devices and Architectures
・講演者: Dilip Vasudevan (Lawrence Berkeley National Laboratory)
※発表・スライド共に英語

講演要旨: 詳細を見る

An increasing number of technologies are being proposed to preserve digital computing performance scaling as lithographic scaling slows. These technologies include new devices, specialized architectures, memories, and 3D integration. Currently, no end-to-end tool flow is available to rapidly perform architectural-level evaluation using device-level models and for a variety of emerging technologies at once. We propose PARADISE: An open-source comprehensive methodology to evaluate emerging technologies with a vertical simulation flow from the individual device level all the way up to the architectural level. To demonstrate its effectiveness, we use PARADISE to perform end-to-end simulation and analysis of heterogeneous architectures using CNFETs, TFETs, and NCFETs, along with multiple hardware designs. To demonstrate its accuracy, we show that PARADISE has only a 6% mean deviation for delay and 9% for power compared to previous studies using commercial synthesis tools.

Scott Klaskyのプロフィール写真

第167回
日時: 2019年4月15日(月)、11:00 - 12:00
場所: R-CCS 1階セミナー室

・講演題目:The Adaptable IO framework : An exascale capable IO system for storage, IO, and in situ data processing
・講演者:Scott Klasky (Oak Ridge National Laboratory (ORNL))
※発表・スライド共に英語

講演要旨: 詳細を見る

The USA Exascale Computing Project (ECP) is focused on accelerating the delivery of a capable exascale computing ecosystem that delivers 50 times more computational science and data analytic application power than possible with DOE HPC systems such as Titan (ORNL) and Sequoia (LLNL). As next generation applications and experiments grow in concurrency and in complexity, the data produced often grows to extreme levels, limiting scientific knowledge discovery. In this presentation, I will talk about the new set of applications and experiments which push the edge of scientific data processing and simulation. I will present some of the exciting new research in this area to cope with this tsunami of data, along with the challenges in implementing these effectively on next-generation computer architectures. In this presentation I will also focus on the ADIOS framework a next generation to ingest, reduce, and move data on HPC systems and over the WAN to other computational resources. I will also focus on in situ data processing infrastructure and next generation data compression algorithms.
※紹介されるツールのチュートリアルが同日午後に行われます。
[ Time table ]13:00-16:00 ADIOS2 Tutorial
Lecturers: Scott Klasky (ORNL), Norbert Podhorszki (ORNL)
興味を持たれた方は、ADIOS2 Adaptable I/O Frameworkチュートリアル 開催(4月15日・神戸)をご覧下さい。

第166回 第1部
日時: 2019年3月25日(月)、15:00 - 16:00
場所: R-CCS 1階セミナー室

・講演題目:米国における気候数値モデルの開発状況報告
・講演者:吉田 龍二 (Unversity of Colorado Boulder, CIRES/NOAA ESRL)
※発表は日本語、スライドは英語

講演要旨: 詳細を見る

米国における気候数値モデル開発の一例をご紹介いたします。発表者は米国エネルギー省SciDACプロジェクトを通してE3SMという全球モデル開発に参加しています。E3SMはエネルギー問題を解決するために,高解像度の地球環境シミュレーションを実施し、水循環,雪氷-海洋系,そして生物圏の理解を進めることを目的に開発されているモデルです。計算能力向上のために,機械学習やGPUコンピューティングにも力を入れており、米国エネルギー省の次世代スーパーコンピュータ「Shasta」で計算が実行される予定です。この次世代機はクレイ社によって発表されたばかりで,AMD,Intel,ARM,そしてGPUといった様々な計算コアが実装される予定です。モデル開発者は,まだ見ぬ次世代機に適用するため様々な手法を考えています。

第166回 第2部
日時: 2019年3月25日(月)、16:00 - 17:00
場所: R-CCS 1階セミナー室

・講演題目:Agent-based model (ABM) for city-scale traffic simulation: a case study on San Francisco.
・講演者:Bingyu Zhao, University of California at Berkeley
※発表・スライド共に英語

講演要旨: 詳細を見る

Agent-Based Model (ABM) is a promising tool for city-scale traffic simulation to understand the complex behaviour of the entire urban transportation system under different scenarios. In the ABM, traffic is intuitively simulated as movements and interactions between large numbers of agents, each capable of finding the route for an individual traveller or vehicle. In this talk, the development of such an ABM simulation tool will be presented to reproduce the traffic patterns of the city of San Francisco. The model features a detailed road network and hour-long simulation time step to capture realistic variations in traffic conditions. Agent speed is determined according to a simplified volume-delay macroscopic relationship, which is more efficient than applying microscopic rules (e.g., car following) for evaluating city-scale traffic conditions. Two particular challenges of building such an ABM will be discussed in particular: data availability and computational cost. The key inputs to the ABM are sourced from standard and publicly available datasets, including the travel demand surveys published by local transport authorities and the road network data from the OpenStreetMap. In addition, an efficient priorityqueue based Dijkstra algorithm is implemented to overcome the computational bottleneck of agent routing. The ABM is designed to run on High Performance Computing (HPC) clusters, thereby improving the computational speed significantly. Preliminary validation of the ABM is conducted by comparing its results with a published model. Overall, the ABM has been demonstrated to run efficiently and produce reliable results. Use cases of the ABM tool will be demonstrated through two examples, including evaluating the value of real-time traffic information and assessing the outcomes of complex network-level emission mitigation measures.