On the verge of the convergence between high performance computing (HPC) and Big Data processing, it has become increasingly prevalent to deploy large-scale data analytics workloads on high-end supercomputers. Such applications often come in the form of complex work- flows with various different components, assimilating data from scientific simulations as well as from measurements streamed from sensor net- works, such as radars and satellites. For example, as part of the next generation flagship (post-K) supercomputer project of Japan, RIKEN is investigating the feasibility of a highly accurate weather forecasting system that would provide a real-time outlook for severe guerrilla rainstorms. One of the main performance bottlenecks of this application is the lack of efficient communication among workflow components, which currently takes place over the parallel file system. This presentation reports an initial study of a direct communication framework designed for complex workflows that eliminates unnecessary file I/O among components. Specifically, we propose an I/O arbitrator layer that provides direct parallel data transfer among job components that rely on the netCDF interface for performing I/O operations, with only minimal modifications to application code. We present the design and a preliminary evaluation of the framework on the K Computer using RIKEN’s experimental weather forecasting workflow as a case study.
日時: 2016年6月1日(水)、15:30 – 16:30
場所: AICS 6階講堂
・講演題目: Toward a General I/O Arbitration Framework for netCDF based Big Data Processing
・講演者: リョウ ケンイ（Liao Jianwei）（フラッグシップ2020プロジェクト システムソフトウェア開発チーム）