Conference

PNNL @ SC21

PNNL computer scientists will be presenting the latest research in high-performance computing (HPC) at SC21, the International Conference for High Performance Computing, Networking, Storage, and Analysis.

PNNL at SC21
November 14–19, 2021

The SC21 conference will feature tutorials, workshops, and presentations on HPC, machine learning, artificial intelligence, modeling and simulation, and the application of these capabilities to accelerate scientific discovery. The full program of presentations is planned for November 14–19. Learn more about Pacific Northwest National Laboratory's (PNNL's) presence at SC21 below.

DOE @ SC21

If you're attending SC21 in person, be sure to visit the Department of Energy (DOE) booth. You can learn more about DOE's presence at SC21 at scdoe.info.  

Meet a Recruiter

Curious about HPC or computational sciences open positions at PNNL? Bring your questions to our recruiting team. They're holding a virtual Q&A on Thursday, November 18 from 9:00 to 11:00 a.m. Pacific Time/11:00 a.m. to 1:00 p.m. Central Time on Zoom. Come learn about more open positions and what it's like to work at PNNL. We can’t wait to meet you!

PNNL Speakers, Presentations, and Posters

November 14, 2021
Sayan G

Toward Modern C++ Language Support for MPI

Sayan GhoshAndrew Lumsdaine (University of Washington Joint Appointee)

Presentation | 12:00 p.m. – 12:30 p.m. PST/2:00 p.m. – 2:30 p.m. CST

The C++ programming language has made significant inroads in improving performance and productivity across a broad spectrum of applications and hardware. The C++ language bindings to Message Passing Interface (MPI) had been deleted since MPI 3.0 based on the rationale that it added minimal functionality over the existing C bindings, relative to modern C++ practice, while incurring significant amount of maintenance to the MPI standard specification. READ MORE

presenters 1

A High-Performance Sparse Tensor Algebra Compiler in MLIR

Ruiqin Tian, Luanzheng Guo, Gokcen Kestor

Presentation | 1:30 p.m. – 2:10 p.m. PST/3:30 p.m. – 4:10 p.m. CST

Sparse tensor algebra is widely used in many applications. The performance of sparse tensor algebra kernels strongly depends on the characteristics of the input tensors. Therefore, many storage formats are designed for tensors to achieve optimal performance for particular applications and architectures, which makes it challenging to implement and optimize every tensor operation of interest on a given architecture. READ MORE

November 15, 2021
presenters 2

IA3 2021: 11th Workshops on Irregular Applications: Architectures and Algorithms

Antonino Tumeo, Marco Minutoli, Vito Giovanni Castellana, John Feo

Workshop | 7:00 a.m. – 3:30 p.m. PST/9:00 a.m. – 5:30 p.m. CST

Due to the heterogeneous datasets they process, data-intensive applications employ a diverse set of methods and data structures, exhibiting irregular memory accesses, control flows, and communication patterns. Current supercomputing systems are organized around components optimized for data locality and bulk synchronous computations. READ MORE

presenters 3

HPC Graph Toolkits and the GraphBLAS Forum

Antonino Tumeo, John Feo, Mahantesh Halappanavar

Presentation | 3:15 p.m. – 4:45 p.m. PST/5:15 p.m. – 6:45 p.m. CST

HPC systems are diverse. Programmers can’t afford to customize software from scratch for each case. We need frameworks that hide hardware behind high-level abstractions. Workflows are complex with graphs, databases, simulations, machine learning, and more. READ MORE

November 16, 2021
presenters 4

Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores

Ang Li, Tong Geng

Presentation: 1:30 p.m. – 2:00 p.m. PST/3:30 p.m. – 4:00 p.m. CST

Accelerating neural networks with quantization has been widely studied. Unfortunately, prior efforts with diverse precisions are usually restricted by limited precision support on GPUs. To break such restrictions, we introduce the first arbitrary precision neural network framework to fully exploit quantization benefits on Ampere GPU tensor cores. READ MORE

presenters 5

SODA-OPT: System-Level Design in MLIR for HLS

Nicolas Bohm Agostini, Antonino Tumeo

Poster | 6:30 a.m. – 3:00 p.m. PST/8:30 a.m. – 5:00 p.m. CST

High-level-synthesis enables the generation of hardware descriptions from applications implemented with high-level languages. State-of-the-art tools, however, typically require the application to be manually translated to C/C++ and carefully annotated to improve final design performance. READ MORE

Vito G

Towards a Scalable and Distributed High-Performance SHAD C++ Library

Vito Giovanni Castellana            

Poster | 6:30 a.m. – 3:00 p.m. PST/8:30 a.m. – 5:00 p.m. CST

SHAD is the Scalable High-performance Algorithms and Data-structures C++ library, providing general purpose building blocks and supporting high-level custom utilities. SHAD is designed with scalability, flexibility, productivity, and portability in mind, and serves as a playground for research in parallel programming models, runtime systems, and their applications. READ MORE

presenters 6

FPGA-Accelerated Ripples

Reece Neff, Marco MinutoliAntonino Tumeo

Poster*6:30 a.m. – 3:00 p.m. PST/8:30 a.m. – 5:00 p.m. CST

Influence Maximization is an important graph algorithm that is gaining traction in areas where social networks and other related graphs are processed and analyzed. The long run time of the algorithm opened the door for optimizations but is challenging to parallelize and port onto novel architecture due to its irregular and memory-hungry behavior. READ MORE

*This poster is one of four finalists for the Best Research Poster award. There will be a 12-minute presentation on November 17 in the Best Research Poster presentation session from 8:30 a.m. – 8:50 a.m. PST/10:30 a.m. – 10:50 a.m. CST

presenters 7

Breadth-First Search on Xilinx Versal

Guilherme Prado Alves, Marco MinutoliAntonino Tumeo

Poster | 6:30 a.m. – 3:00 p.m. PST/8:30 a.m. – 5:00 p.m. CST         

The new Xilinx Versal Platform provides a highly heterogeneous system to programmers. How these diverse resources can be utilized effectively is an open question. This project implements breadth-first search on this platform, utilizing all available regions to accelerate this workload. READ MORE

presenters 8

Hardware Acceleration of Complex Machine Learning Models through Modern High-Level Synthesis

Serena Curzel, Antonino Tumeo

Poster | 6:30 a.m. – 3:00 p.m. PST/8:30 a.m. – 5:00 p.m. CST

Machine-learning algorithms continue to receive significant attention from industry and research. As the models increase in complexity and accuracy, their computational and memory demands also grow, pushing for more powerful, heterogeneous architectures. READ MORE

November 17, 2021
presenters 9

Single-Node Partitioned-Memory for Huge Graph Analytics: Cost and Performance Trade-Offs

Sayan Ghosh, Nathan Tallent, Marco Minutoli, Mahantesh Halappanavar, Ananth Kalyanaraman (WSU Joint Appointee)

Presentation | 9:30 a.m. – 12:00 p.m. PST/11:30 a.m. – 2:00 p.m. CST

Because of cost, nonvolatile memory NVDIMMs such as Intel Optane are attractive in single-node big-memory systems. We evaluate performance and cost trade-offs when using Optane as volatile memory for huge-graph analytics. READ MORE

Kevin Barker

Advanced Architecture Testbeds: Community Resources for Enhanced HPC Research

Kevin Barker

Presentation | 10:15 a.m. – 11:15 a.m. PST/12:15 p.m. – 1:15 p.m. CST

This presentation brings together panelists from advanced architecture testbed efforts including Swiss National Supercomputing Centre’s User lab, PNNL’s Center for Advanced Technology Evaluation (CENATE) testbed, Heterogeneous Advanced Architecture Platforms at Sandia National Laboratories, the Rogues Gallery at Georgia Tech, Experimental Computing Lab at Oak Ridge National Laboratory, and the Maui HPC Center to discuss next-generation architectures and challenges. READ MORE

KhanHalappanavarSerra

Scaling Subgraph Isomorphism on Distributed Multi-GPU Systems Using Trie-Based Data Structure

Arif Khan, Mahantesh Halappanavar, Edoardo Serra (Boise State University Joint Appointee)

Presentation | 2:30 p.m. – 3:00 p.m. PST/4:30 p.m. – 5:00 p.m. CST

Subgraph isomorphism is a pattern-matching algorithm widely used in many domains such as chem-informatics, bioinformatics, databases, and social network analysis. It is computationally expensive and is a proven NP-hard problem. The massive parallelism in GPUs is well suited for solving subgraph isomorphism. However, current GPU implementations are far from the achievable performance. READ MORE

November 18, 2021
presenters 11

Multi-Accelerator Pattern Allocation Policy for Multi-Tenant GPU Serve

Joshua SuetterleinJoseph Manzano

Presentation | 1:30 p.m. – 2:00 p.m. PST/3:30 p.m. – 4:00 p.m. CST

Multi-accelerator servers are increasingly being deployed in shared multi-tenant environments (such as in cloud data centers) to meet the demands of large-scale compute-intensive workloads. In addition, these accelerators are increasingly being interconnected in complex topologies and workloads are exhibiting a wider variety of inter-accelerator communication patterns. READ MORE

presenters 12

Scalable PGAS-Based State Vector Simulation of Quantum Circuit

Ang Li, Bo Fang

Presentation: 2:00 p.m. – 2:30 p.m. PST/4:00 p.m. – 4:30 p.m. CST

High-performance quantum circuit simulation in a classic HPC is still necessary in the noisy intermediate-scale quantum era. Observing that the major obstacle of scalable state-vector quantum simulation arises from the massively fine-grained irregular data-exchange with remote nodes, in this paper we present state-vector quantum circuit simulation to apply the emerging partitioned global address space-based communication models for efficient large-scale quantum circuit simulation. READ MORE

November 19, 2021                                                     
Vinay A

HPC for Urgent Decision Making

Vinay Amatya

Workshop | 6:30 a.m. – 10:00 a.m. PST/8:30 a.m. – 12:00 p.m. CST

Responding to natural disasters, pandemics, and time-sensitive societal issues, technological advances are creating exciting new opportunities with the potential to move HPC beyond traditional computational workloads. Combining high-velocity data and live analytics with HPC models can aid in responding to urgent real-world problems, ultimately saving lives and reducing economic loss. READ MORE

Ang Li

Guarding Numerics Amidst Rising Heterogeneity

Ang Li

Presentation | 7:00 a.m. – 7:20 a.m. PST/9:00 a.m. – 9:20 a.m. CST

New heterogeneous computing platforms—especially GPUs and other accelerators—are being introduced at a brisk pace, motivated by the goals of exploiting parallelism and reducing data movement. Unfortunately, their sheer variety and the optimization options supported by them have been observed to alter the computed numerical results to the extent that reproducible results are no longer possible to obtain without extra effort. READ MORE

RSDHA: Redefining Scalability for Diversely Heterogeneous Architectures

presenter 2

Antonino Tumeo

Panel Discussion | 7:10 a.m. – 8:10 a.m. PST/9:10 a.m. – 10:10 a.m. CST

The panel discussion at RSDHA will seek answers for two primary questions:

  • How could the traditional HPC applications adopt the architectural, programming, and runtime approaches employed by the state-of-the-art diversely heterogeneous systems?
  • How could the diversely heterogeneous architectures for mobile and autonomous systems take examples from traditional HPC to beat the multi-node scalability challenges as they become increasingly more connected? READ MORE