April 7, 2022 – Researchers supporting the development of NWChemEx, an open-source exascale software platform for high-performance quantum chemistry simulations, demonstrated a new modular software design solution that can be extended to future architectures with an effort of minimal software engineering and provides a sustainable path to software development with kernels that can be plugged into many high-level algorithms. Their work, funded by the Exascale Computing project, addresses the challenge of finding programming models that perform well in the heterogeneous architectures found in the world’s supercomputers – which may still use multi-core processors and GPU accelerators from multiple vendors – as well as the limitations that single-layer source portability present to optimize scientific software workflows. The team demonstrated similar performance profiles for NVIDIA, AMD, and Intel GPUs using numerical integration of the exchange-correlation potential in Kohn-Sham Density Functional Theory (KS-DFT), a method of quantum chemistry essential to the simulation of molecules and materials. Their findings were published in the September 2021 issue of Parallel processing.
The researchers’ modular, object-oriented software design solution separates the expression of scientific workflows from the implementation details of individual algorithmic cores. This allows a developer to express the overall algorithm in a single high-level source language while allowing the implementation of a handful of performance-critical cores on a per-architecture basis, ensuring the sustainability of development efforts. software. On each architecture of interest, each core is implemented and optimized as a plugin loaded at runtime or at compile time depending on the application, allowing the developer to simultaneously target multiple architectures without altering the high-end algorithmic workflow. level and offering extensibility. The modular design also makes it possible to quickly test and prototype new implementation strategies for individual architectures without interfering with implementations of other kernels or architectures.
Work is underway to unify the GPU and CPU implementations of the KS-DFT module in NWChemEx as well as to extend the GPU implementation to FPGAs and ASICs while maintaining a high-level algorithmic specification. If successful, the implementation would be a first of its kind in the field of energy-efficient scientific computing with high impact in the era of post-exascale computing.
David B. Williams-Young, Abhishek Bagusetty, Wibe A. de Jong, Douglas Doerfler, Hubertus JJ van Dam, Álvaro Vázquez-Mayagoitia, Theresa L. Windus, Chao Yang. “Achieving Performance Portability in Gaussian Basis Ensemble Density Functional Theory on Accelerator-Based Architectures in NWChemEx.” 2021. Parallel processing (September). https://doi.org/10.1016/j.parco.2021.102829.