X-Strack: Programming Environments for Scientific Computing

Department of Energy - Office of Science Office of Science
Posted on

Application Deadline

Type

Fellowships

Reference Number

DE-FOA-0002460

The DOE SC program in Advanced Scientific Computing Research (ASCR) hereby announces its interest in basic research in computer science exploring innovative approaches to creating, verifying, validating, optimizing, maintaining, and executing scientific software targeting distributed, heterogeneous, high-performance computing platforms. Next-generation systems for scientific computing are anticipated to be both heterogeneous and distributed, potentially pushing current trends to an extreme degree [1,5]: Heterogeneous: It is now common for high-performance-computing (HPC) systems to feature one or more computational accelerators, and ASCRโ€™s upcoming Exacale systems, Aurora and Frontier, will have node architectures containing multiple Central Processing Units (CPUs) and Graphics Processing Units (GPUs) (for more information, see https://science.osti.gov/ascr/Facilities/User-Facilities/Upgrades). Next-generation systems may feature many different kinds of computational accelerators, including but not limited to, GPUs, Coarse-Gained Reconfigurable Architecture (CGRAs), Filed-Programmable Gate Arrays (FPGAs), machine-learning accelerators, and processing-in-memory capabilities. In addition, to the extent that scientific-application workflows span multiple computing systems, applications in an individual scientific workflow may run on hardware with different hardware architectures. Not only do programming models for these heterogeneous systems need to enable execution across a variety of different kinds of hardware, but data movement and layout are also critical programming considerations [1,3]. Distributed: HPC systems are commonly composed from hundreds or thousands of individual nodes connected to each other using a state-of-the-art local network. While some systems do support a global address space (i.e., are shared-memory systems), most do not (i.e., are distributed-memory systems), and data is copied between nodes as needed [e.g., using Message Passing Interface (MPI)]. The nodes on a single system often share the same hardware architecture, although scientific workflows can span different kinds of systems. Scientific applications often need a large amount of memory to hold the state of the systems being analyzed or simulated, and as a result, their ability to run efficiently on large HPC systems is essential to their utility. The cost of moving data between nodes, the time and space complexity of relevant algorithms as the number of nodes used by the application increases, the effectiveness of load balancing across nodes, and other factors, are all critical to scientific application design [3]. Fully unlocking the potential benefits of these next-generation systems depends on a high-productivity, sustainable development cycle that yields acceptable application performance [2,9]. As stated in [2], โ€œHardware, software and problem complexities are dramatically reducing the number of developers who can effectively use CSE [Computational Science and Engineering] environments to address grand challenge problems. New models are needed to spur development of productive and sustainable tools that expand access to and usability of CSE capabilities.โ€ Crucially, the development cycle of scientific applications includes both the implementation of new functionality and the porting of existing functionality to new systems. Moreover, for a development cycle to be productive, its stages must be productive, including but not limited to, design, implementation, verification, optimization, and integration. Fortunately, state-of-the-art methods in program analysis and synthesis, leveraging formal methods, machine learning, and other techniques, promise future programming environments and software stacks with a significant degree of automation [6.7]. These techniques also enable state-of-the-art methods for program verification and repair [7]. Given that verification of an application on a new platform, or after any other modification, is often an expensive and time-consuming task that sits on the critical path to using new platforms and ensuring scientific integrity [4,9], improving the productivity and effectiveness of the testing process is a high priority. The automated test synthesis research area defined below addresses this priority. Additionally, due to availability, interoperability, and performance constraints, no one parallel-programming model [e.g., OpenMP (https://www.openmp.org/), OpenACC (https://www.openacc.org/), SYCL, Kokkos, RAJA, CUDA and other vendor-specific models [8]) is optimal across all HPC platforms. Given DOEโ€™s diverse scientific-computing ecosystem, assisting programmers in the transitioning of existing applications between different programming models is a high priority. The parallel-programming-model translation research area defined below addresses this priority.
Categories: Science and Technology and other Research and Development.

More Information

Posted on

Application Deadline

Type

Fellowships

Reference Number

DE-FOA-0002460

Aurora%2C%20United%20States

Aurora , United States