ParaView Catalyst

ParaView Catalyst
Name	ParaView Catalyst
Developer	Kitware, Inc.
Released	2012
Programming language	C++, Python
Operating system	Linux, Windows, macOS
License	BSD license

Contents

Overview
Architecture and Components
Integration and Use Cases
Scripting and APIs
Performance and Scalability
Development and Community
Security and Deployment Considerations

ParaView Catalyst is an in situ visualization and analysis library that enables real-time processing of simulation data inside high-performance computing workflows. It was developed to reduce I/O overhead by coupling simulation applications with visualization pipelines, enabling scientific projects to perform analysis without expensive post hoc disk writes. Catalyst is used across computational science projects, national laboratories, and industrial research centers to accelerate visualization for large-scale simulations.

Overview

Catalyst emerged from collaboration among Kitware, Inc., national laboratories such as Sandia National Laboratories and Los Alamos National Laboratory, and academic groups involved in projects like ASC and exascale initiatives. It complements visualization systems such as ParaView, VisIt, and libraries like VTK by providing a lightweight embeddable interface that exposes processing pipelines to simulation codes. Catalyst supports integrations with community codes in domains represented by projects such as GROMACS, OpenFOAM, FLASH, LAMMPS, and climate models linked with initiatives like CMIP and agencies like NASA and DOE. The design targets environments exemplified by supercomputers such as Summit, Fugaku, and Titan.

Architecture and Components

Catalyst’s architecture is organized around a small set of components: an embeddable adaptor, a pipeline description format, data adapters to bridge simulation memory layouts, and execution engines that invoke filters and writers from visualization toolkits. The adaptor handles runtime integration with host applications using APIs modeled on MPI and tasking systems such as OpenMP and CUDA. Pipeline descriptions are commonly authored as Python scripts that instantiate filters from VTK and ParaView-style algorithms, connect sources and sinks like image writers or XDMF exporters, and control timesteps for workflows used by consortia including NERSC and OLCF. Data adapters map simulation mesh topologies—structured grids, unstructured meshes, particle collections—often matching formats used by HDF5 and ADIOS2.

Integration and Use Cases

Catalyst is embedded into simulation codes to perform tasks such as in situ visualization, feature detection, volume rendering, thresholding, and statistical reduction. Use cases include turbulence research performed on systems funded by agencies such as NSF and DOE, astrophysical simulations run by teams affiliated with Princeton University or Caltech, and engineering workflows at companies like General Electric and Boeing. Catalyst is often used alongside data-reduction frameworks such as ADIOS and coupled to monitoring systems in projects coordinated by laboratories like Lawrence Berkeley National Laboratory and Argonne National Laboratory. It supports producing derived diagnostics for campaigns like Exascale Computing Project and collaborations tied to initiatives such as USQCD.

Scripting and APIs

The typical Catalyst integration uses Python bindings to define pipelines, leveraging language features from Python and packaging ecosystems such as PyPI. APIs expose functions to initialize, execute, and finalize pipelines, and provide adapters for common array and mesh representations used by scientific libraries like NumPy, PETSc, and Trilinos. Scripted pipelines can incorporate shaders and rendering techniques using components from OpenGL, OSPRay and GPU programming models like CUDA and OpenCL. Interaction models mirror those in ParaView’s client-server architecture, while enabling lightweight embedding in MPI-parallel applications used at centers such as NERSC and Argonne National Laboratory.

Performance and Scalability

Catalyst emphasizes minimizing data movement by operating directly on simulation memory and by supporting parallel execution across MPI ranks and GPU resources on systems like Summit and Perlmutter. Its scalability strategy draws on approaches documented by HPC projects including Top500 entries and practices from exascale-ready toolchains. Performance tuning often involves adaptation to I/O stacks such as Lustre and GPFS, and integration with in situ reduction libraries like ZFP and compression schemes used in ADIOS2. Benchmarks from collaborations with institutions such as Oak Ridge National Laboratory and Lawrence Livermore National Laboratory demonstrate reduced I/O and end-to-end wall-clock gains for workflows in computational fluid dynamics, climate modeling, and materials science.

Development and Community

Catalyst’s development is coordinated by Kitware, Inc. with contributions from national laboratories, universities including University of Utah and University of Chicago, and community projects in the scientific computing ecosystem. The project follows open-source collaboration patterns used by foundations such as Apache Software Foundation-hosted projects and engages users through workshops at conferences including Supercomputing Conference, SciPy Conference, and IEEE Visualization Conference. Community-driven extensions integrate with software from groups like Census and research centers including Argonne National Laboratory and Lawrence Berkeley National Laboratory.

Security and Deployment Considerations

Deploying Catalyst in production HPC environments requires attention to security models used by centers such as Oak Ridge National Laboratory and Argonne National Laboratory, including job scheduler constraints from systems like Slurm and identity management approaches used at NERSC. Administrators must consider attack surfaces introduced by embedded Python interpreters, remote execution components present in client-server systems like ParaView, and dependencies on libraries from vendors such as NVIDIA and Intel Corporation. Best practices mirror practices adopted by research IT groups at institutions such as MIT, Stanford University, and Harvard University: isolating pipeline execution, applying secure build toolchains, and auditing third-party components.

Category:Visualization software