Skip to content

GPU assembly support on AMD and CUDA#7042

Open
multitalentloes wants to merge 23 commits into
OPM:masterfrom
multitalentloes:add_gpu_thermalgaswaterassembly2
Open

GPU assembly support on AMD and CUDA#7042
multitalentloes wants to merge 23 commits into
OPM:masterfrom
multitalentloes:add_gpu_thermalgaswaterassembly2

Conversation

@multitalentloes
Copy link
Copy Markdown
Member

The GPU assembly support extracted from #7018
Still contains more code than necessary from that branch to support the assembly, but the goal is to reduce this PR to be as small as possible yet support the new GPU linearization.

@multitalentloes multitalentloes added the manual:irrelevant This PR is a minor fix and should not appear in the manual label May 7, 2026
@multitalentloes multitalentloes force-pushed the add_gpu_thermalgaswaterassembly2 branch 2 times, most recently from d3c8416 to 60c8298 Compare May 8, 2026 07:59
@atgeirr atgeirr requested a review from Copilot May 11, 2026 08:13
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extracts and wires up GPU matrix assembly/linearization support for Flow (CUDA + HIP/AMD), building on prior GPU POC work, and introduces GPU-oriented model/problem shims plus supporting GPU ISTL utilities.

Changes:

  • Add GPU TPFA linearization/assembly path with supporting GPU data transfers (intensive quantities, boundary info, residual/Jacobian buffers).
  • Introduce simplified GPU-friendly model/problem wrappers used by the GPU assembly kernels.
  • Extend GPU ISTL utilities (typed diagonal block pointers; GpuView access changes) and update build system to compile a flow_gpu variant.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/gpuistl/test_GpuSparseTable.cu Extends kernel test to validate SparseTable::dataSize() usage on device.
opm/simulators/linalg/gpuistl/GpuView.hpp Adjusts operator[] to return const T&; adds an extra include.
opm/simulators/linalg/gpuistl/detail/gpusparse_matrix_operations.hpp Declares typed diagonal block pointer extraction API.
opm/simulators/linalg/gpuistl/detail/gpusparse_matrix_operations.cu Implements getDiagPtrsTyped() and instantiates for MiniMatrix block sizes.
opm/simulators/flow/Transmissibility.hpp Adds accessor for thermal boundary half-transmissibility map.
opm/simulators/flow/Transmissibility_impl.hpp Implements new transmissibility map accessor.
opm/simulators/flow/SimplifiedGpuBlackOilModel.hpp Adds simplified FI black-oil model wrapper and GPU copy/view helpers.
opm/simulators/flow/SimplifiedFlowProblemGPU.hpp Adds simplified GPU problem wrapper for boundary thermal transmissibility (alpha) + module params.
opm/simulators/flow/NewTranFluxModule.hpp Generalizes pressure-diff calculation template to accept alternative module params type.
opm/simulators/flow/FlowProblem.hpp Exposes thermalLawManager() accessor.
opm/simulators/aquifers/BlackoilAquiferModel.hpp Disables SupportsFaceTag include (commented).
opm/simulators/aquifers/BlackoilAquiferModel_impl.hpp Disables grid SupportsFaceTag static_assert (commented).
opm/models/discretization/common/tpfalinearizer.hh Major: adds GPU assembly path (domain/neighbor/boundary transfers, kernels, GPU Jacobian/residual handling).
opm/models/discretization/common/fvbaseproperties.hh Introduces to_gpu_type(_t) mapping and a GpuFIBlackOilModel property hook.
opm/models/discretization/common/fvbasediscretization.hh Adds helpers to fetch all cached IQs (timeIdx 0/1) and a numDof() accessor.
opm/models/blackoil/blackoillocalresidualtpfa.hh Refactors boundary flux handling for GPU/non-static fluid system; adds GPU-related helpers.
opm/models/blackoil/blackoilintensivequantities.hh Adjusts assignment for host/device; adds GPU-switch constructors; changes withOtherFluidSystem() to accept a pointer.
opm/models/blackoil/blackoilextbomodules.hh Replaces a raw throw with OPM_THROW.
opm/models/blackoil/blackoildiffusionmodule.hh Adds accessors for diffusion/tortuosity arrays; marks update as host/device.
opm/models/blackoil/blackoilconvectivemixingmodule.hh Minor fluid-system index usage adjustments; adds a fluidsystem include.
flow/flow_gpu.hpp Adds GPU flow typetag mapping and declarations shared between CUDA/HIP entrypoints.
flow/flow_gpu.hip Adds HIP entrypoint implementation for GPU flow variant.
flow/flow_gpu.cu Adds CUDA entrypoint implementation for GPU flow variant.
flow/flow_gpu_main.cpp Adds standalone main for flow_gpu binary.
CMakeLists.txt Adds gpu Flow model variant and selects .cu/.hip source accordingly.
CMakeLists_files.cmake Adds per-file CUDA flags for new GPU sources and installs new public headers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +34 to +35
#include <iostream>

Comment on lines +19 to +30
#include <opm/common/utility/gpuistl_if_available.hpp>
#include <opm/common/utility/VectorWithDefaultAllocator.hpp>

namespace Opm {

template<class Scalar, template <class> class Storage = Opm::VectorWithDefaultAllocator>
class SimplifiedFlowProblemGPU
{
public:
using ModuleParams = BlackoilModuleParams<ConvectiveMixingModuleParam<Scalar, Storage>>;

SimplifiedFlowProblemGPU() = default;
} else if (boundaryFaceIndex == 2) {
return alpha2_[globalIndex];
} else {
OPM_THROW(std::logic_error, "Invalid boundary face index: " + std::to_string(boundaryFaceIndex));
Comment on lines +31 to +53
#include <opm/models/blackoil/blackoilmodel.hh>
#include <opm/models/utils/propertysystem.hh>

#include <opm/common/ErrorMacros.hpp>
#include <opm/models/utils/propertysystem.hh>

#include <opm/grid/CpGrid.hpp>
#include <opm/grid/utility/ElementChunks.hpp>

#include <opm/models/blackoil/blackoilconvectivemixingmodule.hh>
#include <opm/models/blackoil/blackoilmoduleparams.hh>
#include <opm/models/parallel/threadmanager.hpp>
#include <opm/simulators/utils/DeferredLoggingErrorHelpers.hpp>

#include <opm/material/fluidmatrixinteractions/EclMultiplexerMaterialParams.hpp>

#include <cstddef>
#include <stdexcept>
#include <type_traits>

#ifdef _OPENMP
#include <omp.h>
#endif
#include <opm/simulators/aquifers/AquiferFetkovich.hpp>
#include <opm/simulators/aquifers/AquiferNumerical.hpp>
#include <opm/simulators/aquifers/SupportsFaceTag.hpp>
// #include <opm/simulators/aquifers/SupportsFaceTag.hpp>
// Grid needs to support Facetag
using Grid = std::remove_const_t<std::remove_reference_t<decltype(simulator.vanguard().grid())>>;
static_assert(SupportsFaceTag<Grid>::value, "Grid has to support assumptions about face tag.");
// static_assert(SupportsFaceTag<Grid>::value, "Grid has to support assumptions about face tag.");
Comment thread opm/models/blackoil/blackoilintensivequantities.hh
Comment on lines +663 to +685
// TODO: make this more efficient (avoid copy!) and still valid for gpu
// TODO: did not want to delve into the special IntQuants vector with special allocators
std::vector<IntensiveQuantities> allIntensiveQuantities0()
{
std::vector<IntensiveQuantities> allIntensiveQuantities;
auto& timeZeroIntQuants = intensiveQuantityCache_[0];
allIntensiveQuantities.insert(allIntensiveQuantities.end(),
timeZeroIntQuants.begin(),
timeZeroIntQuants.end());


return allIntensiveQuantities;
}

std::vector<IntensiveQuantities> allIntensiveQuantities1()
{
std::vector<IntensiveQuantities> allIntensiveQuantities;
auto& timeOneIntQuants = intensiveQuantityCache_[1];
assert(!timeOneIntQuants.empty());
allIntensiveQuantities.insert(allIntensiveQuantities.end(),
timeOneIntQuants.begin(),
timeOneIntQuants.end());
return allIntensiveQuantities;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

manual:irrelevant This PR is a minor fix and should not appear in the manual

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants