CMN 2026

A minimum-residual multiscale Galerkin framework with applications on h- and r-adaptivity using neural networks

Uriarte, Carlos (Basque Center for Applied Mathematics (BCAM))
Calo, Victor (Curtin University)

In session: TS001B - Neural Networks for Solving PDEs II

Please login to view abstract download link

We present a robust minimum-residual multiscale framework addressing the critical instability of standard Galerkin formulations in the pre-asymptotic regime. Our approach synergizes the structural decomposition of variational multiscale methods with the minimum-residual framework that utilizes optimal test functions. This formulation is equivalent to a mixed saddle-point system, which naturally decouples the fine-scale Galerkin solution into a stable coarse component and a computable fine-scale correction. A central contribution of this work is the extension of the neural-network-based r-adaptive strategy established in [1] to this robust multiscale setting. Following that precedent, we reinterpret the mesh generation process as a differentiable computational graph, where the grid topology is defined by unconstrained real-valued learnable parameters passed through a differentiable softmax layer. This parameterization allows us to treat the mesh nodes as trainable weights in a neural architecture. We train this architecture via gradient descent to minimize a global loss function defined by the fine-scale residual energy. To train this architecture efficiently, we leverage the JAX library for just-in-time compilation and automatic differentiation. A key innovation in our implementation is the derivation of a “surrogate” loss function that exploits the self-adjoint symmetry of the minimum-residual formulation. In standard bi-level optimization, computing gradients with respect to mesh parameters requires differentiating through the linear solver (implicit differentiation), incurring an expensive cost. By utilizing the stationarity of the Ritz energy, our surrogate loss detaches the forward state solution from the computational graph. This allows the same computation of gradients—equivalent to the adjoint method—using only the forward assembly traces, significantly accelerating the training loop. When geometric optimization saturates, an h-refinement step driven by local residual indicators enriches the discretization. Numerical results confirm that this hybrid framework ensures discrete robustness, effectively stabilizing the adaptive process through the unstable pre-asymptotic regime and optimally guiding it toward the naturally stable asymptotic regime where saturation guarantees convergence. REFERENCES [1] D. Aballay, et al., “An r-adaptive finite element method using neural networks for parametric self-adjoint elliptic problems”, J. Comp. Phys. ... (2026)