Leveraging automatic differentiation in modern machine learning frameworks for (neural) topology optimization

Journal Article (2026)
Author(s)

Suryanarayanan Manoj Sanu (TU Delft - Mechanical Engineering)

Miguel A. Bessa (Brown University)

Alejandro M. Aragón (TU Delft - Mechanical Engineering)

Research Group
Computational Design and Mechanics
DOI related publication
https://doi.org/10.1007/s00158-026-04299-6 Final published version
More Info
expand_more
Publication Year
2026
Language
English
Research Group
Computational Design and Mechanics
Journal title
Structural and Multidisciplinary Optimization
Issue number
5
Volume number
69
Article number
126
Downloads counter
9
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Automatic differentiation (AD) was introduced into topology optimization (TO) more than two decades ago to compute accurate gradients through complex computational workflows. Nevertheless, its adoption within the TO community has remained limited, largely due to the strong reliance on adjoint-based sensitivity analysis—which typically offers superior memory efficiency and runtime performance—and the practical difficulties of integrating large-scale simulations into specialized AD frameworks. The recent rise of machine learning (ML) has opened new opportunities for TO through the advanced AD capabilities of modern ML frameworks such as JAX and PyTorch. A growing body of work at the intersection of ML and TO now focuses on tightly coupling ML components with classical TO workflows. Neural TO is a prominent example, in which an untrained neural network parameterizes the material density field and optimization proceeds over the network parameters. To enable such ML–TO hybrid workflows, a deeper understanding of how AD systems operate in these frameworks is essential. This article explains the practical principles of AD in modern ML frameworks and their relation to classical adjoint-based sensitivity analysis. We present implementation strategies for wrapping essential operations—such as finite element solvers—into AD-compatible components without reimplementing them from scratch. These ideas are illustrated through two compact code examples: a classical TO pipeline with selectively AD-wrapped components and a neural TO workflow.