Analysis of Large Non-Linear Systems Underlying Simulations of Technical Processes

Consider a production process for which there is a "rigorous" model (i.e. a mathematical model of the physical and chemical processes happening in the plant). Such a model is called a chemical flowsheet (because it models the flows and reactions of substances along the plant through all process units), and the numerical simulation of the plant is called a flowsheet simulation. Mathematically, the whole process can be described by one large set of non-linear equations.

$f (x) = 0$ s. t. $g (x) \geq 0$ , $h (x) = s$ .

The vector x here collects all quantities needed to describe the process. For a realistic production plant that can be thousands of quantities (temperatures, pressures, compositions, flowrates, temperature-dependent thermodynamic properties like activity coefficients,...). Moreover, since the equations are non-linear, they can be very hard to solve. Hence, flowsheets simulations can be very time-intensive.
The goal is now to learn something about the behavior of the process. And in that respect, we are usually only interested in what happens when we change a very small subset of parameters. For example, we may be interested how the product purities and the electric and heat duties needed for operation change when we change the composition of the input stream to a reactor. Then in the simulation we need to vary the values of a small set of quantities s (the so-called specifications) and observe the desired quantities. Imposing these specifications in the mathematical description above is encoded in

h (x) = s

, which in words means: The (maybe thousands of) parameters x must be chosen such that the handful of specified quantities have the values s we want them to have. What we could do now is to vary the specified values s in the interesting range and to record the quantities we are interested in (e.g. product purities, heat duties, quantities for evaluating safety,...). This, however, cannot be done in a brute-force way, since for that the simulations are much too time intensive. Moreover, the solution of the equations can fail numerically for some sets of specifications. Such failed simulations are usually even more time intensive than the successful ones. A priori we do not know for which specification values s this will be the case.

In my research, I work on algorithms using different machine learning models (classification and regression) which allow a very efficient exploration of the behavior of chemical production processes by rigorous simulations. Summarized in one sentence: We generate sets of simulations by subsequently doing rigorous simulations and training machine learning models to predict where the next rigorous simulations may be most useful for learning about the process. These adaptive sampling methods may be regarded as "Design of Experiments for Simulations". For further information please see the papers

P. O. Ludl, R. Heese, J. Höller, N. Asprion and M. Bortz

Using machine learning models to explore the solution space of large nonlinear systems underlying flowsheet simulations with constraints

Front. Chem. Sci. Eng. 16, 183 (2022), doi:10.1007/s11705-021-2073-7.
J. Höller, M. Bubel, R. Heese, P. O. Ludl, P. Schwartz, J. Schwientek, N. Asprion, M. Wlotzka and M. Bortz

Adaptively exploring the feature space of flowsheets

AIChE Journal (2024) e18404, doi:10.1002/aic.18404

Patrick Ludl

Analysis of Large Non-Linear Systems Underlying Simulations of Technical Processes