Computers to the rescue: unraveling the secrets of models

The purpose of science is to explain what we observe around us, i.e., the universal system. ^[1] One fundamental toolkit for scientists are models. But what exactly is a model in science? I asked this question to Copilot, and I received the following answer:

In science, a model is a simplified representation of a system or phenomenon, designed to help understand, explain, predict, or control its behavior. Models can be physical, mathematical, or computational.

Here’s a quick breakdown:

Physical Models: Scale models like a globe or a wind tunnel prototype.
Mathematical Models: Equations or formulas that describe relationships, like Newton’s laws of motion.
Computational Models: Simulations run on computers, like climate models or CFD (Computational Fluid Dynamics) for fluid flow.

They are essential tools for scientists because they provide insights into systems that are too complex, large, small, or dangerous to study directly.

Models are also created and used by companies. They create models for, e.g., the behavior of a population of clients or the dynamics of supply and demand in financial markets. Models are typically used in what-if scenario analysis, where the company tries to quantify the effects of specific internal decisions, or specific external conditions.

Mathematical models can be abstract representations of phenomenological principles obtained in an empirically way (e.g., Darcy’s law in porous media flow, law of mass action in chemical reactions), or conservation laws (e.g., conservation of energy). One can define such abstract model $M$ as \begin{align*} M(\mathbf{y}, \mathbf{u}; \boldsymbol{\theta}(\mathbf{y})) = \mathbf{F}\;, \end{align*} subject to constraints $C$ \begin{align*} C(\mathbf{y}, \mathbf{u}; \boldsymbol{\theta}(\mathbf{y})) \geqslant \mathbf{0}\;. \end{align*} Here $\mathbf{y}$ denote the dependent/outcome/response variables, $\mathbf{u}$ the (possibly correlated) explanatory/independent variables, $\mathbf{F}$ a source term, and $\boldsymbol{\theta}(\mathbf{y})$ the set of parameters within the model. The model is homogeneous when $M(\mathbf{0},\mathbf{u}; \boldsymbol{\theta}(\mathbf{0})) \equiv \mathbf{0}$. In case of $\mathbf{F} \neq \mathbf{0}$, the model is called inhomogeneous. For spatiotemporal models, we have $\mathbf{y} := \mathbf{y}(t,\mathbf{x})$, $\mathbf{u} := \mathbf{u}(t,\mathbf{x})$, and $\boldsymbol{\theta}(\mathbf{y}) := \boldsymbol{\theta}(t,\textbf{x}, \mathbf{y}(t,\mathbf{x}))$, where $t$ denotes time and $\mathbf{x}$ coordinates representing a specific spatial geometry (Euclidean 3D; can be the entire space, or a 2D surface, or a cylinder, etc.). Time-varying models only depend on time. When there exists a dependency on time, the model is called dynamic. For dynamic models, we must provide an initial condition, i.e., the known value of $\mathbf{y}$ at $t \!=\! 0$. This condition is usually absorbed in the constraints $C$.^[2] A solution of a dynamic model can become stationary; in this case we say that the model reaches an equilibrium. Nonlinear models can have multiple equilibria.^[3] Some dynamical models may exhibit time-periodic behavior for large times (e.g., oscillatory solutions) or they are time-periodic from the start. If there is no time dependency, the model is called static. The quantities $\textbf{y}$, $\textbf{u}$, and $\boldsymbol{\theta}(\mathbf{y})$ can be deterministic, but they can also have an assumed or estimated (from empirical data) probability distribution. The model $M$ is given here implicitly, and can represent a linear or a nonlinear model^[4]. Similarly, the constraints $C$ can be linear or nonlinear relations.

The parameters $\boldsymbol{\theta}(\cdot)$ are determined to give the model an explicit state. However, the state doesn’t need to be unique: For nonlinear systems there may be more than one solution for only one set of parameters. The determination of these parameters may be challenging for complex models having a significant number of parameters. Sometimes parameter determination is part of the modeling exercise, but in most other cases they are determined via experiments or measurements. In an experiment, one observes or measures a set of points $\{\mathbf{u}_i,\mathbf{y}_i\}$. These measurements are fed to the model $M$. Via regression methods, the model parameters $\boldsymbol{\theta}(\cdot)$ are determined in such a way that the model fits, in a statistically most likely way, the provided measurements: \begin{align*} M(\mathbf{y}_i, \mathbf{u}_i; \hat{\boldsymbol{\theta}}(\mathbf{y}_i)) \approx \mathbf{F}\;, \qquad \text{for each $i$,} \end{align*} and where each measurement satisfies the constraints $C$. The procedure of determining $\hat{\boldsymbol{\theta}}(\cdot)$ is called model calibration. One should be careful not to over-parametrize the model. This means that there are more parameters than can be estimated from data. In this case there might be hidden (linear or nonlinear) relations in between parameters.^[5]

In most cases, $M$ provides some kind of algebraic relation between $\mathbf{y}$ and $\mathbf{u}$, and/or in terms of derivatives or integrals involving $\mathbf{y}$ and/or $\mathbf{u}$. The model $M$ with constraints $C$ has a closed form solution if there exists a function $F$ such that \begin{align*} \mathbf{y} = F_{\hat{\boldsymbol{\theta}}}(\mathbf{u})\;, \end{align*} where $\hat{\boldsymbol{\theta}}$ represents the parameters of the calibrated model.

When a closed form solution cannot be obtained, as in almost all complex real world models, one can still obtain an approximate solution. An approximation is usually based on simplification(s) of the original model. However, researchers sometimes want to stick with the original model as simplified models are not able to answer their original research questions. In that case, the following approach is taken: instead of assuming a continuous dependency of $\mathbf{y}$ on $t$ and $\mathbf{x}$, one tries to obtain the solution of $\mathbf{y}$ at discrete points in time and space. Time is then restricted to a time interval, and divided into a set of time points. Space is divided into a set of nodes. This approach is called discretization. Intuitively, when one increases the set of time points and nodes, i.e., refining the grid, it will in some sense converge to the “real”, albeit unknown, solution. However, this will only be true if the discretization itself is subject to some constraints.^[6]

The process of discretization makes the model tractable for computers. Even more when the number of unknowns, $\mathbf{y}$ at discrete points in time and space, significantly increases due to grid refinements. At this point, we enter the world of computational models. One must realize that computational models are, by default, approximations of mathematical models. There are various ways to perform a discretization, and they all result in a different approximation for $\mathbf{y}$.

In the upcoming blogs I will further explore computational models by means of examples taken from my working and academic experiences.

This statement has ever been controversial, as it can be descriptive or normative: Is it what scientists do or is it what they should do! Also note that science is actually a method.
In case of boundary-value problems, the relevant boundary conditions are also absorbed in the constraints $C$
Dynamical systems theory is an area in mathematics that studies the behavior of complex systems.
Mathematical system(s) theory is concerned with the study and control of input/output phenomena.
Uncertainty quantification studies the impact of varying parameters on the model behavior.
There is an entire mathematical discipline called “numerical analysis” to tackle these questions.

Computers to the rescue: unraveling the secrets of models

Comments

Leave a Reply Cancel reply