In part I of this series, we’ve introduced the fundamental concepts of surrogate modeling. In part II, we’ve seen surrogate modeling in action through a case study that presented the full analysis pipeline.
To recap, the surrogate modeling technique trains a cheap yet accurate statistical model to serve as the surrogate for the computationally expensive simulations, thus significantly improving the efficiency of the product design and analyses.
In part III, we will briefly discuss the following three trends emerged in surrogate modeling research and application:
Gradients are defined as the sensitivity of the output with respect to the inputs. Thanks to rapid developments in techniques like adjoint method and automatic differentiation, it is now common for engineering simulation code to not only compute the output f(x) given the input vector x, but also compute the gradients ∂_f_(x)/∂**_x _**at the same time with negligible costs.
Consequently, we can expand our training data pairs (xᵢ, f(xᵢ)) to training data triples (xᵢ, f(xᵢ), ∂_f_(xᵢ)/∂**xᵢ). By leveraging the additional gradient information, the trained surrogate model could reach a higher accuracy compared with the model trained only on (xᵢ, f(x**ᵢ)), given that both models use the same number of training data points.
We can also state the benefits of including the gradients in an equivalent way: it allows reducing the number of data points to achieve a given accuracy. This is a desired feature in practice. Recall that generating each training data point requires running the expensive simulation code one time. If we can cut down the total number of training data points, we can train the surrogate model with a smaller computational budget, therefore improving the training efficiency.
#statistics #data-science #modeling #surrogate-modeling #machine-learning