Gaussian Process Surrogate Models for Efficient Estimation of Structural Response Distributions and Order Statistics
Abstract
Engineering disciplines often rely on extensive simulations to ensure that structures are designed to withstand harsh conditions while avoiding over-engineering for unlikely scenarios. Assessments such as Serviceability Limit State (SLS) involve evaluating weather events, including estimating loads not expected to be exceeded more than a specified number of times (e.g., 100) throughout the structure’s design lifetime. Although physics-based simulations provide robust and detailed insights, they are computationally expensive, making it challenging to generate statistically valid representations of a wide range of weather conditions.
To address these challenges, we propose an approach using Gaussian Process (GP) surrogate models trained on a limited set of simulation outputs to directly generate the structural response distribution. We apply this method to an SLS assessment for estimating the order statistics , representing the 100th highest response, of a structure exposed to 25 years of historical weather observations. Our results indicate that the GP surrogate models provide comparable results to full simulations but at a fraction of the computational cost.
I Introduction
Accurate estimation of structural responses under diverse weather conditions is influenced by both the variability of the weather environment (e.g., waves, wind, currents) and the variability of the structural response in a given random weather state. For precise long-term estimation, it is essential to consider both these variabilities.
Order statistics, which involve analyzing specific ranked values within a dataset, are particularly useful in this context. These statistics can involve extreme values like the maximum or the minimum, as well as other values such as the 100th largest response. By examining these ranked values, order statistics provide valuable insights into the behavior of structures under various conditions, which is crucial for both reliability and serviceability assessments.
While traditional physics-based simulation methods can calculate order statistics, this is often impractical due to the computational expense. This is especially true when dealing with long time periods, such as 25 or 100 years, which are typically used in the design of structures [1].
A common practice in the engineering field is thus to use surrogate models, which approximate the results of high-fidelity simulations. These models, such as Gaussian Process (GP) models, can achieve similar accuracy with significantly reduced computational cost [2].
In this paper, we propose a method for creating GP-based surrogate models suitable for order statistics calculation. Our approach assumes that the structural characteristics remain constant during the studied timeframe, which is a common simplification in practical structural response simulations [3].
Our method introduces several aspects that distinguish it from existing methods. Specifically, we do not use surrogate models to estimate the structural responses directly. Instead, we estimate the parameters of the structural response distribution. This allows us to generate samples from the predicted distribution, enabling efficient calculation of order statistics without the need for extensive simulations. Additionally, our approach is designed to work with stochastic simulators where both the responses and the number of data points returned vary stochastically.
Our method is particularly valuable for Serviceability Limit State (SLS) calculations [4], where the evaluation of structural responses under a wide range of weather conditions is crucial but difficult to achieve with traditional methods like environmental contours [5].
To demonstrate our method, we conducted an SLS assessment estimating the 100th largest response () for a structure exposed to 25-years worth of historical weather observations. This proof-of-concept uses a simplified stochastic simulation model that balances realistic dynamics and computational efficiency. We benchmark our method against a brute-force approach that calculates order statistics directly using the simulator. Our method showed comparable results at a fraction of the computational cost.
II Problem Statement and Approach
The specific problem addressed in this paper is the need for an efficient and accurate method to estimate the order statistics, , representing the largest response within a selected time interval. For systems where the response is stochastic, this is challenging using traditional methods due to the inherent variability of the responses, which would require a high number of simulations to capture accurately.
Our proposed method maps weather data inputs to predicted distributions of structural responses using a surrogate model, and then generates data to mimic the simulator output. This enables efficient estimation of selected order statistics, effectively bypassing the need for generating the structural responses using a simulator.
Our approach is inspired by [6], which uses a Gaussian Process to model the parameters of the output distribution. Here, we extend this method by not only modeling the distributional parameters, but also generating realizations of the predicted structural response from the predictive distributions.
The simulator is considered to be a stochastic black-box function, represented by
(1) |
where each represents the response within a certain time interval, as explained in further detail in Section III.1.1. The number of values returned by the simulator, , is a random variable conditional on . This means that both the responses and the count are stochastic outputs of the simulator.
We assume outputs of the simulator at a point are samples from a distribution , governed by the underlying physics of the system, where are the parameters of the distribution. For example, if is a Gumbel distribution, then . In other words, we assume a fixed distribution type with unique parametrization at each .
Producing the surrogate model’s estimate of a single run works as follows:
-
1.
Use the Gaussian Process to map .
-
2.
Create the distribution using the parameters .
-
3.
Generate a sample from , then generate samples from distribution , representing the output of the simulation model.
While this mapping could be performed with many different models, using a Gaussian Process allows us to quantify the uncertainty in our estimation, as well as propagating uncertainty about the true surrogate model to our estimates of the order statistics .
The GP model assumes that the function mapping inputs to outputs is a realization of a Gaussian process, defined by its mean function and covariance function, which encodes certain assumptions about the function we aim to predict, such as smoothness properties or periodicity. In this study, we use the Matérn covariance function, which is suitable for modeling functions with varying smoothness [7].
Further details on the simulation model, surrogate model, and quantities of interest calculation are covered in the following sections.