Multiobjective Parameter Estimation for NARMAX models - An Overview¶
Example created by Gabriel Bueno Leandro, Samir Milani Martins and Wilson Rocha Lacerda Junior
Multiobjective parameter estimation represents a fundamental paradigm shift in the way we approach the parameter tuning problem for NARMAX models. Instead of seeking a single set of parameter values that optimally fits the model to the data, multiobjective approaches aim to identify a set of parameter solutions, known as the Pareto front, that provide a trade-off between competing objectives. These objectives often encompass a spectrum of model performance criteria, such as goodness-of-fit, model complexity, and robustness.
Reference¶
For further information, check this reference: https://doi.org/10.1080/00207170601185053.
Use case: Buck converter¶
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sysidentpy.model_structure_selection import FROLS
from sysidentpy.multiobjective_parameter_estimation import AILS
from sysidentpy.basis_function._basis_function import Polynomial
from sysidentpy.utils.display_results import results
from sysidentpy.utils.plotting import plot_results
from sysidentpy.metrics import root_relative_squared_error
from sysidentpy.utils.narmax_tools import set_weights
Dynamic Behavior¶
df_train = pd.read_csv(r"datasets/buck_id.csv")
df_valid = pd.read_csv(r"datasets/buck_valid.csv")
# Plotting the measured output (identification and validation data)
plt.figure(1)
plt.title("Output")
plt.plot(df_train.sampling_time, df_train.y, label="Identification", linewidth=1.5)
plt.plot(df_valid.sampling_time, df_valid.y, label="Validation", linewidth=1.5)
plt.xlabel("Samples")
plt.ylabel("Voltage")
plt.legend()
plt.show()
# Plotting the measured input (identification and validation data)
plt.figure(2)
plt.title("Input")
plt.plot(df_train.sampling_time, df_train.input, label="Identification", linewidth=1.5)
plt.plot(df_valid.sampling_time, df_valid.input, label="Validation", linewidth=1.5)
plt.ylim(2.1, 2.6)
plt.ylabel("u")
plt.xlabel("Samples")
plt.legend()
plt.show()
Buck Converter Static Function¶
The duty cycle, represented by the symbol $D$, is defined as the ratio of the time the system is on ($T_{on}$) to the total operation cycle time ($T$). Mathematically, this can be expressed as $D=\frac{T_{on}}{T}$. The complement of the duty cycle, represented by $D'$, is defined as the ratio of the time the system is off ($T_{off}$) to the total operation cycle time ($T$) and can be expressed as $D'=\frac{T_{off}}{T}$.
The load voltage ($V_o$) is related to the source voltage ($V_d$) by the equation $V_o=D⋅V_d=(1−D’)⋅V_d$. For this particular converter, it is known that $D′=\frac{\bar{u}-1}{3}$, which means that the static function of this system can be derived from theory to be:
$V_o = \frac{4V_d}{3} - \frac{V_d}{3}\cdot \bar{u}$
If we assume that the source voltage $V_d$ is equal to 24 V, then we can rewrite the above expression as follows:
$V_o = (4 - \bar{u})\cdot 8$
# Static data
Vd = 24
Uo = np.linspace(0, 4, 50)
Yo = (4 - Uo) * Vd / 3
Uo = Uo.reshape(-1, 1)
Yo = Yo.reshape(-1, 1)
plt.figure(3)
plt.title("Buck Converter Static Curve")
plt.xlabel("$\\bar{u}$")
plt.ylabel("$\\bar{y}$")
plt.plot(Uo, Yo, linewidth=1.5, linestyle="-", marker="o")
plt.show()
Buck converter Static Gain¶
The gain of a Buck converter is a measure of how its output voltage changes in response to changes in its input voltage. Mathematically, the gain can be calculated as the derivative of the converter’s static function, which describes the relationship between its input and output voltages. In this case, the static function of the Buck converter is given by the equation:
$V_o = (4 - \bar{u})\cdot 8$
Taking the derivative of this equation with respect to $\hat{u}$, we find that the gain of the Buck converter is equal to −8. In other words, for every unit increase in the input voltage $\hat{u}$, the output voltage Vo will decrease by 8 units.
so $gain=V_o'=-8$
# Defining the gain
gain = -8 * np.ones(len(Uo)).reshape(-1, 1)
plt.figure(3)
plt.title("Buck Converter Static Gain")
plt.xlabel("$\\bar{u}$")
plt.ylabel("$\\bar{gain}$")
plt.plot(Uo, gain, linewidth=1.5, label="gain", linestyle="-", marker="o")
plt.legend()
plt.show()
Building a dynamic model using the mono-objective approach¶
x_train = df_train.input.values.reshape(-1, 1)
y_train = df_train.y.values.reshape(-1, 1)
x_valid = df_valid.input.values.reshape(-1, 1)
y_valid = df_valid.y.values.reshape(-1, 1)
basis_function = Polynomial(degree=2)
model = FROLS(
order_selection=True,
n_info_values=8,
extended_least_squares=False,
ylag=2,
xlag=2,
info_criteria="aic",
estimator="least_squares",
basis_function=basis_function,
)
model.fit(X=x_train, y=y_train)
C:\Users\wilso\Desktop\projects\GitHub\sysidentpy\sysidentpy\utils\deprecation.py:37: FutureWarning: Passing a string to define the estimator will rise an error in v0.4.0. You'll have to use FROLS(estimator=LeastSquares()) instead. The only change is that you'll have to define the estimator first instead of passing a string like 'least_squares'. This change will make easier to implement new estimators and it'll improve code readability. warnings.warn(message, FutureWarning)
<sysidentpy.model_structure_selection.forward_regression_orthogonal_least_squares.FROLS at 0x14acaf77950>
Affine Information Least Squares Algorithm (AILS)¶
AILS is a multiobjective parameter estimation algorithm, based on a set of affine information pairs. The multiobjective approach proposed in the mentioned paper and implemented in SysIdentPy leads to a convex multiobjective optimization problem, which can be solved by AILS. AILS is a LeastSquares-type non-iterative scheme for finding the Pareto-set solutions for the multiobjective problem.
So, with the model structure defined (we will be using the one built using the dynamic data above), one can estimate the parameters using the multiobjective approach.
The information about static function and static gain, besides the usual dynamic input/output data, can be used to build the pair of affine information to estimate the parameters of the model. We can model the cost function as:
$ \gamma(\hat\theta) = w_1\cdot J_{LS}(\hat{\theta})+w_2\cdot J_{SF}(\hat{\theta})+w_3\cdot J_{SG}(\hat{\theta}) $
Multiobjective parameter estimation considering 3 different objectives: the prediction error, the static function and the static gain¶
# you can use any set of model structure you want in your use case, but in this notebook we will use the one obtained above the compare with other work
mo_estimator = AILS(final_model=model.final_model)
# setting the log-spaced weights of each objective function
w = set_weights(static_function=True, static_gain=True)
# you can also use something like
# w = np.array(
# [
# [0.98, 0.7, 0.5, 0.35, 0.25, 0.01, 0.15, 0.01],
# [0.01, 0.1, 0.3, 0.15, 0.25, 0.98, 0.35, 0.01],
# [0.01, 0.2, 0.2, 0.50, 0.50, 0.01, 0.50, 0.98],
# ]
# )
# to set the weights. Each row correspond to each objective
AILS has an estimate
method that returns the cost functions (J), the Euclidean norm of the cost functions (E), the estimated parameters referring to each weight (theta), the regressor matrix of the gain and static_function affine information HR and QR, respectively.
J, E, theta, HR, QR, position = mo_estimator.estimate(
X=x_train, y=y_train, gain=gain, y_static=Yo, X_static=Uo, weighing_matrix=w
)
result = {
"w1": w[0, :],
"w2": w[2, :],
"w3": w[1, :],
"J_ls": J[0, :],
"J_sg": J[1, :],
"J_sf": J[2, :],
"||J||:": E,
}
pd.DataFrame(result)
w1 | w2 | w3 | J_ls | J_sg | J_sf | ||J||: | |
---|---|---|---|---|---|---|---|
0 | 0.006842 | 0.003078 | 0.990080 | 0.999970 | 1.095020e-05 | 0.000013 | 0.245244 |
1 | 0.007573 | 0.002347 | 0.990080 | 0.999938 | 2.294665e-05 | 0.000016 | 0.245236 |
2 | 0.008382 | 0.001538 | 0.990080 | 0.999885 | 6.504913e-05 | 0.000018 | 0.245223 |
3 | 0.009277 | 0.000642 | 0.990080 | 0.999717 | 4.505541e-04 | 0.000021 | 0.245182 |
4 | 0.006842 | 0.098663 | 0.894495 | 1.000000 | 7.393246e-08 | 0.000015 | 0.245251 |
... | ... | ... | ... | ... | ... | ... | ... |
2290 | 0.659632 | 0.333527 | 0.006842 | 0.995896 | 3.965699e-04 | 1.000000 | 0.244489 |
2291 | 0.730119 | 0.263039 | 0.006842 | 0.995632 | 5.602981e-04 | 0.972842 | 0.244412 |
2292 | 0.808139 | 0.185020 | 0.006842 | 0.995364 | 8.321071e-04 | 0.868299 | 0.244300 |
2293 | 0.894495 | 0.098663 | 0.006842 | 0.995100 | 1.364999e-03 | 0.660486 | 0.244160 |
2294 | 0.990080 | 0.003078 | 0.006842 | 0.992584 | 9.825987e-02 | 0.305492 | 0.261455 |
2295 rows × 7 columns
Now we can set theta related to any weight results
model.theta = theta[-1, :].reshape(
-1, 1
) # setting the theta estimated for the last combination of the weights
# the model structure is exactly the same, but the order of the regressors is changed in estimate method. Thats why you have to change the model.final_model
model.final_model = mo_estimator.final_model
yhat = model.predict(X=x_valid, y=y_valid)
rrse = root_relative_squared_error(y_valid, yhat)
r = pd.DataFrame(
results(
model.final_model,
model.theta,
model.err,
model.n_terms,
err_precision=3,
dtype="sci",
),
columns=["Regressors", "Parameters", "ERR"],
)
r
Regressors | Parameters | ERR | |
---|---|---|---|
0 | 1 | 2.2930E+00 | 9.999E-01 |
1 | y(k-1) | 2.3307E-01 | 2.042E-05 |
2 | y(k-2) | 6.3209E-01 | 1.108E-06 |
3 | x1(k-1) | -5.9333E-01 | 4.688E-06 |
4 | y(k-1)^2 | 2.7673E-01 | 3.922E-07 |
5 | y(k-2)y(k-1) | -5.3228E-01 | 8.389E-07 |
6 | x1(k-1)y(k-1) | 1.6667E-02 | 5.690E-07 |
7 | y(k-2)^2 | 2.5766E-01 | 3.827E-06 |