Codes

sysidentpy base

Base classes for NARX estimator.

class sysidentpy.base.GenerateRegressors[source]

Polynomial NARMAX model

Provides the main functions to generate the regressor dictionary and regressor codes for polynomial basis.

regressor_space(non_degree, xlag, ylag, n_inputs)[source]

Create the code representation of the regressors.

This function generates a codification from all possibles regressors given the maximum lag of the input and output. This is used to write the final terms of the model in a readable form. [1001] -> y(k-1). This code format was based on a dissertation from UFMG. See reference below.

Parameters
  • non_degree (int) – The desired maximum nonlinearity degree.

  • ylag (int) – The maximum lag of output regressors.

  • xlag (int) – The maximum lag of input regressors.

Returns

  • max_lag (int) – This value can be used by another functions.

  • regressor_code (ndarray of int) – Matrix codification of all possible regressors.

Examples

The codification is defined as:

>>> 100n = y(k-n)
>>> 200n = u(k-n)
>>> [100n 100n] = y(k-n)y(k-n)
>>> [200n 200n] = u(k-n)u(k-n)

References

[1] Master Thesis: Barbosa, Alípio Monteiro.

Técnicas de otimizaçao bi-objetivo para a determinaçao da estrutura de modelos NARX (2010).

class sysidentpy.base.HouseHolder[source]

Householder reflection and transformation.

class sysidentpy.base.InformationMatrix[source]

Class for methods regarding preprocessing of columns

shift_column(col_to_shift, lag)[source]

Shift values based on a lag.

Parameters
  • col_to_shift (array-like of shape = n_samples) – The samples of the input or output.

  • lag (int) – The respective lag of the regressor.

Returns

tmp_column – The shifted array of the input or output.

Return type

array-like of shape = n_samples

Examples

>>> y = [1, 2, 3, 4, 5]
>>> shift_column(y, 1)
[0, 1, 2, 3, 4]
initial_lagged_matrix(X, y, xlag, ylag)[source]

Build a lagged matrix concerning each lag for each column.

Parameters
  • model (ndarray of int) – The model code representation.

  • y (array-like) – Target data used on training phase.

  • X (array-like) – Input data used on training phase.

  • ylag (int) – The maximum lag of output regressors.

  • xlag (int) – The maximum lag of input regressors.

Returns

lagged_data – The lagged matrix built in respect with each lag and column.

Return type

ndarray of floats

Examples

Let X and y be the input and output values of shape Nx1. If the chosen lags are 2 for both input and output the initial lagged matrix will be formed by Y[k-1], Y[k-2], X[k-1], and X[k-2].

build_information_matrix(X, y, xlag, ylag, non_degree)[source]

Build the information matrix.

Each columns of the information matrix represents a candidate regressor. The set of candidate regressors are based on xlag, ylag, and non_degree entered by the user.

Parameters
  • model (ndarray of int) – The model code representation.

  • y (array-like) – Target data used on training phase.

  • X (array-like) – Input data used on training phase.

  • ylag (int) – The maximum lag of output regressors.

  • xlag (int) – The maximum lag of input regressors.

  • non_degree (int) – The desired maximum nonlinearity degree.

Returns

The lagged matrix built in respect with each lag and column.

Return type

lagged_data = ndarray of floats

sysidentpy narmax

Build Polynomial NARMAX Models

class sysidentpy.polynomial_basis.narmax.PolynomialNarmax(non_degree=2, ylag=2, xlag=2, order_selection=False, info_criteria='aic', n_terms=None, n_inputs=1, n_info_values=10, estimator='recursive_least_squares', extended_least_squares=True, aux_lag=1, lam=0.98, delta=0.01, offset_covariance=0.2, mu=0.01, eps=2.220446049250313e-16, gama=0.2, weight=0.02)[source]

Polynomial NARXMAX model

Parameters
  • non_degree (int, default=2) – The nonlinearity degree of the polynomial function.

  • ylag (int, default=2) – The maximum lag of the output.

  • xlag (int, default=2) – The maximum lag of the input.

  • order_selection (bool, default=False) – Whether to use information criteria for order selection.

  • info_criteria (str, default="aic") – The information criteria method to be used.

  • n_terms (int, default=None) – The number of the model terms to be selected. Note that n_terms overwrite the information criteria values.

  • n_inputs (int, default=1) – The number of inputs of the system.

  • n_info_values (int, default=10) – The number of iterations of the information criteria method.

  • estimator (str, default="least_squares") – The parameter estimation method.

  • extended_least_squres (bool, default=False) – Whether to use extended least squres method for parameter estimation. Note that we define a specific set of noise regressors.

  • aux_lag (int, default=1) – Temporary lag value used only for parameter estimation. This value is overwriten by the max_lag value and will be removed in v0.1.4.

  • lam (float, default=0.98) – Forgetting factor of the Recursive Least Squares method.

  • delta (float, default=0.01) – Normalization factor of the P matrix.

  • offset_covariance (float, default=0.2) – The offset covariance factor of the affine least mean squares filter.

  • mu (float, defaul=0.01) – The convergence coefficient (learning rate) of the filter.

  • eps (float) – Normalization factor of the normalized filters.

  • gama (float, default=0.2) – The leakage factor of the Leaky LMS method.

  • weight (float, default=0.02) – Weight factor to control the proportions of the error norms and offers an extra degree of freedom within the adaptation of the LMS mixed norm method.

Examples

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from sysidentpy.polynomial_basis import PolynomialNarmax
>>> from sysidentpy.metrics import root_relative_squared_error
>>> from sysidentpy.utils.generate_data import get_miso_data, get_siso_data
>>> x_train, x_valid, y_train, y_valid = get_siso_data(n=1000,
...                                                    colored_noise=True,
...                                                    sigma=0.2,
...                                                    train_percentage=90)
>>> model = PolynomialNarmax(non_degree=2,
...                          order_selection=True,
...                          n_info_values=10,
...                          extended_least_squares=False,
...                          ylag=2, xlag=2,
...                          info_criteria='aic',
...                          estimator='least_squares',
...                          )
>>> model.fit(x_train, y_train)
>>> yhat = model.predict(x_valid, y_valid)
>>> rrse = root_relative_squared_error(y_valid, yhat)
>>> print(rrse)
0.001993603325328823
>>> results = pd.DataFrame(model.results(err_precision=8,
...                                      dtype='dec'),
...                        columns=['Regressors', 'Parameters', 'ERR'])
>>> print(results)
    Regressors Parameters         ERR
0        x1(k-2)     0.9000  0.95556574
1         y(k-1)     0.1999  0.04107943
2  x1(k-1)y(k-1)     0.1000  0.00335113

References

[1] Manuscript: Orthogonal least squares methods and their application

to non-linear system identification https://eprints.soton.ac.uk/251147/1/778742007_content.pdf

[2] Manuscript (portuguese): Identificação de Sistemas não Lineares

Utilizando Modelos NARMAX Polinomiais–Uma Revisão e Novos Resultados

error_reduction_ratio(psi, y, process_term_number)[source]

Perform the Error Reduction Ration algorithm.

Parameters
  • y (array-like of shape = n_samples) – The target data used in the identification process.

  • psi (ndarray of floats) – The information matrix of the model.

  • process_term_number (int) – Number of Process Terms defined by the user.

Returns

  • err (array-like of shape = number_of_model_elements) – The respective ERR calculated for each regressor.

  • piv (array-like of shape = number_of_model_elements) – Contains the index to put the regressors in the correct order based on err values.

  • psi_orthogonal (ndarray of floats) – The updated and orthogonal information matrix.

References

[1] Manuscript: Orthogonal least squares methods and their application

to non-linear system identification https://eprints.soton.ac.uk/251147/1/778742007_content.pdf

[2] Manuscript (portuguese): Identificação de Sistemas não Lineares

Utilizando Modelos NARMAX Polinomiais–Uma Revisão e Novos Resultados

fit(X, y)[source]

Fit polynomial NARMAX model.

This is an ‘alpha’ version of the ‘fit’ function which allows a friendly usage by the user. Given two arguments, X and y, fit training data.

Parameters
  • X (ndarray of floats) – The input data to be used in the training process.

  • y (ndarray of floats) – The output data to be used in the training process.

Returns

  • model (ndarray of ints) – The model code represetation.

  • piv (array-like of shape = number_of_model_elements) – Contains the index to put the regressors in the correct order based on err values.

  • theta (array-like of shape = number_of_model_elements) – The estimated parameters of the model.

  • err (array-like of shape = number_of_model_elements) – The respective ERR calculated for each regressor.

  • info_values (array-like of shape = n_regressor) – Vector with values of akaike’s information criterion for models with N terms (where N is the vector position + 1).

predict(X, y, steps_ahead=None)[source]

Return the predicted values given an input.

The predict function allows a friendly usage by the user. Given a previously trained model, predict values given a new set of data.

This method accept y values mainly for prediction n-steps ahead (to be implemented in the future)

Parameters
  • X (ndarray of floats) – The input data to be used in the prediction process.

  • y (ndarray of floats) – The output data to be used in the prediction process.

  • = int (default = None) (steps_ahead) – The forecast horizon.

Returns

yhat – The predicted values of the model.

Return type

ndarray of floats

information_criterion(X, y)[source]

Determine the model order.

This function uses a information criterion to determine the model size. ‘Akaike’- Akaike’s Information Criterion with

critical value 2 (AIC) (default).

‘Bayes’ - Bayes Information Criterion (BIC). ‘FPE’ - Final Prediction Error (FPE). ‘LILC’ - Khundrin’s law ofiterated logarithm criterion (LILC).

Parameters
  • y (array-like of shape = n_samples) – Target values of the system.

  • X (array-like of shape = n_samples) – Input system values measured by the user.

Returns

output_vector – Vector with values of akaike’s information criterion for models with N terms (where N is the vector position + 1).

Return type

array-like of shape = n_regressor

References

results(theta_precision=4, err_precision=8, dtype='dec')[source]

Write the model regressors, parameters and ERR values.

This function returns the model regressors, its respectives parameter and ERR value on a string matrix.

Parameters
  • theta_precision (int (default: 4)) – Precision of shown parameters values.

  • err_precision (int (default: 8)) – Precision of shown ERR values.

  • dtype (string (default: 'dec')) – Type of representation: sci - Scientific notation; dec - Decimal notation.

Returns

output_matrix

Where:

First column represents each regressor element; Second column represents associated parameter; Third column represents the error reduction ratio associated to each regressor.

Return type

string

compute_info_value(n_theta, n_samples, e_var)[source]

Compute the information criteria value.

This function returns the information criteria concerning each number of regressor. The informotion criteria can be AIC, BIC, LILC and FPE.

Parameters
  • n_theta (int) – Number of parameters of the model.

  • n_samples (int) – Number of samples given the maximum lag.

  • e_var (float) – Variance of the residues

Returns

info_criteria_value – The computed value given the information criteria selected by the user.

Return type

float

sysidentpy simulation

class sysidentpy.polynomial_basis.simulation.SimulatePolynomialNarmax(n_inputs=1, estimator='recursive_least_squares', extended_least_squares=True, lam=0.98, delta=0.01, offset_covariance=0.2, mu=0.01, eps=2.220446049250313e-16, gama=0.2, weight=0.02, estimate_parameter=False)[source]
simulate(X_train=None, y_train=None, X_test=None, y_test=None, model_code=None, steps_ahead=None, theta=None, plot=True)[source]

Simulate a model defined by the user.

Parameters
  • X_train (ndarray of floats) – The input data to be used in the training process.

  • y_train (ndarray of floats) – The output data to be used in the training process.

  • X_test (ndarray of floats) – The input data to be used in the prediction process.

  • y_test (ndarray of floats) – The output data (initial conditions) to be used in the prediction process.

  • model_code (ndarray of int) – Flattened list of input or output regressors.

  • = int (steps_ahead) – The forecast horizon.

  • = None (default) – The forecast horizon.

  • theta (array-like of shape = number_of_model_elements) – The parameters of the model.

  • plot (bool, default=True) – Indicate if the user wants to plot or not.

Returns

  • yhat (ndarray of floats) – The predicted values of the model.

  • results (string) –

    Where:

    First column represents each regressor element; Second column represents associated parameter; Third column represents the error reduction ratio associated to each regressor.

sysidentpy narx_neural_network

sysidentpy general_estimators

Build NARX Models Using general estimators

class sysidentpy.general_estimators.narx.NARX(non_degree=1, ylag=2, xlag=2, n_inputs=1, base_estimator=None, fit_params={})[source]

NARX model build on top of general estimators

Currently is possible to use any estimator that have a fit/predict as an Autoregressive Model. We use our GenerateRegressors and InformationMatrix classes to handle the creation of the lagged features and we are able to use a simple fit and prediction function to run infinity-steps-ahead prediction.

Parameters
  • non_degree (int, default=1) – The nonlinearity degree of the polynomial function.

  • ylag (int, default=2) – The maximum lag of the output.

  • xlag (int, default=2) – The maximum lag of the input.

  • n_inputs (int, default=1) – The number of inputs of the system.

  • fit_params (dict, default=None) – Optional parameters of the fit function of the baseline estimator

  • base_estimator (default=None) – The defined base estimator of the sklearn

  • verbose (bool, default=False) – Print messages

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> from sysidentpy.metrics import mean_squared_error
>>> from sysidentpy.utils.generate_data import get_siso_data
>>> from sysidentpy.general_estimators import NARX
>>> from sklearn.linear_model import BayesianRidge # to use as base estimator
>>> x_train, x_valid, y_train, y_valid = get_siso_data(n=1000,
>>>                                                    colored_noise=False,
>>>                                                    sigma=0.01,
>>>                                                    train_percentage=80)
>>> BayesianRidge_narx = NARX(base_estimator=BayesianRidge(),
...                           xlag=2,
...                           ylag=2
... )
>>> BayesianRidge_narx.fit(x_train, y_train)
>>> yhat = BayesianRidge_narx.predict(x_valid, y_valid)
>>> print(mean_squared_error(y_valid, yhat))
0.000131
data_preparation(X, y)[source]

Return the lagged matrix and the y values given the maximum lags.

Parameters
  • X (ndarray of floats) – The input data.

  • y (ndarray of floats) – The output data.

Returns

  • y (ndarray of floats) – The y values considering the lags.

  • reg_matrix (ndarray of floats) – The information matrix of the model.

fit(X, y)[source]

Train a NARX Neural Network model.

This is an training pipeline that allows a friendly usage by the user. All the lagged features are built using the SysIdentPy classes and we use the fit method of the base estimator of the sklearn to fit the model.

Parameters
  • X (ndarrays of floats) – The input data to be used in the training process.

  • y (ndarrays of floats) – The output data to be used in the training process.

Returns

base_estimator – The model fitted.

Return type

sklearn estimator

predict(X, y_initial)[source]

Return the predicted given an input and initial values.

The predict function allows a friendly usage by the user. Given a trained model, predict values given a new set of data.

This method accept y values mainly for prediction n-steps ahead (to be implemented in the future).

Currently we only support infinity-steps-ahead prediction, but run 1-step-ahead prediction manually is straightforward.

Parameters
  • X (ndarray of floats) – The input data to be used in the prediction process.

  • y (ndarray of floats) – The output data to be used in the prediction process.

Returns

yhat – The predicted values of the model.

Return type

ndarray of floats

sysidentpy residues

class sysidentpy.residues.residues_correlation.ResiduesAnalysis[source]

Bases: object

Residues analysis for Polynomial NARX model.

residuals(X, y, yhat)[source]

Performs the residual analysis of output to validate model.

Parameters
  • y (array-like of shape = n_samples) – The target data used in the identification process.

  • yhat (array-like of shape = n_samples) – The prediction values of the identification process.

  • X (ndarray of floats) – The input data.

Returns

  • output_autocorr (ndarray of floats:) – 1st column - Residuals normalized autocorrelation. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.

  • output_crosscorr (ndarray of floats:) – 1st column - Correlation between residuals and input. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.

Examples

>>> y = [3, -0.5, 2, 7]
>>> autocorr(y)
[62.25 11.5   2.5  21.  ]
plot_result(y, yhat, e_acf, xe_ccf, figsize=(10, 8), n=100)[source]

Plot the free run simulation and residues analysis.

Parameters
  • y (array-like of shape = n_samples) – The target data used in the identification process.

  • yhat (array-like of shape = n_samples) – The prediction values of the identification process.

  • e_acf (ndarray of floats:) – 1st column - Residuals normalized autocorrelation. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.

  • xe_ccf (ndarray of floats:) – 1st column - Correlation between residuals and input. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.

__dict__ = mappingproxy({'__module__': 'sysidentpy.residues.residues_correlation', '__doc__': 'Residues analysis for Polynomial NARX model.', 'residuals': <function ResiduesAnalysis.residuals>, '_input_ccf': <function ResiduesAnalysis._input_ccf>, '_residuals_acf': <function ResiduesAnalysis._residuals_acf>, '_normalized_correlation': <function ResiduesAnalysis._normalized_correlation>, 'plot_result': <function ResiduesAnalysis.plot_result>, '__dict__': <attribute '__dict__' of 'ResiduesAnalysis' objects>, '__weakref__': <attribute '__weakref__' of 'ResiduesAnalysis' objects>, '__annotations__': {}})
__module__ = 'sysidentpy.residues.residues_correlation'
__weakref__

list of weak references to the object (if defined)

sysidentpy metrics

Common metrics to assess performance on NARX models.

sysidentpy.metrics._regression.forecast_error(y, y_predicted)[source]

Calculate the forecast error in a regression model.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – The difference between the true target values and the predicted or forecast value in regression or any other phenomenon.

Return type

ndarray of floats

References

[1] Wikipedia entry on the Forecast error

https://en.wikipedia.org/wiki/Forecast_error

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> forecast_error(y, y_predicted)
[0.5, -0.5, 0, -1]
sysidentpy.metrics._regression.mean_forecast_error(y, y_predicted)[source]

Calculate the mean of forecast error of a regression model.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – The mean value of the difference between the true target values and the predicted or forecast value in regression or any other phenomenon.

Return type

float

References

[1] Wikipedia entry on the Forecast error

https://en.wikipedia.org/wiki/Forecast_error

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> mean_forecast_error(y, y_predicted)
-0.25
sysidentpy.metrics._regression.mean_squared_error(y, y_predicted)[source]

Calculate the Mean Squared Error.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – MSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

References

[1] Wikipedia entry on the Mean Squared Error

https://en.wikipedia.org/wiki/Mean_squared_error

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> mean_squared_error(y, y_predicted)
0.375
sysidentpy.metrics._regression.root_mean_squared_error(y, y_predicted)[source]

Calculate the Root Mean Squared Error.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – RMSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

References

[1] Wikipedia entry on the Root Mean Squared Error

https://en.wikipedia.org/wiki/Root-mean-square_deviation

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> root_mean_squared_error(y, y_predicted)
0.612
sysidentpy.metrics._regression.normalized_root_mean_squared_error(y, y_predicted)[source]

Calculate the normalized Root Mean Squared Error.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – nRMSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

References

[1] Wikipedia entry on the normalized Root Mean Squared Error

https://en.wikipedia.org/wiki/Root-mean-square_deviation

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> normalized_root_mean_squared_error(y, y_predicted)
0.081
sysidentpy.metrics._regression.root_relative_squared_error(y, y_predicted)[source]

Calculate the Root Relative Mean Squared Error.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – RRSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> root_relative_mean_squared_error(y, y_predicted)
0.206
sysidentpy.metrics._regression.mean_absolute_error(y, y_predicted)[source]

Calculate the Mean absolute error.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – MAE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float or ndarray of floats

References

[1] Wikipedia entry on the Mean absolute error

https://en.wikipedia.org/wiki/Mean_absolute_error

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> mean_absolute_error(y, y_predicted)
0.5
sysidentpy.metrics._regression.mean_squared_log_error(y, y_predicted)[source]

Calculate the Mean Squared Logarithmic Error.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – MSLE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

Examples

>>> y = [3, 5, 2.5, 7]
>>> y_predicted = [2.5, 5, 4, 8]
>>> mean_squared_log_error(y, y_predicted)
0.039
sysidentpy.metrics._regression.median_absolute_error(y, y_predicted)[source]

Calculate the Median Absolute Error.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – MdAE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

References

[1] Wikipedia entry on the Median absolute deviation

https://en.wikipedia.org/wiki/Median_absolute_deviation

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> median_absolute_error(y, y_predicted)
0.5
sysidentpy.metrics._regression.explained_variance_score(y, y_predicted)[source]

Calculate the Explained Variance Score.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – EVS output is non-negative values. Becoming 1.0 means your model outputs are exactly matched by true target values. Lower values means worse results.

Return type

float

References

[1] Wikipedia entry on the Explained Variance

https://en.wikipedia.org/wiki/Explained_variation

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> explained_variance_score(y, y_predicted)
0.957
sysidentpy.metrics._regression.r2_score(y, y_predicted)[source]

Calculate the R2 score.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – R2 output can be non-negative values or negative value. Becoming 1.0 means your model outputs are exactly matched by true target values. Lower values means worse results.

Return type

float

Notes

This is not a symmetric function.

References

[1] Wikipedia entry on the Coefficient of determination

https://en.wikipedia.org/wiki/Coefficient_of_determination

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> explained_variance_score(y, y_predicted)
0.948
sysidentpy.metrics._regression.symmetric_mean_absolute_percentage_error(y, y_predicted)[source]

Calculate the SMAPE score.

Parameters
  • y (array-like of shape = number_of_outputs) – Represent the target values.

  • y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – SMAPE output is a non-negative value. The results are percentages values.

Return type

float

Notes

One supposed problem with SMAPE is that it is not symmetric since over-forecasts and under-forecasts are not treated equally.

References

[1] Wikipedia entry on the Symmetric mean absolute percentage error

https://en.wikipedia.org/wiki/Symmetric_mean_absolute_percentage_error

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> symmetric_mean_absolute_percentage_error(y, y_predicted)
57.87

sysidentpy estimators

Least Squares Methodos for parameter estimation

class sysidentpy.parameter_estimation.estimators.Estimators(aux_lag=1, lam=0.98, delta=0.01, offset_covariance=0.2, mu=0.01, eps=2.220446049250313e-16, gama=0.2, weight=0.02)[source]

Oridanry Least squares for linear parameter estimation

least_squares(psi, y)[source]

Estimate the model parameters using Least Squares method.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

References

[1] Manuscript: Sorenson, H. W. (1970). Least-squares estimation:

from Gauss to Kalman. IEEE spectrum, 7(7), 63-68. http://pzs.dstu.dp.ua/DataMining/mls/bibl/Gauss2Kalman.pdf

[2] Book (Portuguese): Aguirre, L. A. (2007). Introduçaoa identificaçao

de sistemas: técnicas lineares enao-lineares aplicadas a sistemas reais. Editora da UFMG. 3a ediçao.

[3] Manuscript: Markovsky, I., & Van Huffel, S. (2007).

Overview of total least-squares methods. Signal processing, 87(10), 2283-2302. https://eprints.soton.ac.uk/263855/1/tls_overview.pdf

[4] Wikipedia entry on Least Squares

https://en.wikipedia.org/wiki/Least_squares

total_least_squares(psi, y)[source]

Estimate the model parameters using Total Least Squares method.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

References

[1] Manuscript: Golub, G. H., & Van Loan, C. F. (1980).

An analysis of the total least squares problem. SIAM journal on numerical analysis, 17(6), 883-893.

[2] Manuscript: Markovsky, I., & Van Huffel, S. (2007).

Overview of total least-squares methods. Signal processing, 87(10), 2283-2302. https://eprints.soton.ac.uk/263855/1/tls_overview.pdf

[3] Wikipedia entry on Total Least Squares

https://en.wikipedia.org/wiki/Total_least_squares

recursive_least_squares(psi, y)[source]

Estimate the model parameters using the Recursive Least Squares method.

The implementation consider the forgeting factor. :param psi: The information matrix of the model. :type psi: ndarray of floats :param y_train: The data used to training the model. :type y_train: array-like of shape = y_training

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book (Portuguese): Aguirre, L. A. (2007). Introduçaoa identificaçao

de sistemas: técnicas lineares enao-lineares aplicadas a sistemas reais. Editora da UFMG. 3a ediçao.

affine_least_mean_squares(psi, y)[source]

Estimate the model parameters using the Affine Least Mean Squares.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book: Poularikas, A. D. (2017). Adaptive filtering: Fundamentals

of least mean squares with MATLAB®. CRC Press.

least_mean_squares(psi, y)[source]

Estimate the model parameters using the Least Mean Squares filter.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book: Haykin, S., & Widrow, B. (Eds.). (2003). Least-mean-square

adaptive filters (Vol. 31). John Wiley & Sons.

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_sign_error(psi, y)[source]

Parameter estimation using the Sign-Error Least Mean Squares filter.

The sign-error LMS algorithm uses the sign of the error vector to change the filter coefficients.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1]`Book: Hayes, M. H. (2009). Statistical digital signal processing

and modeling. John Wiley & Sons.

[2]`Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

normalized_least_mean_squares(psi, y)[source]

Parameter estimation using the Normalized Least Mean Squares filter.

The normalization is used to avoid numerical instability when updating the estimated parameters.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1]`Book: Hayes, M. H. (2009). Statistical digital signal processing

and modeling. John Wiley & Sons.

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_normalized_sign_error(psi, y)[source]

Parameter estimation using the Normalized Sign-Error LMS filter.

The normalization is used to avoid numerical instability when updating the estimated parameters and the sign of the error vector is used to to change the filter coefficients.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book: Hayes, M. H. (2009). Statistical digital signal processing

and modeling. John Wiley & Sons.

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_sign_regressor(psi, y)[source]

Parameter estimation using the Sign-Regressor LMS filter.

The sign-regressor LMS algorithm uses the sign of the matrix information to change the filter coefficients.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book: Hayes, M. H. (2009). Statistical digital signal processing

and modeling. John Wiley & Sons.

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_normalized_sign_regressor(psi, y)[source]

Parameter estimation using the Normalized Sign-Regressor LMS filter.

The normalization is used to avoid numerical instability when updating the estimated parameters and the sign of the information matrix is used to change the filter coefficients.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book: Hayes, M. H. (2009). Statistical digital signal processing

and modeling. John Wiley & Sons.

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_sign_sign(psi, y)[source]

Parameter estimation using the Sign-Sign LMS filter.

The sign-regressor LMS algorithm uses both the sign of the matrix information and the sign of the error vector to change the filter coefficients.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book: Hayes, M. H. (2009). Statistical digital signal processing

and modeling. John Wiley & Sons.

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_normalized_sign_sign(psi, y)[source]

Parameter estimation using the Normalized Sign-Sign LMS filter.

The normalization is used to avoid numerical instability when updating the estimated parameters and both the sign of the information matrix and the sign of the error vector are used to change the filter coefficients.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book: Hayes, M. H. (2009). Statistical digital signal processing

and modeling. John Wiley & Sons.

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_normalized_leaky(psi, y)[source]

Parameter estimation using the Normalized Leaky LMS filter.

When the leakage factor, gama, is set to 0 then there is no leakage in the estimation process.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book: Hayes, M. H. (2009). Statistical digital signal processing

and modeling. John Wiley & Sons.

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_leaky(psi, y)[source]

Parameter estimation using the Leaky LMS filter.

When the leakage factor, gama, is set to 0 then there is no leakage in the estimation process.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book: Hayes, M. H. (2009). Statistical digital signal processing

and modeling. John Wiley & Sons.

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_fourth(psi, y)[source]

Parameter estimation using the LMS Fourth filter.

When the leakage factor, gama, is set to 0 then there is no leakage in the estimation process.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Book: Hayes, M. H. (2009). Statistical digital signal processing

and modeling. John Wiley & Sons.

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Manuscript:Gui, G., Mehbodniya, A., & Adachi, F. (2013).

Least mean square/fourth algorithm with application to sparse channel estimation. arXiv preprint arXiv:1304.3911. https://arxiv.org/pdf/1304.3911.pdf

[4] Manuscript: Nascimento, V. H., & Bermudez, J. C. M. (2005, March).

When is the least-mean fourth algorithm mean-square stable? In Proceedings.(ICASSP’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. (Vol. 4, pp. iv-341). IEEE. http://www.lps.usp.br/vitor/artigos/icassp05.pdf

[5] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_mixed_norm(psi, y)[source]

Parameter estimation using the Mixed-norm LMS filter.

The weight factor controls the proportions of the error norms and offers an extra degree of freedom within the adaptation.

Parameters
  • psi (ndarray of floats) – The information matrix of the model.

  • y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

[1] Chambers, J. A., Tanrikulu, O., & Constantinides, A. G. (1994).

Least mean mixed-norm adaptive filtering. Electronics letters, 30(19), 1574-1575. https://ieeexplore.ieee.org/document/326382

[2] Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação,

análise estatística e novas estratégias de algoritmos LMS de passo variável.

[3] Wikipedia entry on Least Mean Squares

https://en.wikipedia.org/wiki/Least_mean_squares_filter

sysidentpy utils

Utilities fo data validation

sysidentpy.utils._check_arrays.check_infinity(X, y)[source]

Check that X and y have no NaN or Inf samples.

If there is any NaN or Inf samples a ValueError is raised.

Parameters
  • X (ndarray of floats) – The input data.

  • y (ndarray of floats) – The output data.

sysidentpy.utils._check_arrays.check_nan(X, y)[source]

Check that X and y have no NaN or Inf samples.

If there is any NaN or Inf samples a ValueError is raised.

Parameters
  • X (ndarray of floats) – The input data.

  • y (ndarray of floats) – The output data.

sysidentpy.utils._check_arrays.check_length(X, y)[source]

Check that X and y have the same number of samples.

If the length of X and y are different a ValueError is raised.

Parameters
  • X (ndarray of floats) – The input data.

  • y (ndarray of floats) – The output data.

sysidentpy.utils._check_arrays.check_dimension(X, y)[source]

Check if X and y have only real values.

If there is any string or object samples a ValueError is raised.

Parameters
  • X (ndarray of floats) – The input data.

  • y (ndarray of floats) – The output data.

sysidentpy.utils._check_arrays.check_X_y(X, y)[source]

Validate input and output data using some crucial tests.

Parameters
  • X (ndarray of floats) – The input data.

  • y (ndarray of floats) – The output data.

sysidentpy generate data

Utilities for data generation

sysidentpy.utils.generate_data.get_siso_data(n=5000, colored_noise=False, sigma=0.05, train_percentage=90)[source]

Perform the Error Reduction Ration algorithm.

Parameters
  • n (int) – The number of samples.

  • colored_noise (bool) – Select white noise or colored noise (autoregressive noise).

  • sigma (float) – The standard deviation of the random distribution to generate the noise.

  • train_percentage (int) – The percentage of the data to be used as train data.

Returns

  • x_train, x_valid (array-like) – The input data to be used in identification and validation, respectively.

  • y_train, y_valid (array-like) – The output data to be used in identification and validation, respectively.

sysidentpy.utils.generate_data.get_miso_data(n=5000, colored_noise=False, sigma=0.05, train_percentage=90)[source]

Perform the Error Reduction Ration algorithm.

Parameters
  • n (int) – The number of samples.

  • colored_noise (bool) – Select white noise or colored noise (autoregressive noise).

  • sigma (float) – The standard deviation of the random distribution to generate the noise.

  • train_percentage (int) – The percentage of the data to be used as train data.

Returns

  • x_train, x_valid (array-like) – The input data to be used in identification and validation, respectively.

  • y_train, y_valid (array-like) – The output data to be used in identification and validation, respectively.

Indices and tables