# sysidentpy base¶

Base classes for NARMAX estimator.

class sysidentpy.narmax_base.GenerateRegressors[source]

Polynomial NARMAX model

Provides the main functions to generate the regressor dictionary and regressor codes for polynomial basis.

create_narmax_code(non_degree, xlag, ylag, n_inputs)[source]

Create the code representation of the regressors.

This function generates a codification from all possibles regressors given the maximum lag of the input and output. This is used to write the final terms of the model in a readable form. [1001] -> y(k-1). This code format was based on a dissertation from UFMG. See reference below.

Parameters
• non_degree (int) – The desired maximum nonlinearity degree.

• ylag (int) – The maximum lag of output regressors.

• xlag (int) – The maximum lag of input regressors.

Returns

• max_lag (int) – This value can be used by another functions.

• regressor_code (ndarray of int) – Matrix codification of all possible regressors.

Examples

The codification is defined as:

>>> 100n = y(k-n)
>>> 200n = u(k-n)
>>> [100n 100n] = y(k-n)y(k-n)
>>> [200n 200n] = u(k-n)u(k-n)


References

[1] Master Thesis: Barbosa, Alípio Monteiro.

Técnicas de otimização bi-objetivo para a determinação da estrutura de modelos NARX (2010).

class sysidentpy.narmax_base.HouseHolder[source]

Householder reflection and transformation.

class sysidentpy.narmax_base.InformationMatrix[source]

Class for methods regarding preprocessing of columns

shift_column(col_to_shift, lag)[source]

Shift values based on a lag.

Parameters
• col_to_shift (array-like of shape = n_samples) – The samples of the input or output.

• lag (int) – The respective lag of the regressor.

Returns

tmp_column – The shifted array of the input or output.

Return type

array-like of shape = n_samples

Examples

>>> y = [1, 2, 3, 4, 5]
>>> shift_column(y, 1)
[0, 1, 2, 3, 4]

initial_lagged_matrix(X, y, xlag, ylag)[source]

Build a lagged matrix concerning each lag for each column.

Parameters
• model (ndarray of int) – The model code representation.

• y (array-like) – Target data used on training phase.

• X (array-like) – Input data used on training phase.

• ylag (int) – The maximum lag of output regressors.

• xlag (int) – The maximum lag of input regressors.

Returns

lagged_data – The lagged matrix built in respect with each lag and column.

Return type

ndarray of floats

Examples

Let X and y be the input and output values of shape Nx1. If the chosen lags are 2 for both input and output the initial lagged matrix will be formed by Y[k-1], Y[k-2], X[k-1], and X[k-2].

build_output_matrix(y, ylag)[source]

Build the information matrix of output values.

Each columns of the information matrix represents a candidate regressor. The set of candidate regressors are based on xlag, ylag, and non_degree entered by the user.

Parameters
• model (ndarray of int) – The model code representation.

• y (array-like) – Target data used on training phase.

• ylag (int) – The maximum lag of output regressors.

• non_degree (int) – The desired maximum nonlinearity degree.

Returns

The lagged matrix built in respect with each lag and column.

Return type

lagged_data = ndarray of floats

build_input_matrix(X, xlag)[source]

Build the information matrix of input values.

Each columns of the information matrix represents a candidate regressor. The set of candidate regressors are based on xlag, ylag, and non_degree entered by the user.

Parameters
• model (ndarray of int) – The model code representation.

• X (array-like) – Input data used on training phase.

• xlag (int) – The maximum lag of input regressors.

• non_degree (int) – The desired maximum nonlinearity degree.

Returns

The lagged matrix built in respect with each lag and column.

Return type

lagged_data = ndarray of floats

build_input_output_matrix(X, y, xlag, ylag)[source]

Build the information matrix.

Each columns of the information matrix represents a candidate regressor. The set of candidate regressors are based on xlag, ylag, and non_degree entered by the user.

Parameters
• model (ndarray of int) – The model code representation.

• y (array-like) – Target data used on training phase.

• X (array-like) – Input data used on training phase.

• ylag (int) – The maximum lag of output regressors.

• xlag (int) – The maximum lag of input regressors.

• non_degree (int) – The desired maximum nonlinearity degree.

Returns

The lagged matrix built in respect with each lag and column.

Return type

lagged_data = ndarray of floats

# sysidentpy FROLS¶

Forward Regression Orthogonal Least Squares algorithm.

This class uses the FROLS algorithm ([1]_, [2]_) to build NARMAX models. The NARMAX model is described as:

$y_k= F^\ell[y_{k-1}, \dotsc, y_{k-n_y},x_{k-d}, x_{k-d-1}, \dotsc, x_{k-d-n_x} + e_{k-1}, \dotsc, e_{k-n_e}] + e_k$

where $$n_y\in \mathbb{N}^*$$, $$n_x \in \mathbb{N}$$, $$n_e \in \mathbb{N}$$, are the maximum lags for the system output and input respectively; $$x_k \in \mathbb{R}^{n_x}$$ is the system input and $$y_k \in \mathbb{R}^{n_y}$$ is the system output at discrete time $$k \in \mathbb{N}^n$$; $$e_k \in \mathbb{R}^{n_e}$$ stands for uncertainties and possible noise at discrete time $$k$$. In this case, $$\mathcal{F}^\ell$$ is some nonlinear function of the input and output regressors with nonlinearity degree $$\ell \in \mathbb{N}$$ and $$d$$ is a time delay typically set to $$d=1$$.

param ylag

The maximum lag of the output.

type ylag

int, default=2

param xlag

The maximum lag of the input.

type xlag

int, default=2

param elag

The maximum lag of the residues.

type elag

int, default=2

param order_selection

Whether to use information criteria for order selection.

type order_selection

bool, default=False

param info_criteria

The information criteria method to be used.

type info_criteria

str, default=”aic”

param n_terms

The number of the model terms to be selected. Note that n_terms overwrite the information criteria values.

type n_terms

int, default=None

param n_info_values

The number of iterations of the information criteria method.

type n_info_values

int, default=10

param estimator

The parameter estimation method.

type estimator

str, default=”least_squares”

param extended_least_squares

Whether to use extended least squares method for parameter estimation. Note that we define a specific set of noise regressors.

type extended_least_squares

bool, default=False

param aux_lag

Temporary lag value used only for parameter estimation. This value is overwritten by the max_lag value and will be removed in v0.1.4.

type aux_lag

int, default=1

param lam

Forgetting factor of the Recursive Least Squares method.

type lam

float, default=0.98

param delta

Normalization factor of the P matrix.

type delta

float, default=0.01

param offset_covariance

The offset covariance factor of the affine least mean squares filter.

type offset_covariance

float, default=0.2

param mu

The convergence coefficient (learning rate) of the filter.

type mu

float, default=0.01

param eps

Normalization factor of the normalized filters.

type eps

float

param gama

The leakage factor of the Leaky LMS method.

type gama

float, default=0.2

param weight

Weight factor to control the proportions of the error norms and offers an extra degree of freedom within the adaptation of the LMS mixed norm method.

type weight

float, default=0.02

param model_type

The user can choose “NARMAX”, “NAR” and “NFIR” models

type model_type

str, default=”NARMAX”

Examples

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from sysidentpy.model_structure_selection import FROLS
>>> from sysidentpy.basis_function._basis_function import Polynomial
>>> from sysidentpy.utils.display_results import results
>>> from sysidentpy.metrics import root_relative_squared_error
>>> from sysidentpy.utils.generate_data import get_miso_data, get_siso_data
>>> x_train, x_valid, y_train, y_valid = get_siso_data(n=1000,
...                                                    colored_noise=True,
...                                                    sigma=0.2,
...                                                    train_percentage=90)
>>> basis_function = Polynomial(degree=2)
>>> model = PolynomialNarmax(basis_function=basis_function,
...                          order_selection=True,
...                          n_info_values=10,
...                          extended_least_squares=False,
...                          ylag=2, xlag=2,
...                          info_criteria='aic',
...                          estimator='least_squares',
...                          )
>>> model.fit(x_train, y_train)
>>> yhat = model.predict(x_valid, y_valid)
>>> rrse = root_relative_squared_error(y_valid, yhat)
>>> print(rrse)
0.001993603325328823
>>> r = pd.DataFrame(
...     results(
...         model.final_model, model.theta, model.err,
...         model.n_terms, err_precision=8, dtype='sci'
...         ),
...     columns=['Regressors', 'Parameters', 'ERR'])
>>> print(r)
Regressors Parameters         ERR
0        x1(k-2)     0.9000       0.0
1         y(k-1)     0.1999       0.0
2  x1(k-1)y(k-1)     0.1000       0.0


References

1

Manuscript: Orthogonal least squares methods and their application to non-linear system identification https://eprints.soton.ac.uk/251147/1/778742007_content.pdf

2

Manuscript (portuguese): Identificação de Sistemas não Lineares Utilizando Modelos NARMAX Polinomiais – Uma Revisão e Novos Resultados

# sysidentpy metamss¶

Meta-Model Structure Selection: Building Polynomial NARMAX model

This class uses the MetaMSS ([1]_, [2]_, [3]_) algorithm to build NARMAX models. The NARMAX model is described as:

$y_k= F^\ell[y_{k-1}, \dotsc, y_{k-n_y},x_{k-d}, x_{k-d-1}, \dotsc, x_{k-d-n_x} + e_{k-1}, \dotsc, e_{k-n_e}] + e_k$

where $$n_y\in \mathbb{N}^*$$, $$n_x \in \mathbb{N}$$, $$n_e \in \mathbb{N}$$, are the maximum lags for the system output and input respectively; $$x_k \in \mathbb{R}^{n_x}$$ is the system input and $$y_k \in \mathbb{R}^{n_y}$$ is the system output at discrete time $$k \in \mathbb{N}^n$$; $$e_k \in \mathbb{R}^{n_e}$$ stands for uncertainties and possible noise at discrete time $$k$$. In this case, $$\mathcal{F}^\ell$$ is some nonlinear function of the input and output regressors with nonlinearity degree $$\ell \in \mathbb{N}$$ and $$d$$ is a time delay typically set to $$d=1$$.

param ylag

The maximum lag of the output.

type ylag

int, default=2

param xlag

The maximum lag of the input.

type xlag

int, default=2

param loss_func

The loss function to be minimized.

type loss_func

str, default=”metamss_loss”

param estimator

The parameter estimation method.

type estimator

str, default=”least_squares”

param estimate_parameter

Whether to estimate the model parameters.

type estimate_parameter

bool, default=True

param extended_least_squares

Whether to use extended least squares method for parameter estimation. Note that we define a specific set of noise regressors.

type extended_least_squares

bool, default=False

param lam

Forgetting factor of the Recursive Least Squares method.

type lam

float, default=0.98

param delta

Normalization factor of the P matrix.

type delta

float, default=0.01

param offset_covariance

The offset covariance factor of the affine least mean squares filter.

type offset_covariance

float, default=0.2

param mu

The convergence coefficient (learning rate) of the filter.

type mu

float, default=0.01

param eps

Normalization factor of the normalized filters.

type eps

float

param gama

The leakage factor of the Leaky LMS method.

type gama

float, default=0.2

param weight

Weight factor to control the proportions of the error norms and offers an extra degree of freedom within the adaptation of the LMS mixed norm method.

type weight

float, default=0.02

param maxiter

The maximum number of iterations.

type maxiter

int, default=30

param alpha

The descending coefficient of the gravitational constant.

type alpha

int, default=23

param g_zero

The initial value of the gravitational constant.

type g_zero

int, default=100

param k_agents_percent

Percent of agents applying force to the others in the last iteration.

type k_agents_percent

int, default=2

param norm

The information criteria method to be used.

type norm

int, default=-2

param power

The number of the model terms to be selected. Note that n_terms overwrite the information criteria values.

type power

int, default=2

param n_agents

The number of agents to search the optimal solution.

type n_agents

int, default=10

param dimension

The dimension of the search space. criteria method.

type dimension

int, default=15

param p_zeros

The probability of getting ones in the construction of the population.

type p_zeros

float, default=0.5

param p_zeros

The probability of getting zeros in the construction of the population.

type p_zeros

float, default=0.5

Examples

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from sysidentpy.model_structure_selection import MetaMSS
>>> from sysidentpy.metrics import root_relative_squared_error
>>> from sysidentpy.basis_function._basis_function import Polynomial
>>> from sysidentpy.utils.display_results import results
>>> from sysidentpy.utils.generate_data import get_siso_data
>>> x_train, x_valid, y_train, y_valid = get_siso_data(n=400,
...                                                    colored_noise=False,
...                                                    sigma=0.001,
...                                                    train_percentage=80)
>>> basis_function = Polynomial(degree=2)
>>> model = MetaMSS(
...     basis_function=basis_function,
...     norm=-2,
...     xlag=7,
...     ylag=7,
...     estimator="least_squares",
...     k_agents_percent=2,
...     estimate_parameter=True,
...     maxiter=30,
...     n_agents=10,
...     p_value=0.05,
...     loss_func='metamss_loss'
... )
>>> model.fit(x_train, y_train, x_valid, y_valid)
>>> yhat = model.predict(x_valid, y_valid)
>>> rrse = root_relative_squared_error(y_valid, yhat)
>>> print(rrse)
0.001993603325328823
>>> r = pd.DataFrame(
...     results(
...         model.final_model, model.theta, model.err,
...         model.n_terms, err_precision=8, dtype='sci'
...         ),
...     columns=['Regressors', 'Parameters', 'ERR'])
>>> print(r)
Regressors Parameters         ERR
0        x1(k-2)     0.9000       0.0
1         y(k-1)     0.1999       0.0
2  x1(k-1)y(k-1)     0.1000       0.0


References

1

Manuscript: Meta-Model Structure Selection: Building Polynomial NARX Model for Regression and Classification https://arxiv.org/pdf/2109.09917.pdf

2

Manuscript (Portuguese): Identificação de Sistemas Não Lineares Utilizando o Algoritmo Híbrido e Binário de Otimização por Enxame de Partículas e Busca Gravitacional DOI: 10.17648/sbai-2019-111317

3

Master thesis: Meta model structure selection: an algorithm for building polynomial NARX models for regression and classification

# sysidentpy er¶

Entropic Regression Algorithm

Build Polynomial NARMAX model using the Entropic Regression Algorithm ([1]_). This algorithm is based on the Matlab package available on: https://github.com/almomaa/ERFit-Package

The NARMAX model is described as:

$y_k= F^\ell[y_{k-1}, \dotsc, y_{k-n_y},x_{k-d}, x_{k-d-1}, \dotsc, x_{k-d-n_x} + e_{k-1}, \dotsc, e_{k-n_e}] + e_k$

where $$n_y\in \mathbb{N}^*$$, $$n_x \in \mathbb{N}$$, $$n_e \in \mathbb{N}$$, are the maximum lags for the system output and input respectively; $$x_k \in \mathbb{R}^{n_x}$$ is the system input and $$y_k \in \mathbb{R}^{n_y}$$ is the system output at discrete time $$k \in \mathbb{N}^n$$; $$e_k \in \mathbb{R}^{n_e}$$ stands for uncertainties and possible noise at discrete time $$k$$. In this case, $$\mathcal{F}^\ell$$ is some nonlinear function of the input and output regressors with nonlinearity degree $$\ell \in \mathbb{N}$$ and $$d$$ is a time delay typically set to $$d=1$$.

param ylag

The maximum lag of the output.

type ylag

int, default=2

param xlag

The maximum lag of the input.

type xlag

int, default=2

param k

The kth nearest neighbor to be used in estimation.

type k

int, default=2

param q

Quantile to compute, which must be between 0 and 1 inclusive.

type q

float, default=0.99

param p

Lp Measure of the distance in Knn estimator.

type p

default=inf,

param n_perm

Number of permutation to be used in shuffle test

type n_perm

int, default=200

param estimator

The parameter estimation method.

type estimator

str, default=”least_squares”

param skip_forward = bool

To be used for difficult and highly uncertain problems. Skipping the forward selection results in more accurate solution, but comes with higher computational cost.

param default=False

To be used for difficult and highly uncertain problems. Skipping the forward selection results in more accurate solution, but comes with higher computational cost.

param lam

Forgetting factor of the Recursive Least Squares method.

type lam

float, default=0.98

param delta

Normalization factor of the P matrix.

type delta

float, default=0.01

param offset_covariance

The offset covariance factor of the affine least mean squares filter.

type offset_covariance

float, default=0.2

param mu

The convergence coefficient (learning rate) of the filter.

type mu

float, default=0.01

param eps

Normalization factor of the normalized filters.

type eps

float

param gama

The leakage factor of the Leaky LMS method.

type gama

float, default=0.2

param weight

Weight factor to control the proportions of the error norms and offers an extra degree of freedom within the adaptation of the LMS mixed norm method.

type weight

float, default=0.02

param model_type

The user can choose “NARMAX”, “NAR” and “NFIR” models

type model_type

str, default=”NARMAX”

Examples

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from sysidentpy.model_structure_selection import ER
>>> from sysidentpy.basis_function._basis_function import Polynomial
>>> from sysidentpy.utils.display_results import results
>>> from sysidentpy.metrics import root_relative_squared_error
>>> from sysidentpy.utils.generate_data import get_miso_data, get_siso_data
>>> x_train, x_valid, y_train, y_valid = get_siso_data(n=1000,
...                                                    colored_noise=True,
...                                                    sigma=0.2,
...                                                    train_percentage=90)
>>> basis_function = Polynomial(degree=2)
>>> model = ER(basis_function=basis_function,
...              ylag=2, xlag=2
...              )
>>> model.fit(x_train, y_train)
>>> yhat = model.predict(x_valid, y_valid)
>>> rrse = root_relative_squared_error(y_valid, yhat)
>>> print(rrse)
0.001993603325328823
>>> r = pd.DataFrame(
...     results(
...         model.final_model, model.theta, model.err,
...         model.n_terms, err_precision=8, dtype='sci'
...         ),
...     columns=['Regressors', 'Parameters', 'ERR'])
>>> print(r)
Regressors Parameters         ERR
0        x1(k-2)     0.9000       0.0
1         y(k-1)     0.1999       0.0
2  x1(k-1)y(k-1)     0.1000       0.0


References

1

Abd AlRahman R. AlMomani, Jie Sun, and Erik Bollt. How Entropic Regression Beats the Outliers Problem in Nonlinear System Identification. Chaos 30, 013107 (2020).

2

Alexander Kraskov, Harald St¨ogbauer, and Peter Grassberger. Estimating mutual information. Physical Review E, 69:066-138,2004

3

Alexander Kraskov, Harald St¨ogbauer, and Peter Grassberger. Estimating mutual information. Physical Review E, 69:066-138,2004

4

Alexander Kraskov, Harald St¨ogbauer, and Peter Grassberger. Estimating mutual information. Physical Review E, 69:066-138,2004

# sysidentpy simulation¶

Simulation methods for NARMAX models

class sysidentpy.simulation._simulation.SimulateNARMAX(*, estimator='recursive_least_squares', extended_least_squares=False, lam=0.98, delta=0.01, offset_covariance=0.2, mu=0.01, eps=2.220446049250313e-16, gama=0.2, weight=0.02, estimate_parameter=True, calculate_err=False, model_type='NARMAX', basis_function=None)[source]

Simulation of Polynomial NARMAX model

The NARMAX model is described as:

$y_k= F^\ell[y_{k-1}, \dotsc, y_{k-n_y},x_{k-d}, x_{k-d-1}, \dotsc, x_{k-d-n_x} + e_{k-1}, \dotsc, e_{k-n_e}] + e_k$

where $$n_y\in \mathbb{N}^*$$, $$n_x \in \mathbb{N}$$, $$n_e \in \mathbb{N}$$, are the maximum lags for the system output and input respectively; $$x_k \in \mathbb{R}^{n_x}$$ is the system input and $$y_k \in \mathbb{R}^{n_y}$$ is the system output at discrete time $$k \in \mathbb{N}^n$$; $$e_k \in \mathbb{R}^{n_e}$$ stands for uncertainties and possible noise at discrete time $$k$$. In this case, $$\mathcal{F}^\ell$$ is some nonlinear function of the input and output regressors with nonlinearity degree $$\ell \in \mathbb{N}$$ and $$d$$ is a time delay typically set to $$d=1$$.

Parameters
• estimator (str, default="least_squares") – The parameter estimation method.

• extended_least_squares (bool, default=False) – Whether to use extended least squares method for parameter estimation. Note that we define a specific set of noise regressors.

• estimate_parameter (bool, default=False) – Whether to use a method for parameter estimation. Must be True if the user do not enter the pre-estimated parameters. Note that we define a specific set of noise regressors.

• calculate_err (bool, default=False) – Whether to use a ERR algorithm to the pre-defined regressors.

• lam (float, default=0.98) – Forgetting factor of the Recursive Least Squares method.

• delta (float, default=0.01) – Normalization factor of the P matrix.

• offset_covariance (float, default=0.2) – The offset covariance factor of the affine least mean squares filter.

• mu (float, default=0.01) – The convergence coefficient (learning rate) of the filter.

• eps (float) – Normalization factor of the normalized filters.

• gama (float, default=0.2) – The leakage factor of the Leaky LMS method.

• weight (float, default=0.02) – Weight factor to control the proportions of the error norms and offers an extra degree of freedom within the adaptation of the LMS mixed norm method.

Examples

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from sysidentpy.simulation import SimulateNARMAX
>>> from sysidentpy.basis_function._basis_function import Polynomial
>>> from sysidentpy.metrics import root_relative_squared_error
>>> from sysidentpy.utils.generate_data import get_miso_data, get_siso_data
>>> x_train, x_valid, y_train, y_valid = get_siso_data(n=1000,
...                                                    colored_noise=True,
...                                                    sigma=0.2,
...                                                    train_percentage=90)
>>> basis_function = Polynomial(degree=2)
>>> s = SimulateNARMAX(basis_function=basis_function)
>>> model = np.array(
...     [
...     [1001,    0], # y(k-1)
...     [2001, 1001], # x1(k-1)y(k-1)
...     [2002,    0], # x1(k-2)
...     ]
...                 )
>>> # theta must be a numpy array of shape (n, 1) where n is the number of regressors
>>> theta = np.array([[0.2, 0.9, 0.1]]).T
>>> yhat = s.simulate(
...     X_test=x_test,
...     y_test=y_test,
...     model_code=model,
...     theta=theta,
...     )
>>> r = pd.DataFrame(
...     results(
...         model.final_model, model.theta, model.err,
...         model.n_terms, err_precision=8, dtype='sci'
...         ),
...     columns=['Regressors', 'Parameters', 'ERR'])
>>> print(r)
Regressors Parameters         ERR
0        x1(k-2)     0.9000       0.0
1         y(k-1)     0.1999       0.0
2  x1(k-1)y(k-1)     0.1000       0.0

simulate(*, X_train=None, y_train=None, X_test=None, y_test=None, model_code=None, steps_ahead=None, theta=None, forecast_horizon=None)[source]

Simulate a model defined by the user.

Parameters
• X_train (ndarray of floats) – The input data to be used in the training process.

• y_train (ndarray of floats) – The output data to be used in the training process.

• X_test (ndarray of floats) – The input data to be used in the prediction process.

• y_test (ndarray of floats) – The output data (initial conditions) to be used in the prediction process.

• model_code (ndarray of int) – Flattened list of input or output regressors.

• int (steps_ahead =) – The forecast horizon.

• None (default =) – The forecast horizon.

• theta (array-like of shape = number_of_model_elements) – The parameters of the model.

• plot (bool, default=True) – Indicate if the user wants to plot or not.

Returns

• yhat (ndarray of floats) – The predicted values of the model.

• results (string) –

Where:

First column represents each regressor element; Second column represents associated parameter; Third column represents the error reduction ratio associated to each regressor.

error_reduction_ratio(psi, y, process_term_number, regressor_code)[source]

Perform the Error Reduction Ration algorithm.

Parameters
• y (array-like of shape = n_samples) – The target data used in the identification process.

• psi (ndarray of floats) – The information matrix of the model.

• process_term_number (int) – Number of Process Terms defined by the user.

Returns

• err (array-like of shape = number_of_model_elements) – The respective ERR calculated for each regressor.

• piv (array-like of shape = number_of_model_elements) – Contains the index to put the regressors in the correct order based on err values.

• psi_orthogonal (ndarray of floats) – The updated and orthogonal information matrix.

References

1

Manuscript: Orthogonal least squares methods and their application to non-linear system identification https://eprints.soton.ac.uk/251147/1/778742007_content.pdf

2

Manuscript (portuguese): Identificação de Sistemas não Lineares Utilizando Modelos NARMAX Polinomiais – Uma Revisão e Novos Resultados

# sysidentpy basis function¶

class sysidentpy.basis_function._basis_function.Polynomial(degree=2)[source]

Build polynomial basis function. Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree.

..math:

y_k = sum_{i=1}^{p}Theta_i imes prod_{j=0}^{n_x}u_{k-j}^{b_i, j}prod_{l=1}^{n_e}e_{k-l}^{d_i, l}prod_{m=1}^{n_y}y_{k-m}^{a_i, m} label{eq5:narx}

where $$p$$ is the number of regressors, $$\Theta_i$$ are the model parameters, and $$a_i, m, b_i, j$$ and $$d_i, l \in \mathbb{N}$$ are the exponents of the output, input and noise terms, respectively. :param degree: The maximum degree of the polynomial features. :type degree: int (max_degree), default=2

Notes

Be aware that the number of features in the output array scales significantly as the number of inputs, the max lag of the input and output, and degree increases. High degrees can cause overfitting.

fit(data, max_lag, predefined_regressors=None)[source]

Build the Polynomial information matrix.

Each columns of the information matrix represents a candidate regressor. The set of candidate regressors are based on xlag, ylag, and degree defined by the user.

Parameters
• data (ndarray of floats) – The lagged matrix built with respect to each lag and column.

• max_lag (int) – Target data used on training phase.

• predefined_regressors (ndarray of int) – The index of the selected regressors by the Model Structure Selection algorithm.

Returns

The lagged matrix built in respect with each lag and column.

Return type

psi = ndarray of floats

class sysidentpy.basis_function._basis_function.Fourier(n=1, p=6.283185307179586, degree=1, ensemble=True)[source]

Build Fourier basis function. Generate a new feature matrix consisting of all Fourier features with respect to the number of harmonics.

Parameters

degree (int (max_degree), default=2) – The maximum degree of the polynomial features.

Notes

Be aware that the number of features in the output array scales significantly as the number of inputs, the max lag of the input and output.

fit(data, max_lag, predefined_regressors=None)[source]

Build the Polynomial information matrix.

Each columns of the information matrix represents a candidate regressor. The set of candidate regressors are based on xlag, ylag, and degree defined by the user.

Parameters
• data (ndarray of floats) – The lagged matrix built with respect to each lag and column.

• max_lag (int) – Target data used on training phase.

• predefined_regressors (ndarray of int) – The index of the selected regressors by the Model Structure Selection algorithm.

Returns

The lagged matrix built in respect with each lag and column.

Return type

psi = ndarray of floats

# sysidentpy general_estimators¶

Build NARX Models Using general estimators

class sysidentpy.general_estimators.narx.NARX(*, ylag=2, xlag=2, model_type='NARMAX', basis_function=None, base_estimator=None, fit_params={})[source]

NARX model build on top of general estimators

Currently is possible to use any estimator that have a fit/predict as an Autoregressive Model. We use our GenerateRegressors and InformationMatrix classes to handle the creation of the lagged features and we are able to use a simple fit and prediction function to run infinity-steps-ahead prediction.

Parameters
• non_degree (int, default=1) – The nonlinearity degree of the polynomial function.

• ylag (int, default=2) – The maximum lag of the output.

• xlag (int, default=2) – The maximum lag of the input.

• n_inputs (int, default=1) – The number of inputs of the system.

• fit_params (dict, default=None) – Optional parameters of the fit function of the baseline estimator

• base_estimator (default=None) – The defined base estimator of the sklearn

• verbose (bool, default=False) – Print messages

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> from sysidentpy.metrics import mean_squared_error
>>> from sysidentpy.utils.generate_data import get_siso_data
>>> from sysidentpy.general_estimators import NARX
>>> from sklearn.linear_model import BayesianRidge
>>> from sysidentpy.basis_function._basis_function import Polynomial
>>> from sysidentpy.utils.display_results import results
>>> from sysidentpy.utils.plotting import plot_residues_correlation, plot_results
>>> from sysidentpy.residues.residues_correlation import compute_residues_autocorrelation, compute_cross_correlation
>>> from sklearn.linear_model import BayesianRidge # to use as base estimator
>>> x_train, x_valid, y_train, y_valid = get_siso_data(
...    n=1000,
...    colored_noise=False,
...    sigma=0.01,
...    train_percentage=80
... )
>>> BayesianRidge_narx = NARX(
...     base_estimator=BayesianRidge(),
...     xlag=2,
...     ylag=2,
...     basis_function=basis_function,
...     model_type="NARMAX",
... )
>>> BayesianRidge_narx.fit(x_train, y_train)
>>> yhat = BayesianRidge_narx.predict(x_valid, y_valid)
>>> print("MSE: ", mean_squared_error(y_valid, yhat))
>>> plot_results(y=y_valid, yhat=yhat, n=1000)
>>> ee = compute_residues_autocorrelation(y_valid, yhat)
>>> plot_residues_correlation(data=ee, title="Residues", ylabel="$e^2$")
>>> x1e = compute_cross_correlation(y_valid, yhat, x_valid)
>>> plot_residues_correlation(data=x1e, title="Residues", ylabel="$x_1e$")
0.000131

fit(*, X=None, y=None)[source]

Train a NARX Neural Network model.

This is an training pipeline that allows a friendly usage by the user. All the lagged features are built using the SysIdentPy classes and we use the fit method of the base estimator of the sklearn to fit the model.

Parameters
• X (ndarrays of floats) – The input data to be used in the training process.

• y (ndarrays of floats) – The output data to be used in the training process.

Returns

base_estimator – The model fitted.

Return type

sklearn estimator

Return the predicted given an input and initial values.

The predict function allows a friendly usage by the user. Given a trained model, predict values given a new set of data.

This method accept y values mainly for prediction n-steps ahead (to be implemented in the future).

Currently we only support infinity-steps-ahead prediction, but run 1-step-ahead prediction manually is straightforward.

Parameters
• X (ndarray of floats) – The input data to be used in the prediction process.

• y (ndarray of floats) – The output data to be used in the prediction process.

Returns

yhat – The predicted values of the model.

Return type

ndarray of floats

# sysidentpy bpsogsa¶

Binary Hybrid Particle Swarm Optimization and Gravitational Search Algorithm

class sysidentpy.metaheuristics.bpsogsa.BPSOGSA(maxiter=30, alpha=23, g_zero=100, k_agents_percent=2, norm=- 2, power=2, n_agents=10, dimension=15, p_zeros=0.5, p_ones=0.5)[source]

Binary Hybrid Particle Swarm Optimization and Gravitational Search Algorithm [1]_, [2]_, [3]_, [4]_, [5]_

Parameters
• maxiter (int, default=30) – The maximum number of iterations.

• alpha (int, default=23) – The descending coefficient of the gravitational constant.

• g_zero (int, default=100) – The initial value of the gravitational constant.

• k_agents_percent (int, default=2) – Percent of agents applying force to the others in the last iteration.

• norm (int, default=-2) – The information criteria method to be used.

• power (int, default=2) – The number of the model terms to be selected. Note that n_terms overwrite the information criteria values.

• n_agents (int, default=10) – The number of agents to search the optimal solution.

• dimension (int, default=15) – The dimension of the search space. criteria method.

• p_zeros (float, default=0.5) – The probability of getting ones in the construction of the population.

• p_zeros – The probability of getting zeros in the construction of the population.

Examples

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from sysidentpy.metaheuristics import BPSOGSA
>>> opt = BPSOGSA(maxiter=100,
...               k_agents_percent=2,
...               n_agents=10,
...               dimension=20
...               )
>>> opt.optimize()
>>> plt.plot(opt.best_by_iter)
>>> plt.show()
>>> print(opt.optimal_fitness_value)


References

1

A New Hybrid PSOGSA Algorithm for Function Optimization https://www.mathworks.com/matlabcentral/fileexchange/35939-hybrid-particle-swarm-optimization-and-gravitational-search-algorithm-psogsa

2

Manuscript: Particle swarm optimization: developments, applications and resources.

3

Manuscript: S-shaped versus v-shaped transfer functions for binary particle swarm optimization

4

Manuscript: BGSA: Binary Gravitational Search Algorithm.

5

Manuscript: A taxonomy of hybrid metaheuristics

optimize()[source]

Run the BPSOGSA algorithm.

This algorithm is based on the Matlab implementation provided by the author of the BPSOGSA algorithm [1]_, [2]_, [3]_, [4]_, [5]_.

References

1

A New Hybrid PSOGSA Algorithm for Function Optimization. https://www.mathworks.com/matlabcentral/fileexchange/35939-hybrid-particle-swarm-optimization-and-gravitational-search-algorithm-psogsa

2

Manuscript: Particle swarm optimization: developments, applications and resources.

3

Manuscript: S-shaped versus v-shaped transfer functions for binary. particle swarm optimization

4

Manuscript: BGSA: Binary Gravitational Search Algorithm.

5

Manuscript: A taxonomy of hybrid metaheuristics.

generate_random_population(random_state=None)[source]

Generate the initial population of agents randomly

Returns

population – The initial population of agents.

Return type

ndarray of zeros and ones

mass_calculation(fitness_value)[source]

Calculate the inertial masses of the agents.

Parameters

fitness_value (ndarray) – The fitness value of each agent.

Returns

agent_mass – The mass of each agent.

Return type

ndarray of floats

calculate_gravitational_constant(iteration)[source]

Update the gravitational constant.

Parameters

iteration (int) – The specific time.

Returns

gravitational_constant – The gravitational_constant at time defined by the iteration.

Return type

float

calculate_acceleration(population, agent_mass, gravitational_constant, iteration)[source]

Calculate the acceleration of each agent.

Parameters
• population (ndarray of zeros and ones) – The population defined by the agents.

• agent_mass (ndarray of floats) – The mass of each agent.

• gravitational_constant (float) – The gravitational_constant at time defined by the iteration.

• iteration (int) – The current iteration.

Returns

acceleration – The acceleration of each agent.

Return type

ndarray of floats

update_velocity_position(population, acceleration, velocity, iteration)[source]

Update the velocity and position of each agent.

Parameters
• population (ndarray of zeros and ones) – The population defined by the agents.

• acceleration (ndarray of floats) – The acceleration of each agent.

• velocity (ndarray of floats) – The velocity of each agent.

• iteration (int) – The current iteration.

Returns

• velocity (ndarray of floats) – The updated velocity of each agent.

• population (ndarray of zeros and ones) – The updated population defined by the agents.

# sysidentpy residues¶

sysidentpy.residues.residues_correlation.compute_residues_autocorrelation(y, yhat)[source]
sysidentpy.residues.residues_correlation.calculate_residues(y, yhat)[source]
sysidentpy.residues.residues_correlation.get_unnormalized_e_acf(e)[source]
sysidentpy.residues.residues_correlation.compute_cross_correlation(y, yhat, arr)[source]
class sysidentpy.residues.residues_correlation.ResiduesAnalysis[source]

Bases: object

Residues analysis for Polynomial NARX model.

residuals(X, y, yhat)[source]

Performs the residual analysis of output to validate model.

Parameters
• y (array-like of shape = n_samples) – The target data used in the identification process.

• yhat (array-like of shape = n_samples) – The prediction values of the identification process.

• X (ndarray of floats) – The input data.

Returns

• output_autocorr (ndarray of floats:) – 1st column - Residuals normalized autocorrelation. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.

• output_crosscorr (ndarray of floats:) – 1st column - Correlation between residuals and input. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.

Examples

>>> y = [3, -0.5, 2, 7]
>>> autocorr(y)
[62.25 11.5   2.5  21.  ]

plot_result(y, yhat, e_acf, xe_ccf, figsize=(10, 8), n=100)[source]

Plot the free run simulation and residues analysis.

Parameters
• y (array-like of shape = n_samples) – The target data used in the identification process.

• yhat (array-like of shape = n_samples) – The prediction values of the identification process.

• e_acf (ndarray of floats:) – 1st column - Residuals normalized autocorrelation. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.

• xe_ccf (ndarray of floats:) – 1st column - Correlation between residuals and input. 2nd/3rd columns - Superior and inferior limits of a 95% confidence interval.

__dict__ = mappingproxy({'__module__': 'sysidentpy.residues.residues_correlation', '__doc__': 'Residues analysis for Polynomial NARX model.', 'residuals': <function ResiduesAnalysis.residuals>, '_input_ccf': <function ResiduesAnalysis._input_ccf>, '_residuals_acf': <function ResiduesAnalysis._residuals_acf>, '_normalized_correlation': <function ResiduesAnalysis._normalized_correlation>, 'plot_result': <function ResiduesAnalysis.plot_result>, '__dict__': <attribute '__dict__' of 'ResiduesAnalysis' objects>, '__weakref__': <attribute '__weakref__' of 'ResiduesAnalysis' objects>, '__annotations__': {}})
__module__ = 'sysidentpy.residues.residues_correlation'
__weakref__

list of weak references to the object (if defined)

# sysidentpy metrics¶

Common metrics to assess performance on NARX models.

sysidentpy.metrics._regression.forecast_error(y, y_predicted)[source]

Calculate the forecast error in a regression model.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – The difference between the true target values and the predicted or forecast value in regression or any other phenomenon.

Return type

ndarray of floats

References

1

Wikipedia entry on the Forecast error https://en.wikipedia.org/wiki/Forecast_error

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> forecast_error(y, y_predicted)
[0.5, -0.5, 0, -1]

sysidentpy.metrics._regression.mean_forecast_error(y, y_predicted)[source]

Calculate the mean of forecast error of a regression model.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – The mean value of the difference between the true target values and the predicted or forecast value in regression or any other phenomenon.

Return type

float

References

1

Wikipedia entry on the Forecast error https://en.wikipedia.org/wiki/Forecast_error

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> mean_forecast_error(y, y_predicted)
-0.25

sysidentpy.metrics._regression.mean_squared_error(y, y_predicted)[source]

Calculate the Mean Squared Error.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – MSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

References

1

Wikipedia entry on the Mean Squared Error https://en.wikipedia.org/wiki/Mean_squared_error

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> mean_squared_error(y, y_predicted)
0.375

sysidentpy.metrics._regression.root_mean_squared_error(y, y_predicted)[source]

Calculate the Root Mean Squared Error.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – RMSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

References

1

Wikipedia entry on the Root Mean Squared Error https://en.wikipedia.org/wiki/Root-mean-square_deviation

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> root_mean_squared_error(y, y_predicted)
0.612

sysidentpy.metrics._regression.normalized_root_mean_squared_error(y, y_predicted)[source]

Calculate the normalized Root Mean Squared Error.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – nRMSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

References

1

Wikipedia entry on the normalized Root Mean Squared Error https://en.wikipedia.org/wiki/Root-mean-square_deviation

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> normalized_root_mean_squared_error(y, y_predicted)
0.081

sysidentpy.metrics._regression.root_relative_squared_error(y, y_predicted)[source]

Calculate the Root Relative Mean Squared Error.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – RRSE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> root_relative_mean_squared_error(y, y_predicted)
0.206

sysidentpy.metrics._regression.mean_absolute_error(y, y_predicted)[source]

Calculate the Mean absolute error.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – MAE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float or ndarray of floats

References

1

Wikipedia entry on the Mean absolute error https://en.wikipedia.org/wiki/Mean_absolute_error

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> mean_absolute_error(y, y_predicted)
0.5

sysidentpy.metrics._regression.mean_squared_log_error(y, y_predicted)[source]

Calculate the Mean Squared Logarithmic Error.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – MSLE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

Examples

>>> y = [3, 5, 2.5, 7]
>>> y_predicted = [2.5, 5, 4, 8]
>>> mean_squared_log_error(y, y_predicted)
0.039

sysidentpy.metrics._regression.median_absolute_error(y, y_predicted)[source]

Calculate the Median Absolute Error.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – MdAE output is non-negative values. Becoming 0.0 means your model outputs are exactly matched by true target values.

Return type

float

References

1

Wikipedia entry on the Median absolute deviation https://en.wikipedia.org/wiki/Median_absolute_deviation

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> median_absolute_error(y, y_predicted)
0.5

sysidentpy.metrics._regression.explained_variance_score(y, y_predicted)[source]

Calculate the Explained Variance Score.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – EVS output is non-negative values. Becoming 1.0 means your model outputs are exactly matched by true target values. Lower values means worse results.

Return type

float

References

1

Wikipedia entry on the Explained Variance https://en.wikipedia.org/wiki/Explained_variation

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> explained_variance_score(y, y_predicted)
0.957

sysidentpy.metrics._regression.r2_score(y, y_predicted)[source]

Calculate the R2 score.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – R2 output can be non-negative values or negative value. Becoming 1.0 means your model outputs are exactly matched by true target values. Lower values means worse results.

Return type

float

Notes

This is not a symmetric function.

References

1

Wikipedia entry on the Coefficient of determination https://en.wikipedia.org/wiki/Coefficient_of_determination

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> explained_variance_score(y, y_predicted)
0.948

sysidentpy.metrics._regression.symmetric_mean_absolute_percentage_error(y, y_predicted)[source]

Calculate the SMAPE score.

Parameters
• y (array-like of shape = number_of_outputs) – Represent the target values.

• y_predicted (array-like of shape = number_of_outputs) – Target values predicted by the model.

Returns

loss – SMAPE output is a non-negative value. The results are percentages values.

Return type

float

Notes

One supposed problem with SMAPE is that it is not symmetric since over-forecasts and under-forecasts are not treated equally.

References

1

Wikipedia entry on the Symmetric mean absolute percentage error https://en.wikipedia.org/wiki/Symmetric_mean_absolute_percentage_error

Examples

>>> y = [3, -0.5, 2, 7]
>>> y_predicted = [2.5, 0.0, 2, 8]
>>> symmetric_mean_absolute_percentage_error(y, y_predicted)
57.87


# sysidentpy estimators¶

Least Squares Methods for parameter estimation

class sysidentpy.parameter_estimation.estimators.Estimators(max_lag=1, lam=0.98, delta=0.01, offset_covariance=0.2, mu=0.01, eps=2.220446049250313e-16, gama=0.2, weight=0.02)[source]

Ordinary Least Squares for linear parameter estimation

least_squares(psi, y)[source]

Estimate the model parameters using Least Squares method.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

References

1

Manuscript: Sorenson, H. W. (1970). Least-squares estimation: from Gauss to Kalman. IEEE spectrum, 7(7), 63-68. http://pzs.dstu.dp.ua/DataMining/mls/bibl/Gauss2Kalman.pdf

2

Book (Portuguese): Aguirre, L. A. (2007). Introdução identificação de sistemas: técnicas lineares e não-lineares aplicadas a sistemas reais. Editora da UFMG. 3a edição.

3

Manuscript: Markovsky, I., & Van Huffel, S. (2007). Overview of total least-squares methods. Signal processing, 87(10), 2283-2302. https://eprints.soton.ac.uk/263855/1/tls_overview.pdf

4

Wikipedia entry on Least Squares https://en.wikipedia.org/wiki/Least_squares

total_least_squares(psi, y)[source]

Estimate the model parameters using Total Least Squares method.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

References

1

Manuscript: Golub, G. H., & Van Loan, C. F. (1980). An analysis of the total least squares problem. SIAM journal on numerical analysis, 17(6), 883-893.

2

Manuscript: Markovsky, I., & Van Huffel, S. (2007). Overview of total least-squares methods. Signal processing, 87(10), 2283-2302. https://eprints.soton.ac.uk/263855/1/tls_overview.pdf

3

Wikipedia entry on Total Least Squares https://en.wikipedia.org/wiki/Total_least_squares

recursive_least_squares(psi, y)[source]

Estimate the model parameters using the Recursive Least Squares method.

The implementation consider the forgetting factor. :param psi: The information matrix of the model. :type psi: ndarray of floats :param y_train: The data used to training the model. :type y_train: array-like of shape = y_training

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book (Portuguese): Aguirre, L. A. (2007). Introdução identificação de sistemas: técnicas lineares e não-lineares aplicadas a sistemas reais. Editora da UFMG. 3a edição.

affine_least_mean_squares(psi, y)[source]

Estimate the model parameters using the Affine Least Mean Squares.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Poularikas, A. D. (2017). Adaptive filtering: Fundamentals of least mean squares with MATLAB®. CRC Press.

least_mean_squares(psi, y)[source]

Estimate the model parameters using the Least Mean Squares filter.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Haykin, S., & Widrow, B. (Eds.). (2003). Least-mean-square adaptive filters (Vol. 31). John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_sign_error(psi, y)[source]

Parameter estimation using the Sign-Error Least Mean Squares filter.

The sign-error LMS algorithm uses the sign of the error vector to change the filter coefficients.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Hayes, M. H. (2009). Statistical digital signal processing and modeling. John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

normalized_least_mean_squares(psi, y)[source]

Parameter estimation using the Normalized Least Mean Squares filter.

The normalization is used to avoid numerical instability when updating the estimated parameters.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Hayes, M. H. (2009). Statistical digital signal processing and modeling. John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_normalized_sign_error(psi, y)[source]

Parameter estimation using the Normalized Sign-Error LMS filter.

The normalization is used to avoid numerical instability when updating the estimated parameters and the sign of the error vector is used to to change the filter coefficients.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Hayes, M. H. (2009). Statistical digital signal processing and modeling. John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_sign_regressor(psi, y)[source]

Parameter estimation using the Sign-Regressor LMS filter.

The sign-regressor LMS algorithm uses the sign of the matrix information to change the filter coefficients.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Hayes, M. H. (2009). Statistical digital signal processing and modeling. John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_normalized_sign_regressor(psi, y)[source]

Parameter estimation using the Normalized Sign-Regressor LMS filter.

The normalization is used to avoid numerical instability when updating the estimated parameters and the sign of the information matrix is used to change the filter coefficients.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Hayes, M. H. (2009). Statistical digital signal processing and modeling. John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_sign_sign(psi, y)[source]

Parameter estimation using the Sign-Sign LMS filter.

The sign-regressor LMS algorithm uses both the sign of the matrix information and the sign of the error vector to change the filter coefficients.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Hayes, M. H. (2009). Statistical digital signal processing and modeling. John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_normalized_sign_sign(psi, y)[source]

Parameter estimation using the Normalized Sign-Sign LMS filter.

The normalization is used to avoid numerical instability when updating the estimated parameters and both the sign of the information matrix and the sign of the error vector are used to change the filter coefficients.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Hayes, M. H. (2009). Statistical digital signal processing and modeling. John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_normalized_leaky(psi, y)[source]

Parameter estimation using the Normalized Leaky LMS filter.

When the leakage factor, gama, is set to 0 then there is no leakage in the estimation process.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Hayes, M. H. (2009). Statistical digital signal processing and modeling. John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_leaky(psi, y)[source]

Parameter estimation using the Leaky LMS filter.

When the leakage factor, gama, is set to 0 then there is no leakage in the estimation process.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Hayes, M. H. (2009). Statistical digital signal processing and modeling. John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_fourth(psi, y)[source]

Parameter estimation using the LMS Fourth filter.

When the leakage factor, gama, is set to 0 then there is no leakage in the estimation process.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Book: Hayes, M. H. (2009). Statistical digital signal processing and modeling. John Wiley & Sons.

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Manuscript:Gui, G., Mehbodniya, A., & Adachi, F. (2013). Least mean square/fourth algorithm with application to sparse channel estimation. arXiv preprint arXiv:1304.3911. https://arxiv.org/pdf/1304.3911.pdf

4

Manuscript: Nascimento, V. H., & Bermudez, J. C. M. (2005, March). When is the least-mean fourth algorithm mean-square stable? In Proceedings.(ICASSP’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. (Vol. 4, pp. iv-341). IEEE. http://www.lps.usp.br/vitor/artigos/icassp05.pdf

5

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

least_mean_squares_mixed_norm(psi, y)[source]

Parameter estimation using the Mixed-norm LMS filter.

The weight factor controls the proportions of the error norms and offers an extra degree of freedom within the adaptation.

Parameters
• psi (ndarray of floats) – The information matrix of the model.

• y_train (array-like of shape = y_training) – The data used to training the model.

Returns

theta – The estimated parameters of the model.

Return type

array-like of shape = number_of_model_elements

Notes

A more in-depth documentation of all methods for parameters estimation will be available soon. For now, please refer to the mentioned references.

References

1

Chambers, J. A., Tanrikulu, O., & Constantinides, A. G. (1994). Least mean mixed-norm adaptive filtering. Electronics letters, 30(19), 1574-1575. https://ieeexplore.ieee.org/document/326382

2

Dissertation (Portuguese): Zipf, J. G. F. (2011). Classificação, análise estatística e novas estratégias de algoritmos LMS de passo variável.

3

Wikipedia entry on Least Mean Squares https://en.wikipedia.org/wiki/Least_mean_squares_filter

# sysidentpy utils¶

Utilities fo data validation

sysidentpy.utils._check_arrays.check_random_state(seed)[source]

Turn seed into a np.random.RandomState instance. :param seed: numpy.random.RandomState}, optional

If seed is None (or np.random), the numpy.random.RandomState singleton is used. If seed is an int, a new RandomState instance is used, seeded with seed. If seed is already a Generator or RandomState` instance then that instance is used.

Returns

seed – Random number generator.

Return type

{numpy.random.Generator, numpy.random.RandomState}

sysidentpy.utils._check_arrays.check_infinity(X, y)[source]

Check that X and y have no NaN or Inf samples.

If there is any NaN or Inf samples a ValueError is raised.

Parameters
• X (ndarray of floats) – The input data.

• y (ndarray of floats) – The output data.

sysidentpy.utils._check_arrays.check_nan(X, y)[source]

Check that X and y have no NaN or Inf samples.

If there is any NaN or Inf samples a ValueError is raised.

Parameters
• X (ndarray of floats) – The input data.

• y (ndarray of floats) – The output data.

sysidentpy.utils._check_arrays.check_length(X, y)[source]

Check that X and y have the same number of samples.

If the length of X and y are different a ValueError is raised.

Parameters
• X (ndarray of floats) – The input data.

• y (ndarray of floats) – The output data.

sysidentpy.utils._check_arrays.check_dimension(X, y)[source]

Check if X and y have only real values.

If there is any string or object samples a ValueError is raised.

Parameters
• X (ndarray of floats) – The input data.

• y (ndarray of floats) – The output data.

sysidentpy.utils._check_arrays.check_X_y(X, y)[source]

Validate input and output data using some crucial tests.

Parameters
• X (ndarray of floats) – The input data.

• y (ndarray of floats) – The output data.

# sysidentpy generate data¶

Utilities for data generation

sysidentpy.utils.generate_data.get_siso_data(n=5000, colored_noise=False, sigma=0.05, train_percentage=90)[source]

Perform the Error Reduction Ration algorithm.

Parameters
• n (int) – The number of samples.

• colored_noise (bool) – Select white noise or colored noise (autoregressive noise).

• sigma (float) – The standard deviation of the random distribution to generate the noise.

• train_percentage (int) – The percentage of the data to be used as train data.

Returns

• x_train, x_valid (array-like) – The input data to be used in identification and validation, respectively.

• y_train, y_valid (array-like) – The output data to be used in identification and validation, respectively.

sysidentpy.utils.generate_data.get_miso_data(n=5000, colored_noise=False, sigma=0.05, train_percentage=90)[source]

Perform the Error Reduction Ration algorithm.

Parameters
• n (int) – The number of samples.

• colored_noise (bool) – Select white noise or colored noise (autoregressive noise).

• sigma (float) – The standard deviation of the random distribution to generate the noise.

• train_percentage (int) – The percentage of the data to be used as train data.

Returns

• x_train, x_valid (array-like) – The input data to be used in identification and validation, respectively.

• y_train, y_valid (array-like) – The output data to be used in identification and validation, respectively.

# sysidentpy display results¶

sysidentpy.utils.display_results.results(final_model=None, theta=None, err=None, n_terms=None, theta_precision=4, err_precision=8, dtype='dec')[source]

Write the model regressors, parameters and ERR values.

This function returns the model regressors, its respective parameter and ERR value on a string matrix.

Parameters
• theta_precision (int (default: 4)) – Precision of shown parameters values.

• err_precision (int (default: 8)) – Precision of shown ERR values.

• dtype (string (default: 'dec')) – Type of representation: sci - Scientific notation; dec - Decimal notation.

Returns

output_matrix

Where:

First column represents each regressor element; Second column represents associated parameter; Third column represents the error reduction ratio associated to each regressor.

Return type

string

This method saves the model “model” in folder “folder” using an extension .syspy

Parameters
• model (the model variable to be saved) –

• file_name (file name, along with .syspy extension) –

• path (location where the model will be saved (optional)) –

Returns

Return type

file file_name.syspy located at “path”, containing the estimated model.

This method loads the model from file “file_name.syspy” located at path “path”

Parameters
• file_name (file name (str), along with .syspy extension of the file containing model to be loaded) –

• path (location where "file_name.syspy" is (optional)) –

Returns