Solvers

This module contains solver routines for fitting models to the timeseries of stations.

Solution Object

class disstans.solvers.Solution(models, parameters, covariances=None, weights=None, obs_indices=None, reg_indices=None)[source]

Class that contains the solution output by the solver functions in this module, and distributes the parameters, covariances, and weights (where present) into the respective models.

Parameters
  • models (ModelCollection) – Model collection object that describes which models were used by the solver.

  • parameters (ndarray) – Model collection parameters of shape \((\text{num_solved}, \text{num_components})\).

  • covariances (Optional[ndarray], default: None) – Full model variance-covariance matrix that has a square shape with dimensions \(\text{num_solved} * \text{num_components}\).

  • weights (Optional[ndarray], default: None) – Model parameter regularization weights of shape \((\text{num_solved}, \text{num_components})\).

  • obs_indices (Optional[ndarray], default: None) – Observation mask of shape \((\text{num_parameters}, )\) with True at indices where the parameter was actually estimated, and False where the estimation was skipped due to observability or other reasons. None defaults to all True, which implies that \(\text{num_parameters} = \text{num_solved}\).

  • reg_indices (Optional[ndarray], default: None) – Regularization mask of shape \((\text{num_solved}, )\) with True where a parameter was subject to regularization, and False otherwise. None defaults to all False.

static aggregate_models(results_dict, mdl_description, key_list=None, stack_parameters=False, stack_covariances=False, stack_weights=False, zeroed=False)[source]

For a dictionary of Solution objects (e.g. one per station) and a given model description, aggregate the model parameters, variances and parameter regularization weights (where present) into combined NumPy arrays.

If the shape of the individual parameters etc. do not match between objects in the dictionary for the same model, None is returned instead, without raising an error (e.g. if the same model is site-specific and has different numbers of parameters).

Parameters
  • results_dict (dict[str, Solution]) – Dictionary of Solution objects.

  • mdl_description (str) – Name of the model to aggregate the parameters, variances and weights for.

  • key_list (Optional[list[str]], default: None) – If provided, aggregate only the selected keys in the dictionary. None defaults to all keys.

  • stack_parameters (bool, default: False) – If True, stack the parameters, otherwise just return None.

  • stack_covariances (bool, default: False) – If True, stack the covariances, otherwise just return None.

  • stack_weights (bool, default: False) – If True, stack the weights, otherwise just return None.

  • zeroed (bool, default: False) – If False, use parameters and covariances, else parameters_zeroed and covariances_zeroed.

Return type

tuple[Optional[ndarray], Optional[ndarray], Optional[ndarray]]

Returns

  • stacked_parameters – If stack_parameters=True, the stacked model parameters.

  • stacked_covariances – If stack_covariances=True and covariances are present in the models, the stacked component covariances, None if not present everywhere.

  • stacked_weights – If stack_weights=True and regularization weights are present in the models, the stacked weights, None if not present everywhere.

converged

True, if all values in the estimated parameters were finite (i.e., the solution converged), False otherwise.

covariances

Full covariance matrix, where unsolved and unobservable covariances are set to NaN.

covariances_by_model(models, zeroed=False)[source]

Helper function that uses get_model_indices() to quickly return the covariances for (a) specific model(s).

Parameters
Return type

ndarray

Returns

Covariances of the model subset.

property covariances_zeroed: ndarray

Returns the model covariances but sets the unobservable ones to zero to distinguish between unobservable ones and observable ones that could not be estimated because of a solver failure (and which are NaNs).

get_model_indices(models, for_cov=False)[source]

Given a model name or multiple names, returns an array of integer indices that can be used to extract the relevant entries from parameters and covariances.

Parameters
Return type

ndarray

Returns

Integer index array for the models.

property model_list: list[str]

List of models present in the solution.

num_components

Number of components the solution was computed for.

num_parameters

Total number of parameters (solved or unsolved) in solution.

obs_mask

Observability mask where True indicates observable parameters.

parameters

Full parameter matrix, where unsolved and unobservable parameters are set to NaN.

parameters_by_model(models, zeroed=False)[source]

Helper function that uses get_model_indices() to quickly return the parameters for (a) specific model(s).

Parameters
Return type

ndarray

Returns

Parameters of the model subset.

property parameters_zeroed: ndarray

Returns the model parameters but sets the unobservable ones to zero to distinguish between unobservable ones and observable ones that could not be estimated because of a solver failure (and which are NaNs).

weights

Full weights matrix, where unsolved weights are set to NaN.

weights_by_model(models)[source]

Helper function that uses get_model_indices() to quickly return the weights for specific model parameters.

Parameters

models (str | list[str]) – Name(s) of model(s).

Return type

ndarray

Returns

Weights of the model parameter subset.

Solver Functions

Either used directly, or through fit().

lasso_regression

disstans.solvers.lasso_regression(ts, models, penalty, reweight_max_iters=None, reweight_func=None, reweight_max_rss=1e-09, reweight_init=None, reweight_coupled=True, formal_covariance=False, use_data_variance=True, use_data_covariance=True, use_internal_scales=True, cov_zero_threshold=1e-06, return_weights=False, check_constraints=True, cvxpy_kw_args={'solver': 'SCS'})[source]

Performs linear, L1-regularized least squares using CVXPY.

The timeseries are the observations \(\mathbf{d}\), and the models’ mapping matrices are stacked together to form a single, sparse mapping matrix \(\mathbf{G}\). Given the penalty hyperparameter \(\lambda\), the solver then computes the model parameters \(\mathbf{m}\) that minimize the cost function

\[f(\mathbf{m}) = \left\| \mathbf{Gm} - \mathbf{d} \right\|_2^2 + \lambda \left\| \mathbf{m}_\text{reg} \right\|_1\]

where \(\mathbf{\epsilon} = \mathbf{Gm} - \mathbf{d}\) is the residual and the subscript \(_\text{reg}\) masks to zero the model parameters not designated to be regularized (see regularize).

If the observations \(\mathbf{d}\) include a covariance matrix \(\mathbf{C}_d\) (incorporating var_cols and possibly also cov_cols), this data will be used. In this case, \(\mathbf{G}\) and \(\mathbf{d}\) are replaced by their weighted versions

\[\mathbf{G} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{G}\]

and

\[\mathbf{d} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{d}\]

If reweight_max_iters is specified, sparsity of the solution parameters is promoted by iteratively reweighting the penalty parameter for each regularized parameter based on its current value, approximating the L0 norm rather than the L1 norm (see Notes).

The formal model covariance \(\mathbf{C}_m\) is defined as being zero except in the rows and columns corresponding to non-zero parameters (as defined my the absolute magnitude set by cov_zero_threshold), where it is defined exactly as the unregularized version (see linear_regression()), restricted to those same rows and columns. (This definition might very well be mathematically or algorithmically wrong - there probably needs to be some dependence on the reweighting function.)

Parameters
  • ts (Timeseries) – Timeseries to fit.

  • models (ModelCollection) – Model collection used for fitting.

  • penalty (float | list[float] | ndarray) – Penalty hyperparameter \(\lambda\). It can either be a single value used for all components, or a list or NumPy array specifying a penalty for each component in the data.

  • reweight_max_iters (Optional[int], default: None) – If an integer, number of solver iterations (see Notes), resulting in reweighting. None defaults to no reweighting.

  • reweight_func (Optional[ReweightingFunction], default: None) – If reweighting is active, the reweighting function instance to be used.

  • reweight_max_rss (float, default: 1e-09) – When reweighting is active and the maximum number of iterations has not yet been reached, let the iteration stop early if the solutions do not change much anymore (see Notes). Set to 0 to deactivate early stopping.

  • reweight_init (list[ndarray] | dict[str, ndarray] | ndarray, default: None) – When reweighting is active, use this array to initialize the weights. It has to have size \(\text{num_components} * \text{num_reg}\), where \(\text{num_components}=1\) if covariances are not used (and the actual number of timeseries components otherwise) and \(\text{num_reg}\) is the number of regularized model parameters. It can be a single NumPy array or a list of NumPy arrays, in which case it (or the array created by concatenating the list) need to already have the right output shape (no check is performed). If it is a dictionary, the keys need to be model names, and the values are then the NumPy arrays which will be arranged properly to match the mapping matrix.

  • reweight_coupled (bool, default: True) – If True and reweighting is active, the L1 penalty hyperparameter is coupled with the reweighting weights (see Notes).

  • formal_covariance (bool, default: False) – If True, calculate the formal model covariance.

  • use_data_variance (bool, default: True) – If True and ts contains variance information, this uncertainty information will be used.

  • use_data_covariance (bool, default: True) – If True, ts contains variance and covariance information, and use_data_variance is also True, this uncertainty information will be used.

  • use_internal_scales (bool, default: True) – If True, the reweighting takes into account potential model-specific internal scaling parameters, otherwise ignores them.

  • cov_zero_threshold (float, default: 1e-06) – When extracting the formal covariance matrix, assume parameters with absolute values smaller than cov_zero_threshold are effectively zero. Internal scales are always respected.

  • return_weights (bool, default: False) – When reweighting is active, set to True to return the weights after the last update.

  • check_constraints (bool, default: True) – If True, check whether models have sign constraints that should be enforced.

  • cvxpy_kw_args (dict, default: {'solver': 'SCS'}) – Additional keyword arguments passed on to CVXPY’s solve() function.

Return type

Solution

Returns

Result of the regression.

Notes

The L0-regularization approximation used by setting reweight_max_iters >= 0 is based on [candes08]. The idea here is to iteratively reduce the cost (before multiplication with \(\lambda\)) of regularized, but significant parameters to 1, and iteratively increasing the cost of a regularized, but small parameter to a much larger value.

This is achieved by introducing an additional parameter vector \(\mathbf{w}\) of the same shape as the regularized parameters, inserting it into the L1 cost, and iterating between solving the L1-regularized problem, and using a reweighting function on those weights:

  1. Initialize \(\mathbf{w}^{(0)} = \mathbf{1}\) (or use the array from reweight_init).

  2. Solve the modified weighted L1-regularized problem minimizing \(f(\mathbf{m}^{(i)}) = \left\| \mathbf{Gm}^{(i)} - \mathbf{d} \right\|_2^2 + \lambda \left\| \mathbf{w}^{(i)} \circ \mathbf{m}^{(i)}_\text{reg} \right\|_1\) where \(\circ\) is the element-wise multiplication and \(i\) is the iteration step.

  3. Update the weights element-wise using a predefined reweighting function \(\mathbf{w}^{(i+1)} = w(\mathbf{m}^{(i)}_\text{reg})\).

  4. Repeat from step 2 until reweight_max_iters iterations are reached or the root sum of squares of the difference between the last and current solution is less than reweight_max_rss.

The reweighting function is set through the argument reweight_func, see ReweightingFunction and its derived classes.

If reweighting is active and reweight_coupled=True, \(\lambda\) is moved into the norm and combined with \(\mathbf{w}\), such that the reweighting applies to the product of both. Furthermore, if reweight_init is also not None, then the penalty is ignored since it should already be contained in the passed weights array. (If reweight_coupled=False, \(\lambda\) is always applied separately, regardless of whether initial weights are passed or not.)

Note that the orders of magnitude between the penalties computed by the different reweighting functions for the same input parameters can differ significantly, even with the same penalty.

References

candes08(1,2,3)

Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted \(\ell_1\) Minimization. Journal of Fourier Analysis and Applications, 14(5), 877–905. doi:10.1007/s00041-008-9045-x.

linear_regression

disstans.solvers.linear_regression(ts, models, formal_covariance=False, use_data_variance=True, use_data_covariance=True, check_constraints=True)[source]

Performs linear, unregularized least squares using linalg.

The timeseries are the observations \(\mathbf{d}\), and the models’ mapping matrices are stacked together to form a single mapping matrix \(\mathbf{G}\). The solver then computes the model parameters \(\mathbf{m}\) that minimize the cost function

\[f(\mathbf{m}) = \left\| \mathbf{Gm} - \mathbf{d} \right\|_2^2\]

where \(\mathbf{\epsilon} = \mathbf{Gm} - \mathbf{d}\) is the residual.

If the observations \(\mathbf{d}\) include a covariance matrix \(\mathbf{C}_d\), this information can be used. In this case, \(\mathbf{G}\) and \(\mathbf{d}\) are replaced by their weighted versions

\[\mathbf{G} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{G}\]

and

\[\mathbf{d} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{d}\]

The formal model covariance is defined as the pseudo-inverse

\[\mathbf{C}_m = \left( \mathbf{G}^T \mathbf{C}_d \mathbf{G} \right)^g\]
Parameters
  • ts (Timeseries) – Timeseries to fit.

  • models (ModelCollection) – Model collection used for fitting.

  • formal_covariance (bool, default: False) – If True, calculate the formal model covariance.

  • use_data_variance (bool, default: True) – If True and ts contains variance information, this uncertainty information will be used.

  • use_data_covariance (bool, default: True) – If True, ts contains variance and covariance information, and use_data_variance is also True, this uncertainty information will be used.

  • check_constraints (bool, default: True) – If True, check whether models have sign constraints that should be enforced.

Return type

Solution

Returns

Result of the regression.

ridge_regression

disstans.solvers.ridge_regression(ts, models, penalty, formal_covariance=False, use_data_variance=True, use_data_covariance=True, check_constraints=True)[source]

Performs linear, L2-regularized least squares using linalg.

The timeseries are the observations \(\mathbf{d}\), and the models’ mapping matrices are stacked together to form a single mapping matrix \(\mathbf{G}\). Given the penalty hyperparameter \(\lambda\), the solver then computes the model parameters \(\mathbf{m}\) that minimize the cost function

\[f(\mathbf{m}) = \left\| \mathbf{Gm} - \mathbf{d} \right\|_2^2 + \lambda \left\| \mathbf{m}_\text{reg} \right\|_2^2\]

where \(\mathbf{\epsilon} = \mathbf{Gm} - \mathbf{d}\) is the residual and the subscript \(_\text{reg}\) masks to zero the model parameters not designated to be regularized (see regularize).

If the observations \(\mathbf{d}\) include a covariance matrix \(\mathbf{C}_d\), this information can be used. In this case, \(\mathbf{G}\) and \(\mathbf{d}\) are replaced by their weighted versions

\[\mathbf{G} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{G}\]

and

\[\mathbf{d} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{d}\]

The formal model covariance is defined as the pseudo-inverse

\[\mathbf{C}_m = \left( \mathbf{G}^T \mathbf{C}_d \mathbf{G} + \lambda \mathbf{I}_\text{reg} \right)^g\]

where the subscript \(_\text{reg}\) masks to zero the entries corresponding to non-regularized model parameters.

Parameters
  • ts (Timeseries) – Timeseries to fit.

  • models (ModelCollection) – Model collection used for fitting.

  • penalty (float | list[float] | ndarray) – Penalty hyperparameter \(\lambda\). It can either be a single value used for all components, or a list or NumPy array specifying a penalty for each component in the data.

  • formal_covariance (bool, default: False) – If True, calculate the formal model covariance.

  • use_data_variance (bool, default: True) – If True and ts contains variance information, this uncertainty information will be used.

  • use_data_covariance (bool, default: True) – If True, ts contains variance and covariance information, and use_data_variance is also True, this uncertainty information will be used.

  • check_constraints (bool, default: True) – If True, check whether models have sign constraints that should be enforced.

Return type

Solution

Returns

Result of the regression.

Reweighting Functions

For use with lasso_regression() and spatialfit().

ReweightingFunction

class disstans.solvers.ReweightingFunction(eps, scale=1)[source]

Base class for reweighting functions for spatialfit(), that convert a model parameter magnitude into a penalty value. For large magnitudes, the penalties approach zero, and for small magnitudes, the penalties approach “infinity”, i.e. very large values.

Usually, the eps value determines where the transition between significant and insignificant (i.e., essentially zero) parameters, and the scale modifies the maximum penalty for insignificant parameters (with the exact maximum penalty dependent on the chose reweighting function).

At initialization, the eps parameter is set. Inheriting child classes need to define the __call__() method that determines the actual reweighting mechanism. Instantiated reweighting functions objects can be used as functions but still provide access to the eps parameter.

Parameters
  • eps (float) – Stability parameter to use for the reweighting function.

  • scale (float, default: 1) – Scale parameter applied to reweighting result.

abstract __call__(value)[source]

Call self as a function.

Return type

ndarray

eps

Stability parameter to use for the reweighting function.

static from_name(name, *args, **kw_args)[source]

Search the local namespace for a reweighting function of that name and return an initialized instance of it.

Parameters
  • name (str) – Name (or abbreviation) of the reweighting function.

  • *args – Argument list passed on to function initialization.

  • **kw_args – Keyword arguments passed on to function initialization.

Return type

ReweightingFunction

Returns

An instantiated reweighting function object.

scale

Scaling factor applied to reweighting result.

InverseReweighting

class disstans.solvers.InverseReweighting(eps, scale=1)[source]
__call__(m)[source]

Reweighting function based on the inverse of the input based on [candes08]:

\[w(m_j) = \frac{\text{scale}}{|m_j| + \text{eps}}\]

The maximum penalty (\(y\)-intercept) can be approximated as \(\frac{\text{scale}}{\text{eps}}\), and the minimum penalty approaches zero asymptotically.

Parameters

m (ndarray) – \(\mathbf{m}\)

Return type

ndarray

Returns

Weights

InverseSquaredReweighting

class disstans.solvers.InverseSquaredReweighting(eps, scale=1)[source]
__call__(m)[source]

Reweighting function based on the inverse squared of the input based on [candes08]:

\[w(m_j) = \frac{\text{scale}}{m_j^2 + \text{eps}^2}\]

The maximum penalty (\(y\)-intercept) can be approximated as \(\frac{\text{scale}}{\text{eps}^2}\), and the minimum penalty approaches zero asymptotically.

Parameters

m (ndarray) – \(\mathbf{m}\)

Return type

ndarray

Returns

Weights

LogarithmicReweighting

class disstans.solvers.LogarithmicReweighting(eps, scale=1)[source]
__call__(m)[source]

Reweighting function based on the logarithm of the input based on [andrecut11]:

\[w(m_j) = \text{scale} \cdot \log_\text{num_reg} \frac{ \| \mathbf{m} \|_1 + \text{num_reg} \cdot \text{eps}}{|m_j| + \text{eps}}\]

(where \(0 < \text{eps} \ll \frac{1}{\text{num_reg}}\)). This reweighting function’s calculated penalties for individual elements depends on the overall size and 1-norm of the input weight vector.

The maximum penalty (\(y\)-intercept) can be approximated as \(\text{scale} \cdot \log_\text{num_reg} \frac{\| \mathbf{m} \|_1}{\text{eps}}\). If there is only a single nonzero value, its penalty will be zero (at \(|m_j|=\| \mathbf{m} \|_1\)). In the intermediate cases where multiple values are nonzero, their penalties will be distributed on a logarithmic slope between the \(y\)-intercept and zero.

Parameters

m (ndarray) – \(\mathbf{m}\)

Return type

ndarray

Returns

Weights

References

andrecut11

Andrecut, M. (2011). Stochastic Recovery Of Sparse Signals From Random Measurements. Engineering Letters, 19(1), 1-6.