Solvers
This module contains solver routines for fitting models to the timeseries of stations.
Solution Object
- class disstans.solvers.Solution(models, parameters, covariances=None, weights=None, obs_indices=None, reg_indices=None)[source]
Class that contains the solution output by the solver functions in this module, and distributes the parameters, covariances, and weights (where present) into the respective models.
- Parameters
models (
ModelCollection
) – Model collection object that describes which models were used by the solver.parameters (
ndarray
) – Model collection parameters of shape \((\text{num_solved}, \text{num_components})\).covariances (
Optional
[ndarray
], default:None
) – Full model variance-covariance matrix that has a square shape with dimensions \(\text{num_solved} * \text{num_components}\).weights (
Optional
[ndarray
], default:None
) – Model parameter regularization weights of shape \((\text{num_solved}, \text{num_components})\).obs_indices (
Optional
[ndarray
], default:None
) – Observation mask of shape \((\text{num_parameters}, )\) withTrue
at indices where the parameter was actually estimated, andFalse
where the estimation was skipped due to observability or other reasons.None
defaults to allTrue
, which implies that \(\text{num_parameters} = \text{num_solved}\).reg_indices (
Optional
[ndarray
], default:None
) – Regularization mask of shape \((\text{num_solved}, )\) withTrue
where a parameter was subject to regularization, andFalse
otherwise.None
defaults to allFalse
.
- static aggregate_models(results_dict, mdl_description, key_list=None, stack_parameters=False, stack_covariances=False, stack_weights=False, zeroed=False)[source]
For a dictionary of Solution objects (e.g. one per station) and a given model description, aggregate the model parameters, variances and parameter regularization weights (where present) into combined NumPy arrays.
If the shape of the individual parameters etc. do not match between objects in the dictionary for the same model,
None
is returned instead, without raising an error (e.g. if the same model is site-specific and has different numbers of parameters).- Parameters
results_dict (
dict
[str
,Solution
]) – Dictionary of Solution objects.mdl_description (
str
) – Name of the model to aggregate the parameters, variances and weights for.key_list (
Optional
[list
[str
]], default:None
) – If provided, aggregate only the selected keys in the dictionary.None
defaults to all keys.stack_parameters (
bool
, default:False
) – IfTrue
, stack the parameters, otherwise just returnNone
.stack_covariances (
bool
, default:False
) – IfTrue
, stack the covariances, otherwise just returnNone
.stack_weights (
bool
, default:False
) – IfTrue
, stack the weights, otherwise just returnNone
.zeroed (
bool
, default:False
) – IfFalse
, useparameters
andcovariances
, elseparameters_zeroed
andcovariances_zeroed
.
- Return type
tuple
[Optional
[ndarray
],Optional
[ndarray
],Optional
[ndarray
]]- Returns
stacked_parameters – If
stack_parameters=True
, the stacked model parameters.stacked_covariances – If
stack_covariances=True
and covariances are present in the models, the stacked component covariances,None
if not present everywhere.stacked_weights – If
stack_weights=True
and regularization weights are present in the models, the stacked weights,None
if not present everywhere.
- converged
True
, if all values in the estimated parameters were finite (i.e., the solution converged),False
otherwise.
- covariances
Full covariance matrix, where unsolved and unobservable covariances are set to
NaN
.
- covariances_by_model(models, zeroed=False)[source]
Helper function that uses
get_model_indices()
to quickly return the covariances for (a) specific model(s).- Parameters
zeroed (
bool
, default:False
) – IfFalse
, usecovariances
, elsecovariances_zeroed
.
- Return type
- Returns
Covariances of the model subset.
- property covariances_zeroed: ndarray
Returns the model covariances but sets the unobservable ones to zero to distinguish between unobservable ones and observable ones that could not be estimated because of a solver failure (and which are NaNs).
- get_model_indices(models, for_cov=False)[source]
Given a model name or multiple names, returns an array of integer indices that can be used to extract the relevant entries from
parameters
andcovariances
.- Parameters
for_cov (
bool
, default:False
) – IfFalse
, return the indices forparameters
, otherwise forcovariances
- Return type
- Returns
Integer index array for the models.
- num_components
Number of components the solution was computed for.
- num_parameters
Total number of parameters (solved or unsolved) in solution.
- obs_mask
Observability mask where
True
indicates observable parameters.
- parameters
Full parameter matrix, where unsolved and unobservable parameters are set to
NaN
.
- parameters_by_model(models, zeroed=False)[source]
Helper function that uses
get_model_indices()
to quickly return the parameters for (a) specific model(s).- Parameters
zeroed (
bool
, default:False
) – IfFalse
, useparameters
, elseparameters_zeroed
.
- Return type
- Returns
Parameters of the model subset.
- property parameters_zeroed: ndarray
Returns the model parameters but sets the unobservable ones to zero to distinguish between unobservable ones and observable ones that could not be estimated because of a solver failure (and which are NaNs).
- weights
Full weights matrix, where unsolved weights are set to
NaN
.
Solver Functions
Either used directly, or through fit()
.
lasso_regression
- disstans.solvers.lasso_regression(ts, models, penalty, reweight_max_iters=None, reweight_func=None, reweight_max_rss=1e-09, reweight_init=None, reweight_coupled=True, formal_covariance=False, use_data_variance=True, use_data_covariance=True, use_internal_scales=True, cov_zero_threshold=1e-06, return_weights=False, check_constraints=True, cvxpy_kw_args={'solver': 'SCS'})[source]
Performs linear, L1-regularized least squares using CVXPY.
The timeseries are the observations \(\mathbf{d}\), and the models’ mapping matrices are stacked together to form a single, sparse mapping matrix \(\mathbf{G}\). Given the penalty hyperparameter \(\lambda\), the solver then computes the model parameters \(\mathbf{m}\) that minimize the cost function
\[f(\mathbf{m}) = \left\| \mathbf{Gm} - \mathbf{d} \right\|_2^2 + \lambda \left\| \mathbf{m}_\text{reg} \right\|_1\]where \(\mathbf{\epsilon} = \mathbf{Gm} - \mathbf{d}\) is the residual and the subscript \(_\text{reg}\) masks to zero the model parameters not designated to be regularized (see
regularize
).If the observations \(\mathbf{d}\) include a covariance matrix \(\mathbf{C}_d\) (incorporating var_cols and possibly also cov_cols), this data will be used. In this case, \(\mathbf{G}\) and \(\mathbf{d}\) are replaced by their weighted versions
\[\mathbf{G} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{G}\]and
\[\mathbf{d} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{d}\]If
reweight_max_iters
is specified, sparsity of the solution parameters is promoted by iteratively reweighting the penalty parameter for each regularized parameter based on its current value, approximating the L0 norm rather than the L1 norm (see Notes).The formal model covariance \(\mathbf{C}_m\) is defined as being zero except in the rows and columns corresponding to non-zero parameters (as defined my the absolute magnitude set by
cov_zero_threshold
), where it is defined exactly as the unregularized version (seelinear_regression()
), restricted to those same rows and columns. (This definition might very well be mathematically or algorithmically wrong - there probably needs to be some dependence on the reweighting function.)- Parameters
ts (
Timeseries
) – Timeseries to fit.models (
ModelCollection
) – Model collection used for fitting.penalty (
float
|list
[float
] |ndarray
) – Penalty hyperparameter \(\lambda\). It can either be a single value used for all components, or a list or NumPy array specifying a penalty for each component in the data.reweight_max_iters (
Optional
[int
], default:None
) – If an integer, number of solver iterations (see Notes), resulting in reweighting.None
defaults to no reweighting.reweight_func (
Optional
[ReweightingFunction
], default:None
) – If reweighting is active, the reweighting function instance to be used.reweight_max_rss (
float
, default:1e-09
) – When reweighting is active and the maximum number of iterations has not yet been reached, let the iteration stop early if the solutions do not change much anymore (see Notes). Set to0
to deactivate early stopping.reweight_init (
list
[ndarray
] |dict
[str
,ndarray
] |ndarray
, default:None
) – When reweighting is active, use this array to initialize the weights. It has to have size \(\text{num_components} * \text{num_reg}\), where \(\text{num_components}=1\) if covariances are not used (and the actual number of timeseries components otherwise) and \(\text{num_reg}\) is the number of regularized model parameters. It can be a single NumPy array or a list of NumPy arrays, in which case it (or the array created by concatenating the list) need to already have the right output shape (no check is performed). If it is a dictionary, the keys need to be model names, and the values are then the NumPy arrays which will be arranged properly to match the mapping matrix.reweight_coupled (
bool
, default:True
) – IfTrue
and reweighting is active, the L1 penalty hyperparameter is coupled with the reweighting weights (see Notes).formal_covariance (
bool
, default:False
) – IfTrue
, calculate the formal model covariance.use_data_variance (
bool
, default:True
) – IfTrue
andts
contains variance information, this uncertainty information will be used.use_data_covariance (
bool
, default:True
) – IfTrue
,ts
contains variance and covariance information, anduse_data_variance
is alsoTrue
, this uncertainty information will be used.use_internal_scales (
bool
, default:True
) – IfTrue
, the reweighting takes into account potential model-specific internal scaling parameters, otherwise ignores them.cov_zero_threshold (
float
, default:1e-06
) – When extracting the formal covariance matrix, assume parameters with absolute values smaller thancov_zero_threshold
are effectively zero. Internal scales are always respected.return_weights (
bool
, default:False
) – When reweighting is active, set toTrue
to return the weights after the last update.check_constraints (
bool
, default:True
) – IfTrue
, check whether models have sign constraints that should be enforced.cvxpy_kw_args (
dict
, default:{'solver': 'SCS'}
) – Additional keyword arguments passed on to CVXPY’ssolve()
function.
- Return type
- Returns
Result of the regression.
Notes
The L0-regularization approximation used by setting
reweight_max_iters >= 0
is based on [candes08]. The idea here is to iteratively reduce the cost (before multiplication with \(\lambda\)) of regularized, but significant parameters to 1, and iteratively increasing the cost of a regularized, but small parameter to a much larger value.This is achieved by introducing an additional parameter vector \(\mathbf{w}\) of the same shape as the regularized parameters, inserting it into the L1 cost, and iterating between solving the L1-regularized problem, and using a reweighting function on those weights:
Initialize \(\mathbf{w}^{(0)} = \mathbf{1}\) (or use the array from
reweight_init
).Solve the modified weighted L1-regularized problem minimizing \(f(\mathbf{m}^{(i)}) = \left\| \mathbf{Gm}^{(i)} - \mathbf{d} \right\|_2^2 + \lambda \left\| \mathbf{w}^{(i)} \circ \mathbf{m}^{(i)}_\text{reg} \right\|_1\) where \(\circ\) is the element-wise multiplication and \(i\) is the iteration step.
Update the weights element-wise using a predefined reweighting function \(\mathbf{w}^{(i+1)} = w(\mathbf{m}^{(i)}_\text{reg})\).
Repeat from step 2 until
reweight_max_iters
iterations are reached or the root sum of squares of the difference between the last and current solution is less thanreweight_max_rss
.
The reweighting function is set through the argument
reweight_func
, seeReweightingFunction
and its derived classes.If reweighting is active and
reweight_coupled=True
, \(\lambda\) is moved into the norm and combined with \(\mathbf{w}\), such that the reweighting applies to the product of both. Furthermore, ifreweight_init
is also notNone
, then thepenalty
is ignored since it should already be contained in the passed weights array. (Ifreweight_coupled=False
, \(\lambda\) is always applied separately, regardless of whether initial weights are passed or not.)Note that the orders of magnitude between the penalties computed by the different reweighting functions for the same input parameters can differ significantly, even with the same
penalty
.References
- candes08(1,2,3)
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted \(\ell_1\) Minimization. Journal of Fourier Analysis and Applications, 14(5), 877–905. doi:10.1007/s00041-008-9045-x.
linear_regression
- disstans.solvers.linear_regression(ts, models, formal_covariance=False, use_data_variance=True, use_data_covariance=True, check_constraints=True)[source]
Performs linear, unregularized least squares using
linalg
.The timeseries are the observations \(\mathbf{d}\), and the models’ mapping matrices are stacked together to form a single mapping matrix \(\mathbf{G}\). The solver then computes the model parameters \(\mathbf{m}\) that minimize the cost function
\[f(\mathbf{m}) = \left\| \mathbf{Gm} - \mathbf{d} \right\|_2^2\]where \(\mathbf{\epsilon} = \mathbf{Gm} - \mathbf{d}\) is the residual.
If the observations \(\mathbf{d}\) include a covariance matrix \(\mathbf{C}_d\), this information can be used. In this case, \(\mathbf{G}\) and \(\mathbf{d}\) are replaced by their weighted versions
\[\mathbf{G} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{G}\]and
\[\mathbf{d} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{d}\]The formal model covariance is defined as the pseudo-inverse
\[\mathbf{C}_m = \left( \mathbf{G}^T \mathbf{C}_d \mathbf{G} \right)^g\]- Parameters
ts (
Timeseries
) – Timeseries to fit.models (
ModelCollection
) – Model collection used for fitting.formal_covariance (
bool
, default:False
) – IfTrue
, calculate the formal model covariance.use_data_variance (
bool
, default:True
) – IfTrue
andts
contains variance information, this uncertainty information will be used.use_data_covariance (
bool
, default:True
) – IfTrue
,ts
contains variance and covariance information, anduse_data_variance
is alsoTrue
, this uncertainty information will be used.check_constraints (
bool
, default:True
) – IfTrue
, check whether models have sign constraints that should be enforced.
- Return type
- Returns
Result of the regression.
ridge_regression
- disstans.solvers.ridge_regression(ts, models, penalty, formal_covariance=False, use_data_variance=True, use_data_covariance=True, check_constraints=True)[source]
Performs linear, L2-regularized least squares using
linalg
.The timeseries are the observations \(\mathbf{d}\), and the models’ mapping matrices are stacked together to form a single mapping matrix \(\mathbf{G}\). Given the penalty hyperparameter \(\lambda\), the solver then computes the model parameters \(\mathbf{m}\) that minimize the cost function
\[f(\mathbf{m}) = \left\| \mathbf{Gm} - \mathbf{d} \right\|_2^2 + \lambda \left\| \mathbf{m}_\text{reg} \right\|_2^2\]where \(\mathbf{\epsilon} = \mathbf{Gm} - \mathbf{d}\) is the residual and the subscript \(_\text{reg}\) masks to zero the model parameters not designated to be regularized (see
regularize
).If the observations \(\mathbf{d}\) include a covariance matrix \(\mathbf{C}_d\), this information can be used. In this case, \(\mathbf{G}\) and \(\mathbf{d}\) are replaced by their weighted versions
\[\mathbf{G} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{G}\]and
\[\mathbf{d} \rightarrow \mathbf{G}^T \mathbf{C}_d^{-1} \mathbf{d}\]The formal model covariance is defined as the pseudo-inverse
\[\mathbf{C}_m = \left( \mathbf{G}^T \mathbf{C}_d \mathbf{G} + \lambda \mathbf{I}_\text{reg} \right)^g\]where the subscript \(_\text{reg}\) masks to zero the entries corresponding to non-regularized model parameters.
- Parameters
ts (
Timeseries
) – Timeseries to fit.models (
ModelCollection
) – Model collection used for fitting.penalty (
float
|list
[float
] |ndarray
) – Penalty hyperparameter \(\lambda\). It can either be a single value used for all components, or a list or NumPy array specifying a penalty for each component in the data.formal_covariance (
bool
, default:False
) – IfTrue
, calculate the formal model covariance.use_data_variance (
bool
, default:True
) – IfTrue
andts
contains variance information, this uncertainty information will be used.use_data_covariance (
bool
, default:True
) – IfTrue
,ts
contains variance and covariance information, anduse_data_variance
is alsoTrue
, this uncertainty information will be used.check_constraints (
bool
, default:True
) – IfTrue
, check whether models have sign constraints that should be enforced.
- Return type
- Returns
Result of the regression.
Reweighting Functions
For use with lasso_regression()
and spatialfit()
.
ReweightingFunction
- class disstans.solvers.ReweightingFunction(eps, scale=1)[source]
Base class for reweighting functions for
spatialfit()
, that convert a model parameter magnitude into a penalty value. For large magnitudes, the penalties approach zero, and for small magnitudes, the penalties approach “infinity”, i.e. very large values.Usually, the
eps
value determines where the transition between significant and insignificant (i.e., essentially zero) parameters, and the scale modifies the maximum penalty for insignificant parameters (with the exact maximum penalty dependent on the chose reweighting function).At initialization, the
eps
parameter is set. Inheriting child classes need to define the__call__()
method that determines the actual reweighting mechanism. Instantiated reweighting functions objects can be used as functions but still provide access to theeps
parameter.- Parameters
- eps
Stability parameter to use for the reweighting function.
- static from_name(name, *args, **kw_args)[source]
Search the local namespace for a reweighting function of that name and return an initialized instance of it.
- Parameters
name (
str
) – Name (or abbreviation) of the reweighting function.*args – Argument list passed on to function initialization.
**kw_args – Keyword arguments passed on to function initialization.
- Return type
- Returns
An instantiated reweighting function object.
- scale
Scaling factor applied to reweighting result.
InverseReweighting
- class disstans.solvers.InverseReweighting(eps, scale=1)[source]
- __call__(m)[source]
Reweighting function based on the inverse of the input based on [candes08]:
\[w(m_j) = \frac{\text{scale}}{|m_j| + \text{eps}}\]The maximum penalty (\(y\)-intercept) can be approximated as \(\frac{\text{scale}}{\text{eps}}\), and the minimum penalty approaches zero asymptotically.
InverseSquaredReweighting
- class disstans.solvers.InverseSquaredReweighting(eps, scale=1)[source]
- __call__(m)[source]
Reweighting function based on the inverse squared of the input based on [candes08]:
\[w(m_j) = \frac{\text{scale}}{m_j^2 + \text{eps}^2}\]The maximum penalty (\(y\)-intercept) can be approximated as \(\frac{\text{scale}}{\text{eps}^2}\), and the minimum penalty approaches zero asymptotically.
LogarithmicReweighting
- class disstans.solvers.LogarithmicReweighting(eps, scale=1)[source]
- __call__(m)[source]
Reweighting function based on the logarithm of the input based on [andrecut11]:
\[w(m_j) = \text{scale} \cdot \log_\text{num_reg} \frac{ \| \mathbf{m} \|_1 + \text{num_reg} \cdot \text{eps}}{|m_j| + \text{eps}}\](where \(0 < \text{eps} \ll \frac{1}{\text{num_reg}}\)). This reweighting function’s calculated penalties for individual elements depends on the overall size and 1-norm of the input weight vector.
The maximum penalty (\(y\)-intercept) can be approximated as \(\text{scale} \cdot \log_\text{num_reg} \frac{\| \mathbf{m} \|_1}{\text{eps}}\). If there is only a single nonzero value, its penalty will be zero (at \(|m_j|=\| \mathbf{m} \|_1\)). In the intermediate cases where multiple values are nonzero, their penalties will be distributed on a logarithmic slope between the \(y\)-intercept and zero.
References
- andrecut11
Andrecut, M. (2011). Stochastic Recovery Of Sparse Signals From Random Measurements. Engineering Letters, 19(1), 1-6.