Timeseries
This module contains the Timeseries
base class and other
formats included by default in DISSTANS.
Timeseries (Parent Class)
- class disstans.timeseries.Timeseries(dataframe, src, data_unit, data_cols, var_cols=None, cov_cols=None, remove_initial_offset=False)[source]
Object that expands the functionality of a
DataFrame
object for better integration into DISSTANS. Apart from the data itself, it contains information about the source and units of the data. It also performs input checks and uses property setters/getters to ensure consistency.Also enables the ability to perform math on timeseries directly.
- Parameters
dataframe (
DataFrame
) – The timeseries’ data as a DataFrame. The index should be time, whereas data columns can be both data and their uncertainties.src (
str
) – Source description.data_unit (
str
) – Data unit.data_cols (
list
[str
]) – List of strings with the names of the columns ofdataframe
that contain the data. The length cooresponds to the number of componentsnum_components
.var_cols (
Optional
[list
[str
]], default:None
) – List of strings with the names of the columns ofdataframe
that contain the data’s variance. Must have the same length asdata_cols
.None
defaults to no data variance columns.cov_cols (
Optional
[list
[str
]], default:None
) – List of strings with the names of the columns ofdataframe
that contain the data’s covariance. Must have length(num_components * (num_components - 1)) / 2
, where the order of the elements is determined by their row-by-row, sequential position in the covariance matrix (see Notes).None
defaults to no covariance columns.remove_initial_offset (
bool
, default:False
) – IfTrue
, the data timeseries will be shifted such that it starts at zero. The offset will be recorded inoffset
if it needs to be recovered.
Notes
In terms of mapping the covariance matrix of observations into the format for the
Timeseries
class, consider this example for observations with three components:var_cols[0]
cov_cols[0]
cov_cols[1]
(symmetric)
var_cols[1]
cov_cols[2]
(symmetric)
(symmetric)
var_cols[2]
- __add__(other)[source]
Special function that allows two timeseries instances (or a timeseries and an equivalently shaped NumPy array) to be added together element-wise.
- Parameters
other (
Timeseries
) – Timeseries to add to instance.- Return type
- Returns
New timeseries object containing the sum of the two timeseries.
See also
prepare_math
Prepares the two instances for the mathematical operation. Refer to it for more details about how the two objects are cast together.
Example
Add two
Timeseries
ts1
andts2
and save the result asts3
:ts3 = ts1 + ts2
- __getitem__(columns)[source]
Convenience special function that provides a shorthand notation to access the timeseries’ columns.
- Parameters
columns (
str
|list
[str
]) – String or list of strings of the columns to return.- Return type
- Returns
Returns the requested data as a Series (if a single column) or DataFrame (if multiple columns).
Example
If
ts
is aTimeseries
instance andcolumns
a list of column names, the following two are equivalent:ts.df[columns] ts[ts_description]
- __mul__(other)[source]
Special function that allows two timeseries instances (or a timeseries and an equivalently shaped NumPy array) to be multiplied together element-wise.
- Parameters
other (
Timeseries
) – Timeseries to multiply to instance.- Return type
- Returns
New timeseries object containing the product of the two timeseries.
See also
prepare_math
Prepares the two instances for the mathematical operation. Refer to it for more details about how the two objects are cast together.
Example
Multiply two
Timeseries
ts1
andts2
and save the result asts3
:ts3 = ts1 * ts2
- __radd__(other)[source]
Reflected operation of
__add__()
(necessary if first operand is a NumPy array).- Return type
- __rmul__(other)[source]
Reflected operation of
__mul__()
(necessary if first operand is a NumPy array).- Return type
- __rsub__(other)[source]
Reflected operation of
__sub__()
(necessary if first operand is a NumPy array).- Return type
- __rtruediv__(other)[source]
Reflected operation of
__truediv__()
(necessary if first operand is a NumPy array).- Return type
- __str__()[source]
Special function that returns a readable summary of the timeseries. Accessed, for example, by Python’s
print()
built-in function.- Return type
- Returns
Timeseries summary.
- __sub__(other)[source]
Special function that allows a timeseries instance (or a timeseries and an equivalently shaped NumPy array) to be subtracted from another element-wise.
- Parameters
other (
Timeseries
) – Timeseries to subtract from instance.- Return type
- Returns
New timeseries object containing the difference of the two timeseries.
See also
prepare_math
Prepares the two instances for the mathematical operation. Refer to it for more details about how the two objects are cast together.
Example
Subtract two
Timeseries
ts1
andts2
and save the result asts3
:ts3 = ts1 - ts2
- __truediv__(other)[source]
Special function that allows a timeseries instances (or a timeseries and an equivalently shaped NumPy array) to be divided by another element-wise.
- Parameters
other (
Timeseries
) – Timeseries to divide instance by.- Return type
- Returns
New timeseries object containing the quotient of the two timeseries.
See also
prepare_math
Prepares the two instances for the mathematical operation. Refer to it for more details about how the two objects are cast together.
Example
Divide two
Timeseries
ts1
andts2
and save the result asts3
:ts3 = ts1 / ts2
- add_uncertainties(timeseries=None, var_data=None, var_cols=None, cov_data=None, cov_cols=None)[source]
Add variance and covariance data and column names to the timeseries.
- Parameters
timeseries (
Optional
[Timeseries
], default:None
) – Another timeseries object that contains uncertainty information. If set, the function will ignore the rest of the arguments.var_data (
Optional
[ndarray
], default:None
) – New data variance.var_cols (
Optional
[list
[str
]], default:None
) – List of variance column names.cov_data (
Optional
[ndarray
], default:None
) – New data covariance. Setting this but notvar_data
requires there to already be data variance.cov_cols (
Optional
[list
[str
]], default:None
) – List of covariance column names.
- Return type
Notes
If
ts
is aTimeseries
instance, just using:ts.vars = new_variance ts.covs = new_covariance
will only work when the respective columns already exist in the dataframe. (This is the same behavior for renaming variance columns that do not exist.) If they do not exist, the calls will results in an error because no column names exist, in an effort to make the inner workings more transparent and rigorous.
This function allows to override the default behavior, and can also generate column names by itself if none are specified.
- convert_units(factor, new_data_unit)[source]
Convert the data and covariances to a new data unit by providing a conversion factor.
- copy(only_data=False, src=None)[source]
Return a deep copy of the timeseries instance.
- Parameters
- Returns
The copy of the timeseries instance.
- Return type
- cut(t_min=None, t_max=None, i_min=None, i_max=None, keep_inside=True)[source]
Cut the timeseries to contain only data between certain times or indices. If both a minimum (maximum) timestamp or index is provided, the later (earlier, respectively) one is used (i.e., the more restrictive one). Also provides the reverse operation, i.e. only removing data between dates.
This operation changes the timeseries in-place; if it should be done on a new timeseries, use
copy()
first.- Parameters
t_min (
UnionType
[Timestamp
,str
,None
], default:None
) – A timestamp or timestamp-convertable string of the earliest observation to keep.t_max (
UnionType
[Timestamp
,str
,None
], default:None
) – A timestamp or timestamp-convertable string of the latest observation to keep.i_min (
Optional
[int
], default:None
) – The index of the earliest observation to keep.i_max (
Optional
[int
], default:None
) – The index of the latest observation to keep.keep_inside (
bool
, default:True
) – IfTrue
, keeps data inside of the specified date range. IfFalse
, keeps only data outside the specified date range.
- Return type
- classmethod from_array(timevector, data, src, data_unit, data_cols, var=None, var_cols=None, cov=None, cov_cols=None)[source]
Constructor method to create a
Timeseries
instance from a NumPyndarray
.- Parameters
timevector (
Series
|DatetimeIndex
) –Series
ofTimestamp
or alternatively aDatetimeIndex
containing the timestamps of each observation.data (
ndarray
) – 2D NumPy array of shape \((\text{n_observations},\text{n_components})\) containing the data.src (
str
) – Source description.data_unit (
str
) – Data unit.data_cols (
str
) – List of strings with the names of the columns ofdata
.var (
Optional
[ndarray
], default:None
) – 2D NumPy array of shape \((\text{n_observations},\text{n_components})\) containing the data variances.None
defaults to no data uncertainty.var_cols (
Optional
[list
[str
]], default:None
) – List of strings with the names of the columns ofdata
that contain the data’s variance. Must have the same length asdata_cols
. Ifvar
is given butvar_cols
is not, it defaults to appending'_var'
todata_cols
.cov (
Optional
[ndarray
], default:None
) – 2D NumPy array of shape \((\text{n_observations},\text{n_components})\) containing the data covariances (as defined inTimeseries
).None
defaults to no data uncertainty.cov_cols (
Optional
[list
[str
]], default:None
) – List of strings with the names of the columns ofdata
that contain the data’s covariance. Must have the same length asdata_cols
. Ifcov
is given butcov_cols
is not, it defaults to appending'_cov'
to the two respective entries ofdata_cols
.
- Return type
- Returns
The generated Timeseries object.
See also
date_range()
Quick function to generate a timevector.
- classmethod from_fit(data_unit, data_cols, fit)[source]
Import a fit dictionary and create a Timeseries instance.
- Parameters
- Return type
- Returns
Timeseries instance created from
fit
.
See also
disstans.models.Model.evaluate
Evaluating a model produces the fit dictionary.
- get_arch()[source]
Build a dictionary describing the architecture of this timeseries, to be used when creating a network JSON configuration file.
Without subclassing
Timeseries
, this function will return an empty dictionary by default, since it is unknown how to recreate a general Timeseries object from just a JSON-compatible dictionary. :rtype:dict
See also
disstans.network.Network.to_json
Export the Network configuration as a JSON file.
disstans.timeseries.Timeseries.get_arch
Get the architecture dictionary of a
Timeseries
instance.disstans.models.Model.get_arch
Get the architecture dictionary of a
Model
instance.
- index_map
Matrix that contains the rolling indices of each matrix element used by
get_cov_indices()
.
- property length: timedelta64
Returns the length of the timeseries.
- mask_out(dcol)[source]
Mask out an entire data column (and if present, its uncertainty column) by setting the entire column to
NaN
. Converts it to a sparse representation to save memory.
- offset
Offset applied to the timeseries data such that it starts at zero.
- static prepare_math(left, right, operation)[source]
Tests two timeseries’ ability to be cast together in a mathematical operation, and returns output characteristics. Currently, only addition, subtraction, multiplication, and division are supported.
All uncertainty information is lost during mathematical operations.
One of the objects can be a NumPy array. In this case, the array has to have the exact same shape as the data in the Timeseries instance. Furthermore, the resulting Timeseries object will have the same
src
,data_unit
anddata_cols
attributes (instead of a combination of both).- Parameters
left (
Timeseries
|ndarray
) – Left term of the operation.right (
Timeseries
|ndarray
) – Right term of the operation.operation (
Literal
['+'
,'-'
,'*'
,'/'
]) – Operation to perform.
- Return type
- Returns
left_data – View of the 2D left data array of the operation with shape (
len(out_time)
,num_components
).right_data – View of the 2D right data array of the operation with shape (
len(out_time)
,num_components
).out_src – Combines the sources of each object to a new string.
out_data_unit – Combines the data units of each object into a new unit.
out_data_cols – List of strings containing the new data column names.
out_time – Index object containing the indices of all timestamps common to both.
- Raises
TypeError – If one of the operands is not a
Timeseries
orndarray
, or if both arendarray
(since then this function would never be called anyway).ValueError – If the number of data columns is not equal between the two operands, or if the data units are not the same adding or subtracting.
AssertionError – If one of the operands is a NumPy array but does not have the same number of rows as the other operand.
Warning
This method is called under-the-hood whenever a mathematical operation is performed, and should not need to be used by normal users.
See also
__add__
Addition for two Timeseries or a Timeseries and a NumPy array
__radd__
Addition for a NumPy array and a Timeseries.
__sub__
Subtraction for two Timeseries or a Timeseries and a NumPy array
__rsub__
Subtraction for a NumPy array and a Timeseries.
__mul__
Multiplication for two Timeseries or a Timeseries and a NumPy array
__rmul__
Multiplication for a NumPy array and a Timeseries.
__truediv__
Division for two Timeseries or a Timeseries and a NumPy array
__rtruediv__
Division for a NumPy array and a Timeseries.
- property reliability: float
Returns the reliability (between 0 and 1) defined as the number of available observations divided by the the number of expected observations. The expected observations are calculated by taking the median timespan between observations, and then dividing the total time span by that timespan.
(Essentially, this assumes that there are not any “close-by” observation, e.g. two observation for the same day but a different hour in a dataset of otherwise daily observations.)
- property shape: tuple[int, int]
Returns the shape tuple (similar to NumPy) of the timeseries, which is of shape \((\text{n_observations},\text{n_components})\).
- property var_cov: DataFrame
Returns the variance as well as covariance columns from
df
, to be indexed byvar_cov_map
to yield the full variance-covariance matrix.
- var_cov_map
Contains the column indices needed to create the full variance-covariance matrix for a single time.
Specialized Classes
GipsyTimeseries
- class disstans.timeseries.GipsyTimeseries(path, show_warnings=True, data_unit='mm', **kw_args)[source]
Subclasses
Timeseries
.Timeseries subclass for GNSS measurements in JPL’s Gipsy(X)
.tseries
file format.- Parameters
Additional keyword arguments will be passed onto
Timeseries
.Notes
The column format is described on JPL’s website:
Columns
Description
Column 1
Decimal year computed with 365.25 days/yr
Columns 2-4
East, North and Vertical [m]
Columns 5-7
East, North and Vertical standard deviation [m]
Columns 8-10
East, North and Vertical correlation [-]
Column 11
Time in Seconds past J2000
Columns 12-17
Time in YEAR MM DD HR MN SS
Time is GPS time, and the time series are relative to each station’s first epoch.
- get_arch()[source]
Returns a JSON-compatible dictionary with all the information necessary to recreate the Timeseries instance (provided the data file is available).
- Returns
JSON-compatible dictionary sufficient to recreate the GipsyTimeseries instance.
- Return type
See also
Timeseries.get_arch
For further information.
UNRTimeseries
- class disstans.timeseries.UNRTimeseries(path, show_warnings=True, data_unit='mm', **kw_args)[source]
Subclasses
Timeseries
.Timeseries subclass for GNSS measurements in UNR’s
.tenv3
file format.- Parameters
Additional keyword arguments will be passed onto
Timeseries
.Notes
The column format is described on UNR’s website:
Columns
Description
Column 1
Station name
Column 2
Date
Column 3
Decimal year
Column 4
Modified Julian day
Columns 5-6
GPS week and day
Column 7
Longitude [°] of reference meridian
Columns 8-9
Easting [m] from ref. mer., integer and fraction
Columns 10-11
Northing [m] from equator, integer and fraction
Columns 12-13
Vertical [m], integer and fraction
Column 14
Antenna height [m]
Column 15-17
East, North, Vertical standard deviation [m]
Column 18
East-North correlation coefficient [-]
Column 19
East-Vertical correlation coefficient [-]
Column 20
North-Vertical correlation coefficient [-]
Newer files also contain the following three columns:
Column 21
Latitude [°]
Column 22
Longitude [°]
Column 23
Altitude [m]
The time series are relative to each station’s first integer epoch.
- get_arch()[source]
Returns a JSON-compatible dictionary with all the information necessary to recreate the Timeseries instance (provided the data file is available).
- Returns
JSON-compatible dictionary sufficient to recreate the UNRTimeseries instance.
- Return type
See also
Timeseries.get_arch
For further information.