The gmwmx2
R
package allows to estimate
linear model with correlated residuals in presence of missing data.
More precisely, we assume the following model:
where is a design matrix of observed predictors, is the regression parameter vector and is a zero-mean process following an unspecified joint distribution with positive-definite covariance function characterizing the second-order dependence structure of the process and parameterized by the vector .
We then define the a random variable which describes the missing observation mechanism. More specifically, the vector is a binary-valued stationary process independent of with expectation , , and covariance matrix whose structure is assumed known up to the parameter vector
We then assume that we only observe the stochastic process , where denotes the Hadamard product. Hence, represents the observed process vector with null elements in the positions where observations are missing.
Using to denote the Kronecker product, we then define as the design matrix with zero-valued vectors for the rows where observations are missing in (where represents a vector of ones of dimension ).
We estimate the parameters with the least square estimator:
We compute the estimated residuals as .
We then estimate with the Maximum Likelihood Estimator the parameters of the missingness process assuming that is generated from a Markov model with the following transition probabilities:
We then estimate the parameters using a Generalized method of Wavelet Moments approach (Guerrier et al. 2013) and using the fact that the variance-covariance matrix of is given by:
where and is the identity matrix of dimension .
More precisely, we rely on the result of Xu et al. (2017) that provide a computationally efficient form of the theoretical Allan variance (equivalent to the Haar wavelet variance up to a constant) for zero-mean stochastic processes such as to avoid computing these large matrices multiplication in the objective function. Indeed in Xu et al. (2017) they generalize the results in Zhang (2008) to zero-mean non-stationary processes by using averages of the diagonals and super-diagonals of the covariance matrix of . What this implies is that the GMWM, which uses this form, does not require the storage of the covariance matrix of , but only of a vector of dimension which is then plugged into an explicit formula consisting in a linear combination of the elements of this vector (these elements being averages of the diagonal and super-diagonals of the covariance matrix).
While the GMWMX as described above and in more details in Voirol et al. (2024), is a general method for
estimating large linear model with complex dependence structure in
presence of missing data, the gmwmx2
R
package
is currently developed specifically to estimate tectonic velocities from
position times series in graticule distance coordinates system (GD)
provided by the Nevada geodetic Laboratory (Blewitt 2024; Blewitt, Hammond, and Kreemer
2018).
To estimate the trajectory model (see e.g., Bevis and Brown (2014) for more details), we construct the design matrix such that -th component of the vector can be described as follows with representing the ordered time point (epoch) indicated in Modified Julian Date and representing the reference epoch located exactly in the middle of start and end point of the time series:
where is the initial position at the reference epoch , is the velocity parameter, and are the periodic motion parameters ( and represent the annual and semi-annual seasonal terms, respectively with and ). The offset terms models earthquakes, equipment changes or human intervention in which is the magnitude of the step at epochs , is the total number of offsets, is the Heaviside step function defined as and the last term allow to model post-seismic deformation (see e.g., Sobrero et al. (2020)) where is the number of post seismic relaxation time specified, is the time when the relaxation starts in Modified Julian Date (MJD), is the relaxation period in days for the post-seismic relaxation and is the amplitude of the transient. Note that by default the estimates of the functional parameters are provided in unit/day.
When loading data from a specific station using the function
gmwmx2::download_station_ngl()
, we extract from the Nevada
Geodetic Laboratory the position time series in GD coordinates, the time
steps associated with a equipment or software change and the time steps
associated with an earthquake near the station. All these objects are
stored in a object of class gnss_ts_ngl
.
When applying the function gmwmx2::gmwmx2()
to an object
of class gnss_ts_ngl
, we construct the design matrix
by considering an offset term for all equipment or software changes
steps and all earthquakes indicated by the NGL. We also specify a
post-seismic relaxation term for all earthquakes indicated by the NGL.
If no relaxation time is specified in the argument
vec_earthquakes_relaxation_time
, we consider a default
relaxation time of
days.
It is generally recognized that noise in GNSS time series is best described by a combination of colored noise plus white noise (He et al. 2017; Langbein 2008; Williams et al. 2004; Bos et al. 2013) where the white noise generally model noise at high frequencies and the colored noise model the lower frequencies of the spectrum. In a large study on the noise properties of GNSS signals, concluded that the optimal noise models for 80โ90% of GNSS time series signals are the power law and white noise model or white noise and flicker/pink noise with model. The package currently support both stochastic model specification.
More precisely, the power spectrum of a power-law noise has the following form: where is the frequency, is a constant, the sampling frequency and the exponent is called the spectral index.
Many stochastic noise can be expressed as such, for example, a spectral index produces a white noise, a spectral index produces a red noise or random walk and a spectral index produce a flicker noise, also called pink noise.
Granger (1980) and Hosking (1981) showed that power-law noise with a spectral index between and can be obtained by using fractional differencing of Gaussian noise:
where is the backward-shift operator and a vector with independent and identically distributed (IID) Gaussian noise.
Following from Hoskingโs definition of the fractional differencing, we obtain
with the coefficient that can be computed using the following recurrence relation (Kasdin and Walter 1992):
Assuming an infinite sequence of zero-mean white noise , with variance , and a spectral index then the autocovariance is (Bos et al. 2008):
When the argument
stochastic_model
is set to "wn + pl"
, the
stochastic model considered includes both white noise and power-law with
the specified above stationary autocovariance structure. The parameters
estimated are:
,
(constrained to be greater than
)
and
.
When the argument stochastic_model
is set to
"wn + fl"
, the stochastic model considered includes both
white noise and flicker noise (not stationary power-law noise with a
spectral index
)
where the variance covariance of the flicker noise
is obtained as follows (see e.g., (Bos et al.
2008)):
where with the coefficients computed considering a spectral index .
The stochastic parameters estimated are: , and is fixed to .
Let us showcase how to estimate the tectonic velocity in for one specific component (North, East or Vertical) of one station.
Let us first load the gmwmx2
package.
station_data <- download_station_ngl("CHML")
plot(station_data)
plot(station_data, component = "N")
fit1 <- gmwmx2(station_data, n_seasonal = 2, component = "N", stochastic_model = "wn + pl")
summary(fit1)
## Summary of Estimated Model
## -------------------------------------------------------------
## Functional parameters
## -------------------------------------------------------------
## Parameter Estimate Std_Deviation 95% CI Lower 95% CI Upper
## -------------------------------------------------------------
## Intercept -0.70924596 0.00876879 -0.72643246 -0.69205945
## Trend 0.00007198 0.00000833 0.00005565 0.00008831
## Sin (Annual) 0.00135377 0.00034179 0.00068387 0.00202367
## Cos (Annual) -0.00062838 0.00036176 -0.00133741 0.00008065
## Sin (Semi-Annual) 0.00017180 0.00023845 -0.00029556 0.00063915
## Cos (Semi-Annual) 0.00030877 0.00024595 -0.00017328 0.00079082
## Jump: MJD 56248 0.01588617 0.00156627 0.01281633 0.01895601
## Jump: MJD 55391 0.00780478 0.00153504 0.00479616 0.01081339
## Jump: MJD 55563 0.00230763 0.00174920 -0.00112074 0.00573600
## Jump: MJD 55603 0.00286880 0.00237791 -0.00179182 0.00752942
## Jump: MJD 55606 -0.00044386 0.00351636 -0.00733580 0.00644808
## Jump: MJD 56011 -0.00008945 0.00165713 -0.00333736 0.00315846
## Jump: MJD 56085 0.00073436 0.00170239 -0.00260227 0.00407099
## Jump: MJD 56748 -0.00038578 0.00139555 -0.00312101 0.00234945
## Jump: MJD 57281 -0.00145742 0.00153903 -0.00447386 0.00155901
## Earthquake: MJD 55391 0.01576163 0.00742010 0.00121850 0.03030476
## Earthquake: MJD 55563 0.00174787 0.02573410 -0.04869004 0.05218579
## Earthquake: MJD 55603 -0.28371260 0.53172585 -1.32587612 0.75845092
## Earthquake: MJD 55606 0.27719703 0.52581943 -0.75339011 1.30778417
## Earthquake: MJD 56011 -0.00480858 0.01325307 -0.03078411 0.02116696
## Earthquake: MJD 56085 -0.00772227 0.01245006 -0.03212393 0.01667940
## Earthquake: MJD 56748 -0.00910404 0.00542215 -0.01973126 0.00152318
## Earthquake: MJD 57281 -0.02236505 0.00645616 -0.03501889 -0.00971122
## -------------------------------------------------------------
## Stochastic parameters
## -------------------------------------------------------------
## White Noise Variance : 0.00000122
## Stationary powerlaw Spectral index: -0.83907240
## Stationary powerlaw Variance: 0.00000336
## -------------------------------------------------------------
## Missingness parameters
## -------------------------------------------------------------
## P(Z_{i+1} = 0 | Z_{i} = 1): 0.00279460
## P(Z_{i+1} = 1 | Z_{i} = 0): 0.31578947
## \hat{E[Z]}: 0.99122807
## -------------------------------------------------------------
## Running time: 1.31 seconds
## -------------------------------------------------------------
By default, the estimated parameters are provided in m/day, we can
optionally scale the estimated functional parameters so that they are
returned in m/year with the argument scale_parameters
.
summary(fit1, scale_parameters = TRUE)
## Summary of Estimated Model
## -------------------------------------------------------------
## Functional parameters
## -------------------------------------------------------------
## Parameter Estimate Std_Deviation 95% CI Lower 95% CI Upper
## -------------------------------------------------------------
## Intercept -259.05208570 3.20279877 -265.32945595 -252.77471545
## Trend 0.02629027 0.00304291 0.02032629 0.03225426
## Sin (Annual) 0.49446304 0.12483968 0.24978175 0.73914432
## Cos (Annual) -0.22951487 0.13213206 -0.48848895 0.02945921
## Sin (Semi-Annual) 0.06274864 0.08709451 -0.10795347 0.23345075
## Cos (Semi-Annual) 0.11277837 0.08983307 -0.06329122 0.28884795
## Jump: MJD 56248 5.80242306 0.57208186 4.68116322 6.92368289
## Jump: MJD 55391 2.85069445 0.56067178 1.75179795 3.94959095
## Jump: MJD 55563 0.84286159 0.63889600 -0.40935155 2.09507473
## Jump: MJD 55603 1.04782942 0.86853158 -0.65446119 2.75012003
## Jump: MJD 55606 -0.16211875 1.28435052 -2.67939950 2.35516201
## Jump: MJD 56011 -0.03267118 0.60526627 -1.21897127 1.15362890
## Jump: MJD 56085 0.26822620 0.62179917 -0.95047779 1.48693018
## Jump: MJD 56748 -0.14090556 0.50972464 -1.13994749 0.85813637
## Jump: MJD 57281 -0.53232443 0.56212924 -1.63407749 0.56942863
## Earthquake: MJD 55391 5.75693615 2.71019228 0.44505689 11.06881541
## Earthquake: MJD 55563 0.63841028 9.39938149 -17.78403892 19.06085948
## Earthquake: MJD 55603 -103.62602673 194.21286783 -484.27625302 277.02419955
## Earthquake: MJD 55606 101.24621682 192.05554589 -275.17573617 477.66816980
## Earthquake: MJD 56011 -1.75633307 4.84068270 -11.24389683 7.73123069
## Earthquake: MJD 56085 -2.82055781 4.54738426 -11.73326718 6.09215156
## Earthquake: MJD 56748 -3.32525134 1.98044104 -7.20684446 0.55634177
## Earthquake: MJD 57281 -8.16883558 2.35811162 -12.79064944 -3.54702173
## -------------------------------------------------------------
## Stochastic parameters
## -------------------------------------------------------------
## White Noise Variance : 0.00000122
## Stationary powerlaw Spectral index: -0.83907240
## Stationary powerlaw Variance: 0.00000336
## -------------------------------------------------------------
## Missingness parameters
## -------------------------------------------------------------
## P(Z_{i+1} = 0 | Z_{i} = 1): 0.00279460
## P(Z_{i+1} = 1 | Z_{i} = 0): 0.31578947
## \hat{E[Z]}: 0.99122807
## -------------------------------------------------------------
## Running time: 1.31 seconds
## -------------------------------------------------------------
plot(fit1)
fit2 <- gmwmx2(station_data, n_seasonal = 2, component = "N", stochastic_model = "wn + fl")
summary(fit2)
## Summary of Estimated Model
## -------------------------------------------------------------
## Functional parameters
## -------------------------------------------------------------
## Parameter Estimate Std_Deviation 95% CI Lower 95% CI Upper
## -------------------------------------------------------------
## Intercept -0.70924596 0.01136945 -0.73152967 -0.68696225
## Trend 0.00007198 0.00001083 0.00005074 0.00009321
## Sin (Annual) 0.00135377 0.00043414 0.00050287 0.00220467
## Cos (Annual) -0.00062838 0.00045622 -0.00152254 0.00026579
## Sin (Semi-Annual) 0.00017180 0.00027821 -0.00037348 0.00071707
## Cos (Semi-Annual) 0.00030877 0.00028861 -0.00025689 0.00087443
## Jump: MJD 56248 0.01588617 0.00191916 0.01212469 0.01964764
## Jump: MJD 55391 0.00780478 0.00182956 0.00421891 0.01139065
## Jump: MJD 55563 0.00230763 0.00190189 -0.00142001 0.00603527
## Jump: MJD 55603 0.00286880 0.00235741 -0.00175163 0.00748923
## Jump: MJD 55606 -0.00044386 0.00358112 -0.00746272 0.00657500
## Jump: MJD 56011 -0.00008945 0.00192695 -0.00386620 0.00368730
## Jump: MJD 56085 0.00073436 0.00199984 -0.00318524 0.00465397
## Jump: MJD 56748 -0.00038578 0.00187908 -0.00406870 0.00329714
## Jump: MJD 57281 -0.00145742 0.00187987 -0.00514189 0.00222704
## Earthquake: MJD 55391 0.01576163 0.00930876 -0.00248320 0.03400647
## Earthquake: MJD 55563 0.00174787 0.02766816 -0.05248072 0.05597647
## Earthquake: MJD 55603 -0.28371260 0.52755951 -1.31771023 0.75028503
## Earthquake: MJD 55606 0.27719703 0.52174624 -0.74540681 1.29980088
## Earthquake: MJD 56011 -0.00480858 0.01509158 -0.03438754 0.02477039
## Earthquake: MJD 56085 -0.00772227 0.01454627 -0.03623244 0.02078791
## Earthquake: MJD 56748 -0.00910404 0.00725836 -0.02333016 0.00512207
## Earthquake: MJD 57281 -0.02236505 0.00811536 -0.03827086 -0.00645925
## -------------------------------------------------------------
## Stochastic parameters
## -------------------------------------------------------------
## White Noise Variance : 0.00000193
## Flicker Noise Variance: 0.00000251
## -------------------------------------------------------------
## Missingness parameters
## -------------------------------------------------------------
## P(Z_{i+1} = 0 | Z_{i} = 1): 0.00279460
## P(Z_{i+1} = 1 | Z_{i} = 0): 0.31578947
## \hat{E[Z]}: 0.99122807
## -------------------------------------------------------------
## Running time: 1.09 seconds
## -------------------------------------------------------------
plot(fit2)