Validation protocol inspired on Simulation Based Calibration (cite). Also known as fake data.

validate_calibration(spec, N, T = 1000, x = NULL, seed = 9000,
  nCores = NULL, ...)

Arguments

spec

A Specification object.

N

An integer with the number of repetitions of the validation protocol.

T

An optional integer with the length of the time series to be simulated. It defaults to 1000 observations.

x

An optional numeric matrix with covariates for Markov-switching regression. It defaults to NULL (no covariates).

seed

An optional integer with the seed used for the simulations. It defaults to 9000.

nCores

An optional integer with the number of cores to use to run the protocol in parallel. It defaults to half the number of available cores

...

Arguments to be passed to draw_samples.

Value

A named list with two elements. The first element chains is a data.frame with Markov-chain Monte Carlo convergence diagnostics (number of divergences, number of times max tree depth is reached, maximum leapfrogs, warm up and sampling times) and posterior predictive checks (observation ranks, Kolmogorov-Smirnov statistic for observed sample vs posterior predictive samples). The second element, parameters, compare true versus estimated values for the unknown quantities (mean, sd, quantiles and other posterior measures, Monte Carlo standard error, estimated sample size, R Hat, and rank).

Details

  1. Compile the prior predictive model (i.e. no likelihood statement in the Stan code).

  2. Draw \(N\) samples of the parameter vector \(\theta\) and the observation vector \(\strong{y}_t\) from prior predictive density.

  3. Compile the posterior predictive model (i.e. Stan code includes both prior density and likelihood statement).

  4. For all \(n \in 1, \dots, N\):

    1. Feed \(\strong{y}_t^{(n)}\) to the full model.

    2. Draw one posterior sample of the observation variable \(\strong{y_t}^{(n)}_{new}\).

    3. Collect Hamiltonian Monte Carlo diagnostics: number of divergences, number of times max tree depth is reached, maximum leapfrogs, warm up and sample times.

    4. Collect posterior sampling diagnostics: posterior summary measures (mean, sd, quantiles), comparison against the true value (rank), MCMC convergence measures (Monte Carlo SE, ESS, R Hat).

    5. Collect posterior predictive diagnostics: observation ranks, Kolmogorov-Smirnov statistic for observed sample vs posterior predictive samples.

Examples

# NOT RUN {
mySpec   <- hmm(
  K = 2, R = 1,
  observation = Gaussian(
    mu    = Gaussian(0, 10),
    sigma = Student(
      mu = 0, sigma = 10, nu = 1, bounds = list(0, NULL)
    )
  ),
  initial     = Dirichlet(alpha = c(1, 1)),
  transition  = Dirichlet(alpha = c(1, 1)),
  name = "Univariate Gaussian Hidden Markov Model"
)

myVal <- validate_calibration(
  myFit, N = 50, T = 300, seed = 90, nCores = 10, iter = 500
)
# }