Treed Distributed Lag Mixture Models (Deprecated)

TDLMM is a method for estimating a Treed Distributed Lag Mixture Model. It operates by building an ensemble of pairs of regression trees. Each tree in a tree-pair partitions the time span of the exposure data and estimates a piecewise constant distributed lag effect. The two trees are then intersected to create an interaction surface for estimating the interaction between two exposures. Exposures are selected for each tree stochastically and each exposure or interaction has a unique shrinkage variance component. This allows for exposure variable selection in addition to the estimation of the distributed lag mixture model.

Usage

tdlmm(
  formula,
  data,
  exposure.data,
  n.trees = 20,
  n.burn = 2000,
  n.iter = 5000,
  n.thin = 5,
  family = "gaussian",
  binomial.size = 1,
  formula.zi = NULL,
  keep_XZ = FALSE,
  mixture.interactions = "noself",
  tree.params = c(0.95, 2),
  step.prob = c(0.25, 0.25, 0.25),
  mix.prior = 1,
  shrinkage = "exposures",
  subset = NULL,
  verbose = TRUE,
  diagnostics = FALSE,
  initial.params = NULL,
  ...
)

Arguments

formula: object of class formula, a symbolic description of the fixed effect model to be fitted, e.g. y ~ a + b
data: data frame containing variables used in the formula
exposure.data: named list containing equally sized numerical matrices of exposure data with same, having same length as data
n.trees: integer for number of trees in ensemble
n.burn: integer for length of burn-in
n.iter: integer for number of iterations to run model after burn-in
n.thin: integer thinning factor, i.e. keep every tenth iteration
family: 'gaussian' for continuous response, 'logit' for binomial response with logit link, or 'zinb' for zero-inflated negative binomial with logit link
binomial.size: integer type scalar (if all equal, default = 1) or vector defining binomial size for 'logit' family
formula.zi: object of class formula, a symbolic description of the ZI model to be fitted, e.g. y ~ a + b. This only applies to ZINB where covariates for ZI model is different from NB model. This is same as the main formula by default
keep_XZ: FALSE (default) or TRUE: keep the model scale exposure and covariate data
mixture.interactions: 'noself' (default) which estimates interactions only between two different exposures, 'all' which also allows interactions within the same exposure, or 'none' which eliminates all interactions and estimates only main effects of each exposure
tree.params: numerical vector of alpha and beta hyperparameters controlling tree depth (see Bayesian CART, 1998), default: alpha = 0.95, beta = 2
step.prob: numerical vector for probability of 1) grow/prune, 2) change, 3) switch exposure, defaults to (0.25, 0.25, 0.25) or equal probability of each step for tree updates
mix.prior: positive scalar hyperparameter for sparsity of exposures
shrinkage: character "all" (default), "trees", "exposures", "none", turns on horseshoe-like shrinkage priors for different parts of model
subset: integer vector to analyze only a subset of data and exposures
verbose: TRUE (default) or FALSE: print output
diagnostics: TRUE or FALSE (default) keep model diagnostic such as terminal nodes, acceptance details, etc.
initial.params: initial parameters for fixed effects model, FALSE = none (default), "glm" = generate using GLM, or user defined, length must equal number of parameters in fixed effects model
...: NA

Value

object of class 'tdlmm'

Details

tdlmm

Model is recommended to be run for at minimum 5000 burn-in iterations followed by 15000 sampling iterations with a thinning factor of 5. Convergence can be checked by re-running the model and validating consistency of results.