Skip to contents

TDLMM is a method for estimating a Treed Distributed Lag Mixture Model. It operates by building an ensemble of pairs of regression trees. Each tree in a tree-pair partitions the time span of the exposure data and estimates a piecewise constant distributed lag effect. The two trees are then intersected to create an interaction surface for estimating the interaction between two exposures. Exposures are selected for each tree stochastically and each exposure or interaction has a unique shrinkage variance component. This allows for exposure variable selection in addition to the estimation of the distributed lag mixture model.

Usage

tdlmm(
  formula,
  data,
  exposure.data,
  n.trees = 20,
  n.burn = 2000,
  n.iter = 5000,
  n.thin = 5,
  family = "gaussian",
  binomial.size = 1,
  formula.zi = NULL,
  keep_XZ = FALSE,
  mixture.interactions = "noself",
  tree.params = c(0.95, 2),
  step.prob = c(0.25, 0.25, 0.25),
  mix.prior = 1,
  shrinkage = "exposures",
  subset = NULL,
  verbose = TRUE,
  diagnostics = FALSE,
  initial.params = NULL,
  ...
)

Arguments

formula

object of class formula, a symbolic description of the fixed effect model to be fitted, e.g. y ~ a + b

data

data frame containing variables used in the formula

exposure.data

named list containing equally sized numerical matrices of exposure data with same, having same length as data

n.trees

integer for number of trees in ensemble

n.burn

integer for length of burn-in

n.iter

integer for number of iterations to run model after burn-in

n.thin

integer thinning factor, i.e. keep every tenth iteration

family

'gaussian' for continuous response, 'logit' for binomial response with logit link, or 'zinb' for zero-inflated negative binomial with logit link

binomial.size

integer type scalar (if all equal, default = 1) or vector defining binomial size for 'logit' family

formula.zi

object of class formula, a symbolic description of the ZI model to be fitted, e.g. y ~ a + b. This only applies to ZINB where covariates for ZI model is different from NB model. This is same as the main formula by default

keep_XZ

FALSE (default) or TRUE: keep the model scale exposure and covariate data

mixture.interactions

'noself' (default) which estimates interactions only between two different exposures, 'all' which also allows interactions within the same exposure, or 'none' which eliminates all interactions and estimates only main effects of each exposure

tree.params

numerical vector of alpha and beta hyperparameters controlling tree depth (see Bayesian CART, 1998), default: alpha = 0.95, beta = 2

step.prob

numerical vector for probability of 1) grow/prune, 2) change, 3) switch exposure, defaults to (0.25, 0.25, 0.25) or equal probability of each step for tree updates

mix.prior

positive scalar hyperparameter for sparsity of exposures

shrinkage

character "all" (default), "trees", "exposures", "none", turns on horseshoe-like shrinkage priors for different parts of model

subset

integer vector to analyze only a subset of data and exposures

verbose

TRUE (default) or FALSE: print output

diagnostics

TRUE or FALSE (default) keep model diagnostic such as terminal nodes, acceptance details, etc.

initial.params

initial parameters for fixed effects model, FALSE = none (default), "glm" = generate using GLM, or user defined, length must equal number of parameters in fixed effects model

...

NA

Value

object of class 'tdlmm'

Details

tdlmm

Model is recommended to be run for at minimum 5000 burn-in iterations followed by 15000 sampling iterations with a thinning factor of 5. Convergence can be checked by re-running the model and validating consistency of results.