This document illustrates the main features of tdlnm
within the dlmtree
package. Additional information regarding the statistical methodology and computational details are given in Mork & Wilson (2021).
The distributed lag nonlinear model (DLNM) is a tool for estimating an exposure-response function due to a series of possibly correlated, lagged exposures. In particular, we consider the exposure-response function \(f(\mathbf{x}_i)=\sum_t w(x_{it},t)\) where \(w(x_{it},t)\) represents the DLNM.
Often, \(w\) is constrained to vary smoothly over time and estimated via spline-based methods. TDLNM allows for smoothness, but also enables a `hard’ cutoff in exposure effects at a given timepoint, which allows for more precision in identifying periods of effect/no-effect.
TDLNM method operates using a set of regression trees that each assume piecewise constant relationships across the exposure-time space. TDLNM outperforms spline-based models when the exposure-time surface is not smooth and performs similarly in smooth settings. TDLNM also produces lower variance estimates while maintain high coverage and more precisely identifies critical windows within the exposure-time surface.
To install package dlmtree
, use the devtools
package and run function install_github
.
library(devtools)
install_github("danielmork/dlmtree")
For this example we will use a data simulation available in the dlmtree
package.
library(dlmtree)
set.seed(1)
D <- sim.tdlnm(effect = "A", error.to.signal = 1)
# Also try effects B (linear), C (logistic), and D (logistic and smooth in time)
data
parameterexposure.data
as a matrix, where the columns relate to the time periods of exposure (this could also be interpreted as lags where the first column is the most recent exposure)exposure.se
for a smooth effect (TDLNMse) in exposure-concentration, in our paper we used half the standard deviation of the exposure data, which gave consistent results across a range of simulation scenariosexposure.splits
across the range of exposure-concentration values. This could be given as a fixed set of values, or a single value which will split evenly across the quantiles of the exposure data. Setting this to \(0\) results in a DLM (linear effects)family
parametersplits <- seq(min(D$exposure), max(D$exposure), length.out = 52)
res <- tdlnm(formula = y ~ .,
data = D$dat,
exposure.data = as.matrix(D$exposure),
#exposure.se = sd(D$exposure)/2,
exposure.splits = splits,
n.trees = 20, n.burn = 2000, n.iter = 5000, n.thin = 5,
family = "gaussian")
## Preparing data...
## exposure.splits entered as numeric vector, assuming values are exposure splitting points
## Burn-in % complete
## [0--------25--------50--------75--------100]
## ''''''''''''''''''''''''''''''''''''''''''
## MCMC iterations (est time: 5 seconds)
## [0--------25--------50--------75--------100]
## ''''''''''''''''''''''''''''''''''''''''''
## Compiling results...
summary
functionexp(res_sum$gamma.mean)
and exp(res_sum$gamma.ci)
res_sum <- summary(res,
cenval = D$cenval,
pred.at = seq(min(D$exposure), max(D$exposure), length.out = 100),
conf.level = 0.95)
## Centered DLNM at exposure value 1
res_sum
## ---
## TDLNM summary
##
## Model run info
## - 20 trees
## - 2000 burn-in iterations
## - 5000 post-burn iterations
## - 5 thinning factor
## - 0.95 confidence level
##
## Fixed effect coefficients:
## Mean Lower.Bound Upper.Bound
## (Intercept) -4.150 -25.682 17.959
## *c1 0.683 0.586 0.785
## *c2 1.288 1.198 1.386
## *c3 0.301 0.200 0.400
## *c4 -0.994 -1.084 -0.904
## *c5 -0.575 -0.673 -0.475
## *b1 -2.451 -2.646 -2.245
## *b2 -1.569 -1.770 -1.366
## *b3 1.723 1.522 1.923
## *b4 0.580 0.379 0.776
## *b5 -1.512 -1.717 -1.304
## ---
## * = CI does not contain zero
##
## DLNM effect:
## range = [-1.05, 0.064]
## signal-to-noise = 1.072
## critical windows: 11-15
boxplot(res$gamma[,-1], xlab = "Parameters", ylab = "Effect") # Boxplot estimated effects
points(1:10, res_sum$gamma.mean[-1], col = "red", pch = 16) # Posterior mean
points(1:10, D$params, col = "green", pch = 16) # Green dots = truth
plot(res_sum,
plot.type = "mean", # (default), also 'se', 'ci-min', 'ci-max', 'slice'
main = "Plot title",
xlab = "Time axis",
ylab = "Exposure-concentration axis",
flab = "Effect color label",
start.time = 1) # change to reorient x-axis
plot(res_sum, plot.type = "slice", val = 3, main = "Slice at concentration 3")
plot(res_sum, plot.type = "slice", time = 13, main = "Slice at time 13")
truth <- D$dlnm.fun(sapply(1:37, function(i) res_sum$pred.at), D$cenval, F)
# RMSE
sqrt(mean((res_sum$matfit - truth)^2))
## [1] 0.0318009
# Coverage
mean(res_sum$cilower < truth & res_sum$ciupper > truth)
## [1] 0.98
# True positive effect classification
(length(which(res_sum$cilower > 0 & truth > 0)) +
length(which(res_sum$ciupper < 0 & truth < 0))) /
length(which(truth != 0))
## [1] 1
# False positive effect classification
(length(which(res_sum$cilower > 0 & truth == 0)) +
length(which(res_sum$ciupper < 0 & truth == 0))) /
length(which(truth == 0))
## [1] 0