Package 'medshift'

Title: Causal mediation analysis for stochastic interventions
Description: Estimators of a parameter arising in the decomposition of the population intervention (in)direct effect of stochastic interventions in causal mediation analysis, including efficient one-step, targeted minimum loss (TML), re-weighting (IPW), and substitution estimators. The parameter estimated constitutes a part of each of the population intervention (in)direct effects. These estimators may be used in assessing population intervention (in)direct effects under stochastic treatment regimes, including incremental propensity score interventions and modified treatment policies. The methodology was first discussed by I Díaz and NS Hejazi (2020) <doi:10.1111/rssb.12362>.
Authors: Nima Hejazi [aut, cre, cph] , Iván Díaz [aut] , Mark van der Laan [ctb, ths] , Jeremy Coyle [ctb]
Maintainer: Nima Hejazi <[email protected]>
License: MIT + file LICENSE
Version: 0.1.4
Built: 2024-10-31 02:50:50 UTC
Source: https://github.com/nhejazi/medshift

Help Index


Confidence Intervals for Stochastic Mediation Parameters

Description

Compute confidence intervals for objects of class medshift, which contain estimates produced by medshift.

Usage

## S3 method for class 'medshift'
confint(object, parm = seq_len(object$psi), level = 0.95, ...)

Arguments

object

An object of class medshift, as produced by invoking medshift, for which a confidence interval is to be computed.

parm

A numeric vector indicating indices of object$est for which to return confidence intervals.

level

A numeric indicating the confidence interval level.

...

Other arguments. Not currently used.


Inverse probability weighted (IPW) estimator

Description

Inverse probability weighted (IPW) estimator

Usage

est_ipw(data, delta, g_learners, e_learners, w_names, z_names, ...)

Arguments

data

A data.table containing the observed data, with columns in the order specified by the NPSEM (Y, Z, A, W), with column names set appropriately based on the original input data. Such a structure is merely a convenience utility to passing data around to the various core estimation routines and is automatically generated by medshift.

delta

A numeric value indicating the degree of shift in the intervention to be used in defining the causal quantity of interest. In the case of binary interventions, this takes the form of an incremental propensity score shift, acting as a multiplier of the odds with which a given observational unit receives the intervention (EH Kennedy, 2018, JASA; doi:10.1080/01621459.2017.1422737).

g_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the propensity score, i.e., g = P(A | W).

e_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting a propensity score that conditions on the mediators, i.e., e = P(A | Z, W).

w_names

A character vector of the names of the columns that correspond to baseline covariates (W). The input for this argument is automatically generated by a call to the wrapper function medshift.

z_names

A character vector of the names of the columns that correspond to mediators (Z). The input for this argument is automatically generated by medshift.

...

Other arguments currently ignored.


Efficient One-Step Estimator

Description

Efficient One-Step Estimator

Usage

est_onestep(data, delta, g_learners, e_learners, m_learners, phi_learners,
  w_names, z_names, cv_folds = 10)

Arguments

data

A data.table containing the observed data, with columns in the order specified by the NPSEM (Y, Z, A, W), with column names set appropriately based on the original input data. Such a structure is merely a convenience utility to passing data around to the various core estimation routines and is automatically generated by medshift.

delta

A numeric value indicating the degree of shift in the intervention to be used in defining the causal quantity of interest. In the case of binary interventions, this takes the form of an incremental propensity score shift, acting as a multiplier of the probability with which a given observational unit receives the intervention (EH Kennedy, 2018, JASA; doi:10.1080/01621459.2017.1422737).

g_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the propensity score, i.e., g = P(A | W).

e_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting a propensity score that conditions on the mediators, i.e., e = P(A | Z, W).

m_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the outcome regression, i.e., m(A, Z, W).

phi_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in a regression of a pseudo-outcome on the baseline covariates, i.e., phi(W) = E[m(A = 1, Z, W) - m(A = 0, Z, W) | W).

w_names

A character vector of the names of the columns that correspond to baseline covariates (W). The input for this argument is automatically generated by medshift.

z_names

A character vector of the names of the columns that correspond to mediators (Z). The input for this argument is automatically generated by medshift.

cv_folds

A numeric specifying the number of folds to be created for cross-validation. Use of cross-validation / cross-fitting allows for entropy conditions on the AIPW estimator to be relaxed. Note: for compatibility with make_folds, this value must be greater than or equal to 2; the default is to create 10 folds.


Substitution estimator

Description

Substitution estimator

Usage

est_substitution(data, delta, g_learners, m_learners, w_names, z_names, ...)

Arguments

data

A data.table containing the observed data, with columns in the order specified by the NPSEM (Y, Z, A, W), with column names set appropriately based on the original input data. Such a structure is merely a convenience utility to passing data around to the various core estimation routines and is automatically generated by medshift.

delta

A numeric value indicating the degree of shift in the intervention to be used in defining the causal quantity of interest. In the case of binary interventions, this takes the form of an incremental propensity score shift, acting as a multiplier of the odds with which a given observational unit receives the intervention (EH Kennedy, 2018, JASA; doi:10.1080/01621459.2017.1422737).

g_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the propensity score, i.e., g = P(A | W).

m_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the outcome regression, i.e., m(A, Z, W).

w_names

A character vector of the names of the columns that correspond to baseline covariates (W). The input for this argument is automatically generated by a call to the wrapper function medshift.

z_names

A character vector of the names of the columns that correspond to mediators (Z). The input for this argument is automatically generated by a call to the wrapper function medshift.

...

Other arguments currently ignored.


Fit propensity score regression while conditioning on mediators

Description

Fit propensity score regression while conditioning on mediators

Usage

fit_e_mech(data, valid_data = NULL, learners, z_names, w_names)

Arguments

data

A data.table containing the observed data, with columns in the order specified by the NPSEM (Y, Z, A, W), with column names set appropriately based on the original input data. Such a structure is merely a convenience utility to passing data around to the various core estimation routines and is automatically generated by medshift.

valid_data

A holdout data set, with columns exactly matching those appearing in the preceding argument data, to be used for estimation via cross-fitting. Optional, defaulting to NULL.

learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting a propensity score that conditions on the mediators, i.e., e = P(A | Z, W).

z_names

A character vector of the names of the columns that correspond to mediators (Z). The input for this argument is automatically generated by medshift.

w_names

A character vector of the names of the columns that correspond to baseline covariates (W). The input for this argument is automatically generated by a call to the wrapper function medshift.


Fit propensity score with incremental stochastic shift intervention

Description

Fit propensity score with incremental stochastic shift intervention

Usage

fit_g_mech(data, valid_data = NULL, delta, learners, w_names)

Arguments

data

A data.table containing the observed data, with columns in the order specified by the NPSEM (Y, Z, A, W), with column names set appropriately based on the original input data. Such a structure is merely a convenience utility to passing data around to the various core estimation routines and is automatically generated by medshift.

valid_data

A holdout data set, with columns exactly matching those appearing in the preceding argument data, to be used for estimation via cross-fitting. Optional, defaulting to NULL.

delta

A numeric value indicating the degree of shift in the intervention to be used in defining the causal quantity of interest. In the case of binary interventions, this takes the form of an incremental propensity score shift, acting as a multiplier of the odds with which a given observational unit receives the intervention (EH Kennedy, 2018, JASA; doi:10.1080/01621459.2017.1422737).

learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the propensity score, i.e., g = P(A | W).

w_names

A character vector of the names of the columns that correspond to baseline covariates (W). The input for this argument is automatically generated by medshift.


Fit outcome regression

Description

Fit outcome regression

Usage

fit_m_mech(data, valid_data = NULL, learners, z_names, w_names)

Arguments

data

A data.table containing the observed data, with columns in the order specified by the NPSEM (Y, Z, A, W), with column names set appropriately based on the original input data. Such a structure is merely a convenience utility to passing data around to the various core estimation routines and is automatically generated by medshift.

valid_data

A holdout data set, with columns exactly matching those appearing in the preceding argument data, to be used for estimation via cross-fitting. Optional, defaulting to NULL.

learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the outcome regression, i.e., m(A, Z, W).

z_names

A character vector of the names of the columns that correspond to mediators (Z). The input for this argument is automatically generated by medshift.

w_names

A character vector of the names of the columns that correspond to baseline covariates (W). The input for this argument is automatically generated by medshift.


Fit intervention-specific exponential tilt nuisance parameter

Description

Fit intervention-specific exponential tilt nuisance parameter

Usage

fit_phi_mech(train_data, valid_data, learners, m_output, w_names)

Arguments

train_data

A data.table containing the observed data, with columns in the order specified by the NPSEM (Y, Z, A, W), with column names set appropriately based on the input data. Such a structure is merely a convenience utility to passing data around to the various core estimation routines and is automatically generated by medshift.

valid_data

A holdout data set, with columns exactly matching those appearing in the preceding argument train_data, to be used for estimation via cross-fitting. Not optional for this nuisance parameter.

learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in a regression of a pseudo-outcome on the baseline covariates, i.e., phi(W) = E[m(A = 1, Z, W) - m(A = 0, Z, W) | W).

m_output

Object containing results from fitting the outcome regression, as produced by fit_m_mech.

w_names

A character vector of the names of the columns that correspond to baseline covariates (W). The input for this argument is automatically generated by medshift.


Likelihood Factor for Incremental Propensity Score Interventions

Description

Likelihood Factor for Incremental Propensity Score Interventions

Format

R6Class object.

Value

LF_base object.

Constructor

define_lf(LF_ipsi, name, type = "density", likelihood_base, shift_param, treatment_task, control_task, ...)

name

A character, giving the name of the likelihood factor. Should match a node name in the nodes specified by the npsem slot of tmle3_Task.

likelihood_base

A trained Likelihood object, for use in generating a re-scaled likelihood factor.

shift_param

A numeric, specifying the magnitude of the desired incremental propensity score shift (a multiplier of the odds of receiving treatment).

treatment_task

A tmle3_Task object created by setting the intervention to the treatment condition: do(A = 1).

control_task

A tmle3_Task object created by setting the intervention to the control condition: do(A = 0).

...

Not currently used.

Fields

likelihood_base

A trained Likelihood object, for use in generating a re-scaled likelihood factor.

shift_param

A numeric, specifying the magnitude of the desired incremental propensity score shift (a multiplier of the odds of receiving treatment).

treatment_task

A tmle3_Task object created by setting the intervention to the treatment condition: do(A = 1).

control_task

A tmle3_Task object created by setting the intervention to the control condition: do(A = 0).

...

Additional arguments passed to the base class.

References

"Nonparametric Causal Effects Based on Incremental Propensity Score Interventions."

Kennedy, Edward H (2019). Journal of the American Statistical Association. https://doi.org/10.1080/01621459.2017.1422737

"Causal Mediation Analysis for Stochastic Interventions"

Díaz, Iván and Hejazi, Nima S (2020). Journal of the Royal Statistical Society, Series B. https://doi.org/10.1111/rssb.12362


Nonparametric estimation of the population intervention (in)direct effects

Description

Nonparametric estimation of the population intervention (in)direct effects

Usage

medshift(W, A, Z, Y, ids = seq_along(Y), delta,
  g_learners = sl3::Lrnr_glm$new(), e_learners = sl3::Lrnr_glm$new(),
  m_learners = sl3::Lrnr_glm$new(), phi_learners = sl3::Lrnr_glm$new(),
  estimator = c("onestep", "tmle", "substitution", "reweighted"),
  estimator_args = list(cv_folds = 10, max_iter = 10000, step_size = 1e-06))

Arguments

W

A matrix, data.frame, or similar corresponding to a set of baseline covariates.

A

A numeric vector corresponding to a treatment variable. The parameter of interest is defined as a location shift of this quantity.

Z

A numeric vector, matrix, data.frame, or similar corresponding to a set of mediators (on the causal pathway between the intervention A and the outcome Y).

Y

A numeric vector corresponding to an outcome variable.

ids

A numeric vector of observation-level IDs, allowing for observational units to be related through a hierarchical structure. The default is to assume all units are IID. When repeated IDs are included, both the cross-validation procedures used for estimation and inferential procedures respect these IDs.

delta

A numeric value indicating the degree of shift in the intervention to be used in defining the causal quantity of interest. In the case of binary interventions, this takes the form of an incremental propensity score shift, acting as a multiplier of the odds with which a unit receives the intervention (EH Kennedy, 2018, JASA; doi:10.1080/01621459.2017.1422737).

g_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the propensity score, i.e., g = P(A | W).

e_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting a propensity score that conditions on the mediators, i.e., e = P(A | Z, W).

m_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the outcome regression, i.e., m(A, Z, W).

phi_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in a regression of a pseudo-outcome on the baseline covariates, i.e., phi(W) = E[m(A = 1, Z, W) - m(A = 0, Z, W) | W).

estimator

The desired estimator of the natural direct effect to be computed. Currently, choices are limited to a substitution estimator, a re-weighted estimator, a one-step estimator, and a targeted minimum loss estimator.

estimator_args

A list of extra arguments to be passed (via ...) to the function call for the specified estimator. The default is so chosen as to allow the number of folds used in computing the one-step estimator to be easily tweaked. Refer to the documentation for functions est_onestep, est_ipw, and est_substitution for details on what other arguments may be specified through this mechanism. For the option "tmle", there is heavy reliance on the architecture provided by tmle3.


Parameter for the Population Intervention (In)direct Effects

Description

Parameter definition class. See https://doi.org/10.1111/rssb.12362.

Format

R6Class object.

Value

Param_base object.

Constructor

define_param(Param_medshift, shift_param, ..., outcome_node)

observed_likelihood

A Likelihood corresponding to the observed likelihood.

shift_param

A numeric, specifying the magnitude of the desired incremental propensity score shift (a multiplier of the odds of receiving treatment).

...

Not currently used.

outcome_node

A character, giving the name of the node that should be treated as the outcome.

Fields

cf_likelihood

The counterfactual likelihood under the joint stochastic intervention on exposure and mediators.

lf_ipsi

Object derived from LF_base for assessing the joint intervention on exposure and mediators.

treatment_task

A tmle3_Task created by setting the intervention to the treatment condition: do(A = 1).

control_task

A tmle3_Task object created by setting the intervention to the control condition: do(A = 0).

shift_param

A numeric, specifying the magnitude of the desired incremental propensity score shift (a multiplier of the odds of receiving treatment).


One-step or TML estimation of the population intervention direct effect

Description

One-step or TML estimation of the population intervention direct effect

Usage

pide(W, A, Z, Y, ids = seq(1, length(Y)), delta, estimator = c("onestep",
  "tmle"), ci_level = 0.95, ...)

Arguments

W

A matrix, data.frame, or similar corresponding to a set of baseline covariates.

A

A numeric vector corresponding to a treatment variable. The parameter of interest is defined as a location shift of this quantity.

Z

A numeric vector, matrix, data.frame, or similar corresponding to a set of mediators (on the causal pathway between the intervention A and the outcome Y).

Y

A numeric vector corresponding to an outcome variable.

ids

A numeric vector of observation-level IDs, allowing for observational units to be related through a hierarchical structure. The default is to assume all units are IID. When repeated IDs are included, both the cross-validation procedures used for estimation and inferential procedures respect these IDs.

delta

A numeric value indicating the degree of shift in the intervention to be used in defining the causal quantity of interest. In the case of binary interventions, this takes the form of an incremental propensity score shift, acting as a multiplier of the odds with which a unit receives the intervention (EH Kennedy, 2018, JASA; doi:10.1080/01621459.2017.1422737).

estimator

The desired estimator of the natural direct effect to be computed. Currently, choices are limited to a substitution estimator, a re-weighted estimator, a one-step estimator, and a targeted minimum loss estimator.

ci_level

A numeric indicating the desired coverage level of the confidence interval to be computed.

...

Additional arguments passed to medshift. Consult the documentation of that function for details.


Print Method for Class medshift

Description

The print method for objects of class medshift.

Usage

## S3 method for class 'medshift'
print(x, ...)

Arguments

x

An object of class medshift.

...

Other options (not currently used).


Summary for Stochastic Mediation Parameter Objects

Description

Print a convenient summary for objects of S3 class medshift.

Usage

## S3 method for class 'medshift'
summary(object, ..., ci_level = 0.95)

Arguments

object

An object of class medshift, as produced by invoking the function medshift, for which a confidence interval is to be constructed.

...

Other arguments. Not currently used.

ci_level

A numeric indicating the level of the confidence interval to be computed.


Hypothesis test of direct effect with mediated stochastic interventions using the multiplier bootstrap

Description

Hypothesis test of direct effect with mediated stochastic interventions using the multiplier bootstrap

Usage

test_de(W, A, Z, Y, ids = seq(1, length(Y)), delta_grid = seq(from = 0.5,
  to = 5, by = 0.9), mult_type = c("rademacher", "gaussian"),
  ci_level = 0.95, g_learners, e_learners, m_learners, phi_learners,
  cv_folds = 10, n_mult = 10000)

Arguments

W

A matrix, data.frame, or similar corresponding to a set of baseline covariates.

A

A numeric vector corresponding to a treatment variable. The parameter of interest is defined as a location shift of this quantity.

Z

A numeric vector, matrix, data.frame, or similar corresponding to a set of mediators (on the causal pathway between the intervention A and the outcome Y).

Y

A numeric vector corresponding to an outcome variable.

ids

A numeric vector of observation-level IDs, allowing for observational units to be related through a hierarchical structure. The default is to assume all units are IID. When repeated IDs are included, both the cross-validation procedures used for estimation and inferential procedures respect these IDs.

delta_grid

A numeric of values giving the varous degrees of shift in the intervention to be used in defining the causal quantity of interest. In the case of binary interventions, this takes the form of an incremental propensity score shift, acting as a multiplier of the odds with which a given observational unit receives the intervention (EH Kennedy, 2018, JASA; doi:10.1080/01621459.2017.1422737).

mult_type

A character identifying the type of multipliers to be used in the multiplier bootstrap. Choices are "rademacher" or "gaussian", with the default being the former.

ci_level

A numeric indicating the (1 - alpha) level of the simultaneous confidence band to be computed around the estimates of the direct effect. The error level of the test reported in the p-value returned is simply alpha, i.e., one less this quantity.

g_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the propensity score, i.e., g = P(A | W).

e_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting a propensity score that conditions on the mediators, i.e., e = P(A | Z, W).

m_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting the outcome regression, i.e., m(A, Z, W).

phi_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in a regression of a pseudo-outcome on the baseline covariates, i.e., phi(W) = E[m(A = 1, Z, W) - m(A = 0, Z, W) | W).

cv_folds

A numeric specifying the number of folds to be created for cross-validation. Use of cross-validation / cross-fitting allows for entropy conditions on the AIPW estimator to be relaxed. Note: for compatibility with make_folds, this value must be greater than or equal to 2; the default is to create 10 folds.

n_mult

A numeric scalar giving the number of repetitions of the multipliers to be used in computing the multiplier bootstrap.


TML Estimator for the Counterfactual Mean of a Joint Stochastic Intervention Defining the Population Intervention (In)direct Effects

Description

O = (W, A, Z, Y) W = Covariates (possibly multivariate) A = Treatment (binary or categorical) Z = Mediators (binary or categorical; possibly multivariate) Y = Outcome (binary or bounded continuous)

Usage

tmle_medshift(shift_type = "ipsi", delta, e_learners, phi_learners,
  max_iter = 10000, step_size = 1e-06, ...)

Arguments

shift_type

A character defining the type of shift to be applied to the exposure – an incremental propensity score intervention, by default.

delta

A numeric, specifying the magnitude of the shift.

e_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in fitting a cleverly parameterized propensity score that conditions on the mediators, i.e., e = P(A | Z, W).

phi_learners

A Stack (or other learner class that inherits from Lrnr_base), containing a single or set of instantiated learners from sl3, to be used in a regression of a pseudo-outcome on the baseline covariates, i.e., phi(W) = E[m(A = 1, Z, W) - m(A = 0, Z, W) | W).

max_iter

A numeric setting the maximum iterations allowed in the targeting step based on universal least favorable submodels.

step_size

A numeric giving the step size (delta_epsilon in tmle3) to be used in the targeting step based on universal least favorable submodels.

...

Additional arguments (currently unused).


TML Estimator for the Counterfactual Mean of a Joint Stochastic Intervention Defining the Population Intervention (In)direct Effects

Description

TML Estimator for the Counterfactual Mean of a Joint Stochastic Intervention Defining the Population Intervention (In)direct Effects