Skip to contents

The connectome-based predictive modeling (CPM) is a data-driven approach to predict individual behavior from brain connectivity data. Originally proposed by Shen et al. (2017), the CPM has been widely used in various studies. This function implements the CPM algorithm and provides a convenient interface to use it.

Usage

cpm(
  conmat,
  behav,
  ...,
  confounds = NULL,
  thresh_method = c("alpha", "sparsity"),
  thresh_level = 0.01,
  kfolds = NULL,
  bias_correct = TRUE,
  return_edges = c("none", "sum", "all")
)

Arguments

conmat

A matrix of connectome data. Observations in row, edges in column (assumed that duplicated edges are removed).

behav

A numeric vector contains behavior data. Length must equal to number of observations in conmat.

...

For future extension. Currently ignored.

confounds

A matrix of confounding variables. Observations in row, variables in column. If NULL, no confounding variables are used.

thresh_method, thresh_level

The threshold method and level used in edge selection. If method is set to be "alpha", the edge selection is based on the critical value of correlation coefficient. If method is set to be "sparsity", the edge selection is based on the quantile of correlation coefficient, thus network sparsity is controlled.

kfolds

Folds number of cross-validation. If NULL, it will be set to be equal to the number of observations, i.e., leave-one-subject-out.

bias_correct

Logical value indicating if the connectome data should be bias-corrected. If TRUE, the connectome data will be centered and scaled to have unit variance based on the training data before model fitting and prediction. See Rapuano et al. (2020) for more details.

return_edges

A character string indicating the return value of the selected edges. If "none", no edges are returned. If "sum", the sum of selected edges across folds is returned. If "all", the selected edges for each fold is returned, which is a 3D array and memory-consuming.

Value

A list with the following components:

folds

The corresponding fold for each observation when used as test group in cross-validation.

real

The real behavior data. This is the same as the input behav if confounds is NULL, otherwise it is the residual of behav after regressing out confounds.

pred

The predicted behavior data, with each column corresponding to a model, i.e., both edges, positive edges, negative edges, and the row names corresponding to the observation names (the same as those of behav).

edges

The selected edges, if return_edges is not "none". If return_edges is "sum", it is a matrix with rows corresponding to edges and columns corresponding to networks. If return_edges is "all", it is a 3D array with dimensions corresponding to folds, edges, and networks.

References

Shen, X., Finn, E. S., Scheinost, D., Rosenberg, M. D., Chun, M. M., Papademetris, X., & Constable, R. T. (2017). Using connectome-based predictive modeling to predict individual behavior from brain connectivity. Nature Protocols, 12(3), 506–518. https://doi.org/10.1038/nprot.2016.178

Rapuano, K. M., Rosenberg, M. D., Maza, M. T., Dennis, N. J., Dorji, M., Greene, A. S., Horien, C., Scheinost, D., Todd Constable, R., & Casey, B. J. (2020). Behavioral and brain signatures of substance use vulnerability in childhood. Developmental Cognitive Neuroscience, 46, 100878. https://doi.org/10.1016/j.dcn.2020.100878

Examples

conmat <- matrix(rnorm(100 * 100), nrow = 100)
behav <- rnorm(100)
cpm(conmat, behav)
#> CPM results based on leave-one-out cross validation.
# use different threshold method and level
cpm(conmat, behav, thresh_method = "sparsity", thresh_level = 0.05)
#> CPM results based on leave-one-out cross validation.
# use a 10-fold cross-validation
cpm(conmat, behav, kfolds = 10)
#> CPM results based on 10-fold cross validation.