| Title: | Build and Compare Statistical Models |
|---|---|
| Description: | Build and compare nested statistical models with sets of equal and different independent variables. An analysis using this package is Marquardt et al. (2021) <https://github.com/p-mq/Percentile_based_averaging>. |
| Authors: | Dr J. Peter Amin Marquardt [aut, cre] (ORCID: <https://orcid.org/0000-0002-5596-1357>) |
| Maintainer: | Dr J. Peter Amin Marquardt <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.3 |
| Built: | 2026-06-10 08:39:01 UTC |
| Source: | https://github.com/p-mq/blanketstatsments |
Build formula used in statistical models from vectors of strings. Copied from basecamb package to avoid dependency
.build_model_formula(outcome, predictors, censor_event = NULL).build_model_formula(outcome, predictors, censor_event = NULL)
outcome |
character denoting the column with the outcome. |
predictors |
vector of characters denoting the columns with the predictors. |
censor_event |
character denoting the column with the censoring event, for use in Survival-type models. |
formula for use in statistical models
J. Peter Marquardt
Calculate concordance statistics for a list of statistical models on the same data set
blanket_c_statistic(df, model_list, modality = "logistic", verbose = FALSE)blanket_c_statistic(df, model_list, modality = "logistic", verbose = FALSE)
df |
data.frame containing the data set. If evaluating independently, use the test set. |
model_list |
list of statistical models of type lm, glm or coxph to be evaluated. |
modality |
character specifying model type. Currently accepts 'linear', 'logistic', and 'cox' |
verbose |
logical. TRUE activates printout messages. |
list of doubles with the AUC values for the evaluated models on the specified data set.
J. Peter Marquardt
Perform a blanket redundancy analysis on a list of existing models
blanket_redundancy_analysis( model_list, data, r2_threshold = 0.9, nk = 0, verbose = FALSE )blanket_redundancy_analysis( model_list, data, r2_threshold = 0.9, nk = 0, verbose = FALSE )
model_list |
a list of statistical regression model of class linear, logistic or coxph |
data |
data.frame used to create the models |
r2_threshold |
float threshold value to consider a parameter redundant |
nk |
number of knots in splicing |
verbose |
ctivate printouts of key findings |
an list of objects of class "redun"
J. Peter Marquardt
[blanket_stats()]
data <- survival::lung models_to_run <- list( 'OS' = list('outcome' = 'time', 'modality' = 'cox', 'event_censor' = 'status'), 'weight_loss' = list('outcome' = 'wt.loss', 'modality' = 'linear', 'event_censor' = NA)) predictor_sets <- list('age' = c('age'), 'age_ecog' = c('age', 'ph.ecog')) covariates = c('sex') bl_stats <- blanket_statsments(data, models_to_run, predictor_sets, covariates) blanket_redundancy_analysis(bl_stats, data)data <- survival::lung models_to_run <- list( 'OS' = list('outcome' = 'time', 'modality' = 'cox', 'event_censor' = 'status'), 'weight_loss' = list('outcome' = 'wt.loss', 'modality' = 'linear', 'event_censor' = NA)) predictor_sets <- list('age' = c('age'), 'age_ecog' = c('age', 'ph.ecog')) covariates = c('sex') bl_stats <- blanket_statsments(data, models_to_run, predictor_sets, covariates) blanket_redundancy_analysis(bl_stats, data)
Run the same model (type, outcome, and covariates) with different sets of predictors
blanket_stats( df, outcome, predictor_sets, covariates = c(), modality = "linear", event_censor = NA, verbose = FALSE )blanket_stats( df, outcome, predictor_sets, covariates = c(), modality = "linear", event_censor = NA, verbose = FALSE )
df |
data.frame containing the data set. |
outcome |
character designating the column with the outcome of interest |
predictor_sets |
named list or character vectors containing columns with predictors |
covariates |
vector of characters denoting columns with covariables |
modality |
character denoting model type. Currently limited to 'linear', 'logistic', and 'cox' |
event_censor |
character denoting column with censor event. For coxph models only |
verbose |
logical. TRUE activates printout messages. |
named list of models
J. Peter Marquardt
data <- survival::lung outcome <- 'time' predictor_sets <- list('age' = c('age'),'age_ecog' = c('age', 'ph.ecog')) covariates = c('sex') modality <- 'cox' event_censor <- 'status' bl_stats <- blanket_stats(data, outcome, predictor_sets, covariates, modality, event_censor)data <- survival::lung outcome <- 'time' predictor_sets <- list('age' = c('age'),'age_ecog' = c('age', 'ph.ecog')) covariates = c('sex') modality <- 'cox' event_censor <- 'status' bl_stats <- blanket_stats(data, outcome, predictor_sets, covariates, modality, event_censor)
Wraps blanket_stats. Run a list of models with different modalities/outcomes for a list of different predictor sets with the same covariables.
blanket_statsments( df, models_to_run, predictor_sets, covariates = c(), verbose = FALSE )blanket_statsments( df, models_to_run, predictor_sets, covariates = c(), verbose = FALSE )
df |
data.frame containing the data set. |
models_to_run |
either a named list or data.frame type, with every entry/row having the keys/columns outcome, modality, and event_censor |
predictor_sets |
named list of lists containing the set of predictors. See blanket_stats for details |
covariates |
vector of characters denoting columns with covariables |
verbose |
logical. TRUE activates printout messages. |
named list of named lists of models
J. Peter Marquardt
data <- survival::lung models_to_run <- list('OS' = list( 'outcome' = 'time', 'modality' = 'cox', 'event_censor' = 'status'), 'weight_loss' = list('outcome' = 'wt.loss', 'modality' = 'linear', 'event_censor' = NA)) predictor_sets <- list('age' = c('age'),'age_ecog' = c('age', 'ph.ecog')) covariates = c('sex') bl_stats <- blanket_statsments(data, models_to_run, predictor_sets, covariates)data <- survival::lung models_to_run <- list('OS' = list( 'outcome' = 'time', 'modality' = 'cox', 'event_censor' = 'status'), 'weight_loss' = list('outcome' = 'wt.loss', 'modality' = 'linear', 'event_censor' = NA)) predictor_sets <- list('age' = c('age'),'age_ecog' = c('age', 'ph.ecog')) covariates = c('sex') bl_stats <- blanket_statsments(data, models_to_run, predictor_sets, covariates)
Build a Cox proportional hazards model from data and meta-parameters
build_cox_model( df, event_time, event_censor, predictors, covariates = c(), verbose = FALSE )build_cox_model( df, event_time, event_censor, predictors, covariates = c(), verbose = FALSE )
df |
data.frame containing the data set |
event_time |
character denoting column with event time |
event_censor |
character denoting column specifying events/censoring |
predictors |
character vector denoting columns with independent variables of interest |
covariates |
character vector denoting columns with independent variables not of interest. Covariates are mathematically identical to predictors but will be ignored in reporting |
verbose |
logical. TRUE activates printout messages |
A Cox proportional hazards model
J. Peter Marquardt
data <- survival::lung mod <- build_cox_model(data, 'time', 'status', c('age', 'sex'))data <- survival::lung mod <- build_cox_model(data, 'time', 'status', c('age', 'sex'))
Build a generic regression model from data and meta-parameters. Currently only available for linear and logistic types.
build_reg_model( df, outcome, predictors, covariates = c(), modality = "linear", verbose = FALSE )build_reg_model( df, outcome, predictors, covariates = c(), modality = "linear", verbose = FALSE )
df |
data.frame containing the data set |
outcome |
character denoting column with the outcome of interest |
predictors |
character vector denoting columns with independent variables of interest |
covariates |
character vector denoting columns with independent variables not of interest. Covariates are mathematically identical to predictors but will be ignored in reporting |
modality |
character designating type. Currently limited to 'linear' and 'logistic'. |
verbose |
logical. TRUE activates printout messages |
A regression model of linear or logistic type
J. Peter Marquardt
mod <- build_reg_model(data.frame('outcome' = c(1,2), 'pred' = c(3,4)), 'outcome', c('pred'))mod <- build_reg_model(data.frame('outcome' = c(1,2), 'pred' = c(3,4)), 'outcome', c('pred'))
Calculate Uno's concordance statistic for any model. CAVE: If you want to evaluate a model trained on a different dataset, df should be limited to the test set.
calculate_Uno_c(df, model, verbose = FALSE)calculate_Uno_c(df, model, verbose = FALSE)
df |
data.frame containing the data set. If evaluating independently, use the test set. |
model |
statistical model of type coxph to be evaluated. |
verbose |
logical. TRUE activates printout messages. |
double AUC value for the evaluated model on the specified data set.
J. Peter Marquardt
data <- survival::lung cancer_mod <- survival::coxph(survival::Surv(time, status)~age, data = data) calculate_Uno_c(data, cancer_mod)data <- survival::lung cancer_mod <- survival::coxph(survival::Surv(time, status)~age, data = data) calculate_Uno_c(data, cancer_mod)
Perform a redundancy analysis on an existing model
redundancy_analysis(model, data, r2_threshold = 0.9, nk = 0)redundancy_analysis(model, data, r2_threshold = 0.9, nk = 0)
model |
a statistical regression model of class linear, logistic or coxph |
data |
data.frame used to create the model |
r2_threshold |
float threshold value to consider a parameter redundant |
nk |
number of knots in splicing |
an object of class "redun"
J. Peter Marquardt
data <- survival::lung mod <- build_reg_model(data, 'meal.cal', c('sex', 'age')) redundancy_analysis(mod, data)data <- survival::lung mod <- build_reg_model(data, 'meal.cal', c('sex', 'age')) redundancy_analysis(mod, data)
Table results of a blanket redundancy analysis on a list of existing models
table_blanket_redundancies(blanket_redundancies, digits = 2)table_blanket_redundancies(blanket_redundancies, digits = 2)
blanket_redundancies |
list of lists of redun objects generated by blanket_redundancy_analysis() |
digits |
integer number of decimals to include |
a data.frame tabling the key results
J. Peter Marquardt
[table_predictors()], [blanket_redundancy_analysis()]
data <- survival::lung models_to_run <- list( 'OS' = list('outcome' = 'time', 'modality' = 'cox', 'event_censor' = 'status'), 'weight_loss' = list('outcome' = 'wt.loss', 'modality' = 'linear', 'event_censor' = NA)) predictor_sets <- list('age' = c('age'), 'age_ecog' = c('age', 'ph.ecog')) covariates = c('sex') bl_stats <- blanket_statsments(data, models_to_run, predictor_sets, covariates) bl_redun <- blanket_redundancy_analysis(bl_stats, data) table_blanket_redundancies(bl_redun)data <- survival::lung models_to_run <- list( 'OS' = list('outcome' = 'time', 'modality' = 'cox', 'event_censor' = 'status'), 'weight_loss' = list('outcome' = 'wt.loss', 'modality' = 'linear', 'event_censor' = NA)) predictor_sets <- list('age' = c('age'), 'age_ecog' = c('age', 'ph.ecog')) covariates = c('sex') bl_stats <- blanket_statsments(data, models_to_run, predictor_sets, covariates) bl_redun <- blanket_redundancy_analysis(bl_stats, data) table_blanket_redundancies(bl_redun)
Wraps blanket_stats. Run a list of models with different modalities/outcomes for a list of different predictor sets with the same covariables.
table_blanket_statsments(df, blanket_statsment_models)table_blanket_statsments(df, blanket_statsment_models)
df |
data.frame containing the data set. |
blanket_statsment_models |
list of models produced by blanket_statsments() |
data.frame with tabled results
J. Peter Marquardt
[blanket_statsments()] for models and [table_predictors()] for tabling results
data <- survival::lung models_to_run <- list('OS' = list( 'outcome' = 'time', 'modality' = 'cox', 'event_censor' = 'status'), 'weight_loss' = list('outcome' = 'wt.loss', 'modality' = 'linear', 'event_censor' = NA)) predictor_sets <- list('age' = c('age'),'age_ecog' = c('age', 'ph.ecog')) covariates = c('sex') bl_stats <- blanket_statsments(data, models_to_run, predictor_sets, covariates) tbl <- table_blanket_statsments(data, bl_stats)data <- survival::lung models_to_run <- list('OS' = list( 'outcome' = 'time', 'modality' = 'cox', 'event_censor' = 'status'), 'weight_loss' = list('outcome' = 'wt.loss', 'modality' = 'linear', 'event_censor' = NA)) predictor_sets <- list('age' = c('age'),'age_ecog' = c('age', 'ph.ecog')) covariates = c('sex') bl_stats <- blanket_statsments(data, models_to_run, predictor_sets, covariates) tbl <- table_blanket_statsments(data, bl_stats)
Extract coefficients and p-values only for regression models and table them
table_predictors(df, model, predictors)table_predictors(df, model, predictors)
df |
data.frame containing the data set. If evaluating independently, use the test set. |
model |
statistical model to be evaluated. |
predictors |
vector of characters designating columns of interest. Non-specified independent variables will not be included. |
data.frame with coefficients and p-values for predictor variables
J. Peter Marquardt
data <- survival::lung mod <- build_reg_model(data, 'age', 'sex') tbl <- table_predictors(data, mod, 'sex')data <- survival::lung mod <- build_reg_model(data, 'age', 'sex') tbl <- table_predictors(data, mod, 'sex')