Run the SWAG algorithm on Generalized Linear Models specified by a family object and using the fastglm library.

swaglm(
  X,
  y,
  p_max = 2L,
  family = NULL,
  method = 0L,
  alpha = 0.3,
  verbose = FALSE,
  seed = 123L
)

Arguments

X

A numeric matrix of predictors.

y

A numeric vector of responses.

p_max

An integer specifying the maximum dimension to explore

family

A family object. Default is binomial.

method

An integer scalar with value 0 for the column-pivoted QR decomposition, 1 for the unpivoted QR decomposition, 2 for the LLT Cholesky, or 3 for the LDLT Cholesky. See ?fastglm::fastglm

alpha

A double specifying the quantile of the criterion used to select models which are employed to construct models to explore at the next dimension

verbose

A boolean used to control verbose

seed

An integer that is the random seed used when creating the set of model to explore for the next dimension

Value

An object of class swaglm structured as a List containing:

  • lst_estimated_beta: A List that contain the estimated coefficients for each estimated model. Each entry of this List is a matrix where in each rows are the estimated coefficients for the model.

  • lst_p_value A List that contain the p-value associated with each estimated coefficients for each estimated model. Each entry of this List is a matrix where in each rows are the p-value for the model.

  • lst_AIC: A List that contains the AIC values for each model at each dimension. Each entry of this list correspond to the AIC values for the models explored at this dimension.

  • lst_var_mat: A List that that contain in each of its entries, a matrix that specify for each row a combination of variables that compose a model.

  • lst_selected_models A List that contain the selected models at each dimension.

  • lst_index_selected_models A List that contain the index of the rows corresponding to the selected models at each dimension.

  • vec_selected_variables_dimension_1 A vector that contain the index of the selected variables at the screening step.

  • y The response vector used in the estimation.

  • X The predictor matrix used in the estimation.

  • p_max The maximum dimension explored by the algorithm.

  • alpha The selection quantile used at each step.

  • family The GLM family used in the estimation (e.g. binomial()).

  • method The method used by fastglm for estimation.

Examples

# Parameters for data generation
set.seed(12345)
n <- 2000
p <- 100
# create design matrix and vector of coefficients
Sigma <- diag(rep(1/p, p))
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = Sigma)
beta = c(-15,-10,5,10,15, rep(0,p-5))

# --------------------- generate from logistic regression with an intercept of one
z <- 1 + X%*%beta
pr <- 1/(1 + exp(-z))
y <- as.factor(rbinom(n, 1, pr))
y = as.numeric(y)-1

# define swag parameters
quantile_alpha = .15
p_max = 20
swag_obj = swaglm::swaglm(X=X, y = y, p_max = p_max, family = stats::binomial(), 
alpha = quantile_alpha, verbose = TRUE, seed = 123)
#> Completed models of dimension 1
#> Completed models of dimension 2
#> Completed models of dimension 3
#> Completed models of dimension 4
#> Completed models of dimension 5
#> Completed models of dimension 6
#> Completed models of dimension 7
#> Completed models of dimension 8
#> Completed models of dimension 9
#> Completed models of dimension 10
#> Completed models of dimension 11
#> Completed models of dimension 12
#> Completed models of dimension 13
#> Completed models of dimension 14
#> Completed models of dimension 15
str(swag_obj)
#> List of 13
#>  $ lst_estimated_beta                :List of 15
#>   ..$ : num [1:100, 1:2] 0.654 0.583 0.552 0.572 0.617 ...
#>   ..$ : num [1:105, 1:3] 0.72 0.669 0.699 0.775 0.648 ...
#>   ..$ : num [1:105, 1:4] 0.654 0.766 0.67 0.714 0.663 ...
#>   ..$ : num [1:105, 1:5] 0.715 0.713 0.806 0.864 0.78 ...
#>   ..$ : num [1:105, 1:6] 0.848 0.853 0.899 0.952 0.864 ...
#>   ..$ : num [1:102, 1:7] 1.02 1.01 1.02 1.01 1.02 ...
#>   ..$ : num [1:84, 1:8] 1.01 1.01 1.01 1.02 1.01 ...
#>   ..$ : num [1:60, 1:9] 1.01 1.01 1.01 1.01 1.02 ...
#>   ..$ : num [1:43, 1:10] 1.01 1.01 1.01 1.01 1.01 ...
#>   ..$ : num [1:21, 1:11] 1.01 1.01 1.01 1.01 1.01 ...
#>   ..$ : num [1:16, 1:12] 1.01 1.01 1.01 1.01 1.01 ...
#>   ..$ : num [1:9, 1:13] 1.01 1.01 1.01 1.01 1.01 ...
#>   ..$ : num [1:5, 1:14] 1.01 1.01 1.01 1.01 1.01 ...
#>   ..$ : num [1:2, 1:15] 1.01 1.01 -15.38 -15.37 -10.85 ...
#>   ..$ : num [1, 1:16] 1.01 -15.37 -10.84 5.83 10.45 ...
#>  $ lst_p_value                       :List of 15
#>   ..$ : num [1:100, 1:2] 3.63e-37 4.11e-33 4.81e-32 3.44e-32 3.02e-34 ...
#>   ..$ : num [1:105, 1:3] 1.19e-39 9.13e-38 5.09e-38 5.33e-41 2.44e-36 ...
#>   ..$ : num [1:105, 1:4] 5.56e-37 6.11e-40 8.41e-38 8.03e-39 2.43e-35 ...
#>   ..$ : num [1:105, 1:5] 8.26e-39 2.83e-36 4.00e-42 1.82e-43 1.15e-40 ...
#>   ..$ : num [1:105, 1:6] 4.79e-42 2.24e-42 1.75e-44 8.96e-44 1.80e-43 ...
#>   ..$ : num [1:102, 1:7] 3.25e-46 3.76e-45 4.09e-46 9.00e-46 5.29e-46 ...
#>   ..$ : num [1:84, 1:8] 3.26e-45 2.42e-45 6.91e-45 6.47e-46 8.55e-46 ...
#>   ..$ : num [1:60, 1:9] 4.66e-45 6.27e-45 5.89e-45 2.17e-45 6.17e-46 ...
#>   ..$ : num [1:43, 1:10] 4.23e-45 4.08e-45 5.19e-45 5.75e-45 5.14e-45 ...
#>   ..$ : num [1:21, 1:11] 4.72e-45 4.49e-45 5.71e-45 6.33e-45 5.16e-45 ...
#>   ..$ : num [1:16, 1:12] 5.16e-45 4.93e-45 4.20e-45 5.81e-45 4.11e-45 ...
#>   ..$ : num [1:9, 1:13] 4.54e-45 5.20e-45 5.02e-45 4.20e-45 4.13e-45 ...
#>   ..$ : num [1:5, 1:14] 5.54e-45 4.56e-45 5.05e-45 4.57e-45 5.05e-45 ...
#>   ..$ : num [1:2, 1:15] 5.57e-45 5.60e-45 8.30e-69 9.53e-69 2.39e-41 ...
#>   ..$ : num [1, 1:16] 5.63e-45 1.05e-68 2.63e-41 3.49e-17 3.91e-42 ...
#>  $ lst_AIC                           :List of 15
#>   ..$ : num [1:100, 1] 2337 2487 2603 2494 2361 ...
#>   ..$ : num [1:105, 1] 2170 2300 2171 2006 2327 ...
#>   ..$ : num [1:105, 1] 2335 1996 2299 2163 2199 ...
#>   ..$ : num [1:105, 1] 2162 2023 1953 1816 1985 ...
#>   ..$ : num [1:105, 1] 1800 1811 1754 1577 1818 ...
#>   ..$ : num [1:102, 1] 1504 1500 1507 1503 1508 ...
#>   ..$ : num [1:84, 1] 1501 1497 1497 1499 1503 ...
#>   ..$ : num [1:60, 1] 1494 1497 1497 1497 1500 ...
#>   ..$ : num [1:43, 1] 1494 1494 1495 1495 1496 ...
#>   ..$ : num [1:21, 1] 1496 1496 1497 1497 1498 ...
#>   ..$ : num [1:16, 1] 1497 1497 1496 1498 1496 ...
#>   ..$ : num [1:9, 1] 1498 1498 1498 1498 1498 ...
#>   ..$ : num [1:5, 1] 1500 1500 1500 1500 1500
#>   ..$ : num [1:2, 1] 1501 1501
#>   ..$ : num [1, 1] 1503
#>  $ lst_var_mat                       :List of 15
#>   ..$ : int [1:100, 1] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ : int [1:105, 1:2] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:105, 1:3] 1 1 1 1 2 1 2 1 3 2 ...
#>   ..$ : int [1:105, 1:4] 1 2 1 1 1 2 2 1 1 1 ...
#>   ..$ : int [1:105, 1:5] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:102, 1:6] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:84, 1:7] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:60, 1:8] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:43, 1:9] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:21, 1:10] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:16, 1:11] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:9, 1:12] 1 1 1 1 1 1 1 1 1 2 ...
#>   ..$ : int [1:5, 1:13] 1 1 1 1 1 2 2 2 2 2 ...
#>   ..$ : int [1:2, 1:14] 1 1 2 2 3 3 4 4 5 5 ...
#>   ..$ : int [1, 1:15] 1 2 3 4 5 8 16 34 35 50 ...
#>  $ lst_selected_models               :List of 15
#>   ..$ : int [1:15, 1] 1 2 3 4 5 8 16 34 35 50 ...
#>   ..$ : int [1:16, 1:2] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:16, 1:3] 1 1 2 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:16, 1:4] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:16, 1:5] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:16, 1:6] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:13, 1:7] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ : int [1:9, 1:8] 1 1 1 1 1 1 1 1 1 2 ...
#>   ..$ : int [1:7, 1:9] 1 1 1 1 1 1 1 2 2 2 ...
#>   ..$ : int [1:4, 1:10] 1 1 1 1 2 2 2 2 3 3 ...
#>   ..$ : int [1:3, 1:11] 1 1 1 2 2 2 3 3 3 4 ...
#>   ..$ : int [1:2, 1:12] 1 1 2 2 3 3 4 4 5 5 ...
#>   ..$ : int [1, 1:13] 1 2 3 4 5 8 16 34 35 50 ...
#>   ..$ : int [1, 1:14] 1 2 3 4 5 8 16 34 35 50 ...
#>   ..$ : int [1, 1:15] 1 2 3 4 5 8 16 34 35 50 ...
#>  $ lst_index_selected_models         :List of 15
#>   ..$ : int [1:15, 1] 1 2 3 4 5 8 16 34 35 50 ...
#>   ..$ : int [1:16, 1] 1 2 3 4 5 6 8 9 11 12 ...
#>   ..$ : int [1:16, 1] 2 4 11 19 27 30 36 40 53 56 ...
#>   ..$ : int [1:16, 1] 4 21 36 37 53 63 71 72 74 78 ...
#>   ..$ : int [1:16, 1] 4 15 27 30 43 53 57 61 65 71 ...
#>   ..$ : int [1:16, 1] 1 2 3 4 5 6 7 8 9 11 ...
#>   ..$ : int [1:13, 1] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ : int [1:9, 1] 1 2 3 4 7 8 10 25 56
#>   ..$ : int [1:7, 1] 1 2 3 4 5 24 35
#>   ..$ : int [1:4, 1] 7 8 12 13
#>   ..$ : int [1:3, 1] 3 5 11
#>   ..$ : int [1:2, 1] 1 3
#>   ..$ : int [1, 1] 1
#>   ..$ : int [1, 1] 2
#>   ..$ : int [1, 1] 1
#>  $ vec_selected_variables_dimension_1: int [1:15, 1] 1 2 3 4 5 8 16 34 35 50 ...
#>  $ y                                 : num [1:2000, 1] 0 1 0 1 1 1 1 0 0 0 ...
#>  $ X                                 : num [1:2000, 1:100] 0.0679 -0.0418 0.1332 0.0199 -0.0145 ...
#>  $ p_max                             : int 20
#>  $ alpha                             : num 0.15
#>  $ family                            :List of 13
#>   ..$ family    : chr "binomial"
#>   ..$ link      : chr "logit"
#>   ..$ linkfun   :function (mu)  
#>   ..$ linkinv   :function (eta)  
#>   ..$ variance  :function (mu)  
#>   ..$ dev.resids:function (y, mu, wt)  
#>   ..$ aic       :function (y, n, mu, wt, dev)  
#>   ..$ mu.eta    :function (eta)  
#>   ..$ initialize: language {     if (NCOL(y) == 1) { ...
#>   ..$ validmu   :function (mu)  
#>   ..$ valideta  :function (eta)  
#>   ..$ simulate  :function (object, nsim)  
#>   ..$ dispersion: num 1
#>   ..- attr(*, "class")= chr "family"
#>  $ method                            : int 0
#>  - attr(*, "class")= chr "swaglm"