Skip to contents

This function takes a population of models and makes three plots, feature prevalence in population, feature abundance by class and feature prevalence by class

Usage

analyzeImportanceFeatures(
  clf_res,
  X,
  y,
  makeplot = TRUE,
  name = "",
  verbose = TRUE,
  pdf.dims = c(width = 25, height = 20),
  filter.perc = 0.05,
  filter.cv.prev = 0.25,
  nb.top.features = 100,
  scaled.importance = FALSE,
  k_penalty = 0.75/100,
  k_max = 0
)

Arguments

clf_res:

the result of an experiment or multiple exmeriments (list of experimenets)

X:

the X dataset where to compute the abundance and prevalence

y:

the target class

makeplot:

make a pdf file with the resulting plots (default:TRUE)

name:

the suffix of the pdf file (default:"")

verbose:

print out informaiton

pdf.dims:

dimensions of the pdf object (default: c(w = 25, h = 20))

filter.perc:

filter by prevalence percentage in the population between 0 and 1 (default:0.05)

filter.cv.prev:

keep only features found in at least (default: 0.25, i.e 25 percent) of the cross validation experiments

nb.top.features:

the maximum number (default: 100) of most important features to be shown. If this value is NULL or NA, all features be returned

scaled.importance:

the scaled importance is the importance multipied by the prevalence in the folds. If (default = TRUE) this will be used, the mean mda will be scaled by the prevalence of the feature in the folds and ordered subsequently

k_penalty:

the sparsity penalty needed to select the best models of the population (default:0.75/100).

k_max:

select the best population below a given threshold. If (default:0) no selection is performed.

Value

plots if makeplot is FALSE