ASTUTE R
packagevignettes/ASTUTE.rmd
      ASTUTE.rmdASTUTE is an R package designed to integrate cancer genomic and transcriptomic data in order to perform genotype-phenotype mapping. It leverages regularized regression with LASSO penalty to uncover associations between somatic mutations and gene expression profiles.
In its basic implementation, ASTUTE requires two main inputs: (i) a binary matrix where rows are patients (i.e., samples) and columns are mutations. Each cell of the matrix is 1 if the related mutation was observed in the sample; 0 otherwise. (2) a matrix with log2(x+1)-transformed normalized expression matrix for the same patients.
In this vignette, we give an overview of the package by presenting some of its main functions.
The ASTUTE package can be installed from GitHub using the R package devtools as follows.
library("devtools")
install_github("ramazzottilab/ASTUTE", ref = 'master')We provide within the package an example dataset providing alterations and expression data for a set of selected genes from 50 lung adenocarcinoma samples from Cancer Genome Atlas Research Network. “Comprehensive molecular profiling of lung adenocarcinoma.” Nature 511, no. 7511 (2014): 543.
ASTUTE performs genotype-phenotype mapping by associating somatic mutations to gene expression profiles.
set.seed(12345)
resExample <- ASTUTE( alterations = datasetExample$alterations, 
                      expression = datasetExample$expression, 
                      regularization = TRUE, 
                      nboot = NA, 
                      num_processes = NA, 
                      verbose = FALSE )
print(names(resExample))## [1] "input_data"   "inference"    "parameters"   "goodness_fit" "fold_changes"
## [6] "pvalues"      "qvalues"
The output of this analysis is a a list of 7 elements: (1) input_data: list providing the input data (i.e., alterations and expression data); (2) bootstrap: results of the inference by bootstrap (i.e., alpha alterations matrix, beta matrix, and intercept estimates); (3) parameters: list with the paremeters used for the inference (i.e., regularization TRUE/FALSE and nboot); (4) goodness_fit: goodness of fit estimated as the cosine similarity comparing observations and predictions; (5) fold_changes: log2 fold changes estimates; (6) pvalues: p-values estimates; (7) qvalues: p-values estimates corrected for false discovery.
In the example provided above, we did not perform bootstrap, so no p-values and q-values estimates are provided.
## R Under development (unstable) (2025-02-24 r87814)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ASTUTE_1.2.0     BiocStyle_2.35.0
## 
## loaded via a namespace (and not attached):
##  [1] lsa_0.73.3          cli_3.6.4           knitr_1.50         
##  [4] rlang_1.1.5         xfun_0.51           textshaping_1.0.0  
##  [7] jsonlite_1.9.1      SnowballC_0.7.1     htmltools_0.5.8.1  
## [10] ragg_1.3.3          sass_0.4.9          glmnet_4.1-8       
## [13] rmarkdown_2.29      grid_4.5.0          evaluate_1.0.3     
## [16] jquerylib_0.1.4     fastmap_1.2.0       foreach_1.5.2      
## [19] yaml_2.3.10         lifecycle_1.0.4     bookdown_0.42      
## [22] BiocManager_1.30.25 compiler_4.5.0      codetools_0.2-20   
## [25] fs_1.6.5            Rcpp_1.0.14         htmlwidgets_1.6.4  
## [28] lattice_0.22-6      systemfonts_1.2.1   digest_0.6.37      
## [31] R6_2.6.1            parallel_4.5.0      splines_4.5.0      
## [34] shape_1.4.6.1       Matrix_1.7-3        bslib_0.9.0        
## [37] tools_4.5.0         iterators_1.0.14    survival_3.8-3     
## [40] pkgdown_2.1.1.9000  cachem_1.1.0        desc_1.4.3