*(for student mentee)
+(for equal contributions, co-corresponding author)
Preprints (Selected)
Liu, Limeng*, Wang, GuanNan & Safo, Sandra E., (2024) Multivariate Sparse Functional Linear Discriminant Analysis: An Application to Inflammatory Bowel Disease Classification
J. Butts*, C. Wendt, R. Bowler, C.P. Hersh, Q. Long, L. Eberly, S. E. Safo. (2023). Extensions of Heterogeneity in Integration and Prediction (HIP) with R Shiny Application. arXiv preprint
Domonique W. Hodge, Sandra E. Safo, Qi Long, "Multiple imputation using dimension reduction techniques for high-dimensional data", arXiv:1905.05274
Peer-Reviewed (Selected)
Safo, S.E., & Lu, H*. Scalable Randomized Kernel-Based Methods for Multiview Data Integration and Prediction with Application to COVID-19 Accepted for publications in Biostatistics, 2024 RandMVLearn
Jessica Butts*, Leif Verace, Christine Wendt, Russel P Bowler, Craig P Hersh, Qi Long, Lynn Eberly, Sandra E Safo, HIP: a method for high-dimensional multi-view data integration and prediction accounting for subgroup heterogeneity, Briefings in Bioinformatics, Volume 25, Issue 6, November 2024, bbae470, https://doi.org/10.1093/bib/bbae470 HIP (Arxiv)
Sarthak Jain*, Sandra E Safo, DeepIDA-GRU: a deep learning pipeline for integrative discriminant analysis of cross-sectional and longitudinal multiview data with applications to inflammatory bowel disease classification, Briefings in Bioinformatics, Volume 25, Issue 4, July 2024, bbae339, https://doi.org/10.1093/bib/bbae339
Jiuzhou Wang*, Sandra E Safo, Deep IDA: a deep learning approach for integrative discriminant analysis of multi-omics data with feature ranking—an application to COVID-19, Bioinformatics Advances, Volume 4, Issue 1, 2024, vbae060, https://doi.org/10.1093/bioadv/vbae060
Islam, Jessica Y., Eric Hurwitz, Dongmei Li, Marlene Camacho-Rivera, Jing Sun, Sandra Safo, Jennifer M. Ross et al. "Associations of county-level social determinants of health with COVID-19 related hospitalization among people with HIV: a retrospective analysis of the US National COVID Cohort Collaborative (N3C)." AIDS and Behavior (2024): 1-13.
Dimple Vaidya, Kenneth J. Wilkins, Eric Hurwitz, Jessica Y. Islam, Dongmei Li, Jing Sun, Sandra E. Safo, Jennifer M. Ross, Shukri Hassan, Elaine Hill, Bohdan Nosyk, Cara D. Varley, Nada Fadul, Marlene Camacho-Rivera, Charisse Madlock-Brown, Rena C. Patel (2024) Assessing associations between individual-level social determinants of health and COVID-19 hospitalizations: investigating racial/ethnic disparities among people living with HIV in the U.S. National COVID Cohort Collaborative (N3C), Journal of Clinical and Translational Science
Seth D König, Sandra Safo, Kai Miller, Alexander B. Herman, David P. Darrow (2024) Flexible multi-step hypothesis testing of human ECoG data using cluster-based permutation tests with GLMEs, NeuroImage
Hengkang Wang*, Han Lu*, Ju Sun, Sandra E Safo (2024) Interpretable Deep Learning Methods for Multiview Learning iDeepViewLearn. BMC Bioinformatics.
Palzer, E. F., & Safo, S. E. (2024). mvlearnR and Shiny App for multiview learning. arXiv preprint arXiv:2311.16181. Bioinformatics Advances .
Kunz, M*., Rott, K*., Hurwitz, E., Kunisaki, K., Islam, J., Sun, J., . . . Patel, R., & Safo, S. E. (2024) The Intersections of COVID-19, HIV, and race/ethnicity: Machine Learning Methods to Identify and Model Risk Factors for Severe COVID-19 in a Large U.S. National Dataset. AIDS and Behavior
Castro-Pearson*, S., Samorodnitsky, S*., Yang, K*., Lotfi-Emran, S., Ingraham, N. E., Bramante, C., ... , Safo, S.E., and Tignanelli, C. J. (2023). Development of a proteomic signature associated with severe disease for patients with COVID-19 using data from 5 multicenter, randomized, controlled, and prospective studies. Scientific Reports, 13(1), 20315.
Yang, K., Kang, Z., Guan, W., Lotfi-Emran, S., Mayer, Z. J., Guerrero, C. R., ... & Safo, S.E. (2023). Developing A Baseline Metabolomic Signature Associated with COVID-19 Severity: Insights from Prospective Trials Encompassing 13 US Centers. Metabolites, 13(11), 1107.
Safo, S., Haine, L.*, Baker, J., Reilly, C., Duprez, D., Neaton, J., . . . Staub, T. Derivation of a protein risk score for cardiovascular disease for a multiethnic HIV+ cohort. Journal of the American Heart Association (JAHA)
Lipman, D.*, Safo, S.+, & Chekouo, T. Integrative multi-omics approach for identifying molecular signatures and pathways and deriving and validating molecular scores for COVID-19 severity and status. BMC Genomics
W. Zhang*, C. Wendt, R. Bowler, C. P. Hersh, and S. E. Safo. Robust Integrative Biclustering for Multi-view Data. 2022 iSSVD (Arxiv) Statistical Methods in Medical Research
Palzer EF*, Wendt C, Bowler R, Hersh CP, Safo SE, Lock EF. sJIVE: "Supervised Joint and Individual Variation Explained", 2022 Computational Statistics and Data Analysis
Haileab Hilafu and Sandra E. Safo, "Sparse sliced inverse regression for high dimensional data analysis", 2022 BMC Bioinformatics
Danika Lipman*, Sandra E. Safo+, Thierry Chekouo, "Multi-omic analysis reveals enriched pathways associated with COVID-19 and COVID-19 severity", 2022 PLOS ONE
Thierry Chekouo and Sandra E. Safo+, "Bayesian Integrative Analysis and Prediction with Application to Atherosclerosis Cardiovascular Disease", BIPNet, 2021, Biostatistics
Sandra E. Safo, Eun Jeong Min, and Lillian Haine*, " Sparse Linear Discriminant Analysis for Multiview Structured Data", SIDA and SIDANet, 2021, Biometrics
Groene EA*, Valeris-Chacin RJ*, Stadelman AM*, Safo SE, Cusick SE. Maternal HIV and child anthropometric outcomes over time: an analysis of Zimbabwe demographic health surveys. AIDS. 2021 Mar 1;35(3):477-484. doi: 10.1097/QAD.0000000000002772. PubMed PMID: 33252491; PubMed Central PMCID: PMC7855570.
Haileab Tesfe Hilafu, Sandra E. Safo, and Lillian Haine, "Sparse reduced-rank regression for integrating omics data", BMC Bioinformatics, 2020
Staimez, Lisa & Rhee, M.K. & Deng, Y.* & Safo, Sandra & Butler, S.M. & Legvold, B. & Jackson, Sandra & Ford, C.N. & Wilson, P.W.F. & Long, Q. & Phillips, L.S.. (2020), "Retinopathy develops at similar glucose levels but higher HbA 1c levels in people with black African ancestry compared to white European ancestry: evidence for the need to individualize HbA 1c interpretation", Diabetic Medicine
Min E J, Safo SE, Long Q. "Penalized Co-inertia analysis with applications to -omics data", Bioinformatics, 2019 Mar 15;35(6):1018-1025. doi: 10.1093/bioinformatics/bty726
Sandra E. Safo, Jeongyoun Ahn, Yongho Jeon, and Sungkyu Jung, "Sparse Generalized Eigenvalue Problem with Application to Canonical Correlation Analysis for Integrative Analysis of Methylation and Gene Expression Data", Biometrics
Safo SE, Long Q, "Sparse linear discriminant analysis in structured covariates space". Statistical Analysis and Data Mining: The ASA Data Science Journal 2018;1–14.https://doi.org/10.1002/sam.11376
Luiz Gustavo Gardinassi, Jianguo Xia, Sandra E Safo, Shuzhao Li, "Bioinformatics Tools for the Interpretation of Metabolomics Data", Current Pharmacology Reports, 2017, DOI 10.1007/s40495-017-0107-0
Z. Li*, S. E. Safo, and Q. Long, “Incorporating biological information in sparse principal component analysis with application to genomic data", Bmc Bioinformatics 2017.
Mary K. Rhee, Sandra E. Safo, Sandra L. Jackson, Weingiong Xue, Qi Long, Darin E. Olson, Diana Barb, J. Sonya Haw, Anne M. Tomolo, and Lawrence S. Philips. Inpatient Glucose Values: Determination of Normal and Utility in Opportunistic Diabetes Screening. American journal of preventive medicine, epub ahead of print, 2017
Jackson SL, Staimez LR, Safo S, Long Q, Rhee MK, Cunningham SA, Olson DE, Tomolo AM, Ramakrishnan U, Narayan VKM, Phillips LS. J Diabetes Complications. 2017 Sep;31(9):1430-1436. doi: 10.1016/j.jdiacomp.2017.06.001. Epub 2017 Jun 6.
S. Safo, S. Li, and Q. Long, “Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information.,” Biometrics, p. in press, 2017.
Sandra L Jackson, Lisa Staimez, Sandra E. Safo, Qi Long, Mary K Rhee, Solveig A Cunningham, Darin E Olson, Anne M Tomolo, Usha Ramakrishnan, KM Venkat Narayan, Lawrence S Phillips (2015+) Participation in a National VA Lifestyle Change Program is Associated with Improved Diabetes Control.
S. L. Jackson, S. Safo, L. R. Staimez, Q. Long, M. K. Rhee, S. A. Cunningham, D. E. Olson, A. M. Tomolo, U. Ramakrishnan, K. V. Narayan, and others, “Reduced cardiovascular disease incidence with a national lifestyle change program,” American journal of preventive medicine, vol. 52, iss. 4, pp. 459-468, 2017.
S. Jackson, S. Safo, L. Staimez, D. Olson, K. Narayan, Q. Long, J. Lipscomb, M. Rhee, P. Wilson, A. Tomolo, and others, “Glucose challenge test screening for prediabetes and early diabetes.,” Diabetic medicine: a journal of the british diabetic association, vol. 34, iss. 5, pp. 716-724, 2017.
Sandra E. Safo and Jeongyoun Ahn (2016). General Sparse Multi-class Linear Discriminant Analysis. Computational Statistics and Data Analysis, 66, 81-90 http://www.sciencedirect.com/science/article/pii/S0167947316000207
Sandra Safo, Xiao Song and Kevin K. Dobbin (2015). Sample size determination for training cancer classifiers from microarray and RNA-seq data. Annals of Applied Statistics 9 (2) 1053-1075 http://arxiv.org/abs/1509.04897