Ying Ding

PhD
  • Professor, Associate Dean for Graduate Academic Affairs
  • Faculty in Biostatistics
  • Faculty Senate President

My methodological research focuses on (1) semiparametric theory and methods for complex time-to-event data, (2) subgroup identification/inference and individualized treatment effect estimation in precision medicine and public health, (3) multi-omics data analysis and integration, and (4) deep learning and causal inference. My collaborative research has a broad scope. I first established my collaboration in cancer studies. Later, I have been actively working on neuropsychiatric disorders and aging-related research (e.g., Age-related macular degeneration and Alzheimer’s disease (AD)) using large-scale genetic and multi-omics data. Currently, I am also working on pediatric pulmonary diseases such as childhood asthma.

Education

Ph.D. (2010) Department of Biostatistics, University of Michigan, MI
M.A. (2005) Department of Mathematics, Indiana University Bloomington, IN
B.S. (2003) Department of Mathematics, Nanjing University, China

Teaching
Applied Survival Analysis BIOST2150 Fall 2024
Applied Survival Analysis BIOST2066 Fall 2019, 2020, 2021, 2022, 2023
Survival Analysis BIOST2054/STAT2261 Spring 2018, 2019, 2022, 2023

Applied Mixed Models BIOST2086 Spring 2013, Spring 2014, Spring 2016, 2017
Biostatistics Seminar BIOST2025 Spring 2014, Fall 2014

    Awards
  • 2021 James L. Craig Excellence in Education Award 
  • 2022 Inducted to Delta Omega Honor Society in Public Health
  • 2022 Ascending Star Award, Heath Sciences, University of Pittsburgh
  • 2023 American Statistical Association LiDS Section Outstanding Service Award
Research Grants

Funding (PI Grant) 

  1. Funding Agency: NIH/NIGMS
    Grant Number: R01GM141076
    Grant Title: New statistical methods and software for modeling complex multivariate survival data with large-scale covariates
    Role on Grant: Principal Investigator
    Years Inclusive: 6/1/2022 – 5/31/2026
    Total Direct Costs: $800,000
  2. Funding Agency: Pitt CTSI
    Grant Title: Precision Care in asthma using EHR analytics
    Role on Grant: MPI
    Years Inclusive: 6/1/2022 – 5/31/2023
    Total Direct Costs: $45,000
  3. Funding Agency: NIH/NEI
    Grant Number: R21EY030488
    Grant Title: Deep-learning-based prediction of AMD and its progression with GWAS and fundus image data
    Role on Grant: MPI (contact PI)
    Years Inclusive: 8/1/2020 – 5/31/2023 (with 1-year NCE)
    Total Direct Costs: $270,000
  4. Funding Agency: NIH/Clinical and Translational Science Institute, University of Pittsburgh
    Grant Title: Deep Learning with GWAS to Predict AMD Progression
    Role on Grant: Principal Investigator
    Years Inclusive: 1/1/2019 – 12/31/2019
    Total Direct Costs: $10,000                  
  5. Funding Agency: NIH/NIMH
    Grant Number: R03MH108849
    Grant Title: Novel and Robust Methods for Differential Protein Network Analysis of Proteomics Data in Schizophrenia Research
    Role on Grant: Principal Investigator
    Years Inclusive: 7/1/2016 – 6/30/2018
    Total Direct Costs: $100,000
  6. Funding Agency: UPMC 
    Grant Title: Competitive Medical Research Fund
    Role on Grant: Principal Investigator
    Years Inclusive: 7/1/2015 - 12/31/2017
    Total Direct Costs: $25,000 
Selected Publications

*: corresponding/senior author; +: co-first author; _: PhD student advisee

2024:

  • Hu H, Wang X, Feng S, Xu Z, Liu J, Heidrich-O’Hare E, Chen Y, Yue M, Zeng L, Ding Y, Huang H, Duerr R, Chen W. (2024). A unified model-based framework for doublet/multiplet detection in single-cell multiomics data. Nature Communications 15, 5562. https://doi.org/10.1038/s41467-024-49448-x  

  • Liu J, Bo N, Forno E, Ding Y*. (2024). Predicting Pediatric Asthma Severe Outcomes using Machine Learning Methods for EHR Data with Repeated Clinic Visits. Journal of Statistical Research. 58(1): 131-149. https://doi.org/10.3329/jsr.v58i1.75419

  • Bo N+, Wei Y+, Zeng L, Kang C, Ding Y*. (2024). A Meta-Learner Framework to Estimate Individualized Treatment Effects for Survival Outcomes (An earlier version won the 2022 JSM LiDS section student paper award). Journal of Data Science. https://doi.org/10.6339/24-JDS1119

  • Chen L, Wang Y, Cai C, Ding Y, Kim RS, Lipchik C, Fumagalli D, Gavin PG, Yothers G, Allegra CJ, Petrelli NJ, Suga JM, Hopkins JO, Saito NG, Evans T, Jujjavarapu S, Wolmark N, Lucas PC, O’Connell MJ, Paik S, Sun M, Pogue-Geile KL, Lu X. (2024). Machine Learning Predicts Oxaliplatin Benefit in Colon Cancer Adjuvant Therapies. Journal of Clinical Oncology. PMID: 38315963. DOI: 10.1200/JCO.23.01080

2023:

2022:

  • Sun T, Cheng Y, Ding Y*. (2022) An Information Ratio based Goodness-of-fit Test for Copula Models on Censored Data. Biometrics. https://doi.org/10.1111/biom.13807
  • Ding Y*, Sun T. Copula Models and Diagnostics for Multivariate Interval-Censored Data. In: Sun J, Chen D-G, editors. Emerging Topics in Modeling Interval-Censored Survival Data p141–165 New York: Springer, 2022.
  • Wang X, Xu Z, Zhou X, Zhang Y, Huang H, Ding Y, Duerr RH, Chen W. (2022) SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics. PNAS Nexus. https://doi.org/10.1101/2020.11.06.371849
  • Sun T, Ding Y. (2022) Neural Network on Interval Censored Data with Application to the Prediction of Alzheimer’s Disease. Biometrics. https://onlinelibrary.wiley.com/doi/abs/10.1111/biom.13734
  • Ganjdanesh A+, Zhang Z+, Chew EY, Ding Y, Chen W*, Huang H* (2022) LONGL-Net: A Temporal Correlation Structure Guided Deep Learning Framework for Predicting Longitudinal Age-related Macular Degeneration Severity. PNAS Nexus. PMID: 35360552 DOI: 10.1093/pnasnexus/pgab003

2021:

  • Wei Y, Hsu JC, Chen W, Chew EY, Ding Y*. (2021) Identification and Inference for Subgroups with Differential Treatment Efficacy from Randomized Controlled Trials with Survival Outcomes through Multiple Testing. (The earlier version won the Best Poster Award in ASA Pittsburgh Chapter 2019 Meeting.)Statistics in Medicine. PMID: 34542190 DOI: 10.1002/sim.9196
  • Wei Y, Wang X, Chew EY, Ding Y*. (2021) Confident Identification of Subgroups from SNP Testing in RCTs with Binary Outcomes. Biometrical Journal. https://doi.org/10.1002/bimj.202000170
  • Yan Q, Jiang Y, Huang H, Xin H, Swaroop A, Chew EY, Weeks DE, Chen W*,Ding Y*. (2021) GWAS-based Machine Learning for Prediction of Age-Related Macular Degeneration Risk. Translational Vision Science & Technology (TVST). https://doi.org/10.1167/tvst.10.2.29
  • Cui X, Dickhaus T, Ding Y, Hsu JC. Handbook of Multiple Comparisons. Chapman & Hall/CRC,2021 ISBN 9780367140670
  • Ding Y*,Wei Y, Wang X, Hsu JC. Testing SNPs in Targeted Drug Development. Book Chapter In: Cui X, Dickhaus T, Ding Y, Hsu JC. Handbook of Multiple Comparisons. Chapman & Hall/CRC, 2021

2020:

  • Sun T, Wei Y, Chen W, Ding Y*. (2020) Genome-wide Association Study-based Deep Learning for Survival Prediction. (The earlier version won the 2019 LiDS Conference Student Poster Award.) Statistics in Medicine.https://doi.org/10.1002/sim.8743.
  • Chen L-W, Cheng Y, Ding Y, Li R. (2020) Quantile Association Regression on Bivariate Survival Data. Canadian Journal of Statistics. doi/10.1002/cjs.11577.
  • ­Wang X+, Sun Z+, Zhang Y, Xu Z, Huang H, Duerr R, Chen K, Ding Y*, Chen W*. (2020) BREM-SC: A Bayesian Random Effects Mixture Model for Joint Clustering Single Cell Multi-omics Data. (The paper won the 2020 ICSA Student Paper Award.) Nucleic Acid Research.48(11): 5814–5824 doi: 10.1093/nar/gkaa314. PMID: 32379315.
  • Sun T, Ding Y*. (2020) CopulaCenR: Copula based Regression Models for Bivariate Censored Data in R. The R Journal. https://doi.org/10.32614/RJ-2020-025.
  • Yan Q, Weeks DE, Xin H, Huang H, Swaroop A, Chew EY, Ding Y*, Chen W*. (2020) Deep-learning-based Prediction of Late Age-Related Macular Degeneration Progression. Nature Machine Intelligence.2(2):141-150 DOI: 10.1038/s42256-020-0154-9 PMID: 32285025.
  • Ding Y*,Wei Y, Wang X. Logical Inference on Treatment Efficacy When Subgroups Exist. Book Chapter In: Ting N, Cappelleri JC, Ho S, Chen DG. Design and Analysis of Subgroups with Biopharmaceutical Applications. New York: Springer, 2020.

2019:

  • Wei Y+, Liu Y+, Sun T, Chen W, Ding Y*. (2019) Gene-based Association Analysis for Bivariate Time-to-event Data through Functional Regression with Copula Models. (The earlier version won the 2019 LiDS Conference Student Paper Award.Biometrics.DOI:10.1111/biom.13165
  • Sun T, Ding Y*. (2019) Copula-based semiparametric transformation model for bivariate data under general interval censoring. (The earlier version won the 2019 ENAR Distinguished Student Paper Award.) Biostatistics. DOI: 10.1093/biostatistics/kxz032
  • Sun Z, Chen L, Xin H, Huang Q, Cillo AR, Tabib T, Kolls JK, Bruno TC, Lafyatis R, Vignali DAA, Chen K, Ding Y*, Hu M*, Chen W*. (2019) BAMM-SC: A Bayesian mixture model for clustering droplet-based single cell transcriptomic data from population studies. (The earlier version won the 2019 ENAR Distinguished Student Paper Award.) Nature Communication. 10(1):1649 Doi: 10.1038/s41467-019-09639-3. PMID: 30967541
  • Sun T+, Liu Y+, Cook RJ, Chen W, Ding Y*. (2019). Copula-based Score Test for Bivariate Time-to-event Data, with Application to a Genetic Study of AMD Progression. (The earlier version won the Best Poster Award in ASA Pittsburgh Chapter 2017 Meeting.) Lifetime Data Analysis. DOI: 10.1007/s10985-018-09459-5. PMID: 30560439
  • Lin HM, Xu H, Ding Y, Hsu JC. (2019). Correct and Logical Inference on Efficacy in Subgroups and Their Mixture for Binary Outcomes. Biometrical Journal. 61(2): 8-26. PMID: 30353566

2018:

  • Ding Y*, Li GY, Liu Y, Ruberg SJ, Hsu JC. (2018). Confident Inference For SNP Effects On Treatment Efficacy. Annals of Applied Statistics. 12(3): 1727-1748.
  • Ding Y*,+, Kong S+, Kang S, Chen W. (2018). A Semiparametric Imputation Approach for Regression with Censored Covariate, with Application to an AMD Progression Study. Statistics in Medicine. 37: 3293–3308. PMID: 29845616
  • Yan Q+, Ding Y+, Liu YSun T, Fritsche LG, Clemons T, Ratnapriya R, Klein ML, Cook RJ, Liu Y, Fan R, Wei L, Abecasis GR, Swaroop A, Chew EY, AREDS2 research group, Weeks DE, Chen W. (2018). Genome-wide Analysis of Disease Progression in Age-related Macular Degeneration. Human Molecular Genetics. 27(5):929-940. PMID: 29346644
  • Sun Z, Wang T, Deng K, Wang X-F, Lafyatis R, Ding Y, Hu M, Chen W. (2018). DIMM-SC: A Dirichlet mixture model for clustering droplet-based single cell transcriptomic data. Bioinformatics. 34(1): 139-146. PMID: 29036318

2017 and before:

  • Ding YLiu Y, Yan Q, Fritsche LG, Cook RJ, Clemons T, Ratnapriya R, Klein ML, Abecasis GR, Swaroop A, Chew EY, Weeks DE, Chen W. (2017). Bivariate Analysis of Age-Related Macular Degeneration Progression Using Genetic Risk Scores. Genetics. 206(1):119-133. PMID: 28341650
  • Wang T, Ren Z, Ding Y, Zhou F, Sun Z, MacDonald ML, Sweet RA, Chen W. (2016). FastGGM: An efficient algorithm for the inference of Gaussian graphical model in biological networks. PLoS Computational Biology. 12(2): e1004755. PMID: 26872036
  • Fan R, Wang Y, Yan Q, Ding Y, Weeks DE, Lu Z, Ren H, Cook R J, Xiong M, Swaroop A, Chew E Y, and Chen W. (2016). Gene-based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions. Genetic Epidemiology. 40(2): 133-43. PMID: 26782979
  • Ding Y*, Lin HM, Hsu JC. (2016). Subgroup Mixable Inference on Treatment Efficacy in Mixture Populations, with an Application to Time-to-Event Outcomes. Statistics in Medicine. 35(10):1580-94. PMID: 26646305
  • Ding Y*, Nan B. (2015). Estimating Mean Survival Time: When is it Possible? Scandinavian Journal of Statistics 42(2):397-413. PMID: 26019387 PMCID: PMC4442028
  • Shen L, Ding Y, Battioui C. A Framework of Statistical Methods for Identification of Subgroups with Differential Treatment Effects in Randomized Trials. (2015) In: Chen Z, Liu A, Qu Y, Tang L, Ting N & Tsong Y, eds. Applied Statistics in Biomedicine and Clinical Trials Design: Selected Papers from 2013 ICSA/ISBS Joint Statistical Meetings. New York: Springer.
  • Ding Y, Fu H. (2013). Bayesian Indirect and Mixed Treatment Comparisons Across Longitudinal Time Points. Statistics in Medicine 32 (15):2613-28. PMID: 23229717
  • Banerjee M, Ding Y, Noone A. (2012). Identifying Representative Trees from Ensembles. Statistics in Medicine 31(15):1601-16. PMID: 22302520
  • Ding Y, Nan B. (2011). A Sieve M-theorem for Bundled Parameters in Semiparametric Models, with Application to the Efficient Estimation in a Linear Model for Censored Data. Annals of Statistics 39(6): 3032-3061. PMID: 24436500 PMCID: PMC3890689
  • Ding Y, Choi H, Nesvizhskii AI. (2008). Adaptive Discriminant Function Analysis and Reranking of MS/MS Database Search Results for Improved Peptide Identification in Shotgun Proteomics. Journal of Proteome Research 7(11): 4878-89. PMID: 18788775 PMCID: PMC3744223
  • Complete List of Published Work in My Bibliography