Jun Zhang of the Department of Biostatistics defends her dissertation on "Interpretable Analysis of Multivariate Functional Data".
Committee Chairsperson: Robert T. Krafty, PhD, Department of Biostatistics
Committee Members: Stewart J. Anderson, PhD, Department of Biostatistics
George C. Tseng, ScD, Department of Biostatistics
Greg J. Siegle, PhD, Department of Psychiatry
Graduate faculty of the University and all other interested parties are invited to attend via Zoom https://pitt.zoom.us/j/99847179454
Multivariate functional data have become increasingly popular in medical studies. For example, participants' brain activity are recorded using modalities such as EEG from multiple locations on the scalp and summarized as power within multiple time-varying frequency bands. Two popular tools for analyzing multivariate functional data are multivariate functional principal component (MFPCA) analysis, which provides low-dimensional measures that account for most of the variation, and multivariate functional linear discriminant analysis (MFLDA), which provides low-dimensional measures that discriminant groups with different outcomes. Existing approaches to conducting MFPCA and MFLDA are limited in that they are difficult to interpret, since the component or the classifier is a nontrivial function of each variate of all the time points. This dissertation proposes novel MFPCA and MFLDA for multivariate functional data that provide scientifically interpretable components or classifiers, which are both sparse among variates and localized in time within each variate.
In the first part of the dissertation, we develop a novel approach to conducting interpretable principal components analysis on multivariate functional data. Moreover, the method also handles a multilevel structure of the multivariate function data, for example each subject (i.e., subject-level) has curves repeatedly measured at different electrodes (i.e., electrode-level) on the scalp. We decompose the total variation into subject-level and replicate-within-subject-level variation and provides interpretable components that can be both sparse among variates (e.g. frequency bands) and have localized support over time within each frequency band. The sparsity and localization of components is obtained through a novel localized sparse-variate functional principal component analysis (LVPCA) achieved by solving an innovative rank-one based convex optimization problem with block Frobenius and matrix L1-norm based penalties. Finally, we apply the proposed methods to the Blunted and Discordant Affect (BADA) study to summarize the joint variation across multiple frequency bands for both whole-brain variability between subjects as well as location-variation within subjects, and successfully identify the components that are associated with the symptom of dissociation.
In the second part of the dissertation, we develop a novel approach to conducting interpretable linear discriminant analysis on multivariate functional data. The proposed localized sparse-variate functional linear discriminant analysis (LVLDA) is achieved through a two-step procedure by first projecting the high-dimensional data along localized functional basis through LVPCA proposed in the first part of the dissertation, and then performing sparse LDA on the low-dimensional space to select basis that are relevant to the between-class covariance. We apply the proposed methods to the AgeWise study to uncover physiological discriminants between older adults with and without poor self-reported sleep quality using waking EEG theta power, salivary melatonin, and core body temperature.