一、题目:
Optimal Discriminant Analysis in High-Dimensional Latent Factor Models
二、主讲人:
Xin Bing
三、摘要:
In high-dimensional classification problems, a commonly used approach is to first project the high-dimensional features into a lower dimensional space, and base the classification on the resulting lower dimensional projections. In this paper, we formulate a latent-variable model with a hidden low-dimensional structure to justify this two-step procedure and to guide which projection to choose. We propose a computationally efficient classifier that takes certain principal components (PCs) of the observed features as projections, with the number of retained PCs selected in a data-driven way. A general theory is established for analyzing such two-step classifiers based on any projections. We derive explicit rates of convergence of the excess risk of the proposed PC-based classifier. The obtained rates are further shown to be optimal up to logarithmic factors in the minimax sense. Our theory allows the lower-dimension to grow with the sample size and is also valid even when the feature dimension (greatly) exceeds the sample size. Extensive simulations corroborate our theoretical findings. The proposed method also performs favorably relative to other existing discriminant methods on three real data example.
四、主讲人简介:
Dr. Xin Bing joined the Department of Statistical Sciences at the University of Toronto in 2022. He finished his Ph.D. in Statistics in 2021 from the Department of Statistics and Data Science at Cornell University. Prior to his Ph.D. study, he received a BS in Mathematics in 2013 from Shandong University, China and an MS in Statistics in 2016 from University of Washington, Seattle. His research interest generally lies in developing new methodology with theoretical guarantees to tackle modern statistical problems such as high-dimensional statistics, low-rank matrix estimation, multivariate analysis, model-based clustering, latent factor model, topic models, minimax estimation, high-dimensional inference, statistical and computational trade-offs. He is also interested in applications of statistical methods to genetics, neuroscience, immunology and other areas.
五、邀请人:
聂天洋教授、杜凯副研究员
六、时间:
5月23日(周二)10:00-11:00
七、地点:
中心校区知新楼B座1248报告厅
八、主办:
山东大学数学学院