Sparsity and dimension reduction
Keywords: “Statistics”, “Dimension reduction”, “Matrix Factorization”, “Data visualization”, “Single-cell”, “Gene expression”, “RNA-seq”, “Sparse PLS”, “Logistic regression”, “High-dimensional data”, “Classification”
Summary
In the era of large-scale (huge sample size) and/or high-dimensional (numerous variables/features) data, the question of data exploration and representation is central. A wide range of frameworks in statistics and machine learning are now available to solve supervised and unsupervised problems despite the data dimension and complexity. In particular, we will discuss sparsity in the context of dimension reduction, focusing on variable or feature selection and latent space projection. The presentation will be illustrated by various sparse methods designed for data visualization, regression or classification of high-dimensional data, and different examples of genomic data analysis.