The statistical analysis of Next-Generation Sequencing data raises many computational challenges regarding modeling and inference, especially because of the high dimensionality of genomic data. The research work in this manuscript concerns hybrid …

For nearly 20 years, sequencing technologies have been on the rise, producing more and more data often characterized by their high dimensionality, meaning when the number $p$ of covariates like genes is far larger than the number $n$ of observations. Analysing such data is a statistical challenge and requires the use of dimension reduction approaches. Compression methods show particular abilities concerning data interpretation through visualisation or for clustering. Especially, projection-based methods such as principal component analysis (PCA) generally solve a problem of matrix factorization, for instance the PCA corresponds to a singular value decomposition (SVD).

For nearly 20 years, sequencing technologies have been on the rise, producing more and more data often characterized by their high dimensionality, meaning when the number $p$ of covariates like genes is far larger than the number $n$ of observations. Analysing such data is a statistical challenge and requires the use of dimension reduction approaches. Compression methods show particular abilities concerning data interpretation through visualisation or for clustering. Especially, projection-based methods such as principal component analysis (PCA) generally solve a problem of matrix factorization, for instance the PCA corresponds to a singular value decomposition (SVD).

© 2019 Ghislain DURIF · Powered by the Academic theme for Hugo.