Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study.
Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; and Kim, Seongho
"Compound Identification Using Penalized Linear Regression on Metabolomics,"
Journal of Modern Applied Statistical Methods:
1, Article 20.
Available at: http://digitalcommons.wayne.edu/jmasm/vol15/iss1/20