[en] In the analysis of biological samples, control over experimental design and data acquisition procedures alone cannot ensure well-conditioned 1H NMR spectra with maximal information recovery for data analysis. A third major element affects the accuracy and robustness of results: the data pre-processing/pre-treatment for which not enough attention is usually devoted, in particular in metabolomic studies. The usual approach is to use proprietary software provided by the analytical instruments' manufacturers to conduct the entire pre-processing strategy. This widespread practice has a number of advantages such as a user-friendly interface with graphical facilities, but it involves non-negligible drawbacks: a lack of methodological information and automation, a dependency of subjective human choices, only standard processing possibilities and an absence of objective quality criteria to evaluate pre-processing quality. This paper introduces PepsNMR to meet these needs, an R package dedicated to the whole processing chain prior to multivariate data analysis, including, among other tools, solvent signal suppression, internal calibration, phase, baseline and misalignment corrections, bucketing and normalisation. Methodological aspects are discussed and the package is compared to the gold standard procedure with two metabolomic case studies. The use of PepsNMR on these data shows better information recovery and predictive power based on objective and quantitative quality criteria. Other key assets of the package are workflow processing speed, reproducibility, reporting and flexibility, graphical outputs and documented routines
Disciplines :
Engineering, computing & technology: Multidisciplinary, general & others Laboratory medicine & medical technology
Author, co-author :
Martin, Manon; Université Catholique de Louvain - UCL > Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA/IMMAQ)
Legat, Benoît; Université Catholique de Louvain - UCL > Ecole Polytechnique de Louvain (EPL)
Leenders, Justine ; Université de Liège - ULiège > Département des sciences cliniques > Labo de biologie des tumeurs et du développement
Vanwinsberghe, Julien; Université Louis Pasteur (Strasbourg) - ULP
Rousseau, Réjane; Université Catholique de Louvain - UCL > Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA/IMMAQ)
Boulanger, Bruno; Mont St Guibert > Statistical Department > Eli Lilly & Company
Eilers H.C., Paul; Erasmus Universiteit Rotterdam - EUR > Department of Biostatistics
De Tullio, Pascal ; Université de Liège - ULiège > Département de pharmacie > Chimie pharmaceutique
Govaerts, Bernadette; Université Catholique de Louvain - UCL > Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA/IMMAQ)
Language :
English
Title :
PepsNMR for 1H NMR metabolomic data pre-processing
Alonso, A., Marsal, S., Julià A., Analytical methods in untargeted metabolomics: state of the art in 2015. Frontiers in bioengineering and biotechnology, 3, 2015, 23, 10.3389/fbioe.2015.00023.
Smolinska, A., Blanchet, L., Buydens, L.M., Wijmenga, S.S., Nmr and pattern recognition methods in metabolomics: from data acquisition to biomarker discovery: a review. Anal. Chim. Acta 750 (2012), 82–97, 10.1016/j.aca.2012.05.049 750th Anniversary Volume http://www.sciencedirect.com/science/article/pii/S000326701200815X.
Gebregiworgis, T., Powers, R., Application of nmr metabolomics to search for human disease biomarkers. Comb. Chem. High Throughput Screen. 15:8 (2012), 595–610, 10.2174/138620712802650522 http://www.ingentaconnect.com/content/ben/cchts/2012/00000015/00000008/art00002.
Maher, A.D., Zirah, S.F., Holmes, E., Nicholson, J.K., Experimental and analytical variation in human urine in 1h nmr spectroscopy-based metabolic phenotyping studies. Anal. Chem. 79:14 (2007), 5204–5211, 10.1021/ac070212f pMID: 17555297.
Lenz, E.M., Wilson, I.D., Analytical strategies in metabonomics. J. Proteome Res. 6:2 (2007), 443–458, 10.1021/pr0605217 pMID: 17269702.
Liland, K.H., Multivariate methods in metabolomics — from pre-processing to dimension reduction and statistical analysis. Trac. Trends Anal. Chem. 30:6 (2011), 827–841, 10.1016/j.trac.2011.02.007 http://www.sciencedirect.com/science/article/pii/S0165993611000914.
Engel, J., Gerretzen, J., Szymańska, E., Jansen, J.J., Downey, G., Blanchet, L., Buydens, L.M., Breaking with trends in pre-processing?. Trac. Trends Anal. Chem. 50 (2013), 96–106, 10.1016/j.trac.2013.04.015 http://www.sciencedirect.com/science/article/pii/S0165993613001465.
Buydens, L., Towards tsunami-resistant chemometrics. The analytical Scientist 813 (2013), 24–30.
Goodacre, R., Broadhurst, D., Smilde, A.K., Kristal, B.S., Baker, J.D., Beger, R., Bessant, C., Connor, S., Capuani, G., Craig, A., Ebbels, T., Kell, D.B., Manetti, C., Newton, J., Paternostro, G., Somorjai, R., Sjöström, M., Trygg, J., Wulfert, F., Proposed minimum reporting standards for data analysis in metabolomics. Metabolomics 3:3 (2007), 231–241, 10.1007/s11306-007-0081-3.
Rousseau, R., Statistical Contribution to the Analysis of Metabonomics Data in 1h Nmr Spectroscopy. Ph.D. thesis, 2011, Institut de Statistique, Biostatistique et Sciences Actuarielles, Université catholique de Louvain.
Vanwinsberghe, J., Bubble: Development of a Matlab Tool for Automated 1h-nmr Data Processing in Metabonomics. Traineeship report (unpublished results), 2005, Strasbourg University.
Keeler, J., 4 fourier transformation and data processing. Understanding NMR Spectroscopy, 2002, John Wiley & Sons, 48–65, 10.17863/CAM.968.
Siebert, W.M., Circuits, Signals, and Systems, vol 2, 1986, MIT press.
Claridge, T.D., Chapter 3-practical aspects of high-resolution {NMR}. Claridge, T.D., (eds.) High-resolution {NMR} Techniques in Organic Chemistry, third ed., 2016, Elsevier, Boston, 61–132, 10.1016/B978-0-08-099986-9.00003-8.
Liland, K.H., Almøy, T., Mevik, B.-H., Optimal choice of baseline correction for multivariate calibration of spectra. Appl. Spectrosc. 64:9 (2010), 1007–1016, 10.1366/000370210792434350 pMID: 20828437.
Marion, R., Pre-processing of Nmr Spectra: Review and Evaluation of Baseline Correction, Normalization, Scaling and Transformation Methods. Master thesis (unpublished results), 2016, Ecole de Statistique, Biostatistique et Sciences Actuarielles, Université catholique de Louvain.
Eilers, P.H.C., Boelens, H.F., Baseline Correction with Asymmetric Least Squares Smoothing. Medical Centre Report (unpublished results), 2005, Leiden University.
Craig, A., Cloarec, O., Holmes, E., Nicholson, J.K., Lindon, J.C., Scaling and normalization effects in nmr spectroscopic metabonomic data sets. Anal. Chem. 78:7 (2006), 2262–2267, 10.1021/ac0519312 pMID: 16579606.
Bloemberg, T.G., Gerretzen, J., Lunshof, A., Wehrens, R., Buydens, L.M., Warping methods for spectroscopic and chromatographic signal alignment: a tutorial. Anal. Chim. Acta 781 (2013), 14–32, 10.1016/j.aca.2013.03.048 http://www.sciencedirect.com/science/article/pii/S0003267013004224.
Vu, T.N., Laukens, K., Getting your peaks in line: a review of alignment methods for nmr spectral data. Metabolites, 3(2), 2013, 259, 10.3390/metabo3020259 http://www.mdpi.com/2218-1989/3/2/259.
van Nederkassel, A., Daszykowski, M., Eilers, P., Heyden, Y.V., A comparison of three algorithms for chromatograms alignment. J. Chromatogr. A 1118:2 (2006), 199–210, 10.1016/j.chroma.2006.03.114 http://www.sciencedirect.com/science/article/pii/S0021967306007059.
Eilers, P.H.C., Marx, B.D., Flexible smoothing with b-splines and penalties. Stat. Sci. 11:2 (1996), 89–102 http://www.jstor.org/stable/2246049.
Dieterle, F., Ross, A., Schlotterbeck, G., Senn, H., Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. application in 1h nmr metabonomics. Anal. Chem. 78:13 (2006), 4281–4290, 10.1021/ac051632c pMID: 16808434.
Tang, K.W.A., Toh, Q.C., Teo, B.W., Normalisation of urinary biomarkers to creatinine for clinical practice and research–when and why. Singap. Med. J., 56(1), 2015, 7.
Wu, Y., Li, L., Sample normalization methods in quantitative metabolomics. J. Chromatogr. A 1430 (2016), 80–95.
Féraud, B., Govaerts, B., Verleysen, M., de Tullio, P., Statistical treatment of 2d nmr cosy spectra in metabolomics: data preparation, clustering-based evaluation of the metabolomic informative content and comparison with 1h-nmr. Metabolomics 11:6 (2015), 1756–1768, 10.1007/s11306-015-0830-7.
Giacomoni, F., Le Corguillé G., Monsoor, M., Landi, M., Pericard, P., Pétéra, M., Duperier, C., Tremblay-Franco, M., Martin, J.-F., Jacob, D., Goulitquer, S., Thévenot, E.A., Caron, C., Workflow4metabolomics: a collaborative research infrastructure for computational metabolomics. Bioinformatics, 31(9), 2015, 1493, 10.1093/bioinformatics/btu813.
Hubert, M., Rousseeuw, P.J., Vanden Branden, K., Robpca: a new approach to robust principal component analysis. Technometrics 47:1 (2005), 64–79, 10.1198/004017004000000563.