October 10, 2005
This paper presents an approach to the evaluation and validation of mass spectrometry data for construction of an `early warning' diagnostic procedure. We describe implementation of a designed experiment and place emphasis on the consistent and correct use of validation based evaluation - which is a key requirement to achieve unbiased assessment of the ability of mass spectrometry data for diagnosis in this setting. Strict adherence to validation as a scientific principle will however typically imply that the analyst must make choices. Like all choices in statistical analysis, validation comes at a cost! We present a detailed and extensive discussion of the issues involved and propose that much greater emphasis and requirement for validation would enter clinical proteomic science.
Similar papers 1
March 21, 2014
Proteomics will celebrate its 20th year in 2014. In this relatively short period of time, it has invaded most areas of biology and its use will probably continue to spread in the future. These two decades have seen a considerable increase in the speed and sensitivity of protein identification and characterization, even from complex samples. Indeed, what was a challenge twenty years ago is now little more than a daily routine. Although not completely over, the technological ch...
April 5, 2009
In the last ten years, the field of proteomics has expanded at a rapid rate. A range of exciting new technology has been developed and enthusiastically applied to an enormous variety of biological questions. However, the degree of stringency required in proteomic data generation and analysis appears to have been underestimated. As a result, there are likely to be numerous published findings that are of questionable quality, requiring further confirmation and/or validation. Th...
July 8, 2008
Biomarker discovery for clinical purposes is one of the major areas in which proteomics is used. However, despite considerable effort, the successes have been relatively scarce. In this perspective paper, we try to highlight and analyze the main causes for this limited success, and to suggest alternate strategies, which will avoid them, without eluding the foreseeable weak points of these strategies. Two major strategies are analyzed, namely, the switch from body fluids to ce...
February 9, 2016
Mass spectrometry based clinical proteomics has emerged as a powerful tool for highthroughput protein profiling and biomarker discovery. Recent improvements in mass spectrometry technology have boosted the potential of proteomic studies in biomedical research. However, the complexity of the proteomic expression introduces new statistical challenges in summarizing and analyzing the acquired data. Statistical methods for optimally processing proteomic data are currently a growi...
October 10, 2017
High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity and mechanisms underlying human health and disease. Large-scale metabolomics data, generated using targeted or nontargeted platforms, are increasingly more common. Appropriate statistical analysis of these complex high-dimensional data is critical for extracting meaningful results from such large-scale human metabo...
October 10, 2017
Background. Emerging technologies now allow for mass spectrometry based profiling of up to thousands of small molecule metabolites (metabolomics) in an increasing number of biosamples. While offering great promise for revealing insight into the pathogenesis of human disease, standard approaches have yet to be established for statistically analyzing increasingly complex, high-dimensional human metabolomics data in relation to clinical phenotypes including disease outcomes. To ...
January 6, 2011
Mass spectrometry-based proteomics has become the tool of choice for identifying and quantifying the proteome of an organism. Though recent years have seen a tremendous improvement in instrument performance and the computational tools used, significant challenges remain, and there are many opportunities for statisticians to make important contributions. In the most widely used "bottom-up" approach to proteomics, complex mixtures of proteins are first subjected to enzymatic cl...
April 11, 2018
The process of biomarker discovery is typically lengthy and costly, involving the phases of discovery, qualification, verification, and validation before clinical evaluation. Being able to efficiently identify the truly relevant markers in discovery studies can significantly simplify the process. However, in discovery studies the sample size is typically small while the number of markers being explored is much larger. Hence discovery studies suffer from sparsity and high dime...
May 28, 2013
This review presents how R, the popular statistical environment and programming language, can be used in the frame of proteomics data analysis. A short introduction to R is given, with special emphasis on some of the features that make R and its add-on packages a premium software for sound and reproducible data analysis. The reader is also advised on how to find relevant R software for proteomics. Several use cases are then presented, illustrating data input/output, quality c...
January 16, 2022
Accurate information about protein content in the organism is instrumental for a better understanding of human biology and disease mechanisms. While the presence of certain types of proteins can be life-threatening, the abundance of others is an essential condition for an individual's overall well-being. Protein microarray is a technology that enables the quantification of thousands of proteins in hundreds of human samples in a parallel manner. In a series of studies involvin...