This very condensed paper is intended for specialists in mass spectrometry. It focuses on the automatic detection of contaminant masses. The number of contaminants may be small relative to the total number of peptides analyzed, but automatic determination of contaminant masses could improve the overall process of peptide identification. The experimental data was obtained only from MALDI-TOF instruments, but the concepts described could apply to other instruments as well.
The authors indicate that masses were obtained after base line correction, noise reduction, de-isotoping, and mass calibration. The key element in identification of contaminant masses is the selection of a proper clustering process. For this task, the authors describe a similarity measure, a clustering algorithm, and validation criteria. The clustering algorithm was designed based on the “tentative add” idea. Validation criteria include the measure of the frequency of the observed masses.
The results presented by the authors show an optimal cluster setting of 30 parts per million (PPM). This is the radius of the clusters for which the masses are clustered without ambiguity. It is relevant that, between the probable contaminant masses, the authors discovered some of the well-known masses of trypsin peptides. The volume of data analyzed was 78,384 masses from 3,029 proteins.
Overall, this is a helpful paper on the identification of contaminants, which is an important issue for mass spectrometry analysts.