Proteome Analyst: custom predictions with explanations in a web-based tool for high-throughput proteome annotations

Szafron, Duane
Lu, Paul
Greiner, Russell
Wishart, David S.
Poulin, Brett
Eisner, Roman
Lu, Zhiyong
Anvik, John
Macdonell, Cam
Fyshe, Alona
Oxford University Press
Proteome Analyst (PA) ( is a publicly available, high-throughput, web-based system for predicting various properties of each protein in an entire proteome. Using machine-learned classifiers, PA can predict, for example, the GeneQuiz general function and Gene Ontology (GO) molecular function of a protein. In addition, PA is currently the most accurate and most comprehensive system for predicting subcellular localization, the location within a cell where a protein performs its main function. Two other capabilities of PA are notable. First, PA can create a custom classifier to predict a new property, without requiring any programming, based on labeled training data (i.e. a set of examples, each with the correct classification label) provided by a user. PA has been used to create custom classifiers for potassium-ion channel proteins and other general function ontologies. Second, PA provides a sophisticated explanation feature that shows why one prediction is chosen over another. The PA system produces a Naïve Bayes classifier, which is amenable to a graphical and interactive approach to explanations for its predictions; transparent predictions increase the user's confidence in, and understanding of, PA.
Sherpa Romeo green journal. Permission to archive final published version.
Proteome Analyst , Custom predictions , Custom classifiers , Proteome annotations
Szafron, D., Lu, P., Greiner, R., Wishart, D. S., Poulin, B. Eisner, R., ... Meeuwis, D. (2004). Proteome Analyst: Custom predictions with explanations in a web-based tool for high-throughput proteome annotations. Nucleic Acids Research, 32(2), Pages W365–W371,