TY - JOUR
T1 - PROTREC
T2 - A probability-based approach for recovering missing proteins based on biological networks
AU - Kong, Weijia
AU - Wong, Bertrand Jern Han
AU - Gao, Huanhuan
AU - Guo, Tiannan
AU - Liu, Xianming
AU - Du, Xiaoxian
AU - Wong, Limsoon
AU - Goh, Wilson Wen Bin
N1 - Publisher Copyright:
© 2021 The Authors
PY - 2022/1/6
Y1 - 2022/1/6
N2 - A novel network-based approach for predicting missing proteins (MPs) is proposed here. This approach, PROTREC (short for PROtein RECovery), dominates existing network-based methods – such as Functional Class Scoring (FCS), Hypergeometric Enrichment (HE), and Gene Set Enrichment Analysis (GSEA) – across a variety of proteomics datasets derived from different proteomics data acquisition paradigms: Higher PROTREC scores are much more closely correlated with higher recovery rates of MPs across sample replicates. The PROTREC score, unlike methods reporting p-values, can be directly interpreted as the probability that an unreported protein in a proteomic screen is actually present in the sample being screened. Significance: Mass spectrometry (MS) has developed rapidly in recent years; however, an obvious proportion of proteins is still undetected, leading to missing protein problems. A few existing protein recovery methods are based on biological networks, but the performance is not satisfactory. We propose a new protein recovery method, PROTREC, a Bayesian-inspired approach based on biological networks, which shows exceptional performance across multiple validation strategies. It does not rely on peptide information, so it avoids the ambiguity issue that most protein assembly methods face.
AB - A novel network-based approach for predicting missing proteins (MPs) is proposed here. This approach, PROTREC (short for PROtein RECovery), dominates existing network-based methods – such as Functional Class Scoring (FCS), Hypergeometric Enrichment (HE), and Gene Set Enrichment Analysis (GSEA) – across a variety of proteomics datasets derived from different proteomics data acquisition paradigms: Higher PROTREC scores are much more closely correlated with higher recovery rates of MPs across sample replicates. The PROTREC score, unlike methods reporting p-values, can be directly interpreted as the probability that an unreported protein in a proteomic screen is actually present in the sample being screened. Significance: Mass spectrometry (MS) has developed rapidly in recent years; however, an obvious proportion of proteins is still undetected, leading to missing protein problems. A few existing protein recovery methods are based on biological networks, but the performance is not satisfactory. We propose a new protein recovery method, PROTREC, a Bayesian-inspired approach based on biological networks, which shows exceptional performance across multiple validation strategies. It does not rely on peptide information, so it avoids the ambiguity issue that most protein assembly methods face.
KW - Bioinformatics
KW - Missing proteins
KW - Networks
KW - Protein complexes
KW - Proteomics
KW - Statistics
UR - http://www.scopus.com/inward/record.url?scp=85117253996&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85117253996&partnerID=8YFLogxK
U2 - 10.1016/j.jprot.2021.104392
DO - 10.1016/j.jprot.2021.104392
M3 - Article
C2 - 34626823
AN - SCOPUS:85117253996
SN - 1874-3919
VL - 250
JO - Journal of Proteomics
JF - Journal of Proteomics
M1 - 104392
ER -