PROTREC: A probability-based approach for recovering missing proteins based on biological networks

Weijia Kong; Bertrand Jern Han Wong; Huanhuan Gao; Tiannan Guo; Xianming Liu; Xiaoxian Du; Limsoon Wong; Wilson Wen Bin Goh

doi:10.1016/j.jprot.2021.104392

PROTREC: A probability-based approach for recovering missing proteins based on biological networks

Weijia Kong, Bertrand Jern Han Wong, Huanhuan Gao, Tiannan Guo, Xianming Liu, Xiaoxian Du, Limsoon Wong^*, Wilson Wen Bin Goh^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

8 Citations (Scopus)

Abstract

A novel network-based approach for predicting missing proteins (MPs) is proposed here. This approach, PROTREC (short for PROtein RECovery), dominates existing network-based methods – such as Functional Class Scoring (FCS), Hypergeometric Enrichment (HE), and Gene Set Enrichment Analysis (GSEA) – across a variety of proteomics datasets derived from different proteomics data acquisition paradigms: Higher PROTREC scores are much more closely correlated with higher recovery rates of MPs across sample replicates. The PROTREC score, unlike methods reporting p-values, can be directly interpreted as the probability that an unreported protein in a proteomic screen is actually present in the sample being screened. Significance: Mass spectrometry (MS) has developed rapidly in recent years; however, an obvious proportion of proteins is still undetected, leading to missing protein problems. A few existing protein recovery methods are based on biological networks, but the performance is not satisfactory. We propose a new protein recovery method, PROTREC, a Bayesian-inspired approach based on biological networks, which shows exceptional performance across multiple validation strategies. It does not rely on peptide information, so it avoids the ambiguity issue that most protein assembly methods face.

Original language	English
Article number	104392
Journal	Journal of Proteomics
Volume	250
DOIs	https://doi.org/10.1016/j.jprot.2021.104392
Publication status	Published - Jan 6 2022
Externally published	Yes

Bibliographical note

Publisher Copyright:
© 2021 The Authors

ASJC Scopus Subject Areas

Biophysics
Biochemistry

Keywords

Bioinformatics
Missing proteins
Networks
Protein complexes
Proteomics
Statistics

Access to Document

10.1016/j.jprot.2021.104392

Cite this

@article{8b1885284c5341f9acac8327d0ebd770,

title = "PROTREC: A probability-based approach for recovering missing proteins based on biological networks",

abstract = "A novel network-based approach for predicting missing proteins (MPs) is proposed here. This approach, PROTREC (short for PROtein RECovery), dominates existing network-based methods – such as Functional Class Scoring (FCS), Hypergeometric Enrichment (HE), and Gene Set Enrichment Analysis (GSEA) – across a variety of proteomics datasets derived from different proteomics data acquisition paradigms: Higher PROTREC scores are much more closely correlated with higher recovery rates of MPs across sample replicates. The PROTREC score, unlike methods reporting p-values, can be directly interpreted as the probability that an unreported protein in a proteomic screen is actually present in the sample being screened. Significance: Mass spectrometry (MS) has developed rapidly in recent years; however, an obvious proportion of proteins is still undetected, leading to missing protein problems. A few existing protein recovery methods are based on biological networks, but the performance is not satisfactory. We propose a new protein recovery method, PROTREC, a Bayesian-inspired approach based on biological networks, which shows exceptional performance across multiple validation strategies. It does not rely on peptide information, so it avoids the ambiguity issue that most protein assembly methods face.",

keywords = "Bioinformatics, Missing proteins, Networks, Protein complexes, Proteomics, Statistics",

author = "Weijia Kong and Wong, \{Bertrand Jern Han\} and Huanhuan Gao and Tiannan Guo and Xianming Liu and Xiaoxian Du and Limsoon Wong and Goh, \{Wilson Wen Bin\}",

note = "Publisher Copyright: {\textcopyright} 2021 The Authors",

year = "2022",

month = jan,

day = "6",

doi = "10.1016/j.jprot.2021.104392",

language = "English",

volume = "250",

journal = "Journal of Proteomics",

issn = "1874-3919",

publisher = "Elsevier",

}

TY - JOUR

T1 - PROTREC

T2 - A probability-based approach for recovering missing proteins based on biological networks

AU - Kong, Weijia

AU - Wong, Bertrand Jern Han

AU - Gao, Huanhuan

AU - Guo, Tiannan

AU - Liu, Xianming

AU - Du, Xiaoxian

AU - Wong, Limsoon

AU - Goh, Wilson Wen Bin

PY - 2022/1/6

Y1 - 2022/1/6

N2 - A novel network-based approach for predicting missing proteins (MPs) is proposed here. This approach, PROTREC (short for PROtein RECovery), dominates existing network-based methods – such as Functional Class Scoring (FCS), Hypergeometric Enrichment (HE), and Gene Set Enrichment Analysis (GSEA) – across a variety of proteomics datasets derived from different proteomics data acquisition paradigms: Higher PROTREC scores are much more closely correlated with higher recovery rates of MPs across sample replicates. The PROTREC score, unlike methods reporting p-values, can be directly interpreted as the probability that an unreported protein in a proteomic screen is actually present in the sample being screened. Significance: Mass spectrometry (MS) has developed rapidly in recent years; however, an obvious proportion of proteins is still undetected, leading to missing protein problems. A few existing protein recovery methods are based on biological networks, but the performance is not satisfactory. We propose a new protein recovery method, PROTREC, a Bayesian-inspired approach based on biological networks, which shows exceptional performance across multiple validation strategies. It does not rely on peptide information, so it avoids the ambiguity issue that most protein assembly methods face.

AB - A novel network-based approach for predicting missing proteins (MPs) is proposed here. This approach, PROTREC (short for PROtein RECovery), dominates existing network-based methods – such as Functional Class Scoring (FCS), Hypergeometric Enrichment (HE), and Gene Set Enrichment Analysis (GSEA) – across a variety of proteomics datasets derived from different proteomics data acquisition paradigms: Higher PROTREC scores are much more closely correlated with higher recovery rates of MPs across sample replicates. The PROTREC score, unlike methods reporting p-values, can be directly interpreted as the probability that an unreported protein in a proteomic screen is actually present in the sample being screened. Significance: Mass spectrometry (MS) has developed rapidly in recent years; however, an obvious proportion of proteins is still undetected, leading to missing protein problems. A few existing protein recovery methods are based on biological networks, but the performance is not satisfactory. We propose a new protein recovery method, PROTREC, a Bayesian-inspired approach based on biological networks, which shows exceptional performance across multiple validation strategies. It does not rely on peptide information, so it avoids the ambiguity issue that most protein assembly methods face.

KW - Bioinformatics

KW - Missing proteins

KW - Networks

KW - Protein complexes

KW - Proteomics

KW - Statistics

UR - http://www.scopus.com/inward/record.url?scp=85117253996&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85117253996&partnerID=8YFLogxK

U2 - 10.1016/j.jprot.2021.104392

DO - 10.1016/j.jprot.2021.104392

M3 - Article

C2 - 34626823

AN - SCOPUS:85117253996

SN - 1874-3919

VL - 250

JO - Journal of Proteomics

JF - Journal of Proteomics

M1 - 104392

ER -

PROTREC: A probability-based approach for recovering missing proteins based on biological networks

Abstract

Bibliographical note

ASJC Scopus Subject Areas

Keywords

Access to Document

Other files and links

Fingerprint

Cite this