Advanced bioinformatics methods for practical applications in proteomics

Wilson Wen Bin Goh*, Limsoon Wong

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

25 Citations (Scopus)

Abstract

Mass spectrometry (MS)-based proteomics has undergone rapid advancements in recent years, creating challenging problems for bioinformatics.We focus on four aspects where bioinformatics plays a crucial role (and proteomics is needed for clinical application): peptide-spectramatching (PSM) based on the new data-independent acquisition (DIA) paradigm, resolvingmissing proteins (MPs), dealing with biological and technical heterogeneity in data and statistical feature selection (SFS). DIA is a bruteforce strategy that provides greater width and depth but, because it indiscriminately captures spectra such that signal frommultiple peptides ismixed, getting good PSMs is difficult.We consider two strategies: simplification of DIA spectra to pseudo-datadependent acquisition spectra or, alternatively, brute-force search of each DIA spectra against known reference libraries. The MP problemarises when proteins are never (or inconsistently) detected by MS.When observed in at least one sample, imputationmethods can be used to guess the approximate protein expression level. If never observed at all, network/protein complexbased contextualization provides an independent prediction platform. Data heterogeneity is a difficult problemwith two dimensions: technical (batch effects), which should be removed, and biological (including demography and disease subpopulations), which should be retained. Simple normalization is seldomsufficient, while batch effect-correction algorithmsmay create errors. Batch effect-resistant normalizationmethods are a viable alternative. Finally, SFS is vital for practical applications.Whilemany methods exist, there is no bestmethod, and both upstream(e.g. normalization) and downstreamprocessing (e.g.multipletesting correction) are performance confounders. We also discuss signal detection when class effects are weak.

Original languageEnglish
Pages (from-to)347-355
Number of pages9
JournalBriefings in Bioinformatics
Volume20
Issue number1
DOIs
Publication statusPublished - Jan 18 2019
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2017 The Author.

ASJC Scopus Subject Areas

  • Information Systems
  • Molecular Biology

Keywords

  • bioinformatics
  • biostatistics
  • biotechnology
  • networks
  • proteomics

Fingerprint

Dive into the research topics of 'Advanced bioinformatics methods for practical applications in proteomics'. Together they form a unique fingerprint.

Cite this