LGB-Stack: Stacked Generalization with LightGBM for Highly Accurate Predictions of Polymer Bandgap

Kai Leong Goh, Atsushi Goto*, Yunpeng Lu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)

Abstract

Recently, the Ramprasad group reported a quantitative structure-property relationship (QSPR) model for predicting the Egapvalues of 4209 polymers, which yielded a test set R2score of 0.90 and a test set root-mean-square error (RMSE) score of 0.44 at a train/test split ratio of 80/20. In this paper, we present a new QSPR model named LGB-Stack, which performs a two-level stacked generalization using the light gradient boosting machine. At level 1, multiple weak models are trained, and at level 2, they are combined into a strong final model. Four molecular fingerprints were generated from the simplified molecular input line entry system notations of the polymers. They were trimmed using recursive feature elimination and used as the initial input features for training the weak models. The output predictions of the weak models were used as the new input features for training the final model, which completes the LGB-Stack model training process. Our results show that the best test set R2and the RMSE scores of LGB-Stack at the train/test split ratio of 80/20 were 0.92 and 0.41, respectively. The accuracy scores further improved to 0.94 and 0.34, respectively, when the train/test split ratio of 95/5 was used.

Original languageEnglish
Pages (from-to)29787-29793
Number of pages7
JournalACS Omega
Volume7
Issue number34
DOIs
Publication statusPublished - Aug 30 2022
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022 American Chemical Society. All rights reserved.

ASJC Scopus Subject Areas

  • General Chemistry
  • General Chemical Engineering

Fingerprint

Dive into the research topics of 'LGB-Stack: Stacked Generalization with LightGBM for Highly Accurate Predictions of Polymer Bandgap'. Together they form a unique fingerprint.

Cite this