Regularized Phrase-Based Topic Model for Automatic Question Classification with Domain-Agnostic Class Labels

S. Supraja, Andy W.H. Khong*, S. Tatinati

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)

Abstract

Classification of questions according to domain-agnostic class labels relies on a suitable feature extraction process. We propose the use of phrases that is more effective than using words to represent questions. The proposed phrase-based topic modeling technique employs asymmetric priors that are scaled with a new C-value for nested regular expressions. In addition, to suppress high-frequency words in phrases, we deploy term weightages computed using the modified distinguishing feature selector. The proposed approach also incorporates a new topic regularization mechanism to facilitate efficient mapping of questions to class labels. We validate the performance of our proposed model via four datasets across different domain-agnostic class labels comprising question types, reasoning capabilities, and cognitive complexities. Results obtained highlight that the proposed technique outperforms existing methods in terms of macro-average F1 score.

Original languageEnglish
Pages (from-to)3604-3616
Number of pages13
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume29
DOIs
Publication statusPublished - 2021
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

ASJC Scopus Subject Areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Keywords

  • Automatic question classification
  • nested phrase mining
  • regular expression
  • term weighting schemes
  • topic modeling

Fingerprint

Dive into the research topics of 'Regularized Phrase-Based Topic Model for Automatic Question Classification with Domain-Agnostic Class Labels'. Together they form a unique fingerprint.

Cite this