Abstract
Classification of questions according to domain-agnostic class labels relies on a suitable feature extraction process. We propose the use of phrases that is more effective than using words to represent questions. The proposed phrase-based topic modeling technique employs asymmetric priors that are scaled with a new C-value for nested regular expressions. In addition, to suppress high-frequency words in phrases, we deploy term weightages computed using the modified distinguishing feature selector. The proposed approach also incorporates a new topic regularization mechanism to facilitate efficient mapping of questions to class labels. We validate the performance of our proposed model via four datasets across different domain-agnostic class labels comprising question types, reasoning capabilities, and cognitive complexities. Results obtained highlight that the proposed technique outperforms existing methods in terms of macro-average F1 score.
Original language | English |
---|---|
Pages (from-to) | 3604-3616 |
Number of pages | 13 |
Journal | IEEE/ACM Transactions on Audio Speech and Language Processing |
Volume | 29 |
DOIs | |
Publication status | Published - 2021 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2014 IEEE.
ASJC Scopus Subject Areas
- Computer Science (miscellaneous)
- Acoustics and Ultrasonics
- Computational Mathematics
- Electrical and Electronic Engineering
Keywords
- Automatic question classification
- nested phrase mining
- regular expression
- term weighting schemes
- topic modeling