Abstract
The study utilizes text classification (TC) to observe “interpretese”, the distinctive linguistic patterns employed by interpreters, in simultaneous interpreting (SI) at United Nations Security Council conferences. A text vectorization method known as TF*IDF is improved with Shannon’s entropy and employed to convert interpreted and non-interpreted target language speeches into vectors. Subsequently, stacking ensemble learning classifies the vectors reduced in dimensions into two labeled categories: interpreted speech and non-interpreted speech. Accurate classifications would support the interpretese hypothesis. To explore the universality of interpretese, this study detects interpretese in bidirectional SI when interpreters work from their first to second languages in one direction and from their second to first languages in the other direction. The results demonstrate successful classifications in the two interpreting directions, thereby supporting that the interpretese hypothesis. Notably, a higher classification accuracy score is yielded when the interpreters work into their first language than into their second language, suggesting interpretese is more pronounced in the former direction, and interpreting directions impacts interpreters’ language processing. Different classification algorithms vary in terms of their performances in the classification tasks, underscoring the importance of using stacking for ensemble learning to achieve reliable results and justify algorithm selection.
Original language | English |
---|---|
Journal | IEEE Access |
DOIs | |
Publication status | Accepted/In press - 2025 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2013 IEEE.
ASJC Scopus Subject Areas
- General Computer Science
- General Materials Science
- General Engineering
Keywords
- ensemble learning
- entropy
- interpretese
- interpreting directions
- text classification
- TF*IDF