Details
Original language | English |
---|---|
Pages (from-to) | 430-459 |
Number of pages | 30 |
Journal | International Journal of Corpus Linguistics |
Volume | 28 |
Issue number | 3 |
Publication status | Published - 19 Jul 2023 |
Externally published | Yes |
Abstract
This paper elaborates on the notion of uncertainty in the context of annotation in large text corpora, specifically focusing on (but not limited to) historical languages. Such uncertainty might be due to inherent properties of the language, for example, linguistic ambiguity and overlapping categories of linguistic description, but could also be caused by a lack of annotation expertise. By examining annotation uncertainty in more detail, we identify the sources, deepen our understanding of the nature and different types of uncertainty encountered in daily annotation practice, and discuss practical implications of our theoretical findings. This paper can be seen as an attempt to reconcile the perspectives of the main scientific disciplines involved in corpus projects, linguistics and computer science, to develop a unified view and to highlight the potential synergies between these disciplines.
Keywords
- annotation, fuzziness, grammatical change, uncertainty
ASJC Scopus subject areas
- Arts and Humanities(all)
- Language and Linguistics
- Social Sciences(all)
- Linguistics and Language
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: International Journal of Corpus Linguistics, Vol. 28, No. 3, 19.07.2023, p. 430-459.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Annotation uncertainty in the context of grammatical change
AU - Merten, Marie Luis
AU - Wever, Marcel
AU - Geierhos, Michaela
AU - Tophinke, Doris
AU - Hüllermeier, Eyke
N1 - Publisher Copyright: © 2023 John Benjamins Publishing Company.
PY - 2023/7/19
Y1 - 2023/7/19
N2 - This paper elaborates on the notion of uncertainty in the context of annotation in large text corpora, specifically focusing on (but not limited to) historical languages. Such uncertainty might be due to inherent properties of the language, for example, linguistic ambiguity and overlapping categories of linguistic description, but could also be caused by a lack of annotation expertise. By examining annotation uncertainty in more detail, we identify the sources, deepen our understanding of the nature and different types of uncertainty encountered in daily annotation practice, and discuss practical implications of our theoretical findings. This paper can be seen as an attempt to reconcile the perspectives of the main scientific disciplines involved in corpus projects, linguistics and computer science, to develop a unified view and to highlight the potential synergies between these disciplines.
AB - This paper elaborates on the notion of uncertainty in the context of annotation in large text corpora, specifically focusing on (but not limited to) historical languages. Such uncertainty might be due to inherent properties of the language, for example, linguistic ambiguity and overlapping categories of linguistic description, but could also be caused by a lack of annotation expertise. By examining annotation uncertainty in more detail, we identify the sources, deepen our understanding of the nature and different types of uncertainty encountered in daily annotation practice, and discuss practical implications of our theoretical findings. This paper can be seen as an attempt to reconcile the perspectives of the main scientific disciplines involved in corpus projects, linguistics and computer science, to develop a unified view and to highlight the potential synergies between these disciplines.
KW - annotation
KW - fuzziness
KW - grammatical change
KW - uncertainty
UR - http://www.scopus.com/inward/record.url?scp=85168533774&partnerID=8YFLogxK
U2 - 10.1075/ijcl.20113.mer
DO - 10.1075/ijcl.20113.mer
M3 - Article
AN - SCOPUS:85168533774
VL - 28
SP - 430
EP - 459
JO - International Journal of Corpus Linguistics
JF - International Journal of Corpus Linguistics
SN - 1384-6655
IS - 3
ER -