The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

  • Nailia Mirzakhmedova
  • Johannes Kiesel
  • Milad Alshomary
  • Maximilian Heinrich
  • Nicolas Handke
  • Xiaoni Cai
  • Valentin Barriere
  • Doratossadat Dastgheib
  • Omid Ghahroodi
  • MohammadAli SadraeiJavaheri
  • Ehsaneddin Asgari
  • Lea Kawaletz
  • Henning Wachsmuth
  • Benno Stein
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Herausgeber/-innenNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
ErscheinungsortTorino, Italia
Seiten16121-16134
Seitenumfang14
ISBN (elektronisch)9782493814104
PublikationsstatusVeröffentlicht - 1 Mai 2024

Abstract

While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touché23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4780 new arguments, doubling the dataset`s size to 9324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touché23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.

ASJC Scopus Sachgebiete

Zitieren

The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. / Mirzakhmedova, Nailia; Kiesel, Johannes; Alshomary, Milad et al.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Hrsg. / Nicoletta Calzolari; Min-Yen Kan; Veronique Hoste; Alessandro Lenci; Sakriani Sakti; Nianwen Xue. Torino, Italia, 2024. S. 16121-16134.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Mirzakhmedova, N, Kiesel, J, Alshomary, M, Heinrich, M, Handke, N, Cai, X, Barriere, V, Dastgheib, D, Ghahroodi, O, SadraeiJavaheri, M, Asgari, E, Kawaletz, L, Wachsmuth, H & Stein, B 2024, The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. in N Calzolari, M-Y Kan, V Hoste, A Lenci, S Sakti & N Xue (Hrsg.), Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Torino, Italia, S. 16121-16134. <https://aclanthology.org/2024.lrec-main.1402/>
Mirzakhmedova, N., Kiesel, J., Alshomary, M., Heinrich, M., Handke, N., Cai, X., Barriere, V., Dastgheib, D., Ghahroodi, O., SadraeiJavaheri, M., Asgari, E., Kawaletz, L., Wachsmuth, H., & Stein, B. (2024). The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. In N. Calzolari, M.-Y. Kan, V. Hoste, A. Lenci, S. Sakti, & N. Xue (Hrsg.), Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (S. 16121-16134). https://aclanthology.org/2024.lrec-main.1402/
Mirzakhmedova N, Kiesel J, Alshomary M, Heinrich M, Handke N, Cai X et al. The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. in Calzolari N, Kan MY, Hoste V, Lenci A, Sakti S, Xue N, Hrsg., Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Torino, Italia. 2024. S. 16121-16134
Mirzakhmedova, Nailia ; Kiesel, Johannes ; Alshomary, Milad et al. / The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Hrsg. / Nicoletta Calzolari ; Min-Yen Kan ; Veronique Hoste ; Alessandro Lenci ; Sakriani Sakti ; Nianwen Xue. Torino, Italia, 2024. S. 16121-16134
Download
@inproceedings{3626ffb7fb31427a9aea4c967d7fb8d6,
title = "The Touch{\'e}23-ValueEval Dataset for Identifying Human Values behind Arguments",
abstract = "While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touch{\'e}23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4780 new arguments, doubling the dataset`s size to 9324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touch{\'e}23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.",
keywords = "Corpus (Creation, Annotation, etc.), Document Classification, Text categorisation",
author = "Nailia Mirzakhmedova and Johannes Kiesel and Milad Alshomary and Maximilian Heinrich and Nicolas Handke and Xiaoni Cai and Valentin Barriere and Doratossadat Dastgheib and Omid Ghahroodi and MohammadAli SadraeiJavaheri and Ehsaneddin Asgari and Lea Kawaletz and Henning Wachsmuth and Benno Stein",
note = "Publisher Copyright: {\textcopyright} 2024 ELRA Language Resource Association: CC BY-NC 4.0.",
year = "2024",
month = may,
day = "1",
language = "English",
pages = "16121--16134",
editor = "Nicoletta Calzolari and Min-Yen Kan and Veronique Hoste and Alessandro Lenci and Sakriani Sakti and Nianwen Xue",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",

}

Download

TY - GEN

T1 - The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments

AU - Mirzakhmedova, Nailia

AU - Kiesel, Johannes

AU - Alshomary, Milad

AU - Heinrich, Maximilian

AU - Handke, Nicolas

AU - Cai, Xiaoni

AU - Barriere, Valentin

AU - Dastgheib, Doratossadat

AU - Ghahroodi, Omid

AU - SadraeiJavaheri, MohammadAli

AU - Asgari, Ehsaneddin

AU - Kawaletz, Lea

AU - Wachsmuth, Henning

AU - Stein, Benno

N1 - Publisher Copyright: © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

PY - 2024/5/1

Y1 - 2024/5/1

N2 - While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touché23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4780 new arguments, doubling the dataset`s size to 9324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touché23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.

AB - While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touché23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4780 new arguments, doubling the dataset`s size to 9324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touché23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.

KW - Corpus (Creation, Annotation, etc.)

KW - Document Classification

KW - Text categorisation

UR - http://www.scopus.com/inward/record.url?scp=85195967977&partnerID=8YFLogxK

M3 - Conference contribution

SP - 16121

EP - 16134

BT - Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

A2 - Calzolari, Nicoletta

A2 - Kan, Min-Yen

A2 - Hoste, Veronique

A2 - Lenci, Alessandro

A2 - Sakti, Sakriani

A2 - Xue, Nianwen

CY - Torino, Italia

ER -

Von denselben Autoren