The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Nailia Mirzakhmedova
  • Johannes Kiesel
  • Milad Alshomary
  • Maximilian Heinrich
  • Nicolas Handke
  • Xiaoni Cai
  • Valentin Barriere
  • Doratossadat Dastgheib
  • Omid Ghahroodi
  • MohammadAli SadraeiJavaheri
  • Ehsaneddin Asgari
  • Lea Kawaletz
  • Henning Wachsmuth
  • Benno Stein
View graph of relations

Details

Original languageEnglish
Title of host publicationProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
EditorsNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Place of PublicationTorino, Italia
Pages16121-16134
Number of pages14
ISBN (electronic)9782493814104
Publication statusPublished - 1 May 2024

Abstract

While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touché23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4780 new arguments, doubling the dataset`s size to 9324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touché23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.

Keywords

    Corpus (Creation, Annotation, etc.), Document Classification, Text categorisation

ASJC Scopus subject areas

Cite this

The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. / Mirzakhmedova, Nailia; Kiesel, Johannes; Alshomary, Milad et al.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). ed. / Nicoletta Calzolari; Min-Yen Kan; Veronique Hoste; Alessandro Lenci; Sakriani Sakti; Nianwen Xue. Torino, Italia, 2024. p. 16121-16134.

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Mirzakhmedova, N, Kiesel, J, Alshomary, M, Heinrich, M, Handke, N, Cai, X, Barriere, V, Dastgheib, D, Ghahroodi, O, SadraeiJavaheri, M, Asgari, E, Kawaletz, L, Wachsmuth, H & Stein, B 2024, The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. in N Calzolari, M-Y Kan, V Hoste, A Lenci, S Sakti & N Xue (eds), Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Torino, Italia, pp. 16121-16134. <https://aclanthology.org/2024.lrec-main.1402/>
Mirzakhmedova, N., Kiesel, J., Alshomary, M., Heinrich, M., Handke, N., Cai, X., Barriere, V., Dastgheib, D., Ghahroodi, O., SadraeiJavaheri, M., Asgari, E., Kawaletz, L., Wachsmuth, H., & Stein, B. (2024). The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. In N. Calzolari, M.-Y. Kan, V. Hoste, A. Lenci, S. Sakti, & N. Xue (Eds.), Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (pp. 16121-16134). https://aclanthology.org/2024.lrec-main.1402/
Mirzakhmedova N, Kiesel J, Alshomary M, Heinrich M, Handke N, Cai X et al. The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. In Calzolari N, Kan MY, Hoste V, Lenci A, Sakti S, Xue N, editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Torino, Italia. 2024. p. 16121-16134
Mirzakhmedova, Nailia ; Kiesel, Johannes ; Alshomary, Milad et al. / The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). editor / Nicoletta Calzolari ; Min-Yen Kan ; Veronique Hoste ; Alessandro Lenci ; Sakriani Sakti ; Nianwen Xue. Torino, Italia, 2024. pp. 16121-16134
Download
@inproceedings{3626ffb7fb31427a9aea4c967d7fb8d6,
title = "The Touch{\'e}23-ValueEval Dataset for Identifying Human Values behind Arguments",
abstract = "While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touch{\'e}23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4780 new arguments, doubling the dataset`s size to 9324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touch{\'e}23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.",
keywords = "Corpus (Creation, Annotation, etc.), Document Classification, Text categorisation",
author = "Nailia Mirzakhmedova and Johannes Kiesel and Milad Alshomary and Maximilian Heinrich and Nicolas Handke and Xiaoni Cai and Valentin Barriere and Doratossadat Dastgheib and Omid Ghahroodi and MohammadAli SadraeiJavaheri and Ehsaneddin Asgari and Lea Kawaletz and Henning Wachsmuth and Benno Stein",
note = "Publisher Copyright: {\textcopyright} 2024 ELRA Language Resource Association: CC BY-NC 4.0.",
year = "2024",
month = may,
day = "1",
language = "English",
pages = "16121--16134",
editor = "Nicoletta Calzolari and Min-Yen Kan and Veronique Hoste and Alessandro Lenci and Sakriani Sakti and Nianwen Xue",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",

}

Download

TY - GEN

T1 - The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments

AU - Mirzakhmedova, Nailia

AU - Kiesel, Johannes

AU - Alshomary, Milad

AU - Heinrich, Maximilian

AU - Handke, Nicolas

AU - Cai, Xiaoni

AU - Barriere, Valentin

AU - Dastgheib, Doratossadat

AU - Ghahroodi, Omid

AU - SadraeiJavaheri, MohammadAli

AU - Asgari, Ehsaneddin

AU - Kawaletz, Lea

AU - Wachsmuth, Henning

AU - Stein, Benno

N1 - Publisher Copyright: © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

PY - 2024/5/1

Y1 - 2024/5/1

N2 - While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touché23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4780 new arguments, doubling the dataset`s size to 9324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touché23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.

AB - While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touché23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4780 new arguments, doubling the dataset`s size to 9324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touché23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.

KW - Corpus (Creation, Annotation, etc.)

KW - Document Classification

KW - Text categorisation

UR - http://www.scopus.com/inward/record.url?scp=85195967977&partnerID=8YFLogxK

M3 - Conference contribution

SP - 16121

EP - 16134

BT - Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

A2 - Calzolari, Nicoletta

A2 - Kan, Min-Yen

A2 - Hoste, Veronique

A2 - Lenci, Alessandro

A2 - Sakti, Sakriani

A2 - Xue, Nianwen

CY - Torino, Italia

ER -

By the same author(s)