Details
Original language | English |
---|---|
Title of host publication | ICMR '24 |
Subtitle of host publication | Proceedings of the 2024 International Conference on Multimedia Retrieval |
Pages | 506-514 |
Number of pages | 9 |
ISBN (electronic) | 9798400706028 |
Publication status | Published - 7 Jun 2024 |
Event | 2024 International Conference on Multimedia Retrieval, ICMR 2024 - Phuket, Thailand Duration: 10 Jun 2024 → 14 Jun 2024 |
Abstract
The proliferation of news sources on the web amplifies the problem of disinformation and misinformation, impacting public perception and societal stability. These issues necessitate the identification of bias in news broadcasts, whereby the analysis and understanding of speaker roles and news contexts are essential prerequisites. Although there is prior research on multimodal speaker role recognition (mostly) in the news domain, modern feature representations have not been explored yet, and no comprehensive public dataset is available. In this paper, we propose novel approaches to classify speaker roles (e.g., “anchor," “reporter," “expert") and categorise scenes into news situations (e.g., “report," “interview") in news videos, to enhance the understanding of news content. To bridge the gap of missing datasets, we present a novel annotated dataset for various speaker roles and news situations from diverse (national) media outlets. Furthermore, we suggest a rich set of features and employ aggregation and post-processing techniques. In our experiments, we compare classifiers like Random Forest and XGBoost for identifying speaker roles and news situations in video segments. Our approach outperforms recent state-of-the-art methods, including end-to-end multimodal deep network and unimodal transformer-based models. Through detailed feature combination analysis, generalisation and explainability insights, we underscore our models’ capabilities and set new directions for future research.
Keywords
- news situations, news videos, speaker roles, video classification
ASJC Scopus subject areas
- Computer Science(all)
- Computer Graphics and Computer-Aided Design
- Computer Science(all)
- Human-Computer Interaction
- Computer Science(all)
- Software
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval. 2024. p. 506-514.
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Identification of Speaker Roles and Situation Types in News Videos
AU - Cheema, Gullal S.
AU - Arafat, Judi
AU - Tseng, Chiao I.
AU - Bateman, John A.
AU - Ewerth, Ralph
AU - Müller-Budack, Eric
N1 - Publisher Copyright: © 2024 Copyright held by the owner/author(s).
PY - 2024/6/7
Y1 - 2024/6/7
N2 - The proliferation of news sources on the web amplifies the problem of disinformation and misinformation, impacting public perception and societal stability. These issues necessitate the identification of bias in news broadcasts, whereby the analysis and understanding of speaker roles and news contexts are essential prerequisites. Although there is prior research on multimodal speaker role recognition (mostly) in the news domain, modern feature representations have not been explored yet, and no comprehensive public dataset is available. In this paper, we propose novel approaches to classify speaker roles (e.g., “anchor," “reporter," “expert") and categorise scenes into news situations (e.g., “report," “interview") in news videos, to enhance the understanding of news content. To bridge the gap of missing datasets, we present a novel annotated dataset for various speaker roles and news situations from diverse (national) media outlets. Furthermore, we suggest a rich set of features and employ aggregation and post-processing techniques. In our experiments, we compare classifiers like Random Forest and XGBoost for identifying speaker roles and news situations in video segments. Our approach outperforms recent state-of-the-art methods, including end-to-end multimodal deep network and unimodal transformer-based models. Through detailed feature combination analysis, generalisation and explainability insights, we underscore our models’ capabilities and set new directions for future research.
AB - The proliferation of news sources on the web amplifies the problem of disinformation and misinformation, impacting public perception and societal stability. These issues necessitate the identification of bias in news broadcasts, whereby the analysis and understanding of speaker roles and news contexts are essential prerequisites. Although there is prior research on multimodal speaker role recognition (mostly) in the news domain, modern feature representations have not been explored yet, and no comprehensive public dataset is available. In this paper, we propose novel approaches to classify speaker roles (e.g., “anchor," “reporter," “expert") and categorise scenes into news situations (e.g., “report," “interview") in news videos, to enhance the understanding of news content. To bridge the gap of missing datasets, we present a novel annotated dataset for various speaker roles and news situations from diverse (national) media outlets. Furthermore, we suggest a rich set of features and employ aggregation and post-processing techniques. In our experiments, we compare classifiers like Random Forest and XGBoost for identifying speaker roles and news situations in video segments. Our approach outperforms recent state-of-the-art methods, including end-to-end multimodal deep network and unimodal transformer-based models. Through detailed feature combination analysis, generalisation and explainability insights, we underscore our models’ capabilities and set new directions for future research.
KW - news situations
KW - news videos
KW - speaker roles
KW - video classification
UR - http://www.scopus.com/inward/record.url?scp=85199157357&partnerID=8YFLogxK
U2 - 10.1145/3652583.3658101
DO - 10.1145/3652583.3658101
M3 - Conference contribution
AN - SCOPUS:85199157357
SP - 506
EP - 514
BT - ICMR '24
T2 - 2024 International Conference on Multimedia Retrieval, ICMR 2024
Y2 - 10 June 2024 through 14 June 2024
ER -