Details
Original language | English |
---|---|
Title of host publication | ICMR 2020 |
Subtitle of host publication | Proceedings of the 2020 International Conference on Multimedia Retrieval |
Place of Publication | New York |
Pages | 16-25 |
Number of pages | 10 |
ISBN (electronic) | 9781450370875 |
Publication status | Published - 8 Jun 2020 |
Event | 10th ACM International Conference on Multimedia Retrieval, ICMR 2020 - Dublin, Ireland Duration: 8 Jun 2020 → 11 Jun 2020 |
Abstract
The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. The photos can be decorative, depict additional details, or even contain misleading information. Quantifying the cross-modal consistency of entity representations can assist human assessors in evaluating the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today's society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of these entities with the news photo, using state-of-the-art computer vision approaches. In contrast to previous work, our system automatically gathers example data from the Web and is applicable to real-world news. The feasibility is demonstrated on two novel datasets that cover different languages, topics, and domains.
Keywords
- Cross-modal consistency, Cross-modal entity verification, Deep learning, Image repurposing detection, Multimodal retrieval
ASJC Scopus subject areas
- Computer Science(all)
- Computer Networks and Communications
- Computer Science(all)
- Computer Science Applications
- Computer Science(all)
- Computer Graphics and Computer-Aided Design
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
ICMR 2020: Proceedings of the 2020 International Conference on Multimedia Retrieval. New York, 2020. p. 16-25.
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Multimodal analytics for real-world news using measures of cross-modal entity consistency
AU - Müller-Budack, Eric
AU - Theiner, Jonas
AU - Diering, Sebastian
AU - Idahl, Maximilian
AU - Ewerth, Ralph
N1 - Funding information: This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sk?odowska-Curie grant agreement no 812997, and the the German Research Foundation (DFG: Deutsche Forschungsgemeinschaft, project number: 388420599). We are very grateful to Avishek Anand (L3S Research Center, Leibniz Universität Hannover) for his valuable comments that improved the quality of the paper. This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement no 812997, and the the German Research Foundation (DFG: Deutsche Forschungsgemeinschaft, project number: 388420599). We are very grateful to Avishek Anand (L3S Research Center, Leibniz Universitt Hannover) for his valuable comments that improved the quality of the paper.
PY - 2020/6/8
Y1 - 2020/6/8
N2 - The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. The photos can be decorative, depict additional details, or even contain misleading information. Quantifying the cross-modal consistency of entity representations can assist human assessors in evaluating the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today's society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of these entities with the news photo, using state-of-the-art computer vision approaches. In contrast to previous work, our system automatically gathers example data from the Web and is applicable to real-world news. The feasibility is demonstrated on two novel datasets that cover different languages, topics, and domains.
AB - The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. The photos can be decorative, depict additional details, or even contain misleading information. Quantifying the cross-modal consistency of entity representations can assist human assessors in evaluating the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today's society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of these entities with the news photo, using state-of-the-art computer vision approaches. In contrast to previous work, our system automatically gathers example data from the Web and is applicable to real-world news. The feasibility is demonstrated on two novel datasets that cover different languages, topics, and domains.
KW - Cross-modal consistency
KW - Cross-modal entity verification
KW - Deep learning
KW - Image repurposing detection
KW - Multimodal retrieval
UR - http://www.scopus.com/inward/record.url?scp=85086904454&partnerID=8YFLogxK
U2 - 10.1145/3372278.3390670
DO - 10.1145/3372278.3390670
M3 - Conference contribution
AN - SCOPUS:85086904454
SP - 16
EP - 25
BT - ICMR 2020
CY - New York
T2 - 10th ACM International Conference on Multimedia Retrieval, ICMR 2020
Y2 - 8 June 2020 through 11 June 2020
ER -