Details
Original language | English |
---|---|
Pages (from-to) | 57492-57503 |
Number of pages | 12 |
Journal | IEEE ACCESS |
Volume | 12 |
Publication status | Published - 22 Apr 2024 |
Abstract
Core to much of modern deep learning is the notion of representation learning, learning representations of things that are useful for performing some task(s) related to those things. Encoder-only language models, for example, learn representations of language useful for performing language-related tasks, often classification. While fruitful in many applications, inherent is the assumption that only one classification is to be made for a particular input. This poses challenges when multiple classifications are to be made about different portions of a single record, such as emotion recognition in conversation (ERC) where the objective is to classify the emotion in each utterance of a dialog. Existing methods for this task typically either involve redundant computation, non-trivial post-processing outside of the core language model backbone, or both. To address this, we generalize recent work for deriving player-specific embeddings from multi-player sequences of events in sport for domain-agnostic application while also enabling it to leverage inter-entity relationships. Seeing the efficacy of the method in regression and classification tasks, we explore how it can be used to cluster player representations, proposing a novel approach for distribution-aware deep-clustering in the absence of labels. We demonstrate how the proposed methods yield state-of-the-art performance on the disparate tasks of ERC in Natural Language Processing (NLP), long-tail partial-label-learning (LT-PLL) in Computer Vision (CV), and player form clustering in sports analytics.
Keywords
- emotion recognition in conversation, long-tail partial-label-learning, Representation learning, sports analytics
ASJC Scopus subject areas
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: IEEE ACCESS, Vol. 12, 22.04.2024, p. 57492-57503.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Embedding and Clustering Multi-Entity Sequences
AU - Heaton, Connor
AU - Mitra, Prasenjit
N1 - Publisher Copyright: © 2013 IEEE.
PY - 2024/4/22
Y1 - 2024/4/22
N2 - Core to much of modern deep learning is the notion of representation learning, learning representations of things that are useful for performing some task(s) related to those things. Encoder-only language models, for example, learn representations of language useful for performing language-related tasks, often classification. While fruitful in many applications, inherent is the assumption that only one classification is to be made for a particular input. This poses challenges when multiple classifications are to be made about different portions of a single record, such as emotion recognition in conversation (ERC) where the objective is to classify the emotion in each utterance of a dialog. Existing methods for this task typically either involve redundant computation, non-trivial post-processing outside of the core language model backbone, or both. To address this, we generalize recent work for deriving player-specific embeddings from multi-player sequences of events in sport for domain-agnostic application while also enabling it to leverage inter-entity relationships. Seeing the efficacy of the method in regression and classification tasks, we explore how it can be used to cluster player representations, proposing a novel approach for distribution-aware deep-clustering in the absence of labels. We demonstrate how the proposed methods yield state-of-the-art performance on the disparate tasks of ERC in Natural Language Processing (NLP), long-tail partial-label-learning (LT-PLL) in Computer Vision (CV), and player form clustering in sports analytics.
AB - Core to much of modern deep learning is the notion of representation learning, learning representations of things that are useful for performing some task(s) related to those things. Encoder-only language models, for example, learn representations of language useful for performing language-related tasks, often classification. While fruitful in many applications, inherent is the assumption that only one classification is to be made for a particular input. This poses challenges when multiple classifications are to be made about different portions of a single record, such as emotion recognition in conversation (ERC) where the objective is to classify the emotion in each utterance of a dialog. Existing methods for this task typically either involve redundant computation, non-trivial post-processing outside of the core language model backbone, or both. To address this, we generalize recent work for deriving player-specific embeddings from multi-player sequences of events in sport for domain-agnostic application while also enabling it to leverage inter-entity relationships. Seeing the efficacy of the method in regression and classification tasks, we explore how it can be used to cluster player representations, proposing a novel approach for distribution-aware deep-clustering in the absence of labels. We demonstrate how the proposed methods yield state-of-the-art performance on the disparate tasks of ERC in Natural Language Processing (NLP), long-tail partial-label-learning (LT-PLL) in Computer Vision (CV), and player form clustering in sports analytics.
KW - emotion recognition in conversation
KW - long-tail partial-label-learning
KW - Representation learning
KW - sports analytics
UR - http://www.scopus.com/inward/record.url?scp=85191342903&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2024.3391820
DO - 10.1109/ACCESS.2024.3391820
M3 - Article
AN - SCOPUS:85191342903
VL - 12
SP - 57492
EP - 57503
JO - IEEE ACCESS
JF - IEEE ACCESS
SN - 2169-3536
ER -