Details
Original language | English |
---|---|
Title of host publication | 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 9092-9096 |
Number of pages | 5 |
ISBN (electronic) | 9781665405409 |
ISBN (print) | 978-1-6654-0541-6 |
Publication status | Published - 2022 |
Event | 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Virtual, Online, Singapore Duration: 23 May 2022 → 27 May 2022 |
Publication series
Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
---|---|
Volume | 2022-May |
ISSN (Print) | 1520-6149 |
Abstract
Covid-19 has caused a huge health crisis worldwide in the past two years. Although an early detection of the virus through nucleic acid screening can considerably reduce its spread, the efficiency of this diagnostic process is limited by its complexity and costs. Hence, an effective and inexpensive way to early detect Covid-19 is still needed. Considering that the cough of an infected person contains a large amount of information, we propose an algorithm for the automatic recognition of Covid-19 from cough signals. Our approach generates static log-Mel spectrograms with deltas and delta-deltas from the cough signal and subsequently extracts feature maps through a Convolutional Neural Network (CNN). Following the advances on transformers in the realm of deep learning, our proposed architecture exploits a novel adaptive position embedding structure which can learn the position information of the features from the CNN output. This make the transformer structure rapidly lock the attention feature location by overlaying with the CNN output, which yields better classification. The efficiency of the proposed architecture is shown by the improvement, w. r. t. the baseline, of our experimental results on the INTERPSEECH 2021 Computational Paralinguistics Challenge CCS (Coughing Sub Challenge) database, which reached 72.6 % UAR (Unweighted Average Recall).
Keywords
- Adaptive Position Embedding Transformer, Computer Audition, Convolutional Neural Network, log-Mel Spectrogram, SARS-CoV2 Detection
ASJC Scopus subject areas
- Computer Science(all)
- Software
- Computer Science(all)
- Signal Processing
- Engineering(all)
- Electrical and Electronic Engineering
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2022. p. 9092-9096 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2022-May).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Convoluational Transformer With Adaptive Position Embedding For Covid-19 Detection From Cough Sounds
AU - Yan, Tianhao
AU - Meng, Hao
AU - Liu, Shuo
AU - Parada-Cabaleiro, Emilia
AU - Ren, Zhao
AU - Schuller, Björn W.
PY - 2022
Y1 - 2022
N2 - Covid-19 has caused a huge health crisis worldwide in the past two years. Although an early detection of the virus through nucleic acid screening can considerably reduce its spread, the efficiency of this diagnostic process is limited by its complexity and costs. Hence, an effective and inexpensive way to early detect Covid-19 is still needed. Considering that the cough of an infected person contains a large amount of information, we propose an algorithm for the automatic recognition of Covid-19 from cough signals. Our approach generates static log-Mel spectrograms with deltas and delta-deltas from the cough signal and subsequently extracts feature maps through a Convolutional Neural Network (CNN). Following the advances on transformers in the realm of deep learning, our proposed architecture exploits a novel adaptive position embedding structure which can learn the position information of the features from the CNN output. This make the transformer structure rapidly lock the attention feature location by overlaying with the CNN output, which yields better classification. The efficiency of the proposed architecture is shown by the improvement, w. r. t. the baseline, of our experimental results on the INTERPSEECH 2021 Computational Paralinguistics Challenge CCS (Coughing Sub Challenge) database, which reached 72.6 % UAR (Unweighted Average Recall).
AB - Covid-19 has caused a huge health crisis worldwide in the past two years. Although an early detection of the virus through nucleic acid screening can considerably reduce its spread, the efficiency of this diagnostic process is limited by its complexity and costs. Hence, an effective and inexpensive way to early detect Covid-19 is still needed. Considering that the cough of an infected person contains a large amount of information, we propose an algorithm for the automatic recognition of Covid-19 from cough signals. Our approach generates static log-Mel spectrograms with deltas and delta-deltas from the cough signal and subsequently extracts feature maps through a Convolutional Neural Network (CNN). Following the advances on transformers in the realm of deep learning, our proposed architecture exploits a novel adaptive position embedding structure which can learn the position information of the features from the CNN output. This make the transformer structure rapidly lock the attention feature location by overlaying with the CNN output, which yields better classification. The efficiency of the proposed architecture is shown by the improvement, w. r. t. the baseline, of our experimental results on the INTERPSEECH 2021 Computational Paralinguistics Challenge CCS (Coughing Sub Challenge) database, which reached 72.6 % UAR (Unweighted Average Recall).
KW - Adaptive Position Embedding Transformer
KW - Computer Audition
KW - Convolutional Neural Network
KW - log-Mel Spectrogram
KW - SARS-CoV2 Detection
UR - http://www.scopus.com/inward/record.url?scp=85131243911&partnerID=8YFLogxK
U2 - 10.1109/icassp43922.2022.9747513
DO - 10.1109/icassp43922.2022.9747513
M3 - Conference contribution
AN - SCOPUS:85131243911
SN - 978-1-6654-0541-6
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 9092
EP - 9096
BT - 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Y2 - 23 May 2022 through 27 May 2022
ER -