Analyzing the memory ordering models of the Apple M1

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Autoren

  • Lars Wrenger
  • Dominik Töllner
  • Daniel Lohmann
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Aufsatznummer103102
Seitenumfang8
FachzeitschriftJournal of Systems Architecture
Jahrgang149
Frühes Online-Datum4 März 2024
PublikationsstatusVeröffentlicht - Apr. 2024

Abstract

The Apple M1 ARM processor family incorporates two memory consistency models: the conventional ARM weak memory ordering and the Total store ordering (TSO) model from the x86 architecture utilized by Apple's x86 emulator, Rosetta 2. The presence of both memory ordering models on the same hardware enables us to thoroughly benchmark and compare their performance characteristics and worst-case workloads. In this paper, we assess the performance implications of TSO on the Apple M1 processor architecture. Based on the multi-threading workloads of the SPEC2017 CPU FP benchmark suite, our findings indicate that TSO is, on average, 8.94 percent slower than ARM's weaker memory ordering. Through synthetic benchmarks, we further explore the workloads that experience the most significant performance degradation due to TSO. We also take a deeper look into the specific atomic instructions provided by the ARMv8.3 specification and their synchronization overheads.

ASJC Scopus Sachgebiete

Zitieren

Analyzing the memory ordering models of the Apple M1. / Wrenger, Lars; Töllner, Dominik; Lohmann, Daniel.
in: Journal of Systems Architecture, Jahrgang 149, 103102, 04.2024.

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Wrenger L, Töllner D, Lohmann D. Analyzing the memory ordering models of the Apple M1. Journal of Systems Architecture. 2024 Apr;149:103102. Epub 2024 Mär 4. doi: 10.1016/j.sysarc.2024.103102
Wrenger, Lars ; Töllner, Dominik ; Lohmann, Daniel. / Analyzing the memory ordering models of the Apple M1. in: Journal of Systems Architecture. 2024 ; Jahrgang 149.
Download
@article{b22c44b5838441f7b74c74c5434d90b8,
title = "Analyzing the memory ordering models of the Apple M1",
abstract = "The Apple M1 ARM processor family incorporates two memory consistency models: the conventional ARM weak memory ordering and the Total store ordering (TSO) model from the x86 architecture utilized by Apple's x86 emulator, Rosetta 2. The presence of both memory ordering models on the same hardware enables us to thoroughly benchmark and compare their performance characteristics and worst-case workloads. In this paper, we assess the performance implications of TSO on the Apple M1 processor architecture. Based on the multi-threading workloads of the SPEC2017 CPU FP benchmark suite, our findings indicate that TSO is, on average, 8.94 percent slower than ARM's weaker memory ordering. Through synthetic benchmarks, we further explore the workloads that experience the most significant performance degradation due to TSO. We also take a deeper look into the specific atomic instructions provided by the ARMv8.3 specification and their synchronization overheads.",
keywords = "Apple M1, ARM, Memory ordering, TSO",
author = "Lars Wrenger and Dominik T{\"o}llner and Daniel Lohmann",
note = "Funding Information: We thank our reviewers for their valuable feedback. This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – LO 1719/8-1 . ",
year = "2024",
month = apr,
doi = "10.1016/j.sysarc.2024.103102",
language = "English",
volume = "149",
journal = "Journal of Systems Architecture",
issn = "1383-7621",
publisher = "Elsevier",

}

Download

TY - JOUR

T1 - Analyzing the memory ordering models of the Apple M1

AU - Wrenger, Lars

AU - Töllner, Dominik

AU - Lohmann, Daniel

N1 - Funding Information: We thank our reviewers for their valuable feedback. This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – LO 1719/8-1 .

PY - 2024/4

Y1 - 2024/4

N2 - The Apple M1 ARM processor family incorporates two memory consistency models: the conventional ARM weak memory ordering and the Total store ordering (TSO) model from the x86 architecture utilized by Apple's x86 emulator, Rosetta 2. The presence of both memory ordering models on the same hardware enables us to thoroughly benchmark and compare their performance characteristics and worst-case workloads. In this paper, we assess the performance implications of TSO on the Apple M1 processor architecture. Based on the multi-threading workloads of the SPEC2017 CPU FP benchmark suite, our findings indicate that TSO is, on average, 8.94 percent slower than ARM's weaker memory ordering. Through synthetic benchmarks, we further explore the workloads that experience the most significant performance degradation due to TSO. We also take a deeper look into the specific atomic instructions provided by the ARMv8.3 specification and their synchronization overheads.

AB - The Apple M1 ARM processor family incorporates two memory consistency models: the conventional ARM weak memory ordering and the Total store ordering (TSO) model from the x86 architecture utilized by Apple's x86 emulator, Rosetta 2. The presence of both memory ordering models on the same hardware enables us to thoroughly benchmark and compare their performance characteristics and worst-case workloads. In this paper, we assess the performance implications of TSO on the Apple M1 processor architecture. Based on the multi-threading workloads of the SPEC2017 CPU FP benchmark suite, our findings indicate that TSO is, on average, 8.94 percent slower than ARM's weaker memory ordering. Through synthetic benchmarks, we further explore the workloads that experience the most significant performance degradation due to TSO. We also take a deeper look into the specific atomic instructions provided by the ARMv8.3 specification and their synchronization overheads.

KW - Apple M1

KW - ARM

KW - Memory ordering

KW - TSO

UR - http://www.scopus.com/inward/record.url?scp=85186716348&partnerID=8YFLogxK

U2 - 10.1016/j.sysarc.2024.103102

DO - 10.1016/j.sysarc.2024.103102

M3 - Article

AN - SCOPUS:85186716348

VL - 149

JO - Journal of Systems Architecture

JF - Journal of Systems Architecture

SN - 1383-7621

M1 - 103102

ER -