Details
Originalsprache | Englisch |
---|---|
Aufsatznummer | 103102 |
Seitenumfang | 8 |
Fachzeitschrift | Journal of Systems Architecture |
Jahrgang | 149 |
Frühes Online-Datum | 4 März 2024 |
Publikationsstatus | Veröffentlicht - Apr. 2024 |
Abstract
The Apple M1 ARM processor family incorporates two memory consistency models: the conventional ARM weak memory ordering and the Total store ordering (TSO) model from the x86 architecture utilized by Apple's x86 emulator, Rosetta 2. The presence of both memory ordering models on the same hardware enables us to thoroughly benchmark and compare their performance characteristics and worst-case workloads. In this paper, we assess the performance implications of TSO on the Apple M1 processor architecture. Based on the multi-threading workloads of the SPEC2017 CPU FP benchmark suite, our findings indicate that TSO is, on average, 8.94 percent slower than ARM's weaker memory ordering. Through synthetic benchmarks, we further explore the workloads that experience the most significant performance degradation due to TSO. We also take a deeper look into the specific atomic instructions provided by the ARMv8.3 specification and their synchronization overheads.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Software
- Informatik (insg.)
- Hardware und Architektur
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
in: Journal of Systems Architecture, Jahrgang 149, 103102, 04.2024.
Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review
}
TY - JOUR
T1 - Analyzing the memory ordering models of the Apple M1
AU - Wrenger, Lars
AU - Töllner, Dominik
AU - Lohmann, Daniel
N1 - Funding Information: We thank our reviewers for their valuable feedback. This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – LO 1719/8-1 .
PY - 2024/4
Y1 - 2024/4
N2 - The Apple M1 ARM processor family incorporates two memory consistency models: the conventional ARM weak memory ordering and the Total store ordering (TSO) model from the x86 architecture utilized by Apple's x86 emulator, Rosetta 2. The presence of both memory ordering models on the same hardware enables us to thoroughly benchmark and compare their performance characteristics and worst-case workloads. In this paper, we assess the performance implications of TSO on the Apple M1 processor architecture. Based on the multi-threading workloads of the SPEC2017 CPU FP benchmark suite, our findings indicate that TSO is, on average, 8.94 percent slower than ARM's weaker memory ordering. Through synthetic benchmarks, we further explore the workloads that experience the most significant performance degradation due to TSO. We also take a deeper look into the specific atomic instructions provided by the ARMv8.3 specification and their synchronization overheads.
AB - The Apple M1 ARM processor family incorporates two memory consistency models: the conventional ARM weak memory ordering and the Total store ordering (TSO) model from the x86 architecture utilized by Apple's x86 emulator, Rosetta 2. The presence of both memory ordering models on the same hardware enables us to thoroughly benchmark and compare their performance characteristics and worst-case workloads. In this paper, we assess the performance implications of TSO on the Apple M1 processor architecture. Based on the multi-threading workloads of the SPEC2017 CPU FP benchmark suite, our findings indicate that TSO is, on average, 8.94 percent slower than ARM's weaker memory ordering. Through synthetic benchmarks, we further explore the workloads that experience the most significant performance degradation due to TSO. We also take a deeper look into the specific atomic instructions provided by the ARMv8.3 specification and their synchronization overheads.
KW - Apple M1
KW - ARM
KW - Memory ordering
KW - TSO
UR - http://www.scopus.com/inward/record.url?scp=85186716348&partnerID=8YFLogxK
U2 - 10.1016/j.sysarc.2024.103102
DO - 10.1016/j.sysarc.2024.103102
M3 - Article
AN - SCOPUS:85186716348
VL - 149
JO - Journal of Systems Architecture
JF - Journal of Systems Architecture
SN - 1383-7621
M1 - 103102
ER -