
Microarchitectural attacks have, in recent years, emerged as a central concern for system security, particularly as computing systems become increasingly heterogeneous—utilizing a mix of CPUs, GPUs, accelerators, and other processing units. The research presented in "(M)WAIT for It: Bridging the Gap between Microarchitectural and Architectural Attacks" introduces a novel primitive that enables attackers to convert microarchitectural states into architectural states, without relying on precise timing measurements. This innovation broadens the potential for exploitation and increases the attack surface on modern platforms.
This blog will walk you through the basics and advanced aspects of microarchitectural attacks, demonstrate technical code examples, explain how tools and attacks are detected or scanned for, and show how this knowledge directly impacts real-world cyber-defense and offense, using (M)WAIT and related attacks as case studies.
We will also touch upon insights provided by the seminal survey "Microarchitectural Attacks in Heterogeneous Systems" and the new transient execution attack "Inception", CVE-2023-20569, on AMD Zen CPUs.
Contents
Microarchitectural attacks represent a class of exploits that use the subtle behaviors and implementation details of computer hardware—specifically, those not intentionally expressed in the architectural specifications or APIs. The rapid evolution in processor design and the increasing adoption of heterogeneous systems have inadvertently expanded the range of exploitable microarchitectural states.
Heterogeneous systems—those with multiple, diverse types of processing units—introduce new challenges and risks for system security. Differences in how these components interact and share resources (e.g., memory, caches, interconnects) can leak sensitive data.
The (M)WAIT primitive enables attackers to exploit these low-level states and amplify them to architectural level leaks—switching from hard-to-measure, high-noise microarchitectural effects to programmatically observable events, such as instruction results or register values.
In a typical processor, architectural state refers to the predictable, programmer-visible parts of computation: registers, memory, and the results of instructions. Microarchitectural state consists of the hidden details: pipeline stages, branch predictors, CPU caches, execution queues, buffer statuses, and so on.
While the architectural state is what programs interact with, microarchitectural state frequently leaks information implicitly. Attackers exploit this through various channels, typically falling into two main categories:
Side-channels: These are unintended outputs of a system (such as execution time, cache usage stats, or power consumption) that can be measured to infer secret data, such as cryptographic keys. Example: Measuring how long encryption takes may reveal bits of the key.
Covert-channels: These are unauthorized channels created intentionally by a malicious process to transmit data in a sneaky manner, typically by manipulating microarchitectural state (like cache lines or branch predictor entries) so another process can observe changes and reconstruct the message.
The (M)WAIT attack primitive, introduced at USENIX Security 2023, presents a fundamentally new way to turn microarchitectural states into architectural leakage without relying on fine-grained timers (which are often restricted or fuzzed by defenses).
MWAIT or related instructions in x86 architecture (such as MONITOR/MWAIT) to pause execution based on monitored memory writes.MWAIT, architectural effects after waking up can become dependent on the hidden state before sleeping.(M)WAIT enables attackers to amplify microarchitectural phenomena into observable differences in program behavior, without relying on or measuring high-resolution timers.
MONITOR/MWAIT instructions, which prepare the CPU to sleep and resume based on memory accesses or specific events.MWAIT in Assembly; Note: actual usage of MONITOR/MWAIT requires ring0 (kernel privilege)
mov ecx, 0 ; Hint
mov eax, 0 ; C-state
MONITOR ; Set up monitoring
MWAIT ; Wait for monitored address to be written
If these are exposed (e.g., via virtualization or privileged operations), attackers can orchestrate the attack sequence across VMs or protected kernel code.
Most side-channel attacks have historically relied on timing measurements (using RDTSC, high-resolution timers, etc.) to infer the state of microarchitectural components. Many modern OS and hardware defenses fuzz, slow down, or remove high-resolution timers. (M)WAIT evades these mitigations by leveraging architectural activities as indicators.
No need for explicit timing—simply compare architectural behaviors after (M)WAIT!
By eliminating the need for explicit timing, (M)WAIT attacks:
sandboxing), or on mixed-CPU platforms.As summarized in Microarchitectural Attacks in Heterogeneous Systems, modern computing environments increasingly rely on heterogeneous hardware. For example:
Inception is a class of transient execution attacks—reminiscent of Spectre and Meltdown—but specifically targets AMD Zen CPUs. The exploit "Inception: how a simple XOR can cause arbitrary data leakage" demonstrates that even after several high-profile mitigations, new attack vectors can be crafted from simple architectural instructions.
At its core, Inception leverages the transient execution window—where the CPU has speculatively fetched instructions and executed them before confirming if they're architecturally correct.
Here's a simplified outline in C and Assembly:
unsigned char* shared_region = ...; // mmap'ed shared memory
void secret_leaking_function(char secret) {
// Transiently, access data depending on the secret bit
if (secret & 0x1) {
// Access one cache line
temp = shared_region[64];
} else {
// Access another cache line
temp = shared_region[128];
}
}
Attacker's monitoring side:
import time
from ctypes import CDLL
libc = CDLL("libc.so.6")
def flush(addr):
libc.__builtin_ia32_clflush(addr)
address = 0xdeadbeef # Example address
# Prime cache
flush(address)
start = time.perf_counter_ns()
dummy_read = *(volatile char*)address;
end = time.perf_counter_ns()
latency = end - start
print("Latency:", latency)
# Low latency means cache hit; high latency means miss
Inception removes the need for direct timing by leveraging speculative code execution sequences and amplification effects.
Detecting and mitigating microarchitectural attacks is fundamentally challenging. Defenses must cover:
MONITOR/MWAIT)Most microarchitectural vulnerabilities are published as CVEs. On Linux, the spectre-meltdown-checker tool helps identify whether your hardware/OS configuration is susceptible.
sudo apt-get install spectre-meltdown-checker
sudo spectre-meltdown-checker
Sample output (abridged):
CVE-2017-5753 [Spectre Variant 1] : Mitigated
CVE-2018-3639 [Spectre Variant 4] : Vulnerable
CVE-2023-20569 [Inception] : Vulnerable
...
MONITOR/MWAIT UsageUse objdump or grep to scan binaries for the presence of these instructions.
# List all binaries in a directory using MONITOR/MWAIT
for f in /usr/bin/*; do
if objdump -d "$f" | grep -q -E "mwait|monitor"; then
echo "$f uses MONITOR/MWAIT"
fi
done
If an unprivileged binary or user code contains these, it's a red flag.
Suppose you want to automate vulnerability scanning across a fleet:
import subprocess
def check_inception_vulnerability():
proc = subprocess.run(['spectre-meltdown-checker', '--batch', 'json'], capture_output=True, text=True)
if '"CVE-2023-20569"' in proc.stdout and '"Vulnerable"' in proc.stdout:
print("System is vulnerable to Inception.")
else:
print("System is protected or not affected.")
check_inception_vulnerability()
You can monitor for use of privileged instructions using audit frameworks, e.g., SELinux or Linux's auditd.
# Example: Log all attempts to load monitor/mwait instructions (if possible)
auditctl -a always,exit -S ptrace
Attackers have demonstrated (see references) that in cloud environments where VMs share physical CPUs, side and covert-channel attacks can recover cross-VM secrets, infer co-residency, or exfiltrate cryptographic material.
Notably:
Web browsers and sandboxed services often restrict timers (e.g., window.performance.now() returns fuzzed/low-resolution values). (M)WAIT-like amplification can still leak data out.
New designs with on-chip AI accelerators, shared cache, or memory provide additional vectors for microarchitectural state crossover. Simple firmware vulnerabilities can now have far-reaching impact.
The introduction of (M)WAIT as a primitive for converting microarchitectural state into observable architecture-level leakage profoundly impacts threat modeling and defense for modern systems. The key takeaways:
Defensive recommendations:
Looking forward, system architects must collaborate across hardware, OS, and application layers to mitigate these subtle, but devastating, attack vectors.
(M)WAIT for It: Bridging the Gap between Microarchitectural and Architectural Attacks
Ruiyi Zhang, et al. USENIX Security 2023.
https://www.usenix.org/conference/usenixsecurity23/presentation/zhang-ruiyi
Microarchitectural Attacks in Heterogeneous Systems
Amir Moradi et al. ACM Computing Surveys.
https://dl.acm.org/doi/10.1145/3544102
Inception: how a simple XOR can cause arbitrary data leakage
Daniel Gruss et al., ETH Zurich, CVE-2023-20569.
https://comsec.ethz.ch/research/microarch/inception/
spectre-meltdown-checker Tool
https://github.com/speed47/spectre-meltdown-checker
Official Linux audit framework documentation
https://linux.die.net/man/8/auditctl
This post is intended as an educational resource for cybersecurity professionals, system architects, and security operations teams monitoring evolving microarchitectural threats in modern computing environments.
If you found this content valuable, imagine what you could achieve with our comprehensive 47-week elite training program. Join 1,200+ students who've transformed their careers with Unit 8200 techniques.