3 min to read
RemembERR
Leveraging Microprocessor Errata for Design Testing and Validation.
Our work aggregates and annotates every publicly available Intel Core and AMD microprocessor erratum between 2008 and 2022 — 2,563 raw entries in total. By resolving intra- and inter-document duplicates (2,057 Intel entries → 743 unique; 506 AMD entries → 385 unique), we distilled 1,128 unique errata in RemembERR. RemembERR turns scattered, free-form PDF listings into a cohesive, cross-vendor, machine-readable database.
The RemembERR database is available on github.
What are microprocessor errata?
After silicon ships, subtle design bugs emerge only under real-world conditions. Vendors publish these as errata, each describing:
-
Triggers: the exact stimuli needed to provoke the bug.
-
Contexts: the operating modes or environments where it manifests.
-
Effects: the observable incorrect behavior.
Unfortunately, these documents lack structure, consistency, and cross-reference, making large-scale analysis and automated tooling difficult, especially before the LLM era.
How did we build RemembERR?
1. Collection and De-duplication
-
Scraped the latest errata PDF for every Intel Core generation (1–12) and AMD family (since 2008).
-
Used title-similarity heuristics and manual inspection to merge duplicates—reducing 2,563 raw entries to 1,128 unique errata.
2. Unified Three-Tier Schema
-
Concrete level: Vendor-specific details (e.g., “write MSR 0x1A2 with value 0x5”).
-
Abstract level: Generalized patterns (e.g., “MSR configuration write”).
-
Class level: Top-level categories (e.g., Trg_CFG for dynamic configuration).
We defined 60 abstract categories across Triggers, Contexts, and Effects, then mapped each erratum through an automated filter plus a four-eyes manual process to ensure >80 % initial agreement, resolving all mismatches iteratively.
3. Four-Eyes Classification
-
Two researchers independently annotated all errata with the three-tier schema.
-
Guided by a regex-based highlighting tool, they discussed each discrepancy over seven rounds, spending ~30 hours each in high-focus work to finalize annotations.
4. Public Artifacts
We open-source the complete RemembERR database, annotation tools, and Dockerized experiments on GitHub: https://github.com/comsec-group/rememberr
Key Insights & Gaps in Design Testing
By mining RemembERR’s annotated errata, we expose critical validation blind spots:
-
Multi-Trigger Complexity: 49 % of errata require two or more distinct trigger types to manifest. Simple, single-condition tests miss almost half of all documented bugs.
-
Dominant Trigger Patterns: The most frequent triggers across vendors are MSR configuration writes, power-state transitions, and power throttling, indicating that dynamic power management is a recurrent failure point.
-
Context Concentration: Over two-thirds of errata occur in virtual-machine guest environments or require System Management Mode, yet many validation suites lack thorough VM-based testing.
-
Disjunctive Observations: While triggers must all occur (conjunctive), observing any single effect—from machine-check exceptions to incorrect register values—is sufficient to detect the bug.
-
Workaround Deficit: A substantial share of errata lack any vendor-suggested workaround—35.9 % for Intel and 28.1 % for AMD—leaving system integrators without guidance.
How can RemembERR be used?
-
Directed Simulation & Formal Verification: Feed concrete trigger sets and contexts into test generators or formal properties to target the hardest-to-find bugs.
-
Focused Silicon Testing: Prioritize power-state, MSR, and peripheral stimuli under VM contexts, then monitor for any deviation—from spurious faults to hangs—to maximize coverage.
-
Trend Analysis & Process Improvement: Track which bug classes persist across generations—e.g., “heredity” clusters reveal that some bugs survived across 11 generations—so vendors can refine pre-silicon validation.
Frequently Asked Questions
Why classify at three levels?
Concrete details ensure reproducibility; abstract categories enable pattern discovery; class-level groups support high-level validation strategy.
Is this just academic?
No — design teams can immediately integrate RemembERR into existing flows, generating targeted test cases that cover historical blind spots.
Acknowledgements
We would like to thank the anonymous reviewers for their valuable feedback. This work was supported by a Microsoft Swiss JRC grant, the Swiss State Secretariat for Education, Research and Innovation under contract number MB22.00057 (ERC-StG PROMISE), and the Swiss National Science Foundation under NCCR Automation, grant agreement 51NF40 180545.