LLMpediaThe first transparent, open encyclopedia generated by LLMs

MITRE ATT&CK Evaluations

Generated by GPT-5-mini
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: Mitre ATT&CK Hop 4
Expansion Funnel Raw 88 → Dedup 0 → NER 0 → Enqueued 0
1. Extracted88
2. After dedup0 (None)
3. After NER0 ()
4. Enqueued0 ()
MITRE ATT&CK Evaluations
NameMITRE ATT&CK Evaluations
TypeEvaluation framework
Founded2019
LocationBedford, Massachusetts
Parent organizationMITRE Corporation

MITRE ATT&CK Evaluations MITRE ATT&CK Evaluations are comparative testing programs developed to assess defensive capabilities against known adversary behaviors using the ATT&CK knowledge base. They provide reproducible scenarios drawn from documented incidents to benchmark products from vendor-supplied and open-source CrowdStrike-era, Microsoft-era detections, enabling defenders across Department of Defense, National Security Agency, U.S. Cyber Command, and private sector organizations such as Lockheed Martin, Raytheon Technologies, Northrop Grumman to measure effectiveness. The evaluations inform procurement, research at institutions like Carnegie Mellon University and Massachusetts Institute of Technology, and operational improvements in organizations including Amazon, Google, Facebook, and JPMorgan Chase.

Overview

The evaluations are rooted in the ATT&CK knowledge base compiled by the MITRE Corporation, mapping adversary techniques observed in incidents such as those attributed to groups like APT29, APT28, FIN7, Lazarus Group, and Sandworm. Participants include commercial vendors such as Symantec, McAfee, Sophos, Trend Micro, Palo Alto Networks, and Elastic, alongside open-source projects connected to The MITRE ATT&CK community. Results are consumed by end users in sectors served by Cisco Systems, Fortinet, BAE Systems, and regulators in agencies like the Federal Bureau of Investigation and Department of Homeland Security. The program draws comparisons to historical verification efforts like the Common Criteria and techniques used in evaluations by National Institute of Standards and Technology.

Methodology

Methodology leverages case studies from incidents involving entities such as Sony Pictures Entertainment, Target Corporation, Equifax, Maersk, and NotPetya to construct reproducible adversary emulations. Test design references threat actor reports from organizations including FireEye, CrowdStrike, Mandiant, Kaspersky Lab, and ESET. The framework maps adversary behaviors to ATT&CK techniques and procedures, using telemetry from endpoints and network sensors similar to deployments by Microsoft Defender, Carbon Black, SentinelOne, and VMware. Controls for bias and repeatability take inspiration from scientific standards used at National Institutes of Health and statistical methods taught at Stanford University.

Evaluation Process

Evaluations execute emulations of adversary campaigns using tools and scripts comparable to those documented by Cobalt Strike researchers and captured in VirusTotal datasets. Scenarios involve stages like initial access, execution, persistence, and exfiltration, referencing documented incidents involving SolarWinds, CCleaner, and Stuxnet-era compromises. Test infrastructures often mirror enterprise environments operated by Bank of America, Wells Fargo, BP, and ExxonMobil to reflect realistic telemetry. Data collection integrates logs and alerts from products by Splunk, Elastic, IBM Security, and Google Chronicle to generate ATT&CK-mapped detections.

Results and Findings

Published findings highlight detection coverage for techniques attributed to actors such as Cozy Bear, Fancy Bear, Evil Corp, and Charming Kitten. Reports compare signal-to-noise performance across vendors including McAfee, Symantec, CrowdStrike, SentinelOne, and Sophos, and expose gaps in telemetry similar to critiques leveled at legacy systems during incidents like Equifax breach and WannaCry. Analyses often prompt improvements in detections by vendors partnered with Microsoft, Palo Alto Networks, Trend Micro, and Cisco. Papers arising from evaluations are cited in academic work at Massachusetts Institute of Technology, University of Cambridge, University of Oxford, and Princeton University.

Impact and Criticism

Impact extends to procurement decisions made by institutions such as U.S. Department of Defense, General Services Administration, NATO, and major corporations including Apple, IBM, and Siemens. Critics from advisory bodies and analysts at Gartner, Forrester Research, Bloomberg, and The Wall Street Journal note potential limitations: representativeness of scenarios, vendor preparation, and how tests reflect operational deployments in organizations like Walmart and Target Corporation. Debates mirror earlier controversies around evaluations by AV-Comparatives and certification regimes like the Federal Information Processing Standards.

Vendor Participation and Transparency

Vendors including CrowdStrike, Microsoft, Palo Alto Networks, Trend Micro, Sophos, Elastic, and SentinelOne participate to varying degrees, submitting artifacts and telemetry for mapping to ATT&CK techniques. Transparency practices echo standards championed by Open Web Application Security Project and peer-reviewed disclosures seen in DEF CON and Black Hat USA conferences. Discussions among vendors, government purchasers such as National Security Agency acquisition staff, and academic researchers at Carnegie Mellon University drive evolving guidance on disclosure, reproducibility, and responsible use.

Category:Cybersecurity