Accuracy, Robustness & Security

// theme · accuracy-robustness

Accuracy, Robustness & Security

Performance, resilience to errors and adversarial inputs, cybersecurity.

// Do once → satisfies all three

ONE pre-deployment test report (accuracy, robustness, adversarial, cybersecurity) plus a live drift-monitoring dashboard.

Art.15 demands declared levels; ISO/NIST demand the evidence underneath. One report + one dashboard maintains both.

ISO 42001

Annex A.6.2.4 · Cl.9.1

NIST AI RMF

MEASURE 2.5 · MEASURE 2.6 · MEASURE 2.7

EU AI Act

Art.15

// Evidence auditors expect

✓ Test report covering accuracy, robustness, cybersecurity
✓ Adversarial / red-team test results
✓ Drift-monitoring dashboard with thresholds
✓ Pen-test / vulnerability scan against model serving stack

// Common pitfalls

⚠ Test set drawn from training distribution only - no out-of-distribution evaluation.
⚠ Accuracy reported aggregate, hiding worst-case performance on subgroups.
⚠ No cybersecurity testing of the model serving / prompt-injection surface.

ISO 42001

Annex A.6.2.4 covers verification and validation; Cl.9.1 covers ongoing measurement.

Clause 9.1

Monitoring, measurement, analysis and evaluation

Define what to monitor, methods, frequency, and evaluate AI performance.

Annex A.4.5

System and computing resources

Document the system and computing resources (infrastructure, compute) supporting AI systems.

Annex A.6.2.4

AI system verification and validation

Define and apply measures to verify and validate AI systems against requirements.

NIST AI RMF

MEASURE 2.5–2.7 cover validity, reliability, safety, security and resilience.

MAP 2.3

Scientific integrity and TEVV

Scientific integrity and Test, Evaluation, Verification & Validation considerations documented.

MEASURE 1.1

Metrics and methods selected

Approaches and metrics for measuring AI risks are selected.

MEASURE 2.1

Test sets, metrics, details documented

Test sets, metrics and methodology details documented for evaluation.

MEASURE 2.5

Validity & reliability assessed

AI system is regularly evaluated for validity, reliability and performance.

MEASURE 2.6

Safety risks evaluated

AI system is evaluated for safety risks.

MEASURE 2.7

Security & resilience evaluated

Security and resilience including adversarial robustness evaluated.

EU AI Act

Art.15 requires high-risk AI to achieve appropriate accuracy, robustness and cybersecurity, declared in technical documentation.

Article 15

Accuracy, robustness & cybersecurity

Appropriate level of accuracy, robustness and cybersecurity throughout lifecycle.

Article 55

Obligations for GPAI with systemic risk

Model evaluations, systemic risk assessment & mitigation, incident reporting, cybersecurity.

// What now?

Open

Open “Accuracy, Robustness & Security” in the Explorer →

Continue

Continue to a related theme · Documentation & Records →

Accuracy, Robustness & Security across ISO 42001, NIST AI RMF and the EU AI Act

Accuracy, Robustness & Security