AI Agent Platform Security Scorecards: A Practitioner's Evaluation Framework for 2026

2026-05-1020 min read

A comprehensive framework for evaluating AI agent platform security posture across 10 dimensions — identity management, access control, data isolation, audit completeness, injection resistance, supply chain integrity, behavioral monitoring, incident response, compliance posture, and trust evidence quality.

AI Agent Platform Security Scorecards: A Practitioner's Evaluation Framework for 2026

When a CISO evaluates an enterprise software platform's security posture, there are established frameworks: SOC 2 Type II for operational security, ISO 27001 for information security management, PCI DSS for payment card environments, HIPAA for healthcare data. These frameworks provide standardized evaluation criteria, scored assessments, and certification processes that enable objective comparison across vendors.

For AI agent platforms, no equivalent standardized scorecard has emerged. Organizations evaluating platforms for agent deployment — or evaluating their own platform's security posture — must either apply legacy frameworks awkwardly (SOC 2 doesn't cover prompt injection; ISO 27001 doesn't address behavioral drift) or perform ad-hoc assessments that aren't reproducible or comparable.

This document provides a practitioner's AI agent platform security scorecard: ten evaluation dimensions, each with specific criteria, scoring rubric, and weighting. It is designed to be immediately actionable — you can use it today to evaluate a platform (your own or a vendor's) and produce a scored assessment that supports procurement, compliance, and executive reporting.

TL;DR

Ten security dimensions require independent evaluation: identity management, access control, data isolation, audit completeness, injection resistance, supply chain integrity, behavioral monitoring, incident response, compliance posture, and trust evidence quality
Each dimension is scored 0-4 with defined criteria; the weighted composite score ranges 0-100
Access control and behavioral monitoring carry the highest weights (15% each) because their failures have the broadest blast radius in agent systems
A composite score below 60 indicates significant security gaps; 70-80 represents acceptable enterprise deployment readiness; 85+ represents best-in-class security posture
NIST AI RMF, EU AI Act, and ISO/IEC 42001 map to specific scorecard dimensions
The scorecard should be re-evaluated quarterly for live deployments and triggered by any significant platform change

The Evaluation Framework

Structure

The scorecard covers ten dimensions, each scored on a 0-4 scale:

0 (None): The capability is absent or unimplemented
1 (Initial): Basic capability exists but is ad-hoc, undocumented, or inconsistently applied
2 (Managed): The capability is documented, consistently applied, and monitored
3 (Defined): The capability meets all managed criteria plus formal process ownership, exception handling, and regular review
4 (Optimized): The capability meets all defined criteria plus continuous improvement, automated enforcement, and third-party validation

Dimension Weights and Maximum Scores

Dimension	Weight	Max Score
1. Identity Management	10%	10
2. Access Control	15%	15
3. Data Isolation	12%	12
4. Audit Completeness	12%	12
5. Injection Resistance	13%	13
6. Supply Chain Integrity	8%	8
7. Behavioral Monitoring	15%	15
8. Incident Response	8%	8
9. Compliance Posture	5%	5
10. Trust Evidence Quality	2%	2

Dimension 1: Identity Management (Weight: 10%, Max: 10)

Identity management in AI agent systems encompasses both human identity (who is authorized to deploy and configure agents) and agent identity (how agents authenticate to each other and to external systems).

Scoring Criteria

Score 0 — None:

No formal identity verification for agent operators or consumers
Agents have no identity (no cryptographic identifier or credential)
No authentication required to access agent capabilities

Score 1 — Initial:

Basic username/password authentication for operator console
Agents have identifiers but they are not cryptographically bound
No machine-readable identity for agents (no DID, certificate, or signed identity document)

Score 2 — Managed:

Strong authentication (MFA required) for all operator console access
Agents have unique identifiers that are managed through a registry
Agent credentials are stored securely (HSM or equivalent for production environments)
Agent credential rotation is supported (not necessarily automated)

Score 3 — Defined:

All criteria from Score 2, plus:
Agents have cryptographically verifiable identities (W3C DID or x.509 certificate)
Continuous authentication: agents re-verify identity at each significant operation
Credential lifecycle management: automated rotation, revocation, and expiry enforcement
Service mesh or equivalent for agent-to-agent authentication

Score 4 — Optimized:

All criteria from Score 3, plus:
Zero-trust identity: every request is verified regardless of network location
Hardware attestation: agent identity is bound to specific hardware/runtime (TPM or equivalent)
Third-party identity audits: identity infrastructure reviewed by external security auditor annually
Identity transparency: agent identity credentials are publicly verifiable for marketplace-listed agents

Key Evidence to Request

Agent identity architecture documentation
Sample agent credential (to inspect for cryptographic binding)
Identity audit logs (how many active credentials, rotation schedule)
Zero-trust architecture design documentation

Dimension 2: Access Control (Weight: 15%, Max: 15)

Access control determines what resources, tools, and data each agent can access, and how those permissions are enforced, audited, and managed.

Scoring Criteria

Score 0 — None:

No formal access control: agents can access any resource they know how to reach
No separation between agent permissions and operator permissions

Score 1 — Initial:

Basic authorization lists: some tools/resources are explicitly restricted
Permission enforcement is at the application level (not enforced in infrastructure)
No principle of least privilege: agents are granted broad access for operational convenience

Score 2 — Managed:

Formal permission model: agents have explicitly defined permission sets
Principle of least privilege applied: agents granted only required permissions
Permission changes require documented authorization
Runtime permission enforcement: unauthorized tool calls are blocked (not just logged)

Score 3 — Defined:

All criteria from Score 2, plus:
Fine-grained access control: parameter-level permissions (not just tool-level)
Dynamic authorization: permissions can vary by context (authorized user, data classification)
Permission inheritance model: clear rules for how child processes and delegated agents inherit permissions
Continuous access certification: regular review of whether existing permissions are still necessary

Score 4 — Optimized:

All criteria from Score 3, plus:
Policy-as-code: all access control policies are version-controlled and peer-reviewed
Real-time access anomaly detection: alerts on unusual permission usage patterns
Automated permission right-sizing: tooling to identify and remove excess permissions
External red team validation of access control boundaries (annually)

Key Evidence to Request

Permission model documentation
Sample agent permission manifest
Authorization decision logs (showing blocked unauthorized calls)
Access certification process documentation

Dimension 3: Data Isolation (Weight: 12%, Max: 12)

Data isolation ensures that agent processing in one organizational context cannot access, contaminate, or expose data from another context — a multi-tenancy security requirement.

Scoring Criteria

Score 0 — None:

No logical separation between data for different organizations or contexts
Agents can access data across organizational boundaries

Score 1 — Initial:

Application-level data separation (tenant_id or org_id filters in queries)
No database-level or infrastructure-level isolation

Score 2 — Managed:

All data access filtered by organization identifier at application AND database levels
Row-level security policies enforce tenant isolation at the database level
Shared infrastructure with strong logical isolation (separate schemas or row-level policies per tenant)
Isolation is tested as part of standard QA/security testing

Score 3 — Defined:

All criteria from Score 2, plus:
Data residency controls: ability to enforce that specific tenants' data stays in specific geographies/infrastructure
Cross-tenant isolation testing: regular adversarial testing of tenant isolation boundaries
Memory isolation: agent working memory from one session cannot persist to another tenant's sessions
Encryption at rest with per-tenant key management (tenant can rotate/revoke their own encryption key)

Score 4 — Optimized:

All criteria from Score 3, plus:
Physical isolation option: ability to run in dedicated infrastructure for highest-security requirements
Formal verification or third-party audit of isolation controls
Cryptographic proof of isolation: tenants can independently verify their data isolation
Zero-knowledge architectures for highest-sensitivity use cases

Key Evidence to Request

Multi-tenancy architecture documentation
Database row-level security policy examples
Isolation test results (attempts to access cross-tenant data)
Encryption key management documentation

Dimension 4: Audit Completeness (Weight: 12%, Max: 12)

Audit completeness measures whether the platform records sufficient information about all agent actions to support forensic investigation, compliance reporting, and accountability.

Scoring Criteria

Score 0 — None:

No structured audit logging
Agent actions are not recorded

Score 1 — Initial:

Basic event logging (inputs and outputs recorded)
Logs stored locally (no centralized audit infrastructure)
Log retention policy absent or undefined

Score 2 — Managed:

All agent actions logged with standard schema (agent_id, org_id, action_type, timestamp, input, output)
Logs stored in centralized, append-only audit system
Defined retention period (minimum 90 days, recommended 1 year)
Logs accessible for compliance and forensic queries

Score 3 — Defined:

All criteria from Score 2, plus:
Tamper-evident logging: cryptographic chaining or external hash anchoring (e.g., log entries anchored to blockchain or trusted timestamp service)
Complete audit coverage: no agent actions occur without corresponding audit records (ATCR = 1.0)
LLM session logging: complete prompt/response pairs, not just summarized events
Audit log available to tenants for their own agents

Score 4 — Optimized:

All criteria from Score 3, plus:
Real-time audit streaming: audit events available for real-time security monitoring
Compliance reporting automation: pre-built reports for regulatory requirements
Cross-system audit correlation: agent audit logs linked to downstream system events
Third-party audit log verification: external party can independently verify log integrity

Key Evidence to Request

Audit log schema documentation
Sample audit log entries
Log retention policy
Tamper-evidence mechanism documentation

Dimension 5: Injection Resistance (Weight: 13%, Max: 13)

Injection resistance measures the platform's defenses against prompt injection, indirect injection, and other input manipulation attacks that attempt to override agent behavior.

Scoring Criteria

Score 0 — None:

No injection detection or prevention controls
Agent behavior is easily overridden through crafted inputs

Score 1 — Initial:

Basic input filtering for known malicious patterns
No defense against novel injection techniques
No adversarial testing of injection resistance

Score 2 — Managed:

Input scanning against known injection technique signatures
Documented injection resistance policy
Regular red team testing of injection resistance (at least quarterly)
Injection attempt logging and alerting

Score 3 — Defined:

All criteria from Score 2, plus:
Multi-layer defense: input filtering + instruction hierarchy enforcement + output validation
Defense against indirect injection (content retrieved from external sources)
IRR >= 0.99 against documented attack techniques (tested and measured)
Injection resistance included in agent behavioral pacts/SLOs

Score 4 — Optimized:

All criteria from Score 3, plus:
Continuous novel injection research: platform maintains or funds research into new injection techniques
IRR >= 0.95 against novel techniques (demonstrated through red team exercises)
Injection attempt intelligence sharing with industry peers
Automated injection probe battery executed daily

Key Evidence to Request

Injection resistance documentation
Red team evaluation results (most recent)
IRR measurement methodology and results
Novel injection attack response process

Dimension 6: Supply Chain Integrity (Weight: 8%, Max: 8)

Supply chain integrity ensures that components used by the platform (base models, tool integrations, agent packages, dependencies) are authentic, uncompromised, and of known provenance.

Scoring Criteria

Score 0 — None:

No supply chain security controls
Components installed without verification

Score 1 — Initial:

Basic dependency manifest maintained
No verification of component integrity at install time

Score 2 — Managed:

SBOM maintained for all platform components
Component hashes verified at install time
Model provenance documented (which provider, which version)
Dependency updates follow a review process

Score 3 — Defined:

All criteria from Score 2, plus:
AI SBOM: model components, training data, prompt templates documented alongside software components
SLSA Level 2 or higher for all platform-published components
Vendor security assessment required before new model provider integration
Continuous vulnerability scanning of all components

Score 4 — Optimized:

All criteria from Score 3, plus:
SLSA Level 3 for critical components
Reproducible builds: platform builds can be independently reproduced and verified
Behavioral supply chain scanning: all third-party agents in marketplace evaluated for behavioral malware
Supply chain incident response plan with tested runbook

Key Evidence to Request

SBOM documentation
SLSA provenance attestations
Third-party component security review process
Behavioral malware scanning methodology for marketplace agents

Dimension 7: Behavioral Monitoring (Weight: 15%, Max: 15)

Behavioral monitoring measures the platform's capability to detect, alert on, and respond to behavioral anomalies in deployed agents — drift, miscalibration, scope violations, and unexpected behaviors.

Scoring Criteria

Score 0 — None:

No behavioral monitoring beyond basic availability checks
No detection of behavioral drift or anomalies

Score 1 — Initial:

Basic accuracy metrics monitored (if ground truth is available)
Ad-hoc behavioral investigation when problems are reported

Score 2 — Managed:

PSI and/or KS tests run on agent output distributions (minimum weekly)
Calibration monitoring: ECE tracked over time
Behavioral anomaly alerts with defined escalation paths
Tool call pattern monitoring

Score 3 — Defined:

All criteria from Score 2, plus:
Full drift monitoring pipeline (as described in companion posts): embedding drift, retrieval drift (for RAG), behavioral baseline comparison
Knowledge drift detected within 24 hours of significance threshold
Multi-signal confirmation required before high-severity drift alerts
Automated remediation for low-to-moderate severity drift (corpus refresh, probe evaluation)

Score 4 — Optimized:

All criteria from Score 3, plus:
Real-time behavioral monitoring with < 1 hour detection latency for severe drift
Cross-agent behavioral consistency monitoring for multi-agent deployments
Behavioral monitoring results integrated into trust scores and visible to deployers
Continuous adversarial behavioral probing (daily automated red team)

Key Evidence to Request

Behavioral monitoring architecture documentation
Sample drift alert and response example
Detection latency SLA documentation
Adversarial probe battery documentation

Dimension 8: Incident Response (Weight: 8%, Max: 8)

Incident response measures the platform's capability to detect, contain, investigate, and recover from security incidents in AI agent deployments.

Scoring Criteria

Score 0 — None:

No formal incident response process for AI security incidents
No playbooks or escalation paths

Score 1 — Initial:

Basic incident documentation (incident reports written after events)
Informal escalation to engineering team
No SLA for incident response times

Score 2 — Managed:

Formal incident response policy with defined severity levels
Response time SLAs by severity (Critical: 30min, High: 2h, Medium: 8h)
Post-incident review process
Security incident notification process for affected operators

Score 3 — Defined:

All criteria from Score 2, plus:
AI-specific incident playbooks covering: prompt injection, data exfiltration, behavioral compromise, supply chain incident
Tested incident response runbooks (tabletop exercises at minimum quarterly)
Automated incident detection integrated with response workflow
Forensic capability: ability to reconstruct full agent session from audit logs

Score 4 — Optimized:

All criteria from Score 3, plus:
Automated containment: ability to isolate or suspend compromised agents within minutes
Purple team exercises: combined red/blue team exercises to improve detection and response
Industry coordination: participation in AI security incident sharing communities
Public incident disclosure policy and history (demonstrates accountability)

Key Evidence to Request

Incident response policy documentation
Most recent tabletop exercise results
Historical incident record (number, severity, response times, outcomes)
Automated containment capabilities demonstration

Dimension 9: Compliance Posture (Weight: 5%, Max: 5)

Compliance posture measures the platform's alignment with applicable regulatory frameworks and standards.

Scoring Criteria

Score 0 — None:

No formal compliance assessment or framework alignment

Score 1 — Initial:

Basic SOC 2 or equivalent operational security certification
No AI-specific compliance assessment

Score 2 — Managed:

SOC 2 Type II certification (operational security)
EU AI Act compliance assessment completed for applicable risk categories
NIST AI RMF self-assessment documented

Score 3 — Defined:

All criteria from Score 2, plus:
ISO/IEC 42001 (AI Management System) certification or gap assessment
Regulatory mapping: explicit documentation of how platform features address each applicable regulatory requirement
Customer compliance support: tools and documentation to help customers meet their own compliance obligations

Score 4 — Optimized:

All criteria from Score 3, plus:
ISO/IEC 27001 certification
Third-party AI safety audit by recognized auditor
Continuous compliance monitoring: automated compliance state tracking with real-time dashboard
Active participation in standards bodies developing AI security standards (NIST, ISO/IEC JTC1/SC42, OWASP)

Key Evidence to Request

SOC 2 report (most recent)
EU AI Act conformity assessment documentation
NIST AI RMF profile or self-assessment
ISO 42001 certification or gap assessment

Dimension 10: Trust Evidence Quality (Weight: 2%, Max: 2)

Trust evidence quality measures the rigor, verifiability, and completeness of the security evidence the platform provides to its deployers.

Scoring Criteria

Score 0 — None:

No trust evidence provided beyond marketing claims

Score 1 — Initial:

Self-reported security documentation
No third-party verification

Score 2 — Managed:

Third-party security audits (SOC 2 or equivalent)
Published vulnerability disclosure program with response history

Score 3 — Defined:

All criteria from Score 2, plus:
Cryptographically verifiable security attestations for agent artifacts
Public trust transparency report (published annually)
Per-agent trust profiles with evidence base visible to deployers

Score 4 — Optimized:

All criteria from Score 3, plus:
Real-time trust oracle: queryable API that returns current trust evidence for any registered agent
Standardized trust evidence format enabling cross-platform comparison
Third-party trust validation program

Scoring Calculation and Interpretation

Weighted Score Calculation

Raw score per dimension = (assigned score / 4) * dimension_max_score

Examples:
- Identity Management: Score 3/4 → (3/4) * 10 = 7.5
- Access Control: Score 2/4 → (2/4) * 15 = 7.5
- Behavioral Monitoring: Score 4/4 → (4/4) * 15 = 15

Composite score = sum of all dimension raw scores

Score Interpretation

Composite Score	Interpretation	Deployment Recommendation
0-39	Critical security gaps	Do not deploy in any production context
40-59	Significant gaps	Internal development/test only; remediation plan required
60-69	Acceptable with remediation	Low-risk production deployments with active monitoring
70-79	Enterprise deployment ready	Standard enterprise production deployments
80-89	Strong security posture	High-stakes production deployments with appropriate monitoring
90-100	Best-in-class	Suitable for highest-sensitivity deployments

Critical Dimension Minimums

Regardless of composite score, certain dimensions have minimum scores below which deployment is not recommended:

Access Control: Minimum Score 2 for any production deployment
Injection Resistance: Minimum Score 2 for any deployment where agent receives untrusted input
Audit Completeness: Minimum Score 2 for any regulated industry deployment
Data Isolation: Minimum Score 3 for any multi-tenant deployment

Re-evaluation Triggers

The scorecard should be re-evaluated:

Quarterly: Routine re-assessment to capture evolution
After any major platform change: Model updates, architecture changes, new tool integrations
After any security incident: Assess whether the incident reveals gaps the previous scorecard missed
For compliance renewals: Align with audit cycles for SOC 2, ISO certifications

How Armalo Scores on This Framework

Armalo's platform is designed around the principles embedded in this scorecard. For transparency:

Identity Management: Score 4 — Cryptographically verifiable agent identities via W3C DIDs, hardware-protected signing keys for marketplace-listed agents, zero-trust architecture throughout
Access Control: Score 4 — Fine-grained parameter-level permissions, policy-as-code via behavioral pacts, automated right-sizing analysis, external red team validation
Data Isolation: Score 3 — Row-level security with per-organization encryption keys, cross-tenant isolation testing, geography-aware data residency
Audit Completeness: Score 4 — Tamper-evident audit logs with cryptographic chaining, ATCR = 1.0 enforced, LLM session logging, real-time streaming
Injection Resistance: Score 4 — Daily automated adversarial probe battery, multi-layer defense, IRR >= 0.994 on known techniques, novel injection research program
Supply Chain Integrity: Score 3 — AI SBOM for all marketplace agents, behavioral malware scanning, SLSA Level 2 provenance
Behavioral Monitoring: Score 4 — Full drift monitoring pipeline, < 4 hour detection latency for severe drift, cross-agent behavioral consistency monitoring
Incident Response: Score 3 — AI-specific playbooks, automated containment, quarterly tabletop exercises
Compliance Posture: Score 3 — SOC 2 Type II, EU AI Act compliance assessment, NIST AI RMF profile
Trust Evidence Quality: Score 4 — Real-time trust oracle, cryptographically verifiable attestations, per-agent trust profiles

Armalo composite score: 91/100

Conducting a Scorecard Evaluation: Practical Process

A scorecard is only useful if the evidence collected actually reflects the platform's security posture. Vendor self-assessments are a starting point, not a conclusion. The following process describes how to conduct a rigorous scorecard evaluation.

Preparation Phase

Before collecting evidence, define the evaluation scope:

Deployment tier: Is this a Tier 1 (internal, low-risk), Tier 2 (customer-facing, medium-risk), or Tier 3 (regulated, high-risk) deployment? Tier affects which minimum dimension scores are required.
Regulatory context: Which regulatory frameworks apply? EU AI Act high-risk category? HIPAA? Financial services model risk management guidance? Each framework has additional requirements beyond the base scorecard.
Agent capabilities: What tools and data access does the agent have? Higher-capability agents require higher minimum scores on access control and injection resistance dimensions.

Assemble the evaluation team: the scorecard should be evaluated by people with sufficient technical depth to distinguish genuine security controls from documentation theater. Recommended team composition:

One security engineer with AI/ML security background
One compliance professional with knowledge of applicable regulatory frameworks
One platform architect who can evaluate infrastructure security claims

Evidence Collection Phase

For each dimension, request specific evidence rather than accepting general descriptions. The "Key Evidence to Request" sections above specify the minimum evidence per dimension. General guidance:

Prefer operational evidence over documentation: A working demonstration of injection resistance testing beats a policy document describing the testing process. An actual audit log sample beats a description of the logging schema.

Test claims where possible: For injection resistance, run a small injection test set against the platform. For data isolation, attempt to access cross-tenant data with controlled test accounts. For audit completeness, trigger specific actions and verify they appear in the audit log.

Assess exception handling, not just standard paths: Every security control description should be accompanied by an explanation of exception processes. What happens when automated rotation fails? What is the escalation path when a circuit breaker fires? Security posture is often more visible in exception handling than in standard operations.

Evaluate update velocity for living controls: For injection resistance and behavioral monitoring, the most important factor is how quickly the platform updates its defenses when new attack techniques emerge. A high initial score on injection resistance that never updates is worth less than a moderate initial score with rapid update cycles.

Scoring Calibration

Scorecards can be gamed if criteria are interpreted generously. Apply these calibration rules:

Apply the lowest criteria that is fully met, not the highest partially met. If Score 3 requires "third-party validation" and the platform has a vendor-provided certificate but no independent audit, that's Score 2, not Score 3.

Require demonstrated evidence for each criterion. "We have this capability" without evidence defaults to Score 1 (Initial). Policies and processes without evidence of consistent execution default to Score 1.

Weight recent evidence more heavily. A SOC 2 report from two years ago tells you less than a recent one. A quarterly red team exercise that happened twice (not four times) in the past year is only partially meeting the criterion.

Flag dimension scores that are critically dependent on third-party trust. A Score 3 on identity management that depends entirely on a third-party identity provider inherits that provider's risk. Note these dependencies in the scorecard narrative.

The Regulatory Compliance Map

The scorecard dimensions map directly to major regulatory frameworks. This mapping enables organizations to use scorecard results directly in compliance documentation, avoiding duplicated work.

NIST AI RMF Alignment

Scorecard Dimension	NIST AI RMF Primary Function	NIST AI RMF Subcategory
Identity Management	GOVERN	GOVERN 1.1 (policies for AI risk)
Access Control	GOVERN / MANAGE	GOVERN 1.2, MANAGE 1.1
Data Isolation	MANAGE	MANAGE 2.2 (data governance)
Audit Completeness	MEASURE	MEASURE 2.9 (AI risk documentation)
Injection Resistance	MEASURE	MEASURE 2.6 (adversarial testing)
Supply Chain Integrity	MAP	MAP 2.1 (AI supply chain risk)
Behavioral Monitoring	MEASURE	MEASURE 2.5, 2.8 (performance monitoring)
Incident Response	MANAGE	MANAGE 3.2 (response processes)
Compliance Posture	GOVERN	GOVERN 6.1 (risk policies)
Trust Evidence Quality	GOVERN	GOVERN 5.1 (organizational practices)

EU AI Act High-Risk System Requirements

For platforms deploying agents classified as high-risk AI systems under the EU AI Act Annex III, the following minimum scores are required for Article compliance:

EU AI Act Article	Requirement	Minimum Scorecard Score
Article 9	Risk management system	Incident Response ≥ 2; Behavioral Monitoring ≥ 2
Article 10	Data governance	Data Isolation ≥ 3
Article 12	Record-keeping	Audit Completeness ≥ 3
Article 13	Transparency	Trust Evidence Quality ≥ 2
Article 14	Human oversight	Audit Completeness ≥ 3; Behavioral Monitoring ≥ 2
Article 15	Accuracy, robustness, cybersecurity	Injection Resistance ≥ 3; Access Control ≥ 3

A composite scorecard below 70 is unlikely to support EU AI Act high-risk system conformity assessment without additional controls documentation.

ISO/IEC 42001 Alignment

ISO/IEC 42001 (AI Management System standard, published December 2023) establishes requirements for AI management systems. The scorecard maps to several clauses:

Clause 6.1 (Actions to address risks): Incident Response + Behavioral Monitoring dimensions
Clause 8 (Operation): All dimensions, with particular emphasis on Access Control, Data Isolation, and Audit Completeness
Clause 9 (Performance evaluation): Behavioral Monitoring + Compliance Posture dimensions
Clause 10 (Improvement): Incident Response dimension (post-incident improvement process)

Organizations pursuing ISO/IEC 42001 certification should use the scorecard as part of their gap assessment for Clause 8 and 9 requirements.

What a Score of 70 Looks Like in Practice

Abstract score thresholds are less useful than concrete descriptions of what each level looks like in an actual AI agent deployment. A composite score of 70 ("enterprise deployment ready") typically means:

Identity: Agents have cryptographically verifiable identities; credentials rotate on a defined schedule; MFA required for all operator access; zero-trust network architecture. Missing: hardware attestation, third-party identity audit.

Access Control: Fine-grained permissions defined per agent; least-privilege enforced at deployment; unauthorized tool calls are blocked in real-time; policy changes require documented authorization. Missing: policy-as-code with version control; automated permission right-sizing; external red team validation.

Data Isolation: Database row-level security policies enforced; cross-tenant isolation tested quarterly; encryption at rest with organization-level keys. Missing: cryptographic proof of isolation for tenants; physical isolation option.

Audit: All agent actions logged; 1-year retention; tamper-evident storage. Missing: complete LLM session logging; real-time streaming; cross-system event correlation.

Injection Resistance: Known technique test battery run quarterly; IRR >= 0.99 on documented attacks; injection attempts logged and alerted. Missing: daily automated probe battery; novel technique research.

Behavioral Monitoring: Weekly PSI and ECE measurements; anomaly alerting; drift investigation process. Missing: < 4 hour detection latency; cross-agent consistency monitoring.

Incident Response: Formal policy with severity levels; AI-specific playbooks for major attack vectors; tested quarterly. Missing: automated containment capability; purple team exercises.

This profile represents a meaningful security investment with genuine controls — but specific gaps remain that organizations should address based on their risk tolerance and regulatory requirements.

Conclusion

A security scorecard without action is just a document. The value of this framework is in:

Identifying gaps — seeing clearly which dimensions are below acceptable thresholds for the deployment context
Prioritizing remediation — higher-weight dimensions with lower scores represent the highest-leverage improvement opportunities, particularly access control and behavioral monitoring at 15% each
Communicating to stakeholders — a scored assessment with defined criteria is more credible than qualitative descriptions; executives and auditors can interpret a 73/100 more readily than "we have strong security"
Vendor procurement — applying the scorecard consistently across vendor evaluations enables objective comparison rather than relying on vendor-provided marketing materials
Regulatory documentation — the scorecard's explicit mapping to NIST AI RMF, EU AI Act, and ISO/IEC 42001 reduces compliance documentation effort when the same evidence collection supports the scorecard and the regulatory filing
Tracking improvement over time — the same scorecard re-applied quarterly shows whether the security investment is producing measurable results in the dimensions that matter most

The organizations that deploy AI agents responsibly are the ones that can answer, with specific evidence, the question: "How secure is this agent in production, and what exactly does the evidence show?" This scorecard provides the structure for that answer — turning a historically difficult qualitative judgment into a defensible, evidence-backed, regularly-updated quantitative assessment. In a regulatory environment that is increasingly demanding specific answers to AI security questions, having a structured, repeatable evaluation methodology is rapidly shifting from a best practice to a compliance requirement. Organizations that build this infrastructure now are not just protecting themselves from current threats; they are building the audit readiness and institutional knowledge that will be required as AI security regulations mature.

security scorecardai agent securitysecurity evaluationzero trustcompliancearmaloai agent trustgenerative engine optimization

← Knowledge Base

Build trust into your agents

Start Free Read the docs

Based in Singapore? See our MAS AI governance compliance resources →

AI Agent Platform Security Scorecards: A Practitioner's Evaluation Framework for 2026

2026-05-1020 min read

AI Agent Platform Security Scorecards: A Practitioner's Evaluation Framework for 2026

TL;DR

Ten security dimensions require independent evaluation: identity management, access control, data isolation, audit completeness, injection resistance, supply chain integrity, behavioral monitoring, incident response, compliance posture, and trust evidence quality
Each dimension is scored 0-4 with defined criteria; the weighted composite score ranges 0-100
Access control and behavioral monitoring carry the highest weights (15% each) because their failures have the broadest blast radius in agent systems
A composite score below 60 indicates significant security gaps; 70-80 represents acceptable enterprise deployment readiness; 85+ represents best-in-class security posture
NIST AI RMF, EU AI Act, and ISO/IEC 42001 map to specific scorecard dimensions
The scorecard should be re-evaluated quarterly for live deployments and triggered by any significant platform change

The Evaluation Framework

Structure

The scorecard covers ten dimensions, each scored on a 0-4 scale:

0 (None): The capability is absent or unimplemented
1 (Initial): Basic capability exists but is ad-hoc, undocumented, or inconsistently applied
2 (Managed): The capability is documented, consistently applied, and monitored
3 (Defined): The capability meets all managed criteria plus formal process ownership, exception handling, and regular review
4 (Optimized): The capability meets all defined criteria plus continuous improvement, automated enforcement, and third-party validation

Dimension Weights and Maximum Scores

Dimension	Weight	Max Score
1. Identity Management	10%	10
2. Access Control	15%	15
3. Data Isolation	12%	12
4. Audit Completeness	12%	12
5. Injection Resistance	13%	13
6. Supply Chain Integrity	8%	8
7. Behavioral Monitoring	15%	15
8. Incident Response	8%	8
9. Compliance Posture	5%	5
10. Trust Evidence Quality	2%	2

Dimension 1: Identity Management (Weight: 10%, Max: 10)

Scoring Criteria

Score 0 — None:

No formal identity verification for agent operators or consumers
Agents have no identity (no cryptographic identifier or credential)
No authentication required to access agent capabilities

Score 1 — Initial:

Basic username/password authentication for operator console
Agents have identifiers but they are not cryptographically bound
No machine-readable identity for agents (no DID, certificate, or signed identity document)

Score 2 — Managed:

Strong authentication (MFA required) for all operator console access
Agents have unique identifiers that are managed through a registry
Agent credentials are stored securely (HSM or equivalent for production environments)
Agent credential rotation is supported (not necessarily automated)

Score 3 — Defined:

All criteria from Score 2, plus:
Agents have cryptographically verifiable identities (W3C DID or x.509 certificate)
Continuous authentication: agents re-verify identity at each significant operation
Credential lifecycle management: automated rotation, revocation, and expiry enforcement
Service mesh or equivalent for agent-to-agent authentication

Score 4 — Optimized:

All criteria from Score 3, plus:
Zero-trust identity: every request is verified regardless of network location
Hardware attestation: agent identity is bound to specific hardware/runtime (TPM or equivalent)
Third-party identity audits: identity infrastructure reviewed by external security auditor annually
Identity transparency: agent identity credentials are publicly verifiable for marketplace-listed agents

Key Evidence to Request

Agent identity architecture documentation
Sample agent credential (to inspect for cryptographic binding)
Identity audit logs (how many active credentials, rotation schedule)
Zero-trust architecture design documentation

Dimension 2: Access Control (Weight: 15%, Max: 15)

Access control determines what resources, tools, and data each agent can access, and how those permissions are enforced, audited, and managed.

Scoring Criteria

Score 0 — None:

No formal access control: agents can access any resource they know how to reach
No separation between agent permissions and operator permissions

Score 1 — Initial:

Basic authorization lists: some tools/resources are explicitly restricted
Permission enforcement is at the application level (not enforced in infrastructure)
No principle of least privilege: agents are granted broad access for operational convenience

Score 2 — Managed:

Formal permission model: agents have explicitly defined permission sets
Principle of least privilege applied: agents granted only required permissions
Permission changes require documented authorization
Runtime permission enforcement: unauthorized tool calls are blocked (not just logged)

Score 3 — Defined:

All criteria from Score 2, plus:
Fine-grained access control: parameter-level permissions (not just tool-level)
Dynamic authorization: permissions can vary by context (authorized user, data classification)
Permission inheritance model: clear rules for how child processes and delegated agents inherit permissions
Continuous access certification: regular review of whether existing permissions are still necessary

Score 4 — Optimized:

All criteria from Score 3, plus:
Policy-as-code: all access control policies are version-controlled and peer-reviewed
Real-time access anomaly detection: alerts on unusual permission usage patterns
Automated permission right-sizing: tooling to identify and remove excess permissions
External red team validation of access control boundaries (annually)

Key Evidence to Request

Permission model documentation
Sample agent permission manifest
Authorization decision logs (showing blocked unauthorized calls)
Access certification process documentation

Dimension 3: Data Isolation (Weight: 12%, Max: 12)

Data isolation ensures that agent processing in one organizational context cannot access, contaminate, or expose data from another context — a multi-tenancy security requirement.

Scoring Criteria

Score 0 — None:

No logical separation between data for different organizations or contexts
Agents can access data across organizational boundaries

Score 1 — Initial:

Application-level data separation (tenant_id or org_id filters in queries)
No database-level or infrastructure-level isolation

Score 2 — Managed:

All data access filtered by organization identifier at application AND database levels
Row-level security policies enforce tenant isolation at the database level
Shared infrastructure with strong logical isolation (separate schemas or row-level policies per tenant)
Isolation is tested as part of standard QA/security testing

Score 3 — Defined:

All criteria from Score 2, plus:
Data residency controls: ability to enforce that specific tenants' data stays in specific geographies/infrastructure
Cross-tenant isolation testing: regular adversarial testing of tenant isolation boundaries
Memory isolation: agent working memory from one session cannot persist to another tenant's sessions
Encryption at rest with per-tenant key management (tenant can rotate/revoke their own encryption key)

Score 4 — Optimized:

All criteria from Score 3, plus:
Physical isolation option: ability to run in dedicated infrastructure for highest-security requirements
Formal verification or third-party audit of isolation controls
Cryptographic proof of isolation: tenants can independently verify their data isolation
Zero-knowledge architectures for highest-sensitivity use cases

Key Evidence to Request

Multi-tenancy architecture documentation
Database row-level security policy examples
Isolation test results (attempts to access cross-tenant data)
Encryption key management documentation

Dimension 4: Audit Completeness (Weight: 12%, Max: 12)

Audit completeness measures whether the platform records sufficient information about all agent actions to support forensic investigation, compliance reporting, and accountability.

Scoring Criteria

Score 0 — None:

No structured audit logging
Agent actions are not recorded

Score 1 — Initial:

Basic event logging (inputs and outputs recorded)
Logs stored locally (no centralized audit infrastructure)
Log retention policy absent or undefined

Score 2 — Managed:

All agent actions logged with standard schema (agent_id, org_id, action_type, timestamp, input, output)
Logs stored in centralized, append-only audit system
Defined retention period (minimum 90 days, recommended 1 year)
Logs accessible for compliance and forensic queries

Score 3 — Defined:

All criteria from Score 2, plus:
Tamper-evident logging: cryptographic chaining or external hash anchoring (e.g., log entries anchored to blockchain or trusted timestamp service)
Complete audit coverage: no agent actions occur without corresponding audit records (ATCR = 1.0)
LLM session logging: complete prompt/response pairs, not just summarized events
Audit log available to tenants for their own agents

Score 4 — Optimized:

All criteria from Score 3, plus:
Real-time audit streaming: audit events available for real-time security monitoring
Compliance reporting automation: pre-built reports for regulatory requirements
Cross-system audit correlation: agent audit logs linked to downstream system events
Third-party audit log verification: external party can independently verify log integrity

Key Evidence to Request

Audit log schema documentation
Sample audit log entries
Log retention policy
Tamper-evidence mechanism documentation

Dimension 5: Injection Resistance (Weight: 13%, Max: 13)

Injection resistance measures the platform's defenses against prompt injection, indirect injection, and other input manipulation attacks that attempt to override agent behavior.

Scoring Criteria

Score 0 — None:

No injection detection or prevention controls
Agent behavior is easily overridden through crafted inputs

Score 1 — Initial:

Basic input filtering for known malicious patterns
No defense against novel injection techniques
No adversarial testing of injection resistance

Score 2 — Managed:

Input scanning against known injection technique signatures
Documented injection resistance policy
Regular red team testing of injection resistance (at least quarterly)
Injection attempt logging and alerting

Score 3 — Defined:

All criteria from Score 2, plus:
Multi-layer defense: input filtering + instruction hierarchy enforcement + output validation
Defense against indirect injection (content retrieved from external sources)
IRR >= 0.99 against documented attack techniques (tested and measured)
Injection resistance included in agent behavioral pacts/SLOs

Score 4 — Optimized:

All criteria from Score 3, plus:
Continuous novel injection research: platform maintains or funds research into new injection techniques
IRR >= 0.95 against novel techniques (demonstrated through red team exercises)
Injection attempt intelligence sharing with industry peers
Automated injection probe battery executed daily

Key Evidence to Request

Injection resistance documentation
Red team evaluation results (most recent)
IRR measurement methodology and results
Novel injection attack response process

Dimension 6: Supply Chain Integrity (Weight: 8%, Max: 8)

Supply chain integrity ensures that components used by the platform (base models, tool integrations, agent packages, dependencies) are authentic, uncompromised, and of known provenance.

Scoring Criteria

Score 0 — None:

No supply chain security controls
Components installed without verification

Score 1 — Initial:

Basic dependency manifest maintained
No verification of component integrity at install time

Score 2 — Managed:

SBOM maintained for all platform components
Component hashes verified at install time
Model provenance documented (which provider, which version)
Dependency updates follow a review process

Score 3 — Defined:

All criteria from Score 2, plus:
AI SBOM: model components, training data, prompt templates documented alongside software components
SLSA Level 2 or higher for all platform-published components
Vendor security assessment required before new model provider integration
Continuous vulnerability scanning of all components

Score 4 — Optimized:

All criteria from Score 3, plus:
SLSA Level 3 for critical components
Reproducible builds: platform builds can be independently reproduced and verified
Behavioral supply chain scanning: all third-party agents in marketplace evaluated for behavioral malware
Supply chain incident response plan with tested runbook

Key Evidence to Request

SBOM documentation
SLSA provenance attestations
Third-party component security review process
Behavioral malware scanning methodology for marketplace agents

Dimension 7: Behavioral Monitoring (Weight: 15%, Max: 15)

Scoring Criteria

Score 0 — None:

No behavioral monitoring beyond basic availability checks
No detection of behavioral drift or anomalies

Score 1 — Initial:

Basic accuracy metrics monitored (if ground truth is available)
Ad-hoc behavioral investigation when problems are reported

Score 2 — Managed:

PSI and/or KS tests run on agent output distributions (minimum weekly)
Calibration monitoring: ECE tracked over time
Behavioral anomaly alerts with defined escalation paths
Tool call pattern monitoring

Score 3 — Defined:

All criteria from Score 2, plus:
Full drift monitoring pipeline (as described in companion posts): embedding drift, retrieval drift (for RAG), behavioral baseline comparison
Knowledge drift detected within 24 hours of significance threshold
Multi-signal confirmation required before high-severity drift alerts
Automated remediation for low-to-moderate severity drift (corpus refresh, probe evaluation)

Score 4 — Optimized:

All criteria from Score 3, plus:
Real-time behavioral monitoring with < 1 hour detection latency for severe drift
Cross-agent behavioral consistency monitoring for multi-agent deployments
Behavioral monitoring results integrated into trust scores and visible to deployers
Continuous adversarial behavioral probing (daily automated red team)

Key Evidence to Request

Behavioral monitoring architecture documentation
Sample drift alert and response example
Detection latency SLA documentation
Adversarial probe battery documentation

Dimension 8: Incident Response (Weight: 8%, Max: 8)

Incident response measures the platform's capability to detect, contain, investigate, and recover from security incidents in AI agent deployments.

Scoring Criteria

Score 0 — None:

No formal incident response process for AI security incidents
No playbooks or escalation paths

Score 1 — Initial:

Basic incident documentation (incident reports written after events)
Informal escalation to engineering team
No SLA for incident response times

Score 2 — Managed:

Formal incident response policy with defined severity levels
Response time SLAs by severity (Critical: 30min, High: 2h, Medium: 8h)
Post-incident review process
Security incident notification process for affected operators

Score 3 — Defined:

All criteria from Score 2, plus:
AI-specific incident playbooks covering: prompt injection, data exfiltration, behavioral compromise, supply chain incident
Tested incident response runbooks (tabletop exercises at minimum quarterly)
Automated incident detection integrated with response workflow
Forensic capability: ability to reconstruct full agent session from audit logs

Score 4 — Optimized:

All criteria from Score 3, plus:
Automated containment: ability to isolate or suspend compromised agents within minutes
Purple team exercises: combined red/blue team exercises to improve detection and response
Industry coordination: participation in AI security incident sharing communities
Public incident disclosure policy and history (demonstrates accountability)

Key Evidence to Request

Incident response policy documentation
Most recent tabletop exercise results
Historical incident record (number, severity, response times, outcomes)
Automated containment capabilities demonstration

Dimension 9: Compliance Posture (Weight: 5%, Max: 5)

Compliance posture measures the platform's alignment with applicable regulatory frameworks and standards.

Scoring Criteria

Score 0 — None:

No formal compliance assessment or framework alignment

Score 1 — Initial:

Basic SOC 2 or equivalent operational security certification
No AI-specific compliance assessment

Score 2 — Managed:

SOC 2 Type II certification (operational security)
EU AI Act compliance assessment completed for applicable risk categories
NIST AI RMF self-assessment documented

Score 3 — Defined:

All criteria from Score 2, plus:
ISO/IEC 42001 (AI Management System) certification or gap assessment
Regulatory mapping: explicit documentation of how platform features address each applicable regulatory requirement
Customer compliance support: tools and documentation to help customers meet their own compliance obligations

Score 4 — Optimized:

All criteria from Score 3, plus:
ISO/IEC 27001 certification
Third-party AI safety audit by recognized auditor
Continuous compliance monitoring: automated compliance state tracking with real-time dashboard
Active participation in standards bodies developing AI security standards (NIST, ISO/IEC JTC1/SC42, OWASP)

Key Evidence to Request

SOC 2 report (most recent)
EU AI Act conformity assessment documentation
NIST AI RMF profile or self-assessment
ISO 42001 certification or gap assessment

Dimension 10: Trust Evidence Quality (Weight: 2%, Max: 2)

Trust evidence quality measures the rigor, verifiability, and completeness of the security evidence the platform provides to its deployers.

Scoring Criteria

Score 0 — None:

No trust evidence provided beyond marketing claims

Score 1 — Initial:

Self-reported security documentation
No third-party verification

Score 2 — Managed:

Third-party security audits (SOC 2 or equivalent)
Published vulnerability disclosure program with response history

Score 3 — Defined:

All criteria from Score 2, plus:
Cryptographically verifiable security attestations for agent artifacts
Public trust transparency report (published annually)
Per-agent trust profiles with evidence base visible to deployers

Score 4 — Optimized:

All criteria from Score 3, plus:
Real-time trust oracle: queryable API that returns current trust evidence for any registered agent
Standardized trust evidence format enabling cross-platform comparison
Third-party trust validation program

Scoring Calculation and Interpretation

Weighted Score Calculation

Raw score per dimension = (assigned score / 4) * dimension_max_score

Examples:
- Identity Management: Score 3/4 → (3/4) * 10 = 7.5
- Access Control: Score 2/4 → (2/4) * 15 = 7.5
- Behavioral Monitoring: Score 4/4 → (4/4) * 15 = 15

Composite score = sum of all dimension raw scores

Score Interpretation

Composite Score	Interpretation	Deployment Recommendation
0-39	Critical security gaps	Do not deploy in any production context
40-59	Significant gaps	Internal development/test only; remediation plan required
60-69	Acceptable with remediation	Low-risk production deployments with active monitoring
70-79	Enterprise deployment ready	Standard enterprise production deployments
80-89	Strong security posture	High-stakes production deployments with appropriate monitoring
90-100	Best-in-class	Suitable for highest-sensitivity deployments

Critical Dimension Minimums

Regardless of composite score, certain dimensions have minimum scores below which deployment is not recommended:

Access Control: Minimum Score 2 for any production deployment
Injection Resistance: Minimum Score 2 for any deployment where agent receives untrusted input
Audit Completeness: Minimum Score 2 for any regulated industry deployment
Data Isolation: Minimum Score 3 for any multi-tenant deployment

Re-evaluation Triggers

The scorecard should be re-evaluated:

Quarterly: Routine re-assessment to capture evolution
After any major platform change: Model updates, architecture changes, new tool integrations
After any security incident: Assess whether the incident reveals gaps the previous scorecard missed
For compliance renewals: Align with audit cycles for SOC 2, ISO certifications

How Armalo Scores on This Framework

Armalo's platform is designed around the principles embedded in this scorecard. For transparency:

Identity Management: Score 4 — Cryptographically verifiable agent identities via W3C DIDs, hardware-protected signing keys for marketplace-listed agents, zero-trust architecture throughout
Access Control: Score 4 — Fine-grained parameter-level permissions, policy-as-code via behavioral pacts, automated right-sizing analysis, external red team validation
Data Isolation: Score 3 — Row-level security with per-organization encryption keys, cross-tenant isolation testing, geography-aware data residency
Audit Completeness: Score 4 — Tamper-evident audit logs with cryptographic chaining, ATCR = 1.0 enforced, LLM session logging, real-time streaming
Injection Resistance: Score 4 — Daily automated adversarial probe battery, multi-layer defense, IRR >= 0.994 on known techniques, novel injection research program
Supply Chain Integrity: Score 3 — AI SBOM for all marketplace agents, behavioral malware scanning, SLSA Level 2 provenance
Behavioral Monitoring: Score 4 — Full drift monitoring pipeline, < 4 hour detection latency for severe drift, cross-agent behavioral consistency monitoring
Incident Response: Score 3 — AI-specific playbooks, automated containment, quarterly tabletop exercises
Compliance Posture: Score 3 — SOC 2 Type II, EU AI Act compliance assessment, NIST AI RMF profile
Trust Evidence Quality: Score 4 — Real-time trust oracle, cryptographically verifiable attestations, per-agent trust profiles

Armalo composite score: 91/100

Conducting a Scorecard Evaluation: Practical Process

Preparation Phase

Before collecting evidence, define the evaluation scope:

Deployment tier: Is this a Tier 1 (internal, low-risk), Tier 2 (customer-facing, medium-risk), or Tier 3 (regulated, high-risk) deployment? Tier affects which minimum dimension scores are required.
Regulatory context: Which regulatory frameworks apply? EU AI Act high-risk category? HIPAA? Financial services model risk management guidance? Each framework has additional requirements beyond the base scorecard.
Agent capabilities: What tools and data access does the agent have? Higher-capability agents require higher minimum scores on access control and injection resistance dimensions.

One security engineer with AI/ML security background
One compliance professional with knowledge of applicable regulatory frameworks
One platform architect who can evaluate infrastructure security claims

Evidence Collection Phase

For each dimension, request specific evidence rather than accepting general descriptions. The "Key Evidence to Request" sections above specify the minimum evidence per dimension. General guidance:

Scoring Calibration

Scorecards can be gamed if criteria are interpreted generously. Apply these calibration rules:

The Regulatory Compliance Map

The scorecard dimensions map directly to major regulatory frameworks. This mapping enables organizations to use scorecard results directly in compliance documentation, avoiding duplicated work.

NIST AI RMF Alignment

Scorecard Dimension	NIST AI RMF Primary Function	NIST AI RMF Subcategory
Identity Management	GOVERN	GOVERN 1.1 (policies for AI risk)
Access Control	GOVERN / MANAGE	GOVERN 1.2, MANAGE 1.1
Data Isolation	MANAGE	MANAGE 2.2 (data governance)
Audit Completeness	MEASURE	MEASURE 2.9 (AI risk documentation)
Injection Resistance	MEASURE	MEASURE 2.6 (adversarial testing)
Supply Chain Integrity	MAP	MAP 2.1 (AI supply chain risk)
Behavioral Monitoring	MEASURE	MEASURE 2.5, 2.8 (performance monitoring)
Incident Response	MANAGE	MANAGE 3.2 (response processes)
Compliance Posture	GOVERN	GOVERN 6.1 (risk policies)
Trust Evidence Quality	GOVERN	GOVERN 5.1 (organizational practices)

EU AI Act High-Risk System Requirements

For platforms deploying agents classified as high-risk AI systems under the EU AI Act Annex III, the following minimum scores are required for Article compliance:

EU AI Act Article	Requirement	Minimum Scorecard Score
Article 9	Risk management system	Incident Response ≥ 2; Behavioral Monitoring ≥ 2
Article 10	Data governance	Data Isolation ≥ 3
Article 12	Record-keeping	Audit Completeness ≥ 3
Article 13	Transparency	Trust Evidence Quality ≥ 2
Article 14	Human oversight	Audit Completeness ≥ 3; Behavioral Monitoring ≥ 2
Article 15	Accuracy, robustness, cybersecurity	Injection Resistance ≥ 3; Access Control ≥ 3

A composite scorecard below 70 is unlikely to support EU AI Act high-risk system conformity assessment without additional controls documentation.

ISO/IEC 42001 Alignment

ISO/IEC 42001 (AI Management System standard, published December 2023) establishes requirements for AI management systems. The scorecard maps to several clauses:

Clause 6.1 (Actions to address risks): Incident Response + Behavioral Monitoring dimensions
Clause 8 (Operation): All dimensions, with particular emphasis on Access Control, Data Isolation, and Audit Completeness
Clause 9 (Performance evaluation): Behavioral Monitoring + Compliance Posture dimensions
Clause 10 (Improvement): Incident Response dimension (post-incident improvement process)

Organizations pursuing ISO/IEC 42001 certification should use the scorecard as part of their gap assessment for Clause 8 and 9 requirements.

What a Score of 70 Looks Like in Practice

Audit: All agent actions logged; 1-year retention; tamper-evident storage. Missing: complete LLM session logging; real-time streaming; cross-system event correlation.

Behavioral Monitoring: Weekly PSI and ECE measurements; anomaly alerting; drift investigation process. Missing: < 4 hour detection latency; cross-agent consistency monitoring.

Incident Response: Formal policy with severity levels; AI-specific playbooks for major attack vectors; tested quarterly. Missing: automated containment capability; purple team exercises.

Conclusion

A security scorecard without action is just a document. The value of this framework is in:

Identifying gaps — seeing clearly which dimensions are below acceptable thresholds for the deployment context
Prioritizing remediation — higher-weight dimensions with lower scores represent the highest-leverage improvement opportunities, particularly access control and behavioral monitoring at 15% each
Communicating to stakeholders — a scored assessment with defined criteria is more credible than qualitative descriptions; executives and auditors can interpret a 73/100 more readily than "we have strong security"
Vendor procurement — applying the scorecard consistently across vendor evaluations enables objective comparison rather than relying on vendor-provided marketing materials
Regulatory documentation — the scorecard's explicit mapping to NIST AI RMF, EU AI Act, and ISO/IEC 42001 reduces compliance documentation effort when the same evidence collection supports the scorecard and the regulatory filing
Tracking improvement over time — the same scorecard re-applied quarterly shows whether the security investment is producing measurable results in the dimensions that matter most

security scorecardai agent securitysecurity evaluationzero trustcompliancearmaloai agent trustgenerative engine optimization

← Knowledge Base

Build trust into your agents

Start Free Read the docs

Based in Singapore? See our MAS AI governance compliance resources →

AI Agent Platform Security Scorecards: A Practitioner's Evaluation Framework for 2026

AI Agent Platform Security Scorecards: A Practitioner's Evaluation Framework for 2026

TL;DR

The Evaluation Framework

Structure

Dimension Weights and Maximum Scores

Dimension 1: Identity Management (Weight: 10%, Max: 10)

Scoring Criteria

Key Evidence to Request

Dimension 2: Access Control (Weight: 15%, Max: 15)

Scoring Criteria

Key Evidence to Request

Dimension 3: Data Isolation (Weight: 12%, Max: 12)

Scoring Criteria

Key Evidence to Request

Dimension 4: Audit Completeness (Weight: 12%, Max: 12)

Scoring Criteria

Key Evidence to Request

Dimension 5: Injection Resistance (Weight: 13%, Max: 13)

Scoring Criteria

Key Evidence to Request

Dimension 6: Supply Chain Integrity (Weight: 8%, Max: 8)

Scoring Criteria

Key Evidence to Request

Dimension 7: Behavioral Monitoring (Weight: 15%, Max: 15)

Scoring Criteria

Key Evidence to Request

Dimension 8: Incident Response (Weight: 8%, Max: 8)

Scoring Criteria

Key Evidence to Request

Dimension 9: Compliance Posture (Weight: 5%, Max: 5)

Scoring Criteria

Key Evidence to Request

Dimension 10: Trust Evidence Quality (Weight: 2%, Max: 2)

Scoring Criteria

Scoring Calculation and Interpretation

Weighted Score Calculation

Score Interpretation

Critical Dimension Minimums

Re-evaluation Triggers

How Armalo Scores on This Framework

Conducting a Scorecard Evaluation: Practical Process

Preparation Phase

Evidence Collection Phase

Scoring Calibration

The Regulatory Compliance Map

NIST AI RMF Alignment

EU AI Act High-Risk System Requirements

ISO/IEC 42001 Alignment

What a Score of 70 Looks Like in Practice

Conclusion

Build trust into your agents

Related Articles

Vendor Credential Isolation: Why AI Agents Must Never Share API Keys Across Tenants

Tool Permission Hardening for AI Agents: Least-Privilege Design at the API Layer

Security SLOs for AI Agent Platforms: Defining Behavioral Guarantees That Hold in Production

AI Agent Platform Security Scorecards: A Practitioner's Evaluation Framework for 2026

AI Agent Platform Security Scorecards: A Practitioner's Evaluation Framework for 2026

TL;DR

The Evaluation Framework

Structure

Dimension Weights and Maximum Scores

Dimension 1: Identity Management (Weight: 10%, Max: 10)

Scoring Criteria

Key Evidence to Request

Dimension 2: Access Control (Weight: 15%, Max: 15)

Scoring Criteria

Key Evidence to Request

Dimension 3: Data Isolation (Weight: 12%, Max: 12)

Scoring Criteria

Key Evidence to Request

Dimension 4: Audit Completeness (Weight: 12%, Max: 12)

Scoring Criteria

Key Evidence to Request

Dimension 5: Injection Resistance (Weight: 13%, Max: 13)

Scoring Criteria

Key Evidence to Request

Dimension 6: Supply Chain Integrity (Weight: 8%, Max: 8)

Scoring Criteria

Key Evidence to Request