How a metric breach becomes a boardroom finding

From observed value
to stakeholder narrative

Three agents, one human gate, one deterministic rules engine. The LLM only touches the final sentence.

Input Breach Event Caller supplied

Everything the engine needs arrives in a single request: the system's identity (DNA), the breach signal, and which personas need an assessment.

BreachEvent: POST /assess

System DNA CLICK ANY VALUE ↗

agency_level autonomous↗

action_type transactional↗

exposure_surfaceext_regulated_consumer↗

domain financial_services↗

data_sensitivity sensitive_personal↗

Breach Signal

quality_attribute hallucination_rate

measured_value 0.09 (9%)

threshold_value 0.03 (3%)

prior_breach_count1

── target_personas ──────────

compliance_officer

quality_lead

product_owner

business_owner

⚠ The threshold is not DNA-dependent: it's caller-supplied from your eval tooling or quality policy. Prism determines the consequence of breaching it, not what the threshold should be.

Agent 1: Telemetry Architect

Agent 1 Telemetry Architect Catalogue validation

Runs before the impact pipeline. Confirms the attribute is well-defined, has a declared metric scale, and has measurement tooling assigned. Blocks bad inputs rather than letting them produce silently wrong scores.

Reads from ontology

quality_attributes.yaml , v2.1, 22 attributes

system_dna.yaml , DNA vocabulary

impact_rule.schema.json , v2.1, authoritative

Enriched attribute record: added to payload

+ metric_scale { min:0, max:1, direction:"lower_is_better", unit:"ragas_score" }

+ measurement_tools [ "RAGAS", "DeepEval" ]

+ applicable_domains [ "financial_services", "healthcare" ]

Blocks these anti-patterns

Duplicate attributes: Jaccard dedup gate

Missing metric_scale: engine cannot normalise without it

Unknown DNA ids in applicable_system_types

Saturated quadrant: exits to no_proposal if nothing new to add

Agent 2: Impact Mapper

Agent 2 Impact Mapper Core reasoning

This is where technical breach becomes business risk. Takes the enriched payload and applies chain-of-thought reasoning in DNA axis order: agency before metric. Produces a schema-valid Impact Rule JSON per persona.

Reasoning axis order: agency evaluated before metric delta

1. agency_level autonomous = highest Csys multiplier (1.4×)

2. action_type transactional = irreversible (T = 1.0)

3. exposure_surfaceexternal_regulated → MAS jurisdiction

4. Qa metric delta(0.09 – 0.03) normalised → ~0.97

Vs = ( Qa × Ws × Csys ) × R × Mr

Produces: Impact Rule JSON (per persona)

+ rule_id "hallucination_rate__autonomous__compliance_officer__financial_services"

+ severity_at_breach "critical"

+ causal_chain mechanism named, not category

+ regulatory_referencesclause named, not just framework

+ scoring_block { Qa, Ws, Csys, R }

+ narrative_template placeholder tokens only: no LLM yet

Anti-patterns it must avoid

Generic causal chain: must name the mechanism, not the category

Framework-level refs: must cite specific clause (MAS FEAT Principle 3.2, not "MAS FEAT")

Severity inflation: not everything is Critical

Wrong domain scoping: MAS FEAT for an HR rule is specific and wrong

Agent 3: Calibration Critic

Agent 3 Calibration Critic Quality gate

Reviews the Impact Mapper's draft. Scores on six dimensions. Returns for revision (max 2 cycles) or flags for human review. Specificity ≠ correctness: the critic catches wrong-clause citations, not just absent ones.

Six scored dimensions

schema_valid deterministic: passes impact_rule.schema.json?

severity_calibrated LLM-judged: consistent with golden rule anchors?

regulatory_specific LLM-judged: clause named, not just framework?

causal_chain_specific LLM-judged: mechanism named, not category?

domain_scoping_correct deterministic: catalogue-level attribute scope

instance_domain_scoping_correctdeterministic: instance-level applicability

Example output: all pass except one:

schema_valid ✓

severity_calibrated ✓

regulatory_specific ✓

causal_chain_specific: revise

⚠ note: "causal_chain states 'regulatory risk' without naming the MAS supervisory notification window. Cite MAS Notice SFA 04-N02 paragraph 6.2 or omit."

All 6 pass → /review/pending/ · Any fail → return to Agent 2 (max 2 cycles) · Fail after 2 cycles → human flag

Human Review CLI

Gate Human Review The only path in

No agent output reaches the rules engine without a human setting meta.status = "approved". Rules with any other status are invisible to the engine.

What the reviewer sees

rule_id "hallucination_rate__autonomous__compliance_officer__financial_services"

severity_at_breach"critical"

reversibility 0.34 (computed: 1÷(1+0.9×0.9))

critic_scores all_pass

nearest_golden "hallucination_rate__autonomous__compliance_officer"

Approve → /ontology/impacts/ Reject → /review/rejected/ Annotate + re-queue

Rules engine → LLM narrator

Output Scored Results Per persona

The engine fires deterministically: DNA filter first, then Vs scoring per persona. The LLM's only job is to translate the structured causal_chain into stakeholder language. All reasoning is already done.

Compliance Officer

Vs 0.87

Critical

regulatoryindividual accountability

Business Owner

Vs 0.74

Critical

commercialreputational

Quality Lead

Vs 0.61

High

measurement integrityfailure classification

Product Owner

Vs 0.55

High

user experienceoperational

→ The narrative is generated from structured data: every figure and regulatory reference traces back to an auditable rule field. The LLM does not reason. It translates.

From observed valueto stakeholder narrative

From observed value
to stakeholder narrative