How a metric breach becomes a boardroom finding

From observed value
to stakeholder narrative

Three agents, one human gate, one deterministic rules engine. The LLM only touches the final sentence.

Input Breach Event Caller supplied

Everything the engine needs arrives in a single request — the system's identity (DNA), the breach signal, and which personas need an assessment.

BreachEvent — POST /assess
System DNA  CLICK ANY VALUE ↗
agency_level    autonomous
Axis — Agency Level
How much independent decision-making authority the system has. This is the first reasoning axis — evaluated before any quality metric. It drives the agency multiplier in Csys and anchors the reversibility formula R = 1÷(1+A×T).
assistiveA = 0.0
supervisedA = 0.5
autonomous ← this systemA = 1.0
At A=1.0 combined with T=1.0, R = 0.5 — every impact score is halved. The system makes irreversible decisions without human gates.
action_type    transactional
Axis — Action Type
What kind of output the system produces and how reversible it is. Drives the transactionality value T in R = 1÷(1+A×T). Higher transactionality = harder to undo = lower R = amplified impact scores.
generativeT = 0.1
advisoryT = 0.3
operationalT = 0.6
filingT = 0.8
transactional ← this systemT = 1.0
Payments and credit decisions are hardest to reverse. With autonomous agency, this system sits at maximum reversibility penalty.
exposure_surfaceext_regulated_consumer
Axis — Exposure Surface
Who the system's outputs reach, and under what regulatory regime. Drives the exposure multiplier in Csys and scopes which regulatory frameworks the Impact Mapper applies.
internal_privateno reg scope
internal_regulatedlimited
external_b2bcontract scope
external_consumerconsumer scope
external_regulated_consumer ← thisfull MAS / 1.3×
Triggers MAS FEAT, MAS TRM, and PDPA lookups. Sets the highest exposure multiplier (1.3×) in Csys.
domain         financial_services
Axis — Domain (optional)
The industry sector. Optional — when supplied, scopes regulatory reference lookups to sector-specific frameworks. A rule without a domain is treated as universal.
financial_services ← this systemMAS, PDPA
healthcareMOH, PDPA
legalLegal Profession Act
hr_workforceTripartite Guidelines
general_enterpriseno domain-specific reg
public_sectorIMDA AI Gov. Framework
data_sensitivity sensitive_personal
Axis — Data Sensitivity (optional)
The sensitivity classification of data the system processes. When supplied, affects PDPA breach notification obligations and persona consequence framing. Validated at API ingestion — unknown values raise HTTP 422.
publicno PDPA obligations
internalminimal obligations
personalPDPA applies
sensitive_personal ← this systemPDPA + 3-day notify
Mandatory breach notification window of 3 days under PDPA. Most restrictive classification — route all incidents to compliance_officer persona first.
Breach Signal
quality_attribute hallucination_rate
measured_value   0.09  (9%)
threshold_value  0.03  (3%)
prior_breach_count1
── target_personas ──────────
compliance_officer
quality_lead
product_owner
business_owner
The threshold is not DNA-dependent — it's caller-supplied from your eval tooling or quality policy. AIQIDE determines the consequence of breaching it, not what the threshold should be.
Agent 1 — Telemetry Architect
Agent 1 Telemetry Architect Catalogue validation

Runs before the impact pipeline. Confirms the attribute is well-defined, has a declared metric scale, and has measurement tooling assigned. Blocks bad inputs rather than letting them produce silently wrong scores.

Reads from ontology
quality_attributes.yaml — v2.1, 22 attributes
system_dna.yaml       — DNA vocabulary
impact_rule.schema.json — v2.1, authoritative
Enriched attribute record — added to payload
+ metric_scale     { min:0, max:1, direction:"lower_is_better", unit:"ragas_score" }
+ measurement_tools [ "RAGAS", "DeepEval" ]
+ applicable_domains [ "financial_services", "healthcare" ]
Blocks these anti-patterns
Duplicate attributes — Jaccard dedup gate
Missing metric_scale — engine cannot normalise without it
Unknown DNA ids in applicable_system_types
Saturated quadrant — exits to no_proposal if nothing new to add
Agent 2 — Impact Mapper
Agent 2 Impact Mapper Core reasoning

This is where technical breach becomes business risk. Takes the enriched payload and applies chain-of-thought reasoning in DNA axis order — agency before metric. Produces a schema-valid Impact Rule JSON per persona.

Reasoning axis order — agency evaluated before metric delta
1. agency_level   autonomous = highest Csys multiplier (1.4×)
2. action_type   transactional = irreversible (T = 1.0)
3. exposure_surfaceexternal_regulated → MAS jurisdiction
4. Qa metric delta(0.09 – 0.03) normalised → ~0.97
Vs = ( Qa × Ws × Csys ) × R × Mr
Produces — Impact Rule JSON (per persona)
+ rule_id             "hallucination_rate__autonomous__compliance_officer__financial_services"
+ severity_at_breach  "critical"
+ causal_chain       mechanism named, not category
+ regulatory_referencesclause named, not just framework
+ scoring_block      { Qa, Ws, Csys, R }
+ narrative_template  placeholder tokens only — no LLM yet
Anti-patterns it must avoid
Generic causal chain — must name the mechanism, not the category
Framework-level refs — must cite specific clause (MAS FEAT Principle 3.2, not "MAS FEAT")
Severity inflation — not everything is Critical
Wrong domain scoping — MAS FEAT for an HR rule is specific and wrong
Agent 3 — Calibration Critic
Agent 3 Calibration Critic Quality gate

Reviews the Impact Mapper's draft. Scores on six dimensions. Returns for revision (max 2 cycles) or flags for human review. Specificity ≠ correctness — the critic catches wrong-clause citations, not just absent ones.

Six scored dimensions
schema_valid                   deterministic — passes impact_rule.schema.json?
severity_calibrated          LLM-judged — consistent with golden rule anchors?
regulatory_specific          LLM-judged — clause named, not just framework?
causal_chain_specific       LLM-judged — mechanism named, not category?
domain_scoping_correct     deterministic — catalogue-level attribute scope
instance_domain_scoping_correctdeterministic — instance-level applicability

Example output — all pass except one:

schema_valid ✓
severity_calibrated ✓
regulatory_specific ✓
causal_chain_specific — revise
note: "causal_chain states 'regulatory risk' without naming the MAS supervisory notification window. Cite MAS Notice SFA 04-N02 paragraph 6.2 or omit."
All 6 pass → /review/pending/  ·  Any fail → return to Agent 2 (max 2 cycles)  ·  Fail after 2 cycles → human flag
Human Review CLI
Gate Human Review The only path in

No agent output reaches the rules engine without a human setting meta.status = "approved". Rules with any other status are invisible to the engine.

What the reviewer sees
rule_id          "hallucination_rate__autonomous__compliance_officer__financial_services"
severity_at_breach"critical"
reversibility   0.34 (computed: 1÷(1+0.9×0.9))
critic_scores   all_pass
nearest_golden  "hallucination_rate__autonomous__compliance_officer"
Approve → /ontology/impacts/ Reject → /review/rejected/ Annotate + re-queue
Rules engine → LLM narrator
Output Scored Results Per persona

The engine fires deterministically — DNA filter first, then Vs scoring per persona. The LLM's only job is to translate the structured causal_chain into stakeholder language. All reasoning is already done.

Compliance Officer
Vs 0.87
Critical
regulatoryindividual accountability
Business Owner
Vs 0.74
Critical
commercialreputational
Quality Lead
Vs 0.61
High
measurement integrityfailure classification
Product Owner
Vs 0.55
High
user experienceoperational
The narrative is generated from structured data — every figure and regulatory reference traces back to an auditable rule field. The LLM does not reason. It translates.