Item status
Live
In progress
Planned
Blocked
Edge axes
Wired · Flowing · Conforming · Complete
Systems
Scout
Exploratory testing agent
AI
3
2 live, 1 progress
Features
12
5 live, 3 progress, 4 planned
Config
8
all live
Probes Corpus Coach with persona bots, raises flags. Live in prod.
Corpus Coach
Subject under test (RAG)
AI
4
3 live, 1 planned
Features
5
4 live, 1 progress
Config
6
all live
Finance-domain mock RAG. Stand-in target for Scout + signal source for Crystal Ball.
Crystal Ball
Dashboard + threshold gate
AI
1
optional, planned
Features
8
4 live, 2 progress, 2 planned
Config
5
3 live, 1 progress, 1 planned
Deterministic spine. Optional viewer — customers may BYO dashboard.
Engine (AIQIDE)
Core value — scoring, DNA, narrative
AI
3
all live
Features
6
3 live, 1 progress, 2 planned
Config
5
3 live, 2 progress (r1)
Scoring + System DNA (incl. arch_pattern r1) + narrative + audit. The product.
Edges — data flow + adapter health
Scout → Corpus Coach
Probe channel. Scout sends crafted queries.
WiredFlowingConforming 99%Complete 5/5
Corpus Coach → Crystal Ball
Eval signal feed. Scores per attribute.
WiredFlowingConforming 87%Complete 8/12
Scout → Crystal Ball
Flag dispatch. Scout findings.
WiredFlowingConforming 100%Complete 4/4
Crystal Ball ↔ Engine
Breach → Engine; narrative → CB.
WiredFlowingConforming 92%Complete 6/8
External adapters — engine outbound, customer integrations
Engine → Customer dashboard
BYO-dashboard adapter. JSON output.
WiredFlowingConforming 100%Complete 3/5
Engine → Slack / Teams
Persona narrative push to channels.
Wired (planned)FlowingConforming —Complete 0/3
Crystal Ball → CSV / Data export
Audit trail extraction.
Wired (planned)FlowingConforming —Complete 0/2
Corpus Coach ← External corpora
Customer corpus ingest.
Wired (partial)Flowing — brokenConforming —Complete 1/4
Detail
Click any tile or edge above
Drilldown shows items by category (AI / Features / Config) or edge contract details.
Reading the view. Tiles = systems. Pills show item counts per category and aggregate health. Edges show adapter health on four axes: Wired, Flowing, Conforming, Complete. Plumbing analogy: pipe there / water flowing / right water / enough water.
Quality bands
Good ≥ 0.85
Watch 0.70 – 0.85
Poor < 0.70
Unknown
N/A
Edge quality
Fidelity · Freshness · Calibration · Information
Systems — quality of measured set
Scout
Exploratory testing agent
0.81Composite
avg of 8 measured quality signals
Corpus Coach
Subject under test (RAG)
0.79Composite
avg of 4 measured RAG attributes — material set TBD
Crystal Ball
Dashboard + threshold gate
0.84Composite
avg of 5 product + governance metrics
Engine (AIQIDE)
Core value — quality of audit
0.86Composite
meta — how good is the audit Engine produces
Coverage Gap Audit — Corpus Coach
System DNA: supervised · advisory · external_regulated_consumer · financial_services · personal | architecture_pattern: rag (sibling field, r1 pending deploy)
Measured today (4)
- 0.91
faithfulness - 0.74
retrieval_relevancy— threshold 0.80 - 0.69
context_precision— threshold 0.75 - 0.83
response_groundedness
Required, not yet wired (5+)
- missing
context_recall - missing
answer_relevancy - missing
fairness_subgroup - partial
bias_demographic— instrument exists, not running - missing
prompt_injection_resistance
Unknown — pending r1 derivation
- tbd Calibration thresholds per DNA combo
- tbd Architecture-specific attr promotions
- tbd Regulatory-specific extensions (MAS FEAT)
Honesty. The "Required" set above is an informed estimate. The defensible material attribute set requires the System DNA + architecture_pattern derivation (r1 decision, pending deploy a3ce43f) plus the action_type × architecture_pattern cell matrix exercise. Until both land, the gap register can be sketched but not signed-off. This is the artefact the Coverage Gap Audit produces. Today, manual. Once r1 lands, deterministic.
Edges — quality of data flowing
Fidelity (signal arrives intact), Freshness (data actionable), Calibration (thresholds informed by signal scale), Information (does the edge carry enough to act on)
Scout → Corpus Coach
Probe channel.
Fidelity 100%Freshness <200msCalibration n/aInformation 5/5
Corpus Coach → Crystal Ball
Eval signal feed.
Fidelity 87%Freshness <1sCalibration partialInformation 4/12+
Scout → Crystal Ball
Flag dispatch.
Fidelity 100%Freshness <500msCalibration n/aInformation 4/4
Crystal Ball ↔ Engine
Breach ↔ narrative.
Fidelity 92%Freshness <2sCalibration partialInformation 6/8
Detail
Click any tile or edge above
Drilldown: per-attribute quality, threshold, source, trend.
Reading the view. Composite scores averaged over measured set only. They don't claim coverage of the material set. The Coverage Gap Audit panel makes the gap explicit. Engine quality lens is meta — how good the audit it produces, not the system audited.
Why the gap matters. A high composite score over a small measured set is the canonical "looks fine, isn't" failure mode. Scoring 0.91 on faithfulness while not measuring fairness_subgroup is a reassuring number that hides the regulatory exposure. The view forces both numbers into the same eyeline.
Why the gap matters. A high composite score over a small measured set is the canonical "looks fine, isn't" failure mode. Scoring 0.91 on faithfulness while not measuring fairness_subgroup is a reassuring number that hides the regulatory exposure. The view forces both numbers into the same eyeline.
Quality band per phase + feature
Good — meeting target
Watch — below target / known gap
Poor — failing
Unknown / planned
User journey — five phases
Click a phase to drill into how three personas (Compliance officer / CTO / Business owner) experience it, what features support it, and what the non-AI version looks like.
Good
1
Arrive
"Something needs my attention — or I'm here for routine review."
User does
- Opens alert / email / Slack
- Logs in to dashboard
- Joins scheduled review
What they see
- Notification with one-line summary
- Login + workspace picker
- Welcome with last-visited state
Outcome: Situational awareness. Knows context.
Watch
2
Scan
"Where do I focus today across my portfolio?"
User does
- Reviews portfolio summary
- Filters by risk / project / status
- Sorts by trend or breach severity
What they see
- All projects with quality bands
- Trend lines per key dimension
- Latest run + breach indicators
Outcome: Knows where to dig.
Watch
3
Drill
"What's the state of one project I care about?"
User does
- Opens a project
- Reviews quality dimensions
- Reads recent runs + breaches
What they see
- Project quality scorecard
- Per-dimension trend chart
- Open breaches with recency
Outcome: Understands one system's posture.
Good
4
Diagnose
"What's wrong, what does it mean, and why does it matter to me?"
User does
- Opens a finding card
- Reads narrative in own language
- Inspects evidence + regulatory mapping
What they see
- Plain-language explanation
- Sample failures + raw signal
- Linked regulatory clauses
Outcome: Decision-grade understanding.
Watch
5
Act
"Closes the loop — what gets done, by whom, with what evidence."
User does
- Requests new rule / escalates
- Assigns action / mutes / accepts
- Jumps to external eval tool
- Exports evidence pack for audit
What they see
- Action menu + status
- External tool deep-link
- Audit-pack download
- Annotation history
Outcome: Loop closed. Trail recorded.
Phase detail
Click a phase above
Personas, supporting features, non-AI baseline, quality questions answered.
Reading the view. Business-user lens. Engines, rule libraries, narrative LLMs, RAG infrastructure are hidden. The user experiences capabilities. AI is the delivery mechanism for some capabilities, not the capability itself. Each phase's "without AI" note shows what the deterministic baseline looks like.