System implementation progress

What's built, what's adapter-healthy, what's still in flight.
Status view Quality view Journey view
Item status Live In progress Planned Blocked
Edge axes Wired · Flowing · Conforming · Complete

Scout

Exploratory testing agent
AI
3
2 live, 1 progress
Features
12
5 live, 3 progress, 4 planned
Config
8
all live
Probes Corpus Coach with persona bots, raises flags. Live in prod.

Corpus Coach

Subject under test (RAG)
AI
4
3 live, 1 planned
Features
5
4 live, 1 progress
Config
6
all live
Finance-domain mock RAG. Stand-in target for Scout + signal source for Crystal Ball.

Crystal Ball

Dashboard + threshold gate
AI
1
optional, planned
Features
8
4 live, 2 progress, 2 planned
Config
5
3 live, 1 progress, 1 planned
Deterministic spine. Optional viewer — customers may BYO dashboard.

Engine (AIQIDE)

Core value — scoring, DNA, narrative
AI
3
all live
Features
6
3 live, 1 progress, 2 planned
Config
5
3 live, 2 progress (r1)
Scoring + System DNA (incl. arch_pattern r1) + narrative + audit. The product.
Scout → Corpus Coach
Probe channel. Scout sends crafted queries.
WiredFlowingConforming 99%Complete 5/5
Corpus Coach → Crystal Ball
Eval signal feed. Scores per attribute.
WiredFlowingConforming 87%Complete 8/12
Scout → Crystal Ball
Flag dispatch. Scout findings.
WiredFlowingConforming 100%Complete 4/4
Crystal Ball ↔ Engine
Breach → Engine; narrative → CB.
WiredFlowingConforming 92%Complete 6/8
Engine → Customer dashboard
BYO-dashboard adapter. JSON output.
WiredFlowingConforming 100%Complete 3/5
Engine → Slack / Teams
Persona narrative push to channels.
Wired (planned)FlowingConforming —Complete 0/3
Crystal Ball → CSV / Data export
Audit trail extraction.
Wired (planned)FlowingConforming —Complete 0/2
Corpus Coach ← External corpora
Customer corpus ingest.
Wired (partial)Flowing — brokenConforming —Complete 1/4

Click any tile or edge above

Drilldown shows items by category (AI / Features / Config) or edge contract details.
Reading the view. Tiles = systems. Pills show item counts per category and aggregate health. Edges show adapter health on four axes: Wired, Flowing, Conforming, Complete. Plumbing analogy: pipe there / water flowing / right water / enough water.
Quality bands Good ≥ 0.85 Watch 0.70 – 0.85 Poor < 0.70 Unknown N/A
Edge quality Fidelity · Freshness · Calibration · Information

Scout

Exploratory testing agent
0.81Composite
avg of 8 measured quality signals
AI quality
0.81
Feature quality
0.88
Config quality
0.85
2 unknownProbe novelty + persona drift not yet measured

Corpus Coach

Subject under test (RAG)
0.79Composite
avg of 4 measured RAG attributes — material set TBD
AI quality
0.79
Feature quality
0.91
Config quality
0.87
5+ gapsLikely material attrs not measured (see panel)

Crystal Ball

Dashboard + threshold gate
0.84Composite
avg of 5 product + governance metrics
AI quality
Feature quality
0.87
Config quality
0.78
Threshold cal limitedDefensible only where DNA classified

Engine (AIQIDE)

Core value — quality of audit
0.86Composite
meta — how good is the audit Engine produces
AI quality
0.86
Feature quality
0.90
Config quality
0.75
r1 pendingCatalogue extension + arch_pattern unblock material set

Coverage Gap Audit — Corpus Coach

System DNA: supervised · advisory · external_regulated_consumer · financial_services · personal  |  architecture_pattern: rag (sibling field, r1 pending deploy)

Measured today (4)

  • 0.91 faithfulness
  • 0.74 retrieval_relevancy — threshold 0.80
  • 0.69 context_precision — threshold 0.75
  • 0.83 response_groundedness

Required, not yet wired (5+)

  • missing context_recall
  • missing answer_relevancy
  • missing fairness_subgroup
  • partial bias_demographic — instrument exists, not running
  • missing prompt_injection_resistance

Unknown — pending r1 derivation

  • tbd Calibration thresholds per DNA combo
  • tbd Architecture-specific attr promotions
  • tbd Regulatory-specific extensions (MAS FEAT)
Honesty. The "Required" set above is an informed estimate. The defensible material attribute set requires the System DNA + architecture_pattern derivation (r1 decision, pending deploy a3ce43f) plus the action_type × architecture_pattern cell matrix exercise. Until both land, the gap register can be sketched but not signed-off. This is the artefact the Coverage Gap Audit produces. Today, manual. Once r1 lands, deterministic.
Fidelity (signal arrives intact), Freshness (data actionable), Calibration (thresholds informed by signal scale), Information (does the edge carry enough to act on)
Scout → Corpus Coach
Probe channel.
Fidelity 100%Freshness <200msCalibration n/aInformation 5/5
Corpus Coach → Crystal Ball
Eval signal feed.
Fidelity 87%Freshness <1sCalibration partialInformation 4/12+
Scout → Crystal Ball
Flag dispatch.
Fidelity 100%Freshness <500msCalibration n/aInformation 4/4
Crystal Ball ↔ Engine
Breach ↔ narrative.
Fidelity 92%Freshness <2sCalibration partialInformation 6/8

Click any tile or edge above

Drilldown: per-attribute quality, threshold, source, trend.
Reading the view. Composite scores averaged over measured set only. They don't claim coverage of the material set. The Coverage Gap Audit panel makes the gap explicit. Engine quality lens is meta — how good the audit it produces, not the system audited.

Why the gap matters. A high composite score over a small measured set is the canonical "looks fine, isn't" failure mode. Scoring 0.91 on faithfulness while not measuring fairness_subgroup is a reassuring number that hides the regulatory exposure. The view forces both numbers into the same eyeline.
Quality band per phase + feature Good — meeting target Watch — below target / known gap Poor — failing Unknown / planned
Click a phase to drill into how three personas (Compliance officer / CTO / Business owner) experience it, what features support it, and what the non-AI version looks like.
Good
1
Arrive
"Something needs my attention — or I'm here for routine review."
User does
  • Opens alert / email / Slack
  • Logs in to dashboard
  • Joins scheduled review
What they see
  • Notification with one-line summary
  • Login + workspace picker
  • Welcome with last-visited state
Outcome: Situational awareness. Knows context.
Watch
2
Scan
"Where do I focus today across my portfolio?"
User does
  • Reviews portfolio summary
  • Filters by risk / project / status
  • Sorts by trend or breach severity
What they see
  • All projects with quality bands
  • Trend lines per key dimension
  • Latest run + breach indicators
Outcome: Knows where to dig.
Watch
3
Drill
"What's the state of one project I care about?"
User does
  • Opens a project
  • Reviews quality dimensions
  • Reads recent runs + breaches
What they see
  • Project quality scorecard
  • Per-dimension trend chart
  • Open breaches with recency
Outcome: Understands one system's posture.
Good
4
Diagnose
"What's wrong, what does it mean, and why does it matter to me?"
User does
  • Opens a finding card
  • Reads narrative in own language
  • Inspects evidence + regulatory mapping
What they see
  • Plain-language explanation
  • Sample failures + raw signal
  • Linked regulatory clauses
Outcome: Decision-grade understanding.
Watch
5
Act
"Closes the loop — what gets done, by whom, with what evidence."
User does
  • Requests new rule / escalates
  • Assigns action / mutes / accepts
  • Jumps to external eval tool
  • Exports evidence pack for audit
What they see
  • Action menu + status
  • External tool deep-link
  • Audit-pack download
  • Annotation history
Outcome: Loop closed. Trail recorded.

Click a phase above

Personas, supporting features, non-AI baseline, quality questions answered.
Reading the view. Business-user lens. Engines, rule libraries, narrative LLMs, RAG infrastructure are hidden. The user experiences capabilities. AI is the delivery mechanism for some capabilities, not the capability itself. Each phase's "without AI" note shows what the deterministic baseline looks like.