Decision Architecture
The decision architecture, complete.
The decision is an architecture of seven parts through which it moves, and the model is one of the seven.
The Seven-Component Decision Architecture
Main Pipeline
1. Frame
Guards the question being asked
2. Generator
The model producing candidates
3. Evaluation
Independent grading, not self-grading
4. Gate
Accept, Return, or Escalate
Decision Paths from Gate
- Accept → Output delivered
- Return → Refinement Loop → Generator
- Escalate → Human Judgment
When Decisions Need More
5. Refinement Loop
Iterates until quality threshold
6. Human Judgment
The irreducible call at the edges
Support Layer
7. Record
Captures everything. History feeds back to Frame.
The Seven Components
The architecture, not the model
Every component is necessary. None is sufficient alone.
Most AI systems are a single model with accessories. OrbisFramework is the seven-component decision architecture the model sits inside. The model produces candidates. Everything around it is what makes those candidates trustworthy enough for a high-stakes decision. Each component is external, inspectable, and replaceable. The techniques named on each card are how that component is built today. They will be superseded. The architecture they implement is what endures.
The clearest single example is CLIO, the cognitive loop via in-situ optimization. Rather than training reasoning into a model's weights, CLIO builds it as an external loop around an ordinary model: the system formulates its own approach, monitors its confidence as it works, adapts when confidence falls, and exposes the whole process to a person who can watch the uncertainty and interject. That is the philosophy in one technique. Capability that lives in the architecture rather than the model, and stays inspectable because of it. CLIO appears on two cards below, refinement and the record, because one steerable loop touches both at once.
Frame
Guards against answering the wrong question
The Frame validates that the question is well-formed, in scope, and within the system's capabilities, then assembles the evidence the decision will rest on through retrieval (RAG) rather than the model's memory. It also pulls comparable past decisions from the Record, the History loop, so the case is informed by how like cases turned out rather than framed from scratch. A wrong question answered correctly is still a wrong answer.
Where the field converged: EGuR (Stein), the experience-guided reasoner, drawing on the Record as its case memory.
Generator
The AI model producing candidates
This is what most people think of when they think of AI: the model that produces candidate outputs. It is one component of seven. The generator's job is to produce candidates, not to evaluate them and not to decide what to do with them.
Current techniques: self-consistency (Wang), best-of-N sampling (Cobbe), tree of thoughts (Yao), test-time compute scaling (Snell).
Evaluation
Guards against the generator grading its own exam
Outputs must be scored by criteria separate from the generation process. The generator cannot grade its own exam. Evaluation applies multiple dimensions with documented thresholds, producing scores that the gate will use to route the output.
Where the field converged: process supervision (Lightman), Constitutional AI (Bai), LLM-as-judge (Zheng), AI safety via debate (Irving), the Deliberative Reasoning Network, DRN (Xu).
Gate
Guards against a system that can only pass or discard
The gate has three routes. Accept: the output meets threshold and proceeds. Return: the output fails threshold and goes back for refinement. Escalate: the output has reached an edge where human judgment is required. Escalation is a designed route, not a failure.
Where the field converged: conformal prediction (Vovk), model calibration (Kadavath), semantic entropy (Kuhn), certainty-guided reasoning, CGR (Nogueira), agentic uncertainty quantification, AUQ (Zhang).
Refinement Loop
Iterates until quality threshold
Outputs that fail evaluation return for another attempt. The feedback is specific, not just 'try again.' Maximum retry and cost limits prevent infinite loops. Persistent failures escalate to humans rather than retrying forever.
Where the field converged: Self-Refine (Madaan), Reflexion (Shinn), chain-of-verification (Dhuliawala), CLIO (Cheng), and the documented limit of self-correction (Huang).
Human Judgment
The irreducible call at the edges
Some decisions cannot be delegated to automation. At the edges of the system's competence, a named human makes the call. Human judgment is a designed route, not a failure mode. The system recognizes when it has reached an edge and routes accordingly.
Where the field converged: learning to defer (Madras), appropriate reliance and human-AI complementarity (Bansal), scalable oversight (Amodei).
Record
Captures everything. Read later for drift.
Every frame, candidate, evaluation, gate action, refinement, and human judgment is recorded. The record preserves contemporaneous state. It is built to be read by someone not present at the decision. And it is actually read, for drift detection and continuous improvement.
Where the field converged: agentic memory (Park), CLIO (Cheng), chain-of-thought faithfulness (Turpin).
The Whole Architecture
Only as sound as its weakest necessary part
The decision is the whole, not any single part. Soundness is judged by the weakest necessary component because one unguarded component caps the reliability of everything else. That is what the assessment measures, and where it shows you the cap.
Where the field converged: agentic failure-mode studies (Cemri), the spiral of hallucination (Zhang).
The Three-Route Gate
Accept, Return, or Escalate
Escalation is a designed route, not a failure. A system that can only pass or discard has no answer for the cases that fall between.
The Three-Route Gate
Evaluated Output
Gate
Decision Point
Accept
Meets threshold
Return
For refinement
Escalate
Human required
Escalation is a designed route, not a failure.
The Four Edges
Where the system escalates to human judgment
The system must recognize when a case has reached the edge of its competence. At these edges, it routes to a named human, on purpose.
The Four Edges Where the System Escalates to Human Judgment
System Competence
Cases within bounds
Edge of the Standard
Case fits no category
Tell: Confidence spread thin, no clear peak
Requires: Judgment
Edge of the Measurable
Evaluators disagree
Tell: Disagreement no extra data resolves
Requires: Adjudication
Edge of the Frame
Anomaly breaks presuppositions
Tell: Data that should not exist
Requires: Reframing
Edge of Competence
Beyond trained capability
Tell: No precedent, only a guess
Requires: Expertise
At every edge, the system routes to a named human role. Escalation is designed, not failure.
For security architecture, deployment options, and technical stack detail:
See Security and DeploymentSelf-Assessment
How sound is your decision architecture?
Take the audit to assess your current system across all seven components.
Take the Architecture AuditSee it implemented
OrbisFramework implements the complete architecture.
Schedule a strategic session to discuss how the architecture applies to your workflows.
