Decision Architecture

The decision architecture, complete.

The decision is an architecture of seven parts through which it moves, and the model is one of the seven.

The Seven-Component Decision Architecture

Main Pipeline

1. Frame

Guards the question being asked

2. Generator

The model producing candidates

3. Evaluation

Independent grading, not self-grading

4. Gate

Accept, Return, or Escalate

Decision Paths from Gate

Accept → Output delivered
Return → Refinement Loop → Generator
Escalate → Human Judgment

When Decisions Need More

5. Refinement Loop

Iterates until quality threshold

6. Human Judgment

The irreducible call at the edges

Support Layer

7. Record

Captures everything. History feeds back to Frame.

The Seven Components

The architecture, not the model

Every component is necessary. None is sufficient alone.

Most AI systems are a single model with accessories. OrbisFramework is the seven-component decision architecture the model sits inside. The model produces candidates. Everything around it is what makes those candidates trustworthy enough for a high-stakes decision. Each component is external, inspectable, and replaceable. The techniques named on each card are how that component is built today. They will be superseded. The architecture they implement is what endures.

The clearest single example is CLIO, the cognitive loop via in-situ optimization. Rather than training reasoning into a model's weights, CLIO builds it as an external loop around an ordinary model: the system formulates its own approach, monitors its confidence as it works, adapts when confidence falls, and exposes the whole process to a person who can watch the uncertainty and interject. That is the philosophy in one technique. Capability that lives in the architecture rather than the model, and stays inspectable because of it. CLIO appears on two cards below, refinement and the record, because one steerable loop touches both at once.

Frame

Guards against answering the wrong question

The Frame validates that the question is well-formed, in scope, and within the system's capabilities, then assembles the evidence the decision will rest on through retrieval (RAG) rather than the model's memory. It also pulls comparable past decisions from the Record, the History loop, so the case is informed by how like cases turned out rather than framed from scratch. A wrong question answered correctly is still a wrong answer.

Where the field converged: EGuR (Stein), the experience-guided reasoner, drawing on the Record as its case memory.

Generator

The AI model producing candidates

This is what most people think of when they think of AI: the model that produces candidate outputs. It is one component of seven. The generator's job is to produce candidates, not to evaluate them and not to decide what to do with them.

Current techniques: self-consistency (Wang), best-of-N sampling (Cobbe), tree of thoughts (Yao), test-time compute scaling (Snell).

Evaluation

Guards against the generator grading its own exam

Outputs must be scored by criteria separate from the generation process. The generator cannot grade its own exam. Evaluation applies multiple dimensions with documented thresholds, producing scores that the gate will use to route the output.

Where the field converged: process supervision (Lightman), Constitutional AI (Bai), LLM-as-judge (Zheng), AI safety via debate (Irving), the Deliberative Reasoning Network, DRN (Xu).

Gate

Guards against a system that can only pass or discard

The gate has three routes. Accept: the output meets threshold and proceeds. Return: the output fails threshold and goes back for refinement. Escalate: the output has reached an edge where human judgment is required. Escalation is a designed route, not a failure.

Where the field converged: conformal prediction (Vovk), model calibration (Kadavath), semantic entropy (Kuhn), certainty-guided reasoning, CGR (Nogueira), agentic uncertainty quantification, AUQ (Zhang).

Refinement Loop

Iterates until quality threshold

Outputs that fail evaluation return for another attempt. The feedback is specific, not just 'try again.' Maximum retry and cost limits prevent infinite loops. Persistent failures escalate to humans rather than retrying forever.

Where the field converged: Self-Refine (Madaan), Reflexion (Shinn), chain-of-verification (Dhuliawala), CLIO (Cheng), and the documented limit of self-correction (Huang).

Human Judgment

The irreducible call at the edges

Some decisions cannot be delegated to automation. At the edges of the system's competence, a named human makes the call. Human judgment is a designed route, not a failure mode. The system recognizes when it has reached an edge and routes accordingly.

Where the field converged: learning to defer (Madras), appropriate reliance and human-AI complementarity (Bansal), scalable oversight (Amodei).

Record

Captures everything. Read later for drift.

Every frame, candidate, evaluation, gate action, refinement, and human judgment is recorded. The record preserves contemporaneous state. It is built to be read by someone not present at the decision. And it is actually read, for drift detection and continuous improvement.

Where the field converged: agentic memory (Park), CLIO (Cheng), chain-of-thought faithfulness (Turpin).

The Whole Architecture

Only as sound as its weakest necessary part

The decision is the whole, not any single part. Soundness is judged by the weakest necessary component because one unguarded component caps the reliability of everything else. That is what the assessment measures, and where it shows you the cap.

Where the field converged: agentic failure-mode studies (Cemri), the spiral of hallucination (Zhang).

The Three-Route Gate

Accept, Return, or Escalate

Escalation is a designed route, not a failure. A system that can only pass or discard has no answer for the cases that fall between.

The Three-Route Gate

Evaluated Output

Gate

Decision Point

Meets threshold

Return

For refinement

Escalate

Human required

Escalation is a designed route, not a failure.

The Four Edges

Where the system escalates to human judgment

The system must recognize when a case has reached the edge of its competence. At these edges, it routes to a named human, on purpose.

The Four Edges Where the System Escalates to Human Judgment

System Competence

Cases within bounds

Edge of the Standard

Case fits no category

Tell: Confidence spread thin, no clear peak

Requires: Judgment

Edge of the Measurable

Evaluators disagree

Tell: Disagreement no extra data resolves

Requires: Adjudication

Edge of the Frame

Anomaly breaks presuppositions

Tell: Data that should not exist

Requires: Reframing

Edge of Competence

Beyond trained capability

Tell: No precedent, only a guess

Requires: Expertise

At every edge, the system routes to a named human role. Escalation is designed, not failure.

For security architecture, deployment options, and technical stack detail:

See Security and Deployment

Self-Assessment

How sound is your decision architecture?

Take the audit to assess your current system across all seven components.

Take the Architecture Audit

See it implemented

OrbisFramework implements the complete architecture.

Schedule a strategic session to discuss how the architecture applies to your workflows.

Schedule Strategic Session Send a Message