Verification

IRIS includes a verification system that lets programs carry machine-checked proofs of correctness. Verification is optional – programs work without proofs – but the infrastructure exists to verify properties ranging from type safety to cost bounds.

This page explains what verification means in IRIS, how it works, what it can prove, and where the boundaries are.

Why Verify?

Programs in IRIS can modify themselves. Evolution breeds new code. The self-improvement daemon hot-swaps components. In a system where code changes autonomously, you need a way to know that changes are safe.

The verification system serves three purposes:

  1. Safety gate for evolution – the daemon won’t deploy a replacement that fails verification
  2. Gradient for search – graded verification scores let evolution follow a gradient toward provable programs, not just binary pass/fail
  3. Trust boundary – the proof kernel is the only component that IRIS cannot replace with evolved code (Lob’s theorem ceiling)

The Proof Kernel

The kernel is the trusted computing base. It’s an LCF-style proof system: the only code that can construct Theorem values lives inside the kernel. No external code can forge a proof.

Properties

  • Small, auditable Rust with zero unsafe blocks
  • 20 inference rules, each producing a Theorem with a BLAKE3 proof hash
  • Every theorem records: context, node, type, cost bound, and a cryptographic hash of the derivation chain

The 20 Rules

The rules form a sound fragment of System F with refinements, cost annotations, and structural recursion.

Core lambda calculus (rules 1–7, 15):

Rule Name What it proves
1 assume Variable lookup: if x:A is in the context, then x has type A
2 intro Lambda introduction: given a proof of the body, produce a function type
3 elim Application: given f:A->B and a:A, produce f(a):B
4 refl Reflexivity: any node equals itself
5 symm Symmetry: if a = b then b = a
6 trans Transitivity: if a = b and b = c then a = c
7 congr Congruence: if f = g and a = b then f(a) = g(b)
15 let_bind Let binding: compose the costs of the bound expression and body

Pattern matching and control flow (rules 16, 20):

Rule Name What it proves
16 match_elim All match arms agree on type; cost is scrutinee + max(arm costs)
20 guard_rule Conditional: both branches must agree on type

Induction (rules 13–14, 17):

Rule Name What it proves
13 nat_ind Natural number induction: P(0) and P(n)->P(n+1) implies P for all n
14 structural_ind Structural induction: one case per constructor proves the property for all values
17 fold_rule Catamorphism: cost is input + base + (step cost * input size)

Polymorphism (rules 18–19):

Rule Name What it proves
18 type_abst ForAll introduction (System F)
19 type_app ForAll elimination with well-formedness check on substituted types

Cost and refinement (rules 8–12):

Rule Name What it proves
8 type_check_node Trust a node’s content-addressed type annotation
9 cost_subsume Weaken a cost bound: if actual cost <= declared cost, accept
10 cost_leq_rule Produce a witness that one cost bound is <= another
11 refine_intro If e:T and P(e) holds, then e:{x:T|P}
12 refine_elim If e:{x:T|P}, extract both e:T and the predicate P

Why LCF?

The LCF architecture means the kernel is the single point of trust. The checker, the evolution engine, and user code all produce proofs by calling kernel rules – they cannot bypass them. If the 20 rules are sound, then every Theorem value in the system is sound, regardless of what untrusted code produced it.

Refinement Types

The proof kernel supports refinement types via refine_intro and refine_elim rules, and the LIA (Linear Integer Arithmetic) solver can check predicates. However, refinement type annotations in .iris surface syntax are parsed but not lowered to the SemanticGraph – the lowerer discards them.

Refinement types work at the kernel API level when proofs are constructed programmatically (e.g., by the checker or evolution engine), but not from surface syntax today.

-- Parsed but not lowered -- serves as documentation
let safe_div x y : Int -> {y : Int | y != 0} -> Int = x / y

The kernel’s refine_intro rule requires an independent proof that the predicate holds – you can’t introduce a refinement without evidence. refine_elim extracts the base type, discarding the predicate. This prevents circular reasoning.

Supported Predicates

The LIA solver handles:

  • Comparisons: x = y, x < y, x <= y, x != y
  • Arithmetic: constants, variables, addition, scalar multiplication, negation, modulo
  • Logic: and, or, not, implies
  • Divisibility: x % d = 0

When the solver can’t decide a predicate, it falls back to property-based testing (random sampling with counterexample generation).

Example Programs

IRIS ships with annotated programs in programs/verified/. The requires/ensures annotations are parsed but discarded by the lowerer – they don’t affect compilation or execution. They serve as documentation of intended invariants.

What iris check actually verifies is the graph structure: node types agree across edges, match arms are consistent, guards have compatible branches, and fold costs are well-formed. It does not check the requires/ensures predicates.

Absolute value

let abs x : Int -> Int
  requires x >= -1000000 && x <= 1000000
  ensures result >= 0
  = if x >= 0 then x else 0 - x

The checker verifies that both guard branches produce values of the same type and that the graph is well-formed. The requires/ensures clauses are documentation.

Safe division

let safe_div x y : Int -> Int -> Int
  requires y != 0
  = x / y

The requires y != 0 documents the precondition but is not enforced – division by zero will still error at runtime.

Bounded addition

let bounded_add x y : Int -> Int -> Int
  requires x >= -500000 && x <= 500000
  requires y >= -500000 && y <= 500000
  ensures result >= -1000000
  ensures result <= 1000000
  ensures result == x + y
  = x + y

Clamping

let clamp x lo hi : Int -> Int -> Int -> Int
  requires lo <= hi
  ensures result >= lo
  ensures result <= hi
  = if x < lo then lo else if x > hi then hi else x

The Checker

The checker is untrusted code that drives the kernel. It walks a program’s graph bottom-up, applying kernel rules at each node, and collects the results.

How iris check works

iris check programs/verified/abs.iris
# [OK] abs: 5/5 obligations satisfied (score: 1.00)
  1. Parse and compile the .iris file to a SemanticGraph
  2. Topologically sort the graph nodes
  3. For each node, apply the appropriate kernel rule
  4. Count successes and failures
  5. Report the score: satisfied / total obligations

Graded Verification

Instead of binary pass/fail, the checker computes a score in [0.0, 1.0]. This is critical for evolution – a program that satisfies 8/10 obligations is better than one that satisfies 2/10, even though both “fail.”

The score is scaled by verification tier:

Tier Constructs Ceiling Decidable?
Tier 0 Core lambda calculus 0.3 Yes, < 10ms
Tier 1 + fold, recursion, induction 0.6 Mostly, < 1s
Tier 2 + effects, extern 0.9 Requires annotation
Tier 3 + neural, property testing 1.0 Probabilistic

Proof-Guided Mutation

When verification fails, the checker produces a ProofFailureDiagnosis with a MutationHint:

  • AddTerminationCheck – missing loop termination
  • FixTypeSignature(expected, actual) – type mismatch
  • AddCostAnnotation – node needs a cost bound
  • WrapInGuard – add a runtime guard (e.g., division-by-zero check)

The evolution engine uses these hints to guide mutations toward provable programs. Instead of random search, the system knows what’s wrong and how to fix it.

Cost Analysis

Cost annotations declare the computational complexity of a function:

let sum xs : List Int -> Int [cost: Linear(xs)] = fold 0 (+) xs

The cost lattice forms a partial order:

Zero < Constant(k) < Linear(n) < NLogN(n) < Polynomial(n, d)

Composite forms handle sequential (Sum), parallel (Par), repeated (Mul), and branching (Sup for max, Inf for min) costs.

The kernel’s cost_subsume rule can weaken cost bounds (e.g., accept a Linear implementation where Polynomial was declared), but today cost subsumption is not wired into the main proof construction pipeline. Cost annotations function as evolution fitness objectives – lower cost is better – and as documentation of intended complexity.

The kernel has cost rules but they are not wired into the checker. Cost annotations guide evolution, not verification.

Lean Formalization

A separate Lean 4 codebase (lean/IrisKernel/) formalizes the proof kernel’s type theory.

What’s formalized

  • All 20 inference rules as constructors of an inductive Derivation type
  • The cost lattice partial order (CostLeq) with reflexivity, transitivity, and zero/unknown axioms
  • Type well-formedness: a type is well-formed if it and all its references exist
  • Key metatheorems: reflexivity produces valid judgments, cost weakening is transitive, Zero is bottom, Unknown is top

What’s not formalized yet

  • Full weakening lemma (context extension preserves derivations) – partially proven, needs structural induction over all derivation forms
  • Some cost algebra embedding rules (e.g., a <= Sum(a, b))
  • Full soundness proof of the combined rule set

What “formalized” means

The Lean code proves properties about the type-theoretic design, not about the Rust implementation. The Rust kernel could have bugs that the Lean proofs don’t catch. The Lean formalization is a design validation tool, not a code extraction pipeline.

Soundness Argument

The 20 rules are sound because they correspond to well-understood type-theoretic constructs:

  1. Rules 1–7, 15 = simply-typed lambda calculus (known sound since 1940s)
  2. Rules 18–19 = System F polymorphism (sound, with well-formedness guard on substitutions)
  3. Rules 11–12 = refinement types / liquid types (sound: intro requires independent evidence)
  4. Rules 13–14 = standard Peano and structural induction (sound, requires exhaustive cases)
  5. Rules 9–10, 17 = cost annotations in a separate dimension (don’t affect type soundness)
  6. Rule 8 = trusts content-addressed node annotations (conservative)
  7. Rules 16, 20 = standard sum elimination and conditional (sound: all branches agree on type)

The LCF architecture ensures that even if untrusted code (the checker, evolution, user programs) tries to construct invalid proofs, the kernel rejects them.

What Verification Does NOT Do

To be clear about the boundaries:

  • Verification is optional. Most programs today carry no proofs. You can write and run IRIS code without ever invoking the checker.
  • Contracts and refinement types are parsed but not enforced. requires, ensures, and {x : T | P} syntax exists, but the lowerer discards them. The kernel has the rules; the wiring from syntax to kernel doesn’t exist.
  • Cost annotations are metadata. The kernel has cost rules, but they’re not wired into the main checker pipeline. Costs guide evolution, not verification.
  • The Rust kernel is not verified. It’s small, auditable, and has zero unsafe – but no code extraction or formal linkage to the Lean formalization.
  • Property testing is probabilistic. When the LIA solver can’t decide, random testing provides confidence, not certainty.