Verification

IRIS includes a verification system that lets programs carry machine-checked proofs of correctness. Verification is optional – programs work without proofs – but the infrastructure exists to verify properties ranging from type safety to cost bounds.

This page explains what verification means in IRIS, how it works, what it can prove, and where the boundaries are.

Why Verify?

Programs in IRIS can modify themselves. Evolution breeds new code. The self-improvement daemon hot-swaps components. In a system where code changes autonomously, you need a way to know that changes are safe.

The verification system serves three purposes:

Safety gate for evolution – the daemon won’t deploy a replacement that fails verification
Gradient for search – graded verification scores let evolution follow a gradient toward provable programs, not just binary pass/fail
Trust boundary – the proof kernel is the only component that IRIS cannot replace with evolved code (Lob’s theorem ceiling)

The Proof Kernel

The kernel is the trusted computing base. It’s an LCF-style proof system: the only code that can construct Theorem values lives inside the kernel. No external code can forge a proof.

Properties

Small, auditable Rust with zero unsafe blocks
20 inference rules, each producing a Theorem with a BLAKE3 proof hash
Every theorem records: context, node, type, cost bound, and a cryptographic hash of the derivation chain

The 20 Rules

The rules form a sound fragment of System F with refinements, cost annotations, and structural recursion.

Core lambda calculus (rules 1–7, 15):

Rule	Name	What it proves
1	`assume`	Variable lookup: if x:A is in the context, then x has type A
2	`intro`	Lambda introduction: given a proof of the body, produce a function type
3	`elim`	Application: given f:A->B and a:A, produce f(a):B
4	`refl`	Reflexivity: any node equals itself
5	`symm`	Symmetry: if a = b then b = a
6	`trans`	Transitivity: if a = b and b = c then a = c
7	`congr`	Congruence: if f = g and a = b then f(a) = g(b)
15	`let_bind`	Let binding: compose the costs of the bound expression and body

Pattern matching and control flow (rules 16, 20):

Rule	Name	What it proves
16	`match_elim`	All match arms agree on type; cost is scrutinee + max(arm costs)
20	`guard_rule`	Conditional: both branches must agree on type

Induction (rules 13–14, 17):

Rule	Name	What it proves
13	`nat_ind`	Natural number induction: P(0) and P(n)->P(n+1) implies P for all n
14	`structural_ind`	Structural induction: one case per constructor proves the property for all values
17	`fold_rule`	Catamorphism: cost is input + base + (step cost * input size)

Polymorphism (rules 18–19):

Rule	Name	What it proves
18	`type_abst`	ForAll introduction (System F)
19	`type_app`	ForAll elimination with well-formedness check on substituted types

Cost and refinement (rules 8–12):

Rule	Name	What it proves
8	`type_check_node`	Trust a node’s content-addressed type annotation
9	`cost_subsume`	Weaken a cost bound: if actual cost <= declared cost, accept
10	`cost_leq_rule`	Produce a witness that one cost bound is <= another
11	`refine_intro`	If e:T and P(e) holds, then e:{x:T\|P}
12	`refine_elim`	If e:{x:T\|P}, extract both e:T and the predicate P

Why LCF?

The LCF architecture means the kernel is the single point of trust. The checker, the evolution engine, and user code all produce proofs by calling kernel rules – they cannot bypass them. If the 20 rules are sound, then every Theorem value in the system is sound, regardless of what untrusted code produced it.

The proof kernel supports refinement types via refine_intro and refine_elim rules, and the LIA (Linear Integer Arithmetic) solver can check predicates. However, refinement type annotations in .iris surface syntax are parsed but not lowered to the SemanticGraph – the lowerer discards them.

Refinement types work at the kernel API level when proofs are constructed programmatically (e.g., by the checker or evolution engine), but not from surface syntax today.

-- Parsed but not lowered -- serves as documentation
let safe_div x y : Int -> {y : Int | y != 0} -> Int = x / y

The kernel’s refine_intro rule requires an independent proof that the predicate holds – you can’t introduce a refinement without evidence. refine_elim extracts the base type, discarding the predicate. This prevents circular reasoning.

Supported Predicates

The LIA solver handles:

Comparisons: x = y, x < y, x <= y, x != y
Arithmetic: constants, variables, addition, scalar multiplication, negation, modulo
Logic: and, or, not, implies
Divisibility: x % d = 0

When the solver can’t decide a predicate, it falls back to property-based testing (random sampling with counterexample generation).

Example Programs

IRIS ships with annotated programs in programs/verified/. The requires/ensures annotations are parsed but discarded by the lowerer – they don’t affect compilation or execution. They serve as documentation of intended invariants.

What iris check actually verifies is the graph structure: node types agree across edges, match arms are consistent, guards have compatible branches, and fold costs are well-formed. It does not check the requires/ensures predicates.

Absolute value

let abs x : Int -> Int
  requires x >= -1000000 && x <= 1000000
  ensures result >= 0
  = if x >= 0 then x else 0 - x

The checker verifies that both guard branches produce values of the same type and that the graph is well-formed. The requires/ensures clauses are documentation.

Safe division

let safe_div x y : Int -> Int -> Int
  requires y != 0
  = x / y

The requires y != 0 documents the precondition but is not enforced – division by zero will still error at runtime.

Bounded addition

let bounded_add x y : Int -> Int -> Int
  requires x >= -500000 && x <= 500000
  requires y >= -500000 && y <= 500000
  ensures result >= -1000000
  ensures result <= 1000000
  ensures result == x + y
  = x + y

Clamping

let clamp x lo hi : Int -> Int -> Int -> Int
  requires lo <= hi
  ensures result >= lo
  ensures result <= hi
  = if x < lo then lo else if x > hi then hi else x

The Checker

The checker is untrusted code that drives the kernel. It walks a program’s graph bottom-up, applying kernel rules at each node, and collects the results.

How `iris check` works

iris check programs/verified/abs.iris
# [OK] abs: 5/5 obligations satisfied (score: 1.00)

Parse and compile the .iris file to a SemanticGraph
Topologically sort the graph nodes
For each node, apply the appropriate kernel rule
Count successes and failures
Report the score: satisfied / total obligations

Graded Verification

Instead of binary pass/fail, the checker computes a score in [0.0, 1.0]. This is critical for evolution – a program that satisfies 8/10 obligations is better than one that satisfies 2/10, even though both “fail.”

The score is scaled by verification tier:

Tier	Constructs	Ceiling	Decidable?
Tier 0	Core lambda calculus	0.3	Yes, < 10ms
Tier 1	+ fold, recursion, induction	0.6	Mostly, < 1s
Tier 2	+ effects, extern	0.9	Requires annotation
Tier 3	+ neural, property testing	1.0	Probabilistic

Proof-Guided Mutation

When verification fails, the checker produces a ProofFailureDiagnosis with a MutationHint:

AddTerminationCheck – missing loop termination
FixTypeSignature(expected, actual) – type mismatch
AddCostAnnotation – node needs a cost bound
WrapInGuard – add a runtime guard (e.g., division-by-zero check)

The evolution engine uses these hints to guide mutations toward provable programs. Instead of random search, the system knows what’s wrong and how to fix it.

Cost Analysis

Cost annotations declare the computational complexity of a function:

let sum xs : List Int -> Int [cost: Linear(xs)] = fold 0 (+) xs

The cost lattice forms a partial order:

Zero < Constant(k) < Linear(n) < NLogN(n) < Polynomial(n, d)

Composite forms handle sequential (Sum), parallel (Par), repeated (Mul), and branching (Sup for max, Inf for min) costs.

The kernel’s cost_subsume rule can weaken cost bounds (e.g., accept a Linear implementation where Polynomial was declared), but today cost subsumption is not wired into the main proof construction pipeline. Cost annotations function as evolution fitness objectives – lower cost is better – and as documentation of intended complexity.

The kernel has cost rules but they are not wired into the checker. Cost annotations guide evolution, not verification.

Lean Formalization

A separate Lean 4 codebase (lean/IrisKernel/) formalizes the proof kernel’s type theory.

What’s formalized

All 20 inference rules as constructors of an inductive Derivation type
The cost lattice partial order (CostLeq) with reflexivity, transitivity, and zero/unknown axioms
Type well-formedness: a type is well-formed if it and all its references exist
Key metatheorems: reflexivity produces valid judgments, cost weakening is transitive, Zero is bottom, Unknown is top

What’s not formalized yet

Full weakening lemma (context extension preserves derivations) – partially proven, needs structural induction over all derivation forms
Some cost algebra embedding rules (e.g., a <= Sum(a, b))
Full soundness proof of the combined rule set

What “formalized” means

The Lean code proves properties about the type-theoretic design, not about the Rust implementation. The Rust kernel could have bugs that the Lean proofs don’t catch. The Lean formalization is a design validation tool, not a code extraction pipeline.

Soundness Argument

The 20 rules are sound because they correspond to well-understood type-theoretic constructs:

Rules 1–7, 15 = simply-typed lambda calculus (known sound since 1940s)
Rules 18–19 = System F polymorphism (sound, with well-formedness guard on substitutions)
Rules 11–12 = refinement types / liquid types (sound: intro requires independent evidence)
Rules 13–14 = standard Peano and structural induction (sound, requires exhaustive cases)
Rules 9–10, 17 = cost annotations in a separate dimension (don’t affect type soundness)
Rule 8 = trusts content-addressed node annotations (conservative)
Rules 16, 20 = standard sum elimination and conditional (sound: all branches agree on type)

The LCF architecture ensures that even if untrusted code (the checker, evolution, user programs) tries to construct invalid proofs, the kernel rejects them.

What Verification Does NOT Do

To be clear about the boundaries:

Verification is optional. Most programs today carry no proofs. You can write and run IRIS code without ever invoking the checker.
Contracts and refinement types are parsed but not enforced. requires, ensures, and {x : T | P} syntax exists, but the lowerer discards them. The kernel has the rules; the wiring from syntax to kernel doesn’t exist.
Cost annotations are metadata. The kernel has cost rules, but they’re not wired into the main checker pipeline. Costs guide evolution, not verification.
The Rust kernel is not verified. It’s small, auditable, and has zero unsafe – but no code extraction or formal linkage to the Lean formalization.
Property testing is probabilistic. When the LIA solver can’t decide, random testing provides confidence, not certainty.