Q9 — Formal Verification Reference

TLA+ Properties & Principles

What does REF formally specify, what invariants does TLC check, and what Rigorous Digital Engineering principles guide the verification architecture? A complete map of REF's three specification layers.

Consensus Spec
REFConsensus v1.9 (EV-008)
TLC States (Consensus)
152,658,351 generated · 0 on queue
Token Spec
TokenLifecycle (EV-009)
Methodology
Rigorous Digital Engineering
Core Position
REF uses TLA+ to express safety invariants as mathematical properties — checked exhaustively by TLC within finite bounds — not as prose assertions. Every property traces to a security requirement above and an implementation artifact below.
01 — Specification Architecture

Three Layers, One Verification Chain

REF's TLA+ specifications are organized into three layers, each addressing a distinct correctness domain. Together they form the assume-guarantee chain that connects cryptographic primitives to economic incentives.

Layer 1 — Byzantine Agreement
REFConsensus.tla v1.9
EV-008 · 4 validators (1 Byzantine) · 4 invariants · 152.7M states · 11m 06s · COMPLETE
Layer 2 — Token Lifecycle
TokenLifecycle.tla
EV-009 · 2 purchases, 3 nullifiers · 7 invariants · 269 states · COMPLETE
Layer 3 — Economic Deterrence
Dominance Proof
Analytic + Monte Carlo · 500 scenarios · 100% honesty dominance
"A specification is not documentation — it is a machine-checkable claim about what your system can and cannot do."
Rigorous Digital Engineering methodology
02 — Consensus Layer Properties

Byzantine Agreement Invariants

Four invariants checked by TLC across 152,658,351 states with non-vacuous Byzantine behavior. Configuration: 4 validators (v1–v4), 1 Byzantine (v4), 2 competing values, MaxHeight=1, MaxView=0. Zero violations. Zero states remaining on queue.

Agreement
Safety
Agreement == ∀ v1, v2 ∈ Validators: decision[v1] ≠ NIL ∧ decision[v2] ≠ NIL ⇒ decision[v1] = decision[v2]
If any two validators have both reached a decision, those decisions must be identical. This is the core BFT safety property — the network cannot fork, even when validator v4 is actively Byzantine (251,966 ByzPropose executions during run).
TLC checked · EV-008 · 152,658,351 states · 0 violations
LockedValueSafety
Safety
LockedValueSafety == ∀ v ∈ Honest: lockedValue[v] ≠ NIL ∧ lockedRound[v] ≥ 0 ⇒ (* validator only votes for locked value or NIL until unlock *)
Once an honest validator locks on a value, it will not vote for a conflicting value in subsequent rounds unless properly unlocked. This prevents equivocation and ensures view changes preserve safety.
TLC checked · EV-008 · 0 violations
UniqueHonestProposalPerRound
Integrity
UniqueHonestProposalPerRound == ∀ h ∈ Heights, r ∈ Rounds: ∀ p1, p2 ∈ Proposals: p1.proposer ∈ Honest ∧ p2.proposer ∈ Honest ∧ p1.height = h ∧ p2.height = h ∧ p1.round = r ∧ p2.round = r ⇒ p1 = p2
An honest validator proposes at most one value per round per height. Combined with Agreement, this ensures that equivocation is only possible from Byzantine nodes — and even then, it cannot break safety.
TLC checked · EV-008 · 0 violations
PrepareVotesReferenceProposals
Integrity
PrepareVotesReferenceProposals == ∀ m ∈ Msgs: m.type = "PREPARE" ∧ m.from ∈ Honest ⇒ ∃ p ∈ proposals: m.value = p.value
Every honest PREPARE vote references a value that was actually proposed. Honest validators never vote for phantom values — they only endorse proposals that exist in the proposal registry.
TLC checked · EV-008 · 0 violations
Byzantine Non-Vacuity Evidence

The v1.9 spec includes explicit Byzantine actions: ByzPropose (251,966 executions), ByzPrepare (1,901,481 executions), ByzCommit (12,863,638 executions). The adversary was active across millions of state transitions — this is not a vacuous pass on a system with no Byzantine behavior.

Known Limitation — Documented for Transparency

MaxView=0 in the EV-008 harness. This configuration does not exercise view-change safety properties. LockedValueSafety is checked structurally but view changes that test lock propagation across rounds require MaxView ≥ 1. This is the highest-priority extension for Phase 2 model checking.

03 — Token Lifecycle Properties

Purchase-to-Review Binding Invariants

Seven invariants checked by TLC across 269 states (exhaustive for the tiny harness: 2 purchases, 3 nullifiers, 2 content values). Clean pass on February 11, 2026. Zero violations. Zero states remaining on queue.

TypeOK
Well-Typedness
TypeOK == /\ purchases ⊆ Purchases /\ tokens ∈ [Purchases → TokenRecord ∪ {⊥}] /\ reviews ⊆ ReviewRecord /\ nullifier_set ⊆ Nullifiers
All four state variables are well-typed. Purchases is a set of valid purchase IDs, tokens maps purchases to token records, reviews is a set of valid review records, and nullifier_set contains only valid nullifiers.
TLC checked · EV-009 · 269 states · 0 violations
AtMostOneTokenPerPurchase
Integrity
∀ p ∈ purchases: |{t ∈ tokens : t.purchase = p}| ≤ 1
Each purchase generates at most one cryptographic token. Prevents token multiplication attacks where an adversary attempts to generate multiple review rights from a single transaction.
TLC checked · EV-009 · 0 violations
AtMostOneReviewPerToken
Integrity
∀ t ∈ tokens: |{r ∈ reviews : r.token = t}| ≤ 1
Each token can produce exactly one review. Combined with the previous invariant, this enforces the one-purchase → one-review binding end-to-end.
TLC checked · EV-009 · 0 violations
NoOrphanReview
Integrity
∀ r ∈ reviews: ∃ t ∈ tokens: r.token = t ∧ t.status = valid
Every published review has a corresponding valid token. No review can exist in the system without cryptographic proof of a legitimate purchase behind it.
TLC checked · EV-009 · 0 violations
NullifierUniqueness
Non-repudiation
∀ n ∈ nullifier_set: |{t ∈ tokens : t.nullifier = n}| = 1
Every nullifier in the spent set maps to exactly one token. Prevents nullifier collision attacks and ensures the double-spend check is sound.
TLC checked · EV-009 · 0 violations
NullifierImpliesReview
Consistency
∀ n ∈ nullifier_set: ∃ r ∈ reviews: r.nullifier = n
No nullifier is marked spent without a review actually being published. Guarantees atomicity — token consumption and review publication are an indivisible operation.
TLC checked · EV-009 · 0 violations
GlobalNullifierUniqueness
Cross-Purchase
\* Stronger than NullifierUniqueness: \* No two tokens across different purchases share a nullifier ∀ p1, p2 ∈ Purchases: p1 ≠ p2 ⇒ tokens[p1].nullifier ≠ tokens[p2].nullifier
Nullifiers are globally unique across all purchases — not just within a single purchase. This prevents cross-purchase nullifier reuse, which would allow a review from purchase A to block the review slot of purchase B.
TLC checked · EV-009 · 0 violations
Naming Reconciliation Note

Two properties from the Q9 source document — NoDoubleSpend and TokenStateMonotonicity — are not named INVARIANT lines in the TLC configuration. NoDoubleSpend is structurally enforced by the state machine: the SubmitReview action's guard requires nullifier ∉ nullifier_set, making double-spend impossible by construction. TokenStateMonotonicity (ISSUED → REDEEMED → SPENT) is likewise enforced by action guards on each transition. Both properties hold — they are enforced structurally rather than checked as separate invariants.

04 — Mechanism Design Property

Economic Deterrence Invariant

Where cryptography makes fraud structurally impossible, mechanism design makes the residual attack surface — merchant collusion — economically irrational.

HonestyDominance
IC/IR
α · C₀ · Nγ > k ⇒ v* = 0 is unique profit maximum \* R(v) = k·ln(1+v) — reward (diminishing returns) \* C(v,N) = C₀·e^(αv)·N^γ — cost (exponential scaling) \* D(v) = C'(v) - R'(v) increasing for α>0 \* D(0) > 0 ⇒ no fraud is profitable at any scale
When the dominance condition holds, the marginal cost of the very first fake review already exceeds its marginal benefit. Fraud is not merely expensive — it is economically dominated at every volume. Validated across 500 Monte Carlo parameter scenarios with 100% pass rate.
Analytic proof complete · Monte Carlo 500/500 · Runtime enforceable
Verification Evidence

Analytic: D(v) = C'(v) − R'(v). Since C'(v) = α·C₀·eαv·Nγ and R'(v) = k/(1+v), the condition D(0) > 0 reduces to α·C₀·Nγ > k.

Empirical: mc_results_quick.csv — 500 parameter draws, all 500 satisfy honesty_dominates = True over tested ranges (α ∈ [0.037, 0.205], C₀ ∈ [549, 39125], γ ∈ [1.77, 2.90], N ∈ [1041, 19969]).

05 — RDE Principles Embodied

Rigorous Digital Engineering in Practice

The TLA+ specifications embody four principles from Rigorous Digital Engineering that structure the entire verification approach.

Refinement Chain

Each TLA+ spec traces to security requirements above (threat model, security properties) and implementation artifacts below (circuit signals, API schemas). The spec bridges "what must hold" and "what the code does."

Assume-Guarantee Composition

Layers state what they assume and guarantee. Crypto guarantees (binding, soundness) feed consensus assumptions. Consensus guarantees (agreement, finality) feed economic IC/IR conditions. Documented in THREAT_MODEL.md §6.

Design by Contract

Preconditions guard all state transitions. Leader ∈ Committee before Propose. nullifier ∉ nullifier_set before SubmitReview. token.status = ISSUED before Redeem. Violations are structurally impossible, not merely detected.

Bounded Model Checking

Properties are verified within explicit finite bounds (4 validators, MaxView=0 for consensus; 2 purchases, 3 nullifiers for tokens). Limitations are documented, not hidden. Unbounded verification requires theorem provers (TLAPS), planned for Phase 2.

06 — Cross-Layer Composition

The Assume-Guarantee Chain

REF's three layers compose through explicit contracts. Each layer's guarantees become the next layer's assumptions.

Cryptographic Layer → Consensus Layer → Economic Layer
Crypto Guarantees

Binding commitments (Poseidon), unforgeable signatures (Ed25519), sound proofs (Groth16 128-bit), domain-separated tags prevent cross-protocol replay.

Consensus Assumes

Attestations are unforgeable (honest merchant ⇒ valid signature). Tokens are binding (witness cannot be forged). Nullifiers are collision-resistant.

feeds into
Consensus Guarantees

Agreement (no conflicting decisions). LockedValueSafety (no equivocation). UniqueHonestProposalPerRound. PrepareVotesReferenceProposals.

Economics Assumes

Finalized tokens cannot be forged or duplicated (consensus Agreement). Network-wide nullifier uniqueness maintained (GlobalNullifierUniqueness). N validators participate at measured rate.

07 — Verification Status Summary

What Is Checked, What Is Planned

Transparency requires distinguishing verified claims from planned work. Every entry traces to a specific artifact and TLC log.

Property
Layer
Status
Evidence
Agreement
Consensus
✓ Checked
EV-008 · 152.7M states
LockedValueSafety
Consensus
✓ Checked
EV-008 · 152.7M states
UniqueHonestProposalPerRound
Consensus
✓ Checked
EV-008 · 152.7M states
PrepareVotesReferenceProposals
Consensus
✓ Checked
EV-008 · 152.7M states
TypeOK
Token
✓ Checked
EV-009 · 269 states
AtMostOneTokenPerPurchase
Token
✓ Checked
EV-009 · 269 states
AtMostOneReviewPerToken
Token
✓ Checked
EV-009 · 269 states
NoOrphanReview
Token
✓ Checked
EV-009 · 269 states
NullifierUniqueness
Token
✓ Checked
EV-009 · 269 states
NullifierImpliesReview
Token
✓ Checked
EV-009 · 269 states
GlobalNullifierUniqueness
Token
✓ Checked
EV-009 · 269 states
NoDoubleSpend
Token
⊢ Structural
Action guard on SubmitReview
TokenStateMonotonicity
Token
⊢ Structural
Action guards enforce forward-only
HonestyDominance
Mechanism
✓ Validated
Analytic + MC (500/500)
View-Change Safety (MaxView ≥ 1)
Consensus
○ Planned
Requires larger harness
Liveness under Partial Synchrony
Consensus
○ Planned
Requires fairness + TLAPS
Spec↔Code Equivalence
Cross-Layer
○ Planned
Requires Cryptol/SAW
Intellectual Honesty Note

Bounded model checking proves properties hold within the explored state space — not universally. The green items are verified by TLC logs. The cyan items hold by construction (action guards) but are not separate INVARIANT lines. The amber items are gaps where formal methods expertise would accelerate the work. The refinement from TLA+ to running Circom/Python code is human-maintained, not machine-checked (DEF-02).