Deep Dive

BFT Consensus & Distributed Ledger

Classical PBFT (Castro–Liskov, 1999) with Ed25519-signed quorum certificates, view-change protocol for leader failure recovery, periodic checkpointing with garbage collection, and a replicated distributed ledger for agent economic state.

PBFT (Castro–Liskov)Ed25519 Quorum CertsView Change ProtocolState Machine Replication

Three-Phase Commit

PBFT message flow visualized

The protocol proceeds in three broadcast phases after the client sends a request. The primary assigns a sequence number in PRE-PREPARE, replicas echo agreement in PREPARE, and signal execution readiness in COMMIT. Each phase requires a quorum of 2f+1 matching messages before advancing.

Phase 0REQUEST
Phase 1PRE-PREPARE
Phase 2PREPARE
Phase 3COMMIT
Phase 4EXECUTE
C
Client
R0
Primary (R₀)
R1
Replica R₁
R2
Replica R₂
R3
Replica R₃
broadcast
quorum ≥ 2f+1
quorum ≥ 2f+1
sequential
pbft_three_phase.py
class PBFTReplica:
    """Three-phase commit with Ed25519-signed messages."""

    async def handle_request(self, request: ClientRequest):
        if not self.is_primary:
            return self.forward_to_primary(request)

        seq = self.next_sequence()
        digest = sha256(serialize(request)).hexdigest()

        pre_prepare = PBFTMessage(
            msg_type=MessageType.PRE_PREPARE,
            view=self.current_view,
            sequence=seq,
            digest=digest,
            sender_id=self.replica_id,
            signature=self.sign(f"AEOS/pbft/pp/{self.current_view}/{seq}/{digest}")
        )
        await self.broadcast(pre_prepare)

    async def handle_prepare(self, msg: PBFTMessage):
        self.prepare_log[msg.sequence].add(msg)
        if len(self.prepare_log[msg.sequence]) >= self.quorum_size:
            self.phase[msg.sequence] = Phase.PREPARED
            commit = self._build_commit(msg.view, msg.sequence, msg.digest)
            await self.broadcast(commit)

    async def handle_commit(self, msg: PBFTMessage):
        self.commit_log[msg.sequence].add(msg)
        if len(self.commit_log[msg.sequence]) >= self.quorum_size:
            self.phase[msg.sequence] = Phase.COMMITTED
            result = self.execute(msg.sequence)
            await self.reply_to_client(result)

Byzantine Tolerance

n = 3f + 1 fault bound

PBFT requires n ≥ 3f + 1 replicas to tolerate f Byzantine faults. This bound ensures that any two quorums of 2f+1 overlap in at least one correct replica—the fundamental property that prevents conflicting commits.

Cluster

n = 4

Tolerate

f = 1

quorum = 33 correct

Cluster

n = 7

Tolerate

f = 2

quorum = 55 correct

Cluster

n = 10

Tolerate

f = 3

quorum = 77 correct

Cluster

n = 13

Tolerate

f = 4

quorum = 99 correct
Correct replica
Byzantine (faulty) replica
overlap = (2f+1) + (2f+1) − (3f+1) = f+1 ≥ 1 correct

Quorum Certificate

Ed25519-signed proof of agreement

A QuorumCertificate bundles the message metadata with a dictionary of Ed25519 signatures from distinct replicas. The certificate is valid when the signature count meets or exceeds the quorum threshold of 2f+1.

QuorumCertificate

Consensus Proof

msg_typeMessageType
COMMIT
viewu64
7
sequenceu64
1042
digestSHA-256
0x9f3a...c8d1

Signatures (replica_id → Ed25519)

R₀
0x7a2f...e4b1
R₁
0x3c8d...a9f2
R₂
0x1e5b...d7c3
R₃
valid = len(signatures) ≥ quorum_size3 ≥ 3 ✓
quorum_certificate.py
@dataclass
class QuorumCertificate:
    """Aggregated proof of replica agreement."""

    msg_type: MessageType
    view: int
    sequence: int
    digest: str
    signatures: dict[str, bytes]  # replica_id → Ed25519 sig

    @property
    def quorum_size(self) -> int:
        return 2 * self.f + 1

    def is_valid(self, network_keys: dict[str, VerifyKey]) -> bool:
        """Verify quorum of distinct valid signatures."""
        valid_count = 0
        seen_replicas: set[str] = set()

        for replica_id, signature in self.signatures.items():
            if replica_id in seen_replicas:
                continue  # no double-counting
            seen_replicas.add(replica_id)

            verify_key = network_keys.get(replica_id)
            if verify_key is None:
                continue

            message = domain_sep(
                f"AEOS/pbft/{self.msg_type.value}",
                f"{self.view}/{self.sequence}/{self.digest}"
            )
            try:
                verify_key.verify(message, signature)
                valid_count += 1
            except BadSignatureError:
                continue

        return valid_count >= self.quorum_size

    @property
    def f(self) -> int:
        """Max Byzantine faults from network size."""
        return (self.network_size - 1) // 3

View Change Protocol

Leader failure & recovery

When the primary is suspected faulty, replicas initiate a view change. The new primary is deterministically selected as primary = view mod n. Prepared certificates are carried forward to preserve the safety invariant across view transitions.

Normal Operation

view = v, primary = R₀

Primary Failure

timeout expired

VIEW_CHANGE

broadcast to all

NEW_VIEW

2f+1 certs collected

Resumed

view = v+1, primary = R₁

timeout
broadcast
collect 2f+1
resume ops
view_change.py
async def initiate_view_change(self):
    """Trigger view change when primary is suspected faulty."""
    self.current_view += 1
    new_primary = self.current_view % self.network_size

    # Collect prepared certificates to carry forward
    prepared_certs = {
        seq: cert for seq, cert in self.certificates.items()
        if cert.msg_type == MessageType.PREPARE
    }

    view_change = PBFTMessage(
        msg_type=MessageType.VIEW_CHANGE,
        view=self.current_view,
        sequence=self.last_stable_checkpoint,
        digest=serialize_certs(prepared_certs),
        sender_id=self.replica_id,
        signature=self.sign(
            f"AEOS/pbft/vc/{self.current_view}/{self.last_stable_checkpoint}"
        )
    )
    await self.broadcast(view_change)

async def handle_new_view(self, msg: PBFTMessage):
    """New primary broadcasts NEW_VIEW after collecting 2f+1 VIEW_CHANGE."""
    if not self.verify_new_view_proof(msg):
        return  # invalid proof

    self.current_view = msg.view
    # Re-propose any operations that were prepared but not committed
    for cert in msg.prepared_certificates:
        await self.handle_request(cert.original_request)

Safety Proof

Formal proof sketch for safety

The safety guarantee—that no two correct replicas commit different values for the same sequence in the same view—follows from three key properties of quorum intersection, certificate integrity, and view change preservation.

1

Quorum Overlap

Any two quorums of size 2f+1 drawn from n = 3f+1 replicas must overlap in at least f+1 replicas.

|Q₁ ∩ Q₂| = |Q₁| + |Q₂| − n

= (2f+1) + (2f+1) − (3f+1)

= f + 1

Since at most f replicas are Byzantine, at least 1 correct replica exists in every intersection.

2

No Conflicting Commits

A correct replica in the overlap received both PREPARE quorums. Since it is correct, it only signs one digest per (view, sequence) pair.

∀ correct replica r:

prepared(r, v, s, d₁) ∧ prepared(r, v, s, d₂)

⟹ d₁ = d₂

Two conflicting values would require the correct replica to have signed both—a contradiction.

3

View Change Preserves

The new primary in view v+1 collects prepared certificates from 2f+1 replicas and re-proposes any operation that reached the PREPARED state.

VIEW_CHANGE carries:

prepared_certs for (v, s, d)

NEW_VIEW re-proposes:

∀ prepared(v, s, d) → pre_prepare(v+1, s, d)

Safety is preserved across arbitrary view transitions by certificate forwarding.

Checkpointing

Periodic snapshots & garbage collection

Every CHECKPOINT_PERIOD = 100 sequences, replicas publish a signed digest of their current state. Once 2f+1 matching checkpoint proofs are collected, the checkpoint becomes stable and log entries at or below the stable sequence are garbage-collected.

01

Checkpoint trigger

sequence % CHECKPOINT_PERIOD == 0 triggers checkpoint broadcast

02

State digest

Each replica signs checkpoint with AEOS/ckpt/ domain prefix over state hash

03

Stable checkpoint

Quorum of 2f+1 matching proofs promotes checkpoint to stable status

04

Garbage collection

Delete log entries ≤ stable_seq; retain only the latest stable snapshot

checkpointing.py
CHECKPOINT_PERIOD = 100

async def maybe_checkpoint(self, sequence: int):
    """Trigger checkpoint every CHECKPOINT_PERIOD sequences."""
    if sequence % CHECKPOINT_PERIOD != 0:
        return

    state_digest = sha256(serialize(self.state)).hexdigest()
    checkpoint = PBFTMessage(
        msg_type=MessageType.CHECKPOINT,
        view=self.current_view,
        sequence=sequence,
        digest=state_digest,
        sender_id=self.replica_id,
        signature=self.sign(
            f"AEOS/ckpt/{sequence}/{state_digest}"
        )
    )
    await self.broadcast(checkpoint)

async def handle_checkpoint(self, msg: PBFTMessage):
    """Collect checkpoint proofs; stabilize at quorum."""
    key = (msg.sequence, msg.digest)
    self.checkpoint_proofs[key].add(msg)

    if len(self.checkpoint_proofs[key]) >= self.quorum_size:
        self.last_stable_checkpoint = msg.sequence
        self.stable_digest = msg.digest
        self._garbage_collect(msg.sequence)

def _garbage_collect(self, stable_seq: int):
    """Prune log entries at or below the stable checkpoint."""
    for seq in list(self.prepare_log.keys()):
        if seq <= stable_seq:
            del self.prepare_log[seq]
            del self.commit_log[seq]
            del self.phase[seq]

Message Types

8 protocol message types

Every PBFT message is typed, view-tagged, sequence-numbered, and Ed25519-signed. The type enum covers the full protocol lifecycle from client request through checkpoint stabilization.

CodeTypeDescription
0REQUESTClient submits operation to the primary replica for ordering
1PRE_PREPAREPrimary assigns sequence number and broadcasts to all replicas
2PREPAREReplica acknowledges PRE_PREPARE; quorum = 2f+1 matching messages
3COMMITReplica signals readiness to execute; quorum = 2f+1 required
4REPLYReplica returns execution result to the requesting client
5VIEW_CHANGEReplica suspects primary failure; triggers leader rotation protocol
6NEW_VIEWNew primary proves legitimacy with 2f+1 VIEW_CHANGE certificates
7CHECKPOINTReplica publishes signed state digest every CHECKPOINT_PERIOD sequences

Phase Enum

5 execution phases

Each operation tracks its progress through a deterministic state machine. Phase transitions are monotonic—once an operation advances, it never regresses within the same view.

0

IDLE

Waiting for client request

1

PRE_PREPARED

Primary assigned sequence

2

PREPARED

Quorum of PREPARE received

3

COMMITTED

Quorum of COMMIT received

4

EXECUTED

Operation applied to state

phase_enum.py
class Phase(IntEnum):
    """Monotonic execution phase per sequence slot."""
    IDLE         = 0  # no request assigned
    PRE_PREPARED = 1  # primary broadcast PRE_PREPARE
    PREPARED     = 2  # received 2f+1 matching PREPARE
    COMMITTED    = 3  # received 2f+1 matching COMMIT
    EXECUTED     = 4  # operation applied to state machine

    def can_advance_to(self, target: "Phase") -> bool:
        """Phases advance monotonically within a view."""
        return target.value == self.value + 1

Distributed Ledger

State machine replication for AEOS

The DistributedLedger wraps a PBFTNetwork to provide deterministic replication of agent economic state. Operations are deduplicated via op_hash before submission to the consensus layer, ensuring exactly-once semantics across replicas.

agent_registered

New agent identity committed to replicated state

contract_created

Binding agreement anchored with multi-sig attestation

obligation_fulfilled

Milestone delivery confirmed and escrow released

dispute_filed

Dispute evidence chain submitted for arbitration

Deduplication via op_hash

operation

agent_registered

SHA-256(op)

op_hash = 0x3f...

seen_ops check

skip if duplicate

PBFT submit

consensus round

distributed_ledger.py
class DistributedLedger:
    """Replicated economic state machine over PBFTNetwork."""

    def __init__(self, network: PBFTNetwork):
        self.network = network
        self.seen_ops: set[str] = set()
        self.state: dict[str, Any] = {}

    async def submit_operation(
        self, op_type: str, payload: dict
    ) -> ConsensusResult:
        """Submit with deduplication via content-addressed hash."""
        op_hash = sha256(
            f"{op_type}:{serialize(payload)}".encode()
        ).hexdigest()

        if op_hash in self.seen_ops:
            return ConsensusResult(status="duplicate", seq=-1)

        self.seen_ops.add(op_hash)
        result = await self.network.submit(
            operation={"type": op_type, "payload": payload, "hash": op_hash}
        )

        if result.committed:
            self._apply(op_type, payload)
        return result

    def _apply(self, op_type: str, payload: dict):
        """Deterministic state transition after consensus."""
        match op_type:
            case "agent_registered":
                self.state[payload["did"]] = payload
            case "contract_created":
                self.state[payload["contract_id"]] = payload
            case "obligation_fulfilled":
                contract = self.state[payload["contract_id"]]
                contract["milestones_complete"] += 1
            case "dispute_filed":
                self.state[payload["dispute_id"]] = payload

Continue exploring the protocol

BFT consensus provides the finality layer. Explore how identities anchor the economic graph, how tokenization governs programmable assets, and how the full protocol stack fits together.