BFT Consensus & Distributed Ledger
Classical PBFT (Castro–Liskov, 1999) with Ed25519-signed quorum certificates, view-change protocol for leader failure recovery, periodic checkpointing with garbage collection, and a replicated distributed ledger for agent economic state.
Three-Phase Commit
PBFT message flow visualized
The protocol proceeds in three broadcast phases after the client sends a request. The primary assigns a sequence number in PRE-PREPARE, replicas echo agreement in PREPARE, and signal execution readiness in COMMIT. Each phase requires a quorum of 2f+1 matching messages before advancing.
class PBFTReplica:
"""Three-phase commit with Ed25519-signed messages."""
async def handle_request(self, request: ClientRequest):
if not self.is_primary:
return self.forward_to_primary(request)
seq = self.next_sequence()
digest = sha256(serialize(request)).hexdigest()
pre_prepare = PBFTMessage(
msg_type=MessageType.PRE_PREPARE,
view=self.current_view,
sequence=seq,
digest=digest,
sender_id=self.replica_id,
signature=self.sign(f"AEOS/pbft/pp/{self.current_view}/{seq}/{digest}")
)
await self.broadcast(pre_prepare)
async def handle_prepare(self, msg: PBFTMessage):
self.prepare_log[msg.sequence].add(msg)
if len(self.prepare_log[msg.sequence]) >= self.quorum_size:
self.phase[msg.sequence] = Phase.PREPARED
commit = self._build_commit(msg.view, msg.sequence, msg.digest)
await self.broadcast(commit)
async def handle_commit(self, msg: PBFTMessage):
self.commit_log[msg.sequence].add(msg)
if len(self.commit_log[msg.sequence]) >= self.quorum_size:
self.phase[msg.sequence] = Phase.COMMITTED
result = self.execute(msg.sequence)
await self.reply_to_client(result)Byzantine Tolerance
n = 3f + 1 fault bound
PBFT requires n ≥ 3f + 1 replicas to tolerate f Byzantine faults. This bound ensures that any two quorums of 2f+1 overlap in at least one correct replica—the fundamental property that prevents conflicting commits.
Cluster
n = 4
Tolerate
f = 1
Cluster
n = 7
Tolerate
f = 2
Cluster
n = 10
Tolerate
f = 3
Cluster
n = 13
Tolerate
f = 4
Quorum Certificate
Ed25519-signed proof of agreement
A QuorumCertificate bundles the message metadata with a dictionary of Ed25519 signatures from distinct replicas. The certificate is valid when the signature count meets or exceeds the quorum threshold of 2f+1.
QuorumCertificate
Consensus Proof
Signatures (replica_id → Ed25519)
@dataclass
class QuorumCertificate:
"""Aggregated proof of replica agreement."""
msg_type: MessageType
view: int
sequence: int
digest: str
signatures: dict[str, bytes] # replica_id → Ed25519 sig
@property
def quorum_size(self) -> int:
return 2 * self.f + 1
def is_valid(self, network_keys: dict[str, VerifyKey]) -> bool:
"""Verify quorum of distinct valid signatures."""
valid_count = 0
seen_replicas: set[str] = set()
for replica_id, signature in self.signatures.items():
if replica_id in seen_replicas:
continue # no double-counting
seen_replicas.add(replica_id)
verify_key = network_keys.get(replica_id)
if verify_key is None:
continue
message = domain_sep(
f"AEOS/pbft/{self.msg_type.value}",
f"{self.view}/{self.sequence}/{self.digest}"
)
try:
verify_key.verify(message, signature)
valid_count += 1
except BadSignatureError:
continue
return valid_count >= self.quorum_size
@property
def f(self) -> int:
"""Max Byzantine faults from network size."""
return (self.network_size - 1) // 3View Change Protocol
Leader failure & recovery
When the primary is suspected faulty, replicas initiate a view change. The new primary is deterministically selected as primary = view mod n. Prepared certificates are carried forward to preserve the safety invariant across view transitions.
Normal Operation
view = v, primary = R₀
Primary Failure
timeout expired
VIEW_CHANGE
broadcast to all
NEW_VIEW
2f+1 certs collected
Resumed
view = v+1, primary = R₁
async def initiate_view_change(self):
"""Trigger view change when primary is suspected faulty."""
self.current_view += 1
new_primary = self.current_view % self.network_size
# Collect prepared certificates to carry forward
prepared_certs = {
seq: cert for seq, cert in self.certificates.items()
if cert.msg_type == MessageType.PREPARE
}
view_change = PBFTMessage(
msg_type=MessageType.VIEW_CHANGE,
view=self.current_view,
sequence=self.last_stable_checkpoint,
digest=serialize_certs(prepared_certs),
sender_id=self.replica_id,
signature=self.sign(
f"AEOS/pbft/vc/{self.current_view}/{self.last_stable_checkpoint}"
)
)
await self.broadcast(view_change)
async def handle_new_view(self, msg: PBFTMessage):
"""New primary broadcasts NEW_VIEW after collecting 2f+1 VIEW_CHANGE."""
if not self.verify_new_view_proof(msg):
return # invalid proof
self.current_view = msg.view
# Re-propose any operations that were prepared but not committed
for cert in msg.prepared_certificates:
await self.handle_request(cert.original_request)Safety Proof
Formal proof sketch for safety
The safety guarantee—that no two correct replicas commit different values for the same sequence in the same view—follows from three key properties of quorum intersection, certificate integrity, and view change preservation.
Quorum Overlap
Any two quorums of size 2f+1 drawn from n = 3f+1 replicas must overlap in at least f+1 replicas.
|Q₁ ∩ Q₂| = |Q₁| + |Q₂| − n
= (2f+1) + (2f+1) − (3f+1)
= f + 1
Since at most f replicas are Byzantine, at least 1 correct replica exists in every intersection.
No Conflicting Commits
A correct replica in the overlap received both PREPARE quorums. Since it is correct, it only signs one digest per (view, sequence) pair.
∀ correct replica r:
prepared(r, v, s, d₁) ∧ prepared(r, v, s, d₂)
⟹ d₁ = d₂
Two conflicting values would require the correct replica to have signed both—a contradiction.
View Change Preserves
The new primary in view v+1 collects prepared certificates from 2f+1 replicas and re-proposes any operation that reached the PREPARED state.
VIEW_CHANGE carries:
prepared_certs for (v, s, d)
NEW_VIEW re-proposes:
∀ prepared(v, s, d) → pre_prepare(v+1, s, d)
Safety is preserved across arbitrary view transitions by certificate forwarding.
Checkpointing
Periodic snapshots & garbage collection
Every CHECKPOINT_PERIOD = 100 sequences, replicas publish a signed digest of their current state. Once 2f+1 matching checkpoint proofs are collected, the checkpoint becomes stable and log entries at or below the stable sequence are garbage-collected.
Checkpoint trigger
sequence % CHECKPOINT_PERIOD == 0 triggers checkpoint broadcast
State digest
Each replica signs checkpoint with AEOS/ckpt/ domain prefix over state hash
Stable checkpoint
Quorum of 2f+1 matching proofs promotes checkpoint to stable status
Garbage collection
Delete log entries ≤ stable_seq; retain only the latest stable snapshot
CHECKPOINT_PERIOD = 100
async def maybe_checkpoint(self, sequence: int):
"""Trigger checkpoint every CHECKPOINT_PERIOD sequences."""
if sequence % CHECKPOINT_PERIOD != 0:
return
state_digest = sha256(serialize(self.state)).hexdigest()
checkpoint = PBFTMessage(
msg_type=MessageType.CHECKPOINT,
view=self.current_view,
sequence=sequence,
digest=state_digest,
sender_id=self.replica_id,
signature=self.sign(
f"AEOS/ckpt/{sequence}/{state_digest}"
)
)
await self.broadcast(checkpoint)
async def handle_checkpoint(self, msg: PBFTMessage):
"""Collect checkpoint proofs; stabilize at quorum."""
key = (msg.sequence, msg.digest)
self.checkpoint_proofs[key].add(msg)
if len(self.checkpoint_proofs[key]) >= self.quorum_size:
self.last_stable_checkpoint = msg.sequence
self.stable_digest = msg.digest
self._garbage_collect(msg.sequence)
def _garbage_collect(self, stable_seq: int):
"""Prune log entries at or below the stable checkpoint."""
for seq in list(self.prepare_log.keys()):
if seq <= stable_seq:
del self.prepare_log[seq]
del self.commit_log[seq]
del self.phase[seq]Message Types
8 protocol message types
Every PBFT message is typed, view-tagged, sequence-numbered, and Ed25519-signed. The type enum covers the full protocol lifecycle from client request through checkpoint stabilization.
| Code | Type | Description |
|---|---|---|
| 0 | REQUEST | Client submits operation to the primary replica for ordering |
| 1 | PRE_PREPARE | Primary assigns sequence number and broadcasts to all replicas |
| 2 | PREPARE | Replica acknowledges PRE_PREPARE; quorum = 2f+1 matching messages |
| 3 | COMMIT | Replica signals readiness to execute; quorum = 2f+1 required |
| 4 | REPLY | Replica returns execution result to the requesting client |
| 5 | VIEW_CHANGE | Replica suspects primary failure; triggers leader rotation protocol |
| 6 | NEW_VIEW | New primary proves legitimacy with 2f+1 VIEW_CHANGE certificates |
| 7 | CHECKPOINT | Replica publishes signed state digest every CHECKPOINT_PERIOD sequences |
Phase Enum
5 execution phases
Each operation tracks its progress through a deterministic state machine. Phase transitions are monotonic—once an operation advances, it never regresses within the same view.
IDLE
Waiting for client request
PRE_PREPARED
Primary assigned sequence
PREPARED
Quorum of PREPARE received
COMMITTED
Quorum of COMMIT received
EXECUTED
Operation applied to state
class Phase(IntEnum):
"""Monotonic execution phase per sequence slot."""
IDLE = 0 # no request assigned
PRE_PREPARED = 1 # primary broadcast PRE_PREPARE
PREPARED = 2 # received 2f+1 matching PREPARE
COMMITTED = 3 # received 2f+1 matching COMMIT
EXECUTED = 4 # operation applied to state machine
def can_advance_to(self, target: "Phase") -> bool:
"""Phases advance monotonically within a view."""
return target.value == self.value + 1Distributed Ledger
State machine replication for AEOS
The DistributedLedger wraps a PBFTNetwork to provide deterministic replication of agent economic state. Operations are deduplicated via op_hash before submission to the consensus layer, ensuring exactly-once semantics across replicas.
agent_registered
New agent identity committed to replicated state
contract_created
Binding agreement anchored with multi-sig attestation
obligation_fulfilled
Milestone delivery confirmed and escrow released
dispute_filed
Dispute evidence chain submitted for arbitration
Deduplication via op_hash
operation
agent_registered
SHA-256(op)
op_hash = 0x3f...
seen_ops check
skip if duplicate
PBFT submit
consensus round
class DistributedLedger:
"""Replicated economic state machine over PBFTNetwork."""
def __init__(self, network: PBFTNetwork):
self.network = network
self.seen_ops: set[str] = set()
self.state: dict[str, Any] = {}
async def submit_operation(
self, op_type: str, payload: dict
) -> ConsensusResult:
"""Submit with deduplication via content-addressed hash."""
op_hash = sha256(
f"{op_type}:{serialize(payload)}".encode()
).hexdigest()
if op_hash in self.seen_ops:
return ConsensusResult(status="duplicate", seq=-1)
self.seen_ops.add(op_hash)
result = await self.network.submit(
operation={"type": op_type, "payload": payload, "hash": op_hash}
)
if result.committed:
self._apply(op_type, payload)
return result
def _apply(self, op_type: str, payload: dict):
"""Deterministic state transition after consensus."""
match op_type:
case "agent_registered":
self.state[payload["did"]] = payload
case "contract_created":
self.state[payload["contract_id"]] = payload
case "obligation_fulfilled":
contract = self.state[payload["contract_id"]]
contract["milestones_complete"] += 1
case "dispute_filed":
self.state[payload["dispute_id"]] = payloadContinue exploring the protocol
BFT consensus provides the finality layer. Explore how identities anchor the economic graph, how tokenization governs programmable assets, and how the full protocol stack fits together.