Advanced Consensus and BFT

Raft/Paxos assume nodes fail by crashing. BFT protocols handle arbitrary or malicious behavior.

Beginner View

Crash Fault vs Byzantine Fault

Crash fault: node stops responding
Byzantine fault: node responds incorrectly, inconsistently, or maliciously

Most enterprise systems use crash-fault consensus (Raft/Paxos). BFT is used only when trust assumptions are weaker.

Replica Count Intuition

Crash fault tolerance: usually 2f + 1 replicas for f faults
Byzantine fault tolerance: usually 3f + 1 replicas for f Byzantine faults

This is why BFT is more expensive.

Senior Deep Dive

Why BFT Costs More

More message rounds for agreement
Signature/verification overhead
Higher network fan-out and latency

Protocol Families

PBFT: classic three-phase protocol; high communication overhead
HotStuff: pipeline-friendly design reducing protocol complexity
Tendermint-style: practical BFT for validator-based systems

Decision Framework

Use crash-fault consensus when:

Single organization control plane
Strongly authenticated infra and low adversarial risk

Use BFT when:

Multi-organization governance
Adversarial environment cannot be ignored
Cost of inconsistent/malicious state is catastrophic

Failure Models and Risk Mapping

Environment	Recommended model	Reason
Internal service registry	Crash fault	Trusted infra, lower cost
Cross-company settlement network	BFT	Independent trust domains
Public validator network	BFT	Adversarial participants expected

Operational Considerations

Benchmark consensus latency under realistic geo RTT
Use hardware crypto acceleration if signature-heavy
Define quorum-loss runbooks and emergency governance flows
Continuously test node equivocation/Byzantine simulation in staging

Interview Questions

Q: Why can Raft not handle Byzantine faults by design?

A: Raft assumes fail-stop or crash faults and honest message behavior. If nodes lie or equivocate, Raft's quorum logic cannot guarantee safety.

Q: Explain why BFT generally needs `3f + 1` replicas.

A: To tolerate $f$ Byzantine nodes, the system needs enough honest overlap between quorums. With $3f+1$ replicas, at least $2f+1$ can agree, ensuring quorum intersection includes honest nodes.

Q: When is BFT over-engineering for enterprise systems?

A: If nodes are under one trusted operator and threat model is mostly crashes/outages, crash-fault consensus is usually enough. BFT cost is rarely justified without adversarial trust boundaries.

Q: Compare PBFT and HotStuff at a high level.

A: PBFT uses multiple communication phases with heavier view-change complexity. HotStuff streamlines leader change with a pipelined three-phase protocol and simpler proofs.

Q: How do trust assumptions drive consensus choice?

A: If participants can be malicious, choose BFT; if they are trusted but can crash, choose CFT like Raft/Paxos. Consensus should match the strongest realistic failure mode.

Q: What are practical performance bottlenecks in BFT systems?

A: Signature verification, all-to-all messaging, and WAN latency dominate. Performance degrades quickly with replica count unless batching and crypto acceleration are used.

Q: How would you justify BFT to a product team concerned about latency?

A: Frame BFT as risk reduction for high-value, multi-party trust domains where incorrect commits are catastrophic. Then scope BFT to critical write paths and keep read paths optimized separately.

Q: What staging tests would you run for Byzantine behavior?

A: Inject equivocation, forged signatures, delayed/reordered messages, and split views under load. Verify safety invariants (no conflicting commits) and bounded recovery time.

Beginner View​

Crash Fault vs Byzantine Fault​

Replica Count Intuition​

Senior Deep Dive​

Why BFT Costs More​

Protocol Families​

Decision Framework​

Failure Models and Risk Mapping​

Operational Considerations​

Interview Questions​

Q: Why can Raft not handle Byzantine faults by design?​

Q: Explain why BFT generally needs 3f + 1 replicas.​

Q: When is BFT over-engineering for enterprise systems?​

Q: Compare PBFT and HotStuff at a high level.​

Q: How do trust assumptions drive consensus choice?​

Q: What are practical performance bottlenecks in BFT systems?​

Q: How would you justify BFT to a product team concerned about latency?​

Q: What staging tests would you run for Byzantine behavior?​