Skip to main content

Advanced Consensus and BFT

Raft/Paxos assume nodes fail by crashing. BFT protocols handle arbitrary or malicious behavior.


Beginner Viewโ€‹

Crash Fault vs Byzantine Faultโ€‹

  • Crash fault: node stops responding
  • Byzantine fault: node responds incorrectly, inconsistently, or maliciously

Most enterprise systems use crash-fault consensus (Raft/Paxos). BFT is used only when trust assumptions are weaker.

Replica Count Intuitionโ€‹

  • Crash fault tolerance: usually 2f + 1 replicas for f faults
  • Byzantine fault tolerance: usually 3f + 1 replicas for f Byzantine faults

This is why BFT is more expensive.


Senior Deep Diveโ€‹

Why BFT Costs Moreโ€‹

  • More message rounds for agreement
  • Signature/verification overhead
  • Higher network fan-out and latency

Protocol Familiesโ€‹

  • PBFT: classic three-phase protocol; high communication overhead
  • HotStuff: pipeline-friendly design reducing protocol complexity
  • Tendermint-style: practical BFT for validator-based systems

Decision Frameworkโ€‹

Use crash-fault consensus when:

  • Single organization control plane
  • Strongly authenticated infra and low adversarial risk

Use BFT when:

  • Multi-organization governance
  • Adversarial environment cannot be ignored
  • Cost of inconsistent/malicious state is catastrophic

Failure Models and Risk Mappingโ€‹

EnvironmentRecommended modelReason
Internal service registryCrash faultTrusted infra, lower cost
Cross-company settlement networkBFTIndependent trust domains
Public validator networkBFTAdversarial participants expected

Operational Considerationsโ€‹

  • Benchmark consensus latency under realistic geo RTT
  • Use hardware crypto acceleration if signature-heavy
  • Define quorum-loss runbooks and emergency governance flows
  • Continuously test node equivocation/Byzantine simulation in staging

Interview Questionsโ€‹

Q: Why can Raft not handle Byzantine faults by design?โ€‹

A: Raft assumes fail-stop or crash faults and honest message behavior. If nodes lie or equivocate, Raft's quorum logic cannot guarantee safety.

Q: Explain why BFT generally needs 3f + 1 replicas.โ€‹

A: To tolerate f Byzantine nodes, the system needs enough honest overlap between quorums. With 3f+1 replicas, at least 2f+1 can agree, ensuring quorum intersection includes honest nodes.

Q: When is BFT over-engineering for enterprise systems?โ€‹

A: If nodes are under one trusted operator and threat model is mostly crashes/outages, crash-fault consensus is usually enough. BFT cost is rarely justified without adversarial trust boundaries.

Q: Compare PBFT and HotStuff at a high level.โ€‹

A: PBFT uses multiple communication phases with heavier view-change complexity. HotStuff streamlines leader change with a pipelined three-phase protocol and simpler proofs.

Q: How do trust assumptions drive consensus choice?โ€‹

A: If participants can be malicious, choose BFT; if they are trusted but can crash, choose CFT like Raft/Paxos. Consensus should match the strongest realistic failure mode.

Q: What are practical performance bottlenecks in BFT systems?โ€‹

A: Signature verification, all-to-all messaging, and WAN latency dominate. Performance degrades quickly with replica count unless batching and crypto acceleration are used.

Q: How would you justify BFT to a product team concerned about latency?โ€‹

A: Frame BFT as risk reduction for high-value, multi-party trust domains where incorrect commits are catastrophic. Then scope BFT to critical write paths and keep read paths optimized separately.

Q: What staging tests would you run for Byzantine behavior?โ€‹

A: Inject equivocation, forged signatures, delayed/reordered messages, and split views under load. Verify safety invariants (no conflicting commits) and bounded recovery time.