Redis: Overview & Architecture
Redis (Remote Dictionary Server) is an open-source, in-memory data structure store used as a database, cache, message broker, and streaming engine. Its combination of simplicity, speed, and rich data structures makes it ubiquitous in production systems.
๐ถ Beginner Concept: The "Librarian's Memory" Analogyโ
Imagine you ask a massive library for a specific book on 18th Century Rome.
- Traditional Database (Disk): The librarian takes your request, physically walks 5 floors down into the basement (the Hard Drive), pulls out a ledger, finds the row, writes it down, and walks back up. It takes SECONDS.
- Redis (In-Memory): The librarian instantly snaps her fingers and recites the exact paragraph you asked for strictly from her own Short-Term Memory (RAM). It takes MILLISECONDS.
The tradeoff? When the librarian leaves work (the server reboots), her short-term memory is wiped. That's why Redis is perfect for caching and active sessions, but dangerous as the sole source of truth for critical long-term billing data.
Why Redis is Fastโ
Redis is often described as "single-threaded" โ but this requires nuance:
Single-threaded event loop (command processing)
โ
Epoll/Kqueue (I/O multiplexing โ handles thousands of connections)
โ
Background threads (AOF fsync, object eviction, lazy delete)
Single-Threaded Command Executionโ
All Redis commands execute sequentially in a single thread. This design:
- Eliminates locking overhead (no mutexes needed for data structures)
- Makes all operations atomic by default
- Simplifies reasoning about state consistency
- Avoids context-switching overhead between threads
Redis 6.0+: Added I/O threading โ network reads/writes are parallelized while command execution remains single-threaded. This removes the I/O bottleneck for high-connection workloads.
๐ง Senior Deep Dive: I/O Multiplexing with Epollโ
How can a single-threaded server handle 100,000 concurrent client connections without crashing? Through Linux epoll (or macOS kqueue).
In a classic blocking server (like older Tomcat), every connected client consumes one entire OS thread. If 10,000 clients connect, the Linux Kernel has to instantly spawn 10,000 heavy threads and continuously rapidly switch between them (Context Switching). The CPU chokes to death just managing threads.
Redis reverses this using the Reactor Pattern:
[100,000 Connected Clients]
โ (Network Sockets)
โผ
[ epoll() Syscall Kernel Space ] โโ "Only these 4 sockets actually sent HTTP bytes in the last microsecond"
โ
โผ
[ Event Loop Queue ]
โ
โผ
[ Single Main Thread ] โโ Pops the 4 commands, processes them sequentially instantly, and loops.
Redis relies on the fact that reading from RAM takes nanoseconds. Because command execution is so incredibly fast, running them one-by-one in a single queue is actually exponentially faster than dealing with the massive CPU overhead of thread synchronization, Locking, and Context Switching.
Memory Architectureโ
Redis stores all data in RAM (optionally persisted to disk):
Memory Layout:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Redis Object (robj) โ
โ โโโ type (string, list, hash...) โ
โ โโโ encoding (ziplist, hashtable) โ
โ โโโ ptr โ actual data โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Memory Encoding Optimizationโ
Redis automatically uses compact encodings for small collections to save memory:
| Data Type | Small Encoding | Large Encoding | Threshold |
|---|---|---|---|
| Hash | ziplist/listpack | hashtable | >128 fields or field >64 bytes |
| List | listpack/quicklist | quicklist | >128 elements or element >64 bytes |
| Set | listpack/intset | hashtable | >128 elements |
| Sorted Set | listpack | skiplist + hashtable | >128 elements |
| String | int (raw int) | embstr/raw | >20 chars |
Why this matters: A Redis hash with <64 fields uses a flat array (ziplist) instead of a full hash table โ dramatically reducing memory overhead. Designing your key structure to stay within these thresholds is a key performance optimization.
Redis Data Persistenceโ
| Mode | Mechanism | Recovery Point | Use Case |
|---|---|---|---|
| RDB (Snapshot) | Fork + binary dump at intervals | At last snapshot | Fast recovery, small files |
| AOF (Append-Only File) | Log every write command | Near real-time | Durability, audit trail |
| No persistence | Pure in-memory | Data lost on restart | Cache-only deployments |
| RDB + AOF | Both modes combined | AOF granularity | Production recommended |
# RDB: save snapshot every 60s if โฅ1000 changes
save 60 1000
# AOF: sync every second (compromise between durability and performance)
appendfsync everysec
# Options: always (safest), everysec (default), no (fastest, risky)
RDB fork() latency: When Redis forks to create a snapshot, the kernel must copy page tables. On a 10 GB instance, this fork can cause a 50โ100ms latency spike. Use latency monitor to detect this.
Redis Architecture Patternsโ
Standaloneโ
Single Redis instance โ simple but single point of failure.
Sentinel (High Availability)โ
Master โโโ Replica 1
โโโ Replica 2
Sentinel 1 / Sentinel 2 / Sentinel 3 (quorum-based monitoring)
- Sentinels vote to promote a replica if master fails
- Client libraries use Sentinel to discover the current master
- Failover time: typically 10โ30 seconds
Cluster (Horizontal Scaling)โ
Slot 0โ5460 Slot 5461โ10922 Slot 10923โ16383
[Master A] [Master B] [Master C]
[Replica A] [Replica B] [Replica C]
- Hash slots (0โ16383) sharded across masters
CLUSTER KEYSLOT mykeyโ tells you which slot a key maps to- Keys in different slots cannot be used in multi-key operations
- Use hash tags
{user}.ordersand{user}.profileto force co-location on same slot
Quick Start with Dockerโ
# Run Redis locally
docker run --name redis -p 6379:6379 -d redis:latest
# Connect via CLI
docker exec -it redis redis-cli
Spring Boot Integrationโ
Add the dependency to your pom.xml:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
Configure in application.yml:
spring:
data:
redis:
host: localhost
port: 6379
password: your_password # if set
timeout: 2000ms
Common Use Casesโ
| Use Case | Redis Features Used | Senior Consideration |
|---|---|---|
| Session store | Strings + TTL | Sticky sessions vs distributed session |
| Rate limiting | INCR + EXPIRE or Sorted Sets | Token bucket vs sliding window |
| Distributed locks | SET NX EX (Redlock) | Clock drift, lock renewal, fencing tokens |
| Leaderboards | Sorted Sets | ZRANGEBYSCORE vs ZRANGEBYRANK |
| Real-time feeds | Streams or Pub/Sub | At-most-once vs at-least-once delivery |
| Cache | Strings/Hashes + TTL + eviction | Stampede, dogpile, warm-up strategy |
| Queue | Lists (BLPOP/RPUSH) or Streams | Visibility timeout, dead-letter queue |
| Geospatial | GEO commands | GEORADIUS for proximity queries |
Redis vs Memcachedโ
| Redis | Memcached | |
|---|---|---|
| Data types | Rich (12+ types) | Strings only |
| Persistence | RDB + AOF | None |
| Replication | Built-in | None (requires client) |
| Pub/Sub | Yes | No |
| Lua scripting | Yes | No |
| Cluster | Native cluster | Consistent hash (client-side) |
| Threading | Single-threaded exec + I/O threads | Multi-threaded |
| Memory efficiency | Slightly higher overhead | Lower overhead for simple strings |
Choose Redis for almost all new projects. Choose Memcached only for pure string caching at extreme throughput where Redis Cluster latency is measurable.
Redis Command Complexity Referenceโ
| Command | Complexity | Notes |
|---|---|---|
| GET, SET, INCR | O(1) | Hash table lookup |
| HGETALL | O(n) | n = number of fields |
| LRANGE | O(S+N) | S = offset from head, N = elements returned |
| ZADD | O(log N) | Skip list insertion |
| SMEMBERS | O(n) | Returns all members |
| SORT | O(N+M*log(M)) | N = elements, M = returned elements โ dangerous on large sets |
| KEYS pattern | O(n) | Never use in production โ use SCAN instead |
Production rule: Never run
KEYS *in production โ it blocks the single-threaded event loop and causes latency spikes. UseSCANwith a cursor instead.
Key Naming Conventionsโ
# Pattern: object-type:id:field
user:1234:profile
user:1234:sessions
# Hash tag for cluster slot co-location
{user:1234}:profile
{user:1234}:orders # same slot as above
{user:1234}:sessions
# Versioned keys for safe migrations
user:v2:1234:profile
# Avoid: key names longer than 100 bytes waste memory
# Avoid: key names with spaces or special chars (use : and _ as delimiters)