πŸ“ Volume IV: Distributed Systems

πŸ”’ Topic 28: Distributed Locking

Cache::lock is not enough for multi-server environments.

"In a single-server app, a mutex (lock) is simple.
In a distributed system with 10 servers, a lock becomes a nightmare.
Redis locks fail during network partitions.
Database locks are slow.
You need a distributed locking algorithm β€” Redlock, ZooKeeper, or etcd."
⚠️ THE DISTRIBUTED LOCKING TRAP

Many Laravel developers assume Cache::lock() is sufficient for distributed locking. In a single-server environment, it works fine. But in a multi-server setup with network partitions, Redis locks can fail in subtle ways β€” two servers can acquire the same lock simultaneously. Understanding distributed locking is essential for reliable distributed systems.

πŸ”΄ The Problem: Why Local Locks Fail in Distributed Systems

SINGLE SERVER (LOCKS WORK FINE) ═══════════════════════════════════════════════════════════════════ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ ONE SERVER β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Process A: lock acquired βœ“ β”‚ β”‚ β”‚ β”‚ Process B: wait (blocked) β”‚ β”‚ β”‚ β”‚ Process C: wait (blocked) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Works perfectly. One source of truth. MULTIPLE SERVERS (LOCKS ARE DANGEROUS) ═══════════════════════════════════════════════════════════════════ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Server 1 β”‚ β”‚ Server 2 β”‚ β”‚ Server 3 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Lock acquired βœ“ β”‚ β”‚ Lock acquired βœ“ β”‚ β”‚ Lock acquired βœ“ β”‚ β”‚ (thinks it's β”‚ β”‚ (thinks it's β”‚ β”‚ (thinks it's β”‚ β”‚ the only one) β”‚ β”‚ the only one) β”‚ β”‚ the only one) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ REDIS (Single) β”‚ β”‚ β”‚ β”‚ Server 1: SET lock:user:123 "server1" NX EX 10 β†’ SUCCESS β”‚ β”‚ Server 2: SET lock:user:123 "server2" NX EX 10 β†’ FAIL (locked) β”‚ β”‚ Server 3: SET lock:user:123 "server3" NX EX 10 β†’ FAIL (locked) β”‚ β”‚ β”‚ β”‚ βœ… Redis as single source of truth β€” works! β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ BUT WHAT HAPPENS DURING NETWORK PARTITION? ═══════════════════════════════════════════════════════════════════ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Server 1 β”‚ βœ—βœ—βœ— β”‚ Server 2 β”‚ β”‚ (can reach β”‚ β”‚ (can't reach β”‚ β”‚ Redis) β”‚ β”‚ Redis) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Redis β”‚ β”‚ (No β”‚ β”‚ Alive β”‚ β”‚ Redis) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Server 1: Can acquire locks normally Server 2: Can't acquire locks (Redis unreachable) β†’ But what if it THINKS it can? Without proper distributed locking, Server 2 might assume the lock is free and acquire it locally β€” now TWO servers have the lock!

πŸ” Laravel's Cache::lock (What It Is, What It Isn't)

LARAVEL CACHE LOCK (SINGLE SERVER OK, DISTRIBUTED RISKY)
// Laravel's atomic lock using Redis
$lock = Cache::lock('processing.user.123', 10);

if ($lock->get()) {
    // Process the user
    $this->processUser(123);
    
    $lock->release();
}

// This works perfectly in single-server or when Redis is always reachable
// But during network partitions, it can fail
LIMITATIONS OF Cache::lock

πŸ”΄ The Redlock Algorithm (Redis Distributed Lock)

REDLOCK: ACQUIRE LOCK FROM MULTIPLE REDIS INSTANCES ═══════════════════════════════════════════════════════════════════ Instead of one Redis, use 5 independent Redis masters: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Redis 1 β”‚ β”‚ Redis 2 β”‚ β”‚ Redis 3 β”‚ β”‚ Redis 4 β”‚ β”‚ Redis 5 β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚ Server 1 β”‚ β”‚ tries to β”‚ β”‚ acquire β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Algorithm: 1. Generate unique lock ID (UUID) 2. Try to acquire lock from ALL 5 Redis instances 3. Use a short timeout (e.g., 5ms per Redis) 4. If lock acquired from MAJORITY (β‰₯3 Redis instances): β†’ Lock is acquired successfully 5. If not, release any locks you have and retry Why it's more reliable: β€’ Even if 2 Redis instances die, you still have majority (3/5) β€’ Network partition affecting 2 instances won't break the lock
REDLOCK IN LARAVEL (WITH PACKAGE)
// Install package
composer require redlock-php/redlock

use RedLock\RedLock;

$servers = [
    ['host' => 'redis1.example.com', 'port' => 6379],
    ['host' => 'redis2.example.com', 'port' => 6379],
    ['host' => 'redis3.example.com', 'port' => 6379],
    ['host' => 'redis4.example.com', 'port' => 6379],
    ['host' => 'redis5.example.com', 'port' => 6379],
];

$redlock = new RedLock($servers);

$lock = $redlock->lock('user_processing_123', 10000); // 10 seconds TTL

if ($lock) {
    try {
        // Perform critical operation
        $this->processUser(123);
    } finally {
        $redlock->unlock($lock);
    }
}
⚠️ REDLOCK CONTROVERSY

Redlock has been criticized by distributed systems experts (including Martin Kleppmann). It's not perfectly safe under all conditions (clock drift, GC pauses). But for 99% of Laravel applications, it's "good enough" and much better than a single Redis lock.

πŸ¦“ ZooKeeper (The Gold Standard for Distributed Locks)

ZOOKEEPER: STRONGLY CONSISTENT DISTRIBUTED LOCKS ═══════════════════════════════════════════════════════════════════ ZooKeeper uses the ZAB protocol (ZooKeeper Atomic Broadcast) for strong consistency: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ ZooKeeper Ensemble (3/5 nodes) β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Leader β”‚ β”‚Follower β”‚ β”‚Follower β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ Consensus (ZAB) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Application β”‚ β”‚ Server 1..N β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Guarantees: β€’ Linearizable writes (strong consistency) β€’ Sequential consistency for reads β€’ No split-brain (network partition? writes stop) β€’ Fencing tokens (prevent lock from dead processes)
ZOOKEEPER LOCK IN LARAVEL
// Install PHP ZooKeeper extension
sudo apt-get install php-zookeeper

use Zookeeper;

class ZooKeeperLock
{
    private Zookeeper $zk;
    
    public function __construct(string $hosts)
    {
        $this->zk = new Zookeeper($hosts);
    }
    
    public function acquireLock(string $lockName, int $timeout = 30): bool
    {
        $lockPath = "/locks/{$lockName}";
        
        // Create sequential, ephemeral node
        $path = $this->zk->create(
            $lockPath . '/lock-',
            '',
            Zookeeper::EPHEMERAL | Zookeeper::SEQUENCE
        );
        
        // Check if this is the smallest sequence number
        $children = $this->zk->getChildren($lockPath);
        sort($children);
        
        if (basename($path) === $children[0]) {
            return true;  // We have the lock
        }
        
        // Wait for the previous lock to be released
        // ... watch implementation omitted for brevity
        
        return false;
    }
}

⚑ etcd (Used by Kubernetes, Modern Alternative)

ETCD DISTRIBUTED LOCK
// Install etcd PHP client
composer require otevrel/etcd-php

use Etcd\Client;

$client = new Client('http://localhost:2379');

// Acquire a distributed lock
$lock = $client->lock('user_processing_123');

if ($lock->acquire(10)) {  // 10 second timeout
    try {
        // Process the user
        $this->processUser(123);
    } finally {
        $lock->release();
    }
}

// etcd also supports:
// - Watch for lock changes
// - Lease-based locks (TTL)
// - Re-entrant locks
ETCD VS ZOOKEEPER VS REDIS
FeatureRedis (Single)RedlockZooKeeperetcd
ConsistencyWeakBest-effortStrongStrong
Fencing tokensNoNoYesYes
Split-brain safeNoPartialYesYes
PerformanceVery FastFastMediumFast
ComplexityLowMediumHighMedium
Best forSingle serverMost distributed appsCritical systemsK8s, modern stacks

πŸ›‘οΈ Fencing Tokens (The Hidden Requirement)

THE GHOST LOCK PROBLEM (No Fencing Token) ═══════════════════════════════════════════════════════════════════ Time 0ms: Process A acquires lock (token=1) and starts processing Time 100ms: Process A has GC pause (or network delay) Time 200ms: Lock expires, Process B acquires lock (token=2) Time 300ms: Process A wakes up, continues processing Time 400ms: Process A and Process B both think they have lock! Result: CORRUPTED DATA! WITH FENCING TOKEN (Solution): ═══════════════════════════════════════════════════════════════════ Time 0ms: Process A acquires lock (token=1) and starts processing Time 100ms: Process A has GC pause Time 200ms: Lock expires, Process B acquires lock (token=2) Time 300ms: Process A wakes up, tries to write to database Database rejects write because token=1 < current token=2 Time 400ms: Only Process B writes. Data is safe! LARAVEL IMPLEMENTATION (with ZooKeeper/etcd):
// Fencing token in action
class FencedResource
{
    private int $currentToken;
    
    public function write(string $data, int $fencingToken): void
    {
        if ($fencingToken < $this->currentToken) {
            throw new \Exception("Fencing token expired. Write rejected.");
        }
        
        // Perform write with current token
        DB::table('resource')->update([
            'data' => $data,
            'last_write_token' => $fencingToken,
        ]);
        
        $this->currentToken = $fencingToken;
    }
}

πŸ—„οΈ Database-Based Distributed Locks (Slow but Reliable)

MYSQL GET_LOCK (Simple but Works)
// MySQL named locks (server-wide, not cross-database)
use Illuminate\Support\Facades\DB;

$lockName = 'user_processing_123';

// Acquire lock (returns 1 on success, 0 on timeout)
$acquired = DB::select('SELECT GET_LOCK(?, 10)', [$lockName]);

if ($acquired[0]->{'GET_LOCK(?, 10)'} === 1) {
    try {
        // Process the user
        $this->processUser(123);
    } finally {
        DB::select('SELECT RELEASE_LOCK(?)', [$lockName]);
    }
}
DATABASE LOCK LIMITATIONS

🎯 Choosing the Right Distributed Lock for Your Laravel App

Requirement Recommendation
Single server, simple locking Cache::lock() is fine
Multiple servers, network delays acceptable Redlock (Redis 5+ instances) or etcd
Strong consistency required (financial) ZooKeeper or etcd with fencing tokens
High performance, low latency Redlock (5 Redis instances)
Running on Kubernetes etcd (Kubernetes uses it natively)
Simple setup, no new infrastructure MySQL GET_LOCK (but understand limits)
πŸ“Œ THE RULE: Start with Cache::lock(). If you outgrow single server, move to Redlock with 5 Redis instances. For financial or safety-critical systems, use ZooKeeper or etcd with fencing tokens. Never assume a single Redis instance is safe for distributed locking.

πŸ“ Topic 28 Summary: Distributed Locking

Solution Safety Performance Complexity
Cache::lock (single Redis) ⚠️ Low (network partition risk) Very High Low
Redlock (5+ Redis) Medium High Medium
ZooKeeper High (with fencing) Medium High
etcd High High Medium
MySQL GET_LOCK Medium Low Low
πŸ“Œ THE RULE: In distributed systems, every lock can fail. Design your system to handle lock failures gracefully. Use fencing tokens to prevent ghost locks. And never, ever assume a lock is perfectly safe.
NEXT TOPIC PREVIEW

Topic 29: Observability (Logging, Metrics, Tracing) β€” You can't improve what you can't measure. The three pillars of observability: Logs (what happened), Metrics (how often), Traces (where time went).