📁 Volume II: Laravel Performance Tuning Kit

🗂️ Topic 11: Database Partitioning

Vertical & Horizontal — When one table becomes too big for one server.

"Your orders table has 100 million rows.
Indexes help, but the B-Tree is now 10 levels deep.
Every query touches 10 disk pages. Your I/O is screaming.
Read replicas won't save you. You need to partition."

⚠️ THE SCALE PROBLEM

Read/write splitting (Topic 10) scales your read capacity, but it doesn't reduce the size of your tables. When a single table reaches 100M+ rows, even with perfect indexes, queries become slow. The B-Tree depth increases, and every query requires more I/O. Partitioning is the next step.

📐 Type 1: Vertical Partitioning

THE PROBLEM: WIDE TABLES

Some columns are large (TEXT, BLOB, JSON) and rarely accessed. But they still occupy space in every row, slowing down full table scans and wasting buffer pool memory.

│ │ BEFORE: ONE WIDE TABLE │ ═══════════════════════════════════════════════════════════════════ │ │ ┌─────────────────────────────────────────────────────────────────────────────────────┐ │ │ users table │ │ │ ┌────────┬──────────┬─────────────┬──────────┬─────────────┬──────────────────┐ │ │ │ │ id │ name │ email │ password │ avatar (BLOB)│ bio (TEXT) │ │ │ │ ├────────┼──────────┼─────────────┼──────────┼─────────────┼──────────────────┤ │ │ │ │ 1 │ John │ j@x.com │ hash... │ [500KB] │ "Lorem ipsum..." │ │ │ │ │ 2 │ Jane │ jn@x.com │ hash... │ [500KB] │ "Dolor sit..." │ │ │ │ │ 3 │ Bob │ b@x.com │ hash... │ [500KB] │ "Consectetur..." │ │ │ │ └────────┴──────────┴─────────────┴──────────┴─────────────┴──────────────────┘ │ │ └─────────────────────────────────────────────────────────────────────────────────────┘ │ │ PROBLEM: SELECT id, name, email FROM users WHERE active = 1 │ → MySQL still reads the 500KB avatar and 10KB bio from disk! │ → Wasted I/O. Slow queries. │ │ ═══════════════════════════════════════════════════════════════════ │ │ AFTER: VERTICAL PARTITIONING │ ═══════════════════════════════════════════════════════════════════ │ │ ┌─────────────────────────────────────┐ ┌─────────────────────────────────────────┐ │ │ users_main (lean) │ │ users_profile (large columns) │ │ │ ┌────────┬──────────┬─────────────┐│ │ ┌─────────┬─────────────┬──────────────┐│ │ │ │ id │ name │ email ││ │ │ user_id │ avatar │ bio ││ │ │ ├────────┼──────────┼─────────────┼│ │ ├─────────┼─────────────┼──────────────┤│ │ │ │ 1 │ John │ j@x.com ││ │ │ 1 │ [500KB] │ "Lorem..." ││ │ │ │ 2 │ Jane │ jn@x.com ││ │ │ 2 │ [500KB] │ "Dolor..." ││ │ │ │ 3 │ Bob │ b@x.com ││ │ │ 3 │ [500KB] │ "Consect..." ││ │ │ └────────┴──────────┴─────────────┘│ │ └─────────┴─────────────┴──────────────┘│ │ └─────────────────────────────────────┘ └─────────────────────────────────────────┘ │ │ BENEFIT: SELECT id, name, email FROM users_main WHERE active = 1 │ → Only reads the small table! 10x faster, 10x less I/O! │

VERTICAL PARTITIONING IN LARAVEL

// Model for main user data (frequently accessed)
class User extends Model
{
    protected $table = 'users_main';
    protected $fillable = ['id', 'name', 'email', 'password'];
    
    public function profile()
    {
        return $this->hasOne(UserProfile::class, 'user_id');
    }
}

// Model for large/binary data (rarely accessed)
class UserProfile extends Model
{
    protected $table = 'users_profile';
    protected $fillable = ['user_id', 'avatar', 'bio'];
}

// Usage: When you need avatar, eager load
$user = User::with('profile')->find(1);
$avatar = $user->profile->avatar;

// Usage: When you only need basic info (no avatar)
$users = User::select('id', 'name', 'email')->get();  // Fast! No large columns!

WHEN TO USE VERTICAL PARTITIONING

Table has large TEXT, BLOB, or JSON columns
These large columns are rarely accessed
You frequently query only the small columns
You want to reduce buffer pool pressure

📊 Type 2: Horizontal Partitioning (Sharding)

THE PROBLEM: TOO MANY ROWS

Your table has 500 million rows. Even with perfect indexes, the B-Tree is too deep. Writes are slow. Maintenance (ALTER TABLE) takes hours. Backups take days.

│ │ BEFORE: ONE MASSIVE TABLE │ ═══════════════════════════════════════════════════════════════════ │ │ ┌─────────────────────────────────────────────────────────────────────────────────────┐ │ │ orders table (500 million rows) │ │ │ ┌─────────┬─────────────┬────────────┬──────────────┬─────────────────────────┐ │ │ │ │ id │ user_id │ amount │ status │ created_at │ │ │ │ ├─────────┼─────────────┼────────────┼──────────────┼─────────────────────────┤ │ │ │ │ 1 │ 123 │ $99.99 │ completed │ 2020-01-01 00:00:00 │ │ │ │ │ 2 │ 456 │ $49.99 │ pending │ 2020-01-01 00:01:00 │ │ │ │ │ ... │ ... │ ... │ ... │ ... │ │ │ │ │ 500M │ 789 │ $19.99 │ completed │ 2024-01-01 23:59:59 │ │ │ │ └─────────┴─────────────┴────────────┴──────────────┴─────────────────────────┘ │ │ └─────────────────────────────────────────────────────────────────────────────────────┘ │ │ PROBLEM: │ • SELECT queries scanning months of data are slow │ • DELETE old data is impossible (locks table for hours) │ • ALTER TABLE ADD COLUMN takes 2 hours (rebuilds entire table) │ • Backups are 500GB+ │ │ ═══════════════════════════════════════════════════════════════════ │ │ AFTER: HORIZONTAL PARTITIONING (BY DATE) │ ═══════════════════════════════════════════════════════════════════ │ │ ┌─────────────────────────┐ ┌─────────────────────────┐ ┌─────────────────────────┐ │ │ orders_2024 │ │ orders_2025 │ │ orders_2026 │ │ │ (Jan-Dec 2024) │ │ (Jan-Dec 2025) │ │ (Jan-Dec 2026) │ │ │ ┌───────┬───────┬─────┐│ │ ┌───────┬───────┬─────┐│ │ ┌───────┬───────┬─────┐│ │ │ │ id │ user │ amt ││ │ │ id │ user │ amt ││ │ │ id │ user │ amt ││ │ │ ├───────┼───────┼─────┤│ │ ├───────┼───────┼─────┤│ │ ├───────┼───────┼─────┤│ │ │ │ 1 │ 123 │ $99 ││ │ │ 1M │ 456 │ $49 ││ │ │ 2M │ 789 │ $19 ││ │ │ │ ... │ ... │ ... ││ │ │ ... │ ... │ ... ││ │ │ ... │ ... │ ... ││ │ │ └───────┴───────┴─────┘│ │ └───────┴───────┴─────┘│ │ └───────┴───────┴─────┘│ │ └─────────────────────────┘ └─────────────────────────┘ └─────────────────────────┘ │ │ BENEFITS: │ • Query only the partition you need (WHERE created_at BETWEEN '2025-01-01' AND '2025-12-31') │ • Drop old partitions instantly (DROP TABLE orders_2020) │ • ALTER TABLE affects only one partition (seconds, not hours) │ • Backups can be partitioned │

👥 Sharding by User ID (Multi-Tenant)

│ │ SHARDING BY USER_ID (Horizontal Scaling Across Servers) │ ═══════════════════════════════════════════════════════════════════ │ │ Application Layer (Laravel) │ ┌─────────────────────────────────────────────────────────────────────────────────────┐ │ │ $shard = $user_id % 4; │ │ │ DB::connection("shard_$shard")->table('user_data')->where('user_id', $user_id); │ │ └─────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ ▼ ▼ ▼ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ Shard 1 │ │ Shard 2 │ │ Shard 3 │ │ Shard 4 │ │ │ (users 0- │ │ (users │ │ (users │ │ (users │ │ │ 2.5M) │ │ 2.5M-5M) │ │ 5M-7.5M) │ │ 7.5M-10M) │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ BENEFITS: │ • Each shard is 1/4 the size (2.5M rows instead of 10M) │ • Queries only hit one shard (4x faster) │ • Can add more shards as users grow (rebalancing is hard) │ • Each shard can be on a different server (horizontal scaling) │ │ TRADE-OFFS: │ • Cross-shard queries are impossible (e.g., COUNT(*) across all shards) │ • Need a routing layer (application logic) │ • Rebalancing when adding shards is complex │

SHARDING IN LARAVEL (Dynamic Connection)

// config/database.php
'connections' => [
    'shard_1' => ['host' => 'db1.example.com', ...],
    'shard_2' => ['host' => 'db2.example.com', ...],
    'shard_3' => ['host' => 'db3.example.com', ...],
    'shard_4' => ['host' => 'db4.example.com', ...],
],

// AppServiceProvider or custom class
class ShardManager
{
    public static function getConnection($userId)
    {
        $shard = $userId % 4 + 1;  // 1-4
        return DB::connection("shard_$shard");
    }
    
    public static function getTable($userId)
    {
        $shard = $userId % 4 + 1;
        return DB::connection("shard_$shard")->table('user_data');
    }
}

// Usage in controller
$userData = ShardManager::getTable($userId)
    ->where('user_id', $userId)
    ->first();

📅 Date-Based Partitioning (MySQL Native)

MYSQL'S BUILT-IN PARTITIONING (EASIER)

MySQL supports partitioning natively. You don't need multiple tables or connections. One logical table, multiple physical partitions.

-- Create partitioned table by year
CREATE TABLE orders (
    id INT NOT NULL,
    user_id INT NOT NULL,
    amount DECIMAL(10,2),
    status VARCHAR(20),
    created_at DATE NOT NULL
)
PARTITION BY RANGE (YEAR(created_at)) (
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025),
    PARTITION p2025 VALUES LESS THAN (2026),
    PARTITION p2026 VALUES LESS THAN (2027),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);

-- Query automatically uses only relevant partition
SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';
-- MySQL only scans p2024 partition!

-- Drop old data instantly (no DELETE)
ALTER TABLE orders DROP PARTITION p2022;  -- Seconds, not hours!

-- Add new partition for next year
ALTER TABLE orders ADD PARTITION (PARTITION p2027 VALUES LESS THAN (2028));

MYSQL PARTITIONING: PROS & CONS

Pros	Cons
Transparent to application (no code changes)	All partitions on same server (no horizontal scaling)
Instant partition drop (DROP PARTITION)	Partition key must be in every UNIQUE index
Automatic partition pruning	Some limitations on foreign keys
Easier than manual sharding	Not supported by all storage engines (InnoDB works)

📊 Performance Gains from Partitioning

Scenario	Without Partitioning	With Partitioning	Improvement
Query by date (1 year of 10M rows)	Scans 10M rows	Scans 1M rows	10x faster
DELETE old data (3 years old)	Locks table for hours (DELETE millions)	DROP PARTITION in seconds	1,000x faster
ALTER TABLE ADD COLUMN on 100M rows	1-2 hours (rebuilds whole table)	Minutes (affects only active partitions)	10-20x faster
Backup size	500GB (all data)	50GB per partition (can backup separately)	10x smaller per backup
Query by user_id (sharded across 4 servers)	Scans 40M rows on 1 server	Scans 10M rows on 1 of 4 servers	4x faster + 4x capacity

🎯 Choosing the Right Partitioning Strategy

Situation	Solution	Complexity	When to Apply
Wide table (large TEXT/BLOB columns)	Vertical Partitioning	Low	As soon as you notice large columns
Time-series data (logs, events, orders)	MySQL Native Range Partitioning (by date)	Low	When table > 10M rows
Multi-tenant SaaS (data per customer)	Sharding by tenant_id (multiple databases)	Medium	When single database can't fit all tenants
User data for millions of users	Sharding by user_id (multiple servers)	High	When table > 50M rows or > 500GB
Need to keep old data but query rarely	Archive table + Partitioning	Medium	When you have compliance requirements

🔴 THE RULE: Start simple. Use indexes. Add read replicas. Then vertical partitioning. Then MySQL range partitioning. Only when all else fails, implement application-level sharding.

Premature partitioning adds complexity. Measure first, then partition.

⚠️ The Downsides You Must Know

MYSQL PARTITIONING LIMITATIONS

Max 8192 partitions per table (but you'll never need that many)
Partition key must be in every UNIQUE index (including PRIMARY KEY)
Foreign keys not supported with partitioning (you have to manage manually)
Full-text indexes not supported on partitioned tables
ALTER TABLE ... REBUILD PARTITION still locks the partition

APPLICATION-LEVEL SHARDING DRAWBACKS

Cross-shard queries are impossible (e.g., SELECT COUNT(*) across all users)
Rebalancing is a nightmare (moving data between shards)
JOINs across shards don't work (you need denormalization or application-side joins)
Transactions across shards are complex (2-phase commit needed)
Your application code becomes more complex (routing logic everywhere)

RECOMMENDATION

Use MySQL native partitioning first. It's transparent to your application and solves 80% of use cases. Only implement application-level sharding when you've exhausted all other options (indexes, replicas, vertical partitioning, MySQL partitioning).

📝 Topic 11 Summary: Database Partitioning

Type	What	When to Use	Complexity
Vertical	Split wide tables by column groups	Large TEXT/BLOB columns rarely accessed	Low
MySQL Range	Split by date or numeric range (native)	Time-series data, logs, events	Low
MySQL Hash	Split by hash of a key (native)	Even distribution across partitions	Low
Application Sharding	Split across multiple database servers	Massive data (100M+ rows), need horizontal scaling	High

📌 THE RULE: Vertical partitioning first (separate large columns). Then MySQL range partitioning (by date). Then MySQL hash partitioning. Only then consider application-level sharding.

Partitioning is powerful but adds complexity. Don't do it until you need it. When you do need it, start with the simplest solution.

NEXT TOPIC PREVIEW

Topic 12: Statelessness Doctrine — Why your server must be killable. Move sessions to Redis. Move files to S3. Move config to env vars. Scale horizontally without fear.