📁 Volume II: Laravel Performance Tuning Kit
🗂️ Topic 11: Database Partitioning
Vertical & Horizontal — When one table becomes too big for one server.
"Your orders table has 100 million rows.
Indexes help, but the B-Tree is now 10 levels deep.
Every query touches 10 disk pages. Your I/O is screaming.
Read replicas won't save you. You need to partition."
⚠️ THE SCALE PROBLEM
Read/write splitting (Topic 10) scales your read capacity, but it doesn't reduce the size of your tables. When a single table reaches 100M+ rows, even with perfect indexes, queries become slow. The B-Tree depth increases, and every query requires more I/O. Partitioning is the next step.
📐 Type 1: Vertical Partitioning
THE PROBLEM: WIDE TABLES
Some columns are large (TEXT, BLOB, JSON) and rarely accessed. But they still occupy space in every row, slowing down full table scans and wasting buffer pool memory.
│
│ BEFORE: ONE WIDE TABLE
│ ═══════════════════════════════════════════════════════════════════
│
│ ┌─────────────────────────────────────────────────────────────────────────────────────┐
│ │ users table │
│ │ ┌────────┬──────────┬─────────────┬──────────┬─────────────┬──────────────────┐ │
│ │ │ id │ name │ email │ password │ avatar (BLOB)│ bio (TEXT) │ │
│ │ ├────────┼──────────┼─────────────┼──────────┼─────────────┼──────────────────┤ │
│ │ │ 1 │ John │ j@x.com │ hash... │ [500KB] │ "Lorem ipsum..." │ │
│ │ │ 2 │ Jane │ jn@x.com │ hash... │ [500KB] │ "Dolor sit..." │ │
│ │ │ 3 │ Bob │ b@x.com │ hash... │ [500KB] │ "Consectetur..." │ │
│ │ └────────┴──────────┴─────────────┴──────────┴─────────────┴──────────────────┘ │
│ └─────────────────────────────────────────────────────────────────────────────────────┘
│
│ PROBLEM: SELECT id, name, email FROM users WHERE active = 1
│ → MySQL still reads the 500KB avatar and 10KB bio from disk!
│ → Wasted I/O. Slow queries.
│
│ ═══════════════════════════════════════════════════════════════════
│
│ AFTER: VERTICAL PARTITIONING
│ ═══════════════════════════════════════════════════════════════════
│
│ ┌─────────────────────────────────────┐ ┌─────────────────────────────────────────┐
│ │ users_main (lean) │ │ users_profile (large columns) │
│ │ ┌────────┬──────────┬─────────────┐│ │ ┌─────────┬─────────────┬──────────────┐│
│ │ │ id │ name │ email ││ │ │ user_id │ avatar │ bio ││
│ │ ├────────┼──────────┼─────────────┼│ │ ├─────────┼─────────────┼──────────────┤│
│ │ │ 1 │ John │ j@x.com ││ │ │ 1 │ [500KB] │ "Lorem..." ││
│ │ │ 2 │ Jane │ jn@x.com ││ │ │ 2 │ [500KB] │ "Dolor..." ││
│ │ │ 3 │ Bob │ b@x.com ││ │ │ 3 │ [500KB] │ "Consect..." ││
│ │ └────────┴──────────┴─────────────┘│ │ └─────────┴─────────────┴──────────────┘│
│ └─────────────────────────────────────┘ └─────────────────────────────────────────┘
│
│ BENEFIT: SELECT id, name, email FROM users_main WHERE active = 1
│ → Only reads the small table! 10x faster, 10x less I/O!
│
VERTICAL PARTITIONING IN LARAVEL
// Model for main user data (frequently accessed)
class User extends Model
{
protected $table = 'users_main';
protected $fillable = ['id', 'name', 'email', 'password'];
public function profile()
{
return $this->hasOne(UserProfile::class, 'user_id');
}
}
// Model for large/binary data (rarely accessed)
class UserProfile extends Model
{
protected $table = 'users_profile';
protected $fillable = ['user_id', 'avatar', 'bio'];
}
// Usage: When you need avatar, eager load
$user = User::with('profile')->find(1);
$avatar = $user->profile->avatar;
// Usage: When you only need basic info (no avatar)
$users = User::select('id', 'name', 'email')->get(); // Fast! No large columns!
WHEN TO USE VERTICAL PARTITIONING
- Table has large TEXT, BLOB, or JSON columns
- These large columns are rarely accessed
- You frequently query only the small columns
- You want to reduce buffer pool pressure
📊 Type 2: Horizontal Partitioning (Sharding)
THE PROBLEM: TOO MANY ROWS
Your table has 500 million rows. Even with perfect indexes, the B-Tree is too deep. Writes are slow. Maintenance (ALTER TABLE) takes hours. Backups take days.
│
│ BEFORE: ONE MASSIVE TABLE
│ ═══════════════════════════════════════════════════════════════════
│
│ ┌─────────────────────────────────────────────────────────────────────────────────────┐
│ │ orders table (500 million rows) │
│ │ ┌─────────┬─────────────┬────────────┬──────────────┬─────────────────────────┐ │
│ │ │ id │ user_id │ amount │ status │ created_at │ │
│ │ ├─────────┼─────────────┼────────────┼──────────────┼─────────────────────────┤ │
│ │ │ 1 │ 123 │ $99.99 │ completed │ 2020-01-01 00:00:00 │ │
│ │ │ 2 │ 456 │ $49.99 │ pending │ 2020-01-01 00:01:00 │ │
│ │ │ ... │ ... │ ... │ ... │ ... │ │
│ │ │ 500M │ 789 │ $19.99 │ completed │ 2024-01-01 23:59:59 │ │
│ │ └─────────┴─────────────┴────────────┴──────────────┴─────────────────────────┘ │
│ └─────────────────────────────────────────────────────────────────────────────────────┘
│
│ PROBLEM:
│ • SELECT queries scanning months of data are slow
│ • DELETE old data is impossible (locks table for hours)
│ • ALTER TABLE ADD COLUMN takes 2 hours (rebuilds entire table)
│ • Backups are 500GB+
│
│ ═══════════════════════════════════════════════════════════════════
│
│ AFTER: HORIZONTAL PARTITIONING (BY DATE)
│ ═══════════════════════════════════════════════════════════════════
│
│ ┌─────────────────────────┐ ┌─────────────────────────┐ ┌─────────────────────────┐
│ │ orders_2024 │ │ orders_2025 │ │ orders_2026 │
│ │ (Jan-Dec 2024) │ │ (Jan-Dec 2025) │ │ (Jan-Dec 2026) │
│ │ ┌───────┬───────┬─────┐│ │ ┌───────┬───────┬─────┐│ │ ┌───────┬───────┬─────┐│
│ │ │ id │ user │ amt ││ │ │ id │ user │ amt ││ │ │ id │ user │ amt ││
│ │ ├───────┼───────┼─────┤│ │ ├───────┼───────┼─────┤│ │ ├───────┼───────┼─────┤│
│ │ │ 1 │ 123 │ $99 ││ │ │ 1M │ 456 │ $49 ││ │ │ 2M │ 789 │ $19 ││
│ │ │ ... │ ... │ ... ││ │ │ ... │ ... │ ... ││ │ │ ... │ ... │ ... ││
│ │ └───────┴───────┴─────┘│ │ └───────┴───────┴─────┘│ │ └───────┴───────┴─────┘│
│ └─────────────────────────┘ └─────────────────────────┘ └─────────────────────────┘
│
│ BENEFITS:
│ • Query only the partition you need (WHERE created_at BETWEEN '2025-01-01' AND '2025-12-31')
│ • Drop old partitions instantly (DROP TABLE orders_2020)
│ • ALTER TABLE affects only one partition (seconds, not hours)
│ • Backups can be partitioned
│
👥 Sharding by User ID (Multi-Tenant)
│
│ SHARDING BY USER_ID (Horizontal Scaling Across Servers)
│ ═══════════════════════════════════════════════════════════════════
│
│ Application Layer (Laravel)
│ ┌─────────────────────────────────────────────────────────────────────────────────────┐
│ │ $shard = $user_id % 4; │
│ │ DB::connection("shard_$shard")->table('user_data')->where('user_id', $user_id); │
│ └─────────────────────────────────────────────────────────────────────────────────────┘
│ │ │ │
│ ▼ ▼ ▼ ▼
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ │ Shard 1 │ │ Shard 2 │ │ Shard 3 │ │ Shard 4 │
│ │ (users 0- │ │ (users │ │ (users │ │ (users │
│ │ 2.5M) │ │ 2.5M-5M) │ │ 5M-7.5M) │ │ 7.5M-10M) │
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│
│ BENEFITS:
│ • Each shard is 1/4 the size (2.5M rows instead of 10M)
│ • Queries only hit one shard (4x faster)
│ • Can add more shards as users grow (rebalancing is hard)
│ • Each shard can be on a different server (horizontal scaling)
│
│ TRADE-OFFS:
│ • Cross-shard queries are impossible (e.g., COUNT(*) across all shards)
│ • Need a routing layer (application logic)
│ • Rebalancing when adding shards is complex
│
SHARDING IN LARAVEL (Dynamic Connection)
// config/database.php
'connections' => [
'shard_1' => ['host' => 'db1.example.com', ...],
'shard_2' => ['host' => 'db2.example.com', ...],
'shard_3' => ['host' => 'db3.example.com', ...],
'shard_4' => ['host' => 'db4.example.com', ...],
],
// AppServiceProvider or custom class
class ShardManager
{
public static function getConnection($userId)
{
$shard = $userId % 4 + 1; // 1-4
return DB::connection("shard_$shard");
}
public static function getTable($userId)
{
$shard = $userId % 4 + 1;
return DB::connection("shard_$shard")->table('user_data');
}
}
// Usage in controller
$userData = ShardManager::getTable($userId)
->where('user_id', $userId)
->first();
📅 Date-Based Partitioning (MySQL Native)
MYSQL'S BUILT-IN PARTITIONING (EASIER)
MySQL supports partitioning natively. You don't need multiple tables or connections. One logical table, multiple physical partitions.
-- Create partitioned table by year
CREATE TABLE orders (
id INT NOT NULL,
user_id INT NOT NULL,
amount DECIMAL(10,2),
status VARCHAR(20),
created_at DATE NOT NULL
)
PARTITION BY RANGE (YEAR(created_at)) (
PARTITION p2022 VALUES LESS THAN (2023),
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025),
PARTITION p2025 VALUES LESS THAN (2026),
PARTITION p2026 VALUES LESS THAN (2027),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
-- Query automatically uses only relevant partition
SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';
-- MySQL only scans p2024 partition!
-- Drop old data instantly (no DELETE)
ALTER TABLE orders DROP PARTITION p2022; -- Seconds, not hours!
-- Add new partition for next year
ALTER TABLE orders ADD PARTITION (PARTITION p2027 VALUES LESS THAN (2028));
MYSQL PARTITIONING: PROS & CONS
| Pros | Cons |
| Transparent to application (no code changes) | All partitions on same server (no horizontal scaling) |
| Instant partition drop (DROP PARTITION) | Partition key must be in every UNIQUE index |
| Automatic partition pruning | Some limitations on foreign keys |
| Easier than manual sharding | Not supported by all storage engines (InnoDB works) |
📊 Performance Gains from Partitioning
| Scenario |
Without Partitioning |
With Partitioning |
Improvement |
| Query by date (1 year of 10M rows) |
Scans 10M rows |
Scans 1M rows |
10x faster |
| DELETE old data (3 years old) |
Locks table for hours (DELETE millions) |
DROP PARTITION in seconds |
1,000x faster |
| ALTER TABLE ADD COLUMN on 100M rows |
1-2 hours (rebuilds whole table) |
Minutes (affects only active partitions) |
10-20x faster |
| Backup size |
500GB (all data) |
50GB per partition (can backup separately) |
10x smaller per backup |
| Query by user_id (sharded across 4 servers) |
Scans 40M rows on 1 server |
Scans 10M rows on 1 of 4 servers |
4x faster + 4x capacity |
🎯 Choosing the Right Partitioning Strategy
| Situation |
Solution |
Complexity |
When to Apply |
| Wide table (large TEXT/BLOB columns) |
Vertical Partitioning |
Low |
As soon as you notice large columns |
| Time-series data (logs, events, orders) |
MySQL Native Range Partitioning (by date) |
Low |
When table > 10M rows |
| Multi-tenant SaaS (data per customer) |
Sharding by tenant_id (multiple databases) |
Medium |
When single database can't fit all tenants |
| User data for millions of users |
Sharding by user_id (multiple servers) |
High |
When table > 50M rows or > 500GB |
| Need to keep old data but query rarely |
Archive table + Partitioning |
Medium |
When you have compliance requirements |
🔴 THE RULE: Start simple. Use indexes. Add read replicas. Then vertical partitioning. Then MySQL range partitioning. Only when all else fails, implement application-level sharding.
Premature partitioning adds complexity. Measure first, then partition.
⚠️ The Downsides You Must Know
MYSQL PARTITIONING LIMITATIONS
- Max 8192 partitions per table (but you'll never need that many)
- Partition key must be in every UNIQUE index (including PRIMARY KEY)
- Foreign keys not supported with partitioning (you have to manage manually)
- Full-text indexes not supported on partitioned tables
- ALTER TABLE ... REBUILD PARTITION still locks the partition
APPLICATION-LEVEL SHARDING DRAWBACKS
- Cross-shard queries are impossible (e.g., SELECT COUNT(*) across all users)
- Rebalancing is a nightmare (moving data between shards)
- JOINs across shards don't work (you need denormalization or application-side joins)
- Transactions across shards are complex (2-phase commit needed)
- Your application code becomes more complex (routing logic everywhere)
RECOMMENDATION
Use MySQL native partitioning first. It's transparent to your application and solves 80% of use cases. Only implement application-level sharding when you've exhausted all other options (indexes, replicas, vertical partitioning, MySQL partitioning).
📝 Topic 11 Summary: Database Partitioning
| Type |
What |
When to Use |
Complexity |
| Vertical |
Split wide tables by column groups |
Large TEXT/BLOB columns rarely accessed |
Low |
| MySQL Range |
Split by date or numeric range (native) |
Time-series data, logs, events |
Low |
| MySQL Hash |
Split by hash of a key (native) |
Even distribution across partitions |
Low |
| Application Sharding |
Split across multiple database servers |
Massive data (100M+ rows), need horizontal scaling |
High |
📌 THE RULE: Vertical partitioning first (separate large columns). Then MySQL range partitioning (by date). Then MySQL hash partitioning. Only then consider application-level sharding.
Partitioning is powerful but adds complexity. Don't do it until you need it. When you do need it, start with the simplest solution.
NEXT TOPIC PREVIEW
Topic 12: Statelessness Doctrine — Why your server must be killable. Move sessions to Redis. Move files to S3. Move config to env vars. Scale horizontally without fear.