Featured image of post Partitioning, Bucketing, Sharding, Replication, Edge: The Clean Mental Model

Partitioning, Bucketing, Sharding, Replication, Edge: The Clean Mental Model

A precise mental model for partitioning, bucketing, sharding, replication, and edge computing - when to use each and why they exist at different layers.

Engineers mix up partitioning, bucketing, sharding, and replication constantly.

Edge computing is usually not confused with those terms, but it often gets incorrectly proposed as a substitute for them (e.g., “let’s use edge instead of sharding”).

These concepts live at different layers and solve different bottlenecks.

This post gives you a clean mental model and practical examples so you can choose the right tool.


The One Diagram to Remember

Everything in this post reduces to this hierarchy:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
DATA ORGANIZATION (within one database instance)

    Partitioning
    Bucketing


DATA DISTRIBUTION (across multiple database instances)

    Sharding
    Replication


COMPUTE PLACEMENT (where code runs)

    Edge computing

Quick Definitions (one sentence each)

Partitioning Split a table into logical segments inside one database to reduce scan and maintenance cost.

Bucketing Hash rows into fixed groups inside one database to improve join and parallel scan efficiency.

Sharding Split data across multiple database instances to scale storage and write throughput.

Replication Copy the same data to multiple nodes to improve availability and read scalability.

Edge computing Run compute closer to users to reduce latency.


A More Visual Real-Life Model: Shipping Warehouses

Imagine an online store with packages.

  • A database instance is one warehouse.
  • A database cluster is multiple warehouses.
  • Users are customers around the world.

Now map the terms:

  • Partitioning: split packages into sections like “January”, “February”, “Returns”, so workers search less.
  • Bucketing: inside each section, assign packages into fixed lanes (Lane 0..31) by hashing the order id.
  • Sharding: open more warehouses; each warehouse owns a subset of orders.
  • Replication: keep copies of the same inventory data in multiple warehouses for failover and read load.
  • Edge computing: place small local depots near customers to serve common requests faster (mostly caching + routing).

If you prefer a verb memory trick:

  • Partitioning: split by value
  • Bucketing: split by hash
  • Sharding: split across warehouses
  • Replication: copy to more warehouses
  • Edge: move compute closer to customers

The 30-Second Decision Tree

When someone says “we need sharding”, run this checklist:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Is the problem read throughput or high CPU from concurrent requests?
    Yes -> replication and caching, maybe edge.

Is the problem latency for users far away?
    Yes -> CDN and edge.

Is the problem one huge table and slow queries?
    Yes -> indexes, query fixes, then partitioning.

Is the problem that one database cannot handle writes or size?
    Yes -> sharding.

Layer 1: Partitioning (split by value)

Partitioning divides one logical table into smaller logical segments inside the same database.

Example structure:

1
2
3
4
5
events table

    events_2026_01
    events_2026_02
    events_2026_03

Example SQL:

1
2
3
4
CREATE TABLE events (
    id bigint,
    created_at timestamptz
) PARTITION BY RANGE (created_at);

Why partitioning exists:

  • Faster queries via partition pruning
  • Faster deletes and archival
  • Smaller indexes
  • Easier maintenance

What partitioning does NOT do:

  • Does not increase total storage capacity
  • Does not increase write throughput beyond one database

Layer 2: Bucketing (hash distribution within one database)

Bucketing distributes rows into a fixed number of buckets using a hash function.

Example:

1
2
3
4
5
6
users table

    bucket 0
    bucket 1
    bucket 2
    bucket 3

Example logic:

1
2
3
4
5
import hashlib

def bucket_for(user_id, bucket_count):
    hash_val = int(hashlib.md5(str(user_id).encode()).hexdigest(), 16)
    return hash_val % bucket_count

Why bucketing exists:

  • Faster joins
  • Better parallelism
  • Reduced scan work

The Magic of Joins & Parallelism

Bucketing shines when you have two large tables (e.g., orders and order_items) both bucketed by the same key (order_id) into the same number of buckets.

1. Co-located Joins

In a distributed system, joining two large tables often triggers a Shuffle Join. The engine has to move millions of rows across the network (like sending massive trucks between warehouses) so that matching IDs end up on the same machine. This network “traffic jam” is a silent killer of performance.

With Bucketing, you perform a Bucket-to-Bucket Join. Since the engine knows that orders in Bucket 0 can only match order_items in Bucket 0, and both are already sitting on the same disk, it avoids the shuffle entirely. No trucks, no traffic jams.

2. Independent Parallelism

Because buckets are independent, you can assign 32 workers to join 32 pairs of buckets simultaneously. No worker needs to talk to another.

Example pseudocode for a parallel join:

1
2
3
4
5
# The database engine runs this across multiple threads
def parallel_join(buckets_count):
    for i in range(buckets_count):
        # This pair can be joined completely in isolation
        process_join(orders_bucket[i], items_bucket[i])

Important constraint:

All buckets still live inside the same database or storage system.

Bucketing improves performance, not capacity.


Partitioning vs Bucketing

Partitioning = split by value

Example:

1
2
3
4
orders

    partition: 2026-01
    partition: 2026-02

Bucketing = split by hash

Example:

1
2
3
4
5
6
orders_2026-01

    bucket 0
    bucket 1
    bucket 2
    bucket 3

In practice: partitioning reduces scan scope; bucketing improves join and parallel processing efficiency.

They are often used together.


Layer 3: Sharding (distribution across multiple databases)

Sharding distributes data across multiple database instances.

Example:

1
2
3
4
Shard 0 → DB instance A
Shard 1 → DB instance B
Shard 2 → DB instance C
Shard 3 → DB instance D

Example routing:

1
2
3

def shard_for(user_id, shard_count):
    return user_id % shard_count

Why sharding exists:

  • Increase total storage capacity
  • Increase write throughput
  • Increase connection capacity

Tradeoffs:

  • Cross-shard queries are complex
  • Transactions are harder
  • Rebalancing is difficult

Bucketing vs Sharding (critical distinction)

Both use hash distribution.

Difference is scope.

Bucketing:

1
2
3
4
5
One database instance

    bucket 0
    bucket 1
    bucket 2

Sharding:

1
2
3
Database instance A
Database instance B
Database instance C

Bucketing improves performance inside one database.

Sharding increases capacity across multiple databases.


Layer 4: Replication (copy data)

Replication creates identical copies of data across nodes.

Example:

1
2
3
4
5
Primary

Replica A
Replica B
Replica C

Why replication exists:

  • High availability
  • Failover
  • Read scaling

Replication does not increase write capacity.


Sharding vs Replication (the data distinction)

Both involve multiple servers, but they handle data differently.

Replication: every server has all the data (clones).

  • Goal: Scaling reads and high availability.
  • Limit: You are still limited by the storage capacity of one single machine.

Sharding: every server has a piece of the data (splits).

  • Goal: Scaling writes and total storage capacity.
  • Limit: Complex cross-shard queries and transactions.

Layer 5: Edge computing (move compute closer to users)

Edge computing moves compute to locations closer to users.

Typical flow:

1
2
3
4
5
6
7
User

Edge location

Origin API

Origin database

Edge improves latency and user experience, typically via caching and request routing.

Edge does not change database capacity.


How Real Systems Evolve

Typical progression:

  1. Indexes and query optimization
  2. Partitioning (when tables grow)
  3. Replication (when reads dominate)
  4. Caching and edge (when latency and bandwidth matter)
  5. Sharding (only when one primary cannot keep up)

Most systems never need step 5.


Final Mental Model

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
Organize data

    Partitioning
    Bucketing

Scale capacity

    Sharding

Increase availability

    Replication

Reduce latency

    Edge

Each term maps to a different bottleneck.

The fastest way to lose a design review is to use the right word for the wrong layer.

Made with laziness love 🦥

Subscribe to My Newsletter