Featured image of post Partitioning, Bucketing, Sharding, Replication: Why Edge Computing Isn’t One of Them

Partitioning, Bucketing, Sharding, Replication: Why Edge Computing Isn’t One of Them

Engineers mix these words all the time. Here’s the mental model, the trade-offs, and the exact moment you should stop partitioning and start sharding.

I’ve watched smart engineers burn 30 minutes in a design review arguing about “sharding” when they actually meant “read replicas”.

Same vibe as mixing bits and bytes: the words sound similar, they show up in the same conversations, and your brain decides they’re interchangeable.

They’re not.

This post gives you a clean mental model and practical examples so you can answer precisely and choose the right tool.

Before we go technical, here’s the real-life version.

The Real-Life Mental Model

Imagine you run a bubble tea business.

  • You have a big recipe book (your data).
  • You have a main kitchen (your database).
  • You also have multiple shops around the city (multiple database nodes).

Now map the buzzwords:

  • Partitioning: you split the recipe book into chapters (milk tea, fruit tea, toppings) so staff can find things faster.
  • Bucketing: within a chapter, you put recipes into numbered folders (1..32) so a runner can jump to the right folder instantly.
  • Sharding: you open more kitchens and each kitchen owns part of the menu or customers.
  • Replication: you photocopy the same recipe book to every shop so if one shop is on fire, the business still runs.
  • Edge computing: you prep common toppings at each shop so customers don’t wait for the main kitchen.

If you prefer a simpler memory trick, use verbs:

  • Partitioning: split (inside one DB)
  • Bucketing: hash (into fixed groups)
  • Sharding: split across machines (many DBs)
  • Replication: copy (same data, many nodes)
  • Edge: move compute closer (many locations)

The Cheat Sheet (one sentence each)

  • Partitioning: split one table into pieces inside one database so queries and maintenance get easier.
  • Bucketing: hash data into a fixed number of buckets (usually inside partitions) so reads/joins can prune work.
  • Sharding: split the dataset across multiple database instances so you can scale beyond one machine.
  • Replication: copy the same dataset to multiple nodes so you survive failures and scale reads.
  • Edge computing: move compute closer to users; it’s a deployment topology, not a data-splitting method.

If you remember only one thing, remember this:

Partitioning and bucketing organize data. Sharding distributes data. Replication duplicates data. Edge moves compute.

The core idea: two different layers

Most confusion comes from mixing data layout decisions with where code runs decisions.

Here’s the “diagram” I wish people drew on whiteboards:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
DATA LAYOUT (how data is arranged)

	One DB instance:
		- Partitioning  (table -> partitions)
		- Bucketing     (partition/file -> buckets)

	Many DB instances:
		- Sharding      (dataset -> shards)
		- Replication   (dataset -> copies)


COMPUTE TOPOLOGY (where code runs)

	- Centralized (one region)
	- Regional    (a few regions)
	- Edge        (many POPs, close to users)

Now let’s go through each term like an engineer who has to operate it at 3 a.m.

The 30-Second Decision Tree

When someone says “we need sharding”, run this quick checklist:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Is the problem reads?
	Yes -> replication and caching, maybe edge.

Is the problem latency for users far away?
	Yes -> CDN and edge, plus regional deployments.

Is the problem one huge table and slow queries?
	Yes -> indexes, query fixes, then partitioning.

Is the problem that one database cannot handle writes/size even after tuning?
	Yes -> sharding.

This is not perfect, but it stops 80 percent of bad architecture decisions.

Partitioning: “same database, smaller chunks”

Partitioning is when a single logical table is split into smaller logical pieces inside the same database (same logical cluster; often the same instance/host in traditional RDBMS).

Why you do it:

  • You want partition pruning (only scan relevant chunks).
  • You want maintenance to be cheap (drop/archive old partitions fast).
  • You want to reduce index bloat by keeping each partition smaller.

Classic example: time-based events.

Real-life example: in a bubble tea shop, today’s orders are in a tray on the counter, last month’s receipts are in a box in the back. You still have one shop, you just organized the paperwork so you do not dig through everything.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- Postgres-style range partitioning (illustrative)
CREATE TABLE events (
	event_id   bigserial,
	created_at timestamptz NOT NULL,
	user_id    bigint NOT NULL,
	payload    jsonb
) PARTITION BY RANGE (created_at);

CREATE TABLE events_2026_01 PARTITION OF events
	FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

When you query January, the planner can skip everything else.

What partitioning is not:

  • It’s not horizontal scaling. You still have one database box with one CPU/memory ceiling.

Bucketing: “predictable hashing for less work”

Bucketing is usually a data-warehouse / lakehouse trick: you hash one or more columns into a fixed number of buckets.

Why you do it:

  • Faster joins (same key hashes to the same bucket).
  • Better parallelism (each bucket can be processed independently).
  • More predictable planning (a fixed number of buckets to schedule/scan).

Example mental model:

1
2
3
4
partition: 2026-01-25
	bucket(user_id) -> 0..31

Query for user_id=123 only needs bucket hash(123)%32

If you’re mostly building OLTP apps (typical web backends), you might never use bucketing directly. You’ll still bump into it indirectly in Spark/Hive/Iceberg/BigQuery-style pipelines.

Real-life example: you keep 32 labeled folders for receipts, based on the last 5 bits of the receipt number. You do not open every folder, you open one.

Sharding: “now it’s multiple databases”

Sharding is the moment you accept reality:

One database machine can’t keep up anymore (data size, write throughput, connection limits, or I/O).

So you split the dataset across multiple database instances.

Real-life trigger: your single kitchen has a line out the door every night. You can optimize the menu (indexes), prep faster (caching), or hire more staff (bigger machine). But at some point you need a second kitchen (horizontal scale).

The part people underestimate: routing

Once you shard, every request needs to find the right shard.

That can be done by:

  • Application routing (your app chooses the shard).
  • A proxy/router layer.
  • A database feature (some systems offer it, with trade-offs).

Illustrative routing logic:

1
2
3
4
5
6
7
8
9
def shard_for(user_id: int, shard_count: int) -> int:
	# Use a stable function in real systems.
	# (Python's built-in hash() is intentionally randomized per process.)
	return user_id % shard_count

def get_user(user_id: int):
	shard_id = shard_for(user_id, shard_count=64)
	conn = shard_pool[shard_id]
	return conn.query("SELECT * FROM users WHERE id = %s", [user_id])

Picking a shard key (the real design problem)

Good shard keys:

  • Spread load evenly.
  • Match your access patterns.
  • Avoid “hot shards”.

Bad shard keys:

  • Put all writes into one shard (e.g., monotonically increasing IDs with range-based routing).
  • Make your most common queries cross-shard.

The pain points (you pay these forever)

  • Cross-shard joins: usually become “fetch from many shards, merge in app”.
  • Transactions: become hard across shards (two-phase commit, sagas, or “don’t do that”).
  • Rebalancing: adding shards means moving data around, safely.

Sharding is powerful, but it’s not a “performance trick”. It’s a platform decision.

Real-life example: Kitchen A serves half the city, Kitchen B serves the other half. Each kitchen has fewer orders, but now you need a dispatcher that routes each order to the correct kitchen.

Replication: “copies of the same data”

Replication is about availability and read scaling.

The mental model is simple:

  • Sharding: each node has different rows.
  • Replication: each node has the same rows.

Most common topology:

1
2
Primary (writes)  ->  Replica A (reads)
                ->  Replica B (reads)

Trade-offs you choose:

  • Synchronous replication: safer, but writes can get slower.
  • Asynchronous replication: faster writes, but replicas can lag (eventual consistency).

Operationally, replication gives you:

  • Failover paths.
  • Read-only traffic offloading.
  • Disaster recovery options.

But it does not reduce dataset size per node (every replica stores everything).

Real-life example: every shop gets the same recipe book. Great for resilience and training, but printing and keeping them all updated takes effort.

Edge computing: “move code closer to users”

Edge computing is not a database layout technique.

It’s a compute deployment model where code runs at many locations close to users (CDN POPs, edge regions, on-device gateways).

The most common “edge” story in real systems is this:

  1. User request hits an edge cache.
  2. If cached: respond immediately.
  3. If not cached: fetch from origin (regional app + database), then cache.

Illustrative flow:

1
2
3
4
5
User -> Edge POP
				 | cache hit? yes -> return
				 | no
				 v
			Origin API -> DB/Cache

Edge makes reads fast and cheap. But it doesn’t magically solve write scaling or strong consistency.

Real-life example: you keep popular toppings at each shop (edge) so drinks are quick. But your central inventory system (origin) still decides how much you actually have.

Putting It All Together: A Simple Ecommerce Story

Let’s say you run an ecommerce site.

  • Users browse product pages (lots of reads).
  • Users place orders (writes).
  • Your orders table grows forever.
  • You sell globally.

Here is the “grown up” architecture you usually end up with:

  1. Edge for product pages

    • Product pages and images are cached close to users.
    • Most traffic never hits your database.
  2. Replication for read-heavy database queries

    • Your primary handles writes.
    • Replicas serve reporting dashboards and heavy read endpoints.
  3. Partitioning for the orders table

    • Partition orders by month so “last 7 days” queries are fast.
    • Archiving older partitions becomes a simple operation.
  4. Sharding only when you truly outgrow one primary

    • If writes become too high, you shard by customer_id (or another key that matches your access patterns).
    • Now every request must route to the right shard.
    • Cross-customer analytics becomes harder, so you often move analytics to a separate system.

This story is why “edge vs sharding” is a category mistake. They solve different pains.

So… what should you do first?

If you’re building a normal product, the “grown-up” progression is often:

  1. Schema + indexes + query fixes (most “scaling problems” are just bad queries)
  2. Partitioning (when tables get huge and time slicing makes sense)
  3. Replication (when reads dominate)
  4. Caching / CDN / edge (when latency and bandwidth matter)
  5. Sharding (when one machine truly can’t handle it)

Not every system reaches step 5. Many shouldn’t.

Common mistakes I keep seeing

  1. Calling read replicas “shards”

    • If all nodes have the same data, it’s replication.
  2. Choosing a shard key that matches the org chart, not the traffic

    • “Shard by country” sounds neat until one country becomes 80% of your users.
  3. Sharding to fix one slow query

    • If one query is slow on one node, it will probably be slow on every shard too.

3.5. Using a bad shard function - The routing function must be stable and consistent across restarts and deployments. - “It worked on my laptop” is not a sharding strategy.

  1. Ignoring rebalancing day

    • “We’ll just add shards later” is how you end up doing data migration during an incident.
  2. Assuming edge = consistent storage

    • Edge is great for caching, auth, personalization, and routing.
    • Your source of truth still lives somewhere (usually regional).

Final thoughts

When people say “we should shard”, they usually mean one of four things:

  • “Our queries are slow” (often an indexing/query problem)
  • “Our reads are too high” (replication/caching)
  • “Our latency is too high globally” (edge)
  • “One database can’t handle the writes or size” (actual sharding)

Different problem. Different tool.

If you want, tell me what kind of workload you’re targeting (OLTP vs analytics, write-heavy vs read-heavy, global vs single-region) and I’ll help you choose partitioning vs replicas vs sharding vs edge for that specific scenario.

Made with laziness love 🦥

Subscribe to My Newsletter