I’ve watched smart engineers burn 30 minutes in a design review arguing about “sharding” when they actually meant “read replicas”.
Same vibe as mixing bits and bytes: the words sound similar, they show up in the same conversations, and your brain decides they’re interchangeable.
They’re not.
This post gives you a clean mental model and practical examples so you can answer precisely and choose the right tool.
Before we go technical, here’s the real-life version.
The Real-Life Mental Model
Imagine you run a bubble tea business.
- You have a big recipe book (your data).
- You have a main kitchen (your database).
- You also have multiple shops around the city (multiple database nodes).
Now map the buzzwords:
- Partitioning: you split the recipe book into chapters (milk tea, fruit tea, toppings) so staff can find things faster.
- Bucketing: within a chapter, you put recipes into numbered folders (1..32) so a runner can jump to the right folder instantly.
- Sharding: you open more kitchens and each kitchen owns part of the menu or customers.
- Replication: you photocopy the same recipe book to every shop so if one shop is on fire, the business still runs.
- Edge computing: you prep common toppings at each shop so customers don’t wait for the main kitchen.
If you prefer a simpler memory trick, use verbs:
- Partitioning: split (inside one DB)
- Bucketing: hash (into fixed groups)
- Sharding: split across machines (many DBs)
- Replication: copy (same data, many nodes)
- Edge: move compute closer (many locations)
The Cheat Sheet (one sentence each)
- Partitioning: split one table into pieces inside one database so queries and maintenance get easier.
- Bucketing: hash data into a fixed number of buckets (usually inside partitions) so reads/joins can prune work.
- Sharding: split the dataset across multiple database instances so you can scale beyond one machine.
- Replication: copy the same dataset to multiple nodes so you survive failures and scale reads.
- Edge computing: move compute closer to users; it’s a deployment topology, not a data-splitting method.
If you remember only one thing, remember this:
Partitioning and bucketing organize data. Sharding distributes data. Replication duplicates data. Edge moves compute.
The core idea: two different layers
Most confusion comes from mixing data layout decisions with where code runs decisions.
Here’s the “diagram” I wish people drew on whiteboards:
| |
Now let’s go through each term like an engineer who has to operate it at 3 a.m.
The 30-Second Decision Tree
When someone says “we need sharding”, run this quick checklist:
| |
This is not perfect, but it stops 80 percent of bad architecture decisions.
Partitioning: “same database, smaller chunks”
Partitioning is when a single logical table is split into smaller logical pieces inside the same database (same logical cluster; often the same instance/host in traditional RDBMS).
Why you do it:
- You want partition pruning (only scan relevant chunks).
- You want maintenance to be cheap (drop/archive old partitions fast).
- You want to reduce index bloat by keeping each partition smaller.
Classic example: time-based events.
Real-life example: in a bubble tea shop, today’s orders are in a tray on the counter, last month’s receipts are in a box in the back. You still have one shop, you just organized the paperwork so you do not dig through everything.
| |
When you query January, the planner can skip everything else.
What partitioning is not:
- It’s not horizontal scaling. You still have one database box with one CPU/memory ceiling.
Bucketing: “predictable hashing for less work”
Bucketing is usually a data-warehouse / lakehouse trick: you hash one or more columns into a fixed number of buckets.
Why you do it:
- Faster joins (same key hashes to the same bucket).
- Better parallelism (each bucket can be processed independently).
- More predictable planning (a fixed number of buckets to schedule/scan).
Example mental model:
| |
If you’re mostly building OLTP apps (typical web backends), you might never use bucketing directly. You’ll still bump into it indirectly in Spark/Hive/Iceberg/BigQuery-style pipelines.
Real-life example: you keep 32 labeled folders for receipts, based on the last 5 bits of the receipt number. You do not open every folder, you open one.
Sharding: “now it’s multiple databases”
Sharding is the moment you accept reality:
One database machine can’t keep up anymore (data size, write throughput, connection limits, or I/O).
So you split the dataset across multiple database instances.
Real-life trigger: your single kitchen has a line out the door every night. You can optimize the menu (indexes), prep faster (caching), or hire more staff (bigger machine). But at some point you need a second kitchen (horizontal scale).
The part people underestimate: routing
Once you shard, every request needs to find the right shard.
That can be done by:
- Application routing (your app chooses the shard).
- A proxy/router layer.
- A database feature (some systems offer it, with trade-offs).
Illustrative routing logic:
| |
Picking a shard key (the real design problem)
Good shard keys:
- Spread load evenly.
- Match your access patterns.
- Avoid “hot shards”.
Bad shard keys:
- Put all writes into one shard (e.g., monotonically increasing IDs with range-based routing).
- Make your most common queries cross-shard.
The pain points (you pay these forever)
- Cross-shard joins: usually become “fetch from many shards, merge in app”.
- Transactions: become hard across shards (two-phase commit, sagas, or “don’t do that”).
- Rebalancing: adding shards means moving data around, safely.
Sharding is powerful, but it’s not a “performance trick”. It’s a platform decision.
Real-life example: Kitchen A serves half the city, Kitchen B serves the other half. Each kitchen has fewer orders, but now you need a dispatcher that routes each order to the correct kitchen.
Replication: “copies of the same data”
Replication is about availability and read scaling.
The mental model is simple:
- Sharding: each node has different rows.
- Replication: each node has the same rows.
Most common topology:
| |
Trade-offs you choose:
- Synchronous replication: safer, but writes can get slower.
- Asynchronous replication: faster writes, but replicas can lag (eventual consistency).
Operationally, replication gives you:
- Failover paths.
- Read-only traffic offloading.
- Disaster recovery options.
But it does not reduce dataset size per node (every replica stores everything).
Real-life example: every shop gets the same recipe book. Great for resilience and training, but printing and keeping them all updated takes effort.
Edge computing: “move code closer to users”
Edge computing is not a database layout technique.
It’s a compute deployment model where code runs at many locations close to users (CDN POPs, edge regions, on-device gateways).
The most common “edge” story in real systems is this:
- User request hits an edge cache.
- If cached: respond immediately.
- If not cached: fetch from origin (regional app + database), then cache.
Illustrative flow:
| |
Edge makes reads fast and cheap. But it doesn’t magically solve write scaling or strong consistency.
Real-life example: you keep popular toppings at each shop (edge) so drinks are quick. But your central inventory system (origin) still decides how much you actually have.
Putting It All Together: A Simple Ecommerce Story
Let’s say you run an ecommerce site.
- Users browse product pages (lots of reads).
- Users place orders (writes).
- Your orders table grows forever.
- You sell globally.
Here is the “grown up” architecture you usually end up with:
Edge for product pages
- Product pages and images are cached close to users.
- Most traffic never hits your database.
Replication for read-heavy database queries
- Your primary handles writes.
- Replicas serve reporting dashboards and heavy read endpoints.
Partitioning for the orders table
- Partition
ordersby month so “last 7 days” queries are fast. - Archiving older partitions becomes a simple operation.
- Partition
Sharding only when you truly outgrow one primary
- If writes become too high, you shard by
customer_id(or another key that matches your access patterns). - Now every request must route to the right shard.
- Cross-customer analytics becomes harder, so you often move analytics to a separate system.
- If writes become too high, you shard by
This story is why “edge vs sharding” is a category mistake. They solve different pains.
So… what should you do first?
If you’re building a normal product, the “grown-up” progression is often:
- Schema + indexes + query fixes (most “scaling problems” are just bad queries)
- Partitioning (when tables get huge and time slicing makes sense)
- Replication (when reads dominate)
- Caching / CDN / edge (when latency and bandwidth matter)
- Sharding (when one machine truly can’t handle it)
Not every system reaches step 5. Many shouldn’t.
Common mistakes I keep seeing
Calling read replicas “shards”
- If all nodes have the same data, it’s replication.
Choosing a shard key that matches the org chart, not the traffic
- “Shard by country” sounds neat until one country becomes 80% of your users.
Sharding to fix one slow query
- If one query is slow on one node, it will probably be slow on every shard too.
3.5. Using a bad shard function - The routing function must be stable and consistent across restarts and deployments. - “It worked on my laptop” is not a sharding strategy.
Ignoring rebalancing day
- “We’ll just add shards later” is how you end up doing data migration during an incident.
Assuming edge = consistent storage
- Edge is great for caching, auth, personalization, and routing.
- Your source of truth still lives somewhere (usually regional).
Final thoughts
When people say “we should shard”, they usually mean one of four things:
- “Our queries are slow” (often an indexing/query problem)
- “Our reads are too high” (replication/caching)
- “Our latency is too high globally” (edge)
- “One database can’t handle the writes or size” (actual sharding)
Different problem. Different tool.
If you want, tell me what kind of workload you’re targeting (OLTP vs analytics, write-heavy vs read-heavy, global vs single-region) and I’ll help you choose partitioning vs replicas vs sharding vs edge for that specific scenario.
