Featured image of post 1 Byte = 8 Bits, So How Many Bytes Is Your Name?

1 Byte = 8 Bits, So How Many Bytes Is Your Name?

If you have no idea how many bytes a song lyric or even a single letter takes, this post will unlock a core computing skill you're missing.

I once spoke with a colleague who could recite “1 byte = 8 bits” in his sleep, yet froze when asked, “How many bytes is your name?”

Let’s fix that.

The Parking Lot Model

Picture a parking lot with 8 slots. Each slot is either empty (0) or has a car (1). That whole lot is a byte.

Slot #76543210
Price$128$64$32$16$8$4$2$1
  • 1 byte = 1 parking lot with 8 slots
  • 1 bit = 1 slot (empty = 0, car parked = 1)
  • Each slot has a price tag based on position: slot 0 costs $1, slot 1 costs $2, slot 7 costs $128
  • Total revenue = sum of occupied slot prices

This is a visualization of how a byte is structured:

Byte Status 00000000
Revenue $ $0
// TAP SLOT TO INITIALIZE PARKING SEQUENCE

00000001 → only the last slot has a car → revenue = 1

00000010 → one car parked in the next slot → revenue = 2

11111111 → all 8 slots are full → revenue = 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = 255

The number (0–255) just depends on which slots have cars.

ASCII

Why do computers group 8 bits (2³) into 1 byte? Why not 6 or 10?

Short answer: It’s the perfect size for text. 🐧

ASCII Character Set

To store text, computers need to map characters to numbers. With 8 bits, you get 256 slots (0–255). That’s enough space for every English letter (a-z, A-Z), number, and symbol, with room to spare.

Think of it as 256 unique parking configurations. Each one maps to a specific character.

But here’s the twist: Standard ASCII actually only needs 7 bits (128 slots).

So, why 8? Honestly, it was a mix of lucky hardware decisions and the need for a standardized “chunk” size that could handle more than just simple text.

This is a visualization of how chars are distributed in bytes:

ASCII = 1 Lot. UTF-8 (Emojis) = 2-4 Lots.

ASCII Encoding

So, how do we convert between characters and bytes like the visualization above?

Let’s encode “Y” (from YELL) as an example.

Step 1: Character set: Character → Number

Look it up in the ASCII table: Y = 89

Step 2: Encoding: Number → Binary

Here’s where most people get stuck. Use the Bit-test method. Memorize this sequence: 128, 64, 32, 16, 8, 4, 2, 1

Start with 89 and work left to right:

  • 89 ≥ 128? ❌ → 0
  • 89 ≥ 64? ✅ → 1 (89 − 64 = 25)
  • 25 ≥ 32? ❌ → 0
  • 25 ≥ 16? ✅ → 1 (25 − 16 = 9)
  • 9 ≥ 8? ✅ → 1 (9 − 8 = 1)
  • 1 ≥ 4? ❌ → 0
  • 1 ≥ 2? ❌ → 0
  • 1 ≥ 1? ✅ → 1
1
2
128 | 64 | 32 | 16 | 8 | 4 | 2 | 1
  0 |  1 |  0 |  1 | 1 | 0 | 0 | 1

Result: 01011001. That’s one byte.

ASCII Char
Target (0-255)
PARKING REVENUE STRATEGY
Waiting for Input...
CURRENT REVENUE $0
REMAINING $89

There’s also the “Divide by 2” method for converting decimal to binary, but I find it tedious and rarely use it.

The mental shift:

  • Humans think in characters
  • Computers think in bytes
  • Encodings are the translation layer between the two

Bigger Than ASCII

ASCII works for English. But what about 🦈, 你, ộ, or ắ?

256 values won’t cut it. We need something bigger.

Unicode Character Set

Q: Is Unicode the same type of thing as ASCII but bigger?

A: Sort of, but not exactly.

ASCII has 2 parts:

  • Character set: Convert character to decimal
  • Encoding: Convert decimal to binary

Unicode is just a Character set. It assigns a unique number (code point) to every character, but it doesn’t dictate how those numbers are stored as bits.

CharCode PointDecimal
🦈U+1F988129,416
U+4F6020,320
U+1EC77,879
U+1EA57,845

Unicode 17.0 has 150,000+ code points.

The encoding part? That’s where UTF-8 comes in.

UTF-8 Encoding

Code point → bytes. How?

ASCII was simple: one character = one byte.

But 🦈 is 129,416. That’s way beyond 255. One parking lot (one byte) can’t hold it.

So, how do we distribute the cars across multiple lots to represent bigger numbers?

We need more lots.

UTF-8: The Multi-Lot Chain System

Chaining lots creates a problem: where does one character end and the next begin?

UTF-8 solves this with byte templates. The first few bits tell you how many bytes to expect.

Bytes neededTemplate
10xxxxxxx
2110xxxxx 10xxxxxx
31110xxxx 10xxxxxx 10xxxxxx
411110xxx 10xxxxxx 10xxxxxx 10xxxxxx

The prefixes (110, 1110, 11110) tell the computer: “Hey, I’m a multi-byte character, read X more bytes.”

Example: Encoding 🦈

  1. Find code point: 🦈 = U+1F988 = 129,416
  2. Check range: 129,416 falls in 65,536 – 1,114,111 → needs 4 bytes
  3. Convert to binary: represent the code point in binary (21 bits)
  4. Stuff into template: distribute those bits into the 4-byte template
Character
Code Point (Hex)
$ Target (Dec)
UTF-8 Encoding Breakdown
Loading...
Header (Gate/Flag)
Payload (Data)

Quick Reference

Code Point RangeSizeExampleCode PointDecimalUTF-8 bytes (dec)
0 – 127 (ASCII)1 byteYU+00598989
128 – 2,0472 byteséU+00E9233195, 169
2,048 – 65,5353 bytesU+4F6020,320228, 189, 160
65,536 – 1,114,1114 bytes🦈U+1F988129,416240, 159, 166, 136

The takeaway: UTF-8’s genius isn’t just that it supports emojis. It stays compatible with ASCII and scales to the entire Unicode space.

Quantum computing (Bonus)

Let’s break the physics of our parking lot.

A classical bit is simple: The slot is either empty (0) or occupied (1).

A Quantum bit (qubit) is weird. The car enters Superposition.

  • Superposition: The car is in a ghostly state: both parked and empty at the same time.
  • Measurement: When you check the slot, the car is forced to “pick a side.” It instantly collapses into a normal 0 or 1.

It’s like Schrödinger’s Parking Spot. Spooky? Yes. But it allows quantum computers to calculate millions of possibilities at once.

Made with the laziness 🦥
by a busy guy

Subscribe to My Newsletter