What is Base64 Encoding? A Developer's Guide

Base64 encoding shows up everywhere in modern software — embedded images in CSS, email attachments, JWT tokens, and API authentication headers. Yet many developers treat it as a black box, copying snippets without understanding what actually happens under the hood. This guide demystifies Base64 completely.

What is Base64?

Base64 is a binary-to-text encoding scheme that represents arbitrary binary data using a set of 64 printable ASCII characters. The name comes directly from that alphabet size: 64 characters.

The standard Base64 alphabet consists of:

  • A–Z (26 uppercase letters)
  • a–z (26 lowercase letters)
  • 0–9 (10 digits)
  • + and / (2 special characters)
  • = (used as padding, not part of the core 64)

Every character in this alphabet is a safe, printable ASCII character. That is the entire point: Base64 lets you take arbitrary binary data — bytes that might be control characters, null bytes, or values that break text protocols — and represent it as plain text that can travel safely through any system designed for ASCII.

Base64 is defined in RFC 4648, which also covers related encodings like Base32 and Base16 (hex).

How Base64 Encoding Works

The core algorithm converts every 3 bytes of binary input into 4 Base64 characters. Here is the step-by-step process:

  1. Take 3 bytes of input (24 bits total).
  2. Split those 24 bits into four 6-bit groups.
  3. Map each 6-bit value (0–63) to the corresponding Base64 alphabet character.
  4. Output those 4 characters.

Because 2^6 = 64, each 6-bit group maps to exactly one character in the 64-character alphabet.

The padding rule: If the input length is not a multiple of 3, the final group is padded:

  • 1 remaining byte → 2 Base64 characters + ==
  • 2 remaining bytes → 3 Base64 characters + =

This ensures the output length is always a multiple of 4 characters, which simplifies parsing.

Size increase: Every 3 bytes becomes 4 characters, so Base64 output is always exactly 33% larger than the input (plus up to 2 padding characters).

Step-by-Step Encoding Example

Let’s encode the string "Man" (3 bytes: 0x4D, 0x61, 0x6E).

Input bytes:    0x4D       0x61       0x6E
Binary:       01001101   01100001   01101110
              \_____/\_____/\_____/\_____/
6-bit groups:  010011  010110  000101  101110
Decimal:         19      22       5      46
Base64 chars:    T       W       F       u

So "Man" encodes to "TWFu". You can verify this yourself with our Base64 tool.

Now let’s encode "Ma" (only 2 bytes, so padding is needed):

Input bytes:    0x4D       0x61
Binary:       01001101   01100001   00000000  (padded)
6-bit groups:  010011  010110  000100  (padding)
Decimal:         19      22       4
Base64 chars:    T       W       E       =

Result: "TWE=" — one = padding character because we had 2 input bytes.

Common Use Cases

Data URIs

HTML and CSS can embed binary files directly as Base64-encoded strings, eliminating separate HTTP requests:

<!-- Inline PNG image as a data URI -->
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==" alt="1x1 pixel" />
/* Inline SVG icon */
.icon {
  background-image: url("data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==");
}

Email Attachments (MIME)

The MIME standard (RFC 2045) uses Base64 to encode binary attachments — PDFs, images, Word documents — within email messages. The attachment content is Base64-encoded and included as a text block with a Content-Transfer-Encoding: base64 header.

HTTP Basic Authentication

HTTP Basic Auth encodes credentials as username:password in Base64 and sends them in the Authorization header:

Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ=

Note: Base64 is not encryption. Anyone who intercepts this header can decode it immediately. Always use HTTPS.

JSON and XML with Binary Data

JSON has no native binary type. When you need to embed binary data — a cryptographic signature, a thumbnail image, a serialized object — inside a JSON payload, Base64 is the standard approach:

{
  "filename": "report.pdf",
  "mimeType": "application/pdf",
  "content": "JVBERi0xLjQKJcOkw7zDtsOfCjIgMCBvYmoK..."
}

JWT Tokens

JSON Web Tokens consist of three Base64URL-encoded sections separated by dots: header, payload, and signature. The header and payload are Base64URL-encoded JSON; the signature is Base64URL-encoded binary data.

eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJ1c2VyMTIzIn0.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

Base64 is NOT Encryption

This cannot be overstated: Base64 is an encoding, not encryption. It provides zero security. Any encoded string can be decoded in milliseconds by anyone with access to it — no key, no password required.

Common misconceptions:

  • “The data is encoded so it’s safe to transmit” — No. Encoding changes the format; encryption changes the content.
  • “It obscures the data” — Only to humans glancing quickly. Programmatically, it is fully transparent.

Use Base64 when you need to safely transport binary data through text-based channels. Use actual encryption (AES, RSA, ChaCha20) when you need data confidentiality.

Base64 Variants

Standard Base64 (RFC 4648 §4)

Uses + and / as the 62nd and 63rd characters, with = padding. This is the most common variant and what most btoa()/atob() functions produce.

URL-Safe Base64 (RFC 4648 §5)

Replaces + with - and / with _. This eliminates characters that have special meaning in URLs and file paths, making the encoded string safe to use as a URL parameter or filename without percent-encoding.

Standard:  "f+/z" → "Zu/z..." may break in URLs
URL-safe:  "f-_z" → safe in any URL component

JWTs use URL-safe Base64 (without padding) for exactly this reason.

MIME Base64

The same alphabet as standard Base64, but the output is broken into 76-character lines separated by \r\n. Used in email encoding.

Code Examples

JavaScript

// Encoding and decoding strings (browser and Node.js)
const encoded = btoa("Hello, World!");
console.log(encoded); // "SGVsbG8sIFdvcmxkIQ=="

const decoded = atob("SGVsbG8sIFdvcmxkIQ==");
console.log(decoded); // "Hello, World!"

// For binary data (Node.js Buffer)
const buffer = Buffer.from([0xff, 0xfe, 0xfd]);
const b64 = buffer.toString("base64");
console.log(b64); // "/v79" (may vary)

// URL-safe Base64 (Node.js)
const urlSafe = buffer.toString("base64url");
console.log(urlSafe); // "_v79"

// Decode back to buffer
const original = Buffer.from(b64, "base64");

// Note: btoa/atob only handle Latin-1. For Unicode strings:
function encodeUnicode(str) {
  return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, (_, p1) =>
    String.fromCharCode(parseInt(p1, 16))
  ));
}

Python

import base64

# Encoding bytes
data = b"Hello, World!"
encoded = base64.b64encode(data)
print(encoded)           # b'SGVsbG8sIFdvcmxkIQ=='
print(encoded.decode())  # 'SGVsbG8sIFdvcmxkIQ==' (as string)

# Decoding
decoded = base64.b64decode(b"SGVsbG8sIFdvcmxkIQ==")
print(decoded)  # b'Hello, World!'

# URL-safe variant
url_encoded = base64.urlsafe_b64encode(b"\xff\xfe\xfd")
print(url_encoded)  # b'__79' (uses - and _ instead of + and /)

url_decoded = base64.urlsafe_b64decode(url_encoded)

# Encoding a file
with open("image.png", "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode()
    print(f"data:image/png;base64,{image_b64}")

Performance Considerations

Base64 imposes two costs worth knowing:

Size overhead: The 33% size increase is fixed and unavoidable. A 1 MB image becomes ~1.33 MB when Base64-encoded. For data URIs in CSS, this can meaningfully increase page weight if overused. Inline only small assets (icons under ~4 KB); load larger images via regular URLs.

CPU cost: Encoding and decoding are lightweight operations — modern CPUs can process hundreds of MB/s. For typical use cases (tokens, small payloads, auth headers), CPU cost is negligible. For processing large files in bulk — video, large datasets — be mindful of the overhead and consider streaming approaches.

Compression interaction: Base64 expands data before compression. If your transport layer uses gzip or brotli, compress first, then Base64-encode the compressed bytes. The compression will largely recover the 33% overhead.

Try encoding and decoding strings instantly with our Base64 tool.

FAQ

Why is Base64 output always a multiple of 4 characters?

The = padding ensures this. Since 3 input bytes produce exactly 4 output characters, the decoder needs to know how many bytes the final group represents. One = means 2 bytes in the last group; == means 1 byte. This makes Base64 strings unambiguous and easy to parse without needing a separate length field.

Can I use Base64 to store passwords?

No. Base64 is reversible encoding, not a one-way hash. Anyone with access to a Base64-encoded password can decode it instantly. Passwords must be stored using a proper password-hashing algorithm like bcrypt, scrypt, or Argon2.

What is the difference between Base64 and hex encoding?

Both encode binary data as ASCII text. Hex (Base16) uses 16 characters (0–9, a–f) and represents each byte as 2 characters, producing a 100% size increase. Base64 represents 3 bytes with 4 characters, producing a 33% size increase. Base64 is more compact; hex is more human-readable for inspecting raw bytes.

Why does Base64 sometimes have no padding?

URL-safe Base64 often omits the = padding because = is a reserved character in URLs and the padding is technically redundant — a decoder can infer the correct length from the string length modulo 4. Many JWTs and tokens omit padding for this reason. Most decoders accept both padded and unpadded input.

Related Tools