Back to Blog
Developer Tools 9 min readPublished April 19, 2025· Updated April 23, 2025

Base64 Encoding Explained: A Developer's Complete Reference

Base64 converts binary data to ASCII text so it can be safely transmitted in text-based protocols. Learn how it works, when to use it, and common use cases including JWT, data URLs, and email attachments.

Base64 is one of those encoding schemes that every developer encounters repeatedly but few fully understand. You see it in JWT tokens, data URLs for images, email attachments, API authentication headers, and even embedded inside XRechnung attachments. This guide explains what Base64 actually is, how the encoding algorithm works, when to use it, and how it relates to other encoding schemes.

What Problem Does Base64 Solve?

Many text-based protocols — email (SMTP), HTTP headers, JSON, XML, and HTML — were designed to handle sequences of printable ASCII characters. These protocols break when they encounter raw binary data (images, PDFs, audio files, encrypted bytes) because binary data contains byte values that are not valid ASCII characters, or are interpreted as control characters (null bytes, newlines, carriage returns) that corrupt the protocol.

Base64 solves this by encoding arbitrary binary data as a sequence of 64 safe, printable ASCII characters. The result is always a valid text string that can be embedded in any text-based protocol without corruption.

How Base64 Encoding Works

The standard Base64 alphabet consists of 64 characters: A-Z (26), a-z (26), 0-9 (10), + (1), / (1). Since 64 = 2⁶, each Base64 character encodes exactly 6 bits of binary data.

The encoding algorithm works in groups of 3 bytes (24 bits) at a time:

  1. Take 3 bytes (24 bits) of input data.
  2. Split into four 6-bit groups.
  3. Map each 6-bit value to its corresponding Base64 character from the alphabet table.
  4. If the input is not a multiple of 3 bytes, pad with = characters: one = if 2 bytes remain, two = if 1 byte remains.

Example: The word "Man" in ASCII is 77, 97, 110. In binary: 01001101 01100001 01101110. Split into 6-bit groups: 010011, 010110, 000101, 101110. Mapped to Base64: T, W, F, u. Result: "TWFu".

Base64 vs. URL-Safe Base64

Standard Base64 uses + and / characters, which have special meanings in URLs. URL-safe Base64 (also called Base64url) replaces + with - and / with _, making the output safe to use in URLs without percent-encoding. URL-safe Base64 is used in:

  • JWT (JSON Web Tokens): All three parts of a JWT are Base64url-encoded.
  • OAuth 2.0 PKCE code_challenge values.
  • URL-safe data URIs and API parameters.
  • WebAuthn / FIDO2 credential identifiers.

Decoding JWT Tokens

A JWT consists of three Base64url-encoded parts separated by dots: header.payload.signature. The header and payload are Base64url-encoded JSON objects; the signature is a cryptographic HMAC or RSA signature over the first two parts.

To inspect a JWT payload manually: take the second segment (between the first and second dot), replace - with + and _ with /, add padding (= characters) if necessary to make the length a multiple of 4, then Base64-decode. The result is the JSON payload containing claims like sub, exp, and iat.

Important: Base64 encoding is NOT encryption. The payload of a JWT is publicly visible to anyone who captures the token. Never put sensitive data (passwords, credit card numbers) in a JWT payload without encrypting it separately.

Data URLs

Data URLs (RFC 2397) allow you to embed binary data directly in HTML, CSS, or anywhere a URL is expected. The format is: data:[mediatype];base64,[base64-encoded-data]. For example, a small PNG icon embedded directly in an HTML img tag would look like: <img src="data:image/png;base64,iVBORw0KGgo...">.

Data URLs eliminate the need for separate HTTP requests for small assets but increase the size of the HTML or CSS file. They are ideal for icons, fonts, and other small assets that would otherwise require a separate network round-trip.

Base64 in Email: MIME Encoding

MIME (Multipurpose Internet Mail Extensions) uses Base64 to encode email attachments. When you send a PDF invoice via email, the email client Base64-encodes the PDF bytes and embeds them in the email body as text. The receiving client decodes the Base64 back to binary and presents it as a downloadable attachment. This is why email files (.eml) are much larger than the sum of their attachments — Base64 encoding adds approximately 33% overhead.

Base64 in XRechnung and ZUGFeRD

XRechnung supports embedded attachments via the AdditionalDocumentReference element. Binary attachments (PDFs, images, other documents) must be Base64-encoded when embedded in the XML. An attachment might look like: <cbc:EmbeddedDocumentBinaryObject mimeCode="application/pdf" filename="delivery-note.pdf">[base64 data]</cbc:EmbeddedDocumentBinaryObject>.

Performance and Size Considerations

Base64 encoding increases data size by exactly 4/3 (approximately 33%). A 1MB PDF becomes roughly 1.37MB when Base64-encoded. For small payloads, this is usually negligible. For large files transmitted over the network, consider whether Base64 embedding is appropriate or whether a separate binary file transfer (multipart HTTP, presigned S3 URL) is more efficient.

Common Base64 Mistakes

  • Forgetting padding: Base64 strings must have a length that is a multiple of 4. Missing = padding causes decode errors in strict parsers.
  • Confusing Base64 with encryption: Base64 is not a security mechanism. It provides no confidentiality.
  • Using standard Base64 in URLs: Always use URL-safe Base64 (Base64url) for data embedded in URLs or URL query parameters.
  • Line breaks in Base64: Some Base64 encoders insert line breaks every 76 characters (MIME standard). Strip line breaks before decoding if your decoder does not handle them.
  • Wrong character encoding: Always ensure your input string is UTF-8 encoded before Base64-encoding it, especially for non-ASCII characters.

Frequently Asked Questions

Is Base64 the same as encryption?

No. Base64 is an encoding scheme, not an encryption algorithm. Anyone can decode a Base64 string instantly without any key. It provides absolutely no security but does allow binary data to be safely transmitted in text-based systems.

Why does Base64 end with == sometimes?

The = padding characters indicate that the input was not a multiple of 3 bytes. One = means one byte of padding was added; two == means two bytes were added to complete the final 3-byte group. Some systems (like JWT) omit padding and require it to be inferred from the string length.

Can I use Base64 in CSS?

Yes. You can use data URLs with Base64-encoded images directly in CSS background-image properties. This is commonly done for small icons and patterns to reduce HTTP requests, though it increases CSS file size.

Base64EncodingJWTDeveloperSecurity