Base64 / URL / HTML Encoder-Decoder
Encode and decode Base64, URLs, HTML entities, JWT tokens, and hex — all in your browser.
Understanding Encoding Formats
Base64 is a binary-to-text encoding scheme defined in RFC 4648 that represents arbitrary binary data using a 64-character alphabet: A–Z (26), a–z (26), 0–9 (10), and the two symbols + and / (2), giving 64 printable ASCII characters in total. The name 'Base64' directly refers to this alphabet size. Because every 3 input bytes map to exactly 4 output characters, Base64 increases data size by approximately 33%. This predictable overhead is the accepted cost for transmitting binary data safely through text-only channels — email (RFC 2045 MIME), XML attribute values, JSON strings, HTTP headers, and CSS data URIs. RFC 4648 also defines Base64url, a URL-safe variant that replaces + with - and / with _ to avoid conflicts with URL syntax.
URL encoding (percent-encoding, standardized in RFC 3986) is a mechanism for encoding arbitrary byte sequences in a URI by replacing each unsafe byte with a percent sign followed by two hexadecimal digits representing the byte value. For example, a space becomes %20, an ampersand becomes %26, and a German ü (UTF-8 bytes 0xC3 0xBC) becomes %C3%BC. RFC 3986 defines 'unreserved characters' (A–Z, a–z, 0–9, -, _, ., ~) that do not require percent-encoding. Every web developer encounters URL encoding in query string parameters, HTML form submissions, OAuth redirect URIs, and deep-link construction. Correctly encoding each component of a URL (scheme, host, path, query, fragment) requires understanding which characters are safe in each component.
HTML entity encoding converts characters with reserved meaning in HTML markup — primarily <, >, &, " and ' — into named or numeric entity references (<, >, &, ", ') or numeric code points (<). This prevents browsers from interpreting user-supplied content as HTML markup, making entity encoding one of the primary defenses against XSS (Cross-Site Scripting) vulnerabilities. The OWASP XSS Prevention Cheat Sheet recommends encoding all untrusted data before inserting it into HTML output. Our HTML encoder handles all five reserved characters and common special characters, producing safe output for injection into HTML documents.
JWT (JSON Web Tokens, RFC 7519) is a compact, URL-safe format for representing claims between two parties. A JWT consists of three Base64url-encoded sections separated by dots: the header (specifying the algorithm, e.g., HS256 or RS256), the payload (containing the claims, e.g., sub, iat, exp, roles), and the signature. Because the header and payload are only Base64url-encoded — not encrypted — their contents are readable by anyone holding the token. The signature provides integrity: it proves the token was issued by the party holding the secret key and has not been tampered with. JWTs are the dominant authentication token format for OAuth 2.0, OpenID Connect, and API authorization in modern web applications.
Hexadecimal (base-16) encoding represents each byte of binary data as exactly two hexadecimal digits (0–9, a–f), producing output that is twice the size of the input. Hexadecimal is the standard representation for cryptographic outputs: SHA-256 hashes are displayed as 64 hex characters, MD5 as 32, TLS certificate fingerprints as hex strings. It is used in debugging (memory dumps, network packet inspection), color codes in CSS (#RRGGBB), UUID representation, and low-level data inspection. Our hex encoder converts arbitrary text (encoded as UTF-8) to its hexadecimal representation and back.
How to Use This Tool
- Select a tab — Choose from Base64, URL, HTML, JWT, or Hex encoding. Each tab is optimized for its specific encoding format, with relevant input placeholders and contextual tips.
- Choose direction — Toggle between Encode and Decode using the direction button (except for JWT, which is decode-only — the header and payload of a JWT are always Base64url-encoded, not encrypted).
- Enter your input — Paste or type your data into the input field. For Base64 and Hex, the tool handles both plain ASCII and full Unicode input (encoding it as UTF-8 bytes before applying the transformation). For URL encoding, paste the component you want to encode — not the entire URL.
- Process — Click the button or press Ctrl+Enter to encode/decode. The result appears immediately in the output field.
- Copy result — Click "Copy" to copy the output to your clipboard. The copy button confirms success with a 'Copied!' indication.
All processing is performed locally in your browser using native JavaScript APIs (btoa/atob, encodeURIComponent/decodeURIComponent, TextEncoder/TextDecoder). No data is sent to any server. This makes the tool safe for encoding sensitive credentials, private JWT payloads, and confidential document attachments.
How It Works
Base64 encoding works by grouping input bytes into blocks of 3. Each 3-byte block (24 bits) is split into four 6-bit groups. Each 6-bit value (0–63) maps to a character in the Base64 alphabet. If the input length is not divisible by 3, one or two padding = characters are appended to make the output length a multiple of 4. For Unicode text, our tool first encodes the string as UTF-8 bytes using the TextEncoder API, then applies Base64 — this is the correct approach for non-ASCII characters like German umlauts (ä, ö, ü) or any text outside ASCII range.
URL encoding uses JavaScript's native encodeURIComponent() for encoding and decodeURIComponent() for decoding. These functions comply with RFC 3986 and leave unreserved characters (A–Z, a–z, 0–9, -, _, ., ~) unencoded while percent-encoding everything else, including characters with special meaning in URLs (/, ?, #, =, &). For multi-byte UTF-8 characters, each byte is percent-encoded individually — for example, ü (U+00FC) becomes %C3%BC in a URL.
JWT decoding splits the token at the two dot delimiters into header, payload, and signature segments. Each segment is Base64url-decoded: the - and _ characters are replaced with + and / (reversing the URL-safe substitution), padding is added if missing, and the result is decoded with atob(). The decoded bytes are parsed as UTF-8 JSON and displayed as formatted, indented JSON objects. The signature segment is displayed as raw Base64url — it cannot be verified without the secret key, and docutools.pro intentionally does not request or store keys.
Use Cases
XRechnung and ZUGFeRD Embedded Attachments
The XRechnung standard allows embedding binary attachments (supporting PDFs, specifications, delivery notes) via the AdditionalDocumentReference element. The binary content must be encoded as standard Base64 (RFC 4648) for the EmbeddedDocumentBinaryObject element. Our encoder produces precisely this format. ZUGFeRD embedding works identically in the CII XML attachment. Use our tool to encode small PDFs or images before manually editing an invoice XML.
API Authentication Debugging
HTTP Basic Authentication encodes credentials as Base64(username:password) in the Authorization header. OAuth 2.0 client credentials are sometimes Base64-encoded in similar headers. JWTs carry claims in a decodable payload. Our tool covers all three: Basic Auth encoding, JWT payload inspection, and OAuth 2.0 token debugging. When an API returns a 401 Unauthorized, decoding the token your client is sending is the first debugging step.
Web Security and XSS Prevention
When building web applications that display user-supplied content, every string must be HTML-entity-encoded before insertion into the page. Our HTML encoder transforms the five dangerous characters (<, >, &, ", ') into their safe entity forms. Use this to check that your backend sanitization is producing the right output, or to manually encode a test string to verify your template engine is configured correctly.
Cryptographic Output Inspection
SHA-256 hashes, HMAC signatures, TLS certificate fingerprints, and bcrypt hashes are all displayed as hexadecimal strings. Our hex encoder converts between raw text and hex, helping you verify hash outputs, inspect certificate thumbprints, or understand the byte-level representation of cryptographic keys and nonces.
Example: Base64-Encoded Attachment in XRechnung
Here is how a short text is Base64-encoded (RFC 4648) and embedded as an attachment in an XRechnung invoice XML. Any Base64 decoder worldwide — including our tool — returns exactly the same original text from this encoded string.
Input (plain text):
Invoice INV-2025-001 dated 2025-01-15
Base64 output (RFC 4648):
SW52b2ljZSBJTlYtMjAyNS0wMDEgZGF0ZWQgMjAyNS0wMS0xNQ==
In XRechnung UBL XML:
<cac:AdditionalDocumentReference>
<cbc:ID>Attachment-1</cbc:ID>
<cac:Attachment>
<cbc:EmbeddedDocumentBinaryObject mimeCode="text/plain"
filename="invoice.txt">
SW52b2ljZSBJTlYtMjAyNS0wMDEgZGF0ZWQgMjAyNS0wMS0xNQ==
</cbc:EmbeddedDocumentBinaryObject>
</cac:Attachment>
</cac:AdditionalDocumentReference>The trailing == is padding — it indicates that the input length was not a multiple of 3 bytes and 2 padding characters were appended to make the output length a multiple of 4.
Tips & Limitations
Tips
- Use Ctrl+Enter as a keyboard shortcut for fast encoding or decoding without reaching for the mouse.
- When encoding Unicode text (German umlauts ä/ö/ü, accented characters, CJK scripts), our tool uses UTF-8 encoding — the universal standard for text APIs. This ensures your Base64 string decodes correctly on any platform.
- JWT decoding shows header and payload without verifying the signature. Never trust a JWT's claims in production code without server-side signature verification using the issuer's public key or shared secret.
- Base64url (used in JWTs and some OAuth flows) replaces + with - and / with _ for URL-safe tokens. Our JWT decoder handles Base64url automatically; the Base64 tab produces standard Base64 (with + and /).
Limitations
- Base64 is encoding, not encryption. Anyone holding a Base64 string can decode it instantly without a key. For confidentiality, encrypt the data with AES-256 or RSA first, then Base64-encode the ciphertext if needed for transport.
- JWT signature verification requires the secret key (for HMAC algorithms) or the issuer's public key (for RSA/ECDSA). Our tool displays the decoded payload but cannot verify or forge signatures — this is intentional.
- URL encoding encodes a single component value, not a full URL. If you paste an entire URL (https://example.com/path?key=value) into the URL encoder, the slashes, colons, and question marks will be percent-encoded, breaking the URL structure. Encode each query parameter value individually.
- Very large inputs (multi-MB files) may slow the browser since all processing is entirely client-side. For large binary file Base64 encoding, consider command-line tools (base64 on Linux/macOS, certutil -encode on Windows).
Frequently Asked Questions
Is Base64 encoding the same as encryption?
No — this is one of the most important misconceptions in software security. Base64 is an encoding scheme, not encryption. It has no key, no secret, and no computational difficulty. Anyone who receives a Base64 string can decode it instantly using any Base64 library or online tool. It is designed for safe transport of binary data over text channels, not for confidentiality. If you need to protect data, use proper authenticated encryption (AES-256-GCM, ChaCha20-Poly1305) before considering whether Base64 transport encoding is also needed.
Why is my Base64 output approximately 33% larger than the input?
Base64 maps every 3 input bytes to exactly 4 output characters — a ratio of 4/3 ≈ 1.333. This means a 1 MB binary file produces approximately 1.37 MB of Base64 text. The overhead is the unavoidable cost of representing 8-bit bytes using only 6 bits of information per character (log₂(64) = 6). This 33% size increase is why data URIs for images increase page load sizes and why Base64 is not suitable for data compression — it always expands the data.
Can I decode a JWT without the secret key?
Yes — the header and payload of a JWT are only Base64url-encoded, not encrypted. Anyone holding the token string can decode and read the claims (sub, email, roles, exp, iat, etc.) without knowing any key. This is by design: JWTs are meant to be readable, not secret. The signature ensures the token was issued by the correct party and has not been tampered with. This means you should never put sensitive data (passwords, PII you want to conceal) in a JWT payload unless you additionally encrypt the token (a JWE, as defined in RFC 7516).
Is my data sent to a server when I use this tool?
No. All encoding and decoding operations run entirely in your browser using native JavaScript APIs (btoa/atob for Base64, encodeURIComponent/decodeURIComponent for URLs, TextEncoder/TextDecoder for Unicode). Your data never leaves your device. This is especially important for JWT tokens, which may contain sensitive user claims, and for passwords encoded in HTTP Basic Auth headers.
What is the difference between Base64 and Base64url?
Standard Base64 (RFC 4648 §4) uses + and / as the 62nd and 63rd characters, and pads output with = to a multiple of 4. Base64url (RFC 4648 §5) replaces + with - and / with _ so the encoded string can appear in URLs and HTTP headers without percent-encoding. JWT tokens use Base64url. Most API contexts where Base64 appears in URLs or HTTP headers also expect Base64url. When in doubt about which variant to use, check whether the target system documentation mentions RFC 4648 §4 (standard) or §5 (URL-safe).
How do I encode an image as a Base64 data URL for CSS or HTML?
Read the image file as binary bytes, Base64-encode the bytes using RFC 4648 standard encoding, then construct the data URL by prepending the MIME type prefix: data:[mimeType];base64,<encoded>. For a PNG: data:image/png;base64,iVBORw0KGgo.... This string can be used directly as the src of an <img> tag or as a url() value in CSS. Important: data URLs increase your HTML/CSS file size by ~33% and cannot be cached separately by the browser — use them only for small images (icons, small logos) where the extra HTTP request for a separate file would cost more.
What does the = padding at the end of Base64 mean?
Base64 output must be a multiple of 4 characters. When the input length is not divisible by 3, padding is added: one = means 1 byte of padding was added (2 input bytes in the last group); == means 2 bytes of padding (1 input byte in the last group). Zero padding characters means the input was exactly divisible by 3. Some systems (notably JWT Base64url) omit the padding and expect decoders to infer it from the string length. If you see a 'incorrect padding' decoding error, try adding one or two = characters to the end of the string.
How do I use Base64 to embed a PDF attachment in XRechnung XML?
Read the PDF file as binary bytes (e.g., using FileReader.readAsDataURL() in JavaScript, or file.read() in Python followed by base64.b64encode()). Encode the bytes using standard Base64 (RFC 4648 §4). Place the encoded string in the cbc:EmbeddedDocumentBinaryObject element of AdditionalDocumentReference, with mimeCode="application/pdf" and filename attributes. Important: XRechnung validates that the mimeCode is one of the permitted types (application/pdf, image/png, image/jpeg, text/csv, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet). Exceeding the recommended attachment size (typically <5 MB) may be rejected by some portal implementations.
What character encoding should I use before Base64-encoding text?
Always encode text as UTF-8 bytes before applying Base64. UTF-8 is the universally interoperable encoding for text data in APIs, XML, JSON, and email. Using UTF-16 or Latin-1 produces valid Base64 but the recipient must know which encoding was used — and many systems assume UTF-8. Our tool uses the browser's TextEncoder API, which always produces UTF-8. This correctly handles German umlauts (ä, ö, ü, ß), French accents, Greek, Arabic, CJK characters, and any other Unicode text.
How do I decode a JWT to debug authentication issues?
Paste the full JWT (all three dot-separated segments) into the JWT tab of our tool and click Decode. The header shows the algorithm (alg: HS256, RS256, ES256, etc.) and token type. The payload shows the claims: sub (subject/user ID), iat (issued-at timestamp), exp (expiry timestamp — check this first if getting 401 errors), aud (audience), iss (issuer), and any custom claims added by your identity provider. The exp claim is a Unix timestamp — use our Timestamp Converter tool to translate it to a human-readable date to see if the token has expired.