Base85 Encoding: Compact Data Storage in Bigger Bits

Introduction

Base85 is a binary-to-text encoding scheme that uses 85 printable ASCII characters to represent binary data. It is used to compactly store data in bigger bits by reducing the size of files and transmitting them faster. The encoding scheme is popularly used in computer networking, file compression and serialization.

Sample code snippet

To encode data using Base85 in Python, you can use the base64.b85encode() method available in the Python Standard Library, as shown below:

import base64

data = b'This is a sample data to be encoded'
encoded_data = base64.b85encode(data)

print(encoded_data.decode('ascii'))

To decode the encoded data, you can use the base64.b85decode() method, like this:

decoded_data = base64.b85decode(encoded_data)

print(decoded_data.decode('ascii'))

Discussion

The Base85 encoding scheme was introduced by Adobe Systems in the early 1990s and is also known as ASCII85. It is commonly used by Adobe software for encoding PostScript and PDF files. The scheme uses 85 characters composed of 84 printable ASCII characters and one delimiter character.

According to Brendan Eich, the creator of JavaScript, Base85 encoding is better than Base64 encoding in terms of storage space and length of encoding. He tweeted, “Base85 is better than Base64 for source code and ASCII art, but not for over-the-wire data encoding.”

Address of the Wikipedia citation

Base85 Encoding

Suggestions on how to use it

Base85 encoding is useful for encoding binary data to printable ASCII characters. It is commonly used for compressing files and transmitting data over networks. Here are some suggestions on how to use Base85 encoding:

  • Compress large files to reduce their size for faster transmission.
  • Encode binary data to printable ASCII characters for safe transmission in text-only protocols.
  • Serialize data structures for storage or transmission in a compact format.

Relevant tables

Here’s a comparison of the number of characters needed to encode the same data using Base64 and Base85 encoding:

Encoding SchemeEncoded Length
Base6424 characters
Base8520 characters

Common misconceptions about its use

One common misconception about Base85 encoding is that it is always more efficient than Base64 encoding. While this is generally true for larger files, smaller files may actually be slightly larger when encoded using Base85.

Common FAQ questions and answers

Q: What characters are used in Base85 encoding?

A: Base85 encoding uses a set of 85 characters composed of 84 printable ASCII characters and one delimiter character.

Q: Is Base85 encoding better than Base64 encoding?

A: Base85 encoding is generally better than Base64 encoding for larger files, but not for smaller files. Base64 encoding is better than Base85 in terms of encoding speed and simplicity.