Better Programming

Advice for programmers.

Follow publication

Member-only story

Use Binary Encoding Instead of JSON

Shilpi Gupta
Better Programming
Published in
6 min readJun 16, 2020
Photo by Christopher Gower on Unsplash.

Why Should I Care?

In memory, the data is kept as data structures like objects, lists, arrays, etc. But when you want to send the data over a network or store it in a file, you need to encode the data as a self-contained sequence of bytes. The translation from the in-memory representation to a byte sequence is called encoding and the inverse is called decoding. With time, the schema for data that an application handles or stores may evolve, a new field can get added, or an old one can be removed. Therefore, the encoding used needs to support both backward (new code should be able to read the data written by the old code) and forward (old code should be able to read the data written by the new code) compatibility.

In this article, we will discuss different encoding formats, how binary encoding formats are better than JSON, XML, and how these support schema evolution.

Types of Encoding Formats

There are two types of encoding formats:

  1. Textual formats
  2. Binary formats

Textual Formats

Textual formats are somewhat human-readable. Some of the common formats are JSON, CSV, and XML. Textual formats are easy to use and understand but can result in different problems:

  1. Textual formats can contain a lot of ambiguity. For example, in XML and CSV, you cannot distinguish between strings and numbers. JSON can distinguish between string and numbers but cannot distinguish between integers and floating numbers and doesn’t specify a precision. This becomes a problem when dealing with large numbers. An example of numbers larger than 253 occurs on Twitter, which uses a 64-bit number to identify each tweet. The JSON returned by Twitter’s API includes tweet IDs twice — once as a JSON number and once as a decimal string — to work around the fact that the numbers are not correctly parsed by JavaScript applications.
  2. CSV doesn’t contain any schema, leaving it to the application to define the meaning of each row and column.
  3. Textual formats take more space than binary encoding. For example, as JSON and XML are…

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Shilpi Gupta
Shilpi Gupta

Written by Shilpi Gupta

https://www.onepercentbetter.dev/ | Software Developer| Previously @ Adobe| Expedia| Google WTM Scholar’17 | Computer Engineering’18 from DTU (aka DCE)

Responses (36)

Write a response