Understanding the Kubernetes YAML Syntax

Everything JSON can do and then some

Published in

Better Programming

5 min readAug 23, 2019

Photo by Brando Makes Branding on Unsplash

As stated on the Wikipedia page for JSON, YAML (Yet Another Markup Language) is a superset of JSON, which means that it has all the functionality of JSON, but it also extends this functionality to some degree.

YAML is basically a wrapper around JSON, doing everything that JSON can do and then some.

To illustrate this, let’s take a YAML file from the Kubernetes documentation page called Understanding Kubernetes Objects and convert it into JSON using http://convertjson.com, an online utility that can convert YAML to JSON.

Here is the original deployment.yaml file:

apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2 # tells deployment to run 2 pods matching the template
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80

And here is this file converted to JSON:

{
   "apiVersion": "apps/v1",
   "kind": "Deployment",
   "metadata": {
      "name": "nginx-deployment"
   },
   "spec": {
      "selector": {
         "matchLabels": {
            "app": "nginx"
         }
      },
      "replicas": 2,
      "template": {
         "metadata": {
            "labels": {
               "app": "nginx"
            }
         },
         "spec": {
            "containers": [
               {
                  "name": "nginx",
                  "image": "nginx:1.7.9",
                  "ports": [
                     {
                        "containerPort": 80
                     }
                  ]
               }
            ]
         }
      }
   }
}

There are a couple of obvious things to note here.

1. The YAML copy takes less space than the JSON copy.

2. YAML requires less characters than JSON does.

3. YAML allows for comments, while JSON doesn’t.

It’s immediately obvious that YAML takes less space and uses fewer characters than JSON. Even taking into account the comments in the YAML example, the YAML code uses 414 characters while the JSON code uses 697 characters.

The ability to use comments in YAML is also very nice. Comments are indicated by the # character, is in this line:

apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2

Everything after the # is a comment.

Although you might wonder why anyone would use JSON over YAML, JSON does have some advantages. JavaScript provides built-in tooling to parse and deal with JSON (after all, JSON stands for JavaScript Object Notation).

Not just JavaScript, but many other languages have built-in JSON helpers, including many C-family languages.

Fundamentals of YAML Syntax

Let’s dive into some of the fundamentals of YAML syntax.

YAML files consist of maps (or dictionaries) of key-value pairs. A YAML map is simply an object, containing keys and values.

Here’s a map of five key-value pairs:

key1: value1
key2: value2
key3: value3
key4: value4
key5: value5

The equivalent in JSON would be:

{
   "key1": "value1",
   "key2": "value2",
   "key3": "value3",
   "key4": "value4",
   "key5": "value5"
}

A single key can itself contain a map:

key1:
  subkey1: subvalue1
  subkey2: subvalue2
  subkey3: subvalue3

A nested map is just a nested object. The equivalent in JSON would be:

{
   "key1": {
      "subkey1": "subvalue1",
      "subkey2": "subvalue2",
      "subkey3": "subvalue3"
   }
}

YAML also has lists:

list:
  - item1
  - item2
  - item3
  - item4
  - item5

YAML lists are just an array of values for a particular key. The equivalent in JSON would be:

{
   "list": [
      "item1",
      "item2",
      "item3",
      "item4",
      "item5"
   ]
}

Lists can also contain maps:

list:
  - item1
  - 
    mapItem1: value
    mapItem2: value
  - item3
  - item4
  - item5

A list item that contains a map is simply an object in an array. The equivalent in JSON would be:

[
   "item1",
   {
      "mapItem1": "value",
      "mapItem2": "value"
   },
   "item3",
   "item4",
   "item5"
]

Taking Another Look at Our YAML File

Given the examples above, let’s take another look at our deployment.yaml file and see if we can read it a little better.

Take this section:

YAML:

spec:
  selector:
    matchLabels:
      app: nginx

JSON:

"spec": {
      "selector": {
         "matchLabels": {
            "app": "nginx"
         }
      }

The indentation is still there but the quotes and curly braces are abstracted away.

Let’s take a look further down the YAML file where lists are used:

spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80

And the JSON equivalent:

"spec": {
            "containers": [
               {
                  "name": "nginx",
                  "image": "nginx:1.7.9",
                  "ports": [
                     {
                        "containerPort": 80
                     }
                  ]
               }
            ]
         }

As you can see in the JSON equivalent, the value at "containers" is an array, and the first (and only) item in that array is an object.

That object has three keys in it: name, image, and ports. The value for ports is an array that contains one item, which is an object.

We’re getting pretty good at reading YAML now!

Required Fields in Kubernetes YAML Files

There are a few required fields in every Kubernetes YAML file:

apiVersion
kind
metadata
spec

You’ll see that our provided YAML file contains all four of these top-level keys.

The Kubernetes docs describe the apiVersion as:

“Which version of the Kubernetes API you’re using to create this object.”

There are different Kubernetes API versions. We are using apps/v1, and that API can be found in the Kubernetes documentation.

You'll see that the Kubernetes API is pretty massive and to go over every aspect of it, even focusing on apps/v1, is beyond the scope of this article.

Nonetheless, there are different kinds and versions of the API and you'll have to become familiar with which API is appropriate for the Kubernetes object you're creating.

The next required field is kind. The docs define this as:

“What kind of object you want to create.”

There are a large number of different Kubernetes objects you can create. They are listed in this Stack Overflow answer.

Next up in the list of required fields is metadata. The docs define metadata as:

“Data that helps uniquely identify the object, including a name string, UID, and optional namespace.”

This is fairly easy to understand. In our deployment.yaml file, we give the object a name metadata value of nginx-deployment.

This value can then be used by other objects to refer to this object and it also provides us with greater context as to the purpose of this Kubernetes object.

Last of the required fields, but definitely not least, is spec. The docs go over this:

“You’ll also need to provide the object spec field. The precise format of the object spec is different for every Kubernetes object, and contains nested fields specific to that object. The Kubernetes API reference can help you find the spec format for all of the objects you can create using Kubernetes.”

The spec field is where you'll describe the object in greater detail, and you'll need to do so using the Kubernetes API.

For our specific YAML file, we would refer to the API for deployment v1 apps as that is the API version we specified in the apiVersion field.

Conclusion

Hopefully, this will give you a much better understanding of how Kubernetes YAML files work.

YAML is very easy on the eyes compared to JSON but how it abstracts away some of the traits of JSON can be difficult to understand. YAML is also not as commonly used as JSON.

I plan to dive deeper into the Kubernetes API and how to properly configure the spec field for these objects. Stay tuned!