YAML Tutorial: get started with YAML in 5 minutes

Mar 23, 2021 - 7 min read
Ryan Thelin
editor-page-cover

YAML is a data serialization language that allows you to store complex data in a compact and readable format. It’s important for DevOps and virtualization because it’s essential in making efficient data management systems and automation.

While often overlooked by developers, it’s a powerful and simple tool that can greatly improve your job prospects with just a couple of hours of learning.

Today, we’ll help you learn YAML fast with a hands-on tutorial and explore how you can use it in your next data-driven solution.

Here’s what we’ll cover today:


Add a YAML certificate to your resume

Get hands-on experience and earn official YAML certification in less than an hour that you can add to your resume, LinkedIn profile, or your personal website.

Introduction to YAML


What is YAML?

YAML is a data serialization language for storing information in a human-readable form. It originally stood for “Yet Another Markup Language” but has since been changed to “YAML Ain’t Markup Language” to distinguish itself as different from a true markup language.

It is similar to XML and JSON files but uses a more minimalist syntax even while maintaining similar capabilities. YAML is commonly used to create configuration files in Infrastructure as Code (IoC) programs or to manage containers in the DevOps development pipeline.

More recently, YAML has been used to create automation protocols that can execute a series of commands listed in a YAML file. This means your systems can be more independent and responsive without additional developer attention.

As more and more companies embrace DevOps and virtualization, YAML is quickly becoming a must-have skill for modern developer positions. YAML is also easy to incorporate with existing technologies through the support of popular technologies like Python using PyYAML library, Docker, or Ansible.


YAML vs JSON vs XML

YAML (.yml):

  • Human-readable code
  • Minimalist syntax
  • Solely designed for data
  • Similar inline style to JSON (is a superset of JSON)
  • Allows comments
  • Strings without quotation marks
  • Considered the “cleaner” JSON
  • Advanced features (extensible data types, relational anchors, and mapping types preserving key order)

Use Case: YAML is best for data-heavy apps that use DevOps pipelines or VMs. It’s also helpful for when other developers on your team will work with this data often and therefore need it to be more readable.

JSON

  • Harder to read
  • Explicit, strict syntax requirements
  • Similar inline style to YAML (some YAML parsers can read JSON files)
  • No comments
  • Strings require double quotes

Use Case: JSON is favored in web development as it’s best for serialization formats and transmitting data over HTTP connections.

XML

  • Harder to read
  • More verbose
  • Acts as a markup language, while YAML is for data formatting
  • Contains more features than YAML, like tag attributes
  • More rigidly defined document schema

Use Case: XML is best for complex projects that require fine control over validation, schema, and namespace. XML is not human-readable and requires more bandwidth and storage capacity, but offers unparalleled control.

Enjoying the article? Scroll down to sign up for our free, bi-monthly newsletter.


Salient Features of YAML

Here are some of the best features YAML has to offer.


Multi-document support

You can have multiple YAML documents in a single YAML file to make file organization or data parsing easier.

The separation between each document is marked by three dashes (---)

---
player: playerOne
action: attack (miss)
---
player: playerTwo
action: attack (hit)
---

Built-in commenting

YAML allows you to add comments to files using the hash symbol (#) similar to Python comments.

key: #Here is a single-line comment 
   - value line 5
   #Here is a 
   #multi-line comment
 - value line 13

Readable syntax

YAML files use an indentation system similar to Python to show the structure of your program. You’re required to use spaces to create indentation rather than tabs to avoid confusion.

It also cuts much of the “noise” formatting found in JSON and XML files such as quotation marks, brackets, and braces.

Together, these formatting specifications increase the readability of YAML files beyond XML and JSON.

Imaro:
   author: Charles R. Saunders
   language: English
   publication-year: 1981
   pages: 224
YAML Example
{
  "Imaro": {
    "author": "Charles R. Saunders",
    "language": "English",
     "publication-year": "1981",
     "pages": 224,
  }
}
JSON Example

Notice that the same information is conveyed; however, the removal of double quotes, commas, and brackets throughout the YAML file makes it much easier to read at a glance.


Implicit and Explicit typing

YAML offers versatility in typing by auto-detecting data types while also supporting explicit typing options. To tag data as a certain type, simply include !![typeName] before the value.

# The value should be an int:
is-an-int: !!int 14.10
# Turn any value to a string:
is-a-str: !!str 67.43
# The next value should be a boolean:
is-a-bool: !!bool true

No executable commands

As a data-representation format, YAML does not contain executables. It’s therefore very safe to exchange YAML files with external parties.

YAML must be integrated with other languages, like Perl or Java, to add executables.


Keep the learning going.

Earn a YAML certificate in less than an hour. Educative’s hands-on courses let you pick up the essential skills and certifications you need to stand out to top recruiters.

Introduction to YAML


YAML Syntax

YAML has a few basic concepts that make up the majority of data.


Key-value pairs

In general, most things in a YAML file are a form of key-value pair where the key represents the pair’s name and the value represents the data linked to that name. Key-value pairs are the basis for all other YAML constructions.

<key>: <value>

Scalars and mapping

Scalars represent a single stored value. Scalars are assigned to key names using mapping. You define a mapping with a name, colon, and space, then a value for it to hold.

YAML supports common types like integer and floating-point numeric values, as well as non-numeric types Boolean and String.

Each can be represented in different ways, like hexadecimal, octal, or exponent. There are also special types for mathematical concepts like infinity, -infinity, and Not a Number (NAN)

integer: 25
hex: 0x12d4 #evaluates to 4820
octal: 023332 #evaluates to 9946

float: 25.0
exponent: 12.3015e+05 #evaluates to 1230150.0

boolean: Yes
string: "25"

infinity: .inf # evaluates to infinity
neginf: -.Inf #evaluates to negative infinity
not: .NAN #Not a Number

String

Strings are a collection of characters that represent a sentence or phrase. You either use | to print each string as a new line or > to print it as a paragraph.

Strings in YAML do not need to be in double-quotes.

str: Hello World
data: |
   These
   Newlines
   Are broken up
data: >
   This text is
   wrapped and is a
   single paragraph

Sequence

Sequences are data structures similar to a list or array that hold multiple values under the same key. They’re defined using a block or inline flow style.

Block style uses spaces to structure the document. It’s easier to read but is less compact compared to flow style.

--- 
# Shopping List Sequence in Block Style
shopping: 
- milk
- eggs
- juice

Flow style allows you to write sequences inline using square brackets, similar to an array declaration in a programming language like Python or JavaScript. Flow style is more compact but harder to read at a glance.

--- 
# Shopping List Sequence in Flow Style
shopping: [milk, eggs, juice]

Dictionaries

Dictionaries are collections of key-value pairs all nested under the same subgroup. They’re helpful to divide data into logical categories for later use.

Dictionaries are defined like mappings in that you enter the dictionary name, a colon, and a space followed by 1 or more indented key-value pairs.

# An employee record
Employees: 
- dan:
    name: Dan D. Veloper
    job: Developer
    team: DevOps
- dora:
   name: Dora D. Veloper
   job: Project Manager
   team: Web Subscriptions

Dictionaries can contain more complex structures as well, such as sequences. Nesting sequences is a good trick to represent complex relational data.


Advanced concepts to learn next

Congratulations on taking your first step toward learning YAML. While often overlooked, YAML is a simple and effective tool to pick up for your DevOps toolkit.

Some next advanced topics to look at are:

  • Anchors
  • Templates
  • YAML with external tools (Docker, Ansible, etc.)
  • Advanced sequence/mapping types
  • Advanced data types (timestamp, null, etc.)

To help you pick up YAML fast, Educative has created the course Introduction to YAML. This mini-course covers all YAML syntax in-depth from simple mappings to advanced anchoring techniques.

After less than an hour, you’ll have cracked all the essential YAML skills and earned a YAML certification for your DevOps resume.

Happy learning!


Continue reading about DevOps and data management


WRITTEN BYRyan Thelin

Join a community of 500,000 monthly readers. A free, bi-monthly email with a roundup of Educative's top articles and coding tips.