Introduction to YAML

Get introduced to YAML in this lesson and learn about the benefits of YAML.

Welcome to this course

Welcome to this course. I’m glad that you have enrolled to this course on YAML. This course is an interactive version of my best-selling book - Introduction to YAML: Demystifying YAML Data Serialization. This is a great way to start your journey towards learning the YAML data serialization language.

So, let’s begin our learning journey.

Overview of YAML

YAML stands for YAML Ain’t Markup Language. It rhymes with “camel”.

Previously, it stood for Yet Another Markup Language. It was later changed to emphasize that the language is intended for data and not documents.

  • YAML is a very simple, text-based, human-readable language used to exchange data between people and computers.

  • YAML is not a programming language. It is mostly used for storing configuration information.

  • YAML files store data, and they do not include commands and instructions.

  • YAML is a data serialization language similar to XML or JSON. However, it is much more human readable and concise.

In case you are wondering what data serialization is, here is the definition.

Data serialization is the process of converting data objects, or object states present in complex data structures, into a stream of bytes for storage, transfer, and distribution in a form that can allow recovery of its original structure.

Messaging is sending information across various software components or applications.

Where is YAML used?

  • YAML can represent any data structure that can be represented using either XML or JSON format.

  • YAML can store and manage any textual data in Unicode format, like configuration or log files. This makes it software and hardware platform independent for storing, transporting, and sharing data. It is important to understand this format because any changes to configuration files can cause your application to work incorrectly.

YAML is used in configuration files, log files, object persistence, caching, and messaging (sending information across various software components or applications).

Persistence is the process of preserving the state of an object longer than the lifespan or duration of the process of creating the object. This is achieved by storing the object’s state in a non-volatile memory like a hard disk instead of a volatile memory like RAM.

Caching is a technique of storing a copy of a repeatedly requested resource and serving it back from the copy itself instead of from the remote server. This makes the application more responsive.

Benefits of YAML

YAML has the following characteristics:

  • Human-readable - YAML is very much human readable. It allows you to represent complex data structures in a human readable manner. To prove this point even the homepage of the YAML’s Official site (https://yaml.org) is displayed as a YAML document.
  • Simple and clean syntax - YAML is easy to learn and simple to read. It can easily express wide variety of different data structures.
  • Easy to implement and use - It is easy to implement and use.
  • Strict - The YAML specification has very little leeway for flexibility, which increases its robustness.
  • Unambiguous - YAML unambiguously specifies the data structures of the serialized data, so there’s no need to rely on comments or documentation. Because the data structures are unambiguous, it makes it easier to use automated tools (scripts) for reading and writing YAML. For example, you could write a script that reads the data structure from a file written in YAML and converts it to another format (such as JSON or XML).
  • Version control friendly - YAML stores plain text, so it can be added to a version control such as Git or Subversion repository without any issues.
  • Portable across programming languages - YAML supports representing sequences as lists and mappings as dictionaries (hashes in some languages) in a language-independent manner.
  • Powerful - It is more powerful than JSON when it comes to specifying complex data structures. It’s a superset of JSON, which means that all valid JSON documents are also valid YAML.
  • Matches native data structures of modern programming languages such as Python, Ruby, and JavaScript. There are multiple YAML parsers in different languages, so you can use the same language for both generating and parsing YAML.
  • Consistent data model to support generic tools. It has powerful tools, such as PyYAML.
  • Supports one-pass or one-direction processing - Parsing YAML is linear. There are no forward or backward pointers to deal with, so there are fewer parsing ambiguities.
  • Expressive and extensible - By extensibility, we mean existing applications continue to work even when new data is added (or removed).
  • Fast - YAML is fast to load and easy to process in memory.
  • Secure - Many security issues in programming languages are related to parsing untrusted input (such as JSON). Python, Ruby, Java, JavaScript, PHP, and more make it possible for attackers to exploit these vulnerabilities bypassing unexpected input strings to the parser. YAML is designed to prevent these types of exploits by specifying what type of data each part of the YAML stream must consist of.

Due to the above benefits, numerous modern tools and applications rely on YAML.

Understanding YAML is important because any changes to the files that store configuration data in the YAML format can cause an application based on it to work incorrectly.

History of YAML

The following are different versions of the YAML language

  • 1.0,
  • 1.1,
  • 1.2

Timeline

  • In early 2004, Clark Evans, Oren Ben-Kiki, and Ingy döt Net released the YAML 1.0 specification, which was its first release.

  • In 2005, the YAML 1.1 standard was released. By this time JSON became popular. Accidentally JSON was a subset of YAML (both syntactically and semantically).

  • YAML 1.2 was released in 2009 to make sure it is a valid superset of JSON. YAML 1.2 has been updated to include more security features, making it more resistant to attacks.

  • YAML 1.2.2 is the latest revision of the YAML specification. It was released on 1st October 2021. It mainly focuses on improving clarity, readability and removal of ambiguity in the specification.

Summary

  • YAML stands for YAML Ain’t Markup Language.
  • YAML is a very simple data format for exchanging data.
  • It is similar to XML or JSON.
  • YAML files usually store configuration data.
  • YAML is a human-readable, clean, simple, easy to implement, consistent, portable, expressive, and extensible data serialization format.

Quiz about YAML fundamentals

1

YAML stands for

A)

Yet Another Markup Language

B)

YAML Ain’t Markup Language

C)

Your Another Markup Language

D)

Your Awesome Markup Language

Question 1 of 30 attempted