How an abstract syntax tree is formed

Let’s take a look at the following expression:

6 + 8 * 2

Whenever the compiler encounters any expression to be executed, it goes through the following three stages:

Lexical analysis
Syntax analysis
Code generation

Our point of interest here is the second step, syntax analysis. This is the step that produces the abstract syntax tree (AST).

We’ll take our expression example again. In the first step of lexical analysis, the code will be broken down into smaller pieces called tokens.

The tree also forms the basis of the hierarchy of steps to perform. For example, in our expression, the * operation will be performed before the + operation.

One of the best uses of the AST is to find syntax errors in the code.

Design requirements

There are some core requirements of the compiler to process the AST correctly:

Variable types must be preserved.
The order of statements to be executed must be well defined.
Both components of the binary tree must be stored and identified correctly.