How an abstract syntax tree is formed

Share

Let’s take a look at the following expression:

6 + 8 * 2

Whenever the compiler encounters any expression to be executed, it goes through the following three stages:

  1. Lexical analysis
  2. Syntax analysis
  3. Code generation

Our point of interest here is the second step, syntax analysis. This is the step that produces the abstract syntax tree (AST).

We’ll take our expression example again. In the first step of lexical analysis, the code will be broken down into smaller pieces called tokens.

The process of lexical analysis

In the next step of syntax analysis, the tokens are converted into a tree called the abstract syntax tree. The structure of the tree is similar to the code structure.

The abstract syntax tree of 6 + 8 * 2

The tree also forms the basis of the hierarchy of steps to perform. For example, in our expression, the * operation will be performed before the + operation.

One of the best uses of the AST is to find syntax errors in the code.

Design requirements

There are some core requirements of the compiler to process the AST correctly:

  • Variable types must be preserved.
  • The order of statements to be executed must be well defined.
  • Both components of the binary tree must be stored and identified correctly.
Copyright ©2024 Educative, Inc. All rights reserved