Code generation

%147 node_1 Parsing node_2 Transformation node_1->node_2 node_3 Code generation node_2->node_3
Step 3 of 3

The final phase of a compiler is code generation. Sometimes compilers will do things that overlap with transformation, but for the most part code generation just means take our AST and string-ify code back out.

Code generators work several different ways, some compilers will reuse the tokens from earlier, others will have created a separate representation of the code so that they can print node linearly, but from what I can tell most will use the same AST we just created, which is what we’re going to focus on.

Effectively our code generator will know how to “print” all of the different node types of the AST, and it will recursively call itself to print nested nodes until everything is printed into one long string of code.

And that's it! That's all the different pieces of a compiler.

Now that isn’t to say every compiler looks exactly like I described here. Compilers serve many different purposes, and they might need more steps than I have detailed.

But now you should have a general high-level idea of what most compilers look like.

Now we will pass the previously created AST to codeGenerator function in order to convert it into a string.

function codeGenerator(node) {
// We'll break things down by the `type` of the `node`.
switch (node.type) {
// If we have a `Program` node. We will map through each node in the `body`
// and run them through the code generator and join them with a newline.
case 'Program':
return node.body.map(codeGenerator)
.join('\n');
// For `ExpressionStatements` we'll call the code generator on the nested
// expression and we'll add a semicolon...
case 'ExpressionStatement':
return (
codeGenerator(node.expression) +
';' // << (...because we like to code the *correct* way)
);
// For `CallExpressions` we will print the `callee`, add an open
// parenthesis, we'll map through each node in the `arguments` array and run
// them through the code generator, joining them with a comma, and then
// we'll add a closing parenthesis.
case 'CallExpression':
return (
codeGenerator(node.callee) +
'(' +
node.arguments.map(codeGenerator)
.join(', ') +
')'
);
// For `Identifiers` we'll just return the `node`'s name.
case 'Identifier':
return node.name;
// For `NumberLiterals` we'll just return the `node`'s value.
case 'NumberLiteral':
return node.value;
// And if we haven't recognized the node, we'll throw an error.
default:
throw new TypeError(node.type);
}
}
var newAST = {
type:'Program',
body:[{
type: 'ExpressionStatement',
expression: {
type: 'CallExpression',
callee: {
type: 'Identifier',
name: 'add'
},
arguments: [{
type: 'NumberLiteral',
value: '2'
}, {
type: 'CallExpression',
callee: {
type: 'Identifier',
name: 'subtract'
},
arguments: [{
type: 'NumberLiteral',
value: '4'
}, {
type: 'NumberLiteral',
value: '2'
}]
}]
}
}]
};
var output = codeGenerator(newAST);
console.log(output);