visit
As programmers, Code formatters are an essential tool in our day-to-day jobs, They make it easier to read the code if it is formatted, but did you ask yourself how it works?
Let’s start our story from your file that contains a simple hello world example
fun main() {
print("Hello, World!")
}
The first step is to read this text file and convert it into a list of tokens, A token is a class that represents a keyword, number, bracket, string, …etc with this position in the source code for example
data class Token (
val kind : TokenKind,
val literal : String,
val line : Int,
)
Error in File Main Line 10: Missing semicolon :D
This step is called scanner, lexer or tokenizer and at the end, we will end up with a List of tokens for example
{ FUN_KEYWORD, "fun", 1 }
{ IDENTIFIER, "main", 1 }
{ LEFT_PAREN, "(", 1 }
{ RRIGHT_PAREN, ")", 1 }
{ LEFT_BRACE, "{", 1 }
{ IDENTIFIER, "print", 2 }
{ LEFT_PAREN, "(", 2 }
{ STRING, "Hello, World!", 2 }
{ RRIGHT_PAREN, ")", 2 }
{ RIGHT_BRACE, "}", 3 }
The result is a list of tokens
val tokens : List<Token> = tokenizer(input)
After this step, you will forget your text file and deal with this list of tokens, and now we should convert some tokens into nodes depending on our language grammar for when we saw FUN_KEYWORD that means we will build a function declaration node and we expect name, paren, parameters …etc
In this step, we need a data structure to represent the program in a way we can traverse and validate it later and it is called Abstract Syntax Tree (AST), each node in AST represent statement such as If, While, Function declaration, var declaration …etc or expressions such as assignments, unary …etc, each node store required information to use them later in the next steps for example
Function Declaration
data class Function (
var name : String,
var arguments : List<Argument>,
var body : List<Statement>
)
Variable Declaration
data class Var (
var name : String
var value : Expression
)
var astNode = parse(tokens)
For example suppose that we want all developers to declare variables without using _ inside the name, to check that we will traverse our AST node to find all Var nodes and check them
fun checkVarDeclaration(node : Var) {
if (node.name.contains("_") {
reportError("Ops your variable name ${node.name} contains _")
}
}
But now we need to format it, so how to do that? It's the same we traverse our AST and for each node, we will write it back to text but formatted for example
fun formatVarDeclaration(node : Var) : String {
var builder = StringBuilder()
builder.append(indentation)
builder.append("var ")
builder.append(node.name)
builder.append(" = ")
builder.append(formatValue(node.value))
builder.append("\n")
return builder.toString()
}
At the end of this step, we end up with a string that represents the same input file but formatted and then we write it back to the file.
This is the basic implementation of code formatter, a real production code formatter must handle more cases, for example, what if the code is not valid?, should I format only valid code? should we read the whole program every time we want to format or compile the code?
Now back to Turtle graphics, In this project i already did all the required steps before and has a ready AST, so i just rewrite it with code as you saw above ^_^ i read it from the UI format it and write it back to UI in my case
If you are interested and want to read more I suggest