# Top Down Operator Precedence Read [Top Down Operator Precedence](https://tdop.github.io/). Read [Top down operator precedence parsing in Go](http://www.cristiandima.com/top-down-operator-precedence-parsing-in-go/). Main flow: ```txt Scanner -> Parser ``` ## Scanner Input: string Output: series of tokens Token is type of text. Example input is `1 + 2`: `1`, `2` is token `int`, `+` is token `plus`. ## Parser From tokens we scanned, we parsed them to [AST tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree). ### Expression Each node of AST tree is called expression. We implement expression like this: ```go type Expression struct { Token Token Value interface{} Children []Expression } ``` Each expression has `Token`. `Value` and `Children` is optional. Example: expression `int 3` has `Token int`, `Value 3` but doesn't have `Children`, expression `and A B` has `Token and`, `Children A B` but doesn't have `Value`. With input `A + B * C`, we parse to expression like this: ```txt + / \ A * / \ B C ``` ### Token precedence Each token has [precedence](https://en.wikipedia.org/wiki/Order_of_operations#Programming_languages). Precedence decides order of operator. Example `A + B * C`, `*` has higher precedence than `+` so `A + (B * C)`. ### Token program Each token has programs, program to decide what to do if we meet that token when we parse. Token program can be 2 types: `nud` or `led`. | short | long | explain | | ----- | --------------- | ---------------------------------------------------------------- | | `nud` | null denotation | code denoted by a value (int, string, ...) token or prefix token | | `led` | left denotation | code denoted by an infix token | Example prefix token is `(`, `not`, `-` (negative sign). Example infix token is `and`, `or`, `==`. ### Pratt algorithm To do what we want, we implement Pratt algorithm. Core algorithm looks like this: ```go func Parse(precedence int) Expression { token := Scan() result := nud(token) for { peekToken := Peek() if precedence >= peekToken.Precedence() { break } token := Scan() result = led(token, result) } return result } func nud(token Token) Expression { return Expression{ Token: token, // deal with value and children } } func led(token Token, expr Expression) Expression { rightExpr := Parse(token.Precedence()) // do something special return Expression { Token: token, // deal with value Children: []Expression{ expr, rightExpr, } } } ``` | mystery | explain | | -------------------------------------- | ------------------------------------------------------------------------- | | `precedence argument` | precedence of previous token | | `Scan()` | return next token and ahh it gone | | `Peek()` | return next token but it's still there | | `precedence >= peekToken.Precedence()` | previous token is already powerful than next token, stop | | `nud(token)` | return expression, this token must be value or prefix | | `led(token, result)` | return expression with result as right argument, this token must be infix | Must remember is `nud()` and `led` in example are for general. Each token should define how `nud()` and `led()` do, if not define let user handle error. To parse, call `Parse(0)`. This algorithm is hard I know. But it will be easier if we read through example ### Example Assume `+`, `-` precedence is 1, `*` precedence is 2. Input: `A + B * C - D` Function calls happen as follows: ```txt Parse(precedence = 0) (1) nud(A) result in Expression(A) 0 < peek.Precedence (peek is +, precedence is 1), enter loop led(+, Expression(A)) result in Expression(+) save Expression(A) as first child call Parse(precedence = 1) (2) and save result as second child Tree: + / \ A ? Parse(precedence = 1) (2) nud(B) result in Expression(B) 1 < peek.Precedence (peek is *, precedence is 2), enter loop led(*, Expression(B)) result in Expression(*) save Expression(B) as first child call Parse(precedence = 2) (3) and save result as second child Tree: * / \ B ? Parse(precedence = 2) (3) nud(C) result in Expression(C) 2 > peek.Precedence(peek is -, precedence is 1), stop loop return Expression(C) Tree: C Back to Parse(precedence = 1) (2) Expression(*) has Expression(C) as second child continue loop 1 = peek.Precedence (peek is -, precedence is 1), stop loop return Expression(*) with Expression(B), Expression(C) as children Tree: * / \ B C Back to Parse(precedence = 0) (1) Expression(+) has Expression(*) as second child continue loop 0 < peek.Precedence (peek is +, precedence is 1) led(-, Expression(+)) result in Expression(-) save Expression(+) as first child call Parse(precedence = 1) (4) and save result as second child Tree: - / \ + ? / * A / \ B C Parse(precedence = 1) (4) nud(D) result in Expression(D) 1 < peek.Precedence(peek is EOF, precedence is 0), stop loop return Expression(D) Tree: D Back to Parse(precedence = 0) (1) Expression(-) has Expression(D) as second child continue loop 0 = peek.Precedence(peek is EOF, precedence is 0), stop loop return Expression(-) with Expression(+), Expression(D) as children Tree: - / \ + D / * A / \ B C ```