Abstract:

Created 2 years ago by Peter Kankowski
Last changed 6 months ago
Filed under Interpreters and compilers

Reddit Digg Delicious Buzz Facebook Twitter

Adding variables to expression evaluator

In the previous part of expression evaluator series, we completed a simple compiler for arithmetic expressions. Now let's add support for variables and improve the code.

Lexer

The first improvement is separating lexer from parser. In the previous version, splitting the expression into tokens was intermixed with token processing. Now, there is a separate function, NextToken, which retrieves the next token. A token may be:

How can you return all these kind of data types from NextToken? The function returns a char, which can be an operator or two special values (TOKEN_NUM and TOKEN_VAR). In the latter case, the calling function should look for the value in _number and _var variables.

Variables

The names and addresses of variables are stored in a hash table. A simple fixed-size (1024 elements) table and Bernstein hash function were used. Both provide good performance for this task.

The code to process variables is now very simple. When we encounter a variable name, we check if it is an assignment or usage of previously assigned variable. If the variable is assigned, we emit the code to calculate the expression and store the result in memory. If the value of variable is used in the expression, we just load it from memory:

if(_cur_token == TOKEN_VAR) {
    EVAL_CHAR* var_name = _op_start;
    if( NextToken() == '=') {
        double** stored_to = _var;
        NextToken();
        ParseSummands();
        *stored_to = _code.EmitStoreNumber(ExprCode::P_LEAVE);
    } else {
        if(NULL == *_var) {
            _op_start = var_name;
            ReportError(EEE_UNDEF_VAR);
        }
        _code.EmitLoadVariable(*_var);
    }
    if(negative)
        _code.EmitOperation(OP_CHANGE_SIGN);
}

The evaluator accepts C-style assignments like a = b = 2 or 3 + b = 2 (which means b = 2 and print 3 + b). We could implement Basic-style assignment, which allows assignment only to one variable in an expression, but this would require more complex code. We would have to push token back if the variable is the first token, but there is no assignment (e.g., a + 5). So, simpler C-style assignments were chosen.

As an exercise, you can implement pre-defined variables. For example, add pi variable for "π" number.

Conclusion

This code illustrates some techniques widely used in compilers. Most translators (including commercial C++ compilers) store variables in hash tables and separate lexer from parser. By studying this toy example, you can learn something about "real" compilers.

Download the expression evaluator, 43 KB

 

The previous part: Reverse Polish notation

6 comments

Michael, 3 years ago
Awesome article, thanks for writing this series.

I've thoroughly enjoyed the education on machine code, which I'm applying to AIMGP.
Peter Kankowski, 3 years ago
Thank you for kind words :). I will continue the series.

And what is AIMGP?
Michael, 3 years ago
It stands for Automatic Induction of Machine-code Genetic Programming, it's a high performance technique for implementing Linear Genetic Programming.

It's an AI search algorithm where you generate machine code "organisms", test their fitness through execution of the raw machine code(on a set of test data), then perform crossover/mutation etc on the fitter organisms.

One of the more complex parts is learning machine code and generating it on the fly at run time.
ace, 3 years ago
One potentially interesting tool.

A parser/lexer/tree generator written with the goal to be better than all yacc-based-and-like tools, written in Java but with the C target possible. BCD license.

http://www.antlr.org/

I haven't tried it. The C runtime has around 10000 lines of C (!) which I consider a little too much for something small. But maybe it's good for experimenting or doing something big.
Peter Kankowski, 3 years ago
Expression evaluator in ANTLR
Anthony Akentiev, 2 years ago
Thank you, Petya! Can't wait to see upcoming articles. Well done. There were not so much new information to me, but it was still interesting.

p.s.
I love Red Dragon book (it is one of my fav. computer books) and i don't think that it is so "boring". I highly recomend it even as a starter.
Your name:
Comment: