Abstract:

Created by Peter Kankowski
Last changed
Filed under Interpreters and compilers

Share on social sitesReddit Digg Delicious Buzz Facebook Twitter

Code generator

P-code vs native code: pros and contras

The next step in writing our expression compiler is creating a code generator. You could write a p-code compiler and a virtual machine for executing the p-code instructions. Many compiler authors choose this approach. .NET, Java, Python are typical examples of a giant p-code compiler/interpreter. For a small and comprehensible example, see the source code of NSIS. The p-code approach has several advantages:

  • P-code is portable across different machines. You don't need to change the p-code generator when porting your program to, say, Itanium or ARM.
  • P-code is safe. You can handle all possible run-time errors (invalid p-code instruction, stack overflow, etc.) in the virtual machine.

However, in this article we will write a native code compiler for x86 ISA. This approach is unusual and is often viewed as complex and unsafe. Here are some arguments in its defence:

  • x86 ISA is ubiquitous, and you will rarely need to port your program to anything else. If you develop a Win32 application, you are already bound to Wintel platform, so using a portable p-code interpreter doesn't win you anything.
  • You can write a native code generator without a line of assembly code, so you code will work on any ANSI C++ (or ANSI C, if you wish) compiler. Below is an example of such generator.
  • Many errors can be handled in the parser, so there will be no invalid instructions or stack overflows in the generated code. You still need to check for division by zero, though.
  • Native code is faster than p-code, because it is not interpreted.
  • And the final reason: writing assembly code is a lot of fun. Generating machine code is probably even more fun. :) Even if you don't like this approach for a real program, you could play with it a little, just for fun.

Memory protection

Modern processors have an eXecution Disable (XD) bit that prevents them from executing a memory page marked as data. It provides some protection from shellcodes and viruses. Windows XP employs this bit for what Microsoft calls DEP (Data Execution Prevention). For a developer, it means that VirtualAlloc or VirtualProtect should be called to set the access rights of generated code. You cannot simply write some machine code to an array and execute it: this will work for Win 9x, but not for Windows XP.

VirtualProtect should be called only for pages allocated with VirtualAlloc. Because you should not intervene in the malloc() function or new operator to get the base addresses of allocated memory pages, you have to manage the memory yourself:

short*  _code;
double* _data;

SYSTEM_INFO si;
GetSystemInfo(&si);
DWORD region_size = si.dwAllocationGranularity;
// Allocate the pages that can be read, written to, and executed
_code = (short*)VirtualAlloc(NULL, region_size * 2, MEM_COMMIT,
                             PAGE_EXECUTE_READWRITE);
if(!_code)
   PrintErrorAndExit();
// Set _data to the beginning of the second page
_data = (double*)((char*)_code + region_size);

Here two memory regions are allocated, one for code, and one for data. Each of them is 64 KB, which will be quite enough for an expression evaluator.

Generating the code

The process of code generation is simple: you just write the necessary machine code bytes to _code array. Here are methods of the code generator class and the instructions they generate:

EmitOperation

void ExprCode::EmitOperation(OPERATION_CODE opcode);

The opcode can be OP_ADD, OP_SUB, OP_MUL, or OP_DIV. Let's take OP_ADD as an example. The method generate ADDP instruction, which pops two operands from the x87 register stack, adds them together, and pushes the result to the stack. Many p-code architectures have a similar instruction (in Java VM it is called DADD).

One thing to note here is that x86 processors store the bytes in reversed order (little-endian). That's why ADDP opcode (DE C1) is written as 0xC1DE in the source code.

EmitLoadNumber

void ExprCode::EmitLoadNumber(double number);

When you call EmitLoadNumber(1.23), it generates the following code:

number DQ 1.23
FLD [number]

The constant is allocated in _data array. Its address is written to _code after the opcode of the FLD instruction (see the EmitFldFstp function in the sources).

x86 ISA permits absolute addressing, but x86-64 uses 32-bit addresses that are relative to the current instruction address (see the footnote in this table). So two adjacent memory pages are allocated for code and data to guarantee that the difference between them will be less that 232. The code for x86-64 is not tested yet. If you have a 64-bit processor, 64-bit Windows, and a 64-bit C++ compiler, please compile the source code and say if it works.

EmitStoreNumber

double* ExprCode::EmitStoreNumber();

This method generates the FSTP instruction to store the result of calculation. It allocates the variable in _data array in the same way as EmitLoadNumber does.

EmitEnd

void ExprCode::EmitEnd();

EmitEnd writes a RET instruction to the _code array. The instruction ends the generated code and returns to the main program.

Execute

void ExprCode::Execute() {
   void (*p)() = (void (*)())_code;
   p();
}

This short method runs the generated code. You can do it without inline assembly or vendor-specific keywords, just cast _code to the pointer to a function returning void and call the function.

Download the code generator (3 KB)

 

The previous part: Expression evaluator

The next part: Reverse Polish notation

Peter Kankowski
Peter Kankowski

About the author

Peter is the developer of Aba Search and Replace, a tool for replacing text in multiple files. He likes to program in C with a bit of C++, also in x86 assembly language, Python, and PHP.

Created by Peter Kankowski
Last changed

3 comments

ac,
re: 64-bit mode.

I haven't tried your code on 64-bits, but take a look at http://www.intel.com/products/processor/manuals/index.htm

In Volume 2A:

"RIP-Relative Addressing

A new addressing form, RIP-relative (relative instruction-pointer) addressing, is implemented
in 64-bit mode. An effective address is formed by adding displacement to the 64-bit RIP of the
next instruction."

Note the "RIP of the next instruction".
Peter Kankowski,
Thank you, I've corrected this. Also, I compiled the asm code with FASM and compared the binary with the generated code. After your correction, they are identical, so the generator should be okay.

Your name:


Comment: