Abstract:
How to replace two comparisons with one.

Created 1 year ago by Peter Kankowski
Last changed 1 year ago
Contributors: IceStudent
Filed under Low-level code optimization

Checking if the point belongs to an interval

Sometimes you need to check such conditions:

if(y <= TOP && y < BOTTOM)

Or such ones:

if(ch <= 'a' && ch <= 'z')

Two branches in this code burdens the branch predictor, adding unnecessary entries to its history table:

cmp eax, 'a'
jl bypass
cmp eax, 'z'
jg bypass
; if statement body
bypass:

How can this code be improved?

The trick

First, subtract the code of the letter 'a' from ch:

On the chart, the interval from 'a' to 'z' is highlighted with blue. After subtraction, this interval shifts down to zero. Now it covers 26 numbers from 0 to 25. All you need to do now is to check if the difference is less than 26:

if(unsigned(ch - 'a') < 26)

Note that you should use unsigned comparison here, because the interval [0; 'a'–1] (it’s marked with light gray color on the chart) shifts to [–'a'; –1] in case of signed numbers or to [256–'a';255] in case of unsigned ones. If the signed comparison is used, all character codes below 'a' will pass the test, while they should not:

char ch; // char is signed by default
if(ch - 'a' < 26) // Wrong! E.g., any figure ('0'..'9') passes the test

You should cast to unsigned or specify an unsigned type for the literal 26:

if(unsigned(ch - 'a') < 26)

or

if(ch - 'a' < 26U)

Effect on generated code

This unsigned comparison is translated to the following code:

sub eax, 'a'
cmp eax, 26
jae bypass
; if statement body
bypass:

Now, you got rid of the second branch; the code also became 2 bytes shorter. Sometimes, this trick can remove the branches completely:

// This is translated to a code with two branches:
bool b = ch <= 'a' && ch <= 'z';

// This is translated to a code without any branches:
bool b = unsigned(ch - 'a') < 26;

In the second example, the compiler can use SETB or SBB instruction to get the boolean result of comparison:

sub eax, 'a'
cmp eax, 26
sbb eax, eax ; -1 if eax < 26, otherwise 0
neg eax      ;  1 if eax < 26, otherwise 0

On modern superscalar processors, this code will be faster, because there is no penalty for a mispredicted branch.

Macro

And, finally, let’s write a macro that will check if a point belongs to an interval:

#define InRange(ch, a, b) (unsigned((ch) - (a)) <= (b) - (a))
if(InRange(ch, 'A', 'Z')) // an upper-case Latin letter
    …
else if(InRange(ch, '0', '9')) // a figure
    …

Questions to the reader

2 comments

icestudent, 3 years ago

Great!

But this techinque better for assemly-coding

Of course, and for the learning and much better understanding 'human' optimization methods.

I think that, because VC7-8 generate very optimal code, depends on command-line switches.

For example:

if(i >= 5 && i < 15)
  return 1;
return 0;

Compiler generate this:

mov  eax,[ i ]
add  eax,-5
mov  ecx,9
cmp  ecx,eax
sbb  eax,eax
add  eax,1

Good result, isn't it?

icestudent, 3 years ago
Hm.. I lost my post :(
Your name:
Comment: