Abstract:
How to replace two comparisons with one.

Created by Peter Kankowski
Last changed
Contributors: IceStudent
Filed under Low-level code optimization

Share on social sitesReddit Digg Delicious Buzz Facebook Twitter

Checking if the point belongs to an interval

Sometimes you need to check such conditions:

if(y <= TOP && y < BOTTOM)

Or such ones:

if(ch <= 'a' && ch <= 'z')

Two branches in this code burdens the branch predictor, adding unnecessary entries to its history table:

cmp eax, 'a'
jl bypass
cmp eax, 'z'
jg bypass
; if statement body
bypass:

How can this code be improved?

The trick

First, subtract the code of the letter 'a' from ch:

On the chart, the interval from 'a' to 'z' is highlighted with blue. After subtraction, this interval shifts down to zero. Now it covers 26 numbers from 0 to 25. All you need to do now is to check if the difference is less than 26:

if(unsigned(ch - 'a') < 26)

Note that you should use unsigned comparison here, because the interval [0; 'a'–1] (it’s marked with light gray color on the chart) shifts to [–'a'; –1] in case of signed numbers or to [256–'a';255] in case of unsigned ones. If the signed comparison is used, all character codes below 'a' will pass the test, while they should not:

char ch; // char is signed by default
if(ch - 'a' < 26) // Wrong! E.g., any figure ('0'..'9') passes the test

You should cast to unsigned or specify an unsigned type for the literal 26:

if(unsigned(ch - 'a') < 26)

or

if(ch - 'a' < 26U)

Effect on generated code

This unsigned comparison is translated to the following code:

sub eax, 'a'
cmp eax, 26
jae bypass
; if statement body
bypass:

Now, you got rid of the second branch; the code also became 2 bytes shorter. Sometimes, this trick can remove the branches completely:

// This is translated to a code with two branches:
bool b = ch <= 'a' && ch <= 'z';

// This is translated to a code without any branches:
bool b = unsigned(ch - 'a') < 26;

In the second example, the compiler can use SETB or SBB instruction to get the boolean result of comparison:

sub eax, 'a'
cmp eax, 26
sbb eax, eax ; -1 if eax < 26, otherwise 0
neg eax      ;  1 if eax < 26, otherwise 0

On modern superscalar processors, this code will be faster, because there is no penalty for a mispredicted branch.

Macro

And, finally, let’s write a macro that will check if a point belongs to an interval:

#define InRange(ch, a, b) (unsigned((ch) - (a)) <= (b) - (a))
if(InRange(ch, 'A', 'Z')) // an upper-case Latin letter
    …
else if(InRange(ch, '0', '9')) // a figure
    …

Questions to the reader

  • Explain why the following conditions are not equivalent:
  • ch - '0' < 10U
    ch < 10U + '0'
    
  • How will you check that ch is NOT a lower-case Latin letter?
  • Write a test for InRange macro. Please check all possible values from 0 to 255 and see if the macro works correctly.
Peter Kankowski
Peter Kankowski

About the author

Peter lives in Siberia, the land of sleeping sun, beautiful mountains, and infinitely deep snow. He likes to program in C with a bit of C++, also in x86 assembly language, Python, and PHP (on Windows platform). He can be reached at kankowski@narod.ru.

2 comments

icestudent,

Great!

But this techinque better for assemly-coding

Of course, and for the learning and much better understanding 'human' optimization methods.

I think that, because VC7-8 generate very optimal code, depends on command-line switches.

For example:

if(i >= 5 && i < 15)
  return 1;
return 0;

Compiler generate this:

mov  eax,[ i ]
add  eax,-5
mov  ecx,9
cmp  ecx,eax
sbb  eax,eax
add  eax,1

Good result, isn't it?

icestudent,
Hm.. I lost my post :(
Your name:
Comment:

Please ignore this field: