Suppose you have a null-terminating string, and its length is stored in a variable (so called fucked string). The function is looking for a character in the string.
Naive implementation
const char* memchr_naive(const char* s, int ch, size_t len) {
const char* end = s + len;
for(; s < end; s++)
if(*s == ch)
return s;
return NULL;
}
Sentinel implementation
There is a better way to do it: temporarily replace the terminating null with the searched character. It will be a "sentinel". Now the program can do one comparison on each iteration instead of two. After the loop, we should check if we found the sentinel or a character in the string:
char* memchr_sentinel(char* s, int ch, size_t len) {
char* sentinel = (char*)(s + len);
*sentinel = (char)ch;
while(*s != ch)
s++;
*sentinel = '\0';
return (s < sentinel) ? s : NULL;
}
This method is 5-30% faster. With loop unrolling and vectorizing tricks by Agner Fog, it shows even better results for long strings:
Time in clock cycles | ||
---|---|---|
for short strings | for long string | |
CRT | 1280 | 1560 |
naive | 1500 | 2670 |
sentinel | 1430 | 2130 |
sentinel unrolled | 1370 | 1430 |
sentinel vectorized | 1380 | 1180 |
sentinel vectorized asm | 1550 | 1160 |
Conclusion
You can use the sentinel method not only for memchr, but also for string search and array search functions. The main limitation is that there should be enough space for the sentinel after the end of the string. The string should be located in a writeable section of exe file (it cannot be "const"!).
Dowload the code (30 KB)
5 comments
But I wonder: Am I the only one who has problems posting comments? It takes ages (and lots of timeouts) to see my post once I pressed the "Submit Comment"-Button.
I will think about switching to another hosting if the performance problems will continue. Thank you for reporting them.
The string passed in argument is const char, so should not be modified.If the string happens to be placed in some read-only memory, this is going to basically crash (or in the best case, not work as expected).
Also, you did mention that this will work on a NULL-terminated string ; if len happens to be the size of a buffer but the string is actually much smaller than len, this is going to be less efficient than the naive implementation. And of course, this new memchr_sentinel will not work on substrings...