When debugging a complex parser, you have to deal with buffer overflows. Consider a parser that reads 1-2 characters past the end of buffer when some specific input is provided. The bytes after the buffer are usually readable, but if they are not, the program crashes. It is hard to catch the bug, because it occurres only with some specific memory layout.
In such cases, you need to make the bug occur always. You program will crash more often, but that's good because it helps you to find the root of the problem.
Here is a debugging version of malloc that allocates the buffer at the very end of memory page, so reading one extra character leads to access violation error:
VOID* DbgMalloc(SIZE_T size) {
SYSTEM_INFO sys_info;
GetSystemInfo(&sys_info);
INT_PTR page = sys_info.dwPageSize;
assert((page & (page - 1)) == 0); // page size is the power of two
SIZE_T rounded_size = (size + page - 1) & (-page); // round up to the page boundary
BYTE* start = (BYTE*)VirtualAlloc(NULL, rounded_size + page, MEM_COMMIT, PAGE_READWRITE);
DWORD old_protect;
BOOL res = VirtualProtect(start + rounded_size, page, PAGE_NOACCESS, &old_protect);
assert(res); UNREFERENCED_PARAMETER(res);
return start + (rounded_size - size);
}
Now you can allocate memory with this function when running automatic tests. If the test coverage is close to 100%, you will able to catch all buffer overflow errors.
Download the source code (10 KB)
10 comments
See http://support.microsoft.com/kb/286470
MSVC++ malloc uses it's own heap manager for small blocks of data, so gflags will not work for them. In this case, you have to "reinvent the wheel" and use DbgMalloc.
So, gflags can be useful. I will try this tool, thank you again.
http://www.microsoft.com/technet/prodtechnol/windows/appcompatibility/appverifier.mspx
The debug heap of MSVS has some handy functions as well, for example _CrtSetDbgFlag.
> I viewed the source code of MSVC 6.0 CRT
Even in 6.0 the CRT doesn't use its own manager for small blocks unless it's explicitly turned on. Apparently turning it on can often improve the overall performance even in the code built with the later compiler versions if the code does a lot of small allocations.
(Out of topic: There's a simple rule, convenient for us non native English speakers: If you mean “it is” or “it has” write “it's”, otherwise write “its”)
Karlis, thank you very much; I will try AppVerifier. There was an article about CRT debug heap functions on this blog.
I also found a way to do this at link time (i.e. completely automated), but I never got around to documenting it.