Beyond the debugger

It is said that printf (although I usually use fprintf) is enough for debugging, and that it is what seasoned programmers use. Certainly you can get by using only printf, but is it really bettern than a debugger? I'm not going to argue either case, but with a focus on C, I will in this article go over tools and technique that all C programmer should know, and that will make the debugger the slower or less useful tool in many situations.

The first debugging tool C programmers usually learn is not the debugger, but an indispensable complement to the debugger: Valgrind (there is also the less capable but faster alternative memleax(1)). Valgrind is primarily known for its memory checker, but it has a lot of other capabilities that you are advised to read up on; but assuredly the memory checker is the most important and mostly used part of Valgrind. It uses emulation and function swapping to find memory leaks and most memory accesses bugs. Valgrind provides bug hunting tools that debuggers do not implement, and its memory checker is used for somewhere in the vicinity of half of all bugs in C code, running it is usually the first thing to do when debugging your code as you will quickly see if an error is due to a memory access bug such as reading uninitialised memory or if there is an out-of-bounds error. If the process crashes because of a memory segmentation violation (usually dereferensing null or an uninitialised pointer), valgrind(1) will tell you where and if there was any other memory access errors leading up to the crashing fault.

As stated above, Valgrind uses emulation. This can be a pain point for test case. For long/indefinitely running processes it can be critical that there are no memory leaks. Imagine a processes leaking 100 bytes a hundred time per second (extreme case, often you can make memory allocations rear, but I know from experience using third-party libraries that some libraries do infact allocate this a small amount of memory very frequently); now imagine this processes running uninterrupted for just a day. It would leak 824 megabyte. In one week it would add up to 5.6 gigabyte. 22.5 gigabyte in four weeks, and 294 gigabyte in one year. Hopefully there is a system in place to restart it before it uses upp all the memory one the system. But if you don't want this to happen, you would add tests to find memory leaks. You write some tests and run them with valgrind(1) and parse the output. But because Valgrind uses emulation, this can be very slow, but additionally there are other types of resource leaks that you want to find: some of them can be even more scarce than physical memory. The solution is utilies two obscure properties: one of the C preprocessor and one if libc. libc, the C standard library, uses weak linking, this means that you can simply reimplement its functions in your own test code. Must interesting functions can be implemented very simply (apart from the memory allocation functions, the important functions are trivial, and there are trivial ways to implement memory allocation), but if not, libc (or maybe you are dealing with some other library) may provided hidden strongly linked implementations that can be called from wrapping code. As for your own functions, if you put round brackets around the function name when you define or declare them, the C preprocessor can be used to replace calls to the function (although only direct calls, calls via function pointers cannot be replaced). This can be used to fully replace functions or wrap them in test code. Now you can add resource leak detection to your test code, that can run natively, but be aware that your replacements of malloc(3) and company will be replaced by Valgrind's replacements, so your test code will not to check if the replacement is being used. This is also important in cases libc does not use weak linking. Furthermore, this is important becausevalgrind(1) might not work with your libc. Replacing predefined functions this way is also the easiest way to test how your program handles failures that are hard to trigger, memory allocation failure probably being the most difficult (but keep in mind that memory allocation failures are usually asynchronous as the operating system is usually configured to over-commit memory (some applications allocation a terabyte even if you only have a few gigabyte) and allocate commited memory dynamically during faults).

As for printf, print statements can provide very clean and easy to real debug traces, they can even be more detailed than when a typical debugger can provide (the GNU Debugger (GDB) supports Python scripting than can be used to provide just as much information, but it can be a chore). This can certainly be very useful, but it can also be a chore, and a debugger may be better alternative. But the C preprocessor is a very versatile tool that many languages are sadly missing. Modern PC CPU:s have (slow) branch tracing, and GDB does support it. However, regardless whether your CPU have support for branch tracing, you can do it branch tracing without a debugger, simply using the C preprocessor: simply replace some keywords. The C preprocessor can expand macros inside macros (even macros passed as arguments), but when it reaches itself it stops. So if you for example define A as (A+1), every A will be replaced with (A+1) (it does not create an infinite loop). This means that if you build your program with

make CC='gcc -include /usr/include/stdio.h -D'\'\
'if(...)=if((__VA_ARGS__) && (printf("%i\n", __LINE__),1))'\'

every if-statement whose condition is met, will print the line number where the if-statement is found. Naturally, more details can be printed, such as the condition, the filename and even the function name. And this technique works for any keyword that is followed by pair of brackets (importantly, even if the preprocessor does not allow whitespace after than macro name when a parameterised macro is defined, it does allow whitespace after than macro name when the macro is expanded). If course, you don't whant to do this incantation every time, what you do instead is either add the macros to a private header file, or you add a wrapper script for the compiler and put it in ~/.local/bin (and make sure you put ~/.local/bin in your PATH). Instead of defining macros with the same name as the keywords, you can give them more descriptive names (and have different versions), and use the preprocessor the replace the keywords with this macros in select sections of your code. If you are using C++20, you can use __VA_OPT__ and __VA_ARGS__ combined with print statements to print the input and output, of every function call to select functions. This can also be done in GNU C11, but it requires _Generic, which unfortunately isn't as flexible and gets very ugly (due to having to add casts to suppress warnings and even errors generated for unmatched cases), but using X-macros you can provide ways, from your code, to print your data types.

There are other debugging tools also, but the only other one that I've found useful is strace(1) (and my simpler sctrace(1)) which traces system calls. Apropos strace(1), there is a less used debugging technique where dprintf(3) is used to pritn debug information that can be included in releases, to a negative file descriptor, which only becomes visible to the user via system call tracing. This technique can be improve on by adding an environment variable to decide whetger to print to a negative file descriptor or a real file descriptor.

That said, there are still situations where using a debugger is less work, but it is difficult to formulate when. But there are very useful features in GDB, such a checkpoints which allow you to step back to a previous state, which is useful with its ability to modify memory. There are also conditional breakpoints and the ability to suspend just the trapped thread or all of them.