Create a stack monitor in 7 easy steps
One of the most painstaking bugs to hunt down in an embedded system is when the stack overflows its boundaries and starts to overwrite nearby memory. The symptoms of stack overflow usually appear randomly, when just the perfect storm of interrupts and function calls are occurring. This leads them to be difficult to detect. You can prevent stack overflow through the use of a stack monitor. Here are seven steps that developers can take to create a stack monitor and ensure the stack remains in its allocated memory region.
Step #1 – Perform a worst case stack analysis
Many compilers and tool chains will automatically set the stack size to 0x400 bytes, which is equivalent to a kilobyte of RAM. A stack size of a kilobyte is usually sufficient for many applications but this is supposed to be computer science and not a guessing game. So how can an engineer be sure that the stack is properly sized? The answer is to perform a worst case stack analysis.
A worst case stack analysis can be performed in many different ways and is beyond the scope of this article. In general, though, a developer needs to fully understand a number of items. First, an understanding of the call depth of their application is necessary. How many functions are calling functions that call functions before returning back up the chain? Each of the return addresses is stored on the stack. Second, the developer needs to understand the number and size of all variables within those functions to estimate how much stack space each function will use. Finally, the developer will need to determine how many interrupts could fire simultaneously along with the size of each interrupt frame.
Step #2 – Set the stack size
The worst case stack analysis will reveal a size that the stack should be. But calculating stack size can be difficult and hard to do. So despite a careful analysis of the system, it doesn’t hurt to multiply the final number by 1.5 just to make sure that there is a reasonable buffer included for unforeseen circumstances. The stack size can then be changed either through the project properties or through the linker file depending on preference and tool capabilities.
Step #3 – Select a protection method
Properly sizing the stack is good progress towards preventing the stack from overflowing and clobbering nearby memory regions, but it still doesn’t allow for detection of such an overflow event. In an embedded system there are a number of ways to detect such an event. The first is to use a memory protection unit and set the stack boundary. If the stack crosses the boundary the MPU can fire an interrupt and the system can then log the issue and follow procedure to recover the system.
A second approach, if a RTOS is in use, is to enable stack overflow detection in the RTOS. Many RTOSs by default have this detection enabled but I have seen articles recommending turning this feature off to improve performance! It is NOT recommended that developers disable stack overflow detection or else you may feel the cold embrace of a stack overflow bug.
Finally, in a resource constrained system where an MPU isn’t available or an RTOS in use, a developer can very easily create their own stack monitor.
Step #4 – Add guard section to the linker
A developer can create a stack guard section in a number of different ways but one useful way is to use the linker file. The linker file can be updated to include a guard size and location. The size is completely arbitrary. A rule of thumb is to make the guard large enough so that if an overflow were to occur it wouldn’t overflow the guard area. An example of what the guard section might look like can be seen in Figure 1.
Figure 1 – Example of what the linker may look like
Step #5 – Populate guard space with pattern
Creating a guard section is great but it isn’t terribly useful unless there is a known pattern populating it. A guard pattern can then later be checked by the application code to determine if an overflow has occurred. Any pattern can be placed in the guard area but I’ve found it useful during subsequent debugging to use a pattern that is human readable. The use of the pattern 0xC0DE is one of my favorites. Figure 1 shows an example of what construction of a populated guard area might look like. The exact implementation will vary based on the toolchain that this used.
Step #6 – Periodically check the pattern
The application code should be set up to periodically check that the entire guard section still contains the correct pattern. A change in the pattern will be caused by a stack overflow. Application code for this check is relatively simple. A developer just needs to loop through each pattern and verify that it is still correct. Figure 2 shows an example loop using a pointer that is checking the stack guard fill pattern. If a change were detected the application could then branch off and try to log the system stack and begin recovery procedures.
Figure 2 – Example of what the guard application may look like
Step #7 – Test the guard
The final step to creating a stack monitor is, of course, to test it! One of the best ways to test the monitor is to write a small piece of code that will modify the stack guard pattern. The periodic check of the stack guard should then detect that the pattern has changed, an indication that the stack has overflown.
A tested stack monitor goes a long way towards improving the reliability and robustness of the system. Once the stack monitor is able to detect the overflow additional application code is necessary to decide what to do with that information. Logging the call depth, register values, and application state is an action that will help a developer repeat the overflow and discover the root cause.
The stack is often overlooked by developers when they start software development. But stack overflow is one of those difficult to find bugs that can cause headaches unless developers make the effort to monitor for it. Detecting a stack overflow isn’t difficult and the minor performance hit of a monitor is well worth it!
Jacob Beningo is a Certified Software Development Professional (CSDP) whose expertise is in embedded software. He works with companies to decrease costs and time to market while maintaining a quality and robust product. Feel free to contact him at email@example.com, at his website www.beningo.com, and sign-up for his monthly Embedded Bytes Newsletter here.