Silicon Labs
|
Silicon Labs Community Silicon Labs Community
  • Products
    1. 8-bit MCU
    2. 32-bit MCU
    3. Bluetooth
    4. Proprietary
    5. Wi-Fi
    6. Zigbee & Thread
    7. Z-Wave
    8. Interface
    9. Isolation
    10. Power
    11. Sensors
    12. Timing
  • Development Tools
    1. Simplicity Studio
    2. Third Party Tools
  • Expert's Corner
    1. Announcements
    2. Blog
    3. General Interest
    4. Projects
How to Buy
English
  • English
  • 简体中文
  • 日本語
//
Community // Blog

Detecting Stack Overflows with the Micrium OS Kernel

04/102/2018 | 06:21 PM
Jean Labrosse
Employee

Level 3


In a previous blog, I showed how you can display the stack of a Micrium OS Kernel-based applications using µC/Probe. In this post, I’ll describe the importance of sizing your stacks at design time and checking task stacks at run‑time to catch stack overflows. I will first explore how to determine the size of task stacks and then go into ways that can be used to detect overflows.  I will show different stack overflow detection methods. They are listed in order of the most preferable to the least preferable, based on the likelihood of detecting the overflow. 

How do you determine the size of a task stack?

In a Micrium OS Kernel-based (and most real-time kernels) application, each task requires its own stack.  The size of the stack required by a task is application specific.  It’s possible to manually figure out the stack space needed by adding up:

  • The memory required by all function call nesting. For each function call hierarchy level:
    • Depending on the CPU architecture, one pointer for the return address of a function call.  Some CPUs actually save the return address in a special register reserved for that purpose (often called the Link Register).  However, if the function calls another function, the Link Register must be saved by the caller so, it might be wise to assume that the pointer is pushed onto the stack anyway. 
    • The memory required by the arguments passed in those function calls.  Arguments are often passed in CPU registers but again, if a function calls other functions the register contents will be saved onto the stack anyway.  I would thus highly recommend that you assume that arguments are passed on the stack for the purpose of determining the size of a task’s stack.
    • Storage of local variables for those functions
    • Additional stack space for state saving operations inside the functions
  • The storage for a full CPU context (depends on the CPU) plus FPU registers as needed
  • The storage of another full CPU context for each nested ISR (if the CPU doesn’t have a separate stack to handle ISRs)
  • The stack space needed for local variables used by those ISRs. 

Adding all this up is a tedious chore and the resulting number is a minimum requirement.  Most likely you would not allocate the size of the stack that precisely so that you can plan for “surprises”.  The number you come up with should probably be multiplied by some safety factor, possibly 1.5 to 2.0.  The stack usage calculation assumes that the exact path of the code is known at all times, which is not always possible.  Specifically, when calling a function such as printf() it might be difficult or nearly impossible to even guess just how much stack space printf() will require.  Also indirect function calls through tables of function pointers could be problematic.  Generally speaking, start with a fairly large stack space and monitor the stack usage at run-time to see just how much stack space is actually used after the application runs for a while.  For more information, you can visit “Exploring the Micrium OS Kernel Built-In Performance Measurements” in the Blog section of the Silicon Labs website (www.silabs.com).

Also, avoid writing recursive code because stack usage is typically non-deterministic with this type of code.

There are really cool and clever compilers/linkers such as Keil and IAR that provide this information in a link map.  Specifically, for each function, the link map indicates the worst-case stack usage.  However, these tools will not account for indirect calls (i.e. function pointers) or assembly language routines.  GCC has partial support by providing per-function stack usage but not a call-graph. This feature clearly enables you to better evaluate stack usage for each task.  It is still necessary to add the stack space for a full CPU context plus another full CPU context for each nested ISR (if the CPU does not have a separate stack to handle ISRs), plus whatever stack space is needed by those ISRs.  Again, allow for a safety net and multiply this value by some factor.  

If your kernel monitors stack usage at run-time then it’s a good idea to display that information and keep an eye on your stacks while developing and testing the product.  Stack overflows are common and can lead to some curious behaviors.  In fact, whenever someone mentions that his or her application behaves “strangely,” insufficient stack size is the first thing that comes to mind.

What are Stack Overflows?

Just so that we are on the same page, below is a description of what a stack overflow is.  For the sake of discussion, it’s assumed here that stacks grow from high-memory to low-memory.  Of course, the same issue occurs when the stack grows in the other direction. Refer to Figure 1.

Figure 1 – Stack Overflow

F1-(1) The CPU’s SP (Stack Pointer) register points somewhere inside the stack space allocated for a task.  The task is about to call the function foo() as shown in Listing 1.

Listing 1 – Example of possible stack overflow

F1-(2) Calling foo() causes the CPU to save the return address of the caller onto the stack.  Of course, that depends greatly on the CPU and the compiler.

F1-(3) The compiler then adjusts the stack pointer to accommodate for local variables. Unfortunately, at this point, we overflowed the stack (the SP points outside the storage area assigned for the stack) and just about anything foo() does will corrupt whatever data is beyond the stack base.  In fact, depending on the code flow, the array might never be used, in which case the problem would not be immediately apparent.  However, if foo() calls another function, there is a high likelihood that this will cause something outside the stack to be touched.

F1-(4) So, when foo() starts to execute code, the stack pointer has an offset of 48 bytes from where it was prior to calling foo() (assuming a stack entry is 4 bytes wide).

F1-(5) We typically don’t know what resides here.  It could be the stack of another task, it could be variables, data structures or an array used by the application.  Overwriting whatever resides here can cause strange behaviors: values computed by another task may not be what you expected and could cause decisions in your code to take the wrong path, or your system may work fine under normal conditions but then fail.  We just don’t know and it’s actually quite difficult to predict.  In fact, the behavior can change each time you make changes to your code.

Detecting Stack Overflows

There are a number of techniques that can be used to detect stack overflows.  Some make use of hardware while some are performed entirely in software.  As we will see shortly, having the capability in hardware is preferable since stack overflows can be detected nearly immediately as they happen, which can help avoid those strange behaviors and aid in solving them faster.  

Hardware stack overflow detection mechanisms generally trigger an exception handler.  The exception handler typically saves the current PC (Program Counter) and possibly other CPU registers onto the current task’s stack.  Of course, because the exception occurs when we are attempting to access data outside of the stack, the handler would overwrite some variables or another stack in your application; assuming there is RAM beyond the base of the overflowed stack.  

In most cases the application developer will need to decide what to do about a stack overflow condition.  Should the exception handler place the embedded system in a known safe state and reset the CPU or simply do nothing?  If you decide to reset the CPU, you might figure out a way to store the fact that an overflow occurred and which task caused the overflow so you can notify a user upon reset.

Technique 1: Using a Stack Limit Register

Some processors (unfortunately very few of them) have simple yet highly effective stack pointer overflow detection registers.  This feature will however, be available on processed based on the ARMv8-M CPU architecture.  When the CPU’s stack pointer goes below (or above depending on stack growth) the value set in this register (let’s call it the SP_Limitregister), an exception is generated. The drawing in Figure 2 shows how this works.

Figure 2 – Using a Stack Limit Register to Detect Stack Overflows

F2-(1) The SP_Limit register is loaded by the context switch code of the kernel when the task is switched in.

F2-(2) The location where the SP_Limit points to could be at the very base of the stack or, preferably, at a location that would allow the exception handler enough room to save enough registers on the offending stack to handle the exception.

F2-(3) As the stack grows, if the SPregister ever goes below the SP_Limit, an exception is generated.  As we’ve seen when your code calls a function and uses local variables, the SP register can easily be positioned outside the stack upon entry of a function.  One way to reduce the likelihood of this happening is to move the SP_Limit further away from the Stack Base Address.  

The Micrium OS Kernel was designed from the get-go to support CPUs with a stack limit register.  Each task contains its own value to load into the SP_Limit and this value is placed in the Task Control Block (TCB).  The value of the SP_Limit register used by the CPU’s stack overflow detection hardware needs to be changed whenever the Micrium OS Kernel performs a context switch.  The sequence of events to do this must be performed in the following order:

1- Set SP_Limit to 0.  This ensures the stack pointer is never below the SP_Limit register.  Note that I assumed here that the stack grows from high memory to low memory but the concept works in a similar fashion if the stack grows in the opposite direction.

2- Load the SP register.

3- Get the value of the SP_Limit that belongs to the new task from its TCB.  Set the SP_Limit register to this value.

The SP_Limit register provides a simple way to detect stack overflows.  

Technique 2: Using an MPU – Stacks Are Contiguous

Arm Cortex-M processors are typically equipped with an MPU (Memory Protection Unit) which typically monitors the address bus to see if your code is allowed to access certain memory locations or I/O ports.  MPUs are relatively simple devices to use but are somewhat complex to setup.  However, if all you want to do is detect stack overflows then an MPU can be put to good use without a great deal of initialization code.   The MPU is already on your chip, meaning it’s available at no extra cost to you, so why not use it?  In the discussion that follows, we’ll setup an MPU region that says “if ever you write to this region, the MPU will trigger a CPU exception.”  

One way to setup your stacks is to locate ALL of the stacks together in contiguous memory, starting the stacks at the base of RAM, and locating the C stack as the first stack at the base of RAM as shown in Figure 3.

Figure 3 – Locating Task Stacks Continuously

As the kernel context switches between tasks, it moves a single MPU ‘protection window’ (I will call it the “RED Zone”) from task to task as shown in Figure 4.  Note that the RED Zone is located below the base address of each of the stacks.  This allows you to make use of the full stack area before the MPU detects an overflow.

Figure 4 – Moving the RED Zone During Context Switches

As shown, the RED Zone can be positioned below the stack base address. The size of the RED Zone depends on a number of factors.  For example, the size of the RED Zone on the MPU of a Cortex-M CPU must be a power of 2 (32, 64, 128, 256, etc.).  Also, stacks must be aligned to the size of the RED Zone.  On processors based on the Armv8-M architecture, this restriction has been removed and MPU region size granularity is 32 bytes.  However, with the Armv8-M, you’d use its stack limit register feature.  The larger the RED Zone, the more likely we can detect a stack overflow when a function call allocates large arrays on the stack.  However, locating RED Zones below the stack base address has other issues. For one thing, you cannot allocate buffers on a task’s stack and pass that pointer to another task because it’s possible that the allocated buffer would be overlaid by the RED Zone thus causing an exception.  However, allocating buffers on a task’s stack is not good practice anyway, so getting slapped by an MPU violation is a kind punishment.  

You may also ask: “Why should the C stack be located at the start of RAM?”. Because in most cases, once multitasking has started, the C stack is never used and is thus lost.  Overflowing into RAM that is no longer used might not be a big deal but, technically, it should not be allowed.  Having the C stack RAM simply allows us to store the saved CPU registers that are stacked on the offending task’s stack during an MPU exception sequence. 

Technique 3: Using an MPU – Stacks are non-contiguous

If you are not able to allocate storage for your tasks in continuous memory as I outlined in the previous section then we need to use the MPU differently.  What we can do here is to reserve a portion of RAM towards the base of the stack and, if anything gets written in that area then we can generate an exception.  The kernel would reconfigure the MPU during a context switch to protect the new task’s stack.  This is shown in Figure 5.

Figure 5 – Locating the RED Zone inside a Task’s Stack

Again, the size of the RED Zone depends on a number of factors.  As previously discussed, for the MPU on a Cortex-M CPU (except for Armv8-M), the size must be a power of 2 (32, 64, 128, 256, etc.).  Also, stacks must be aligned to the size of the RED Zone.  The larger the RED Zone, the more likely we can detect a stack overflow when a function call allocates large arrays on the stack.  However, in this case, the RED Zone takes away storage space from the stack because, by definition, a write to the RED Zone will generate an exception and thus cannot be performed by the task.   If the size of a stack is 512 bytes (i.e. 128 stack entries for a 32-bit wide stack), a 64-byte RED Zone would consume 12.5% of your available stack and thus leave only 448 bytes for your task, so you might need to allocate larger stacks to compensate.  

As shown in Figure 6, if a function call ‘skips over’ the RED Zone by allocating local storage for an array or a large data structure then the code might not ever write in the RED Zone and thus bypass the stack overflow detection mechanism altogether.  In other words, if the RED Zone is too small,foo()might just use iand array[0] to array[5] but nothing that happens to overlap the RED Zone.

Figure 6 – Bypassing the RED Zone

To avoid this, local variables and arrays should always be initialized as shown in Listing 2.  

Listing 2 – Initializing local variables to better detect stack overflows

 

Technique 4: Software-based RED Zones

The Micrium OS Kernel has a built-in RED Zone stack overflow detection mechanism but, it’s implemented in software.  This software based approach is enabled by setting OS_CFG_TASK_STK_REDZONE_ENto DEF_ENABLED in os_cfg.h. When enabled, the Micrium OS Kernel creates a monitored zone at the end of a task's stack which is filled upon task creation with a special value.  The actual value is not that critical and we used 0xABCD2345 as an example (but it could be anything).  However, it’s wise to avoid values that could be used in the application such as zero.  The size of the RED Zone is defined by OS_CFG_TASK_STK_REDZONE_DEPTH.  By default, the size of the RED Zone is eight CPU_STK elements deep.  The effectively usable stack space is thus reduced by 8 stack entries.  This is shown in Figure 7.

The Micrium OS Kernel checks the RED Zone at each context switch.  If the RED Zone has been overwritten or if the stack pointer is out-of-bounds the Micrium OS Kernel informs the user by calling OSRedzoneHitHook().  The hook allows the user to gracefully shutdown the application since at this point the stack corruption may have caused irreversible damage.  The hook, if defined, must ultimately call CPU_SW_EXCEPTION() or otherwise stop the Micrium OS Kernel from proceeding with corrupted data. 

Since the RED Zone is typically small, it’s ever so important to initialize local variables, large arrays or data structures upon entry of a function in order to detect the overflow using this mechanism. 

The software RED Zone is nice because it’s portable across any CPU architecture.  However, the drawback is that it consumes possibly valuable CPU cycles during a context switch.

Figure 7 – Software-based RED Zone

 

Technique 5: Determining the actual stack usage at run-time

Although not actually an automatic stack overflow detection mechanism, determining the ideal size of a stack at run-time is highly useful and is a feature available in the Micrium OS Kernel.  Specifically, you’d allocate more stack space than is anticipated to be used for the stack then, monitor and possibly display actual maximum stack usage at run-time.  This is fairly easy to do.  First, the task stack needs to be cleared (i.e. filled with zeros) when the task is created.  You should note that we could have used a different value than zero.  Next, a low priority task (the statistics task in the Micrium OS Kernel) walks the stack of each created task, from the bottom towards the top, counting the number of zero entries. When the statistics task finds a non-zero value, the process is stopped and the usage of the stack can be computed (in number of stack entries used or as a percentage).  From this, you can adjust the size of the stacks (by recompiling the code) to allocate a more reasonable value (either increase or decrease the amount of stack space for each task).  For this to be effective, however, you need to run the application long enough and under stress for the stack to grow to its highest value.  This is illustrated in Figure 8.  

Figure 8 – Determining Actual Stack Usage at Run-Time

The Micrium OS Kernel provides a function that determines stack usage of a task at run-time, OSTaskStkChk() and, in fact, the Micrium OS Kernel’s statistics task, OS_StatTask() calls this function repeatedly for each task created every 1/10th of a second.  This is what µC/Probe displays as described in my other article: See “Exploring the Micrium OS Kernel Built-In Performance Measurements” in the Blog section of the Silicon Labs website (www.silabs.com).

Summary

This blog described different techniques to detect stack overflows.  Stack overflows can occur either in single or multi-threaded environments.  Even though we can detect overflows, there is typically no way to safely continue execution after one occurs and, in many cases, the only recourse is to reset the CPU or halt execution altogether.  However, before taking such a drastic measure it’s recommended for your code to bring your embedded system to a known and safe state. For example, you might turn off motors, actuators, open or close valves and so on. Even though you are in a shutdown state you might still be able to use kernel services to perform this work. 

 

  • Blog Posts

Tags

  • Wireless
  • High Performance Jitter Attenuators
  • EFR32FG22 Series 2 SoCs
  • EFR32MG21 Series 2 SoCs
  • Security
  • Bluegiga Legacy Modules
  • Zigbee SDK
  • ZigBee and Thread
  • EFR32BG13 Series 1 Modules
  • Internet Infrastructure
  • Sensors
  • Wireless Xpress BGX13
  • Blue Gecko Bluetooth Low Energy SoCs
  • Z-Wave
  • Micrium OS
  • Blog Posts
  • Low Jitter Clock Generators
  • Bluetooth Classic
  • Makers
  • Flex SDK
  • Tips and Tricks
  • timing
  • Smart Cities
  • Smart Homes
  • IoT Heroes
  • Reviews
  • RAIL
  • Simplicity Studio
  • Tiny Gecko
  • EFR32MG22 Series 2 SoCs
  • Mighty Gecko SoCs
  • Timing
  • Temperature Sensors
  • Blue Gecko Bluetooth Low Energy Modules
  • Ultra Low Jitter Clock Generators
  • General Purpose Clock Generators
  • EFR32BG22 Series 2 SoCs
  • Industry 4.0
  • Giant Gecko
  • 32-bit MCUs
  • Bluetooth Low Energy
  • 32-bit MCU SDK
  • Gecko
  • Microcontrollers
  • Jitter Attenuators
  • EFR32BG21 Series 2 SoCs
  • News and Events
  • Industrial Automation
  • Wi-Fi
  • Bluetooth SDK
  • Community Spotlight
  • Clock Generators
  • Biometric Sensors
  • General Purpose Jitter Attenuators
  • Giant Gecko S1
  • WF200
  • Flex Gecko
  • Internet of Things
  • 8-bit MCUs
  • Wireless Jitter Attenuators
  • Isolation
  • Powered Devices
  • Power

Top Authors

  • Avatar image Siliconlabs
  • Avatar image Jackie Padgett
  • Avatar image Nari Shin
  • Avatar image lynchtron
  • Avatar image deirdrewalsh
  • Avatar image Lance Looper
  • Avatar image lethawicker

Archives

  • 2016 January
  • 2016 February
  • 2016 March
  • 2016 April
  • 2016 May
  • 2016 June
  • 2016 July
  • 2016 August
  • 2016 September
  • 2016 October
  • 2016 November
  • 2016 December
  • 2017 January
  • 2017 February
  • 2017 March
  • 2017 April
  • 2017 May
  • 2017 June
  • 2017 July
  • 2017 August
  • 2017 September
  • 2017 October
  • 2017 November
  • 2017 December
  • 2018 January
  • 2018 February
  • 2018 March
  • 2018 April
  • 2018 May
  • 2018 June
  • 2018 July
  • 2018 August
  • 2018 September
  • 2018 October
  • 2018 November
  • 2018 December
  • 2019 January
  • 2019 February
  • 2019 March
  • 2019 April
  • 2019 May
  • 2019 June
  • 2019 July
  • 2019 August
  • 2019 September
  • 2019 October
  • 2019 November
  • 2019 December
  • 2020 January
  • 2020 February
  • 2020 March
  • 2020 April
  • 2020 May
  • 2020 June
  • 2020 July
  • 2020 August
  • 2020 September
  • 2020 October
  • 2020 November
  • 2020 December
  • 2021 January
Silicon Labs
Stay Connected With Us
Plug into the latest on Silicon Labs products, including product releases and resources, documentation updates, PCN notifications, upcoming events, and more.
  • About Us
  • Careers
  • Community
  • Contact Us
  • Corporate Responsibility
  • Privacy and Terms
  • Press Room
  • Investor Relations
  • Site Feedback
  • Cookies
Copyright © Silicon Laboratories. All rights reserved.
粤ICP备15107361号
Also of Interest:
  • Bring Your IoT Designs to Life with Smart,...
  • IoT Hero CoreTigo Drives New Wireless Standard...
  • A Guide to IoT Protocols at Works With...