Almost all of Silicon Labs Starter/Development Kits include an onboard J-Link debugger, which is great to not only debug and flash your embedded application, but also to run SystemView.
SystemView as you probably know is the tool to record and analyze your Micrium OS Kernel events in real time.
However, the onboard J-Link can be slow depending on the rate of kernel events that your embedded application is creating. Overflow events occur when the SystemView buffer on your embedded target is full.
In this blog I'm gonna discuss the basic steps to prevent overflows and then I will describe the ultimate way to prevent overflows.
1. Increase the buffer size to store the events:
Open the configuration file SEGGER_SYSVIEW_Conf.h and set the buffer size to 4096 as shown below:
#define SEGGER_SYSVIEW_RTT_BUFFER_SIZE 4096
2. In case you are running Simplicity Studio, close Simplicity Studio and let SystemView run by itself.
3. In case you are running Probe, close Probe and let SystemView run by itself.
4. Open the configuration file os_cfg_trace.h and decrease the number of events by disabling the following features:
#define OS_CFG_TRACE_API_ENTER_EN DEF_DISABLED #define OS_CFG_TRACE_API_EXIT_EN DEF_DISABLED
5. If you are still having overflows after making the changes above, then the ultimate way to prevent overflows is to buy a much faster External J-Link from SEGGER: https://www.segger.com/products/debug-probes/j-link/
Most of our Starter Kits have a Debug Connector that you can use to connect the external J-Link.
The following section describes how to connect your external J-Link to a Silicon Labs Starter Kit.
1. First you need to configure your Starter Kit to reroute the debugging circuitry to the external debug connector.
Open Simplicity Studio, select your Starter Kit, locate the section Debug Mode: MCU and then press the link Change as shown in the image below:
2. You may be asked to download an adapter firmware image. If so, press the button Yes.
3. The default Debug Mode is called MCU which means that your debugger is the Onboard J-Link.
4. Select from the drop-down the option IN which means that your debugger is an External J-Link as shown in the image below:
5. Connect your external J-Link to the debug connector on your Silicon Labs Starter Kit.
Depending on your kit, it may be one of the ones shown below.
The J-Link 19-pin 0.05" Cortex-M Debug Connector shown in Figure 3 may require a J-Link 19-pin Cortex-M Adapter available from SEGGER.
On the other hand, the standard 20-pin 0.1" JTAG Debug Connector shown in Figure 4 does not require any adapters and can be connected directly to your external J-Link.
For more information on how to configure your Starter Kit in the appropriate debugging mode, please consult your Starter Kit's User's Guide, the section On-Board Debugger, Debug Modes.
SystemView Installer: https://www.segger.com/downloads/free-utilities/#SystemView
SystemView User's Manual: https://www.segger.com/downloads/free-utilities/#Manuals
J-Link Debug Probes: https://www.segger.com/products/debug-probes/j-link/
J-Link 19-pin Cortex-M Adapter: https://www.segger.com/products/debug-probes/j-link/accessories/adapters/19-pin-cortex-m-adapter/
Whether you are currently running your embedded application on Silicon Labs hardware or other semiconductor, the migration path is the same as illustrated in Figure 1.
You should start from a working Micrium OS example and then move your embedded application over to the example project.
Once you move your embedded application to the Micrium OS baseline example project, you need to change the code that calls the FreeRTOS API.
The purpose of the attached PDF document is to describe the differences between the two kernels, offer plenty of side-by-side examples and mapping tables to help you in the process of migrating your embedded application.
Today, we’ve announced the acquisition of Sigma Designs’ Z-Wave business. Adding Z-Wave to our wireless portfolio gives ecosystem providers and developers of smart home solutions access to the broadest range of wireless connectivity options available today.
Together, we’ll open the door to millions of potential users of smart home technologies by expanding access to a large and varied network of ecosystems and partners. Z-Wave’s reputation as a leading mesh networking technology for the smart home with traction in more than 2,400 certified interoperable Z-Wave devices from more than 700 manufacturers and service providers worldwide, coupled with Silicon Labs’ position as the leader in silicon, software, and solutions for the IoT, make this a great match.
Silicon Labs has been actively driving the IoT for years, and we recognize the large following Z-Wave has among developers and end customers in the smart home. With our experience in mesh technologies, we are uniquely positioned to advance the standard and grow adoption with input from the Z-Wave Alliance and partners.
Adding Z-Wave to Silicon Labs’ extensive portfolio of connectivity options allows us to create a unified vision for the technologies underpinning the smart home market: a secure, interoperable customer experience is at the heart of how smart home products are designed, deployed and managed. Our vision for the smart home is one where various technologies work securely together, where any device using any of our connectivity technologies easily joins the home network, and where security updates or feature upgrades occur automatically or on a pre-determined schedule.
In a previous blog, I showed how you can display the stack of a Micrium OS Kernel-based applications using µC/Probe. In this post, I’ll describe the importance of sizing your stacks at design time and checking task stacks at run‑time to catch stack overflows. I will first explore how to determine the size of task stacks and then go into ways that can be used to detect overflows. I will show different stack overflow detection methods. They are listed in order of the most preferable to the least preferable, based on the likelihood of detecting the overflow.
In a Micrium OS Kernel-based (and most real-time kernels) application, each task requires its own stack. The size of the stack required by a task is application specific. It’s possible to manually figure out the stack space needed by adding up:
Adding all this up is a tedious chore and the resulting number is a minimum requirement. Most likely you would not allocate the size of the stack that precisely so that you can plan for “surprises”. The number you come up with should probably be multiplied by some safety factor, possibly 1.5 to 2.0. The stack usage calculation assumes that the exact path of the code is known at all times, which is not always possible. Specifically, when calling a function such as printf() it might be difficult or nearly impossible to even guess just how much stack space printf() will require. Also indirect function calls through tables of function pointers could be problematic. Generally speaking, start with a fairly large stack space and monitor the stack usage at run-time to see just how much stack space is actually used after the application runs for a while. For more information, you can visit “Exploring the Micrium OS Kernel Built-In Performance Measurements” in the Blog section of the Silicon Labs website (www.silabs.com).
Also, avoid writing recursive code because stack usage is typically non-deterministic with this type of code.
There are really cool and clever compilers/linkers such as Keil and IAR that provide this information in a link map. Specifically, for each function, the link map indicates the worst-case stack usage. However, these tools will not account for indirect calls (i.e. function pointers) or assembly language routines. GCC has partial support by providing per-function stack usage but not a call-graph. This feature clearly enables you to better evaluate stack usage for each task. It is still necessary to add the stack space for a full CPU context plus another full CPU context for each nested ISR (if the CPU does not have a separate stack to handle ISRs), plus whatever stack space is needed by those ISRs. Again, allow for a safety net and multiply this value by some factor.
If your kernel monitors stack usage at run-time then it’s a good idea to display that information and keep an eye on your stacks while developing and testing the product. Stack overflows are common and can lead to some curious behaviors. In fact, whenever someone mentions that his or her application behaves “strangely,” insufficient stack size is the first thing that comes to mind.
Just so that we are on the same page, below is a description of what a stack overflow is. For the sake of discussion, it’s assumed here that stacks grow from high-memory to low-memory. Of course, the same issue occurs when the stack grows in the other direction. Refer to Figure 1.
Figure 1 – Stack Overflow
F1-(1) The CPU’s SP (Stack Pointer) register points somewhere inside the stack space allocated for a task. The task is about to call the function foo() as shown in Listing 1.
Listing 1 – Example of possible stack overflow
F1-(2) Calling foo() causes the CPU to save the return address of the caller onto the stack. Of course, that depends greatly on the CPU and the compiler.
F1-(3) The compiler then adjusts the stack pointer to accommodate for local variables. Unfortunately, at this point, we overflowed the stack (the SP points outside the storage area assigned for the stack) and just about anything foo() does will corrupt whatever data is beyond the stack base. In fact, depending on the code flow, the array might never be used, in which case the problem would not be immediately apparent. However, if foo() calls another function, there is a high likelihood that this will cause something outside the stack to be touched.
F1-(4) So, when foo() starts to execute code, the stack pointer has an offset of 48 bytes from where it was prior to calling foo() (assuming a stack entry is 4 bytes wide).
F1-(5) We typically don’t know what resides here. It could be the stack of another task, it could be variables, data structures or an array used by the application. Overwriting whatever resides here can cause strange behaviors: values computed by another task may not be what you expected and could cause decisions in your code to take the wrong path, or your system may work fine under normal conditions but then fail. We just don’t know and it’s actually quite difficult to predict. In fact, the behavior can change each time you make changes to your code.
There are a number of techniques that can be used to detect stack overflows. Some make use of hardware while some are performed entirely in software. As we will see shortly, having the capability in hardware is preferable since stack overflows can be detected nearly immediately as they happen, which can help avoid those strange behaviors and aid in solving them faster.
Hardware stack overflow detection mechanisms generally trigger an exception handler. The exception handler typically saves the current PC (Program Counter) and possibly other CPU registers onto the current task’s stack. Of course, because the exception occurs when we are attempting to access data outside of the stack, the handler would overwrite some variables or another stack in your application; assuming there is RAM beyond the base of the overflowed stack.
In most cases the application developer will need to decide what to do about a stack overflow condition. Should the exception handler place the embedded system in a known safe state and reset the CPU or simply do nothing? If you decide to reset the CPU, you might figure out a way to store the fact that an overflow occurred and which task caused the overflow so you can notify a user upon reset.
Some processors (unfortunately very few of them) have simple yet highly effective stack pointer overflow detection registers. This feature will however, be available on processed based on the ARMv8-M CPU architecture. When the CPU’s stack pointer goes below (or above depending on stack growth) the value set in this register (let’s call it the SP_Limitregister), an exception is generated. The drawing in Figure 2 shows how this works.
Figure 2 – Using a Stack Limit Register to Detect Stack Overflows
F2-(1) The SP_Limit register is loaded by the context switch code of the kernel when the task is switched in.
F2-(2) The location where the SP_Limit points to could be at the very base of the stack or, preferably, at a location that would allow the exception handler enough room to save enough registers on the offending stack to handle the exception.
F2-(3) As the stack grows, if the SPregister ever goes below the SP_Limit, an exception is generated. As we’ve seen when your code calls a function and uses local variables, the SP register can easily be positioned outside the stack upon entry of a function. One way to reduce the likelihood of this happening is to move the SP_Limit further away from the Stack Base Address.
The Micrium OS Kernel was designed from the get-go to support CPUs with a stack limit register. Each task contains its own value to load into the SP_Limit and this value is placed in the Task Control Block (TCB). The value of the SP_Limit register used by the CPU’s stack overflow detection hardware needs to be changed whenever the Micrium OS Kernel performs a context switch. The sequence of events to do this must be performed in the following order:
1- Set SP_Limit to 0. This ensures the stack pointer is never below the SP_Limit register. Note that I assumed here that the stack grows from high memory to low memory but the concept works in a similar fashion if the stack grows in the opposite direction.
2- Load the SP register.
3- Get the value of the SP_Limit that belongs to the new task from its TCB. Set the SP_Limit register to this value.
The SP_Limit register provides a simple way to detect stack overflows.
Arm Cortex-M processors are typically equipped with an MPU (Memory Protection Unit) which typically monitors the address bus to see if your code is allowed to access certain memory locations or I/O ports. MPUs are relatively simple devices to use but are somewhat complex to setup. However, if all you want to do is detect stack overflows then an MPU can be put to good use without a great deal of initialization code. The MPU is already on your chip, meaning it’s available at no extra cost to you, so why not use it? In the discussion that follows, we’ll setup an MPU region that says “if ever you write to this region, the MPU will trigger a CPU exception.”
One way to setup your stacks is to locate ALL of the stacks together in contiguous memory, starting the stacks at the base of RAM, and locating the C stack as the first stack at the base of RAM as shown in Figure 3.
Figure 3 – Locating Task Stacks Continuously
As the kernel context switches between tasks, it moves a single MPU ‘protection window’ (I will call it the “RED Zone”) from task to task as shown in Figure 4. Note that the RED Zone is located below the base address of each of the stacks. This allows you to make use of the full stack area before the MPU detects an overflow.
Figure 4 – Moving the RED Zone During Context Switches
As shown, the RED Zone can be positioned below the stack base address. The size of the RED Zone depends on a number of factors. For example, the size of the RED Zone on the MPU of a Cortex-M CPU must be a power of 2 (32, 64, 128, 256, etc.). Also, stacks must be aligned to the size of the RED Zone. On processors based on the Armv8-M architecture, this restriction has been removed and MPU region size granularity is 32 bytes. However, with the Armv8-M, you’d use its stack limit register feature. The larger the RED Zone, the more likely we can detect a stack overflow when a function call allocates large arrays on the stack. However, locating RED Zones below the stack base address has other issues. For one thing, you cannot allocate buffers on a task’s stack and pass that pointer to another task because it’s possible that the allocated buffer would be overlaid by the RED Zone thus causing an exception. However, allocating buffers on a task’s stack is not good practice anyway, so getting slapped by an MPU violation is a kind punishment.
You may also ask: “Why should the C stack be located at the start of RAM?”. Because in most cases, once multitasking has started, the C stack is never used and is thus lost. Overflowing into RAM that is no longer used might not be a big deal but, technically, it should not be allowed. Having the C stack RAM simply allows us to store the saved CPU registers that are stacked on the offending task’s stack during an MPU exception sequence.
If you are not able to allocate storage for your tasks in continuous memory as I outlined in the previous section then we need to use the MPU differently. What we can do here is to reserve a portion of RAM towards the base of the stack and, if anything gets written in that area then we can generate an exception. The kernel would reconfigure the MPU during a context switch to protect the new task’s stack. This is shown in Figure 5.
Figure 5 – Locating the RED Zone inside a Task’s Stack
Again, the size of the RED Zone depends on a number of factors. As previously discussed, for the MPU on a Cortex-M CPU (except for Armv8-M), the size must be a power of 2 (32, 64, 128, 256, etc.). Also, stacks must be aligned to the size of the RED Zone. The larger the RED Zone, the more likely we can detect a stack overflow when a function call allocates large arrays on the stack. However, in this case, the RED Zone takes away storage space from the stack because, by definition, a write to the RED Zone will generate an exception and thus cannot be performed by the task. If the size of a stack is 512 bytes (i.e. 128 stack entries for a 32-bit wide stack), a 64-byte RED Zone would consume 12.5% of your available stack and thus leave only 448 bytes for your task, so you might need to allocate larger stacks to compensate.
As shown in Figure 6, if a function call ‘skips over’ the RED Zone by allocating local storage for an array or a large data structure then the code might not ever write in the RED Zone and thus bypass the stack overflow detection mechanism altogether. In other words, if the RED Zone is too small,foo()might just use iand array to array but nothing that happens to overlap the RED Zone.
Figure 6 – Bypassing the RED Zone
To avoid this, local variables and arrays should always be initialized as shown in Listing 2.
Listing 2 – Initializing local variables to better detect stack overflows
The Micrium OS Kernel has a built-in RED Zone stack overflow detection mechanism but, it’s implemented in software. This software based approach is enabled by setting OS_CFG_TASK_STK_REDZONE_ENto DEF_ENABLED in os_cfg.h. When enabled, the Micrium OS Kernel creates a monitored zone at the end of a task's stack which is filled upon task creation with a special value. The actual value is not that critical and we used 0xABCD2345 as an example (but it could be anything). However, it’s wise to avoid values that could be used in the application such as zero. The size of the RED Zone is defined by OS_CFG_TASK_STK_REDZONE_DEPTH. By default, the size of the RED Zone is eight CPU_STK elements deep. The effectively usable stack space is thus reduced by 8 stack entries. This is shown in Figure 7.
The Micrium OS Kernel checks the RED Zone at each context switch. If the RED Zone has been overwritten or if the stack pointer is out-of-bounds the Micrium OS Kernel informs the user by calling OSRedzoneHitHook(). The hook allows the user to gracefully shutdown the application since at this point the stack corruption may have caused irreversible damage. The hook, if defined, must ultimately call CPU_SW_EXCEPTION() or otherwise stop the Micrium OS Kernel from proceeding with corrupted data.
Since the RED Zone is typically small, it’s ever so important to initialize local variables, large arrays or data structures upon entry of a function in order to detect the overflow using this mechanism.
The software RED Zone is nice because it’s portable across any CPU architecture. However, the drawback is that it consumes possibly valuable CPU cycles during a context switch.
Figure 7 – Software-based RED Zone
Although not actually an automatic stack overflow detection mechanism, determining the ideal size of a stack at run-time is highly useful and is a feature available in the Micrium OS Kernel. Specifically, you’d allocate more stack space than is anticipated to be used for the stack then, monitor and possibly display actual maximum stack usage at run-time. This is fairly easy to do. First, the task stack needs to be cleared (i.e. filled with zeros) when the task is created. You should note that we could have used a different value than zero. Next, a low priority task (the statistics task in the Micrium OS Kernel) walks the stack of each created task, from the bottom towards the top, counting the number of zero entries. When the statistics task finds a non-zero value, the process is stopped and the usage of the stack can be computed (in number of stack entries used or as a percentage). From this, you can adjust the size of the stacks (by recompiling the code) to allocate a more reasonable value (either increase or decrease the amount of stack space for each task). For this to be effective, however, you need to run the application long enough and under stress for the stack to grow to its highest value. This is illustrated in Figure 8.
Figure 8 – Determining Actual Stack Usage at Run-Time
The Micrium OS Kernel provides a function that determines stack usage of a task at run-time, OSTaskStkChk() and, in fact, the Micrium OS Kernel’s statistics task, OS_StatTask() calls this function repeatedly for each task created every 1/10th of a second. This is what µC/Probe displays as described in my other article: See “Exploring the Micrium OS Kernel Built-In Performance Measurements” in the Blog section of the Silicon Labs website (www.silabs.com).
This blog described different techniques to detect stack overflows. Stack overflows can occur either in single or multi-threaded environments. Even though we can detect overflows, there is typically no way to safely continue execution after one occurs and, in many cases, the only recourse is to reset the CPU or halt execution altogether. However, before taking such a drastic measure it’s recommended for your code to bring your embedded system to a known and safe state. For example, you might turn off motors, actuators, open or close valves and so on. Even though you are in a shutdown state you might still be able to use kernel services to perform this work.
The Micrium OS Kernel has a rich set of built-in instrumentation that collects real-time performance data. This data can be used to provide invaluable insight into your kernel‑based application, allowing you to have a better understanding of the run-time behavior of your system. Having this information readily available can, in some cases, uncover potential real-time programming errors and allow you to optimize your application.
In Part I of this post we examined, via µC/Probe, a number of the statistics built into the Micrium OS Kernel, including those for stack usage, CPU usage (total and per-task), context-switch counts, and signaling times for task semaphores and queues.
In this post, we'll examine the kernel’s built-in ability to measure interrupt-disable and scheduler lock time on a per‑task basis. Once again, we used µC/Probe to display, at run‑time, these values.
Kernels often need to disable interrupts to manage critical sections of code. In fact, it’s often useful, if not necessary to get a sense of just how much time interrupts are disabled in an application. Micriµm added this capability in one of its utility module called µC/CPU which is provided with the Micrium OS Kernel. Disabling and enabling interrupts is the fastest and most efficient way to ensure access that code is accessed atomically. However, this must be done with great care as to not overly impact the responsiveness of your application.
Throughout Micriµm’s code, you will notice the sequence as shown in Listing 1.
Listing 1, Protecting a Critical Section of Code
The ‘:’ indicates a section of code that executes. The actual sequence is unimportant for this discussion. CPU_SR_ALLOC() is a macro that creates a local variable called cpu_sr. cpu_sris used to save the current state of the CPU’s interrupt disable flag which is typically found in the CPU’s status register thus the ‘sr’ in the name.
CPU_CRITICAL_ENTER() is also a macro and it’s used to save the state of the CPU interrupt disable flag by placing the current value of the status register in cpu_srbefore disabling further interrupts. CPU_CRITICAL_EXIT() simply restores the state of the status register (from the saved cpu_sr).
You enable interrupt disable time measurement by setting a configuration #definecalled CPU_CFG_INT_DIS_MEAS_EN to DEF_ENABLED. In this case, the macros CPU_CRITICAL_ENTER() and CPU_CRITICAL_EXIT() are each automatically altered to include a call to a function. CPU_CRITICAL_ENTER() will call CPU_IntDisMeasStart() immediately upon disabling interrupts and CPU_CRITICAL_EXIT() will call CPU_IntDisMeasStop() just before re-enabling interrupts. Both of these functions are presented in Listing 2 and 3, respectively.
Listing 2, Interrupt Disable Time Measurement – Start Measurement Function
L2-(1) Here we keep track of how often interrupts are disabled. This value is not actually involved in the interrupt disable time measurement.
L2-(2) A counter is used to track nesting of CPU_CRITICAL_ENTER() calls. In practice, however, it’s rare, if ever, that we actually nest this macro. However, the code is included in case your application needs to do this. That being said, it’s not recommended that you do.
Here we read the value of a free-running counter by calling CPU_TS_TmrRd(). If you enable interrupt disable time measurements, you (or the implementer of the port for the CPU you are using) will need to setup a free-running counter that would ideally be a 32‑bit up counter that is clocked at the same rate as the CPU. The reason a 32-bit counter is preferable is because we use Time Stamping (thus the ‘TS’ acronym) elsewhere for delta‑time measurements and a 32-bit counter avoids having to account for overflows when measuring relatively long times. Of course, you can also use a 16-bit counter but most likely it would be clocked at a slower rate to avoid overflows. The value read from the timer is stored in a global variable (global to the µC/CPU module) called CPU_IntDisMeasStart_cnts.
L2-(3) We then increment the nesting counter.
Listing 3, Interrupt Disable Time Measurement – End Measurement Function
L3-(4) A local variable of data type CPU_TS_TMR is allocated and is used in the delta-time measurement. This variable typically matches the word width of the free-running counter and is always an unsigned integer.
L3-(5) We decrement the nesting counter and, if this is the last nested level, we proceed with reading the free-running counter in order to obtain a time-stamp of the end of interrupt disable time.
L3-(6) The time difference is computed by subtracting the start time from the end time (i.e. stop time). The calculation always yields the proper delta time because the free-running counter is an up counter and, we used unsigned math.
L3-(7) We keep track of two separate interrupt disable time peak detectors. One peak detector is used to determine the maximum interrupt disable time of each task (i.e. CPU_IntDisMeasMaxCur_cnts) and is reset during each context switch. CPU_IntDisMeasMax_cnts is the global maximum interrupt disable time and is never reset.
Interrupt disable time is measured in free-running counter counts. To get the execution time in µs, you would need to know the clocking rate of the free-running counter. For example, if you get 1000 counts and the counter is clocked at 100 MHz then the interrupt disable time would be 10 µs.
As you’d expect, the code to measure interrupt disable time adds measurement artifacts and should be accounted for. This overhead is found in the variable CPU_IntDisMeasOvrhd_cnts and is determined during initialization. However, the overhead is NOT accounted for in the code shown above. CPU_IntDisMeasOvrhd_cnts also represents free‑running counter counts. Depending on the CPU architecture, the measurement overhead (i.e. callingCPU_IntDisMeasStart() and CPU_IntDisMeasStop()) consists of typically between 50 and 75 CPU instructions.
µC/Probe is able to display the interrupt disable time on a per-task basis for the Micrium OS Kernel when you select to display the kernel awareness capabilities. Also, µC/Probe takes care of the conversion of counts to µs and thus, all values are displayed in µs as shown in Figure 1.
Each row represents the value of a task. Not shown in the figure is that there is a name associated with each task and thus you can of course know which time is associated with what task. The highlighted task clearly shows that one task disables interrupts longer than the average. This could be a problem or, it might be within expected limits. Something the developer might need to investigate.
The per-task interrupt disable time came in handy recently as it helped us discover that a driver was disabling interrupts for over 2,500 microseconds! This value popped up like a sore thumb so we were able to quickly identify and correct the issue. Without this measurement, I’m not sure we would have been able to identify and correct this problem as quickly as we did.
Figure 1 – Per-Task CPU Interrupt Disable Times
In order to avoid long interrupt disable times, the Micrium OS Kernel locks, and allows you to lock the scheduler. This has the same effect as temporarily making the current task the highest priority task. However, while the scheduler is locked, interrupts are still accepted and processed.
As shown in Figure 2, the Micrium OS Kernel measures the maximum scheduler lock time on a per‑task basis. Again, not shown in the figure is that there is a name associated with each task allowing you to know which time is associated with what task.
The values are displayed in microseconds and for most tasks shown in this example, the scheduler never gets locked. However, the Micrium OS Kernel’s timer task locks the scheduler for close to 15 microseconds. In my example, I created 20 soft-timers and selected (a configuration option) to lock the scheduler to ensure that timers are treated atomically when they are updated. If mutual exclusion semaphores are enabled when you use the Micrium OS Kernel then this is the mechanism used to gain exclusive access to soft timers.
As a general rule, you should avoid locking the scheduler in your application but, if you do so, try to lock it for shorter periods than what the Micrium OS Kernel does.
Figure 2 – Per-Task Scheduler Lock Time
In this post we examined, on a per‑task basis, statistics built into the Micrium OS Kernel for measuring interrupt disable time as well as scheduler lock time. These metrics can be invaluable in determining whether your embedded application satisfies some of its requirements. With µC/Probe, you get to see performance metrics and kernel status information live.
The Micrium OS Kernel has a rich set of built-in instrumentation that collects real-time performance data. This data can be used to provide invaluable insight into your kernel‑based application, allowing you to have a better understanding of the run-time behavior of your system. Having this information readily available can, in some cases, uncover potential real-time programming errors and allow you to optimize your application. In this two-part series of posts, we will explore the statistics yielded by the kernel's instrumentation, and we'll also consider a unique way of visualizing this information.
Many IDEs provide, as part of their debugger, what is called a Kernel Awareness plug-in. Kernel Awareness allows you to see the status of certain kernel data structures (mostly tasks) using a tabular format. A notable problem with IDE-based Kernel Awareness plug-ins is that the information is displayed only when you stop the target (i.e. when you reach a breakpoint or when you step through code). This is quite useful and often sufficient if you are looking at such things as maximum stack usage or how often a task executed, but you only get a snapshot of an instant in time. This is similar to taking a picture versus watching a movie. However, there are situations where you simply cannot stop the target and examine kernel or other variables: engine control, conveyor belts, ECG monitoring, networking communications and more. In other words, those have to be monitored while the target is running.
Micrium offers a powerful tool called µC/Probe which is a target-agnostic, Windows-based application that allows you to display or change the value of variables on a running target with little or no CPU intervention. Although µC/Probe can be used with systems that do not incorporate Micrium's kernels, the tool includes Kernel Awareness screens which provide a ‘dashboard’ or ‘instrument panel’ for µC/OS-II, µC/OS-III and the Micrium OS Kernel. The ‘Task’ window of the Kernel-Awareness screen is shown in Figure 1.
Figure 1 – Micrium OS Kernel Awareness in µC/Probe
The Task Window is one of the many views into performance and status data collected by the Micrium OS Kernel and displayed by µC/Probe. Specifically, µC/Probe exposes status information for Semaphores, Mutexes, Event Flags, Message Queues, and more as shown in Figure 2.
Figure 2 – Additional views of the Micrium OS Kernel status in µC/Probe
With µC/Probe, you can display or change the value of any application variable (i.e. your variables), at run-time. Values can be represented numerically, using gauges or meters, bar graphs, charts, using virtual LEDs, on an Excel spreadsheet, using TreeView and through other graphical components. Kernel variables can be similarly accessed, and the Kernel Awareness screens provide a pre-populated interface for reading these variables. In this post and the next, we'll consider several of the fields in the 'Task' window of the Micrium OS Kernel Awareness screen. We'll also look into the instrumentation underlying the fields.
Each row in the 'Task' window represents a task managed by the Micrium OS Kernel. The task name is shown in the third column and is obviously quite useful to have as it associates the data in a row with a specific task that was created. The name is what was specified during the OSTaskCreate() call.
As you may know, a real-time kernel like the Micrium OS Kernel requires that you allocate a stack for each task in your application. The Micrium OS Kernel has an optional built-in task called the statistics task which runs every 100 ms (the frequency is configurable at compile-time) and calculates how much stack each task has used up. The information collected by the statistics task is stored in Task Control Blocks (or TCBs), and µC/Probe displays that information as shown in Figure 3.
The 3rd column indicates the total available stack size of each task. The value is in stack units so, to find out how many bytes the task used, you simply multiply the value shown by the size of a stack unit for the CPU architecture you use. In this case, a stack unit was 4 bytes wide so, for many of the tasks, the total available stack size is 400 bytes.
The 1st column (#Used) indicates worst case stack usage of the task (again in stack units).
The 2nd column deducts how many stack units are left from the total size and what’s used:
#Free = Size - #Used
Figure 3 – µC/Probe Task Stacks View
The bar graph is probably the most useful representation of stack usage since, at a glance, you can tell whether or not application tasks have sufficient stack space. In fact, the bar graph is color coded. If stack usage for a given task is below 70%, the bar graph is GREEN. If stack usage is between 70% and 90%, the bar graph is displayed in YELLOW. Finally, if the stack usage exceeds 90%, the bar graph changes to RED. You should thus consider increasing the size of stack for tasks that have YELLOW bar graphs and for sure increase the size for any tasks that have RED bar graphs.
Stack overflows are the number one issue you're likely to encounter when you develop kernel‑based applications. If a task's stack appears RED in µC/Probe, you stand a good chance of experiencing strange behaviors because an overflow might alter variables that are located just above the top of the stack (lower memory locations). To avoid such problems, you should increase the size of any RED stacks.
The Micrium OS Kernel computes overall CPU usage and updates this value every time the statistics task runs. µC/Probe displays this information within the Kernel Awareness window as shown in Figure 4.
Figure 4 – µC/Probe Global CPU Usage
The gauge shows two things, the current CPU usage as a percentage (needle) and the peak CPU usage (small RED triangle). As shown, peak CPU usage was a tad above 10%. The CPU usage of the idle task is not actually counted in the calculation; otherwise, the gauge would always show 100% and that would not be very useful. The ‘Total CPU Usage’ gauge actually considers the execution of tasks as well as ISRs during the measurement period.
The chart on the right actually shows CPU usage over the past 60 seconds. The chart is updated every second and scrolls to the left. The most recent CPU usage is thus displayed on the far right. The RED line shows peak CPU usage and the BLUE trace shows current CPU usage.
The Micrium OS Kernel also computes the execution time of each task. This is then used to figure out the relative CPU usage of each task. µC/Probe displays this information as a bar graph as shown in Figure 5.
The field at the bottom is the CPU usage of the Micrium OS Kernel’s idle task. Since the CPU of the depicted system is not overly busy, the idle task consumes over 90% of the CPU time. The idle task is typically a good place to add code to put the CPU in low power mode, as is often required for battery powered applications.
The BLACK vertical bar with the small triangle pointing upwards represents the peak CPU usage of that task. The peak usage is actually tracked by the Micrium OS Kernel and can be a useful indicator about the behavior of your tasks. The per-task CPU usage statistic gives you a good idea of where your CPU is spending its time. This can help confirm your expectations and possibly help you determine task priorities.
You should note that the CPU usage for each task also includes time spent in ISRs while the task is running. In other words, the Micrium OS Kernel doesn’t subtract out ISR time while a task is running as this would add too much overhead in the calculation.
Figure 5 – µC/Probe Per-Task CPU Usage
You will notice that, adding all the values in the figure yields a total of 97.06%. The reason for this is that µC/Probe doesn’t update all the values at exactly the same time.
As shown in Figure 6, each task contains a counter that keeps track of how often it actually had control of the CPU. This feature is helpful to see if a task executes as often as you expect.
A task with a count of zero (0) or not incrementing indicates that the task doesn’t get a chance to execute. This can be normal if the event that the task is waiting for never occurs. For example, a task may be waiting for an Ethernet packet to arrive. If the cable is not connected then the task counter for that task will not increment.
Another situation where the task context switch counter would not increment is as shown in Figure 7. Here, an ISR is posting messages to a task. Unfortunately, the ISR is posting to the wrong queue and thus the recipient never receives those messages.
Figure 6 – µC/Probe Per-Task CtxSwCtr
Figure 7 – ISR Posting to the Wrong Message Queue
Semaphores are typically used as a signaling mechanism to notify a task that an event occurred. An event can be generated by an ISR or another task.
The Micrium OS Kernel has a built-in semaphore for every task and thus an ISR or another task can directly signal a task. This greatly simplifies system design, reduces RAM footprint and is much more efficient than the conventional method of allocating a separate kernel object for this purpose.
Figure 8 shows that the task semaphore consists of a counter that indicates how often the task has been signaled since it was last pended on.
A zero value in the ‘Ctr’ column either indicates that the task semaphore was not signaled or, that the task received the signal and actually processed the event. A non-zero (and possibly incrementing) count would indicate that you are signaling faster than you can process the signals. You might need to increase the task priority of the signaled task or find out if there could be another reason.
The ‘Signal Time’ indicates how long it took (in microseconds) between the occurrence of the signal and when the task woke up to process it.
The ‘Signal Time (Max)’ is a peak detector of the ‘Signal Time’. In other words, this column shows the worst case response time for the signal.
Figure 8 – Task Semaphores
Message queues are typically used as a way to notify a task that an event has occurred as well as provide additional information beyond the fact that the event occurred. A message can be sent from an ISR or another task. The message queue is implemented as a FIFO (First-In-First-Out) or a LIFO (Last-In-First-Out).
The Micrium OS Kernel has a built-in message queue for every task and thus an ISR or another task can directly send messages to a specific task in your application. This greatly simplifies system design, reduces RAM footprint and is much more efficient than the conventional method of allocating a separate message queue for this purpose.
Figure 9 shows that the task message queue contains 5 fields that are displayed by µC/Probe.
The 1st column indicates the current number of entries in the message queue. If messages are being consumed at an adequate rate then this value should typically be 0.
The 3rd column contains the total number of messages that can be queued at any given time. In other words, the size of the queue.
The 2nd column indicates the maximum message count observed for the queue. . If the value ever reaches the number in the 3rd column then your application is not able to handle messages as fast as they are produced. You might consider either raising the priority of the receiving task or increasing the size of the message queue to avoid losing any messages.
Figure 9 – Task Message Queue
The ‘Msg Sent Time’ indicates how long it took (in microseconds) between when the message was posted and the message was received and processed by the task.
The ‘Msg Sent Time (Max)’ is a peak detector of the ‘Msg Sent Time’. In other words, this column shows the worst case response time for processing the message.
The example code I used didn’t make use of the task message queues.
In this post we examined, via uC/Probe, a number of the statistics built into the Micrium OS Kernel, including those for stack usage, CPU usage (total and per-task), context-switch counts, and signaling times for task semaphores and queues. We'll examine the kernel's built-in code for measuring interrupt-disable and schedule-lock
Title: Benchmarking Bluetooth Mesh, Thread, and Zigbee Network Performance
Date: June 13 & 14, 2018
Duration: 1 hour
Speaker: Tom Pannell, senior director of marketing for IoT products at Silicon Labs
When it comes to mesh networking wireless connectivity, one size does not fit all. Bluetooth, Thread, and Zigbee protocols each present unique advantages depending on the use case, and it’s important to understand how they perform in the key areas of power consumption, throughput, and scalability. The inner workings of these mesh networking technologies goes beyond a list of key features and Silicon Labs has released the industry’s first comprehensive network performance results based on large-scale, multicast testing of each of these stacks.
In this webinar Tom Pannell, senior director of marketing for IoT products, will provide an overview of this benchmarking and discuss how developers can make more informed choices when designing products for the IoT.
Title: Why the IoT Needs Upgradable Security
Date: May 9 & 10, 2018
Duration: 1 hour
Speaker: Lars Lydersen, senior director of product security at Silicon Labs
In this webinar, Lars Lydersen, senior director of product security at Silicon Labs, delves into the historical data of how adversary capability has evolved, and discusses how these can be extrapolated into the future. This understanding gives a necessary background to evaluate what security functionality is necessary in an IoT design. Even with advanced security functionality, there will always be unknown unknowns, and it will be necessary to secure against attacks and adversaries of the future. Lars will also discuss the typical properties of an attack that can be thwarted via updates, why enabling software updates is needed, and the consequences for IoT designs.
This webinar is now available to view on-demand. If you have any questions regarding the content, please post them here in our community.
Silicon Labs recently had the opportunity to speak with Larry Poon, chief operating officer of IMONT, a start-up software company taking a radical approach to connecting IoT devices by circumventing the cloud. Larry shared how IMONT’s interoperable software connects any type of device to other devices, regardless of the manufacturer. Graham Nice from Skelmir, one of IMONT’s key integration partners, joined our conversation to explain how companies are reacting to IMONT’s new IoT option for connectivity – and how he sees a potential move in the future away from the cloud.
So tell me about IMONT – what exactly do you offer?
We develop device connectivity software. If a company wants software to connect their devices to other devices, we can help them do so in a unique way.
We lower the barrier to entry and the ongoing operational costs of scaling out – we do this by being cloudless and hubless. We’re also much more secure, and we’re interoperable. For example, if a utility company wants to offer a smart home solution that includes devices from other manufacturers - they can connect them all using our software. Otherwise, they would have to use different apps to connect the different manufactured products. By not using the cloud, we save a lot of money for certain customers, such as smart home operators. And obviously, if you don’t use the cloud, it’s more secure.
Can you tell me how your platform avoids using the cloud? And why is it more secure?
The software is mesh-based, and we do everything locally. So if we have to do any transaction or use analytics, we use the edge. That is a big advantage of our system - we never have to connect the device to the cloud. Also, when I say we have no hub, I mean any device in the configuration can be the hub – we don’t require a separate hub. All of the data is within each device itself; therefore, you don’t have to move anything to the cloud. But the cloud option is there because we have made it flexible enough with MQTT for cloud transmission, if a customer wants it.
You can offer this because of your software expertise, whereas a hardware company needs a hub, unless they write software for the edge?
That’s right. Let’s say Samsung, a device manufacturer, wants its products to connect to other devices in a smart home. Everyone wants choices, so it’s hard to find a home with all Samsung devices. In order for all of those devices to be connected, Samsung would typically create a hub, then use their cloud service to interoperate with the other manufacturers’ cloud services, which is not the most efficient way of doing it. But with our system, we’re already there, we’ve already written the code to connect manufacturers; therefore, we are able to avoid using the cloud and a hub.
How do you approach customers with your value proposition?
We’ve been around since August 2016 – so awareness is key right now. We’re a young company, small and lean. We’re knocking on the doors of anyone offering IoT systems, but we partner with companies like Silicon Labs to offer this solution to your customers, who could be looking for this type of solution. We also partner with implementation partners who can get this done for them.
Have you seen people searching for your type of solution, or are you educating people about the option?
It’s a little of both. Every time we talk to someone about it, they say exactly what you say – “oh, this is kind of novel, I never thought about it that way.” But then there’s a certain group of people who are beginning to say, “we don’t really need the cloud.” New articles are starting to crop up about cloudless approaches, but it’s just starting to get noticed. Anyone we end up talking to likes the idea once they hear it – but to go so far as say people are actively looking for a cloudless solution, we’re slowly getting there.
Is data an issue if you’re not using the cloud?
No, our customers can collect all of the data they want – we give them that flexibility, and they can move it to the cloud if they want.
So there’s no real drawback to moving away from the cloud?
No, we don’t think there is. People have no option but to move away from the cloud, data is too expensive.
Graham, tell me about the Java integration and how your companies work together?
Our company is turning 20 years-old this year. We started out providing our virtual machine for running Java on set top boxes in the German-speaking European Pay-TV market. Since then, our customers have deployed over 120 million devices using various iterations of that core virtual machine. We have a history of deploying predominantly in the digital TV space around the world.
In the past six years, we’ve worked in the IoT market, supporting Java-based IoT industry standards and proprietary solutions. In the case of IMONT, we had worked with one of the founders previously and he reached out to us to use our VM to host his new solution.
Since IMONT’s software runs on Java, our role is to help IMONT’s customers get up and running extremely quickly on various platforms and devices.
As a close partner, what is your impression of the market reaction to IMONT?
IMONT has a disruptive approach to deploying IoT. Everybody is all about the cloud, but the cloud has some significant downfalls. For one, it’s horrendously expensive, and you have vast amounts of data constantly feeding up to the cloud, chewing up bandwidth. You also still have privacy concerns - a lot of consumers have an issue with their personal data being moved to the cloud. All of that data incurs costs to operators. The reaction IMONT is getting from service providers is – first, that can’t be done. But then IMONT proves them wrong. Yes, it can be done, and when operators see the cost benefits, it becomes a very compelling proposition. There are a lot of people realizing that the cloud isn’t the way forward and edge computing makes more sense. IMONT provides the framework for edge computing, and hopefully we provide the vehicle to get their technology running on low-end devices, bringing the cost point down for service providers in the home. But it’s not just the home, industrial IoT deployment applications is a market for IMONT, as well.
Larry, how did you start using Silicon Labs’ products?
Our partnership with D-Link strengthened our ties with Silicon Labs. D-Link offers a lot of devices built with Silicon Labs’ technology, so we started making our software work with Silicon Labs.
Where do you see IoT going in the next 5-8 years?
From our perspective, we see devices getting smarter than they already are, yielding greater power efficiency and eventually operating independently of the cloud. We also expect the number and types of IoT device deployments to continue to explode, but consumers are pushing for greater security and seamless connectivity, so we will see significant improvements in those areas, as well.