Hello,
I'm working on a pretty complex project on the EFR32MG12 which includes proprietary radio, an external LoRa transceiver and a number of internal procedures/tasks that I manage with the help of Micrium Os.
Typically it works well, but when I stress the device with some intensive procedure, after a while (10-20 minutes) it gets stuck in default handler. Unfortunately I cannot isolate any procedure/function that trigger this, I can only set the breakpoint in the Default_Handler so that I know when it happens, but I cannot say what is causing it. I've tried to look at the NVIC registers, but none of NVIC_IABRx bit is set.
Any idea on how to debug this is really welcome.
Thank you,
Davide.
Default handler debug
Hello,
I'm working on a pretty complex project on the EFR32MG12 which includes proprietary radio, an external LoRa transceiver and a number of internal procedures/tasks that I manage with the help of Micrium Os.
Typically it works well, but when I stress the device with some intensive procedure, after a while (10-20 minutes) it gets stuck in default handler. Unfortunately I cannot isolate any procedure/function that trigger this, I can only set the breakpoint in the Default_Handler so that I know when it happens, but I cannot say what is causing it. I've tried to look at the NVIC registers, but none of NVIC_IABRx bit is set.
Any idea on how to debug this is really welcome.
Thank you,
Davide.
check if this helps :
https://www.silabs.com/community/mcu/32-bit/knowledge-base.entry.html/2014/05/26/debug_a_hardfault-78gc.html
Than you delu,
I followed that post, but what I found is still not enough to solve the problem. The print of registers made with the proposed routine gives me:
HardFault:
SCB->CFSR 0x00000400
SCB->HFSR 0x40000000
SCB->MMFAR 0xe000ed34
SCB->BFAR 0xe000ed38
SP 0x20004a60
R0 0x200207a8
R1 0x00000000
R2 0x20020828
R3 0x20020801
R12 0x000012a7
LR 0x00023a55
PC 0x0002350c
PSR 0x81000000
CFSR has IMPRECISERR bit set, HFSR has FORCED bit set. At reference [1], this kind of error is associated with overclocked chip, but it is not my case, at least I think. I attach here my init_mcu.c file if it helps
Moreover, if I try to inspect what I have at the address close to the content of LR register (assuming the value is valid) I see this:
By inspecting the map file, this part of code looks to be part of the _free_r function of libc_nano (as side note I'm not using malloc nor free in my code, but I'm using mbedtls, that might use it).
[1] Debugging a HardFault on Cortex-M - https://www.iar.com/knowledge/support/technical-notes/debugger/debugging-a-hardfault-on-cortex-m/
I don't know if it makes sense, but I found some reference online where newlib-nano is declared non thread safe (for instance this: https://mcuoneclipse.com/2017/07/02/using-freertos-with-newlib-and-newlib-nano/). Now I would like to see if by using a thread safe version of it, the problem gets solved, however it is not that easy. In the project I already have a thread safe version of malloc (which is sl_malloc, provided by silabs). However, I use a lot of printf, memcmp, memcpy, memmove and some time.h functions in the project, and if these functions call malloc (and not sl_malloc), I cannot avoid them from being non-thread safe
Should I substitute all the calls to memcpy to some kind of my_memcpy, where my_memcpy is made thread safe with a CORE_CRITICAL_SECTION()? Or is there a better/cleaner way?
You can try this. Since you're using another library that is not thread safe, I am not sure if there is anything else you can do, short of using a thread safe library.
Can you check the call hierarchy to see what is calling this function?
I isolate the problem to happen during the computation of the MD5 (made with mbedtls library) of a large region of FLASH.
There was a problem in the logic of the application that led the device to keep calculating the MD5 in a kind of while(1) loop. This was done in a of low priority task which was probably interrupted hundreds of time during the calculation.
I workaround the problem by changing the application logic so that the MD5 is calculated only once in the procedure.
Honestly I don't know if it makes sense for me to go further, the problem is related to (or at least it happens in) a standard library (nano-lib), when it is used by another third party library (mbedtls). For now I accept to path it with the workaround, and if it will cause further problems I'll go on with the investigation.