i have a simple situation based on the Server / Client Thread example. I re-created the server / client example with 2 built from scracth Thread projects. I copied the emberAfNetworkStatusCallback() callback implementation from both examples, but i re-arranged the stateEventHandler() implementation for my custom product. The "server" also becomes the commissioner, and the steering is set to EMBER_JOINING_ALLOW_ALL_STEERING and a simple join key is used. The "server" Router form the network correctly, and the "client" Router join the network correctly. The problem begins after powering off the "server", the "client" goes to the EMBER_JOINED_NETWORK_NO_PARENT network status. After that if i reset the network and try to re-join it (after a poweron of the "server") it fails with EMBER_JOIN_FAILURE_REASON_SECURITY reason (0x04). I also tried to reset the network and re-form it on the "server" but without success. I don't know the source of the problem. If i call emberResetNetwork() on both routers they should start from a clean situation right? the Thread API does not permit to do anything else about join permission. Sometimes after resetting the network on the "server" more times, the "client" can join the network, but why? What is the right behaviour of nodes after going to the EMBER_JOINED_NETWORK_NO_PARENT status? Trying to attach or resume the network cotinuously? should i reset the network only when joining a totally new network?
i probably found the cause of the stack instability. i re-created same projects with IAR toolchain and they are working without any problems. I would really appreciate a confirmation from Silicon Labs about GCC validation of the Thread stack. Anyway i'll do some other tests.
it's not a compiler problem. the stack is totally unstable. Its behaviour is unpredictable.
Leader: MGM12P22F1024GE - Tried GCC v7.2.1 and IAR 8.32
Router (client): MGM12P22F1024GA - Tried GCC v7.2.1 and IAR 8.32
but i also tried the client example with the SLWSTK6000A demo boards.
Plugins and stack configuration is the same of the server/client example. I only customized the hwconf and the used serials.
Before flashing any modules i do a full flash erase.
- i read this from the Thread 2.10 API reference web page from the Silicon Labs site: " To forget the network and return to a status of EMBER_NO_NETWORK, please read cautions for emberResetNetworkState() " (written on the few rows of emberJoinNetwork() description). Furthermore i read these few and enigmatic words in the emberResetNetwork() description: " This function erases the network state stored in nonvolatile memory after which the network status will be EMBER_NO_NETWORK. This function should not be called to rejoin a former network; use emberResumeNetwork() instead. There may be difficulties joining a former network after resetting the network state, due to security considerations."
What does it mean? security conderations??? Meaning what? Basically what shall i do to leave a network and join another one? Or to leave, join another one and then re-join the first one? A detailed description of the Thread STACK functionality would be appreciated.
- i tried changing the network ID both on the leader (the router calling emberFormNetwork()) and on the client (emberJoinNetwork()), same behaviour, the client join fails with reason 0x02, then after some re-tries it joined.
- i tried powering off and on the client router, it resumed correctly the network, then i tried powering off and on the leader. The client router only after a minute goes to the network state EMBER_JOINED_NETWORK_ATTACHING and the re-attach to the leader correctly. It seems ok.
- i re-tried switching off and on the client router, it resumes the network, but it cannot receiver the leader advertisement. They seem to be in the same network (same name, channel, panID) but the advertisement of the leader is not received. why?? i tried also to reset the module and resume again the network without any result. if i reset the network i will loose the possibility to join that network again. waiting some minutes the client has begun to receive the leader advertisement again. Why? Is it a known delay or is it a random behaviour?
- i flashed another client router, the join fails with code 0x02. I tried to join as an end device, same result. it joined the network only after some minutes of retries (1 retries every 5 seconds). in the meantime the client router lost the attachment once, why? All devices are in the range of 3 meters.
In this forum there are very few posts about Thread, anyone using it without any problems? I think there is a major bug somewhere, maybe in the stack configuration created by the App Builder for the MGM12P.
i found another "bug" in the example. The server (but also client) code does not support the EMBER_JOINED_NETWORK_NO_PARENT state, asserting an error in that case, because the server should never be in that state. BUT the server goes in that state, for sure when there are more routers in the same network. After some power-off and on of routers, may it loose its parent? Also routers have parents. That state must be managed with a emberAttachToNetwork() call.
The server and client sample apps are a somewhat simple example of communication over a Thread network. The application state machine is definitely not robust enough for all situations. So, the applications of these examples would have to be redesigned to fit your needs. A network capture would better show what is happening over the air. The nodes should be passing MLE messages to each other to maintain routing and network information. My recommendation would be to start a capture from the multiple nodes you are using to see what is happening over the air.
A couple of notes: