What advice can Silicon Labs provide regarding very large and/or very dense network design with ZigBee PRO / EmberZNet PRO?
Following are some useful tips on designing for large, dense networks, such as building automation networks for offices, hotels, or large residential areas.
While mesh networks are generally improved by more router coverage (for a more well-connected mesh), large building automation networks (where you may have radios in every light ballast and light switch) tend to be overly dense, particularly if you make all of the line-powered devices be routers. The stack can only track a finite number of neighboring routers (16 in the case of EmberZNet), and any nodes outside of this neighbor table will require multi-hop routes (through a known neighbor) despite being within direct radio range of the sender, as only the known neighbors at the time of route discovery are considered as potential next hops in new route. (You could potentially overcome this behavior by forcing a 1-hop source route to the target, but without the destination being in the neighbor table, it's unclear how you would assess whether the target is reachable within radio range. There's a recent EmberZNet feature request suggesting a potential stack callback for incoming Link Status messages, such that a node could determine who is within direct radio range even if it is not one of the preferred nodes lucky enough to be in the neighbor table. While this idea has merit, a node in a very dense network like this would be hearing a *lot* of incoming Link Status messages, so would be a lot of callback activity for the application to manage.) And, in a similar vein, NWK/APS broadcast traffic is only repeated if the packet is heard from a known neighbor (else the NWK security frame counter cannot be verified, so the risk of a rogue injected broadcast is too great), so broadcasts don't get around the tracked neighbor concerns in dense networks.
One potential solution is to make a subset of your router-capable nodes behave as either end devices (non-sleepy [RxOnWhenIdle] ones, generally, so they don't have to poll to receive messages/ACKs) or non-relaying routers. Either behavior is supported at compilation / EZSP configuration time by our software, although how you determine which nodes and how many should be "neutered" in this way is up to the system designer and generally involves some proprietary scheme (enforced directly by the installer or by some advanced software algorithm that considers some out-of-band information prior to the node entering the network). To make the node a non-sleepy End Device, just select the "End Device" node type for the node in question in your AppBuilder setup (or join as EMBER_END_DEVICE if you're interfacing with the stack directly rather than using our ZCL-based App Framework). To make the node behave as a non-relaying router, just define EMBER_DISABLE_RELAY globally for your SoC build, or set EzspConfigId EZSP_CONFIG_DISABLE_RELAY with value 0x01 for EZSP host app builds. Contrasting these two solutions, the non-sleepy end device requires a router/coordinator node to be designated as its "parent" and will need to poll this node (or send a message through it) periodically -- about once every 5 minutes, typically -- to ensure that it isn't aged out as a stale child, else it will lose its connection to the network and will need to rejoin to get a proper parent connection again. The end device also doesn't participate directly in routing, instead using its parent node as the first hop for any outgoing messages or last hop for anything incoming, so it won't hear broadcasts unless relayed by its parent, needs to rely on its parent for route discover, can't utilize Many To One Routing as a concentrator, and can't act as a parent to joining/rejoining end devices. However, it also doesn't need to maintain a neighbor table, child table, or a route table (making it less RAM-intensive), can run with our "zigbee-pro-leaf-stack" library (which saves about 5-6KB over the standard ZigBee Pro stack library), doesn't send Link Status messages, and won't echo broadcasts, so those are all potential advantages over the non-relaying router. The non-relaying router uses the standard EMBER_ROUTER node type and joins/behaves just like a standard router node except that it will not relay any unicast traffic or route request broadcasts or respond to route requests with any route replies unless the source or destination of the message/route is the node itself or one of its end device children. The node still sends Link Status messages, still echoes broadcasts as expected, and still allows end device children to join/rejoin through it (unless EMBER_MAX_END_DEVICE_CHILDREN is defined/configured as 0 for the build). Also, while it's not officially supported, it is possible to set/clear the allow relaying flag (extern int8u emAllowRelay) at runtime on SoC devices, whereas changing node types can be done at runtime but is more onerous as it requires leaving the network and joining again with new node type.
In a network of any appreciable size, broadcasts should be avoided or minimized wherever possible, and radius of propagation should be limited in cases where the nodes of interest are known to be in close proximity (within a limited number of hops) of the sender. (See note above about hop count being slightly higher in these neighbors due to density.) Furthermore, the ZigBee Pro stack limits the broadcast traffic to about 8 broadcasts in any 9-second window, so any NWK layer broadcast activity (APS broadcasts, APS multicasts, route discoveries, ZDO announcements or broadcast requests, PAN ID / channel / NWK key update notifications) by any node, even one with a 1-hop radius, will count against this limit, and if any node encounters more than this amount of broadcast traffic, it will discard the packet because it can't be tracked against the list of known broadcasts, and we don't want to risk repeating some old broadcast to propagate it unnecessarily.
While the broadcast table (used to track outstanding broadcasts and which limits how many can be active at a time) can theoretically be resized, the standard EmberZNet PRO stack does not allow for this as changing this parameter can potentially violate ZigBee Pro stack compliance and can impact interoperability with other nodes (even those running the EmberZNet PRO stack) who have different broadcast table parameters. However, some customers designing fully proprietary networks (where compliance and third-party interoperability is not such a concern and where they can specify the build parameters of all nodes involved in the PAN) are able to benefit from a configurable broadcast table to suit the broadcast bandwidth expectations of their design, and in this case a special stack library variant can be made available on case-by-case basis to customers who understand the risks. If you believe this applies to you, please contact your Silicon Labs support representative via the technical support portal (https://siliconlabs.force.com). Also note that even if the broadcast table is resized to allow more broadcasts in the network in a short period of time, sending broadcasts has a significant impact on the throughput of the network and is not intended to be a reliable form of communication as no ACKs are produced by recipients and as clear channel access may not be available to all nodes at the time of relaying the message.
One type of broadcast that you'd expect to see in a large network, especially one with central control points as in a building automation network, is a Many To One Route Request (MTORR). Having central collection/control points in the network act as network concentrators (such as via the Concentrator Support plugin in our Application Framework) is recommended to make communication between this node and others more efficient, such that fewer 1-to-1 route discoveries are necessary and route tables of non-concentrator nodes are not overtaxes by the communication flowing in/out of these concentrators. While an MTORR is still a broadcast and is needed periodically to refresh/maintain the routes to the concentrator (and to derive reverse routes back from the concentrator), MTORRs are more regular/periodic in nature than 1-to-1 route requests and originate at specific points, so the traffic impact is much deterministic and easier to model. The rate of MTORRs, even when used to correct non-functional routes, can be adjusted via the Min/Max Time Between Broadcasts parameters in the Concentrator Support Plugin, along with thresholds for producing MTORRs in reaction to route errors and delivery failures. As the network grows in size and/or number of concentrators, it is important to adjust the Maximum Time Between Broadcasts to be longer so that the network is not overwhelmed by several concentrators doing periodic MTORRs in a short window of time. Note that you will want to leave enough broadcast bandwidth for not just the MTORRs but also the occasional address discovery (ZDO IEEE Address Request or Network Address Request), which may be needed by the stack to resolve the node ID of a binding/address table destination or (in the case of IEEE Address Request) to elicit a unicast reply from a non-concentrator so that a Route Record is produced for the concentrator's benefit after the many-to-one route has changed or is unknown. (This style of proactive source route repair is employed by the Concentrator Support plugin in EmberZNet PRO 5.1.2 and later.)
In order to ensure the benefits of source routing and many-to-one routing are achieved, the concentrator should make sure to suppress route discovery (disallow the setting of EMBER_APS_OPTION_ENABLE_ROUTE_DISCOVERY) for outgoing messages, instead deriving source routes from incoming Route Records. Similarly, nodes talking to a concentrator should also suppress route discovery in their unicasts to the concentrator, instead waiting for the MTORR to set up the inbound route from the concentrator. While these adjustments may introduce some latency into the initial communication between concentrator and partner node (as the routes are being set up), the benefit achieved from avoiding the 1-to-1 route discovery broadcast is important to the scalability of the network. If the non-concentrator node has difficulty knowing which destinations are concentrators in order to suppress route discovery where appropriate, the optional stack callback emberIncomingManyToOneRouteRequestHandler() may be used, which requires defining EMBER_APPLICATION_HAS_INCOMING_ROUTE_ERROR_HANDLER during the build process. Also note that if using the binding table to communicate with a concentrator, EMBER_MANY_TO_ONE_BINDING type can be used when creating the EmberBindingTableEntry, and this will suppress the stack's normal address discovery if delivery to the binding destination fails, even if EMBER_APS_OPTION_ENABLE_ADDRESS_DISCOVERY is enabled. However, it will not suppress route discovery if the EMBER_APS_OPTION_ENABLE_ROUTE_DISCOVERY or EMBER_APS_OPTION_FORCE_ROUTE_DISCOVERY option is enabled for the current unicast.
Note that your Trust Center (coordinator) node should always implement concentrator support so that it can facilitate responses to authentication requests arriving from parents of joining/rejoining nodes, which may not already be in the route table at the time the requests arrives. Failing to enable this feature on the Trust Center can result in problems joining nodes over multiple hops.
To ensure the concentrator has source routes for nodes without having to resort to soliciting themm, the designer of an application for a concentrator node in a large network should make an effort to accommodate source routes for as many nodes as resources will allow. This means using a High RAM Concentrator (as opposed to a Low RAM concentrator, which will force the nodes to send Route Records to the concentrator for every unicast, adding to the traffic load) and setting the source route table size to be large enough for (ideally) the largest expected set of nodes with whom the concentrator will communicate during the life of the PAN. For SoC applications, this means setting EMBER_SOURCE_ROUTE_TABLE_SIZE as large as feasible. (If more than 255 nodes are expected, designers can alter the data type of the SourceRouteTableEntry table index used in app/util/source-route.c to accommodate a 16-bit index as necessary..) For EZSP host applications employing an NCP, the NCP's own source route table should ideally be as large as RAM will accommodate, even if the host is planning to maintain source route data on its own, since there are cases (such as Trust Center authentication requests and incoming ZDO requests) where the stack will need to respond on its own without host involvement, so a route will need to be summoned with only the NCP's available route / source route tables. (Make sure to check the "Enable Concentrator Support at the NCP" option in the AppBuilder Concentrator Support plugin if your version includes this.) If the host is planning on storing source routes as well, it should attempt to store as many as resources allow, such that it can provide the NCP with suitable source route data for outgoing unicasts.
Also note that, for a large/dense network with many hops, the source routes appended to messages may be quite large, so the maximum application payload size for outgoing source-routed unicasts will be impacted by this overhead. For more information about payload sizes, see Knowledge Base article (at community.silabs.com) entitled "What is the maximum ZigBee message payload length in secure and non-secure modes?". As such, fragmentation (which can be provided via AppBuilder's Fragmentation plugin) may be necessary for certain messages with large payloads.
In addition to source route overhead, applications for large networks such as this may also want to consider increasing the following parameters (which can be redefined through build symbols such as via the Macros section of the AppBuilder "Includes" tab):
EMBER_PACKET_BUFFER_COUNT - More traffic in/out/through nodes due to network size/density will mean more packet data needs to be buffered by routers. A single packet buffer can hold up to 32 bytes of data, so a large packet may require as many as 4 packet buffers to store. Packet buffers are also needed for messages buffered on a parent for sleepy end device children and for other transient network data used by the stack. For EZSP host applications, Silicon Labs recommends allocating any unused RAM on the NCP towards packet buffers to ensure the NCP has enough data, particular since packet buffers are also used to store EZSP frames being queued for transmission to the host. (This is the default behavior of the Application Framework already.) The Application Framework uses a default of 75 packet buffers for SoC builds, which is sufficient for most applications, but if you begin seeing EMBER_NO_BUFFERS status results when transmitting messages during heavy network usage in your tests, it may be time to increase this value.
EMBER_APS_UNICAST_MESSAGE_COUNT - The number of pending (waiting for ACK / timeout) APS unicasts that be outstanding from a sending node at any one time. Default is 10.
EMBER_MAX_END_DEVICE_CHILDREN - The number of end device children that can be supported by a single parent router/coordinator. Default is 6; maximum is 64.
EMBER_BINDING_TABLE_SIZE - The number of bindings that the node can leverage for binding-based communication by the stack. Maximum is 126 based on the limitations of the SimEEPROM storage model used for hold these, but 100 is the maximum in the EZSP NCP firmware (except for EM260 and EM351-based firmware, which is limited to 12 bindings). If more than this amount of bindings is needed, the application will need to manage bindings itself, which can be achieved by defining EMBER_APPLICATION_HANDLES_BINDING_ZDO_REQUESTS for SoC builds or, for NCP builds, by setting EzspConfigId EZSP_CONFIG_APPLICATION_ZDO_FLAGS to include the bitmask EMBER_APP_HANDLES_ZDO_BINDING_REQUESTS. This setting will cause all ZDO requests with clusterId of BINDING_TABLE_REQUEST, BIND_REQUEST, or UNBIND_REQUEST to be passed up to the [host] application for processing & response rather than letting the stack handle it in the usual way.
Hopefully, this guide has been helpful to you. If you have any concerns about your large network design, please contact Silicon Labs technical support for advice.
How can I achieve the best RF performance with a keyfob using Si4010/11/12? What are the design and tuning tricks?
The Si4010 RFICs can be used with small loop antennas without external matching components. A native small loop antenna has a quite high input impedance (around 10k), but for the optimum operation the Si4010 transmitter's PA requires about 400-500 ohms antenna impedance. This impedance transformation is provided by the ratio of the top antenna capacitance (can be realized as a printed interdigital capacitor, C1) and the internal PA capacitance bank of the Si4010 (inside the chip, C2). C2>C1.
All designs are pretty sensitive for the top antenna capacitance (shown as C1): slightly smaller capacitance results lower radiated power at the fundamental, meanwhile slightly larger capacitance results higher radiated harmonics. The below table summarizes the effects of the top antenna capacitance on the RF performance (in this comparison the form factor size - loop antenna area - and the frequency are assumed to be fixed).
The internal PA capacitance (C2) is automatically tuned by the RFIC itself and can be monitored ("wPA_CAP" register with Silicon Labs IDE).
Here are some dimension parameters for reference:
If the loop antenna area is around 25x25mm and the operating frequency is 434MHz, the typical top antenna capacitance is about 2pF.
More suggested design tricks for the small loop keyfob antenna:
Inductive tapping: it is realized as a smaller loop on our reference keyfob designs and responsible for only the Si4010's PA biasing without the need of use any external choke inductor (since the Si4010 PA outputs have to be connected to the supply voltage).
How are GPIO interrupts on the EM3xx SoC platform handled when they arrive in ATOMIC() context?
In the Ember HAL, as used on the EM3xx SoCs, the ATOMIC(...) macro is used to temporarily disable interrupts during execution of the enclosed code statements. However, what happens if an external interrupt (via IRQ pin) is triggered while interrupts are disabled? Is the IRQ interrupt lost? The short answer is No, and this applies to both edge-triggered and level-triggered interrupts.
Let's examine why not:
An edge-sensitive interrupt would cause the pending interrupt flag for the IRQ to be set. This flag will be held (asserted), and then the enabling of said flag is what would allow that signal to propagate. The ATOMIC() block, as defined by the Ember HAL, puts shifts around the priority encoding of the interrupts such that the NVIC wouldn't service said interrupt until the ATOMIC block changes BASEPRI so the event propagates. That means the interrupt would be serviced effectively before the next line of code that immediately follows the ATOMIC() block.
A level-triggered interrupt would behave the same way even if, by the time ATOMIC() block finishes, the level has changed. This is becausethe event gets latched in the NVIC and serviced when the BASEPRI says it can be.
If I'm building an application that doesn't need to be ZigBee-compliant (or interoperable with non-EmberZNet nodes), what kinds of proprietary features are available to leverage?
In a proprietary network scenario, you have a bit more flexibility in that you don't have to follow strict ZigBee conventions and can utilize some EmberZNet features that are optional but non-standard and are not guaranteed to interoperate with other vendors' ZigBee Pro stacks. If operating in this mode, it's best to use a project-wide define to set EMBER_STACK_PROFILE as 0 to indicate a proprietary (non-ZigBeePro-compliant) network and prevent other nodes (using the standard ZigBee Pro stack profile of 2 or the non-Pro stack profile of 1) from trying to join your PAN.
Once you do this, some various defines related to ZigBee properties you're at liberty to redefine include:
* EMBER_MAX_DEPTH (default of 15; used to calculate EMBER_MAX_HOPS as 2x max depth)
* EMBER_ROUTE_TABLE_SIZE (able to be set below ZigBee-specified minimum of 10)
* EMBER_INDIRECT_TRANSMISSION_TIMEOUT (length of time in ms that parents hold messages for sleepy children; stack allows this to be set to 30000)
* EMBER_SEND_MULTICASTS_TO_SLEEPY_ADDRESS (default is false, but can set it to true in proprietary networks)
Other EmberZNet-proprietary features you might want to leverage include:
* Just-in-Time (JIT) messaging / Alarm Messages for sleepy children - See related Knowledge Base article @ http://community.silabs.com/t5/Silicon-Labs-Knowledge-Base/How-do-I-use-Alarm-Messages/ta-p/113347 and example usage in createAndStoreJitMessage() in app/sensor/common.c.
* Piggy-backed payloads on APS ACKs - See emberSendReply() API in stack/include/message.h.
* Mobile End Devices - For devices that change parents (or move out of parent range) often, they can join as EMBER_MOBILE_END_DEVICE (selectable in AppBuilder) and our stack will store the child entries in RAM with a more aggressive timeout (EMBER_MOBILE_NODE_POLL_TIMEOUT) rather than writing to SimEEPROM memory each time the child comes and goes.
* Raw messages - See API in stack/include/raw-message.h. Note that while emberSendRawMessage() will send pretty much anything, on the receiving end the stack will consume (not pass to your application) anything with a MAC header frame control that ZigBee would normally employ (beacons, MAC ACKs, unicast MAC commands, many MAC data frames, etc.) However, unhandled MAC frame control masks can be passed up to the application for filtering via either emberMacPassthroughMessageHandler() (SoC only) or through a pre-configured frame filter that can be set by emberSetMacFilterMatchList() for filter-matching in emberMacFilterMatchMessageHandler(). See example in the Inter-PAN plugin in app/framework/plugin/interpan as well as MAC filter types defined in stack/include/ember-types.h.
* Distributed Trust Center mode - Allows you to avoid the risk of having a single point of failure at the centralized Trust Center (coordinator); every router authenticates joins locally rather than consulting ZC. Enable with EMBER_DISTRIBUTED_TRUST_CENTER_MODE in ZC's initial security bitmask passed to emberSetInitialSecurityBitmask(); it's then inherited to each node that joins.