Below is the definition of Sequence number and IV index from chapters 3.8.3 and 3.8.4 of Mesh Profile Specification v1.0.1.
The sequence number, a 24-bit value contained in the sequence number field of the Network PDU, is primarily designed to protect against replay attacks. Elements within the same node may or may not share the sequence number space with each other. Having a different sequence number in each new Network PDU for every message source (identified by the unicast address contained in the SRC field) is critical for the security of the mesh network.
With a 24-bit sequence number, an element can transmit 16,777,216 messages before repeating a nonce. If an element transmits a message on average once every five seconds (representing a fairly high frequency message for known use cases), the element can transmit for 2.6 years before the nonce repeats.
Each element shall use strictly increasing sequence numbers for the Network PDUs it generates. Before the sequence number approaches the maximum value (0xFFFFFF), the element shall update the IV Index using the IV Update procedure (see Section 3.10.5). This is done to ensure that the sequence number will never wrap around.
Sequence number should be non-volatile, software stacks should make sure that every time the device sends a valid packet, it should never use the sequence numbers which had been already used before, so it would be a good way to store it in the flash. However, writing flash once per each sequence number increment is probably too often so that the flash may easily wear out before the life time of the real product. To balance the impact, the stack will only write the sequence number to flash at a fixed interval when the sequence number increases by pstore_write_interval_elem_seq times, which is adjustable and located in the dcd.c file in your project. Given that there is a chance that the device may loss power unexpectedly before the interval, which means the real sequence number has not been written to flash, next time when the device boots, it will use the old value in the flash. To avoid the node uses any old sequence number, the node will always increase pstore_write_interval_elem_seq after reset. Besides, the stack will detect if there is valid sequence number increasing between the 2 times of reset, if no, the stack won't increase the sequence number. For more information, you can refer to the example in Figure 1.
NOTE: This is not standard but Silicon Labs' current solution and it may change in the future.
The IV Index is a 32-bit value that is a shared network resource (i.e., all nodes in a mesh network share the same value of the IV Index and use it for all subnets they belong to).
The IV Index starts at 0x00000000 and is incremented during the IV Update procedure as described in Section 3.10.5. The timing when the value is incremented does not have to be exact, since the least significant bit is communicated within every Network PDU. Since the IV Index value is a 32-bit value, a mesh network can function approximately 5 trillion years before the IV Index will wrap.
The IV Index is shared within a network via Secure Network beacons (see Section 3.9.3). IV updates received on a subnet are processed and propagated to that subnet. The propagation happens by the device transmitting Secure Network beacons with the updated IV Index for that particular subnet. If a device on a primary subnet receives an update on the primary subnet, it shall propagate the IV update to all other subnets. If a device on a primary subnet receives an IV update on any other subnet, the update shall be ignored.
If a node is absent from a mesh network for a period of time, it can scan for Secure Network beacons (see Section 3.10.1) or use the IV Index Recovery procedure (see Section 3.10.6), and therefore set the IV Index value autonomously.
IV Update Procedure:
The sequence number is 24-bit length, as the example provided above, with 5 seconds cadence, the sequence number will repeat after 2.6 years. To avoid this, the node could start the IV update procedure to update the network to a new IV index so that the sequence number could be reset to 0. The IV update procedure can be initiated by any node in the primary subnet. Table 1 shows the summary of IV update procedure.
|IV Index||IV Update Flag||IV Update Procedure State||IV Index Accepted||IV Index used when transmitting|
|m (m=n+1)||1||In Progress||m-1, m||m-1|
Table 1: IV Update procedure summary
IV Recovery Procedure:
A node shall support the IV index recovery procedure because a node that is away from the network for a long time may miss IV Update procedures, in which case it can no longer communicate with the other nodes. In order to recover the IV Index, the node must listen for a Secure Network beacon, which contains the Network ID and the current IV Index.
There are some limitations on initiating IV update and recovery procedure, mentioned in chapters 3.10.5 and 3.10.6 of Mesh Profile Specification v1.0.1. Generally, as below:
Because the limitations mentioned above, it's not easy to test the IV update & recovery procedure. So the IV test mode removes the 96 hours limit. All other behavior of the devices remains unchanged.
The secure network beacon must be enabled to do the IV update & recovery since it's the carrier of the information that a node is updating the IV index.
NOTE: the Bluetooth Mesh app may not support to configure the secure network beacon state, so you may need to enable it locally with the mesh test class commands. If using the Host Provisioner, then it's not a problem since it supports that.
The example contains two projects - iv_update and iv_recovery, they needs to work together to show how the IV update & recovery procedure works, as well as showing the sequence number storing strategy. Let's call the node flashed with iv_update firmware node U, and the node flashed with iv_recovery firmware node R.
Below is a list of BGAPI commands and events related to the IV update & recovery procedure, all of which are demonstrated in the example.
As you can see from the last section that the examples are based on the light and switch examples, so they still keep the basic functionalities and behaviors as the original example. What is modified is listed below:
On the Node U side - iv_update project:
On the Node R side - iv_recovery project:
Every time the node U boots and if the node is provisioned, it will print the current and remaining sequence number. Every time any button is pressed to send a packet, the current and remaining sequence number will be printed. You can try to send some packets then reset the device to see what happens. How about if not send packets but reset the node directly? See the result in Figure 1, the pstore_write_interval_elem_seq here is 0x10000.
Figure 1. Sequence Number Increasing
Figure 2 shows the procedure of IV update and IV recovery, there are 2 nodes in the same network, as you can see from figure 1, on the left side, it's the print of the node which initiates IV update procedure, this is node U, on the right side, it's the print of the node which catches up the IV update, this is node R. There are 4 times of the IV update & recovery procedure, describing below:
Figure 2. IV Update & Recovery Procedure
Thank you for this great article and examples.
I have implemented a similar IV Update & Recovery Update Procedure in my application, but I am also periodically calling cmd_mesh_node_save_replay_protection_list(). However, I'm concerned about causing too much unnecessary FLASH wear.
Could you comment on what frequency would you say it's best to call that function in order to maximize FLASH life and still be somewhat protected against replay attacks? I imagine the recommended rate will depend on the frequency of messages sent by the nodes, but I would also like to get some data about Silicon Labs BGM13P Flash endurance cycles.
The frequency of calling to this command doesn't affect the replay attack performance. The variables for replay attack protection are all in RAM, syncing up them to flash is for power cycling. After a power cycle, it can load the latest value from flash, not from 0. The API is targeted to save the RPL before power loss.