8-bit Knowledge Base

      • The detailed structure of bootload record

        yucheng | 10/304/2016 | 11:32 PM

        The bootloader can only process the bootload record which be created from a hex file with Hex2Boot tool. And the bootload record is a pure binary file composed of the bootloader commands described in 5. Bootloader Protocol of AN945.

        The figure as below illustrate the detailed structure of bootload record. 

        The bootload record file start with a setup command to pass the flash keys to the bootloader and selects the active flash bank. The followed erase command will be decoded by bootloader and erase the select flash page. And then the Write command will be decoded by bootloader to write the payload data to flash starting at the indicated address.




        The Hex2Boot can also merge the Erase and Write as one single Erase command, after get the record bootloader will erase the page firstly, and then write flash with data in record.

        Whether the bootloader erase the flash page or not just depend on the bootload record. By default, only the flash page which need to be written will be erased firstly.



      • How does the EFM8 device enter bootloader mode

        yucheng | 10/304/2016 | 11:28 PM

        1.1 Entering Bootloader Mode
        The EFM8 device will entry bootloader mode under these three conditions.
        • On any reset, the bootloader will start if flash address 0x0000 is 0xFF (i.e. first byte of the reset vector is not programmed). This ensures new or erased parts start the bootloader for production programming. For robustness, the bootloader erases flash page 0 first and writes flash address 0x0000 last. This ensures that any interrupted bootload operations will restart when the interrupting event clears.
        • To start the bootloader on demand, the application firmware can set the signature value 0xA5 in R0 in Bank 0 (data address 0x00) and then initiate a software reset of the device. If the bootloader sees a software reset with the signature value in R0, it will start bootloader execution instead of jumping to the application.
        • To provide fail-safe operation in case the application is corrupted, the bootloader starts on either power-on reset (POR) or pin resets if a pin is held low for longer than 50 µs. A full list of entry pins for each device and package is available in Table 3.1 Summary of Pins for Bootloader Mode Entry on page 7 of AN945. The pin for this bootloader entry method can also be found by looking at the efm8_device.h file in the bootloader source code. There is no option to disable this entry method.


        1.2 The detailed implementation of bootloader source code as below.
        After system reset or power on, the MCU will execute the boot_startup.asm in bootloader FW firstly.
        • If the flash address 0x0000 is 0xFF, MCU will jump to boot_start to start the bootloader, otherwise execute the next command.
        • When the bootloader sees a software reset and the signature value in R0 is 0xA5, it will jump to boot_start, otherwise will jump to pin_test for boot pin checking.
        • If the boot pin held low for longer than 50us, the MCU will also execute boot_start, otherwise jump to app_start to execute the application code.



        ; Start bootloader if reset vector is not programmed
            MOV     DPTR,#00H
            CLR     A
            MOVC    A,@A+DPTR
            CPL     A
            JZ      boot_start
        ; Start bootloader if software reset and R0 == signature
            MOV     A,RSTSRC
            CJNE    A,#010H,pin_test
            MOV     A,R0
            XRL     A,#BL_SIGNATURE
            JZ      boot_start
        ; Start the application by jumping to the reset vector
            LJMP    00H
        ; Start bootloader if POR|Pin reset and boot pin held low
            ANL     A,#03H                  ; A = RSTSRC
            JZ      app_start               ; POR or PINR only
            MOV     R0,#(BL_PIN_LOW_CYCLES / 7)
        ?C0001:                             ; deglitch loop
            JB      BL_START_PIN,app_start  ; +3
            DJNZ    R0,?C0001               ; +4 = 7 cycles per loop
        ; Setup the stack and jump to the bootloader
            MOV     SP, #?BL_STACK-1
            LJMP    ?C_START



      • What happened if the RX FIFO of EFM8LB1 I2C Slave overruns?

        Jiehui | 10/304/2016 | 11:24 PM


         What happened if the RX FIFO of EFM8LB1 I2C Slave overruns?


        EFM8LB1 I2C Slave peripheral includes two separate 2-byte FIFOs on transmit and receive. When the RX FIFO is full during I2C write transfer, only the first two bytes will be stored in the RX FIFO. The first two bytes can be (Slave Address + Data Byte 0) if ADDRCHK bit is set to 1, or (Data Byte 0 + Data Byte 1) when ADDRCHK bit is 0. The reset of the bytes are thrown away. Therefore, it is responsibility of software to clear the FIFO before it overruns.

      • EFM8LB1 I2C Slave TX/RX FIFOs

        Jiehui | 10/304/2016 | 11:19 PM


        Are the EFM8LB1 I2C slave TX/RX FIFO and Shift register shared or separate?


        There are 2 separate two-byte FIFOs for RX and TX paths.

        The shift register for I2C Write and Read transfer is also separate.

      • Changing part package in configurator

        jstine | 10/281/2016 | 06:24 PM


        How do I change which package of a particular part Configurator uses?  What do I do if Configurator does not list the package I need?


        The package used by Configurator can be changed by editing the .hwconf file.  Under the device tag, there will be a name field that includes the package.  In the example below, QFN32 can be changed to QFP48:


        <device:XMLDevice xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:device="http://www.silabs.com/sls/hc/document/device.ecore" name="EFM8UB20F64G-A-QFN32" partId="com.silabs.mcu.si8051.efm8ub2_g.efm8ub20f64g">

      • IAR EW8051 printf/printf_P formatter options and code size

        ChrisM | 10/281/2016 | 06:23 PM


        How can I reduce the amount of RAM and flash used to support printf/printf_P formatter options?


        Printf/Sprintf Formatter Options


        printf() and scanf() have various levels of support for the C89 format specifiers (ie %d, %f, etc).  However full support for all C89 formatters uses a lot of RAM and flash.  For small embedded systems, this may not be practical.  To reduce the RAM and flash requirements, consider reducing the formatter support based on the three options listed in the table below.


        Printf/sprintf Formatter Option Formatter Support
        Large Full C89 support
        Medium No float support (%f, %g, %G, %e, and %E not supported)
        Small Only supports %%, %d, %o, %c, %s, and %x with no support for field width or precision


        To configure the printf/scanf formatter options, perform the following steps in IAR EW8051:

        1. Right click the project and click Options

        2. Click the General Options category

        3. Click the Library Options tab

        4. Choose an option from the Printf formatter and Scanf formatter drop down list



        The Small option uses the least amount of RAM and flash, but only supports limited integer format options.  To conserve the most memory, use the Small printf formatter option with the int16_t type.


        For example:


        int16_t number = 10;
        printf("%d", number);


        The Large option will use substantially more flash as floating point is not supported in hardware and requires software libraries on 8051 devices.


        See FORMATTERS USED BY PRINTF AND SPRINTF in EW8051_CompilerReference under the The CLIB runtime environment section for complete details.


        Printf_P/sprinf_P Formatter Options


        Printf_P must be used when the string formatter is stored in code memory.  The same formatter options are present for printf_P, but this option can't be specified using the Library options dialog.  Instead developers must use one of the following Linker options:


        Printf/sprintf Formatter Option Linker Command Line Option
        Large <default>
        Medium -e_medium_read_P=_formatted_read_P
        Small -e_small_read_P=_formatted_read_P


        The Large formatter option is the default and thus no linker command line option should be added to set this option.


        To set the printf_P formatter option to Medium or Small, perform the following steps:

        1. Right click the project and click Options

        2. Click the Linker category

        3. Click the Extra Options tab

        4. Check the Use command line options checkbox

        5. Add the appropriate command line option for medium or small as specified in the table above



        See SPECIFYING READ AND WRITE FORMATTERS in EW8051_CompilerReference under the 8051-specific CLIB functions section for complete details.

      • IAR EW8051 printf() error Pe167 when storing constants and strings in CODE memory

        ChrisM | 10/281/2016 | 06:23 PM


        Why do I receive the following error message when using IAR EW8051 with the "CODE memory" setting for "Location for constants and strings" Target setting:



        // Sample Code
        printf("%d", count);


        // Error Message
        Error[Pe167]: argument of type "char __code *" is incompatible with parameter of type "char const *"




        When using the CODE memory setting, string literals will be stored in the __code memory space.  The IAR 8051 implementation of printf() and sprintf() expect to find the format string parameter in RAM.  Users must call 8051-specific versions of these functions when format strings are stored in __code space.


        Data Memory Function Code Memory Function
        printf printf_P
        scanf scanf_P
        sprintf sprintf_P


        Additional 8051-specific functions that target code memory can be found in pgmspace.h.


        See PLACING CONSTANTS AND STRINGS IN CODE MEMORY in EW8051_CompilerReference for complete details.

      • Converting 8-bit IDE Memory Upload to Binary file

        BrianL | 10/281/2016 | 06:02 PM

        In the legacy 8-bit MCU IDE, there is an option to upload the contents of the memory to a text file. This can be useful to record the firmware image on a device:






        However, the format that is produced is non-standard - every byte of flash is recorded in text, separated with a new line, e.g.:





        It may be useful to convert this format into a binary format. Attached to this case is a simple python script that performs this action. The usage for this script is:

        MemoryUpload2Bin.py -m <path to memory upload txt> -b <path to binary to be created>




      • Converting binary firmware image files to intel hex files (bin to hex)

        BrianL | 10/281/2016 | 06:01 PM

        Binary files can be generated from several toolsets, or even by uploading the memory contents of a device using Simplicity Studio. However, it may be useful to convert these .bin files into a .hex file. A python script has been attached to this article to perform this task.


        The usage of this file is:


        Bin2Hex.py -b <path to binary file> -o <path to hex to be created>


      • Branch instruction variable execution time on devices with a flash prefetch engine

        BrianL | 10/281/2016 | 05:57 PM


        Flash Prefetching:


        Some Silicon Labs 8-bit MCUs have a peripheral known as a Flash Prefetch Engine. This is required on devices that support MCU core SYSCLK frequencies greater than 25 MHz. Since the flash memory on the device has a maximum clock frequency of 25 MHz, and the MCU core can execute one byte of code per SYSCLK cycle, it becomes necessary to fetch multiple bytes of flash per SYSCLK cycle if SYSCLK exceeds the flash maximum clock speed.


        For example, if the MCU core was running at 50 MHz, but the flash was still only fetching one byte per 25 MHz flash clock, the MCU core would need to be idle every other SYSCLK cycle in order to wait for new instructions. To combat this, the prefetch engine allows the device to fetch multiple bytes of flash per SYSCLK clock cycle.



        Branch Instructions, variable latency:


        Branch instructions generally have a variable execution time, depending on whether the branch is taken or not taken. This occurs because the core automatically assumes that the branch will not be taken, and fetches the next byte of code after the branch instruction automatically. However, if the branch instruction is executed, and it turns out that the branch is actually taken, the core has to dump any instructions that are currently executing from the branch-not-taken code, load the code from the branch-taken destination, and begin executing this code instead. For example, here is the JZ instruction latency table from the EFM8LB1 reference manual:


              Clock Cycles
        Mnemonic Description Bytes FLRT = 0 FLRT = 1 FLRT = 2
        JZ rel  Jump if A equals zero  2 2 or 3 2 or 6 2 or 8

        Table 1. JZ rel variable instruction execution times


        As you can see, each column for number of clocks to execute this instruction has two values - one for the branch-not-taken case (the smaller number), and one for the branch-taken case (the larger number). The branch-not-taken case will always be faster, since the core, by default, is already in the process of executing the instructions after the branch statement by the time the branch instruction is resolved. If the branch instruction resolves to 'branch not taken', the core continues executing as normal, causing no delays in execution.


        For other instructions, such as SJMP, the branch is always taken, requiring that the core always take this longer route to execution:


              Clock Cycles
        Mnemonic Description Bytes FLRT = 0 FLRT = 1 FLRT = 2
        SJMP rel  Short jump (relative address) 2 3 6 8

        Table 1. SJMP rel instruction execution times



        Prefetch Engine's impact on instruction execution times:


        Another thing to note, however, is that the instruction execution time is now dependent on the FLRT setting in the prefetch engine. This setting has the following description in the EFM8LB1 reference manual:


        "Flash Read Timing:
        This field should be programmed to the smallest allowed value, according to the system clock speed. When transitioning to a faster clock speed, program FLRT before changing the clock. When changing to a slower clock speed, change the clock before changing FLRT."


        FLRT Setting Description
        0 SYSCLK < 25 MHz.
        1 SYSCLK < 50 MHz.
        2 SYSCLK < 75 MHz.


        Effectively, this means that FLRT=0 means 1 byte of flash will be fetched per SYSCLK cycle, FLRT=1 means two bytes will be fetched at once, and FLRT=2 means that four bytes of flash will be fetched at once. This setting also determines the flash clock timing. FLRT=0 means the flash clock will be SYSCLK, FLRT=1 means the flash clock will be SYSCLK/2, and FLRT=2 means that the flash clock will be SYSCLK/3.


        How does this impact instruction execution times? When a jump instruction is executed, and the jump is taken, the core must now load the code from the destination address. This will take one flash clock cycle, during which the core must be idle. In the case of FLRT=1 or FLRT=2, one flash clock cycle equals multiple core cycles - either 2 or 3 cycles, respectively. Some additional latency is also added due to having to load the new flash address, dumping the instructions that are currently being executed in the core, etc. Generally, the jump-not-taken execution time will be +3 clock cycles and +5 clock cycles for FLRT=1 and FLRT=2, respectively.


        More variation in execution times:


        However, these numbers are not absolute rules. More quirks with the flash prefetch engine can speed up or slow down execution, depending on the location of the jump instruction itself, as well as the destination location.


        As an experiment, the SJMP instruction was evaluated on an EFM8LB1 device running at SYCLK=72MHz, FLRT=2. For the first test, the SJMP instruction was placed at address 0x0100, with its destination address modified to be between 0x0102 and 0x011A. 


        SJMP Rel Cycles Destination Address Address Mod 4
        0 3 0x0102 2
        1 3 0x0103 3
        2 3 0x0104 0
        3 3 0x0105 1
        4 3 0x0106 2
        5 5 0x0107 3
        6 6 0x0108 0
        7 6 0x0109 1
        8 6 0x010A 2
        9 8 0x010B 3
        10 6 0x010C 0
        11 6 0x010D 1
        12 6 0x010E 2
        13 8 0x010F 3
        14 6 0x0110 0
        15 6 0x0111 1
        16 6 0x0112 2
        17 8 0x0113 3
        18 6 0x0114 0
        19 6 0x0115 1
        20 6 0x0116 2
        21 8 0x0117 3
        22 6 0x0118 0
        23 6 0x0119 1
        24 6 0x011A 2

         Table 2. SJMP instruction at 0x0100 


        As you can see, the SJMP execution time varies anywhere from 3 cycles (when the destination is within 4 bytes of the SJMP instruction), all the way to 8 cycles (when the destination is farther away from the SJMP instruction, and the destination address ends in 0x3, 0x7, 0xB, or 0xF (effectively, if the address modulo 4 equals 3).


        Another test was performed, this time with the SJMP instruction at address 0x00FE:


        SJMP Rel Cycles Destination Address Address Mod 4
        0 4 0x0100 0
        1 4 0x0101 1
        2 4 0x0102 2
        3 5 0x0103 3
        4 6 0x0104 0
        5 6 0x0105 1
        6 6 0x0106 2
        7 8 0x0107 3
        8 7 0x0108 0
        9 7 0x0109 1
        10 9 0x010A 2
        11 7 0x010B 3
        12 7 0x010C 0
        13 7 0x010D 1
        14 7 0x010E 2
        15 9 0x010F 3
        16 7 0x0110 0
        17 7 0x0111 1
        18 7 0x0112 2
        19 9 0x0113 3
        20 7 0x0114 0
        21 7 0x0115 1
        22 7 0x0116 2
        23 9 0x0117 3
        24 7 0x0118 0

         Table 1. SJMP instruction at 0x00FE


        Again, the SJMP instruction execution time varies, this time with an additional clock cycle added to all instructions. The fastest execution, again, was when the destination was close to the SJMP instruction address, while all instructions ending in 0x3, 0x7, 0xB, and 0xF performed worse.


        Minimizing execution times:


        Several observations can be made that can be used to minimize instruction execution times for branch instructions, in cases where highly optimized branches are needed.


        1. If possible, put the destination address very close to the jump instruction. 
        2. Place the destination address on an even flash boundary. If FLRT=1, place the destination so that its address modulo 2 equals zero. If FLRT=2, place the destination so that its address modulo 4 equals zero.
        3. Place the jump instruction on a similar boundary, as in #2. If FLRT=1, place the jump instruction so that its address modulo 2 equals zero. If FLRT=2, place the jump instruction so that its address modulo 4 equals zero.


        Attached to this topic is a program called SJMP_TEST.asm, which can be used to measure the instruction execution times of jump instructions. This is performed by monitoring the value of TIMER3 (clocked by SYSCLK) once the jump instruction is executed, and the core reaches the destination.


        This thread may also be a good resource for more information on this topic: http://community.silabs.com/t5/8-bit-MCU/Random-Latency-with-Laser-Bee-Branch-Instructions/m-p/177927#U177927