Skip to main content

The following page content corresponds to the products marketed in Japan.
If you do not live in Japan, please

Software Development Tips for Boosting System Performance


To help minimize the size of the code needed for applications and maximize how fast they run, it's important to choose the right compiler Build options.

The microcomputers in the RX product line deliver a step up in processing performance compared to other mid-range devices. A key way that they implement a high level of code efficiency is by shortening the codes for frequently used instructions. This is achieved in conjunction with the high-performance RX C/C++ compiler. To get the most performance out of RX chips, it is essential to properly select the Build options provided by that compiler. This article explains three key Build options we recommend for use in application development. The correct choices for these options are particularly effective for developing applications that run quickly and require less memory.

Providing both major processing speed boosts and code size reductions

To help ensure that applications require the least amount of code and run as fast as possible, it is important to correctly use three key Build options provided by the Renesas C/C++ compiler for our RX microcomputers. This tutorial describes those options and offers recommendations for their use. Specifically, we will explain the following Build options:
[1] Optimization for external variables
[2] Base address setting
[3] Optimization method

Applying Build options: [1] Optimization for variables with wide scope
Select [Inter-module] to optimize external variables across all files.

The first Build option examined here is the one that optimizes wide-scope variables; i.e., those variables defined outside the functions so that they can be accessed from all functions. If optimization isn't used, the code generated when accessing a wide-scope variable from inside a function contains a separate address for each variable. As a result, processing performance is degraded from a level that perhaps could be higher. Code efficiency is adversely affected, too, because access based on the absolute addressing mode of RX microcomputers requires 32-bit address data.

Our recommended action is to enable [Optimization for Access to External Variables] (see Screen 1). This option automatically combines multiple wide-scope variables (with the same size) into a single package (structure). This changes the accesses to use relative addressing for other than the variable set at the base address, so it helps reduce code size.

Two options—[Inter-module] and [Inner-module]—are available for optimization. Selecting [Inner-module] only uses the structure format for variables defined in the same file (see Figure 1, left side). In contrast, the [Inter-module] setting uses the structure format for all wide-scope variables, regardless of where they are defined.

The [Inter-module] setting results in a longer Build time because wide-scope variables from different files cannot be combined as a structure until after linking; therefore, the compiler and linker must run twice. Nevertheless, this option is likely to be particularly effective (see Figure 1, right side). Accordingly, we recommend that users select [Inter-module] for [Optimization for access to external variables].

Screen 1: Procedure for selecting the [Inter-module] setting for optimization for wide-scope variables. From the [Build] menu, select [RX Standard Toolchain] to open the menu screen (upper screen). Next, select the [Compiler] tab page ([1]) and select [Optimize] from the [Category] drop-down menu ([2]). Next, select [Inter-module] from the [Optimization for Access to External Variables] drop-down menu ([3]). This causes a [WARNING] message window to open. Click the [OK] button in the window.

Figure 1: Comparison of [Inner-module optimization] and [inter-module optimization]. The code on the left shows an example of [Inner-module optimization] for wide-scope variables, while the code on the right shows the result of [Inter-module optimization]. If you select [inner-module optimization], of the six wide-scope variables contained in the example, the structure format is only used for the two variables (a and b) that have the same size and are defined in the same file. However, if you choose [inter-module optimization], the structure format is also used for variables x, y, and z which are defined in different files. As a result, two data structures are generated: one for variables a, b, x and y and the other for variables c and z. As you can see from these examples, this reduces the code size from 53 to 29 bytes and cuts the number of cycles from 16 to 13.
Applying Build options: [2] Base address setting
To improve program size, use the base address setting for frequently accessed I/O areas and variables.

The next Build option we will cover is base address setting. As explained in the previous section, using 32-bit address data reduces code efficiency. The same caveat applies for access to I/O areas. The purpose of the base address setting function, then, is to extend to absolute address access—in other words, access to I/O areas—the same benefits that [Optimization for variables with wide scope] provides for variables. The base address for the area to be accessed is set in one of the RX CPU's general-purpose registers (R8 to R13) and then is used as the basis for performing relative accesses.

Screen 2 shows that the decision of whether or not to use a base address can be set independently for the ROM, RAM, and I/O areas. Writing these addresses to the general-purpose registers is performed at the start of the PowerON_Reset function generated by the compiler.

Figure 2 illustrates an application example consisting of code that initializes each TPU register. Code efficiency is lower when the base address is not set (left side), since in this case 16-bit relative addresses are used. On the other hand, code efficiency is improved by the use of 8-bit relative addresses when the top address of the TPU register area is set as the base address (right side).

Accordingly, by eliminating address-load instructions, this function can increase speed and improve program size. We urge its use when accessing I/O areas and other data areas for which an absolute address has been specified by a pointer, macro definition, or "#pragma address" directive.

Screen 2: Procedure for setting base addresses. Select the [CPU] tab page ([1]) and then use the [Base Registers] drop-down menus to select which general-purpose registers to use for the ROM, RAM, and peripheral I/O base addresses respectively ([2]). The [Address] field specifies the peripheral I/O address ([3]).

Figure 2: Comparison of results of compilation with and without base address settings. The example on the left, in which no base address has been set, contains five 16-bit relative accesses. In the example on the right, the base address has been set, so each of these five 16-bit relative accesses has been replaced by an 8-bit relative access. Note that an absolute address access appears in the first line because relative accesses that involve a negative offset from the address set in the base register are not supported.
Applying Build options: [3] Optimization method
The correct choice of "Optimize for speed" or "Optimize for size" depends on the application.

The other Build option discussed here specifies the basis on which the compiler will perform optimization. Obviously, giving priority to program speed and giving priority to program size tend to be conflicting objectives. Performance of the system will differ, depending on which of these optimization criteria is selected. Accordingly, the RX compiler offers a choice between [Optimize for speed] and [Optimize for size] and also lets users specify the optimization level (see Screen 3). The chart in Figure 3 lists the settings implemented when either of these options is selected.

There are three available optimization levels: "1", "2", and "MAX". For example, when [Optimize for speed] is selected, loop expansion (see Figure 4) will be used to reduce the number of loop iterations by approximately half if you set optimization level 2 is chosen, or by approximately 1/32 if "MAX" is set.

Similarly, when optimizing for speed, inline expansion is used in place of function calls to speed up execution. If optimization level 2 is specified, expansion is performed until the calling function reaches twice its original size. By contrast, when the MAX optimization level is specified, expansion will continue until the function reaches approximately 655 times its original size. We recommend choosing optimization level 2.

When using optimization level 1 or higher, users must always specify the volatile modifier for wide-scope variables that are used in both standard and interrupt functions. To reduce the number of memory accesses to the wide-scope variables, the compiler copies the values of wide-scope variables to registers inside the function. Thus, arithmetic and other operations can access the registers that hold the copies of each variable, rather than the variables themselves. The register values are written back to the wide-scope variables when finished. However, be aware that this operation creates problems in certain cases—for example, when one of these variables is accessed from an interrupt function while an operation is in progress.

Screen 3: Procedure for selecting optimization method. Select the [Compiler] tab page ([1]) and then select [Optimize] from the [Category] drop-down menu ([2]). Next, in the [Speed or size] drop-down menu, select either [Optimize for speed] or [Optimize for size] ([3]). The default setting is [Optimize for size].

Figure 3: Example of loop expansion when optimizing for speed. This code shows how the 100-iteration for loop statement is optimized at optimization level 2. The left side shows the code before optimization; the right side shows it after optimization. As you can see, the optimization process has reduced the number of loop iterations by half, to 50 iterations. Fewer conditional branch instructions are required, so execution speed increases. On the other hand, the increased size of the iterated code means the program size becomes larger.


End of content

Back To Top