# FII RISC-V3.01 on FII-PRX100-S (ARTIX-7, XC7A100T) XILINX FPGA Board Evaluation

V1.1

FRASER INNOVATION INC



# **Version Control**

| version | Date       | Description                                    |
|---------|------------|------------------------------------------------|
| 1.0     | 09/29/2020 | Initial Release                                |
| 1.1     | 10/07/2020 | Add Description of PRX100 and Comparison Plots |
|         |            |                                                |
|         |            |                                                |
|         |            |                                                |
|         |            |                                                |
|         |            |                                                |



### © 2020 Fraser Innovation Inc ALL RIGHTS RESERVED

Without written permission of Fraser Innovation Inc, no unit or individual may extract or modify part of or all the contents of this manual. Offenders will be held liable for their legal responsibility.

Thank you for purchasing the FPGA development board. Please read the manual carefully before using the product and make sure that you know how to use the product correctly. Improper operation may damage the development board. This manual is constantly updated, and it is recommended that you download the latest version when using.

# **Official Shopping Website:**

https://fpgamarketing.com/FII-PRX100-D-ARTIX-100T-XC7A100T-RISC-V-FPGA-Board-PR X100-D-1.htm



(https://fpgamarketing.com/FII-PRX100-S-ARTIX-100T-XC7A100T-Xilinx-RISC-V-FPGA -Board-FII-PRX100-S-1.htm).

Coremark has been EEMBC's CPU evaluation standard since 2009. EEMBC (Embedded Microprocessor Benchmark Consortium) is a non-profit organization with members including Huawei, Intel, ARM and Analog Devices. EEMBC is an important standard for evaluating embedded processors and compilers [1].

Coremark mainly detects ALU (Arithmetic Logic Unit), memory reference, pipeline and branch operations. It is designed to make it impossible for the CPU to run benchmark tests in advance, thus ensuring its fairness. During the specified test time, Coremark does not allow invoking third-party library, and the results are completely based on the optimization of the compiler and the execution processing time of the CPU. Because Coremark mainly provides testing of the CPU architecture, in order to abandon the superiority of the hardware manufacturing process, the final test results of Coremark will be normalized, that is to say, the final test results will be evenly divided into the system clock and the unit is Coremark/MHz. Coremark's main code is written in C language, including list processing (find and sort), matrix manipulation (common matrix operations),

state machine (determine if an input stream contains valid numbers), and CRC

(cyclic redundancy check) [2].

The FII RISC-V3.01 system clock is 50MHz, and the Coremark test score

shown in Figure 1 is 3.38 (169/50 Coremark/MHz).



Figure 1 FII RISC-V Coremark

Figure 2 is a screenshot of some other CPU Coremarks provided on the

EEMBC website.



| Clear<br>Sel. | Processor                              | Cert.        | Compiler                | Execution Memory      | MHz  | Cores | CoreMark  | CoreMark /<br>MHz | Threads | Date↓      |
|---------------|----------------------------------------|--------------|-------------------------|-----------------------|------|-------|-----------|-------------------|---------|------------|
|               | STMicroelectronics STM32L476           | ~            | IAR ANSI C/C++ Compiler | Internal Flash        | 80   | 1     | 265.61    | 3.32              | 1       | 2015-02-04 |
|               | Renesas RZ/T1                          | 1            | IAR ANSI C/C++ Compiler | 512K TCM for code,    | 600  | 1     | 1904.17   | 3.17              | 1       | 2015-01-28 |
|               | Renesas RX71M                          | $\checkmark$ | Renesas CC-RX V.2.03    | Code in Flash (no wa  | 240  | 1     | 1044.60   | 4.35              | 1       | 2015-01-16 |
|               | Renesas RX64M                          | ~            | Renesas CC-RX V.2.03    | Code in Flash (no wa  | 120  | 1     | 546.24    | 4.55              | 1       | 2015-01-16 |
|               | STMicroelectronics STM32L053           | 1            | IAR ANSI C/C++ Compiler | Internal Flash        | 16   | 1     | 39.91     | 2.49              | 1       | 2015-01-12 |
|               | STMicroelectronics STM32L053           | ~            | IAR ANSI C/C++ Compiler | Internal Flash        | 32   | 1     | 75.18     | 2.35              | 1       | 2015-01-12 |
|               | STMicroelectronics STM32L152           | ~            | IAR ANSI C/C++ Compiler | Internal Flash        | 16   | 1     | 53.36     | 3.33              | 1       | 2015-01-12 |
|               | STMIcroelectronics STM32L152           | 1            | IAR ANSI C/C++ Compiler | Internal Flash        | 32   | 1     | 92.36     | 2.89              | 1       | 2015-01-12 |
|               | Atmel SMART SAMV71Q21                  | 1            | IAR-EWARM-7.30          | Code ITCM; Data DT    | 300  | 1     | 1503.00   | 5.01              | 1       | 2015-01-05 |
|               | Imagination P5600                      | 1            | Sourcery CodeBench 2014 | DDR2                  | 20   | 1     | 112.10    | 5.61              | 1       | 2014-12-02 |
|               | Altera Arria V SoC                     | ~            | Linaro GCC 2013.02 (GCC | 1 GB DDR3 SDRAM       | 1050 | 2     | 5654.00   | 5.38              | 2       | 2014-10-06 |
|               | Microchip Technology PIC32MZ2048E      | $\checkmark$ | Microchip MPLAB XC32 v  | Code in Flash, Data i | 200  | 1     | 636.97    | 3.19              | 1       | 2014-09-24 |
|               | STMicroelectronics STM32F756NGH6       | 1            | IAR ANSI C/C++ Compiler | Internal Flash        | 200  | 1     | 1001.79   | 5.01              | 1       | 2014-09-24 |
|               | Microchip PIC18F46K22                  | 1            | Microchip MPLAB XC8 v1  | Code in Flash, Data i | 64   | 1     | 7.23      | 0.11              | 1       | 2014-06-12 |
|               | Renesas RX64M                          | $\checkmark$ | IAR EWRX V2.50.1        | Code in Flash (no wa  | 120  | 1     | 510.20    | 4.25              | 1       | 2014-03-12 |
|               | Microchip dsPIC33EP512MU810            | 1            | Microchip MPLAB XC16v1  | Code in Flash, Data i | 70   | 1     | 132.39    | 1.89              | 1       | 2014-02-20 |
|               | Microchip PIC32MZ2048ECH100            | 1            | Microchip MPLAB XC32v1  | Code in internal Flas | 200  | 1     | 654.36    | 3.27              | 1       | 2014-01-13 |
|               | Renesas RZ/A1H                         | ~            | IAR ANSI C/C++ Compiler | SRAM 133MHz           | 400  | 1     | 1660.00   | 4.15              | 1       | 2013-11-20 |
|               | Tilera TILE-Gx8072                     | ~            | gcc 4.4.6               | DDR3 1333MT/s He      | 1200 | 71    | 277578.70 | 231.32            | 71      | 2013-11-11 |
|               | Texas MSP430F5529                      | 1            | IAR EW430 V.5.52.1      | Data in SRAM (stack   | 25   | 1     | 27.70     | 1.11              | 1       | 2013-10-15 |
|               | Imagination Technologies interAptiv si | 1            | gcc 4.9.0               | DDR231MHz             | 62.5 | 1     | 221.10    | 3.54              | 2       | 2013-06-17 |
|               | Imagination Technologies microAptiv    | 1            | gcc 4.9.0               | DDR 40MHz             | 40   | 1     | 137.48    | 3.44              | 1       | 2013-06-16 |
|               | Imagination Technologies proAptiv sing | ~            | gcc 4.9.0               | DDR231MHz             | 62.5 | 1     | 319.06    | 5.11              | 1       | 2013-06-16 |
|               | ARM Cortex-A15                         | $\checkmark$ | armcc 5.03-24           | DDR3 800MHz           | 1700 | 2     | 15908.00  | 9.36              | 2       | 2013-04-15 |
|               | Renesas RX111                          | 1            | IAR EWRX V2.41.3        | Code in FLASH (no     | 32   | 1     | 98.52     | 3.08              | 1       | 2013-03-21 |

## Figure 2 Part of the CPU Coremark result of EEMBC



Figure 3 CPU Coremark Comparison

FII RISC-V3.01 is a single-core, a mix of 2-stage and 3-stage pipeline CPU.Figure 3 lists some other single-core CPUs' Coremark being certified by EEMBC.FII RISC-V3.01 has been highlighted using red strokes. It can be seen that FII



blue). From the official manual by STMicroelectronics, STM32H72x/73x rev Z and STMicroelectronics STM32H7B3 rev Z both use Cortex-M7, which has a 6-stage super scalar pipeline. Renesas Electronics RX66T uses RXv3 core, which has improved 5-stage pipeline. Since with more stages of pipeline, undoubtedly the better performance of CPU is, to make the comparison of performance more fair, compared with Texas Stellaris Cortex-M3 (highlight as blue), which is also a 3-pipeline processor as FII RISC-V3.01. Nevertheless, FII RISC-V's Coremark is greatly larger than theirs, even more than two times. Compared with another processor, Microchip ATSAML21J18B (highlight as blue), which is also a three-stage pipeline, the Coremark of FII RISC-V3.01 is still much higher.To conclude, with the same amount core stages of pipeline constrained, FII-RISCV3.01 performs outstandingly and is favourable.

Dhrystone was also a universally recognized benchmark for evaluating CPU performance before Coremark appeared. It was invented by Reinhold Weicker in 1984. It is collectively called the "classic benchmark" with Livermore, Whetstone and Linpack, and was popular at 70-80s in last century. Each of these four benchmarks has a biased focus. While Dhrystone tests the integer operation for 6/11



UNIX systems. Whetstone tests the floating point operations for minicomputers. Linpack tests the floating point operations for workstation, and Livermore tests the numeric operations for supercomputers [3]. There are two main versions of Dhrystone, Dhrystone 1.1 and 2.1. Dhrystone2.1 has been improved on the basis of Dhrystone1.1, so that part of the code will not be disabled due to optimization, thereby affecting the accuracy of the evaluation. The main code of Dhrystone is written in C language, and its performance is tested on the CPU's Millions of Instructions Per Second (MIPS). The final result needs to be normalized and divided by 1757. This is because historically the Dhrystone test result of 1977 Digital Vax 11/780 was 1757 Dhrystone/s (it can execute 1757 times Dhrystone benchmark per second), which was considered the world's first microcomputer with a computing level of one million instructions per second. Therefore, Dhrystone MIPS (DMIPS) tested on other platforms will be normalized. The limitation of Dhrystone is that it invokes some library functions to perform iteration, and the compiler usually optimizes it and converts it into assembly language for execution, so in fact Dhrystone is also testing the degree of optimization of the compiler to the C library functions of a specific CPU. As shown in Figure 4, the test score of FII RISC-V3.01 Dhrystone 2.1 is 98360 Dhrystone/s, 55.98 DMIPS (98360/1757), and 1.12 DMIPS/MHz (55.98/50MHz).



Dhrystone Benchmark, Version 2.1 (Language: C) Program compiled without 'register' attribute Execution starts, 6000000 runs through Dhrystone Execution ends Seconds passed: 61 number of runs: 6000000 Microseconds for one run through Dhrystone: 10.1 Dhrystones per Second: 98360

Figure 4 Dhrystone test results



Figure 5 CPU Dhrystone plot

Figure 5 shows some CPU Dhrystone results provided by ARM official website. FII RISC-V3.01 has been red highlighted. It can be seen that the CORTEX-M7 (marked blue) has the highest Dhrystone among the 10 CPUs. From the ARM official website, CORTEX-M7, CORTEX-R4 (marked blue), and CORTEX-A5 (marked blue) all have more than 3-stage pipelines. CORTEX-R4, and CORTEX-A5 have more than one core. Same as before, to make the comparison of performance more fair, FII RISC-V3.01 is compared with CORTEX-M3 (marked blue), which is also a single-core and has a 3-stage



In addition to Coremark and Dhrystone, there are many other open source benchmarks, such as AIM Multiuser Benchmark, Embench, HINT, etc., and non-free industry standard, such as SPEC (Standard Performance Evaluation Corporation), BAPCo (Business Application Performance Corporation) that provide benchmarks that can be audited and verified [6].





## References

[1] "EEMBC | Wikiwand", *Wikiwand*, 2020. [Online]. Available:

https://www.wikiwand.com/en/EEMBC. [Accessed: 29- Sep- 2020].

[2] "EEMBC", *Eembc.org*, 2020. [Online]. Available:

https://www.eembc.org/coremark/index.php. [Accessed: 29- Sep- 2020].

[3] "Roy Longbottom's PC benchmark Collection - Classic

Benchmarks", *Roylongbottom.org.uk*, 2020. [Online]. Available:

http://www.roylongbottom.org.uk/classic.htm. [Accessed: 28- Sep- 2020].

[4] "Dhrystone howto - CDOT Wiki", Wiki.cdot.senecacollege.ca, 2020. [Online]. Available:

https://wiki.cdot.senecacollege.ca/wiki/Dhrystone\_howto#What\_Dhrystone\_really \_does. [Accessed: 28- Sep- 2020].

[5] T. Riemersma, "The Dhrystone benchmark, the LPC2106 and GNU

GCC", Compuphase.com, 2020. [Online]. Available:

https://www.compuphase.com/dhrystone.htm. [Accessed: 28- Sep- 2020].

[6] "Benchmark (computing) | Wikiwand", Wikiwand, 2020. [Online]. Available:
https://www.wikiwand.com/en/Benchmark\_(computing). [Accessed: 29- Sep-2020].