Hardware Used
Four platforms were used to test various Cortex A Series CPUs. They are described in the table below:
HPL Results
The HPL results shown below are all given in GFLOPS. I included the results at 1GHz for all boards and then at the max frequency. The clock frequencies were as close as possible to 1GHz but the difference was almost negligible. Immediately we notice the Jetson Tegra K1 has approximately 4 GFLOPS more than the Odroid XU+E. This is expected as this is approximately the same ratio as the clock frequency ratio of 2300 to 1600.
HPL Results for four ARM boards |
HPL Efficiency
Similarly to what was done in previous posts I have taken the power measurements of the boards at each frequency and recorded the HPL performance. This gave a nice profile of HPL performance / Watt as a function of clock frequency. The A7 performs fairly poorly but it is a duel core. There are not that many available CPU frequencies on the Wandboard (A9) so we are stuck with just 3 data points but even so we can pretty much see the pattern. The A15-p2 (Odroid, Green) clearly shows the transition between the power saver (Quad A7) and the higher powered A15 which occurs at approximately 600 MHz. The Tegra K1 has a much better power efficiency (Over 2 GFLOPS/Watt at low frequencies). This is impressive but impractical since one would never run these devices at <300 MHz for processing data. What is impressive is that even at 2 GHz the efficiency is still over 1 GFLOPS/Watt.
HPL Squared per Watt
As mentioned in my previous posts the efficiency alone is not that useful. A more interesting feature to look for is the best operating frequency to run these chips to maximise both power consumption and performance simultaneously. This really does give us a nice profile of the boards. It also clearly shows the improvements of the Tegra K1 over the Odroid. What is the main reason is still a little unclear but what we do know is this: The Tegra K1 is a later revision of the Cortex A15. What was changed between the r3p2 and r3p3 revisions is not clear from the ARM website : http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0438g/ch01s08s10.html as all they mention are the register values that are changed and "Various engineering errata fixes". I think a more significant reason is the process type used to manufacture the chips. The Odroid was made with the 28 nm HKMG process and the Tegra K1 was made with the 28HPM process. According to TSMC the 28HPM provides more power will maintaining the same power leakage as the 28LP process.