Tuesday 11 March 2014

NAS Benchmarks on ARM

The NAS Parallel Benchmarks (link) are a comprehensive suite of benchmarks to test supercomputers, maintained by NASA. They were originally based on computational fluid dynamics (in 1994) and expanded over time to cover many different problem types as well as many problem sizes; from very small problems that run in a few seconds for testing purposes, to large problems that can take hours on a supercomputer!

Since these benchmarks cover a range of problems, most interestingly a specific Embarrassingly Parallel benchmark, it is important to test their performance on ARM. Luckily the task of building the benchmark suite on ARM is straightforward. I will document it here for those who are interested. I will write about performance tweaks and compiler flags in a later post once I have had more time to experiment.


Installation (Single Processor Test)

  • Download a copy of the source code from the web site linked above. Unzip the source into a directory on your ARM system.
  • You should already have a full suite of compilers (gcc) installed on your system, as well as MPICH or other MPI library.
  • Navigate into the NPB3.3-MPI directory. Please read the README.install text document for some details. There is a short document in each benchmark directory with some details about that specific benchmark.
  • Navigate into the 'config' directory.
  • Run this command to use the template for the build: cp make.def.template make.def
  • Then run this command to use the template for the suites: cp suite.def.template suite.def
  • You now need to customize the make.def file to your system. Your modifications should be the same as mine if you are running Linux (Linaro) on ARM. Scroll through the file and adjust the lines as below:
MPIF77 = mpif77
FFLAGS = O3
MPICC = mpicc
Un-comment include ../config/make.dummy

  • Note that we uncommented the make.dummy file. This means that true MPI will not be used, and all of the benchmarks will only run with single processor as a simple test.
  • The template suite.def file is fine for this proof-of-concept.
  • Return to the root directory of NAS with ../
  • Type make suite and wait for the build to complete. If something goes wrong there may be an issue with a dependency.

Installation (Multi-Processor MPI)

To install a true MPI version, follow the steps above, except leave the make.dummy commented. You should also modify the suite.def file to suit the number of processors (processes) you would like to run.

To run a multi-processor version type:
mpirun -np 4 ./bin/ep.S.4
For a 4 processor version of EP with a size of S. Obviously the benchmark must be compiled for the correct number of processors. You need to update the command accordingly.

You can selectively compile a single test at a time. Please see the README.install file - it's really quite simple.

No comments:

Post a Comment