HEP Group Blog: Set up "Array" of two Cubieboard2's with MPI and HPL

Now that I have been able to get HPL working on the Cubieboard2 the next step would be to get it working on an array of boards. I was only able to get my hands on two boards so I am treating this as a proof of principle for later larger arrays.

If you do not have HPL set up on your board and would like a walk through please see my previous post: Installing HPL on Cubieboard2

Before we start. This is the setup I am using: Two Cubieboard2 running Ubuntu 13.10. Each board has one CPU and 2 cores with 1GB DDR3 RAM. In total we have 4 cores and 2GB RAM. I have called the boards cubiedev1 and cubiedev2 (Host names). OK lets get started.

MPI needs to be able to identify the nodes (actual machines or computers) so that it can execute the programs on each of the nodes cores. to do this we need to set up a hosts file.

Host names on Master Node

On the master node (Generally the node where you will issue the tests and store results) edit the host names file and add in the corresponding computers with their designated IP's.

nano /etc/hosts

127.0.0.1 localhost
192.168.1.1 cubiedev1
192.168.1.2 cubiedev2

Note that you must not have the master node specified as localhost. I.E. You must not have 127.0.0.1 cubiedev1... Even if this is true for this board it will cause the other nodes to try connect to localhost when connecting to cubiedev1.

Using NFS for Ease of Testing

NFS allows you to mirror a hard drive over the network. This is extremely useful for us since to run a program such as HPL, the exact same version must be installed on all of the nodes. So instead of copying the program to all nodes we can mirror the drive and then do all our editing once and not have to worry about distributing the program around.

To install run:

sudo apt-get install nfs-kernel-server

Now we need to share the folder we will work in... The sd card that the cubieboard has its OS on is only 8GB. I have an external HDD mounted in the directory /mnt/cub1/ if you want to mirror a folder on your sdcard its not a problem but the r/w speeds are generally not that great and you are limited by the size. So I created a directory called mpiuser on /mnt/cub1/ and I will run all my tests from this folder.

So now we have the directory /mnt/cub1/mpiuser and we must edit the folder exports and add the directory and restart the nfs service.

nano /etc/exports

/mnt/cub1/mpiuser *(rw,sync)
sudo service nfs-kernel-server restart

The folder mpiuser has now been shared but we need to mount this on the other nodes and link it to the master node. We can do this manually from the terminal each time we boot with the mount command or we can edit the fstab file so it mounts at boot.

nano /etc/fstab

cubiedev1:/mnt/cub1/mpiuser    /mnt/cub1/mpiuser    nfs

sudo mount -a
repeat on each node

Creating the user for all MPI programs

Creating one user with the same name and password on each board will allow us to easily access each node over ssh. We need to create the user and set the home directory to our shared folder mpiuser. We then also need to change the ownership of the folder to this user.

sudo adduser mpiuser --home /mnt/cub1/mpiuser  
sudo chown mpiuser /mnt/cub1/mpiuser

Make sure that the password is the same on all boards.

Configure SSH to use keys and not passwords

Change to our new user:

su - mpiuser

Create the key using

ssh-keygen -t rsa

Use the default location as this is now a shared directory and will update to all nodes.
Now we need to add this key to the authorized keys:

cd .ssh  
cat id_rsa.pub >> authorized_keys

If you can ssh into the other nodes using their host names then you have set it up correctly. Test using:

ssh cubiedev2

MPI software

I have already installed the MPICH2 for my MPI program as I did this in the previous post mentioned before. You can use OpenMPI. It's up to you.

We need to set up a machine file. This file will be a flag when running using the mpi command. It is a list of hosts with the specified number of nodes that you want to use. An example of the machines file that I have is:

cubiedev1:2 #The :2 represents the number of cores
cubiedev2:2

To test if this works we will use a simple test script which can be found on this blog. Save the content below to a file called mpi_hello.c

#include 
#include 

int main(int argc, char** argv) {
    int myrank, nprocs;

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
    MPI_Comm_rank(MPI_COMM_WORLD, &myrank);

    printf("Hello from processor %d of %d\n", myrank, nprocs);

    MPI_Finalize();
    return 0;
}

Compile it with

mpicc mpi_hello.c -o mpi_hello

Now run it with the correct number of specified processors (1 for each core)

mpirun -np 4 -f machines ./mpi_hello

The output I get is:

Hello from processor 0 of 4
Hello from processor 1 of 4
Hello from processor 2 of 4
Hello from processor 3 of 4

Cool... Now we know that all the processors are being "seen".

Set up the HPL files

Copy the HPL files that you have been using into the mpiuser directory on the shared hdd. Make sure the ower is set correctly via the chown hpl mpiuser command. If you are unsure of how to set up HPL please see Installing HPL on Cubieboard2

Set the HPL.dat file so that the product of P x Q = 4 (since we running it on both cubieboards) also make sure your problem size is large enough.

Now run HPL using:

mpirun -np 4 -f machines ./xhpl

Pages

Sunday, 24 November 2013

Set up "Array" of two Cubieboard2's with MPI and HPL