>> BUILDING A RASPBERRY PI 3 CLUSTER (PART 2)
"master.. master.. just call my name, 'cause I'll hear you scream" (Metallica - master of puppets)
In my
previous entry
I covered the design of setting up a cluster of Raspberry Pi devices to
build a cheap supercomputer. Hopefully you have got a number of devices
configured with the latest raspbian image and it is completely up-to-date.
We will start with the master node of the cluster - tweaking the stock
image to serve it's purpose.
The master node has the responsibility of providing a number of functions;
such as access to the internet, acting as a gateway for the slave nodes and
a shared disk drive resource across all the nodes. It is effectively the
brain of the system and will draw more power than the slaves so it is
recommended to connect it to a higher ampere power to make sure it remains
stable.
STEP 1: INTERNET ACCESS OVER WiFi
The Raspberry Pi 3 has two default network interfaces, namely eth0
and wlan0 - for older devices, a USB WiFi adapter would be
required. The WiFi network will connect to the internet; while our
ethernet network create a private network allowing the slaves to
communicate with each other.
Configuring WiFi requires the editing of the following file:
$ sudo vi /etc/wpa_supplicant/wpa_supplicant.conf
network={
ssid="ssid"
psk="pass"
proto=RSN
key_mgmt=WPA-PSK
pairwise=CCMP
auth_alg=OPEN
}
Of course the appropriate settings specific to your WiFi network should be
defined here - be careful following older guides online as the
jessie release of rasbian has introduced a number of changes
which will not work with old guides anymore - this has fueled
frustration in the community.
STEP 2: GATEWAY CONFIGURATION
Our design dictated a private network between the master and slave nodes
to allow for communication - yet, isolate them from external tampering. In
effect; we are building a firewall to protect the nodes. To nerd out
the configuration, we will use the 3.141.59.x (π) IP address range.
In summary; these will be the associated IP addresses:
master: wlan0: 192.168.1.xxx
eth0: 3.141.59.1
slave1: eth0: 3.141.59.2
slave2: eth0: 3.141.59.3
slave3: eth0: 3.141.59.4
Our WiFi network should already be setup - your IP address may differ
depending on your WiFi network configuration. Now we need to assign a
fixed IP address to the eth0 network adapter. In order to do
this; we need to modify the DHCP client settings by editing the following
file:
$ sudo vi /etc/dhcpcd.conf
interface eth0
static ip_address=3.141.59.1/24
static domain_name_servers=8.8.8.8
We do not want to define a router in this case; as we want it to default
to the wlan0 interface. In order for the device to act as a gateway
we also need to enable packet forwarding and define NAT routing rules for
the eth0 network to utilize the wlan0 network by doing
a few simple modifications:
$ sudo vi /etc/sysctl.conf
net.ipv4.ip_forward=1
$ sudo vi /etc/rc.local
/sbin/iptables --table nat -A POSTROUTING -o wlan0 -j MASQUERADE
While we are at it; we might as well create some alias's for the nodes:
$ sudo vi /etc/hosts
3.141.59.1 rPi01
3.141.59.2 rPi02
3.141.59.3 rPi03
3.141.59.4 rPi04
Make sure there are not duplicates in this file for hosts - you will
need to remove the 127.0.0.1 entry for the current host.
We will come back to configuring the slave nodes later on;
but we should have everything we needed configured for networking
at this point.
STEP 3: SHARED USB DRIVE
To make the supercomputer more like a standard computer we need to
have a shared disk drive mapped across all the nodes. This can be
useful for loading resources for processing and storing results for
all nodes to access; we will utilize NFS to share a USB drive across
the nodes.
We should format the USB drive to use the native ext4
file system:
$ sudo mkfs.ext4 /dev/sda1 -L shared
We need a mount point for the USB drive - we will use /mnt/usb
and define it as follows:
$ sudo mkdir /mnt/usb
$ sudo mount /dev/sda1 /mnt/usb
$ sudo chown -R pi:pi /mnt/usb
We need to change the ownership of the mount point to pi:pi so the
default user can read and modify files. To make sure the USB drive mounts
on boot; we need an fstab entry defined:
$ vi /etc/fstab
/dev/sda1 /mnt/usb ext4 defaults,user,exec 0 1
We will now need to install an NFS server so it can be available for other
nodes:
$ sudo apt-get install nfs-server
In order to grant permission to the slave nodes; we will need to define
some exports:
$ sudo vi /etc/exports
/mnt/usb rpi02(rw,sync)
/mnt/usb rpi03(rw,sync)
/mnt/usb rpi04(rw,sync)
The next is to ensure the service starts automatically on boot,
and define a symlink for consistency.
$ sudo update-rc.d rpcbind enable
$ sudo update-rc.d nfs-common enable
Unfortunately; there is a weird race-condition between the NFS server and
RPC; so we need to manually force restart of the nfs-kernel-server
on device boot to make sure nfsd is running.
$ vi /etc/rc.local
# need to give nfs-server a little nudge to start right
service nfs-kernel-server restart
We should make an alias for the mount point so that it looks like same as
it will on a slave node.
$ sudo ln -s /mnt/usb /mnt/nfs
This will allow us to run applications on the master node using the
binaries as on the slaves without conditional compilation rules when
trying to access the shared mounted resource.
STEP 4: MISCELLANEOUS SETTINGS
We will also want to initialize some ssh keys so that we have
a mechanism to log in and out of the slave nodes without having to
provide a password every time. This will be vital when we look at
utilizing software in the future to run parallel applications on the
device smoothly.
$ mkdir ~/.ssh
$ cd ~/.ssh/
$ ssh-keygen
We will come back to deploying the id_rsa.pub file across
the slave nodes when we set them up. A final task is to provide an easy
way to shutdown all the slave nodes from the master:
$ vi shutdown.sh
#!/bin/sh
HOSTS="rpi04 rpi03 rpi02"
for HOSTNAME in $HOSTS; do
echo executing \'sudo shutdown -hP now\' on $HOSTNAME
ssh `whoami`@$HOSTNAME sudo shutdown -hP now
done
echo executing \'sudo shutdown -hP now\' on localhost
sudo shutdown -hP now
We have learnt in the past that failure to power down a Raspberry Pi device
can end up with a corrupt file system on the memory card - it would not
make sense to manually log into each slave to shut them down so this
script will come in handy - it can be easily modified to reboot the slaves.
STEP 5: VERIFY SETTINGS
We can do a few sanity checks to make sure everything has been done right:
# verify the IP addresses on the network interfaces
$ ifconfig
eth0 Link encap:Ethernet HWaddr b8:27:eb:2d:a1:0f
inet addr:3.141.59.1 Bcast:3.141.59.255 Mask:255.255.255.0
..
wlan0 Link encap:Ethernet HWaddr b8:27:eb:78:f4:5a
inet addr:192.168.1.xxx Bcast:192.168.1.255 Mask:255.255.255.0
# verify the mount point has been defined right
$ ls -al /mnt
total 12
drwxr-xr-x 3 root root 4096 Nov 26 12:48 .
drwxr-xr-x 22 root root 4096 Nov 26 13:43 ..
lrwxrwxrwx 1 root root 3 Nov 26 12:48 nfs -> usb
drwxr-xr-x 3 pi pi 4096 Nov 26 12:49 usb
# verify nfsd is running
$ ps -aux | grep nfsd | wc -l
root 776 0.0 0.0 0 0 ? S< 19:00 0:00 [nfsd4_callbacks]
root 779 0.0 0.0 0 0 ? S 19:00 0:00 [nfsd]
root 780 0.0 0.0 0 0 ? S 19:00 0:00 [nfsd]
root 781 0.0 0.0 0 0 ? S 19:00 0:00 [nfsd]
root 782 0.0 0.0 0 0 ? S 19:00 0:00 [nfsd]
root 783 0.0 0.0 0 0 ? S 19:00 0:00 [nfsd]
root 784 0.0 0.0 0 0 ? S 19:00 0:00 [nfsd]
root 785 0.0 0.0 0 0 ? S 19:00 0:00 [nfsd]
root 786 0.0 0.0 0 0 ? S 19:00 0:00 [nfsd]
pi 968 0.0 0.1 4276 1840 pts/0 S+ 19:04 0:00 grep nfsd
In the next entry; we will configure the slave nodes and verify that
everything is working!