Learning the Hadoop Install

After discussing with Chad Day the possible solutions for setting up both a Hadoop cluster and an Eucalyptus cluster on the same set of machines, we have concluded we should first attempt to run the services side by side. We are planning on using CentOS 6.4 as the operating system on the machines, and then installing the services we need for the nodes. The nine machines we have now currently have operating systems and packages on them from the 401 class that focused on working with Eucalyptus. Because of this Chad and I will install the latest version, 6.4, of CentOS on the machines and install the necessary packages for the servers.

Before attempting to install the Hadoop software on the CS machines, I am planning to set up a small environment of Hadoop servers on my personal machine using VirtualBox. This gives me a little more flexibility to play with packages and will be easier to access as I am learning the install process and how the head node and data nodes work together. Once I am confident with the install process on the virtual machines also running CentOS 6.4, I will install the packages on the machines that will have both Hadoop and Eucalyptus.

As far as installing the base operating system on the nine machines, Chad and I will install CentOS on the machines this week, hoping to have all the base system on all the machines by the end of this upcoming week. My plan is to have a couple install medias, either several CD’s or USB drives with the CentOS image on it to help the install go a little quicker. After all the machines are set up and I’ve played around in the virtual environment, I will be ready to install the cluster that will be used in the computer science department.