MacOS – Installing Apache Hadoop on Mac Mavericks

apachemacos

I am having a real hard time installing Apache Hadoop(2.4.1) on my Mac(OS 10.9).Is there any step to step guide that fully explains and more importantly gets Hadoop running on my machine?. I followed a tutorial for installing and currently my issues are these:

  1. I am not sure whether Hadoop is actually "correctly" installed. Typing hadoop version shows 2.4.1 but running start-all.sh shows a long list of warnings.
  2. I thought about checking whether Hadoop is correctly installed by running a sample program(WordCount.java) as is provided everywhere on the net. Now I have Eclipse Luna installed but the guide I followed for illustrating "How to integrate Eclipse with Hadoop" tells me to import all the jars from '../libexec' but for Hadoop 2.4.1 there are no jars, at least none that i could find.
  3. Currently in my /usr/local/ there are 3 directories named hadoop-2.4.1, hadoop(a symlink to this i suppose) and a directory named Cellar. Now all three directories have some subdirectory by name Hadoop and many other subdirectories like lib, lib exec. Now how do I know which ones for which purpose. Every other tutorial refers to a different directory to use, the one that got me Hadoop installed never mentions a bit about how to test a sample Hadoop map reduce application.

I have even tried the HortonWorks Sandbox for Apache Hadoop. But my machines 4GB RAM seems tiny for that mammoth application to run and my system hanged!! I have to get this working for my project related stuff, looking forward for sincere help.

Best Answer

  1. In my case start-all.sh tells it is depraciated and i should use start-dfs.sh and start-yarn.sh. Both give no errors nor warnings on output (for both local and local cluster hdfs).

  2. In Hadoop 2.4.1 jars with shared libraries are located under /libexec/share/hadoop/ and following subdirectories. To run simple MapReduce apps it is enough to add mapreduce/hadoop-mapreduce-client-core-2.4.1.jar and common/hadoop-common-2.4.1.jar.

  3. I use the brew version, so mine is located under /usr/local/Cellar/hadoop/2.4.1/. To make it easier, I use env variable to point to the directory.

I'm not sure, but 4 GiB of ram should be enough to test the environment with some apps. It should not hang.