Apache Hadoop is a framework that allows distributed processing of large datasets.
The Hadoop SmartMachine includes the following components:
To learn more about Hadoop, see the Hadoop Documentation.
In this topic:
Release notes for this image:
Since Hadoop runs on Java, you should provision your Hadoop SmartMachine with a comfortable amount of memory. For a stand-alone machine, 4GB should be enough. If your machine is part of a cluster, you should use 8GB or more.
The Hadoop SmartMachine is configured with the two standard accounts: root and admin. You can use SmartLogin to log into your account using the keys in your my.joyentcloud.com account. Both accounts also have generated passwords that you can see in the Credentials section of the machine's detail page.
Log into your Hadoop SmartMachine the same way you log into a standard SmartMachine:
When you log in to your Hadoop SmartMachine for the first time, it is a good idea to bring the pkgsrc repository up to date and to upgrade the installed packages.
|executables (hadoop, hbase, pig, etc)||/opt/local/bin|
|shell scripts (start-all.sh, hadoop-create-user.sh, etc)||/opt/local/sbin|
|configuration files|| /opt/local/etc/hadoop
Some of the Hadoop tools rely on the JAVA_HOME environment variable to be set. This environment variable is set automatically when you run java and in /opt/local/etc/hadoop/hadoop-env.sh.
If you need to set it yourself, you can do so like this:
The HADOOP_HOME variable is set by /opt/local/etc/hadoop/hadoop-env.sh relative to the startup scripts.
The Hadoop SmartMachine is based on the SmartMachine Base 13.1.0 image, using the http://pkgsrc.joyent.com/packages/SmartOS/2013Q1/x86_64/All repository.
For a detailed list of every package installed with this image, click here.