diff --git a/README.md b/README.md index 6547a84..b20a1ac 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,17 @@ +[![Gitter chat](https://badges.gitter.im/gitterHQ/gitter.png)](https://gitter.im/big-data-europe/Lobby) + # Changes Version 2.0.0 introduces uses wait_for_it script for the cluster startup # Hadoop Docker +## Supported Hadoop Versions +* 2.7.1 with OpenJDK 7 +* 2.7.1 with OpenJDK 8 + +## Quick Start + To deploy an example HDFS cluster, run: ``` docker-compose up @@ -14,6 +22,18 @@ Or deploy in swarm: docker stack deploy -c docker-compose-v3.yml hadoop ``` +`docker-compose` creates a docker network that can be found by running `docker network list`, e.g. `dockerhadoop_default`. + +Run `docker network inspect` on the network (e.g. `dockerhadoop_default`) to find the IP the hadoop interfaces are published on. Access these interfaces with the following URLs: + +* Namenode: http://:50070/dfshealth.html#tab-overview +* History server: http://:8188/applicationhistory +* Datanode: http://:50075/ +* Nodemanager: http://:8042/node +* Resource manager: http://:8088/ + +## Configure Environment Variables + The configuration parameters can be specified in the hadoop.env file or as environmental variables for specific services (e.g. namenode, datanode etc.): ``` CORE_CONF_fs_defaultFS=hdfs://namenode:8020 @@ -23,7 +43,7 @@ CORE_CONF corresponds to core-site.xml. fs_defaultFS=hdfs://namenode:8020 will b ``` fs.defaultFShdfs://namenode:8020 ``` -To define dash inside a configuration parameter, use double underscore, such as YARN_CONF_yarn_log___aggregation___enable=true (yarn-site.xml): +To define dash inside a configuration parameter, use triple underscore, such as YARN_CONF_yarn_log___aggregation___enable=true (yarn-site.xml): ``` yarn.log-aggregation-enabletrue ``` @@ -37,10 +57,3 @@ The available configurations are: * /etc/hadoop/mapred-site.xml MAPRED_CONF If you need to extend some other configuration file, refer to base/entrypoint.sh bash script. - -After starting the example Hadoop cluster, you should be able to access interfaces of all the components (substitute domain names by IP addresses from ```network inspect dockerhadoop_default``` command): -* Namenode: http://namenode:50070/dfshealth.html#tab-overview -* History server: http://historyserver:8188/applicationhistory -* Datanode: http://datanode:50075/ -* Nodemanager: http://nodemanager:8042/node -* Resource manager: http://resourcemanager:8088/ diff --git a/base/entrypoint.sh b/base/entrypoint.sh index e37bf20..3b16fed 100644 --- a/base/entrypoint.sh +++ b/base/entrypoint.sh @@ -23,7 +23,7 @@ function configure() { echo "Configuring $module" for c in `printenv | perl -sne 'print "$1 " if m/^${envPrefix}_(.+?)=.*/' -- -envPrefix=$envPrefix`; do - name=`echo ${c} | perl -pe 's/___/-/g; s/__/_/g; s/_/./g'` + name=`echo ${c} | perl -pe 's/___/-/g; s/__/@/g; s/_/./g; s/@/_/g;'` var="${envPrefix}_${c}" value=${!var} echo " - Setting $name=$value"