first version of hadoop-base hadoop-namenode and hadoop-datanode

This commit is contained in:
Giannis Mouchakis 2016-03-09 17:37:43 +02:00
parent dcff074a8d
commit b97e590f6f
6 changed files with 131 additions and 1 deletions

View File

@ -1 +1,29 @@
this will be the repo for docker-hadoop
This is a docker container for hadoop.
By default it uses data replication "2". To change it edit the hdfs-site.xml file.
To start the namenode run
docker run --name namenode -h bde2020/hadoop-namenode
To start two datanodes on the same host run
docker run --name datanode1 --link namenode:namenode bde2020/hadoop-datanode
docker run --name datanode2 --link namenode:namenode bde2020/hadoop-datanode
More info is comming soon on how to run hadoop docker using docker network and docker swarm
All data are stored in /hdfs-data, so to store data in a host directory datanodes as
docker run --name datanode1 --link namenode:namenode -v /path/to/host:/hdfs-data bde2020/hadoop-datanode
docker run --name datanode2 --link namenode:namenode -v /path/to/host:/hdfs-data bde2020/hadoop-datanode
By default the namenode formats the namenode directory only if not exists (hdfs namenode -format -nonInteractive).
If you want to mount an external directory that already contains a namenode directory and format it you have to first delete it manually.
Hadoop namenode listens on
hdfs://namenode:8020
To use access the namenode from another container link it using "--link namenode:namenode" and then use the afformentioned URL.
More info on how to access it using docker network coming soon.

28
hadoop-base/Dockerfile Normal file
View File

@ -0,0 +1,28 @@
FROM java:8-jre
MAINTAINER Yiannis Mouchakis <gmouchakis@iit.demokritos.gr>
# define hadoop version
ENV HADOOP_VERSION 2.7.1
# Hadoop env variables
ENV HADOOP_PREFIX /opt/hadoop
ENV HADOOP_CONF_DIR $HADOOP_PREFIX/conf
ENV PATH $PATH:$HADOOP_PREFIX/bin
ENV PATH $PATH:$HADOOP_PREFIX/sbin
RUN apt-get update && apt-get install -y \
wget \
tar \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# deploy hadoop
RUN wget http://archive.apache.org/dist/hadoop/core/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
RUN tar -zxf /hadoop-$HADOOP_VERSION.tar.gz
RUN rm /hadoop-$HADOOP_VERSION.tar.gz
RUN mv hadoop-$HADOOP_VERSION $HADOOP_PREFIX
# add configuration files
ADD core-site.xml $HADOOP_CONF_DIR/core-site.xml
ADD hdfs-site.xml $HADOOP_CONF_DIR/hdfs-site.xml

24
hadoop-base/core-site.xml Normal file
View File

@ -0,0 +1,24 @@
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:8020</value>
</property>
</configuration>

40
hadoop-base/hdfs-site.xml Normal file
View File

@ -0,0 +1,40 @@
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/hdfs-data/datanode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/hdfs-data/namenode</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
</configuration>

View File

@ -0,0 +1,5 @@
FROM bde2020/hadoop-base
MAINTAINER Yiannis Mouchakis <gmouchakis@iit.demokritos.gr>
CMD hdfs datanode

View File

@ -0,0 +1,5 @@
FROM bde2020/hadoop-base
MAINTAINER Yiannis Mouchakis <gmouchakis@iit.demokritos.gr>
CMD hdfs namenode -format -nonInteractive & hdfs namenode