Added versions in section and note for resource manager container.

This commit is contained in:
K 2025-04-14 17:58:17 +05:30
parent 3c27c86c81
commit fd4489977d
Signed by: notkshitij
GPG Key ID: C5B8BC7530F8F43F

View File

@ -1,22 +1,36 @@
# Hadoop Docker # Hadoop Docker
## Supported Hadoop Versions This repository provides setup for Hadoop cluster using Docker containers. Spin up NameNode, DataNode, ResourceManager, NodeManager, and HistoryServer—each in its own container.
See repository branches for supported hadoop versions
---
## Versions
- Hadoop: 3.4.1
- Java: 17
---
> [!IMPORTANT]
> You may have to restart the `hadoop-resourcemanager` few times before it actually starts running.
## Quick Start ## Quick Start
To deploy an example HDFS cluster, run: To deploy an example HDFS cluster, run:
```
docker-compose up ```shell
docker-compose up -d
``` ```
Run example wordcount job: Run example wordcount job:
```
make wordcount ```shell
make wordcount
``` ```
Or deploy in swarm: Or deploy in swarm:
```
```shell
docker stack deploy -c docker-compose-v3.yml hadoop docker stack deploy -c docker-compose-v3.yml hadoop
``` ```
@ -24,35 +38,39 @@ docker stack deploy -c docker-compose-v3.yml hadoop
Run `docker network inspect` on the network (e.g. `dockerhadoop_default`) to find the IP the hadoop interfaces are published on. Access these interfaces with the following URLs: Run `docker network inspect` on the network (e.g. `dockerhadoop_default`) to find the IP the hadoop interfaces are published on. Access these interfaces with the following URLs:
* Namenode: http://<dockerhadoop_IP_address>:9870/dfshealth.html#tab-overview - Namenode: http://<dockerhadoop_IP_address>:9870/dfshealth.html#tab-overview
* History server: http://<dockerhadoop_IP_address>:8188/applicationhistory - History server: http://<dockerhadoop_IP_address>:8188/applicationhistory
* Datanode: http://<dockerhadoop_IP_address>:9864/ - Datanode: http://<dockerhadoop_IP_address>:9864/
* Nodemanager: http://<dockerhadoop_IP_address>:8042/node - Nodemanager: http://<dockerhadoop_IP_address>:8042/node
* Resource manager: http://<dockerhadoop_IP_address>:8088/ - Resource manager: http://<dockerhadoop_IP_address>:8088/
## Configure Environment Variables ## Configure Environment Variables
The configuration parameters can be specified in the hadoop.env file or as environmental variables for specific services (e.g. namenode, datanode etc.): The configuration parameters can be specified in the hadoop.env file or as environmental variables for specific services (e.g. namenode, datanode etc.):
``` ```
CORE_CONF_fs_defaultFS=hdfs://namenode:8020 CORE_CONF_fs_defaultFS=hdfs://namenode:8020
``` ```
CORE_CONF corresponds to core-site.xml. fs_defaultFS=hdfs://namenode:8020 will be transformed into: CORE_CONF corresponds to core-site.xml. fs_defaultFS=hdfs://namenode:8020 will be transformed into:
``` ```
<property><name>fs.defaultFS</name><value>hdfs://namenode:8020</value></property> <property><name>fs.defaultFS</name><value>hdfs://namenode:8020</value></property>
``` ```
To define dash inside a configuration parameter, use triple underscore, such as YARN_CONF_yarn_log___aggregation___enable=true (yarn-site.xml): To define dash inside a configuration parameter, use triple underscore, such as YARN_CONF_yarn_log___aggregation___enable=true (yarn-site.xml):
``` ```
<property><name>yarn.log-aggregation-enable</name><value>true</value></property> <property><name>yarn.log-aggregation-enable</name><value>true</value></property>
``` ```
The available configurations are: The available configurations are:
* /etc/hadoop/core-site.xml CORE_CONF - /etc/hadoop/core-site.xml CORE_CONF
* /etc/hadoop/hdfs-site.xml HDFS_CONF - /etc/hadoop/hdfs-site.xml HDFS_CONF
* /etc/hadoop/yarn-site.xml YARN_CONF - /etc/hadoop/yarn-site.xml YARN_CONF
* /etc/hadoop/httpfs-site.xml HTTPFS_CONF - /etc/hadoop/httpfs-site.xml HTTPFS_CONF
* /etc/hadoop/kms-site.xml KMS_CONF - /etc/hadoop/kms-site.xml KMS_CONF
* /etc/hadoop/mapred-site.xml MAPRED_CONF - /etc/hadoop/mapred-site.xml MAPRED_CONF
If you need to extend some other configuration file, refer to base/entrypoint.sh bash script. If you need to extend some other configuration file, refer to base/entrypoint.sh bash script.