Configuring Hadoop Using Ansible
Hello Guys..!!!
Here am going to configure Hadoop and start cluster using Ansible Playbook.
Requirements: Target nodes (Ec2 instances), Managed node(RHEL8),Hadoop software file ,JDK file.
Steps:
- Configure Ansible in managed node
- Write two ansible playbooks which automates a target node to start namenode and the other for datanode.
3.These playbooks include tasks of
— copying of Hadoop software and JDK files
— Installing Hadoop and Java
— creating namenode and datanode directory
— copying of core-site and Hdfs-site files
— Starting Namenode ,Datanode
— check JPS
4.Run Both Ansible Playbooks.
Implementation:
First Install Ansible by using cmd “ yum install Ansible ”
Now setup Inventory file and add IP’s of Target nodes , Also configure the file at “ansible.cfg”.
Ansible is configured.
Now , We can make a playbook which can automates to start namenode and check Jps.
- hosts: namenode
tasks:
— name: “copy__hadoop software file”
copy:
src : “/root/hadoop-1.2.1–1.x86_64.rpm”
dest: “/root”- name: “copy__jdk file”
copy:
src: “/root/jdk-8u171-linux-x64.rpm”
dest: “/root”- name: “Installing hadoop”
shell: “rpm -ivh hadoop-1.2.1–1.x86_64.rpm — force”
register: Hadoop
ignore_errors: yes- name: “Installing java”
shell: rpm -ivh jdk-8u171-linux-x64.rpm
register: java
ignore_errors: yes- name: “Creating_Directory”
file:
state: directory
path: “nn1”- name: “Copy_coresite file”
copy:
src: “core-site.xml”
dest: “/etc/hadoop/core-site.xml”
— name: “Copy_hdfs-site.xml”
copy:
src: “hdfs-site.xml”
dest: “/etc/hadoop/hdfs-site.xml”- name: “Format_namenode”
shell: “echo Y | hadoop namenode -format”
register: format- name: “Start_Namenode”
shell: “hadoop-daemon.sh start namenode”
ignore_errors: yes
register: namenode_starts- name: “check_JPS”
shell: “jps”
register: jps
Now run this Playbook using cmd
“ansible-playbook namenode.yml”
The output of the playbook is
As the playbook run successfully , The Namenode started .
Let us check whether the Namenode started.
As shown in above image ,Hadoop installed , jdk installed also namenode started .
Now also make the datanode playbook as follows
- hosts: datanode
tasks:
— name: “copy__hadoop software file”
copy:
src : “/root/hadoop-1.2.1–1.x86_64.rpm”
dest: “/root”- name: “copy__jdk file”
copy:
src: “/root/jdk-8u171-linux-x64.rpm”
dest: “/root”- name: “Installing hadoop”
shell: “rpm -ivh hadoop-1.2.1–1.x86_64.rpm — force”
register: Hadoop
ignore_errors: yes- name: “Installing java”
shell: “rpm -ivh jdk-8u171-linux-x64.rpm”
register: java
ignore_errors: yes- name: “Creating_Directory”
file:
state: directory
path: “dn1”- name: “Copy_coresite file”
copy:
src: “core-site.xml”
dest: “/etc/hadoop/core-site.xml”
— name: “Copy_hdfs-site.xml”
copy:
src: “hdfs-site.xml”
dest: “/etc/hadoop/hdfs-site.xml”- name: “Start_Datanode”
shell: “hadoop-daemon.sh start datanode”
ignore_errors: yes
register: datanode_starts- name: “check_JPS”
shell: “jps”
register: jps
Run this playbook using cmd
“ansible-playbook datanode.yml”
The ouput of the playbook is
Datanode playbook also run successfully and datanode is started.
In this way Hadoop Cluster configuration is done by Automation using Ansible playbook .
Thank you…!!!