https://www.edureka.co/blog/setting-up-a-multi-node-cluster-in-hadoop-2-x/
vmhost1: This is where name node, clouder manager and many other roles are located.
vmhost2: Data Node
vmhost3: Data Node
replication factor is 3
OS
CPU
memory
network bandwidth
storage
Pre-Steps Before the Installation
1. Set SELinux policy to diasabled. Modify the following parameter in /etc/selinux/config file.
2. Disable firewall.
chkconfig iptables off
3. Set swappiness to 0 in /etc/sysctl.conf file.
Disable IPV6 in /etc/sysctl.conf file.
4. Disable IPV6 in /etc/sysctl.conf file.
5. Configure passwordless SSH for root user.
1. Download and Run the Cloudera Manager Server Installer using wget cmd
you will pop up a browser window and point to http://localhost:7180/
2. Logon Screen
3. Choose Version
4. specify hostname for cdh clster installation
5. Select Repository
6.Java Installation
7.SSH Login Credentials
8.cluster installtaion--detect version--finish screen
9.Role Assignment --hdfs,hive
10.Database Setup-review-complete-finish
#### Cloud Era Installation on Google Cloud Platform GCP ####
You can use any domain name here it is san.com (better copy below steps to notepad)
Create 4 machines with same configuration (because there is a quota restriction) like below
You have to select same zone for all systems (otherwise systems won’t communicate with private ip)
1.Goto ->create an instance in GCP (google cloud platform) and follow below steps-NRMBFN
Name (any name)
Region,Zone-any region (any region but same for all nodes)-las vegas west4a
Machine configuration-E2-8Gb
Boot Disk-centOS7-50Gb
Firewall-Allow http and https
Networking-cdh1.san.com------->Create
2.edit config file /etc/ssh/sshd_config as root user –ALL 4-Nodes
Make sure the below parameters are there
cdh1 ~]$ sudo su – root
[root@cdh1 ~]# vi /etc/ssh/sshd_config +38
PasswordAuthentication yes (line number 38) :38
PermitRootLogin yes (line number 65) :65
Restart sshd
[root@cdh1 ~]# systemctl restart sshd
Reset your password (san123) -help you to login as root in putty
[root@cdh1 ~]# passwd
3.Create rc.local file and set hostname (All 4 nodes)-
touch /etc/rc.d/rc.local
chmod u+x /etc/rc.d/rc.local
systemctl enable rc-local
setup the hostname for all the nodes:-
hostnamectl set-hostname cdh1.san.com
hostnamectl set-hostname cdh2.san.com
hostnamectl set-hostname cdh3.san.com
hostnamectl set-hostname cdh4.san.com
4.Change runlevel in all the 4-nodes to multi-user.target
systemctl set-default multi-user.target
5.Disable selinux and firewall in all the 4-nodes
[root@cdh1 ~]# vi /etc/selinux/config +7
Chane SELINUX to disabled
SELINUX=disabled
[root@localhost ~]# systemctl disable firewalld
6.Log in to the first node cdh1 and install the MySQL metastore (used for Colud era Manager Server(SCM)Database, Hive, OOzie..) Install MySQL and the MySQL community server, and start the MySQL service: (first copy in notepad)
yum localinstall \
[root@cdh1 ~]#yum install mysql-community-server -y
use the below settings in /etc/my.cnf (:1,$d to remove all contents)copy paste below contents
[root@cdh1 ~]#vi /etc/my.cnf
[root@cdh1 ~]#systemctl enable mysqld.service
[root@cdh1 ~]#systemctl start mysqld.service
check the running status using
[root@cdh1 ~]#systemctl status mysqld.service
7.If root password is not presnt in the log file , use the below commands to reset the password
==============================================================
[root@cdh1 ~]# sudo systemctl stop mysqld
[root@cdh1 ~]# sudo systemctl set-environment MYSQLD_OPTS="--skip-grant-tables"
[root@cdh1 ~]# sudo systemctl unset-environment MYSQLD_OPTS
[root@cdh1 ~]# sudo systemctl set-environment MYSQLD_OPTS="--skip-grant-tables"
[root@cdh1 ~]# sudo systemctl start mysqld
[root@cdh1 ~]# mysql -u root
mysql> update mysql.user set plugin='mysql_native_password';
mysql> UPDATE mysql.user SET authentication_string = PASSWORD('Bigdata123!') WHERE User = 'root';
mysql> FLUSH PRIVILEGES;
mysql> quit
Login to the Mysql:
[root@cdh1 ~]# mysql -u root –p
mysql> quit
=====================================================================
8.setup /etc/hosts with the below entries (communication between systems)-change ips
[root@localhost ~]# vi /etc/hosts (replace below ips with your private ips or internal ips from GCP)
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.182.0.4 cdh1.san.com cdh1
10.182.0.5 cdh2.san.com cdh2
10.182.0.6 cdh3.san.com cdh3
10.182.0.7 cdh4.san.com cdh4
Reboot all the 4-nodes
[root@cdh1 ~]# reboot
Reset the root password of all the servers to "san123" in all the nodes
Configure passwordless SSH between the master and slave nodes.
login to CDH1 node (gerate ssh key and cross check it)
[root@cdh1 ~]# yum install sshpass –y (for non-interactive password authentication)
rm -rf ~/.ssh/id_rsa*
ssh-keygen -t rsa -P "" -f ~/.ssh/id_rsa
ls -ltr ~/.ssh
Now do [root@cdh1 ~]# ssh cdh1,2,3,4 (in order to add those keys) from CDH1
Copy rsa public key to all the nodes in the cluster, use the below command:-(10 =ip and san123 pwd)
[root@lcdh1]# for i in `cat /etc/hosts | grep 10 | awk '{print $1}'`; do sshpass -p san123 ssh-copy-id $i; done
Now if you login any node from cdh1 it will not ask for password
9.Install and Configure cluster shell in cdh1 node only:-
[root@cdh1 ~]# yum install epel* -y
[root@cdh1 ]# yum install clustershell -y
Make the below config file:- (for NN,SNN and DN configuration)
[root@cdh1 ~]# vi /etc/clustershell/groups.d/local.cfg (delete existing and replace ips of cdh1,2,3,4)
nn: 10.182.0.4
snn: 10.182.0.5
dn: 10.182.0.5 10.182.0.6 10.182.0.7
edge: 10.182.0.7
all: 10.182.0.4 10.182.0.5 10.182.0.6 10.182.0.7
[root@hdp1 etc]# clush -g all -b "date" (it should show 4 machines time here) if it is woking cluster shell config is good)
10.Install and Configure NTP server:- enforcing all the machines to get the time from ntp server
[root@cdh1 ~]# clush -g all -b "yum install ntp -y"
[root@cdh1 ~]# clush -g all -b "systemctl enable ntpd"
[root@cdh1 ~]# clush -g all -b "systemctl restart ntpd"
[root@cdh1 ~]# cat /etc/ntp.conf (here GCP by default add ntp server)
Check the time in all the nodes and make sure time in sync
[root@cdh1 ~]# clush -g all -b "date"
Copy /etc/hosts to all the nodes in the cluster:- (for communication between systems)
[root@cdh1 ~]# clush -g all --copy /etc/hosts (cross check whether it is copied or not)
Disable firewall runnning on all the nodes:- (for easy communication)
[root@cdh1 ~]# clush -g all -b "systemctl disable firewalld"
[root@cdh1 ~]# clush -g all -b "systemctl stop firewalld"
11.Install java in all the 4-nodes:- (jdk is required for installing hadoop framework)
[root@cdh1 ~]# clush -g all -b "yum install java-1.8.0-openjdk.x86_64 -y"
Make sure java installed on all the nodes by excuting the below command:-
[root@cdh1 ~]# clush -g all -b "java -version"
Install MySQL Java Connector in all 4-nodes
[root@cdh1 ~]# clush -g all -b "yum -y install mysql-connector-java"
After installation, check whether mysql-connector-java.jar is present in /usr/share/java/.
[root@cdh1 ~]# clush -g all -b "ls -al /usr/share/java/mysql-connector-java.jar"
Disable ipv6 in all the nodes (Add the below entries):-
[root@cdh1 ~]#sudo vi /etc/sysctl.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
net.ipv6.conf.enp0s3.disable_ipv6 = 1
vm.swappiness = 10
Copy the same file to all other nodes:-
[root@cdh1 ~]# clush -g all --copy /etc/sysctl.conf
In all the 4-nodes execute the below commands:- (prerequisites)
-----------------------------------------------------------------------------------------------
cd /etc/yum.repos.d
sudo echo never > /sys/kernel/mm/transparent_hugepage/defrag
sudo yum -y install wget
sudo yum -y install createrepo
sudo yum -y install yum-utils createrepo
sudo yum -y install MySQL-python*
sudo yum -y install python*
sudo yum -y install httpd
sudo yum -y install telnet
sudo yum -y install bind*
sudo yum -y install openssh*
sudo yum -y install rpmdevtools
sudo yum -y install ntp*
sudo yum -y install redhat-lsb*
sudo yum -y install cyrus*
sudo yum -y install mod_ssl*
sudo yum -y install portmap*
sudo yum -y install openssl*
sudo yum -y install mlocate*
sudo yum -y install sshpass*
sudo yum -y remove snappy
sudo yum -y install gcc
sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo
sudo yum -y install apache-maven
sudo updatedb
sudo systemctl disable firewalld.service
sudo systemctl enable firewalld.service
sudo systemctl enable httpd
sudo systemctl enable ntpd
sudo systemctl enable ntpdate
sudo systemctl status firewalld.service
sudo systemctl status httpd
sudo systemctl status ntpd
----------------------------------------------------------------------------------------------
12.Download all RPMs for java and cloudera manager and keep them in /var/www/html/cm in cdh1
[root@cdh1 ~]#yum install httpd -y
Enable and start the httpd service
[root@cdh1 ~]#systemctl enable httpd
[root@cdh1 ~]#systemctl start httpd
Download all the required RPMs for the cloudera manager from the url:- (prerequisites)
[root@cdh1 ~]#mkdir /var/www/html/cm
[root@cdh1 ~]#cd /var/www/html/cm (copy in notepad and execute below cmds)
[root@cdh1 ~]#wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/5.16.2/RPMS/x86_64/cloudera-manager-daemons-5.16.2-1.cm5162.p0.7.el7.x86_64.rpm
[root@cdh1 ~]#wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/5.16.2/RPMS/x86_64/cloudera-manager-agent-5.16.2-1.cm5162.p0.7.el7.x86_64.rpm
[root@cdh1 ~]#wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/5.16.2/RPMS/x86_64/cloudera-manager-server-5.16.2-1.cm5162.p0.7.el7.x86_64.rpm
[root@cdh1 ~]#wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/5.16.2/RPMS/x86_64/cloudera-manager-server-db-2-5.16.2-1.cm5162.p0.7.el7.x86_64.rpm
[root@cdh1 ~]#wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/5.16.2/RPMS/x86_64/enterprise-debuginfo-5.16.2-1.cm5162.p0.7.el7.x86_64.rpm
[root@cdh1 ~]#wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/5.16.2/RPMS/x86_64/jdk-6u31-linux-amd64.rpm
[root@cdh1 ~]#wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/5.16.2/RPMS/x86_64/oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
[root@cdh1 ]#wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/RPM-GPG-KEY-cloudera
Download the cloudera parcels from the url:-
[root@cdh1 html]# mkdir -p /var/www/html/CDH5.16.2/parcels
[root@cdh1 html]# cd /var/www/html/CDH5.16.2/parcels
[root@cdh1 parcels]# wget https://archive.cloudera.com/cdh5/parcels/5.16.2/CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel
[root@cdh1 parcels]# wget https://archive.cloudera.com/cdh5/parcels/5.16.2/CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel.sha1
[root@cdh1 parcels]# mv CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel.sha1 CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel.sha
[root@cdhdr1 parcels]# wget http://archive.cloudera.com/cdh5/parcels/5.16.2/manifest.json
Create the yum repo using the below command:-
[root@cdh1 html]# createrepo /var/www/html/cm
[root@cdh1 parcels]# createrepo /var/www/html/CDH5.16.2/parcels
Create cloudera manager repocitory (replace cdh1 internal ips below)
[root@cdh1 cm]# vi /etc/yum.repos.d/cloudera-manager.repo
[cloudera-manager]
name = Cloudera Manager Version 5.16.2
baseurl = http://10.128.0.17/cm/
gpgcheck = 1
[root@cdh1 parcels]# vi /etc/yum.repos.d/cloudera-cdh.repo
[cloudera-cdh5]
name=Clouderaparcels-5.16.2_LocalRepo
baseurl=http://10.128.0.17/CDH5.16.2/parcels
enabled=1
gpgcheck=0
make a directory:-
[root@cdh1 parcels]# mkdir -p /opt/cloudera/parcels
copy the parcels to /opt/cloudera/parcels directory:-
[root@cdh1 parcels]# cp /var/www/html/CDH5.16.2/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel.sha /opt/cloudera/parcels/
[root@cdh1 parcels]# cp /var/www/html/CDH5.16.2/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel /opt/cloudera/parcels/
Copy the repocitory to all the nodes in the cluster:-
[root@cdh1 cm]# clush -g all --copy /etc/yum.repos.d/cloudera-cdh.repo
[root@cdh1 cm]# clush -g all --copy /etc/yum.repos.d/cloudera-manager.repo
Install the below packages in the first node cdh1 only
[root@cdh1 cm]# yum install cloudera-manager-agent cloudera-manager-daemons cloudera-manager-server -y
13.Prepare the cloudera manager database:-
Login to the mysql database and create the below databases:-
[root@cdh1 cm]# mysql -u root -p
Enter password:
mysql> show databases;
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.
mysql> SET PASSWORD = PASSWORD('Bigdata123!'); or /usr/bin/mysql_secure_installation
mysql> grant all on *.* TO 'root'@'%' IDENTIFIED BY 'Bigdata123!' WITH GRANT OPTION;
mysql> uninstall plugin validate_password; (inorder to create easy password)
-----------------------------------------------------------------------------------------
create database amon DEFAULT CHARACTER SET utf8;
grant all on amon.* to 'amon'@'%' identified by 'root@123#';
create database scm DEFAULT CHARACTER SET utf8;
grant all on scm.* to 'scm'@'%' identified by 'root@123#';
create database rman DEFAULT CHARACTER SET utf8;
grant all on rman.* to 'rman'@'%' identified by 'root@123#';
create database metastore DEFAULT CHARACTER SET utf8;
grant all on metastore.* to 'hive'@'%' identified by 'root@123#';
create database sentry DEFAULT CHARACTER SET utf8;
grant all on sentry.* to 'sentry'@'%' identified by 'root@123#';
create database nav DEFAULT CHARACTER SET utf8;
grant all on nav.* to 'nav'@'%' identified by 'root@123#';
create database navms DEFAULT CHARACTER SET utf8;
grant all on navms.* to 'navms'@'%' identified by 'root@123#';
create database hue DEFAULT CHARACTER SET utf8;
grant all on hue.* to 'hue'@'%' identified by 'root@123#';
create database oozie DEFAULT CHARACTER SET utf8;
grant all on oozie.* to 'oozie'@'%' identified by 'root@123#';
create database hive DEFAULT CHARACTER SET utf8;
grant all on hive.* to 'hive'@'%' identified by 'root@123#';
flush privileges;
-----------------------quit----------------------------------------------------------------------------------
Create schema for SCM database: (coludera uses the SCM DB for saving meta data information)
Login to the node cdh1 where your MYSQL is installed.
[root@cdh1 cm]# /usr/share/cmf/schema/scm_prepare_database.sh mysql scm scm
Enter SCM password:root@123#
Output last line:All done, your SCM database is configured correctly
[root@cdh1 cm]# sudo cat /etc/cloudera-scm-server/db.properties (CDH SCM DB loading credentilas
Output last line: com.cloudera.cmf.db.password=root@123#
14.After this, restart cloudera-scm-server in cdh1 node
[root@cdh1 cm]#systemctl enable cloudera-scm-server
[root@cdh1 cm]#systemctl restart cloudera-scm-server
[root@cdh1 cm]#systemctl status cloudera-scm-server
Now check the cloudera manager by logging in (backend it will create schema and tables)-5min
[root@cdh1 cm]# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log (log of CDH SCM server)
Output first line :MetricSchemaManager: Registering cross entity aggregates...
Wait for five minute , cloudera manager demon will listen on port 7180.
telnet cm.san.com 7180
15.access cloudera manager by using the below link (use your external Ip)
Open firewall for port 7180 in GCP –go to GCP and create firewall rule
Create new firewall GCP rule -Should do it every restart of GCP
Name,target(all instances in your n/w),source ip range(myipadss.com)---create
Login cloudera portel using your CDH1-public ip and (uname :admin pwd:admin)
Select - Cloudera Enterprise Cloudera Enterprise Trial-60 days-continue-contine
[root@cdh1 ~]# cat /etc/hosts (get ips and paste in window)
10.182.0.4,10.182.0.5,10.182.0.6,10.182.0.7 (sreach these ips)-continue
[root@cdh1 yum.repos.d]# cat cloudera-manager.repo
Continue-->continue
[root@cdh1 ~]# clush -g all -b "echo never > /sys/kernel/mm/transparent_hugepage/defrag"
[root@cdh1 ~]# clush -g all -b "echo never > /sys/kernel/mm/transparent_hugepage/enabled"
Below is Name Node I.e cdh4
Below is Secondary Name Node I.e cdh3
Balancer is cdh4
NFS gateway in edge node I.e cdh1
Select Data nodes as 1,2,3
Cross check cluster is running or not
Now shutdown GCP machines
[root@cdh1 ~]# clush -g all -b "poweroff"
When again open GCP machines your external ip will change
So again, you have to create firewall rule in order to access cloud era
You can execute a smaple map reduce job using the below command:-
[root@cdh7 cm]#hadoop jar /opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/jars/hadoop-examples.jar wordcount /user/balaji/1015mbfile /user/balaji/out3
-----------------------------------------------------------------------------------------------------------------------------
Errors
Start GCP nodes and do below steps after every restart of GCP systems
Create new firewall GCP rule -Should do it every restart of GCP
Name,target(all instances in your n/w),source ip range(myipadss.com)---create
Login cloudera portel using your CDH1-public ip and (uname :admin pwd:admin)
No comments:
Post a Comment