Hướng dẫn cài đặt Hadoop trong vài nốt nhạc

0

Bài viết được mình dịch lại từ blog dataottam.com, trong bài viết họ sẽ hướng dẫn bạn một số cách để tiến hành cài đặt Hadoop - một công cụ để xử lý dữ liệu lớn.

Bạn cần chuẩn bị

  1. Ubuntu 14.04
  2. Script cài đặt (mình để ở dưới)
Hướng dẫn sử dụng


  1. Tạo script có tên 3clicks.sh sau đó copy và dán đoạn mã dưới vào file vừa tạo

#! /bin/bash#sed -i -e 's/\r$//' scriptname.sh#sudo chmod 777 scriptname.sh#./scriptname.sh sudo apt-get update \&& sudo apt-get -y install openssh-server \&& sudo apt-get -y install openjdk-7-jdk \&& sudo wget http://mirror.olnevhost.net/pub/apache/hadoop/common/hadoop-1.2.1/hadoop-1.2.1-bin.tar.gz \&& sudo tar -zxvf hadoop-1.2.1-bin.tar.gz \&& sudo mv hadoop-1.2.1 /home/ubuntu/hadoop \&& sudo chown -R ubuntu /home/ubuntu/hadoop \&& sudo echo "export HADOOP_HOME=/home/ubuntu/hadoop" >> /home/ubuntu/.bashrc \&& sudo echo "export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64/" >> /home/ubuntu/.bashrc \&& echo "export PATH=\$PATH:\$HADOOP_HOME/bin" >> /home/ubuntu/.bashrc \&& echo "export PATH=\$PATH:\$JAVA_HOME/bin" >> /home/ubuntu/.bashrc \&& sudo mkdir /home/ubuntu/hadoop/tmp \&& sudo chown root /home/ubuntu/hadoop/tmp \&& sudo chmod 777 /home/ubuntu/hadoop \&& sudo chmod 777 /home/ubuntu/hadoop/tmp \&& sudo sed -i 's/# export JAVA_HOME=\/usr\/lib\/j2sdk1.5-sun/export JAVA_HOME=\/usr\/lib\/jvm\/java-1.7.0-openjdk-amd64/' /home/ubuntu/hadoop/conf/hadoop-env.sh \&& sudo sed -i 's/# export HADOOP_OPTS=-server/export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true/' /home/ubuntu/hadoop/conf/hadoop-env.sh \&& sudo sed -i "7d" /home/ubuntu/hadoop/conf/core-site.xml \&& sudo sed -i "7i<property>\n<name>fs.default.name</name>\n<value>hdfs://localhost:9000</value>\n</property>\n<property>\n<name>hadoop.tmp.dir</name>\n<value>/home/ubuntu/hadoop/tmp</value>\n</property>" /home/ubuntu/hadoop/conf/core-site.xml \&& sudo sed -i "7d" /home/ubuntu/hadoop/conf/mapred-site.xml \&& sudo sed -i "7i<property>\n<name>mapred.job.tracker</name>\n<value>localhost:9001</value>\n</property>" /home/ubuntu/hadoop/conf/mapred-site.xml \&& sudo sed -i "7d" /home/ubuntu/hadoop/conf/hdfs-site.xml \&& sudo sed -i "7i<property>\n<name>dfs.replication</name>\n<value>1</value>\n</property>" /home/ubuntu/hadoop/conf/hdfs-site.xml \&& ssh-keygen -b 2048 -t rsa -f /home/ubuntu/.ssh/id_rsa -q -N "" \&& cat /home/ubuntu/.ssh/id_rsa.pub >> /home/ubuntu/.ssh/authorized_keys \&& ssh-keyscan localhost >> /home/ubuntu/.ssh/known_hosts
     2. Thực hiện lệnh để cho phép quyền được thực thi
sed -i -e ‘s/\r$//’ 3clicks.shsudo chmod 777 3clicks.sh 
    3. Chạy Script bằng lệnh ./3clicks.sh 


Ngoài ra để hiểu thêm Script chạy như nào mình sẽ để lại lời giải thích của tác giả
  1. First we updated the source list of Ubuntu 14.04 O.S. by and then we moved for second step
root@ubuntu$ sudo apt-get update \
  1. Install openssh-server to enable the port number 22 for ssh connection:-
&& sudo apt-get -y install openssh-server \
  1. Install the openjdk which is required for Hadoop:-
&& sudo apt-get -y install openjdk-7-jdk \
  1. Downloaded the Hadoop-1.2.1-tar.gz:-
&& sudo wget http://mirror.olnevhost.net/pub/apache/hadoop/common/hadoop-1.2.1/hadoop-1.2.1-bin.tar.gz \
  1. Extract the Hadoop tar file:-
&& sudo tar -zxvf hadoop-1.2.1-bin.tar.gz \
  1. Renamed the Hadoop-1.2.1 file to Hadoop
&& sudo mv hadoop-1.2.1 /home/ubuntu/hadoop \
  1. Give the ownership to Hadoop
&& sudo chown -R ubuntu /home/ubuntu/hadoop \
  1. Give the path of Hadoop in .bashrc
&& sudo echo “export HADOOP_HOME=/home/ubuntu/hadoop” >> /home/ubuntu/.bashrc \
  1. Give the path of Java in .bashrc
&& sudo echo “export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64/” >> /home/ubuntu/.bashrc \
  1. Give the bin path of Hadoop in .bashrc
&& echo “export PATH=\$PATH:\$HADOOP_HOME/bin” >> /home/ubuntu/.bashrc \
  1. Give the bin path of Java in .bashrc
&& echo “export PATH=\$PATH:\$JAVA_HOME/bin” >> /home/ubuntu/.bashrc \
  1. Create one directory tmp in hadoop which is a base for other directory
&& sudo mkdir /home/ubuntu/hadoop/tmp \
  1. Give the root privilege to tmp
&& sudo chown root /home/ubuntu/hadoop/tmp \
  1. Give the read, write and execute permission to Hadoop
&& sudo chmod 777 /home/ubuntu/hadoop \
  1. Give the read,write and execute permission to tmp
&& sudo chmod 777 /home/ubuntu/hadoop/tmp \
  1. Set the Java path in hadoop-env.sh
&& sudo sed -i ‘s/# export JAVA_HOME=\/usr\/lib\/j2sdk1.5-sun/export JAVA_HOME=\/usr\/lib\/jvm\/java-1.7.0-openjdk-amd64/’ /home/ubuntu/hadoop/conf/hadoop-env.sh \
  1. Set HADOOP_OPTS true in Hadoop-env.sh
&& sudo sed -i ‘s/# export HADOOP_OPTS=-server/export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true/’ /home/ubuntu/hadoop/conf/hadoop-env.sh \
  1. Go to the core-site.xml
&& sudo sed -i “7d” /home/ubuntu/hadoop/conf/core-site.xml \
  1. At the 7th line write the configuration of core-site.xml
&& sudo sed -i “7i<property>\n<name>fs.default.name</name>\n<value>hdfs://localhost:9000</value>\n</property>\n<property>\n<name>hadoop.tmp.dir</name>\n<value>/home/ubuntu/hadoop/tmp</value>\n</property>” /home/ubuntu/hadoop/conf/core-site.xml \
  1. Go to the mapred-site.xml
&& sudo sed -i “7d” /home/ubuntu/hadoop/conf/mapred-site.xml \
  1. Configuration for mapred-site.xml
&& sudo sed -i “7i<property>\n<name>mapred.job.tracker</name>\n<value>localhost:9001</value>\n</property>” /home/ubuntu/hadoop/conf/mapred-site.xml \
  1. Go to the hdfs-site.xml
&& sudo sed -i “7d” /home/ubuntu/hadoop/conf/hdfs-site.xml \
  1. Configured for hdfs-site.xml
&& sudo sed -i “7i<property>\n<name>dfs.replication</name>\n<value>1</value>\n</property>” /home/ubuntu/hadoop/conf/hdfs-site.xml \
  1. Generate the key
&& ssh-keygen -b 2048 -t rsa -f /home/ubuntu/.ssh/id_rsa -q -N “” \
  1. Copy the public key in the authorized_keys
&& cat /home/ubuntu/.ssh/id_rsa.pub >> /home/ubuntu/.ssh/authorized_keys \
  1. Add the ssh-keyscan localhost to known_hosts
ssh-keyscan localhost >> /home/ubuntu/.ssh/known_hosts

Post a Comment

0 Comments
* Please Don't Spam Here. All the Comments are Reviewed by Admin.

buttons=(Accept !) days=(20)

Our website uses cookies to enhance your experience. Learn More
Accept !