2022. 6. 20. 01:02ㆍBigData/Hadoop
1. 리눅스 설치(CentOS)
$ docker search centos
$ docker pull centos
$ docker run -it --name hadoop centos /bin/bash
2. wget, vim 설치
[hadoop] yum install wget -y && yum install vim -y
아래와 동일한 에러가 발생할 경우 클릭
Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist |
3. Java 설치
$ yum install java-1.8.0-openjdk-devel.x86_64 -y
$ java -version
openjdk version "1.8.0_312"
OpenJDK Runtime Environment (build 1.8.0_312-b07)
OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)
4. java 환경변수 설정
$ which java
/usr/bin/java
$ readlink -f $(which java)
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-2.el8_5.x86_64/jre/bin/java
$ vi ~/.bashrc
# java export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-2.el8_5.x86_64 export PATH=$PATH:$JAVA_HOME/bin export JAVA_OPTS="-Dfile.encoding=UTF8" export CLASSPATH="." |
5. Hadoop 설치 (독립 실행 모드)
1) 폴더 만들기
$ mkdir /hadoop_home
$ cd /hadoop_home
2) 다운로드
$ wget https://mirrors.sonic.net/apache/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
3) 압축 풀기
$ tar xvzf hadoop-3.3.1.tar.gz
4) 환경변수 설정
$ vi ~/.bashrc
# hadoop export HADOOP_HOME=/hadoop_home/hadoop-3.3.1 export HADOOP_CONFIG_HOME=$HADOOP_HOME/etc/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin |
5) 설치 확인
$ hadoop version
Hadoop 3.3.1 Source code repository https://github.com/apache/hadoop.git -r a3b9c37a397ad4188041dd80621bdeefc46885f2 Compiled by ubuntu on 2021-06-15T05:13Z Compiled with protoc 3.7.1 From source with checksum 88a4ddb2299aca054416d6b7f81ca55 This command was run using /hadoop_home/hadoop-3.3.1/share/hadoop/common/hadoop-common-3.3.1.jar |
6. Mapreduce 사용
$ hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar wordcount $HADOOP_HOME/LICENSE.txt wordcount_output
-> hadoop-mapreduce-examples-3.3.1.jar 파일에 wordcount를 이용하여, LICENSE.txt 의 파일내용의 단어를 count하여 결과값을 wordcount_output 폴더에 넣어 줄 것이다.
$ ls -ltr
drwxr-xr-x 2 root root 4096 Jun 21 12:48 wordcount_output
$ cd wordcount_output
$ ls -ltr
-rw-r--r-- 1 root root 9894 Jun 21 12:46 part-r-00000
-rw-r--r-- 1 root root 0 Jun 21 12:46 _SUCCESS
-> 결과 값은 part-r-00000에 존재한다.
7. 더 자세한 내용은 여기에서 확인
'BigData > Hadoop' 카테고리의 다른 글
hadoop 설치 & single node (0) | 2021.07.22 |
---|