2024世界职业技能大赛大数据平台搭建hadoop(容器环境)

news/2024/11/16 20:05:48 标签: 大数据, hadoop, 分布式, jdk

任务A:大数据平台搭建(容器环境)(15分)

环境说明:

服务端登录地址详见各任务服务端说明。
补充说明:宿主机可通过Asbru工具或SSH客户端进行SSH访问;
相关软件安装包在宿主机的/opt目录下,请选择对应的安装包进行安装,用不到的可忽略;
所有任务中应用命令必须采用绝对路径;
进入Master节点的方式为
docker exec -it master /bin/bash
进入Slave1节点的方式为
docker exec -it slave1 /bin/bash
进入Slave2节点的方式为
docker exec -it slave2 /bin/bash
三个容器节点的root密码均为123456

子任务一:Hadoop 完全分布式安装配置

评分标准
主要知识与技能点分值
JDK的解压安装1
JDK的环境变量配置1
Host配置及三个节点的分发1
Hadoop解压安装及环境初始化2
Hadoop集群启动并查看2

本任务需要使用root用户完成相关配置,安装Hadoop需要配置前置环境。命令中要求使用绝对路径,具体要求如下:

任务1

1、 从宿主机/opt目录下将文件hadoop-3.1.3.tar.gz、jdk-8u212-linux-x64.tar.gz复制到容器Master中的/opt/software路径中(若路径不存在,则需新建),将Master节点JDK安装包解压到/opt/module路径中(若路径不存在,则需新建),将JDK解压命令复制并粘贴至客户端桌面【Release\任务A提交结果.docx】中对应的任务序号下;

用终端连接宿主机
检查容器是否启动

​ docker ps -a

连接master

docker exec -it master /bin/bash

检查/opt/software路径是否存在
[root@master ~]# ls /opt/
module  software
把宿主机里的资料复制到容器里面去
[root@Bigdata ~]# docker cp /opt/jdk-8u212-linux-x64.tar.gz master:/opt/software

Successfully copied 195MB to master:/opt/software

[root@Bigdata ~]# docker cp /opt/hadoop-3.1.3.tar.gz master:/opt/software

Successfully copied 338MB to master:/opt/software

将Master节点JDK安装包解压到/opt/module
tar zxvf /opt/software/jdk-8u212-linux-x64.tar.gz -C /opt/module/
任务2

2、 修改容器中/etc/profile文件,设置JDK环境变量并使其生效,配置完毕后在Master节点分别执行“java -version”和“javac”命令,将命令行执行结果分别截图并粘贴至客户端桌面【Release\任务A提交结果.docx】中对应的任务序号下;

jdk_80">重命名jdk文件夹
mv /opt/module/jdk1.8.0_212 /opt/module/java
在/etc/profile文件末尾写环境变量
#JAVA_HOME
export JAVA_HOME=/opt/module/java

#PATH
export PATH=$PATH:$JAVA_HOME/bin
使环境变量生效
source /etc/profile
输入 java -version
java version "1.8.0_212"
Java(TM) SE Runtime Environment (build 1.8.0_212-b10)
Java HotSpot(TM) 64-Bit Server VM (build 25.212-b10, mixed mode)
输入 javac
Usage: javac <options> <source files>
where possible options include:
  -g                         Generate all debugging info
  -g:none                    Generate no debugging info
  -g:{lines,vars,source}     Generate only some debugging info
  -nowarn                    Generate no warnings
  -verbose                   Output messages about what the compiler is doing
  -deprecation               Output source locations where deprecated APIs are used
  -classpath <path>          Specify where to find user class files and annotation processors
  -cp <path>                 Specify where to find user class files and annotation processors
  -sourcepath <path>         Specify where to find input source files
  -bootclasspath <path>      Override location of bootstrap class files
  -extdirs <dirs>            Override location of installed extensions
  -endorseddirs <dirs>       Override location of endorsed standards path
  -proc:{none,only}          Control whether annotation processing and/or compilation is done.
  -processor <class1>[,<class2>,<class3>...] Names of the annotation processors to run; bypasses default discovery process
  -processorpath <path>      Specify where to find annotation processors
  -parameters                Generate metadata for reflection on method parameters
  -d <directory>             Specify where to place generated class files
  -s <directory>             Specify where to place generated source files
  -h <directory>             Specify where to place generated native header files
  -implicit:{none,class}     Specify whether or not to generate class files for implicitly referenced files
  -encoding <encoding>       Specify character encoding used by source files
  -source <release>          Provide source compatibility with specified release
  -target <release>          Generate class files for specific VM version
  -profile <profile>         Check that API used is available in the specified profile
  -version                   Version information
  -help                      Print a synopsis of standard options
  -Akey[=value]              Options to pass to annotation processors
  -X                         Print a synopsis of nonstandard options
  -J<flag>                   Pass <flag> directly to the runtime system
  -Werror                    Terminate compilation if warnings occur
  @<filename>                Read options and filenames from file
任务3

3、 请完成host相关配置,将三个节点分别命名为master、slave1、slave2,并做免密登录,用scp命令并使用绝对路径从Master复制JDK解压后的安装文件到slave1、slave2节点(若路径不存在,则需新建),并配置slave1、slave2相关环境变量,将全部scp复制JDK的命令复制并粘贴至客户端桌面【Release\任务A提交结果.docx】中对应的任务序号下;

分别进入三个容器检查ip

输入 ifconfig

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.100.102  netmask 255.255.255.0  broadcast 192.168.100.255
        ether 00:50:56:80:3e:d7  txqueuelen 0  (Ethernet)
        RX packets 6934  bytes 575269 (561.7 KiB)
        RX errors 0  dropped 334  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
编写/etc/hosts文件 vi /etc/hosts

在末尾添加

192.168.100.102 master
192.168.100.103 slave1
192.168.100.104 slave2
配置免密

输入 ssh-keygen 一直敲回车

Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:a5KqXjGa6r1CO1pe9cG9bR3Pp2om6BstpOB9l6SW24E root@master
The key's randomart image is:
+---[RSA 2048]----+
|                 |
|                 |
|                 |
|       . .       |
|    + . S o   .  |
| . + * = O + . + |
|. = + = E.* o . +|
| B.o . =.*.oo  ..|
|=o*+o  .+..+...  |
+----[SHA256]-----+
复制密钥
ssh-copy-id master

/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/root/.ssh/id_rsa.pub”
The authenticity of host ‘master (192.168.100.102)’ can’t be established.
RSA key fingerprint is SHA256:nuf/qVhd2k6k0u5t7GvhylqRi+4xMC3MNGmJKJNXipo.
RSA key fingerprint is MD5:e6:fd:96:10:3f:2c:9a:68:40:cc:d7:7c:e2:ee:6e:67.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys
root@master’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘master’”
and check to make sure that only the key(s) you wanted were added.

ssh-copy-id slave1

/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/root/.ssh/id_rsa.pub”
The authenticity of host ‘slave1 (192.168.100.103)’ can’t be established.
RSA key fingerprint is SHA256:nuf/qVhd2k6k0u5t7GvhylqRi+4xMC3MNGmJKJNXipo.
RSA key fingerprint is MD5:e6:fd:96:10:3f:2c:9a:68:40:cc:d7:7c:e2:ee:6e:67.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys
root@slave1’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘slave1’”
and check to make sure that only the key(s) you wanted were added.

ssh-copy-id slave2

/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/root/.ssh/id_rsa.pub”
The authenticity of host ‘slave2 (192.168.100.104)’ can’t be established.
RSA key fingerprint is SHA256:nuf/qVhd2k6k0u5t7GvhylqRi+4xMC3MNGmJKJNXipo.
RSA key fingerprint is MD5:e6:fd:96:10:3f:2c:9a:68:40:cc:d7:7c:e2:ee:6e:67.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys
root@slave2’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘slave2’”
and check to make sure that only the key(s) you wanted were added.

复制hosts文件到slave机器上
scp /etc/hosts slave1:/etc/

hosts 100% 219 218.0KB/s 00:00

scp /etc/hosts slave2:/etc/

hosts 100% 219 218.0KB/s 00:00

在两台slave机器里也配置免密连接
ssh-keygen
ssh-copy-id master
ssh-copy-id slave1
ssh-copy-id slave2
jdkslave1slave2_284">复制jdk到slave1、slave2
scp -rq /opt/module/java slave1:/opt/module
scp -rq /opt/module/java slave2:/opt/module
复制环境变量
scp /etc/profile slave1:/etc/profile
scp /etc/profile slave2:/etc/profile
任务4

4、 在Master将Hadoop解压到/opt/module(若路径不存在,则需新建)目录下,并将解压包分发至slave1、slave2中,其中master、slave1、slave2节点均作为datanode,配置好相关环境,初始化Hadoop环境namenode,将初始化命令及初始化结果截图(截取初始化结果日志最后20行即可)粘贴至客户端桌面【Release\任务A提交结果.docx】中对应的任务序号下;

解压Hadoop
[root@master ~]# tar zxvf /opt/software/hadoop-3.1.3.tar.gz -C /opt/module/
重命名文件夹
[root@master ~]# mv /opt/module/hadoop-3.1.3 /opt/module/hadoop
环境配置 vi /etc/profile
增加下面代码
#HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop

#PATH
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

​​​​环境配置

使环境生效
source /etc/profile
hadoop_333">配置hadoop文件
core-site.xml
 vi /opt/module/hadoop/etc/hadoop/core-site.xml
    <!-- 指定NameNode的地址 -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:8020</value>
    </property>

    <!-- 指定hadoop数据的存储目录 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/module/hadoop/data</value>
    </property>

core-site

hdfs-site.xml
 vi /opt/module/hadoop/etc/hadoop/hdfs-site.xml
	<!-- nn web端访问地址-->
	<property>
        <name>dfs.namenode.http-address</name>
        <value>master:9870</value>
    </property>
	<!-- 2nn web端访问地址-->
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>slave2:9868</value>
    </property>

hdfs-site

yarn-site.xml
vi /opt/module/hadoop/etc/hadoop/yarn-site.xml
<!-- 指定MR走shuffle -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <!-- 指定ResourceManager的地址-->
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>master</value>
    </property>

    <!-- 环境变量的继承 -->	·
    <property>
        <name>yarn.nodemanager.env-whitelist</name>       				 
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,
CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>

yarn-site

mapred-site.xml
vi /opt/module/hadoop/etc/hadoop/mapred-site.xml
<!-- 指定MapReduce程序运行在Yarn上 -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

mapred-site

hadoopenvsh_428">hadoop-env.sh
vi /opt/module/hadoop/etc/hadoop/hadoop-env.sh

把下面代码复制到这个文件末尾

export JAVA_HOME=/opt/module/java
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
workers
vi /opt/module/hadoop/etc/hadoop/workers

localhost删除这个填下面的代码

master
slave1
slave2
分发到集群去
分发环境变量
[root@master ~]# scp /etc/profile slave1:/etc/
[root@master ~]# scp /etc/profile slave2:/etc/
hadoop_468">分发hadoop
[root@master ~]# scp -rq /opt/module/hadoop slave1:/opt/module/
[root@master ~]# scp -rq /opt/module/hadoop slave2:/opt/module/
初始化Hadoop环境namenode
[root@master ~]# hdfs namenode -format

namenode

任务5

5、 启动Hadoop集群(包括hdfs和yarn),使用jps命令查看Master节点与slave1节点的Java进程,将jps命令与结果截图粘贴至客户端桌面【Release\任务A提交结果.docx】中对应的任务序号下。

启动Hadoop集群
[root@master ~]# start-all.sh 

或者

[root@master ~]#  /opt/module/hadoop/sbin/start-all.sh 
使用jps命令查看Master节点与slave1节点的Java进程
[root@master ~]# jps

jps

[root@slave1 ~]# jps
12161 Jps
11112 NodeManager
10846 DataNode
[root@slave2 ~]# jps
10451 DataNode
10821 NodeManager
10582 SecondaryNameNode
11944 Jps

测试web网页是否正常
测试web
datanode


http://www.niftyadmin.cn/n/5754573.html

相关文章

Android CALL按键同步切换通话界面上免提和听筒的图标显示

按一下call按键,进行切换图标,分别显示为免提和听筒模式! /frameworks/base/services/core/java/com/android/server/policy/PhoneWindowManager.java case KeyEvent.KEYCODE_CALL: { //*/ add custom key. if("com.freeme.factory.in…

reduce-scatter:适合分布式计算;Reduce、LayerNorm和Broadcast算子的执行顺序对计算结果的影响,以及它们对资源消耗的影响

目录 Gather Scatter Reduce reduce-scatter:适合分布式计算 Reduce、LayerNorm和Broadcast算子的执行顺序对计算结果的影响,以及它们对资源消耗的影响 计算结果理论正确性 资源消耗方面 Gather 这个也很好理解,就是把多个进程的数据拼凑在一起。 Scatter 不同于Br…

ArkTS学习笔记:类的定义和对象的创建

文章目录 1. 准备工作2. 类的定义2.1 语法格式2.2 创建类 3. 对象的创建3.1 语法格式3.2 创建对象 4. 运行程序&#xff0c;查看效果5. 实战总结 1. 准备工作 创建鸿蒙项目 - LearnArkTS 编写首页代码 Entry Component struct Index {State message: string 学习ArkTS;buil…

支持向量机SVM——基于分类问题的监督学习算法

支持向量机&#xff08;SVM&#xff0c;Support Vector Machine&#xff09;是一种常用于分类问题的监督学习算法&#xff0c;其核心思想是通过寻找一个最佳的超平面来将不同类别的数据点分开&#xff0c;从而实现分类。支持向量机广泛应用于模式识别、文本分类、图像识别等任务…

基本定时器---内/外部时钟中断

一、定时器的概念 定时器&#xff08;TIM&#xff09;&#xff0c;可以对输入的时钟信号进行计数&#xff0c;并在计数值达到设定值的时候触发中断。 STM32的定时器系统有一个最为重要的结构是时基单元&#xff0c;它由一个16位计数器&#xff0c;预分频器&#xff0c;和自动重…

Ubuntu24.04上安装和配置MySQL8.4.3

Ubuntu24.04上安装和配置MySQL8.4.3 #MySQL 的 APT 配置工具包:https://repo.mysql.com/&#xff0c;最新版的就是这个了 wget https://repo.mysql.com/mysql-apt-config_0.8.33-1_all.deb#输入这条命令&#xff0c;然后选择OK sudo dpkg -i mysql-apt-config_0.8.33-1_all.de…

Nginx SSL+tomcat,使用request.getScheme() 取到https协议

架构上使用了 Nginx tomcat 集群, 且nginx下配置了SSL,tomcat no SSL,项目使用https和http协议。 发现 request.getScheme() //总是 http&#xff0c;而不是实际的http或https request.isSecure() //总是false&#xff08;因为总是http&#xff09; request.getRemoteAddr(…

NotePad++中安装XML Tools插件

一、概述 作为开发人员&#xff0c;日常开发中大部的数据是标准的json格式&#xff0c;但是对于一些古老的应用&#xff0c;例如webservice接口&#xff0c;由于其响应结果是xml&#xff0c;那么我们拿到xml格式的数据后&#xff0c;常常会对其进行格式化&#xff0c;以便阅读。…