Hadoop: Windows Installation 安装
Prerequisite
- Java JDK 1.8.0
Java development kit can be confirmed that has been installed in your machine through the command javac -version
.
Hadoop is only supported with 8 and 11 Java version.
Setup Hadoop
Download Hadoop
Download Hadoop 3.3.0 (Binary download) and extract with Winrar (Windows) or Keka (Mac). After the hadoop-3.3.0.tar.gz has been downloaded, it has to be extracted to C:\ folder.
Setup Environmental Variables
Open the System Properties window from Control Panel and select the Environment Variables button.
- User Variables
Variable | Value |
---|---|
HADOOP_HOME | C:\hadoop-3.3.0\bin |
- System Variables
Variable | Value |
---|---|
PATH | C:\hadoop-3.3.0\bin |
Configuration Modification
Edit each file and paste below xml paragraph and save each file.
- hadoop-3.3.0/etc/hadoop/core-site.xml
1 | <configuration> |
- hadoop-3.3.0/etc/hadoop/mapred-site.xml
1 | <configuration> |
- hadoop-3.3.0/etc/hadoop/hdfs-site.xml
1 | <configuration> |
Remember to replace
/c:/software/hadoop-3.3.0
with your hadoop root directory 注意把/c:/software/hadoop-3.3.0
替换成你的 hadoop 根目录
- hadoop-3.3.0/etc/hadoop/yarn-site.xml
1 | <configuration> |
- hadoop-3.3.0/etc/hadoop/hadoop-env.cmd
1 | set JAVA_HOME=C:\Java |
Remember to replace
C:\Java
with your java root directory 注意把C:\Java
替换成你的 java 根目录
Update bin folder
Delete file bin on C:\Hadoop-2.8.0\bin, replaced by file bin on file just download.
Download bin from Github
Download bin from Drive
Run 执行 Hadoop
- Enter
hdfs namenode -format
in the bin directory, and you should see the result 在 bin 目录下输入hdfs namenode -format
,应该能看到这样的结果:
- Enter
start-all.cmd
in the sbin directory, and multiple cmd windows will be created. At this time, enterjps
and you should see the following results 在 sbin 目录下输入start-all.cmd
,会有多个 cmd 窗口被创建,此时输入jps
,应当看到如下结果:
Sometimes encounter the failure of DataNode creation. Delete the data/datanode folder in the root directory to solve the problem in start-all. 有时会遇到 DataNode 创建失败的情况,删除根目录下 data/datanode 文件夹在 start-all 解决问题。
- Make sure 4 cmd are all running.
- Enter
localhost:50070
in the browser and you should see the following webpage 在浏览器中输入localhost:50070
应当能看到如下网页:
Enter
localhost:8088
in the browser and you should see the following webpage 在浏览器中输入localhost:50070
应当能看到如下网页:Enter
stop-all.cmd
in the sbin directory.
WordCount & MapReduce
If you want to play with Hadoop’s MapReduce algorithm and WordCount.java.