Prerequisite

Java JDK 1.8.0

Java development kit can be confirmed that has been installed in your machine through the command javac -version.

Hadoop is only supported with 8 and 11 Java version.

Setup Hadoop

Download Hadoop

Download Hadoop 3.3.0 (Binary download) and extract with Winrar (Windows) or Keka (Mac). After the hadoop-3.3.0.tar.gz has been downloaded, it has to be extracted to C:\ folder.

Download Hadoop from Apache

Setup Environmental Variables

Open the System Properties window from Control Panel and select the Environment Variables button.

User Variables

Variable	Value
HADOOP_HOME	C:\hadoop-3.3.0\bin

System Variables

Variable	Value
PATH	C:\hadoop-3.3.0\bin

Configuration Modification

Edit each file and paste below xml paragraph and save each file.

hadoop-3.3.0/etc/hadoop/core-site.xml

<configuration>
    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://localhost:9000</value>
    </property>
</configuration>

hadoop-3.3.0/etc/hadoop/mapred-site.xml

<configuration>
   <property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
   </property>
</configuration>

hadoop-3.3.0/etc/hadoop/hdfs-site.xml

<configuration>
    <property>
       <name>dfs.replication</name>
       <value>1</value>
   </property>
   <property>
       <name>dfs.namenode.name.dir</name>
       <value>/c:/software/hadoop-3.3.0/data/namenode</value>
   </property>
   <property>
       <name>dfs.datanode.data.dir</name>
     <value>/c:/software/hadoop-3.3.0/data/datanode</value>
   </property>
</configuration>

Remember to replace /c:/software/hadoop-3.3.0 with your hadoop root directory 注意把/c:/software/hadoop-3.3.0替换成你的 hadoop 根目录

hadoop-3.3.0/etc/hadoop/yarn-site.xml

<configuration>
   <property>
       <name>yarn.nodemanager.aux-services</name>
       <value>mapreduce_shuffle</value>
   </property>
   <property>
       <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
       <value>org.apache.hadoop.mapred.ShuffleHandler</value>
   </property>
</configuration>

hadoop-3.3.0/etc/hadoop/hadoop-env.cmd

1	set JAVA_HOME=C:\Java

Remember to replace C:\Java with your java root directory 注意把C:\Java替换成你的 java 根目录

Update bin folder

Delete file bin on C:\Hadoop-2.8.0\bin, replaced by file bin on file just download.

Download bin from Github
Download bin from Drive

Run 执行 Hadoop

Enter hdfs namenode -format in the bin directory, and you should see the result 在 bin 目录下输入hdfs namenode -format，应该能看到这样的结果:

Enter start-all.cmd in the sbin directory, and multiple cmd windows will be created. At this time, enter jps and you should see the following results 在 sbin 目录下输入start-all.cmd，会有多个 cmd 窗口被创建，此时输入jps，应当看到如下结果：

Sometimes encounter the failure of DataNode creation. Delete the data/datanode folder in the root directory to solve the problem in start-all. 有时会遇到 DataNode 创建失败的情况，删除根目录下 data/datanode 文件夹在 start-all 解决问题。

Make sure 4 cmd are all running.

Enter localhost:50070 in the browser and you should see the following webpage 在浏览器中输入localhost:50070应当能看到如下网页：

Enter localhost:8088 in the browser and you should see the following webpage 在浏览器中输入localhost:50070应当能看到如下网页：
Enter stop-all.cmd in the sbin directory.