天天看點

Hadoop - Mac OSX下配置和啟動hadoop以及常見錯誤解決

0. 安裝JDK

參考網上教程在OSX下安裝jdk

1. 下載下傳及安裝hadoop

a) 下載下傳位址:

http://hadoop.apache.org

b) 配置ssh環境

在terminal裡面輸入: ssh localhost

如果有錯誤提示資訊,表示目前使用者沒有權限。這個多半是系統為安全考慮,預設設定的。

更改設定如下:進入system preference --> sharing --> 勾選remote login,并設定allow access for all users。

再次輸入“ssh localhost",再輸入密碼并确認之後,可以看到ssh成功。

c) ssh免登陸配置

指令行輸入:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa  

ssh-keygen表示生成秘鑰;-t表示秘鑰類型;-P用于提供密語;-f指定生成的秘鑰檔案。

這個指令在”~/.ssh/“檔案夾下建立兩個檔案id_dsa和id_dsa.pub,是ssh的一對兒私鑰和公鑰。

接下來,将公鑰追加到授權的key中去,輸入:

cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

********************************************************************************

免密碼登入localhost

1. ssh-keygen -t rsa Press enter for each line 提示輸入直接按回車就好

2. cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

3. chmod og-wx ~/.ssh/authorized_keys

測試 

ssh localhost

如果仍然提示要輸入密碼,那麼可以

vim ~/.ssh/config

檔案,添加以下代碼。

Host localhost
   AddKeysToAgent yes
   UseKeychain yes
   IdentityFile ~/.ssh/id_rsa      

ssh localhost

,不再提示需要輸入密碼。

d) 設定環境變量

在實際啟動Hadoop之前,有三個檔案需要進行配置。

但在這之前,我們需要在我們的bash_profile中配置如下幾個配置

指令行輸入: 

open ~/.bash_profile;

# hadoop

export HADOOP_HOME=/Users/YourUserName/Documents/Dev/hadoop-2.7.3

export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

e) 配置hadoop-env.sh

在${HADOOP_HOME}/etc/hadoop目錄下,找到hadoop-env.sh,打開編輯确認如下設定是否正确:

export JAVA_HOME=${JAVA_HOME}

export HADOOP_HEAPSIZE=2000(去掉注釋)

export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"(去掉注釋)

f) 配置core-site.xml——指定了NameNode的主機名與端口

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->  
  
<configuration>  
    <property>  
        <name>/Users/YourUserName/Documents/Dev/hadoop-2.7.3/hadoop-${user.name}</name>  
        <value>hdfs://localhost:9000</value>  
        <description>A base for other temporary directories.</description>  
    </property>  
    <property>  
        <name>fs.default.name</name>  
        <value>hdfs://localhost:8020</value>  
    </property>  
</configuration>         

g) 配置hdfs-site.xml——指定了HDFS的預設參數副本數,因為僅運作在一個節點上,是以這裡的副本數為1

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>  
    <property>  
        <name>dfs.replication</name>  
        <value>1</value>  
    </property>  
</configuration>         

h) 配置mapred-site.xml——指定了JobTracker的主機名與端口

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>  
    <property>  
        <name>mapred.job.tracker</name>  
        <value>hdfs://localhost:9001</value>  
    </property>  
    <property>  
        <name>mapred.tasktracker.map.tasks.maximum</name>  
        <value>2</value>  
    </property>  
    <property>  
        <name>mapred.tasktracker.reduce.tasks.maximum</name>  
        <value>2</value>  
    </property>  
</configuration>        

i) 安裝HDFS

經過以上的配置,就可以進行HDFS的安裝了

cd $HADOOP_HOME/bin

hadoop namenode -format 

如果出現下圖, 說明你的HDFS已經安裝成功了

Hadoop - Mac OSX下配置和啟動hadoop以及常見錯誤解決

j) 啟動Hadoop

cd ${HADOOP_HOME}/sbin

start-dfs.sh

start-yarn.sh

k) 驗證hadoop

如果在啟動過程中沒有發生任何錯誤

啟動完成之後,在指令行輸入: jps

如果結果如下:

3761 DataNode

4100 Jps

3878 SecondaryNameNode

3673 NameNode

4074 NodeManager

3323 ResourceManager

以上幾個節點都列印出來,那麼恭喜你,你已經成功安裝和啟動hadoop了!

最後,我們可以在浏覽器通過http的方式進行驗證

浏覽器輸入:

http://localhost:8088/

結果如下:

http://localhost:50070/

2. 常見錯誤解決

hadoop namenode不能啟動

org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-javoft/dfs/name is in an inconsistent state: storage di rectory does not exist or is not accessible.

原因在于core-site.xml

你必須覆寫hadoop.tmp.dir為你自己的hadoop目錄

...

hadoop.tmp.dir

/home/javoft/Documents/hadoop/hadoop-${user.name}

技術改變世界