天天看點

(轉)yarn 叢集部署,遇到的問題小結

link:

版本資訊: hadoop 2.3.0  hive 0.11.0

1. Application Master 無法通路

    點選application mater 連結,出現 http 500

錯誤,java.lang.Connect.exception:

    問題是由于設定web ui時,50030 端口對應的ip位址為0.0.0.0,導緻application

master 連結無法定位。

解決辦法:

     yarn-site.xml 檔案

    <property>

        <description>The address of the RM web

application.</description>

<name>yarn.resourcemanager.webapp.address</name>

<value>xxxxxxxxxx:50030</value>

    </property>

    這是2.3.0 的裡面的一個bug 1811 ,2.4.0已經修複

2. History UI 無法通路 和 container 打不開

     點選 Tracking URL:History無法通路

       問題是 history service 沒有啟動

  解決辦法:

 配置:選擇(xxxxxxxxxx: 作為history

sever)

<name>yarn.log-aggregation-enable</name>

<value>true</value>

   <property>

<name>mapreduce.jobhistory.address</name>

<value>xxxxxxxxxx::10020</value>

<name>mapreduce.jobhistory.webapp.address</name>

<value>xxxxxxxxxx:19888</value>

  sbin/mr-jobhistory-daemon.sh   

start historyserver

相關連結:

3 yarn 平台的優化

設定 虛拟cpu的個數

        <name>yarn.nodemanager.resource.cpu-vcores</name>

        <value>23</value> 

    </property>

    設定使用的記憶體

    <property>

        <name>yarn.nodemanager.resource.memory-mb</name>

        <value>61440</value>

        <description>the amount of memory on the NodeManager in GB</description>

設定每個任務最大使用的記憶體

        <name>yarn.scheduler.maximum-allocation-mb</name>

        <value>49152</value>

        <source>yarn-default.xml</source>

4 運作任務 提示: Found interface

org.apache.hadoop.mapreduce.Counter, but class was expected

修改pom,重新install

    <dependency>

 <groupId>org.apache.hadoop</groupId>

 <artifactId>hadoop-common</artifactId>

 <version>2.3.0</version>

   </dependency>    

 <dependency>

 <artifactId>hadoop-mapreduce-client-core</artifactId>

 </dependency>

   <dependency>

                <groupId>org.apache.mrunit</groupId>

                <artifactId>mrunit</artifactId>

                <version>1.0.0</version>

                <classifier>hadoop2</classifier>

                <scope>test</scope>

            </dependency>

jdk 換成1.7

5 運作任務提示shuffle記憶體溢出Java heap

space

2014-05-14 16:44:22,010 FATAL [IPC Server handler 4 on 44508]

org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:

attempt_1400048775904_0006_r_000004_0 - exited :

org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle

in fetcher#3

    at

org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)

org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)

org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)

    at java.security.AccessController.doPrivileged(Native

Method)

javax.security.auth.Subject.doAs(Subject.java:415)

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)

org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Caused by: java.lang.OutOfMemoryError: Java heap space

org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)

org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)

org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.<init>(InMemoryMapOutput.java:63)

org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.unconditionalReserve(MergeManagerImpl.java:297)

org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:287)

org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:411)

org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:341)

org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)

來源: <:19888/jobhistory/logs/ST-L09-05-back-tj-yarn15:8034/container_1400048775904_0006_01_000001/job_1400048775904_0006/hadoop/syslog/?start=0>

解決方法:

調低mapreduce.reduce.shuffle.memory.limit.percent的值

預設為0.25 現在調成0.10 

參考:

http://www.sqlparty.com/yarn%E5%9C%A8shuffle%E9%98%B6%E6%AE%B5%E5%86%85%E5%AD%98%E4%B8%8D%E8%B6%B3%E9%97%AE%E9%A2%98error-in-shuffle-in-fetcher/

6 reduce 任務的log 中間發現:

2014-05-14 17:51:21,835 WARN [Readahead Thread #2]

org.apache.hadoop.io.ReadaheadPool: Failed readahead on ifile

EINVAL: Invalid argument

org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)

org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:263)

org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:142)

org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

    at java.lang.Thread.run(Thread.java:745)

來源: <:8042/node/containerlogs/container_1400060792764_0001_01_000726/hadoop/syslog/?start=-4096>

ps:

錯誤沒有再現,暫無解決方法

7 hive 任務

java.lang.InstantiationException: org.antlr.runtime.CommonToken

Continuing ...

java.lang.RuntimeException: failed to evaluate:

<unbound>=Class.new();

參考:https://issues.apache.org/jira/browse/HIVE-4222s

8 hive

任務自動把join裝換mapjoin時記憶體溢出,解決方法:關閉自動裝換,11前的版本預設值為false,後面的為true;

在任務腳本裡面加上:set

hive.auto.convert.join=false;

或者在hive-site.xml 配上為false;

出錯日志:

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

2014-05-15 02:40:58     Starting to launch local task to process

map join;      maximum memory = 1011351552

2014-05-15 02:41:00     Processing rows:      

 200000  Hashtable size: 199999  Memory usage:   110092544

      rate:   0.109

2014-05-15 02:41:01     Processing rows:      

 300000  Hashtable size: 299999  Memory usage:   229345424

      rate:   0.227

 400000  Hashtable size: 399999  Memory usage:   170296368

      rate:   0.168

 500000  Hashtable size: 499999  Memory usage:   285961568

      rate:   0.283

2014-05-15 02:41:02     Processing rows:      

 600000  Hashtable size: 599999  Memory usage:   408727616

      rate:   0.404

 700000  Hashtable size: 699999  Memory usage:   333867920

      rate:   0.33

 800000  Hashtable size: 799999  Memory usage:   459541208

      rate:   0.454

2014-05-15 02:41:03     Processing rows:      

 900000  Hashtable size: 899999  Memory usage:   391524456

      rate:   0.387

 1000000 Hashtable size: 999999  Memory usage:   514140152  

    rate:   0.508

 1029052 Hashtable size: 1029052 Memory usage:   546126888  

    rate:   0.54

2014-05-15 02:41:03     Dump the hashtable into file:

file:/tmp/hadoop/hive_2014-05-15_14-40-53_413_3806680380261480764/-local-10002/HashTable-Stage-4/MapJoin-mapfile01--.hashtable

2014-05-15 02:41:06     Upload 1 File to:

File size: 68300588

2014-05-15 02:41:06     End of local task; Time Taken: 8.301

sec.

Execution completed successfully

Mapred Local Task Succeeded . Convert the Join into

MapJoin

Mapred Local Task Succeeded . Convert the Join into MapJoin

Launching Job 2 out of 2

log出錯日志:

2014-05-15 13:52:54,007 FATAL [main] org.apache.hadoop.mapred.YarnChild:

Error running child : java.lang.OutOfMemoryError: Java heap space

java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3465)

java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3271)

java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1789)

java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)

java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)

    at java.util.HashMap.readObject(HashMap.java:1183)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

java.lang.reflect.Method.invoke(Method.java:606)

java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)

java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)

java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.initilizePersistentHash(HashMapWrapper.java:128)

org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:194)

org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)

org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1377)

org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)

來源: <:19888/jobhistory/logs/ST-L09-10-back-tj-yarn21:8034/container_1400064445468_0013_01_000002/attempt_1400064445468_0013_m_000000_0/hadoop/syslog/?start=0>

9 hive運作時 提示:failed to evaluate:

<unbound>=Class.new(); ,更新到0.13.0

參考

SLF4J: See

http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J:

Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]OKTime taken: 2.28

secondsjava.lang.InstantiationException: org.antlr.runtime.CommonTokenContinuing

...java.lang.RuntimeException: failed to evaluate:

<unbound>=Class.new();Continuing ...java.lang.InstantiationException:

org.antlr.runtime.CommonTokenContinuing ...java.lang.RuntimeException: failed to

evaluate: <unbound>=Class.new();Continuing

...java.lang.InstantiationException: org.antlr.runtime.CommonTokenContinuing

...

這個應該更新後能解決,不過不知道為什麼我更新12.0 和13.0

,一運作就報錯fileNotfundHIVE_PLANxxxxxxxxx 。ps

(參考11)應該是我配置有問題,暫無解決方法。

10 hive

建立表或者資料庫的時候 Couldnt obtain a new sequence

(unique id) : You have an error in your SQL syntax

解決方法:這個是因為hive中繼資料庫的名字是yarn-hive,

sql中中劃線是關鍵詞,是以sql錯誤。把資料庫名去掉中劃線,問題解決。

錯誤日志:

FAILED: Error in metadata: MetaException(message:javax.jdo.JDOException:

Couldnt obtain a new sequence (unique id) : You have an error in your SQL

syntax; check the manual that corresponds to your MySQL server version for the

right syntax to use near ‘-hive.`SEQUENCE_TABLE` WHERE

`SEQUENCE_NAME`=‘org.apache.hadoop.hive.metastore.m‘ at line 1

        at

org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)

org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:732)

org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)

org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:643)

sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)

        at com.sun.proxy.$Proxy14.createTable(Unknown

Source)

org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1070)

org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1103)

org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)

com.sun.proxy.$Proxy15.create_table_with_environment_context(Unknown

org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:466)

org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:455)

org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)

        at com.sun.proxy.$Proxy16.createTable(Unknown

org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:597)

org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3777)

org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:256)

org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)

org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)

org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1362)

org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1146)

org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)

shark.SharkCliDriver.processCmd(SharkCliDriver.scala:338)

org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)

shark.SharkCliDriver$.main(SharkCliDriver.scala:235)

shark.SharkCliDriver.main(SharkCliDriver.scala)

NestedThrowablesStackTrace:

com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an

error in your SQL syntax; check the manual that corresponds to your MySQL server

version for the right syntax to use near ‘-hive.`SEQUENCE_TABLE` WHERE

sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)

sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

java.lang.reflect.Constructor.newInstance(Constructor.java:526)

com.mysql.jdbc.Util.handleNewInstance(Util.java:406)

com.mysql.jdbc.Util.getInstance(Util.java:381)

com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1030)

com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956)

com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3558)

com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3490)

com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959)

com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2109)

com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2648)

com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2077)

com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2228)

org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)

org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:381)

org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:504)

org.datanucleus.store.rdbms.valuegenerator.SequenceTable.getNextVal(SequenceTable.java:197)

org.datanucleus.store.rdbms.valuegenerator.TableGenerator.reserveBlock(TableGenerator.java:190)

org.datanucleus.store.valuegenerator.AbstractGenerator.reserveBlock(AbstractGenerator.java:305)

org.datanucleus.store.rdbms.valuegenerator.AbstractRDBMSGenerator.obtainGenerationBlock(AbstractRDBMSGenerator.java:170)

org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerationBlock(AbstractGenerator.java:197)

org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractGenerator.java:105)

org.datanucleus.store.rdbms.RDBMSStoreManager.getStrategyValueForGenerator(RDBMSStoreManager.java:2019)

org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractStoreManager.java:1385)

org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3727)

org.datanucleus.state.JDOStateManager.setIdentity(JDOStateManager.java:2574)

org.datanucleus.state.JDOStateManager.initialiseForPersistentNew(JDOStateManager.java:526)

org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:202)

org.datanucleus.ExecutionContextImpl.newObjectProviderForPersistentNew(ExecutionContextImpl.java:1326)

org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2123)

org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1972)

org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1820)

org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)

org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:727)

11 安裝hive 12 和13

後,運作任務報錯提示:FileNotFoundException: HIVE_PLAN

解決方法:可能是hive一個bug,也可能那裡配置錯了

,待解決

錯誤日志

2014-05-16 10:27:07,896 INFO [main] org.apache.hadoop.mapred.MapTask:

Processing split:

Paths:/user/hive/warehouse/game_predata.db/game_login_log/dt=0000-00-00/000000_0:201326592+60792998,/user/hive/warehouse/game_predata.db/game_login_log/dt=0000-00-00/000001_0_copy_1:201326592+58503492,/user/hive/warehouse/game_predata.db/game_login_log/dt=0000-00-00/000001_0_copy_2:67108864+67108864,/user/hive/warehouse/game_predata.db/game_login_log/dt=0000-00-00/000001_0_copy_2:134217728+67108864,/user/hive/warehouse/game_predata.db/game_login_log/dt=0000-00-00/000002_0_copy_1:67108864+67108864InputFormatClass:

org.apache.hadoop.mapred.TextInputFormat

2014-05-16 10:27:07,954 WARN [main] org.apache.hadoop.mapred.YarnChild:

Exception running child : java.lang.RuntimeException:

java.io.FileNotFoundException: HIVE_PLAN14c8af69-0156-4633-9273-6a812eb91a4c

(沒有那個檔案或目錄)

org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:230)

org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)

org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:381)

org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:374)

org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:540)

org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:168)

org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409)

org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)

Caused by: java.io.FileNotFoundException:

HIVE_PLAN14c8af69-0156-4633-9273-6a812eb91a4c (沒有那個檔案或目錄)

    at java.io.FileInputStream.open(Native Method)

java.io.FileInputStream.<init>(FileInputStream.java:146)

java.io.FileInputStream.<init>(FileInputStream.java:101)

org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:221)

    ... 12 more

2014-05-16 10:27:07,957 INFO [main] org.apache.hadoop.mapred.Task: Runnning

cleanup for the task

來源: <:19888/jobhistory/logs/ST-L10-10-back-tj-yarn10:8034/container_1400136017046_0026_01_000030/attempt_1400136017046_0026_m_000000_0/hadoop>

12java.lang.OutOfMemoryError: GC overhead

limit exceeded 

分析:這個是JDK6新添的錯誤類型。是發生在GC占用大量時間為釋放很小空間的時候發生的,是一種保護機制。解決方案是,關閉該功能,可以添加JVM的啟動參數來限制使用記憶體:

-XX:-UseGCOverheadLimit 

添加位置是:mapred-site.xml

裡新增項:mapred.child.java.opts 内容:-XX:-UseGCOverheadLimit

來源: <>

參考14 

13hive   hive

0.10.0為了執行效率考慮,簡單的查詢,就是隻是select,不帶count,sum,group

by這樣的,都不走map/reduce,直接讀取hdfs檔案進行filter過濾。這樣做的好處就是不新開mr任務,執行效率要提高不少,但是不好的地方就是使用者界面不友好,有時候資料量大還是要等很長時間,但是又沒有任何傳回。

改這個很簡單,在hive-site.xml裡面有個配置參數叫

hive.fetch.task.conversion

将這個參數設定為more,簡單查詢就不走map/reduce了,設定為minimal,就任何簡單select都會走map/reduce。

 參考14 

14 運作mr 任務的時候提示:

[java] 

Container [pid=30486,containerID=container_1400229396615_0011_01_000012] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 1.7 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1400229396615_0011_01_000012 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 30501 30486 30486 30486 (java) 3924 322 1720471552 262096 /opt/jdk1.7.0_55/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx1024m -XX:-UseGCOverheadLimit -Djava.io.tmpdir=/home/nodemanager/local/usercache/hadoop/appcache/application_1400229396615_0011/container_1400229396615_0011_01_000012/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/hadoop/logs/nodemanager/logs/application_1400229396615_0011/container_1400229396615_0011_01_000012 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 30.30.30.39 47925 attempt_1400229396615_0011_m_000000_0 12 |- 30486 12812 30486 30486 (bash) 0 0 108642304 302 /bin/bash -c /opt/jdk1.7.0_55/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx1024m -XX:-UseGCOverheadLimit -Djava.io.tmpdir=/home/nodemanager/local/usercache/hadoop/appcache/application_1400229396615_0011/container_1400229396615_0011_01_000012/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/hadoop/logs/nodemanager/logs/application_1400229396615_0011/container_1400229396615_0011_01_000012 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 30.30.30.39 47925 attempt_1400229396615_0011_m_000000_0 12 1>/home/hadoop/logs/nodemanager/logs/application_1400229396615_0011/container_1400229396615_0011_01_000012/stdout 2>/home/hadoop/logs/nodemanager/logs/application_1400229396615_0011/container_1400229396615_0011_01_000012/stderr Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143  

下面的參數是關于mapreduce任務運作時的記憶體設定,如果有的任務需要可單獨配置,就統一配置了。如果有container被kill

可以适當調高

mapreduce.map.memory.mb    map任務的最大記憶體

mapreduce.map.java.opts -Xmx1024M map任務jvm的參數

mapreduce.reduce.memory.mb  reduce任務的最大記憶體

mapreduce.reduce.java.opts -Xmx2560M reduce任務jvm的參數

mapreduce.task.io.sort.mb 512 Higher memory-limit while sorting data for

efficiency.

摘自:

關閉記憶體檢測程序:

是在搞不清楚 問什麼有的任務就實體記憶體200多MB

,虛拟記憶體就飙到2.7G了,估計記憶體檢測程序有問題,而且我有的任務是需要大記憶體的,為了進度,索性關了,一下子解決所有記憶體問題。

yarn.nodemanager.pmem-check-enabled false

yarn.nodemanager.vmem-check-enabled false

15 yarn 的webUI 有關的調整:

1 cluser 頁面 application的starttime 和finishtime 都是 UTC格式,改成

+8區時間也就是中原標準時間。

./share/hadoop/yarn/hadoop-yarn-common-2.3.0.jar

裡面的webapps.static.yarn.dt.plugins.js

或者源碼包裡面:/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/yarn.dt.plugins.js

添加代碼:

[java]

Date.prototype.Format = function (fmt) { //author: meizz   

    var o = {  

        "M+": this.getMonth() + 1, //月份   

        "d+": this.getDate(), //日   

        "h+": this.getHours(), //小時   

        "m+": this.getMinutes(), //分   

        "s+": this.getSeconds(), //秒   

        "q+": Math.floor((this.getMonth() + 3) / 3), //季度   

        "S": this.getMilliseconds() //毫秒   

    };  

    if (/(y+)/.test(fmt)) fmt = fmt.replace(RegExp.$1, (this.getFullYear() + "").substr(4 - RegExp.$1.length));  

    for (var k in o)  

    if (new RegExp("(" + k + ")").test(fmt)) fmt = fmt.replace(RegExp.$1, (RegExp.$1.length == 1) ? (o[k]) : (("00" + o[k]).substr(("" + o[k]).length)));  

    return fmt;  

};  

同時按下面修改下的代碼

function renderHadoopDate(data, type, full)   

{ if (type === ‘display‘ || type === ‘filter‘) { if(data === ‘0‘) { return "N/A"; }   

return new Date(parseInt(data)).Format("yyyy-MM-dd hh:mm:ss"); }  

16  MR1的任務用到DistributedCache

的任務遷移到MR2上出錯。原來我裡面使用檔案名區分不同的緩存檔案,MR2裡面分發檔案以後隻保留的檔案名如:

application_xxxxxxx/container_14xxxx/part-m-00000  

application_xxxxxxx/container_14xxxx/part-m-00001  

application_xxxxxxx/container_14xxxx/00000_0  

解決方法:每個緩存檔案添加符号連結,連結為 父級名字+檔案名

DistributedCache.addCacheFile(new URI(path.toString() + "#"+ path.getParent().getName() + "_" + path.getName()),  

configuration);  

這樣就會生成帶有檔案名的緩存檔案