天天看點

[Spark]那些年我們遇到的Spark的坑

版權聲明:本文為部落客原創文章,未經部落客允許不得轉載。 https://blog.csdn.net/SunnyYoona/article/details/72922155

1. java.lang.NoClassDefFoundError: org/apache/spark/Logging

https://note.youdao.com/md/preview/preview.html?file=%2Fyws%2Fapi%2Fpersonal%2Ffile%2FWEBb774513a37b05e91e19b8f83bd5fb60e%3Fmethod%3Ddownload%26read%3Dtrue#11-%E9%97%AE%E9%A2%98 1.1 問題

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/Logging
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    at java.security.AccessController.doPrivileged(Native Method)I
    at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at org.apache.spark.streaming.twitter.TwitterUtils$.createStream(TwitterUtils.scala:44)
    at TwitterStreamingApp$.main(TwitterStreamingApp.scala:42)
    at TwitterStreamingApp.main(TwitterStreamingApp.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 23 more           

https://note.youdao.com/md/preview/preview.html?file=%2Fyws%2Fapi%2Fpersonal%2Ffile%2FWEBb774513a37b05e91e19b8f83bd5fb60e%3Fmethod%3Ddownload%26read%3Dtrue#12-%E8%A7%A3%E5%86%B3%E6%96%B9%E6%A1%88 1.2 解決方案

先看一下我們的Maven依賴:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming_2.11</artifactId>
    <version>2.1.0</version>
</dependency>

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-kafka_2.11</artifactId>
    <version>1.6.3</version>
</dependency>           

我們看到我們Spark版本為2.1.0版本,然而在spark 1.5.2版本之後 org/apache/spark/Logging 已經被移除了。 由于spark-streaming-kafka 1.6.3版本中使用到了logging,是以我們替換一個依賴:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-kafka-0-8_2.11</artifactId>
    <version>2.1.0</version>
</dependency>           

https://note.youdao.com/md/preview/preview.html?file=%2Fyws%2Fapi%2Fpersonal%2Ffile%2FWEBb774513a37b05e91e19b8f83bd5fb60e%3Fmethod%3Ddownload%26read%3Dtrue#2-task-not-serializable 2. Task not serializable

https://note.youdao.com/md/preview/preview.html?file=%2Fyws%2Fapi%2Fpersonal%2Ffile%2FWEBb774513a37b05e91e19b8f83bd5fb60e%3Fmethod%3Ddownload%26read%3Dtrue#21-%E9%97%AE%E9%A2%98%E6%8F%8F%E8%BF%B0 2.1 問題描述

Exception in thread "main" org.apache.spark.SparkException: Task not serializable
	at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
	at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
	at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
	at org.apache.spark.SparkContext.clean(SparkContext.scala:2094)
	at org.apache.spark.streaming.dstream.DStream$$anonfun$map$1.apply(DStream.scala:546)
	at org.apache.spark.streaming.dstream.DStream$$anonfun$map$1.apply(DStream.scala:546)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.SparkContext.withScope(SparkContext.scala:701)
	at org.apache.spark.streaming.StreamingContext.withScope(StreamingContext.scala:264)
	at org.apache.spark.streaming.dstream.DStream.map(DStream.scala:545)
	at org.apache.spark.streaming.api.java.JavaDStreamLike$class.map(JavaDStreamLike.scala:157)
	at org.apache.spark.streaming.api.java.AbstractJavaDStreamLike.map(JavaDStreamLike.scala:42)
	at com.sjf.open.spark.stream.SparkStreamTest.receiveKafkaData(SparkStreamTest.java:76)
	at com.sjf.open.spark.stream.SparkStreamTest.startStream(SparkStreamTest.java:71)
	at com.sjf.open.spark.stream.SparkStreamTest.main(SparkStreamTest.java:48)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Caused by: java.io.NotSerializableException: com.sjf.open.spark.stream.SparkStreamTest
Serialization stack:
	- object not serializable (class: com.sjf.open.spark.stream.SparkStreamTest, value: com.sjf.open.spark.stream.SparkStreamTest@1e6cc850)
	- element of array (index: 0)
	- array (class [Ljava.lang.Object;, size 1)
	- field (class: java.lang.invoke.SerializedLambda, name: capturedArgs, type: class [Ljava.lang.Object;)
	- object (class java.lang.invoke.SerializedLambda, SerializedLambda[capturingClass=class com.sjf.open.spark.stream.SparkStreamTest, functionalInterfaceMethod=org/apache/spark/api/java/function/Function.call:(Ljava/lang/Object;)Ljava/lang/Object;, implementation=invokeSpecial com/sjf/open/spark/stream/SparkStreamTest.handleMessage:(Lscala/Tuple2;)Ljava/lang/String;, instantiatedMethodType=(Lscala/Tuple2;)Ljava/lang/String;, numCaptured=1])
	- writeReplace data (class: java.lang.invoke.SerializedLambda)
	- object (class com.sjf.open.spark.stream.SparkStreamTest$$Lambda$8/1868987089, com.sjf.open.spark.stream.SparkStreamTest$$Lambda$8/1868987089@3c0036b)
	- field (class: org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1, name: fun$1, type: interface org.apache.spark.api.java.function.Function)
	- object (class org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1, <function1>)
	at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
	at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
	at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
	at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
	... 20 more           

https://note.youdao.com/md/preview/preview.html?file=%2Fyws%2Fapi%2Fpersonal%2Ffile%2FWEBb774513a37b05e91e19b8f83bd5fb60e%3Fmethod%3Ddownload%26read%3Dtrue#22-%E8%A7%A3%E5%86%B3%E6%96%B9%E6%A1%88 2.2 解決方案

發現我們的SparkStreamTest類沒有進行序列化:

public class SparkStreamTest{
    ...
}           

修改為:

public class SparkStreamTest implements Serializable{
    ...
}           

https://note.youdao.com/md/preview/preview.html?file=%2Fyws%2Fapi%2Fpersonal%2Ffile%2FWEBb774513a37b05e91e19b8f83bd5fb60e%3Fmethod%3Ddownload%26read%3Dtrue#3javalangnosuchmethoderror-comfasterxmljacksoncorejsonfactoryrequirespropertyorderingz 3.java.lang.NoSuchMethodError: com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z

https://note.youdao.com/md/preview/preview.html?file=%2Fyws%2Fapi%2Fpersonal%2Ffile%2FWEBb774513a37b05e91e19b8f83bd5fb60e%3Fmethod%3Ddownload%26read%3Dtrue#31-%E9%97%AE%E9%A2%98 3.1 問題

Exception in thread "main" java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:200)
	at org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:194)
	at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
	at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
	at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
	at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
	at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
	at org.apache.spark.metrics.MetricsSystem.registerSinks(MetricsSystem.scala:194)
	at org.apache.spark.metrics.MetricsSystem.start(MetricsSystem.scala:102)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:522)
	at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
	at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
	at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
	at com.sjf.open.spark.JavaWordCount.test1(JavaWordCount.java:37)
	at com.sjf.open.spark.JavaWordCount.main(JavaWordCount.java:96)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Caused by: java.lang.NoSuchMethodError: com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z
	at com.fasterxml.jackson.databind.ObjectMapper.<init>(ObjectMapper.java:537)
	at com.fasterxml.jackson.databind.ObjectMapper.<init>(ObjectMapper.java:448)
	at org.apache.spark.metrics.sink.MetricsServlet.<init>(MetricsServlet.scala:48)
	... 26 more           

https://note.youdao.com/md/preview/preview.html?file=%2Fyws%2Fapi%2Fpersonal%2Ffile%2FWEBb774513a37b05e91e19b8f83bd5fb60e%3Fmethod%3Ddownload%26read%3Dtrue#32-%E8%A7%A3%E5%86%B3%E6%96%B9%E6%A1%88 3.2 解決方案

引入的ES包中也包含com.fasterxml.jackson.core版本為2.8.1 ,版本較高,找不到那個方法. 是以需要顯示指明com.fasterxml.jackson.core版本:

<jackson.version>2.6.5</jackson.version>
<!-- jackson spark sql 與 es 沖突 手動指定-->
<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-databind</artifactId>
    <version>${jackson.version}</version>
</dependency>