2016-08-23 8 views
1

いくつかの簡単なSparkジョブといくつかのテストを作成しました。私はIntelliJですべてをやってきました。今、私のコードがsbtで構築されていることを確認したいと思います。コンパイルは正常ですが、実行中とテスト中に奇妙なエラーが発生します。sbtを使用してローカルでスパークジョブを実行できませんが、IntelliJで動作します

私は私のbuild.sbtファイルは次のようになりますScalaのバージョン2.11.8sbtバージョン0.13.8

を使用しています:

name := "test" 

version := "1.0" 

scalaVersion := "2.11.7" 

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0" 
libraryDependencies += "javax.mail" % "javax.mail-api" % "1.5.6" 
libraryDependencies += "com.sun.mail" % "javax.mail" % "1.5.6" 
libraryDependencies += "commons-cli" % "commons-cli" % "1.3.1" 
libraryDependencies += "org.scalatest" % "scalatest_2.11" % "3.0.0" % "test" 
libraryDependencies += "com.holdenkarau" % "spark-testing-base_2.11" % "2.0.0_0.4.4" % "test" intransitive() 

私はここでsbt "run-main com.test.email.processor.bin.Runner"を使用して自分のコードを実行しようとするが出力されます。

[info] Loading project definition from /Users/max/workplace/test/project 
[info] Set current project to test (in build file:/Users/max/workplace/test/) 
[info] Running com.test.email.processor.bin.Runner -j recipientCount -e /Users/max/workplace/data/test/enron_with_categories/*/*.txt 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
16/08/23 18:46:55 INFO SparkContext: Running Spark version 2.0.0 
16/08/23 18:46:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
16/08/23 18:46:55 INFO SecurityManager: Changing view acls to: max 
16/08/23 18:46:55 INFO SecurityManager: Changing modify acls to: max 
16/08/23 18:46:55 INFO SecurityManager: Changing view acls groups to: 
16/08/23 18:46:55 INFO SecurityManager: Changing modify acls groups to: 
16/08/23 18:46:55 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(max); groups with view permissions: Set(); users with modify permissions: Set(max); groups with modify permissions: Set() 
16/08/23 18:46:56 INFO Utils: Successfully started service 'sparkDriver' on port 61759. 
16/08/23 18:46:56 INFO SparkEnv: Registering MapOutputTracker 
16/08/23 18:46:56 INFO SparkEnv: Registering BlockManagerMaster 
16/08/23 18:46:56 INFO DiskBlockManager: Created local directory at /private/var/folders/75/4dydy_6110v0gjv7bg265_g40000gn/T/blockmgr-9eb526c0-b7e5-444a-b186-d7f248c5dc62 
16/08/23 18:46:56 INFO MemoryStore: MemoryStore started with capacity 408.9 MB 
16/08/23 18:46:56 INFO SparkEnv: Registering OutputCommitCoordinator 
16/08/23 18:46:56 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
16/08/23 18:46:56 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.1.11:4040 
16/08/23 18:46:56 INFO Executor: Starting executor ID driver on host localhost 
16/08/23 18:46:57 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 61760. 
16/08/23 18:46:57 INFO NettyBlockTransferService: Server created on 192.168.1.11:61760 
16/08/23 18:46:57 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.1.11, 61760) 
16/08/23 18:46:57 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.11:61760 with 408.9 MB RAM, BlockManagerId(driver, 192.168.1.11, 61760) 
16/08/23 18:46:57 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.1.11, 61760) 
16/08/23 18:46:57 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 128.0 KB, free 408.8 MB) 
16/08/23 18:46:57 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 14.6 KB, free 408.8 MB) 
16/08/23 18:46:57 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.1.11:61760 (size: 14.6 KB, free: 408.9 MB) 
16/08/23 18:46:57 INFO SparkContext: Created broadcast 0 from wholeTextFiles at RecipientCountJob.scala:22 
16/08/23 18:46:58 WARN ClosureCleaner: Expected a closure; got com.test.email.processor.util.cleanEmail$ 
16/08/23 18:46:58 INFO FileInputFormat: Total input paths to process : 1702 
16/08/23 18:46:58 INFO FileInputFormat: Total input paths to process : 1702 
16/08/23 18:46:58 INFO CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 0 
16/08/23 18:46:58 INFO SparkContext: Starting job: take at RecipientCountJob.scala:35 
16/08/23 18:46:58 WARN DAGScheduler: Creating new stage failed due to exception - job: 0 
java.lang.ClassNotFoundException: scala.Function0 
    at sbt.classpath.ClasspathFilter.loadClass(ClassLoaders.scala:63) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357) 
    at java.lang.Class.forName0(Native Method) 
    at java.lang.Class.forName(Class.java:348) 
    at com.twitter.chill.KryoBase$$anonfun$1.apply(KryoBase.scala:41) 
    at com.twitter.chill.KryoBase$$anonfun$1.apply(KryoBase.scala:41) 
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) 
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) 
    at scala.collection.immutable.Range.foreach(Range.scala:166) 
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) 
    at scala.collection.AbstractTraversable.map(Traversable.scala:104) 
    at com.twitter.chill.KryoBase.<init>(KryoBase.scala:41) 
    at com.twitter.chill.EmptyScalaKryoInstantiator.newKryo(ScalaKryoInstantiator.scala:57) 
    at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:86) 
    at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:274) 
    at org.apache.spark.serializer.KryoSerializerInstance.<init>(KryoSerializer.scala:259) 
    at org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:175) 
    at org.apache.spark.serializer.KryoSerializer.supportsRelocationOfSerializedObjects$lzycompute(KryoSerializer.scala:182) 
    at org.apache.spark.serializer.KryoSerializer.supportsRelocationOfSerializedObjects(KryoSerializer.scala:178) 
    at org.apache.spark.shuffle.sort.SortShuffleManager$.canUseSerializedShuffle(SortShuffleManager.scala:187) 
    at org.apache.spark.shuffle.sort.SortShuffleManager.registerShuffle(SortShuffleManager.scala:99) 
    at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:90) 
    at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:91) 
    at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:235) 
    at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:233) 
    at scala.Option.getOrElse(Option.scala:121) 
    at org.apache.spark.rdd.RDD.dependencies(RDD.scala:233) 
    at org.apache.spark.scheduler.DAGScheduler.visit$2(DAGScheduler.scala:418) 
    at org.apache.spark.scheduler.DAGScheduler.getAncestorShuffleDependencies(DAGScheduler.scala:433) 
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getShuffleMapStage(DAGScheduler.scala:288) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DAGScheduler.scala:394) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DAGScheduler.scala:391) 
    at scala.collection.immutable.List.foreach(List.scala:381) 
    at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:391) 
    at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:403) 
    at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:304) 
    at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:339) 
    at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:849) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1626) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607) 
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 
16/08/23 18:46:58 INFO DAGScheduler: Job 0 failed: take at RecipientCountJob.scala:35, took 0.076653 s 
[error] (run-main-0) java.lang.ClassNotFoundException: scala.Function0 
java.lang.ClassNotFoundException: scala.Function0 
[trace] Stack trace suppressed: run last compile:runMain for the full output. 
16/08/23 18:46:58 ERROR ContextCleaner: Error in cleaning thread 
java.lang.InterruptedException 
    at java.lang.Object.wait(Native Method) 
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) 
    at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:175) 
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1229) 
    at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:172) 
    at org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:67) 
16/08/23 18:46:58 ERROR Utils: uncaught error in thread SparkListenerBus, stopping SparkContext 
java.lang.InterruptedException 
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) 
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) 
    at java.util.concurrent.Semaphore.acquire(Semaphore.java:312) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:67) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66) 
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:65) 
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1229) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:64) 
java.lang.RuntimeException: Nonzero exit code: 1 
+0

Scala 2.11がインストールされていますか? –

+0

私はそれをインストールしましたが、どうすればsbtにその場所が分かりますか? – Max

+0

SCALA_HOMEが設定されている限り、あなたは良いです –

答えて

0

どうやら、スパークはsbt経由で実行することはできません。私は、assemblyプラグインを使用し、javaで実行すると、ジョブ全体をjarファイルにパッケージ化しました。

0

スカラーライブラリが見つからない場合は、scala.Function0は、標準のScalaライブラリに含まれています。

あなたは、特定のスコープ

libraryDependencies += "org.scala-lang" % "scala-library" % scalaVersion.value

でScalaの-libが追加してみてください可能性がしかし、それはScalaの-libには、あなたの実行のクラスパスに追加されていないように思えます。

コンパイルに使用したのと同じクラスパスがSBTでコードを実行するために使用されるように、何かを追加することもできます。

fullClasspath in run := (fullClasspath in Compile).value

+0

scala-libraryがclasspathにない可能性は低いです。また、両方の行を追加しても何もしませんでした。 Sparkタスクが正しいクラスパスを取得できない可能性はありますか? – Max

関連する問題