特定の条件でフィルタリングされたエントリの数を見つけるために、クラスからの割り当てがあります。apache pig countが機能しない、DUMPが失敗する
私のデータセットには以下のスキーマがあります。
data1 = LOAD '/answers.csv' USING PigStorage(',') AS (qid:long,qt:long,tag:chararray,at:long);
qid = question ID, qt = question start time(in epoch time), at = answer end time(in epoch time);
サンプルデータセット:
SN QID QT
1 563355 1235000081 PHP、エラー、GD、画像処理1235000501
2 563355 1235000081 PHP、エラー、GD、でタグ画像処理1235000551
3 563356 1235000140のLisp、スキーム、主観的、Clojureの1235000177
4 563356 1235000140のLisp、スキーム、主観的、Clojureの1235001545
5 563356 1235000140のLisp、スキーム、主観的、Clojureの1235002457
6 563356 1235000140のLisp、スキーム、主観的、Clojureの1235002809
7 563356 1235000140 Lispの、スキーム、主観的、Clojureの1235003266
8 563356 1235000140のLisp、スキーム、主観的、Clojureの1235007817
9 563356 1235000140のLisp、スキーム、主観的、Clojureの1235007913
10 563356 1235000140 lispの、スキーム、主観的、Clojureの1235020626
11 563356 1235000140 lispの、スキームは、主観的、Clojureの1235040652
は1時間以内に答え、質問の数を見つける必要があります。
アプローチ:PIGバージョン0.15.0
QT間と
A = FOREACH data1 GENERATE HoursBetween(ToDate(qt),ToDate(at)) AS diffhours;
B = FOREACH (FILTER A BY diffhours < 1) GENERATE diffhours;
C = GROUP B ALL;
D = FOREACH C GENERATE COUNT(B.diffhours) ;
の差時間を見つけるためにしかし、私はDをダンプするときの仕事は、コメントの下で失敗:中
2016-04-06 01:13:17,736 [LocalJobRunner Map Task Executor #0] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [112, 114, 111, 103, 114, 97, 109, 109, 105, 110, 103] in field being converted to int, caught NumberFormatException <For input string: "programming"> field discarded
2016-04-06 01:13:17,736 [LocalJobRunner Map Task Executor #0] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [115, 117, 98, 106, 101, 99, 116, 105, 118, 101, 34] in field being converted to int, caught NumberFormatException <For input string: "subjective""> field discarded
私はこれらを得る...
Pig Stack Trace
---------------
ERROR 1200: <line 6, column 0> Syntax error, unexpected symbol at or near 'D'
Failed to parse: <line 6, column 0> Syntax error, unexpected symbol at or near 'D'
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:244)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:182)
at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1707)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1680)
at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1082)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:505)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:565)
at org.apache.pig.Main.main(Main.java:177)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
================================================================================
Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias D
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias D
at org.apache.pig.PigServer.openIterator(PigServer.java:935)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:754)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:376)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:565)
at org.apache.pig.Main.main(Main.java:177)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
at org.apache.pig.PigServer.openIterator(PigServer.java:927)
... 13 more
私はこの問題を理解することができません。
あなたのデータはどのように見えますか?あなたの質問にいくつかのサンプル入力データを追加してください。 – ninja123