カフカからFlumeが管理するHDFSへのデータフローに苦労しています。 以下で説明する例外のため、データがhdfsに完全に転送されません。 このエラーは私たちにとって誤解を招くようですが、データディレクトリとhdfsの両方に十分なスペースがあります。私たちはそれがチャンネル設定の問題かもしれないと思っていますが、他のソースと同様の設定があり、正しく動作します。誰かがこの問題に対処しなければならない場合、私はヒントに感謝します。カフカからHDFSへのデータフロー中に十分な空間エラーが発生しない
17 Aug 2017 14:15:24,335 ERROR [Log-BackgroundWorker-channel1] (org.apache.flume.channel.file.Log$BackgroundWorker.run:1204) - Error doing checkpoint
java.io.IOException: Usable space exhausted, only 0 bytes remaining, required 524288000 bytes
at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:1003)
at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:986)
at org.apache.flume.channel.file.Log.access$200(Log.java:75)
at org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:1201)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
17 Aug 2017 14:15:27,552 ERROR [PollableSourceRunner-KafkaSource-kafkaSource] (org.apache.flume.source.kafka.KafkaSource.doProcess:305) - KafkaSource EXCEPTION, {}
org.apache.flume.ChannelException: Commit failed due to IO error [channel=channel1]
at org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:639)
at org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
at org.apache.flume.source.kafka.KafkaSource.doProcess(KafkaSource.java:286)
at org.apache.flume.source.AbstractPollableSource.process(AbstractPollableSource.java:58)
at org.apache.flume.source.PollableSourceRunner$PollingRunner.run(PollableSourceRunner.java:137)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Usable space exhausted, only 0 bytes remaining, required 524288026 bytes
at org.apache.flume.channel.file.Log.rollback(Log.java:722)
at org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:637)
... 6 more
水路構成:
agent2.sources = kafkaSource
#sources defined
agent2.sources.kafkaSource.type = org.apache.flume.source.kafka.KafkaSource
agent2.sources.kafkaSource.kafka.bootstrap.servers = …
agent2.sources.kafkaSource.kafka.topics = pega-campaign-response
agent2.sources.kafkaSource.channels = channel1
# channels defined
agent2.channels = channel1
agent2.channels.channel1.type = file
agent2.channels.channel1.checkpointDir = /data/cloudera/.flume/filechannel/checkpointdirs/pega
agent2.channels.channel1.dataDirs = /data/cloudera/.flume/filechannel/datadirs/pega
agent2.channels.channel1.capacity = 10000
agent2.channels.channel1.transactionCapacity = 10000
#hdfs sinks
agent2.sinks = sink
agent2.sinks.sink.type = hdfs
agent2.sinks.sink.hdfs.fileType = DataStream
agent2.sinks.sink.hdfs.path = hdfs://bigdata-cls:8020/stage/data/pega/campaign-response/%d%m%Y
agent2.sinks.sink.hdfs.batchSize = 1000
agent2.sinks.sink.hdfs.rollCount = 0
agent2.sinks.sink.hdfs.rollSize = 0
agent2.sinks.sink.hdfs.rollInterval = 120
agent2.sinks.sink.hdfs.useLocalTimeStamp = true
agent2.sinks.sink.hdfs.filePrefix = pega-
のdf -hコマンド:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rhel-root 26G 6.8G 18G 28%/
devtmpfs 126G 0 126G 0% /dev
tmpfs 126G 6.3M 126G 1% /dev/shm
tmpfs 126G 2.9G 123G 3% /run
tmpfs 126G 0 126G 0% /sys/fs/cgroup
/dev/sda1 477M 133M 315M 30% /boot
tmpfs 26G 0 26G 0% /run/user/0
cm_processes 126G 1.9G 124G 2% /run/cloudera-scm-agent/process
/dev/scinib 2.0T 53G 1.9T 3% /data
tmpfs 26G 20K 26G 1% /run/user/2000
カフカチャンネルまたはメモリチャンネルを使用した場合はどうなりますか? –