私はHBaseをCDHクラスタ5.7.0で実行しています。問題なく何ヶ月か実行した後、hbaseサービスが停止し、HBaseマスター(1つのマスターサーバーと4つのリージョンサーバー)を起動することは不可能です。HBase Masterが起動しない
私はいくつかの点で、それを起動しようとすると、マシンがハングし、私はマスターログで見ることができる最後の事はある:私はWALProcedureStoreに壊れ何かがあります怖い
2016-10-24 12:17:15,150 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recover lease on dfs file hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005528.log
2016-10-24 12:17:15,152 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recovered lease, attempt=0 on file=hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005528.log after 2ms
2016-10-24 12:17:15,177 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recover lease on dfs file hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005529.log
2016-10-24 12:17:15,179 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recovered lease, attempt=0 on file=hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005529.log after 2ms
2016-10-24 12:17:15,394 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recover lease on dfs file hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005530.log
2016-10-24 12:17:15,397 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recovered lease, attempt=0 on file=hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005530.log after 3ms
2016-10-24 12:17:15,405 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recover lease on dfs file hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005531.log
2016-10-24 12:17:15,409 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recovered lease, attempt=0 on file=hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005531.log after 3ms
2016-10-24 12:17:15,414 WARN org.apache.hadoop.hdfs.BlockReaderFactory: I/O error constructing remote block reader.
java.net.SocketException: No buffer space available
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:465)
at sun.nio.ch.Net.connect(Net.java:457)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3499)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:838)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:753)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:374)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:889)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:942)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:742)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:232)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
at org.apache.hadoop.hbase.protobuf.generated.ProcedureProtos$ProcedureWALHeader.parseDelimitedFrom(ProcedureProtos.java:3870)
at org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormat.readHeader(ProcedureWALFormat.java:138)
at org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFile.open(ProcedureWALFile.java:76)
at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.initOldLog(WALProcedureStore.java:1006)
at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.initOldLogs(WALProcedureStore.java:969)
at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:300)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.start(ProcedureExecutor.java:509)
at org.apache.hadoop.hbase.master.HMaster.startProcedureExecutor(HMaster.java:1175)
at org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1097)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:681)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:187)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1756)
at java.lang.Thread.run(Thread.java:745)
2016-10-24 12:17:15,427 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /xxx.xxx.xxx.xxx:50010 for block, add to deadNodes and continue. java.net.SocketException: No buffer space available
java.net.SocketException: No buffer space available
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:465)
at sun.nio.ch.Net.connect(Net.java:457)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3499)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:838)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:753)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:374)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:889)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:942)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:742)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:232)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
at org.apache.hadoop.hbase.protobuf.generated.ProcedureProtos$ProcedureWALHeader.parseDelimitedFrom(ProcedureProtos.java:3870)
at org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormat.readHeader(ProcedureWALFormat.java:138)
at org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFile.open(ProcedureWALFile.java:76)
at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.initOldLog(WALProcedureStore.java:1006)
at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.initOldLogs(WALProcedureStore.java:969)
at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:300)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.start(ProcedureExecutor.java:509)
at org.apache.hadoop.hbase.master.HMaster.startProcedureExecutor(HMaster.java:1175)
at org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1097)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:681)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:187)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1756)
at java.lang.Thread.run(Thread.java:745)
2016-10-24 12:17:15,436 INFO org.apache.hadoop.hdfs.DFSClient: Successfully connected to /xxx.xxx.xxx.xxx:50010 for BP-813663273-xxx.xxx.xxx.xxx-1460963038761:blk_1079056868_5316127
2016-10-24 12:17:15,442 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recover lease on dfs file hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005532.log
2016-10-24 12:17:15,444 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recovered lease, attempt=0 on file=hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005532.log after 2ms
2016-10-24 12:17:15,669 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recover lease on dfs file hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005533.log
2016-10-24 12:17:15,672 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recovered lease, attempt=0 on file=hdfs://namenode:8020/hbase/MasterProcWALs/state-00000000000000005533.log after 2ms
が、私はしないでください問題を見つけるために掘り続ける方法を知っている。すべての手がかりは?以前に破損した状態をロードしようとせずに、新しくマスターを起動できますか?
EDITは:
私はちょうど私が私に起こっているのと同じ問題だと思うthis bugを見ました。 hbaseに保存されている古いデータを削除せずに、/hbase/MasterProcWALs
のすべてを安全に削除できますか?
おかげ
この解決方法を試すことができます。http://stackoverflow.com/questions/6068423/java-net-socketexception-no-buffer-space-available-maximum-connections-reached – BruceWayne