Wednesday, June 19, 2013

Hadoop - Hive/HBase Integration - Zookeeper Session Closes Immediately


We have an 8 node cluster installed CDH4.2.1-1.cdh4.2.1.p0.5, and configured using Cloudera Manager. We have 3 dedicated master node running zookeeper. When I configure hive to run local hadoop, executed from the master node, I have no problem retreiving the data from HBase. When I run distributed map/reduce via hive, I am getting the following error when the slave nodes connect to zookeeper.

2013-06-19 21:05:11,553 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost.localdomain/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2013-06-19 21:05:11,554 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
 at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) 
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

From the error message, we can see that the Map/Reduce submitted by Hive is trying to connect to zookeeper at 'localhost', regardless of how the zookeeper.quorom is setup in the config file. To fix this, you have to define "hbase.zookeeper.quorum" and its port in hive-site.xml file. Even though "hive.zookeeper.quorum" has been defined.

No comments: