Saturday, 1 February 2020

Error - CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss


20/01/31 09:58:42 ERROR ConnectionState: Connection timed out for connection string (zookeeper-prelive:2181) and timeout (15000) / elapsed (15086)org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss        at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197)        at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:87)        at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)        at org.apache.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:477)        at org.apache.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:172)        at org.apache.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:161)        at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)        at org.apache.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:158)        at org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:148)        at org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:36)        at org.apache.spark.deploy.SparkCuratorUtil$.mkdir(SparkCuratorUtil.scala:48)        at org.apache.spark.deploy.master.ZooKeeperPersistenceEngine.<init>(ZooKeeperPersistenceEngine.scala:41)        at org.apache.spark.deploy.master.ZooKeeperRecoveryModeFactory.createPersistenceEngine(RecoveryModeFactory.scala:71)        at org.apache.spark.deploy.master.Master.onStart(Master.scala:174)        at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:122)        at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:205)        at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:101)        at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)        at java.lang.Thread.run(Thread.java:748)


When you get above error in Spark Master check whether the ZooKeeper is running. If you see zoo keeper is working fine. Then it is because of zookeeper temporal memory issue. So restart zookeeper , also the spark master.

You might need to consider increasing RAM for this system.