In this post, we will discuss an error/warning message “java.io.IOException: Failed to connect to”. This error keeps coming when we try to execute a hive query from spark-shell using spark SQL. This error occurs when Spark tries to execute a task in local mode (pseudo-distributed mode). It is caused because of a connection exception. The error gets escalated from one level to another and prints the complete error stack track. Once the task is failed, spark retries the task and if it again fails, it again tries. That is how the overall execution time gets increased significantly.
This error occurs because spark tries to send the transformation that runs in the local driver to the spark cluster. However, in the local mode, we do not have any spark cluster and that is why it keeps failing several times during the execution. This error occurs when we use an action like Show() command.
java.io.IOException: Failed to connect to – Local Spark mode
The error message summary looks like the below:
ERROR Utils: Aborting task (0 + 1) / 1]
java.io.IOException: Failed to connect to /192.168.1.7:59554
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:288)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:230).......................
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: /192.168.1.7:59554...................
Caused by: java.net.ConnectException: Operation timed out
How to fix – java.io.IOException in local spark mode
To fix java.io.IOException in local spark mode, we need to use the below command before starting the spark-shell. The command is below:
export SPARK_LOCAL_IP="127.0.0.1"
spark-shell
Above, we can see that we have set the SPARK_LOCAL_IP to 127.0.0.1 to tell the spark that we are running in local mode. This will help spark to understand that there is no cluster installed and the shell is running in local mode only. Once we use the above command and then start the spark-shell, the error gets fixed.
For reference purposes, we have copy-pasted the full error message here.
ERROR Utils: Aborting task (0 + 1) / 1]
java.io.IOException: Failed to connect to /192.168.1.7:59554
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:288)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:230)
at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:399)
at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:367)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1496)
at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:366)
at org.apache.spark.repl.ExecutorClassLoader.getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:135)
at org.apache.spark.repl.ExecutorClassLoader.$anonfun$fetchFn$1(ExecutorClassLoader.scala:66)
at org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:176)
at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:113)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:576)
at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.java:40)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:398)
at org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:89)
at org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:317)
at org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:8618)
at org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:6771)
at org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:6620)
at org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:6599)
at org.codehaus.janino.UnitCompiler.access$14300(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$22$1.visitReferenceType(UnitCompiler.java:6502)
at org.codehaus.janino.UnitCompiler$22$1.visitReferenceType(UnitCompiler.java:6497)
at org.codehaus.janino.Java$ReferenceType.accept(Java.java:4134)
at org.codehaus.janino.UnitCompiler$22.visitType(UnitCompiler.java:6497)
at org.codehaus.janino.UnitCompiler$22.visitType(UnitCompiler.java:6490)
at org.codehaus.janino.Java$ReferenceType.accept(Java.java:4133)
at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:6490)
at org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:6895)
at org.codehaus.janino.UnitCompiler.access$14100(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$22$1.visitArrayType(UnitCompiler.java:6500)
at org.codehaus.janino.UnitCompiler$22$1.visitArrayType(UnitCompiler.java:6497)
at org.codehaus.janino.Java$ArrayType.accept(Java.java:4215)
at org.codehaus.janino.UnitCompiler$22.visitType(UnitCompiler.java:6497)
at org.codehaus.janino.UnitCompiler$22.visitType(UnitCompiler.java:6490)
at org.codehaus.janino.Java$ArrayType.accept(Java.java:4214)
at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:6490)
at org.codehaus.janino.UnitCompiler.access$1300(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$36.getParameterTypes2(UnitCompiler.java:10451)
at org.codehaus.janino.IClass$IInvocable.getParameterTypes(IClass.java:959)
at org.codehaus.janino.IClass$IMethod.getDescriptor2(IClass.java:1224)
at org.codehaus.janino.IClass$IInvocable.getDescriptor(IClass.java:982)
at org.codehaus.janino.IClass.getIMethods(IClass.java:248)
at org.codehaus.janino.IClass.getIMethods(IClass.java:237)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:470)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:410)
at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:389)
at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:384)
at org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1594)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:384)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:362)
at org.codehaus.janino.UnitCompiler.access$000(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$1.visitCompilationUnit(UnitCompiler.java:336)
at org.codehaus.janino.UnitCompiler$1.visitCompilationUnit(UnitCompiler.java:333)
at org.codehaus.janino.Java$CompilationUnit.accept(Java.java:363)
at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:333)
at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:235)
at org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:464)
at org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:314)
at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:237)
at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:205)
at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:80)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:1489)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1586)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1583)
at org.sparkproject.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at org.sparkproject.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
at org.sparkproject.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
at org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
at org.sparkproject.guava.cache.LocalCache.get(LocalCache.java:4000)
at org.sparkproject.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at org.sparkproject.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:1436)
at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:378)
at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.generate(GenerateUnsafeProjection.scala:327)
at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.createCodeGeneratedObject(Projection.scala:123)
at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.createCodeGeneratedObject(Projection.scala:119)
at org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:52)
at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.create(Projection.scala:150)
at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.create(Projection.scala:143)
at org.apache.spark.sql.catalyst.expressions.UnsafeProjection$.create(Projection.scala:135)
at org.apache.spark.sql.hive.execution.HiveTableScanExec.$anonfun$doExecute$3(HiveTableScanExec.scala:218)
at org.apache.spark.sql.hive.execution.HiveTableScanExec.$anonfun$doExecute$3$adapted(HiveTableScanExec.scala:217)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2(RDD.scala:885)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2$adapted(RDD.scala:885)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: /192.168.1.7:59554
Caused by: java.net.ConnectException: Operation timed out
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:707)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
Thanks for the reading. Please share your inputs in the comment section.