Pourquoi spark-Shell échoue avec NullPointerException?

Question

J'essaie d'exécuter spark-Shell sous Windows 10, mais je continue à avoir cette erreur chaque fois que je l'exécute.

J'ai utilisé les dernières versions et spark-1.5.0-bin-hadoop2.4.

15/09/22 18:46:24 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 15/09/22 18:46:24 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 15/09/22 18:46:27 WARN ObjectStore: Version information not found in metastore. Hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 15/09/22 18:46:27 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 15/09/22 18:46:27 WARN : Your hostname, DESKTOP-8JS2RD5 resolves to a loopback/non-reachable address: fe80:0:0:0:0:5efe:c0a8:103%net1, but we couldn't find any external IP address! Java.lang.RuntimeException: Java.lang.NullPointerException at org.Apache.hadoop.Hive.ql.session.SessionState.start(SessionState.Java:522) at org.Apache.spark.sql.Hive.client.ClientWrapper.<init> (ClientWrapper.scala:171) at org.Apache.spark.sql.Hive.HiveContext.executionHive$lzycompute(HiveContext.scala :163) at org.Apache.spark.sql.Hive.HiveContext.executionHive(HiveContext.scala:161) at org.Apache.spark.sql.Hive.HiveContext.<init>(HiveContext.scala:168) at Sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at Sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source) at Sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) at Java.lang.reflect.Constructor.newInstance(Unknown Source) at org.Apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028) at $iwC$$iwC.<init>(<console>:9) at $iwC.<init>(<console>:18) at <init>(<console>:20) at .<init>(<console>:24) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at Sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at Sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at Sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at Java.lang.reflect.Method.invoke(Unknown Source) at org.Apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.Apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340) at org.Apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.Apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.Apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.Apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) at org.Apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) at org.Apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) at org.Apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:132) at org.Apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124) at org.Apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324) at org.Apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124) at org.Apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64) at org.Apache.spark.repl.SparkILoop$$anonfun$org$Apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974) at org.Apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159) at org.Apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64) at org.Apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.sca la:108) at org.Apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64) at org.Apache.spark.repl.SparkILoop$$anonfun$org$Apache$spark$repl$SparkILoop$$proc ess$1.apply$mcZ$sp(SparkILoop.scala:991) at org.Apache.spark.repl.SparkILoop$$anonfun$org$Apache$spark$repl$SparkILoop$$proc ess$1.apply(SparkILoop.scala:945) at org.Apache.spark.repl.SparkILoop$$anonfun$org$Apache$spark$repl$SparkILoop$$proc ess$1.apply(SparkILoop.scala:945) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scal a:135) at org.Apache.spark.repl.SparkILoop.org$Apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945) at org.Apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059) at org.Apache.spark.repl.Main$.main(Main.scala:31) at org.Apache.spark.repl.Main.main(Main.scala) at Sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at Sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at Sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at Java.lang.reflect.Method.invoke(Unknown Source) at org.Apache.spark.deploy.SparkSubmit$.org$Apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672) at org.Apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.Apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.Apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.Apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: Java.lang.NullPointerException at Java.lang.ProcessBuilder.start(Unknown Source) at org.Apache.hadoop.util.Shell.runCommand(Shell.Java:445) at org.Apache.hadoop.util.Shell.run(Shell.Java:418) at org.Apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.Java:650) at org.Apache.hadoop.util.Shell.execCommand(Shell.Java:739) at org.Apache.hadoop.util.Shell.execCommand(Shell.Java:722) at org.Apache.hadoop.fs.FileUtil.execCommand(FileUtil.Java:1097) at org.Apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.Java:559) at org.Apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.Java:534) org.Apache.hadoop.Hive.ql.session.SessionState.createRootHDFSDir(SessionState.Java:599) at org.Apache.hadoop.Hive.ql.session.SessionState.createSessionDirs(SessionState.Java:554)

org.Apache.hadoop.Hive.ql.session.SessionState.start (SessionState.Java:508) ... 56 plus

 <console>:10: error: not found: value sqlContext import sqlContext.implicits._ ^ <console>:10: error: not found: value sqlContext import sqlContext.sql ^

max · Accepted Answer

J'ai utilisé Spark 1.5.2 avec Hadoop 2.6 et ai eu des problèmes similaires. Résolu en procédant comme suit:

Téléchargez winutils.exe depuis le répertoire repository vers un dossier local, par exemple. C:\hadoop\bin.
Définissez HADOOP_HOME sur C:\hadoop.
Créez un répertoire c: mp\Hive (à l'aide de l'Explorateur Windows ou de tout autre outil).
Invite de commande ouverte avec droits d'administrateur.
Exécuter C:\hadoop\bin\winutils.exe chmod 777 /tmp/Hive

Avec cela, je reçois toujours des avertissements, mais pas d'ERREUR et je peux très bien exécuter les applications Spark.

user6567143 · Answer

Je faisais face à un problème similaire, résolu en plaçant le winutil dans le dossier bin. Le Hadoop_home doit être défini sur C:\Winutils et Winutil doit être placé sur C:\Winutils\bin.

Winutils Windows 10 64 bits sont disponibles dans https://github.com/steveloughran/winutils/tree/master/hadoop-2.6.0/bin

Assurez-vous également que la ligne de commande dispose d'un accès administratif.

Référez-vous https://wiki.Apache.org/hadoop/WindowsProblems

Michael Pigg · Answer

Je suppose que vous rencontrez https://issues.Apache.org/jira/browse/SPARK-10528 . Je voyais le même problème sous Windows 7. Au début, je recevais l'exception NullPointerException comme vous l'avez fait. Lorsque je mets winutils dans le répertoire bin et que je règle HADOOP_HOME pour qu'il pointe vers le répertoire Spark, j'ai eu l'erreur décrite dans le problème JIRA.

kepung · Answer

Ou peut-être que ce lien ci-dessous sera plus facile à suivre,

https://wiki.Apache.org/hadoop/WindowsProblems

En règle générale, téléchargez et copiez winutils.exe dans votre dossier spark\bin. Relancez spark-shell

Si vous n'avez pas configuré votre/tmp/Hive sur un état inscriptible, veuillez le faire.

Sachin Sukumaran · Answer

Pour Python - Créez une SparkSession dans votre python (cette section de configuration concerne uniquement Windows)

spark = SparkSession.builder.config("spark.sql.warehouse.dir", "C:/temp").appName("SparkSQL").getOrCreate()

Copiez winutils.exe et conservez-le dans C:\winutils\bin et exécutez les commandes ci-dessous

C:\Windows\system32>C:\winutils\bin\winutils.exe chmod 777 C:/temp

Exécuter une invite de commande en mode ADMIN (Exécuter en tant qu'administrateur)

Nishu Tayal · Answer

Vous devez donner l'autorisation au répertoire/tmp/Hive pour résoudre cette exception.

J'espère que vous avez déjà winutils.exe et que vous définissez la variable d'environnement HADOOP_HOME. Ouvrez ensuite la commande Invite et lancez la commande suivante en tant qu'administrateur:

Si winutils.exe est présent dans l'emplacement D:\winutils\bin et que \ tmp\Hive est également dans le lecteur D:

D:\winutils\bin\winutils.exe chmod 777 D:	mp\Hive

Pour plus de détails, vous pouvez vous référer aux liens suivants:

Des problèmes fréquents sont survenus pendant le développement de Spark
Comment exécuter Apache Spark sous Windows 7 en mode autonome

JKMburu · Answer

Mon problème était d'avoir d'autres .exe/Jars dans le dossier winutils/bin. Donc, j'ai effacé tous les autres et je suis resté avec winutils.exe seul. Utilisait spark 2.1.1

Deepak m · Answer

Vous pouvez résoudre ce problème en plaçant le fichier jar mysqlconnector dans le dossier spark-1.6.0/libs, puis en le redémarrant.Il fonctionne.

La chose importante est ici au lieu de lancer spark-Shell, vous devriez faire

spark-Shell --driver-class-path /home/username/spark-1.6.0-libs-mysqlconnector.jar

J'espère que ça devrait marcher.

user2590261 · Answer

Sous Windows, vous devez cloner "winutils"

git clone https://github.com/steveloughran/winutils.git

Et

set var HADOOP_HOME to DIR_CLONED\hadoop-{version}

N'oubliez pas de choisir la version de votre hadoop.

ashwin kumar · Answer

Le problème a été résolu après l'installation de la version correcte de Java, dans mon cas, Java 8 et la définition des variables d'environnement. Assurez-vous d’exécuter le fichier winutils.exe pour créer un répertoire temporaire comme indiqué ci-dessous.

c:\winutils\bin\winutils.exe chmod 777 	mp\Hive

Ci-dessus ne devrait renvoyer aucune erreur. Utilisez Java -version pour vérifier la version de Java que vous utilisez avant d'appeler spark-Shell.