“Valeur $ n'est pas membre de StringContext” - Plugin Scala manquant?

Question

J'utilise Maven avec Archétype Scala. Je reçois cette erreur:

“Valeur $ n'est pas membre de StringContext”

J'ai déjà essayé d'ajouter plusieurs choses dans pom.xml, mais rien ne fonctionnait très bien ...

Mon code:

import org.Apache.spark.ml.evaluation.RegressionEvaluator import org.Apache.spark.ml.regression.LinearRegression import org.Apache.spark.ml.tuning.{ParamGridBuilder, TrainValidationSplit} // To see less warnings import org.Apache.log4j._ Logger.getLogger("org").setLevel(Level.ERROR) // Start a simple Spark Session import org.Apache.spark.sql.SparkSession val spark = SparkSession.builder().getOrCreate() // Prepare training and test data. val data = spark.read.option("header","true").option("inferSchema","true").format("csv").load("USA_Housing.csv") // Check out the Data data.printSchema() // See an example of what the data looks like // by printing out a Row val colnames = data.columns val firstrow = data.head(1)(0) println("
") println("Example Data Row") for(ind <- Range(1,colnames.length)){ println(colnames(ind)) println(firstrow(ind)) println("
") } //////////////////////////////////////////////////// //// Setting Up DataFrame for Machine Learning //// ////////////////////////////////////////////////// // A few things we need to do before Spark can accept the data! // It needs to be in the form of two columns // ("label","features") // This will allow us to join multiple feature columns // into a single column of an array of feautre values import org.Apache.spark.ml.feature.VectorAssembler import org.Apache.spark.ml.linalg.Vectors // Rename Price to label column for naming convention. // Grab only numerical columns from the data val df = data.select(data("Price").as("label"),$"Avg Area Income",$"Avg Area House Age",$"Avg Area Number of Rooms",$"Area Population") // An assembler converts the input values to a vector // A vector is what the ML algorithm reads to train a model // Set the input columns from which we are supposed to read the values // Set the name of the column where the vector will be stored val assembler = new VectorAssembler().setInputCols(Array("Avg Area Income","Avg Area House Age","Avg Area Number of Rooms","Area Population")).setOutputCol("features") // Use the assembler to transform our DataFrame to the two columns val output = assembler.transform(df).select($"label",$"features") // Create a Linear Regression Model object val lr = new LinearRegression() // Fit the model to the data // Note: Later we will see why we should split // the data first, but for now we will fit to all the data. val lrModel = lr.fit(output) // Print the coefficients and intercept for linear regression println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}") // Summarize the model over the training set and print out some metrics! // Explore this in the spark-Shell for more methods to call val trainingSummary = lrModel.summary println(s"numIterations: ${trainingSummary.totalIterations}") println(s"objectiveHistory: ${trainingSummary.objectiveHistory.toList}") trainingSummary.residuals.show() println(s"RMSE: ${trainingSummary.rootMeanSquaredError}") println(s"MSE: ${trainingSummary.meanSquaredError}") println(s"r2: ${trainingSummary.r2}")

et mon pom.xml est que:

<project xmlns="http://maven.Apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.Apache.org/POM/4.0.0 http://maven.Apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>test</groupId> <artifactId>outrotest</artifactId> <version>1.0-SNAPSHOT</version> <name>${project.artifactId}</name> <description>My wonderfull scala app</description> <inceptionYear>2015</inceptionYear> <licenses> <license> <name>My License</name> <url>http://....</url> <distribution>repo</distribution> </license> </licenses> <properties> <maven.compiler.source>1.6</maven.compiler.source> <maven.compiler.target>1.6</maven.compiler.target> <encoding>UTF-8</encoding> <scala.version>2.11.5</scala.version> <scala.compat.version>2.11</scala.compat.version> </properties> <dependencies> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>${scala.version}</version> </dependency> <dependency> <groupId>org.Apache.spark</groupId> <artifactId>spark-mllib_2.11</artifactId> <version>2.0.1</version> </dependency> <dependency> <groupId>org.Apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.0.1</version> </dependency> <dependency> <groupId>org.Apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.0.2</version> </dependency> <dependency> <groupId>com.databricks</groupId> <artifactId>spark-csv_2.11</artifactId> <version>1.5.0</version> </dependency> <!-- Test --> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.11</version> <scope>test</scope> </dependency> <dependency> <groupId>org.specs2</groupId> <artifactId>specs2-junit_${scala.compat.version}</artifactId> <version>2.4.16</version> <scope>test</scope> </dependency> <dependency> <groupId>org.specs2</groupId> <artifactId>specs2-core_${scala.compat.version}</artifactId> <version>2.4.16</version> <scope>test</scope> </dependency> <dependency> <groupId>org.scalatest</groupId> <artifactId>scalatest_${scala.compat.version}</artifactId> <version>2.2.4</version> <scope>test</scope> </dependency> </dependencies> <build> <sourceDirectory>src/main/scala</sourceDirectory> <testSourceDirectory>src/test/scala</testSourceDirectory> <plugins> <plugin> <!-- see http://davidb.github.com/scala-maven-plugin --> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <version>3.2.0</version> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> <configuration> <args> <!--<arg>-make:transitive</arg>--> <arg>-dependencyfile</arg> <arg>${project.build.directory}/.scala_dependencies</arg> </args> </configuration> </execution> </executions> </plugin> <plugin> <groupId>org.Apache.maven.plugins</groupId> <artifactId>maven-surefire-plugin</artifactId> <version>2.18.1</version> <configuration> <useFile>false</useFile> <disableXmlReport>true</disableXmlReport> <!-- If you have classpath issue like NoDefClassError,... --> <!-- useManifestOnlyJar>false</useManifestOnlyJar --> <includes> <include>**/*Test.*</include> <include>**/*Suite.*</include> </includes> </configuration> </plugin> </plugins> </build> </project>

Je ne sais pas du tout comment le réparer. Quelqu'un a-t-il une idée?

Apurva Singh · Accepted Answer

Ajoutez ceci .. cela fonctionnera

val spark = SparkSession.builder().getOrCreate() import spark.implicits._ // << add this

Haroun Mohammedi · Answer

Vous pouvez utiliser la fonction col à la place, importez-la simplement comme ceci:

import org.Apache.spark.sql.functions.col

Et puis changez le $"column" en col("column")

J'espère que ça aide

y2k-shubham · Answer

La réponse de @Apurva a initialement fonctionné pour moi en ce que l'erreur a disparu de IntelliJ
Mais alors, il en est résulté "Could not find implicit value for spark" pendant la phase sbt compile

J'ai trouvé un étrange solution en important spark.implicits._ à partir de SparkSession référencé à partir de DataFrame au lieu de celui obtenu par getOrCreate.

import df.sparkSession.implicits._

où df est une DataFrame

Cela peut être dû au fait que mon code a été placé dans un case class qui a reçu un paramètre implicit val spark: SparkSession; mais je ne suis pas vraiment sûr de savoir pourquoi ce correctif a fonctionné pour moi