In this post, we will learn how we can create a jar in IntelliJ IDEA for a Maven-based Scala + Spark project. We will use the maven build tool to create the jar file from the sample Scala project. We know that the Maven is a project management tool that can be used to manage the project lifecycle. Also, it helps us to build an executable jar from java or a scala-based project.
Create a jar file from a Maven-based Scala project using IntelliJ IDEA
We know that we need to package a Scala application as a jar file to execute it on a Hadoop cluster. Once the jar file is created, we can put it on the edge node or on the master node, and then we can use the spark-submit command to execute this jar file. To create jar in IntelliJ IDEA for Maven-based Scala + Spark project, we need to follow the below steps:
- Create a sample Maven based scala project in IntelliJ IDEA
- Install Maven plugins in IntelliJ
- Modify the pom.xml file
- Package the application using mvn command in terminal
- Test the jar file by executing it
Let’s discuss each of the above-mentioned points in detail below:
1. Create a sample Maven based scala project in IntelliJ IDEA
To set up a maven based hello-world project, we need to follow these steps:
- Firstly, Donwload and install IntelliJ IDEA CE
- Secondly, Install Scala language plugin in IntelliJ
- Thirdly, Create a sample hello-world maven project using an archetype “scala-archetype-simple”
- Finally, Build and Run the project
In short, create a Maven project using archetype “scala-archetype-simple” and update the App.scala file with the below code:
package com.sqlrelease.demo
/**
* Hello world!
*
*/
object App {
def main(args: Array[String]): Unit = {
println("Hello World!")
}
}
2. Install Maven plugins in IntelliJ
Once, we have created the hello-world sample scala application, next, we need to install the Maven plugin into IntelliJ IDEA. To do so:
In macOS, choose Intellij IDEA -> Preferences -> Plugins. In Windows choose File -> Settings -> Plugins to get the plugin installation screen.
Next, go to the marketplace and search for the “Maven helper” plugin. Click on the install button to install it.
Once installed, we may need to restart the IDE.
3. Modify the pom.xml file
Now, we need to update the pom.xml file to let the maven know how we need to build and package our scala application.
If we need to add some spark code, we need to add the spark dependencies into the pom.xml file. Also, we need to update the scala version into the pom.xml file. We can update the <properties> tag of the pom file with the below code.
<properties>
<scala.version>2.12.10</scala.version>
<!-- Addedd to add spark version details-->
<scala.binary.version>2.12</scala.binary.version>
<spark.version>3.2.0</spark.version>
</properties>
Next, we need to inject these dependencies into the <dependencies> tag of the pom.xml file.
<!-- Addedd to add spark version details-->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
<scope>${spark.dependency.scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
<scope>${spark.dependency.scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
<scope>${spark.dependency.scope}</scope>
</dependency>
Finally, we need to tell the Maven how the .jar file needs to be packaged. To do so, we need to add the below plugin into the <plugins> tag of the <build> tag in the pom.xml file.
<!-- Added to enable jar creation using mvn command-->
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.1.1</version>
<configuration>
<archive>
<manifest>
<mainClass>fully.qualified.MainClass</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<!-- bind to the packaging phase -->
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
4. Package the application using mvn command in terminal
So, to build and package the application into a jar file, we can execute the below mvn command into the IntelliJ terminal window.
mvn clean package
The above command will clean any existing packages and then it will create a new packaged .jar file that can be used to deploy and execute on a Hadoop cluster.
After executing the above mvn command, a new folder named as the target will be created into the project directory.
5. Test the jar file by executing it
Finally, we can execute the created jar application using the below java command by specifying the fully qualified main class name, and then we can verify the output.
java -cp target/maven-hello-world-scala-1.0-SNAPSHOT-jar-with-dependencies.jar com.sqlrelease.demo.App
Thanks for the reading. Please share your inputs in the comment section.