Build and install Spark on Linux platform.

Build and install Spark on Linux platform.

Build and install Spark on a Linux platform

Here a short guide to build, install and configure Apache Spark on a Linux platform.

You can decide which Spark release install in your environment. I started using Spark starting from version 2.0 and I did not use the pre compiled releases, but I compiled and configured it.

Besides that,  I preferred to use the version from github master development branch, but you can use any given branch from github.

So, fix the main path where you want to install your SPARK_HOME and clone there your preferred  release

All the information to build spark from scratch are available at this link

Once you cloned your Spark release, go in your Spark directory and use the following line to compile and install it

First of all you can configure your environment variables in your .bashrc  which can point to your java, scala and python api of Spark.

Specific configuration for Spark environment

All the Spark configuration can be set up under the spark/conf directory of your installation.

You can adapt these 2 files:

starting from the .template you will find in that directory.

For example you can put in spark-default.conf some of the configuration you will use in you Spark Session.

In, instead you can setup all the variables you need for specific installation and parameters.

If needed, do not forget to do the source of this file.

If you want to try my xgboost and spark code, refer to xgboost installation .

Happy sparking!

Notify of

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Inline Feedbacks
View all comments
Would love your thoughts, please comment.x
%d bloggers like this: