scala-xgboost-spark

Install JVM xgboost package to interface to Apache Spark

For a complete guide and documentation, please refer to the official xgoost documentation.

Here, I’m just reporting a quick start to install xgboost under Linux platform to run it under spark environment as described in my previous post.

First of all you need to clone xgboost repository:

git clone --recursive https://github.com/dmlc/xgboost

You need to have a recent gcc compiler to compile xgboost. I installed and used the redhat development tools and so I have gcc 5.3.1.

Now, move to your xgboost directory

cd xgboost

and compile the project

make -j4

If you want to have your python installation type:

cd python-package && python setup.py install

Install jvm package

To install jvm package, move to the directory jvm-packages and compile and install the package with

mvn -DskipTests clean package install

Done!

You can access to your scala, python and java xgboost API.

Now you can try to run the code of this post

Spark and Xgboost

Happy sparking!