Edit Page

Running Apache Zeppelin with Kotlin interpreter on Apache Spark cluster

Last Updated 3 December 2019
This tutorial shows how to build Apache Zeppelin with Kotlin support from sources and run it on a Spark Cluster.

Running Apache Zeppelin with Kotlin interpreter on Apache Spark cluster

Currently, the latest release of Zeppelin (0.8.2) doesn’t come with bundled Kotlin interpreter. Anyway, it’s already available in the master branch of Zeppelin. Thus, to add Kotlin support to Zeppelin, build your own version from the sources.

Here we'll learn how to run Zeppelin with Kotlin support on an Apache Spark cluster. The instruction for running Zeppelin locally is available here.

Prerequisites

To build a custom version of Zeppelin, you will need:

Spark version

The Kotlin interpreter supports only Spark versions above 2.4. Zeppelin builds with Spark 2.2 support by default, so don't forget to specify the suitable profile for build.

Building and running Zeppelin on a Spark Cluster

Te below instruction explains how to build and run Apache Zeppelin on a Spark Cluster in Amazon EMR.

After getting an access to cluster creation in AWS Console, do the following:

  1. Create a new cluster.

  2. Specify the cluster options. Don't forget that the Spark version should be 2.4 or above.

  3. Connect to newly created cluster via ssh.

  4. Install prerequisites:
    sudo yum -y maven git npm fontconfig freetype freetype-devel fontconfig-devel libstdc++ R
    
  5. Clone the Zeppelin repository. We recommend cloning to /mntsince it generally has more free space.

    git clone --depth=1 git@github.com:apache/zeppelin.git
    

    or

    git clone --depth=1 https://github.com/apache/zeppelin.git
    
  6. Build Zeppelin using the following command:

    mvn clean package -DskipTests -Pspark-2.4 -Pscala-2.11
    
  7. Newly built Zeppelin distributive will be appear in zeppelin-distribution/target. Unpack it into /mnt/distr using tar.

  8. Remove conf and local-repo directories (if any of them exist) from the unpacked folder.

  9. Copy the contents of this dir into /usr/lib/zeppelin:
    cp -r /mnt/distr/zeppelin[??] /usr/lib/zeppelin
    
  10. Reboot the cluster:
    sudo reboot
    
  11. After reboot, log into cluster again and run the following:

    bash sudo stop zeppelin sudo /usr/lib/zeppelin/bin/zeppelin-daemon.sh start

Now you can access Zeppelin on https://<your machine public adress>:8080.