Spark Kernel

A Jupyter kernel written in Scala for use with Apache Spark.

Redirect

Overview of the Spark Kernel Client Library

In this third and final part of the Spark Kernel series (part 1, part 2), we will focus on the client library, a Scala-based library used to interface with the Spark Kernel. This library enables Scala applications to quickly communicate with a Spark Kernel without needing to understand ZeroMQ or the IPython message protocol. Furthermore, using the client library, Scala applications are able to treat the Spark Kernel as a remote service, meaning that they can run separately from a Spark cluster and use the kernel as a remote connection into the cluster.

Redirect

Spark Kernel Architecture

In the first part of the Spark Kernel series, we stepped through the problem with enabling interactive applications against Apache Spark and how the Spark Kernel solved this problem. This week, we will focus on the Spark Kernel’s architecture: how we achieve fault tolerance and scalability using Akka, why we chose ZeroMQ with the IPython/Jupyter message protocol, what the layers of functionality are in the kernel (see figure 1 below), and elaborate on an interactive API from IPython called the Comm API.

Redirect

How to enable interactive applications against Apache Spark

Last December, IBM open sourced a project called the Spark Kernel, an application focused on interactive usage of Apache Spark. This project addresses a problem we encountered when trying to migrate a Storm-based application to Apache Spark, “How do we enable interactive applications against Apache Spark?”

Redirect

The Spark Kernel - Meetup Talk

Redirect