Summary

Install Docker
Note: At any point if there is any failure , power down gimel quickstart/stop-gimel down
Clone Gimel Repo
Download Gimel Jar
Run the bootstrap module
Once spark-session is ready : play with Gimel Data API / GSQL

Gimel Standalone

Overview

The Gimel Standalone feature will provide capability for developers / users alike to

Try Gimel in local/laptop without requiring all the ecosystems on a hadoop cluster.
Standalone would comprise of docker containers spawned for each storage type that the user would like to explore. Storage type examples : kafka , elasticsearch.
Standalone would bootstrap these containers (storage types) with sample flights data.
Once containers are spawned & data is bootstrapped, the use can then refer the connector docs & try the Gimel Data API / Gimel SQL on the local laptop.
Also in the future : the standalone feature would be useful to automate regression tests & run standalone spark JVMs for container based solutions.

Install Docker

Install docker on your machine
MAC - Docker Installation
Start Docker Service
Increase the memory by navigating to Preferences > Advanced > Memory
(Optional) Clear existing containers and images
- Check for existing Docker containers running - docker ps -aq
- Kill existing Docker containers (if any) - docker kill $(docker ps -aq)
- Remove existing Docker containers(if any) - docker rm $(docker ps -aq)

Download the Gimel Jar

Clone the repo Gimel
Download the gimel jar from Here
Navigate to the folder gimel

cd gimel

Navigate to the folder gimel-dataapi/gimel-standalone/ - cd gimel-dataapi/gimel-standalone/
Create lib folder in gimel-standalone - mkdir lib
Copy the downloaded jar in lib

Run Gimel Quickstart Script

Navigate back to GIMEL_HOME

cd $GIMEL_HOME

To install all the dockers and bootstrap storages, please execute the following command

quickstart/start-gimel {STORAGE_SYSTEM}

STORAGE_SYSTEM can be either all or comma seperated list like as follows

quickstart/start-gimel kafka,elasticsearch,hbase-master,hbase-regionserver

Note: This script will do the following * Start docker containers for each storage * Bootstrap the physical storages (Create Kafka Topic and HBase tables)

To start the spark shell run the following command

docker exec -it spark-master bash -c \
"export USER=an;export SPARK_HOME=/spark/;export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin; \
/spark/bin/spark-shell --jars /root/gimel-sql-2.0.0-SNAPSHOT-uber.jar"

Note: You can view the Spark UI here

Common Imports and Initializations

import org.apache.spark.sql.{DataFrame, SQLContext};
import org.apache.spark.sql.hive.HiveContext;
import com.paypal.gimel.sql.GimelQueryProcessor

val gsql = GimelQueryProcessor.executeBatch(_:String,spark)