We are going to walk through the steps of adding a new connector, using an Example.

Anatomy of a Connector's module

Before we start adding a new connector, lets look at the various components in a connector.

Lets assume restapi as a sample connector. A connector typically has following packages -

  • com.paypal.gimel.restapi.reader - This MANDATORY package has the Reader Facade, but not the entire implementation in same file.
  • com.paypal.gimel.restapi.writer - This MANDATORY package has the write Faced, not the entire implementation in same file.
  • com.paypal.gimel.conf - This MANDATORY package should have 3 specific Classes Only. Continue reading for more details.
  • com.paypal.gimel.utilities - This optional package is where the heavy lifting code is implemented. The developers have the freedom to organized the code base as desired, provided it meets general coding practices in Java/Scala.
  • com.paypal.gimel.restapi.conf.RestApiClientConfiguration - All references to members inside the Connector are to be initiated, defaulted or resolved here.
  • com.paypal.gimel.restapi.conf.RestApiConfigs - All the parameters that are exposed to client are listed here.
  • com.paypal.gimel.restapi.conf.RestApiConstants - All the non-parameter constants are defined here.
  • Example
  • com.paypal.gimel.restapi.reader.RestApiConsumer - the connector's API that is exposed to the DataSet class, method read
  • Example
  • com.paypal.gimel.restapi.reader.RestApiProducer - the connector's API that is exposed to the DataSet class, method write
  • Example
  • com.paypal.gimel.restapi.utilities.* - If the connector has heavy implementation, then place all the logic in this package.
  • Example

Adding a new connector

Create New Module

  • Gimel Connectors are all inside the module gimel-dataapi/gimel-connectors.

  • Under module gimel-dataapi/gimel-connectors - add a new maven module.
  • In the new module, add reference to parent pom of module gimel-dataapi.
  • Refer this pom file for example of adding the rest-api connector. Example

Including the new module gimel-dataapi/gimel-connectors/new_connector as part of the parent module gimel-dataapi

  • Under parent module POM gimel-dataapi - add the new connector module's reference.
  • Refer this pom for how a new connector is referenced in the parent pom - so it becomes a part of the build. Example
  • Build gimel to ensure the new connector builds as part of the gimel project. Building Gimel

Add the new module gimel-dataapi/gimel-connectors/new_connector as a dependency for module gimel-dataapi/gimel-core

Include the connector in core api - DataSet

  • Add a new value to the DataSet.Type enum. Example
  • Add a new reference to NEW_STORAGE string, say RESTAPI in case of restapi connector Example
  • Add a call to the core DataSet API in the DataSet Factory - new com.paypal.gimel.restapi.DataSet(sparkSession) Example

Documenting the new connector

  • Place any referring images (must be a .png) under directory - docs/images. Images must be pull from official sites.
  • Add a new markdown file to docs/gimel-connectors. Example
  • Add a reference to the above markdown file in docs/ Example

Adding Standalone Support

  • To support local testability, you may also add a docker container support for the new storage.
  • Refer this yaml file to see examples of docker support for several storages. Docker Example
  • This is highly RECOMMENDED as it provides capability to test the entire connector's feature - locally.

Testing the new connector

General Note

  • Ensure that following components are tested, and also that results are captured in pull request.
  • DataSet.write
  • com.paypal.gimel.sql.GimeQueryProcessor.executeBatch()
  • Options in both read & write API are working fine in following modes -
  • sql mode - set key=value
  • api mode -"dataset_name",options)
  • CatalogProvider must be tested in following modes -
  • gimel.catalog.provider=HIVE
  • gimel.catalog.provider=USER

  • The test results should be shared in a way that is clear for reviewers replicate the testing locally.

Testing locally

Refer to quickstart gimel in local mode - so you can test the entire API on laptop.

Final Step - Raise a pull request

  • Once you are past above listed steps, raise a PR.
  • When you raise a new PR - you will see the guidelines for a PR.
  • Here is the example PR for the restapi connector. Example