Giter Site home page Giter Site logo

drone-fly's Introduction

Logo

A service which allows Hive metastore (HMS) MetaStoreEventListener implementations to be deployed in a separate context to the metastore's own.

Overview

Maven Central Build Status Coverage Status License Docker

Drone Fly is a distributed Hive metastore events forwarder service that allows users to deploy metastore listeners outside the Hive metastore service.

With the advent of event-driven systems, the number of listeners that a user needs to install in the metastore is ever increasing. These listeners can be both internal or can be provided by third party tools for integration purposes. More and more processing is being added to these listeners to address various business use cases.

Adding these listeners directly on the classpath of your Hive metastore couples them with it and can lead to performance degradation or in the worst case, it could take down the entire metastore (e.g. by running out memory, thread starvation etc.) Drone Fly decouples your HMS from the event listeners by providing a virtual Hive context. The event listeners can be provided on the Drone Fly's classpath and it then forwards the events received from Kafka metastore Listener on to the respective listeners.

Start using

A Terraform module for Kubernetes deployment is available here.

Docker images can be found in Expedia Group's dockerhub.

System architecture

The diagram below shows a typical Hive metastore setup without using Drone Fly. In this example, there are several HiveMetastoreListeners installed which send Hive events to other systems like Apache Atlas, AWS SNS, Apache Kafka and other custom implementations.

Hive Metastore setup without Drone Fly.

With Drone Fly, the setup gets modified as shown in the diagram below. The only listener installed in the Hive metastore context is the Apiary Kafka Listener. This forwards the Hive events on to Kafka from which Drone Fly can retrieve them. The other listeners are moved out into separate contexts and receive the messages from Drone Fly which forwards them on as if they were Hive events so the listener code doesn't need to change at all.

Hive Metastore setup with Drone Fly.

Drone Fly can be set up to run in dockerized containers where each instance is initiated with one listener to get even further decoupling.

Usage

Using with Docker

To install a new HMS listener within the Drone Fly context, it is recommended that you build your Docker image using the Drone Fly base Docker image.

A sample image to install the Apiary-SNS-Listener would be as follows:

from expediagroup/drone-fly-app:0.0.1

ENV APIARY_EXTENSIONS_VERSION 6.0.1

ENV AWS_REGION us-east-1
RUN cd /app/libs && \
wget -q https://search.maven.org/remotecontent?filepath=com/expediagroup/apiary/apiary-metastore-listener/${APIARY_EXTENSIONS_VERSION}/apiary-metastore-listener-${APIARY_EXTENSIONS_VERSION}-all.jar -O apiary-metastore-listener-${APIARY_EXTENSIONS_VERSION}-all.jar

Running Drone Fly Docker image

docker run --env APIARY_BOOTSTRAP_SERVERS="localhost:9092" \
	   --env APIARY_LISTENER_LIST="com.expediagroup.sampleListener1,com.expediagroup.sampleListener2" \
	   --env APIARY_KAFKA_TOPIC_NAME="dronefly" \
	         expediagroup/drone-fly-app:<image-version>

Then Drone Fly Terraform module can be used to install your Docker image in a Kubernetes container.

Using Uber Jar

Drone Fly build also produces an uber jar so it can be started as a stand-alone Java service.

Running Drone Fly Jar

java -Dloader.path=lib/ -jar drone-fly-app-<version>-exec.jar \
	--apiary.bootstrap.servers=localhost:9092 \
	--apiary.kafka.topic.name=apiary \
	--apiary.listener.list="com.expediagroup.sampleListener1,com.expediagroup.sampleListener2"	

The properties instance.name, apiary.bootstrap.servers, apiary.kafka.topic.name and apiary.listener.list can also be provided in the spring properties file.

java -Dloader.path=lib/ -jar drone-fly-app-<version>-exec.jar --spring.config.location=file:///dronefly.properties

The parameter -Dloader-path is the path where Drone Fly will search for configured HMS listeners.

Configuring Drone Fly

Drone Fly configuration reference

The table below describes all the available configuration values for Drone Fly.

Name Description Type Default Required
apiary.bootstrap.servers Kafka bootstrap servers that receive Hive metastore events. string n/a yes
apiary.kafka.topic.name Kafka topic name that receives Hive metastore events. string n/a yes
apiary.listener.list Comma separated list of Hive metastore listeners to load from the classpath, e.g. com.expedia.HMSListener1,com.expedia.HMSListener2 string "com.expediagroup.dataplatform.dronefly.app.service.listener.LoggingMetastoreListener" no
instance.name Instance name for a Drone Fly instance. instance.name is also used to derive the Kafka consumer group. Therefore, in a multi-instance deployment, a unique instance.name for each Drone Fly instance needs to be provided to avoid all instances ending up in the same Kafka consumer group. string drone-fly no
endpoint.port Port on which Drone Fly Spring Boot app will start. string 8008 no

Metrics

Drone Fly exposes standard JVM and Kafka metrics using Prometheus on Spring Boot Actuator endpoint /actuator/prometheus.

Some of the useful metrics to track are:

system_cpu_usage
kafka_consumer_records_consumed_total_records_total
jvm_memory_committed_bytes

Legal

This project is available under the Apache 2.0 License.

Copyright 2020 Expedia, Inc.

drone-fly's People

Contributors

abhimanyugupta07 avatar dependabot[bot] avatar eg-oss-ci avatar hamzajugon avatar jaygreeeen avatar massdosage avatar patduin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

drone-fly's Issues

Add an example listener

As a user of Drone Fly
I want an example listener that I can easily set up and configure without having to create my own
So that I can try out Drone Fly and verify that it works

h2. Acceptance Criteria

  • A LoggingListener created in drone-fly-app that uses slf4j to log the parameters passed in every method call.
  • README updated to describe this listener and how it can be used (how to configure, where to look for logs etc.) to test a new Drone Fly installation, making clear that this is just for debugging and verifying that it works etc.

h2. Notes
I'm 99% sure that if we don't provide a logging config file then the default behavior is to log to the console but we should double check.

Improve README file

As a developer of DroneFly,
I want to improve the README file
so that the documentation is clear and unambiguous.

Acceptance Criteria:

  1. Ensure all command line arguments and any other configuration options are clearly documented.
  2. Add sample configuration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.