Giter Site home page Giter Site logo

hdp-demo-bootstrap's People

Contributors

zacblanco avatar

Stargazers

 avatar

Watchers

 avatar  avatar

hdp-demo-bootstrap's Issues

Restructure as Ambari Service

To be able to make a demo work on multiple platforms (sandbox and cluster) it would be ideal to have the bootstrap structured as an Ambari service

This provides the following functions:

  • easy multi-platform
  • User configuration
  • User of Ambari UI
  • Easily start/stop service

Allow JSON strings for data generator

It would be useful to read a JSON string directly for the data generator, rather than always reading from a file.

I'm adding a way to simply create a JSON object from a string if a file isn't found.

Adding config reader

Need a python function to read a config file

  • Going to use python module ConfigParser

hdp-select package install

On clusters we may need to install the 'hdp-select' package provided by Hortonworks.

We'll need to find a way to instal this package on CentOS/RHEL Systems

Possibly in the future we can look at Ubuntu systems.

Add Docs

It would be useful to provide some kind of documentation on the features that are already currently implemented so others (if they end up writing custom scripts) could utilize the features to their fullest extent

Documentation would also allow the project to continue on and help others contribute as well.

Create a module for executing shell commands

It is possible to execute shell commands via something like

import subprocess
subprocess.Popen('command')

We can also retrieve the output from the command by subprocess..Popen('command', stdout=subprocess.PIPE)

To retrieve output simply

output = subprocess.communicate()

http://stackoverflow.com/questions/4256107/running-bash-commands-in-python
https://docs.python.org/2/library/subprocess.html

The purpose of this module is to provide a kind of wrapper around the subprocess to make it simpler to call different functions and wrap up the necessary functionality in just a few function

Python recommends subprocess32 (a backport form python 3 of a new subprocess module)

  • Going to refrain from using this for now.

Install NiFi to Sandbox or Cluster

Issue to track progress on this feature.

We want to be able to (based on the given config)

  • Install NiFi as an Ambari service
  • Know the location of the instal directory

Possibly integrate Ali's NiFi service (or create my own)

Generate code for queries based on data generator config

Pretty simple but it would be really handy to be able to build up Hive and Spark statements based on the data generator's config.

This way it's easy to instantly create Spark and Hive Tables. The user just need to copy/paste and run the queries.

Import Zeppelin Notebook

Given an existing Zeppelin install on a cluster:

  • Import any pre-made json notebooks from a given directory

Be able to Start and Stop Ambari services

In order to mimic the same things that vakshorton's demos are able to do I figured it would be best to add a way to stop/start/restart services in Ambari.

The aim is to eventually python-ize the functions in his install.sh and startDemoServices.sh scripts

Import NiFi templates

Given a directory for NiFi templates:

  • Upload templates to the cluster using any means (possibly SCP?)

Improve Module documentation

I want the docs for demo_utils to be more maintainable. It would also be nice if I could document the functions inside of the demo-files as well

I think sphinx does a good job at this. I just need to work out the details. POC is working locally.

Hopefully this would make the package somewhat more maintainable

Also adding the ability to search is pretty valuable.

Improve Code Coverage

Need to clean up some of the coverage on the modules that have been modified. (Ambari and Logs)

Add Logging

Logging should be essential to the install process. Need to find a good way to add it in and make it configurable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.