Giter Site home page Giter Site logo

docker-hive's People

Contributors

fredrikhgrelland avatar zhenik avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

docker-hive's Issues

Roadmap contexts

Contexts

1. Test & Doc

2. Consistency & Resilience

  • Canary deployments & Rolling upgrades #35
  • Volumes (nomad context) #37
  • Backup and recovery if containers failure (#38 )
  • Deploy with attached(existing) volume (part of #37 )
  • ... (you name it)

3. Security

  • Encryption in transit, service-mesh communication managment (covered by side-car proxies and intentions) #44
  • Delegate admin role to vault, to generate dynamic credentatial with short TTL. secrets engine. (#42 #40 #41 #39 )
  • minio kv #16
  • postgres kv #12
  • Certification generation via Vault, PKI #337
  • Encryption at rest #46
  • Policies (Admin/Produsent/consument) #397
  • ... (you name it)

4. Scaling

  • Deployment strategy (master-slave, horizontal, etc...)
  • Dynamic/static scaling
  • ... (you name it)

5. Observability. How/Where to expose. Additional nodes

  • Metrics (Prometheus, ...) and Pre-builded dash-boards(Grafana examples)
  • Logs (Splunk, LogStash, ...)
  • Traces (Zipkin, OpenCensus, ...)
  • ... (you name it)

6. Cover needs of other teams

  • Add new module (kafka, nifi, etc...)
  • ... (you name it)

7. Enterprise features

  • Namespaces #43
  • consul namespaces #346
  • vault namespaces #353
  • Sentinel policies #45
  • ... (you name it)

[Epic][Enterprise]: Namespaces

Description

Enterprise feature Namespaces is implemented in Consul, Nomad and Vault.

The main idea is isolation in a shared cluster. (teams, deployments, services, policies, etc...)

Acceptance Criteria

    • Provide isolation in consul (resource isolation) #346
    • Provide isolation in nomad (with different users, teams and different deployments)
    • Provide isolation in vault (resouce isolation) WIP #357

References

Feature idea

Main idea consul

Consul namespaces allow global operators to create isolated environments in a shared cluster and apply any required service access restrictions for authenticated users.

Main idea nomad

Namespaces enhance the usability of a shared cluster by isolating teams from the jobs of others, by providing fine grain access control to jobs when coupled with ACLs, and by preventing bad actors from negatively impacting the whole cluster.

Main idea vault

Many organizations implement Vault as a "service", providing centralized management for teams within an organization while ensuring that those teams operate within isolated environments known as tenants.

Backup and recovery

  • Learn feature and usage (understand the difference between canary deployments)
  • Add feature to repository
  • Tests
    • Test that backup works
    • Test that recovery works
    • Simulate failure (what happens if it dies unexpectedly)
  • Documentation

Namespaces

Namespaces

Requires changing prebuild switches in vagrant-hashistack.

TODO

  • Learn feature and usage. Understand differences between namespaces in consul, nomad and vault context
  • run ent versions of Consul, Nomad & Vault
  • Add this feature to nomad-job in the terraform-nomad module ( probably need to build another example in the module(s) )
    • create 2 namespaces
    • deploy "things" on different context
    • verify no-collisions (in services names for example) and resource isolation(service 1 in namespace A can not access service 1 in namespace B)
  • Documentation

References

CD/CI for vagrant

  • conditional ansible provisioning
  • test verification: consul healthcheck
  • github actions (separate issue)

Missing dependency `org.openx.data.jsonserde.JsonSerDe`. JSON support

Steps to reproduce

Create a table

CREATE External TABLE my_table (
  description STRING,
  foo STRUCT<bar: STRING, quux: STRING, level1: STRUCT<l2string: STRING, l2struct: STRUCT<level3: STRING>>>,
  wibble STRING,
  wobble ARRAY<STRUCT <entry: INT, EntryDetails: STRUCT<details1: STRING, details2: INT>>>)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'  location 's3a://hive/warehouse/json/';

Data sample

{
  "description": "my doc",
  "foo": {
    "bar": "baz",
    "quux": "revlos",
    "level1": {
      "l2string": "l2val",
      "l2struct": {
        "level3": "l3val"
      }
    }
  },
  "wibble": "123",
  "wobble": [
    {
      "entry": 1,
      "EntryDetails": {
        "details1": "lazybones",
        "details2": 414
      }
    }
  ]
}

Log

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.openx.data.jsonserde.JsonSerDe
hive> 

UserStory[Security]: Generate password for minio from Vault

Description

As of today we use user provided screts to access minio, however, going forward we would like to generate these secrets from Vault. The KV Secrets Engine feature could be made use of to achieve this.

Acceptance Criteria

  1. Remove the hardcoded minio_access_key, minio_secret_key
  2. Generate on demand password with vault
  3. Add tests

Docker image tag for local testing

Problem description

Invalid reference format if we use the branch name

~/Makefile

branch = $(shell git rev-parse --abbrev-ref HEAD)

build: custom_ca
	docker build . -t local/hive:$(branch)
	docker tag  local/hive:$(branch) local/hive:latest

Log

 ~/src/github.com/zhenik/docker-hive make build                                                                                                                  
docker build . -t local/hive:feature/refactor-to-r.0.2.2
invalid argument "local/hive:feature/refactor-to-r.0.2.2" for "-t, --tag" flag: invalid reference format
See 'docker build --help'.
make: *** [build] Error 125

Suggestion

I would suggest using last commit's hash

~/Makefile

branch = $(shell git rev-parse --verify HEAD) 

Canary deployments

Canary deployments & Rolling upgrades

Link to official documentation

  • Learn feature and usage
  • Add this feature to nomad-job in the terraform-nomad module ( probably need to build another example in the module(s) )
  • Test (could be tricky)
    • simulate failures (using docker in vagrant hashistack, like just stop container. or use other tools like chaos-monkey)
    • test that blue/green deployments works
    • simulate update of service (optional)
    • test that the new version was deployed
  • Documentation

Volumes (nomad context)

Volumes (nomad context)

Prerequisites

Nomad supports several task drivers to deploy "things".
We focusing on docker driver.
TLDR -> nomad has control of docker host. Docker host supports volumes.
There is an opportunity to manage docker volumes via nomad-job.

TODO

Link to official documentation

  • Learn feature and usage
  • Add this feature to nomad-job in the terraform-nomad module which requires state storing ( probably need to build another example in the module(s) )
  • Test (an idea)
    • Run nomad-job (module)
    • Update storage state (create additional users, add records to the database, store some new data)
    • Stop instance, nomad job stop -purge <nomad-job>.
    • Start a new instance with existing volume and verify that data(state) stored previously in volume at the place(exists, hasn't changed, ...)
    • do more tests ... Probably during the reading documentation, more ideas will come

NB docker volumes stored in the docker host file system, basically inside the vagrant-hashistack box.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.