Giter Site home page Giter Site logo

cm-druid's People

Contributors

knoguchi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cm-druid's Issues

Add configuration validator

Some configuration keys have dependencies or mutually exclusive.

  • Research how other CSDs validate them.
  • Create matrix of configuration key dependencies
  • Prioritize them
  • Implement

add OS and versions

As of now the supported OS is RedHat Enterprise 6 "el6" only that is defined in config.mk.
Add more OS and versions that CDH supports.

Possible distro suffixes

  • el5: Redhat Enterprise Linux 5 and clones (CentOS, Scientific Linux, etc)
  • el6: Redhat Enterprise Linux 6 and clones (CentOS, Scientific Linux, etc)
  • el7: Redhat Enterprise Linux 7 and clones (CentOS, Scientific Linux, etc)
  • sles11: SuSE Linux Enterprise Server 11.x
  • lucid: Ubuntu Linux 10.04 LTS (No CDH 5.x parcel provided)
  • precise: Ubuntu Linux 12.04 LTS
  • trusty: Ubuntu Linux 14.04 LTS (Newly supported in CM 5.2. No CDH 4.x parcel provided)
  • squeeze: Debian 6.x (No CDH 5.x parcel provided))
  • wheezy: Debian 7.x (Newly supported in CM 5.0. No CDH 4.x parcel provided)

PostgreSQL support

I did not have a chance to use your parcel yet, but looking at the code it appears that only MySQL is supported.

Could you also add support for PostgreSQL? Both, Druid and Cloudera can work with PostgreSQL.

add rolling restart

The Druid roles must be rolling restarted at the configuration change or Druid upgrade.

  • Research what changes requires restart and which role, and the restart order
  • Create restart graph?
  • Execute the restart graph?

Example:

  • realtime indexing node with 2 replicas (total 2), the two must not be restarted at the same time

define versioning scheme

Define versioning scheme for the following cases.

  • CDH upgrade itself
  • Druid upgrade
  • Parcel bug fix
  • CSD bug fix

Define Release candidate tags and prod tag.

Parcels
There doesn't seem to be standard but most of Cloudera parcels follow this pattern:

<parcel name>-<original version>.cdh<cdh version>.<patch version>.<build version>

Example: the parcel version for druid-0.9.2, CDH 5.8.0 should be
DRUID-0.9.2.cdh.cdh5.8.0.p0.1

CSD
As far as I collected examples, they follow this pattern:
<service type>-<cdh version>

Example:
DRUID-5.8.0

But I'm not sure if this is expressive enough. I guess one service type can support one version only in CDH, hence the cdh version only. Needs research.

Force IPv4 only

Unfortunately CDH5 supports IPv4 only. The CDH installation manual explicitly mentioned this.
However sometimes we don't have control over the IP stack configuration.

Many CDH services are started with -Djava.net.preferIPv4Stack=true regardless the services that actually support IPv6. Hence, Druid failed to connect to Zookeeper if IPv6 is enabled.

Add the same config to Druid.

add monitoring metrics and alerts

monitoring is essential for the operation and for the optimal resource allocation.
I would like to use Druid Emitters to push the metrics to CDH.

Metrics

http://druid.io/docs/latest/operations/metrics.html

Alerts

http://druid.io/docs/latest/operations/alerts.html

add pull-deps command for plugins

For now, please include necessary extensions under extensions directory when you make the parcel. As of now the extensions are loaded from the /opt/parcels/DRUID directory. To make changes to the extension load list does not automatically pull the extensions and their dependencies.

I don't think it's feasible to create a giant parcel that includes all the extensions. More over I have a custom extension that is not part of Druid distribution. I need a way to "add on" extensions. pull-deps can pull extensions from Maven repositories.

http://druid.io/docs/latest/operations/pull-deps.html

Where to save the extensions?

  • /opt/parcels/DRUID/extensions is good so all the roles can share the extension. But I'm not sure cloudera-scm user can write to the directory.
  • to create a separate parcel might be technically right way, but it's a maintenance nightmare.
  • per process directory would work but to pull extensions at config change is real pain e.g. /var/run/cloudera-scm-agent/processes/XXX-druid-overlord/..

Suggest optimal configuration and safe guards

When you allocate roles to the servers. the wizard should recommend values.
When ever configuration is updated, it should check new values does not break the operation.

  • List up critical configuration keys and formula.
  • Prioritize them
  • Implement

Example:

  • total num of threads of the all roles for the host < cores of the host
  • max direct memory + heap should > XXX * num of threads. XXX differs for roles. The role would fail to start otherwise.
  • the total number of peons * max heap + direct memory < total RAM.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.