cdh-ansible is an ansible-based Cloudera's Distribution of Apache Hadoop deployer for Red Hat/CentOS nodes.
- Easy to configure (just modifying nodes.yml file for basic configuration)
- Operating System level tuning
- Active Directory / LDAP integration and SSSD
- Customizable CDH version installer
- Cloudera Manager installer on Service Nodes
You can also:
- Launch cloud infrastructure using Terraform
- Configure deployment to multiple environments
cdh-ansible uses a number of open source projects to work properly:
- Ansible - Ansible is a radically simple IT automation system.
- Terraform - Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently.
The files specified below should be configured:
File | Description |
---|---|
/group_vars/environment/nodes.yml | Configure your cluster's topology |
This file contains information related to your environment. In particular, it should contain detailed info about your nodes. Syntax is specified below:
nodes:
<NODE_NAME>:
disks:
- {dev: <DEVICE_NAME_IN_OS>, fs: <CREATE_FS_ON_DEV>, mount: <MOUNT_POINT>, size: <SIZE_IN_GB>, type: <DEV_TYPE>, volume: <REQUEST_VOLUME>}
flavor: <CLOUD_VENDOR_NODE_FLAVOR>
fqdn: <NODE_FQDN>
role: <NODE_ROLE>
<NODE_NAME>:
...
Variable | Description | Allowed values |
---|---|---|
NODE_NAME | Node name | Example: master01 |
DEVICE_NAME_IN_OS | Device name | Example: vda,vdb |
CREATE_FS_ON_DEV | Wether to create FS on device or not | true,false |
MOUNT_POINT | Where to mount the device at OS-level | Example: /data/device1 |
REQUEST_VOLUME | (Optional) Wether to create volume for the node in Terraform | true,false (Default: false) |
SIZE_IN_GB | (Optional) Volume size in GB, used by Terraform when requesting new volume | 50, 250 |
DEV_TYPE | (Optional) Volume type, used by Terraform when requesting new volume | Example: ssd,hdd |
CLOUD_VENDOR_NODE_FLAVOR | (Optional) Used by Terraform to launch node in cloud vendor | Example: m4.xlarge (for AWS) |
NODE_FQDN | Node's FQDN, used by ansible to auto-generate /etc/hosts file | Example: master01.mycluster.int |
NODE_ROLE | Node's role in the cluster | service,master,worker,edge |
cdh-ansible requires Ansible v2.0.2+ to run.
$ ansible-playbook run-setup.yml -e env=development
For production environments...
$ ansible-playbook -vvvv run-setup.yml -e env=production
(Optional) If you have configured Terraform for creating infrastructure:
$ ansible-playbook -vvvv run-create-infrastructure.yml -e env=production
- Simplify configuration
MIT