tendrl / gluster-integration Goto Github PK

View Code? Open in Web Editor NEW

6.0 6.0 20.0 1.05 MB

Extracts all data from a Gluster cluster for consumption by Tendrl

License: GNU Lesser General Public License v2.1

Python 99.67% Makefile 0.33%

gluster glusterfs management tendrl

gluster-integration's Introduction

Tendrl API

Table of Contents

Build Status
Installation from Source on CentOS 7
Development Environment
Test Environment
Running on Port 80

Installation from Source on CentOS 7

Note	All the commands are run as a regular user that has `sudo` privileges. The commands are all assumed to be run from a single directory, which by default could be the user’s home directory. If different, the required current directory is indicated in `[]` before the shell prompt `$`.

Ensure that etcd is running on a node in the network and is reachable from the node you’re about to install tendrl-api on. Note it’s address and port. In most development setups, both etcd and tendrl-api would reside on the same host.

System Setup

Install the build toolchain.

$ sudo yum groupinstall 'Development Tools'

Install Ruby 2.0.0p598.

$ sudo yum install ruby ruby-devel rubygem-bundler

Install tendrl-api

Clone tendrl-api.

$ git clone https://github.com/Tendrl/tendrl-api.git

Install the gem dependencies, either..

$ cd tendrl-api

everything,

[tendrl-api] $ bundle install --path vendor/bundle --binstubs vendor/bin

OR development setup only,

[tendrl-api] $ bundle install --path vendor/bundle --binstubs vendor/bin \
               --without production

OR production setup only.

[tendrl-api] $ bundle install --path vendor/bundle --binstubs vendor/bin \
               --without development test documentation

Note	Using binstubs allows any of the executables to be executed directly from `vendor/bin`, instead of via `bundle exec`.

Configuration

To configure the etcd connection information, copy the sample configuration file to the appropriate location and make the necessary changes based on your etcd configuration, as discussed in the Deployment Requirements section.

[tendrl-api] $ cp config/etcd.sample.yml config/etcd.yml

Development Environment

Note	All the commands below are assumed to be run from inside the git checkout directory.

Tendrl Definitions:

The API needs the proper Tendrl definitions yaml file to generate the attributes and actions. You can either download it or use the one from the fixtures to explore the API.
```
[tendrl-api] $ cp spec/fixtures/sds/tendrl_definitions_gluster-3.8.3.yaml \
               config/sds/tendrl_definitions_gluster-3.8.3.yaml
```
Seed the etcd instance (optional):

The script will seed the etcd instance with mock cluster data and print a cluster uuid which can be used to make API requests.
```
[tendrl-api] $ vendor/bin/rake etcd:seed # Seed the local store with cluster
```
Start the development server:

This server will reload itself when any of the source files are updated.
```
[tendrl-api] $ vendor/bin/shotgun
```
Note
This makes the development server to be queryable on localhost:9393 by default. Check vendor/bin/shotgun --help to change the ip:port binding.

Test Environment

The test environment does not need the local etcd instance to run the tests.

[tendrl-api] $ vendor/bin/rspec

Running on Port 80

Binding to port 80 requires root permissions. However, tendrl-api runs as a normal user. In order to make the application available on port 80, apache needs to be installed and configured.

Install apache
```
$ sudo yum install httpd
```

Copy over the sample configuration file and validate it’s syntax.

Important

Update the file for your specific host details. The file is commented to point out the suggested changes. The file is configured to connect to the tendrl-api application server on port 9292.

Important

Running behind apache makes the API available at http://<hostname>:80/api/. Client applications' (including tendrl frontend’s) configuration needs to be updated to make all API queries behind this endpoint.

[tendrl-api] $ sudo cp config/apache.vhost.sample \
               /etc/httpd/conf.d/tendrl.conf
$ sudo apachectl configtest

Update the SELinux configuration to allow apache to make connections.
```
$ sudo setsebool -P httpd_can_network_connect 1
```

Run the application via the production server puma, daemonised, listening on port 9292.

[tendrl-api] $ vendor/bin/puma -e development -d

Note	It is possible to run both the development and the production servers at the same time, with the production server behind apache. While the production server `puma` runs, by default, on port 9292; the development server `shotgun` listens on port 9393.

Start apache.
```
$ sudo systemctl start httpd.service
```

gluster-integration's People

Contributors

Stargazers

Watchers

gluster-integration's Issues

Config file is missing

Unfortunately, it looks like the gluster_bridge is missing config file that wasn't added. we have to fix this issue.

Add all volume related fields from glusterd-state to etcd

The current output from gluster get-state command also contains volume options details.
We should update the code to add volume options details to central store.

Fix package version issue

Package version should comes from tendrl/gluster_integration/init.py and this should automatically updated from the github if we run make gitversion.

install instructions documentation

There should be document which describes:

what software should be installed and how should be configured before installing Tendrl
Tendrl installation process
Tendrl post installation steps like configuration, service starting and so on

[was originally at https://tendrl.atlassian.net/browse/TEN-36]

KeyError: 'parameters'

When launched tendrl-gluster-integration command it continuously prints:

ERROR - Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tendrl/gluster_integration/manager/rpc.py", line 161, in _run
    self._server.run()
  File "/usr/lib/python2.7/site-packages/tendrl/gluster_integration/manager/rpc.py", line 83, in run
    self._acceptor()
  File "/usr/lib/python2.7/site-packages/tendrl/gluster_integration/manager/rpc.py", line 70, in _acceptor
    if "etcd_client" in raw_job['parameters']:
KeyError: 'parameters'

[rpm packaging] configuration files are not marked as configuration

Package tendrl-gluster-integration-0.0.1-1.el7.centos.noarch.rpm from https://copr-be.cloud.fedoraproject.org/results/tendrl/tendrl/epel-7-x86_64/ copr doesn't mark configuration files as configuration.

This breaks Fedora packaging policy and will have impact on user changes in
affected configuration files during package upgrades.

Reproducer

# rpm -ql tendrl-gluster-integration | grep ^/etc
/etc/tendrl/gluster_integration
/etc/tendrl/gluster_integration_logging.yaml
# rpm -qc tendrl-gluster-integration | grep ^/etc
#

While expected result would be:

# rpm -ql tendrl-gluster-integration | grep ^/etc
/etc/tendrl/gluster_integration
/etc/tendrl/gluster_integration_logging.yaml
# rpm -qc tendrl-gluster-integration | grep ^/etc
/etc/tendrl/gluster_integration_logging.yaml
#

for more information see: Tendrl/commons#72

check_commit_msg.py displays a misleading error message

The check_commit_msg.py complains that the commis message should be of the form :

'tendrl-bug-id:<tendrl_repo>/issue_id'

It should probably say :

tendrl-bug-id:Tendrl/<tendrl_repo_name>#<issue_id>

Fix service start issue

Service failed to start due to environment variable issue which is used in one of the tendrl configuration file. Service configuration file can not expand the path of a environment variables like $HOME or $PWD.

Re-factor gluster integration code to extend/use common module code

The common module provides all the base classes for flows, atoms, manager, job queue handler, persister and contains all the utilities which are generic in nature like exceptions, command executor, service manager and package installer. The gluster_integration modules should extend/use these common module changes.

Writing glusterd-state in /tmp should be changed

While running tendrl-gluster-integration as service, its not able to read data written at /tmp/glusterd-state by glusterd service.
This happens because each service gets its own variation of /tmp and glusterd and tendrl-gluster-integration are two different services here in this case.
It could be set as /var/run/glusterd-state as its a runtime file.

[systemd] name of systemd service should be "tendrl-gluster-integration"

Current name of systemd service file is tendrl-glusterd.service, while the name of package is tendrl-gluster-integration and the name of binary which implements the service is /usr/bin/tendrl-gluster-integration.

For this reason, the systemd service file should be renamed to tendrl-gluster-integration.service, so that this file name matches component name.

Introduce flows tied to Tendrl objects

All the actions that are specific to an object should be defined as flows under that particular object. Currently such flows are defined at global level. For example, the start volume, stop volume, delete volume flows should be under volume object.
Specification PR: Tendrl/specifications#10
Specification issue: Tendrl/specifications#34

tendrl.conf attribute

In etc/tendrl/tendrl.conf.sample is attribute called log_path in Common section. It should be probably called log_cfg_path.

Delete the volume entry from etcd as well while volume delete

Currently the volume is marked as deleted=True as an stop gap arrangement. Modify the same to delete the etcd entry post volume deletion flow.

Implement the pre and post runs for volume operations (create, delete, start, stop)

The pre and post runs are dummy placeholders at the moment for the volume operations

create
delete
start
stop

Implement the actual logic and test cases for the same as part of this fix.

etc directory is incompatible with documentation

According to installation documentation [1] some configuration files from gluster-integration etc directory should be copied to specific directories but etc directory structure is different from what is specified in documentation.
According to documentation logging.yaml.timedrotation.sample should be located at etc/samples/logging.yaml.timedrotation.sample but it is located at etc/tendrl/gluster-integration/logging.yaml.timedrotation.sample.
According to documentation there should be etc/tendrl/tendrl.conf.sample but there is no such file.
Documentation and etc directory should be united.

[1] https://github.com/Tendrl/gluster-integration/blob/7ab5e79dc28476e777887c434c4d3485c441d195/doc/source/installation.rst

package errors

I've done package check from https://copr.fedorainfracloud.org/coprs/tendrl/tendrl/build/480512/.
All errors should be fixed and it will be great to fix also 3 warnings.

$ rpmlint tendrl-gluster-integration-0.0.1-1.el7.centos.noarch.rpm 
tendrl-gluster-integration.noarch: W: incoherent-version-in-changelog 0.0.1-1 ['0.0.1-1.el7.centos', '0.0.1-1.centos']
tendrl-gluster-integration.noarch: E: script-without-shebang /usr/share/tendrl/gluster_integration/tendrl.conf.sample
tendrl-gluster-integration.noarch: W: non-conffile-in-etc /etc/tendrl/gluster_integration_logging.yaml
tendrl-gluster-integration.noarch: W: no-manual-page-for-binary tendrl-gluster-integration
tendrl-gluster-integration.noarch: E: unknown-key RSA#e879b811 (MD5
1 packages and 0 specfiles checked; 2 errors, 3 warnings.

Store configuration details in etcd

Instead of writing configuration details in file, store everything in etcd

create package for Centos 7

For CI and integration testing it is important to install all Tendrl components from packages.

Introduce flows tied to objects, if the action is specific to object.

Segregate all the flows specific to object, under that object instead of having it in global name space. Note that such flows should be independent other objects

Specification: Tendrl/specifications#34

[question] Does tendrl-gluster-integration require tendrl-node-agent?

Based on my understanding, tendrl-gluster-integration requires a tendrl-node-agent, but this is not directly mentioned in installation.rst file. Is my understanding correct (so that we need to add this into the docs)?

Enhance unit tests for better coverage

As part of re-factoring, the unit test coverage is decreased. More unit tests should be written for better coverage.

Maintain node_id as attribute for manager

This is required while processing a job from the job queue. If the current node's id falls within the list of node_ids passed as parameters, the job would be picked and processed by that node. Only one node would pick and process the job from the list of node_ids passed in parameter for job.

Lots of ceph related terms can be seen in gluster Bridge

There are references to many ceph related terminologies in gluster bridge in files like:
manager/eventer.py
manager/user_request.py
manager/manager.py
manager/request_factory.py
These have to be removed

Travis job should check git commit message for github bug id and an optional tendrl spec name

Moving on, all git commits should contain mandatory mention on newline of
"tendrl-bug-id: <tendrl_repo>/" and optional mention of "tendrl-spec: <tendrl_spec_name>" in the commit msg.

eg:
"""
Packaging python-etcd

Added Makefile and spec file to create python-etcd package
Now one can use "make rpm" to pull the latest source from github
and build the source rpm
and using the Makefile one can also build rpm using "make rpm"
which will use mock to build the rpm using srpm.

tendrl-bug-id: gluster_integration/18

tendrl-spec: refactor_gluster_integration_get_state_dump
"""

For testing Tox required bridge_common as site-package

For testing gluster_bridge, tox required bridge_common as site-package, we have to give bridge_common in test_requirement as pip installable.

refs : https://gist.github.com/GowthamShanmugam/bf66174d7d6062fab936706c713d3d0c

Add preferred tags to flows and change sample job format

Add preferred tags to the flow definition file, which can be used to generate jobs that are targeted towards specific type of nodes. Also change the sample job format to include this.

Fix pep8 errors

We need to fix pep8 errors from latest builds

How to detect pep8 errors:

After cloning gluster_bridge
run "tox -epep8" from the root of the repository and fix the failures.

Current build: https://travis-ci.org/Tendrl/gluster_bridge

Use logging framework for job status updates

Use new logging framework for job status updates in flows and atoms

tendrl-gluster-integration can't be shipped with default configuration

It's not directly possible to ship tendrl-gluster-integration component with configuration file(s) which would contain all default values known in advance (eg. before installation).

Eg. we know that we need to set this piece of configuration into /etc/tendrl/tendrl.conf (see installation.rst):

[gluster_integration]
log_cfg_path = /etc/tendrl/gluster_integration_logging.yaml

But it's not possible to ship this in a default configuration file in binary package, because tendrl.conf configuration file is owned by tendrl-node-agent component.

Reading jobs from etcd doesnt adhere to the api job structure

The new job structure is as per https://github.com/Tendrl/bridge_common/blob/master/etc/samples/tendrl_api_job.sample.json.

Based on this rpc.py which reads and processes the jobs from etcd job queue should be modified to figure out the actual flow name and invoke the same.

Repository name changes

Repository names need to be updated to follow lowercase, hyphenated convention.

version in setup.py is not aligned with git tags

Version specified in setup.py file is 0.0.1, while the latest git tag (and releases page provided for these tags by github) is v1.1.

Details

$ grep version setup.py
    version="0.0.1",
$ git tag
v1.0
v1.1

Related Upstream Guidelines

https://packaging.python.org/distributing/#choosing-a-versioning-scheme
https://packaging.python.org/single_source_version/#single-sourcing-the-version

Refactor code to use commons module config changes

Fix test case issues in gluster integration

Some of the test cases are failing in gluster integration, we have to fix those issues.

gluster_integration service should read cluster-id from config file

gluster-integration service should read the cluster-id from the conf file instead of expecting a command line argument.

[packaging] move list of python project from requirements.txt into install_requires in setup.py

This issue suggest to remove requirements.txt file and move items which this file contains into install_requires setuptools keyword in setup.py file.

Reasoning

Based on explanation of difference between install_requires keyword and requirements.txt file from upstream Python Packaging User Guide, I understand that:

python project should provide minimal abstract list of projects which are needed to run it correctly in install_requires
requirements.txt files are used to define the requirements for a complete python environment

Since this project is a single Tendrl component, which is expected to be installed with other Tendrl components together, using requirement file seems to be inappropriate. Listing dependencies of single reusable python library/tool/daemon, which is our case here, is not mentioned among 4 common uses of Requirements files as listed in description of the feature in packaging guide. Moreover, it creates additional issues:

using requirements.txt file makes us to install the project in non-production way, which means that we will find problems with listing requirements later than we should, see eg. issue https://github.com/Tendrl/performance_monitoring/issues/17 - this is very blatant issue, but was not noticed by any developer, unit tests or CI - since all of those are using installation method which doesn't resemble what an admin using this project would do
keeping requirements.txt file while specifying install_requires in setup.py requires to maintain function to read requirements.txt file in setup.py, which is unnecessary maintenance burden
feature set of requirements.txt is higher and includes features which should not be done in production setup

Volume status is not updating properly at etcd server when start and stop actions performed

Volume status is not updating properly at etcd server when start and stop actions performed.

Fix repetition of the attribute name in sds_atoms_gluster

The attribute name is repeated in the sds_atoms_gluster. One of them has to be replaced with description.

[rpm packaging] Dependency on tendrl-common

tendrl-gluster-integration-0.0.1-1.el7.centos.noarch.rpm have a dependency on tendrl-common. tendrl-common was recently renamed to tendrl-commons so the package should reflect it in a dependency list.

Currently:

# rpm -qp https://copr-be.cloud.fedoraproject.org/results/tendrl/tendrl/epel-7-x86_64/00498961-tendrl-gluster-integration/tendrl-gluster-integration-0.0.1-1.el7.centos.noarch.rpm --requires
/bin/sh
.
.
systemd
tendrl-common
rpmlib(PayloadIsXz) <= 5.2-1
#

Should be:

# rpm -qp https://copr-be.cloud.fedoraproject.org/results/tendrl/tendrl/epel-7-x86_64/00498961-tendrl-gluster-integration/tendrl-gluster-integration-0.0.1-1.el7.centos.noarch.rpm --requires
/bin/sh
.
.
systemd
tendrl-commons
rpmlib(PayloadIsXz) <= 5.2-1
#

Improve code coverage upto 90%

We need to collectively start improving the unit test code coverage for this project.

Current coverage: https://coveralls.io/github/Tendrl/gluster_bridge

Unit tests framework: http://doc.pytest.org/en/latest/

How to write unit tests: http://docs.openstack.org/developer/horizon/topics/testing.html

Contribute patches here: https://github.com/Tendrl/gluster_bridge/tree/master/gluster_bridge/tests

Lets work on this issue together to get code coverage up to 90%, please share doubts/inputs on this issue itself

Some bugs identified from pytest

Inside file servers.py a variable name takes two values (name = 'clusters/gluster/%s/peers/%s') but Inside all render function only one value is assigned to that variable
( self.name % self.peer_uuid).

2.From rpc.py, a while loop is running without any check in _acceptor method. I cant test that function.
and control never come out from that function.

3.From ini2json.py, there a method called dget it is not called anywhere and object for class StrictConfigParser is created and initialized and destroyed inside method ini_to_dict so calling dget from other method is useless.

Rpm installation: logging conf is copied to /etc/tendrl

Copy the logging config to /etc/tendrl/gluster-integration
Also update the 'log_cfg_path' in etc/tendrl/gluster-integration/gluster-integration.conf.yaml.sample to reflect the same

Exception shown in the logs when gluster_bridge can't find job queue in etcd

When one configure gluster_bridge following the Install docs and run tendrl-gluster-bridge, the following exception appears in the logs:

[manager.py:48 -                 _run() ] [Errno 2] No such file or directory
[manager.py:48 -                 _run() ] [Errno 2] No such file or directory
[rpc.py:96 -                 _run() ] EtcdThread run...
[client.py:521 -                 read() ] Issuing read for key /api_job_queue with args {}
[connectionpool.py:383 -        _make_request() ] "GET /v2/keys/api_job_queue HTTP/1.1" 404 79
[rpc.py:99 -                 _run() ] Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 97, in _run
    self._server.run()
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 50, in run
    self._acceptor()
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 28, in _acceptor
    jobs = self.client.read("/api_job_queue")
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 536, in read
    timeout=timeout)
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 848, in wrapper
    return self._handle_server_response(response)
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 928, in _handle_server_response
    etcd.EtcdError.handle(r)
  File "/usr/lib/python2.7/site-packages/etcd/__init__.py", line 304, in handle
    raise exc(msg, payload)
EtcdKeyNotFound: Key not found : /api_job_queue

[manager.py:48 -                 _run() ] [Errno 2] No such file or directory
[rpc.py:96 -                 _run() ] EtcdThread run...
[client.py:521 -                 read() ] Issuing read for key /api_job_queue with args {}
[connectionpool.py:383 -        _make_request() ] "GET /v2/keys/api_job_queue HTTP/1.1" 404 79
[rpc.py:99 -                 _run() ] Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 97, in _run
    self._server.run()
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 50, in run
    self._acceptor()
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 28, in _acceptor
    jobs = self.client.read("/api_job_queue")
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 536, in read
    timeout=timeout)
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 848, in wrapper
    return self._handle_server_response(response)
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 928, in _handle_server_response
    etcd.EtcdError.handle(r)
  File "/usr/lib/python2.7/site-packages/etcd/__init__.py", line 304, in handle
    raise exc(msg, payload)
EtcdKeyNotFound: Key not found : /api_job_queue

[manager.py:48 -                 _run() ] [Errno 2] No such file or directory
[manager.py:195 -             shutdown() ] Signal handler: stopping

Expected behavior

Instead of the exception, gluster_bridge should validate the data it's getting from etcd and log the issue properly without raising the exception.

Flows to get the cluster details

Currently to import a cluster, user expected to provide all the nodes in the cluster as input. This will be inconvenient if the size of the cluster is large. Instead the usecase flow is expected to be modified as:

select one of the node in cluster
use the selected node to get the details of the cluster
submit the request to all the nodes to import the cluster

This issue is to create a flow to get the cluster details

Fix the volume options syncing issue

There are some failures while updating the volume options. Correct the same

Remove salt package from requirements file

Salt package is not a requirement for gluster bridge. So it has to be removed from the requirements file.

Import Error

For unit test i have imported a module called Eventer (from gluster_bridge.manager.eventer import Eventer). When i try to run a test file it says import error like (ImportError: No module named tendrl.gluster_bridge.common.db.event). I dont see any module called Common in gluster_bridge.

Parse the get-state output into a hierarchical dictionary and then process

Currently the output of gluster get-state command is read from a file. This file is in ini format and we just read that into json. Each section in the file becomes one dictionary with entries as keys in dictionary. It would be better parseable/processable if we create a proper hierarchical dictionary as below out of this data and then update the detail into central store

{
  'Global': {
    'MYUUID': '3780fd01-8047-402c-81df-e361a1935bb8',
    'op-version': '30900'
  },
  'Peers': [
   .....
  ],
  'Volumes': [
    {
      'name': 'vol1',
      'id': 'ff0bceae-9901-46b0-9464-7fa4836d2535',
      ............
      'Bricks': [
        'Brick1': {
          'hostname': '<FQDN>',
          'path': '<FQDN>:<mounth path>',
          ..........
        },
        ..........
      ],
      'Options': {
        'option1': 'val1',
        'option2': 'val2',
        .........
      },
      ...............
    },
    ........
  ],  
  'Services': [
    {
      'name': 'glustershd',
      'online_status': "Offline",
    },
    ...............
  ],
  'Misc': {
    'Base Port': 49152,
    'Last allocated port': 49153
  }
]
}

This would make looping through the details and updation in central store more predictable and errors can be handled more effectively (as in case a KeyError for one volume doesnt cause other volume details updation failures and the error could be logged for that specific volume only)