Giter Site home page Giter Site logo

tendrl / gluster-integration Goto Github PK

View Code? Open in Web Editor NEW
6.0 6.0 20.0 1.05 MB

Extracts all data from a Gluster cluster for consumption by Tendrl

License: GNU Lesser General Public License v2.1

Python 99.67% Makefile 0.33%
gluster glusterfs management tendrl

gluster-integration's Introduction

Tendrl API

  • Unit tests: Build Status

  • Functional tests: Build Status

Note
All the commands are run as a regular user that has sudo privileges. The commands are all assumed to be run from a single directory, which by default could be the user’s home directory. If different, the required current directory is indicated in [] before the shell prompt $.

Ensure that etcd is running on a node in the network and is reachable from the node you’re about to install tendrl-api on. Note it’s address and port. In most development setups, both etcd and tendrl-api would reside on the same host.

  1. Install the build toolchain.

    $ sudo yum groupinstall 'Development Tools'
  2. Install Ruby 2.0.0p598.

    $ sudo yum install ruby ruby-devel rubygem-bundler
  1. Clone tendrl-api.

    $ git clone https://github.com/Tendrl/tendrl-api.git
  2. Install the gem dependencies, either..

    $ cd tendrl-api
    1. everything,

      [tendrl-api] $ bundle install --path vendor/bundle --binstubs vendor/bin
    2. OR development setup only,

      [tendrl-api] $ bundle install --path vendor/bundle --binstubs vendor/bin \
                     --without production
    3. OR production setup only.

      [tendrl-api] $ bundle install --path vendor/bundle --binstubs vendor/bin \
                     --without development test documentation
Note
Using binstubs allows any of the executables to be executed directly from vendor/bin, instead of via bundle exec.

To configure the etcd connection information, copy the sample configuration file to the appropriate location and make the necessary changes based on your etcd configuration, as discussed in the Deployment Requirements section.

[tendrl-api] $ cp config/etcd.sample.yml config/etcd.yml
Note
All the commands below are assumed to be run from inside the git checkout directory.
  1. Tendrl Definitions:

    The API needs the proper Tendrl definitions yaml file to generate the attributes and actions. You can either download it or use the one from the fixtures to explore the API.

    [tendrl-api] $ cp spec/fixtures/sds/tendrl_definitions_gluster-3.8.3.yaml \
                   config/sds/tendrl_definitions_gluster-3.8.3.yaml
  2. Seed the etcd instance (optional):

    The script will seed the etcd instance with mock cluster data and print a cluster uuid which can be used to make API requests.

    [tendrl-api] $ vendor/bin/rake etcd:seed # Seed the local store with cluster
  3. Start the development server:

    This server will reload itself when any of the source files are updated.

    [tendrl-api] $ vendor/bin/shotgun
    Note
    This makes the development server to be queryable on localhost:9393 by default. Check vendor/bin/shotgun --help to change the ip:port binding.

The test environment does not need the local etcd instance to run the tests.

[tendrl-api] $ vendor/bin/rspec

Binding to port 80 requires root permissions. However, tendrl-api runs as a normal user. In order to make the application available on port 80, apache needs to be installed and configured.

  1. Install apache

    $ sudo yum install httpd
  2. Copy over the sample configuration file and validate it’s syntax.

    Important
    Update the file for your specific host details. The file is commented to point out the suggested changes. The file is configured to connect to the tendrl-api application server on port 9292.
    Important
    Running behind apache makes the API available at http://<hostname>:80/api/. Client applications' (including tendrl frontend’s) configuration needs to be updated to make all API queries behind this endpoint.
    [tendrl-api] $ sudo cp config/apache.vhost.sample \
                   /etc/httpd/conf.d/tendrl.conf
    $ sudo apachectl configtest
  3. Update the SELinux configuration to allow apache to make connections.

    $ sudo setsebool -P httpd_can_network_connect 1
  4. Run the application via the production server puma, daemonised, listening on port 9292.

    [tendrl-api] $ vendor/bin/puma -e development -d
    Note
    It is possible to run both the development and the production servers at the same time, with the production server behind apache. While the production server puma runs, by default, on port 9292; the development server shotgun listens on port 9393.
  5. Start apache.

    $ sudo systemctl start httpd.service

gluster-integration's People

Contributors

anmolbabu avatar anmolsachan avatar brainfunked avatar fbalak avatar gowthamshanmugam avatar ktdreyer avatar mbukatov avatar nndarshan avatar nthomas-redhat avatar r0h4n avatar rishubhjain avatar sankarshanmukhopadhyay avatar shtripat avatar timothyasirjeyasing avatar umangachapagain avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gluster-integration's Issues

Config file is missing

Unfortunately, it looks like the gluster_bridge is missing config file that wasn't added. we have to fix this issue.

Fix package version issue

Package version should comes from tendrl/gluster_integration/init.py and this should automatically updated from the github if we run make gitversion.

install instructions documentation

There should be document which describes:

  • what software should be installed and how should be configured before installing Tendrl
  • Tendrl installation process
  • Tendrl post installation steps like configuration, service starting and so on

[was originally at https://tendrl.atlassian.net/browse/TEN-36]

KeyError: 'parameters'

When launched tendrl-gluster-integration command it continuously prints:

ERROR - Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tendrl/gluster_integration/manager/rpc.py", line 161, in _run
    self._server.run()
  File "/usr/lib/python2.7/site-packages/tendrl/gluster_integration/manager/rpc.py", line 83, in run
    self._acceptor()
  File "/usr/lib/python2.7/site-packages/tendrl/gluster_integration/manager/rpc.py", line 70, in _acceptor
    if "etcd_client" in raw_job['parameters']:
KeyError: 'parameters'

[rpm packaging] configuration files are not marked as configuration

Package tendrl-gluster-integration-0.0.1-1.el7.centos.noarch.rpm from https://copr-be.cloud.fedoraproject.org/results/tendrl/tendrl/epel-7-x86_64/ copr doesn't mark configuration files as configuration.

This breaks Fedora packaging policy and will have impact on user changes in
affected configuration files during package upgrades.

Reproducer

# rpm -ql tendrl-gluster-integration | grep ^/etc
/etc/tendrl/gluster_integration
/etc/tendrl/gluster_integration_logging.yaml
# rpm -qc tendrl-gluster-integration | grep ^/etc
#

While expected result would be:

# rpm -ql tendrl-gluster-integration | grep ^/etc
/etc/tendrl/gluster_integration
/etc/tendrl/gluster_integration_logging.yaml
# rpm -qc tendrl-gluster-integration | grep ^/etc
/etc/tendrl/gluster_integration_logging.yaml
#

for more information see: Tendrl/commons#72

Fix service start issue

Service failed to start due to environment variable issue which is used in one of the tendrl configuration file. Service configuration file can not expand the path of a environment variables like $HOME or $PWD.

Re-factor gluster integration code to extend/use common module code

The common module provides all the base classes for flows, atoms, manager, job queue handler, persister and contains all the utilities which are generic in nature like exceptions, command executor, service manager and package installer. The gluster_integration modules should extend/use these common module changes.

Writing glusterd-state in /tmp should be changed

While running tendrl-gluster-integration as service, its not able to read data written at /tmp/glusterd-state by glusterd service.
This happens because each service gets its own variation of /tmp and glusterd and tendrl-gluster-integration are two different services here in this case.
It could be set as /var/run/glusterd-state as its a runtime file.

[systemd] name of systemd service should be "tendrl-gluster-integration"

Current name of systemd service file is tendrl-glusterd.service, while the name of package is tendrl-gluster-integration and the name of binary which implements the service is /usr/bin/tendrl-gluster-integration.

For this reason, the systemd service file should be renamed to tendrl-gluster-integration.service, so that this file name matches component name.

tendrl.conf attribute

In etc/tendrl/tendrl.conf.sample is attribute called log_path in Common section. It should be probably called log_cfg_path.

etc directory is incompatible with documentation

According to installation documentation [1] some configuration files from gluster-integration etc directory should be copied to specific directories but etc directory structure is different from what is specified in documentation.
According to documentation logging.yaml.timedrotation.sample should be located at etc/samples/logging.yaml.timedrotation.sample but it is located at etc/tendrl/gluster-integration/logging.yaml.timedrotation.sample.
According to documentation there should be etc/tendrl/tendrl.conf.sample but there is no such file.
Documentation and etc directory should be united.

[1] https://github.com/Tendrl/gluster-integration/blob/7ab5e79dc28476e777887c434c4d3485c441d195/doc/source/installation.rst

package errors

I've done package check from https://copr.fedorainfracloud.org/coprs/tendrl/tendrl/build/480512/.
All errors should be fixed and it will be great to fix also 3 warnings.

$ rpmlint tendrl-gluster-integration-0.0.1-1.el7.centos.noarch.rpm 
tendrl-gluster-integration.noarch: W: incoherent-version-in-changelog 0.0.1-1 ['0.0.1-1.el7.centos', '0.0.1-1.centos']
tendrl-gluster-integration.noarch: E: script-without-shebang /usr/share/tendrl/gluster_integration/tendrl.conf.sample
tendrl-gluster-integration.noarch: W: non-conffile-in-etc /etc/tendrl/gluster_integration_logging.yaml
tendrl-gluster-integration.noarch: W: no-manual-page-for-binary tendrl-gluster-integration
tendrl-gluster-integration.noarch: E: unknown-key RSA#e879b811 (MD5
1 packages and 0 specfiles checked; 2 errors, 3 warnings.

Maintain node_id as attribute for manager

This is required while processing a job from the job queue. If the current node's id falls within the list of node_ids passed as parameters, the job would be picked and processed by that node. Only one node would pick and process the job from the list of node_ids passed in parameter for job.

Travis job should check git commit message for github bug id and an optional tendrl spec name

Moving on, all git commits should contain mandatory mention on newline of
"tendrl-bug-id: <tendrl_repo>/" and optional mention of "tendrl-spec: <tendrl_spec_name>" in the commit msg.

eg:
"""
Packaging python-etcd

Added Makefile and spec file to create python-etcd package
Now one can use "make rpm" to pull the latest source from github
and build the source rpm
and using the Makefile one can also build rpm using "make rpm"
which will use mock to build the rpm using srpm.

tendrl-bug-id: gluster_integration/18

tendrl-spec: refactor_gluster_integration_get_state_dump
"""

tendrl-gluster-integration can't be shipped with default configuration

It's not directly possible to ship tendrl-gluster-integration component with configuration file(s) which would contain all default values known in advance (eg. before installation).

Eg. we know that we need to set this piece of configuration into /etc/tendrl/tendrl.conf (see installation.rst):

[gluster_integration]
log_cfg_path = /etc/tendrl/gluster_integration_logging.yaml

But it's not possible to ship this in a default configuration file in binary package, because tendrl.conf configuration file is owned by tendrl-node-agent component.

[packaging] move list of python project from requirements.txt into install_requires in setup.py

This issue suggest to remove requirements.txt file and move items which this file contains into install_requires setuptools keyword in setup.py file.

Reasoning

Based on explanation of difference between install_requires keyword and requirements.txt file from upstream Python Packaging User Guide, I understand that:

  • python project should provide minimal abstract list of projects which are needed to run it correctly in install_requires
  • requirements.txt files are used to define the requirements for a complete python environment

Since this project is a single Tendrl component, which is expected to be installed with other Tendrl components together, using requirement file seems to be inappropriate. Listing dependencies of single reusable python library/tool/daemon, which is our case here, is not mentioned among 4 common uses of Requirements files as listed in description of the feature in packaging guide. Moreover, it creates additional issues:

  • using requirements.txt file makes us to install the project in non-production way, which means that we will find problems with listing requirements later than we should, see eg. issue https://github.com/Tendrl/performance_monitoring/issues/17 - this is very blatant issue, but was not noticed by any developer, unit tests or CI - since all of those are using installation method which doesn't resemble what an admin using this project would do
  • keeping requirements.txt file while specifying install_requires in setup.py requires to maintain function to read requirements.txt file in setup.py, which is unnecessary maintenance burden
  • feature set of requirements.txt is higher and includes features which should not be done in production setup

[rpm packaging] Dependency on tendrl-common

tendrl-gluster-integration-0.0.1-1.el7.centos.noarch.rpm have a dependency on tendrl-common. tendrl-common was recently renamed to tendrl-commons so the package should reflect it in a dependency list.

Currently:

# rpm -qp https://copr-be.cloud.fedoraproject.org/results/tendrl/tendrl/epel-7-x86_64/00498961-tendrl-gluster-integration/tendrl-gluster-integration-0.0.1-1.el7.centos.noarch.rpm --requires
/bin/sh
.
.
systemd
tendrl-common
rpmlib(PayloadIsXz) <= 5.2-1
#

Should be:

# rpm -qp https://copr-be.cloud.fedoraproject.org/results/tendrl/tendrl/epel-7-x86_64/00498961-tendrl-gluster-integration/tendrl-gluster-integration-0.0.1-1.el7.centos.noarch.rpm --requires
/bin/sh
.
.
systemd
tendrl-commons
rpmlib(PayloadIsXz) <= 5.2-1
#

Improve code coverage upto 90%

We need to collectively start improving the unit test code coverage for this project.

Current coverage: https://coveralls.io/github/Tendrl/gluster_bridge

Unit tests framework: http://doc.pytest.org/en/latest/

How to write unit tests: http://docs.openstack.org/developer/horizon/topics/testing.html

Contribute patches here: https://github.com/Tendrl/gluster_bridge/tree/master/gluster_bridge/tests

Lets work on this issue together to get code coverage up to 90%, please share doubts/inputs on this issue itself

Some bugs identified from pytest

  1. Inside file servers.py a variable name takes two values (name = 'clusters/gluster/%s/peers/%s') but Inside all render function only one value is assigned to that variable
    ( self.name % self.peer_uuid).

2.From rpc.py, a while loop is running without any check in _acceptor method. I cant test that function.
and control never come out from that function.

3.From ini2json.py, there a method called dget it is not called anywhere and object for class StrictConfigParser is created and initialized and destroyed inside method ini_to_dict so calling dget from other method is useless.

Exception shown in the logs when gluster_bridge can't find job queue in etcd

When one configure gluster_bridge following the Install docs and run tendrl-gluster-bridge, the following exception appears in the logs:

[manager.py:48 -                 _run() ] [Errno 2] No such file or directory
[manager.py:48 -                 _run() ] [Errno 2] No such file or directory
[rpc.py:96 -                 _run() ] EtcdThread run...
[client.py:521 -                 read() ] Issuing read for key /api_job_queue with args {}
[connectionpool.py:383 -        _make_request() ] "GET /v2/keys/api_job_queue HTTP/1.1" 404 79
[rpc.py:99 -                 _run() ] Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 97, in _run
    self._server.run()
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 50, in run
    self._acceptor()
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 28, in _acceptor
    jobs = self.client.read("/api_job_queue")
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 536, in read
    timeout=timeout)
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 848, in wrapper
    return self._handle_server_response(response)
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 928, in _handle_server_response
    etcd.EtcdError.handle(r)
  File "/usr/lib/python2.7/site-packages/etcd/__init__.py", line 304, in handle
    raise exc(msg, payload)
EtcdKeyNotFound: Key not found : /api_job_queue

[manager.py:48 -                 _run() ] [Errno 2] No such file or directory
[rpc.py:96 -                 _run() ] EtcdThread run...
[client.py:521 -                 read() ] Issuing read for key /api_job_queue with args {}
[connectionpool.py:383 -        _make_request() ] "GET /v2/keys/api_job_queue HTTP/1.1" 404 79
[rpc.py:99 -                 _run() ] Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 97, in _run
    self._server.run()
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 50, in run
    self._acceptor()
  File "/usr/lib/python2.7/site-packages/tendrl_gluster_bridge-0.1-py2.7.egg/tendrl/gluster_bridge/manager/rpc.py", line 28, in _acceptor
    jobs = self.client.read("/api_job_queue")
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 536, in read
    timeout=timeout)
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 848, in wrapper
    return self._handle_server_response(response)
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 928, in _handle_server_response
    etcd.EtcdError.handle(r)
  File "/usr/lib/python2.7/site-packages/etcd/__init__.py", line 304, in handle
    raise exc(msg, payload)
EtcdKeyNotFound: Key not found : /api_job_queue

[manager.py:48 -                 _run() ] [Errno 2] No such file or directory
[manager.py:195 -             shutdown() ] Signal handler: stopping

Expected behavior

Instead of the exception, gluster_bridge should validate the data it's getting from etcd and log the issue properly without raising the exception.

Flows to get the cluster details

Currently to import a cluster, user expected to provide all the nodes in the cluster as input. This will be inconvenient if the size of the cluster is large. Instead the usecase flow is expected to be modified as:

select one of the node in cluster
use the selected node to get the details of the cluster
submit the request to all the nodes to import the cluster

This issue is to create a flow to get the cluster details

Import Error

For unit test i have imported a module called Eventer (from gluster_bridge.manager.eventer import Eventer). When i try to run a test file it says import error like (ImportError: No module named tendrl.gluster_bridge.common.db.event). I dont see any module called Common in gluster_bridge.

Parse the get-state output into a hierarchical dictionary and then process

Currently the output of gluster get-state command is read from a file. This file is in ini format and we just read that into json. Each section in the file becomes one dictionary with entries as keys in dictionary. It would be better parseable/processable if we create a proper hierarchical dictionary as below out of this data and then update the detail into central store

{
  'Global': {
    'MYUUID': '3780fd01-8047-402c-81df-e361a1935bb8',
    'op-version': '30900'
  },
  'Peers': [
   .....
  ],
  'Volumes': [
    {
      'name': 'vol1',
      'id': 'ff0bceae-9901-46b0-9464-7fa4836d2535',
      ............
      'Bricks': [
        'Brick1': {
          'hostname': '<FQDN>',
          'path': '<FQDN>:<mounth path>',
          ..........
        },
        ..........
      ],
      'Options': {
        'option1': 'val1',
        'option2': 'val2',
        .........
      },
      ...............
    },
    ........
  ],  
  'Services': [
    {
      'name': 'glustershd',
      'online_status': "Offline",
    },
    ...............
  ],
  'Misc': {
    'Base Port': 49152,
    'Last allocated port': 49153
  }
]
}

This would make looping through the details and updation in central store more predictable and errors can be handled more effectively (as in case a KeyError for one volume doesnt cause other volume details updation failures and the error could be logged for that specific volume only)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.