Giter Site home page Giter Site logo

jweezy24 / beehive-server Goto Github PK

View Code? Open in Web Editor NEW

This project forked from waggle-sensor/beehive-server

0.0 1.0 0.0 27.7 MB

Waggle cloud software for aggregation, storage and analysis of sensor data from Waggle nodes

Shell 17.08% Makefile 3.15% Dockerfile 1.98% Python 75.49% HTML 2.30%

beehive-server's Introduction

Beehive Overview

Installation

Please see installation notes for deploying a beehive server.

Beehive1

Base OS: CentOS Linux release 7.2.1511 (Core)
Kernel: Linux beehive1.mcs.anl.gov 3.10.0-327.10.1.el7.x86_64 #1 SMP Tue Feb 16 17:03:50 UTC
        2016 x86_64 x86_64 x86_64 GNU/Linux
Public IP: `40.221.47.67 (67.47.221.140.in-addr.arpa	name = beehive1.mcs.anl.gov.)

FS of Beehive

/dev/mapper/centos_beehive1-root on / type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
/dev/vda1 on /boot type xfs (rw,relatime,seclabel,attr2,inode64,noquota)

All the data is in /mnt which is not a separately mounted partition, but part of root. - Is root of Beehive backed up? If so how often?

All the Beehive processess are run either in docker containers or as jobs directly on the base Cent OS.

FS Usage (27 Feb 2017): Top Users
55G    /

    23G    /var
        17G    /var/lib
            17G    /var/lib/docker    
                16G    /var/lib/docker/devicemapper
                    16G    /var/lib/docker/devicemapper/devicemapper
        5.4G    /var/log

    21G    /mnt
        12G    /mnt/rabbitmq
            12G    /mnt/rabbitmq/data
                12G    /mnt/rabbitmq/data/mnesia
                    12G    /mnt/rabbitmq/data/mnesia/rabbitmq
                    12G    /mnt/rabbitmq/data/mnesia/rabbitmq/queues
                        5.4G    /mnt/rabbitmq/data/mnesia/rabbitmq/queues/3KKHZI9SX6T8Q78UNERI1H2W1
                        2.0G    /mnt/rabbitmq/data/mnesia/rabbitmq/queues/CLRGSG3BEMCB4E5GRS63XRK0V
                        1.6G    /mnt/rabbitmq/data/mnesia/rabbitmq/queues/ACZI56741375WMFLVFPQSXZ67
                        1.3G    /mnt/rabbitmq/data/mnesia/rabbitmq/queues/1JE88H3ZUXZ2PM8VCY1Z7TVDP
                        1.3G    /mnt/rabbitmq/data/mnesia/rabbitmq/queues/1M6BCTH8SNGEKZL68LZBPMWQL

        7.4G    /mnt/cassandra
            7.1G    /mnt/cassandra/data
                7.1G    /mnt/cassandra/data/waggle
                    4.5G    /mnt/cassandra/data/waggle/sensor_data_raw-6a36efb090be11e68f941fe22eacf844
                    2.6G    /mnt/cassandra/data/waggle/sensor_data-9abd35e0c44f11e59521091830ac5256


        1.3G    /mnt/beehive
            1.3G    /mnt/beehive/node-logs


    7.4G    /homes
        7.1G    /homes/moose
            6.8G    /homes/moose/beehive-server
                6.8G    /homes/moose/beehive-server/data-exporter
                    6.8G    /homes/moose/beehive-server/data-exporter/datasets
                        6.8G    /homes/moose/beehive-server/data-exporter/datasets/2

    2.3G    /usr

Docker

Version

Client:
 Version:      1.10.2
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   c3959b1
 Built:        Mon Feb 22 16:16:33 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.2
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   c3959b1
 Built:        Mon Feb 22 16:16:33 2016
 OS/Arch:      linux/amd64

Containers

1. waggle/beehive-worker-coresense
2. waggle/beehive-flask
3. waggle/beehive-logger
4. waggle/beehive-nginx
5. waggle/beehive-plenario-sender
6. waggle/beehive-loader-raw
7. waggle/beehive-sshd
8. waggle/beehive-cert
9. mysql:5.7.10
10. cassandra:3.2
11. waggle/beehive-rabbitmq

Where are the docker images created?

Base_Dir is root of the beehive-server repo.

Some images are generated using the Dockerfile in their respective directories - make build and make deploy

[Base_Dir]/beehive-loader-decoded/Dockerfile
[Base_Dir]/beehive-sshd/Dockerfile
[Base_Dir]/beehive-nginx/Dockerfile
[Base_Dir]/beehive-cert/Dockerfile
[Base_Dir]/beehive-worker-alphasense/Dockerfile
[Base_Dir]/beehive-loader/Dockerfile
[Base_Dir]/beehive-flask/Dockerfile
[Base_Dir]/beehive-loader-raw/Dockerfile
[Base_Dir]/beehive-plenario-sender/Dockerfile
[Base_Dir]/beehive-worker-gps/Dockerfile
[Base_Dir]/beehive-log-saver/Dockerfile
[Base_Dir]/beehive-worker-coresense/Dockerfile
[Base_Dir]/beehive-queue-to-mysql/Dockerfile
[Base_Dir]/beehive-rabbitmq/Dockerfile

How are containers built and deployed?

Service directories all contain a Makefile with commands:

  • make build - build Docker image from directory
  • make deploy - deploy latest Docker built image

For example, the following we build the currently checked out beehive-loader-raw image and deploy it.

$ cd beehive-server/beehive-loader-raw
$ make build && make deploy

Cassandra:

Deployment

Docker Container: beehive-cassandra

Docker Image: cassandra:3.2

  • This image is pulled from the public docker image repo? (Need to confirm)
  • The file here configures the container (Makefile.beehive1 - We need to understand why we have two MakeFiles in that directory) - Makefile

Admin User: waggle

Admin Password: waggle

Default Keyspace: waggle

Access

Can use beehive-server/bin/beehive-cqlsh to quickly connect to waggle keyspace in database.

What do we have in Cassandra on the FS?
[Tue Feb 27 10:43:50 root@beehive1:/mnt/cassandra/data/waggle ] $ du -sc * | sort -k 1 -n

0	abc-624f05e090c211e68f941fe22eacf844
0	admin_messages-8ab478e07c2b11e68f941fe22eacf844
0	admin_messages-940dafb07c5811e68f941fe22eacf844
0	admin_messages-dfbc3a807c5811e68f941fe22eacf844
0	admin_messages-f605b35090c311e68f941fe22eacf844
0	node_datasets-5e301d80933911e782cc6b087c4d7187
0	node_datasets-a1f7d420933611e782cc6b087c4d7187
0	node_last_update-264f3610963411e68f941fe22eacf844
0	node_last_update-75690590961711e68f941fe22eacf844
0	node_last_update-7a1298d0960911e68f941fe22eacf844
0	node_last_update-8cb93120961711e68f941fe22eacf844
0	node_last_update-91ee3db0961811e68f941fe22eacf844
0	node_last_update-c9c2a650961711e68f941fe22eacf844
0	raw_sensor_log-77a602f0008c11e890843184f0338630
0	registration_log-9a777a50c44f11e59521091830ac5256
0	sensor_data_decoded-8642c6e07c2b11e68f941fe22eacf844
0	sensor_data_decoded-93ebcfd07c5811e68f941fe22eacf844
0	sensor_data_decoded-dfac34f07c5811e68f941fe22eacf844
0	sensor_data_decoded-e2381ed090c311e68f941fe22eacf844
0	sensor_data_raw-7ec841b07c2b11e68f941fe22eacf844
0	sensor_data_raw-93d7d2a07c5811e68f941fe22eacf844
0	sensor_data_raw-df99be607c5811e68f941fe22eacf844
0	sensor_data_ttl-9029c1f0c54111e59521091830ac5256
0	sensor_data_ttl-9afd9a40c44f11e59521091830ac5256
0	sensor_data_ttl-d8932620c53c11e59521091830ac5256
0	test_decoded-365b9b40cc9711e692721fe22eacf844
4	raw_sensor_log-ff23f1c0008b11e890843184f0338630
4	sensor_data_raw_log-97a24b50008b11e890843184f0338630
40	message_data-f7fe65d099be11e791a46b087c4d7187
40	nodes_last_log-92f5ed10f09611e6839f3b370c0fef00
40	nodes_last_ssh-930de1e0f09611e6839f3b370c0fef00
76	nodes_last_data-91be5770f09611e6839f3b370c0fef00
112	nodes-9a216b10c44f11e59521091830ac5256
112	nodes_last_update-40143230963411e68f941fe22eacf844
180	metric_data-433e5450999311e791a46b087c4d7187
64180	sensor_data_raw-478d571090be11e68f941fe22eacf844
2632020	sensor_data-9abd35e0c44f11e59521091830ac5256
4710712	sensor_data_raw-6a36efb090be11e68f941fe22eacf844

what do we have in the Cassandra DB - Tables?

Command: /bin/beehive-cqlsh

#!/bin/sh
docker exec -ti beehive-cassandra cqlsh -k waggle

First open prompt using /bin/beehive-cqlsh

Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.2.1 | CQL spec 3.4.0 | Native protocol v4]
Use HELP for help.
cqlsh:waggle> DESCRIBE tables;                         

1. sensor_data
2. sensor_data_raw  
3. nodes  

Table Details:

1. sensor_data: (This is where the old ASCII sensor data from the old nodes went. These should be early 2015 to mid-2017)

cqlsh:waggle> DESCRIBE TABLE sensor_data ;

CREATE TABLE waggle.sensor_data (
    node_id ascii,
    date ascii,
    plugin_id ascii,
    plugin_version int,
    plugin_instance ascii,
    timestamp timestamp,
    sensor ascii,
    data list<ascii>,
    sensor_meta ascii,
    PRIMARY KEY ((node_id, date), plugin_id, plugin_version, plugin_instance, timestamp, sensor)
) WITH CLUSTERING ORDER BY (plugin_id ASC, plugin_version ASC, plugin_instance ASC, timestamp ASC, sensor ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

2. sensor_data_raw: (This is where the current raw data from the nodes go. The data here is stored in the raw form, using the old V2 prot.)

cqlsh:waggle> DESCRIBE TABLE sensor_data_raw ;

CREATE TABLE waggle.sensor_data_raw (
    node_id ascii,
    date ascii,
    plugin_name ascii,
    plugin_version ascii,
    plugin_instance ascii,
    timestamp timestamp,
    parameter ascii,
    data ascii,
    ingest_id int,
    PRIMARY KEY ((node_id, date), plugin_name, plugin_version, plugin_instance, timestamp, parameter)
) WITH CLUSTERING ORDER BY (plugin_name ASC, plugin_version ASC, plugin_instance ASC, timestamp ASC, parameter ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

3. Nodes: (We do not know what this table does or how it impacts the rest of the system, so keeping it alive for now)

cqlsh:waggle> DESCRIBE TABLE nodes  ;

CREATE TABLE waggle.nodes (
    node_id ascii PRIMARY KEY,
    children list<ascii>,
    name ascii,
    parent ascii,
    plugins_all set<frozen<plugin>>,
    plugins_currently set<frozen<plugin>>,
    queue ascii,
    reverse_port int,
    timestamp timestamp
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

MySQL

Stores node management and metadata.

Deployment

Docker container: beehive-mysql

Docker image: mysql:5.7.10

Listening: 127.0.0.1:3306

Admin User: waggle

Admin Password: waggle

Default Database waggle

Access

Use beehive-server/bin/beehive-mysql to connect to waggle database inside container.

Tables

Currently, have many tables (mostly unused):

MySQL [waggle]> show tables;
+----------------------+
| Tables_in_waggle     |
+----------------------+
| calibration          |
| hardware             |
| node_config          |
| node_management      |
| node_management0     |
| node_meta            |
| node_notes           |
| node_offline         |
| nodes                |
| nodes0               |
| nodes1               |
| nodes2               |
| nodesApril7          |
| nodes_2017_06_21     |
| nodes_2017_08_10     |
| nodes_May16_2017     |
| nodes_May3           |
| role                 |
| roles_users          |
| software             |
| ssh_status           |
| testing_groups       |
| testing_nodes        |
| testing_nodes_groups |
| user                 |
+----------------------+

AFAIK, nodes is the only table in active use. It stores management data about each node. (It does not store ssh keys or certificates! Discussed in section will add.)

MySQL [waggle]> describe nodes;
+------------------+--------------+------+-----+----------+----------------+
| Field            | Type         | Null | Key | Default  | Extra          |
+------------------+--------------+------+-----+----------+----------------+
| id               | int(11)      | NO   | PRI | NULL     | auto_increment |
| node_id          | varchar(16)  | YES  |     | NULL     |                |
| project          | int(11)      | YES  | MUL | NULL     |                |
| description      | varchar(255) | YES  |     | NULL     |                |
| reverse_ssh_port | mediumint(9) | YES  |     | NULL     |                |
| hostname         | varchar(64)  | YES  |     | NULL     |                |
| hardware         | json         | YES  |     | NULL     |                |
| name             | varchar(64)  | YES  |     | NULL     |                |
| location         | varchar(255) | YES  |     | NULL     |                |
| last_updated     | timestamp    | YES  |     | NULL     |                |
| opmode           | varchar(64)  | YES  |     | testing. |                |
| groups           | varchar(128) | YES  |     |          |                |
+------------------+--------------+------+-----+----------+----------------+
12 rows in set (0.11 sec)

Some example rows are:

MySQL [waggle]> select * from nodes where description like '%aot chicago (s)%' limit 10;
+-----+------------------+---------+---------------------+------------------+----------+----------+------+------------------------------------------------+--------------+------------+----------+
| id  | node_id          | project | description         | reverse_ssh_port | hostname | hardware | name | location                                       | last_updated | opmode     | groups   |
+-----+------------------+---------+---------------------+------------------+----------+----------+------+------------------------------------------------+--------------+------------+----------+
| 132 | 0000001E0610BC10 |    NULL | AoT Chicago (S) [C] |            50057 | NULL     | NULL     | 01F  | State St - 87th (02/22/2018)                   | NULL         | production | v2 surya |
| 134 | 0000001E0610BA8B |    NULL | AoT Chicago (S) [C] |            50059 | NULL     | NULL     | 018  | CDOT,01/05/2018                                | NULL         | shipped    | v2 surya |
| 135 | 0000001E0610BA18 |    NULL | AoT Chicago (S)     |            50060 | NULL     | NULL     | 01D  | Damen Ave - Cermak [SEC] (12/15/2017)          | NULL         | testing    | v2 surya |
| 137 | 0000001E0610BA81 |    NULL | AoT Chicago (S)     |            50062 | NULL     | NULL     | 040  | Lake Shore Drive - 85th St (11/29/2017)        | NULL         | testing    | v2 tfx   |
| 140 | 0000001E0610BA16 |    NULL | AoT Chicago (S) [C] |            50065 | NULL     | NULL     | 010  | Ohio St - Grand Ave [NEX] (12/01/2017)         | NULL         | testing    | v2 surya |
| 141 | 0000001E0610BBF9 |    NULL | AoT Chicago (S) [C] |            50066 | NULL     | NULL     | 020  | Western Ave - 69th St [SEC] (02/13/2018)       | NULL         | production | v2 surya |
| 143 | 0000001E0610BA8F |    NULL | AoT Chicago (S)     |            50068 | NULL     | NULL     | 00D  | Cornell - 47th St                              | NULL         | production | v2 surya |
| 144 | 0000001E0610BA3B |    NULL | AoT Chicago (S)     |            50069 | NULL     | NULL     | 006  | 18th St - Lake Shore Dr                        | NULL         | production | v2 surya |
| 145 | 0000001E0610BBFF |    NULL | AoT Chicago (S)     |            50070 | NULL     | NULL     | 025  | Western Ave - 18th St [SEC] (12/15/2017)       | NULL         | testing    | v2 surya |
| 146 | 0000001E0610BBE5 |    NULL | AoT Chicago (S) [C] |            50071 | NULL     | NULL     | 02C  | Martin Luther King Dr. - 87th St. (02/16/2018) | NULL         | production |          |
+-----+------------------+---------+---------------------+------------------+----------+----------+------+------------------------------------------------+--------------+------------+----------+
10 rows in set (0.00 sec)

Cert

Manages node registration, node credentials and generating authorized_keys file (/mnt/waggle/SSL/nodes/authorized_keys) used by beehive-sshd.

Deployment

Docker container: beehive-cert

Docker image: beehive-cert

Listening: 127.0.0.1:24181 (but only used via beehive-sshd proxy command)

Summary

Runs an HTTP server responsible for:

  • Generating authorized_keys file (/mnt/waggle/SSL/nodes/authorized_keys) used by beehive-sshd from MySQL. Adds entry for reg proxy command. (!!)
  • Getting node entry from MySQL + certs / keys from /mnt/waggle/SSL.
  • Creating node entry in MySQL + certs / keys in /mnt/waggle/SSL.

Some of this is done using scripts in beehive-cert/SSL:

  • Generating CA cert / key.
  • Generating server cert / key signed by CA.
  • Generating client cert / key signed by CA.

The rest of the functionality is implemented by the API calls.

Beehive2

 Host beehive-prod
  ProxyCommand ssh -q mcs nc -q0 beehive01.cels.anl.gov 22
  User moose

Host beehive-dev
  ProxyCommand ssh -q mcs nc -q0 beehive01-dev.cels.anl.gov 22
  User moose

Host beehive-test
  ProxyCommand ssh -q mcs nc -q0 beehive01-test.cels.anl.gov 22
  User moose

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.