Giter Site home page Giter Site logo

radar-base / radar-docker Goto Github PK

View Code? Open in Web Editor NEW
16.0 13.0 16.0 90.33 MB

Integrated Docker Stack for the RADAR mHealth Streaming Platform Components

Home Page: https://hub.docker.com/u/radarbase/dashboard/

License: Apache License 2.0

Shell 90.14% Dockerfile 2.24% PigLatin 1.63% Python 2.94% Mustache 3.05%
docker confluent radar-docker

radar-docker's Introduction


RADAR-Docker 2.2.0

Build Status

The dockerized RADAR stack for deploying the RADAR-base platform. Component repositories can be found at RADAR-base DockerHub org.

โ— Important Notice
We have made a lot of improvements to the RADAR-based platform over the past years. One of the key improvements is migrating to Kubernetes based deployment to allow automated application deployment, scaling, and management. Please note that the RADAR-Docker stack is not actively maintained and will be deprecated by the end of 2021. Hence, we strongly recommend you to set-up the Kubernetes based installation of the platform. You can find the installation guidelines from the RADAR-Kubernetes repository.

Please be informed that we are still working on improving our documentation. If you would like to contribute by improving the documentation or providing feedback, please contact the RADAR-base community via slack.

Installation instructions

To install RADAR-base stack, do the following:

  1. Install Docker Engine

  2. Install docker-compose using the installation guide or by following our wiki.

  3. Verify the Docker installation by running on the command-line:

    docker --version
    docker-compose --version

    This should show Docker version 1.12 or later and docker-compose version 1.9.0 or later.

  4. Install git for your platform.

    1. For Ubuntu

      sudo apt-get install git
  5. Clone RADAR-Docker repository from GitHub.

    git clone https://github.com/RADAR-base/RADAR-Docker.git
  6. Install required component stack following the instructions below.

Usage

RADAR-Docker currently offers two component stacks to run.

  1. A Docker-compose for components from Confluent Kafka Platform community
  2. A Docker-compose for components from RADAR-base platform.

Note: on macOS, remove sudo from all docker and docker-compose commands in the usage instructions below.

Confluent Kafka platform

Confluent Kafka platform offers integration of the basic components for streaming such as Zookeeper, Kafka brokers, Schema registry and REST-Proxy.

Run this stack in a single-node setup on the command-line:

cd RADAR-Docker/dcompose-stack/radar-cp-stack/
sudo docker-compose up -d

To stop this stack, run:

sudo docker-compose down

RADAR-base platform

In addition to Confluent Kafka platform components, RADAR-base platform offers

  • RADAR-HDFS-Connector - Cold storage of selected streams in Hadoop data storage,
  • RADAR-MongoDB-Connector - Hot storage of selected streams in MongoDB,
  • RADAR-Dashboard,
  • RADAR-Streams - real-time aggregated streams,
  • RADAR-Monitor - Status monitors,
  • RADAR-HotStorage via MongoDB,
  • RADAR-REST API,
  • A Hadoop cluster, and
  • An email server.
  • Management Portal - A web portal to manage patient monitoring studies.
  • RADAR-Gateway - A validating gateway to allow only valid and authentic data to the platform
  • Catalog server - A Service to share source-types configured in the platform. To run RADAR-base stack in a single node setup:
  1. Navigate to radar-cp-hadoop-stack:

    cd RADAR-Docker/dcompose-stack/radar-cp-hadoop-stack/
  2. Follow the README instructions there for correct configuration.

Logging

Set up a logging service by going to the dcompose-stack/logging directory and follow the README there.

Work in progress

The two following stacks will not work on with only Docker and docker-compose. For the Kerberos stack, the Kerberos image is not public. For the multi-host setup, also docker-swarm and Docker beta versions are needed.

Kerberized stack

In this setup, Kerberos is used to secure the connections between the Kafka brokers, Zookeeper and the Kafka REST API. Unfortunately, the Kerberos container from Confluent is not publicly available, so an alternative has to be found here.

$ cd wip/radar-cp-sasl-stack/
$ docker-compose up

Multi-host setup

In the end, we aim to deploy the platform in a multi-host environment. We are currently aiming for a deployment with Docker Swarm. This setup uses features that are not yet released in the stable Docker Engine. Once they are, this stack may become the main Docker stack. See the wip/radar-swarm-cp-stack/ directory for more information.

radar-docker's People

Contributors

afolarin avatar baixiac avatar blootsvoets avatar dennyverbeeck avatar fnobilia avatar mpgxvii avatar nivemaham avatar sboettcher avatar stsnel avatar yatharthranjan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

radar-docker's Issues

SSL certificate troubles

I updated my repos and docker platform today and am now getting

RestSender: Failed to make heartbeat request to RestClient{timeout=20, config=https://10.231.51.9/kafka/}: javax.net.ssl.SSLHandshakeException: java.security.cert.CertPathValidatorException: Trust anchor for certification path not found.

from the app when trying to connect. I'm pretty sure I just forgot something regarding the self-signed certificate, but I can't figure it out... Any idea what I should try?

[DIR]                               [BRANCH]             [ORIGIN]               [REV]
./RADAR-AndroidApplication          On branch dev        github.com:sboettcher  1862a66
./RADAR-Android-Application-Status  On branch dev        github.com/RADAR-CNS   b9a4d8b
./RADAR-Android-Audio               On branch master     github.com/RADAR-CNS   9e6418d
./RADAR-Android-Biovotion           On branch dev        github.com:sboettcher  3867554
./RADAR-Android-Empatica            On branch dev        github.com/RADAR-CNS   2ac4ecd
./RADAR-Android-Pebble              On branch master     github.com/RADAR-CNS   f01809b
./RADAR-Android-Phone               On branch dev        github.com/RADAR-CNS   9b9eb4e
./RADAR-Backend                     On branch dev        github.com/RADAR-CNS   89fa075
./RADAR-Commons                     On branch dev        github.com/RADAR-CNS   338b45f
./RADAR-Commons-Android             On branch dev        github.com/RADAR-CNS   df94904
./RADAR-Dashboard                   On branch master     github.com/RADAR-CNS   90a3351
./RADAR-Docker                      On branch dev        github.com/RADAR-CNS   9470b84
./RADAR-RestApi                     On branch dev        github.com/RADAR-CNS   254acef
./RADAR-Schemas                     On branch biovotion  github.com:sboettcher  fc0aad6
./Restructure-HDFS-topic            On branch master     github.com/RADAR-CNS   514b69e

install script breaks nginx.conf

current (9a5487b) install-radar-stack.sh breaks nginx.conf server name, can for example be fixed by changing

-inline_variable 'server_name[[:space:]]*' $SERVER_NAME etc/nginx.conf
+inline_variable 'server_name[[:space:]]*' "$SERVER_NAME;" etc/nginx.conf

edit: breaks is a strong word, it produces a warning...

nginx: [warn] server name "/var/log/nginx/access.log" has suspicious symbols in /etc/nginx/nginx.conf:19

Write self-signed certificate authority to file

When a self-signed certificate gets created, the certificate of the self-made certificate authority should be written to the local file system. That way clients can choose to trust the certificate authority and use the self-signed certificate.

Monitoring

We need to monitor the docker cluster. Different solutions are currently available to monitor docker containers and the applications running inside them. The most used are:

All of them are supported by docker out-of-the-box (Docker doc).

While the first one is an all-in-one solution: it provides ingestion, visualisation, rest api and alerts. The other two have to be coupled with a storage component, to persist the informations (e.g. Elasticsearch), and a visualisation tool (e.g. Kibana).

For more details check

To monitor the hardware the best candidate is the Google Cadvisor. It can be combined with InfluxDB and Grafana to persist and visualise data (example). Another solution is to use Cadvisor with Prometheus, a metrics system with dashboards and alerting capabilities (ex1 ex2).

All these solutions are open-source.

Explicitly move WIP stacks to a WIP directory

The radar-cp-stack and radar-cp-hadoop-stack are now tested on Travis in the travis branch. For each stack, Travis tests whether its docker-compose files successfully starts all containers. The sasl and swarm stacks are not tested, because they are expected to fail. Can we move those to experimental or WIP? Once Docker 1.13 is released, the radar-cp-swarm-stack should hopefully be able to run on Travis.

0 length avro files in HDFS

As mentioned here RADAR-base/radar-output-restructure#3.
For seemingly random topics I have encountered hdfs files that have 0B length, mostly in the beginning of a partition offset. It looks like the data for that offset range is just missing.

hdfs fsck returns:

Status: HEALTHY
 Total size:	63129575 B (Total open files size: 805 B)
 Total dirs:	142
 Total files:	7252
 Total symlinks:		0 (Files currently being written: 23)
 Total blocks (validated):	7233 (avg. block size 8727 B) (Total open file blocks (not validated): 23)
 Minimally replicated blocks:	7233 (100.0 %)
 Over-replicated blocks:	0 (0.0 %)
 Under-replicated blocks:	7233 (100.0 %)
 Mis-replicated blocks:		0 (0.0 %)
 Default replication factor:	3
 Average block replication:	1.0
 Corrupt blocks:		0
 Missing replicas:		14466 (66.666664 %)
 Number of data-nodes:		2
 Number of racks:		1
FSCK ended at Wed Jul 19 13:49:03 UTC 2017 in 266 milliseconds


The filesystem under path '/' is HEALTHY

Example of what the files look like

root@21d4b7f7b9a8:/# hdfs dfs -ls /topicAndroidNew/android_empatica_e4_inter_beat_interval/partition=1/
Found 21 items
-rw-r--r--   3 root supergroup       1419 2017-07-18 09:08 /topicAndroidNew/android_empatica_e4_inter_beat_interval/partition=1/android_empatica_e4_inter_beat_interval+1+0000000000+0000000000.avro
-rw-r--r--   3 root supergroup          0 2017-07-18 12:48 /topicAndroidNew/android_empatica_e4_inter_beat_interval/partition=1/android_empatica_e4_inter_beat_interval+1+0000000001+0000000099.avro
-rw-r--r--   3 root supergroup       1419 2017-07-18 11:41 /topicAndroidNew/android_empatica_e4_inter_beat_interval/partition=1/android_empatica_e4_inter_beat_interval+1+0000000100+0000000100.avro
-rw-r--r--   3 root supergroup       7977 2017-07-18 11:41 /topicAndroidNew/android_empatica_e4_inter_beat_interval/partition=1/android_empatica_e4_inter_beat_interval+1+0000000101+0000000250.avro
-rw-r--r--   3 root supergroup       7977 2017-07-18 11:41 /topicAndroidNew/android_empatica_e4_inter_beat_interval/partition=1/android_empatica_e4_inter_beat_interval+1+0000000251+0000000400.avro

Example of HDFS restructure script log output

boettche@nz1200:~/radar-docker$ cat /data/radar/storage/restructured/restructure_log_18-07-2017_17-52-35 | grep zero
2017-07-18 15:51:56 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_eeg_sync_pulse/partition=1/android_eeg_sync_pulse+1+0000000301+0000000450.avro has zero length, skipping.
2017-07-18 15:51:56 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_eeg_sync_pulse/partition=1/android_eeg_sync_pulse+1+0000000451+0000000600.avro has zero length, skipping.
2017-07-18 15:51:56 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_eeg_sync_pulse/partition=1/android_eeg_sync_pulse+1+0000000601+0000000750.avro has zero length, skipping.
2017-07-18 15:51:56 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_eeg_sync_pulse/partition=1/android_eeg_sync_pulse+1+0000000751+0000000900.avro has zero length, skipping.
2017-07-18 15:51:56 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_eeg_sync_pulse/partition=1/android_eeg_sync_pulse+1+0000000901+0000001050.avro has zero length, skipping.
2017-07-18 15:52:08 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_biovotion_heart_rate_variability/partition=2/android_biovotion_heart_rate_variability+2+0000000001+0000000084.avro has zero length, skipping.
2017-07-18 15:52:08 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_biovotion_battery_state/partition=2/android_biovotion_battery_state+2+0000000001+0000000002.avro has zero length, skipping.
2017-07-18 15:52:08 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_biovotion_spo2/partition=2/android_biovotion_spo2+2+0000000001+0000000084.avro has zero length, skipping.
2017-07-18 15:52:08 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_biovotion_blood_pulse_wave/partition=2/android_biovotion_blood_pulse_wave+2+0000000001+0000000084.avro has zero length, skipping.
2017-07-18 15:52:23 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_biovotion_galvanic_skin_response/partition=2/android_biovotion_galvanic_skin_response+2+0000000001+0000000084.avro has zero length, skipping.
2017-07-18 15:52:23 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_biovotion_energy/partition=2/android_biovotion_energy+2+0000000001+0000000084.avro has zero length, skipping.
2017-07-18 15:52:24 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_phone_battery_level/partition=1/android_phone_battery_level+1+0000000001+0000000001.avro has zero length, skipping.
2017-07-18 15:52:24 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_biovotion_respiration_rate/partition=2/android_biovotion_respiration_rate+2+0000000001+0000000084.avro has zero length, skipping.
2017-07-18 15:52:33 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_empatica_e4_battery_level/partition=1/android_empatica_e4_battery_level+1+0000000001+0000000002.avro has zero length, skipping.
2017-07-18 15:52:33 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_biovotion_heart_rate/partition=2/android_biovotion_heart_rate+2+0000000001+0000000084.avro has zero length, skipping.
2017-07-18 15:52:33 WARN  RestructureAvroRecords:213 - File hdfs://hdfs-namenode:8020/topicAndroidNew/android_empatica_e4_inter_beat_interval/partition=1/android_empatica_e4_inter_beat_interval+1+0000000001+0000000099.avro has zero length, skipping.

Handle incoming requests with OPTIONS methods for MP

The POST or GET methods in most instances first makes a pre-flight OPTIONS request to the server. The aRMT app was thus not able to access the token end-point of Management Portal because the OPTIONS method is forbidden with the following error message -

polyfills.js:3 OPTIONS https://radar-cns-platform.rosalind.kcl.ac.uk//managementportal/oauth/token 403 (Forbidden)
(index):1 Failed to load https://radar-cns-platform.rosalind.kcl.ac.uk//managementportal/oauth/token: Response for preflight has invalid HTTP status code 403

This could probably be implemented as a conditional block in the nginx.conf file

Gateway doesn't allow OPTIONS request

We get a 401 unauthorised when making a pre-flight options request. Right now, I have added a conditional block in nginx config on the production server to handle this. Its something like

          if ($request_method = OPTIONS){
                  add_header Access-Control-Allow-Origin "*";
                  add_header Access-Control-Allow-Methods "GET, POST, HEAD, OPTIONS";
                  add_header Access-Control-Allow-Headers "Authorization, Content-Type";
                  add_header Access-Control-Allow-Credentials "true";
                  add_header Content-Length 0;
                  add_header Content-Type text/plain;
                  return 200;
          }

Should we do this in the Nginx or should the gateway handle this CORS issue?

Schema compilation still failing for schema version > 0.2.3

Schema compilation is failing with following log output -

sudo docker-compose run --rm kafka-init
Starting radarcphadoopstack_zookeeper-1_1 ... done
Starting radarcphadoopstack_kafka-1_1 ... done
Starting radarcphadoopstack_kafka-2_1 ... done
Starting radarcphadoopstack_kafka-3_1 ... done
Starting radarcphadoopstack_schema-registry-1_1 ... done
Compiling schemas...
Input files to compile:
  merged/commons/stream/aggregator/phone_usage_aggregate.avsc
  merged/commons/stream/aggregator/aggregate_list.avsc
  merged/commons/stream/aggregator/numeric_aggregate.avsc
  merged/commons/monitor/application/application_uptime.avsc
  merged/commons/monitor/application/application_record_counts.avsc
  merged/commons/monitor/application/application_server_status.avsc
  merged/commons/monitor/application/application_external_time.avsc
  merged/commons/active/questionnaire/questionnaire.avsc
  merged/commons/active/thincit/thinc_it_pdq.avsc
  merged/commons/active/thincit/thinc_it_input_type.avsc
  merged/commons/active/thincit/thinc_it_trails.avsc
  merged/commons/active/thincit/thinc_it_symbol_check.avsc
  merged/commons/active/thincit/thinc_it_spotter.avsc
  merged/commons/active/thincit/thinc_it_code_breaker.avsc
  merged/commons/kafka/aggregate_key.avsc
  merged/commons/kafka/observation_key.avsc
  merged/commons/catalogue/processing_state.avsc
  merged/commons/catalogue/radar_widget.avsc
  merged/commons/catalogue/time_window.avsc
  merged/commons/catalogue/unit.avsc
  merged/commons/passive/weather/local_weather.avsc
  merged/commons/passive/pebble/pebble2_battery_level.avsc
  merged/commons/passive/pebble/pebble2_acceleration.avsc
  merged/commons/passive/pebble/pebble2_heart_rate.avsc
  merged/commons/passive/pebble/pebble2_heart_rate_filtered.avsc
  merged/commons/passive/phone/phone_step_count.avsc
  merged/commons/passive/phone/phone_acceleration.avsc
  merged/commons/passive/phone/phone_sms.avsc
  merged/commons/passive/phone/phone_call.avsc
  merged/commons/passive/phone/phone_battery_level.avsc
  merged/commons/passive/phone/phone_bluetooth_devices.avsc
  merged/commons/passive/phone/phone_relative_location.avsc
  merged/commons/passive/phone/phone_user_interaction.avsc
  merged/commons/passive/phone/phone_light.avsc
  merged/commons/passive/phone/phone_gyroscope.avsc
  merged/commons/passive/phone/phone_usage_event.avsc
  merged/commons/passive/phone/phone_magnetic_field.avsc
  merged/commons/passive/phone/phone_sms_unread.avsc
  merged/commons/passive/phone/phone_contact_list.avsc
  merged/commons/passive/empatica/empatica_e4_battery_level.avsc
  merged/commons/passive/empatica/empatica_e4_electrodermal_activity.avsc
  merged/commons/passive/empatica/empatica_e4_tag.avsc
  merged/commons/passive/empatica/empatica_e4_acceleration.avsc
  merged/commons/passive/empatica/empatica_e4_blood_volume_pulse.avsc
  merged/commons/passive/empatica/empatica_e4_inter_beat_interval.avsc
  merged/commons/passive/empatica/empatica_e4_sensor_status.avsc
  merged/commons/passive/empatica/empatica_e4_temperature.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_ppg_raw.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_led_current.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_galvanic_skin_response.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_temperature.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_blood_pulse_wave.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_battery_level.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_heart_rate_variability.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_energy.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_oxygen_saturation.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_acceleration.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_respiration_rate.avsc
  merged/commons/passive/biovotion/biovotion_vsm1_heart_rate.avsc
log4j:WARN No appenders could be found for logger (AvroVelocityLogChute).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" org.apache.avro.SchemaParseException: Undefined name: "NumericAggregate"
	at org.apache.avro.Schema.parse(Schema.java:1228)
	at org.apache.avro.Schema.parse(Schema.java:1306)
	at org.apache.avro.Schema.parse(Schema.java:1269)
	at org.apache.avro.Schema$Parser.parse(Schema.java:1032)
	at org.apache.avro.Schema$Parser.parse(Schema.java:997)
	at org.apache.avro.tool.SpecificCompilerTool.run(SpecificCompilerTool.java:92)
	at org.apache.avro.tool.Main.run(Main.java:87)
	at org.apache.avro.tool.Main.main(Main.java:76)

Interestingly this only fails for schema version 0.3 or greater. I guess that is when NumericAggregate was added.
Also this is only failing for production server even after the dependent enum fix in kafka-radarinit/init.sh. It works fine on amazon test instance.
Any thoughts @blootsvoets , @nivemaham ?

Docker bug breaks kafka-init Dockerfile

OS version: Linux moon 4.4.97-1-MANJARO #1 SMP PREEMPT Wed Nov 8 10:27:12 UTC 2017 x86_64 GNU/Linux
docker version: Docker version 17.10.0-ce, build f4ffd2511c
docker-compose version: docker-compose version 1.13.0, build 1719ceb

https://github.com/RADAR-CNS/RADAR-Docker/blob/dev/dcompose-stack/radar-cp-hadoop-stack/kafka-radarinit/Dockerfile#L18
and
https://github.com/RADAR-CNS/RADAR-Docker/blob/dev/dcompose-stack/radar-cp-hadoop-stack/kafka-radarinit/Dockerfile#L21
fail for me since the behaviour of ADD seems to be inconsistent to what it should be (https://docs.docker.com/engine/reference/builder/#add). The downloaded archives are extracted and deleted, not just downloaded. There are several bug reports on that, apparently it was fixed in 17.06.1, but I am using 17.10.0 which still (again?) has the bug.

Fix proposal:

diff --git a/dcompose-stack/radar-cp-hadoop-stack/kafka-radarinit/Dockerfile b/dcompose-stack/radar-cp-hadoop-stack/kafka-radarinit/Dockerfile
index 7de69d4..317663e 100644
--- a/dcompose-stack/radar-cp-hadoop-stack/kafka-radarinit/Dockerfile
+++ b/dcompose-stack/radar-cp-hadoop-stack/kafka-radarinit/Dockerfile
@@ -14,10 +14,10 @@ RUN mkdir -p /schema/merged
 
 WORKDIR /schema
 
-ADD https://github.com/RADAR-CNS/RADAR-Schemas/archive/v0.2.tar.gz /schema/
-RUN tar xzf v0.2.tar.gz && mv RADAR-Schemas-0.2 original && rm v0.2.tar.gz
+ADD https://github.com/RADAR-CNS/RADAR-Schemas/archive/v0.2.tar.gz /schema/RADAR-Schemas-0.2.tar.gz
+RUN tar xzf RADAR-Schemas-0.2.tar.gz && mv RADAR-Schemas-0.2 original && rm RADAR-Schemas-0.2.tar.gz
 
-ADD https://github.com/RADAR-CNS/RADAR-Schemas/releases/download/v0.2/radar-schemas-tools-0.2.tar.gz /schema/
+ADD https://github.com/RADAR-CNS/RADAR-Schemas/releases/download/v0.2/radar-schemas-tools-0.2.tar.gz /schema/radar-schemas-tools-0.2.tar.gz
 RUN tar xzf radar-schemas-tools-0.2.tar.gz --strip-components=1 -C /usr && rm radar-schemas-tools-0.2.tar.gz
 
 VOLUME /schema/conf

Update Wiki Documentation

  • list of current features implemented
  • list of waiting to be implemented
  • basic setup example on EC2, single host deployment

Check-health.sh does not work with crontab

The check-health.sh file right now does not work with crontab because it uses relative paths for util.sh,.env and docker-compose.yml files. We should update these to absolute so that its works with crontab

Load Testing RADAR Platform

Need evaluation of the throughput capabilities of the full single node deployment so we can estimate resource requirements for the pilots and future works.

Install script add topic names multiple times

In the connector properties, the install script adds the topic names multiple times if the values already exist. For example take a look at the sink-hdfs.properties file -

name=radar-hdfs-sink-android-15000
connector.class=io.confluent.connect.hdfs.HdfsSinkConnector
tasks.max=4
topics=android_biovotion_vsm1_acceleration,android_biovotion_vsm1_battery_level,android_biovotion_vsm1_blood_volume_pulse,android_biovotion_vsm1_energy,android_biovotion_vsm1_galvanic_skin_response,android_biovotion_vsm1_heartrate,android_biovotion_vsm1_heartrate_variability,android_biovotion_vsm1_led_current,android_biovotion_vsm1_oxygen_saturation,android_biovotion_vsm1_ppg_raw,android_biovotion_vsm1_respiration_rate,android_biovotion_vsm1_temperature,android_bittium_faros_acceleration,android_bittium_faros_acceleration,android_bittium_faros_battery_level,android_bittium_faros_ecg,android_bittium_faros_inter_beat_interval,android_empatica_e4_acceleration,android_empatica_e4_acceleration,android_empatica_e4_battery_level,android_empatica_e4_battery_level,android_empatica_e4_blood_volume_pulse,android_empatica_e4_blood_volume_pulse,android_empatica_e4_electrodermal_activity,android_empatica_e4_electrodermal_activity,android_empatica_e4_inter_beat_interval,android_empatica_e4_inter_beat_interval,android_empatica_e4_temperature,android_empatica_e4_temperature,android_local_weather,android_pebble_2_acceleration,android_pebble_2_battery_level,android_pebble_2_heartrate,android_pebble_2_heartrate_filtered,android_phone_acceleration,android_phone_acceleration,android_phone_battery_level,android_phone_battery_level,android_phone_bluetooth_devices,android_phone_bluetooth_devices,android_phone_call,android_phone_call,android_phone_contacts,android_phone_contacts,android_phone_gyroscope,android_phone_gyroscope,android_phone_light,android_phone_light,android_phone_magnetic_field,android_phone_magnetic_field,android_phone_relative_location,android_phone_relative_location,android_phone_sms,android_phone_sms,android_phone_sms_unread,android_phone_sms_unread,android_phone_step_count,android_phone_step_count,android_phone_usage_event,android_phone_usage_event,android_phone_user_interaction,android_phone_user_interaction,application_external_time,application_external_time,application_record_counts,application_record_counts,application_server_status,application_server_status,application_time_zone,application_uptime,application_uptime,notification_thinc_it,questionnaire_audio,questionnaire_esm,questionnaire_esm,questionnaire_esm,questionnaire_phq8,questionnaire_phq8,questionnaire_phq8,questionnaire_rses,questionnaire_rses,questionnaire_rses,task_2MW_test,task_romberg_test,task_tandem_walking_test,thincit_code_breaker,thincit_code_breaker,thincit_pdq5,thincit_pdq5,thincit_spotter,thincit_spotter,thincit_symbol_check,thincit_symbol_check,thincit_trails,thincit_trails,
flush.size=80000
rotate.interval.ms=900000
hdfs.url=hdfs://hdfs-namenode:8020
format.class=org.radarcns.sink.hdfs.AvroFormatRadar
topics.dir=topicAndroidNew

Testing stack stability/behaviour on several major serverside events for production deployment.

In the recent past I have had some problems with the docker-stack here at the clinic, mainly related to rebooting and/or OS-updating the server we are currently using, which is part of the clinic infrastructure (so I won't be able to avoid updating etc) and will also be used for the production deployment. One problem for example was that the mongodb connector threw an exception after rebooting the server while the docker-compose stack was just stoped and not downed. Maybe because some error happened during rebooting the stack while data was streamed.

Several things should be tested to see if e.g. data corruption can happen.

  • rebooting server while stack is streaming/running/stoped/downed
  • stop/down stack while streaming (from multiple sources)
  • ...

This is also e.g. to find out a best practice for rebooting/updating a production server.

Handle incoming requests with OPTIONS method for MP

The POST or GET methods in most instances are preceeded by a pre-flight OPTIONS request to the server. The aRMT app was thus not able to access the token end-point of Management Portal because the OPTIONS method is forbidden with the following error message -

polyfills.js:3 OPTIONS https://radar-cns-platform.rosalind.kcl.ac.uk//managementportal/oauth/token 403 (Forbidden)
(index):1 Failed to load https://radar-cns-platform.rosalind.kcl.ac.uk//managementportal/oauth/token: Response for preflight has invalid HTTP status code 403

This could probably be implemented as a conditional block in the nginx.conf file

Update main README.md

The main instructions is only a stub, needs more comprehensive instructions

  • single host deployment instructions (for the moment)
  • Caveats etc like autoscaling etc not being available a present
  • Zookeeper issues
  • add other

Test with multi-host networking solution

Lots of options here.. depends on how we want to go

multi-host networking
overlay (not sure about performance)
Weave.io
Flannel

full clustered option
kubernettes
docker-swarm

Additional RADAR components

Compose containers to be added:

Requirements for Pilot Project No1:

  • HDFS (Jan 2017)
  • Nodejs
  • MongoDB
  • Tomcat REST-API
  • Backend: Kafka-HDFS connector
  • Backend: Kafka-MongoDB connector
  • Backend: E4 streams

Security Components:

  • HTTP proxy
  • SSO component

Local Timeserver (for pilot studies) is easier to deploy as a system package, as it needs to modify the system time.

Get a replacement Kerberos container

Although listed in the examples folder SASL Confluent Platform dockerized stack, the Kerberos container used doesn't exist or is private (more likely part of some enterprise stack Confluent are holding back)

i.e. https://github.com/confluentinc/cp-docker-images/tree/master/examples/kafka-cluster-sasl

 kerberos:
    image: confluentinc/cp-kerberos:latest
    network_mode: host
    environment:
      BOOTSTRAP: 0
    volumes:
    - ${KAFKA_SASL_SECRETS_DIR}:/tmp/keytab
    - /dev/urandom:/dev/random

If we want to use this for AuthN we should try create a new one.

How to manage secret parameters within docker

The easiest way is to use the combination of env variable and an env-file. Doing so, parameters will be provided at run time: no need to hardcode them inside the image. This approach does not work with certificates, in this case a volume may help.

An advanced alternative, that covers all possible scenarios and for this reason often used in production, is Hashicorp Vault. Katacoda tutorial

Investigate the HDFS data retention

Investigate the HDFS data retention since it is an intermediate data storage.
Data governance tools like Apache Falcon might be overkill for our use case.

Could also be implemented in the HDFS-Restructure app since it is aware of the processed offsets.
Should the data retention be based on the kafka log retention policy or something else ?

UnknownTopicOrPartitionException for _schemas

Upon start up sometimes at least one broker raises this error:
ERROR [ReplicaFetcherThread-0-1], Error for partition [_schemas,0] to broker 1:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. (kafka.server.ReplicaFetcherThread)

Modifying the containers dependences has mitigated the issue: now it appears sometimes, while before it was always raised.

List of basic testing steps

/cc @fnobilia

Connect via ssh to machine where you have installed the Confluent Platform. Then run the following command

# Start Zookeeper
$ ./bin/zookeeper-server-start config/zookeeper.properties

# Start Kafka
$ ./bin/kafka-server-start config/server.properties

# Start Schema Registry
$ ./bin/schema-registry-start config/schema-registry.properties

# Start Rest Proxy
$ ./bin/kafka-rest-start ./etc/kafka-rest/kafka-rest.properties &

# Create acceleration topic
$ ./bin/kafka-topics.sh --create --zookeeper <ZOOKEEPER_PATH> --replication-factor 1 --partitions 3 --topic android_empatica_e4_acceleration

# Create battery topic
$ ./bin/kafka-topics.sh --create --zookeeper <ZOOKEEPER_PATH> --replication-factor 1 --partitions 3 --topic android_empatica_e4_battery_level

# Create blod volume topic
$ ./bin/kafka-topics.sh --create --zookeeper <ZOOKEEPER_PATH> --replication-factor 1 --partitions 3 --topic android_empatica_e4_blood_volume_pulse

# Create electrodermal topic
$ ./bin/kafka-topics.sh --create --zookeeper <ZOOKEEPER_PATH> --replication-factor 1 --partitions 3 --topic android_empatica_e4_electrodermal_activity

# Create inter_beat topic
$ ./bin/kafka-topics.sh --create --zookeeper <ZOOKEEPER_PATH> --replication-factor 1 --partitions 3 --topic android_empatica_e4_inter_beat_interval

# Create temperature topic
$ ./bin/kafka-topics.sh --create --zookeeper <ZOOKEEPER_PATH> --replication-factor 1 --partitions 3 --topic android_empatica_e4_temperature

# Create sensor_status topic
$ ./bin/kafka-topics.sh --create --zookeeper <ZOOKEEPER_PATH> --replication-factor 1 --partitions 3 --topic android_empatica_e4_sensor_status

# In order to check if you are receivng empatica's data, run an avro consumer for each topic
$ ./bin/kafka-avro-console-consumer --topic <TOPIC_NAME> --zookeeper <ZOOKEEPER_PATH> --from-beginning
# For istance
$ ./bin/kafka-avro-console-consumer --topic android_empatica_e4_acceleration --zookeeper <ZOOKEEPER_PATH> --from-beginning 

Before installing the android application, update the following files and the recompile it

in app/src/main/resources/values/server.xml: 

<?xml version="1.0" encoding="utf-8"?>

<resources>

    <string name="kafka_rest_proxy_url">YOUR_REST_PROXY_PATH</string>

    <string name="schema_registry_url">YOUR_SCHEMA_REGISTRY_PATH</string>

</resources>


in app/src/main/resources/values/device.xml: 

<string name="group_id">YOUR_USER_ID</string>
Note that

 - command 1,2,3,4,all the 12  have to be run one per shell

 - if you have installed the Confluent Platform via apt, the base path will be /usr/bin/ instead of bin/

 - if you have installed the Confluent Platform via sudo all commands have to be run by a sudoers user 

 - if you have not modified the out-of-the-box configuration files of Confluent the ZOOKEEPER_PATH is localhost:2181

 - an example of YOUR_REST_PROXY_PATH is http://YOUR_IP:8082

 - an example of YOUR_SCHEMA_REGISTRY_PATH is http://YOUR_IP:8081

 - an example of YOUR_USER_ID is radarTestKCLEmpaticaDevice0

 - the topic name list can be found inside org.radarcns.empaticaE4.E4Topics 

Requesting SSL certificate does not work if too many certs already issued

if too many certs already issued we get the error -

There were too many requests of a given type :: Error creating new cert :: too many certificates already issued for exact set of domains: radar-backend.ddns.net
Please see the logfiles in /var/log/letsencrypt for more details.
touch: /etc/openssl/live/radar-backend.ddns.net/.letsencrypt: No such file or directory

Instead there should be a way to get the previously issued certificate if not found locally

Register custom topics via schemas-tools in kafka-init

It might be nice to have a way of easily registering custom topics via the kafka-init mechanism. Part of that is already there, with the etc/schema/ volume that gets merged. That way the topic can be created. However then the schemas-tools are missing the respective class for registering the topic with the schema-registry (see log).
What I did is building my own schemas-tools that include the class and loading those instead of the github release in the Dockerfile. But maybe this can be done in a nicer way? Unless I missed something that already implements that.

kafka-init_1                   | 2017-11-13T20:07:16.014342553Z [main] ERROR org.radarcns.schema.util.Utils - Failed to apply function, returning empty.
kafka-init_1                   | 2017-11-13T20:07:16.014729136Z java.lang.IllegalStateException: Topic android_eeg_sync_pulse schema cannot be instantiated
kafka-init_1                   | 2017-11-13T20:07:16.014874882Z 	at org.radarcns.config.AvroTopicConfig.parseAvroTopic(AvroTopicConfig.java:46)
kafka-init_1                   | 2017-11-13T20:07:16.014894100Z 	at org.radarcns.schema.specification.DataTopic.getTopics(DataTopic.java:61)
kafka-init_1                   | 2017-11-13T20:07:16.014904177Z 	at org.radarcns.schema.util.Utils.lambda$applyOrEmpty$0(Utils.java:112)
kafka-init_1                   | 2017-11-13T20:07:16.014927819Z 	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267)
kafka-init_1                   | 2017-11-13T20:07:16.014939647Z 	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
kafka-init_1                   | 2017-11-13T20:07:16.014955428Z 	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
kafka-init_1                   | 2017-11-13T20:07:16.014979859Z 	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
kafka-init_1                   | 2017-11-13T20:07:16.014997257Z 	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
kafka-init_1                   | 2017-11-13T20:07:16.015008452Z 	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
kafka-init_1                   | 2017-11-13T20:07:16.015019391Z 	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
kafka-init_1                   | 2017-11-13T20:07:16.015033303Z 	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
kafka-init_1                   | 2017-11-13T20:07:16.015059088Z 	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:270)
kafka-init_1                   | 2017-11-13T20:07:16.015070639Z 	at java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1620)
kafka-init_1                   | 2017-11-13T20:07:16.015086843Z 	at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
kafka-init_1                   | 2017-11-13T20:07:16.015101537Z 	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:270)
kafka-init_1                   | 2017-11-13T20:07:16.015115079Z 	at java.util.Spliterators$ArraySpliterator.tryAdvance(Spliterators.java:958)
kafka-init_1                   | 2017-11-13T20:07:16.015155521Z 	at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
kafka-init_1                   | 2017-11-13T20:07:16.015190318Z 	at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:498)
kafka-init_1                   | 2017-11-13T20:07:16.015196351Z 	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:485)
kafka-init_1                   | 2017-11-13T20:07:16.015205266Z 	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
kafka-init_1                   | 2017-11-13T20:07:16.015215939Z 	at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:230)
kafka-init_1                   | 2017-11-13T20:07:16.015242588Z 	at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:196)
kafka-init_1                   | 2017-11-13T20:07:16.015255832Z 	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
kafka-init_1                   | 2017-11-13T20:07:16.015270802Z 	at java.util.stream.ReferencePipeline.allMatch(ReferencePipeline.java:454)
kafka-init_1                   | 2017-11-13T20:07:16.015285701Z 	at org.radarcns.schema.registration.SchemaRegistry.registerSchemas(SchemaRegistry.java:74)
kafka-init_1                   | 2017-11-13T20:07:16.015301602Z 	at org.radarcns.schema.registration.SchemaRegistry$RegisterCommand.execute(SchemaRegistry.java:167)
kafka-init_1                   | 2017-11-13T20:07:16.015320640Z 	at org.radarcns.schema.CommandLineApp.main(CommandLineApp.java:191)
kafka-init_1                   | 2017-11-13T20:07:16.015454724Z Caused by: java.lang.IllegalArgumentException: Topic android_eeg_sync_pulse schema cannot be instantiated
kafka-init_1                   | 2017-11-13T20:07:16.015478518Z 	at org.radarcns.topic.AvroTopic.parse(AvroTopic.java:135)
kafka-init_1                   | 2017-11-13T20:07:16.015490273Z 	at org.radarcns.config.AvroTopicConfig.parseAvroTopic(AvroTopicConfig.java:44)
kafka-init_1                   | 2017-11-13T20:07:16.015513243Z 	... 26 more
kafka-init_1                   | 2017-11-13T20:07:16.015623872Z Caused by: java.lang.ClassNotFoundException: org.radarcns.passive.eegsync.EegSyncPulse
kafka-init_1                   | 2017-11-13T20:07:16.015651514Z 	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
kafka-init_1                   | 2017-11-13T20:07:16.015673188Z 	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
kafka-init_1                   | 2017-11-13T20:07:16.015698230Z 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
kafka-init_1                   | 2017-11-13T20:07:16.015730638Z 	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
kafka-init_1                   | 2017-11-13T20:07:16.015744674Z 	at java.lang.Class.forName0(Native Method)
kafka-init_1                   | 2017-11-13T20:07:16.015760521Z 	at java.lang.Class.forName(Class.java:264)
kafka-init_1                   | 2017-11-13T20:07:16.015770986Z 	at org.radarcns.topic.AvroTopic.parse(AvroTopic.java:124)
kafka-init_1                   | 2017-11-13T20:07:16.015779249Z 	... 27 more

kafka-init cannot see brokers

Running from the current dev branch, I get the following error from kafka-init:

kafka-init_1               | curl: (7) Failed to connect to rest-proxy-1 port 8082: Connection refused
kafka-init_1               | Expected 3 brokers but found only 0. Waiting 32 second before retrying ...

redcap-config-listener broken on mac

On macOS, redcap-config-listener gives an error. Starting it as a background process is also quite brittle (it will break with ssh + logging out), implementing it as a system service would be preferable. Alternatively, a restart can be manually triggered after editing configuration files, which would be more transparent.

Update various components in the stack and new Release

New versions have been released for major components like radar-backend, mongodb-connector, hdfs-connector, etc. These should be updated in the stack
Also include the latest version of hdfs-restructure jar
Also make a release of RADAR platform 2.0.0 after these are updated and new confluent version is used -- refer #120

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.