Giter Site home page Giter Site logo

clairberlin / managair Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 0.0 2 MB

Multi-tenant web application with REST API to manage sensor nodes, their installation environment and corresponding measurement data. Part of the Clair Stack.

License: BSD 3-Clause "New" or "Revised" License

Dockerfile 0.27% Shell 0.16% Python 92.21% HTML 7.10% CSS 0.26%
iot-platform sensor-network co2-measurements

managair's Introduction

Managair - The Clair Management Server

Managair is part of the Clair Platform1, a system to collect measurements from networked CO2-sensors for indoor air-quality monitoring. It is developed and run by the Clair Berlin Initiative, a non-profit, open-source initiative to help operators of public spaces lower the risk of SARS-CoV2-transmission amongst their patrons.

Technically speaking, Managair is a service in the Clair Stack, which is the infrastructure-as-code implementation of the Clair Platform.

Managair can also be run as ingestair, an application that ingests samples received via an HTTP API. Both the management functionality and the ingest functionality reside in the same code base because they share a common database.

Functionality

Managair is the key administrative service of the Clair Stack. It is a Django web application that manages users, their inventory of sensor nodes, and the measurement samples recorded by these nodes.

  • The samples that are received and decoded by specific protocol handlers in the Clair Stack, Managair offers an ingestion API endpoint that takes a sample and persists it in the underlying PostgrSQL database.
  • A RESTful API offers resources for CRUD operations on all inventory entities, like nodes, rooms, or sensor installations. This API furthermore provides resources to retrieve time series of measured samples for display and analysis.
  • Administrative access to all entities is possible via the Django Admin-UI.
  • Managair enforces an access control policy whereby resources and samples are visible to users that are members of a given organization only, but can be made public explicitly.
  • An independent fidelity service checks if data is received regularly from each node.
  • Publishers allow to forward samples to other IoT data platforms if the corresponding sensor installation was marked as public.

The Managair Data Model

At the core of the Managair is a multitenant data model. A graph of the implemented entity-relationship model is available here

  • Key entity is the organization, which stands for a legal entity like a bar, restaurant, a retail store or a dentist practice. An organization can also stand for large institutions, like a retail chain or a university.
  • A user is a digital identity for a natural person. Each user is identified by a username and authenticated via a password.
  • A user can be a member of one or more organizations. For each organization, the membership can take on one of two roles:
    • A user with the OWNER role has full control over all resources that belong to the organization.
    • The INSPECTOR has read access only.
  • An organization owns one or more sensor nodes. Each node reports measurement samples over time, where each sample is time-stamped and contains at least a CO2 measurement. Depending on the node model and the node protocol, a node might report additional measurement quantities.
  • Each organization can command one or more sites. A site models a physical location with an address and geo-coordinates. Examples for a site might be a restaurant, a pharmacy, a school, or a department store.
  • Each site can consist of one or more rooms. Each room, in turn, can be characterized by its size, height, and maximum occupancy.
  • The organization's sensor nodes are associated with a certain room by means of an installation. Each installation is valid for a given duration with start and end date and time. This way, a single node can be subsequently installed at varios locations in one or more rooms.
  • All entities listed above are by default private. Only authenticated members of the organization can access the resources of the organization via the Managair REST API. However, each installation can be declared public. In this case, the room, site, and organization that contains this installation become public as well and can be accessed via the Managair API without prior authentication.
  • A time series is a sequence of samples recorded by one node. Time series can be accessed on a per-node basis and on a per-installation basis. A node-time-series covers the entire lifetime of the node, whereas an installation-time-series covers the duration of the installation only. Installation time-series are publicly accessible if the installation itself is marked as public. Node time-series are accessible to the organization only.

Architecture

Managair is a Django web application, written in Python 3. The primary REST API adheres to the JSON:API specification. To implement this API, we use the Django REST Framework (DRF) and the extension Django REST Framework JSON API (DJA). Authentication is handled by DRF's session authentication and as token-based authentication provided by DJ-Rest-Auth. Our task queue for the fidelity check services is provided by Django-Q.

REST API

Documentation of the Managair ReST API is available

If you make changes to the API, you need to re-generate the corresponding OpenAPI description file. To do so, execute python3 manage.py spectacular --file schema.yaml, or - if you run the Clair Stack atop inside docker swarm: manage-py.sh <env> spectacular --file schema.yaml.

The schema.yaml should end up in the project's root folder, from where docker build will correctly package it.

Deployment

The Managair is designed to be deployed as part of the Clair Stack, a docker swarm setup that comprises the configuration of all services necessary to ingest node data and serve it via the Managair API to frontend applications.

Even though docker swarm automates most deployment tasks for the entire Clair Stack, there are several tasks that pertain to the Managair service proper:

Static Files

HTML Templates, CSS, and media for the admin-UI and the browsable API are part of the Managair service. In a typical web application, it would be the job of a webserver to serve these files - as described in the Django documentation To simplify configuration, and to align development with production setups, we take a somewhat different route and use the White Noise module instead. With the White Noise middleware installed, Managair can serve its own static files without the help of an additional webserver. This is not quite as performant but requires much less configuration and fewer manual steps during deployment.

Upon a fresh deployment, or whenever static files have changed, you can force Django to collect all static files from all registered Django apps into a common folder by running python manage.py collectstatic. The entrypoint.sh script of the Managair docker container automatically performs this task if the environment variable COLLECT_STATIC_FILES is set to true.

Translations

For the accounts, templates are used that contain strings which can be translated. This is done for german (de). The process was:

  • cd accounts
  • mkdir locale
  • django-admin makemessages -l de
  • add translations to locale/de/LC_MESSAGES/django.po
  • django-admin compilemessages to create the .mo file which is used for rendering

When changing the templates, run:

  • django-admin makemessages -a
  • update translations in locale/de/LC_MESSAGES/django.po
  • django-admin compilemessages

_NOTE: the default language is determined in settings.py via LANGUAGE_CODE.

Secrets

Managair requires several secrets:

  • SECRET_KEY: The standard Django secret key that is used for digital signatures and to derive password salts.
  • SQL_PASSWORD: Password to connect to the main database.
  • EMAIL_HOST_PASSWORD: Password for the SMTP server used for sending mails.
  • SENTRY_URL: When using Sentry.io as a remote-monitoring service, sentry's ingestion URL is a (weak) secret. Sentry remote monitoring is activated setting the environment variable SENTRY=True.

Several more secrets might be necessary for integration with other IoT data platforms. For example, the integration with Stadtpuls Berlin (see the section on integrations below) requires the following additional secrets:

  • SP_API_KEY
  • SP_AUTH_TOKEN
  • SP_LOGIN_PWD

Secrets can be passed in either of two ways:

  • As value of an environment variable: For each of the secrets listed above, define an environment variable of the same name; e.g., SECRET_KEY.
  • As file content: Managair reads the secret value from a file where the filename must be provided via an environment variable <secret_name>_FILE; e.g., SECRET_KEY_FILE. This mechanism can be used in combination with Docker secrets, where Docker provides secrets to services as files inside an encrypted RAM-disk mounted at /run/secrets/<secret> inside the container.

Environment

The following environment variables are available to influence Managair setup. Shown are their default settings:

  • SECRET_KEY or SECRET_KEY_FILE. Secret as explained above.
  • DEBUG=0. Set to 1 to use Django's debug mode with hot reload. Never use in production!
  • DEBUG_TOOLBAR=0. Set to 1 to activate the Django Debug Toolbar. Never use in production.
  • SENTRY=0. Set to 1 to activate remote monitoring via Sentry.io.
  • SENTRY_URL. Secret ingest URL, as explained above.
  • IOTDP_INTEGRATION=0. Boolean flag to enable forwarding of ingested samples to other IoT data platforms. Disabled by default. Set to 1 to enable integrations implemented via the mechanism explained in the Integrations section below.
  • NODE_FIDELITY=0. Set to 1to activate regular monitoring of node traffic. The status of all nodes can be queried via the API at /api/v1/fidelity.
  • DJANGO_ALLOWED_HOSTS. Hosts allowed to connect. See the Django documentation for details.
  • EMAIL_HOST. Host name of the SMTP server used to send emails. See the Django email engine documentation for details.
  • EMAIL_PORT=587. Port of the SMTP server used to send emails.
  • EMAIL_HOST_USER. User name to connect to the SMTP server.
  • EMAIL_HOST_PASSWORD or EMAIL_HOST_PASSWORD_FILE. Secret as explained above.
  • EMAIL_USE_TLS=True.
  • DEFAULT_FROM_EMAIL. Default Reply-to email address for sent mails.
  • SQL_ENGINE=django.db.backends.sqlite3. Default SQL database engine. Do not use SQLite in production. Consult the documentation on Django database backends.
  • SQL_DATABASE=db.sqlite3. Name of the main database to connect to.
  • SQL_USER=user. Default database user.
  • SQL_PASSWORD and SQL_PASSWORD_FILE. The database user's password, as explained above.
  • SQL_HOST=localhost. Host name on which the DBMS is running.
  • SQL_PORT=5432. Port to connect to the DBMS.
  • LOG_LEVEL=INFO. Log level for the Managair application. Only messages with log level of the given severity or higher will be logged. Must be one of DEBUG, INFO, WARNING, ERROR, or CRITICAL. See the Django logging documentation for details.
  • DJANGO_DB_LOG_LEVEL=WARNING. Log level for DBMS messages only.
  • DJANGO_LOG_LEVEL=WARNING. Log level for Django-internal messages.

The following environment variables determine how Managair is launched inside its Docker container, via the entrypoint.sh script:

  • DB_MIGRATE=False. Execute new database migrations upon launch, if available.
  • COLLECT_STATIC_FILES=False. Collect static files in a central location to be served via WhiteNoise.

User Registration and Authentication

The Managair uses dj-rest-auth for user authentication, in combination with the registration functionality from django-allauth. Authentication and registration is available at the /auth/ endpoint; individual resources follow the dj-rest-auth documentation.

Like for the operational API, authentication and registration resources must be JSON:API documents with Content-Type application/vnd.api+json. For example, the body of a login request must look as follows:

{
  "data": {
    "type": "LoginView",
    "attributes": {
      "username": "maxMustermann",
      "password": "mustermann"
    }
  }
}

Node Status Fidelity Check

The Managair contains a background service that periodically checks for all registered nodes if a message has been received recently, within the last two hours (configurable). If so, Managair marks the node's fidelity as ALIVE. If the most recent sample is not older than twice this period (four hours), the node is marked MISSING. A node that has been quiet for longer is declared DEAD, while a node from which no messages have ever been received is UNKNOWN.

The periodic fidelity check is performed by means of the background task scheduler Django_Q. It is active if the environment variable NODE_FIDELITY is set (=1).

Once the entire application stack has booted, you currently need to start its job queue by hand, via the command. python3 manage.py qcluster; or, on the Clair Stack, manage-py.sh <env> qcluster.

Then, open up the admin-UI and schedule a Live-Node Check at an interval of your choice. The function to call is core.tasks.check_node_fidelity.

Results of the fidelity check are available at the API resource api/v1/fidelity, or via the admin UI.

Integrations

Managair in its sample-ingest configuration provides for a means to forward incoming samples to other IoT data platforms (IOTDP). For each incoming sample, the ingester determines if the sample corresponds to an active installation and if this installation has the flag is_public set to true. If so, the ingester publishes a Django signal that can be picked up by a custom integration application for use. In this way, it is possible to develop Django applications that subscribe to this signal. How each application performs the actual integration may differ.

As a first example, Managair comes with an integration for Stadtpuls Berlin, implemented as Django application stadtpuls_integration. This application has its own data model in the Django DB that stores which installations have already been registered with Stadtpuls. For each signal trigger, the Stadtpuls integration checks if the installation that corresponds to the incoming sample has already been registered with Stadtpuls. If yes, it maps the internal installation ID to the corresponding Stadtpuls sensor ID, converts the sample data format, and forwards it to the Stadtpuls ingest API. If not, it first registers a new Stadpuls sensor and then proceeds as above.

Development Setup

Managair is a Django web application atop a PostgreSQL DBMS. It is meant to be run as part of the Clair backend stack. To start up your development environment, consult the stack's Readme-file.

Debugging

When run in DEBUG mode, the managair_server has the Python Tools for Visual Studio Debug Server (PTVSD) included. It allows to attach a Python debugger from within Visual Studio Code to the application running inside the container. To get started, copy dev_utils/launch.json into your project-local .vscode folder. To attach the debugger to the Managair application running inside a container, select the Debug Django run-configuration on the VS-Code debug pane. If the status bar turns from blue into orange, the debugger is attached. Set a breakpoint and fire a request to get going. Details on setup and usage can be found in this blog post.

Django Debug Toolbar

When the debug mode is activated via the environment variable DEBUG=1, the Django Debug Toolbar becomes visible on all HTML views; e.g., the admin UI or the browsable ReST API.

Data Fixtures

To start development work right away, it would be convenient if important data was preloaded into the DB already. This is what Django fixtures are for. Fixture files are JSON files that contain data in a format that can be directly imported into the DB. They are available for the individual applications in their fixture folders. To set up the the application for development, load the fixtures as follows:

  • python3 manage.py loaddata user_manager/fixtures/user-fixtures.json
  • python3 manage.py loaddata core/fixtures/inventory-fixtures.json
  • python3 manage.py loaddata core/fixtures/data-fixtures.json

Make sure to respect the order because of foreign-key constraints. When Managair is executed in a docker container, the above commends must be executed inside the container; e.g., via docker exec. If you run Clair Stack, which is built atop docker swarm, you can use the manage-py.sh shell script to execute your Django management command inside the correct container.

Tools and Conventions

Testing

Django comes with extensive testing support, from unit tests, tests of the DB interaction to full-blown integration tests. As Django installs a separate testing DB, most of the tests could even be run on a production system without interfering with its operation. Therefore, we currently do not have a separate testing configuration of the Clair Stack - simply run the tests on your local development stack.

To execute the tests, use the following Django management command

$> python3 manage.py test <app>

where <app> is the Django application for which to execute test cases. For the Managair, it will most likely be core. For more detailed control about the tests to run, consult the testing documentation.

The tests can be executed perfectly well on a running Clair Stack on docker swarm. Use the management tool as follows:

$> ./tools/manage-py.sh environments/dev.env test core

Debuggin While Testing

When doing test-driven development, or simply while developing tests, it is quite common that things don't work out as intended. In such a case, it is very helpful to set a breakpoint and launch into a debugging session right where the problem occurs. When running the Managair application as part of the Clair Stack, the debugger must be attached remotely.

Unfortunately, the standard attach procedure outlined above does not work - the debugger there waits for the development server to reload, which does not happen during testing. Instead, there is a separate debug configuration for use in Visual Studio Code: Select the Debug Django Tests run configuration and set a breakpoint in the test case you are working on. Then, use the Python script debugtests.py to start a test session with debug support. On docker swarm, execute it via

docker exec -it $(docker ps -q -f name=managair_server) python3 debugtests.py <test-args>

Once the script prompts Waiting for external debugger..., switch back to VS-Code and execute the Debug Django Tests run configuration (click on the green arrow).

Footnotes

  1. The Clair Platform and the Clair-Berlin initiative are now part of the CO2-Monitoring (COMo) project, funded by a grant from the Senate Chancellery of the Governing Mayor of Berlin. โ†ฉ

managair's People

Contributors

jawebada avatar rtzll avatar ulischuster avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

managair's Issues

Add free-text comment field to the Node model.

As a user of a node, I want to attach additional information to a specific node.

As clair operators during the trial phase, we want to attach the owner of a test node to the node (e.g., CityLAB Berlin).

Test all available means of authentication.

Currently in the test cases, we use session-based authentication only. The goal is to test other authentication means as well - in particular, token-based authentication.

3/5 = 0.6 Remove unused measurement status from sample model.

The sample model has a field measurement_status, which was intended to tag erroneous measurements or interpolations - along the lines of electricity measurement data. However, we do not use the field currently.

To reduce complexity and eliminate potential bugs, let's simply get rid of the field altogether.

Add email alert to node fidelity job

As the operator of the Clair platform, or as an organization owner, I want to receive email notifications if no data is coming in from one or more sensor nodes.

Proposed solution: Use the already existing fidelity check and add an email alert to the node fidelity job. Send an email to the Clair platform operators if all nodes seem to be affected, otherwise send an email to members of the organization that have the role OWNER.

Add a status field to the installation model that indicates if fidelity notifications should be sent to OWNERs or not.

Complexity: 5
Value: 8
WSJF: 1.6

Make selected time series publicly visible.

  • As an ONWER of an organization, I want to make time series of specific node installations publicly visible, so that they can be inspected via the API.
  • As a member of the general public, I want to view time series reported by nodes installed in a specific site or room.

This is one of the intended key features of the entire Clair system - making CO2 time series available to the public.

Architecture idea:

  • The private/public distinction is stored for each node installation and pertains to the time slice for which this installation is valid.
  • If a node installation is marked public, basic data of the room, site and organization that host the installation automatically become public as well, so that the node can be searched for.
  • All resources that are publicly available are read-only.
  • The public API provides related resources per node installation, because these time series are automatically time-sliced correctly.

API: Implement role-based authorization

As authenticated user of the API, I want to get access to my resources to perform all actions I am entitled to.
As owner of a certain resource, I want to manage which other users can access the resource, and which actions they are allowed to perform.

The main concepts in our role-based access control (RBAC) scheme are the following:

  • The key entity is the organization. An organization owns sensor nodes, sites, and rooms. The sensor nodes, in turn, record samples. These resources belong to exactly one organization only; they cannot be transferred between entities.
  • Users are members of one or more organizations.
  • Each resource comes with permissions to view, create/add, update/change, and delete the resource (CRUD-permissions).
  • Within each organization, a user has exactly one membership role, where each role aggregates a set of individual permissions.
    • The OWNER has full control over all resources of the organization, including adding resources, adding or removing organization members, and deleting resources or the entire organization. An OWNER can change the membership role of other users.
    • The ASSISTANT can add and update nodes and rooms, add users, and change their role from INSPECTOR to ASSISTANT and back.
    • The INSPECTOR has read-access only to the resources of an organization.
  • Access to resources of a given organization is - by default - restricted to members of this organization.
  • In addition to authenticated users, there are unauthenticated visitors. By default, visitors do not have access to any resource.
  • The OWNER of an organization can choose to mark certain rooms or sites as public. A public mark pertains to the tree of resources below the thus-marked entity, all the way to nodes and their samples. Public resources are read-accessible to visitors.

3/5 = 0.6 Add validation on room-node-installation to prevent overlapping installations.

The M:N relation between a room and the installed node has its own relation model that allows for time-sliced attribution. Because the Django ORM does not support temporal SQL extensions, there is no DB integrity check that would prevent a node being attributed to more than one room at a time. Therefore, we need to introduce a validation higher up in the stack.
This validation could happen in

  • the serialiser validation; this would be the earliest possible validation, and would need to be applied for all serialisers that alter node-to-room attributions;
  • in custom managers on the models involved. It is not quire clear to me if we would need validations on both sides of the relation (the Node and Room models), or just on the RoomNodeInstallation relationship model.

Email confirmation link flagged as unsafe in MS Outlook

Email confirmation link uses a HTTP URL, which is flagged as unsafe by Microsoft Outlook's safe-link feature.
The problem probably results because Managair uses HTTP URLs only. Rewriting to HTTPs happens in the Traefik reverse proxy. For the link in the Email to use HTTPS, we probably need to adjust all Managair URLs.

1/20 = 0.1 Set up production webserver

In development mode, Django uses its builtin development server. However, for DEBUG=0, a production-grade web server must be used.

This task is not about a web server for static files, but about a WSGI server to run the Managair code. Official Django documentation is available here: https://docs.djangoproject.com/en/3.1/howto/deployment/

Requirements

  • The setup should be easy to integrate with the rest of our clair-stack
  • Ideally, we can also use it during development, to minimize the gap between development and production.
  • Should be easy to bring up in our dockerized environment.

5/50 = 0.1 Opt-in for being added to an organization

As a user, I want to be asked for confirmation when someone adds me to an organization.

Use Case:

  • OWNER of an organization attempts to a user as member to the organization via the /memberships/ endpoint.
  • The membership status is set to PENDING and the user is informed to review the request.
    = The user reviews the request and either declines or confirms the membership.

Open issues:

  • how to handle notifications

Send alarm when no samples are received within X minutes

Goal: Monitor each node's ingestion and time series to figure out if

  • A node no longer seems to be active
  • The data ingested from the node cannot be turned into time series samples.

The monitoring system should raise alerts that can be picked up and processed by some notification mechanism for different outlets; e.g., in logs, on the admin web page, or via email or the lik.

Extract docker system setup into a separate repository, with links to the individual applications.

We decided to move to a setup with an ingestair application for data ingestion and a managair/inspectair application to view data and inspect it. All these applications should operate on a common PostgrSQL DB.

Basically, this is the setup currently in use on the managair repository. To include the ingestair application, the docker system setup must be extracted from managair, stored in a separate repository and linked for ease of development and deployment.

Revise Managair API for consistency and to capture basic use cases.

The current ReST API is a direct mapping of our DB relations and not very well suited for use in various applications.

Goal:

  • Revise the API for consistency and ease of use
  • The API should be in line with HTTP specification and follow standard design practices for ReST APIs.

Integrate email send functionality as notification path

As Managair user, I want to receive information about important events; in particular to confirm my account, reset my password, and to confirm membership in an organization.

Emails are a still widely used means of asynchronously informing a user. Django provides email integration. Therefore, we want to use this feature for the above-listed information use cases. The goal here is to get email integration up and running, make it testable, and integrate it with the sign-up use case first.

1/5 = 0.2 Extend the Sample entity to persist a sequence number.

  • As a user of time series data, I want to know if some samples are missing so that I can take remedial actions, like interpolation.
  • As a a DevOps engineer tasked to debug problems with sample ingestion, with sensor nodes or with the transport network, I want to know if samples are missing to have a starting point for further investigations.

Currently, the Sample entity records a timestamp plus the measured quantities. From this information, it is not possible to conclude if samples are missing in a time series.

However, some sample nodes or sample networks do record sequence numbers for all recorded samples. When available, we should make this information part of each Sample entity and persist it.

Goal of the present issue is to extend the Sample entity and related API code so that a sequence number can be provided upon ingestion and retrieved via the Timeseries and Sample API.

Enhancing the TTN client is subject of a separate issue.

Installation privacy cannot be toggled

Toggling an installation's privacy should be accomplished via a PATCH on a given installation resource. However, a PATCH of the is_private flag fails because of a validation that checks for the related node and owner.

@reststate/vuex expects the name of top-level list endpoints and their types to be identical

This is only relevant for resources which are not requested explicitly but only for those which are served included in responses. I had to match the address endpoint to Address and organizations to Organization because @reststate/vuex creates vuex modules per type and expected the name of the resource list endpoints and their type to be identical. The JSON:API is agnostic about this but features an example where type is in plural form: https://jsonapi.org/format/#document-resource-objects

Note: This spec is agnostic about inflection rules, so the value of type can be either plural or singular. However, the same value should be used consistently throughout an implementation.

This is not a blocker for now but might suggest that other JSON:API libraries expect the same.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.