Giter Site home page Giter Site logo

maxiv-kitscontrols / web-maxiv-hdbppviewer Goto Github PK

View Code? Open in Web Editor NEW
3.0 11.0 7.0 5.19 MB

A web based viewer for HDB++/Cassandra data. Project hosted on MAX IV internal GitLab.

License: GNU General Public License v3.0

Python 26.31% JavaScript 64.76% CSS 6.03% HTML 2.26% Dockerfile 0.56% Makefile 0.09%

web-maxiv-hdbppviewer's Introduction

Introduction

This is a web based viewer for HDB++ archive data, currently only supporting the Cassandra backend.

It is currently in a "beta" stage, with basic functionality in place but very limited testing. Bug reports are welcome!

Features

Basic functionality

  • Searching for stored attributes
  • Selecting which attributes to add
  • Free scrolling/zooming the time scale in the plot
  • Two separate Y axes (no hard restriction, but needs UI)
  • Y axes autoscale
  • Encodes current view in URL (e.g. for saving as a bookmark)
  • Display min/max etc. on mouseover
  • Linear and logarithmic Y axes
  • Cache database queries in memory.

Missing functionality

  • Configure color, Y-axis etc for each line
  • Periodical updates
  • Display attribute configuration
  • Display errors
  • "Special" datatypes: String, Boolean, State, Spectrum, ...
  • Cassandra authentication (?)
  • General robustness
  • Allow downloading "raw" data
  • Displaying data as a table
  • Manual scaling of Y axes.
  • Rescale the UI when the window size changes
  • Handling different keyspaces

Improvements needed

  • Optimize data readout and processing
  • UI is pretty basic
  • Mouseover stuff is a mess
  • Server configuration
  • Not sure about the url hash json stuff...

Ideas

  • Use websocket to send data incrementally?
  • Use canvas for plotting
  • Now re-loads the view each time anything changes, maybe possible to be smarter here?
  • Would it be useful (or just confusing) to allow more than two Y-axes?
  • Other ways of browsing for attributes; e.g. a tree?
  • Mobile optimized view? The plot actually works pretty well on a mobile screen, but the rest is unusable as it is.

Requirements

Note: the repo includes a Dockerfile, that can be used to build a docker container image for easy deployment together with all dependencies. Have a look in the file for instructions, you will probably need to modify some things to suit your needs.

Python (for running)

  • python >= 3.5
  • aiohttp
  • aiohttp_cors
  • aiohttp_utils
  • cassandra-driver >= 3.6 (needs to be built with numpy support!)
  • datashader

Datashader has a bunch of scientific python dependencies, the easiest way to get it is probably through anaconda (https://www.continuum.io/downloads)

Javascript (for frontend development)

  • node.js
  • npm

Cassandra

You also, obviously, need to have access to a Cassandra installation somewhere, containing HDB++ formatted data.

Building

The frontend is written using babel, react and redux and managed with webpack. To build it, the following steps should work:

$ npm install
$ webpack

Docker Image build

Before building the docker image Please VERSION your release by editing the VERSION variable in the Makefile. Then you can build by simply executing:

$ make build

and publish your image:

$ make publish

Configuration

By default, the server will load the config file "hdbppviewer.conf". It contains some example configuration and comments. You can create your own configuration file and put it wherever you like, and point the server to it using the "-c" argument.

Running

$ python server.py

Then point a web browser at http://localhost:5005/index.html

Local installation

To disable the database, comment out the HDBPlusPlusConnection in server.py and put 'hdbpp = None' instead. Now the project (only the front-end) can be compiled and served locally. Currently we need to run 2 processes/commands simulatenously, one will re-generate bundle.js and one will serve the bundle.js, the commands are

  1. npm run dev
  2. npm run watch You can run these commands in separate terminals (until we come up with one command that do both parts)

Querying

It's also possible to access data from the server in JSON or CSV format, e.g. using httpie from the commandline. Searching for attributes matching a given string (may be a regex):

$ http --json POST localhost:5005/search target=mag cs='my.control.system:10000'

Get data for a given period of time for one or more attributes, resampled to 5m intervals, in CSV format:

$ http --json POST localhost:5005/query targets:='[{"target": "r3/mag/dia-01/current", "cs": "my.control.system:10000"}]' range:='{"from": "2017-06-16T15:00:00", "to": "2017-06-16T15:30:00"}' interval=5m Accept:text/csv

This API is intentionally compatible with Grafana, making it easy to create a Grafana plugin for HDB++ data (in progress).

web-maxiv-hdbppviewer's People

Contributors

13bscsaamjad avatar antoinedupre avatar beenje avatar hardion avatar henquist avatar johanfforsberg avatar maxlvmt avatar meguiraun avatar muhammad-saad-maxiv avatar saad17com avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

web-maxiv-hdbppviewer's Issues

Plot configuration

The user should be able to configure some things about the plot, most importantly:

  • Line colors (widths?)
  • Y axis limits
  • It should be possible to hide/show individual lines without having to add/remove them from the plot

Zoom out crash?

If the user zooms out too quickly way too many requests are made, and it becomes too slow. So limit the rate of scroll events somehow

Display "events"

The plot should show important archiving "events" such as when an attribute was started/stopped, added, removed, paused etc. This should be helpful to differ between errors and manual actions.

Range inputs values

Initialize the inputs values with the ranges that come from the current plots.

Markers for curves

allow different markes (square, dot...) for a curve, selectable by the user.

Goal: differentiate better each plot

Display errors

The plot needs to somehow indicate where there were errors recorded.

More y-axes

A request from the users is to be able to let each attribute have its own Y axis, so that they can all be scaled independently. This is very useful in order to see how lots of different things correlate in time. The epics "striptool" https://epics.anl.gov/EpicsDocumentation/ExtensionsManuals/StripTool/StripTool.html#GraphWindow that was used at the old MaxLab has been brought up as an example.

image

Today the viewer has only two independent Y scales, right and left and each attribute is displayed on one of them. Technically there should be no problem to support more Y axes, it's mostly a matter of UI and how to present it. Displaying many axes at the same time doesn't seem like a good idea since it becomes a mess visually. The striptool apparently only displays one Y axis at a time, depending on which attribute is selected.

I think this would require some changes in the UI but it would be a powerful feature.

Mouseover information

The mouse cursor should reveal more specific information about the closest line, such as the name of the attribute, value (min/max/avg?) at the point, etc.

Logging

Add some kind of logging mechanism into a file (or into kibana)

Download image seems broken

When clicking the button I see the following in the console:

Uncaught TypeError: can't access property 0, _this3.plot.svg._groups is undefined onClick webpack:///./js/plotwrapper.js?:139

Color selection improvements

  • Show currently selected color next to 'pick color'
  • An 'ok' button for closing the color picker panel
  • Be able to change the color for already existing plots, be aware that the user may want to re-color several plots, so keep the changes onhold and apply them via a new button
  • reset color selection when loading attribute list
  • When the user does not select a color, display as well what will be the default displayed color

The user can break the server

It's pretty easy for a user to basically break the server by e.g. zooming out a lot. I think the main reason is that there's no limitation on the number of pending queries to the DB, which means that they will queue up and the server will have too much to do for a long time, possibly also running out of memory.

Not sure what's the best way to deal with this. Perhaps simply keeping track of each user session to prevent queuing up new queries until the previous one completed. This is not quite trivial since it's important that the user at least eventually gets the latest data requested. Maybe keeping track of the "current" and the "latest" request per user session, and whenever the "current" is complete, if no "latest" request exists we return the result, otherwise discard it and process the "latest". If a new request comes in before the "current" request is completed, we just replace the "latest" with the new. As far as I can tell, it's not really possible to manually abort a Cassandra query that is already running in the cluster.

Attribute list saving

like the old archiver, the user should be able to save the currently displayed attrs and later recover. Everyone sees everyone else's list

Handle dates with timezone

The backend currently gives an error if a query specifies a time with timezone.
'from':"2021-02-16T13:55:53Z" does not work, but 'from':"2021-02-16T13:55:53" does.
Specifying a timezone results in an exception from Pandas about comparing dates with and without timezone.

Plot drawing method

I'd like to discuss the pros and cons of the current method of drawing, now that the application has been in use for a while. Feel free to add your observations and opinions to this issue. If the drawbacks are too large, we should start thinking about a new solution.

The plots are currently completely drawn on the server, as an image in whatever size the client requests, and sent to the client as a single base64 encoded PNG. To do this we use datashader (http://datashader.org/).

Advantages

  • Datashader is able to plot huge datasets without downsampling. By this I mean that all points are always drawn no matter the total number of points. For example, a single pressure spike lasting less than a second will be visible even if you plot an entire year of otherwise even pressure. Downsampling data is a very tricky subject, so I think this is a huge practical advantage and it's the main reason I went with this method.
  • A very important factor is how many points we can expect to plot. As a start, let's consider an attribute that is stored once per second. This means ~2.5 million points in a month. I think this is not a very unusual case, e.g when looking at a long term trend. So we at least should aim to handle tens of millions of points routinely. Datashader is advertised as handling datasets of hundreds of millions of points or more and so far I think it has handled things very well. It should even be possible to distribute datashader in a computation cluster (using "dask") if performance is not good enough.

Disadvantages

  • Drawing everything on the server means that changing simple properties of the plot (e.g. line color) requires redrawing the entire image. This is probably solvable without changing the whole architecture (see #17) but it will complicate things a bit.
  • We can't use any of the existing third party JS plotting libraries since they depend on getting point data (we do use D3, but only for axis drawing and such).
  • The plots don't look great; no antialiasing on lines, and no line styles. This can be a real problem for people with color blindness.
  • There's no straightforward way to cache data in the client, since images need to be redrawn when the Y axis changes.

Other notes

  • The images may be fairly large, but not huge; according to some quick measurements, realistic data at HD resolution will typically transfer ~50-100 kb per attribute plotted, with compression. Not sure it would be that much more efficient to send raw data points though. Also, the size of the images is essentially independent of the size of the raw data, so plotting a million points can take the same bandwidth as a thousand. I.e. the bandwidth usage is limited.

Alternative solutions

I haven't checked the options during the last year so I may be missing some important ones, but to me there are two main options:

  • Bokeh https://bokeh.pydata.org/en/latest/ is a library that basically solves the same problem, of having large datasets on the server and plotting them in the browser. I started the first prototype of the HDB++ viewer using bokeh, but I ran into some problems with updating data that made me change, I don't remember exactly why. However, bokeh has developed a lot since then, and is definitely worth looking into again. If it worked it could simplify both server and client code.
  • Implementing our own way of downsampling/compressing data, sending it to the client and drawing it with a third party javascript plotting library. Probably tricky, but not impossible.

Dropdown list with attr categories

provide a predefined list (patterns) such as 'VAC', 'CRY', 'FLOW', 'TEMP'... so it populates the attr list only with the relevant params. This is for easy of access, it can be done the same via pattern

clear plot

add a button for this, clears everything instead of going one by one and clicking remove

Downloaded image extras

Currently the downloaded image does not contain any legend/label so it is impossible to know which axis is what.

TODO: add a legend on the plot
TODO: rename the downloaded file, provide a default name (somethihg + date), but let the user change it
TODO: check axis background colour

Display write values

The plot should also contain write values for R/W attributes.

This should be pretty easy, since the data and the plotting mechanism is already there. However it's not obvious visually how to do this without it becoming confusing. Maybe the line for the write values should be shown with the same color but "dotted" or 0.5 alpha or something, to make it clear what's what.

Axis decimals

Fix the decimals, it happens that several ticks can have the same number (due to rounding), be smart on this.

Display attribute configuration

The database contains the attribute settings, such as label, unit, etc. This information should be made available in the user interface somehow.

Auto Update the view

to have a selectable option of auto update the view in real time (same as in the old archiver).

Split plot image into one per attribute

This is a internal change that would make it easier to solve some practical problems (related issues #17, #28, #29).

The idea is to stop rendering all attributes into one image on the server, and instead make one separate image per attribute. This way, the client can choose how to draw each attribute (by some clever canvas operations), and e.g. changing the color of one won't require fetching data from the server. It would also be possible to do other tricks like drawing the line several times, to suggest increased line thickness.

Note: this may seem like a waste of bandwidth, but due to the way PNG compression works, it makes a surprisingly small difference if it's one image of N colors or N images of one color each. At least some initial measurements suggest that the difference is not significant.

Display server status

Notify the user about the status of the server: 'BUSY', 'PROCESSING_REQUEST', 'DOWN'...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.