elan-ev / tobira Goto Github PK

View Code? Open in Web Editor NEW

20.0 10.0 17.0 8.53 MB

Video portal for Opencast

Home Page: https://elan-ev.github.io/tobira/

License: Apache License 2.0

Rust 42.86% JavaScript 0.33% TypeScript 48.13% HTML 0.07% CSS 0.72% PLpgSQL 6.20% Shell 1.48% Dockerfile 0.20%

opencast video-portal

tobira's Introduction

Tobira: an Opencast Video Portal

Tobira is a video portal for Opencast. It provides a hierarchical page structure, with each page consisting of simple configurable content blocks (e.g. text, videos or series). Opencast content (series or single events) can be shown on these pages. Users can upload, edit (via external editor) and share their videos.

The current version of our main branch is deployed at https://tobira.opencast.org. This is just a test deployment and all data is wiped whenever it is re-deployed. The test data was kindly provided by the ETH only for the purpose of this test deployment.

Documentation

Tobira's documentation.

Name

Tobira (扉) is Japanese for "door", "hinged door" or "front page" (of a book). A video portal is a kind of door, so we chose that name. It is also short and somewhat pronounceable for English speaking people.

tobira's People

Contributors

Stargazers

Watchers

Forkers

lukaskalbertodt juliankniephoff lkiesow peculiarprince ziegenberg ffeyen geichelberger dmgolembiowski owi92 gregorydlogan mtneug wsmirnow filip-fk tales-media arnei

tobira's Issues

Add some simple DB caching mechanism on the Rust side

We require this as the realm API pre-processes the database information into an actual tree. We really don't want to do that for every single request. That said, caching -- as it's a performance optimization -- should really not be our concern right now. We should start with a very very simple cache and always lean towards less caching ("correctness over performance").

Add error-handling helper functions and types

We probably want these things:

Bug exception type to indicate buggy code.
- Maybe also an Unreachable exception type
unreachable(reason: string): never: can be used in places that should never be reached and throws an exception
panic(reason: string): never: maybe we want to add this? But I guess it is more idiomatic to throw Bug(reason)?
assertNever(x: never): never: helper to assert that a type is actually never. Useful for exhaustive switches, for example.

Try generating TypeScript from SVG "directly"

See #55 (comment) for some more details.

Update the transitive `node-fetch` dependency

GitHub complained to us about GHSA-w7rc-rwvf-8q5r, which is not relevant to us right now, and probably never will be. However, we should of course still update once we can. The problem is: We can't (easily) do it right now.

node-fetch is a transitive dependency of ours via some of the Relay dependencies, and the patched version does not fit the specific version ranges used in those chain. So we can just wait for any and/or all of the packages in the chain to update their dependencies in a way so we can get the update.

For the record: The chains in question are

relay-compiler@^10.0.1 > fbjs@^1.0.0 > isomorphic-fetch@^2.1.1 > node-fetch@^1.0.1
relay-compiler@^10.0.1 > [email protected]> fbjs@^1.0.0 > isomorphic-fetch@^2.1.1 > node-fetch@^1.0.1
react-relay@^10.0.1 > fbjs@^1.0.0 > isomorphic-fetch@^2.1.1 > node-fetch@^1.0.1
react-relay@^10.0.1 > [email protected] > fbjs@^1.0.0 > isomorphic-fetch@^2.1.1 > node-fetch@^1.0.1

The easiest thing would be for isomorphic-fetch to update, which they did, but unfortunately in a major update (3.0.0), which we can't get with the constraints above.

The next best thing would be for fbjs to do it. Let's see how they handle it.

Add 'series' API

Blocked by #85.

How does configuration work?

We should prepare Tobira to be deployed in a dynamic environment which means that there may be multiple instances, … This might have an effect on configuring Tobira. Another question is if we want to be able to modify certain aspects of the configuration from the front-end (admin area; configuration panel). This doesn't matter for aitomated deployments but is highly requested when doing a manual installation.

Questions:

Do we keep everything in a configuration file?
Will there be two configuration files (backend and frontend)
Is the front-end configuration always public (like with Opencast Studio)
Do we keep/transfer configuration to a central place (e.g. Database)

Thoughts:

Using a database allows us to configure via the frontend
Using a database potentially allows for a configuration wizard
There should be a way for automated deployments
- Maybe deactivate front-end configuration for what is set in a configuration file?

Do we need to configure the Babel TypeScript plugin?

Apparently there are some caveats with using that instead of the TypeScript compiler. Does it use our tsconfig.json? What about the features it does not support (out of the box)?

Sending API data on initial request inside HTML

This is an idea we had in the back of our minds for some time. One notable disadvantage of single page apps (server just sends index.html, JS code and other assets) is that the initial request tooks longer until the user sees something useful. That's because of this dependency chain (from the point of view of the browser):

Download HTML document
Download assets, including JS (this is usually cached by the browser)
Parse/load JS
JS sends API request to backend.

Until the API request returned, the app basically can't render anything meaningful.

So the idea is that the backend could send some useful data inside the initial HTML document (maybe just a <script> tag with a constant JS object or something like that). That way, we can significantly reduce the time to "first meaningful render". Of course, there is also "server side rendering", but that would require a full blown JS runtime in the backend, which has several disadvantages.

There are a few possibilities:

Do not do this at all. Just send the same static index.html.
Send route independent data. This includes "login status" of the user and the navigation tree (just URLs and names). The idea is that with this, the frontend can already completely render the header and sidebar and only the <main> part needs another API request.
Also send "easy" route-dependent data: This additionally includes information for <main> (like the content boxes for that tree node or video details in case of a video page). This is more tricky as we could easily run into problems where we have to duplicate logic from the frontend in the backend. That's not good obviously. But this improves the user perceived load times again, as the main content is what the user is actually interested in.
Send all required data: Unfortunately, I guess this will be impossible with a lot of logic duplication. I am not yet sure what this includes in addition to the previous point.

Some other notes:

Logic duplication could potential be avoided by using WASM. We could write Rust code that is used in the backend and frontend. However, I'm not sure how well this would work (regarding code structure).
When sending data inside the HTML, the backend needs to acquire that data, of course. That could slow down the initial request, which partially negates the whole point. So this could be an argument to send less or only quickly available data.
- The backend could also already start preparing the required data, but not sending it in the HTML. That way, once the frontend does the API request, everything is ready in the backend.

Of course, this all is mostly about speed and is not critically important, especially not in the beginning. But this is related to some other things (see this for example) and taking care of this in the beginning simplifies a lot of stuff later.

Add series & events to database

The Tobira database needs some kind of representation of series and events from the Opencast side. We need to store all information to make Tobira work in our database.

Thoughts on database management/migrations

#69 made me think about how we want to handle database management in general. I researched and thought a bit about this topic. These are my thoughts/conclusions.

Some notes (hopefully there is no disagreement on those?):

Releasing a new version of Tobira should include some easy way to migrate the old database to the new schema. Otherwise updating is terribly annoying for sys admins.
I want commands for DB operations, e.g. tobira db setup, tobira db reset and tobira db run script.sql.
For convenience, just starting tobira should do some setup/migrations automatically, as long as it's clear what to do. In case of ambiguity, Tobira shouldn't guess.
It will regularly happen that we (devs) switch between versions requiring different DB schemas (e.g. git checkout to review a PR, ...). It shouldn't be too annoying to get the database into the correct state. Notably, for development, the actual data in the database isn't important most of the time.
I would prefer the scripts to be written in actual .sql files and not Rust strings. We can still embed those files into the application (include_str!).

Suggestion: `down.sql`

In addition to the obviously needed "forwards" migration (up.sql), I think it would be beneficial to also include scripts to roll back one migration.

Suggestion: store migrations in DB

We would add a table migrations (or so) which would contain all migrations leading to the current state of the database. All migrations should have some unique ID. Running a migration would add a row to this table, undoing a migration would remove the row.

Furthermore, I think it would be useful for development to also store the raw up.sql and down.sql in a text column:

Saving down.sql allows us to roll back a migration even if the script is not currently available in the repository. That is useful if you switch from one feature branch (foo) to a different one (bar). To get the database in the state expected by bar, we have to roll back the migration of foo and then run the migation of bar. But since foo's down.sql is not in the working directory anymore, we need to store it in the DB. Of course, you can always manually undo foo's migration before switching to the new branch, but we will likely forget that most of the time and get annoyed by how inconvenient that is.

Saving up.sql allows us to compare the up.sql in the working directory/file system with the one in the database. If they differ (which will happen a lot during development), Tobira could easily undo the old migration (with the down.sql in the database) and then run the new up.sql from the file system.

How many migration scripts?

Solution A: one migration per version

One migration script per version (e.g. v1.0.sql, v1.1.sql and next.sql). We would only modify the latest (e.g. next.sql) during development. Releasing a new version means mv next.sql 1.2.sql && touch next.sql. The ID for the migrations in the migration table would simply be the version. That way, Tobira knows what migrations to run when updating to a new version.

Advantage: works great for production users as they can easily upgrade to a new version
Advantage: having all schema changes between two versions in one file helps on some situations
Advantage: migration potentially faster as there are fewer scripts
Disadvantage: for developers, switching between branches means to completely undo all database changes since the last released version of Tobira and then redoing them.

Solution B: one migration per "feature" (commit/PR)

Instead of continuously changing one script until release, we could also have a separate migration script per "feature". That roughly corresponds to one migration script per PR that requires to modify the database schema. This of course results in lot more migration scripts, but also probably not ungodly many (adjust the DB schema is not too common). As ID for migrations, we would probably either want to use the current timestamp or some random short string. This is mainly to ensure that two feature branches developed in parallel will have different IDs for their migrations.

Still works fine for production users, although it might be a bit slower than solution A, simply because there are more migrations to run.
Advantage: more atomic migrations could make error handling/debugging easier
Advantage: for development, switching between branches would only require undoing a small migration instead of basically resetting the whole database.
Disadvantage: in solution (a) we could merge changes easily, whereas here, we might end up with migrations that undo part of previous migrations. Maybe.

I at least know of one library which encourages this style of migration management.

Think about URLs

We want admins to assign "nice" URLs to specific parts of the site (i.e. specific nodes of the tree). For example, /lectures and /lectures/math should both be assignable. At the same time, we need a couple of internal URLs (e.g. for settings, ...). How do we assign URLs to those routes? I see a few possibilities:

(a) Like GitHub: reserve a list of special words and no top level tree entry can have those words as URL. On GitHub, no user can be called settings, as github.com/settings is a special path.
- Pro: Nice URLs for both, user content and internal routes
- Contra: Requires foresight to reserve a sufficient set of words for special routes. Reserving new strings after we released is a breaking change. And even with a major version bump, we would potentially disrupt users. And in general, we are pretty constraint once we decide for this solution.
(b) Prefix internal routes: either (b1) with a path segment like /tobira (resulting in /tobira/settings and /tobira/about) or (b2) with a single character like @ (resulting in /@settings and /@about).
- Pro: Nice URLs for user content
- Pro: We can add arbitrary internal routes later
- Pro: user routes are less constraint compared to (a)
- Contra: Not so nice internal paths
(c) Prefix for user content: like (b), but the other way around. Same Pros and Cons, but additional:
- Note: links to user content are sent around more often than links to internal paths
- Pro: least constraining for future development of this software
- Cons: users are not constrained at all

(How) Will we use private fields in TypeScript?

TypeScript has a visibility system similar to Java and other OOPLs. Recently, JavaScript/ECMAScript also gained the ability to hide object properties using #privateFields.

There are some differences between the two approaches, and maybe there will be a situation where we want both, but we should at least pick a sane "default". Unfortunately I don't know enough about either to just make a judgement call here.

If it's going to be the ES mechanism, though, we should get rid of or configure the explicit-member-accessibility rule.

Investigate `clean-webpack-plugin`

This came up during the review of #57. See especially #57 (comment).

Use realm API instead of dummy realm data

How do we want to handle loading?

There will often be situations where the frontend has to wait for things like API requests. We should probably come up with some kind of general guidelines about how to handle this in the code, in the design, etc.

Set up Relay

Initial design ideas and sketches

I looked at a few other "video websites" (including YouTube, Vimeo and the ARD Mediathek) to get some inspiration and know what users are used to. We don't want to do something completely radical, but rather be close to stuff users already know. That way, it's easier to use.

Sorry for the rather ugly sketches, should not have used paper with such a bold grid! Unfortunately, I were unable to find a free and good program for this purpose.

General considerations/facts

We have some kind of structure (probably a tree) that the user needs to navigate. E.g. "lectures" and "conferences" as root level nodes, "biology" and "physics" as children of "lectures", "astrophysics" and "quantum mechanics" as children of "physics". Administrator are probably able to make this tree arbitrarily deep.
- Admins of nodes should have some freedom in choosing what content to show
The website should work well on desktop and mobile (surprise, duh).
The front page should:
- Be able to highlight some videos that the organization administrator wants to highlight.
- For new visitors: look interesting and invite to browse and watch some videos.
- For regular users: provide quick access to what that user probably wants to do.
Assumption: we want the provide the user with some kind of own "library" or "bookmarks". This might not be true, but we will see.

Add configuration: logging to stdout or into file

Some people prefer stdout logging (which is usually viewed via journalctl), while some prefer files.

Tracking issue: backend speed, performance, latency, ...

We are using extremely fast technologies in the backend, so it is very unlikely speed will ever be a problem. However, I'm always interested in making things more efficient and faster. So consider this a "pet project" issue of mine, mostly for note taking :P

Measuring

At some point, we could add some scripts to generate a realistic load and to measure performance data (latency, req/s, ...). We probably want to use wrk, but make sure to not repeatedly requesting just one API, but again, finding some "close to realistic" mix of requests.

Improvements

Caching
Using a better allocator: changing the global Rust allocator is super simple. By default, the system allocator is used, which performs not particularly fast, especially when it comes to lots of small allocations. jemalloc might be a good alternative. But changing this should only be done if an improvement can be measured.
Tweaking build parameters (lto = "fat", codegen-units = 1, ...)

Improve backend configuration

The current configuration system is a bit of a mess. We decided that we want the backend to be configurable via TOML file and via environment variables. I already have plans how to improve the current situation. This issue is just a reminder, so to say.

Improve logging

Just a random collection of ideas:

Generate a random "request ID" for each HTTP request and emit it with every log message that is (indirectly) handling that request. This will make understanding production logs a lot easier. We probably don't want to pass this request ID manually to all functions. Instead we can probably use a "task local variable".

Add video page

The frontend needs a page/route to watch a video. This mainly embeds some player.

Blocked by #87. Probably also by #86.

Get rid of the "Babel + browserslist defaults config hack"

See aa24989.

We also might want to consider to revert .babelrc.js back to a JSON file if there isn't any other reason for it being a script, since it should decrease build times.

We could also move the browserslist config to its own file then, although the docs actually recommend having it in the package.json file. What I don't like about that is that you can't add comments, there.

Choose a GraphQL frontend library

See the four showcase PRs:

and this document

Reevaluate using Babel over TypeScript

We currently use Babel to translate our TypeScript to JavaScript. This is cool because of multiple reasons (see below), but it is also inconsistent with some of the choices we made. We use the TypeScript ESLint plugin for linting, for example, which uses the TypeScript compiler internally, and we also use the TypeScript compiler directly ourselves to do type checking. This could potentially lead to weird behavior where these tools understand different features.

We should be fine if we just use TypeScript, because presumably all TS tooling should understand that. And if we agree on this, the current tooling setup might actually be a good choice because of the combination of the advantages of both approaches (see below), although we might want to think about enforcing the "only standard TypeScript" rule somehow, for example using Babel configuration and/or linter rules.

If we choose to unify the tooling (use TS compiler over babel OR use Babel based linting and ??? for type checking), here is a rough overview over that trade off:

Babel > TypeScript:

Potentially more cool modern ES (or not-quite-yet-ES, aka proposed ES) features!
We could even use crazier stuff that is not just for ES backwards compatibility, like babel-plugin-macros
More control over the output
- TS let's you specify a target ES version, Babel let's you turn on/off individual features, we have browserslist integration, ...

TypeScript > Babel

Better integration of type checking and the build process
- In the "Babel only case" I don't know how to get type checking
- In our current setup, type checking is an independent step; if the code is syntactically correct but doesn't type check, we (can) still get a build
  - This might actually be good sometimes for prototyping, but overall I think it is kind of defeating the purpose of a type checker
The TypeScript based linting can incorporate types in its rules

TL;DR

We should traverse the following decision tree:

Unify our transpiler tooling to avoid weird compatibility traps?
1. Base everything on Babel?
  1. How to do type checking?!
2. Base everything on TypeScript?
Stick with our current setup, and potentially find different ways to do so?
1. Find ways to restrict our code to things that both, Babel and TypeScript understand?
2. Actually we could also consider switching just the linter from TS to Babel, or even use both, but I guess this decision can still be deferred to when we find good reasons.

Render content-blocks

Blocked by #83. After that is merged, the frontend needs to request content blocks and render them accordingly.

Make sure error and panic handling works correctly

When the HTTP/API handler in the backend returns an error or panics, the server should not crash. Instead, either the error should be returned or as a last resort, just return with 5xx. It might already work, but we need to make sure it works as expected. Preferably an automated test should check that, too.

Prometheus integration

It would be cool to have Prometheus integration for several reasons. There is this library which seems pretty functional, well written and well maintained. This is certainly not urgent right now, but we can already collect useful metrics. (If you are allowed to, feel free to edit this comment to add metrics.)

Useful metrics:

Opencast API latency
...

`floof` fails on clean builds

Specifically, running the backend results in the following error:

Error: failed to start HTTP server

Caused by:
    'index.html' is missing from the assets

Add 'events' API

Blocked by #85.

Add "content-blocks" to database and expose via API

Each realm has a number of "content blocks" associated with it. This is what's shown on the realm page. There are a couple different ones:

Single video (to highlight one video)
Video list
- Different kinds of sources: an OC series, "most recent", ...
- Different kinds of visualizations: horizontal list, vertical list, carousel, ...
Text block (potentially with image)

To close this issue, not all variants have to be implemented, but the basic structure should be added to the database and the API for realms should be extended to expose content blocks.

High-availability & Scalability

…and all the other buzzwords

A single application does not scale infinitely. It is much easier to scale by running multiple instances. We also don't want the portal to go down just because a single machine fails. If possible, we want to avoid all single points of failure. Here are a few thoughts on what we want/need:

We want to run multiple Tobira notes
- helps when a node goes down (failover)
- easy way to scale up
- key concept of modern deployment mechanisms (e.g. Kubernetes)
- we need to make sure to run transactions (no intermediate database state)
We want to run different jobs as different applications
- Opencast importer
- data provider
- (maybe) Opencast data updater
We want Tobira to work when Opencast is (mostly) down
- cache videos metadata (location, title, …)
- (maybe) provide own player
- visually disable data updates
We don't need to think about this since they have high-availability stuff on their own:
- databases
- video delivery (streming and download)
- load-balancer
To run multiple Tobira notes we can use
- Kubernetes magic stuff
- HTTP/TCP load-balancers (Nginx, HAProxy)
- DNS load balancing

Add good caching headers for assets (non-dynamic files)

Tobira uses (and will use) completely static files that should be cached by the browser to keep page loads as short as possible. In the backend, we know best how long some things can/should be cached and we should therefore add corresponding HTTP headers there IMO. Passing the responsibility to the server admin and forcing an nginx reverse proxy seems like a bad idea to me.

Some thoughts:

The initial HTML request (index.html basically) should never be cached. Luckily that one is only loaded once per session.
The GraphQL API should also never be cached.
Assets should be cached. I think a common strategy is to include a short hash of an asset's contents in its filename and to set the max age to a high value. That way the assets can be cached for a very long time (think: months) and if it doesn't change, it the user never has to download it again. Whenever an asset changes, the hash and thus filename changes (which is included in the non-cached index.html) and consequently, the browser has to download the new version.

Should we commit the generated GraphQL schema?

The cons are obviously that it's generated and we can always recreate it. And keeping it in version control can lead to inconsistencies.

I don't know whether there are any pros, but it's not impossible there are some, like for package-lock.json. We might want to think about that.

Maybe include debug symbols in release builds?

Rust by default removes all debug symbols in release builds. But at some point, we probably have to debug a problem that occurs in production. And with debug symbols, we at least get file locations with our stack traces. (debug = 1 should be sufficient for that.) Given that the only disadvantage is a larger binary (which is not really important for the use cases), we should probably enable debug symbols in release builds.

I will get some specific numbers on the file size in the near future.

Meta issue: Testing

This is not something to focus on in our initial development, but we decided that we want really good automatic tests for Tobira. This is a big topic though and we will need a couple of different kinds of tests. I open this issue now already so that we can write down notes about this topic.

Backend
- Unit tests -> easy, we already have some.
- Integration tests: #856
- Maybe tests making sure that everything works (reasonably fast) even when handling thousands of requests.
Frontend
- Unit tests -> should be straight forward, something something jest?
- #855

From my personal experience, I would say that writing good test utilities is important to make writing actual tests as easy as possible and that the tests contain as little boilerplate code as possible.

(Feel free to edit this top comment or add your own thoughts/notes)

Think about dependabot

A few thoughts:

Security notifications from dependabot are very useful and we want them. I would also not mind dependabot opening a PR to fix those.
I don't like how dependabot adds so much noise to the PR list. I think I would even prefer not using dependabot for that reason. I probably prefer a PR that includes all updates instead of one PR per dependency.
The large number of PRs probably also leads to us not testing every single one.
However, I see that we somehow want to be reminded to regularly update our dependencies. Maybe there are some bots that can just regularly ping is on an issue?

GraphQL schema validation

We should run a validator (with reasonable settings) during CI. We might also want to provide this as a script.

Speed up ESLint (using incremental TypeScript stuff?)

Of the three parallel tasks floof currently runs during development, ESLint actually turned out to be the slowest. (Thanks, @LukasKalbertodt for the measurements.)

I have an inkling that the TypeScript rules might not make use of the incremental build stuff introduced by #44. We might want to look into this.

Maybe take a look into using swc instead of webpack

swc is a super-fast typescript / javascript compiler written in rust.

swc is a webpack replacement written in Rust that is way faster than webpack: https://github.com/swc-project/swc

I don't know if we could use it: we currently use a lot of webpack plugins that may or may not work with or exist for swc. But making these rebuilds way faster might be worth looking into it.

Integrate `relay-compiler` into Webpack

In #29 I tried to integrate the Relay codegen into the Webpack build, but unfortunately I wasn't very successful. I can't do anything about it, I think, and am basically waiting for a fix from the appropriate Webpack plugin project. If that ever lands, I would still like to do this.

Here is some additional info quoted from #29:

Integrating Relay into the build was a bit awkward
[...]

If we want to use the Webpack plugin, we need to match its version of the graphql dependency because of relay-tools/relay-compiler-webpack-plugin#56

I even had to npm dedupe to make sure that there is only one physical version of the package in the node_modules-tree

We could of course run this build step outside of Webpack, especially once we have a proper task runner

Automatically import `React` and `jsx` in each file

Because we use JSX and we will use the emotion CSS library, two symbols have to be available in every file using JSX syntax: React and jsx. This is easily achieved by importing them:

import { jsx } from "@emotion/core";
import React from "react";

However, adding those two lines to basically every file is annoying and adds noise. It would thus be nice to somehow avoid this manual import. This is unfortunately made a bit more complicated because we need to do that twice: once for webpack and once for tsc.

For webpack:
- jsx: is already automatically imported by the @emotion/babel-preset-css-prop preset.
- React: not currently auto-imported (I think), but this babel plugin seems to do exactly that.
For tsc: for both symbols, it's still unclear how we would add an auto-import.

Do not embed dummy/test logo into production binary

The logo we have for testing purposes shouldn't be included in the final binary. Once the logo is configurable (see #75), we the test logo should be used like normal logos and not embedded.

JS/TS linter configuration

We probably want some standard style and then fight over individual things if they are important enough.

Specifically we need to decide on the issue of ' vs. ", since this is already all over the place. x)

Determine the target platform and configure Babel accordingly

While working on #29, I ran into Babel complaining about missing transformations a few times, for example when using async/await and when using class properties. Instead of blindly adding any, we should probably think about what exact browsers we target with our bundle, first. Maybe we can even remove some. 🤔

Run clippy & rustfmt on CI?

Clippy is the official Rust linter. I never actually used it (shame on me), but I heard it's good and it might be useful. Note however, that linters aren't as important for Rust as they are for languages like JS, as Rust already has a proper compilation step that performs tons of checks and also emits warnings for "probably bad" code (warnings are not allowed during CI).

Rustfmt is the official Rust formatter. We might want to use it to enforce standard style and maybe avoid internal style discussions. I am not a huge fan of the formatter as it's pretty rigid. It is configurable though.

This issue is just a reminder.

University/organization-specific adjustments (design or otherwise)

We expect that organizations (mostly universities) need to adjust the design of their Tobira instance to match their cooperate identity. This issue tracks this general feature and collects ideas and notes about the topic.

What things need to be adjustable?

Logo & favicon
HTML title & some <meta> information
Main colors
Font

Potential additional requirements

Custom CSS?
Two logos, e.g. the main university logo and a logo for the "digital education" section of the university. (Example)

Some general thoughts

We were tasked to make Tobira rather customizable. However, adding more way to customize Tobira adds complexity and makes everything less robust, potentially even slower. Tobira is not supposed to be a full-blown CMS or even website-builder.

Implementation ideas

Customizing Tobira should not require recompiling/bundling anything. Users should be able to grab a precompiled version, run it and change configuration files to adjust the design.

Logo and favicon: the config file can specify paths for these images (logo has to exist in a small and large version, too). Tobira shouldn't embed the dummy logos in its executable file for production builds. Every user is going to replace those! So I'd rather make the config value mandatory.
HTML title and metadata: this should also be easy to configure via config file. It's just strings. Probably makes sense to make these values mandatory to configure as well.
Main colors: This is more tricky. For one, the Tobira design should use some kind of palette with a few main/accent colors. We should test the design with a couple combinations to make sure the design works with different palettes. A technical question is how to configure these colors and where to define a default. We could use actual CSS variables: our CSS code just refers to those instead of fixed colors. The backend then inserts a <style> tag with the correct definitions into index.html and then hopefully everything should work? The defaults would be defined in the backend then. And I think having a default is good in this case as not every institutions will change the default style.
Fonts: I probably would only use one font for all of Tobira: Open Sans. Changing the font is not trivial, for a couple of reasons:
- How to refer to the font in the frontend code? Also use CSS Variables? Maybe. Can CSS variable hold "list" as usually specified for font-family? E.g. font-family: 'Open Sans', 'Arial', sans-serif;
- How to include the font? Currently we have a generated CSS file which includes parts of the font for different charsets. This CSS code is included in index.html. We could let the user specify a file path in the config file to a CSS file doing the same.
- If the font files are served by Tobira, how to include those files? Probably again a path in the config file to a folder that is server under assets/fonts/.

The other "potential" requirements seem rather difficult to implement. Custom CSS is tricky as we use CSS-in-JS. I don't know how a raw included CSS could best overwrite values set by JS. Two logos or a notably different page layout is also pretty tricky: I wouldn't know how to support that.

Setup test server

We want to have a test server for a couple of different purposes:

An Opencast instance with test data. We could of course use develop.opencast.org or stable.opencast.org, but those are wiped every 24h. Instead, I think it would be useful to have a server where we can have a nice set of test videos and series. That is not reset. Additionally, we can roll out custom patches on that server without having to wait for a PR to get merged into Opencast's develop.
A "showcase deployment" of Tobira. This would be a deployment of master (or maybe a special branch if we want to deploy there deliberately). This deployment should work most of the time and should be available via a nice-ish URL.
PR deployments: we want to automatically deploy PRs to make it very easy to test them. As Tobira requires only very little memory, it should be possible to run many instances in parallel on a server. We can expose them via different ports or only use different ports internally and have an nginx forward server/some-path-to-pr to a specific port.

Add automatic content-encoding/compression

Right now, the backend sends all data uncompressed, which is a huge waste. Of course, if we use an nginx in front of our backend, we can do the compression there. But for several reasons, it would be nice to do the compression in the backend itself. Hyper does not include features for compression (I agree it's outside that library's scope). There is async-compression, which might help, but doesn't do everything. We still need to inspect the Accept-encoding header and set the Content-Type header, for example.

Data and Data Structures

Ideas

Page structure is a graph
- Enforce tree
  - Let's start with this
  - Julian may create showcase later
  - Or maybe DAG? But now multiple URLs to same content/realm
- Page can have elements
- Page can have sub-pages
Elements (content-boxes?) have a type
- Store data as JSON?
- Use a document database?
Video
- Video metadata
- Parent series
Series
- Series metadtata
Source of video data is Opencast
Source of Video metadata is Opencast
Push notifications with fallback

Communication

Opencast SHOULD push new updates to Tobira
Opencast MUST keep track of changes and when they happen about:
- Published videos
- Video metadata
- Access control lists
The set of pushed and persisted information is not necessarily the same
- Updated metadata needs to be persisted and pushed
- Updated workflow operation may not be that important
Tobira MUST keep track of its last full update

API

Opencast

Create a Tobira API service in Opencast
Get information from necessary services
API to get updates since X
- Can be used to get all updates
API to synchronously update metadata
- Returns the updated data
- Used by video and series blocks
API to ingest new video
- With metadata
- Returns workflow summary
API to synchronously create/update new series
- Returns the updated data
API to delete series/event

Tobira

API to receive update messages
API to get structure
APT to create new series page?

Random Thoughts

Use Apache Kafka?
- It guarantees that messages reach the destination.
- It supports replay.
- Probably rather not
Database type
- Use PostgreSQL's JSON type?
  - Vermnutlich einfach PostgreSQL
  - https://www.postgresql.org/docs/9.5/textsearch.html
- Use MongoDB?
- Use Elasticsearch?
- One or more database types?
  - Eher eine Datenbank