Giter Site home page Giter Site logo

mkoppmann / eselsohr Goto Github PK

View Code? Open in Web Editor NEW
10.0 3.0 2.0 735 KB

Eselsohr is a self-hostable bookmark manager for storing web articles.

License: European Union Public License 1.2

Dockerfile 0.70% Haskell 97.77% Shell 0.27% Nix 0.78% CSS 0.49%
haskell self-hosted ocap object-capabilities hacktoberfest bookmark three-layer-cake webkey principle-of-least-privilege principle-of-least-authority

eselsohr's Introduction

Eselsohr

Badge for CI workflow status OpenSSF Scorecard

Eselsohr is a self-hostable bookmark manager for storing web articles. Read them later or share access to your collection. It’s still in an early stage of development and not ready for production.

Build Eselsohr

Pre-built binary releases are provided as CI artifacts.

To build the project manually, you’ll need GHC and the Cabal build tool. Download this repository and change your working directory into it. You can then install the executable with:

cabal install --install-method=copy --overwrite-policy=always

By default, the resulting binary gets stored in ~/.cabal/bin/eselsohr-exe.

Nix support

If you have Nix installed with Flakes support you can enter the development environment by running nix develop.

Deploy Eselsohr

Eselsohr is distributed as a single binary and does not have any other dependencies. It can be configured by using env vars or by using a configuration file (eselsohr --config-file /path/to/file). The folder with the static resources is also required. By default it looks for an .env file and a static/ directory in the current working directory.

The following values can be set:

  • DATA_FOLDER: File path where data is getting persisted. Defaults to XdgData.
  • BASE_URL: Base URL to generate HTML links. Defaults to http://localhost.
  • LOG_LEVEL: Level for the built-in logger. Defaults to Error.
  • PORT: Port number on which the web server will listen. Defaults to 6979.
  • LISTEN_ADDR: Address where the web server will listen. Defaults to 127.0.0.1.
  • HTTPS: Send HSTS HTTP header. Automatically enabled when X-Forwarded-Proto HTTP header is set to https. Defaults to False.
  • DISABLE_HSTS: Do not send HSTS HTTP header, when HTTPS is set. Defaults to False.
  • CERT_FILE: File path to the TLS certificate file. Not set by default.
  • KEY_FILE: File path to the TLS key file. Not set by default.
  • DEPLOYMENT_MODE: The mode the application is running in. Can be Prod, Test, or Dev. Defaults to Prod.
  • PUBLIC_COLLECTION_CREATION: Wether the creation of collection should be public. Defaults to False.
  • STATIC_FOLDER_PATH: The path to the folder with static resources. Defaults to static/.

Currently, all configuration parameters are optional. Starting Eselsohr can be as simple as executing the Eselsohr binary in a directory along with the static resources. If you don’t allow the public creation of collections, you can generate accesstokens for collections by running eselsohr-exe collection new.

The dist directory in this repository provides deployment relevant files, like an example rc file for FreeBSD or a service file for systemd-based Linux distributions.

Docker-based

Alternatively, a Docker image is provided. You can build and run it like so:

sudo docker build -t eselsohr .
sudo docker run -p 6979:6979 -v eselsohr-data:/data eselsohr

Architecture

The app is based on the Three Layer Cake architecture. It is similar to the Hexagonal, or Clean Architecture. The initial code was based on the three-layer project.

Eselsohr is used as a research playground for capability-based security in the context of web applications.

Common web applications use authentication with cookies or HTTP headers to enable identity-based authorization. This has certain disadvantages, some of them are listed in the description of the Waterken Server.

In Eselsohr authorization works with capabilities. A capability is a shareable, unforgeable token that references a piece of data, including the associated set of access rights. In our case, authorized requests work with access tokens, Base32 encoded binary data, which are transmitted either over HTTP query strings or in an HTML body. An access token points to a Capability, which points to a data structure called ObjectReference. They can also have some additional, optional properties like a petname, or an expiration date. An object reference gives access to a collection or a single resource and has the associated permissions encoded within.

Object references are required for accessing the global state, like fetching an article with a specific ID. This is enforced by authorized actions: a data type which corresponds to user actions like creating a new article, or changing an article’s title. To obtain such an authorized action token one has to pass a object reference and in some cases the ID, on which the action will be performed, to functions which evaluate if the required permissions are set in the object reference. This forces us to do authorization checks and we can’t forget to do them.

The application tries to incorporate the Principle of Least Privilege wherever it can. Instead of using one single data storage for everything, each article collection is stored as a separate resource in the system. In theory, if Eselsohr would have a vulnerability like a SQL Injection, an attacker could only access their own data, because they do not have a reference to the other resources. Because access tokens are encoded binary data that point to an ID of a resource and the ID of a capability they work across multiple server instances, as long as the server has access to that resource.

Eselsohr is written in the programming language Haskell. Although it’s never explicitly called one, Haskell is a great language to implement capability-based techniques, as functions are pure and data has to be explicitly passed as arguments to other functions and global state is rarely used. Side-effects in Haskell are explicit and are normally done within the IO data type. But IO alone means that any side-effect can happen. Eselsohr uses type classes in a way that represent access to certain IO actions. A function that wants to e.g. scrap a website, needs to have the MonadScraper constraint in its signature. All side-effects are therefore explicit and only effects that are wrapped this way can be used.

eselsohr's People

Contributors

dependabot[bot] avatar mkoppmann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

eselsohr's Issues

Export article list as RSS/Atom/JSON feed

Thanks to our web keys it should be no problem to add authorized feeds to RSS aggregators.
If tags are supported (#37) then feeds could be customized to only display read/unread articles or articles with specific tags.

Add load function to repositories

The current implementation does not need it. It would still be useful for the test suite so it’s easier to verify state changing actions.

Provide recovery option with email

It should be possible to add a recovery email address in case one forgets their collection unlock link.
The address could be stored as a hash.

Replace Clay With Plain CSS

Generating CSS with Clay doesn’t give us clear benefits, but it has some drawbacks:

  • Its current version does not support GHC 9.2+ although its master branch already supports 9.4
  • It’s not really easier to read or write than plain CSS
  • CSS generated by Clay always ends with a footer referring to the Clay website

Catch exceptions for IO functions

Currently, some IO functions like reading or writing a file, or scraping a website, are throwing exceptions that are not handled. At least the common ones should be catched and converted into app specific errors.

Display timestamps with local timezone

Currently, timestamps are displayed for UTC. Two possible ways to fix this:

  1. use JavaScript (built-in support)
  2. add support for server-side storage of user settings. Timezone would then be one of the settings.

Solution 1 would be the easiest and the original timestamp would be displayed if JavaScript is disabled in the user’s browser.

Add import/export feature

It should be possible to migrate between Eselsohr instances. An import/export feature is required for this.

Support migrations

Eselsohr currently does not support migrations for the persisted data. This is absolutely necessary.

PWA support

When a web application is recognized by browsers as a progressive web app it can be installed and used similar to a native application.

These features are currently missing according to the PWA criteria list:

  • Register a service worker with a fetch handler
  • Reference a web app manifest

This would unlock additional features. For example, new articles can be added with the native sharing functionality of the operating system. (see Web Share API)

Upgrade to GHC 9.2

This allows us to use the new dot notation. This should be done first before the module structure is refactored again. With the removal of Clay (see #157) all remaining dependencies should have support for 9.2.

Remove “Unlock collection” page?

Having a place where people can paste in their collection unlock access token could teach them that it’s OK to do so.
This then would make it easier for attackers to try some phishing attacks.
Maybe it’s better to say right away that the token and the URL should stay together.

When we generate a new unlock access token from the command line the correct URL should also be displayed.
For this we would need our BASE_URL config option back.

Provide JSON API

In addition to the HTML-only frontend, a separate JSON API would be handy so that other clients could get developed.
This should also showcase the chosen architecture as commands can be reused while queries are optimized for their specific output format. The main challenge will be the transportation of accesstokens. HATEOAS should be the solution for this.

Provide bookmarklet

It should be possible to add the current page to an article list via a bookmarklet.

Provide Linux binaries in CI or Releases

Haskell can be distributed as binary. We should provide one, at least with every release. Ideally, they would be completely statically-linked, so they have no dependencies on the system, but I’m not sure if this is currently possible with GitHub Actions.

Delete expired capabilities

Currently, expired capabilities are bit-rotting in the persisted state. Some cleanup job should take care of this.

Provide CI with GitHub Actions

This project was once part of a private monorepo with its own CI. Now, with the release on GitHub, we should provide a configuration file for GitHub actions. A simple CI that just builds the projects and runs the test suite would be a good start.

Add new articles via email

It would be very nice if one could add an article by sending it via email.
Similar to Evernote or Instapaper.

Add versioning

Correct versioning is necessary for GitHub Releases, using Cachix (#4), and adding Eselsohr to Docker Hub (#3).

Requires #26.

Improve error handling

Don’t use WithError inside the infrastructure layer. Let the errors return to the controller, where appropriate error pages can be loaded.

Add Eselsohr image to Docker Hub

We already have a Dockerfile. For easy deployments, Eselsohr should also be available on Docker Hub for people who prefer containers.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.