Giter Site home page Giter Site logo

subdavis / workspaces-io Goto Github PK

View Code? Open in Web Editor NEW
22.0 2.0 4.0 2.88 MB

A simple FastAPI server to manage workspaces and multi-user sharing within MinIO.

Home Page: https://workspacesio.subdavis.com

License: MIT License

Python 88.47% Shell 1.02% HTML 0.22% JavaScript 0.29% Vue 6.50% TypeScript 2.76% CSS 0.23% Dockerfile 0.51%
minio fastapi s3 python-click

workspaces-io's Introduction

Better than your NAS, probably

A dead-simple FastAPI service to manage multi-user permissions and indexing for S3 and MinIO.

Documentation

Features

  • Non-invasive import and indexing of your existing data. We leave your data just as it is on-disk.
  • Simple permissions management. wio create to make new workspaces. wio share to share them with others.
  • Low-friction access to your data. WorkspacesIO grants STS credentials so that users can connect directly to minio for listing, upload, and download operations. You can even continue to use minio/mc or boto3 , with some caveats.
  • Permissions-aware indexing and aggregation across the system.
  • Hub-and-spoke architecture. Run a MinIO node wherever you have data, and it will be available through Workspaces. Even regular users can introduce new nodes into the system and retain full control of their data.

Screenshots

Images in a directory

directory of directories

# List user workspaces
~$ wio workspace ls
[2020-10-17T21:07:18.996866] [email protected]/meva/ (unmanaged)
[2020-10-17T21:06:11.048850] [email protected]/second/ (private)
[2020-10-17T21:06:11.470730] [email protected]/third/ (private)
[2020-10-17T21:07:18.188769] [email protected]/kobodls/ (unmanaged)
[2020-10-17T21:07:18.720848] [email protected]/metabolomics/ (unmanaged)
[2020-10-17T21:06:10.623132] [email protected]/first/ (public)
[2020-10-17T21:07:17.911934] [email protected]/_samples/ (unmanaged)
[2020-10-17T21:07:19.575340] [email protected]/viame-web/ (unmanaged)

# List instances of MinIO
~$ wio node ls
[2020-10-17T21:06:08.480801] ddb915fb-d911-4a8a-8971-b2fccd4e4ea8 http://hostname:9000 default
[2020-10-17T21:06:09.803527] 6a4cf079-6c2f-4f09-abb0-abe39379e168 http://hostname:9100 secondary

# List contens of workspace using MinIO client
~$ wio mc ls viame-web/NOAAWorkshop2020/
[2020-10-18 17:47:33 EDT]      0B Aerial Footage/
[2020-10-18 17:47:33 EDT]      0B Completed Pipelines/
[2020-10-18 17:47:33 EDT]      0B Fish Test Set/
[2020-10-18 17:47:33 EDT]      0B Fish Training/
[2020-10-18 17:47:33 EDT]      0B Scallop Test Set/
[2020-10-18 17:47:33 EDT]      0B Sea Lion Test Set/
[2020-10-18 17:47:33 EDT]      0B Sea Lion Training/

Philosophy

Data management should come to users and the places they already have data. If your team wants to use powerful industry-standard tools like MinIO and ElasticSearch, but needs permissions management, WorkspacesIO might be an option.

Caveats

For whatever reason, you can't explicitly revoke STS credentials. That's how AWS does it so MinIO won't implement it either. This means that share revocation has a big asterisk: anyone with outstanding credentials can continue to modify data in s3 until that share expires.

FAQ

What's a workspace?

It's just a folder. Users manage the heirarchy within. Permissions are managed at the workspace level. You can search for workspace contents and share individual objects, but these are features of elasticsearch and minio, respectively.

Who is this for?

WorkspacesIO is for organizations that need to manage large quantities of slow-moving data of the sort that laboratories and research teams accumulate. Think Samba, CIFS, FTP, SSHFS. WorksapcesIO can map your existing data while you transition, or run side-by-side forever.

What if I want to share a single file?

There's always pre-signed URLs to email a collegue. But you shouldn't think of this like Google Drive; WorkspacesIO isn't for slide decks.

Why not just MinIO?

MinIO's multi-user management is great when all users a) need their own space or b) can share everything, but it's cumbersome for dynamic permissions management. WorkspacesIO gives you Role-based access control.

Can I just try it? What if I hate it?

Because it doesn't modify the structure of your data on disk, WorkspacesIO is easy to try.

Multiple nodes? Isn't that just MinIO Distributed Mode?

Not at all. MinIO's Distributed Mode solves an operational and deeply technical problem. It allows for data redundancy and high availability. WorkspacesIO solves a bureaucratic problem. You've got data on different servers and workstations in different locations. You can't reasonably migrate it all into the same storage cluster, but you want to provide read/write/search to certain authenticated users across your org.

Development setup

Server

Install the ldc tool.

# run production services
ldc up

# swap in the development container
ldc dev workspaces

Now you're running a fastapi service in development mode inside a docker container. Local directories are mounted in.

Client

virtualenv -p python3 venv/
venv/bin/activate

pip3 install -e .
pip3 install -r dev.requirements.txt

wio --help

Referring to workspaces

You can refer to workspaces either

  • directly by workspacename/
  • through their owner by owner/workspacename/

Credit

Credit to Filestash.app for the frontend file browser. Integration into workspacesio is ongong and can be found at subdavis/filestash

workspaces-io's People

Contributors

danlamanna avatar kotfic avatar subdavis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

workspaces-io's Issues

Setup MkDocs

Build docs/ into a mkdocs site on github pages or something.

Get S3 working, write docs.

S3 is harder than MinIO to configure because the IAM role you use needs and attached base policy that should have a superset of permissions that WorkspacesIO service might request.

Produce documentation for this.

Write vs. docs

Explain how this tool is different from common data storage solutions and when you might want to choose something else.

  • Seafile
  • Dropbox / Google Drive / Other SaaS
  • NextCloud
  • Raw Minio + Elasticsearch + Filestash (Often a really good option)
  • Pydio Cells
  • Invenio
  • Seaweed FS
  • Filerun

This is data management, not data sync.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.