Giter Site home page Giter Site logo

equinor / data-modelling-storage-service Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 1.0 10.41 MB

Storage service for Data Modelling Tool

Home Page: https://pypi.org/project/dmss-api/

License: MIT License

Dockerfile 0.08% Python 75.86% Shell 0.24% Gherkin 23.82%
python

data-modelling-storage-service's People

Contributors

cbuv avatar dependabot[bot] avatar eoaksnes avatar ingeridhellen avatar kristiankjerstad avatar lassebje avatar netr0m avatar rikkebl avatar sindre-nistad avatar snyk-bot avatar soofstad avatar timothy-edward-kendon avatar

Stargazers

 avatar

Watchers

 avatar  avatar

data-modelling-storage-service's Issues

Improve BDD tests, should be faster and easier to write tests

Given there exist a package "TestData"
    And "TestData" contains blueprint "ItemType"
    And "ItemType" has attribute "list" of type "string"
    And "ItemType" is array
Scenario: Inherited properties should be added and overwrite parent properties
Given there exist a blueprint "Role" with attributes "id, name, description"
And there exists a blueprint "Employee" with attributes "id, name, roles"
And there exists a blueprint "Programmer" with attributes "id, name, skills"
When the domain expert extends the blueprint "Programmer" with the "Employee" blueprint
Then attributes of the "Employee" blueprint are added and overwrites attributes of the "Programmer" blueprint
And so attributes of the blueprint "Programmer" becomes "id, name, skills, roles:Role"

View an extended blueprint

Scenario: View a blueprint that inherit from another blueprint 
Given there exist a blueprint "Role" with properties "id, name, description"
And there exists a blueprint "Employee" with properties "id, name, roles:Role"´
And there exists a blueprint "Programmer" with properties "id, name, skills"
And the "Programmer" blueprint extends "Employee" blueprint 
When the domain expert views the "Programmer" blueprint 
Then the attribute values shown are "id, name, skills" # It should not shown the attributes of Employee blueprint
And extends information is shown, that "Programmer" extends "Employee"

View an extended entity

Scenario: View an instance with a blueprint type that inherit from another blueprint 
Given there exist a blueprint "Role" with properties "id, name, description"
And there exists a blueprint "Employee" with properties "id, name, roles:Role"
And there exists a blueprint "Programmer" with properties "id, name, skills"
And the "Programmer" blueprint extends "Employee" blueprint 
And there exist an instance of type "Programmer" with name "Dennis Ritchie"
When the domain expert views the instance with name "Dennis Ritchie"
Then the attribute values shown are "id, name, skills, roles: Role".  

Cleanup backend project structure

  • Cleanup project structure to match the clean architecture, e.g. remove core folder
  • See if it makes sense to divide things under data, domain, and presentation @eoaksnes
  • Remove files not needed
  • Use dependency injection everywhere. Seems like this is not the case right now, where things are instantiated in the wrong layers. GetDocumentUseCase seems to instantiate DocumentService instead of passing in.

Blobs 2.0

Uploading(Updating) and Deleting of blobs does not work too well now.

We should come up with a better design for this.

Needs to function properly if we'r going to use it for STasks.

investigate: access control

Authentication in DMT/DMSS

Identity Providers

Option 1: One per deployment

  • Config in WebApp and API (plugin? Or only oauth2?)

Option 2: One per dataSource

  • No config in webApp
  • Login to dataSource (each datasource has own oauth2 config)
  • Store tokens per dataSource client side

Access Control

Requirements

  • Read/Write access level per root package (idealy per file)
  • Different institutions can host their own data sources and DMSS instance
  • DMSS instances can share their data sources, and "hook up" to data sources owned by other institutions

Option 1: Repositories handle JWT

  • Can be fleshed out to fully support Oauth2 on-behalf-of flow

+ Can use users credential on the repository (Openworks etc.)
+ No need to keep storage credentials in DMSS
+ Arguably safer. Less privilege, and uses well-known-auth-schemes

- On-behalf-of will be unusably slow withouth a token cache
- Complicated, many moving parts
- Heavily dependent on repository support, or we will need to create wrappers
- Many different technologies (read only mongo vs azure blobs)
- Entities split over multiple repositories
- Hard to maintain and setup (lots of manual work)
- Hard to implement

Option 2: Data source handle auth per file in lookup table

  • Set a default "umask" for each datasource
  • Unix like (owner, groups, other --> ReadWrite)
  • Possible to support root-package level access control for additional cost (rather inherit when created)

+ Minimal performance cost (we already have to lookup each file in the lookup table)
+ Confined and managed within the DMSS system
+ Not reliant on other technologies

- Repository credentials are shared across users (possible to setup separate DS cred per user)
- Repository credentials stored in DMSS

Access Level
Write 2
Read 1
None 0

bug: blobs does not get deleted

"gridfs.errors.FileExists: file with _id '4ec0d646-907c-4eba-9e65-24106236d61b' already exists"

Recreate:
Import an entity with a blob with a set _id.
Delete the package and import the same entity.

Add search

Move logic from DMT repository to here for search.

Feature: No _id should be visible to user

This is a meta issue, encapsulating several issues.
The goal is to have all pointers and identifiers to documents/entities use a pathlike string.
This is necessary for the API to be friendlier to work with, have functional cross-data-source-references, be able to dump/import a large package with both contained and uncontained entities with ease. And the general "human-friendlyness" of the system.

These should be completed in the given order;

  • Remove the "package.content" abstraction where we hide the "content" object
  • TreeNode should be able to generate a path to itself
  • DMT Index should set objects paths
  • DMSS API should work with (also handle) paths
  • Entity references should be a path

Switch to FastAPI

  • Change to FastAPI with minimal changes
  • Request validation
  • OpenAPI refactoring and cleanup

Remove "description" as a required value?

Does it make sense to have "description" as a core attribute in our code? I think not, and that it should be removed from any places where it's hard coded, and kept only as a normal attribute (often inherited from NamedEntity)

Pre-commit hooks

Cleanup and check that all pre-commit works as expected after the move from old repository.

Improve api documentation

The api documentation in the current state does give an overview of all endpoints and code snippets of how to communicate, which is pretty neat.

Actual usage of the api does require special knowledge of most of the endpoints.

We should add:

  • description of what the endpoint do
  • what are valid input?
  • what is the output? this one is less important, but "returns entity of an blueprint" would be sufficent in some cases.

Add Inheritance

As a domain expert,
I want to be able to specify that a blueprint should inherit the attributes of another blueprint
so that I don't need to repeat attributes unnecessary.

image

Add extends attribute

As a domain expert,
I want to be able to specify that a blueprint should inherit the attributes of another blueprint
so that I don't need to repeat attributes unnecessary.

  • Add extends to core Blueprint.json schema
  • Add extends to the Blueprint class. self.extended_attributes: List[BlueprintAttribute] =
  • Update get_blueprint_cached function to add extended attributes (recursive function) to the returned blueprint. Alternative adds parent link recursively until the parent is found. self.parent : Blueprint = get_blueprint_cached(self.extends)
  • Update get_none_primitive_types, get_primitive_types inside Blueprint class to return extended attributes
Scenario: Inherited properties should be added and overwrite parent properties
Given there exist a blueprint "Role" with attributes "id, name, description"
And there exists a blueprint "Employee" with attributes "id, name, roles"
And there exists a blueprint "Programmer" with attributes "id, name, skills"
When the domain expert extends the blueprint "Programmer" with the "Employee" blueprint
Then attributes of the "Employee" blueprint are added and overwrites attributes of the "Programmer" blueprint
And so attributes of the blueprint "Programmer" becomes "id, name, skills, roles:Role"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.