Light

scai-bio / index Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 0.0 9.38 MB

Intelligent data steward toolbox using Large Language Model embeddings for automated Data-Harmonization

Home Page: https://index.bio.scai.fraunhofer.de

License: Apache License 2.0

Python 29.47% Dockerfile 0.84% HTML 17.94% TypeScript 50.18% SCSS 0.99% CSS 0.58%

data-harmonization data-stewardship embeddings large-language-models semantic-mapping

index's Introduction

INDEX – the Intelligent Data Steward Toolbox

INDEX is an intelligent data steward toolbox that leverages Large Language Model embeddings for automated Data-Harmonization.

Installation

Local Development Server

Starting the backend

cd api
pip install -r requirements.txt
uvicorn routes:app --reload --port 5000

Navigate to localhost:5000 to access the backend.

Starting the frontend

cd client
ng serve

Navigate to localhost:4200 to access the frontend.

Docker

You can start both frontend and API using docker-compose:

docker-compose -f docker-compose.local.yaml up

Configuration

TODO: Add configuration instructions

index's People

Stargazers

Watchers

index's Issues

Create (temporary) pypi release

intex
datasteward
datastew

Add frontend tests in ci

@mehmetcanay could you add the angular test to the test ci as in PDataviewer?

Find way to only dynamically load pre-configured (list of) models during build process

Bug: Duplicate entries crash DB

curl -X PUT "[https://index.bio.scai.fraunhofer.de/concepts/id001/mappings?terminology_id=test_ab3&concept_name=cough&text=erkaeltung"](https://index.bio.scai.fraunhofer.de/concepts/id001/mappings?terminology_id=test_ab3&concept_name=cough&text=erkaeltung%22) -H "accept: application/json"
{"detail":"Failed to create or update concept: (sqlite3.IntegrityError) UNIQUE constraint failed: concept.id\n[SQL: INSERT INTO concept (id, name, terminology_id) VALUES (?, ?, ?)]\n[parameters: ('id001', 'cough', 'test_ab3')]\n(Background on this error at: [https://sqlalche.me/e/20/gkpj)"}](https://sqlalche.me/e/20/gkpj)%22%7D)

{"detail":"Failed to create or update terminology: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (sqlite3.IntegrityError) UNIQUE constraint failed: concept.id\n[SQL: INSERT INTO concept (id, name, terminology_id) VALUES (?, ?, ?)]\n[parameters: ('id001', 'cough', 'test_ab3')]\n(Background on this error at: https://sqlalche.me/e/20/gkpj) (Background on this error at: [https://sqlalche.me/e/20/7s2a)"}](https://sqlalche.me/e/20/7s2a)%22%7D)

Add description to excel to excel map

Also add parameter to determine number of closest matches (default = 1) and add those as columns to the resulting dataframe

Move db file to own directory

Having the db file in the same directory as a python package will cause issues when mounting the directory in a data container or as a PVC

Add pagination to get endpoints

Integrate scatter plot visualization into REST API

Update api version in routes.py during release

During the Container build, the api version in routes.py should be uodated by action to the current version tag. This makes sure the resulting docker build will have the current release version as API version.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.