Giter Site home page Giter Site logo

molmedb / molmedb Goto Github PK

View Code? Open in Web Editor NEW
5.0 6.0 3.0 68.73 MB

MolMeDB is an open chemistry database about interactions of molecules with membranes.

Home Page: https://molmedb.upol.cz

PHP 14.78% HTML 4.45% CSS 6.74% JavaScript 74.02% Dockerfile 0.01%
chemistry lipids membranes cheminformatics database small-molecule permeability

molmedb's Introduction

MolMeDB is an open chemistry database about interactions of molecules with membranes.

"Open" means that you can send your scientific data to MolMeDB and that others may freely use it. We collect information on how chemicals interact with individual membranes either from experiment or from simulations.

How do compounds interact with membranes? They are attracted to membranes, PMF estrone vs estriol they partition to membranes, they reside in specific positions in membranes, they penetrate through membranes, they change the membranes, and many more.

MolMeDB tries to collect and display such data in order to help understand these phenomena.

MolMeDB is available at https://molmedb.upol.cz/

molmedb's People

Contributors

dominikmartinat avatar jhupol avatar jurja00 avatar karelberka avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

molmedb's Issues

Enhance visualization of MolMeDB data

current workflows in MolMeDB brings users mainly to download data tables and analyse them on locale. Visualization online dashboard might enhance the usage of the data.

Uploader improvements

  • Rounding of values to two decimal places
  • Move "note" column to the end of table
  • Distinguish different interactions for different notes
  • Add uploader columns - chembl_id, chebi_id
  • Add possibility to link new dataset to the already existing dataset

Octanol/water partitioning logP/D table

Individual sites show different logP/logD values - we should therefore cite the source of logP first and later on add multiple logP values from various resources.

Solution: add octanol as a solvent and add logKm values according to source

MW vypocitat RDkitem ze SMILES

MW se aktualne natahuje z PubChemu, ze? (odhaduju to na zaklade toho, ze u pojmenovanych latek je mozne MW najit, ale u generickych MM ne)

RDkit by to ale mel byt schopen spocitat u vsech, kde mame SMILES.

Dataset VISIBLE/INVISIBLE

Cant change between VISIBLE/INVISIBLE option in Dataset editor as administrator, therefore cant add new datasets even though they are uploaded successfully. (Error: Wrong dataset instance)

Improvement of RDkit service

  • Automatic canonization of smiles for new uploaded substances
  • Canonization of current SMILES in DB + checking duplicates
  • Automatic generation of new 3D structure files
  • Automatic generation of MW from SMILES

REST API

  • Log REST API requests - public/private requests, source IP address, requested URL, response status - log errors #80
  • Implement processing OPTIONS requests
  • Design a hierarchy of queries - for each (sub)query, implement OPTIONS request describing the main usage and alternatives
  • Implement a system for autocompleting REST API documentation.

New "stats" and "browse" sections

  • Make new browse section from the stats section.

  • Make new stats section with just numeric statistics + charts

(More info in comments below)

Dataset information

While browsing individual Sets user should obtain short info about selected dataset e.g. membrane and method used, temperature, pH ...

SQL to RDF

RDF data enable linkage between multiple data sources using SPARQL relatively easily.
Since many major bio/cheminformatics databases provide also RDF access to the data, it would be beneficial to add this functionality also to MolMeDB

Download dataset issues #2

While downloading from Comparator - Compare data the second dataset is not in .csv table (only null) when compared dataset contains more than 200 interactions.

E/C/S nebo E/C/MD

Pro tvorbu ikonek pro prenos dat do PDBe-KB se s Noem bavime o ikonkach nejen co se tyce hodnot
logP_erm
logK_m
ale taky o moznosti ikonky ukazujici, jaka data mame - zatim mame metody delene na E/C/MD, ale premyslim o E/C/S
Experimental
Computed
Molecular Dynamics / Simulated

co myslite?

MDCG male obsazeni

Kontrolou Noe zjistil, ze MDCG (9791) latek ma jen 66% pokryti logPerm, coz je divne, protoze to je dataset z MARTINI, ktery byl jen o permeabilite

nesmyslne hodnoty CCM18/DOPC - k umazani

Ahoj,
slo by nejak efektivne smazat hodnoty logK pro CCM18/DOPC data pro nasledujici latky v MolMeDB? Jsou nesmyslne velka

MM00913
MM00880
MM00852
MM00937
MM00581
MM00790
MM01010
MM00340
MM00577
MM00088
MM00300
MM00254
MM00554
MM00585
MM00416
MM00338
MM00504
MM00578
MM00883
MM00857
MM00674
MM00903
MM00580
MM00489
MM00457
MM00287
MM00472
MM01008
MM00503
MM00280
MM00385
MM01028
MM00751
MM01027
MM00478
MM01014
MM00474
MM00484
MM00853
MM00480
MM00310
MM00552
MM00003
MM00510
MM00483
MM00589
MM01019
MM00353
MM00931
MM00453
MM00561
MM00997
MM00894
MM00454
MM00746
MM00475
MM00588
MM00527
MM00278
MM00863
MM00019
MM00090
MM00488
MM00507
MM00644
MM00557
MM00797
MM00656
MM00587
MM00560
MM00520
MM00311
MM00519
MM00455
MM00456
MM00045
MM00002
MM00583
MM00895
MM00576
MM00584
MM00521
MM00860
MM00092
MM00357
MM00429
MM00481
MM00085
MM00774
MM00476
MM00861
MM00672
MM00960
MM00992
MM00275
MM00586
MM00095
MM00027
MM00005
MM00946
MM00591
MM00477
MM00197
MM00518
MM00006
MM00114
MM00559
MM00791
MM00289
MM00954
MM00339
MM00526
MM00007
MM01013
MM00985
MM00662
MM00430
MM00010
MM00279
MM00881
MM00590
MM00008
MM00517
MM00350
MM00367
MM00367
MM00367
MM01053
MM01031
MM00893
MM00912
MM00023
MM00028
MM00011
MM00897
MM00732
MM00957
MM00053
MM00793
MM00013
MM00479
MM00384
MM00762
MM00359
MM00831
MM00529
MM00956
MM00562
MM00004
MM00925
MM00025
MM00661
MM01021
MM00764
MM00525
MM00854
MM00434
MM00140
MM00830
MM00418
MM00285
MM00987
MM00312
MM00556
MM01011
MM00984
MM00021
MM00882
MM00660
MM00487
MM00733
MM00673
MM00731
MM00486
MM00219
MM00047
MM00986
MM00855
MM00933
MM00582
MM00858
MM00391
MM00859
MM00201
MM01283
MM00936
MM00524
MM00768
MM00844
MM00026
MM00935
MM00284
MM00794
MM00924
MM00355
MM00024
MM00225
MM00020
MM00669
MM00012
MM00415
MM00361

Change logo :)

Use similar font thorough the logo - and have it in two color variants (with dark (on homepage) and white (on contact page) background )

Merging of EDC and EFDC methods

Franck Diffusion cell and Diffusion cell with reference to Franck cell seems a duplicates to me.

I suggest to have EDC as a final method shortcut as there is EFCS which is fluorescence measurement.

Cron service

Service that runs automatically at given time and periodically performs given functions.

  • Implement cron service
  • Add sending notifications from server to admins
  • Check for data validity (substances, transporters)
  • Check for duplications
  • Adding missing 3D structures
  • DB autobackup

Validator improvement

Validate datasets:
Kuba:

  • Check, if some datasets should be merged together - for the same membrane,method and secondary reference.
  • Delete empty datasets.
  • After succesful validation - open validated molecule in new window and original window rest in list of molecules to validate
  • Přidat paginaci k záznamům validatoru (aktuálně zobrazuje jen prvních 50 nebo 100 řádků)
  • přidat možnost označit možnou duplicitu, že se NEJEDNÁ o duplicitu a aby se zachovaly obě varianty.
  • zobrazovat i INVISIBLE datasets (mark them)

Karel:

  • compounds with MMxxxxx and without interaction/transporter data (other than logP from Pubchem calculated from SMILES) can be immediately deleted
  • Reserpine vz isoreserpine (MMcosik) - same Canonical SMILES - upon completation links to DBs mixed - PubChem is to reserpine, ChEBI+ChEMBL to isoreserpine -> manual edit bring SQL error - Error: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '' for key 'UNIQUE_subst_drugbank'
  • show states according to missing values - SMILES, links to PubChem et al, or dataset values - visible, invisible
  • allow marking that molecule should not be validated - e.g. http://molmedb.upol.cz:12300/mol/MM01492 - we will not get SMILES for that one... :)
  • Proxicam http://molmedb.upol.cz:12300/mol/MM15151 is a typo - it should be Piroxicam http://molmedb.upol.cz:12300/mol/MM00682 - how can I merge those two? Add SMILES to Proxicam and let it be found in the next run? or is there other way?
  • Y-39983 is marked as not validated - upon looking into paper - I found CHEMBL identifier CHEMBL571948 but when I tried to add it - http://molmedb.upol.cz:12300/edit/Compound/15181/Y-39983 - SQL error> Error: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '' for key 'UNIQUE_subst_drugbank'
  • Report CSV should not contain "Substance:" but only MMvalue. Substance can be used instead of Detail in the header.
  • Missing links to other DBs could be a new category

Connect to EuropePMC API to get more data on articles

API from EuropePMC https://europepmc.org/RestfulWebService#/ allows to get more stuff about individual articles.

The fields interesting for us might be:
Authors
Title
Journal + Year + page numbers
DOI, PMID
Nr of citations

and we should also provide from our database
Nr of compounds in MolMeDB
Nr of interactions in MolMeDB
And these two should provide links to list of compounds or to interactions table available to download and for addition to Comparator

Navigator page for methods

Prepare Navigator page for methods similarly as for membranes -
Major division:

  • Experimental
  • Computed
  • Simulated

Enhance literature seach

Current selection by literature is quite limited and it would be better enhanced with metadata provided for papers from EuropePMC

Primary and secondary citations

Distinguish between primary and secondary citations
primary = original measurements
secondary = collections of data reported in review articles or databases

Add an option to download the database

In order to work on the data (ML, visualization, ...), it would be very advantageous to be able to download the raw data (depending on how it is stored in the backend as a csv, sqlite db, mysql db, ...).

Searching in membranes/methods

While searching in membrane/method section user should obtain description about selected membrane/method as well as list of compounds

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.