Giter Site home page Giter Site logo

post-processed Data? about m2or HOT 6 CLOSED

HeJunhong1107 avatar HeJunhong1107 commented on June 11, 2024
post-processed Data?

from m2or.

Comments (6)

Samarak99 avatar Samarak99 commented on June 11, 2024

Hello,

Thank you for you comment. I would need more information about the issue in order to fix it. What link have you tried to use and what was the exact error that happened.

Thank you again for your interest in our database.
M2OR Team

from m2or.

HeJunhong1107 avatar HeJunhong1107 commented on June 11, 2024

Thank you for your response. I have identified several issues:

  1. The download link at https://m2or.chemsensim.fr/experiments appears to be non-functional.

M2OR

  1. The data in the 'M2OR_20230428.csv' file seems to overlook the chirality information of the compounds, leading to some discrepancies. For instance:

M2OR_issue

The compound pointed to by CAS-ID 80-54-6 is Lilial instead of 3-(4-tert-butylphenyl)butanal.

from m2or.

Samarak99 avatar Samarak99 commented on June 11, 2024

Okay well received, we will check them and get back to you!

from m2or.

MatejHl avatar MatejHl commented on June 11, 2024

Hello @HeJunhong1107,

I'm happy that M2OR is helping a fellow scientist!

We checked the molecule you mentioned and the name and InChI Key are actually correct. The molecule used in the paper is lilyall (3-(4-tert-butylphenyl)butanal), which is a different compound than lilial. However, there is a common confusion between the two (e.g. ScenTree is considering them as synonyms) and so the CAS of lilial was parsed. We will update the information in the next release of the data.

Generally speaking, InChI Key is the main identifier of molecules and it is curated. Other identifiers like CAS are non-unique and may not even exist, so they are there just as an additional information and I highly recommend avoiding them in any bulk data processing.

As for the download of the preprocessed data, we are working on the issue. It turned out to be more complicated to fix than it looks, so we will let you know when it is finished. In the mean time, you can access the preprocessed data here (it take a second or two to load):

https://m2or.chemsensim.fr/advanced-search?_token=335PpzJkmTPZ7p9xULn400OWYwM4qMOCYXQCYhom&conditions%5B%5D=and&tables%5B%5D=inchi_key&values%5B%5D=&conditions%5B%5D=and&tables%5B%5D=uniprot_id&values%5B%5D=&conditions%5B%5D=and&tables%5B%5D=co_transfection&values%5B%5D=&conditions%5B%5D=and&tables%5B%5D=sequence&values%5B%5D=

Best,
Matej

from m2or.

HeJunhong1107 avatar HeJunhong1107 commented on June 11, 2024

Dear Matej,

Thank you once again for your kind response. Over the past few days, I have made efforts to supplement and modify the relevant data in 'M2OR_20230428.csv.' This primarily involved:

  1. Retrieving compound-specific standard information, including CAS-ID, based on Inchikey and PubChem REST API.
  2. Retrieving protein sequence-related information based on Uniprot-ID.
  3. Standardizing gene names.

Here are the results after my personal processing, and I hope they prove helpful to your team or other researchers with future needs.

M2OR_240201_revised.xlsx

from m2or.

MatejHl avatar MatejHl commented on June 11, 2024

As the website and the preprocessed data is up and running, I'm closing the issue.

Thank you HeJunhong1107 and if you how other suggestions, please don't hesitate to let us know.

from m2or.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.