Giter Site home page Giter Site logo

snurr-group / mofid Goto Github PK

View Code? Open in Web Editor NEW
38.0 4.0 22.0 19 MB

A system for rapid identification and analysis of metal-organic frameworks

License: GNU General Public License v2.0

Makefile 0.02% Python 1.05% R 0.13% CMake 1.55% C++ 71.63% POV-Ray SDL 0.20% Perl 0.09% Shell 0.14% XSLT 0.01% HTML 0.59% Roff 0.18% C 24.24% Java 0.01% PHP 0.06% Batchfile 0.01% Ruby 0.01% Scala 0.01% C# 0.01% JavaScript 0.05% Cuda 0.03%
mofs cheminformatics metal-organic-frameworks

mofid's Introduction

MOFid

A system for rapid identification and analysis of metal-organic frameworks.

Please cite DOI: 10.1021/acs.cgd.9b01050 if you use MOFid in your work.

Objective

Supplement the current MOF naming conventions with a canonical, machine-readable identifier to facilitate data mining and searches. Accomplish this goal by representing MOFs according to their nodes + linkers + topology

Usage and Installation Instructions

There are three main ways in which you can use MOFid:

  1. From your browser.
  2. By compiling the MOFid source code and running it locally.
  3. By using Singularity or Docker to run a pre-built image of the MOFid code locally.

Browser-Based MOFid

Visit https://snurr-group.github.io/web-mofid to quickly and easily run MOFid in your browser! No programming skills are required.

Compiling from Source

See compiling.md for how to compile and run MOFid from source.

Containerized MOFid

See singularity.md for how to run MOFid via a Singularity container.

Background and Troubleshooting

Please read the page here for a detailed background and for important tips/tricks to help troubleshoot any problematic scenarios.

Credits

This work is supported by the U.S. Department of Energy, Office of Basic Energy Sciences, Division of Chemical Sciences, Geosciences and Biosciences through the Nanoporous Materials Genome Center under award DE-FG02-17ER16362.

The MOFid command line and web tools are built on top of other open-source software projects:

mofid's People

Contributors

andrew-s-rosen avatar bbucior avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mofid's Issues

`extract_topology`: Specify nets that are Non-crystallgraphic

Nets such as nts are non-crystallographic and will lead to an error message in the current version of Systre:

#!!! ERROR (STRUCTURE) - - Structure has collisions between next-nearest neighbors. Systre does not currently support such structures..

The message can be improved to suggest more details behind the error considering the MOF topology.
NU-100_SN_simplified_topology_with_two_conn_Zr_N
NU-110: SingleNode gives nts topology (confirmed using Simplified Topology with Two Conn.cif)

Write more formats

bin/sbu.exe needs to export a few more pieces of data:

  • InChIKeys of unique linkers (in the NoSBU subdirectory?). Possibly as a new Deconstructor method called GetJSON
  • Unique SBU's would be useful for analysis
  • Properly handling these files in extract_folder.sh

Update Open Babel dependency & Continuous Integration

To align with our interest to enhance MOFid in the future, it would be helpful to set up a workflow for updating Open Babel and conducting unit tests. Currently, MOFid utilizes Open Babel 2.4.0, whereas the latest version is 3.0.0, one version ahead.

Looks like Github has the option to automatically build and test through Actions / Continuous Integration. Setting up the relevant workflow could be helpful for future code functionality improvements.

Inconsistent output between local mofid and webmofid

I getting different results using MOFid locally vs. the web interface. When I use the web MOFid I am getting the following result:

MOFid:
[Co].[O-]C(=O)c1ccncc1 MOFid-v1.dia.cat0;CEYPUT02
MOFkey:
Co.TWBYWOBDOCUKOW.MOFkey-v1.dia

However when I run locally I get:

{'mofid': '[Co].[O-]C(=O)c1ccncc1 MOFid-v1.ERROR.cat0;CEYPUT02',
 'mofkey': 'Co.TWBYWOBDOCUKOW.MOFkey-v1.ERROR',
 'smiles_nodes': ['[Co]'],
 'smiles_linkers': ['[O-]C(=O)c1ccncc1'],
 'smiles': '[Co].[O-]C(=O)c1ccncc1',
 'topology': 'ERROR',
 'cat': '0',
 'cifname': 'CEYPUT02'}

Any ideas why this might be happening? I didn't see any parameters I can change to get different results but I might be wrong. Can they be using different versions? I compiled mofid from the source today by following the instructions. Please let me know if I can provide more information. The cif file is below. Thanks!

data_image0
_chemical_formula_structural       CoO4N2C3HCHCHCHC3HCHCHCHCoO2NC3HCHCHCHO2NC3HCHCHCH
_chemical_formula_sum              "Co2 O8 N4 C24 H16"
_cell_length_a       6.3043
_cell_length_b       12.6051
_cell_length_c       10.4868
_cell_angle_alpha    90
_cell_angle_beta     91.365
_cell_angle_gamma    90

_space_group_name_H-M_alt    "P 1"
_space_group_IT_number       1

loop_
  _space_group_symop_operation_xyz
  'x, y, z'

loop_
  _atom_site_type_symbol
  _atom_site_label
  _atom_site_symmetry_multiplicity
  _atom_site_fract_x
  _atom_site_fract_y
  _atom_site_fract_z
  _atom_site_occupancy
  Co  Co1       1.0  0.48890  0.24615  0.83518  1.0000
  O   O1        1.0  0.92040  0.64330  0.49570  1.0000
  O   O2        1.0  0.22590  0.65280  0.40080  1.0000
  O   O3        1.0  0.76020  0.86520  0.37980  1.0000
  O   O4        1.0  0.07220  0.88580  0.47240  1.0000
  N   N1        1.0  0.34430  0.35560  0.71280  1.0000
  N   N2        1.0  0.63270  0.15620  0.69500  1.0000
  C   C1        1.0  0.10460  0.61330  0.48440  1.0000
  C   C2        1.0  0.19160  0.52320  0.56560  1.0000
  C   C3        1.0  0.38970  0.48180  0.54590  1.0000
  H   H1        1.0  0.47504  0.50969  0.48302  1.0000
  C   C4        1.0  0.46100  0.39760  0.62090  1.0000
  H   H2        1.0  0.59477  0.36946  0.60669  1.0000
  C   C5        1.0  0.15170  0.39670  0.73290  1.0000
  H   H3        1.0  0.07023  0.36800  0.79720  1.0000
  C   C6        1.0  0.06940  0.48030  0.66150  1.0000
  H   H4        1.0  0.93517  0.50706  0.67735  1.0000
  C   C7        1.0  0.88360  0.91240  0.45810  1.0000
  C   C8        1.0  0.79740  0.00190  0.53640  1.0000
  C   C9        1.0  0.91870  0.04480  0.63680  1.0000
  H   H5        1.0  0.05699  0.02171  0.65248  1.0000
  C   C10       1.0  0.83080  0.12170  0.71240  1.0000
  H   H6        1.0  0.91324  0.15081  0.77824  1.0000
  C   C11       1.0  0.51850  0.11760  0.59540  1.0000
  H   H7        1.0  0.38285  0.14426  0.57955  1.0000
  C   C12       1.0  0.59410  0.04030  0.51620  1.0000
  H   H8        1.0  0.50934  0.01401  0.44958  1.0000
  Co  Co2       1.0  0.98890  0.75385  0.33518  1.0000
  O   O5        1.0  0.42040  0.35670  0.99570  1.0000
  O   O6        1.0  0.72590  0.34720  0.90080  1.0000
  N   N3        1.0  0.84430  0.64440  0.21280  1.0000
  C   C13       1.0  0.60460  0.38670  0.98440  1.0000
  C   C14       1.0  0.69160  0.47680  0.06560  1.0000
  C   C15       1.0  0.88970  0.51820  0.04590  1.0000
  H   H9        1.0  0.97504  0.49030  0.98303  1.0000
  C   C16       1.0  0.96100  0.60240  0.12090  1.0000
  H   H10       1.0  0.09477  0.63054  0.10669  1.0000
  C   C17       1.0  0.65170  0.60330  0.23290  1.0000
  H   H11       1.0  0.57023  0.63200  0.29720  1.0000
  C   C18       1.0  0.56940  0.51970  0.16150  1.0000
  H   H12       1.0  0.43517  0.49294  0.17735  1.0000
  O   O7        1.0  0.26020  0.13480  0.87980  1.0000
  O   O8        1.0  0.57220  0.11420  0.97240  1.0000
  N   N4        1.0  0.13270  0.84380  0.19500  1.0000
  C   C19       1.0  0.38360  0.08760  0.95810  1.0000
  C   C20       1.0  0.29740  0.99810  0.03640  1.0000
  C   C21       1.0  0.41870  0.95520  0.13680  1.0000
  H   H13       1.0  0.55699  0.97829  0.15248  1.0000
  C   C22       1.0  0.33080  0.87830  0.21240  1.0000
  H   H14       1.0  0.41324  0.84919  0.27824  1.0000
  C   C23       1.0  0.01850  0.88240  0.09540  1.0000
  H   H15       1.0  0.88285  0.85574  0.07955  1.0000
  C   C24       1.0  0.09410  0.95970  0.01620  1.0000
  H   H16       1.0  0.00934  0.98599  0.94958  1.0000

Issue running `pip install .` for mofid in Google Colab

Hey, I got an error that I can't debug when running the pip install. Make init and path setup went without issues.

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Processing /content/gdrive/MyDrive/Project_MTF-C/mofid
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

I was running this on colab. All required setup packages are updated as follows.
Requirement already satisfied: pip in /usr/local/lib/python3.10/dist-packages (23.1.2)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (68.0.0)
Requirement already satisfied: ez_setup in /usr/local/lib/python3.10/dist-packages (0.9)

Rare AssertionError in Python interface

The Python interface crashes for the following CIF: mof.txt. The web interface works fine. Need to fix assertion statement.

Traceback:

==============================
*** Open Babel Error  in WriteSystre
  Found two neighboring 2-c sites.  Flagging the cgd output to get an error instead of the incorrect topology.
Note: Rerun ExportSystre() with simplify_two_conn=false if an unsimplified net is useful.
==============================
*** Open Babel Error  in WriteSystre
  Found two neighboring 2-c sites.  Flagging the cgd output to get an error instead of the incorrect topology.
Note: Rerun ExportSystre() with simplify_two_conn=false if an unsimplified net is useful.
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    mofid = cif2mofid(cif_path)
  File "C:\Users\asros\Anaconda3\lib\site-packages\mofid\run_mofid.py", line 24, in cif2mofid
    'SingleNode','topology.cgd'))
  File "C:\Users\asros\Anaconda3\lib\site-packages\mofid\id_constructor.py", line 134, in extract_topology
    assert line[-2] == str(current_component)
AssertionError

Rename NoSBU algorithm

I'm planning to give this algorithm a proper name in the MOFid paper, possibly the "fragment" algorithm or "inorganic". (Something short and descriptive like the "all node," "single node," or "standard" algorithms I reference in the paper). Let's modify the code after the group comes to a consensus on the notation.

Complete MOFid with extraneous error flag

For this this mof, the generated MOFid is [Co][O]([Co])([Co])[Co].[O-]C(=O)N=NC(=O)[O-].[O-]C(=O)[N][N]C(=O)[O-] MOFid-v1.bcu,ERROR.cat0;199. All the components of the MOFid are present and seemingly valid, but with a ",ERROR" that should not be there.

'Unknown' and 'NA' topologies on CoRE MOF dataset

When running the CoRE MOF dataset through mofid, I get an issue with about a third of them being labeled with 'UNKNOWN' or 'NA' topology. Specifically:

  • 27.9% are of 'UNKNOWN' topology
  • 6.9% are of 'NA' topology

I have tried to do some digging to resolve the issue, but I have yet to find a lead. For reference, here is a JSON file of the Python dictionaries generated for each MOF in the dataset. Any ideas what could be causing this?

Clarification on Handling Allnode Topology in Web vs. Local Run in mofid

Hello, I'm currently exploring the use of mofid and I have a question about its behavior on different platform (Web and Local Run). I observed that when running locally, mofid processes allnode topology and generates identifiers in the form of sn,an. However, this doesn't seem to be the case when running on a web run.
When aiming to compare whether two MOFs are identical, should the an (allnode) parameter be included to ensure accurate comparison?

pip install can be slow

The installation step pip install . can run slowly depending on the project environment. The root cause is in pip (pypa/pip#2195). Any source files under the parent directory (including the .git folder) are copied as part of the installation process. Therefore, if subdirectories like Data, Notebooks, Output, etc., are large, then pip will have to copy many files.

A workaround recommended in the Github issue is to use the -e (--editable) flag. The resulting symlinks work on my laptop but not the .travis.yml file for some reason. See the mofid pip branch for some failed attempts at resolving this issue.

Update Open Babel

Generating 2D line drawings in the MOFid web interface is currently buggy due to a regression in the upstream Open Babel project. openbabel/openbabel#1902 patched this bug and needs to be incorporated into the MOFid version of Open Babel. It would also be nice to re-sync the MOFid version with the upstream project once the PR for periodic boundary conditions is merged.

moffles --> mofid

We should rename function calls and files with "moffles" in the name "mofid" where appropriate to be clearer. This issue will also cover changing names_to_tables.py to something more intuitive.

Cleaning up folders

The scripts folder is a bit messy, and I'm not really sure what's going on to be honest. For instance, what is submit.job? It's a PBS script but seems kind of out of place. Also, import_emscripten.sh is hard-coded to a bunch of your paths, and so on. I've taken care of the files in the HPC folder. I now have separate directories for Moab and Slurm schedulers. In case you're curious, there is no need for the analogue to the Moab -v directive in Slurm. By default, Slurm passes all environment variables to the running job, so you just have to export each variable before job submission.

You also may want to clean up the Resources folder (i.e. adding some stuff to your .gitignore). For instance, I don't think we need a .docx and .pdf about Materials Studio or a zip file of kekule. But you make the judgment call for what to remove from the repo.

Clean up stderr

Now that Open Babel writes to InChI and InChIKey for the MOFkey and linker stats, stderr is getting flooded by warnings from the inchi format. The quantity of warnings is making it hard to find other warnings and errors from elsewhere in the MOFid code and greatly inflates the log file.

We should consider writing them to a separate file and optionally print out a single summary warning with a more interpretable message.

Add a verbosity flag to Python interface

Currently, there's a lot of Open Babel output that clogs up the stdout. It would be smart to add a verbosity flag to allow the user to suppress these messages.

Inconsistent aromaticity of N-heterocycles

Certain heterocyclic compounds can yield unexpected SMILES in the upstream Open Babel library. For example, the code does not correctly assign bond orders to L_13 in the ToBaCCo MOFs. This effect occurs inconsistently, sometimes leading to multiple copies of the same linker with slightly different SMILES due to different bond orders or the introduction of radical notation.

Even if the bug is fixed upstream, I suspect the presence of charged organic molecules may exacerbate the issue. Currently, framework.cpp assigns formal charges to the carboxylate and certain rings after the bond orders are detected. Aromaticity detection may be improved if the overall partial charge is assigned before running OBMol::PerceiveBondOrders, for example by using the number/location of coordinated metals.

These are some potentially relevant issues on the upstream project for reference:

Document the round-tripping errors

Many of the "errors" reported in the .json output from check_mof_linkers.py are non-intuitive, e.g. unk_pillar2 is referring to the ambiguity in pyridine vs. carboxylate termination for hMOF linkers (and is not actually an error). Some of the other validator codes in smiles_diff.py:DIFF_LEVELS also need to be documented more clearly. Going through these errors will make it easier to discuss the validation results.

Remove set_paths.py requirement

It's a completely minor, trivial (mostly cosmetic) issue, but I'd like to remove the requirement to call python set_paths.py prior to install.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.