Giter Site home page Giter Site logo

mars-wei / molvec Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ncats/molvec

0.0 0.0 0.0 275.13 MB

A feeble attempt at molecular recognition (in the literal sense)

Home Page: https://molvec.ncats.io

License: GNU Lesser General Public License v2.1

Shell 0.01% Java 99.99%

molvec's Introduction

MolVec

NCATS (chemical) ocr engine that can vectorize chemical images into Chemical objects preserving the 2D layout as much as possible. The code is still very raw in terms of utility. Check out https://molvec.ncats.io for a demonstration of MolVec.

Molvec is on Maven Central

The easiest way to start using Molvec is to include it as a dependency in your build tool of choice. For Maven:

<dependency>
  <groupId>gov.nih.ncats</groupId>
  <artifactId>molvec</artifactId>
  <version>0.9.8</version>
</dependency>

Example Usage: convert image into mol file format

    File image = ...
    String mol = Molvec.ocr(image);

Async Support

MolVec supports asynchronous calls

    CompleteableFuture<String> future = Molvec.ocrAsync( image);
    String mol = future.get(5, TimeUnit.SECONDS);

Support for producing molfile and SDfiles

Since version 0.9.8, Molvec has a more robust API that allows for creating both molfiles and SDfiles and adding properties to the SDfiles.

MolvecOptions options = new MolvecOptions()
                                    .setName(name)
                                    .center(true)
                                    .averageBondLength(2);

MolvecResult result = Molvec.ocr(f, options);

//write out a SDfile without any properties
String sdfile = result.getSDfile().get());

//create a map of key-value pairs and include as properties in an SDfile
Map<String, String> props = new HashMap<>();
 
props.put("File Name", f.getName());

String sdfileWithProperties = result.getSDfile(props).get();

Commandline interface

The Molvec jar has a runnable Main class with the following options

usage: molvec ([-gui],[[(-f <path> [-o <path>]) | (-dir <path> [[-outDir <path> | -outSdf <path>]],[-parallel <count>])]],[-scale
              <value>],[-h])
Image to Chemical Structure Extractor Analyzes the given image and tries to find the chemical structure drawn and
convert it into a Mol format.

options:
     -dir <path>         path to a directory of image files to process. Supported formats include png, jpeg, tiff and
                         svg. Each image file found will be attempted to be processed. If -out or -outDir is not
                         specified then each processed mol will be put in the same directory and named
                         $filename.molThis option or -f is required if not using -gui

     -f,--file <path>    path of image file to process. Supported formats include png, jpeg, tiff.  This option or -dir
                         is required if not using -gui

     -gui                Run Molvec in GUI mode. file and scale option may be set to preload file

     -h,--help           print helptext

     -o,--out <path>     path of output processed mol. Only valid when not using gui mode. If not specified output is
                         sent to STDOUT

     -outDir <path>      path to output directory to put processed mol files. If this path does not exist it will e
                         created

     -outSdf <path>      Write output to a single sdf formatted file instead of individual mol files

     -parallel <count>   Number of images to process simultaneously, if not specified defaults to 1

     -scale <value>      scale of image to show in viewer (only valid if gui mode AND file are specified)

Examples:

      $molvec -f /path/to/image.file

   parse the given image file and print out the structure mol to STDOUT

      $molvec -dir /path/to/directory

   serially parse all the image files inside the given directory and write out a new mol file for each image named
   $image.file.mol the new files will be put in the input directory

      $molvec -dir /path/to/directory -outDir /path/to/outputDir

   serially parse all the image files inside the given directory and write out a new mol file for each image named
   $image.file.mol the new files will be put in the directory specified by outDir

      $molvec -dir /path/to/directory -outSdf /path/to/output.sdf

   serially parse all the image files inside the given directory and write out a new sdf file to the given path that
   contains all the structures from the input image directory

      $molvec -dir /path/to/directory -outSdf /path/to/output.sdf -parallel 4

   parse in 4 concurrent parallel thread all the image files inside the given directory and write out a new sdf file to
   the given path that contains all the structures from the input image directory

      $molvec -dir /path/to/directory -parallel 4

   parse in 4 concurrent parallel threads all the image files inside the given directory and write out a new mol file for
   each image named $image.file.mol the new files will be put in the directory specified by outDir

      $molvec -gui

   open the Molvec Graphical User interface without any image preloaded

      $molvec -gui -f /path/to/image.file

   open the Molvec Graphical User interface  with the given image file preloaded

      $molvec -gui -f /path/to/image.file -scale 2.0

   open the Molvec Graphical User interface  with the given image file preloaded zoomed in/out to the given scale

Developed by NIH/NCATS

GUI

Molvec Comes with a Swing Viewer you can use to step through each step of the structure recognition process

Primitives

Publications that Mention Molvec

How to Report Issues

You can report issues or feature requests either by creating issue tickets on our github page or by forwarding questions and/or problems to [email protected]

molvec's People

Contributors

tylerperyea avatar dkatzel-ncats avatar caodac avatar dan2097 avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.