Giter Site home page Giter Site logo

rcsb / coordinates-compression Goto Github PK

View Code? Open in Web Editor NEW
2.0 6.0 5.0 100 KB

This repository contains the compression methods described in the paper "Towards an efficient compression of 3D coordinates of macromolecular structures".

License: Apache License 2.0

Java 100.00%

coordinates-compression's Introduction

Compression of 3D coordinates of macromolecular structures

This repository contains the compression methods described in the paper "Towards an efficient compression of 3D coordinates of macromolecular structures", PLOS ONE 12(3): e0174846, doi: 10.1371/journal.pone.01748464 methods provide the foundation for a novel standard to represent macromolecular coordinates in a MacroMolecular Transmission Format (MMTF) for 3D structures (http://mmtf.rcsb.org). This format allows a compact representation and interactive visualization of the largest macromolecular complexes that are currently in the Protein Data Bank.

Install from git repository

You can get the latest source code using git. Then you can execute the install goal with Maven to build the project.

$ git clone https://github.com/rcsb/coordinates-compression.git
$ cd coordinates-compression
$ mvn install

The install goal will compile, test, and package the project’s code and then copy it into the local dependency repository which Maven maintains on your local machine.

If you use Maven for the first time, these links can be useful:
Where to download Maven
How to install Maven

How to run the analysis

Maven exec plugin lets you run the main method of a Java class in the project, with the project dependencies automatically included in the classpath.

The proposed approaches are implemented as two types of strategies: (i) intramolecular compression that operates on the sequence of atoms within a polymer chain; and (ii) intermolecular compression designed for the compression of special cases of multiple chains with identical atoms, such as NMR models and structures with repeated identical subunits.

Run intramolecular compression analysis

mvn exec:java -Dexec.mainClass="org.rcsb.mmtf.analysis.RunTotalAnalysis" -Dexec.args="arg1 arg2"

     arg1: path to a Hadoop sequence file with the PDB structures in MMTF format

     arg2: path to the existing folder to write the results

Run intermolecular compression analysis

mvn exec:java -Dexec.mainClass="org.rcsb.mmtf.analysis.RunEnsamblesAnalysis" -Dexec.args="arg1 arg2"

     arg1: path to a Hadoop sequence file with the PDB structures in MMTF format

     arg2: path to the existing folder to write the results

Note: You may need to increase the memory allocation pool for a Java Virtual Machine. Use -Xms option to increase the Java heap size to 8G when running the analysis.

export MAVEN_OPTS="-Xms8G"

How to get 3D structures in MMTF format

You can download a Hadoop sequence file with the PDB structures in MMTF format.

$ wget http://mmtf.rcsb.org/v1.0/hadoopfiles/full.tar
$ tar -xvf full.tar

This will get and unpack the content of the Hadoop Sequence File to a full folder.

coordinates-compression's People

Contributors

josemduarte avatar pwrose avatar valasatava avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.