Giter Site home page Giter Site logo

cpp-dict-pack's Introduction

RCSB mmCIF Dictionary Suite

example workflow

Introduction

This repository/package contains the source for the mmCIF dictionary suite. Included are a number of tools for validation of dictionaries and data.

Installation

Download the source code

  • As a distribution with a filename like mmcif-dict-suite-vX.XXX-prod-src.tar.gz
   gzip -d -c mmcif-dict-suite-vX.XXX-prod-src.tar.gz | tar xf -
  • From github
git clone  --recurse-submodules  https://github.com/rcsb/cpp-dict-pack.git

Building

To build the dictionary suite, you need to have the following in your path or development environment

  • CMake
  • python (version 2/3)
  • C++ compiler
  • bash
  • csh
  • flex
  • bison

Typically, one created a build tree, uses cmake to configure, and then build.

mkdir build
cd build
cmake .. -D<options>
make

This will build the tools, and compile the dictionaries.

Configuration options

As CMake is used, the following command line settings may be used:

  • MINIMAL_DICTS: This indicates that a subset of dictionaries should be downloaded and built. (use -DMINIMAL_DICTS=ON).

  • ALWAYS_DOWNLOAD_DICTS: When working with a pre-packaged tar distribution, only a subset of the dictionaries are provided. Use this option to force download of dictionaries from GitHub. (use -DALWAYS_DOWNLOAD_DICTS=ON).

  • PYTHON_EXECUTABLE: Allows one to provide the path to a python executable when CMake fails. (Use -DPYTHON_EXECUTABLE:FILEPATH=/usr/bin/python3)

Using the dictionary suite

Retrieving the latest dictionary

You may retrieve the latest version of the mmcif-pdbx dictionary by doing the following (from within the build directory)

mkdir PDBx
cd PDBx
wget http://mmcif.pdb.org/dictionaries/ascii/mmcif_pdbx_v50.dic
../bin/DictToSdb -ddlFile ../dicts/dict-mmcif_ddl/mmcif_ddl.dic \
      -dictFile mmcif_pdbx_v50.dic -dictSdbFile mmcif_pdbx_v50.sdb

If errors are found, parsing errors are stored in the file mmcif_pdbx_v50.dic-parser.log and validation errors, against the DDL, are stored in the file mmcif_pdbx_v50.dic-diag.log

Validating a mmCIF/PDBx file against the PDBx dictionary

This is an example of PDBx file validation, of entry 3q45, against the PDBx dictionary.

wget ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/mmCIF/q4/3q45.cif.gz
gunzip 3q45.cif.gz
./bin/CifCheck -f 3q45.cif -dictSdb PDBx/mmcif_pdbx_v50.sdb

If errors are found, parsing errors are stored in the file 3q45.cif-parser.log and validation errors, against the dictionary, are stored in the file 3q45.cif-diag.log.

Notes for developers

Some compromises were made in building the suite and keeping dictionaries consistent

  • Normally when you run cmake, dictionaries are downloaded from GitHub - unless already present in the dicts subdirectory. If you remove a dictionary and run cmake, it will retrieve if need be.

  • If you are doing dictionary development, a symlink to a development dictionary maintained in the dicts directory will be sufficient. Dependencies will be updated properly.

  • To create a distribution, use make dist

  • everything option to make will build the sdb/odb/xml files. It is protected so the -j option to make will work without collision.

cpp-dict-pack's People

Contributors

epeisach avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cpp-dict-pack's Issues

CifCheck does not recognise "stop_"

According to the Star format definition, "stop_" is a reserved word and can not be used as unquoted data value. If I run CifCheck on the following snippet of mmCIF:

_entry.id C103531_G3_I1
_struct.entry_id C103531_G3_I1
_struct.pdbx_model_details stop_in_the_name_of_love
_struct.pdbx_structure_determination_methodology computational
_struct.title 'Model'

CifCheck does not complain about _struct.pdbx_model_details stop_in_the_name_of_love but it should. When reading the above mmCIF input using the RCSB Python mmcif package, I do get

mmcif.io.PdbxExceptions.PdbxSyntaxError: [Line: 4] Unexpected reserved word: stop

So I guess CifCheck should also detect the reserved words in Star format.

Thanks,

Bienchen

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.