Giter Site home page Giter Site logo

Geometry unit conversion factor about qcschema HOT 10 OPEN

molssi avatar molssi commented on June 19, 2024
Geometry unit conversion factor

from qcschema.

Comments (10)

tovrstra avatar tovrstra commented on June 19, 2024 1

@dgasmith @wadejong @matt-chan @cryos Default units and conversion factors cannot work, some of which is explained in earlier comments. I'll try to summarize the problem:

Different programs work in different units internally and they usually already have conversion factors to transform results to other units before printing. These aspects of existing software will not change. If you settle on standard units and conversion factors, one of the following two things is going to happen and neither are great:

  1. Such programs use the conversion factors of the JSON spec to write results in an agreed unit, which may be inconsistent with the usual output of that program.
  2. Such programs may ignore the standard conversion factors and become inconsistent with the spec.

A cleaner solution would be to let every program write results in a JSON file in its internal units, and to let it specify what these units mean. Then the receiver of the JSON data is free to handle the units in whichever way he/she likes. If conversion is needed, the most reasonable choice would be to take the units from the NIST website (which get refined occasionally as more literature becomes available). The disadvantage is that the spec becomes more complicated.

P.S. Most QC programs work in atomic units, which may not cause too much trouble. As soon as you want to exchange data with MM programs, all sorts of units are being used.

from qcschema.

chrisjsewell avatar chrisjsewell commented on June 19, 2024

Hey guys, big fan of your aspirations here, I wish people would put as much thought into their output formats as they do the rest of the program! I actually wrote the jsonextended package to help me parse and manipulate the data I'm working with from Gaussian, CRYSTAL, LAMMPS, etc, in the same kind of format you envisage. In particular, I thought you might be interested in how I am handling unit standardisation; with a "combine-apply-split" methodology, utilising the pint package. Here's a quick demo:

  1. read in your front page example output:
import json
from jsonextended import edict
test = json.load('test.json')
edict.pprint(test,depth=1)
driver:     energy
error:      
method:     {...}
molecule:   {...}
provenance: {...}
raw_output: Output storing was not requested.
return_value: {...}
success:    True
variables:  {...}
  1. Combine all ('val','units') leaf nodes into pint.Quantity objects:
from jsonextended import units as eunits
withunits = eunits.combine_quantities(test,'units','val')
edict.pprint(withunits,depth=2)
driver:       energy
error:        
method:       
  basis:      sto-3g
  expression: SCF
molecule:     
  atoms:    [He, He]
  geometry: [[0 0 0] [0 0 1]] Å
provenance:   
  creator: QM Program
  routine: program.run_json
  version: 1.1rc1
raw_output:   Output storing was not requested.
return_value: -5.433191881443323 E_h
success:      True
variables:    
  NUCLEAR REPULSION ENERGY: 2.11670883436 E_h
  ONE-ELECTRON ENERGY:      -11.67399006298957 E_h
  SCF DIPOLE X:             0.0 E_h
  SCF DIPOLE Y:             0.0 E_h
  SCF DIPOLE Z:             0.0 E_h
  SCF N ITERS:              2.0
  SCF TOTAL ENERGY:         -5.433191881443323 E_h
  SCF TWO-ELECTRON ENERGY:  4.124089347186247 E_h
  1. Apply a unit schema to the data, to convert specified fields to the required units.
newunits = eunits.apply_unitschema(withunits,{'geometry':'nm',
                                              'return_value':'kcal',
                                              'variables':{'SCF*':'eV'}},
                                   use_wildcards=True)
edict.pprint(newunits,depth=2)
driver:       energy
error:        
method:       
  basis:      sto-3g
  expression: SCF
molecule:     
  atoms:    [He, He]
  geometry: [[ 0. 0. 0. ] [ 0. 0. 0.1]] nm
provenance:   
  creator: QM Program
  routine: program.run_json
  version: 1.1rc1
raw_output:   Output storing was not requested.
return_value: -5.661406639574504e-21 kcal
success:      True
variables:    
  NUCLEAR REPULSION ENERGY: 2.11670883436 E_h
  ONE-ELECTRON ENERGY:      -11.67399006298957 E_h
  SCF DIPOLE X:             0.0 eV
  SCF DIPOLE Y:             0.0 eV
  SCF DIPOLE Z:             0.0 eV
  SCF N ITERS:              2.0 eV
  SCF TOTAL ENERGY:         -147.84466590569593 eV
  SCF TWO-ELECTRON ENERGY:  112.22217528934715 eV
  1. Split the pint.Quantity objects back into their ('val','units') pairs:
removeunits = eunits.split_quantities(newunits,'units','val')
edict.pprint(removeunits,depth=3)
driver:     energy
error:      
method:     
  basis:      sto-3g
  expression: SCF
molecule:   
  atoms: [He, He]
  geometry: 
    units: nanometer
    val:   [[ 0. 0. 0. ] [ 0. 0. 0.1]]
provenance: 
  creator: QM Program
  routine: program.run_json
  version: 1.1rc1
raw_output: Output storing was not requested.
return_value: 
  units: kilocalorie
  val:   -5.661406639574504e-21
success:    True
variables:  
  NUCLEAR REPULSION ENERGY: 
    units: hartree
    val:   2.11670883436
  ONE-ELECTRON ENERGY: 
    units: hartree
    val:   -11.67399006298957
  SCF DIPOLE X: 
    units: electron_volt
    val:   0.0
  SCF DIPOLE Y: 
    units: electron_volt
    val:   0.0
  SCF DIPOLE Z: 
    units: electron_volt
    val:   0.0
  SCF N ITERS: 
    units: electron_volt
    val:   2.0
  SCF TOTAL ENERGY: 
    units: electron_volt
    val:   -147.84466590569593
  SCF TWO-ELECTRON ENERGY: 
    units: electron_volt
    val:   112.22217528934715

Ta,
Chris

from qcschema.

tovrstra avatar tovrstra commented on June 19, 2024

jsonextended and pint are very impressive but I guess, for the sake of defining a JSON schema, they may add too much complexity? It would be nice though to design the schema such that it plays nice with these packages.

jsonextended and pint do not seem solve the original problem mentioned by @loriab, namely that different QC codes have different definitions of unit conversion factors, e.g. they use (slightly) different numbers to convert from Bohr to Angstrom. Is there a way to get around this?

from qcschema.

dgasmith avatar dgasmith commented on June 19, 2024

@tovrstra Agreed, I think we can recommend tools. However, the spec itself is tool independent.

Using slightly different conversion factors is tricky. We could take the following steps:

  • Request that all input/output values to QM programs be in Hartree
  • MolSSI could build a repository that had the updated values for everyone to use.

from qcschema.

tovrstra avatar tovrstra commented on June 19, 2024

@dgasmith So you suggest to drop any support for different units and require all numbers to use atomic units?

from qcschema.

wadejong avatar wadejong commented on June 19, 2024

from qcschema.

wadejong avatar wadejong commented on June 19, 2024

from qcschema.

andysim avatar andysim commented on June 19, 2024

This is a very tricky problem, with many different codes using different conversion factors and units in their output. In a JSON context, one possible approach would be to have an extra field that specifies the conversion factor for each quantity (length, energy, etc.) used by the program of interest to some specific convention, e.g. atomic units. This would allow a.u. input to be converted internally by any code, using their native conventions, as usual. It would also provide a mechanism for converting output received to a 'standard' form (a.u. in the example I provided).

from qcschema.

matt-chan avatar matt-chan commented on June 19, 2024

Instead of accepting a variety of units, it would be nice to work with one set. That way, a simple project implementing the spec wouldn't be required to include code to convert from a plethora of possible units.

As others have suggested we would need an agreed standard (molssi or iupac) for conversion.

We could include test cases which would help codes that don't natively work with those units to minimize bugs. (Even if we decide to accept multiple unit systems in the spec, it'd still be a good idea to have the tests)

from qcschema.

cryos avatar cryos commented on June 19, 2024

Agreed, strongly recommend one variety of units. Support others, but have a recommended set of units for the format. Agreed conversion factors to apply would then be available.

from qcschema.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.