posting for <a class="user-mention notranslate" data-hovercard-type="user" data-hoverc

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Geometry unit conversion factor about qcschema HOT 10 OPEN

molssi commented on June 23, 2024

Geometry unit conversion factor

from qcschema.

Comments (10)

tovrstra commented on June 23, 2024 1

@dgasmith @wadejong @matt-chan @cryos Default units and conversion factors cannot work, some of which is explained in earlier comments. I'll try to summarize the problem:

Different programs work in different units internally and they usually already have conversion factors to transform results to other units before printing. These aspects of existing software will not change. If you settle on standard units and conversion factors, one of the following two things is going to happen and neither are great:

Such programs use the conversion factors of the JSON spec to write results in an agreed unit, which may be inconsistent with the usual output of that program.
Such programs may ignore the standard conversion factors and become inconsistent with the spec.

A cleaner solution would be to let every program write results in a JSON file in its internal units, and to let it specify what these units mean. Then the receiver of the JSON data is free to handle the units in whichever way he/she likes. If conversion is needed, the most reasonable choice would be to take the units from the NIST website (which get refined occasionally as more literature becomes available). The disadvantage is that the spec becomes more complicated.

P.S. Most QC programs work in atomic units, which may not cause too much trouble. As soon as you want to exchange data with MM programs, all sorts of units are being used.

from qcschema.

chrisjsewell commented on June 23, 2024

Hey guys, big fan of your aspirations here, I wish people would put as much thought into their output formats as they do the rest of the program! I actually wrote the jsonextended package to help me parse and manipulate the data I'm working with from Gaussian, CRYSTAL, LAMMPS, etc, in the same kind of format you envisage. In particular, I thought you might be interested in how I am handling unit standardisation; with a "combine-apply-split" methodology, utilising the pint package. Here's a quick demo:

read in your front page example output:

import json
from jsonextended import edict
test = json.load('test.json')
edict.pprint(test,depth=1)

driver:     energy
error:      
method:     {...}
molecule:   {...}
provenance: {...}
raw_output: Output storing was not requested.
return_value: {...}
success:    True
variables:  {...}

Combine all ('val','units') leaf nodes into pint.Quantity objects:

from jsonextended import units as eunits
withunits = eunits.combine_quantities(test,'units','val')
edict.pprint(withunits,depth=2)

driver:       energy
error:        
method:       
  basis:      sto-3g
  expression: SCF
molecule:     
  atoms:    [He, He]
  geometry: [[0 0 0] [0 0 1]] Å
provenance:   
  creator: QM Program
  routine: program.run_json
  version: 1.1rc1
raw_output:   Output storing was not requested.
return_value: -5.433191881443323 E_h
success:      True
variables:    
  NUCLEAR REPULSION ENERGY: 2.11670883436 E_h
  ONE-ELECTRON ENERGY:      -11.67399006298957 E_h
  SCF DIPOLE X:             0.0 E_h
  SCF DIPOLE Y:             0.0 E_h
  SCF DIPOLE Z:             0.0 E_h
  SCF N ITERS:              2.0
  SCF TOTAL ENERGY:         -5.433191881443323 E_h
  SCF TWO-ELECTRON ENERGY:  4.124089347186247 E_h

Apply a unit schema to the data, to convert specified fields to the required units.

newunits = eunits.apply_unitschema(withunits,{'geometry':'nm',
                                              'return_value':'kcal',
                                              'variables':{'SCF*':'eV'}},
                                   use_wildcards=True)
edict.pprint(newunits,depth=2)

driver:       energy
error:        
method:       
  basis:      sto-3g
  expression: SCF
molecule:     
  atoms:    [He, He]
  geometry: [[ 0. 0. 0. ] [ 0. 0. 0.1]] nm
provenance:   
  creator: QM Program
  routine: program.run_json
  version: 1.1rc1
raw_output:   Output storing was not requested.
return_value: -5.661406639574504e-21 kcal
success:      True
variables:    
  NUCLEAR REPULSION ENERGY: 2.11670883436 E_h
  ONE-ELECTRON ENERGY:      -11.67399006298957 E_h
  SCF DIPOLE X:             0.0 eV
  SCF DIPOLE Y:             0.0 eV
  SCF DIPOLE Z:             0.0 eV
  SCF N ITERS:              2.0 eV
  SCF TOTAL ENERGY:         -147.84466590569593 eV
  SCF TWO-ELECTRON ENERGY:  112.22217528934715 eV

Split the pint.Quantity objects back into their ('val','units') pairs:

removeunits = eunits.split_quantities(newunits,'units','val')
edict.pprint(removeunits,depth=3)

driver:     energy
error:      
method:     
  basis:      sto-3g
  expression: SCF
molecule:   
  atoms: [He, He]
  geometry: 
    units: nanometer
    val:   [[ 0. 0. 0. ] [ 0. 0. 0.1]]
provenance: 
  creator: QM Program
  routine: program.run_json
  version: 1.1rc1
raw_output: Output storing was not requested.
return_value: 
  units: kilocalorie
  val:   -5.661406639574504e-21
success:    True
variables:  
  NUCLEAR REPULSION ENERGY: 
    units: hartree
    val:   2.11670883436
  ONE-ELECTRON ENERGY: 
    units: hartree
    val:   -11.67399006298957
  SCF DIPOLE X: 
    units: electron_volt
    val:   0.0
  SCF DIPOLE Y: 
    units: electron_volt
    val:   0.0
  SCF DIPOLE Z: 
    units: electron_volt
    val:   0.0
  SCF N ITERS: 
    units: electron_volt
    val:   2.0
  SCF TOTAL ENERGY: 
    units: electron_volt
    val:   -147.84466590569593
  SCF TWO-ELECTRON ENERGY: 
    units: electron_volt
    val:   112.22217528934715

Ta,
Chris

from qcschema.

tovrstra commented on June 23, 2024

jsonextended and pint are very impressive but I guess, for the sake of defining a JSON schema, they may add too much complexity? It would be nice though to design the schema such that it plays nice with these packages.

jsonextended and pint do not seem solve the original problem mentioned by @loriab, namely that different QC codes have different definitions of unit conversion factors, e.g. they use (slightly) different numbers to convert from Bohr to Angstrom. Is there a way to get around this?

from qcschema.

dgasmith commented on June 23, 2024

@tovrstra Agreed, I think we can recommend tools. However, the spec itself is tool independent.

Using slightly different conversion factors is tricky. We could take the following steps:

Request that all input/output values to QM programs be in Hartree
MolSSI could build a repository that had the updated values for everyone to use.

from qcschema.

tovrstra commented on June 23, 2024

@dgasmith So you suggest to drop any support for different units and require all numbers to use atomic units?

from qcschema.

wadejong commented on June 23, 2024

There is a repository of values for units, called the Goldbook from IUPAC. It would be good to have a standard repository that all QM programs can use to set their values and conversion factors, and enabling uniformity.

…

On Fri, Aug 18, 2017 at 5:05 AM, Daniel Smith ***@***.***> wrote: @tovrstra <https://github.com/tovrstra> Agreed, I think we can recommend tools. However, the spec itself is tool independent. Using slightly different conversion factors is tricky. We could take the following steps: - Request that all input/output values to QM programs be in Hartree - MolSSI could build a repository that had the updated values for everyone to use. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#8 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGa9ci-SMBShytO3rBC2hHsOJjjVjkJUks5sZX4FgaJpZM4OpvtZ> .

from qcschema.

wadejong commented on June 23, 2024

That would require ingesters to do all the conversions. And, those conversions might not agree with those in the text based output files that give users properties in the units they need.

…

On Fri, Aug 18, 2017 at 5:18 AM, Toon Verstraelen ***@***.***> wrote: @dgasmith <https://github.com/dgasmith> So you suggest to drop any support for different units and require all numbers to use atomic units? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#8 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGa9cnjI_5YXAykZG-KromjO5P-iHRRYks5sZYEXgaJpZM4OpvtZ> .

from qcschema.

andysim commented on June 23, 2024

This is a very tricky problem, with many different codes using different conversion factors and units in their output. In a JSON context, one possible approach would be to have an extra field that specifies the conversion factor for each quantity (length, energy, etc.) used by the program of interest to some specific convention, e.g. atomic units. This would allow a.u. input to be converted internally by any code, using their native conventions, as usual. It would also provide a mechanism for converting output received to a 'standard' form (a.u. in the example I provided).

from qcschema.

matt-chan commented on June 23, 2024

Instead of accepting a variety of units, it would be nice to work with one set. That way, a simple project implementing the spec wouldn't be required to include code to convert from a plethora of possible units.

As others have suggested we would need an agreed standard (molssi or iupac) for conversion.

We could include test cases which would help codes that don't natively work with those units to minimize bugs. (Even if we decide to accept multiple unit systems in the spec, it'd still be a good idea to have the tests)

from qcschema.

cryos commented on June 23, 2024

Agreed, strongly recommend one variety of units. Support others, but have a recommended set of units for the format. Agreed conversion factors to apply would then be available.

from qcschema.

Geometry unit conversion factor about qcschema HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent