Giter Site home page Giter Site logo

mzjb / deeph-pack Goto Github PK

View Code? Open in Web Editor NEW
183.0 6.0 40.0 686 KB

Deep neural networks for density functional theory Hamiltonian.

License: GNU Lesser General Public License v3.0

Python 84.01% Julia 15.99%
deeph hamiltonian dft julia pytorch equivariant-network density-functional-theory first-principles-calculations physics ab-initio-simulations

deeph-pack's People

Contributors

aaaashanghai avatar bsplu avatar fadelis98 avatar mzjb avatar ting-bao avatar yxpw avatar zhouxy-pku avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

deeph-pack's Issues

in to raise ValueError(f"Invalid format: `{str(fmt)}`")

Hi, There,
when I test your example file(gen_example.py) and specified all the proper execution paths, I meet a bug (File "/public/home/ZT/software/DeepH-pack-main/gen_example.py", line 118, in
stru_shift_pert.to('poscar', f'../example/work_dir/dataset/raw/{shift_index}/POSCAR')
File "/public/home/ZT/.conda/envs/ZT-py39/lib/python3.9/site-packages/pymatgen/core/structure.py", line 2561, in to
raise ValueError(f"Invalid format: {str(fmt)}")
ValueError: Invalid format: `../example/work_dir/dataset/raw/0/poscar), it seems the code in gen_example.py can't be recognized by pymatgen. pymatgen can't understand your code and my pymatgen version is pymatgen-2023.1.30. if you can test it or give me some advise, I would greatly appreciate your help.

best regards,
ZT

Failed to preprocess

To Respected developer,
When i run the step 2-proprocess in TBB,there is a error-Failed to preprocess.When I was searching for errors in the source code, I discovered this piece of code in preprocess.py in line 89.
if capture_output.returncode != 0:
with open(os.path.join(os.path.abspath(relpath), 'error.log'), 'w') as f:
f.write(f'[stdout of cmd "{cmd}"]:\n\n{capture_output.stdout}\n\n\n'
f'[stderr of cmd "{cmd}"]:\n\n{capture_output.stderr}')
print(f'\nFailed to preprocess: {abspath}, '
f'log file was saved to {os.path.join(os.path.abspath(relpath), "error.log")}')
my preprocess.ini is
[basic]
raw_dir = /public/wcd/twisted/example/work_dir/dataset/raw
processed_dir = /public/wcd/twisted/example/work_dir/dataset/processed
target = hamiltonian
interface = openmx
multiprocessing = 0
local_coordinate = True
get_S = False
[interpreter]
python_interpreter = /public/apps/miniconda3/bin/python3.9
julia_interpreter = /public/apps/julia-1.5.4/bin/juila

[graph]
radius = -1.0
create_from_DFT = True
I want to ask how i can solve this problem
best regard.

question about van der Waals heterojunctions

Hi,
I found that the datasets with the double-layer structure in your article are all the same two layers. I would like to know if DeepH can be applied to van der Waals heterojunctions?

best regard.

dataset

I tried to install openmx, but it is too hard. If it is possible to get the raw train data directly?

Error while training TBG_dataset

Dear developer, I tried to train the TBG_dataset but failed for many reasons.
I took 10 of the structures and tested on two devices.

CPU cluster :


/home/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/graph.py:664: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at  ../torch/csrc/utils/tensor_new.cpp:210.)
  read_terms[key] = torch.tensor(v, dtype=default_dtype_torch)
100%|##########| 10/10 [42:54<00:00, 257.49s/it]
Finish processing 10 structures, have cost 2575 seconds
Finish saving 10 structures to ./graph/HGraph-h5-test-5l-FromDFT.pkl, have cost 2577 seconds
Atomic types: [6]
Finish loading the processed 10 structures (spinful: False, the number of atomic types: 1), cost 0 seconds
number of train set: 6
number of val set: 2
number of test set: 2
{'normalizer': False, 'boxcox': False}
Output features length of single edge: 169
The model you built has: 493210 parameters
VAL loss each out:
[7.6e+00, 2.1e+00, 6.2e+00, 3.7e-01, 7.8e-02, 5.2e+00, 5.7e-01, 5.8e-01, 4.4e+00, 8.7e-01, 4.0e+00, 4.0e-01, 1.9e-01, 1.8e+00, 2.2e+00, 2.7e+00, 1.7e-01, 2.0e+00, 9.1e-01, 6.4e-01, 2.0e-02, 2.3e-01, 8.2e-02, 6.0e-01, 7.3e-01, 1.4e+00, 4.3e+00, 1.6e+00, 4.6e+00, 1.1e+00, 2.6e+00, 2.9e+00, 2.2e-01, 6.4e-01, 1.4e+00, 5.9e-01, 1.3e-02, 1.1e+00, 1.2e+00, 2.1e-01, 4.2e-01, 4.4e-01, 2.0e+00, 4.3e-01, 7.2e-02, 1.7e+00, 3.1e-01, 1.7e+00, 8.4e-02, 1.0e+00, 2.4e-02, 3.3e+00, 1.3e-02, 4.4e-01, 6.0e-01, 1.1e-01, 2.9e+00, 2.2e+00, 1.3e-01, 2.0e+00, 2.1e+00, 2.1e-01, 1.6e-01, 9.3e-01, 4.1e-02, 3.2e+00, 5.7e-01, 2.7e+00, 1.2e-01, 2.9e+00, 3.3e+00, 8.2e-02, 6.5e-02, 3.7e-01, 1.5e-01, 5.7e-01, 4.3e-01, 8.4e-02, 5.3e-03, 1.4e-01, 1.0e+00, 1.7e+00, 2.1e-02, 2.2e-01, 2.5e+00, 2.5e-01, 1.3e-01, 1.8e-01, 1.2e+00, 2.0e-01, 3.9e-02, 1.4e-02, 8.6e-01, 1.3e+00, 5.2e-01, 3.4e+00, 2.2e-01, 7.3e+00, 1.9e+00, 1.5e-01, 1.5e-01, 2.3e-01, 1.2e+00, 4.6e-01, 8.9e-01, 3.2e-01, 3.4e+00, 2.7e+00, 1.5e+00, 6.0e-01, 3.8e+00, 6.4e+00, 4.0e+00, 6.3e-01, 1.5e+00, 1.2e-01, 4.2e+00, 8.9e+00, 6.0e-01, 2.7e+00, 1.6e-01, 1.4e+00, 2.7e+00, 6.3e-01, 1.2e+00, 3.8e-01, 5.3e+00, 2.9e-01, 4.3e-02, 9.2e-02, 4.7e-01, 5.3e-01, 4.4e-01, 1.2e+00, 2.0e+00, 7.8e-01, 8.1e-01, 2.6e-01, 6.5e-02, 5.9e-02, 7.5e+00, 8.3e-01, 6.6e-02, 2.0e-01, 3.0e+00, 4.6e-01, 4.8e-01, 9.3e-01, 1.8e+00, 7.8e-01, 4.1e-01, 3.1e-02, 1.8e-01, 1.4e-01, 5.9e+00, 4.5e+00, 2.2e+00, 4.5e-01, 5.4e-02, 3.3e+00, 4.0e-01, 8.1e-01, 4.2e-01, 3.4e-01, 5.1e-01, 3.3e-01, 1.4e+00, 6.2e-02, 6.5e+00]
max orbital: 8.9e+00 (0-based index: 117)
Traceback (most recent call last):
  File "/home/k0171/k017113/miniconda3/envs/deeph/bin/deeph-train", line 8, in 
    sys.exit(main())
  File "/home/k0171/k017113/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/scripts/train.py", line 20, in main
    kernel.train(train_loader, val_loader, test_loader)
  File "/home/k0171/k017113/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/kernel.py", line 565, in train
    save_model({
  File "/home/k0171/k017113/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/utils.py", line 133, in save_model
    with package.PackageExporter(model_dir, verbose=False) as exp:
TypeError: __init__() got an unexpected keyword argument 'verbose'

It looks like CPU-only PyTorch is running slow for some reason. And the training stops after only one step.

wsl2 with GPU:

There is no error message on this device, however the process is stuck here:


Graph data file: HGraph-h5-test-5l-FromDFT.pkl
Process new data file......
Found 10 structures, have cost 0 seconds
Use multiprocessing
 10%|#         | 1/10 [00:54<08:08, 54.24s/it]

After processing one structure the process stops working completely meanwhile the CPU is not used at all.
The training is fine if I only take 1 structure.

Do you know how to solve these problems?

Upgrade to new version of pyG and pytorch

Hello, because some hardware limit, i'm trying to use new version of pytorch and pyG with this package. Do you already know about what api changes caused the incompatibility of this package with new pytorch>1.9 and pytorch geometry>=2.0?

In train.ini , what means about 'orbital basis number' in basic <orbital>?

Hi,
When I trying to reproduce the result of TBG in your paper ' Deep-Learning Density Functional Theory Hamiltonian for Efficient ab initio
Electronic-Structure Calculation ' , I'm confused about the ' orbital basis number ' means . Does it means ' Valence electrons ' ? Or something else ?
Regards ,
TzuChing

LoadError when predicting Hamiltonian

ERROR: LoadError: SystemError: memory mapping failed: No such device
Stacktrace:
[1] systemerror(::String, ::Int32; extrainfo::Nothing) at ./error.jl:168 [2] #systemerror#48 at ./error.jl:167 [inlined]
[3] systemerror at ./error.jl:167 [inlined]
[4] mmap(::IOStream, ::Type{Array{UInt8,1}}, ::Tuple{Int64}, ::Int64; grow::Bool, shared::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Mmap/src/Mmap.jl:212
[5] #mmap#9 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Mmap/src/Mmap.jl:247 [inlined]
[6] mmap at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Mmap/src/Mmap.jl:247 [inlined] (repeats 2 times)
[7] #4 at /home/peiyuan/.julia/packages/JSON/NeJ9k/src/Parser.jl:510 [inlined]
[8] open(::JSON.Parser.var"#4#5"{DataType,DataType,Nothing,Bool,Bool,Int64}, ::String; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at ./io.jl:325
[9] open at ./io.jl:323 [inlined]
[10] #parsefile#3 at /home/peiyuan/.julia/packages/JSON/NeJ9k/src/Parser.jl:509 [inlined]
[11] parsefile(::String) at /home/peiyuan/.julia/packages/JSON/NeJ9k/src/Parser.jl:508
[12] top-level scope at /home/peiyuan/anaconda3/envs/deep8/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:64
[13] include(::Function, ::Module, ::String) at ./Base.jl:380
[14] include(::Module, ::String) at ./Base.jl:368
[15] exec_options(::Base.JLOptions) at ./client.jl:296
[16] _start() at ./client.jl:506
in expression starting at /home/peiyuan/anaconda3/envs/deep8/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:64
Traceback (most recent call last):
File "/home/peiyuan/anaconda3/envs/deep8/bin/deeph-inference", line 8, in
sys.exit(main())
File "/home/peiyuan/anaconda3/envs/deep8/lib/python3.9/site-packages/deeph/scripts/inference.py", line 129, in main
assert capture_output.returncode == 0
AssertionError

Necessary files could not be found in OLP_dir

Hello.
When I use openmx to calculate the overlap matrix of the predicted material, the openmx.out file is not generated, but there are output files.

when in the inference step, it occur error as follow:

assert os.path.exists(os.path.join(OLP_dir, 'openmx.out')), "Necessary files could not be found in OLP_dir"
AssertionError: Necessary files could not be found in OLP_dir.

How can I solve it?
Thank you.

question about step4_overlap

Dear developer, may I ask why it is necessary to calculate the overlay before using the model? If I could directly obtain the Hamiltonian of real space through the model, wouldn't it be possible to obtain any band through diagonalization?
Can overlay help improve efficiency or have some other effects?

OutOfMemoryError

Hi , when I try to run the step 5 with tesk = [5], and calulate only one k-point (1st) . There's inference parameters below:

############################################
[basic]
OLP_dir = /home/zjlin/tbg_Deep_ex/work_dir/olp/TBG_1.05/
work_dir = /home/zjlin/tbg_Deep_ex/work_dir/inference/TBG_1.05_mpi/
structure_file_name = POSCAR
interface = openmx
task = [5]
sparse_calc_config = /home/zjlin/tbg_Deep_ex/work_dir/inference/TBG_1.05_mpi/band_1.json
trained_model_dir = /home/zjlin/tbg_Deep_ex/work_dir/trained_model/2022-12-21_21-30-31
restore_blocks_py = False

[interpreter]
julia_interpreter = /home/zjlin/julia-1.5.4/bin/julia

[graph]
radius = 9.0
create_from_DFT = True
#############################################

and band.json:

#############################################

{
"calc_job": "band",
"which_k": 1,
"fermi_level": -3.8624886706842148,
"lowest_band": -0.3,
"max_iter": 300,
"num_band": 125,
"k_data": ["10 0.000 0.000 0.000 0.500 0.000 0.000 G M ","10 0.500 0.000 0.000 0.333333333 0.333333333 0.000 M K","10 0.333333333 0.333333333 0.000 0.000 0.000 0.000 K G"]
}

##############################################

But I got an error:

##############################################

[ Info: read h5
[ Info: construct Hamiltonian and overlap matrix in the real space
ERROR: LoadError: OutOfMemoryError()
Stacktrace:
[1] Array at ./boot.jl:424 [inlined]
[2] Array at ./boot.jl:432 [inlined]
[3] zeros at ./array.jl:525 [inlined]
[4] zeros(::Type{Complex{Float64}}, ::Int64, ::Int64) at ./array.jl:521
[5] top-level scope at /home/zjlin/anaconda3/envs/pytorch/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:121
[6] include(::Function, ::Module, ::String) at ./Base.jl:380
[7] include(::Module, ::String) at ./Base.jl:368
[8] exec_options(::Base.JLOptions) at ./client.jl:296
[9] _start() at ./client.jl:506
in expression starting at /home/zjlin/anaconda3/envs/pytorch/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:105
Traceback (most recent call last):
File "/home/zjlin/anaconda3/envs/pytorch/bin/deeph-inference", line 8, in
sys.exit(main())
File "/home/zjlin/anaconda3/envs/pytorch/lib/python3.9/site-packages/deeph/scripts/inference.py", line 131, in main
assert capture_output.returncode == 0
AssertionError

###############################################
Here's my question:
When I calculate TBG with 1.05 degree, there's 11908 atoms, how much memery we need prepare?
Best regards

failure when i run the process

it processed the error.log
[stdout of cmd "/data/home/scv7f2x/run/julia-1.6.6/bin/julia /data/home/scv7f2x/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl --input_dir /data/run01/scv7f2x/software-scv7f2x/DeepH-pack/example/work_dir/dataset/raw/0 --output_dir /data/run01/scv7f2x/software-scv7f2x/DeepH-pack/example/work_dir/dataset/processed/0 --save_overlap false"]:

[stderr of cmd "/data/home/scv7f2x/run/julia-1.6.6/bin/julia /data/home/scv7f2x/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl --input_dir /data/run01/scv7f2x/software-scv7f2x/DeepH-pack/example/work_dir/dataset/raw/0 --output_dir /data/run01/scv7f2x/software-scv7f2x/DeepH-pack/example/work_dir/dataset/processed/0 --save_overlap false"]:

ERROR: LoadError: AssertionError: DeepH-pack only supports OpenMX v3.9. Please check your OpenMX version
Stacktrace:
[1] parse_openmx(filepath::String; return_DM::Bool)
@ Main ~/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl:160
[2] get_data(filepath_scfout::String, Rcut::Float64; if_DM::Bool)
@ Main ~/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl:321
[3] top-level scope
@ ~/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl:424
in expression starting at /data/home/scv7f2x/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl:424

i'm sure that i installed the openmx3.9 ,so i don't know to slove it

the raw file is included output openmx.cif openmx.out openmx.scfout openmx.std openmx.xyz openmx_in.dat poscar

how to solve it thanks

question about 'phiVdphi'

Thank you for publishing the wonderful code.

I would like to ask about the meaning of variable 'phiVdphi'.
I believe this physical quantity was not even mentioned in the DeepH paper.
If this is something that can be output from OpenMX, I would appreciate knowing how to make it.

demo with ABACUS v3.1

Hello.
I'm trying to follow the demo_abacus.

In the inference step
deeph-inference --config inference.ini

It occur error as follow:

 Output subdirectories: OUT.ABACUS
Traceback (most recent call last):
  File "/home/whal1235/miniconda3/bin/deeph-inference", line 8, in <module>
    sys.exit(main())
  File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/scripts/inference.py", line 78, in main
    abacus_parse(OLP_dir, work_dir, data_name=f'OUT.{abacus_suffix}', only_S=True)
  File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/preprocess/abacus_get_data.py", line 82, in abacus_parse
    assert "WELCOME TO ABACUS" in line
AssertionError

And, my ABACUS log file is hear:



                              _ABACUS v3.1

               Atomic-orbital Based Ab-initio Computation at UStc

                     Website: http://abacus.ustc.edu.cn/
               Documentation: https://abacus.deepmodeling.com/
                  Repository: https://github.com/abacusmodeling/abacus-develop
                              https://github.com/deepmodeling/abacus-develop

    Start Time is Wed Mar  8 13:45:05 2023

 ------------------------------------------------------------------------------------

 READING GENERAL INFORMATION
                           global_out_dir = OUT.ABACUS/
                           global_in_card = INPUT
                               pseudo_dir =
                              orbital_dir =
                                    DRANK = 1
                                    DSIZE = 1
                                   DCOLOR = 1
                                    GRANK = 1
                                    GSIZE = 1
 The esolver type has been set to : ksdft_lcao




 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 |                                                                    |
 | Reading atom information in unitcell:                              |
 | From the input file and the structure file we know the number of   |
 | different elments in this unitcell, then we list the detail        |
 | information for each element, especially the zeta and polar atomic |
 | orbital number for each element. The total atom number is counted. |
 | We calculate the nearest atom distance for each atom and show the  |
 | Cartesian and Direct coordinates for each atom. We list the file   |
 | address for atomic orbitals. The volume and the lattice vectors    |
 | in real and reciprocal space is also shown.                        |
 |                                                                    |
 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<




 READING UNITCELL INFORMATION
                                    ntype = 1
                 atom label for species 1 = C
                  lattice constant (Bohr) = 1.88972
              lattice constant (Angstrom) = 0.999996

 READING ATOM TYPE 1
                               atom label = C
                      L=0, number of zeta = 2
                      L=1, number of zeta = 2
                      L=2, number of zeta = 1
             number of atom for this type = 100
                      start magnetization = FALSE
                      start magnetization = FALSE
                      start magnetization = FALSE
                      start magnetization = FALSE_

I cannot find any "WELCOME TO ABACUS" line.

Is it occur only from ABACUS v3? Or have I done something wrong?

Can i remove this Assert line from python code??

Thank you.

How to prepare other training dataset

As the author shows, we can use a dataset of small structures to predict large-scale material systems. But I felt confused about the "appropriate dataset of small structures".

  1. How many small structures in the dataset?
  2. The author gave us a Bi36 dataset to predict Bi244, why do you use Bi36, instead of Bi16, Bi4?
  3. What does "have close chemical bonding environment" mean, the same spacegroup?

question on the train paramter

To developer,

When I use the examples of Bi and graphene system, I notice some parameters (revert_then_decay, revert_decay_epoch, revert_decay_gamma) in the train.ini file, which are not explained in the manual wbsite. Therefore,if we start a calculation on new system, I want to ask whether these parameters need to be considered.

best regard.

error while 1.parse_Overlap in interfere part

Hi, dear developer.
when I am doing the inference part, The following error occurred.

Begin 1.parse_Overlap
Traceback (most recent call last):
File "/fs1/home/qijingshan/miniconda3/envs/copy/bin/deeph-inference", line 8, in
sys.exit(main())
File "/fs1/home/qijingshan/miniconda3/envs/copy/lib/python3.9/site-packages/deeph/scripts/inference.py", line 63, in main
openmx_parse_overlap(OLP_dir, work_dir, os.path.join(OLP_dir, structure_file_name))
File "/fs1/home/qijingshan/miniconda3/envs/copy/lib/python3.9/site-packages/deeph/preprocess/openmx_parse.py", line 76, in openmx_parse_overlap
structure = Structure.from_file(stru_dir)
File "/fs1/home/qijingshan/miniconda3/envs/copy/lib/python3.9/site-packages/pymatgen/core/structure.py", line 2676, in from_file
s = cls.from_str(contents, fmt="poscar", primitive=primitive, sort=sort, merge_tol=merge_tol, **kwargs)
File "/fs1/home/qijingshan/miniconda3/envs/copy/lib/python3.9/site-packages/pymatgen/core/structure.py", line 2594, in from_str
s = Poscar.from_string(input_string, False, read_velocities=False, **kwargs).structure
File "/fs1/home/qijingshan/miniconda3/envs/copy/lib/python3.9/site-packages/pymatgen/io/vasp/inputs.py", line 390, in from_string
toks = lines[ipos + 1 + i].split()
IndexError: tuple index out of range

It looks a bit strange,It What should I do to avoid reporting errors?

Here is my train.ini
[basic]
OLP_dir = /fs1/home/qijingshan/wcd/TBB/example/work_dir/olp/5_4
work_dir = /fs1/home/qijingshan/wcd/TBB/example/work_dir/inference/5_4
structure_file_name = POSCAR
task = [1, 2, 3, 4, 5]
sparse_calc_config = /fs1/home/qijingshan/wcd/TBB/example/work_dir/inference/5_4/band.json
trained_model_dir = /fs1/home/qijingshan/wcd/TBB/example/work_dir/trained_model
restore_blocks_py = True

[interpreter]
julia_interpreter = /fs1/home/qijingshan/julia-1.6.6/bin/juila

[graph]
radius = 9.0
create_from_DFT = True

How to restart training model

Hi,

When I run the example of using ABACUS, everything gone well except the training is canceled by the slurm system due to the time limit.

Because the maximum time of using the GPU accelerate card is limited for one submit in our cluster. I wander if I can restart the training when next submit ?

Inference 5.sparse_calc error in julia

I'm trying to do the inference step in the "demo_abacus".

python calc_OLP_of_CNT.py
deeph-inference --config inference.ini

In the inference step, I met an error that I cannot guess why it occur.

Hear in my error:

####### Begin 5.sparse_calc
./abacus_CNT.json
[ Info: read h5
Time for reading h5: 2.841439962387085s
[ Info: construct Hamiltonian and overlap matrix in the real space
Time for constructing Hamiltonian and overlap matrix in the real space: 0.7667179107666016 s
[ Info: calculate bands
ERROR: LoadError: PosDefException: matrix is not positive definite; Cholesky factorization failed.
Stacktrace:
 [1] chkposdef
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/lapack.jl:50 [inlined]
 [2] sygvd!(itype::Int64, jobz::Char, uplo::Char, A::Matrix{ComplexF64}, B::Matrix{ComplexF64})
   @ LinearAlgebra.LAPACK /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/lapack.jl:5337
 [3] #eigen!#102
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/symmetric.jl:832 [inlined]
 [4] eigen!
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/symmetric.jl:832 [inlined]
 [5] eigen(A::Hermitian{ComplexF64, Matrix{ComplexF64}}, B::Hermitian{ComplexF64, Matrix{ComplexF64}}; kws::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ LinearAlgebra /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/eigen.jl:501
 [6] eigen(A::Hermitian{ComplexF64, Matrix{ComplexF64}}, B::Hermitian{ComplexF64, Matrix{ComplexF64}})
   @ LinearAlgebra /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/eigen.jl:500
 [7] top-level scope
   @ ~/miniconda3/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:162
in expression starting at /home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:136
Traceback (most recent call last):
  File "/home/whal1235/miniconda3/bin/deeph-inference", line 8, in <module>
    sys.exit(main())
  File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/scripts/inference.py", line 131, in main
    assert capture_output.returncode == 0
AssertionError

Note that i did not follow the other steps before.

I used the graphene dataset of the paper. (https://zenodo.org/record/6555484#.ZAseK3ZByUk)

and train a model using "graphene.ini" in the git repository.

I guess the trained model is not a problem because the inference 1-4 processed without any error.

Do you think I should follow the data generation and train step in "demo_abacus"?

Or, is the error occur from another reason?

Thank you.

graphene_dataset can't be trained

It can't be trained because graph.py will raise NotImplemented exception. This exception is caused when the configuration parameter "interface" is not in ['h5', 'h5_rc_only'] and the parameter "create_from_DFT" is True. But if I change this parameter "interface" to "h5", the dataset can't find any structure because the rc and rh file ends with "npz".

whether DeepH support parallel computing (for example using mpirun)

To developer,

I am beginner of DeepH. I am wondering whether DeepH supports paraller computing, for example using mpirun? Because most user use it on multi-cores system, jobs are executed with parallel computing style. When I run DeepH, especially in the train step, I wish to execute the work on multi-cores (CPUs or GPUs). I have allocated the job with multi-cores on "slurm jobs submitting system", but the job seems still work on one CPU or GPU. So, I hope and suggest maybe you can write more details on how to execute the program on multi-cores. This would be more helpful and more faster to people to use it. Many thanks for these helps.

best regards,
Tao

failure while preprocess in demo-abacus

Hello PhD.Li, I am a beginner in DeepH.
When I tried to replicate the “demo: train the DeepH model using the Abacus interface” , I encountered the failure when I run step2-preprocess, the command line display:
User config name: ['preprocess.ini']
Found 3 directories to preprocess
^MPreprocessing No. 1/3 [08:00:00<?]...Output subdirectories: OUT.ABACUS
Traceback (most recent call last):
File "/public3/home/scg8978/.conda/envs/python39/bin/deeph-preprocess", line 33, in
sys.exit(load_entry_point('deeph==0.2.2', 'console_scripts', 'deeph-preprocess')())
File "/public3/home/scg8978/.conda/envs/python39/lib/python3.9/site-packages/deeph/scripts/preprocess.py", line 147, in main
worker(index)
File "/public3/home/scg8978/.conda/envs/python39/lib/python3.9/site-packages/deeph/scripts/preprocess.py", line 118, in worker
abacus_parse(abspath, os.path.abspath(relpath), 'OUT.' + abacus_suffix)
File "/public3/home/scg8978/.conda/envs/python39/lib/python3.9/site-packages/deeph/preprocess/abacus_get_data.py", line 160, in abacus_parse
assert line is not None, 'Cannot find "NSPIN" in log file'
AssertionError: Cannot find "NSPIN" in log file
My preprocess.ini is
[basic]
raw_dir =./dataset_3_graphene
processed_dir =./preprocessed_3_graphene
target = hamiltonian
interface = abacus

[interpreter]
julia_interpreter = julia

[graph]
radius = -1
create_from_DFT = True
Although I have been success in master the process of deeph using the demo_abacus in the bohrium platform, I am still a beginner,I can't success in run the demo in our supercomputer, I want to ask how I can solve this problem. looking forward to your reply.
best wishes.

I have a problem when i inference hamiton

Hello, I got this error when forecasting, but I'm running the previously calculated material and it's fine, the settings are all the same, what could be the cause of this?

=> Atomic types: [32], spinful: True, the number of atomic types: 1.
Save processed graph to /mnt/raid1/work/zhang/Ge/4.27/work_dir/inference/7.34/graph.pkl, cost 38.801493644714355 seconds
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [02:56<00:00, 176.66s/it]
Traceback (most recent call last):
File "/mnt/raid1/work/zhang/miniconda3/envs/deeph/bin/deeph-inference", line 8, in
sys.exit(main())
File "/mnt/raid1/work/zhang/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/scripts/inference.py", line 124, in main
predict(input_dir=work_dir, output_dir=work_dir, disable_cuda=disable_cuda, device=device,
File "/mnt/raid1/work/zhang/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/inference/pred_ham.py", line 167, in predict
assert np.all(np.isnan(hamiltonian) == False)
AssertionErro

Here are the settings
OLP_dir = /mnt/raid1/work/zhang/Ge/4.27/work_dir/olp/5_4/7.34
work_dir = /mnt/raid1/work/zhang/Ge/4.27/work_dir/inference/7.34
structure_file_name = POSCAR
task = [1, 2, 3, 4, 5]
sparse_calc_config = /mnt/raid1/work/zhang/Ge/4.27/work_dir/inference/5_4/band.json
trained_model_dir = /mnt/raid1/work/zhang/Ge/4.27/work_dir/Ge_4.27
restore_blocks_py = True
eigen_solver = sparse_jl

[interpreter]
julia_interpreter = /mnt/raid1/work/zhang/julia-1.8.3/bin/julia

[graph]
radius = -1.0
create_from_DFT = True

The problem during the inference stage

  1. During the inference stage, is it possible to only execute task 5 and save tasks 1-4 as files for reading?Are there any parameters provided to achieve this function?

  2. When I set the band.json file, when which_k = 0, it means to use one thread to calculate all k points. Does which_k = [0,1,2,3,4,5] mean that I use one thread to calculate the first 0-5 k points? Will which_k=[0] calculate all k points or only the first k point?

Thank you for your response.
tzuching

how to plot the result of the dft

hello, i have finished the pre of deeph-E3 ,now i want to plot the figure like the paper ,but i don't know how to plot the dft ,can you tell me and give me about your address? thanks

an Assertion error in step 3 get_pred_Hamiltonian of Inference part

Hi, There,

When I successfully get the training model and olp matrix, I did the inference part and I meet a error like this:

=> load best checkpoint (epoch 5969)
=> Atomic types: [52, 74], spinful: True, the number of atomic types: 2.
Load processed graph from /share/home/zhangtao/work/xxxx/xxxx/work_dir/inference/graph.pkl
Traceback (most recent call last):
File "/share/home/zhangtao/anaconda3/envs/ZT-py39/bin/deeph-inference", line 8, in
sys.exit(main())
File "/share/home/zhangtao/anaconda3/envs/ZT-py39/lib/python3.9/site-packages/deeph/scripts/inference.py", line 105, in main
predict(input_dir=work_dir, output_dir=work_dir, disable_cuda=disable_cuda, device=device,
File "/share/home/zhangtao/anaconda3/envs/ZT-py39/lib/python3.9/site-packages/deeph/inference/pred_ham.py", line 167, in predict assert np.all(np.isnan(hamiltonian) == False)
AssertionError
here I also list the inference.ini setting:
[basic]
OLP_dir = /share/home/zhangtao/work/WTe2/train/data/WTe2/work_dir/olp
work_dir = /share/home/zhangtao/work/WTe2/train/data/WTe2/work_dir/inference
interface = openmx
structure_file_name = POSCAR
task = [1, 2, 3, 4, 5]
sparse_calc_config = /share/home/zhangtao/work/WTe2/train/data/WTe2/work_dir/inference/band.json
trained_model_dir = /share/home/zhangtao/work/WTe2/train/data/WTe2/work_dir/trained_model
restore_blocks_py = True
dense_calc = True
disable_cuda = False
device = cuda:0
huge_structure = True

[interpreter]
julia_interpreter = /share/home/zhangtao/software/julia-1.6.6/bin/julia

[graph]
radius = 9.0
create_from_DFT = True

band setting:
{
"calc_job": "band",
"which_k": 0,
"fermi_level": 0,
"lowest_band": -10.3,
"max_iter": 300,
"num_band": 100,
"k_data": ["20 0.5000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 X Γ", "20 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.5000000000 0.0000000000 Γ Y", "20 0.0000000000 0.5000000000 0.0000000000 0.5000000000 0.5000000000 0.0000000000 Y M","20 0.5000000000 0.5000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 M Γ"]
}
I have tried to find out reason, but I failed. So sad! I would greatly appreciate your kind help, if you could give me some advice on this error.

Best regards,
Tao

Merge DeepH-E3 and xDeepH into the current repository

Merge two methods, DeepH-E3 and xDeepH, into DeepH-pack. This will allow us to better integrate the DeepH series methods and maintain them more effectively.

Here are the details of the two methods:

  1. DeepH-E3
  1. xDeepH

a user warning during inference step

Hi, There,
when I am doing the inference part, all steps before inference step all finished normally, but the program remind me this warining (below). is this warning we can ignore or not? if not, how can we avoid (remove) it?

/share/home/zhangtao/anaconda3/envs/ZT-py39/lib/python3.9/site-packages/deeph/kernel.py:53: UserWarning: Unable to copy scripts
warnings.warn("Unable to copy scripts")

Here I also list my inference setting:

[basic]
work_dir =/work/deeph-test/workdir/inference3
OLP_dir = /
/work/deeph-test/workdir/olp
interface = openmx
structure_file_name = POSCAR
trained_model_dir = /work/deeph-test/workdir/trained_model/2023-04-19_11-29-45
task = [1, 2, 3, 4, 5]
sparse_calc_config =
/work/deeph-test/workdir/inference3/band.json
dense_calc = True
disable_cuda = False
device = cuda:0
huge_structure = True

gen_rc_idx = False
gen_rc_by_idx =
with_grad = False

[interpreter]
julia_interpreter = ***/software/julia-1.6.6/bin/julia

[graph]
radius = -1.0
create_from_DFT = True

the band.json setting is :
{
"calc_job": "band",
"which_k": 0,
"fermi_level": 0,
"lowest_band": -10.3,
"max_iter": 300,
"num_band": 100,
"k_data": ["46 0.3333333333333333 0.6666666666666667 0 0 0 0 K Γ", "28 0 0 0 0.5 0.5 0 Γ M", "54 0.5 0.5 0 0.6666666666666667 0.3333333333333333 0 M K'"]
}

One more question,

the program seems to be stucked at 3.get_pred_Hamiltonian, because the output file is not updated in the working directory. The latested updated time is 12:47, 26/04/2023. After that time, the files never have any change, but the program still is running now(17:47, 26/04/2023).

image
Uploading image.png…

Much appreciation for your kind help.

Best regard,

Error while preprocessing in TBB

Dear developer, I encountered the following error while executing the second preprocessing.
Found 20 directories to preprocess
Preprocessing No. 1/20...
/Bin/sh:/public/apps/julia-1.54/bin/jula:
Traceback (most recent call last):
File "/public/apps/miniconda3/bin/deeph-preprocess", line 8, in
sys.exit(main())
File "/public/apps/miniconda3/lib/python3.9/site-packages/deeph/scripts/preprocess.py", line 64, in main
assert capture_ output.returncode == 0
AssertionError
my proprcess.ini is as follows
[basic]
raw_dir = /public/wcd/twisted/example/work_dir/dataset/raw
processed_dir = /public/wcd/twisted/example/work_dir/dataset/processed
target = hamiltonian
interface = openmx
multiprocessing = 48
local_coordinate = Ture
get_S = False

[interpreter]
python_interpreter = /public/apps/miniconda3/bin/python3
julia_interpreter = /public/apps/julia-1.5.4/bin/juila
[graph]
radius = 9.0
create_from_DFT = True
After trying to read the source code, I couldn't figure out why capture_ Output.returncode is not 0.Could you help me with this problem?
Thank you for your response.

preprocess assert error SpinP_switch >> 2 == 3

Hi, I used gen_example.py to reproduce the fig 6c in paper. The first step for DFT calculation works well. But the second step for preprocessing asserts error: SpinP_switch >> 2 == 3. I didn't change any settings after genneration. I tried to run DFT calculation again, but it doesn't work.

Stacktrace:
 [1] parse_openmx(::String; return_DM::Bool) at /deeph/preprocess/openmx_get_data.jl:156
 [2] get_data(::String, ::Float64; if_DM::Bool) at /deeph/preprocess/openmx_get_data.jl:317
 [3] top-level scope at /deeph/preprocess/openmx_get_data.jl:420
 [4] include(::Function, ::Module, ::String) at ./Base.jl:380
 [5] include(::Module, ::String) at ./Base.jl:368
 [6] exec_options(::Base.JLOptions) at ./client.jl:296
 [7] _start() at ./client.jl:506

error while preprocess in demo-abacus

Hello PhD.Li, I am a beginner in DeepH.
When I tried to replicate the “demo: train the DeepH model using the Abacus interface” on the Bohrium platform, I encountered an error when I run step2-preprocess, the command line display:

preprocess.sh: 1: deeph-preprocess: not found

My preprocess.ini is
[basic]
raw_dir = /data/demo_abacus/2_preprocess_dataset/3_dataset_graphene
processed_dir = ./preprocessed_3_graphene
target = hamiltonian
interface = abacus

[interpreter]
julia_interpreter = /home/lihe/apps/julia-1.5.4/bin/julia

[graph]
radius = -1
create_from_DFT = True

Because I am a beginner, I strictly followed the steps of the deep computer tutorial in BiliBili, I want to ask how I can solve this problem. looking forward to your reply.
best wishes.

How to prepare other training dataset

As the author shows, we can use a dataset of small structures to predict large-scale material systems. But I felt confused about the "appropriate dataset of small structures".

  1. How many small structures in the dataset?
  2. The author gave us a Bi36 dataset to predict Bi244, why do you use Bi36, instead of Bi16, Bi4?
  3. What does "have close chemical bonding environment" mean, the same spacegroup?

I get a problem with loss=0 in 3_train progess

I have completed both the 1DFT_calculation and the 2reprocess, and now I am encountering a loss=0 situation during 3_train training. I am sure that the logs for the first two steps are both finished and have data. The first step takes too long time,I would like to conduct a small batch test for 3_train. Neither I train 10 or train 100 folders with the same result loss=0, do I have to process all the data in folders 0-575 before I can proceed with 3_train?

how to get fermi energy

I saw your discussion in other issue about how to get Fermi energy levels

"The most direct way to obtain the Fermi energy is to use the sparse_calc.jl script in DeepH-pack. Starting from the lowest band, one can use to calculate a certain number of bands, which is equal to the number of valence electrons, thus obtaining the Fermi energy. This approach does not require using other software or relying on guessed valuessparse_calc.jl"

But I don't understand what "lowest band" means, is the fermi level set to this value when setting the band.json and how this value is obtained. At the same time, according to your algorithm, the sparse calculation is to calculate the closest to the set Fermi energy, the number of energy bands to the number of valence electrons should be able to band in the set Fermi energy above and below the numerical value, then which number is the Fermi energy it!

How to calculate the eigenvalues of different k points in parallel in the inference step?

Hello, I wonder how to calculate the eigenvalues of different k points in parallel in the inference step?

[which_k : Define which point in k-path to calculate, start counting from 1. You can set it ‘0’ for all k points, or ‘-1’ for no point. It is recommended to calculate the eigenvalues of different k points in parallel through it. (Invalid for dense matrix calculation)]

demo_abacus with ABACUS 3.1

Hello.

I'm trying to follow the "demo_abacus" with ABACUS 3.1.

I've made raw data using ABACUS 3.1.

After that, I tried the deeph-preprocess.

deeph-preprocess --config preprocess.ini

And it occur :

User config name: ['preprocess.ini']
Found 400 directories to preprocess
Preprocessing No. 1/400 [09:00:00<?]...Output subdirectories: OUT.ABACUS
Traceback (most recent call last):
  File "/home/whal1235/miniconda3/bin/deeph-preprocess", line 8, in <module>
    sys.exit(main())
  File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/scripts/preprocess.py", line 147, in main
    worker(index)
  File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/scripts/preprocess.py", line 118, in worker
    abacus_parse(abspath, os.path.abspath(relpath), 'OUT.' + abacus_suffix)
  File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/preprocess/abacus_get_data.py", line 246, in abacus_parse
    hamiltonian_dict, tmp = parse_matrix(
  File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/preprocess/abacus_get_data.py", line 204, in parse_matrix
    num_element = int(line1[3])
ValueError: invalid literal for int() with base 10: 'H(R):'

I think it occur from output file format difference between ABACUS 2.XX and ABACUS 3.1.

This is an example of ABACUS 3.1 output file.

OUT.ABACUS.zip

Can you fix it? Or do you recomend me to install ABACUS 2.xx?

Thank you.

Process killed when inferencing MATBG

[network]
atom_fea_len=64
edge_fea_len=128
gauss_stop=6.0
num_l=5
aggr=add
distance_expansion=GaussianBasis
if_exp=True
if_multiplelinear=False
if_edge_update=True
if_lcmp=True
normalization=LayerNorm
atom_update_net=CGConv
trainable_gaussians=False
type_affine=False

=> load best checkpoint (epoch 3333)
=> Atomic types: [6], spinful: False, the number of atomic types: 1.
Killed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.