mzjb / deeph-pack Goto Github PK
View Code? Open in Web Editor NEWDeep neural networks for density functional theory Hamiltonian.
License: GNU Lesser General Public License v3.0
Deep neural networks for density functional theory Hamiltonian.
License: GNU Lesser General Public License v3.0
Hi, There,
when I test your example file(gen_example.py) and specified all the proper execution paths, I meet a bug (File "/public/home/ZT/software/DeepH-pack-main/gen_example.py", line 118, in
stru_shift_pert.to('poscar', f'../example/work_dir/dataset/raw/{shift_index}/POSCAR')
File "/public/home/ZT/.conda/envs/ZT-py39/lib/python3.9/site-packages/pymatgen/core/structure.py", line 2561, in to
raise ValueError(f"Invalid format: {str(fmt)}
")
ValueError: Invalid format: `../example/work_dir/dataset/raw/0/poscar), it seems the code in gen_example.py can't be recognized by pymatgen. pymatgen can't understand your code and my pymatgen version is pymatgen-2023.1.30. if you can test it or give me some advise, I would greatly appreciate your help.
best regards,
ZT
To Respected developer,
When i run the step 2-proprocess in TBB,there is a error-Failed to preprocess.When I was searching for errors in the source code, I discovered this piece of code in preprocess.py in line 89.
if capture_output.returncode != 0:
with open(os.path.join(os.path.abspath(relpath), 'error.log'), 'w') as f:
f.write(f'[stdout of cmd "{cmd}"]:\n\n{capture_output.stdout}\n\n\n'
f'[stderr of cmd "{cmd}"]:\n\n{capture_output.stderr}')
print(f'\nFailed to preprocess: {abspath}, '
f'log file was saved to {os.path.join(os.path.abspath(relpath), "error.log")}')
my preprocess.ini is
[basic]
raw_dir = /public/wcd/twisted/example/work_dir/dataset/raw
processed_dir = /public/wcd/twisted/example/work_dir/dataset/processed
target = hamiltonian
interface = openmx
multiprocessing = 0
local_coordinate = True
get_S = False
[interpreter]
python_interpreter = /public/apps/miniconda3/bin/python3.9
julia_interpreter = /public/apps/julia-1.5.4/bin/juila
[graph]
radius = -1.0
create_from_DFT = True
I want to ask how i can solve this problem
best regard.
Hi,
I found that the datasets with the double-layer structure in your article are all the same two layers. I would like to know if DeepH can be applied to van der Waals heterojunctions?
best regard.
I tried to install openmx, but it is too hard. If it is possible to get the raw train data directly?
Dear developer, I tried to train the TBG_dataset but failed for many reasons.
I took 10 of the structures and tested on two devices.
CPU cluster :
/home/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/graph.py:664: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:210.)
read_terms[key] = torch.tensor(v, dtype=default_dtype_torch)
100%|##########| 10/10 [42:54<00:00, 257.49s/it]
Finish processing 10 structures, have cost 2575 seconds
Finish saving 10 structures to ./graph/HGraph-h5-test-5l-FromDFT.pkl, have cost 2577 seconds
Atomic types: [6]
Finish loading the processed 10 structures (spinful: False, the number of atomic types: 1), cost 0 seconds
number of train set: 6
number of val set: 2
number of test set: 2
{'normalizer': False, 'boxcox': False}
Output features length of single edge: 169
The model you built has: 493210 parameters
VAL loss each out:
[7.6e+00, 2.1e+00, 6.2e+00, 3.7e-01, 7.8e-02, 5.2e+00, 5.7e-01, 5.8e-01, 4.4e+00, 8.7e-01, 4.0e+00, 4.0e-01, 1.9e-01, 1.8e+00, 2.2e+00, 2.7e+00, 1.7e-01, 2.0e+00, 9.1e-01, 6.4e-01, 2.0e-02, 2.3e-01, 8.2e-02, 6.0e-01, 7.3e-01, 1.4e+00, 4.3e+00, 1.6e+00, 4.6e+00, 1.1e+00, 2.6e+00, 2.9e+00, 2.2e-01, 6.4e-01, 1.4e+00, 5.9e-01, 1.3e-02, 1.1e+00, 1.2e+00, 2.1e-01, 4.2e-01, 4.4e-01, 2.0e+00, 4.3e-01, 7.2e-02, 1.7e+00, 3.1e-01, 1.7e+00, 8.4e-02, 1.0e+00, 2.4e-02, 3.3e+00, 1.3e-02, 4.4e-01, 6.0e-01, 1.1e-01, 2.9e+00, 2.2e+00, 1.3e-01, 2.0e+00, 2.1e+00, 2.1e-01, 1.6e-01, 9.3e-01, 4.1e-02, 3.2e+00, 5.7e-01, 2.7e+00, 1.2e-01, 2.9e+00, 3.3e+00, 8.2e-02, 6.5e-02, 3.7e-01, 1.5e-01, 5.7e-01, 4.3e-01, 8.4e-02, 5.3e-03, 1.4e-01, 1.0e+00, 1.7e+00, 2.1e-02, 2.2e-01, 2.5e+00, 2.5e-01, 1.3e-01, 1.8e-01, 1.2e+00, 2.0e-01, 3.9e-02, 1.4e-02, 8.6e-01, 1.3e+00, 5.2e-01, 3.4e+00, 2.2e-01, 7.3e+00, 1.9e+00, 1.5e-01, 1.5e-01, 2.3e-01, 1.2e+00, 4.6e-01, 8.9e-01, 3.2e-01, 3.4e+00, 2.7e+00, 1.5e+00, 6.0e-01, 3.8e+00, 6.4e+00, 4.0e+00, 6.3e-01, 1.5e+00, 1.2e-01, 4.2e+00, 8.9e+00, 6.0e-01, 2.7e+00, 1.6e-01, 1.4e+00, 2.7e+00, 6.3e-01, 1.2e+00, 3.8e-01, 5.3e+00, 2.9e-01, 4.3e-02, 9.2e-02, 4.7e-01, 5.3e-01, 4.4e-01, 1.2e+00, 2.0e+00, 7.8e-01, 8.1e-01, 2.6e-01, 6.5e-02, 5.9e-02, 7.5e+00, 8.3e-01, 6.6e-02, 2.0e-01, 3.0e+00, 4.6e-01, 4.8e-01, 9.3e-01, 1.8e+00, 7.8e-01, 4.1e-01, 3.1e-02, 1.8e-01, 1.4e-01, 5.9e+00, 4.5e+00, 2.2e+00, 4.5e-01, 5.4e-02, 3.3e+00, 4.0e-01, 8.1e-01, 4.2e-01, 3.4e-01, 5.1e-01, 3.3e-01, 1.4e+00, 6.2e-02, 6.5e+00]
max orbital: 8.9e+00 (0-based index: 117)
Traceback (most recent call last):
File "/home/k0171/k017113/miniconda3/envs/deeph/bin/deeph-train", line 8, in
sys.exit(main())
File "/home/k0171/k017113/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/scripts/train.py", line 20, in main
kernel.train(train_loader, val_loader, test_loader)
File "/home/k0171/k017113/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/kernel.py", line 565, in train
save_model({
File "/home/k0171/k017113/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/utils.py", line 133, in save_model
with package.PackageExporter(model_dir, verbose=False) as exp:
TypeError: __init__() got an unexpected keyword argument 'verbose'
It looks like CPU-only PyTorch is running slow for some reason. And the training stops after only one step.
wsl2 with GPU:
There is no error message on this device, however the process is stuck here:
Graph data file: HGraph-h5-test-5l-FromDFT.pkl
Process new data file......
Found 10 structures, have cost 0 seconds
Use multiprocessing
10%|# | 1/10 [00:54<08:08, 54.24s/it]
After processing one structure the process stops working completely meanwhile the CPU is not used at all.
The training is fine if I only take 1 structure.
Do you know how to solve these problems?
Hello, because some hardware limit, i'm trying to use new version of pytorch and pyG with this package. Do you already know about what api changes caused the incompatibility of this package with new pytorch>1.9 and pytorch geometry>=2.0?
Hi,
When I trying to reproduce the result of TBG in your paper ' Deep-Learning Density Functional Theory Hamiltonian for Efficient ab initio
Electronic-Structure Calculation ' , I'm confused about the ' orbital basis number ' means . Does it means ' Valence electrons ' ? Or something else ?
Regards ,
TzuChing
ERROR: LoadError: SystemError: memory mapping failed: No such device
Stacktrace:
[1] systemerror(::String, ::Int32; extrainfo::Nothing) at ./error.jl:168 [2] #systemerror#48 at ./error.jl:167 [inlined]
[3] systemerror at ./error.jl:167 [inlined]
[4] mmap(::IOStream, ::Type{Array{UInt8,1}}, ::Tuple{Int64}, ::Int64; grow::Bool, shared::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Mmap/src/Mmap.jl:212
[5] #mmap#9 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Mmap/src/Mmap.jl:247 [inlined]
[6] mmap at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Mmap/src/Mmap.jl:247 [inlined] (repeats 2 times)
[7] #4 at /home/peiyuan/.julia/packages/JSON/NeJ9k/src/Parser.jl:510 [inlined]
[8] open(::JSON.Parser.var"#4#5"{DataType,DataType,Nothing,Bool,Bool,Int64}, ::String; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at ./io.jl:325
[9] open at ./io.jl:323 [inlined]
[10] #parsefile#3 at /home/peiyuan/.julia/packages/JSON/NeJ9k/src/Parser.jl:509 [inlined]
[11] parsefile(::String) at /home/peiyuan/.julia/packages/JSON/NeJ9k/src/Parser.jl:508
[12] top-level scope at /home/peiyuan/anaconda3/envs/deep8/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:64
[13] include(::Function, ::Module, ::String) at ./Base.jl:380
[14] include(::Module, ::String) at ./Base.jl:368
[15] exec_options(::Base.JLOptions) at ./client.jl:296
[16] _start() at ./client.jl:506
in expression starting at /home/peiyuan/anaconda3/envs/deep8/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:64
Traceback (most recent call last):
File "/home/peiyuan/anaconda3/envs/deep8/bin/deeph-inference", line 8, in
sys.exit(main())
File "/home/peiyuan/anaconda3/envs/deep8/lib/python3.9/site-packages/deeph/scripts/inference.py", line 129, in main
assert capture_output.returncode == 0
AssertionError
i'm running in super computer, but it is so slowly .
i want to know how much time do we need?
thanks
Hello.
When I use openmx to calculate the overlap matrix of the predicted material, the openmx.out file is not generated, but there are output files.
when in the inference step, it occur error as follow:
assert os.path.exists(os.path.join(OLP_dir, 'openmx.out')), "Necessary files could not be found in OLP_dir"
AssertionError: Necessary files could not be found in OLP_dir.
How can I solve it?
Thank you.
Dear developer, may I ask why it is necessary to calculate the overlay before using the model? If I could directly obtain the Hamiltonian of real space through the model, wouldn't it be possible to obtain any band through diagonalization?
Can overlay help improve efficiency or have some other effects?
Hi , when I try to run the step 5 with tesk = [5], and calulate only one k-point (1st) . There's inference parameters below:
############################################
[basic]
OLP_dir = /home/zjlin/tbg_Deep_ex/work_dir/olp/TBG_1.05/
work_dir = /home/zjlin/tbg_Deep_ex/work_dir/inference/TBG_1.05_mpi/
structure_file_name = POSCAR
interface = openmx
task = [5]
sparse_calc_config = /home/zjlin/tbg_Deep_ex/work_dir/inference/TBG_1.05_mpi/band_1.json
trained_model_dir = /home/zjlin/tbg_Deep_ex/work_dir/trained_model/2022-12-21_21-30-31
restore_blocks_py = False
[interpreter]
julia_interpreter = /home/zjlin/julia-1.5.4/bin/julia
[graph]
radius = 9.0
create_from_DFT = True
#############################################
and band.json:
#############################################
{
"calc_job": "band",
"which_k": 1,
"fermi_level": -3.8624886706842148,
"lowest_band": -0.3,
"max_iter": 300,
"num_band": 125,
"k_data": ["10 0.000 0.000 0.000 0.500 0.000 0.000 G M ","10 0.500 0.000 0.000 0.333333333 0.333333333 0.000 M K","10 0.333333333 0.333333333 0.000 0.000 0.000 0.000 K G"]
}
##############################################
But I got an error:
##############################################
[ Info: read h5
[ Info: construct Hamiltonian and overlap matrix in the real space
ERROR: LoadError: OutOfMemoryError()
Stacktrace:
[1] Array at ./boot.jl:424 [inlined]
[2] Array at ./boot.jl:432 [inlined]
[3] zeros at ./array.jl:525 [inlined]
[4] zeros(::Type{Complex{Float64}}, ::Int64, ::Int64) at ./array.jl:521
[5] top-level scope at /home/zjlin/anaconda3/envs/pytorch/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:121
[6] include(::Function, ::Module, ::String) at ./Base.jl:380
[7] include(::Module, ::String) at ./Base.jl:368
[8] exec_options(::Base.JLOptions) at ./client.jl:296
[9] _start() at ./client.jl:506
in expression starting at /home/zjlin/anaconda3/envs/pytorch/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:105
Traceback (most recent call last):
File "/home/zjlin/anaconda3/envs/pytorch/bin/deeph-inference", line 8, in
sys.exit(main())
File "/home/zjlin/anaconda3/envs/pytorch/lib/python3.9/site-packages/deeph/scripts/inference.py", line 131, in main
assert capture_output.returncode == 0
AssertionError
###############################################
Here's my question:
When I calculate TBG with 1.05 degree, there's 11908 atoms, how much memery we need prepare?
Best regards
it processed the error.log
[stdout of cmd "/data/home/scv7f2x/run/julia-1.6.6/bin/julia /data/home/scv7f2x/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl --input_dir /data/run01/scv7f2x/software-scv7f2x/DeepH-pack/example/work_dir/dataset/raw/0 --output_dir /data/run01/scv7f2x/software-scv7f2x/DeepH-pack/example/work_dir/dataset/processed/0 --save_overlap false"]:
[stderr of cmd "/data/home/scv7f2x/run/julia-1.6.6/bin/julia /data/home/scv7f2x/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl --input_dir /data/run01/scv7f2x/software-scv7f2x/DeepH-pack/example/work_dir/dataset/raw/0 --output_dir /data/run01/scv7f2x/software-scv7f2x/DeepH-pack/example/work_dir/dataset/processed/0 --save_overlap false"]:
ERROR: LoadError: AssertionError: DeepH-pack only supports OpenMX v3.9. Please check your OpenMX version
Stacktrace:
[1] parse_openmx(filepath::String; return_DM::Bool)
@ Main ~/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl:160
[2] get_data(filepath_scfout::String, Rcut::Float64; if_DM::Bool)
@ Main ~/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl:321
[3] top-level scope
@ ~/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl:424
in expression starting at /data/home/scv7f2x/.conda/envs/dl/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl:424
i'm sure that i installed the openmx3.9 ,so i don't know to slove it
the raw file is included output openmx.cif openmx.out openmx.scfout openmx.std openmx.xyz openmx_in.dat poscar
how to solve it thanks
Thank you for publishing the wonderful code.
I would like to ask about the meaning of variable 'phiVdphi'.
I believe this physical quantity was not even mentioned in the DeepH paper.
If this is something that can be output from OpenMX, I would appreciate knowing how to make it.
Hello.
I'm trying to follow the demo_abacus.
In the inference step
deeph-inference --config inference.ini
It occur error as follow:
Output subdirectories: OUT.ABACUS
Traceback (most recent call last):
File "/home/whal1235/miniconda3/bin/deeph-inference", line 8, in <module>
sys.exit(main())
File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/scripts/inference.py", line 78, in main
abacus_parse(OLP_dir, work_dir, data_name=f'OUT.{abacus_suffix}', only_S=True)
File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/preprocess/abacus_get_data.py", line 82, in abacus_parse
assert "WELCOME TO ABACUS" in line
AssertionError
And, my ABACUS log file is hear:
_ABACUS v3.1
Atomic-orbital Based Ab-initio Computation at UStc
Website: http://abacus.ustc.edu.cn/
Documentation: https://abacus.deepmodeling.com/
Repository: https://github.com/abacusmodeling/abacus-develop
https://github.com/deepmodeling/abacus-develop
Start Time is Wed Mar 8 13:45:05 2023
------------------------------------------------------------------------------------
READING GENERAL INFORMATION
global_out_dir = OUT.ABACUS/
global_in_card = INPUT
pseudo_dir =
orbital_dir =
DRANK = 1
DSIZE = 1
DCOLOR = 1
GRANK = 1
GSIZE = 1
The esolver type has been set to : ksdft_lcao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
| |
| Reading atom information in unitcell: |
| From the input file and the structure file we know the number of |
| different elments in this unitcell, then we list the detail |
| information for each element, especially the zeta and polar atomic |
| orbital number for each element. The total atom number is counted. |
| We calculate the nearest atom distance for each atom and show the |
| Cartesian and Direct coordinates for each atom. We list the file |
| address for atomic orbitals. The volume and the lattice vectors |
| in real and reciprocal space is also shown. |
| |
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
READING UNITCELL INFORMATION
ntype = 1
atom label for species 1 = C
lattice constant (Bohr) = 1.88972
lattice constant (Angstrom) = 0.999996
READING ATOM TYPE 1
atom label = C
L=0, number of zeta = 2
L=1, number of zeta = 2
L=2, number of zeta = 1
number of atom for this type = 100
start magnetization = FALSE
start magnetization = FALSE
start magnetization = FALSE
start magnetization = FALSE_
I cannot find any "WELCOME TO ABACUS" line.
Is it occur only from ABACUS v3? Or have I done something wrong?
Can i remove this Assert line from python code??
Thank you.
As the author shows, we can use a dataset of small structures to predict large-scale material systems. But I felt confused about the "appropriate dataset of small structures".
To developer,
When I use the examples of Bi and graphene system, I notice some parameters (revert_then_decay, revert_decay_epoch, revert_decay_gamma) in the train.ini file, which are not explained in the manual wbsite. Therefore,if we start a calculation on new system, I want to ask whether these parameters need to be considered.
best regard.
An error occurred when using the deeph-inference function to obtain overlaps.h5 from ABACUS 's get_s calculation.
The error message is included in the deeph-inference.log file.
In order to save space, I emptied the SR.csr file.
get_S_process.zip
Hi, dear developer.
when I am doing the inference part, The following error occurred.
Begin 1.parse_Overlap
Traceback (most recent call last):
File "/fs1/home/qijingshan/miniconda3/envs/copy/bin/deeph-inference", line 8, in
sys.exit(main())
File "/fs1/home/qijingshan/miniconda3/envs/copy/lib/python3.9/site-packages/deeph/scripts/inference.py", line 63, in main
openmx_parse_overlap(OLP_dir, work_dir, os.path.join(OLP_dir, structure_file_name))
File "/fs1/home/qijingshan/miniconda3/envs/copy/lib/python3.9/site-packages/deeph/preprocess/openmx_parse.py", line 76, in openmx_parse_overlap
structure = Structure.from_file(stru_dir)
File "/fs1/home/qijingshan/miniconda3/envs/copy/lib/python3.9/site-packages/pymatgen/core/structure.py", line 2676, in from_file
s = cls.from_str(contents, fmt="poscar", primitive=primitive, sort=sort, merge_tol=merge_tol, **kwargs)
File "/fs1/home/qijingshan/miniconda3/envs/copy/lib/python3.9/site-packages/pymatgen/core/structure.py", line 2594, in from_str
s = Poscar.from_string(input_string, False, read_velocities=False, **kwargs).structure
File "/fs1/home/qijingshan/miniconda3/envs/copy/lib/python3.9/site-packages/pymatgen/io/vasp/inputs.py", line 390, in from_string
toks = lines[ipos + 1 + i].split()
IndexError: tuple index out of range
It looks a bit strange,It What should I do to avoid reporting errors?
Here is my train.ini
[basic]
OLP_dir = /fs1/home/qijingshan/wcd/TBB/example/work_dir/olp/5_4
work_dir = /fs1/home/qijingshan/wcd/TBB/example/work_dir/inference/5_4
structure_file_name = POSCAR
task = [1, 2, 3, 4, 5]
sparse_calc_config = /fs1/home/qijingshan/wcd/TBB/example/work_dir/inference/5_4/band.json
trained_model_dir = /fs1/home/qijingshan/wcd/TBB/example/work_dir/trained_model
restore_blocks_py = True
[interpreter]
julia_interpreter = /fs1/home/qijingshan/julia-1.6.6/bin/juila
[graph]
radius = 9.0
create_from_DFT = True
Hi,
When I run the example of using ABACUS, everything gone well except the training is canceled by the slurm system due to the time limit.
Because the maximum time of using the GPU accelerate card is limited for one submit in our cluster. I wander if I can restart the training when next submit ?
I'm trying to do the inference step in the "demo_abacus".
python calc_OLP_of_CNT.py
deeph-inference --config inference.ini
In the inference step, I met an error that I cannot guess why it occur.
Hear in my error:
####### Begin 5.sparse_calc
./abacus_CNT.json
[ Info: read h5
Time for reading h5: 2.841439962387085s
[ Info: construct Hamiltonian and overlap matrix in the real space
Time for constructing Hamiltonian and overlap matrix in the real space: 0.7667179107666016 s
[ Info: calculate bands
ERROR: LoadError: PosDefException: matrix is not positive definite; Cholesky factorization failed.
Stacktrace:
[1] chkposdef
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/lapack.jl:50 [inlined]
[2] sygvd!(itype::Int64, jobz::Char, uplo::Char, A::Matrix{ComplexF64}, B::Matrix{ComplexF64})
@ LinearAlgebra.LAPACK /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/lapack.jl:5337
[3] #eigen!#102
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/symmetric.jl:832 [inlined]
[4] eigen!
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/symmetric.jl:832 [inlined]
[5] eigen(A::Hermitian{ComplexF64, Matrix{ComplexF64}}, B::Hermitian{ComplexF64, Matrix{ComplexF64}}; kws::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ LinearAlgebra /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/eigen.jl:501
[6] eigen(A::Hermitian{ComplexF64, Matrix{ComplexF64}}, B::Hermitian{ComplexF64, Matrix{ComplexF64}})
@ LinearAlgebra /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/eigen.jl:500
[7] top-level scope
@ ~/miniconda3/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:162
in expression starting at /home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:136
Traceback (most recent call last):
File "/home/whal1235/miniconda3/bin/deeph-inference", line 8, in <module>
sys.exit(main())
File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/scripts/inference.py", line 131, in main
assert capture_output.returncode == 0
AssertionError
Note that i did not follow the other steps before.
I used the graphene dataset of the paper. (https://zenodo.org/record/6555484#.ZAseK3ZByUk)
and train a model using "graphene.ini" in the git repository.
I guess the trained model is not a problem because the inference 1-4 processed without any error.
Do you think I should follow the data generation and train step in "demo_abacus"?
Or, is the error occur from another reason?
Thank you.
It can't be trained because graph.py will raise NotImplemented exception. This exception is caused when the configuration parameter "interface" is not in ['h5', 'h5_rc_only'] and the parameter "create_from_DFT" is True. But if I change this parameter "interface" to "h5", the dataset can't find any structure because the rc and rh file ends with "npz".
To developer,
I am beginner of DeepH. I am wondering whether DeepH supports paraller computing, for example using mpirun? Because most user use it on multi-cores system, jobs are executed with parallel computing style. When I run DeepH, especially in the train step, I wish to execute the work on multi-cores (CPUs or GPUs). I have allocated the job with multi-cores on "slurm jobs submitting system", but the job seems still work on one CPU or GPU. So, I hope and suggest maybe you can write more details on how to execute the program on multi-cores. This would be more helpful and more faster to people to use it. Many thanks for these helps.
best regards,
Tao
Hello PhD.Li, I am a beginner in DeepH.
When I tried to replicate the “demo: train the DeepH model using the Abacus interface” , I encountered the failure when I run step2-preprocess, the command line display:
User config name: ['preprocess.ini']
Found 3 directories to preprocess
^MPreprocessing No. 1/3 [08:00:00<?]...Output subdirectories: OUT.ABACUS
Traceback (most recent call last):
File "/public3/home/scg8978/.conda/envs/python39/bin/deeph-preprocess", line 33, in
sys.exit(load_entry_point('deeph==0.2.2', 'console_scripts', 'deeph-preprocess')())
File "/public3/home/scg8978/.conda/envs/python39/lib/python3.9/site-packages/deeph/scripts/preprocess.py", line 147, in main
worker(index)
File "/public3/home/scg8978/.conda/envs/python39/lib/python3.9/site-packages/deeph/scripts/preprocess.py", line 118, in worker
abacus_parse(abspath, os.path.abspath(relpath), 'OUT.' + abacus_suffix)
File "/public3/home/scg8978/.conda/envs/python39/lib/python3.9/site-packages/deeph/preprocess/abacus_get_data.py", line 160, in abacus_parse
assert line is not None, 'Cannot find "NSPIN" in log file'
AssertionError: Cannot find "NSPIN" in log file
My preprocess.ini is
[basic]
raw_dir =./dataset_3_graphene
processed_dir =./preprocessed_3_graphene
target = hamiltonian
interface = abacus
[interpreter]
julia_interpreter = julia
[graph]
radius = -1
create_from_DFT = True
Although I have been success in master the process of deeph using the demo_abacus in the bohrium platform, I am still a beginner,I can't success in run the demo in our supercomputer, I want to ask how I can solve this problem. looking forward to your reply.
best wishes.
Hello, I got this error when forecasting, but I'm running the previously calculated material and it's fine, the settings are all the same, what could be the cause of this?
=> Atomic types: [32], spinful: True, the number of atomic types: 1.
Save processed graph to /mnt/raid1/work/zhang/Ge/4.27/work_dir/inference/7.34/graph.pkl, cost 38.801493644714355 seconds
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [02:56<00:00, 176.66s/it]
Traceback (most recent call last):
File "/mnt/raid1/work/zhang/miniconda3/envs/deeph/bin/deeph-inference", line 8, in
sys.exit(main())
File "/mnt/raid1/work/zhang/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/scripts/inference.py", line 124, in main
predict(input_dir=work_dir, output_dir=work_dir, disable_cuda=disable_cuda, device=device,
File "/mnt/raid1/work/zhang/miniconda3/envs/deeph/lib/python3.9/site-packages/deeph/inference/pred_ham.py", line 167, in predict
assert np.all(np.isnan(hamiltonian) == False)
AssertionErro
Here are the settings
OLP_dir = /mnt/raid1/work/zhang/Ge/4.27/work_dir/olp/5_4/7.34
work_dir = /mnt/raid1/work/zhang/Ge/4.27/work_dir/inference/7.34
structure_file_name = POSCAR
task = [1, 2, 3, 4, 5]
sparse_calc_config = /mnt/raid1/work/zhang/Ge/4.27/work_dir/inference/5_4/band.json
trained_model_dir = /mnt/raid1/work/zhang/Ge/4.27/work_dir/Ge_4.27
restore_blocks_py = True
eigen_solver = sparse_jl
[interpreter]
julia_interpreter = /mnt/raid1/work/zhang/julia-1.8.3/bin/julia
[graph]
radius = -1.0
create_from_DFT = True
During the inference stage, is it possible to only execute task 5 and save tasks 1-4 as files for reading?Are there any parameters provided to achieve this function?
When I set the band.json file, when which_k = 0, it means to use one thread to calculate all k points. Does which_k = [0,1,2,3,4,5] mean that I use one thread to calculate the first 0-5 k points? Will which_k=[0] calculate all k points or only the first k point?
Thank you for your response.
tzuching
hello, i have finished the pre of deeph-E3 ,now i want to plot the figure like the paper ,but i don't know how to plot the dft ,can you tell me and give me about your address? thanks
i met the problem which i trained 100 data(all 575 ) ,the train loss is 0, can you meet the problem
Will it be in the output of DeepH?
i want to run the code in windows system,and i installing the pycham etc. could you tell me how to run it in pycharm . thanks
Hi, There,
When I successfully get the training model and olp matrix, I did the inference part and I meet a error like this:
=> load best checkpoint (epoch 5969)
=> Atomic types: [52, 74], spinful: True, the number of atomic types: 2.
Load processed graph from /share/home/zhangtao/work/xxxx/xxxx/work_dir/inference/graph.pkl
Traceback (most recent call last):
File "/share/home/zhangtao/anaconda3/envs/ZT-py39/bin/deeph-inference", line 8, in
sys.exit(main())
File "/share/home/zhangtao/anaconda3/envs/ZT-py39/lib/python3.9/site-packages/deeph/scripts/inference.py", line 105, in main
predict(input_dir=work_dir, output_dir=work_dir, disable_cuda=disable_cuda, device=device,
File "/share/home/zhangtao/anaconda3/envs/ZT-py39/lib/python3.9/site-packages/deeph/inference/pred_ham.py", line 167, in predict assert np.all(np.isnan(hamiltonian) == False)
AssertionError
here I also list the inference.ini setting:
[basic]
OLP_dir = /share/home/zhangtao/work/WTe2/train/data/WTe2/work_dir/olp
work_dir = /share/home/zhangtao/work/WTe2/train/data/WTe2/work_dir/inference
interface = openmx
structure_file_name = POSCAR
task = [1, 2, 3, 4, 5]
sparse_calc_config = /share/home/zhangtao/work/WTe2/train/data/WTe2/work_dir/inference/band.json
trained_model_dir = /share/home/zhangtao/work/WTe2/train/data/WTe2/work_dir/trained_model
restore_blocks_py = True
dense_calc = True
disable_cuda = False
device = cuda:0
huge_structure = True
[interpreter]
julia_interpreter = /share/home/zhangtao/software/julia-1.6.6/bin/julia
[graph]
radius = 9.0
create_from_DFT = True
band setting:
{
"calc_job": "band",
"which_k": 0,
"fermi_level": 0,
"lowest_band": -10.3,
"max_iter": 300,
"num_band": 100,
"k_data": ["20 0.5000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 X Γ", "20 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.5000000000 0.0000000000 Γ Y", "20 0.0000000000 0.5000000000 0.0000000000 0.5000000000 0.5000000000 0.0000000000 Y M","20 0.5000000000 0.5000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 M Γ"]
}
I have tried to find out reason, but I failed. So sad! I would greatly appreciate your kind help, if you could give me some advice on this error.
Best regards,
Tao
Merge two methods, DeepH-E3 and xDeepH, into DeepH-pack
. This will allow us to better integrate the DeepH series methods and maintain them more effectively.
Here are the details of the two methods:
The Supplementary Data 2 contains multiple cif
files, which seem to be generated by OpenMX. But how to get the input file for OpenMX calculation. It seems we need to provide OLP_dir
for infernce. Thanks.
Hi, There,
when I am doing the inference part, all steps before inference step all finished normally, but the program remind me this warining (below). is this warning we can ignore or not? if not, how can we avoid (remove) it?
/share/home/zhangtao/anaconda3/envs/ZT-py39/lib/python3.9/site-packages/deeph/kernel.py:53: UserWarning: Unable to copy scripts
warnings.warn("Unable to copy scripts")
Here I also list my inference setting:
[basic]
work_dir =/work/deeph-test/workdir/inference3
OLP_dir = //work/deeph-test/workdir/olp
interface = openmx
structure_file_name = POSCAR
trained_model_dir = /work/deeph-test/workdir/trained_model/2023-04-19_11-29-45
task = [1, 2, 3, 4, 5]
sparse_calc_config =/work/deeph-test/workdir/inference3/band.json
dense_calc = True
disable_cuda = False
device = cuda:0
huge_structure = True
gen_rc_idx = False
gen_rc_by_idx =
with_grad = False
[interpreter]
julia_interpreter = ***/software/julia-1.6.6/bin/julia
[graph]
radius = -1.0
create_from_DFT = True
the band.json setting is :
{
"calc_job": "band",
"which_k": 0,
"fermi_level": 0,
"lowest_band": -10.3,
"max_iter": 300,
"num_band": 100,
"k_data": ["46 0.3333333333333333 0.6666666666666667 0 0 0 0 K Γ", "28 0 0 0 0.5 0.5 0 Γ M", "54 0.5 0.5 0 0.6666666666666667 0.3333333333333333 0 M K'"]
}
One more question,
the program seems to be stucked at 3.get_pred_Hamiltonian, because the output file is not updated in the working directory. The latested updated time is 12:47, 26/04/2023. After that time, the files never have any change, but the program still is running now(17:47, 26/04/2023).
Much appreciation for your kind help.
Best regard,
Dear developer, I encountered the following error while executing the second preprocessing.
Found 20 directories to preprocess
Preprocessing No. 1/20...
/Bin/sh:/public/apps/julia-1.54/bin/jula:
Traceback (most recent call last):
File "/public/apps/miniconda3/bin/deeph-preprocess", line 8, in
sys.exit(main())
File "/public/apps/miniconda3/lib/python3.9/site-packages/deeph/scripts/preprocess.py", line 64, in main
assert capture_ output.returncode == 0
AssertionError
my proprcess.ini is as follows
[basic]
raw_dir = /public/wcd/twisted/example/work_dir/dataset/raw
processed_dir = /public/wcd/twisted/example/work_dir/dataset/processed
target = hamiltonian
interface = openmx
multiprocessing = 48
local_coordinate = Ture
get_S = False
[interpreter]
python_interpreter = /public/apps/miniconda3/bin/python3
julia_interpreter = /public/apps/julia-1.5.4/bin/juila
[graph]
radius = 9.0
create_from_DFT = True
After trying to read the source code, I couldn't figure out why capture_ Output.returncode is not 0.Could you help me with this problem?
Thank you for your response.
Hi, I used gen_example.py
to reproduce the fig 6c in paper. The first step for DFT calculation works well. But the second step for preprocessing asserts error: SpinP_switch >> 2 == 3. I didn't change any settings after genneration. I tried to run DFT calculation again, but it doesn't work.
Stacktrace:
[1] parse_openmx(::String; return_DM::Bool) at /deeph/preprocess/openmx_get_data.jl:156
[2] get_data(::String, ::Float64; if_DM::Bool) at /deeph/preprocess/openmx_get_data.jl:317
[3] top-level scope at /deeph/preprocess/openmx_get_data.jl:420
[4] include(::Function, ::Module, ::String) at ./Base.jl:380
[5] include(::Module, ::String) at ./Base.jl:368
[6] exec_options(::Base.JLOptions) at ./client.jl:296
[7] _start() at ./client.jl:506
Hello PhD.Li, I am a beginner in DeepH.
When I tried to replicate the “demo: train the DeepH model using the Abacus interface” on the Bohrium platform, I encountered an error when I run step2-preprocess, the command line display:
preprocess.sh: 1: deeph-preprocess: not found
My preprocess.ini is
[basic]
raw_dir = /data/demo_abacus/2_preprocess_dataset/3_dataset_graphene
processed_dir = ./preprocessed_3_graphene
target = hamiltonian
interface = abacus
[interpreter]
julia_interpreter = /home/lihe/apps/julia-1.5.4/bin/julia
[graph]
radius = -1
create_from_DFT = True
Because I am a beginner, I strictly followed the steps of the deep computer tutorial in BiliBili, I want to ask how I can solve this problem. looking forward to your reply.
best wishes.
Hello,I wonder if it is possible to predict magnetic systems?
As the author shows, we can use a dataset of small structures to predict large-scale material systems. But I felt confused about the "appropriate dataset of small structures".
I have completed both the 1DFT_calculation and the 2reprocess, and now I am encountering a loss=0 situation during 3_train training. I am sure that the logs for the first two steps are both finished and have data. The first step takes too long time,I would like to conduct a small batch test for 3_train. Neither I train 10 or train 100 folders with the same result loss=0, do I have to process all the data in folders 0-575 before I can proceed with 3_train?
I saw your discussion in other issue about how to get Fermi energy levels
"The most direct way to obtain the Fermi energy is to use the sparse_calc.jl script in DeepH-pack. Starting from the lowest band, one can use to calculate a certain number of bands, which is equal to the number of valence electrons, thus obtaining the Fermi energy. This approach does not require using other software or relying on guessed valuessparse_calc.jl"
But I don't understand what "lowest band" means, is the fermi level set to this value when setting the band.json and how this value is obtained. At the same time, according to your algorithm, the sparse calculation is to calculate the closest to the set Fermi energy, the number of energy bands to the number of valence electrons should be able to band in the set Fermi energy above and below the numerical value, then which number is the Fermi energy it!
Hello, I wonder how to calculate the eigenvalues of different k points in parallel in the inference step?
[which_k : Define which point in k-path to calculate, start counting from 1. You can set it ‘0’ for all k points, or ‘-1’ for no point. It is recommended to calculate the eigenvalues of different k points in parallel through it. (Invalid for dense matrix calculation)]
Hello.
I'm trying to follow the "demo_abacus" with ABACUS 3.1.
I've made raw data using ABACUS 3.1.
After that, I tried the deeph-preprocess.
deeph-preprocess --config preprocess.ini
And it occur :
User config name: ['preprocess.ini']
Found 400 directories to preprocess
Preprocessing No. 1/400 [09:00:00<?]...Output subdirectories: OUT.ABACUS
Traceback (most recent call last):
File "/home/whal1235/miniconda3/bin/deeph-preprocess", line 8, in <module>
sys.exit(main())
File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/scripts/preprocess.py", line 147, in main
worker(index)
File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/scripts/preprocess.py", line 118, in worker
abacus_parse(abspath, os.path.abspath(relpath), 'OUT.' + abacus_suffix)
File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/preprocess/abacus_get_data.py", line 246, in abacus_parse
hamiltonian_dict, tmp = parse_matrix(
File "/home/whal1235/miniconda3/lib/python3.9/site-packages/deeph/preprocess/abacus_get_data.py", line 204, in parse_matrix
num_element = int(line1[3])
ValueError: invalid literal for int() with base 10: 'H(R):'
I think it occur from output file format difference between ABACUS 2.XX and ABACUS 3.1.
This is an example of ABACUS 3.1 output file.
Can you fix it? Or do you recomend me to install ABACUS 2.xx?
Thank you.
[network]
atom_fea_len=64
edge_fea_len=128
gauss_stop=6.0
num_l=5
aggr=add
distance_expansion=GaussianBasis
if_exp=True
if_multiplelinear=False
if_edge_update=True
if_lcmp=True
normalization=LayerNorm
atom_update_net=CGConv
trainable_gaussians=False
type_affine=False
=> load best checkpoint (epoch 3333)
=> Atomic types: [6], spinful: False, the number of atomic types: 1.
Killed
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.