ezpzbz / aiida-catmat Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 1.0 3.63 MB

Collection of AiiDA WorkChains Developed in CATMAT project

License: MIT License

Python 100.00%

aiida-catmat's Introduction

aiida-catmat

Collection of AiiDA WorkChains developed in CATMAT project at University of Oxford and University of Bath.

Compatible with:

Installation, usage, and available workchains

Please check the documentation.

Sample provenance graph

Below are sample provence graphs from calculations performed using aiida-catmat package.

Citation

If you use this package in your research, please consider citing

Contact

Please contact Pezhman Zarabadi-Poor if you have any questions regarding this package.

Acknowledgment

I think the Faraday Institution (FI) CATMAT project (EP/S003053/1, FIRG016) for financial support. I'd also thank Dr. Benjamin Morgan, Dr. Alex Squires, and Hollie Richards for their inputs during the developemnt.

aiida-catmat's People

Contributors

Stargazers

Watchers

Forkers

geigerj2

aiida-catmat's Issues

Better handling of DFT+U calculations

context:

When we start of a workchain with a structure that has magnetic ordering, we have species with different kinds in StructureData to reflect this. For example:

Li Fe1 Fe2 P S
2 1 1 2 8

which would require LDAUU of:

0 5.3 5.3 0 0

Currently, if use requests DFT+U calculation, the LDAU section of INCAR is constructed in setup_protocols for all stages. This results in an issue that will cause wokchain to fail. Once a stage is finished with structure as output, and considering that we do not attach converged spins to the structure, the resulting structure will have species as:

Li Fe P S
2 2 2 8

which would require LDAUU of:

0 5.3 0 0

which is not given as it is made before this. This will result in parsing errors.

Solution:

Move construction of LDAU section from setup_protocol to get_stage_incar. This way we ensure that we always construct the section for current structure.
Attach information about converged spin to the relaxed structure when parsin it:

This needs to be done in parser.
We need to use site-projected magnetization for this purpose. However, these can have negligible but non-zero values that StructureData would treat them differently. Therefore, we consider a threshold (0.1) and will change the magnetization to zero if the absolute value of them is below threshold. So:
converged_magmoms will be modified:

magmoms = [0 if abs(mag)<0.1 else mag for mag in magmoms]

resulting in:

[0, 0, -3.682, 3.682, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Then,

vrun = Vasptun('vasprun.xml')
structure = vrun.final_structure
structure.add_spin_by_site(magmoms)

[Feature] Moving to aiida-core BaseRestartWorkChain

Currently, we are using the BaseRestartWorkchain from aiida-vasp plugin. It is the implementation that was introduced in aiida-quantumespresso at the first place and then started to being used in every plugin. However, as it was quite useful, AiiDA team has moved it to the aiida-core and improved it by adding nice error handling mechanism. Therefore, it is a good idea that we also start using it in our workchain. It requires:

Carefull reading the plugin to avoid possible bugs that can be introduced and break the current API
Implementing it and add at least one error handler to have a starting point
testing it and making it ready to be merged into master.

[Feature] Improve handling of convergence failure

context

Upon exploring failed workchains, the majority are suffering from failure in convergence, either in relaxation or static stage. This issue somehow can be solved by #56 with proper restarting of non-converged calculation. However, it is not the only case. It can happen that we need to change the settings (like ALGO or NELM) to help the convergence. This needs to be addressed urgently.

solution

We can have two ways to implent the handlers:

Having them in base like other handlers. Then, in order to distinguish between relaxation and static run, we can turn the handler on/off upon submitting the calcjob in the main workchain.
Keeping them in the main workchain. Then, we need to have a calcfuntion to track the changes in INCAR. This is not ideal as I would be happy to delegate all handling issue to the base. This way we know that once a calculation is back from base to main, it is ready to go.

[BUG] Unneeded `MAGMOM` tag in case of spin non-polarized calculation

If we set ISPIN=1, we still get MAGMOM tag written in the INCAR. Although this does not affect the calculations, it is better to fix it.

[Feature] Partial deintercallation in VaspCatMatWorkChain

context

In the current version, we only consider fully deintrcalation of ions which is not the case for most of studies. Therefore, it should be seen in the workchain that how we can consider the partial removal of ions.

solution

We can relatively easy implement this feature via having a calcfunction and to_context submission scheme. So, the following steps should be taken in the workchain:

Have a switch for partial or full deintercallation.
In the case of partial switched to True, after optimization of fully lithiated structure, the structure update would be doe using bsym. Here, we also need to have another input as the requested number of Li removal. There should be a calcfunction that takes the optimized structure, replaces the Li with X and removes X. It will return a dictionary of StructureData objects.
In the run_charged, and in the case of having the above dictionary (for example, self.ctx.partial_deli_structures), it will loop over values and submit them all in parallel.
In the processing of results, we will select the structure which has the lowest energy and calculate the cathode properties.

Add more examples

I need to add more examples to show different use cases:

how to use kpoints
how to use 'kspacing
example that triggers walltime error

Better handling of ZBRENT: fatal error in bracketing

I noticed a case when performing calculation for Li2FeP2S6 project that it has failed but VaspMultiStageWorkChain did not capture the issue properly and also has not reported relevant log message to make life easier in debugging the error.
Workchain has failed in inspect_relax step with the following message:

Excepted <  File "<string>", line None
             xml.etree.ElementTree.ParseError: no element found: line 15612, column 0
             >

I traced it back and found out the OUTCAR is incomplete (so job is crashed) where we can find the actual error at the end of _scheduler-stdout.txt as:

curvature:   0.00 expect dE= 0.107-305 dE for cont linesearch  0.104-305
 ZBRENT: fatal error in bracketing
     please rerun with smaller EDIFF, or copy CONTCAR
     to POSCAR and continue

According to the VASP mailing list, it can be resolved by changing IBRION tag to 1 (i.e. RMM-DIIS), having ADDGRID=True, and increasing ENCUT. We already have the two latter ones and it seems playing around with IBRION is a way to go to solve this issue.

So, I'm going to do the followings to confirm the solution and later implement the fix and handler:

Resubmit the calculation using IBRION=1
Invesitgate the results
Implement the error capture and a fixer in workchain

[Feature] add few more protocols + notebooks

It's good to have few more protocols to have a relatiely complete list for different puproses:

only static calculation for test purpose
only static calculation with high accuracy settings
relax+static
only conventional relax

Update inputs in Base in case of iterations to solve issue of calculation

Context

Currently, if VaspBaseWorkChain gets triggered to solve an issue and updates the INCAR, it goes to second iterations. However, if the issue would not be solved, it will do the same without knowing that the same actions has been taken, therefore, it continues until it hits the maximum iteration.
For example, see below log for workchain trial to solve ZBRENT issue:

2020-09-01 11:15:01 [14823 | REPORT]: [24367|VaspMultiStageWorkChain|run_stage]: Submitted VaspBaseWorkchain <pk>:25457 for stage_1_relaxation
2020-09-01 11:15:01 [14824 | REPORT]:   [25457|VaspBaseWorkChain|run_process]: launching VaspCalculation<25458> iteration #1
2020-09-01 13:46:46 [15063 | REPORT]:   [25457|VaspBaseWorkChain|report_error_handled]: VaspCalculation<25458> failed with exit status 0: None
2020-09-01 13:46:46 [15064 | REPORT]:   [25457|VaspBaseWorkChain|report_error_handled]: Action taken: ERROR_ZBRENT: EDIFF is decreased by 10% to 1e-07
2020-09-01 13:46:47 [15065 | REPORT]:   [25457|VaspBaseWorkChain|apply_modifications]: Applied all modifications for VaspCalculation<25458>
2020-09-01 13:46:47 [15066 | REPORT]:   [25457|VaspBaseWorkChain|inspect_process]: VaspCalculation<25458> finished successfully but a handler was triggered, restarting
2020-09-01 13:46:48 [15067 | REPORT]:   [25457|VaspBaseWorkChain|run_process]: launching VaspCalculation<25907> iteration #2
2020-09-01 15:50:47 [15200 | REPORT]:   [25457|VaspBaseWorkChain|report_error_handled]: VaspCalculation<25907> failed with exit status 0: None
2020-09-01 15:50:47 [15201 | REPORT]:   [25457|VaspBaseWorkChain|report_error_handled]: Action taken: ERROR_ZBRENT: EDIFF is decreased by 10% to 1e-07
2020-09-01 15:50:48 [15202 | REPORT]:   [25457|VaspBaseWorkChain|apply_modifications]: Applied all modifications for VaspCalculation<25907>
2020-09-01 15:50:48 [15203 | REPORT]:   [25457|VaspBaseWorkChain|inspect_process]: VaspCalculation<25907> finished successfully but a handler was triggered, restarting

It redices the EDIFF by an order of magnitude for the second iteration, and then it is same over and over. The reason is that we have cls.setup outside of running loop in outline:

spec.outline(
            cls.setup,
            while_(cls.should_run_process)(
                cls.run_process,
                cls.inspect_process,
            ),
            cls.results,
        )

How to solve it?

There can be several solutions which need to be tested:

We can have a switch to check if we are in the second and above iterations and then re-read the INCAR
We can have a step in outline to make running it conditional.
We can have a dictionary with error keys and values be the count of taken action.
NOTE It is particularly a good way to go, not only to solve this issue. The reason is that we may not want to repeadatley decrease EDIFF or other parameters without improvement. Also, there can be errors that we just wanna try once and not again and again.

[Feature] Avoid setting LREAL to Auto/True at first place

We have error handler that if VASP throws warning about cell is small and it's better to use LREAL = False, our handler will change it. However, it's better to look into VASP source code to check on which circumstances, it complains about it. Then, we can have a decider/corrector function to check it in advance before submitting the calculation. It would save a computational time by avoiding unnecessary calculations.

[Feature] reaarange calcfuntion metadata

Now I have them when I call calcfunction. It is nicer to have them in the function definition. It will make the code a bit shorter (in case of multiple calling) and readable.

[Feature] Smart/automatic choosing of resource parameters

I think we can have nice implementation of guessing proper number of cpus, and required memory specifically for each job. It requires some benchmarking on different machines to get information on the correlation between number of atoms, number k-points, plane waves, and other possible parameters.
Once we have the info, we have a smart choice of resources within workchain that can be changed in different stages rather fixing them.
It also can goes to selection of proper parallelization parameters such as NPAR, NCORE, KPAR, and etc.

[BUG] change strings in output of VaspConvergeWorkChain

Describe the bug

Currently, ENCUT and KSPACING values are reported as string in the output dictionaries. In the case of KSPACING, I multiply them by 1000 within workchain (only for key assignment) to avoid .. This needs to get back to actual values. Current exmaple:

{
    "converged_kspacing": "400",
    "converged_kspacing_conservative": "350",
    "energy_difference": 0.0,
    "final_energy": {
        "200": -53.24188224
    }
}

Steps to reproduce

Running VaspConvergeWorkChain with provided example will reproduce it.

Expected

I need to change this to have the outputs as:

{
    "converged_kspacing": 0.400,
    "converged_kspacing_conservative": 0.350,
    "energy_difference": 0.0,
    "final_energy": {
        0.200: -53.24188224
    }
}

[BUG] Handling walltime exeeding error

It does not work properly. It needs a more detailed investigation.

[Feature] Adding more workchains

Now that VaspMultiStageWorkChain has reached to production run with good features in error handling, it's time to have higher level workchains that act as a wrapper and does the desired job. To do so, we basically use the resulting workchains as wrapper that prepares and submits VaspMultiStageWorkChain with specific purpose. We can expose inputs and outputs and does the logic within the spec.outline.
Currently, I have the followings in mind:

VaspConvergenceWorkChain: It can be used to study ENCUT and kpoints mesh convergence. This will use S protocol to do the job. I will use to_context instead of ToContext to be able to submit all workchains in parallel. We will need two inputs as switch to trigger which parameter to converge. We can have pythonic way in post-processing of results to output the suggested values.
VaspCatMatWorkChain: That would be a workchain that gets a fully lithiated or sodiated structure, optimizes it, removes all ions from the system, optimizes the structure, and at the end calculates the open circut voltage.
VaspDdecWorkChain: This takes an an structures, runs single point calculation with generation of potential files, then submits a DdecCalculation, and outputs the structure in CIF format with DDEC charges and spin moments.
VaspMagOrderWorkChain: This would take structure, enumerates magnetic ordering, runs static/relaxation-static calculation on them and identifies magnetic ground state.

[Feature] smart action on convergence handling

context

I am using some minimal actions to deal with convergence issues which is working but not neither solid nor elegant. This needs to become way better and smarter.

solution

We have tools by Peter Larsson which can extract energy and forces fromOUTCAR. This python script can become handy in this case. We can get the information from OUTCAR and check for gradient of going down for energies and forces. This way we can locate flutuations or slow convergence and let the handler to decide upin that. Besides we have band gap information too. So we can let it be in play for choosing proper smearing method as well.

convert script to python 3 and make it AiiDA compatibale function. It should go to parsers or utils.
Write a function to use the data and let us know how the convergence is going.
Pass those outcomes to handler to take proper action.

[BUG] Restart when structure relaxation is not converged

Context

There is a critical bug in VaspMultiStageWorkChain which results in failing the calculation to be restarted from previous one in case the relaxation is not converged. As we are in the relaxation stage, it makes sense to restart the calculation for another iteration using previous CONTCAR and other available information. However, it currently fails. By failing, I mean it restarts but it is actually the previous calculation which is repeating.

solution

We need to update structure in relaxation stage. Currently. we only update self.ctx.current_structure if it is converged.

Change of ALGO to Normal in protocols?

So far, I observed quite often that the Fast has been changed to Normal during the workchain run.
I need to create a database showing the statistics of this change. If it would be much in the favor of Normal, then we may have it as default in the protocols.

[BUG] Convergence handling in static hybrid runs

In hybrid calcs, we use ALGO = All. If the calculation fails to converge, the current mechanism to handle and re-submi the issue fails.
https://github.com/morgan-group-bath/aiida-bjm/blob/038e22380724b18f9e56c3a15171e78981ddb51b/aiida_bjm/workchains/vasp_multistage.py#L523-L532

It can be fixed by modifying elif as:

                elif self.ctx.vasp_base.vasp.parameters['ALGO'] in ['Normal','All']:
                       self.ctx.modifications.update({'ALGO': 'All', 'NELM': nelm, 'ISTART': 0, 'ICHARG': 2})

[Documentation] Start writing the documentation

As we are getting closer to a properly stable version of workchain and more will be added, I need to start adding the documentations. I'll do it using sphnix so once we are at the stage of making codes public, we can simply deploy the docs to readthedocs.

proper docstrings in rst for API documentation
Pages to describe technical considerations
Tutorials for both preparing structures, protocols, and running workchains
Tutorials for query

[BUG] Symmetry issue is treated as convergence failure

Recently, I had a case using VASP 6.2.0 and ISYM=-1 that VASP had issue with symmetry as follows and it failed:

-----------------------------------------------------------------------------
|                                                                             |
|     EEEEEEE  RRRRRR   RRRRRR   OOOOOOO  RRRRRR      ###     ###     ###     |
|     E        R     R  R     R  O     O  R     R     ###     ###     ###     |
|     E        R     R  R     R  O     O  R     R     ###     ###     ###     |
|     EEEEE    RRRRRR   RRRRRR   O     O  RRRRRR       #       #       #      |
|     E        R   R    R   R    O     O  R   R                               |
|     E        R    R   R    R   O     O  R    R      ###     ###     ###     |
|     EEEEEEE  R     R  R     R  OOOOOOO  R     R     ###     ###     ###     |
|                                                                             |
|     Inconsistent Bravais lattice types found for crystalline and            |
|     reciprocal lattice:                                                     |
|                                                                             |
|        Crystalline: base-centered monoclinic                                |
|        Reciprocal : triclinic                                               |
|                    (instead of base-centered monoclinic)                    |
|                                                                             |
|     In most cases this is due to inaccuracies in the specification of       |
|     the crytalline lattice vectors.                                         |
|                                                                             |
|     Suggested SOLUTIONS:                                                    |
|      ) Refine the lattice parameters of your structure,                     |
|      ) and/or try changing SYMPREC.                                         |
|                                                                             |
|       ---->  I REFUSE TO CONTINUE WITH THIS SICK JOB ... BYE!!! <----       |
|                                                                             |
 -----------------------------------------------------------------------------

This is being treated as convergence failure and job is resubmitted with increased NELM.

This needs to be addressed in error handlers.

[Feature] ionic steps in output parameters

Is your feature request related to a problem? Please describe

I think having these data in output_parameters dictionary is helpful when we want quick visualization of energy changes within ionic steps. Currently, if someone would like to do it, needs to parse them again from retrieved files in repository which therefore, requires tools like aiida-vasp-viz to have access to repo. If we parse them in the workchain, user can simply export the database portion of a node(s) and visualize it in another machine.

Describe the solution you'd like

Directing the ionic steps to output_parameters under the ionic_steps in each stage.

{
     'stage_1_relaxation:{
                      'ionic_steps': [....]
    }
}

Additional context

[BUG] Not specifying kpoints in input

context

Consider a case where we only want to perform a static calculation with KSPACING mentioned in the protocl. Therefore, we should not need to provide any kpoints data as input to the workchain. However, it is not the case now. As we are evaluating the presence of kpoints data before constructing the protocl dictionary in the workchain. If it is not provided, we get 803 error which is associated with lack of kpoints data in the inputs.

solution

It probably should be solved by a bit of rearranging the code. So, we first construct the protocol dictionary, then adding enother elif branch in kpoints evalution block to check for the presence of KSPACING in protocol.

[Feature] have kspacing in protocol

I should add few lines in workchain to recognize KSPACING and KGAMMA tags in protocol:

if 'KSPACING` in self.ctx.paramaters:
    # construct KpointsData based on it

It is necessary improvements because if user wants to alter it within stages, it could be possible.
A common use case, is that, we want to have final static run with finer mesh.
Then, we can provide a constant KSAPCING as inputs within submit script and also have it in the protocl. If there is none in any stage, we use the provided one as input, if there is one in protocol, we replace it.

Heavy parsing of OUTCAR

Now, we are completely dependent on pymatgen parsers. This is quite okay with parsing vasprun.xml but causes serious issues with OUTCAR.
There is an open issue in AiiDA that discusses the problem in detail.
aiidateam/aiida-core#3973

Long story short, parsing OUTCAR can take termendous amount of time to be completed. In few cases that I followed, it takes up to 10-12 minutes to finish parsing and meanwhile it does not release the GIL. It consequently causes the RabbitMQ to miss two consecutive heartbeats (60+60 seconds) and therefore, it assumes process is lost/dead and gives the task to another worker to complete. Same thing will happen with the second worker. Meanwhile, first worker finishes parsing and its job and even registers the resulting nodes in the database. Right after it, second worker also does the same and AiiDA stops it as there is already sealed node in the database, and finally workchain fails.

There can be several solutions to this issue:

It can be solved in AiiDA by locking the process in this particular cases.
It can be handled by increasing heartbeat threshold to some other values.
In our case, it can be solved by changing the OUTCAR parser.

Steps to be done:

change the parser and code to reproduce results in same format as before to avoid missing finished calculations
disable parsing potcar and eigenvalues in vasprun.xml

[Feature] handling convergence issues.

Is your feature request related to a problem? Please describe

Currently, we are checking for electronic and ionic convergence within the workchain.
For well behaving systems, it is quite ok. However, if we face systems that the convergence becomes an issue and needs altering parameters to improve the convergence, keeping the check is the VaspMultiStageWorkChain is not a good idea and will result complications/interefernce of handlers.

Describe the solution you'd like

The ideal solution is having specific handlers with high priorities placed in VaspBaseWorkChain. These would check for convergence in misc and if calculation is not converged, they would try to identify source of issue. We might have different cases:

calculation is actually converging but the convergence is (very) slow and VASP hits the maxim number of iterations, so we might resubmit with increase numbers or possibly change the algorithm.
calculation is fluctuating around some number, so we might change the algorithm and resubmit.

It would be something like:

    @process_handler(priority=550, enabled=True)
    def handle_convergence(self, calculation):
        """Handle convergence issue"""
        converged_elec = calculation.outputs.misc['converged_electronically']
        converged_ionic = calculation.outputs.misc['converged_ionically']
        nsw = self.ctx.parameters.get_dict().get('NSW', 0)
        nelm = self.ctx.parameters.get_dict().get('NELM', 60) + 50
        if nsw == 0:
            if not converged_elec:
                self.ctx.modifications.update({'NELM': nelm})
                action = f'NELM increased by 20% to {nelm}'
                self.report_error_handled(calculation, action)
        else:
            if not converged_ionic:
                self.ctx.modifications.update({'NSW': nsw + 50})
                action = f'NSW increased by 20% to {nsw+50}'
                self.report_error_handled(calculation, action)
        
           return ProcessHandlerReport(False)

Additional context

We may benefit from Peter Larsson's scripts for the identification stages:
https://www.nsc.liu.se/~pla/vasptools/

[BUG] handling of RSPHER: internal ERROR

Describe the bug

From the context point of view, it is similar to #25
Where the VASP terminated the calculation, vasprun.xml is not complete, and we get the issue.

Steps to reproduce

(I'll link inputs which results to have this issue)

Expected behavior

Running them with same settings and resources, should cause VASP to throw similar behavior but VASP is unpredivtable. So, who really knows.

Your environment

Operating system: Centos (Salomon)
Resources: 4 nodes (96 cpus)
aiida-core version: 1.3.0
aiida-vasp version :1.0.1
aiida-bjm version: 0.3.0

Possible solution

Once we have the described mechanism in #25 , it should not happen (at least in the case of throwing parsin exception).
In order to solve the issue in VASP, it seems pretty scarce error with not much of information out there.
However, so far, I found that following solutions can remedy the issue:

reducing the number of cores in calculation
changin LREAL to False

[Feature] adding possibility of running SOC calculations

To run SOC calculations using VaspMultiStageWorkChain I need to improve/modify couple of sections:

MAGMOM tag: This needs to be constructed with different directive compared to collinear calculations. Herein, we have to explicitly provide three values for each ion in the system.
SAXIS tag: It will define the quantisation axis for non-collinear spins. It has to be become a user-defined input with a default.
Verification that user provided non-collinear version of code (i.e., vasp_ncl).
Although by using vasp_ncl as the code, VASP internally sets LNONCOLLINEAR = .True., it is good to have this tag explicitly set in the generate INCAR.
Parser and output dictionary. This needs to be addressed once I have proper set of results in hand.

Error in parsin vasprun.xml

In VaspMultiStageWorkChain, we rely on parsing vasprun.xml using pymatgen VaspRun parser. I found out during the production run that it can fail in some cases that for example LDAUJ/L is being printed as 2***** or -1****. In pymatgen there is a backup mechanism for this where it happens the corresponding entry is being read from INCAR. Unfortunately, it cannot happen within our work chain as INCAR does not exisit in retrived folder by default. I think the easiest solution would be enabling the INCAR retrieval as a default within workchain so it exists when it happens.
It can be enabled by setting ADDITIONAL_RETRIEVE_LIST in inputs:

inputs.settings = {'ADDITIONAL_RETRIEVE_LIST':['INCAR']}

[Feature] Error handling

Once we are done with addressing #6, we can benefit from nice error handlers implemented in custodian to have similar mechanisms here.
It can be done either by:

Adding custodian as a dependency to out package
or taking the codes and solutions from them (of course with proper citing and acknowledgment)

SCF convergence handling

In the below section of VaspMultiStageWorkChain (https://github.com/morgan-group-bath/aiida-bjm/blob/9335c95034d7569ba83b7bfd648f171124e298ac/aiida_bjm/workchains/vasp_multistage.py#L523-L532), if we face an issue and change the ALGO to All, then if the same issue happens, none of if elif conditions will meet, and therfore, the process gets excepted with:

algo = self.ctx.modifications['ALGO']
KeyError: 'ALGO'

Solution is that having an else or elif to capture this and breaks the process peacefully with proper exit code.

[Feature] Convergence status in output dict

During analzing results, I realized that it is convinient to leave a trace of convergence status in the output_parameters Dict.
It later becomes handy if we just quickly wanna know about it when we are querying the results. It can be like:

{
     'electronic_convergence_initial_static`: True/False,
     'electronic_convergence_final_static`: True/False,
     'electronic_convergence_relax`: True/False,
}

[Feature] add `label` and `call_link_label` to `calcfuntion` calls

Similar to the thing that we do for CalcJob in setting label and call_link_label, we can do for calcfunction calles with a slightly different approach. It is benefecial for later querying the results. The code snippet would be as:

@calcfunction
def some_calcfunction(a, b):
    pass
some_calcfunction(Int(1), Int(2), metadata={'label': 'a calcfunction', 'call_link_label': 'some_calcfunction'})

[BUG] change aiida-vasp dependency

context

The current version of setup.json defines the aiida-vasp dependepncy as aiida-vasp==1.0.1 which will result in installing it from pypi server. However, it is not something we want when using aiida-bjm. The reason is that I have slightly modified the plugin:

ezpzbz/aiida-vasp@d1ab61e

to only copy CHGCAR and WAVECAR in the restart event. Their implementation copies everything except few. Besides unnecessary copying and disk burning, it has another downside in Michael. SGE scheduler appends the stderr and stdout to the exising ones and therefore, whatever we do in error handling will be neglcted as the previous error flags still exist.

solution

Simply change the dependecy to my fork and that specific commit.

Order of completed `CalcJob`s in `VaspBaseWorkChain`

context

Simple appending of completed CalcJob nodes in a VaspBaseWorkChain does not guarantee the last item in list would be actually the last completed calculation. Therefore, the attached last incar would not necessarily be the last converged one.

solution

Get the CalcJob with highest pk from the list of completed CalcJob nodes.

Output results of initial static to the output parameters

Currently, if the initial static calculation does not converge, we do not have the data in VaspMultiStageWorkChain output_parameters. It's better to have there for easier access once the work chain is finished. Currently, we need to query for that particular VaspWorkchain.
The convergence status keys mentioned in #2 can be located in each particular places associated with the coming misc dictionary.

Enable precommit and GitHub Actions

Now that workchain is in a good shape, it is time to enable pre-commit hooks and have them run via GitHub Actions.
This later will be more complete by adding tests!

Incomplete `vasprun.xml` of a failed calculation

context

Among 200 submitted workchains, three failed with more or less same issue reporting:

xml.etree.ElementTree.ParseError: no element found: line 19864, column 0

which refers to issue in parsing vasprun.xml that in turn is related to incomplete file with not properly closed tags.
The actual issue is that calculation has been aborted by VASP due to the failure in optimization steps with available info in _scheduler-stdout.txt stating that LAPACK: Routine ZPOTRF failed!. Although we have handler in place to take care of this issue, it is being triggered only if VaspCalculation is finished and actually passed to VaspBaseWorkChain. In this case, VaspCalculation is in Excepted state and therefore, workchain could not handle the failure.

proposed solution:

Current solution that comes to my mind is better handling of exception in:
https://github.com/morgan-group-bath/aiida-bjm/blob/11a542a7658ce4a006bb7db31a7a5b55de6a9a00/aiida_bjm/parsers/__init__.py#L105-L112
As if we do so, then vrun will be set to None and will have the output_paramters with error that will be passed to base to trigger error handling mechanism.

[Feature] Improve OCV

context

I am currently facing cases where we have already calculated the lithiated phases and then, we chose few to calculate OCV on. Although AiiDA caching can be handy in not running the same calculation twice, it might not be enabled all the time. Therefore, it would be nice to have the workchain improved to handle these cases.

solution

We defined an input which would be specific to VaspOcvWorkChain as relaxed_discharged_structure. If it exists in the inputs, workchain would skip run_discharged.

Correct type for number of electrons

The following value in output_parameters has to be int not float:

"extra_parameters": {
     
            "number_of_electrons": 330.0
        },

[Feature] Enable possibility of using KSPACING

My initial inspection and tests reveals that it is not currently possible at plugin level. Although there is function _need_kp which would check whether kpoints are required to parse, in the base part of calculation plugin there is not exception and it goes for parsing and writing KPOINTS.
I need to investigate it in more detail to see if there would be workarounds to avoid modifying the plugin (which seems to me inevitable).
However, it is already seen in our workchain and fixing this issue should not be a big deal.

[BUG] issue in structure sort when spin sign changes during the workchain

context

We do generate INCAR in each stage to have updated information of the structure after previous stage. However, one thing that I am not checking currently is the sign of spin. When we generate structure, we do the structure soring so the species would be sorted and we would not have problem in constructing the hubbard parameters section of INCAR. However, I just realized that we may face structures where the sign of spin would change during the calculations and chages the sorted structure. This will in turn result in having incosistent order of hubbard parameters and therefore crash of parser.

solution

We need to run a sanity check when forming the LDAU tags. The solution that so far I came up with is:

Before constructing the LDAU tag, compare spins of provided structure and its sorted version.
If they are identical, pass! If they are different, take the sorted structure, update structure, and construct the hubbard tag based on updated structure.

To do the check:

import functools
structure_pmg = structure.get_pymatgen_structure(add_spin=True)
structure_pmg_sorted = structure.get_pymatgen_structure(add_spin=True)
structure_pmg_sorted.sort()
species = structure_pmg.species
species_sorted = structure_pmg_sorted.species
spin = []
for s in species:
      spin1.append(getattr(s, "spin", 0))
spin_sorted = []
for s in species_sorted:
      spin_sorted.append(getattr(s, "spin", 0))
if functools.reduce(lambda i, j : i and j, map(lambda m, k: m == k, spin, spin_sorted), True) :
        print ("The lists are identical")
else :
        print ("The lists are not identical")

[FIX] Restrict AiiDA core version

AiiDA core versions of 1.4.0 and 1.4.1 have bug which results in having files stored in repository which should not. In the case of VASP, these can be POTCAR files which has not be there. In order to avoid possible issues in future, I need to exclude these two version from installation in setup.json as:

"aiida-core[atomic_tools] >= 1.2.1,!=1.4.0,!=1.4.1"

NSW for static calculations

I need to make sure that NSW is set to 0 for static calculations.
The possible case is that user provides NSW as INCAR parameter to overwrite workchain defaults. In this case, although we have IBRION = -1 for static calculation, there will be NSW values greater than 0 in the runtime INCAR.

Unconverged calculations

In current implementation, if the calculations is not converged, it will result in triggering exit code and workchain will not try to do another trial:
https://github.com/morgan-group-bath/aiida-bjm/blob/ec68aa97d24fdf1ea43bcbfb4f419362abef6fe2/aiida_bjm/workchains/vasp_multistage.py#L411-L415

This needs to be changed and be triggered when max_stage_iteration is reached.

I discovered this as a bug also in aiida-vasp plugin. If workchain requests the restart, plugin copies all files in the calculations folder to the new calculations folder (expect INCAR, KPOINTS, POTCAR, and few others). Therefore, we will have vasprun.xml from previous calculation too. If for any reason, calculation failes, plugin will retrived files and triggers parsing, it sees the vasprun.xml but does not have any clue that it is from another run.
It needs to be fixed in plugin by only copying WAVECAR and CHGCAR. That's all we need.

[BUG] popping non existent dictionary key

context

If structure needs to be sorted due to the change of signs in magnetic ordering, we do remove restart folder from inputs to restart the calculation from scratch in:
https://github.com/morgan-group-bath/aiida-bjm/blob/89120e5f348d45ca75cb61d7dde02291b21ecdce/aiida_bjm/workchains/vasp_multistage.py#L401

It was not causing error as we were updating the current structure only after getting converged structure in production relax run. Now, it has been slightly changed and we update the structure even after short intial relaxation (which has not faced any error). Therefore, we may hit the should_sort_structure berfore assigning any restart folder. Consequently, it will raise KeyError as there is no restart_folder there.

solution

It has to be changed to:

self.ctx['vasp_base']['vasp'].pop('restart_folder', None)

ezpzbz / aiida-catmat Goto Github PK

aiida-catmat's Introduction

aiida-catmat

Installation, usage, and available workchains

Sample provenance graph

Citation

Contact

Acknowledgment

aiida-catmat's People

Contributors

Stargazers

Watchers

Forkers

aiida-catmat's Issues

context:

Solution:

context

solution

context

solution

Context

How to solve it?

Describe the bug

Steps to reproduce

Expected

context

solution

Context

solution

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Additional context

context

solution

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Additional context

Describe the bug

Steps to reproduce

Expected behavior

Your environment

Possible solution

context

solution

context

solution

context

proposed solution:

context

solution

context

solution

context

solution

Recommend Projects

Recommend Topics

Recommend Org