My VASP TaskDocument s are often filled with lots of <

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Actually, maybe this is what you want: <a href="https://pydantic-docs.helpmanual.io/us

Feature Suggestion: Add a kwarg to the VASP TaskDocument that can drop null entries about atomate2 HOT 7 CLOSED

materialsproject commented on June 12, 2024

Feature Suggestion: Add a kwarg to the VASP TaskDocument that can drop null entries

from atomate2.

Comments (7)

janosh commented on June 12, 2024 1

Yes, I've wanted that in atomate1 too!

from atomate2.

utf commented on June 12, 2024 1

@utf I think pydantic does have a concept of optional vs required fields.

Yep, but I believe "optional field" just means the value can be None or an empty dict or list etc. Technically, all fields in the atomate2 task documents are already optional.

from atomate2.

utf commented on June 12, 2024

Interesting - the main point of having a schema is so that you can enforce a structure for task documents. I.e., you can guarantee each document will have every field. This was actually an issue for MP, i.e. some calculations parsed using older versions of pymatgen didn't have all the correct fields in the database, which meant you have to add a lot of checks in the builders.

I'm not sure dropping empty keys is actually possible with pydantic models as their primary goal is to enforce a schema.

Can I ask why this is an issue for you?

from atomate2.

janosh commented on June 12, 2024

@utf I think pydantic does have a concept of optional vs required fields.

@arosen93 When you say MongoDB, do you mean the Compass app? If so, a feature I have found somewhat helpful in this regard is to define projections incl. only the fields you care about and then store them as favorite queries so you don't need to retype them every time.

from atomate2.

utf commented on June 12, 2024

Actually, maybe this is what you want: https://pydantic-docs.helpmanual.io/usage/exporting_models/#modeldict

There are a couple of options to Model.dict():

exclude_unset: whether fields which were not explicitly set when creating the model should be excluded from the returned dictionary; default False. Prior to v1.0, exclude_unset was known as skip_defaults; use of skip_defaults is now deprecated
exclude_defaults: whether fields which are equal to their default values (whether set or otherwise) should be excluded from the returned dictionary; default False

I think Model.dict(exclude_defaults) should do the trick. I guess this could be added somewhere in jobflow rather than atomate2?

from atomate2.

Andrew-S-Rosen commented on June 12, 2024

Thank you for this very useful background and discussion!

I'm not sure dropping empty keys is actually possible with pydantic models as their primary goal is to enforce a schema.

I suppose one could use something like delattr, but I definitely get what you mean.

Can I ask why this is an issue for you?

It's not so much an issue as an inconvenience. When I pull up a dataset in Studio3T, which is my program of choice, each deposited set of calculation results is full of null entries. Visually, it makes it a little bothersome to scroll through. I'm a very visual person and can't remember how schemas are structured for the life of me, so I'm always referring to Studio3T for how my dataset is structured. I thought it'd be nice to have it be "cleaner", albeit at the expense of not having the same keys for all documents. I can see how that might be an issue downstream for some (I haven't run into a scenario like that yet, but maybe one day I will).

There are a couple of options to Model.dict():

This is fantastic! Yes, I think this is the best solution. Actually, my original solution was going to be to suggest to add a function like clean_dict() (rather than modifying the pydantic model directly) that operates the same way as .dict() but cleaner. Seems like there's already a route for that. Thanks for the find.

I think Model.dict(exclude_defaults) should do the trick. I guess this could be added somewhere in jobflow rather than atomate2?

exclude_defaults seems slightly different. For instance, if one were to set the default value of True then it wouldn't be stored in the returned dict unless it's set to False, whereas my main concern is regarding all the null entries. Here, it would generally have that effect though because the defaults are generally (always?) None.

Anyway, given the points here, I think it makes sense to not modify this in Atomate2 directly. It could be modified with a kwarg in Jobflow, as you suggested.

from atomate2.

Andrew-S-Rosen commented on June 12, 2024

I'm closing this issue because I agree it's better suited for Jobflow.

from atomate2.

Feature Suggestion: Add a kwarg to the VASP TaskDocument that can drop null entries about atomate2 HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent