Comments (10)
I agree, I was just struggling with this issue. I opened a PR (#104) to update the default to allow
from forbid
from dbt-artifacts-parser.
I understand the philosophy. I wnat to clarify two things about extra fields.
- Should we allow to add a new fields in anywhere at any level?
- I feel this can be controlled by the JSON schemas of dbt artifacts.
First, I want to know the philosophy is applied to any fields at any level? For instance, it sounds natural to allow extra fields in
meta
properties. But, I am personally upset to allow extra fields at any level.Second, we can control to allow additional fields by the JSON schema of dbt artifacts. For instance, the
columns
property in a node has theadditionalProperties: true
property. So, the generated class has extra fields. I feel it would be good to clearly define where we can use additional fields in dbt artifacts. We can technically customize the generatec classes from the JSON schema of dbt artifacts. But, it would be a bit difficult to settle what we can customize in the package. I want to keep generated classes from the schemas as original as possible, because it is a bit hassle to manage the additional changes in generating code.I totally understand that it is quite difficult to change the JSON schema of dbt artifacts of the older versions in the dbt-core repository. But, I want to change the JSON schemas, not generatec class, in the repository.
Agree that the best is to handle extra fields policy in JSON Schema directly but currently it's not the case.
IMO for now as a temporary fix to make your package relevant for most users, we should allow at any-level to make sure parsing is handled.
You could think about an extra parameter for the function
likeallow_extra_fields
default to True
from dbt-artifacts-parser.
I understand the philosophy. I wnat to clarify two things about extra fields.
- Should we allow to add a new fields in anywhere at any level?
- I feel this can be controlled by the JSON schemas of dbt artifacts.
First, I want to know the philosophy is applied to any fields at any level? For instance, it sounds natural to allow extra fields in meta
properties. But, I am personally upset to allow extra fields at any level.
Second, we can control to allow additional fields by the JSON schema of dbt artifacts. For instance, the columns
property in a node has the additionalProperties: true
property. So, the generated class has extra fields. I feel it would be good to clearly define where we can use additional fields in dbt artifacts. We can technically customize the generatec classes from the JSON schema of dbt artifacts. But, it would be a bit difficult to settle what we can customize in the package. I want to keep generated classes from the schemas as original as possible, because it is a bit hassle to manage the additional changes in generating code.
I totally understand that it is quite difficult to change the JSON schema of dbt artifacts of the older versions in the dbt-core repository. But, I want to change the JSON schemas, not generatec class, in the repository.
from dbt-artifacts-parser.
but currently it's not the case.
Can you elaborate on it? I am still upset to apply a temporary solution due to the two reasons:
- it is hard to manage manual changes to generated files, because we have to apply them every time when we regenerate classes.
- The official schemas explicitely declare
"additionalProperties": false
. I want to admire the official declarations as much as possible. We use the datamodel-code-generator package to generatge classes from the JSON schemas. It doesn't specify the model configuration about extra fields, if the "additionalProperties" properties aren't set.
I totally understand some users want the temporary changes for extra fields. Meanwhile, we have to beare in mind other users to validate their inputs with the strict properties.
from dbt-artifacts-parser.
@aballiet BTW, can you tell me what kind of extra fields you want to add? I want to enchance the resolution about what some users want to do with the package. It would be good to give feedback not only to the repository, but also the dbt community, as we can also change the official JSON schemas in the dbt-core repository. That might be helpful for other dbt users as well.
I don't want to add any extra fields, just wanted to make your current code working on dbt cloud manifest haha
You can check doing your usual commands:
curl https://schemas.getdbt.com/dbt/manifest/v12.json -o json_schema_v12.json
datamodel-codegen --input json_schema_v12.json --input-file-type jsonschema --output model.py
from dbt-artifacts-parser.
+1 from the dbt team. This permissiveness would be better aligned with our philosophy that adding a new field (with a default) is a non-breaking change, and therefore does not require a major-version bump to artifacts: https://docs.getdbt.com/blog/latest-dbt-stability#stability-for-metadata-artifacts
From the dbt-core artifacts README
Non-breaking changes
Freely make incremental, non-breaking changes in-place to the latest major version of any artifact (minor or patch bumps). The only changes that are fully forward and backward compatible are:
- Adding a new field with a default
- Deleting a field with a default
...
Note: Although jsonschema validation using the schemas in schemas.getdbt.com is not encouraged or formally supported, jsonschema validation should still continue to work once the schemas are updated because they are forward-compatible and can therefore be used to validate previous minor versions of the schema.
from dbt-artifacts-parser.
but currently it's not the case.
Can you elaborate on it? I am still upset to apply a temporary solution due to the two reasons:
- it is hard to manage manual changes to generated files, because we have to apply them every time when we regenerate classes.
- The official schemas explicitely declare
"additionalProperties": false
. I want to admire the official declarations as much as possible. We use the datamodel-code-generator package to generatge classes from the JSON schema. It doesn't specify the model configuration about extra fields, if the "additionalProperties" properties are set.I totally understand some users want to the temporary changes for extra fields. Meanwhile, we have to beare in mind other users to validate their inputs with the strict properties.
Totally agree with you and I am not from dbt labs.
Maybe @jtcohen6 could answer this one on why dbt-labs does not write their jsonschema with this in mind as agree with you it's the purpose of schemas.
Will also ask their dbt cloud support...
from dbt-artifacts-parser.
@aballiet BTW, can you tell me what kind of extra fields you want to add? I want to enchance the resolution about what some users want to do with the package. It would be good to give feedback not only to the repository, but also the dbt community, as we can also change the official JSON schemas in the dbt-core repository. That might be helpful for other dbt users as well.
from dbt-artifacts-parser.
In that case, how about validating your artifacts with the official JSON schemas using something like check-jsonschema. By doing this, we can break down the issue. If you see any issues just on the validation, I would be great to give feedback to the dbt community and dbt Labs.
https://github.com/python-jsonschema/check-jsonschema
from dbt-artifacts-parser.
NVM they fixed on DBT Cloud side, now manifest parser works as expected 👍. Closing the issue
from dbt-artifacts-parser.
Related Issues (15)
- Support dbt manifest.json version 6 HOT 1
- [Feature request] Support for ManifestV7 HOT 2
- Base Classes HOT 5
- Support v8 manifest
- Show a list of contributors in README
- Validation errors for dbt > 1.4.1 HOT 7
- add support for `manifest_v9.json` HOT 2
- Security Policy violation SECURITY.md HOT 148
- ManifestV10 parsing errors on full 1.6 release HOT 1
- Support pydantic 2 HOT 7
- resource_type not defined correctly for manifest v11 HOT 4
- none is not an allowed value: sources -> source.default.xyz -> metadata -> type HOT 1
- Using local copies of artifact schemas for validation causes breakage HOT 3
- Add support for semantic manifest files
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dbt-artifacts-parser.