Comments (1)
@rodrigo-fss thanks for opening! I bet this was very tricky to figure out what was going on!
Does this sound like an accurate re-statement of the problem?
Seed models will have duplicate rows if:
dbt seed
is run a second time without--full-refresh
, and- User is missing a particular permission (or have duplicative permission)
Your hunch is that the TRUNCATE
statement fails is a good one, but I cannot find info on BQ Permissions related to truncate. It'd be great if we knew for sure what was breaking and why.
One thing you might be able to do is look at the logs/dbt.log
file too see the exact statements that are run and potentially fail. However dbt-bigquery relies heavily on API calls via the BQ Python SDK, so we might not be able to see the issue there.
However, if we knew exactly the permission set that causes the issue, I'm not sure what could be done within dbt-bigquery to prevent this issue. Perhaps a dbt debug
check to make sure the user has the right permissions? Regardless this is something we should clearly document to make sure no one else experiences the same issue
from dbt-bigquery.
Related Issues (20)
- BigQuery tags do not work HOT 1
- [Bug] State modified does not pick up changes to policy tags
- [Bug] dbt grant doesn't work for clone
- [Feature] Improvement in data processed/cost incurred in insert_overwrite method HOT 4
- [Release Improvements] Refresh workflows for the `pyproject.toml` migration
- [Bug] persist_docs not working for seeds HOT 2
- Import relevant pytest(s) for cross-database `cast` macro
- [Bug] `docs generate` appears to be returning no table metadata when run with the `--no-compile` option HOT 1
- [Tech Debt] `test_dbt_debug` is not dropping its test schema
- [Bug] `docs generate` does not find all schemas when there are more than 10K schemas
- Cross-database `date` macro
- [Bug] incremental run with __dbt_tmp table does not log the real bytes_billed in run_results.json HOT 1
- [Bug] Requested entity not found when writing a Python model to BigQuery
- [Feature] support copy multiple tables in parallel using copy_partitions
- [Feature] Implement batch metadata freshness using `INFORMATION_SCHEMA.TABLE_STORAGE` instead of `client.get_table` HOT 1
- BigQuery authorized dataset HOT 1
- [Bug] --empty flag not working on Pseudo-columns HOT 2
- [Feature] Support overriding `grant_access_to`
- [Bug] many timeouts with `priority: interactive` config HOT 7
- [Bug] Incremental models don't update new elements in struct column
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dbt-bigquery.