Comments (6)
Sounds great, will have a look at the delete+insert
With regards to last point, yes, when using the when matched part is very slow with Snowflake, and using insert only is MUCH faster - atleast from our testing on 50m rows/800m rows.
from dbt-snowflake.
Hey @joshuataylor, thanks for opening! Have you tried defining an incremental model without a unique_key
? That has the effect of running insert into
, i.e. "append-only":
On dbt-spark
, we have an explicit incremental strategy named append
for this behavior. On most other adapters, this behavior is implicitly supported via one of the existing strategies when unique_key
is not defined. Might make a lot of sense to rationalize this behavior across adapters! (cc @VersusFacit)
from dbt-snowflake.
Removing the unique_key sounds like a good quickfix, but the WHEN NOT MATCHED will then not insert if the record accidentally is in the resultset.
from dbt-snowflake.
Ah! Sorry, I misunderstood the original issue — you're totally right.
Just using the merge, but this can take a while.
It sounds like performance is your main motivation here. A couple options to try:
- You can try using the
delete+insert
strategy instead. Anecdotally, some users have found this to be faster - There's an open PR that proposes to speed up the
merge
strategy by using a view instead of a temp table for the intermediate results: #93 - Your proposal in this issue: We could add logic here saying that, if the user has configured
merge_update_columns = []
(docs), dbt should totally skip templating thewhen matched then update set
statement. In the meantime, you could try this with justmerge_update_columns = ['id']
(or whatever yourunique_key
column is named), to see if that speeds things up. My hunch is that the row-by-rowunique_key
match is the more expensive part of the operation, relative to the column-by-column update — but I could be wrong, and even so, every little bit may help.
from dbt-snowflake.
I have a similar requirement, I want to do a merge to insert records if they do not exist yet, without updating them.
The problem with adding the unique key to the merge_update_columns is that you are now needlessly updating records which is bad for performance.
I think that making sure that when merge_update_columns = []
, then no when matched then update set
statement is added is the best solution.
from dbt-snowflake.
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.
from dbt-snowflake.
Related Issues (20)
- [ADAP-1065] [Feature] Allow users to set snowflake debug logging in profile HOT 2
- [ADAP-1066] 1.7.1 - Security Fix Release not available as Container Image HOT 2
- [ADAP-1076] [Feature] [Dynamic Tables] Add new parameter properties (REFRESH_MODE, INITIALIZE) HOT 2
- [ADAP-1078] [Feature] Add Materialized View as a Materialization to dbt-snowflake HOT 5
- [ADAP-1080] [Bug] Break in Dev build due to https://github.com/dbt-labs/dbt-core/pull/8906/files HOT 1
- [ADAP-1081] [Bug] `connections._split_queries()` breaks stored procedure statements (LANGUAGE SQL) HOT 3
- [ADAP-1084] [Feature] Support Column Comment HOT 1
- [ADAP-1088] Run details - rows_affected always return 1 HOT 2
- [ADAP-1091] [Feature] Include Snowflake session_id in run results response HOT 2
- [ADAP-1092] [Feature] Migrate base adapter references to match dbt-core
- [ADAP-1094] [Feature] Improving the addition of table comment/description in only one query
- [Feature] Implement model contracts for dynamic tables HOT 4
- [ADAP-1096] [Bug] Misleading message when trying to update dynamic table HOT 6
- [ADAP-1097] Migrate to dbt-common and dbt-adapters package
- don't require double quotes around numbers in unit testing yml HOT 4
- [Bug] Contract enforcement results in an error when `data_type: GEOGRAPHY` HOT 6
- DBT creating transient table in snowflake even when we set +transient: false for model in dbt_project.yml HOT 1
- [Unit Testing] Support primative types + objects and arrays in dbt-bigquery
- [Bug] Source metadata freshness is using the wrong field to calculate last updated HOT 1
- [Bug] snowflake__create_table_as contract enforcing incorrect sql code HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dbt-snowflake.