dbt-labs / dbt-snowflake Goto Github PK

dbt-snowflake contains all of the code enabling dbt to work with Snowflake

License: Apache License 2.0

Python 95.27% PLpgSQL 3.15% Shell 0.48% Makefile 0.41% Dockerfile 0.69%

dbt-snowflake's Introduction

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-aggregate the raw data in your warehouse so that it's ready for analysis.

dbt-snowflake

The dbt-snowflake package contains all of the code enabling dbt to work with Snowflake. For more information on using dbt with Snowflake, consult the docs.

Getting started

Install dbt
Read the introduction and viewpoint

Join the dbt Community

Be part of the conversation in the dbt Community Slack
Read more on the dbt Community Discourse

Reporting bugs and contributing code

Want to report a bug or request a feature? Let us know on Slack, or open an issue
Want to help us build dbt? Check out the Contributing Guide

Code of Conduct

Everyone interacting in the dbt project's codebases, issue trackers, chat rooms, and mailing lists is expected to follow the dbt Code of Conduct.

dbt-snowflake's People

Contributors

Stargazers

Watchers

Forkers

mhmcdonald leahwicz joshuahuntley niallrees laxjesse kayrnt dlecocq flywheel-technologies anthu spencer-taylor-workrise rahul-gajula shardphoenix zmac12 brittianwarner devotedhealth jacob-holland-cko madushag thien-vuong jnatkins macmillanplatform hassanjaveed84 javicalvo alparbors nickperrott vaibhavantil2 okayhooni robscriva edmuthiah priyankareddyvadicherla lewisdavies vinclei eaglepointpartners hhobson sdevanapally leahtessem pquadri pace5x shipragupta11 fmorumbasi markhng525 tariqmusa willwen tomasfarias tremmors vasubalakrishnan joe-renpsg abhis926 mauricioobgo teej getbluesky gofrendiasgard pritiprakas patricktu sqlaide cipahi anogues aaron-ortega zerego verhey nshuman1 z3z1ma yotamho samlozier kevinneville jlopezpena hantmac quannh-uet jugodfroy dbt-labs ilikutle surbias terry1504 kxzk yoavost bennylu2 sowa-eqr ueshin nsharma-phdata analyticsworld1 infostrux-solutions benjaminsky acurtis-evi andylaugen3 emmyoop kevin868 vinaykm5758 hrusby osalamapret calixtofelipe dpguthrie rpatel916cb gabeschenz toba66 dustinmoriarty maxa-ai snowplow-incubator kandazh sdggroup vkmohit1980 ivanstillfront

dbt-snowflake's Issues

[CT-1619] [Bug] Could not connect to Snowflake backend after 0 attempt(s)

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

post upgrade to 1.0.0 dbt cli is not connecting to snowflake

Expected Behavior

connect to snowflake after running dbt debug

Steps To Reproduce

Upgrade to dbt 1.0.0
execute dbt debug

Relevant log output

20:30:54  Running with dbt=1.0.0
dbt version: 1.0.0
python version: 3.9.9
python path: /usr/local/Cellar/dbt-snowflake/1.0.0/libexec/bin/python3.9
os info: macOS-12.0.1-x86_64-i386-64bit
Using profiles.yml file at /Users/me/.dbt/profiles.yml
Using dbt_project.yml file at /Users/me/Documents/Intellij Projects/gl/dbt/dbt_project.yml

Configuration:
  profiles.yml file [OK found and valid]
  dbt_project.yml file [OK found and valid]

Required dependencies:
 - git [OK found]

Connection:
  account: ********
  user: ********
  database: ********
  schema: ********
  warehouse: ********
  role: ********
  client_session_keep_alive: False
  Connection test: [ERROR]

1 check failed:
dbt was unable to connect to the specified database.
The database returned the following error:

  >Database Error
  250001: Could not connect to Snowflake backend after 0 attempt(s).Aborting

Check your database credentials and try again. For more information, visit:
https://docs.getdbt.com/docs/configure-your-profile

Environment

- OS:MacOS Monterey
- Python: 3.9.9
- dbt: 1.0.0

What database are you using dbt with?

snowflake

Additional Context

I have upgraded via homebrew to dbt 1.0.0 and just like that all connections to snowflake stopped working.

[CT-84] [Bug] Models are not correctly quoted when materialized as views in dbt 1.0

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

Models with quoted columns (e.g. containing spaces) fail to run when materialized as a view.

It looks like dbt 1.0.1 is explicitly referencing column names in the create or replace view statement:

e.g.

create or replace  view MODEL_DEV.dbt_XXXX.finance_concepts    
(
     App
   , 
     Account Name
)
  as (
WITH finance_concepts AS (

where 0.21.1 did not:

  create or replace  view MODEL_DEV.dbt_XXXX.finance_concepts  as (
WITH finance_concepts AS (

materializing as table in 1.0.1 does not reference columns and works correctly

      create or replace transient table MODEL_DEV.dbt_XXXX.finance_concepts  as
      (
WITH finance_concepts AS (

Expected Behavior

Either correctly quote columns or stop explicitly referencing columns in create or replace view

Steps To Reproduce

in dbt 1.0.1
create model with spaces in column name
materialize as view
dbt run model will fail
materialize as table
dbt run model will succeed

Relevant log output

No response

Environment

- OS: dbt Cloud
- dbt: 1.0.1

What database are you using dbt with?

snowflake

Additional Context

No response

Prepare for publication

Following the examples in:

GHA for testing (dbt-labs/dbt-redshift#5)
Plugin versioning (dbt-labs/dbt-redshift#6)
Housekeeping (dbt-labs/dbt-redshift#8)

Support persist_docs for view columns in Snowflake

Describe the feature

Currently persist_docs is available for relations in Snowflake, but view column documentation is not supported. This would implement that support.

Additional context

This has been proposed in issues #2334 and #1574. Both of these issues have been closed, but implementation at the view column level wasn't completed.

Who will this benefit?

This will be useful for Snowflake users who prefer to store column-level metadata for their views in the database.

Are you interested in contributing this feature?

I'm more than happy to contribute this feature 😄

[CT-331] Views with persist_docs on columns: check to see if any comments first

From #100 (comment):

I agree that a better version of this behavior would first check to see if there are any column comments at all, and exclude the column list if there are none. That change would need to happen here:

dbt-snowflake/dbt/include/snowflake/macros/adapters.sql

Lines 67 to 72 in fda11c2

    
           {% if config.persist_column_docs() -%} 
        
             {% set model_columns = model.columns %} 
        
             {% set query_columns = get_columns_in_query(sql) %} 
        
             {{ get_persist_docs_column_list(model_columns, query_columns) }} 
        
           {%- endif %}

[Feature] Helpful messages when generate_database/schema_name has custom logic and relation not found

Is there an existing feature request for this?

I have searched the existing issues

Describe the Feature

I think it's a rather common-ish occurrence that someone does a dbt run only to find an error that says their database / schema was not found / doesn't exists (e.g. Object does not exists, or operation cannot be performed.) only to then discover/recall that they have a custom generate_database_name() or generate_schema_name() implementation.

Would be useful if the error printing out additional help text that points that out to them (e.g. Object does not exists, or operation cannot be performed. A custom generate_relation_name() macro was also found in the macros folder that may have caused this.).

Describe alternatives you've considered

No response

Who will this benefit?

Anyone else who runs across this error and have custom relation name generation logic. Low priority.

Are you interested in contributing this feature?

Yes

Anything else?

Just the most recent example: https://getdbt.slack.com/archives/CBSQTAPLG/p1634507168413000

I feel like there have been more in the past but can't find them.

dbt Seed, throwing a error

i am new to dbt, so i was trying to seed a csv file to snowflake using dbt seed command. and i am ending with this error.

Dependabot not catching dependencies in setup.py

It looks like there's a new version of snowflake-connector-python, but we're not getting any automated PR over here: dbt-labs/dbt-core#3829

[CT-333] Macro for warehouse selection

Describe the feature

It would be nice to have a macro similar to generate_schema_name for the selection of the snowflake warehouse to use. For example for change the warehouse depending on the environment.

Describe alternatives you've considered

I tried to do a pre hook with the use warehouse query but don't gives the flexibility needed, for example if you use a wildcard in the name all the queries that don't use the pre hook will fail.

Additional context

Please include any other relevant context here.

Who will this benefit?

Every one that wants more control and flexibility on the warehouse selected.

Are you interested in contributing this feature?

Yes, I have some idea about how to do it. I've performed some local tests and it's working. With some help as it's my first contribution I will be able to raise a PR.

[CT-301] [Spike] Perf improvements for single show terse call for each database

Spike on this ticket and the perf improvement that can be gained: #83

Exit criteria: Will we see a perf improvement by making this change? Is it worth investing in?

dbt profiles.yml client_session_keep_alive is timing out after an hour

Describe the bug

We are using dbt cli to create and run models on Snowflake and using external authentication using Okta . Profiles.yml looks snippet:

role: DEV_TRANSFORM
warehouse: DEV_TRANSFORM_WH
database: BDWDEV_ETL
schema: wrk
threads: 2
client_session_keep_alive: we tried both true and false as values
authenticator: externalbrowser

While running the models which are doing bulk data transformations from VSCode , after an hour the auth token expires and opens up a new session and this session does not get the commit message back from Snowflake for query completion and hence no data gets processed. Is there an option in dbt to override the Okta auth token expiry ?

System information

Which database are you using dbt with?
Snowflake

The output of dbt --version:

installed version: 0.18.1
   latest version: 0.19.1

Your version of dbt is out of date! You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:
  - bigquery: 0.18.1
  - postgres: 0.18.1
  - redshift: 0.18.1
  - snowflake: 0.18.1

The operating system you're using:
Windows

[CT-257] Utilizing Deprecated Module from MarkupSafe

soft_unicode was depreciated in version 2.0.1 of MarkupSafe and removed in version 2.1.0 (released 02/17/22). It was replaced with the module soft_str.

MarkupSafe version history here.

Here's the error when running pip install dbt-snowflake for refferance:

Move tests related to snowflake in dbt-core to this repo

move tests in test/unit and test/integration related to snowflake to this repo
may need to create a requirements file to install testing dependencies

Fix versioning issue in `setup.py`

A workaround was needed here to get the correct package version for setup.py
#62

A more permanent fix like we did in dbt-core is needed

dbt persist_docs - please persist view and view column descriptions to Snowflake when a model is a view AND persist_docs = True

Describe the feature

When a model is materialized as a view in snowflake and persist_docs for that model is true, descriptions for that model and that model's columns are persisted in Snowflake as view and column comments.

Describe alternatives you've considered

None

Additional context

I believe there was a time when Views in Snowflake did not support commenting (or their columns didn't?), but a dbt Cloud client mentioned to me that Snowflake does now support this and asked that we look into supporting this functionality. Also I think it's a good idea.

Who will this benefit?

Snowflake users that leverage persist_docs and have views in their projects.

Are you interested in contributing this feature?

Yes!!! I'm guessing we have a check somewhere in the persist_docs handler that just doesn't attempt to alter the view to avoid errors and we can just remove that check for Snowflake.

[CT-353] Use the new testing framework in Snowflake

This is for the changes in dbt-snowflake in order to use the new testing framework.

[CT-315] CREATE VIEW statement column list includes invalid CAST()

Describe the bug

A clear and concise description of what the bug is. What command did you run? What happened?
Models that include a cast of a column only in the final CTE results in the CREATE VIEW statement performing the cast in the view column list. This applies to both hard-coded string literals and columns selected from another object.

Steps To Reproduce

In as much detail as possible, please provide steps to reproduce the issue. Sample data that triggers the issue, example model code, etc is all very helpful here.

Model code:

--test_model.sql
with test_unioned as (
        select '1' as column1
        union
        select '2' as column1
    )
    
    , final as (
        select cast(column1 as varchar)
        from test_unioned
    )
    
    select * from final
)

Resulting dbt run sql:

create
or replace view MY_DB.PUBLIC.test_model (
    CAST(COLUMN1 AS VARCHAR)
) as (
    with test_unioned as (
        select
            '1' as column1
        union
        select
            '2' as column1
    ),
    final as (
        select
            cast(column1 as varchar)
        from
            test_unioned
    )
    select
        *
    from
        final
);

To work around the issue, change model to cast columns before the final CTE as below:

with test_unioned as (
        select cast('1' as varchar) as column1
        union
        select cast('2' as varchar) as column1
    )
    
    , final as (
        select column1
        from test_unioned
    )
    
    select * from final

Expected behavior

A clear and concise description of what you expected to happen.
For Snowflake, CAST() in the column list of a CREATE VIEW statement is not valid syntax. dbt run should not build CREATE VIEW statements that include this function in the column list.

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

System information

The output of dbt --version:

dbt --version
installed version: 1.0.3
   latest version: 1.0.3

Up to date!

Plugins:
  - snowflake: 1.0.0 - Up to date!

The operating system you're using:
Windows 10

The output of python --version:
Python 3.9.10

Additional context

It would appear that computed columns with simple expressions are supported by Snowflake, but CAST() isn't included in this list of functions. (ref: https://community.snowflake.com/s/question/0D50Z00008ixGQKSA2/does-snowflake-supports-computed-columns-in-a-table-while-creating-table)

Snowflake queueing causes keypair auth'd connections to fail

Describe the feature

Snowflake keypair connections have 10 seconds to execute before the embedded JWT token expires. If many concurrent queries are opened against Snowflake when using keypair auth, Snowflake can fail to open connections with this error:

Failed to connect to DB: db.snowflakecomputing.com:443. JWT token is invalid.

I don't think there's too much we can do on dbt's side of things to prevent this, but we can consider doing the following things:

catch this error and retry opening the connection after some timeout
provide better tooling for users to understand how queueing affects dbt invocations
- This could be as simple as a handy Snowflake query you can run to analyze concurrently running dbt queries over time
- It would be awesome to include queued time in the logs or run_results.json file if Snowflake makes it available to us

Steps To Reproduce

Configure a Snowflake connection with keypair auth
Create a sample project with 64 models in it
Run dbt run --threads 64

Expected behavior

TBD. We can address this issue with:

Better docs and tooling
Retries
More information in dbt log output

System information

Which database are you using dbt with?

snowflake

The output of dbt --version:
Any

The operating system you're using:
Any

The output of python --version:
Any

[CT-185] Remove use of temporary table for merge incremental strategy in Snowflake

Describe the feature

The MERGE command in Snowflake supports the use of arbitrary SELECT queries in the USING clause. If we were to adopt this approach for the merge incremental strategy, it remove a database write step that we are currently performing due to the creation of a temporary table to store the data to-be-merged.

Describe alternatives you've considered

The alternative is leaving the status quo. This would represent a performance improvement for the adapter when using the merge strategy.

Additional context

Please include any other relevant context here.

Who will this benefit?

Any customer who is working with incremental tables that have very large incremental loads stand to see a performance improvement with this.

Are you interested in contributing this feature?

Yes, to an extent. I can get some of the way, but would probably need assistance getting this to a merge-able state.

[snowflake] Commit with wrong session when merge

Describe the bug

When running incremental update with DBT, the transaction started for merge is not committed after. Though there is commit command running after, but with different session_id.

Steps To Reproduce

Not 100% how to reproduce. But with long enough time to run the merge query, and couple of post hooks, it ends up with committing with wrong session_id.

Expected behavior

Commit with the same session_id and release the lock.

Screenshots and log output

System information

Which database are you using dbt with?

postgres
redshift
bigquery
[x ] snowflake
other (specify: ____________)

The output of dbt --version:

installed version: 0.17.0
   latest version: 0.18.0

Your version of dbt is out of date! You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:
  - bigquery: 0.17.0
  - snowflake: 0.17.0
  - redshift: 0.17.0
  - postgres: 0.17.0

The operating system you're using:
macos 10.15.6

The output of python --version:
Python 3.6.8 :: Anaconda, Inc.

Snowflake Account URL

Describe the feature

Make the account value fully configurable to allow for URLs that do not necessarily conform to the snowflakecomputing.com convention.

dbt-snowflake/dbt/adapters/snowflake/connections.py

Line 28 in 36c36d0

_TOKEN_REQUEST_URL = 'https://{}.snowflakecomputing.com/oauth/token-request'

Could become something like:

_TOKEN_REQUEST_URL = 'https://{}/oauth/token-request'

Describe alternatives you've considered

Rewriting DNS records, changing security configuration of network

Additional context

If an organisation derives it's own account value that doesn't necessarily conform to the standard hard-coded in the library it makes it impossible to properly connect from local clients to the snowflake account using dbt.

Who will this benefit?

Snowflake users, more mature organisations that are maintaining a high degree security posture.

Are you interested in contributing this feature?

Yes

Snowflake create schema with managed access

Describe the feature

Snowflake managed schemas provide a different behaviour for managing permissions on objects within the schema. For example future grants without the need of account level managed grants.

https://docs.snowflake.com/en/sql-reference/sql/create-schema.html

Additional context

snowflake specific

Who will this benefit?

Can simplify management of role grants upon objects within the schema, in particular future grants, or if schema child objects where created with a different Role to which dbt is running.

[CT-339] Persist dbt tags to Snowflake using object tagging feature

Describe the feature

When tags are defined for various objects (tables, columns, etc), allow the option to persist these tags down to Snowflake using Snowflake's object tagging feature.

Describe alternatives you've considered

I've considered writing a macro to do this, but development time would prefer something more built-in.

Additional context

Tags are used in Snowflake to apply security and masking. Having to maintain tags in two different areas is cumbersome and prone to error.

Who will this benefit?

Tags are useful in dbt for a number of reasons already known. Persisting them into Snowflake would allow users to utilize Snowflake's security methods and prevent having to define tags in two different areas.

Are you interested in contributing this feature?

I can certainly contribute expertise and technical assistance.

[CT-203] A single show terse call for each database

Describe the feature

Currently, dbt makes a call to Snowflake for every schema. This requires opening a connection and running show terse objects in schema <schema>. For projects with many custom schemas though this can add a lot of overhead. I think it would be faster to instead just pull all the objects for a database and then cache the ones for the schemas we need.

Describe alternatives you've considered

Not sure if there are other options to speed this up.

Additional context

I have a working example here: DevotedHealth@f0bc4aa

Who will this benefit?

Project with many custom schemas.

Are you interested in contributing this feature?

Yes!

[CT-443] Potentially sensitive data can be written to logfiles on incremental model failure

Describe the bug

When a merge statement fails on Snowflake with a duplicate row, Snowflake will return the data from the row that failed in the format Duplicate row detected during DML action Row Values: [12345, "col_a_value", "col_b_value", "col_c_value", "etc"...]

Since dbt reports those errors back to the user via stdout and logs them in dbt.log when such an error occurs, data from the warehouse is written to disk, potentially multiple times if dbt is run from a process capturing stdout (like many orchestrators). If the incremental model processes sensitive data, this is a way for said sensitive data to find its way into logfiles, which is a pattern discouraged by OWASP

This error reporting behavior can be suppressed in Snowflake via the ERROR_ON_NONDETERMINISTIC_MERGE parameter but I don't really want to modify the behavior of the merge, just modify the logging behavior.

Logging this in dbt-snowflake instead of dbt-core because I think it's more of a quirk of Snowflake than it is a core dbt problem. dbt is just reporting the error returned from the DB, and I'm not sure if other DBs supported by dbt behave this way on a failed merge. The error message from Snowflake is helpful, but I think it's being too helpful here.

Steps To Reproduce

Create a incremental model:

-- fail_merge.sql
{{ config(
    materialized='incremental',
    unique_key='id'
) }}

with data as (
    SELECT $1 id, $2 name FROM (
        VALUES (1, 'one'), (2, 'two'), (3, 'three')
    )
)
select * from data

Run the incremental model once.

❯ dbt run -m fail_merge
Running with dbt=0.21.0
Found 231 models, 126 tests, 20 snapshots, 0 analyses, 188 macros, 0 operations, 0 seed files, 108 sources, 0 exposures

10:20:26 | Concurrency: 6 threads (target='dev_snowflake')
10:20:26 | 
10:20:26 | 1 of 1 START incremental model analytics.fail_merge.................. [RUN]
10:20:28 | 1 of 1 OK created incremental model analytics.fail_merge............. [SUCCESS 1 in 1.69s]
10:20:28 | 
10:20:28 | Finished running 1 incremental model in 9.08s.

Completed successfully

Modify your incremental model to return a duplicate row

-- fail_merge.sql after edit
{{ config(
    materialized='incremental',
    unique_key='id'
) }}

with data as (
    SELECT $1 id, $2 name FROM (
        VALUES (1, 'one'), (2, 'two'), (3, 'three'), (1, 'one')
    )
)
select * from data

Run your incremental model again and note the presence of the data that failed the merge in stdout and dbt.log

❯ dbt run -m fail_merge
Running with dbt=0.21.0
Found 231 models, 126 tests, 20 snapshots, 0 analyses, 188 macros, 0 operations, 0 seed files, 108 sources, 0 exposures

10:26:09 | Concurrency: 6 threads (target='dev_snowflake')
10:26:09 | 
10:26:09 | 1 of 1 START incremental model analytics.fail_merge.................. [RUN]
10:26:12 | 1 of 1 ERROR creating incremental model analytics.fail_merge......... [ERROR in 3.22s]
10:26:12 | 
10:26:12 | Finished running 1 incremental model in 10.86s.

Completed with 1 error and 0 warnings:

Database Error in model fail_merge (models/analytics/fail_merge.sql)
  100090 (42P18): Duplicate row detected during DML action
  Row Values: [1, "one"]
  compiled SQL at target/run/launchdarkly/models/analytics/fail_merge.sql

Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

❯ tail -n 8 logs/dbt.log
2022-03-31 17:26:13.020221 (MainThread): Database Error in model fail_merge (models/analytics/fail_merge.sql)
2022-03-31 17:26:13.020394 (MainThread):   100090 (42P18): Duplicate row detected during DML action
2022-03-31 17:26:13.020558 (MainThread):   Row Values: [1, "one"]
2022-03-31 17:26:13.020717 (MainThread):   compiled SQL at target/run/launchdarkly/models/analytics/fail_merge.sql
2022-03-31 17:26:13.020883 (MainThread): 
Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
2022-03-31 17:26:13.021132 (MainThread): Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11377be20>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11377bfa0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11377b460>]}
2022-03-31 17:26:13.021412 (MainThread): Flushing usage events

Expected behavior

Contents of Row Values: [...] are not written to disk and/or logged to stdout. While the example above didn't include sensitive data, if the model processed sensitive data it would, and ideally that data in the warehouse should not be logged.

Screenshots and log output

See "Steps To Reproduce"

System information

The output of dbt --version:

❯ dbt --version
installed version: 0.21.0
   latest version: 1.0.0

Your version of dbt is out of date! You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:
  - snowflake: 0.21.0

The operating system you're using:
macOS Monterey 12.3

The output of python --version:

❯ python --version
Python 3.8.12

Additional context

Slack thread

[Bug] Unable to preview varchar that contains binary due to Snowflake python adapter error

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

Unable to preview a column that includes binary but is a varchar in Snowflake with the following error:

252005: Failed to convert current row, cause: 'utf-8' codec can't decode byte 0X87 in position 0: invalid start byte

From this documentation it appears that this is an error raised by the Snowflake python adapter. The select statement works as expected both in Snowflake directly and when doing a dbt run, but this error returns when trying to preview the query.

Also from that documentation, it appears that this can be avoided by wrapping this in the following:
a.encode('utf-8').strip(), which I believe needs to be done python-side, rather than snowflake-side.

Expected Behavior

can preview a string that contains binary, or at least have a more understandable error message

Steps To Reproduce

No response

Relevant log output

No response

Environment

- dbt Cloud
- dbt: 0.20.2

What database are you using dbt with?

snowflake

Additional Context

No response

[CT-239] Add `black` formatter and precommit hooks

Following the changes in dbt-core, let's add the black formatter and precommit hooks to the repo.

Follow the example done in dbt-core:

Example precommit hook PR: dbt-labs/dbt-core#4639
Example formatter PR: dbt-labs/dbt-core#4697

CC: @iknox-fa for questions

dbt snowflake connections through proxy no longer work

Describe the bug

A clear and concise description of what the bug is. What command did you run? What happened?

dbt debug, run, test, and any other commands that initiate a connection with the snowflake database are no longer able to utilize a proxy to make the connection.

Steps To Reproduce

In as much detail as possible, please provide steps to reproduce the issue. Sample data that triggers the issue, example model code, etc is all very helpful here.

Need: Snowflake server that only accepts whitelisted IPs and proxy server whose IP is whitelisted for Snowflake
server; environment variables set for HTTP_PROXY, HTTPS_PROXY; dbt profile for snowflake connection to database; snowflake-connector-python>=2.4.6

run

dbt debug

Expected behavior

A clear and concise description of what you expected to happen.

dbt debug runs detects environment variables for proxy and sets up snowflake connection accordingly, outputs

...
Connection test: OK connection ok
...

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

dbt debug output:

Running with dbt=0.20.2
dbt version: 0.20.2
python version: 3.7.11
python path: /opt/conda/envs/dbt/bin/python
os info: Linux-5.10.47-linuxkit-x86_64-with-debian-bullseye-sid
Using profiles.yml file at /path/to/profiles.yml
Using dbt_project.yml file at /path/to/dbt_project.yml

Configuration:
  profiles.yml file [OK found and valid]
  dbt_project.yml file [OK found and valid]

Required dependencies:
 - git [OK found]

Connection:
  account: *****.*******.***
  user: username
  database: DATABASENAME
  schema: schemaname
  warehouse: warehousename
  role: rolename
  client_session_keep_alive: False
  Connection test: ERROR

dbt was unable to connect to the specified database.
The database returned the following error:

  >Database Error
  250003: Failed to get the response. Hanging? method: post, url: https://*****.*******.***.snowflakecomputing.com:443/session/v1/login-request?request_id=...

Check your database credentials and try again. For more information, visit:
https://docs.getdbt.com/docs/configure-your-profile

System information

Which database are you using dbt with?

The output of dbt --version:

installed version: 0.20.2
   latest version: 0.20.2

Up to date!

Plugins:
  - bigquery: 0.20.2
  - redshift: 0.20.2
  - snowflake: 0.20.2
  - postgres: 0.20.2

The operating system you're using:

Ubuntu 20.04
Linux-5.10.47-linuxkit-x86_64-with-debian-bullseye-sid

The output of python --version:
Python 3.7.11

Additional context

Add any other context about the problem here.

There is a bug with snowflake-connector-python>=2.4.6 where it is no longer able to use the environment variables HTTP_PROXY and HTTPS_PROXY meaning connections that require a proxy with dbt no longer work. The SnowflakeConnection class has parameters to pass proxy settings into it. Utilizing these parameters dbt profiles of the snowflake type could include parameters to define the proxy and side-step this issue.

[CT-427] When cloning an entire database, I'd like to recreate all views to redirect references from self to the newly cloned self

Describe the feature

When cloning an entire database, any views being cloned will retain their original SQL statements. In the case where these views contain references to the current DB (parent), I want these references to be redirected to the cloned DB (child). Additionally, if I am cloning multiple DB's, it would be great to be able to point any references via a mapping – ex. provide a python dictionary (or yaml eq) and have any references overwritten or redirected.

Describe alternatives you've considered

DBT solution: I can recreate all models which are views after cloning using a specific target. But this method is error prone and too complex a solution for my taste.
Snowflake solution: Use materialized views instead of regular views (which are indeed cloned in contrast to regular views). Note, this functionality was released pretty recently - Feb 2022.

Additional context

Currently, cloning occurs as a SQL command to Snowflake – DBT doesn't have a built in way of being aware of these changes. The only way to transmit these changes is to switch targets and assume naively that everything magically appeared (which is fine for the most part, but can lead to issues in cases like these).

Related issue: dbt-labs/dbt-core#4959

Who will this benefit?

Anyone using DBT with Snowflake who has considered or is already using cloning. (As an aside, Snowflake's zero copy cloning is one of its features which sets it apart from its competitors – it would be incredibly valuable to have a more DBT-native solution available).

Are you interested in contributing this feature?

Certainly I am, but would need to have someone else own the project management of this task. (I can do this "on the side" so to say.)

dbt run hangs after getting list of object from database

Describe the bug

Running dbt CLI from a Windows server, we noticed sometimes that the run hangs undefinetely without executing any model.
Looking at Snowflake query history, we see a "show terse" command for each schema, a begin/commit and then nothing else (see attached screenshot).
This happened with different selections, so seems not related to specific models.
We tried adding dbt.exe in our antivirus whitelist, but without any effect.
We saved a dbt.log of one of these strange executions, but we didn't notice any error message (see attached log).

Steps To Reproduce

This behaviour happens randomly, we've not been able to systematically reproduce it.

Expected behavior

dbt should go ahead and run the selected models.

Screenshots and log output

dbt_stg_2021-07-20_14-10-15.17.log

System information

Which database are you using dbt with?

The output of dbt --version:

installed version: 0.19.0
   latest version: 0.20.0

Your version of dbt is out of date! You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:
  - bigquery: 0.19.0
  - postgres: 0.19.0
  - redshift: 0.19.0
  - snowflake: 0.19.0

The operating system you're using:
Windows Server 2016 Datacenter

The output of python --version:
Python 3.7.9

Allow query_tag configuration in snapshot model for Snowflake

Describe the feature

When using Snowflake allow the snapshot config to accept and honor the query_tag parameter.

Additional context

This enhancement is specific for snowflake implementations

Who will this benefit?

Snapshots can be end up being compute intensive, this enhancement would provide a more simplified way for db administrators to collect the data required to monitor and manage the cost.

Example

{% snapshot snapshot_name %}

{{
config(
target_database='db',
target_schema='snapshots',
unique_key='id',
strategy='check',
check_cols=['col_a'],
query_tag='SNAPSHOT'
)
}}

select * from {{ source('db', 'table') }}

{% endsnapshot %}

Snowflake Persist Docs (Materialization == Views, column level descriptions)

Describe the feature

Given the database adapter is Snowflake, materialization type is View and persist docs is set true - create the view using the following snowflake syntax so that column descriptions from dbt schema.yml files are include:

CREATE OR REPLACE VIEW (COL1 COMMENT 'COMMENT', COLN COMMENT 'COMMENT') AS SELECT COL1, COLN FROM TABLE;

Describe alternatives you've considered

I am looking into submitting a PR myself but would benefit from oversight. We have also considered contacting Snowflake to include alter capabilities for columns within a view but believe this to be another alternative.

Additional context

Snowflake

Who will this benefit?

All users of Snowflake who heavily rely on views and would like to use persist docs to capture doc details within the database itself for external dependencies

Are you interested in contributing this feature?

Yes - more than happy to help - just require a push in the right direction

[CT-460] Add `set_query_tag` to the dbt macro namespace

Describe the feature

set_query_tag can only be overridden in the root project. By adding it to the dbt macro namespace, we can then leverage the dispatch functionality and override it with custom query tags in dbt packages.

Describe alternatives you've considered

The only way to achieve this is to add to the dbt macro namespace.

Additional context

https://getdbt.slack.com/archives/C37J8BQEL/p1633519489312300?thread_ts=1633377998.228400&cid=C37J8BQEL

Who will this benefit?

All dbt-snowflake users who have a central dbt package and would like to leverage custom query tags to be generated and stored at runtime. For example:

{
    "invocation_id": "{{ invocation_id }}",
    "run_started_at": "{{ run_started_at }}",
    "dbt_version": "{{ dbt_version }}",
    "project_name": "{{ project_name }}",
    "model":
    {
        "owners": "{{ tojson(model.meta.owners|default([])) }}",
        "database": "{{ this.database }}",
        "schema": "{{ this.schema }}",
        "name": "{{ this.name }}",
        "identifier": "{{ this.identifier }}",
        "path": "{{ model.path}}",
        "unique_id": "{{ model.unique_id }}",
        "materialized": "{{ model.config.materialized }}"
    }
}

Are you interested in contributing this feature?

Yes. #132 is currently in draft pending this issue creation.

Failure to resolve Snowflake OCSP server creates issues with dbt

Describe the bug

When dbt is unable to resolve the Snowflake OCSP server for any reason the environment becomes unusable. Queries hang for long periods waiting for the OCSP server resolution step to time out and dbt cannot run (because every attempt to verify the security of the Snowflake environment fails).

Steps To Reproduce

Follow the steps listed in the SnowCD user-guide to find the OCSP server information. Block access to the OCSP server so that dbt cannot access it. Try to run any dbt command.

Expected behavior

When trying to access Snowflake with a blocked OCSP server, dbt should hang on every query execution step waiting for the timeout before failing and moving on to the next query. When dbt can access Snowflake OCSP servers, it should operate normally.

Screenshots and log output

The output of dbt -d test during the error state:

{'eventType': 'RevocationCheckFailure', 'eventSubType': 'OCSPResponseFailedToConnectCacheServer|OCSPResponseFetchException', 'sfcPeerHost': '<account>.privatelink.snowflakecomputing.com', 'certId': '<cert>', 'ocspRequestBase64': '<request>', 'ocspResponderURL': 'http://ocsp.rootca1.amazontrust.com', 'errorMessage': "254003: Could not fetch OCSP Response from server. Considerchecking your whitelists : Exception - HTTPConnectionPool(host='ocsp.<account>.privatelink.snowflakecomputing.com', port=80): Read timed out. (read timeout=10)", 'insecureMode': False, 'failOpen': True, 'cacheEnabled': True, 'cacheHit': False, 'exceptionMessage': "254003: Could not fetch OCSP Response from server. Considerchecking your whitelists : Exception - HTTPConnectionPool(host='ocsp.<account>.privatelink.snowflakecomputing.com', port=80): Read timed out. (read timeout=10)", 'exceptionStackTrace': 'Traceback (most recent call last):\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\urllib3\\connectionpool.py", line 445, in _make_request\n    six.raise_from(e, None)\n  File "<string>", line 3, in raise_from\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\urllib3\\connectionpool.py", line 440, in _make_request\n    httplib_response = conn.getresponse()\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\http\\client.py", line 1349, in getresponse\n    response.begin()\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\http\\client.py", line 316, in begin\n    version, status, reason = self._read_status()\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\http\\client.py", line 277, in _read_status\n    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\socket.py", line 704, in readinto\n    return self._sock.recv_into(b)\nsocket.timeout: timed out\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\requests\\adapters.py", line 439, in send\n    resp = conn.urlopen(\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\urllib3\\connectionpool.py", line 755, in urlopen\n    retries = retries.increment(\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\urllib3\\util\\retry.py", line 532, in increment\n    raise six.reraise(type(error), error, _stacktrace)\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\urllib3\\packages\\six.py", line 770, in reraise\n    raise value\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\urllib3\\connectionpool.py", line 699, in urlopen\n    httplib_response = self._make_request(\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\urllib3\\connectionpool.py", line 447, in _make_request\n    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\urllib3\\connectionpool.py", line 336, in _raise_timeout\n    raise ReadTimeoutError(\nurllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host=\'ocsp.<account>.privatelink.snowflakecomputing.com\', port=80): Read timed out. (read timeout=10)\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\snowflake\\connector\\ocsp_snowflake.py", line 1563, in _fetch_ocsp_response\n    response = session.request(\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\requests\\sessions.py", line 542, in request\n    resp = self.send(prep, **send_kwargs)\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\requests\\sessions.py", line 655, in send\n    r = adapter.send(request, **kwargs)\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\requests\\adapters.py", line 529, in send\n    raise ReadTimeout(e, request=request)\nrequests.exceptions.ReadTimeout: HTTPConnectionPool(host=\'ocsp.<account>.privatelink.snowflakecomputing.com\', port=80): Read timed out. (read timeout=10)\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\snowflake\\connector\\ocsp_snowflake.py", line 1222, in validate_by_direct_connection\n    ocsp_response = self._fetch_ocsp_response(\n  File "d:\\users\\<user>\\appdata\\local\\programs\\python\\python39\\lib\\site-packages\\snowflake\\connector\\ocsp_snowflake.py", line 1598, in _fetch_ocsp_response\n    raise RevocationCheckError(\nsnowflake.connector.errors.RevocationCheckError: 254003: Could not fetch OCSP Response from server. Considerchecking your whitelists : Exception - HTTPConnectionPool(host=\'ocsp.<account>.privatelink.snowflakecomputing.com\', port=80): Read timed out. (read timeout=10)\n'}
2021-10-20 19:58:54.823774 (ThreadPoolExecutor-1_1): WARNING!!! Using fail-open to connect. Driver is connecting to an HTTPS endpoint without OCSP based Certificate Revocation checking as it could not obtain a valid OCSP Response to use from the CA OCSP responder. Details:

System information

The output of dbt --version:

installed version: 0.21.0
   latest version: 0.21.0

Up to date!

Plugins:
  - bigquery: 0.21.0
  - postgres: 0.21.0
  - redshift: 0.21.0
  - snowflake: 0.21.0

The operating system you're using:
MacOS 11.6
The output of python --version:
Python 3.9.6

Additional context

Add any other context about the problem here.

[CT-368] Race conditions in dbt seed.

Describe the bug

When running dbt seed in multiple jobs, a race condition can and does occur, the following happens:

Seed session 1: Truncate table
Seed session 2: Truncate table
Seed session 1: insert seed
Seed session 2 insert seed

As a result, duplicate records.
The core issue is that the truncate and insert happen in different transactions.

Steps To Reproduce

Run multiple dbt seed commands for the same table at the same time. A race condition is bound to occur.

Expected behavior

If truncate and insert were in one transaction, as they should be, this issue would and could not occur. The expected behaviour is no race conditions when seeding the same table from multiple dbt seed commands running at the same time.

Screenshots and log output

I have added the following log out of our snowflake instance: table names and session ids have been made anonymous:

SESSION_ID	QUERY_TYPE	QUERY_TEXT	START_TIME	END_TIME
2	COMMIT	COMMIT	2022-03-15 23:06:01.609000 -07:00	2022-03-15 23:06:02.085000 -07:00
1	COMMIT	COMMIT	2022-03-15 23:06:01.496000 -07:00	2022-03-15 23:06:01.792000 -07:00
2	INSERT	insert into table_seed_a...	2022-03-15 23:06:00.939000 -07:00	2022-03-15 23:06:01.533000 -07:00
2	BEGIN_TRANSACTION	BEGIN	2022-03-15 23:06:00.783000 -07:00	2022-03-15 23:06:00.856000 -07:00
1	INSERT	insert into table_seed_a...	2022-03-15 23:06:00.435000 -07:00	2022-03-15 23:06:01.414000 -07:00
2	COMMIT	commit;	2022-03-15 23:06:00.289000 -07:00	2022-03-15 23:06:00.688000 -07:00
1	BEGIN_TRANSACTION	BEGIN	2022-03-15 23:06:00.243000 -07:00	2022-03-15 23:06:00.348000 -07:00
2	TRUNCATE_TABLE	truncate table_seed_a...	2022-03-15 23:06:00.034000 -07:00	2022-03-15 23:06:00.213000 -07:00
2	BEGIN_TRANSACTION	begin;	2022-03-15 23:05:59.888000 -07:00	2022-03-15 23:05:59.956000 -07:00
1	COMMIT	commit;	2022-03-15 23:05:59.698000 -07:00	2022-03-15 23:06:00.125000 -07:00
1	TRUNCATE_TABLE	truncate table_seed_a...	2022-03-15 23:05:59.389000 -07:00	2022-03-15 23:05:59.618000 -07:00
1	BEGIN_TRANSACTION	begin;	2022-03-15 23:05:59.251000 -07:00	2022-03-15 23:05:59.306000 -07:00

System information

The output of dbt --version:
dbt 1.0.3

The operating system you're using:
dbt cloud

MFA Token not being cached

Describe the bug

I'm trying to get Snowflake's DUO-based MFA working when running dbt in a linux container running on my laptop. Technically it works, but it's not caching the token and I have to repeatedly re-authorize. I have tried:

Setting ALLOW_CLIENT_MFA_CACHING to true for the account
Setting ALLOW_ID_TOKEN to true for the account
Setting authenticator: username_password_mfa in profiles.yml

All that, and yet, every time I run dbt, I get 4 MFA auth requests right away, and then 1 for each model that it runs. Obviously, this is impractical with 10-100's of models. I'm wondering what else I might be missing, if anything. I swear there used to be a note in Snowflake's docs that some aspect of this only worked on Mac and Windows environments, but I can't find that anymore and was hopeful that it had been resolved for linux. Perhaps this is a dbt defect that isn't sending the right auth method to the python lib?

System information

The output of dbt --version:

root@bbafbba2e1dc:/dbt-runner/dbt# dbt --version
installed version: 0.21.0
   latest version: 0.21.1

Your version of dbt is out of date! You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:
  - snowflake: 0.21.0

The operating system you're using: Using a linux container built based on this dockerfile (with some minor modifications)

root@bbafbba2e1dc:/dbt-runner/dbt# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"


root@bbafbba2e1dc:/dbt-runner/dbt# uname -r
5.10.47-linuxkit

The output of python --version:

root@bbafbba2e1dc:/dbt-runner/dbt# python --version
Python 3.8.12

Version of snowflake connector: snowflake-connector-python[secure-local-storage]==2.5.1

[CT-129] Add integration test for new list input for unique ID

This PR (dbt-labs/dbt-core#4618) added the ability to pass a list for unique ID. We would like to have an integration test for Snowflake that verifies that holds true here

SSO opens multiple authentication browser pages and requires forced process kill

Describe the bug

When using an SSO connection, dbt opens multiple >10+ browser windows to authenticate. Once authenticated, dbt doesn't seem to recognise that so I have to CTRL + C to kill the process and dbt run again, which then succeeds.

Steps To Reproduce

Configure an SSO Snowflake profile, mine looks like:

project:
  outputs:
    dev:
      type: snowflake
      account: <account_name>
      authenticator: externalbrowser
      user: <email_address>
      warehouse: <warehouse>
      database: <database>
      schema: <name>
      threads: 8
  target: dev

Run dbt run

Expected behavior

A single browser page opens where I authenticate with SSO
Once authenticated, the browser window closes
The dbt command continues as normal

Screenshots and log output

dbt run
Running with dbt=1.0.0-b1
[WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 2 unused configuration paths:
- seeds
- snapshots

Found 23 models, 43 tests, 0 snapshots, 0 analyses, 176 macros, 0 operations, 0 seed files, 12 sources, 0 exposures

Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...

System information

The output of dbt --version:

dbt --version
installed version: 1.0.0-b1
   latest version: 0.21.0

Your version of dbt is ahead of the latest release!

Plugins:
  - snowflake: 1.0.0b1
  - postgres: 1.0.0b1

The operating system you're using:
Mac OS 11.4
The output of python --version:

Python 3.8.9

Move contents of `plugins/snowflake` in dbt-core to this repo

if possible, maintain git history
create a PR to rm code from dbt-core

Non-deterministic tests to check query tags

Unfortunately, these tests introduced by #48 appear to fail non-deterministically:

tests/integration/simple_seed_test/test_seed_with_query_tag.py::TestSeedWithQueryTag::test_snowflake_big_batched_seed
tests/integration/simple_snapshot_test/test_snapshot_query_tag.py::TestSnapshotWithQueryTag::test__snowflake__snapshot_with_query_tag

See:

https://github.com/dbt-labs/dbt-snowflake/runs/4263981165
https://github.com/dbt-labs/dbt-snowflake/actions/runs/1480696294

These tests check to confirm that a previously run query is tagged with an appropriate query_tag. They pass whenever I run them locally. Could it be an issue with tests running concurrently?

[CT-45] [Feature] Expose Snowflake query id in case of dbt test failure

Is there an existing feature request for this?

I have searched the existing issues

Describe the Feature

When we get a dbt test failure, I often look through the debug logs to find the exact query which was run. Having the dbt_internal_test query with failures, should_warn, and should_error are an amazing first step, but the issue is that when I rerun the query, it often passes. If the debug logs (or the friendly logs) contained the sfqid, then I could easily modify the query to use snowflake's timetravel feature, so I can recreate the failing test and debug it further.

Right now, I see the following info is logged:

022-01-14 00:14:37.419997 (Thread-39): 00:14:37  On test.[redacted].c630bf19e2: /* {"app": "dbt", "dbt_version": "1.0.1", "profile_name": "user", "target_name": "default", "node_id": "[redacted]"} */

Related issue: dbt-labs/dbt-core#2358

Describe alternatives you've considered

Right now, I look through the snowflake query history to find the sfqid, which works but is a bit tedious.

Who will this benefit?

Analytics engineers running dbt tests being able to track down issues faster. Speed is critical because the standard edition of snowflake only allows for 1 day of timetravel, so you need to debug it quickly.

The easier it is to track down intermittent test issues the better, so that people are encouraged to write more.

Are you interested in contributing this feature?

Sure! I have a bit of experience with python (mostly ruby/javascript), but could figure it out if I know:

people think this is a good idea (so it's not a waste of everyone's time)
a point in the right direction

Anything else?

I'm loving dbt! Thanks for all of the hard work.

[CT-453] Add support for WHEN NOT MATCHED, INSERT incremental update option for Snowflake

Describe the feature

With Snowflake, we have a table that should be append only, so we don't need to merge old results, only do this:

https://docs.snowflake.com/en/sql-reference/sql/merge.html#deterministic-results-for-insert

WHEN NOT MATCHED ... THEN INSERT

This means we won't get duplicates for the unique_key set in dbt.

Describe alternatives you've considered

Just using the merge, but this can take a while.

Additional context

This is great for event tables where it's a table with extra processing done to it. Kind of like a materialized view.

Who will this benefit?

Users who have event tables that don't change, or have metadata tables that don't change.

Are you interested in contributing this feature?

Yes, and happy to champion this feature and document it.

Use lowercase / case-insensitive comparison when persisting column-level comments for views

See #72 for context

dbt-snowflake/dbt/include/snowflake/macros/adapters.sql

Lines 47 to 50 in e75986f

    
           {% for column_name in query_columns %} 
        
             {% if (column_name|upper in model_columns) or (column_name in model_columns) %} 
        
               {{ get_column_comment_sql(column_name, model_columns) }} 
        
             {% else %}

This check should use |lower instead of |upper, since Snowflake columns are almost always uppercase.

model_columns:  {'id': {'name': 'id', 'description': 'this is a cool column', 'meta': {}, 'tags': []}}
query_columns:  ['ID']

That will enable users to persist descriptions for columns defined in lowercase:

version: 2

models:
  - name: my_cool_view_model
    description: this is a cool model
    config:
      materialized: view
      persist_docs:
        relation: true
        columns: true
    columns:
      - name: id # instead of ID
        description: this is a cool column

Snowflake model configs for tests, too

picks up from dbt-labs/dbt-core#2981

Describe the feature

Many model configs are model only, but some are also relevant for tests! There are two Snowflake examples that come immediately to mind:

snowflake_warehouse: Run this test with a different warehouse (= compute resource) from the target.warehouse configured in the profile
query_tag: Slap a query tag on this test that differs from the default query tag

I like these because each represents a different type of change to achieve parity:

The snowflake_warehouse config is executed as a python hook before and after calling the materialization. (I'm not entirely sure why). I think we'd just need to add these lines to the test task:

https://github.com/fishtown-analytics/dbt/blob/7ec5c122e17ab7e1a5dd1e1cab36527159564a0f/core/dbt/task/run.py#L246-L250

(In all likelihood, we should rename those hooks to something more generic, and add them to all the other task types, too.)

Model materializations call the set_query_tag() and unset_query_tag() macros. The Snowflake test materialization could handle this with ease:

https://github.com/fishtown-analytics/dbt/blob/7ec5c122e17ab7e1a5dd1e1cab36527159564a0f/plugins/snowflake/dbt/include/snowflake/macros/materializations/table.sql#L3
https://github.com/fishtown-analytics/dbt/blob/7ec5c122e17ab7e1a5dd1e1cab36527159564a0f/plugins/snowflake/dbt/include/snowflake/macros/materializations/table.sql#L38

Of course, in order for this to work, tests first need to be able to "carry" these configs. Then, in all the ways described by dbt-labs/dbt-core#3253, those configs can be set in-file config() blocks, inherited, set in dbt_project.yml, etc.

[CT-1601] Support string private_key

Describe the feature

Allow accepting a private_key as a string, not just a path.

Describe alternatives you've considered

In some contexts, providing a path to the key is difficult, and allowing it to be passed a string provides another mechanism to consume that key.

As a concrete example, while using airflow-dbt, we would like to read credentials from secretsmanager and use them for authenticating into Snowflake. Normally, we would just create a temporary file with a context manager, write the private key out to the file, and use it. However, this proves awkward when using Airflow.

Additional context

The airflow-dbt project allows vars to be templated, which means we could dynamically (and idiomatically) provide the private key contents dynamically. However, there's no clean toehold to do this using a context manager, and the ideas of 1) pre/post-hooks to clean up a temporary file for a private key, or 2) letting the private key linger in a temporary directory are pretty unappealing.

Who will this benefit?

Predominantly dbt users needing to use private key auth without touching the filesystem.

Are you interested in contributing this feature?

Definitely! I've got a MR nearly ready to go, which I will submit after submitting this issue.

Include the snowflake query id in the run results output

Describe the feature

As an analytics engineer I want a quick way to get the query id of a model run (or even test/seed/snapshot) so I can quickly see the query in the snowflake UI

Describe alternatives you've considered

We've implemented a macro that runs in the post-hook that will get the full url to the query and log it to console. This works great except that we have to call SELECT last_query_id() on every model run and more importantly the hook does not execute if the query fails

I acknowledge that the compiled SQL is available as an alternative but we run airflow on kubernetes so the artifacts are not readily available. We do persist the artifacts in s3 but I think it would be an improved experience to go from airflow log to snowflake ui in one click.

Additional context

I can see that the snowflake query id is being logged for failed queries if you enable debug mode but the output is far too verbose for this use case. I would want just the query url/id logged on every run so that a user can go from log => snowflake ui with the fewest steps possible.

Who will this benefit?

I think this would benefit all dbt-snowflake users that use the snowflake ui for debugging/performance monitoring.

Are you interested in contributing this feature?

I wouldn't mind contributing but I would need some guidance on what option is most amenable to dbt's design. I'm not sure if the plugins are meant to interact with the run results functionality in dbt core.

[Snowflake] `dbt compile` throws `memory exception for ffi.callback()` on M1 Macbook Air

Describe the bug

This is basically dbt-labs/dbt-core#3162, but I'm still running into the problem with send_anonymous_usage_stats: False as well as trying the 0.21.0-b1 prerelease version where this anonymous_usage issue is patched. dbt compile and dbt run both throw MemoryError: Cannot allocate write+execute memory for ffi.callback(). You might be running on a system that prevents this. For more information, see https://cffi.readthedocs.io/en/latest/using.html#callbacks.

Steps To Reproduce

On a 2020 Macbook Air running Big Sur 11.5. Installed dbt as follows

brew install python
python3 -m venv dbt-env  
source dbt-env/bin/activate  
pip install dbt

dbt installs fine with either version, and dbt -version works fine as well. However when I do dbt compile or dbt run I run into the ffi.callback() error. I got this error with both 0.20.0 and 0.21.0-b1.

I know there was an attempted fix as described in this comment but I think that only addresses the case where this is affecting anonymous tracking and the problem here seems to be with the primary snowflake-connector-python connection to Snowflake (see trace below).

There is also an open issue on snowflake-connector-python repo but no progress towards fixing that yet.

Expected behavior

dbt compile and dbt run run without error.

Screenshots and log output

Full trace.

    show terse objects in user_scratch.jlynch_dbt
2021-08-11 02:38:15.323454 (ThreadPoolExecutor-0_0): Rolling back transaction.
2021-08-11 02:38:15.323576 (ThreadPoolExecutor-0_1): Rolling back transaction.
2021-08-11 02:38:15.323678 (ThreadPoolExecutor-0_0): Opening a new connection, currently in state init
2021-08-11 02:38:15.323817 (ThreadPoolExecutor-0_1): Opening a new connection, currently in state init
2021-08-11 02:38:15.423470 (ThreadPoolExecutor-0_1): Error running SQL: macro list_relations_without_caching
2021-08-11 02:38:15.423695 (ThreadPoolExecutor-0_1): Rolling back transaction.
2021-08-11 02:38:15.424342 (ThreadPoolExecutor-0_1): Opening a new connection, currently in state init
2021-08-11 02:38:15.424882 (ThreadPoolExecutor-0_0): Error running SQL: macro list_relations_without_caching
2021-08-11 02:38:15.426007 (ThreadPoolExecutor-0_0): Rolling back transaction.
2021-08-11 02:38:15.426822 (ThreadPoolExecutor-0_0): Opening a new connection, currently in state init
2021-08-11 02:38:15.535091 (MainThread): Connection 'master' was properly closed.
2021-08-11 02:38:15.535311 (MainThread): Connection 'list_analytics_snapshot' was properly closed.
2021-08-11 02:38:15.535435 (MainThread): Connection 'list_user_scratch_jlynch_dbt' was properly closed.
2021-08-11 02:38:15.535643 (MainThread): Flushing usage events
2021-08-11 02:38:15.535784 (MainThread): Encountered an error:
2021-08-11 02:38:15.535869 (MainThread): Cannot allocate write+execute memory for ffi.callback(). You might be running on a system that prevents this. For more information, see https://cffi.readthedocs.io/en/latest/using.html#callbacks
2021-08-11 02:38:15.538470 (MainThread): Traceback (most recent call last):
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/dbt/adapters/snowflake/connections.py", line 179, in exception_handler
    yield
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/dbt/adapters/sql/connections.py", line 79, in add_query
    cursor = connection.handle.cursor()
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/dbt/contracts/connection.py", line 81, in handle
    self._handle.resolve(self)
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/dbt/contracts/connection.py", line 107, in resolve
    return self.opener(connection)
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/dbt/adapters/snowflake/connections.py", line 220, in open
    handle = snowflake.connector.connect(
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/__init__.py", line 50, in Connect
    return SnowflakeConnection(**kwargs)
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/connection.py", line 273, in __init__
    self.connect(**kwargs)
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/connection.py", line 484, in connect
    self.__open_connection()
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/connection.py", line 723, in __open_connection
    self._authenticate(auth_instance)
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/connection.py", line 984, in _authenticate
    self.__authenticate(self.__preprocess_auth_instance(auth_instance))
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/connection.py", line 1003, in __authenticate
    auth.authenticate(
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/auth.py", line 241, in authenticate
    ret = self._rest._post_request(
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/network.py", line 629, in _post_request
    ret = self.fetch(
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/network.py", line 719, in fetch
    ret = self._request_exec_wrapper(
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/network.py", line 841, in _request_exec_wrapper
    raise e
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/network.py", line 762, in _request_exec_wrapper
    return_object = self._request_exec(
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/network.py", line 1049, in _request_exec
    raise err
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/network.py", line 926, in _request_exec
    raw_ret = session.request(
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/vendored/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/vendored/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/vendored/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/vendored/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/vendored/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/vendored/urllib3/connectionpool.py", line 1010, in _validate_conn
    conn.connect()
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/vendored/urllib3/connection.py", line 392, in connect
    self.ssl_context = create_urllib3_context(
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/vendored/urllib3/util/ssl_.py", line 339, in create_urllib3_context
    context.verify_mode = cert_reqs
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/snowflake/connector/vendored/urllib3/contrib/pyopenssl.py", line 444, in verify_mode
    self._ctx.set_verify(_stdlib_to_openssl_verify[value], _verify_callback)
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/OpenSSL/SSL.py", line 1028, in set_verify
    self._verify_helper = _VerifyHelper(callback)
  File "/Users/john.lynch/dbt-env/lib/python3.9/site-packages/OpenSSL/SSL.py", line 331, in __init__
    self.callback = _ffi.callback(
MemoryError: Cannot allocate write+execute memory for ffi.callback(). You might be running on a system that prevents this. For more information, see https://cffi.readthedocs.io/en/latest/using.html#callbacks

System information

Which database are you using dbt with?

The output of dbt --version:

installed version: 0.20.0
   latest version: 0.20.0

Up to date!

Plugins:
  - snowflake: 0.20.0

The operating system you're using:
MacOS Big Sur 11.5

The output of python --version:
Python 3.9.6

redundant

Use cluster by keys in merge for faster incremental updates

Description

dbt-labs/dbt-core#1689 brought the ability for users to leverage Snowflake's clustering in dbt.
We recently realised that maybe on clustered tables we could make merge operations faster by using the cluster keys provided on the model. After testing it via a custom materialisation in our macros we really saw a huge improvement!

The main advantage is that when you feed the keys that are used in the clustering of the table, Snowflake does not cause a full table scan and it's able to directly target the cluster/partition where things should be merged. Since the table is clustered, homing on the right partition is a breeze.

Additional context

Snowflake specific

Who will this benefit?

Anyone on snowflake with large tables who leverages clustering in dbt.

I'd be more than happy to contribute this functionality back to dbt as it's fairly simple.
It would basically consist in adding a test for whether cluster_by_keys are provided and adding them kind of like the following snipped.

        {% if cluster_by_keys is not none %}
            {% for key in cluster_by_keys %}
                and DBT_INTERNAL_SOURCE.{{key}} = DBT_INTERNAL_DEST.{{key}}
            {% endfor %}
        {% endif %}

If this is something you guys would be insterested in I can put up a PR fairly quickly. Would love, first to hear your thoughts, and discuss potential caveats.

Snowflake database error no active warehouse on docs generate and tests

Describe the bug

running command dbt docs generate results in error:

Building catalog
Encountered an error while generating catalog: Database Error
  000606 (57P03): No active warehouse selected in the current session.  Select an active warehouse with the 'use warehouse' command.

running command dbt build or dbt test results in schema test failures:

1 of 31 START test source_not_null_cora_accounts_id.................. [RUN]
16:31:37 | 1 of 31 ERROR source_not_null_cora_accounts_id....................... [ERROR in 0.77s]

Database Error in test source_not_null_cora_accounts_id (models\sources\cora\sources.yml)
  000606 (57P03): No active warehouse selected in the current session.  Select an active warehouse with the 'use warehouse' command.

Prior to generating docs or tests, dbt compile and dbt run were successful.

(.venv-dbt) PS C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt> dbt run
Running with dbt=0.21.0
Found 8 models, 31 tests, 0 snapshots, 0 analyses, 363 macros, 0 operations, 0 seed files, 37 sources, 0 exposures

15:43:05 | Concurrency: 1 threads (target='local')
15:43:05 | 
15:43:05 | 1 of 7 START view model zz_warners_cora.accounts_src................. [RUN]
15:43:06 | 1 of 7 OK created view model zz_warners_cora.accounts_src............ [SUCCESS 1 in 1.47s]
15:43:06 | 2 of 7 START view model zz_warners_orca.current_workflow_properties_src [RUN]
15:43:08 | 2 of 7 OK created view model zz_warners_orca.current_workflow_properties_src [SUCCESS 1 in 1.97s]
15:43:08 | 3 of 7 START view model zz_warners_orca.id_clones_src................ [RUN]
15:43:10 | 3 of 7 OK created view model zz_warners_orca.id_clones_src........... [SUCCESS 1 in 1.18s]
15:43:10 | 4 of 7 START view model zz_warners_orca.id_map_src................... [RUN]
15:43:11 | 4 of 7 OK created view model zz_warners_orca.id_map_src.............. [SUCCESS 1 in 1.54s]
15:43:11 | 5 of 7 START view model zz_warners_salesforce.account_src............ [RUN]
15:43:12 | 5 of 7 OK created view model zz_warners_salesforce.account_src....... [SUCCESS 1 in 1.11s]
15:43:12 | 6 of 7 START view model zz_warners_veeva.account_src................. [RUN]
15:43:14 | 6 of 7 OK created view model zz_warners_veeva.account_src............ [SUCCESS 1 in 1.37s]
15:43:14 | 7 of 7 START view model zz_warners_edm.account_dim................... [RUN]
15:43:15 | 7 of 7 OK created view model zz_warners_edm.account_dim.............. [SUCCESS 1 in 1.08s]
15:43:15 |
15:43:15 | Finished running 7 view models in 19.79s.

Steps To Reproduce

%USERPROFILE%/.dbt/profiles.yml

config:
  send_anonymous_usage_stats: False
  use_colors: True
dwm-core-snowflake:
  target: local
  outputs:
    local:
      type: snowflake
      account: someaccount
      user: someuser
      authenticator: externalbrowser
      role: snowflake_developer
      warehouse: "DEVELOPER_WH"
      #attempted with numerous variations of warehouse: quoted/not quoted lower/upper case
      # user has a default role defined in snowflake with access to warehouse
      database: raw_dev
      schema: "zz_{{ env_var('USERNAME') }}"
      threads: 1
      query_tag: "{{ env_var('USERNAME') }}-dbt"

dbt_project.yml

...
#attempted with and without snowflake_warehouse
models:
  +snowflake_warehouse: "DEVELOPER_WH"
  dwm_core:
    dims:
      +schema: edm
      +materialized: view

    sources:
      +database: raw_dev
      +materialized: view

      somesource:
        +schema: someschema

Expected behavior

Expected docs to be generated without an error.
Expected get_catalog sql be run from a snowflake sql connection with the designated warehouse from profiles.yml

Screenshots and log output

dbt.log for dbt docs generate

2021-10-11 23:36:13.483636 (MainThread): Connection 'master' was properly closed.
2021-10-11 23:36:13.484632 (MainThread): Connection 'list_raw_dev_zz_warners_salesforce' was properly closed.
2021-10-11 23:36:13.484632 (MainThread): Connection 'test.dwm_core.unique_account_dim_unique_key.90cb7ab562' was properly closed.
2021-10-11 23:36:13.496631 (MainThread): 16:36:13 | Done.
2021-10-11 23:36:13.500634 (MainThread): Acquiring new snowflake connection "generate_catalog".
2021-10-11 23:36:13.501635 (MainThread): 16:36:13 | Building catalog
2021-10-11 23:36:13.518635 (ThreadPoolExecutor-1_0): Acquiring new snowflake connection "raw_dev.information_schema".
2021-10-11 23:36:13.534231 (ThreadPoolExecutor-1_0): Using snowflake connection "raw_dev.information_schema".
2021-10-11 23:36:13.534231 (ThreadPoolExecutor-1_0): On raw_dev.information_schema: /* {"app": "dbt", "dbt_version": "0.21.0", "profile_name": "dwm-core-snowflake", "target_name": "local", "connection_name": "raw_dev.information_schema"} */

    
      with tables as (

          select
              table_catalog as "table_database",
              table_schema as "table_schema",
              table_name as "table_name",
              table_type as "table_type",
              comment as "table_comment",

              -- note: this is the _role_ that owns the table
              table_owner as "table_owner",

              'Clustering Key' as "stats:clustering_key:label",
              clustering_key as "stats:clustering_key:value",
              'The key used to cluster this table' as "stats:clustering_key:description",
              (clustering_key is not null) as "stats:clustering_key:include",

              'Row Count' as "stats:row_count:label",
              row_count as "stats:row_count:value",
              'An approximate count of rows in this table' as "stats:row_count:description",
              (row_count is not null) as "stats:row_count:include",

              'Approximate Size' as "stats:bytes:label",
              bytes as "stats:bytes:value",
              'Approximate size of the table as reported by Snowflake' as "stats:bytes:description",
              (bytes is not null) as "stats:bytes:include",

              'Last Modified' as "stats:last_modified:label",
              to_varchar(convert_timezone('UTC', last_altered), 'yyyy-mm-dd HH24:MI'||'UTC') as "stats:last_modified:value",
              'The timestamp for last update/change' as "stats:last_modified:description",
              (last_altered is not null and table_type='BASE TABLE') as "stats:last_modified:include"

          from raw_dev.INFORMATION_SCHEMA.tables

      ),

      columns as (

          select
              table_catalog as "table_database",
              table_schema as "table_schema",
              table_name as "table_name",

              column_name as "column_name",
              ordinal_position as "column_index",
              data_type as "column_type",
              comment as "column_comment"

          from raw_dev.INFORMATION_SCHEMA.columns
      )

      select *
      from tables
      join columns using ("table_database", "table_schema", "table_name")
      where (upper("table_schema") = upper('zz_warners_edm') or upper("table_schema") = upper('zz_warners_stage') or upper("table_schema") = upper('salesforce') or upper("table_schema") = upper('orca') or upper("table_schema") = upper('veeva') or upper("table_schema") = upper('zz_warners_cora') or upper("table_schema") = upper('zz_warners_veeva') or upper("table_schema") = upper('zz_warners_salesforce') or upper("table_schema") = upper('zz_warners_orca') or upper("table_schema") = upper('cora') or upper("table_schema") = upper('zz_warners_dbt_test__audit'))
      order by "column_index"
2021-10-11 23:36:13.535231 (ThreadPoolExecutor-1_0): Opening a new connection, currently in state init
2021-10-11 23:36:14.481791 (ThreadPoolExecutor-1_0): Snowflake query id: 019f8be8-0000-8191-0000-8949001d781a
2021-10-11 23:36:14.481791 (ThreadPoolExecutor-1_0): Snowflake error: 000606 (57P03): No active warehouse selected in the current session.  Select an active warehouse with the 'use warehouse' command.

2021-10-11 23:36:14.482792 (ThreadPoolExecutor-1_0): Error running SQL: macro get_catalog
2021-10-11 23:36:14.482792 (ThreadPoolExecutor-1_0): Rolling back transaction.
2021-10-11 23:36:14.483791 (ThreadPoolExecutor-1_0): On raw_dev.information_schema: Close
2021-10-11 23:36:14.650833 (MainThread): Encountered an error while generating catalog: Database Error
  000606 (57P03): No active warehouse selected in the current session.  Select an active warehouse with the 'use warehouse' command.
  
2021-10-11 23:36:14.683789 (MainThread): dbt encountered 1 failure while writing the catalog
2021-10-11 23:36:14.684785 (MainThread): 16:36:14 | Catalog written to C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\.dbt\.output\catalog.json
2021-10-11 23:36:14.689805 (MainThread): Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001AE68C09A30>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001AE72675A60>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001AE72675EE0>]}
2021-10-11 23:36:14.691847 (MainThread): Flushing usage events
2021-10-11 23:36:15.238732 (MainThread): Connection 'generate_catalog' was properly closed.
2021-10-11 23:36:15.238732 (MainThread): Connection 'raw_dev.information_schema' was properly closed.

dbt.log for running ```dbt test --select source:veeva

2021-10-11 23:41:59.778136 (Thread-1): Began running node test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d
2021-10-11 23:41:59.778136 (Thread-1): 16:41:59 | 2 of 2 START test source_not_null_veeva_account_id................... [RUN]
2021-10-11 23:41:59.780137 (Thread-1): Acquiring new snowflake connection "test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d".
2021-10-11 23:41:59.781162 (Thread-1): Compiling test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d
2021-10-11 23:41:59.790138 (Thread-1): Writing injected SQL for node "test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d"
2021-10-11 23:41:59.795134 (Thread-1): finished collecting timing info
2021-10-11 23:41:59.800177 (Thread-1): Writing runtime SQL for node "test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d"
2021-10-11 23:41:59.805134 (Thread-1): Using snowflake connection "test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d".
2021-10-11 23:41:59.806146 (Thread-1): On test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d: /* {"app": "dbt", "dbt_version": "0.21.0", "profile_name": "dwm-core-snowflake", "target_name": "local", "node_id": "test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d"} */
select
      count(*) as failures,
      count(*) != 0 as should_warn,
      count(*) != 0 as should_error
    from (
      
    
    

select *
from raw_dev.veeva.account
where id is null



      
    ) dbt_internal_test
2021-10-11 23:41:59.806146 (Thread-1): Opening a new connection, currently in state closed
2021-10-11 23:42:00.335136 (Thread-1): Snowflake query id: 019f8bee-0000-7eca-0000-8949001d6b2a
2021-10-11 23:42:00.336157 (Thread-1): Snowflake error: 000606 (57P03): No active warehouse selected in the current session.  Select an active warehouse with the 'use warehouse' command.

2021-10-11 23:42:00.336157 (Thread-1): finished collecting timing info
2021-10-11 23:42:00.337143 (Thread-1): On test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d: Close
2021-10-11 23:42:00.497138 (Thread-1): Database Error in test source_not_null_veeva_account_id (models\sources\veeva\sources.yml)
  000606 (57P03): No active warehouse selected in the current session.  Select an active warehouse with the 'use warehouse' command.
  
  compiled SQL at ../.dbt/.output\run\dwm_core\models\sources\veeva\sources.yml\schema_test\source_not_null_veeva_account_id.sql
Traceback (most recent call last):
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\adapters\snowflake\connections.py", line 183, in exception_handler
    yield
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\adapters\sql\connections.py", line 80, in add_query
    cursor.execute(sql, bindings)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\snowflake\connector\cursor.py", line 721, in execute
    Error.errorhandler_wrapper(
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\snowflake\connector\errors.py", line 258, in errorhandler_wrapper
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\snowflake\connector\errors.py", line 188, in default_errorhandler
    raise error_class(
snowflake.connector.errors.ProgrammingError: 000606 (57P03): No active warehouse selected in the current session.  Select an active warehouse with the 'use warehouse' command.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\task\base.py", line 348, in safe_run
    result = self.compile_and_execute(manifest, ctx)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\task\base.py", line 291, in compile_and_execute
    result = self.run(ctx.node, manifest)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\task\base.py", line 393, in run
    return self.execute(compiled_node, manifest)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\task\test.py", line 112, in execute
    result = self.execute_test(test, manifest)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\task\test.py", line 82, in execute_test
    macro_func()
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\clients\jinja.py", line 333, in __call__
    return self.call_macro(*args, **kwargs)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\clients\jinja.py", line 260, in call_macro
    return macro(*args, **kwargs)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\jinja2\runtime.py", line 675, in __call__
    return self._invoke(arguments, autoescape)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\jinja2\runtime.py", line 679, in _invoke
    rv = self._func(*arguments)
  File "<template>", line 132, in macro
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\jinja2\sandbox.py", line 462, in call
    return __context.call(__obj, *args, **kwargs)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\jinja2\runtime.py", line 290, in call
    return __obj(*args, **kwargs)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\clients\jinja.py", line 333, in __call__
    return self.call_macro(*args, **kwargs)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\clients\jinja.py", line 260, in call_macro
    return macro(*args, **kwargs)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\jinja2\runtime.py", line 675, in __call__
    return self._invoke(arguments, autoescape)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\jinja2\runtime.py", line 679, in _invoke
    rv = self._func(*arguments)
  File "<template>", line 41, in macro
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\jinja2\sandbox.py", line 462, in call
    return __context.call(__obj, *args, **kwargs)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\jinja2\runtime.py", line 290, in call
    return __obj(*args, **kwargs)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\adapters\base\impl.py", line 226, in execute
    return self.connections.execute(
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\adapters\sql\connections.py", line 131, in execute
    _, cursor = self.add_query(sql, auto_begin)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\adapters\snowflake\connections.py", line 354, in add_query
    connection, cursor = super().add_query(
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\adapters\sql\connections.py", line 87, in add_query
    return connection, cursor
  File "C:\Program Files\Python38\lib\contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt\lib\site-packages\dbt\adapters\snowflake\connections.py", line 200, in exception_handler
    raise DatabaseException(msg)
dbt.exceptions.DatabaseException: Database Error in test source_not_null_veeva_account_id (models\sources\veeva\sources.yml)
  000606 (57P03): No active warehouse selected in the current session.  Select an active warehouse with the 'use warehouse' command.
  
  compiled SQL at ../.dbt/.output\run\dwm_core\models\sources\veeva\sources.yml\schema_test\source_not_null_veeva_account_id.sql
2021-10-11 23:42:00.499143 (Thread-1): 16:42:00 | 2 of 2 ERROR source_not_null_veeva_account_id........................ [ERROR in 0.72s]
2021-10-11 23:42:00.500138 (Thread-1): Finished running node test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d
2021-10-11 23:42:00.502136 (MainThread): Acquiring new snowflake connection "master".
2021-10-11 23:42:00.503140 (MainThread): 16:42:00 | 
2021-10-11 23:42:00.504138 (MainThread): 16:42:00 | Finished running 2 tests in 5.55s.
2021-10-11 23:42:00.505136 (MainThread): Connection 'master' was properly closed.
2021-10-11 23:42:00.505136 (MainThread): Connection 'list_raw_dev_zz_warners_veeva' was properly closed.
2021-10-11 23:42:00.506138 (MainThread): Connection 'test.dwm_core.source_not_null_veeva_account_id.41a7bbc89d' was properly closed.
2021-10-11 23:42:00.550138 (MainThread): 
2021-10-11 23:42:00.551139 (MainThread): Completed with 2 errors and 0 warnings:
2021-10-11 23:42:00.552157 (MainThread): 
2021-10-11 23:42:00.553139 (MainThread): Database Error in test source_not_null_veeva_ab_site__c_id (models\sources\veeva\sources.yml)
2021-10-11 23:42:00.554137 (MainThread):   000606 (57P03): No active warehouse selected in the current session.  Select an active warehouse with the 'use warehouse' command.
2021-10-11 23:42:00.555137 (MainThread):   
2021-10-11 23:42:00.556147 (MainThread):   compiled SQL at ../.dbt/.output\run\dwm_core\models\sources\veeva\sources.yml\schema_test\source_not_null_veeva_ab_site__c_id.sql
2021-10-11 23:42:00.558145 (MainThread): 
2021-10-11 23:42:00.560153 (MainThread): Database Error in test source_not_null_veeva_account_id (models\sources\veeva\sources.yml)
2021-10-11 23:42:00.565167 (MainThread):   000606 (57P03): No active warehouse selected in the current session.  Select an active warehouse with the 'use warehouse' command.
2021-10-11 23:42:00.566136 (MainThread):   
2021-10-11 23:42:00.576140 (MainThread):   compiled SQL at ../.dbt/.output\run\dwm_core\models\sources\veeva\sources.yml\schema_test\source_not_null_veeva_account_id.sql
2021-10-11 23:42:00.584148 (MainThread): 
Done. PASS=0 WARN=0 ERROR=2 SKIP=0 TOTAL=2
2021-10-11 23:42:00.586156 (MainThread): Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001E3C7DB2F40>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001E3C7C9DE20>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001E3C837FD00>]}
2021-10-11 23:42:00.588207 (MainThread): Flushing usage events

System information

Which database are you using dbt with?

The output of dbt --version:

(.venv-dbt) PS C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt> dbt --version
installed version: 0.21.0
   latest version: 0.21.0

Up to date!

Plugins:
  - snowflake: 0.21.0

The operating system you're using:
Windows 10

The output of python --version:
Python 3.8.10

Configurations attempted:

(using jaffle_shop models)

Running with dbt=0.21.0
dbt version: 0.21.0
python version: 3.8.10
python path: C:\Users\warners\clients\adaptive\repos\vscode\dwm\dbt-debug\.venv\Scripts\python.exe
os info: Windows-10-10.0.19042-SP0
Using profiles.yml file at C:\Users\warners\.dbt\profiles.yml
Using dbt_project.yml file at C:\Users\warners\clients\adaptive\repos\vscode\dwm\dbt-debug\dwm_core\dbt_project.yml

Configuration:
  profiles.yml file [OK found and valid]
  dbt_project.yml file [OK found and valid]

Required dependencies:
 - git [OK found]

Connection:
  account: adaptivebiotech-corp
  user: ELT_DEV
  database: raw_dev
  schema: zz_warners
  warehouse: DEVELOPER_WH
  role: SNOWFLAKE_DEVELOPER
  client_session_keep_alive: False
  Connection test: [OK connection ok]

All checks passed!

dbt 18.2 with snowflake connector snowflake-connector-python 2.2.10

(.venv-dbt-0182) PS C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt> dbt debug
Running with dbt=0.18.2
dbt version: 0.18.2
python version: 3.8.10
python path: C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt-0182\Scripts\python.exe
os info: Windows-10-10.0.19042-SP0
Using profiles.yml file at C:\Users\warners\.dbt\profiles.yml
Using dbt_project.yml file at C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\dbt_project.yml 

Configuration:
  profiles.yml file [OK found and valid]   
  dbt_project.yml file [OK found and valid]

Required dependencies:
 - git [OK found]

Connection:
  account: adaptivebiotech-corp
  user: ELT_DEV
  database: raw_dev
  schema: zz_warners
  warehouse: DEVELOPER_WH
  role: SNOWFLAKE_DEVELOPER
  client_session_keep_alive: False
  Connection test: OK connection ok

dbt 19.2 with snowflake-connector-python 2.5.1

(.venv-dbt-0192-v2) PS C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt> dbt debug
Running with dbt=0.19.2
dbt version: 0.19.2
python version: 3.8.10
python path: C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt-0192-v2\Scripts\python.exe
os info: Windows-10-10.0.19042-SP0
Using profiles.yml file at C:\Users\warners\.dbt\profiles.yml
Using dbt_project.yml file at C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\dbt_project.yml    

Configuration:
  profiles.yml file [OK found and valid]
  dbt_project.yml file [OK found and valid]

Required dependencies:
 - git [OK found]

Connection:
  account: adaptivebiotech-corp
  user: ELT_DEV
  database: raw_dev
  schema: zz_warners
  warehouse: DEVELOPER_WH
  role: SNOWFLAKE_DEVELOPER
  client_session_keep_alive: False
  Connection test: OK connection ok

dbt 1.0.0-b1 and snowflake-connector-python 2.5.1

(.venv-dbt-1b1) PS C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt> dbt debug
Running with dbt=1.0.0-b1
dbt version: 1.0.0-b1
python version: 3.8.10
python path: C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\.venv-dbt-1b1\Scripts\python.exe
os info: Windows-10-10.0.19042-SP0
Using profiles.yml file at C:\Users\warners\.dbt\profiles.yml
Using dbt_project.yml file at C:\Users\warners\clients\adaptive\repos\vscode\dwm\web-tools\dbt\dbt_project.yml

Configuration:
  profiles.yml file [OK found and valid]   
  dbt_project.yml file [OK found and valid]

Required dependencies:
 - git [OK found]

Connection:
  account: adaptivebiotech-corp
  user: ELT_DEV
  database: raw_dev
  schema: zz_warners
  warehouse: DEVELOPER_WH
  role: SNOWFLAKE_DEVELOPER
  client_session_keep_alive: False
  Connection test: [OK connection ok]

	{% if config.persist_column_docs() -%}
	{% set model_columns = model.columns %}
	{% set query_columns = get_columns_in_query(sql) %}
	{{ get_persist_docs_column_list(model_columns, query_columns) }}

	{%- endif %}

	{% for column_name in query_columns %}
	{% if (column_name\|upper in model_columns) or (column_name in model_columns) %}
	{{ get_column_comment_sql(column_name, model_columns) }}
	{% else %}