Giter Site home page Giter Site logo

Support require partition filter about dbt-ga4 HOT 5 OPEN

velir avatar velir commented on August 17, 2024
Support require partition filter

from dbt-ga4.

Comments (5)

adamribaudo-velir avatar adamribaudo-velir commented on August 17, 2024

This is a noble goal, but every model in the package would need to respect this. We have so much work ahead to even get models like dim_ga4__sessions to utilize partitioning. My preference would be to close this issue and open issues related to the individual tasks that need to be done: Ex: Update dim_ga4__sessions to allow incremental build and use partitions, Update dim_ga4__user_ids to allow incremental build and use partitions, etc

Remember that any model that scans all partitions, by definition, does not use a partition filter. We have lots of those models. So I don't think it's as easy as setting a config variable.

from dbt-ga4.

dgitis avatar dgitis commented on August 17, 2024

Manually adding this to every model is not what I intended by opening this issue. What we need to do to is to macro our partition header and where clauses.

Some recent updates to our data marts only support dynamic partitioning while our base partitions support both dynamic and static partitions. This is an oversight that would be fixed by having partition macros applies across all models and it would make adding partition filters trivial.

Also, I'm working on multisite right now which makes the advantages of templating things like partitioning seem more practical.

from dbt-ga4.

dgitis avatar dgitis commented on August 17, 2024

The issue that I've come across is that moving incremental headers into a macro puts partitions_to_replace outside of the scope of the parent model resulting in static where clauses ending up with no condition.

where event_date_dt = ""

The fix as I see it is to set all partitions macros in the same file. One sets partitions_to_replace. The other two call partitions_to_replace and output the header and where clause code.

from dbt-ga4.

dgitis avatar dgitis commented on August 17, 2024

Adding a require partition filter to this structure should be trivial.

from dbt-ga4.

adamribaudo-velir avatar adamribaudo-velir commented on August 17, 2024

Sorry, I'm not concerned about the mechanics of adding macros for where clauses and configs, I'm concerned about the logical implications of requiring partition filters.

Take for example, stg_ga4__sessions_traffic_sources.sql . If you impose a partition filter, you'll split any multi-day sessions and the first_value function will seek within the 2nd day, rather than the 1st day.

Or take stg_ga4__user_id_mapping. If it's made incremental, then we unintentially start introducing many-to-many relationships between user_pseudo_ids and user_ids.

I definitely want to work on each of these issues, but we have to be careful of the logical implications of adding where filters and incremental materializations. It could change the data in unexpected ways.

from dbt-ga4.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.