Giter Site home page Giter Site logo

dbt-mindsdb's Introduction

MindsDB DBT adapter

The dbt-mindsdb package allows dbt to connect to MindsDB.

Installation

pip install dbt-mindsdb

Configurations

Basic profile.yml for connecting to MindsDB:

mindsdb:
  outputs:
    dev:
      host: '127.0.0.1'
      password: ''
      port: 47335
      database: 'mindsdb'
      project: 'my_project'
      # schema: 'my_project'
      type: mindsdb
      username: 'mindsdb'
  target: dev
Key Required Description Example
type ✔️ The specific adapter to use mindsdb
host ✔️ The MindsDB (hostname) to connect to cloud.mindsdb.com
port ✔️ The port to use 3306 or 47335
schema ✔️ Specify the schema (project) to build models into The MindsDB datasource
username ✔️ The username to use to connect to the server mindsdb or mindsdb cloud user
password ✔️ The password to use for authenticating to the server `pass

Usage

  • Create dbt project, choose mindsdb database and set up connection
    dbt init <project_name>
  • create a integration with "integration" materialization:
    {{
    config(
      materialized='integration',
      engine='trino',
      parameters={
        "user": env_var('TRINO_USER'),
        "auth": "basic",
        "http_scheme": "https",
        "port": 443,
        "password": env_var('TRINO_PASSWORD'),
        "host": "trino.company.com",
        "catalog": "hive",
        "schema": "photorep_schema",
        "with": "with (transactional = true)"
      }
    )
    }}
  • To create predictor add dbt model with "predictor" materialization: Name of the model is used as name of predictor. Parameters:
    • integration - name of used integration to get data from and save result to. In has to be created in mindsdb beforehand
    • predict - field for prediction
    • predict_alias [optional] - alias for predicted field
    • using [optional] - options for configure trained model
    {{
        config(
            materialized='predictor',
            integration='photorep',
            predict='name',
            predict_alias='name',
            using={
                'encoders.location.module': 'CategoricalAutoEncoder',
                'encoders.rental_price.module': 'NumericEncoder'
            }
        )
    }}
      select * from stores
  • Other paramaters for time-series predictor:

    • order_by - column that the time series will be order by.
    • group_by - rows that make a partition
    • window - the number [int] of rows to "look back" into when making a prediction
    • horizon - keyword specifies the number of future predictions, default value is 1
  • To apply predictor add dbt model with "table" materialization. It creates or replaces table in selected integration with results of predictor. Name of the model is used as name of the table to store prediction results. If you need to specify schema you can do it with dot separator: schema_name.table_name.sql
    Parameters:

    • integration - name of used integration to get data from and save result to. In has to be created in mindsdb beforehand
    {{ config(materialized='table', integration='int1') }}
        select a, bc from ddd JOIN TEST_PREDICTOR_NAME where name > LATEST

Notes: predictor_name has been removed. Instead it must be set explicitly in JOIN part of the model

Testing

  • Install dev requirements
  pip install -r dev_requirements.txt
  • Run pytest
  python -m pytest tests/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.