Giter Site home page Giter Site logo

hasura / sqlite-dataconnector-agent Goto Github PK

View Code? Open in Web Editor NEW
8.0 3.0 0.0 690 KB

SQLite Data Connector Agent for Hasura GQL Engine. Please note that this repository is a mirror. We will still accept PRs, but will have to mirror them to our upstream repo.

Dockerfile 0.70% TypeScript 99.30%
hasura-dataconnector-agent

sqlite-dataconnector-agent's Introduction

Data Connector Agent for SQLite

This directory contains an SQLite implementation of a data connector agent. It can use local SQLite database files as referenced by the "db" config field.

Capabilities

The SQLite agent currently supports the following capabilities:

  • GraphQL Schema
  • GraphQL Queries
  • Relationships
  • Aggregations
  • Prometheus Metrics
  • Exposing Foreign-Key Information
  • Mutations
  • Subscriptions
  • Streaming Subscriptions

Note: You are able to get detailed metadata about the agent's capabilities by GETting the /capabilities endpoint of the running agent.

Requirements

Build & Run

npm install
npm run build
npm run start

Or a simple dev-loop via entr:

echo src/**/*.ts | xargs -n1 echo | DB_READONLY=y entr -r npm run start

Docker Build & Run

> docker build . -t dc-sqlite-agent:latest
> docker run -it --rm -p 8100:8100 dc-sqlite-agent:latest

You will want to mount a volume with your database(s) so that they can be referenced in configuration.

Options / Environment Variables

Note: Boolean flags {FLAG} can be provided as 1, true, t, yes, y, or omitted and default to false.

ENV Variable Name Format Default Info
PORT INT 8100 Port for agent to listen on.
PERMISSIVE_CORS {FLAG} false Allows all requests - Useful for testing with SwaggerUI. Turn off on production.
DB_CREATE {FLAG} false Allows new databases to be created.
DB_READONLY {FLAG} false Makes databases readonly.
DB_ALLOW_LIST DB1[,DB2]* Any Allowed Restrict what databases can be connected to.
DB_PRIVATECACHE {FLAG} Shared Keep caches between connections private.
DEBUGGING_TAGS {FLAG} false Outputs xml style tags in query comments for deugging purposes.
PRETTY_PRINT_LOGS {FLAG} false Uses pino-pretty to pretty print request logs
LOG_LEVEL fatal | error | info | debug | trace | silent info The minimum log level to output
METRICS {FLAG} false Enables a /metrics prometheus metrics endpoint.
QUERY_LENGTH_LIMIT INT Infinity Puts a limit on the length of generated SQL before execution.
DATASETS {FLAG} false Enable dataset operations
DATASET_DELETE {FLAG} false Enable DELETE /datasets/:name
DATASET_TEMPLATES DIRECTORY ./dataset_templates Directory to clone datasets from.
DATASET_CLONES DIRECTORY ./dataset_clones Directory to clone datasets to.
MUTATIONS {FLAG} false Enable Mutation Support.

Agent usage

The agent is configured as per the configuration schema. The valid configuration properties are:

Property Type Default
db string
tables string[] null
include_sqlite_meta_tables boolean false
explicit_main_schema boolean false

The only required property is db which specifies a local sqlite database to use.

The schema is exposed via introspection, but you can limit which tables are referenced by

  • Explicitly enumerating them via the tables property, or
  • Toggling the include_sqlite_meta_tables to include or exclude sqlite meta tables.

The explicit_main_schema field can be set to opt into exposing tables by their fully qualified names (ie ["main", "MyTable"] instead of just ["MyTable"]).

Dataset

The dataset used for testing the reference agent is sourced from:

Datasets

Datasets support is enabled via the ENV variables:

  • DATASETS
  • DATASET_DELETE
  • DATASET_TEMPLATES
  • DATASET_CLONES

Templates will be looked up at ${DATASET_TEMPLATES}/${template_name}.sqlite or ${DATASET_TEMPLATES}/${template_name}.sql. The .sqlite templates are just SQLite database files that will be copied as a clone. The .sql templates are SQL script files that will be run against a blank SQLite database in order to create a clone.

Clones will be copied to ${DATASET_CLONES}/${clone_name}.sqlite.

Testing Changes to the Agent

Ensure you run the agent with DATASETS=1 DATASET_DELETE=1 MUTATIONS=1 in order to enable testing of mutations.

Then run:

cabal run dc-api:test:tests-dc-api -- test --agent-base-url http://localhost:8100 sandwich --tui

From the HGE repo.

Known Issues

  • Using "returning" in update mutations where you join across relationships that are affected by the update mutation itself may return inconsistent results. This is because of this issue with SQLite: https://sqlite.org/forum/forumpost/9470611066

TODO

  • Prometheus metrics hosted at /metrics
  • Pull reference types from a package rather than checked-in files
  • Health Check
  • DB Specific Health Checks
  • Schema
  • Capabilities
  • Query
  • Array Relationships
  • Object Relationships
  • Ensure everything is escaped correctly - https://sequelize.org/api/v6/class/src/sequelize.js~sequelize#instance-method-escape
  • Or... Use parameterized queries if possible - https://sequelize.org/docs/v6/core-concepts/raw-queries/#bind-parameter
  • Run test-suite from SDK
  • Remove old queries module
  • Relationships / Joins
  • Rename resultTT and other badly named types in the schema.ts module
  • Add ENV Variable for restriction on what databases can be used
  • Update to the latest types
  • Port back to hge codebase as an official reference agent
  • Make escapeSQL global to the query module
  • Make CORS permissions configurable
  • Optional DB Allowlist
  • Fix SDK Test suite to be more flexible about descriptions
  • READONLY option
  • CREATE option
  • Don't create DB option
  • Aggregate queries
  • Verbosity settings
  • Cache settings
  • Missing WHERE clause from object relationships
  • Reuse find_table_relationship in more scenarios
  • ORDER clause in aggregates breaks SQLite parser for some reason
  • Check that looped exist check doesn't cause name conflicts
  • NOT EXISTS IS NULL != EXISTS IS NOT NULL
  • Mutation support

sqlite-dataconnector-agent's People

Contributors

daniel-chambers avatar danieljharvey avatar dmoverton avatar i-am-tom avatar m4ttheweric avatar nicuveo avatar solomon-b avatar sordina avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

sqlite-dataconnector-agent's Issues

Sample code for supporting queries on related rows

Here's a bit messy but working implementation of query.ts (only for BinaryComparisonOperator, as a POC)

Changes that were made were:

  • Thread the relationship and table name data through the call chain
  • Write a recursive function that generates EXISTS(SELECT ... queries as it walks down the path: [] array
  • Check whether column.path is not 0 when generating WHERE. If it is, we need to generate EXISTS subquery first, then continue normal processing.

image

/*<query>*/ SELECT /*<array_relationship>*/ (
          SELECT JSON_GROUP_ARRAY(j)
          FROM (
            SELECT /*<json_object>*/ JSON_OBJECT('ArtistId', "ArtistId", 'Name', "Name") /*</json_object>*/ AS j
            FROM "Artist"
            /*<where>*/ WHERE (((EXISTS (
        SELECT 1
        FROM Album
        WHERE "Album"."ArtistId" = "Artist"."ArtistId"
        AND
        EXISTS (
        SELECT 1
        FROM Track
        WHERE "Track"."AlbumId" = "Album"."AlbumId"
        AND
        /*<bop_col>*/ "Name" /*</bop_col>*/ /*<bop_op>*/ = /*</bop_op>*/ /*<bop_val>*/ 'Restless and Wild' /*</bop_val>*/
      )
      )))) /*</where>*/
            /*<limit>*/ LIMIT 2 /*</limit>*/

          )) /*</array_relationship>*/ as data /*</query>*/
import { Config } from "./config"
import { connect } from "./db"
import {
    BinaryArrayComparisonOperator,
    BinaryComparisonOperator,
    ComparisonColumn,
    ComparisonValue,
    Expression,
    Field,
    OrderBy,
    QueryRequest,
    QueryResponse,
    Relationship,
    RelationshipField,
    TableRelationships,
} from "./types"
import { coerceUndefinedOrNullToEmptyArray, coerceUndefinedToNull, omap } from "./util"

const SqlString = require("sqlstring-sqlite")

/** Helper type for convenience. Uses the sqlstring-sqlite library, but should ideally use the function in sequalize.
 */
type Fields = Record<string, Field>

function escapeString(x: any): string {
    return SqlString.escape(x)
}

/**
 *
 * @param identifier: Unescaped name. E.g. 'Alb"um'
 * @returns Escaped name. E.g. '"Alb\"um"'
 */
function escapeIdentifier(identifier: string): string {
    // TODO: Review this function since the current implementation is off the cuff.
    const result = identifier.replace(/\\/g, "\\\\").replace(/"/g, '\\"')
    return `"${result}"`
}

function json_object(ts: Array<TableRelationships>, fields: Fields, table: string): string {
    const result = omap(fields, (k, v) => {
        switch (v.type) {
            case "column":
                return [`${escapeString(k)}, ${escapeIdentifier(v.column)}`]
            case "relationship":
                const result = ts.flatMap((x) => {
                    if (x.source_table === table) {
                        const rel = x.relationships[v.relationship]
                        if (rel) {
                            return [`'${k}', ${relationship(ts, rel, v, table)}`]
                        }
                    }
                    return []
                })
                if (result.length < 1) {
                    console.log("Couldn't find relationship for field", k, v, ts)
                }
                return result
        }
    }).join(", ")

    return tag("json_object", `JSON_OBJECT(${result})`)
}

function relationshipChainToExistsSubquery(
    w: Expression,
    rels: Array<Relationship & { source_table: string }>
): string {
    const rel = rels.shift()
    if (rel == null) {
        return ""
    }
    const { source_table, target_table, column_mapping } = rel
    // prettier-ignore
    // ts-ignore
    return `EXISTS (
        SELECT 1
        FROM ${target_table}
        WHERE ${Object.entries(column_mapping)
            .map(([k, v]) => `${escapeIdentifier(target_table)}.${escapeIdentifier(k)} = ${escapeIdentifier(source_table)}.${escapeIdentifier(v)}`)
            .join(" AND ")}
        AND 
        ${rels.length > 0
          ? relationshipChainToExistsSubquery(w, rels)
          : relationship_where(target_table, [])(clearExpressionColumnPath(w))}
      )`
}

function clearExpressionColumnPath(e: Expression): Expression {
    switch (e.type) {
        case "and":
        case "or":
        case "not":
            throw new Error("Only comparison expressions should be passed to this function")
        default:
            return {
                ...e,
                column: {
                    ...e.column,
                    path: [],
                },
            }
    }
}

const relationship_where =
    (table: string, ts: Array<TableRelationships>) =>
    (w: Expression | null): Array<string> => {
        if (w == null) {
            return []
        } else {
            switch (w.type) {
                case "not":
                    const aNot = relationship_where(table, ts)(w.expression)
                    if (aNot.length > 0) {
                        return [`(NOT ${aNot})`]
                    }
                    break
                case "and":
                    const aAnd = w.expressions.flatMap(relationship_where(table, ts))
                    if (aAnd.length > 0) {
                        return [`(${aAnd.join(" AND ")})`]
                    }
                    break
                case "or":
                    const aOr = w.expressions.flatMap(relationship_where(table, ts))
                    if (aOr.length > 0) {
                        return [`(${aOr.join(" OR ")})`]
                    }
                    break
                case "unary_op":
                    switch (w.operator) {
                        case "is_null":
                            return [`(${bop_col(w.column)} IS NULL)`] // TODO: Could escape usnig bop_col if escape is threaded through.
                    }
                case "binary_op":
                    if (w.column.path.length > 0) {
                        let currentTable = table
                        const relPath = w.column.path.map((relName) => {
                            const rel = ts.find((x) => x.relationships[relName] != null)
                            if (rel == null) {
                                throw new Error(`Could not find relationship ${relName} in table ${table}`)
                            }
                            currentTable = rel.source_table
                            const result = { ...rel.relationships[relName], source_table: currentTable }
                            return result
                        })
                        console.log({ relPath })
                        return [relationshipChainToExistsSubquery(w, relPath)]
                    }
                    const bop = bop_op(w.operator)
                    return [`${bop_col(w.column)} ${bop} ${bop_val(w.value)}`]
                case "binary_arr_op":
                    const bopA = bop_array(w.operator)
                    return [`(${bop_col(w.column)} ${bopA} (${w.values.map((v) => escapeString(v)).join(", ")}))`]
            }
            return []
        }
    }

function array_relationship(
    ts: Array<TableRelationships>,
    table: string,
    wJoin: Array<string>,
    fields: Fields,
    wWhere: Expression | null,
    wLimit: number | null,
    wOffset: number | null,
    wOrder: Array<OrderBy>
): string {
    // NOTE: The order of table prefixes are currently assumed to be from "parent" to "child".
    // NOTE: The reuse of the 'j' identifier should be safe due to scoping. This is confirmed in testing.
    if (wOrder.length < 1) {
        return tag(
            "array_relationship",
            `(
          SELECT JSON_GROUP_ARRAY(j)
          FROM (
            SELECT ${json_object(ts, fields, table)} AS j
            FROM ${escapeIdentifier(table)}
            ${where(table, ts)(wWhere, wJoin)}
            ${limit(wLimit)}
            ${offset(wOffset)}
          ))`
        )
    } else {
        // NOTE: Rationale for subselect in FROM clause:
        //
        // There seems to be a bug in SQLite where an ORDER clause in this position causes ARRAY_RELATIONSHIP
        // to return rows as JSON strings instead of JSON objects. This is worked around by using a subselect.
        return tag(
            "array_relationship",
            `(
          SELECT JSON_GROUP_ARRAY(j)
          FROM (
            SELECT ${json_object(ts, fields, table)} AS j
            FROM (
              SELECT *
              FROM ${escapeIdentifier(table)}
              ${where(table, ts)(wWhere, wJoin)}
              ${order(wOrder)}
              ${limit(wLimit)}
              ${offset(wOffset)}
            ) AS ${table}
          ))`
        )
    }
}

function object_relationship(
    ts: Array<TableRelationships>,
    table: string,
    wJoin: Array<string>,
    fields: Fields
): string {
    // NOTE: The order of table prefixes are currently assumed to be from "parent" to "child".
    return tag(
        "object_relationship",
        `(
        SELECT ${json_object(ts, fields, table)} AS j
        FROM ${table}
        ${where(table, ts)(null, wJoin)}
      )`
    )
}

function relationship(ts: Array<TableRelationships>, r: Relationship, field: RelationshipField, table: string): string {
    const wJoin = omap(
        r.column_mapping,
        (k, v) =>
            `${escapeIdentifier(table)}.${escapeIdentifier(k)} = ${escapeIdentifier(r.target_table)}.${escapeIdentifier(
                v
            )}`
    )

    switch (r.relationship_type) {
        case "object":
            return tag("relationship", object_relationship(ts, r.target_table, wJoin, field.query.fields))

        case "array":
            return tag(
                "relationship",
                array_relationship(
                    ts,
                    r.target_table,
                    wJoin,
                    field.query.fields,
                    coerceUndefinedToNull(field.query.where),
                    coerceUndefinedToNull(field.query.limit),
                    coerceUndefinedToNull(field.query.offset),
                    coerceUndefinedOrNullToEmptyArray(field.query.order_by)
                )
            )
    }
}

function bop_col(c: ComparisonColumn): string {
    if (c.path.length < 1) {
        return tag("bop_col", escapeIdentifier(c.name))
    } else {
        return tag("bop_col", c.path.map(escapeIdentifier).join(".") + "." + escapeIdentifier(c.name))
    }
}

function bop_array(o: BinaryArrayComparisonOperator): string {
    switch (o) {
        case "in":
            return tag("bop_array", "IN")
    }
}

function bop_op(o: BinaryComparisonOperator): string {
    let result
    switch (o) {
        case "equal":
            result = "="
            break
        case "greater_than":
            result = ">"
            break
        case "greater_than_or_equal":
            result = ">="
            break
        case "less_than":
            result = "<"
            break
        case "less_than_or_equal":
            result = "<="
            break
    }
    return tag("bop_op", result)
}

function bop_val(v: ComparisonValue): string {
    switch (v.type) {
        case "column":
            return tag("bop_val", bop_col(v.column))
        case "scalar":
            return tag("bop_val", escapeString(v.value))
    }
}

function order(o: Array<OrderBy>): string {
    if (o.length < 1) {
        return ""
    }
    const result = o.map((e) => `${e.column} ${e.ordering}`).join(", ")
    return tag("order", `ORDER BY ${result}`)
}

/**
 * @param whereArray Expressions used in the associated where clause
 * @param joinArray Join clauses
 * @returns string representing the combined where clause
 */
const where =
    (table: string, ts: Array<TableRelationships>) =>
    (whereArray: Expression | null, joinArray: Array<string>): string => {
        const clauses = [...relationship_where(table, ts)(whereArray), ...joinArray]
        if (clauses.length < 1) {
            return ""
        } else {
            return tag("where", `WHERE ${clauses.join(" AND ")}`)
        }
    }

function limit(l: number | null): string {
    if (l === null) {
        return ""
    } else {
        return tag("limit", `LIMIT ${l}`)
    }
}

function offset(o: number | null): string {
    if (o == null) {
        return ""
    } else {
        return tag("offset", `OFFSET ${o}`)
    }
}

/** Top-Level Query Function.
 */
function query(request: QueryRequest): string {
    const result = array_relationship(
        request.table_relationships,
        request.table,
        [],
        request.query.fields,
        coerceUndefinedToNull(request.query.where),
        coerceUndefinedToNull(request.query.limit),
        coerceUndefinedToNull(request.query.offset),
        coerceUndefinedOrNullToEmptyArray(request.query.order_by)
    )
    return tag("query", `SELECT ${result} as data`)
}

/** Format the DB response into a /query response.
 *
 * Note: There should always be one result since 0 rows still generates an empty JSON array.
 */
function output(rows: any): QueryResponse {
    return JSON.parse(rows[0].data)
}

/** Function to add SQL comments to the generated SQL to tag which procedures generated what text.
 *
 * comment('a','b') => '/*\<a>\*\/ b /*\</a>*\/'
 */
function tag(t: string, s: string): string {
    return `/*<${t}>*/ ${s} /*</${t}>*/`
}

/** Performs a query and returns results
 * 
 * Limitations:
 * 
 * - Binary Array Operations not currently supported.
 * 
 * The current algorithm is to first create a query, then execute it, returning results.
 * 
 * Method for adding relationship fields:
 * 
 * - JSON aggregation similar to Postgres' approach.
 *     - 4.13. The json_group_array() and json_group_object() aggregate SQL functions
 *     - https://www.sqlite.org/json1.html#jgrouparray
 * 


 * Example of a test query:
 * 
 * ```
 * query MyQuery {
 *   Artist(limit: 5, order_by: {ArtistId: asc}, where: {Name: {_neq: "Accept"}, _and: {Name: {_is_null: false}}}, offset: 3) {
 *     ArtistId
 *     Name
 *     Albums(where: {Title: {_is_null: false, _gt: "A", _nin: "foo"}}, limit: 2) {
 *       AlbumId
 *       Title
 *       ArtistId
 *       Tracks(limit: 1) {
 *         Name
 *         TrackId
 *       }
 *       Artist {
 *         ArtistId
 *       }
 *     }
 *   }
 *   Track(limit: 3) {
 *     Name
 *     Album {
 *       Title
 *     }
 *   }
 * }
 * ```
 * 
 */
export async function queryData(config: Config, queryRequest: QueryRequest): Promise<QueryResponse> {
    const db = connect(config) // TODO: Should this be cached?
    const q = query(queryRequest)
    const [result, metadata] = await db.query(q)

    return output(result)
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.