hasura / sqlite-dataconnector-agent Goto Github PK

8.0 3.0 0.0 690 KB

SQLite Data Connector Agent for Hasura GQL Engine. Please note that this repository is a mirror. We will still accept PRs, but will have to mirror them to our upstream repo.

Dockerfile 0.70% TypeScript 99.30%

hasura-dataconnector-agent

sqlite-dataconnector-agent's Introduction

Data Connector Agent for SQLite

This directory contains an SQLite implementation of a data connector agent. It can use local SQLite database files as referenced by the "db" config field.

Capabilities

The SQLite agent currently supports the following capabilities:

Note: You are able to get detailed metadata about the agent's capabilities by GETting the /capabilities endpoint of the running agent.

Requirements

NodeJS 16
SQLite >= 3.38.0 or compiled in JSON support
- Required for the json_group_array() and json_group_object() aggregate SQL functions
- https://www.sqlite.org/json1.html#jgrouparray
Note: NPM is used for the TS Types for the DC-API protocol

Build & Run

npm install
npm run build
npm run start

Or a simple dev-loop via entr:

echo src/**/*.ts | xargs -n1 echo | DB_READONLY=y entr -r npm run start

Docker Build & Run

> docker build . -t dc-sqlite-agent:latest
> docker run -it --rm -p 8100:8100 dc-sqlite-agent:latest

You will want to mount a volume with your database(s) so that they can be referenced in configuration.

Options / Environment Variables

Note: Boolean flags {FLAG} can be provided as 1, true, t, yes, y, or omitted and default to false.

ENV Variable Name	Format	Default	Info
`PORT`	`INT`	`8100`	Port for agent to listen on.
`PERMISSIVE_CORS`	`{FLAG}`	`false`	Allows all requests - Useful for testing with SwaggerUI. Turn off on production.
`DB_CREATE`	`{FLAG}`	`false`	Allows new databases to be created.
`DB_READONLY`	`{FLAG}`	`false`	Makes databases readonly.
`DB_ALLOW_LIST`	`DB1[,DB2]*`	Any Allowed	Restrict what databases can be connected to.
`DB_PRIVATECACHE`	`{FLAG}`	Shared	Keep caches between connections private.
`DEBUGGING_TAGS`	`{FLAG}`	`false`	Outputs xml style tags in query comments for deugging purposes.
`PRETTY_PRINT_LOGS`	`{FLAG}`	`false`	Uses `pino-pretty` to pretty print request logs
`LOG_LEVEL`	`fatal` \| `error` \| `info` \| `debug` \| `trace` \| `silent`	`info`	The minimum log level to output
`METRICS`	`{FLAG}`	`false`	Enables a `/metrics` prometheus metrics endpoint.
`QUERY_LENGTH_LIMIT`	`INT`	`Infinity`	Puts a limit on the length of generated SQL before execution.
`DATASETS`	`{FLAG}`	`false`	Enable dataset operations
`DATASET_DELETE`	`{FLAG}`	`false`	Enable `DELETE /datasets/:name`
`DATASET_TEMPLATES`	`DIRECTORY`	`./dataset_templates`	Directory to clone datasets from.
`DATASET_CLONES`	`DIRECTORY`	`./dataset_clones`	Directory to clone datasets to.
`MUTATIONS`	`{FLAG}`	`false`	Enable Mutation Support.

Agent usage

The agent is configured as per the configuration schema. The valid configuration properties are:

Property	Type	Default
`db`	`string`
`tables`	`string[]`	`null`
`include_sqlite_meta_tables`	`boolean`	`false`
`explicit_main_schema`	`boolean`	`false`

The only required property is db which specifies a local sqlite database to use.

The schema is exposed via introspection, but you can limit which tables are referenced by

Explicitly enumerating them via the tables property, or
Toggling the include_sqlite_meta_tables to include or exclude sqlite meta tables.

The explicit_main_schema field can be set to opt into exposing tables by their fully qualified names (ie ["main", "MyTable"] instead of just ["MyTable"]).

Dataset

The dataset used for testing the reference agent is sourced from:

https://raw.githubusercontent.com/lerocha/chinook-database/master/ChinookDatabase/DataSources/Chinook_Sqlite.sql

Datasets

Datasets support is enabled via the ENV variables:

DATASETS
DATASET_DELETE
DATASET_TEMPLATES
DATASET_CLONES

Templates will be looked up at ${DATASET_TEMPLATES}/${template_name}.sqlite or ${DATASET_TEMPLATES}/${template_name}.sql. The .sqlite templates are just SQLite database files that will be copied as a clone. The .sql templates are SQL script files that will be run against a blank SQLite database in order to create a clone.

Clones will be copied to ${DATASET_CLONES}/${clone_name}.sqlite.

Testing Changes to the Agent

Ensure you run the agent with DATASETS=1 DATASET_DELETE=1 MUTATIONS=1 in order to enable testing of mutations.

Then run:

cabal run dc-api:test:tests-dc-api -- test --agent-base-url http://localhost:8100 sandwich --tui

From the HGE repo.

Known Issues

Using "returning" in update mutations where you join across relationships that are affected by the update mutation itself may return inconsistent results. This is because of this issue with SQLite: https://sqlite.org/forum/forumpost/9470611066

TODO

sqlite-dataconnector-agent's People

Contributors

Stargazers

Watchers

sqlite-dataconnector-agent's Issues

Sample code for supporting queries on related rows

Here's a bit messy but working implementation of query.ts (only for BinaryComparisonOperator, as a POC)

Changes that were made were:

Thread the relationship and table name data through the call chain
Write a recursive function that generates EXISTS(SELECT ... queries as it walks down the path: [] array
Check whether column.path is not 0 when generating WHERE. If it is, we need to generate EXISTS subquery first, then continue normal processing.

/*<query>*/ SELECT /*<array_relationship>*/ (
          SELECT JSON_GROUP_ARRAY(j)
          FROM (
            SELECT /*<json_object>*/ JSON_OBJECT('ArtistId', "ArtistId", 'Name', "Name") /*</json_object>*/ AS j
            FROM "Artist"
            /*<where>*/ WHERE (((EXISTS (
        SELECT 1
        FROM Album
        WHERE "Album"."ArtistId" = "Artist"."ArtistId"
        AND
        EXISTS (
        SELECT 1
        FROM Track
        WHERE "Track"."AlbumId" = "Album"."AlbumId"
        AND
        /*<bop_col>*/ "Name" /*</bop_col>*/ /*<bop_op>*/ = /*</bop_op>*/ /*<bop_val>*/ 'Restless and Wild' /*</bop_val>*/
      )
      )))) /*</where>*/
            /*<limit>*/ LIMIT 2 /*</limit>*/

          )) /*</array_relationship>*/ as data /*</query>*/

import { Config } from "./config"
import { connect } from "./db"
import {
    BinaryArrayComparisonOperator,
    BinaryComparisonOperator,
    ComparisonColumn,
    ComparisonValue,
    Expression,
    Field,
    OrderBy,
    QueryRequest,
    QueryResponse,
    Relationship,
    RelationshipField,
    TableRelationships,
} from "./types"
import { coerceUndefinedOrNullToEmptyArray, coerceUndefinedToNull, omap } from "./util"

const SqlString = require("sqlstring-sqlite")

/** Helper type for convenience. Uses the sqlstring-sqlite library, but should ideally use the function in sequalize.
 */
type Fields = Record<string, Field>

function escapeString(x: any): string {
    return SqlString.escape(x)
}

/**
 *
 * @param identifier: Unescaped name. E.g. 'Alb"um'
 * @returns Escaped name. E.g. '"Alb\"um"'
 */
function escapeIdentifier(identifier: string): string {
    // TODO: Review this function since the current implementation is off the cuff.
    const result = identifier.replace(/\\/g, "\\\\").replace(/"/g, '\\"')
    return `"${result}"`
}

function json_object(ts: Array<TableRelationships>, fields: Fields, table: string): string {
    const result = omap(fields, (k, v) => {
        switch (v.type) {
            case "column":
                return [`${escapeString(k)}, ${escapeIdentifier(v.column)}`]
            case "relationship":
                const result = ts.flatMap((x) => {
                    if (x.source_table === table) {
                        const rel = x.relationships[v.relationship]
                        if (rel) {
                            return [`'${k}', ${relationship(ts, rel, v, table)}`]
                        }
                    }
                    return []
                })
                if (result.length < 1) {
                    console.log("Couldn't find relationship for field", k, v, ts)
                }
                return result
        }
    }).join(", ")

    return tag("json_object", `JSON_OBJECT(${result})`)
}

function relationshipChainToExistsSubquery(
    w: Expression,
    rels: Array<Relationship & { source_table: string }>
): string {
    const rel = rels.shift()
    if (rel == null) {
        return ""
    }
    const { source_table, target_table, column_mapping } = rel
    // prettier-ignore
    // ts-ignore
    return `EXISTS (
        SELECT 1
        FROM ${target_table}
        WHERE ${Object.entries(column_mapping)
            .map(([k, v]) => `${escapeIdentifier(target_table)}.${escapeIdentifier(k)} = ${escapeIdentifier(source_table)}.${escapeIdentifier(v)}`)
            .join(" AND ")}
        AND 
        ${rels.length > 0
          ? relationshipChainToExistsSubquery(w, rels)
          : relationship_where(target_table, [])(clearExpressionColumnPath(w))}
      )`
}

function clearExpressionColumnPath(e: Expression): Expression {
    switch (e.type) {
        case "and":
        case "or":
        case "not":
            throw new Error("Only comparison expressions should be passed to this function")
        default:
            return {
                ...e,
                column: {
                    ...e.column,
                    path: [],
                },
            }
    }
}

const relationship_where =
    (table: string, ts: Array<TableRelationships>) =>
    (w: Expression | null): Array<string> => {
        if (w == null) {
            return []
        } else {
            switch (w.type) {
                case "not":
                    const aNot = relationship_where(table, ts)(w.expression)
                    if (aNot.length > 0) {
                        return [`(NOT ${aNot})`]
                    }
                    break
                case "and":
                    const aAnd = w.expressions.flatMap(relationship_where(table, ts))
                    if (aAnd.length > 0) {
                        return [`(${aAnd.join(" AND ")})`]
                    }
                    break
                case "or":
                    const aOr = w.expressions.flatMap(relationship_where(table, ts))
                    if (aOr.length > 0) {
                        return [`(${aOr.join(" OR ")})`]
                    }
                    break
                case "unary_op":
                    switch (w.operator) {
                        case "is_null":
                            return [`(${bop_col(w.column)} IS NULL)`] // TODO: Could escape usnig bop_col if escape is threaded through.
                    }
                case "binary_op":
                    if (w.column.path.length > 0) {
                        let currentTable = table
                        const relPath = w.column.path.map((relName) => {
                            const rel = ts.find((x) => x.relationships[relName] != null)
                            if (rel == null) {
                                throw new Error(`Could not find relationship ${relName} in table ${table}`)
                            }
                            currentTable = rel.source_table
                            const result = { ...rel.relationships[relName], source_table: currentTable }
                            return result
                        })
                        console.log({ relPath })
                        return [relationshipChainToExistsSubquery(w, relPath)]
                    }
                    const bop = bop_op(w.operator)
                    return [`${bop_col(w.column)} ${bop} ${bop_val(w.value)}`]
                case "binary_arr_op":
                    const bopA = bop_array(w.operator)
                    return [`(${bop_col(w.column)} ${bopA} (${w.values.map((v) => escapeString(v)).join(", ")}))`]
            }
            return []
        }
    }

function array_relationship(
    ts: Array<TableRelationships>,
    table: string,
    wJoin: Array<string>,
    fields: Fields,
    wWhere: Expression | null,
    wLimit: number | null,
    wOffset: number | null,
    wOrder: Array<OrderBy>
): string {
    // NOTE: The order of table prefixes are currently assumed to be from "parent" to "child".
    // NOTE: The reuse of the 'j' identifier should be safe due to scoping. This is confirmed in testing.
    if (wOrder.length < 1) {
        return tag(
            "array_relationship",
            `(
          SELECT JSON_GROUP_ARRAY(j)
          FROM (
            SELECT ${json_object(ts, fields, table)} AS j
            FROM ${escapeIdentifier(table)}
            ${where(table, ts)(wWhere, wJoin)}
            ${limit(wLimit)}
            ${offset(wOffset)}
          ))`
        )
    } else {
        // NOTE: Rationale for subselect in FROM clause:
        //
        // There seems to be a bug in SQLite where an ORDER clause in this position causes ARRAY_RELATIONSHIP
        // to return rows as JSON strings instead of JSON objects. This is worked around by using a subselect.
        return tag(
            "array_relationship",
            `(
          SELECT JSON_GROUP_ARRAY(j)
          FROM (
            SELECT ${json_object(ts, fields, table)} AS j
            FROM (
              SELECT *
              FROM ${escapeIdentifier(table)}
              ${where(table, ts)(wWhere, wJoin)}
              ${order(wOrder)}
              ${limit(wLimit)}
              ${offset(wOffset)}
            ) AS ${table}
          ))`
        )
    }
}

function object_relationship(
    ts: Array<TableRelationships>,
    table: string,
    wJoin: Array<string>,
    fields: Fields
): string {
    // NOTE: The order of table prefixes are currently assumed to be from "parent" to "child".
    return tag(
        "object_relationship",
        `(
        SELECT ${json_object(ts, fields, table)} AS j
        FROM ${table}
        ${where(table, ts)(null, wJoin)}
      )`
    )
}

function relationship(ts: Array<TableRelationships>, r: Relationship, field: RelationshipField, table: string): string {
    const wJoin = omap(
        r.column_mapping,
        (k, v) =>
            `${escapeIdentifier(table)}.${escapeIdentifier(k)} = ${escapeIdentifier(r.target_table)}.${escapeIdentifier(
                v
            )}`
    )

    switch (r.relationship_type) {
        case "object":
            return tag("relationship", object_relationship(ts, r.target_table, wJoin, field.query.fields))

        case "array":
            return tag(
                "relationship",
                array_relationship(
                    ts,
                    r.target_table,
                    wJoin,
                    field.query.fields,
                    coerceUndefinedToNull(field.query.where),
                    coerceUndefinedToNull(field.query.limit),
                    coerceUndefinedToNull(field.query.offset),
                    coerceUndefinedOrNullToEmptyArray(field.query.order_by)
                )
            )
    }
}

function bop_col(c: ComparisonColumn): string {
    if (c.path.length < 1) {
        return tag("bop_col", escapeIdentifier(c.name))
    } else {
        return tag("bop_col", c.path.map(escapeIdentifier).join(".") + "." + escapeIdentifier(c.name))
    }
}

function bop_array(o: BinaryArrayComparisonOperator): string {
    switch (o) {
        case "in":
            return tag("bop_array", "IN")
    }
}

function bop_op(o: BinaryComparisonOperator): string {
    let result
    switch (o) {
        case "equal":
            result = "="
            break
        case "greater_than":
            result = ">"
            break
        case "greater_than_or_equal":
            result = ">="
            break
        case "less_than":
            result = "<"
            break
        case "less_than_or_equal":
            result = "<="
            break
    }
    return tag("bop_op", result)
}

function bop_val(v: ComparisonValue): string {
    switch (v.type) {
        case "column":
            return tag("bop_val", bop_col(v.column))
        case "scalar":
            return tag("bop_val", escapeString(v.value))
    }
}

function order(o: Array<OrderBy>): string {
    if (o.length < 1) {
        return ""
    }
    const result = o.map((e) => `${e.column} ${e.ordering}`).join(", ")
    return tag("order", `ORDER BY ${result}`)
}

/**
 * @param whereArray Expressions used in the associated where clause
 * @param joinArray Join clauses
 * @returns string representing the combined where clause
 */
const where =
    (table: string, ts: Array<TableRelationships>) =>
    (whereArray: Expression | null, joinArray: Array<string>): string => {
        const clauses = [...relationship_where(table, ts)(whereArray), ...joinArray]
        if (clauses.length < 1) {
            return ""
        } else {
            return tag("where", `WHERE ${clauses.join(" AND ")}`)
        }
    }

function limit(l: number | null): string {
    if (l === null) {
        return ""
    } else {
        return tag("limit", `LIMIT ${l}`)
    }
}

function offset(o: number | null): string {
    if (o == null) {
        return ""
    } else {
        return tag("offset", `OFFSET ${o}`)
    }
}

/** Top-Level Query Function.
 */
function query(request: QueryRequest): string {
    const result = array_relationship(
        request.table_relationships,
        request.table,
        [],
        request.query.fields,
        coerceUndefinedToNull(request.query.where),
        coerceUndefinedToNull(request.query.limit),
        coerceUndefinedToNull(request.query.offset),
        coerceUndefinedOrNullToEmptyArray(request.query.order_by)
    )
    return tag("query", `SELECT ${result} as data`)
}

/** Format the DB response into a /query response.
 *
 * Note: There should always be one result since 0 rows still generates an empty JSON array.
 */
function output(rows: any): QueryResponse {
    return JSON.parse(rows[0].data)
}

/** Function to add SQL comments to the generated SQL to tag which procedures generated what text.
 *
 * comment('a','b') => '/*\<a>\*\/ b /*\</a>*\/'
 */
function tag(t: string, s: string): string {
    return `/*<${t}>*/ ${s} /*</${t}>*/`
}

/** Performs a query and returns results
 * 
 * Limitations:
 * 
 * - Binary Array Operations not currently supported.
 * 
 * The current algorithm is to first create a query, then execute it, returning results.
 * 
 * Method for adding relationship fields:
 * 
 * - JSON aggregation similar to Postgres' approach.
 *     - 4.13. The json_group_array() and json_group_object() aggregate SQL functions
 *     - https://www.sqlite.org/json1.html#jgrouparray
 * 


 * Example of a test query:
 * 
 * ```
 * query MyQuery {
 *   Artist(limit: 5, order_by: {ArtistId: asc}, where: {Name: {_neq: "Accept"}, _and: {Name: {_is_null: false}}}, offset: 3) {
 *     ArtistId
 *     Name
 *     Albums(where: {Title: {_is_null: false, _gt: "A", _nin: "foo"}}, limit: 2) {
 *       AlbumId
 *       Title
 *       ArtistId
 *       Tracks(limit: 1) {
 *         Name
 *         TrackId
 *       }
 *       Artist {
 *         ArtistId
 *       }
 *     }
 *   }
 *   Track(limit: 3) {
 *     Name
 *     Album {
 *       Title
 *     }
 *   }
 * }
 * ```
 * 
 */
export async function queryData(config: Config, queryRequest: QueryRequest): Promise<QueryResponse> {
    const db = connect(config) // TODO: Should this be cached?
    const q = query(queryRequest)
    const [result, metadata] = await db.query(q)

    return output(result)
}

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.