Giter Site home page Giter Site logo

dexter's Introduction

NPM Build Status MIT License Prettier Code Formatting

Dexter

Dexter is a set of mature LLM tools used in production at Dexa, with a focus on real-world RAG (Retrieval Augmented Generation).

If you're a TypeScript AI engineer, check it out! ๐Ÿ˜Š

Features

  • production-quality RAG
  • extremely fast and minimal
  • handles caching, throttling, and batching for ingesting large datasets
  • optional hybrid search w/ SPLADE embeddings
  • minimal TS package w/ full typing
  • uses fetch everywhere
  • supports Node.js 18+, Deno, Cloudflare Workers, Vercel edge functions, etc
  • full docs

Install

npm install @dexaai/dexter

This package requires node >= 18 or an environment with fetch support.

This package exports ESM. If your project uses CommonJS, consider switching to ESM or use the dynamic import() function.

Usage

This is a basic example using OpenAI's text-embedding-ada-002 embedding model and a Pinecone datastore to index and query a set of documents.

import 'dotenv/config';
import { EmbeddingModel } from '@dexaai/dexter/model';
import { PineconeDatastore } from '@dexaai/dexter/datastore/pinecone';

async function example() {
  const embeddingModel = new EmbeddingModel({
    params: { model: 'text-embedding-ada-002' },
  });

  const store = new PineconeDatastore({
    contentKey: 'content',
    embeddingModel,
  });

  await store.upsert([
    { id: '1', metadata: { content: 'cat' } },
    { id: '2', metadata: { content: 'dog' } },
    { id: '3', metadata: { content: 'whale' } },
    { id: '4', metadata: { content: 'shark' } },
    { id: '5', metadata: { content: 'computer' } },
    { id: '6', metadata: { content: 'laptop' } },
    { id: '7', metadata: { content: 'phone' } },
    { id: '8', metadata: { content: 'tablet' } },
  ]);

  const result = await store.query({ query: 'dolphin' });
  console.log(result);
}

Docs

See the docs for a full usage guide and API reference.

Examples

To run the included examples, clone this repo, run pnpm install, set up your .env file, and then run an example file using tsx.

Environment variables required to run the examples:

  • OPENAI_API_KEY - OpenAI API key
  • PINECONE_API_KEY - Pinecone API key
  • PINECONE_BASE_URL - Pinecone index's base URL
    • You should be able to use a free-tier "starter" index for most of the examples, but you'll need to upgrade to a paid index to run the any of the hybrid search examples
    • Note that Pinecone's free starter index doesn't support namespaces, deleteAll, or hybrid search :sigh:
  • SPLADE_SERVICE_URL - optional; only used for the chatbot hybrid search example

Basic

npx tsx examples/basic.ts

source

Caching

npx tsx examples/caching.ts

source

Redis Caching

This example requires a valid REDIS_URL env var.

npx tsx examples/caching-redis.ts

source

AI Function

This example shows how to use createAIFunction to handle function and tool_calls with the OpenAI chat completions API and Zod.

npx tsx examples/ai-function.ts

source

AI Runner

This example shows how to use createAIRunner to easily invoke a chain of OpenAI chat completion calls, resolving tool / function calls, retrying when necessary, and optionally validating the resulting output via Zod.

Note that createAIRunner takes in a functions array of AIFunction objects created by createAIFunction, as the two utility functions are meant to used together.

npx tsx examples/ai-runner.ts

source

Chatbot

This is a more involved example of a chatbot using RAG. It indexes 100 transcript chunks from the Huberman Lab Podcast into a hybrid Pinecone datastore using OpenAI ada-002 embeddings for the dense vectors and a HuggingFace SPLADE model for the sparse embeddings.

You'll need the following environment variables to run this example:

  • OPENAI_API_KEY
  • PINECONE_API_KEY
  • PINECONE_BASE_URL
    • Note: Pinecone's free starter indexes don't seem to support namespaces or hybrid search, so unfortunately you'll need to upgrade to a paid plan to run this example. See Pinecone's hybrid docs for details on setting up a hybrid index, and make sure it is using the dotproduct metric.
  • SPLADE_SERVICE_URL
    • Here is an example of how to run a SPLADE REST API, which can be deployed to Modal or any other GPU-enabled hosting provider.
npx tsx examples/chatbot/ingest.ts
npx tsx examples/chatbot/cli.ts

source

License

MIT ยฉ Dexa

dexter's People

Contributors

cfortuner avatar colelawrence avatar rileytomasek avatar stevepeak avatar transitive-bullshit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dexter's Issues

Poor visibility into errors within an AIRunner

When an AIRunner encounters an error, it will swallow it and retry until it hits the max iterations and then returns an error stating that the max iterations were hit, without ever exposing the underlying errors.

Arguments of tool is always empty

I am using an openopenAI project in which I apply a tool call in two variations:

  • retrieval for extracting data from a file
  • function_call for invoking image generation

In both cases, the same problem arises: the toolCall object does not contain arguments in the function. Consequently, when calling file access, an error occurs when attempting to parse json. And when calling image generation, the arguments should include a prompt, which is also not delivered.

Reproduction steps:

Create a function to retrieve information from a file and access it. As a result, an error occurs:

JSONError: Unexpected end of JSON input while parsing empty string

This is related to the fact that the function.arguments object is empty.

toolCall: {
index: 0,
id: 'call_UEn7arSct776zKLJVyMY4o7Q',
type: 'function',
function: { name: 'retrieval', arguments: '' }
}

Create a function for image generation and access it. As a result, an error occurs:
JSONError: Unexpected end of JSON input while parsing empty string

Because the same object with arguments will be empty.

After conducting an investigation, I discovered that dexter is not returning the arguments for function calls. It's possible that something has changed on openAI's end in recent weeks, making this functionality unavailable now?

What should I do? Where should I look?

Dexter sub-path imports don't seem to work without tsconfig "moduleResolution": "Bundler"

Using "moduleResolution": "node" in the consuming project, which is very common, seems to break dexter's typing exports as tsserver can't find them.

Example:

import { type Prompt } from '@dexaai/dexter/prompt` // this gives an error

This will probably surprise anyone who tries to use this package, since before 2 weeks ago, I had never used "moduleResolution": "node".

It would be nice to use tsc only, but I wonder if we'd be better off using something like tsup for output compatibility?

Improve support for OpenAI-compatible LLM providers

Trying to use dexter with non-OpenAI LLMs hosted by OpenRouter and seeing a few issues.

Cost warnings

CleanShot 2024-03-31 at 16 19 49@2x

This logs for every LLM call. Since there's no real standard here aside from OpenAI, it'd be nice to support this total_cost format as an optional parameter that we fallback to internally and skip the console.warn.


Will add more cases to this issue as I come across them.

[Feature Request] Support alternative vector DB's other than Pinecone

Hi there,

Congrats on the launch! The library looks very clean, and I like the interfaces. I would love to see some alternative stores implemented, and would be happy to contribute some myself.

Is there anything I/we (other contributors) should be aware of when thinking about submitting a PR like this?

Thanks again!

Make event args readonly

This continues the work in #17 to make Dexter's core more immutable and easier to reason about.

Specifically, any of the objects passed to event handlers should be readonly to guarantee that event handlers don't have unintended side effects for future event handlers or internal functionality.

We may want to consider making other key objects like AbstractModel.params deep readonly as well.

Support for other stores

Hey do you plan to provide support for other data stores? A free one that could be self hosted would be nice.

Retry mechanism broken for tool calling

When a function calls arguments fail the Zod validation we add a user message with the validation error and retry. This doesn't work with tool calling and throws:

An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'.

Improve install size

Currently sitting at ~17MB with 14MB coming from tiktoken: https://pkg-size.dev/@dexaai%2Fdexter

See also dqbd/tiktoken#68

For comparison, here's langchain at ~36MB: https://pkg-size.dev/langchain but we should be a lot slimmer than this. Langchain's not even loading the full tiktoken WASM lib; they're using the 6.6MB js-tiktoken.

This issue may end up just being resolved by improving tiktoken's WASM bundle size upstream, but I wanted to track it while it's top of mind for gptlint.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.