Giter Site home page Giter Site logo

mobarski / aidapter Goto Github PK

View Code? Open in Web Editor NEW
18.0 3.0 0.0 107 KB

Adapter / facade for language models (OpenAI, Anthropic, Cohere, local transformers, etc)

License: MIT License

Python 92.92% Jupyter Notebook 7.08%
adapter ai facade llm llms python anthropic cohere openai transformers

aidapter's Introduction

aidapter

Simple adapter for many language models - remote (Hugging Face, OpenAI, AnthropicAI, CohereAI) and local (transformers library).

Facilitates loading of many new models (Guanaco, Falcon, Vicuna, etc) in 16/8/4 bit modes.

It also supports embedding models (OpenAI, CohereAI, Sentence Transformers).

Installation

๐Ÿšง This is experimental software. Anything can change without any notice.

pip install git+https://github.com/mobarski/aidapter.git

Note: each vendor API requires manual installation of dependencies.

Features

  • simple, unified API to many models (remote and local)
  • batching
  • parallel calls
  • caching
  • usage tracking
  • automatic retries
  • response priming

Usage examples

completion:

>>> import aidapter
>>> model = aidapter.model('openai:gpt-3.5-turbo') # uses OPENAI_API_KEY env variable
>>> model.complete('2+2=')
4
>>> model.complete(['2+2=','7*6=']) # parallel calls
['4', '42']

embeddings:

>>> model = aidapter.model('sentence-transformers:multi-qa-mpnet-base-dot-v1')
>>> vector = model.embed('mighty indeed')
>>> vector[:5]
[-0.07946087, -0.2150347, -0.33358946, 0.18340564, 0.16403404]
>>> vectors = model.embed(['this is the way', 'so say we all']) # parallel / batch processing
>>> [x[:5] for x in vectors]
[[0.037638217, -0.30608281, -0.3064257, -0.46715638, -0.2608084],
 [-0.063842215, -0.16669855, -0.22363697, -0.2893797, 0.060464755]]

multiple models:

>>> m1 = aidapter.model('transformers:ehartford/Wizard-Vicuna-13B-Uncensored:4bit') # 4 bit mode
>>> m2 = aidapter.model('anthropic:claude-instant-v1') # uses ANTHROPIC_API_KEY env variable

persistent cache and usage tracking:

>>> import shelve
>>> model.cache = shelve.open('/tmp/aidapter.cache') # persistant disk cache
>>> model.usage = shelve.open('/tmp/aidapter.usage') # persistant usage tracking
>>> import diskcache as dc
>>> model.cache = dc.Cache('/tmp/aidapter.cache') # persistant disk cache
>>> model.usage = dc.Cache('/tmp/aidapter.usage') # persistant usage tracking

function calling interface*:

>>> def get_weather(city):
>>>     "get weather info for a city; city must be all caps after ISO country code and a : separator (e.g. FR:PARIS)"
>>>     ...
>>> model = aidapter.model('openai:gpt-3.5-turbo-0613')
>>> model.complete('Whats the weather in the capital of Poland?', functions=[get_weather])
{'function_name': 'get_weather', 'arguments': {'city': 'PL:WARSAW'}}

* currently, it works only with selected OpenAI models

use last_hidden_state from any transformer as an embedding*:

>>> model = aidapter.model('transformers:RWKV/rwkv-raven-1b5')
>>> model.raw_embed_one('mighty indeed')[:5]
[0.14850381016731262, -0.021324729546904564, 0.09214707463979721, 0.34308338165283203, -0.11288302391767502]

* requires additional normalization over a corpus, API will change

API

aidapter.model(model_id, **api_kwargs) -> model

  • model_id - model identifier in the following format <vendor_name>:<model_name>
  • api_kwargs - default API arguments

model.complete(prompt, system='', start='', stop=[], limit=100, temperature=0, functions=[], cache='use', debug=False) -> str | list | dict

  • prompt - main prompt or list of prompts

  • system - system prompt

  • start - the text that will be appended to the start of the response and to the end of the prompt (aka response priming)

  • stop - list of strings upon which to stop generating

  • limit - maximum number of tokens to generate before stopping (aka max_new_tokens, max_tokens_to_sample)

  • temperature - amount of randomness

  • functions - list of functions available to the model (none of them will be executed - only the signatures are used)

  • cache - cache usage:

    • use - use the cache if the temperature is 0 (default)
    • skip - don't use the cache
    • force - use the cache even if the temperature is not 0
  • debug - if True, the function will return a dictionary (or a list of dictionaries) containing internal objects / values

    FULL_PROMPT = system + prompt + start

model.embed(input, limit=None) -> list | list[list]

  • input - text or list of texts
  • limit - limit the vector length to first n dimensions (default = None = no limit)

model configuration:

  • model.workers - number of concurrent workers for parallel completion (default=4)

  • model.show_progress - show progress bar when performing parallel completion (default=False)

  • model.retry_tries - maximum number of retry attempts (default=5)

  • model.retry_delay - initial delay between retry attempts (default=0.1)

  • model.retry_backoff - multiplier applied to the delay between retry attempts (default=3)

Supported models

OpenAI

  • openai:gpt-4

  • openai:gpt-4-32k

  • openai:gpt-3.5-turbo

  • openai:text-davinci-003

  • openai:code-davinci-002

  • ...

API key env. variable: OPENAI_API_KEY

Anthropic

  • anthropic:claude-v1

  • anthropic:claude-instant-v1

  • anthropic:claude-v1-100k

  • anthropic:claude-instant-v1-100k

  • ...

API key env. variable: ANTHROPIC_API_KEY

Cohere

  • cohere:command

  • cohere:command-light

  • ...

API key env. variable: CO_API_KEY

Transformers

  • transformers:TheBloke/guanaco-7B-HF

  • transformers:tiiuae/falcon-7b

  • transformers:RWKV/rwkv-raven-3b

  • transformers:ehartford/Wizard-Vicuna-13B-Uncensored

  • transformers:roneneldan/TinyStories-33M

  • ...

Change log

0.6.4

  • initial support for HF API models
  • removed old HF implementation

0.6.3

  • OpenAI's embeddings now use BaseModelV2

0.6.2

  • as_iter option in BaseModelV2.transform
  • removed BaseModelV2.register_progress

0.6.1

  • handle cache=False in BaseModelV2.transform_many
  • hf2 brand renamed to huggingface

0.6

  • initial support for HF API embeddings
  • BaseModelV2
    • cleaner code
    • diskcache support
    • batch + threads support
    • retry configuration
    • progress update

0.5.4

  • initial support for the functions argument (works only with selected OpenAI models)

0.5.3

  • initial support for raw_embed_one in transformers (for creating embeddings from ANY transformer models)

0.5.2

  • fix: kw handling in get_cache_key

0.5.1

  • limit option for embedding models

0.5

  • initial support for embedding models (requires more work with batch / parallel processing):
    • OpenAI
    • Cohere
    • Sentence Transformers

0.4.4

  • response priming (start option)

0.4.3

  • stop option for transformers

0.4.2

  • anthropic usage: tokens, characters

  • transformers usage: tokens, characters

0.4.1

  • remove prompt from transformers output
  • removed kvdb
  • usage['time']
  • fixed pad_token_id
  • fixed limit in transformer models

0.4

  • initial support for local transformers models

    • float16 (add ":16bit" to the model name)

    • load_in_8bit (add ":8bit" to the model name)

    • load_in_4bit (add ":4bit" to the model name)

  • cache = use | skip | force

  • shelve based persistence (for cache and usage)

0.3.2

  • kvdb import fix

0.3

  • Cohere models
  • disk cache

0.2

  • OpenAI instruct models
  • Anthropic models (ANTHROPIC_API_KEY env variable)
  • complete: debug option
  • BaseModel.RENAME_KWARGS
  • pip install
  • limit handling

0.1

  • parallel calls / cache / usage tracking / retries
  • OpenAI chat models

Next

  • HF API text generation
  • llama.cpp models (GGML!)
  • strangulate BaseModel with BaseModelV2

Reference Materials

aidapter's People

Contributors

mobarski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.