Giter Site home page Giter Site logo

Comments (10)

alexanderchang1 avatar alexanderchang1 commented on July 28, 2024 4

Hi, @whitead i got it to work with the following method.

llamafile servers are currently incompatible and result in File Note Found error due to endpoint differences.

Instead, a user has to install the latest llama-cpp-python bindings for web server (https://github.com/abetlen/llama-cpp-python#web-server), and then run the command locally.

python -m llama_cpp.server  --model ./models/llama-2-7b.Q5_K_M.gguf --n_gpu_layers 35 --port 8010

Then you can run it locally via

from paperqa import Docs, OpenAILLMModel, print_callback, LlamaEmbeddingModel, SentenceTransformerEmbeddingModel
from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url="http://localhost:8010/v1",
    api_key = "sk-no-key-required"
)


docs = Docs(client=client,            
            embedding="sentence-transformers",
            llm_result_callback=print_callback,
            llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))

And the remainder of the code is the same, however the performance drops a lot using sentence transformers. Is there an updated version on how to use LlamaEmbeddingModel instead? Or any other model, my hope is to eventually use mistral 8x7B.

from paper-qa.

GordonMcGregor avatar GordonMcGregor commented on July 28, 2024 3

This code (and in the readme)


from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI

# start llamap.cpp client with

local_client = AsyncOpenAI(
    base_url="http://localhost:8080/v1",
    api_key = "sk-no-key-required"
)

docs = Docs(client=local_client,
            embedding_model=LlamaEmbeddingModel(),
            llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))

still generates this error:

pydantic_core._pydantic_core.ValidationError: 1 validation error for LlamaEmbeddingModel
name

from paper-qa.

GordonMcGregor avatar GordonMcGregor commented on July 28, 2024 2

I think the problem in both cases is that LlamaEmbeddingModel() requires a name argument - its that instantiation that's complaining, not Docs()

from paper-qa.

whitead avatar whitead commented on July 28, 2024 1

Sorry about that - yes embedding is for a string.

Use this syntax instead for passing a model (I updated the README - thanks)

from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI

# start llamap.cpp client with

local_client = AsyncOpenAI(
    base_url="http://localhost:8080/v1",
    api_key = "sk-no-key-required"
)

docs = Docs(client=local_client,
            embedding_model=LlamaEmbeddingModel(),
            llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))

For @philippbayer - your example would be using Langchain (I assume!):

docs = Docs(llm = "langhcain",
            client = ChatGoogleGenerativeAI(model='gemini-pro'),
            embedding = "langchain",
            embedding_client = GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

from paper-qa.

GordonMcGregor avatar GordonMcGregor commented on July 28, 2024 1

also embedding_model isn't a valid argument for Docs

pydantic_core._pydantic_core.ValidationError: 1 validation error for Docs
embedding_model
  Extra inputs are not permitted [type=extra_forbidden, input_value=LlamaEmbeddingModel(name=...h_size=4, concurrency=1), input_type=LlamaEmbeddingModel]

from paper-qa.

GordonMcGregor avatar GordonMcGregor commented on July 28, 2024 1

embedding_client perhaps instead?


from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI

# start llamap.cpp client with

local_client = AsyncOpenAI(
    base_url="http://localhost:8080/v1",
    api_key = "sk-no-key-required"
)

docs = Docs(client=local_client,
            embedding_client=LlamaEmbeddingModel(name='llama'),
            llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))

from paper-qa.

alexanderchang1 avatar alexanderchang1 commented on July 28, 2024 1

That fixed the pydantic issue, but revealed another issue

OpenAI API gets embeddings via POST to this endpoint '/embeddings' (https://platform.openai.com/docs/api-reference/embeddings)

But the locally hosted llamafile server gets embeddings via this endpoint '/embedding' (https://github.com/Mozilla-Ocho/llamafile/blob/main/llama.cpp/server/README.md#api-endpoints)

This results in a File Not Found API error when trying to add documents and embed text on a locally hosted LLM.

Llamafile is currently on version 0.6.2 which was last synchronized with llama.cpp on 1-27-2024. However the llama.cpp commit that fixes this problem wasn't committed until 1-29-2024 (ggerganov/llama.cpp@9461329). So until the next version of llamafile syncs to a llama.cpp sync after that date paperqa is not compatible with a llamafile I believe.

from paper-qa.

alexanderchang1 avatar alexanderchang1 commented on July 28, 2024 1

Hi @whitead,

Still getting the File Not Found error due to /embeddings and /embedding conflicts in llamafile.

from paper-qa.

philippbayer avatar philippbayer commented on July 28, 2024

I seem to have the same issue with Gemini: it looks like the pydantic setting wants there to be a 'name' field in both LLamaEmbeddingModel and GoogleGenerativeAIEmbeddings(). Fuller error:

My Docs():

docs = Docs(llm_model = ChatGoogleGenerativeAI(model='gemini-pro'),
 embedding = GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

This is what data looks like before the error:

{'llm_model': ChatGoogleGenerativeAI(name='gemini-pro', model='gemini-pro', client= genai.GenerativeModel(
   model_name='models/gemini-pro',
   generation_config={}.
   safety_settings={}
)), 'embedding': GoogleGenerativeAIEmbeddings(model='models/embedding-001', task_type=None, google_api_key=None, 
client_options=None, transport=None)}

And the error:

Traceback (most recent call last):

  File ".//python3.10/site-packages/paperqa/docs.py", line 129, in __init__
    super().__init__(**data)
  File ".//python3.10/site-packages/pydantic/main.py", line 171, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Docs
name
  Input should be a valid string [type=string_type, input_value=GoogleGenerativeAIEmbeddi...ns=None, transport=None), input_type=GoogleGenerativeAIEmbeddings]

So neither GoogleGenerativeAIEmbeddings nor (I assume) LlamaEmbeddingModel have a name field. ChatGoogleGenerativeAI has one ('gemini-pro'). Not sure right now how to add that. So the pydantic rules in docs.py say that 'name' should be a string, but in our case it's not a string, it's the entire model. Don't know enough about pydantic to fix this right now.

Edit:
OK here's the issue:

if "embedding" in data and data["embedding"] != "default":

data['embedding'] is assumed to be a string, not any model.

This code kind of works:

docs = Docs(llm_model = ChatGoogleGenerativeAI(model='gemini-pro'),
 embedding = 'sentence_transformers')

This will use the local SentenceTransformerEmbeddingModel() instead of Gemini. Then I run into the next error as it assumes that llm_model is wrapped within LLMModel, that's for later today

from paper-qa.

whitead avatar whitead commented on July 28, 2024

Hi @alexanderchang1 - maybe try using SentenceTransformer?

from paperqa import Docs, OpenAILLMModel, print_callback

docs = Docs(client=client,            
            embedding="sentence-transformers",
            llm_result_callback=print_callback,
            llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))

or I added a new embedding that is just keyword based:

from paperqa import Docs, OpenAILLMModel, print_callback

docs = Docs(client=client,            
            embedding="sparse",
            llm_result_callback=print_callback,
            llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))

from paper-qa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.