Comments (10)
Hi, @whitead i got it to work with the following method.
llamafile servers are currently incompatible and result in File Note Found error due to endpoint differences.
Instead, a user has to install the latest llama-cpp-python bindings for web server (https://github.com/abetlen/llama-cpp-python#web-server), and then run the command locally.
python -m llama_cpp.server --model ./models/llama-2-7b.Q5_K_M.gguf --n_gpu_layers 35 --port 8010
Then you can run it locally via
from paperqa import Docs, OpenAILLMModel, print_callback, LlamaEmbeddingModel, SentenceTransformerEmbeddingModel
from openai import AsyncOpenAI
client = AsyncOpenAI(
base_url="http://localhost:8010/v1",
api_key = "sk-no-key-required"
)
docs = Docs(client=client,
embedding="sentence-transformers",
llm_result_callback=print_callback,
llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))
And the remainder of the code is the same, however the performance drops a lot using sentence transformers. Is there an updated version on how to use LlamaEmbeddingModel instead? Or any other model, my hope is to eventually use mistral 8x7B.
from paper-qa.
This code (and in the readme)
from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI
# start llamap.cpp client with
local_client = AsyncOpenAI(
base_url="http://localhost:8080/v1",
api_key = "sk-no-key-required"
)
docs = Docs(client=local_client,
embedding_model=LlamaEmbeddingModel(),
llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))
still generates this error:
pydantic_core._pydantic_core.ValidationError: 1 validation error for LlamaEmbeddingModel
name
from paper-qa.
I think the problem in both cases is that LlamaEmbeddingModel() requires a name argument - its that instantiation that's complaining, not Docs()
from paper-qa.
Sorry about that - yes embedding
is for a string.
Use this syntax instead for passing a model (I updated the README - thanks)
from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI
# start llamap.cpp client with
local_client = AsyncOpenAI(
base_url="http://localhost:8080/v1",
api_key = "sk-no-key-required"
)
docs = Docs(client=local_client,
embedding_model=LlamaEmbeddingModel(),
llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))
For @philippbayer - your example would be using Langchain (I assume!):
docs = Docs(llm = "langhcain",
client = ChatGoogleGenerativeAI(model='gemini-pro'),
embedding = "langchain",
embedding_client = GoogleGenerativeAIEmbeddings(model="models/embedding-001"))
from paper-qa.
also embedding_model isn't a valid argument for Docs
pydantic_core._pydantic_core.ValidationError: 1 validation error for Docs
embedding_model
Extra inputs are not permitted [type=extra_forbidden, input_value=LlamaEmbeddingModel(name=...h_size=4, concurrency=1), input_type=LlamaEmbeddingModel]
from paper-qa.
embedding_client perhaps instead?
from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI
# start llamap.cpp client with
local_client = AsyncOpenAI(
base_url="http://localhost:8080/v1",
api_key = "sk-no-key-required"
)
docs = Docs(client=local_client,
embedding_client=LlamaEmbeddingModel(name='llama'),
llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))
from paper-qa.
That fixed the pydantic issue, but revealed another issue
OpenAI API gets embeddings via POST to this endpoint '/embeddings' (https://platform.openai.com/docs/api-reference/embeddings)
But the locally hosted llamafile server gets embeddings via this endpoint '/embedding' (https://github.com/Mozilla-Ocho/llamafile/blob/main/llama.cpp/server/README.md#api-endpoints)
This results in a File Not Found API error when trying to add documents and embed text on a locally hosted LLM.
Llamafile is currently on version 0.6.2 which was last synchronized with llama.cpp on 1-27-2024. However the llama.cpp commit that fixes this problem wasn't committed until 1-29-2024 (ggerganov/llama.cpp@9461329). So until the next version of llamafile syncs to a llama.cpp sync after that date paperqa is not compatible with a llamafile I believe.
from paper-qa.
Hi @whitead,
Still getting the File Not Found error due to /embeddings and /embedding conflicts in llamafile.
from paper-qa.
I seem to have the same issue with Gemini: it looks like the pydantic setting wants there to be a 'name' field in both LLamaEmbeddingModel and GoogleGenerativeAIEmbeddings(). Fuller error:
My Docs():
docs = Docs(llm_model = ChatGoogleGenerativeAI(model='gemini-pro'),
embedding = GoogleGenerativeAIEmbeddings(model="models/embedding-001"))
This is what data
looks like before the error:
{'llm_model': ChatGoogleGenerativeAI(name='gemini-pro', model='gemini-pro', client= genai.GenerativeModel(
model_name='models/gemini-pro',
generation_config={}.
safety_settings={}
)), 'embedding': GoogleGenerativeAIEmbeddings(model='models/embedding-001', task_type=None, google_api_key=None,
client_options=None, transport=None)}
And the error:
Traceback (most recent call last):
File ".//python3.10/site-packages/paperqa/docs.py", line 129, in __init__
super().__init__(**data)
File ".//python3.10/site-packages/pydantic/main.py", line 171, in __init__
self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Docs
name
Input should be a valid string [type=string_type, input_value=GoogleGenerativeAIEmbeddi...ns=None, transport=None), input_type=GoogleGenerativeAIEmbeddings]
So neither GoogleGenerativeAIEmbeddings nor (I assume) LlamaEmbeddingModel have a So the pydantic rules in docs.py say that 'name' should be a string, but in our case it's not a string, it's the entire model. Don't know enough about pydantic to fix this right now.name
field. ChatGoogleGenerativeAI has one ('gemini-pro'). Not sure right now how to add that.
Edit:
OK here's the issue:
Line 159 in 350225c
data['embedding'] is assumed to be a string, not any model.
This code kind of works:
docs = Docs(llm_model = ChatGoogleGenerativeAI(model='gemini-pro'),
embedding = 'sentence_transformers')
This will use the local SentenceTransformerEmbeddingModel() instead of Gemini. Then I run into the next error as it assumes that llm_model is wrapped within LLMModel
, that's for later today
from paper-qa.
Hi @alexanderchang1 - maybe try using SentenceTransformer?
from paperqa import Docs, OpenAILLMModel, print_callback
docs = Docs(client=client,
embedding="sentence-transformers",
llm_result_callback=print_callback,
llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))
or I added a new embedding that is just keyword based:
from paperqa import Docs, OpenAILLMModel, print_callback
docs = Docs(client=client,
embedding="sparse",
llm_result_callback=print_callback,
llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))
from paper-qa.
Related Issues (20)
- Usage with Google Gemini
- Exception: "cannot pickle '_thread.RLock' object" from "import paperqa" (dependency on openai <= 0.28.1?) HOT 4
- Segmentation error and unable to install simsimd HOT 2
- Advice for streaming contexts faster
- html/xml tags
- No citations
- Docs() generates exception due to langchain AIMessage() not supplying __len__() when I specify a langchain llm HOT 1
- Pydantic throwing an error when following example code HOT 2
- client.model_name doesn't work for langchain ChatAnthropic HOT 1
- minimal example of how to use FAISS: storing, combining, reusing
- 'ChatOllama' object has no attribute 'model_name' HOT 1
- Decompose get_evidence HOT 2
- PaperQA CLI Script draft (ib.bsb.br/clipaper-qa)
- Extract logprobs
- RuntimeError: This event loop is already running HOT 1
- Parsing JSON with newlines
- is working fully now? last time I try was having a lot of problem in the embedding LLMs
- Getting a lot of python packages version mismatch ...
- HTTP Error 403 and 418 on add_url in paper-qa
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from paper-qa.