katanaml / llm-mistral-invoice-cpu Goto Github PK
View Code? Open in Web Editor NEWData extraction with LLM on CPU
License: Apache License 2.0
Data extraction with LLM on CPU
License: Apache License 2.0
How to run it offline without accessing to HuggingFace.co?
in ingest.py there is a option ,and original value is 'cpu'。
when I change it to "cuda" and wanna use GPU locally that is RTX3070,and it doesn't work.
Hey,
thanks so much for this project. Really amazing to see whats possible with mistral.
I happen to have a M1 Pro MacBook and was wondering if you could point me in the direction on how to use the GPU or "neural engine" to speed up the process?
Thanks so much
the mistral-7b-instruct-v0.1.Q5_K_M.gguf repo is not awailable on huggingface
First, thank you so much for this repo.
I want to use my embeddings and local llm model.
What code spaces should i change? Can you help me?
Hi, please provide the necessary changes if i want to use with GPU in windows. If i want to use phi2-7 model what i have to write in model_type in yml file. Thanks
Hi,
I'm curious to know if we can do fine-tuning or not?
Thanks!
Your app works well,
I would like to know how one may speed up the app.
I asked Bard for the same question, here is its response - https://g.co/gemini/share/caba9f27a56b
Asked Claude 3, got these response:
To improve the performance of CTransformers in Langchain, you can consider the following strategies:
Use Quantization: CTransformers supports quantization, which can significantly reduce the model size and improve inference speed. You can enable quantization when loading the model using the quantize parameter in CTransformerModel.
Optimize CTransformer Configuration: CTransformer provides various configuration options to optimize performance, such as setting the number of threads, using CPU or GPU, and enabling optimizations like Operator Fusion. You can adjust these settings in the CTransformerModel initialization.
Batching: If you're processing multiple inputs at once, you can leverage batching to improve performance. CTransformers supports batched inference, which can be more efficient than processing inputs individually.
Caching: If you're performing repetitive computations on the same input, you can cache the results to avoid redundant computations. Langchain provides caching utilities that can be used with CTransformers.
Model Pruning: If the model size is too large, you can consider pruning techniques to reduce the number of parameters while maintaining reasonable performance. Langchain doesn't directly support model pruning, but you can use external libraries like Hugging Face Transformers for this purpose.
Hardware Acceleration: If you have access to hardware accelerators like GPUs or specialized AI accelerators, you can leverage them to further improve performance. CTransformers supports GPU inference, which can provide significant speedups for certain models and tasks.
It seems that GPU is an easy solution: https://g.co/gemini/share/fc9472590c4f
What would you recommend?
Thanks a lot
in your project config, need vectorestore/faiss. but in your repo its not include
how to download it. im done download faiss store and add to my env but still cant run
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.