katanaml / llm-mistral-invoice-cpu Goto Github PK

View Code? Open in Web Editor NEW

252.0 252.0 55.0 63 KB

Data extraction with LLM on CPU

License: Apache License 2.0

Python 100.00%

llm-mistral-invoice-cpu's People

Contributors

Stargazers

Watchers

Forkers

techthiyanes moxmoussa raedfesesi tonywhite11 einfachalf neomatrix25 sathyakrishna-sharma rochemedia tjisousa muharremokutan dch239 navgupta14 aqkhan jesusoctavioas jrocktorrens lherrera animesh tomchapin arieltoledo corticalstack brianmwevi mz0in kameshbrao brunotech anilprasad bogdanr-flowx rkmitddev amansinghlegitt3110 kananvyas restevesd munirabobaker manikandanj2207 gladiopeace bleissem alitrack rohanrusta21 talesmousinho lokeshjonnakuti bijoyksaraf cevremuhendisi plaethos27 conceto-nko octag0no gitzillaa wgong sandy4321 plaethos01 jpvelsamy zahurul-islam siddharth271 gaionl xiaomai2000 alterago bmquynhlinh iosub

llm-mistral-invoice-cpu's Issues

How to run it offline?

How to run it offline without accessing to HuggingFace.co?

change CPU to cuda still can NOT use GPU locally

in ingest.py there is a option ,and original value is 'cpu'。
when I change it to "cuda" and wanna use GPU locally that is RTX3070,and it doesn't work.

How to use GPU / TPU?

Hey,

thanks so much for this project. Really amazing to see whats possible with mistral.

I happen to have a M1 Pro MacBook and was wondering if you could point me in the direction on how to use the GPU or "neural engine" to speed up the process?

Thanks so much

repo missing

the mistral-7b-instruct-v0.1.Q5_K_M.gguf repo is not awailable on huggingface

No of Tokens Exceeding Max Tokens

Hi,

First of all, thank you for this repo

right now I am trying to use Mistral Q8 on gpu and while running main.py I am getting the following error. I wanted to know how do I decide the optimal. I am attaching the config.yam for reference. Please help

Using RAG

First, thank you so much for this repo.

I want to use my embeddings and local llm model.

What code spaces should i change? Can you help me?

what are the changes required to run with GPU? model type for phi-2.7 model?

Hi, please provide the necessary changes if i want to use with GPU in windows. If i want to use phi2-7 model what i have to write in model_type in yml file. Thanks

Can we fine-tuning?

Hi,

I'm curious to know if we can do fine-tuning or not?

Thanks!

How to speed up the app

Your app works well,
I would like to know how one may speed up the app.
I asked Bard for the same question, here is its response - https://g.co/gemini/share/caba9f27a56b

Asked Claude 3, got these response:

To improve the performance of CTransformers in Langchain, you can consider the following strategies:

Use Quantization: CTransformers supports quantization, which can significantly reduce the model size and improve inference speed. You can enable quantization when loading the model using the quantize parameter in CTransformerModel.
Optimize CTransformer Configuration: CTransformer provides various configuration options to optimize performance, such as setting the number of threads, using CPU or GPU, and enabling optimizations like Operator Fusion. You can adjust these settings in the CTransformerModel initialization.
Batching: If you're processing multiple inputs at once, you can leverage batching to improve performance. CTransformers supports batched inference, which can be more efficient than processing inputs individually.
Caching: If you're performing repetitive computations on the same input, you can cache the results to avoid redundant computations. Langchain provides caching utilities that can be used with CTransformers.
Model Pruning: If the model size is too large, you can consider pruning techniques to reduce the number of parameters while maintaining reasonable performance. Langchain doesn't directly support model pruning, but you can use external libraries like Hugging Face Transformers for this purpose.
Hardware Acceleration: If you have access to hardware accelerators like GPUs or specialized AI accelerators, you can leverage them to further improve performance. CTransformers supports GPU inference, which can provide significant speedups for certain models and tasks.

It seems that GPU is an easy solution: https://g.co/gemini/share/fc9472590c4f

What would you recommend?
Thanks a lot

how to add vectore store/faiss?

in your project config, need vectorestore/faiss. but in your repo its not include
how to download it. im done download faiss store and add to my env but still cant run

katanaml / llm-mistral-invoice-cpu Goto Github PK

llm-mistral-invoice-cpu's People

Contributors

Stargazers

Watchers

Forkers

llm-mistral-invoice-cpu's Issues

How to run it offline?

change CPU to cuda still can NOT use GPU locally

How to use GPU / TPU?

repo missing

No of Tokens Exceeding Max Tokens

Using RAG

what are the changes required to run with GPU? model type for phi-2.7 model?

Can we fine-tuning?

How to speed up the app

how to add vectore store/faiss?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent