Giter Site home page Giter Site logo

Comments (9)

PotatoSpudowski avatar PotatoSpudowski commented on May 25, 2024 1

Super nice to hear that you are using the repo. Seems like you are working on some cool project, Please do share more about it if and when possible!

from fastllama.

PotatoSpudowski avatar PotatoSpudowski commented on May 25, 2024

Hi,
You can try increasing the "n_ctx" param in

fastLlama.Model(
        id="ALPACA-LORA-30B",
        path=str(MODEL_PATH.resolve()), #path to model
        num_threads=16, #number of threads to use
        n_ctx=512, #context size of model
        last_n_size=64, #size of last n tokens (used for repetition penalty) (Optional)
        seed=0 #seed for random number generator (Optional)
    )

Although I suspect that the quality might be a bit poor for long context lengths, than the latest lama.cpp repo.

I am in-between updating the ggml library!

from fastllama.

robin-coac avatar robin-coac commented on May 25, 2024

I tried larger context size upto 2048. Still the same problem. Even when 10 GB of Rams and Swap memory was available. problem was persisting.
However, loading the model for each iteration in the loop solves the problem. But it's super-slow as you might expect.

I believe problem like these is solved or will likely be solved in original llama.cpp repo thanks to huge contributers there.

Do you plan to keep on updating this repo alongside the original llama.cpp library ? I want to help anyway I can as well. Although, regretfully, C/C++ is not something I can help :(. only python for the time being

from fastllama.

PotatoSpudowski avatar PotatoSpudowski commented on May 25, 2024

Yes we will continue to update if it makes sense. However we want to make sure we don't just copy files from one repo to another. Else we will just be playing catchup with them. Which isn't that fun!

Wrt to your issue, I will have a look into it and fix it soon!

from fastllama.

robin-coac avatar robin-coac commented on May 25, 2024

Appreciate this ! I find this a really notorious problem. Because there's not much difference between giving instructions interactively VS providing instructions through a for loop.

from fastllama.

PotatoSpudowski avatar PotatoSpudowski commented on May 25, 2024

Interesting, @amitsingh19975 has fixed this in the feature/refactor branch. We are going to merge it to main soon!

Closing this for now. Please feel free to reopen if needed.

from fastllama.

robin-coac avatar robin-coac commented on May 25, 2024

No. it's not solved. I tried with that branch. Basically, loading model once and trying to do inference in a loop causes segmentation fault still now.

from fastllama.

amitsingh19975 avatar amitsingh19975 commented on May 25, 2024

Could you share the model parameters and which model you are using? The main branch uses mmap to reduce the memory consumption that we will introduce in the future as low memory mode with a few changes, which I hope fixes your problem if it's related to the memory.

from fastllama.

PotatoSpudowski avatar PotatoSpudowski commented on May 25, 2024

@robin-coac Is this still the case?

from fastllama.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.