Giter Site home page Giter Site logo

Issues with Splash Attention about levanter HOT 9 CLOSED

versae avatar versae commented on August 15, 2024
Issues with Splash Attention

from levanter.

Comments (9)

dlwh avatar dlwh commented on August 15, 2024 1

Ugh yeah I'll disable/turn into opt in until we figure this out. Thanks for the report. I should have tested it more carefully. The loss curve was approximately the same in my quick test but apparently it's no good.

from levanter.

dlwh avatar dlwh commented on August 15, 2024

i bet i messed up mqa

from levanter.

dlwh avatar dlwh commented on August 15, 2024

pretty sure this will be fixed in #596 (though I'll make it default on after i test it a bit more)

from levanter.

dlwh avatar dlwh commented on August 15, 2024

@versae can you try latest main with:

--model.attn_backend splash

and (the default:)

--model.attn_backend jax_flash

and lmk if things seem good? pretty sure it's correct now

from levanter.

versae avatar versae commented on August 15, 2024

It seems fixed! Haven't run extensive tests, but initial losses match now.

Also, the Enum keys for the AttentionBackend() class seems to be all upper case, so I had to use SPLASH and JAX_FLASH.

from levanter.

versae avatar versae commented on August 15, 2024

Nice catch BTW

https://github.com/stanford-crfm/levanter/pull/596/files#diff-fbe2e2b3c420db2356f984aa3ecff7877ca98a9391bd8c9d2139bb9f3c61e100R763

from levanter.

dlwh avatar dlwh commented on August 15, 2024

thanks. I should have caught it before, but glad it's fixed now

from levanter.

dlwh avatar dlwh commented on August 15, 2024

the lowercase thing can be fixed by upgrading the draccus dependency btw

from levanter.

dlwh avatar dlwh commented on August 15, 2024

declaring it fixed in #598

from levanter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.