Giter Site home page Giter Site logo

Phi-3 support about ctranslate2 HOT 9 CLOSED

Theodotus1243 avatar Theodotus1243 commented on June 3, 2024 1
Phi-3 support

from ctranslate2.

Comments (9)

vince62s avatar vince62s commented on June 3, 2024 2

no worry it will be done, it's quite easy for the mini-4k since it takes all llama2 arch.
fyi: https://forum.opennmt.net/t/phi-3-3-8b-llama2-7b-ensemble-just-for-fun/5729

from ctranslate2.

minhthuc2502 avatar minhthuc2502 commented on June 3, 2024 1

PR #1680 to add the converter for Phi3

from ctranslate2.

BBC-Esq avatar BBC-Esq commented on June 3, 2024

I second this. The current phi loader is broken, apparently because of some changes that Microsoft did to the model after it was initially released. At any rate, adapting the phi loader to the new phi3 should be easier than starting from scratch.

from ctranslate2.

jncraton avatar jncraton commented on June 3, 2024

For anyone else researching this, phi3 support has been added to the convert_hf_to_gguf.py script in llama.cpp. Perhaps something can be gleaned from there to simplify the implementation of the ct2 converter.

from ctranslate2.

BBC-Esq avatar BBC-Esq commented on June 3, 2024

Is it done yet? I've been waiting patiently for approximately two hours now? ;-)

from ctranslate2.

minhthuc2502 avatar minhthuc2502 commented on June 3, 2024

Hello, I am working on it. Some unexpected problems appears.

from ctranslate2.

BBC-Esq avatar BBC-Esq commented on June 3, 2024

I'm not skilled enough to help directly by implementing the code...but if you want me to do any grunt work or research let me know dude...anything to assist speed up the process. Thanks!

from ctranslate2.

BBC-Esq avatar BBC-Esq commented on June 3, 2024

I'd like to start learning to eventually possibly help...Question...how do I get the actual model architecture to start with...It's my understanding that getting the model's structure, what activation functions are used, etc. and basically starting to understanding the structure is key in making additional converters down the road. For example, here's a link:

https://bbycroft.net/llm

Here are some other links that I've been gathering with the goal of eventually contributing a converter...based on first trying to understand the structure of LLMs...

https://github.com/mert-kurttutan/torchview

https://github.com/lutzroeder/netron

Huggingface sometimes (but not always) has information like this...

image

Basically, any good starting point for me that you'd recommend dude? Thanks!

from ctranslate2.

BBC-Esq avatar BBC-Esq commented on June 3, 2024

Remember, you're dealing with an idiot who doesn't do this for a profession and has never taken the LLM 101 class in college let alone have a doctoral degree. ;-) I don't even know what "mlp.down" or "layernorm.weight" means, for example, but am willing to learn.

from ctranslate2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.