aisingapore / sealion Goto Github PK
View Code? Open in Web Editor NEWSouth-East Asia Large Language Models
License: MIT License
South-East Asia Large Language Models
License: MIT License
Hello,
While trying to train a LoRA for this model, I ran into this error:
TypeError: MptForCausalLM.forward() got an unexpected keyword argument 'token_type_ids'
A minimal reproducible example is available here, and you can view the error logs in the train.log file.
I was able to work around this by cloning transformers locally and patching the MPT class. I can provide my changes if this would be helpful.
Would it be possible to get LoRA finetuning supported officially with SEA-LION?
Hi, first of all it is really nice to see SeaLion model. However when i try to load the model, i use the README file
I don't know if there is any version mismatch for transformers? i can not find this version. And also, i have already downloaded tokenization_SEA_BPE.py file but it is still show the warning : "Could not locate the tokenization_SEA_BPE.py inside aisingapore/sealion7b."
Hope to hear from you soon. Thanks again!
Good evening team~, till now I know the prompt format like ### USER: {human_prompt} ### RESPONSE:
, but what if I want to translate one language into another, may I know if there's any prompt format we could follow, or I need to fine tune model to achieve my demand, thank you!
Hi, we are the Foundation Models Lab from Hanoi University of Science and Technology, Vietnam (bkai.ai). Is it possible to contribute the Vietnamese instruct version to this project? Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.