bigcode-project / starcoder2-self-align Goto Github PK
View Code? Open in Web Editor NEWStarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation
License: Apache License 2.0
StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation
License: Apache License 2.0
I am trying to recreate the StarCoder2-Instruct-v0.1
model; however, the model produced by the provided command in the README (copied below) does not match the evaluation of the StarCoder2-Instruct-v0.1
model on HF.
I actually see quite a bit of discrepancy between the two models' evaluations: humaneval
on your HF version is 7 points higher than on my reproduced model (both models were evaluated locally by me in the same environment).
MODEL_KEY=bigcode/starcoder2-15b
LR=1e-5
EPOCH=4
SEQ_LEN=1280
WARMUP_RATIO=0.05
OUTPUT_DIR=/path/to/output_model
DATASET_FILE=/path/to/50k-dataset.jsonl
accelerate launch -m star_align.train \
--model_key $MODEL_KEY \
--model_name_or_path $MODEL_KEY \
--use_flash_attention True \
--datafile_paths $DATASET_FILE \
--output_dir $OUTPUT_DIR \
--bf16 True \
--num_train_epochs $EPOCH \
--max_training_seq_length $SEQ_LEN \
--pad_to_max_length False \
--per_device_train_batch_size 1 \
--gradient_accumulation_steps 64 \
--group_by_length False \
--ddp_find_unused_parameters False \
--logging_steps 1 \
--log_level info \
--optim adafactor \
--max_grad_norm -1 \
--warmup_ratio $WARMUP_RATIO \
--learning_rate $LR \
--lr_scheduler_type linear
Are the parameters in the README correct for the released model? Are you adding anything in your accelerate
config? i.e. any model wrappers or something else?
For the data, I just ran:
>>> from datasets import load_dataset
>>> load_dataset("bigcode/self-oss-instruct-sc2-exec-filter-50k", split="train").to_json("/path/to/50k-dataset.jsonl", lines=True)
Do you have any ideas on how I can reproduce your model? Thanks!
We want to cherry-pick python seed snippets that cover a diverse range of coding concepts and programming features as our few-shot examples
Args:
datasets.load_dataset
to loadHi - can you release on HF the concepts for each of the snippets?
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.