Giter Site home page Giter Site logo

Comments (6)

RonanKMcGovern avatar RonanKMcGovern commented on May 22, 2024 1

I'm reasonably confident this is because SFTTrainer seems to have changed how it expects args to be passed.

SFT_config must be used.

Also, it seems that peft_config cannot be passed within SFT_config. See here.

It took me a while to track this down but it appeared in an error message.

Kind of annoying if it is this, because a lot of my SFT scripts have now broken (and I didn't have frozen versioning because I was using the dev version of SFT in quite a few cases).

from transformers.

amyeroberts avatar amyeroberts commented on May 22, 2024

Hi @KaifAhmad1, in order to be able to help we need a full reproducer i.e. something we can copy, paste and run to get the same error. Without the model and dataset (or public equivalents) or peft config there's not much we can do here

cc @younesbelkada @pacman100

from transformers.

KaifAhmad1 avatar KaifAhmad1 commented on May 22, 2024

Hey, @amyeroberts I cannot share the full code here. Here is the link of colab notebook so you can refer.
https://github.com/KaifAhmad1/code-test/blob/main/Phi_3_Fine_Tuned_on_Indic_Lanuage.ipynb

from transformers.

younesbelkada avatar younesbelkada commented on May 22, 2024

Hi @KaifAhmad1
Can you try to force-set gradient_checkpointing:

# Training Arguments
training_arguments = TrainingArguments(
    output_dir="./results",
    num_train_epochs=1,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=1,
    optim="paged_adamw_32bit",
    save_steps=0,
    logging_steps=25,
    learning_rate=2e-4,
    weight_decay=0.001,
    fp16=True,
    bf16=False,
    max_grad_norm=0.3,
    max_steps=-1,
    warmup_ratio=0.03,
    group_by_length=True,
    lr_scheduler_type="cosine",
    report_to="tensorboard",
+   gradient_checkpointing=True
)
     

# SFTTrainer Arguments
trainer = SFTTrainer(
    model=model,
    train_dataset=train_dataset,
    peft_config=peft_config,
    dataset_text_field='text',
    args=training_arguments,
    tokenizer=tokenizer,
    packing=False,
    max_seq_length=None
)
   

from transformers.

KaifAhmad1 avatar KaifAhmad1 commented on May 22, 2024

Thanks, @younesbelkada for fixing it!

from transformers.

younesbelkada avatar younesbelkada commented on May 22, 2024

Thanks @RonanKMcGovern for the feedback, indeed the SFTConfig might have broken things .. I will look into that, can you help me providing more details here on what are the silent bugs you faced ? In our CI where we extensively test many things, all seemed green for us, this will help improve our testing infra
Can you elaborate on why peft_config cannot be passed within SFTConfig ?

from transformers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.