@@raoufkamal5748 Thanks a million for your continuous support.
@kawtartouizi4366
26 күн бұрын
Very Interesting 👏👏
@APCMasteryPath
26 күн бұрын
@@kawtartouizi4366 I am always loving your comments 🤗😍
@cagataydemirbas7259
26 күн бұрын
Please share fine tuning by using DPO trainer
@APCMasteryPath
21 күн бұрын
The only change to the code is going to be in the refinement of the training parameters and the train.train() steps as shown below: dpo_trainer = DPOTrainer( model = model, ref_model = None, args = TrainingArguments( per_device_train_batch_size = 4, gradient_accumulation_steps = 8, warmup_ratio = 0.1, num_train_epochs = 3, fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(), logging_steps = 1, optim = "adamw_8bit", seed = 42, output_dir = "outputs", ), beta = 0.1, train_dataset = YOUR_DATASET_HERE, # eval_dataset = YOUR_DATASET_HERE, tokenizer = tokenizer, max_length = 1024, max_prompt_length = 512, ) dpo_trainer.train() The rest of the code is going to be the same. You can check the source code from the Unsloth Github main page : github.com/unslothai/unsloth. Also, for the original Llama 3.1 conversational chat template google colaboratory notebook, you can check the following link: colab.research.google.com/drive/15OyFkGoCImV9dSsewU1wa2JuKB4-mDE_?usp=sharing Hope this helps and many thanks for your comment 😊
@SalekinRupak-i5m
21 күн бұрын
Sir, Can We get the code? 🌸
@APCMasteryPath
21 күн бұрын
Please find below the link of the original google colaboratory notebook developed by the Unsloth Team: colab.research.google.com/drive/15OyFkGoCImV9dSsewU1wa2JuKB4-mDE_?usp=sharing Hope this helps and many thanks for your comment 😊
Пікірлер: 9