Негізгі бет Ғылым және технология Building The Next Large Model: trlX: A Framework for Open-Source RLHF

Жыл бұрын

Building The Next Large Model: trlX: A Framework for Open-Source RLHF

Рет қаралды 1,660

Weights & Biases

1 1

From Fully Connected 2023
Over the past couple months CarperAI has built trlX, one of the first open source reinforcement learning with human feedback (RLHF) implementations capable of fine-tuning large language models at scale. They’ve tested offline reinforcement algorithms to reduce compute requirements and explored the practicality of synthetic preference data, finding both can be combined to significantly reduce expensive RLHF costs.

Пікірлер: 3

@JBoy340a
10 ай бұрын
Great work!
@arjungoalset8442
Жыл бұрын
Can’t wait to see this 🙏
@Quazgaa
Жыл бұрын
closed-source ie worthless