Reinforcement learning with human feedback (Q2177): Difference between revisions
(Created a new Item) |
(Created claim: depends on (P1): Machine learning (Q2167)) |
||
Property / depends on | |||
Property / depends on: Machine learning / rank | |||
Normal rank |
Latest revision as of 12:51, 13 October 2025
Training a model using human preferences
- RLHF
Language | Label | Description | Also known as |
---|---|---|---|
English | Reinforcement learning with human feedback |
Training a model using human preferences |
|