AI / Machine Learning
manitcor
•
1y ago
•
100%
Direct Preference Optimization - Your Language Model is Secretly a Reward Model
cross-posted from: https://lemmy.intai.tech/post/17988
Comments 0
cross-posted from: https://lemmy.intai.tech/post/17988