RLHF Trojan Competition
Collection
Datasets and models used for the trojan detection competition co-located at SaTML 2024: https://github.com/ethz-spylab/rlhf_trojan_competition • 20 items • Updated
• 4
YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
This reward model was used to align this generation model for the trojan detection competition co-located at SaTML 2024. For more information, visit the official competition website