lavish-repello commited on
Commit
387ff42
·
verified ·
1 Parent(s): f54a29b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -5
README.md CHANGED
@@ -103,12 +103,17 @@ tags:
103
  - zero-shot
104
  ---
105
 
106
- ## CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer
107
- **A Multilingual Safety Guardrail Model for 100 languages built on XLM-RoBERTa**
108
 
109
- CREST which stands for CRoss-lingual Efficient Safety Transfer, is a parameter-efficient multilingual safety classifier for 100 languages, fine-tuned using 13 strategically selected high-resource languages only, chosen through cluster-guided sampling, enabling strong cross-lingual transfer to unseen low-resource languages.
110
- The model is fine-tuned on the XLM-RoBERTa architecture with a classification head, having a max input length of 512 tokens. The Base variant has approximately 279M parameters.
111
- The model is designed for fast, lightweight safety filtering across a large number of languages, both high-resource and low-resource languages, with minimal training cost, suitable for real-time and on-device deployments.
 
 
 
 
 
 
112
 
113
 
114
  ### Intended Use
@@ -191,3 +196,16 @@ Mitigate by continuous human evaluation and incremental finetuning on domain-spe
191
  - Deployment should include human-in-the-loop moderation where appropriate.
192
  - Use responsibly, considering cultural diversity and fairness concerns.
193
  - Not for making legal, ethical, or policy decisions without human oversight.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
  - zero-shot
104
  ---
105
 
106
+ ## CREST: A Multilingual AI Safety Guardrail Model for 100 languages
 
107
 
108
+ CREST which stands for CRoss-lingual Efficient Safety Transfer is a
109
+ parameter-efficient multilingual safety classifier for 100 languages, fine-tuned using 13 strategically selected high-resource
110
+ languages only, chosen through cluster-guided sampling, enabling strong cross-lingual transfer to unseen low-resource languages.
111
+ The model is fine-tuned on the XLM-RoBERTa architecture with a classification head, having a max input length of 512 tokens.
112
+ The Base variant has approximately 279M parameters.
113
+ The model is designed for fast, lightweight safety filtering across a large number of languages, both high-resource and low-resource
114
+ languages, with minimal training cost, suitable for real-time and on-device deployments.
115
+ For detailed results, see
116
+ [CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer](https://arxiv.org/abs/2512.02711v1).
117
 
118
 
119
  ### Intended Use
 
196
  - Deployment should include human-in-the-loop moderation where appropriate.
197
  - Use responsibly, considering cultural diversity and fairness concerns.
198
  - Not for making legal, ethical, or policy decisions without human oversight.
199
+
200
+ ### Citation
201
+ ```
202
+ @misc{bansal2025crestuniversalsafetyguardrails,
203
+ title={CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer},
204
+ author={Lavish Bansal and Naman Mishra},
205
+ year={2025},
206
+ eprint={2512.02711},
207
+ archivePrefix={arXiv},
208
+ primaryClass={cs.CL},
209
+ url={https://arxiv.org/abs/2512.02711},
210
+ }
211
+ ```