repelloai
/

CREST-Base

Text Classification

safety-guardrails

Model card Files Files and versions

lavish-repello commited on 10 days ago

Commit

387ff42

·

verified ·

1 Parent(s): f54a29b

Update README.md

Files changed (1) hide show

README.md +23 -5

README.md CHANGED Viewed

@@ -103,12 +103,17 @@ tags:
 - zero-shot
 ---
-## CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer
-**A Multilingual Safety Guardrail Model for 100 languages built on XLM-RoBERTa**
-CREST which stands for CRoss-lingual Efficient Safety Transfer, is a parameter-efficient multilingual safety classifier for 100 languages, fine-tuned using 13 strategically selected high-resource languages only, chosen through cluster-guided sampling, enabling strong cross-lingual transfer to unseen low-resource languages.
-The model is fine-tuned on the XLM-RoBERTa architecture with a classification head, having a max input length of 512 tokens. The Base variant has approximately 279M parameters.
-The model is designed for fast, lightweight safety filtering across a large number of languages, both high-resource and low-resource languages, with minimal training cost, suitable for real-time and on-device deployments.
 ### Intended Use
@@ -191,3 +196,16 @@ Mitigate by continuous human evaluation and incremental finetuning on domain-spe
 - Deployment should include human-in-the-loop moderation where appropriate.
 - Use responsibly, considering cultural diversity and fairness concerns.
 - Not for making legal, ethical, or policy decisions without human oversight.

 - zero-shot
 ---
+## CREST: A Multilingual AI Safety Guardrail Model for 100 languages
+CREST which stands for CRoss-lingual Efficient Safety Transfer is a
+parameter-efficient multilingual safety classifier for 100 languages, fine-tuned using 13 strategically selected high-resource
+languages only, chosen through cluster-guided sampling, enabling strong cross-lingual transfer to unseen low-resource languages.
+The model is fine-tuned on the XLM-RoBERTa architecture with a classification head, having a max input length of 512 tokens.
+The Base variant has approximately 279M parameters.
+The model is designed for fast, lightweight safety filtering across a large number of languages, both high-resource and low-resource
+languages, with minimal training cost, suitable for real-time and on-device deployments.
+For detailed results, see
+[CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer](https://arxiv.org/abs/2512.02711v1).
 ### Intended Use
 - Deployment should include human-in-the-loop moderation where appropriate.
 - Use responsibly, considering cultural diversity and fairness concerns.
 - Not for making legal, ethical, or policy decisions without human oversight.
+### Citation
+```
+@misc{bansal2025crestuniversalsafetyguardrails,
+      title={CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer},
+      author={Lavish Bansal and Naman Mishra},
+      year={2025},
+      eprint={2512.02711},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2512.02711},
+}
+```