Yiddish Whisper Training Collection Yiddish based Whisper post-training - Crowd Sourced Open Data • 10 items • Updated Oct 24 • 4
Qwen 3 VL - CATMuS Collection A collection of finetunes of Qwen 3 VL. These models were finetuned on the CATMuS dataset via TRL SFT. • 3 items • Updated Oct 24 • 2
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published Nov 5, 2024 • 70
Whisper Zulu ASR Models Collection This is a collection of Whisper models for transcribing audio/video in the Zulu language. • 4 items • Updated Aug 20, 2024 • 1
TrOCR Medieval HTR Collection This is a collection of models trained to recognize medieval scripts. • 10 items • Updated Jul 8, 2024 • 5
Medieval NER Collection This is a collection of Medieval NER datasets and models. • 7 items • Updated Jul 4, 2024 • 2
Historic Newsaper Datasets Collection Historic Newspaper Datasets on the Hub • 16 items • Updated May 8 • 6