Sometimes I finetune models specifically to take on expert roles in a MoE configuration, sometimes I find interesting models others have fine tuned.
Rasmus Rasmussen
theprint
AI & ML interests
Small model experiments and homespun datasets.
Recent Activity
updated a model about 9 hours ago
theprint/Llama3.2-3B-Math-gsm8k-AutoSFT published a model about 11 hours ago
theprint/Llama3.2-3B-Math-gsm8k-AutoSFT updated a collection 3 days ago
Mixture of Experts (MoE)