Text Generation
Transformers
Safetensors
minimax_m2
conversational
custom_code
fp8

Nice gpt-oss distill

#43
by ChuckMcSneed - opened

At least have the decency to distill from good models, what you made is pure trash, waste of compute.

At least have the decency to distill from good models, what you made is pure trash, waste of compute.

A distill that's 110B parameters larger?

At least have the decency to distill from good models, what you made is pure trash, waste of compute.

A distill that's 110B parameters larger?

Yes, they are this stupid. They distilled a 120B into 229B model.

MiniMax org

Thanks for the comment, but just to correct the misinformation:
If MiniMax M2 were truly “pure trash,” you’d see it reflected in the benchmarks, and you don’t.
We welcome tough feedback, but it needs to be factual if it’s going to be useful. If you have specific technical points, we’re always happy to dive deep.
We open-sourced M2 so that everyone can use it freely and evaluate it transparently.
And honestly, if M2 doesn’t work for your needs, you’re absolutely free to use any other model. 😊

sriting changed discussion status to closed

Sign up or log in to comment