PSA: HF transformers implementation open sourced (with Trainer support)
Hi everyone,
First I'd like to thank the Sesame team for this amazing model.
I want to share with the community something I've been working on: a re-implementation of the model for HuggingFace transformers, fully compatible with Trainer.
It supports decoder training amortization like presented in the CSM blog post. It also supports generation.
It is Apache 2.0 licensed, the code can be found here: https://github.com/thomasgauthier/csm-hf
The converted pretrained model weights are hosted at https://huggingface.co/thomasgauthier/csm-1b-hf
Looking forward to see what the community will do with this.
π€
This is great work. This is a true implementation of a HF model, @thomasgauthier ! Will be recommending and using this. Thanks!
Hope that more people build on top of your framework :)