Fix runtime buffers after load

#37
by err805 - opened
moondream org

On newer versions of transformers, loading the model can leave the non-persistent runtime buffers in an invalid state, which causes detect() to return incorrect results.

This rebuilds attn_mask and text.freqs_cis after from_pretrained() so the model uses the expected runtime buffers on both older and newer transformers versions.

vikhyatk changed pull request status to merged

Sign up or log in to comment