wubingheng
·
AI & ML interests
I like to fine-tune the small models of the Doge series.
Organizations
-
-
-
-
-
-
-
-
-
-
-
view article
Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models
upvoted
a
paper
4 months ago
upvoted
a
paper
12 months ago