Noice!
First of all, thanks for this release! Good mid-sized releases have been rare this year. I can't say I tested the base Bytedance model as its instruction format was really a pain to get going on my own backend, but this one is definitely interesting. Thank you for using a less arcane formatting.
Due to RAM limitations I could only run it in IQ4_XS (16K context), but even at this low quant level, it's surprisingly good. It's decently uncensored, I might need prompt nudging, but overall, refusals are rare even for obviously "wrong" questions. It did well on my personal test bench (do web queries based on user prompt, summary, Q&A, structured output, haystack, decision tree, menu navigation, and finding the correct info in a confusing 16K prompt). I have yet to test the function calling stuff, but so far so good.
The CoT is occasionally a bit weird. It works just fine for academic/work/Q&A/..., but I've noticed that in more creative areas, it'd occasionally respond "as the persona" in the thinking tag (like it's speaking to me) and then reformulate the same thing in the final response. Not a big deal, didn't impede the model, but afaik it's the first time I've ever noticed this behavior in any model, so it's worth sharing.
Cheers.