Besides updates to our 14B, and 70B, we have a new LFM2-based 1.2B, Llama 3.2-based 3B, and Qwen 3-based 8B, all with class-leading Japanese language capabilities.
Per usual, lots of details in the Model Cards for those interested.
---
app_port: 8080 # or any integer besides 7860 that's greater than 2 ** 10
startup_duration_timeout: 350m
---I'll just add that I'm sure it's spam now, that space is attached to another one of my models as well (and obviously not running either). Also the user's other space is straight out linking to something shady: https://huggingface.co/spaces/elseodelasgalletas/detector-de-ia (I can't report as I'm rate limited)
I mean, it's obviously not running my model (it's a brand new JA/EN ablation), so not sure why it'd be attached...
Also, I tested the new https://huggingface.co/DataPilot/ArrowPro-7B-KUJIRA model and it appears to be the real deal, very impressive performance, trained by a 15-yo (!) @Holy-fox - note that using my sampler settings detailed improved the score as well (as otherwise it suffered from looping errors as well).
I'll be aiming for beating that on the Llama 3 8B, and beating Command R Plus for the 70B in the coming days.
I'll just add a note on the sampler parameters for testing that I found improved performance for virtually every model I tested: temperature 0.2, min_p 0.1, frequency_penalty 0.5 (a frequency/repetition penalty is required to minimize looping errors that otherwise creep into most of these models)
gpt-3.5-turbo-0125's JA performance, which is worth noting, and is tuned *exclusively* with the old shisa-v1 dataset (so it's chart position will be very short lived).from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("augmxnt/shisa-7b-v1")
messages = [
{'role': 'user', 'content': 'This is the first user input.'},
{'role': 'assistant', 'content': 'This is the first assistant response.'},
{'role': 'user', 'content': 'This is the second user input.'},
]
print()
print('Chat Template:')
print(tokenizer.chat_template)
print()
print('---')
print()
print(tokenizer.apply_chat_template(messages, tokenize=False))