shisa-ai
/

shisa-v2-llama3.1-405b

Text Generation

text-generation-inference

Model card Files Files and versions

leonardlin commited on 10 days ago

Commit

8a5d66d

·

verified ·

1 Parent(s): f3f46f2

fix broken link

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ Since our initial [Shisa 7B](https://huggingface.co/augmxnt/shisa-7b-v1) release
 ![Shisa V2 405B 日本語上手！](nihongojouzu.jpg)
 ## Shisa V2 405B
-**Llama 3.1 Shisa V2 405B**<sup>1</sup> is a slightly special version of Shisa V2. Obviously, it is the largest, using [Llama 3.1 405B Instruct](meta-llama/Llama-3.1-405B-Instruct) as the base model and required >50x the compute for SFT+DPO compared to the 70B version. While it uses the same Japanese data mix as the other Shisa V2 models, it also has some contributed KO and ZH-TW language data mixed in as well.
 Most notably, Shisa V2 405B not only outperforms Shisa V2 70B on our battery of evals, but also GPT-4 (0603) and GPT-4 Turbo (2024-04-09). Shisa V2 405B also goes toe-to-toe with GPT-4o (2024-11-20) and DeepSeek-V3 (0324) on Japanese MT-Bench. Based on the evaluation results, we believe that Shisa V2 405B is the highest performing LLM ever trained in Japan.

 ![Shisa V2 405B 日本語上手！](nihongojouzu.jpg)
 ## Shisa V2 405B
+**Llama 3.1 Shisa V2 405B**<sup>1</sup> is a slightly special version of Shisa V2. Obviously, it is the largest, using [Llama 3.1 405B Instruct](https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct) as the base model and required >50x the compute for SFT+DPO compared to the 70B version. While it uses the same Japanese data mix as the other Shisa V2 models, it also has some contributed KO and ZH-TW language data mixed in as well.
 Most notably, Shisa V2 405B not only outperforms Shisa V2 70B on our battery of evals, but also GPT-4 (0603) and GPT-4 Turbo (2024-04-09). Shisa V2 405B also goes toe-to-toe with GPT-4o (2024-11-20) and DeepSeek-V3 (0324) on Japanese MT-Bench. Based on the evaluation results, we believe that Shisa V2 405B is the highest performing LLM ever trained in Japan.