Text Generation
Transformers
Safetensors
llama
conversational
text-generation-inference
leonardlin commited on
Commit
8a5d66d
·
verified ·
1 Parent(s): f3f46f2

fix broken link

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -24,7 +24,7 @@ Since our initial [Shisa 7B](https://huggingface.co/augmxnt/shisa-7b-v1) release
24
  ![Shisa V2 405B 日本語上手!](nihongojouzu.jpg)
25
 
26
  ## Shisa V2 405B
27
- **Llama 3.1 Shisa V2 405B**<sup>1</sup> is a slightly special version of Shisa V2. Obviously, it is the largest, using [Llama 3.1 405B Instruct](meta-llama/Llama-3.1-405B-Instruct) as the base model and required >50x the compute for SFT+DPO compared to the 70B version. While it uses the same Japanese data mix as the other Shisa V2 models, it also has some contributed KO and ZH-TW language data mixed in as well.
28
 
29
  Most notably, Shisa V2 405B not only outperforms Shisa V2 70B on our battery of evals, but also GPT-4 (0603) and GPT-4 Turbo (2024-04-09). Shisa V2 405B also goes toe-to-toe with GPT-4o (2024-11-20) and DeepSeek-V3 (0324) on Japanese MT-Bench. Based on the evaluation results, we believe that Shisa V2 405B is the highest performing LLM ever trained in Japan.
30
 
 
24
  ![Shisa V2 405B 日本語上手!](nihongojouzu.jpg)
25
 
26
  ## Shisa V2 405B
27
+ **Llama 3.1 Shisa V2 405B**<sup>1</sup> is a slightly special version of Shisa V2. Obviously, it is the largest, using [Llama 3.1 405B Instruct](https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct) as the base model and required >50x the compute for SFT+DPO compared to the 70B version. While it uses the same Japanese data mix as the other Shisa V2 models, it also has some contributed KO and ZH-TW language data mixed in as well.
28
 
29
  Most notably, Shisa V2 405B not only outperforms Shisa V2 70B on our battery of evals, but also GPT-4 (0603) and GPT-4 Turbo (2024-04-09). Shisa V2 405B also goes toe-to-toe with GPT-4o (2024-11-20) and DeepSeek-V3 (0324) on Japanese MT-Bench. Based on the evaluation results, we believe that Shisa V2 405B is the highest performing LLM ever trained in Japan.
30