Triangle104 commited on
Commit
f47cab4
·
verified ·
1 Parent(s): 92cdfca

Adding Evaluation Results (#2)

Browse files

- Adding Evaluation Results (4dd2d0dd6cee56f45c9e58f35fc475e15698832d)

Files changed (1) hide show
  1. README.md +21 -14
README.md CHANGED
@@ -1,4 +1,5 @@
1
  ---
 
2
  library_name: transformers
3
  tags:
4
  - mergekit
@@ -22,8 +23,7 @@ model-index:
22
  value: 34.45
23
  name: strict accuracy
24
  source:
25
- url: >-
26
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
27
  name: Open LLM Leaderboard
28
  - task:
29
  type: text-generation
@@ -38,8 +38,7 @@ model-index:
38
  value: 20.78
39
  name: normalized accuracy
40
  source:
41
- url: >-
42
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
43
  name: Open LLM Leaderboard
44
  - task:
45
  type: text-generation
@@ -54,8 +53,7 @@ model-index:
54
  value: 29.68
55
  name: exact match
56
  source:
57
- url: >-
58
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
59
  name: Open LLM Leaderboard
60
  - task:
61
  type: text-generation
@@ -70,8 +68,7 @@ model-index:
70
  value: 8.5
71
  name: acc_norm
72
  source:
73
- url: >-
74
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
75
  name: Open LLM Leaderboard
76
  - task:
77
  type: text-generation
@@ -86,8 +83,7 @@ model-index:
86
  value: 8.18
87
  name: acc_norm
88
  source:
89
- url: >-
90
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
91
  name: Open LLM Leaderboard
92
  - task:
93
  type: text-generation
@@ -104,10 +100,8 @@ model-index:
104
  value: 21.01
105
  name: accuracy
106
  source:
107
- url: >-
108
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
109
  name: Open LLM Leaderboard
110
- license: apache-2.0
111
  ---
112
  # Merge
113
 
@@ -150,4 +144,17 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
150
  |MATH Lvl 5 (4-Shot)|29.68|
151
  |GPQA (0-shot) | 8.50|
152
  |MuSR (0-shot) | 8.18|
153
- |MMLU-PRO (5-shot) |21.01|
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
  library_name: transformers
4
  tags:
5
  - mergekit
 
23
  value: 34.45
24
  name: strict accuracy
25
  source:
26
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
 
27
  name: Open LLM Leaderboard
28
  - task:
29
  type: text-generation
 
38
  value: 20.78
39
  name: normalized accuracy
40
  source:
41
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
 
42
  name: Open LLM Leaderboard
43
  - task:
44
  type: text-generation
 
53
  value: 29.68
54
  name: exact match
55
  source:
56
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
 
57
  name: Open LLM Leaderboard
58
  - task:
59
  type: text-generation
 
68
  value: 8.5
69
  name: acc_norm
70
  source:
71
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
 
72
  name: Open LLM Leaderboard
73
  - task:
74
  type: text-generation
 
83
  value: 8.18
84
  name: acc_norm
85
  source:
86
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
 
87
  name: Open LLM Leaderboard
88
  - task:
89
  type: text-generation
 
100
  value: 21.01
101
  name: accuracy
102
  source:
103
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Triangle104/DS-R1-Distill-Q2.5-7B-RP
 
104
  name: Open LLM Leaderboard
 
105
  ---
106
  # Merge
107
 
 
144
  |MATH Lvl 5 (4-Shot)|29.68|
145
  |GPQA (0-shot) | 8.50|
146
  |MuSR (0-shot) | 8.18|
147
+ |MMLU-PRO (5-shot) |21.01|
148
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
149
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Triangle104__DS-R1-Distill-Q2.5-7B-RP-details)
150
+
151
+ | Metric |Value|
152
+ |-------------------|----:|
153
+ |Avg. |20.43|
154
+ |IFEval (0-Shot) |34.45|
155
+ |BBH (3-Shot) |20.78|
156
+ |MATH Lvl 5 (4-Shot)|29.68|
157
+ |GPQA (0-shot) | 8.50|
158
+ |MuSR (0-shot) | 8.18|
159
+ |MMLU-PRO (5-shot) |21.01|
160
+