danish-foundation-models
/

dfm-decoder-open-v0-7b-pt

@@ -43,7 +43,9 @@ and two models from the Pleias family ([Pleias-350M-Preview](https://huggingface
 All comparison models were trained exclusively on open data, either in the public domain or under a permissive license.
 The following tables show the performance on each dataset.
-For each, we report the respective main metric from EuroEval and the confidence interval.
 | Model                               | scala-da (MCC)| dala (MCC)    | angry-tweets (MCC) | dansk (Micro F1, No Misc) | danske-talemaader (MCC) | danish-citizen-tests (MCC) | multi-wiki-qa-da (F1) | hellaswag-da (MCC) | nordjylland-news (BERTScore) | average  |
 | ----------------------------------- | ------------- | ------------- | ------------------ | ------------------------- | ----------------------- | -------------------------- | --------------------- | ------------------ | ---------------------------- | -------  |

 All comparison models were trained exclusively on open data, either in the public domain or under a permissive license.
 The following tables show the performance on each dataset.
+For each, we report the respective main metric from EuroEval and the confidence interval.
+The latter is calculated as the mean of the metric scores across all evaluation runs ± 1.96 times the standard error of the mean:
+$$\hat{\mu} \pm 1.96 \times SEM \quad \textrm{where} \quad SEM = \frac{s}{\sqrt{n}} \quad \textrm{and} \quad s = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \hat{\mu})^2}{n-1}} \quad \textrm{and} \quad \hat{\mu} = \frac{1}{n} \sum_{i=1}^{n} x_i$$
 | Model                               | scala-da (MCC)| dala (MCC)    | angry-tweets (MCC) | dansk (Micro F1, No Misc) | danske-talemaader (MCC) | danish-citizen-tests (MCC) | multi-wiki-qa-da (F1) | hellaswag-da (MCC) | nordjylland-news (BERTScore) | average  |
 | ----------------------------------- | ------------- | ------------- | ------------------ | ------------------------- | ----------------------- | -------------------------- | --------------------- | ------------------ | ---------------------------- | -------  |