File size: 10,019 Bytes
e581477
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
07e9e97
fd82e5c
2d16e09
 
 
 
 
e581477
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2a585ed
 
 
 
 
 
6a4d5ba
2a585ed
6a4d5ba
2a585ed
 
 
 
 
 
 
 
 
6a4d5ba
e581477
 
 
6a4d5ba
e581477
6a4d5ba
e581477
5bc6722
e581477
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
<!-- ----------  Global Styles  ---------- -->
<style>
  /* 1. Center content and limit max width for readability */
  /* .wrapper{
    width:100%;
    max-width:1111px;
    margin:0 auto;
    padding:0 1rem;
  } */

  /* 2. Logo bar (top row) */
  .logo-bar{
    display:flex;
    align-items:center;
    justify-content:space-between;
    height:50px;
    margin-bottom:25px;
  }
  .logo-bar img{
    height:100%;
    max-width:100%;
    object-fit:contain;
  }

  /* 3. Generic paragraph spacing */
  p{line-height:1.6;}

  /* 4. Re-usable image section */
  .section-img{
    display:flex;
    justify-content:center;
    align-items:center;
    margin:25px 0;        /* vertical breathing room */
  }
  .section-img img{
    max-width:80%;
    height:auto;
    object-fit:contain;   /* avoid distortion */
  }

  /* 5. Make long BibTeX lines wrap instead of widening page */
  pre code{
    white-space:pre-wrap;
    word-break:break-word;
  }
</style>

<!-- ----------  Page Content  ---------- -->
<div class="wrapper">

<!-- Top logos ------------------------------------------------------------>
<div class="logo-bar">
<img src="https://cdn-uploads.huggingface.co/production/uploads/67a040fb6934f9aa1c866f99/1bNk6xHD90mlVaUOJ3kT6.png" alt="HMS" />
<img src="https://cdn-uploads.huggingface.co/production/uploads/67a040fb6934f9aa1c866f99/ZVx7ahuV1mVuIeygYwirc.png" alt="MGB" />
<img src="https://cdn-uploads.huggingface.co/production/uploads/67a040fb6934f9aa1c866f99/TkKKjmq98Wv_p5shxJTMY.png" alt="Broad" />
<img src="https://cdn-uploads.huggingface.co/production/uploads/67a040fb6934f9aa1c866f99/UcM8kmTaVkAM1qf3v09K8.png" alt="YLab" />
</div>

<h1>BRIDGE-OPEN Leaderboard</h1>

<!-- Updates -------------------------------------------------------------->
<h2>πŸ“’ Updates</h2>
<ul>
    <li>πŸ—“οΈ 2025/12/07: Updated leaderboard with 1 model (99 models in total)! <a href="https://huggingface.co/spaces/YLab-Open/BRIDGE-OPEN-Leaderboard/blob/main/models-log.md#%F0%9F%93%85-update-2025-12-07">View the full list of added models</a></li>
    <li>πŸ—“οΈ 2025/11/04: Updated leaderboard with 2 models (98 models in total)! <a href="https://huggingface.co/spaces/YLab-Open/BRIDGE-OPEN-Leaderboard/blob/main/models-log.md#%F0%9F%93%85-update-2025-11-04">View the full list of added models</a></li>
    <li>πŸ—“οΈ 2025/11/01: Updated leaderboard with 3 models (96 models in total)! <a href="https://huggingface.co/spaces/YLab-Open/BRIDGE-OPEN-Leaderboard/blob/main/models-log.md#%F0%9F%93%85-update-2025-11-01">View the full list of added models</a></li>
    <li>πŸ—“οΈ 2025/09/04: Updated leaderboard with 8 models (93 models in total)! <a href="https://huggingface.co/spaces/YLab-Open/BRIDGE-OPEN-Leaderboard/blob/main/models-log.md#%F0%9F%93%85-update-2025-09-04">View the full list of added models</a></li>
    <li>πŸ—“οΈ 2025/07/22: Updated leaderboard with 10 models (85 models in total)! <a href="https://huggingface.co/spaces/YLab-Open/BRIDGE-OPEN-Leaderboard/blob/main/models-log.md#%F0%9F%93%85-update-2025-07-22">View the full list of added models</a></li>
    <li>πŸ—“οΈ 2025/06/03: Updated leaderboard with 21 models (75 models in total)! <a href="https://huggingface.co/spaces/YLab-Open/BRIDGE-OPEN-Leaderboard/blob/main/models-log.md#%F0%9F%93%85-update-2025-06-03">View the full list of added models</a></li>
    <li>πŸ—“οΈ 2025/04/28: BRIDGE Leaderboard V1.0.0 is now live (54 models in total)! <a href="https://huggingface.co/spaces/YLab-Open/BRIDGE-OPEN-Leaderboard/blob/main/models-log.md#%F0%9F%93%85-update-2025-04-28">View the full list of models</a></li>
    <li>πŸ—“οΈ 2025/04/28: Our paper <a href="https://arxiv.org/abs/2504.19467">BRIDGE</a> is now available on arXiv!</li>
</ul>

<h2>🎯 Purpose</h2>
<p><strong>BRIDGE-OPEN</strong> is a curated subset of the comprehensive <a href="https://huggingface.co/spaces/YLab-Open/BRIDGE-Medical-Leaderboard">BRIDGE Medical Leaderboard</a> that focuses exclusively on <strong>open-source clinical datasets</strong>. The datasets are accessible via the <a href="https://huggingface.co/datasets/YLab-Open/BRIDGE-Open">BRIDGE-Open dataset</a> on Hugging Face. While the full BRIDGE benchmark contains 87 clinical text tasks, BRIDGE-OPEN includes only those datasets that are freely accessible without restricted access requirements.</p>

<p>For more information about BRIDGE and the construction of this LLM benchmark, please visit the <a href="https://huggingface.co/spaces/YLab-Open/BRIDGE-Medical-Leaderboard">original BRIDGE Leaderboard Space</a>.</p>

<p>This leaderboard enables researchers and practitioners to:</p>
<ul>
    <li><strong>Evaluate LLMs on clinical tasks</strong> using publicly available data</li>
    <li><strong>Reproduce and verify results</strong> without data access barriers</li>
    <li><strong>Benchmark models fairly</strong> on the same open clinical datasets</li>
    <li><strong>Advance medical AI research</strong> through transparent evaluation</li>
</ul>

<h2>πŸ“Š What's Included</h2>
<p><strong>BRIDGE-OPEN</strong> contains <strong>50+ open-access clinical datasets</strong> spanning:</p>
<ul>
    <li><strong>9 Languages</strong>: English, Chinese, Spanish, Japanese, German, Russian, French, Norwegian, Portuguese</li>
    <li><strong>8 Task Types</strong>: Text classification, semantic similarity, normalization/coding, NER, NLI, event extraction, QA, summarization</li>
    <li><strong>14 Clinical Specialties</strong>: General medicine, cardiology, oncology, pharmacology, radiology, and more</li>
    <li><strong>6 Clinical Stages</strong>: From triage to discharge and administration</li>
</ul>

<h2>πŸ† Three Evaluation Modes</h2>
<p>Each model is evaluated using three different inference strategies:</p>
<ol>
    <li><strong>Zero-Shot</strong>: Direct task completion without examples</li>
    <li><strong>Chain-of-Thought (CoT)</strong>: Step-by-step reasoning before final answer</li>
    <li><strong>Few-Shot</strong>: 5 example demonstrations for in-context learning</li>
</ol>

<h2>πŸš€ How to Evaluate Your Model</h2>
<h3>Option 1: Run Inference Locally</h3>
<ol>
    <li>Download the <a href="https://huggingface.co/datasets/YLab-Open/BRIDGE-Open">BRIDGE-Open dataset</a></li>
    <li>Run inference on your model</li>
    <li>Save predictions in the "pred" field for each sample</li>
    <li>Submit results via <a href="https://forms.gle/gU3GjSn9SqJRvs3b9">Google Form</a></li>
</ol>

<h3>Option 2: Request Evaluation</h3>
<p>Submit your model details via <a href="https://forms.gle/gU3GjSn9SqJRvs3b9">Google Form</a> and we'll evaluate it for you.</p>
<p><strong>Note</strong>: Due to computational constraints, there may be delays in processing submissions.</p>

<h2>πŸ” Key Differences from Full BRIDGE</h2>
<table border="1" style="border-collapse: collapse; width: 100%;">
    <tr>
        <th>Feature</th>
        <th>BRIDGE (Full)</th>
        <th>BRIDGE-OPEN</th>
    </tr>
    <tr>
        <td><strong>Datasets</strong></td>
        <td>87 tasks</td>
        <td>50+ open-access tasks</td>
    </tr>
    <tr>
        <td><strong>Data Access</strong></td>
        <td>Mixed (open + regulated)</td>
        <td>100% open access</td>
    </tr>
    <tr>
        <td><strong>Reproducibility</strong></td>
        <td>Limited by data access</td>
        <td>Fully reproducible</td>
    </tr>
    <tr>
        <td><strong>Use Case</strong></td>
        <td>Comprehensive evaluation</td>
        <td>Open research & development</td>
    </tr>
</table>

<h2>🀝 Contributing</h2>
<p>Have an open clinical dataset to add? Submit it through our <a href="https://forms.gle/gU3GjSn9SqJRvs3b9">Google Form</a>!</p>

<h2>πŸ“¬ Contact</h2>
<p>If you have any questions about BRIDGE or the leaderboard, feel free to contact us!</p>
<ul>
    <li><strong>Leaderboard Managers</strong>:
        <ul>
            <li>Jiageng Wu (<a href="mailto:[email protected]">[email protected]</a>)</li>
            <li>Kevin Xie (<a href="mailto:[email protected]">[email protected]</a>)</li>
            <li>Bowen Gu (<a href="mailto:[email protected]">[email protected]</a>)</li>
        </ul>
    </li>
    <li><strong>Benchmark Managers</strong>: Jiageng Wu, Bowen Gu</li>
    <li><strong>Project Lead</strong>: Prof. Jie Yang (<a href="mailto:[email protected]">[email protected]</a>)</li>
</ul>

<h2>πŸ“š Citation</h2>
<p>If you find this leaderboard useful for your research and applications, please cite the following papers:</p>
<pre><code>@article{BRIDGE-benchmark,
    title={BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text},
    author={Wu, Jiageng and Gu, Bowen and Zhou, Ren and Xie, Kevin and Snyder, Doug and Jiang, Yixing and Carducci, Valentina and Wyss, Richard and Desai, Rishi J and Alsentzer, Emily and Celi, Leo Anthony and Rodman, Adam and Schneeweiss, Sebastian and Chen, Jonathan H. and Romero-Brufau, Santiago and Lin, Kueiyu Joshua and Yang, Jie},
    year={2025},
    journal={arXiv preprint arXiv: 2504.19467},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2504.19467}
}

@article{clinical-text-review,
    title={Clinical text datasets for medical artificial intelligence and large language modelsβ€”a systematic review},
    author={Wu, Jiageng and Liu, Xiaocong and Li, Minghui and Li, Wanxin and Su, Zichang and Lin, Shixu and Garay, Lucas and Zhang, Zhiyun and Zhang, Yujie and Zeng, Qingcheng and Shen, Jie and Yuan, Changzheng and Yang, Jie},
    journal={NEJM AI},
    volume={1},
    number={6},
    pages={AIra2400012},
    year={2024},
    publisher={Massachusetts Medical Society}
}
</code></pre>

<p>If you use the datasets in BRIDGE, please also cite the original paper of datasets, which can be found in our BRIDGE paper.</p>

<hr>

<p><em>BRIDGE-OPEN is maintained by the <a href="https://ylab.top/">Y-Lab</a> team at Harvard Medical School and Brigham and Women's Hospital.</em></p>

</div>
<!-- ----------  End of Page Content  ---------- -->