File size: 6,534 Bytes
577fb61
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
# πŸš€ Leaderboard Generation Scripts

This directory contains scripts to automatically generate leaderboard JSON files from Excel data.

## ⚑ Quick Start

### 1. Install Dependencies

```bash
pip install -r scripts/requirements.txt
```

Required packages:
- `pandas>=2.0.0` - For reading Excel files and data manipulation
- `openpyxl>=3.1.0` - For reading .xlsx Excel files

### 2. Configure Excel Path

Open `scripts/config.py` and update the `EXCEL_PATH` variable:

```python
EXCEL_PATH = BASE_DIR / "Clinical Benchmark and LLM.xlsx"  # Place in project root
# OR
EXCEL_PATH = Path("/Users/yourname/Desktop/benchmark.xlsx")  # Use absolute path
```

### 3. Run the Script

From the **project root** directory:

```bash
python scripts/main.py
```

Or with Python 3 explicitly:

```bash
python3 scripts/main.py
```

That's it! The script will automatically:
- βœ… Read the Excel file
- βœ… Generate all three leaderboards (Zero-Shot, Few-Shot, CoT)
- βœ… Update task information
- βœ… Calculate and update rankings
- βœ… Save everything to the `leaderboards/` directory

## βš™οΈ Configuration

All settings are in `scripts/config.py`:

### Required Configuration

**`EXCEL_PATH`** - Path to your Excel file containing model and task data
```python
# In project root (recommended)
EXCEL_PATH = BASE_DIR / "Clinical Benchmark and LLM.xlsx"

# Absolute path
EXCEL_PATH = Path("/Users/yourname/Desktop/Clinical Benchmark and LLM.xlsx")

# In a subdirectory
EXCEL_PATH = BASE_DIR / "data" / "benchmark.xlsx"
```

### Optional Configuration

**`INVALID_MODELS`** - Models to exclude from leaderboards
```python
INVALID_MODELS = [
    "gemma-3-27b-pt",
    "Hulu-Med-7B",
    # Add model names that should not appear
]
```

**Output paths** (usually don't need to change):
- `ZERO_SHOT_OUTPUT` - Zero-Shot leaderboard path
- `FEW_SHOT_OUTPUT` - Few-Shot leaderboard path  
- `COT_OUTPUT` - Chain-of-Thought leaderboard path

## πŸ“‹ Complete Update Workflow

1. **Get your Excel file**
   - Download/obtain the latest "Clinical Benchmark and LLM.xlsx"
   - Place it in the project root or note its location

2. **Update configuration**
   ```bash
   # Edit scripts/config.py
   # Set EXCEL_PATH to your file location
   # Add any models to INVALID_MODELS if needed
   ```

3. **Run the generation script**
   ```bash
   python scripts/main.py
   ```

4. **Verify the output**
   - Check `leaderboards/Zero-Shot_leaderboard.json`
   - Check `leaderboards/Few-Shot_leaderboard.json`
   - Check `leaderboards/CoT_leaderboard.json`
   - Check `task_information.json`

5. **Test locally**
   ```bash
   python app.py
   # Open browser to test the leaderboard interface
   ```

6. **Deploy**
   - Commit and push to GitHub
   - Deploy to Hugging Face Spaces

## πŸ“ Files Overview

- **`config.py`** - Central configuration file ⚠️ **EDIT THIS FILE**
- **`main.py`** - Main script that orchestrates leaderboard generation
- **`requirements.txt`** - Python dependencies for the scripts
- **`README.md`** - This file
- **`helpers/`** - Helper modules for processing Excel data
  - `excel_processor.py` - Processes Excel files and creates leaderboards
  - `reorganize_indices.py` - Reorganizes model indices by size
  - `CONSTANTS.py` - Constants for data mapping (task names, domain mappings, etc.)
  - `leaderboards.py` - Placeholder for future leaderboard operations
  - `__init__.py` - Makes helpers a Python package

## 🀝 Sharing This Code

When sharing this code with others:
1. They only need to update `scripts/config.py` with their Excel file path
2. All other files will automatically use the configured paths
3. No need to search through multiple files to update paths
4. The script validates the Excel file exists before running

## πŸ› Troubleshooting

### "Excel file not found" error
```
❌ ERROR: Excel file not found!
```
**Solution**: 
- Check that `EXCEL_PATH` in `scripts/config.py` points to a valid file
- Verify the file exists at that location
- Use absolute paths if relative paths don't work

### "Missing models" or unexpected output
**Solution**:
- Verify that model names in `INVALID_MODELS` match exactly (case-sensitive)
- Check that the Excel file has the required sheets:
  - "Models (Simplified)" - contains model information
  - "B-CLF", "B-EXT", "B-GEN" - for Zero-Shot
  - "B-CLF-5shot", "B-EXT-5shot", "B-GEN-5shot" - for Few-Shot
  - "B-CLF-CoT", "B-EXT-CoT", "B-GEN-CoT" - for CoT
  - "Task-all" - for task information

### Import errors
```
ModuleNotFoundError: No module named 'pandas'
```
**Solution**: 
- Install the required packages: `pip install -r scripts/requirements.txt`
- Make sure you're using the correct Python environment

### Running from wrong directory
```
ModuleNotFoundError: No module named 'helpers'
```
**Solution**: 
- Always run from the **project root**: `python scripts/main.py`
- Not from inside the scripts directory: ❌ `cd scripts && python main.py`

## πŸ’‘ Excel File Requirements

Your Excel file must contain:

### Required Sheets:
1. **Models (Simplified)** - Model metadata
   - Columns: Name, Domain, License, Size (B)
   
2. **Task Sheets** (for each leaderboard type):
   - Zero-Shot: B-CLF, B-EXT, B-GEN
   - Few-Shot: B-CLF-5shot, B-EXT-5shot, B-GEN-5shot
   - CoT: B-CLF-CoT, B-EXT-CoT, B-GEN-CoT
   
3. **Task-all** - Task metadata
   - Columns: Task name, Language, Task Type, Clinical context, Data Access, etc.

### Model Name Handling:
The script automatically handles some model name variations:
- `gpt-35-turbo-0125` β†’ `gpt-35-turbo`
- `gpt-4o-0806` β†’ `gpt-4o`
- `gemini-2.0-flash-001` β†’ `gemini-2.0-flash`
- And more (see `excel_processor.py` for full list)

## 🎯 What the Script Does

1. **Validates** the Excel file exists
2. **Loads** model information from "Models (Simplified)" sheet
3. **Processes** each leaderboard type (Zero-Shot, Few-Shot, CoT):
   - Extracts performance data from task sheets
   - Calculates average performance
   - Generates JSON with model info and scores
4. **Reorganizes** model indices by size (smallest to largest)
5. **Updates** rankings based on average performance
6. **Creates** task_information.json with metadata
7. **Saves** all output files to the `leaderboards/` directory

## πŸ“ Notes

- The script preserves model order by size within each leaderboard
- Rankings (T column) are updated based on average performance
- Invalid models are excluded before processing
- All JSON files are formatted with 4-space indentation
- The script uses UTF-8 encoding to support non-ASCII characters