BRIDGE-OPEN-Leaderboard

Sleeping

EXCEL_PATH = BASE_DIR / "Clinical Benchmark and LLM.xlsx"  # Place in project root
# OR
EXCEL_PATH = Path("/Users/yourname/Desktop/benchmark.xlsx")  # Use absolute path

3. Run the Script

From the project root directory:

python scripts/main.py

Or with Python 3 explicitly:

python3 scripts/main.py

That's it! The script will automatically:

✅ Read the Excel file
✅ Generate all three leaderboards (Zero-Shot, Few-Shot, CoT)
✅ Update task information
✅ Calculate and update rankings
✅ Save everything to the leaderboards/ directory

⚙️ Configuration

All settings are in scripts/config.py:

Required Configuration

EXCEL_PATH - Path to your Excel file containing model and task data

# In project root (recommended)
EXCEL_PATH = BASE_DIR / "Clinical Benchmark and LLM.xlsx"

# Absolute path
EXCEL_PATH = Path("/Users/yourname/Desktop/Clinical Benchmark and LLM.xlsx")

# In a subdirectory
EXCEL_PATH = BASE_DIR / "data" / "benchmark.xlsx"

Optional Configuration

INVALID_MODELS - Models to exclude from leaderboards

INVALID_MODELS = [
    "gemma-3-27b-pt",
    "Hulu-Med-7B",
    # Add model names that should not appear
]

Output paths (usually don't need to change):

ZERO_SHOT_OUTPUT - Zero-Shot leaderboard path
FEW_SHOT_OUTPUT - Few-Shot leaderboard path
COT_OUTPUT - Chain-of-Thought leaderboard path

📋 Complete Update Workflow

Get your Excel file
- Download/obtain the latest "Clinical Benchmark and LLM.xlsx"
- Place it in the project root or note its location

Update configuration

# Edit scripts/config.py
# Set EXCEL_PATH to your file location
# Add any models to INVALID_MODELS if needed

Run the generation script
```
python scripts/main.py
```
Verify the output
- Check leaderboards/Zero-Shot_leaderboard.json
- Check leaderboards/Few-Shot_leaderboard.json
- Check leaderboards/CoT_leaderboard.json
- Check task_information.json

Test locally

python app.py
# Open browser to test the leaderboard interface

Deploy
- Commit and push to GitHub
- Deploy to Hugging Face Spaces

📁 Files Overview

config.py - Central configuration file ⚠️ EDIT THIS FILE
main.py - Main script that orchestrates leaderboard generation
requirements.txt - Python dependencies for the scripts
README.md - This file
helpers/ - Helper modules for processing Excel data
- excel_processor.py - Processes Excel files and creates leaderboards
- reorganize_indices.py - Reorganizes model indices by size
- CONSTANTS.py - Constants for data mapping (task names, domain mappings, etc.)
- leaderboards.py - Placeholder for future leaderboard operations
- __init__.py - Makes helpers a Python package

🤝 Sharing This Code

When sharing this code with others:

They only need to update scripts/config.py with their Excel file path
All other files will automatically use the configured paths
No need to search through multiple files to update paths
The script validates the Excel file exists before running

🐛 Troubleshooting

"Excel file not found" error

❌ ERROR: Excel file not found!

Solution:

Check that EXCEL_PATH in scripts/config.py points to a valid file
Verify the file exists at that location
Use absolute paths if relative paths don't work

"Missing models" or unexpected output

Solution:

Verify that model names in INVALID_MODELS match exactly (case-sensitive)
Check that the Excel file has the required sheets:
- "Models (Simplified)" - contains model information
- "B-CLF", "B-EXT", "B-GEN" - for Zero-Shot
- "B-CLF-5shot", "B-EXT-5shot", "B-GEN-5shot" - for Few-Shot
- "B-CLF-CoT", "B-EXT-CoT", "B-GEN-CoT" - for CoT
- "Task-all" - for task information

Import errors

ModuleNotFoundError: No module named 'pandas'

Solution:

Install the required packages: pip install -r scripts/requirements.txt
Make sure you're using the correct Python environment

Running from wrong directory

ModuleNotFoundError: No module named 'helpers'

Solution:

Always run from the project root: python scripts/main.py
Not from inside the scripts directory: ❌ cd scripts && python main.py

💡 Excel File Requirements

Your Excel file must contain:

Required Sheets:

Models (Simplified) - Model metadata
- Columns: Name, Domain, License, Size (B)
Task Sheets (for each leaderboard type):
- Zero-Shot: B-CLF, B-EXT, B-GEN
- Few-Shot: B-CLF-5shot, B-EXT-5shot, B-GEN-5shot
- CoT: B-CLF-CoT, B-EXT-CoT, B-GEN-CoT
Task-all - Task metadata
- Columns: Task name, Language, Task Type, Clinical context, Data Access, etc.

Model Name Handling:

The script automatically handles some model name variations:

gpt-35-turbo-0125 → gpt-35-turbo
gpt-4o-0806 → gpt-4o
gemini-2.0-flash-001 → gemini-2.0-flash
And more (see excel_processor.py for full list)

🎯 What the Script Does

Validates the Excel file exists
Loads model information from "Models (Simplified)" sheet
Processes each leaderboard type (Zero-Shot, Few-Shot, CoT):
- Extracts performance data from task sheets
- Calculates average performance
- Generates JSON with model info and scores
Reorganizes model indices by size (smallest to largest)
Updates rankings based on average performance
Creates task_information.json with metadata
Saves all output files to the leaderboards/ directory

📝 Notes

The script preserves model order by size within each leaderboard
Rankings (T column) are updated based on average performance
Invalid models are excluded before processing
All JSON files are formatted with 4-space indentation
The script uses UTF-8 encoding to support non-ASCII characters