π©Ί MedGemma-3-ROCO-FineTuned
This model is a fine-tuned version of Google Gemma-3-4B-IT optimized for medical image processing. It was developed as part of the MedGemma Impact Challenge competition using the reduced ROCO-v2 dataset.
π Evaluation Results
We evaluated the model using both rule-based technical metrics and GenAI-as-a-judge (Gemini 3 Flash).
| Metric Name | Value |
|---|---|
| BERTScore F1 | 0.8468 |
| BLEU | 0.0185 |
| ROUGE-L | 0.1427 |
π Model Description
- Developed by: Aldabergenov Makhambet, Zhalgasbayev Arman
- Language: English
- Model Type: Multimodal Vision-Language Model (VLM)
- Finetuned from:
google/gemma-3-4b-it - Dataset: ROCO (10k balanced radiology images)
π Usage Example
To use this model, you need the transformers and accelerate libraries.
from transformers import Gemma3ForConditionalGeneration, AutoProcessor
import torch
model_id = "Aldabergenov1/medgemma_roco_fine_tuned"
model = Gemma3ForConditionalGeneration.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
processor = AutoProcessor.from_pretrained(model_id)
# Example Inference
# messages = [{"role": "user", "content": [{"type": "image"}, {"type": "text", "text": "Radiology report:"}]}]
# ... (rest of the inference code)
- Downloads last month
- 77