Affiliation Parsing LoRA

This model is a fine-tuned version of Qwen/Qwen3-4B trained using Group Relative Policy Optimization (GRPO) for parsing and extracting author affiliations from academic paper content.

Model Description

Base Model: Qwen3-4B (4.0B parameters)
Training Method: Group Relative Policy Optimization (GRPO) with LoRA
Task: Author affiliation extraction and parsing from academic paper content
Training Data: arXiv author affiliations dataset with PDF content and corresponding author/affiliation annotations

Training Details

Training Configuration

Training Algorithm: GRPO Done Right (dr_grpo)
Learning Rate: 1e-5 with cosine scheduler and 3% warmup ratio
Training Epochs: 0.36 epochs completed
Batch Size: 1 per device, 8 gradient accumulation steps
LoRA Configuration:
- Rank (r): 8
- Alpha: 16
- Dropout: 0.01
- Target modules: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj

Training Metrics

Total Training Steps: 890
Total Tokens Processed: 62,074,442
Final Training Loss: 0.075
Answer Reward: 2.21 ± 0.65
Format Reward: 0.925 ± 0.16

Hardware

GPUs: 8x NVIDIA H100 80GB HBM3
Training Time: ~23.9 hours (86,125 seconds)
Precision: bfloat16

Reward Functions

The model was trained with two reward functions:

Format Reward: Evaluates whether the generated output follows the expected structured format for author and affiliation data (standardized 0-1 scale)
Answer Reward: Assesses the accuracy of extracted author names and affiliations compared to ground truth annotations

Usage

The model processes academic paper content (up to ~6,000 tokens) and extracts structured author and affiliation information. It uses a system prompt that guides the model to parse author details from PDF content.

Expected Input Format

The model expects PDF content from academic papers as input, truncated to approximately 6,000 tokens for processing efficiency.

Training Data Processing

Max Prompt Length: 7,000 tokens
Max Completion Length: 2,000 tokens
Input Truncation: PDF content truncated to 6,000 tokens during preprocessing

System Prompt

You are an expert at reading academic articles and parsing information about their affiliations. The user will show you an academic article and your job is to extract the authors and their affiliations in a structured format.

### JSON Schema

{
  "$defs": {
    "Author": {
      "description": "A single author entry with one or more institutional affiliations.",
      "properties": {
        "name": {
          "description": "Author's full name, exactly as it should appear in the publication (e.g., 'Naser Damer').",
          "title": "Name",
          "type": "string"
        },
        "affiliations": {
          "description": "Ordered list of the author\u2019s institutional affiliations. Each item should be a human-readable string such as 'Fraunhofer Institute for Computer Graphics Research IGD, Darmstadt, Germany'.",
          "items": {
            "type": "string"
          },
          "title": "Affiliations",
          "type": "array"
        }
      },
      "required": [
        "name",
        "affiliations"
      ],
      "title": "Author",
      "type": "object"
    }
  },
  "description": "List of authors in the exact order they appear on the paper.",
  "items": {
    "$ref": "#/$defs/Author"
  },
  "title": "Affiliations",
  "type": "array"
}

### Example Output

[
  {
    "affiliations": [
      "School of Mathematical and Computational Sciences North Haugh, St Andrews, Fife KY16 9SS, UK"
    ],
    "name": "M.D. Atkinson"
  },
  {
    "affiliations": [
      "Department of Mathematics University College, Galway, Eire"
    ],
    "name": "G. Pfeiffer"
  }
]

### Summary

Read the article carefully, paying attention to the authors and their affiliations. Then respond with a JSON object in the format specified above that contains the authors and their affiliations.

Performance

The model achieved strong performance on formatting compliance:

Format compliance: 92.5% of outputs follow the correct structured format
Content extraction: Competitive performance on author and affiliation extraction tasks
Consistent output: Low variance in format reward indicates reliable structured output generation

Training Infrastructure

Cluster: SLURM-managed HPC environment
Node: Single node with 8 H100 GPUs
Memory: 2.1TB total system memory
CUDA Version: 12.8

Limitations

Trained specifically on academic paper content for affiliation extraction
Input limited to ~6,000 tokens due to truncation during training
Performance may vary on paper formats significantly different from arXiv content
Reward metrics are not standardized between 0 and 1 (except format reward), making absolute performance assessment challenging

Model Output

The model generates structured author and affiliation data extracted from academic paper content, following the format patterns learned during GRPO training with the specified reward functions.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cometadata/affiliation-parsing-lora-Qwen3-4B

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Finetuned

(375)

this model

Collection including cometadata/affiliation-parsing-lora-Qwen3-4B

Affiliation extraction

Collection

Models and datasets for affiliation extraction • 20 items • Updated Nov 25