EOT Detector - SmolLM2 135M

A fine-tuned model for End-of-Turn (EOT) detection in conversations, based on SmolLM2-135M.

Model Description

This model predicts whether a user has finished speaking in a conversation (end-of-turn) or is still continuing. It's designed for voice AI applications where accurate turn-taking is critical to avoid interrupting users.

Key Features

  • Base Model: SmolLM2-135M (135M parameters)
  • Fine-tuning Method: LoRA (r=4, alpha=8)
  • Task: Binary classification (complete vs incomplete turn)
  • Inference Speed: ~10ms on CPU

Training Details

Parameter Value
Base Model HuggingFaceTB/SmolLM2-135M
LoRA Rank 4
LoRA Alpha 8
Learning Rate 2e-4
Epochs 3
Training Samples 50
Hardware T4 GPU

Evaluation Results

Evaluated on Vurtnec/eot-detection-testset (30 samples):

Metric Value
Accuracy 76.67%
Precision 100%
Recall 53.33%
F1 Score 69.57%

Classification Report

              precision    recall  f1-score   support

  Incomplete       0.68      1.00      0.81        15
    Complete       1.00      0.53      0.70        15

    accuracy                           0.77        30
   macro avg       0.84      0.77      0.75        30

Analysis

  • High Precision (100%): When the model predicts "complete", it's always correct
  • Lower Recall (53%): The model is conservative, sometimes missing completed turns
  • This is preferable for voice AI: better to wait slightly longer than to interrupt users

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load model
base_model = "HuggingFaceTB/SmolLM2-135M"
adapter_model = "Vurtnec/eot-detector-smollm2"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, adapter_model)

# Format input
def format_conversation(messages):
    text = ""
    for msg in messages:
        text += f"<|im_start|>{msg['role']}\n{msg['content']}<|im_end|>\n"
    text += "<|im_start|>label\n"
    return text

# Example
messages = [
    {"role": "user", "content": "Hi, I need help"},
    {"role": "assistant", "content": "Sure, what do you need?"},
    {"role": "user", "content": "Well, um..."}
]

input_text = format_conversation(messages)
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)
result = tokenizer.decode(outputs[0])

# Check for <|eot|> (complete) or <|continue|> (incomplete)

Datasets

Limitations

  • Trained on limited English data (50 samples)
  • May not generalize well to domain-specific conversations
  • Conservative prediction style (prefers "incomplete" when uncertain)

License

Apache 2.0

Downloads last month
69
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Vurtnec/eot-detector-smollm2

Adapter
(20)
this model

Dataset used to train Vurtnec/eot-detector-smollm2

Evaluation results