EOT Detector - SmolLM2 135M
A fine-tuned model for End-of-Turn (EOT) detection in conversations, based on SmolLM2-135M.
Model Description
This model predicts whether a user has finished speaking in a conversation (end-of-turn) or is still continuing. It's designed for voice AI applications where accurate turn-taking is critical to avoid interrupting users.
Key Features
- Base Model: SmolLM2-135M (135M parameters)
- Fine-tuning Method: LoRA (r=4, alpha=8)
- Task: Binary classification (complete vs incomplete turn)
- Inference Speed: ~10ms on CPU
Training Details
| Parameter | Value |
|---|---|
| Base Model | HuggingFaceTB/SmolLM2-135M |
| LoRA Rank | 4 |
| LoRA Alpha | 8 |
| Learning Rate | 2e-4 |
| Epochs | 3 |
| Training Samples | 50 |
| Hardware | T4 GPU |
Evaluation Results
Evaluated on Vurtnec/eot-detection-testset (30 samples):
| Metric | Value |
|---|---|
| Accuracy | 76.67% |
| Precision | 100% |
| Recall | 53.33% |
| F1 Score | 69.57% |
Classification Report
precision recall f1-score support
Incomplete 0.68 1.00 0.81 15
Complete 1.00 0.53 0.70 15
accuracy 0.77 30
macro avg 0.84 0.77 0.75 30
Analysis
- High Precision (100%): When the model predicts "complete", it's always correct
- Lower Recall (53%): The model is conservative, sometimes missing completed turns
- This is preferable for voice AI: better to wait slightly longer than to interrupt users
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load model
base_model = "HuggingFaceTB/SmolLM2-135M"
adapter_model = "Vurtnec/eot-detector-smollm2"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, adapter_model)
# Format input
def format_conversation(messages):
text = ""
for msg in messages:
text += f"<|im_start|>{msg['role']}\n{msg['content']}<|im_end|>\n"
text += "<|im_start|>label\n"
return text
# Example
messages = [
{"role": "user", "content": "Hi, I need help"},
{"role": "assistant", "content": "Sure, what do you need?"},
{"role": "user", "content": "Well, um..."}
]
input_text = format_conversation(messages)
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)
result = tokenizer.decode(outputs[0])
# Check for <|eot|> (complete) or <|continue|> (incomplete)
Datasets
- Training: Vurtnec/eot-detection-dataset (50 samples)
- Testing: Vurtnec/eot-detection-testset (30 samples)
Limitations
- Trained on limited English data (50 samples)
- May not generalize well to domain-specific conversations
- Conservative prediction style (prefers "incomplete" when uncertain)
License
Apache 2.0
- Downloads last month
- 69
Model tree for Vurtnec/eot-detector-smollm2
Base model
HuggingFaceTB/SmolLM2-135MDataset used to train Vurtnec/eot-detector-smollm2
Evaluation results
- Accuracy on EOT Detection Test Setself-reported0.767
- Precision on EOT Detection Test Setself-reported1.000
- Recall on EOT Detection Test Setself-reported0.533
- F1 on EOT Detection Test Setself-reported0.696