--- license: mit base_model: microsoft/Phi-3-mini-4k-instruct datasets: - b-mc2/sql-create-context --- # QLoRA Adapter for Phi-3-mini: A Technical SQL Assistant This repository contains a QLoRA (Parameter-Efficient Fine-Tuning) adapter for the `microsoft/Phi-3-mini-4k-instruct` model. The adapter was fine-tuned to act as a technical assistant that generates SQL queries from natural language questions, based on a provided database schema. This project was developed as an educational exercise to demonstrate a complete, end-to-end fine-tuning pipeline, from data preparation to model evaluation. ## Model Description - **Base Model:** `microsoft/Phi-3-mini-4k-instruct` - **Fine-tuning Method:** QLoRA (4-bit NormalFloat quantization) - **Task:** Text-to-SQL Generation The model is designed to receive a database schema (as a `CREATE TABLE` statement) and a user's question in natural language. It then generates the appropriate SQL query to answer that question. ## How to Use First, ensure you have the necessary libraries installed: ```bash pip install -U transformers peft accelerate bitsandbytes torch ``` Next, use the following Python code.The `AutoPeftModelForCausalLM` class will automatically load the base model (Phi-3-mini) and apply this fine-tuned adapter on top. ```python from peft import AutoPeftModelForCausalLM from transformers import AutoTokenizer import torch # Your adapter's ID on the Hugging Face Hub adapter_id = "manuelaschrittwieser/phi-3-mini-sql-assistant-adapter" print("Loading model and adapter...") # Load the fine-tuned model and adapter in one step model = AutoPeftModelForCausalLM.from_pretrained( adapter_id, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True, ) # The tokenizer is also loaded from the adapter's repository tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True) print("Model loaded successfully!") # --- Define your schema and question --- context = "CREATE TABLE employees (name VARCHAR, department VARCHAR, salary INTEGER)" question = "What are the names of employees in the 'Engineering' department with a salary over 80000?" # --- Format the prompt using the Phi-3 chat template --- prompt = f"""<|user|> Given the database schema: {context} Generate the SQL query for the following request: {question}<|end|> <|assistant|> """ # --- Generate the response --- input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device) outputs = model.generate(input_ids=input_ids, max_new_tokens=100, do_sample=False) generated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0] # --- Safely extract the generated SQL --- # This handles cases where the model might forget the <|assistant|> token assistant_token = "<|assistant|>" if assistant_token in generated_text: generated_sql = generated_text.split(assistant_token)[1].strip() else: # Fallback for when the model doesn't follow the template perfectly # It removes the original prompt from the generated text prompt_for_splitting = prompt.replace(assistant_token, "") generated_sql = generated_text.split(prompt_for_splitting)[-1].strip() print(f"\nGenerated SQL: {generated_sql}") # Expected output: SELECT name FROM employees WHERE department = 'Engineering' AND salary > 80000 ``` ## Training Procedure #### Dataset The model was fine-tuned on a 10,000-sample subset of the b-mc2/sql-create-context dataset. This dataset provides pairs of database schemas (context), natural language questions (question), and their corresponding SQL queries (answer). The subset was split into 9,000 examples for training and 1,000 for testing. **Fine-tuning Configuration (QLoRA)** * Quantization: 4-bit NormalFloat (NF4) with bfloat16 compute dtype. * LoRA Rank (r): 8 * LoRA Alpha (lora_alpha): 16 * Target Modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] **Training Hyperparameters** * Learning Rate: 2e-4 * Epochs: 1 * Batch Size: 2 (with 4 gradient accumulation steps for an effective batch size of 8) * Optimizer: Paged AdamW (32-bit) * LR Scheduler: Cosine ## Evaluation The model's primary goal was to generate syntactically correct and logically sound SQL queries. A qualitative evaluation on the held-out test set showed that the model successfully learned this task, often producing the exact ground truth SQL. However, after only one epoch of training, the model sometimes fails to perfectly adhere to the chat template format (e.g., omitting the <|assistant|> token), even when the generated SQL itself is correct. Further training would likely improve this formatting consistency. ## Limitations and Bias * This model is a proof-of-concept and was trained on a limited dataset for a short duration. It may not generalize well to complex, unseen database schemas or highly intricate SQL queries. * The model is not designed to be a chatbot and has no knowledge outside of the Text-to-SQL domain it was trained on. * The training data may contain biases, and the model's outputs should be reviewed before execution in a production environment, especially to prevent potential SQL injection vulnerabilities.