Lab 3 — LoRA fine-tune of a small open model¶
Goal. Fine-tune Mistral-7B or Llama-3.1-8B with LoRA on a domain dataset. Compare zero-shot vs fine-tuned on a 50-example task-specific eval.
What you ship. Notebook with LoRA adapter weights, training loss curve, and a side-by-side eval table on a held-out 50-example test set.
Setup¶
Install the dependencies (one-time).
In [ ]:
# !pip install torch transformers peft accelerate bitsandbytes datasets
In [ ]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments
from peft import LoraConfig, get_peft_model, TaskType
from datasets import load_dataset
from trl import SFTTrainer
torch.manual_seed(42)
Pick a domain dataset¶
The default below is a small medical-Q&A dataset; substitute your own if you have one.
In [ ]:
DATASET = 'medalpaca/medical_meadow_medqa'
BASE_MODEL = 'mistralai/Mistral-7B-Instruct-v0.2'
ds = load_dataset(DATASET, split='train').select(range(1000))
test = load_dataset(DATASET, split='train').select(range(1000, 1050))
print('train:', len(ds), 'test:', len(test))
Exercise 1 — Zero-shot baseline¶
In [ ]:
# YOUR TURN
# Load BASE_MODEL in 4-bit. Run zero-shot on the 50-example test set.
# Record exact-match accuracy.
Exercise 2 — LoRA fine-tune¶
In [ ]:
# YOUR TURN
# Configure LoraConfig(r=16, lora_alpha=32, target_modules=['q_proj','v_proj']).
# Train with SFTTrainer for 200-500 steps on the 1000-example train set.
Exercise 3 — Side-by-side eval¶
In [ ]:
# YOUR TURN
# Re-run on the same 50-example test set with the fine-tuned model.
# Print a table: base accuracy vs fine-tuned accuracy.
Done?¶
Submit per the cohort schedule. Peer review pairing announced the following Monday.