Projects
Things I've built and research I've worked on.
-
Database Reporting Agent: a multi-agent text-to-SQL pipeline
A multi-agent text-to-SQL pipeline over a 15+ table enterprise database, with schema resolution, query generation, guardrails, validation, caching, and an evaluation suite.
-
Importance-weighted fine-tuning for relation extraction
Implementing ATLANTIS (Liu et al., ACL 2025) — an importance-weighted weak-to-strong fine-tuning method — from scratch and applying it to sentence-level relation extraction across encoder–decoder (Flan-T5) and decoder-only (Qwen2) models on SemEval-2010 Task 8 and CoNLL2004.
-
Data extraction after exact unlearning
Reproducing and extending Wu et al.'s Reversed Model Guidance attack against exact unlearning. Across WMDP and a synthetic medical dataset, RMG reliably outperforms unguided pre-unlearning generation, lifting A-ESR by up to ~63%, and reveals a "sweet spot" in forget-set ratio plus an inverse relationship between memorization and the optimal guidance scale.
-
Scalable oversight via adversarial deception in resume screening
Applying the Engels et al. (2025) scalable oversight framework to resume screening. We model the task as an adversarial Houdini–Guard game and measure how well a weaker Guard can detect a stronger Houdini's deceptive selections, fitting domain Elo curves across 8 models and 200 games per pair.
-
Steering chain-of-thought length — and what it does to faithfulness
Reproducing ThinkEdit's interpretable weight edits to mitigate overly short chain-of-thought reasoning, then extending the analysis with ChainScope's IPHR faithfulness evaluation across the Qwen3 family (0.6B–8B).