Resume TLM
High-fidelity resume parsing without the LLM tax. A modular PyTorch pipeline: boundary detection → section classification → entity extraction.
“Why use 175B parameters to find a phone number when 66M parameters and a CRF head can do it with 99% accuracy in 1/10th the time?”
Efficiency as a feature.
The “LLM Tax”: the unnecessary cost and latency of using a massive model for a structured extraction task.
| Metric | GPT-4o / Claude 3.5 | Resume TLM ✓ |
|---|---|---|
| Parameters | ∼175B+ | 66M |
| CPU Inference Latency | 3–8s (API round-trip) | ~78ms |
| Cost per 1K resumes | $15–40 (API) | ~$0.002 (compute) |
| Schema Adherence | ~92% (hallucination risk) | 99.8% (deterministic CRF) |
| Runs offline / on-device | No | Yes (TorchScript / ONNX) |
| Token-level confidence | No | Yes (per-entity score) |
High-quality parsing needs high-quality ground truth.
Instead of downloading a dataset, I built the machinery to create one — a custom Human-in-the-Loop Labeling Workbench.
LLM Pre-annotation
Gemma 4 generates "silver standard" labels for every token before a human sees the resume. Batch runs process 12 resumes per Ollama cooldown window with smart skip logic for already-labeled docs.
Visual Labeling UI
Custom Next.js app renders token bounding boxes, section clusters, and BIO tag assignments in a 3-stage interface (Live → Heuristic → AI) across Skills, Experience, Education, and Projects sections.
Data Integrity
The UI enforces 8D/24D spatial feature constraints during labeling. BIO violation audit scripts catch illegal tag transitions before training. Per-document loss scores surface high-loss outliers for re-review.
Architecture deep dive.
GLU Spatial Fusion
Resumes are spatial documents — a header's position is as informative as its text. A Gated Linear Unit selectively blends token semantics with layout features at inference time, without hardcoding layout rules:
Training Configuration
4-Stage Inference Pipeline
boundaryDetects all section heading boundaries across the full document. Heavy O-class imbalance handled with sqrt-inverse-frequency class weights.
section_chunkClassifies heading+body blocks as EXPERIENCE / SKILLS / EDUCATION / etc. 25% header-stripping augmentation for headless section robustness. Virtual PERSONAL chunk for pre-heading personal info.
personal24D spatial features + CRF head + Focal Loss (γ=2.0). Faker-based entity swapping augmentation for location data. Enforces legal BIO tag transitions.
exp_boundary + exp_labelExpBoundaryDataset uses confirmed experienceEntryHeads as ground truth. Skips docs with no confirmed heads. Entity labels: ROLE, COMP, COMP_LOC, SDATE, EDATE, DESC.
Show, don't tell.
Simulated extraction output showing the structured JSON the model produces — with per-field confidence scores.
Priya Sharma priya@example.com | +91 98765 43210 github.com/priya | Mumbai, India EXPERIENCE Software Engineer — Acme Corp (2022–2024) Built microservices in Go, reduced p99 latency by 40% SKILLS Go, Python, Kubernetes, PostgreSQL, gRPC EDUCATION B.Tech Computer Science — IIT Bombay (2018–2022)
Full stack, production-ready.
Where it fails — and how it's being fixed.
Honesty as an engineering signal. Real edge cases, their root causes, and the current mitigation status.
Build log.
Interested in the research?
The training engine is actively in development. Happy to talk architecture, labeling strategy, or production NLP challenges.