Deep Dive: Teaching Arcee Trinity Mini to Read Medical Research with RLVR and GRPO
Bojan Jakimovski, an ML engineer, took Arcee AI's open-source Trinity Mini model and turned it into a biomedical specialist — extracting drug-protein relationships from scientific papers. No massive team. No million-dollar budget. Just open weights, a clever training technique called RLVR, and a weekend of GPU time.
⭐️⭐️⭐️ More content on Substack at https://www.airealist.ai ⭐️⭐️⭐️
In this video, I break down exactly how it works: the Mixture of Experts architecture behind Trinity Mini, why Reinforcement Learning with Verifiable Rewards (RLVR) beats traditional fine-tuning for domain specialization, how the GRPO algorithm (the same one behind DeepSeek R1) trains a model to reason step by step, and how LoRA makes it possible to specialize a 26B-parameter model for under $50.
Whether you're an ML engineer, a researcher, or just curious about where open-source AI is headed, this is a practical, no-hype walkthrough of a pattern you can replicate in your own domain.
Bojan Jakimovski's blog → https://shekswess.github.io
Bojan's LinkedIn → https://linkedin.com/in/bojan-jakimovski
*** MODELS
Trinity-Mini-DrugProt-Think (LoRA adapter) → https://huggingface.co/lokahq/Trinity-Mini-DrugProt-Think
Arcee Trinity Mini (base model) → https://huggingface.co/arcee-ai/Trinity-Mini
Arcee Trinity Mini Base (pre-SFT) → https://huggingface.co/arcee-ai/Trinity-Mini-Base
Trinity Mini on OpenRouter (free tier) → https://openrouter.ai/arcee-ai/trinity-mini:free
Trinity Mini on OpenRouter (paid API) → https://openrouter.ai/arcee-ai/trinity-mini
*** CODE & CONFIGS
Full training repo (configs, metrics, deployment) → https://github.com/LokaHQ/Trinity-Mini-DrugProt-Think
12 experiment TOML configs → https://github.com/LokaHQ/Trinity-Mini-DrugProt-Think/tree/main/experiments/configs/rl
Training metrics CSVs → https://github.com/LokaHQ/Trinity-Mini-DrugProt-Think/tree/main/data
Deploying on Amazon SageMaker (Loka blog) → https://medium.com/loka-engineering/deploying-trinity-mini-drugprot-think
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Related AI Lessons
⚡
⚡
⚡
⚡
The ABCs of reading medical research and review papers these days
Medium · LLM
#1 DevLog Meta-research: I Got Tired of Tab Chaos While Reading Research Papers.
Dev.to AI
How to Set Up a Karpathy-Style Wiki for Your Research Field
Medium · AI
The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap
ArXiv cs.AI
🎓
Tutor Explanation
DeepCamp AI