🥷 Project Overview
Finetuning Ninja is a hands-on project designed to build deep intuition and practical skills in training Large Language Models (LLMs). From setting up the lab environment to deploying reasoning models, this repository guides you through every critical step.
It covers the complete lifecycle:
| Focus | Description |
|---|---|
| Foundations | Understanding the finetuning landscape and supervised finetuning (SFT). |
| Depth | Understanding the internal mechanisms of models. |
| Efficiency | Leveraging Parameter-Efficient Finetuning (PEFT) with LoRA and QLoRA. |
| Alignment | Aligning models with human preferences using RLHF (Reinforcement Learning from Human Feedback). |
| Reasoning | Exploring advanced techniques like GRPO (Group Relative Policy Optimization). |
| Multimodal | Extending capabilities to vision and audio. |
| Production | Best practices for LLM deployment. |
The project emphasizes not just how to run code, but why it works, providing the intuition and theory necessary to become a true practitioner.