Deep (Learning) Focus

Deep (Learning) Focus

Home
Notes
The Author
Archive
About
Decoder-Only Transformers: The Workhorse of Generative LLMs
Building the world's most influential neural network architecture from scratch...
Mar 4, 2024 • Cameron R. Wolfe, Ph.D.
Demystifying Reasoning Models
Understanding reasoning models and their relation to standard LLMs...
Feb 18, 2025 • Cameron R. Wolfe, Ph.D.
Understanding and Using Supervised Fine-Tuning (SFT) for Language Models
Understanding how SFT works from the idea to a working implementation...
Sep 11, 2023 • Cameron R. Wolfe, Ph.D.
AI Agents from First Principles
Understanding AI agents by building upon the most basic concepts of LLMs...
Jun 9, 2025 • Cameron R. Wolfe, Ph.D.
Basics of Reinforcement Learning for LLMs
Understanding the problem formulation and basic algorithms for RL..
Sep 25, 2023 • Cameron R. Wolfe, Ph.D.
Mixture-of-Experts (MoE) LLMs
Understanding models like DeepSeek, Grok, and Mixtral from the ground up...
Jan 27, 2025 • Cameron R. Wolfe, Ph.D.
PPO for LLMs: A Guide for Normal People
Understanding the complex RL algorithm that gave us modern LLMs…
Oct 27, 2025 • Cameron R. Wolfe, Ph.D.
GRPO++: Tricks for Making RL Actually Work
How to go from the vanilla GRPO algorithm to functional RL training at scale...
Jan 5 • Cameron R. Wolfe, Ph.D.
Continual Learning with RL for LLMs
Exploring the impressive continual learning capabilities of RL training...
Jan 26 • Cameron R. Wolfe, Ph.D.
nanoMoE: Mixture-of-Experts (MoE) LLMs from Scratch in PyTorch
An introductory, simple, and functional implementation of MoE LLM pretraining...
Mar 10, 2025 • Cameron R. Wolfe, Ph.D.
Rubric-Based Rewards for RL
Extending the benefits of large-scale RL training to non-verifiable domains...
Feb 16
Group Relative Policy Optimization (GRPO)
How the algorithm that teaches LLMs to reason actually works...
Nov 24, 2025 • Cameron R. Wolfe, Ph.D.
© 2026 Cameron R. Wolfe · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture