Deep (Learning) Focus
Subscribe
Sign in
Home
Notes
The Author
Archive
About
Latest
Top
Discussions
Decoder-Only Transformers: The Workhorse of Generative LLMs
Building the world's most influential neural network architecture from scratch...
Mar 4, 2024
•
Cameron R. Wolfe, Ph.D.
166
15
10
Demystifying Reasoning Models
Understanding reasoning models and their relation to standard LLMs...
Feb 18, 2025
•
Cameron R. Wolfe, Ph.D.
279
5
30
Understanding and Using Supervised Fine-Tuning (SFT) for Language Models
Understanding how SFT works from the idea to a working implementation...
Sep 11, 2023
•
Cameron R. Wolfe, Ph.D.
89
5
8
AI Agents from First Principles
Understanding AI agents by building upon the most basic concepts of LLMs...
Jun 9, 2025
•
Cameron R. Wolfe, Ph.D.
362
25
44
Basics of Reinforcement Learning for LLMs
Understanding the problem formulation and basic algorithms for RL..
Sep 25, 2023
•
Cameron R. Wolfe, Ph.D.
240
5
19
Mixture-of-Experts (MoE) LLMs
Understanding models like DeepSeek, Grok, and Mixtral from the ground up...
Jan 27, 2025
•
Cameron R. Wolfe, Ph.D.
273
13
30
PPO for LLMs: A Guide for Normal People
Understanding the complex RL algorithm that gave us modern LLMs…
Oct 27, 2025
•
Cameron R. Wolfe, Ph.D.
168
11
14
GRPO++: Tricks for Making RL Actually Work
How to go from the vanilla GRPO algorithm to functional RL training at scale...
Jan 5
•
Cameron R. Wolfe, Ph.D.
130
10
18
Continual Learning with RL for LLMs
Exploring the impressive continual learning capabilities of RL training...
Jan 26
•
Cameron R. Wolfe, Ph.D.
145
15
19
nanoMoE: Mixture-of-Experts (MoE) LLMs from Scratch in PyTorch
An introductory, simple, and functional implementation of MoE LLM pretraining...
Mar 10, 2025
•
Cameron R. Wolfe, Ph.D.
180
13
29
Rubric-Based Rewards for RL
Extending the benefits of large-scale RL training to non-verifiable domains...
Feb 16
117
9
17
Group Relative Policy Optimization (GRPO)
How the algorithm that teaches LLMs to reason actually works...
Nov 24, 2025
•
Cameron R. Wolfe, Ph.D.
114
10
14
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts