Deep (Learning) Focus
Subscribe
Sign in
Home
Notes
The Author
Archive
About
Latest
Top
Discussions
The Anatomy of an LLM Benchmark
Common patterns used to create the most effective LLM evaluation datasets...
Mar 30
•
Cameron R. Wolfe, Ph.D.
96
2
15
Applying Statistics to LLM Evaluations
Most LLM evaluations are conducted without a deep consideration of statistics.
Mar 9
•
Cameron R. Wolfe, Ph.D.
126
8
13
February 2026
Rubric-Based Rewards for RL
Extending the benefits of large-scale RL training to non-verifiable domains...
Feb 16
117
9
17
January 2026
Continual Learning with RL for LLMs
Exploring the impressive continual learning capabilities of RL training...
Jan 26
•
Cameron R. Wolfe, Ph.D.
145
15
19
GRPO++: Tricks for Making RL Actually Work
How to go from the vanilla GRPO algorithm to functional RL training at scale...
Jan 5
•
Cameron R. Wolfe, Ph.D.
130
10
18
December 2025
Olmo 3 and the Open LLM Renaissance
Fully-open artifacts with the potential to make LLM research a reality for anyone...
Dec 15, 2025
•
Cameron R. Wolfe, Ph.D.
82
7
14
November 2025
Group Relative Policy Optimization (GRPO)
How the algorithm that teaches LLMs to reason actually works...
Nov 24, 2025
•
Cameron R. Wolfe, Ph.D.
114
10
14
October 2025
PPO for LLMs: A Guide for Normal People
Understanding the complex RL algorithm that gave us modern LLMs…
Oct 27, 2025
•
Cameron R. Wolfe, Ph.D.
168
11
14
September 2025
REINFORCE: Easy Online RL for LLMs
How to get the benefits of online RL without the complexity of PPO...
Sep 29, 2025
•
Cameron R. Wolfe, Ph.D.
100
11
6
Online versus Offline RL for LLMs
A deep dive into the online-offline performance gap in LLM alignment...
Sep 8, 2025
•
Cameron R. Wolfe, Ph.D.
89
4
10
August 2025
GPT-oss from the Ground Up
Everything you should know about OpenAI's new open-weight language models...
Aug 18, 2025
•
Cameron R. Wolfe, Ph.D.
102
14
15
July 2025
Direct Preference Optimization (DPO)
How to align LLMs with limited hardware and minimal complexity...
Jul 28, 2025
•
Cameron R. Wolfe, Ph.D.
128
20
11
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts