5 Comments
User's avatar
Ryan Galliher's avatar

Man, this is great! So much better than browsing papers myself. I'll be digging into some of your earlier posts so that I can better understand this one!

Cameron R. Wolfe, Ph.D.'s avatar

Recommend the prior GRPO post + the LLM-as-a-Judge post linked in the LLM-as-a-Judge section of this writeup! Those should provide good background knowledge.

Aditya Sharan's avatar

This is a very helpful post. Thank You for writing this. It has been very properly covered. Much Appreciated (PS: In my personal view, this is one of your better written posts)

Cameron R. Wolfe, Ph.D.'s avatar

Thanks so much, and I'm glad it was helpful! Spent probably more time than I should've editing this weekend, but I was really happy with how it came out :)

Tedd Hadley's avatar

Rubric, what does it mean? According to Wikipedia (https://en.wikipedia.org/wiki/Rubric)

"A rubric is a word or section of text that is traditionally written or printed in red ink for emphasis. The word derives from the Latin rubrica, meaning red ochre or red chalk, and originates in medieval illuminated manuscripts from the 13th century or earlier."

What does this have to do with machine learning? Nothing yet.

What happened next in the evolution of the term "rubric" was Writing Assessment: evaluating a writer's performance or potential through a writing task (https://en.wikipedia.org/wiki/Writing_assessment). Bob Broad calls 1961's "Factors in Judgments of Writing Ability" (Diederich et al) the "Birth of Rubrics" (https://digitalcommons.usu.edu/cgi/viewcontent.cgi?article=1139&context=usupress_pubs) and noted the scoring system:

> Ideas: relevance, clarity, quantity, development, persuasiveness

> Form: organization and analysis

> Flavor: style, interest, sincerity

> Mechanics: specific errors in punctuation, grammar, etc.

> Wording: choice and arrangement of words

>

> And thus was born what became the standard, traditional, five-point rubric, by some version of which nearly every large-scale assessment of writing since 1961 has been strictly guided.

Since then, this alternate meaning has continued and expanded in academia: rubrics is a scoring guide to evaluate the quality of students' constructed responses in all forms of work (https://en.wikipedia.org/wiki/Rubric_(academic)).

So when did this term get applied to Reinforcement Learning reward scoring? Cameron's masterful and comprehensive article completes the story.