Mixture-of-Experts (MoE) LLMs

Cameron R. Wolfe, Ph.D.

Jan 27, 2025

Understanding models like DeepSeek, Grok, and Mixtral from the ground up...

Read →

13 Comments

Dr. Ashish Bamania

Jan 27, 2025

Great deep dive and very helpful. Thanks for writing this!

Reply (1)

Cameron R. Wolfe, Ph.D.

Jan 27, 2025

Thank you so much, and of course!

DesignREM support

Jan 28, 2025

Amazing article, a true gem. I would be interested to learn more about how DeepSeek used reinforcement learning to such great effect.

Reply (1)

Cameron R. Wolfe, Ph.D.

Jan 28, 2025

Working on that article now!

Tejas Parnerkar

Jul 23, 2025

Great in depth article!

Reply (1)

Cameron R. Wolfe, Ph.D.

Jul 24, 2025

Thanks for the kind words!

Rahul Saini

Jan 27, 2025

Great in-depth article on MoE LLMs.

Reply (1)

Cameron R. Wolfe, Ph.D.

Jan 27, 2025

Thank you very much!

Michael

Jan 27, 2025

I agree with Dr. Bamania- this was very rewarding reading.

Reply (1)

Cameron R. Wolfe, Ph.D.

Jan 27, 2025

Thank you! I'm glad you found it helpful!

Just a Placebo

Jan 29, 2025

Do you expect this leapfrogging to persist for a while longer? The collective response to this tug of war seems overwhelmingly simplified and “gotcha!” meme-based. Public discourse overall appears to lean into a predictable lack of reservation for exponential leaps between companies, unknowns, and established methods.

Mar 4

Circling back to this because it connects directly to something playing out right now. Qwen 3.5's small models shipped and the headline is '4B beats 80B'. Sounds wild until you realise it's 4B dense vs ~3B active in the MoE. The architecture lesson in this post is exactly what people need before taking that claim at face value. I broke down the whole thing here: https://reading.sh/your-laptop-is-an-ai-server-now-370bad238461?sk=1cf7a4391e614720ecbd6e9bc3f076a2

Srini Vijay, PhD

Jan 30, 2025

Great article. I couldn't finish it in one go though. I have book marked it and will come back to it later.

Deep (Learning) Focus

Mixture-of-Experts (MoE) LLMs