As part of the graphical models course I taught last spring, I developed a “cheatsheet” for exponential families. The basic purpose is to explain the standard moment-matching condition of maximum likelihood. The goal of the sheet was to clearly show how this property generalized to maximum likelihood in conditional exponential families, with hidden variables, or both. It’s available as an image below, or as a PDF here. Please let me know about any errors!

I use the (surprisingly controversial) convention of using a sans-serif font for random variables, rather than capital letters. I’m convinced this is the least-bad option for the machine learning literature, where many readers seem to find capital letter random variables distracting. It also allows you to distinguish matrix-valued random variables, though that isn’t used here.

“likelihood” -> “log likelihood”

Thanks, fixed! I decided to go with “learning objective” since if we *really* want to be careful, the right column should arguably be “conditional log marginal likelihood”.