Three opinions on theorems

1. Think of theorem statements like an API.

Some people feel intimidated by the prospect of putting a “theorem” into their papers. They feel that their results aren’t “deep” enough to justify this. Instead, they give the derivation and result inline as part of the normal text.

Sometimes that’s best. However, the purpose of a theorem is not to shout to the world that you’ve discovered something incredible. Rather, theorems are best thought of as an “API for ideas”. There are two basic purposes:

  • To separate what you are claiming from your argument for that claim.
  • To provide modularity to make it easier to understand or re-use your ideas.

To decide if you should create a theorem, ask if these goals will be advanced by doing so.

A thoughtful API makes software easier to use: The goal is to allow the user as much functionality as possible with as simple an interface as possible, while isolating implementation details. If you have a long chain a mathematical argument, you should choose what parts to write as theorems/lemmas in much the same way.

Many paper intermingle definitions, assumptions, proof arguments, and the final results. Have pity on the reader, and tell them in a single place what you are claiming, and under what assumptions. The “proof” section separates your evidence for your claim from the claim itself. Most readers want to understand your result before looking at the proof.  Let them. Don’t make them hunt to figure out what your final result it.

Perhaps controversially, I suggest you should use the above reasoning even if you aren’t being fully mathematically rigorous. It’s still kinder to the reader to state your assumptions informally.

As an example of why it’s helpful to explicitly state your results, here’s an example from a seminal paper, so I’m sure the authors don’t mind. (Click on the image for a larger excerpt.)

proof

This proof is well written. The problem is many small uncertainties that accumulate as you read it. If you try to understand exactly:

  • What result is being stated?
  • Under what assumptions does that result hold?

You will find that the proof “bleeds in” to the result itself. The convergence rate in 2.13 involves Q(\theta) defined in 2.10, which itself involves other assumptions tracing backwards in the paper.

Of course, not every single claim needs to be written as a theorem/lemma/claim. If your result is simple to state and will only be used in a “one-off” manner, it may be clearer just to leave it in the text. That’s analogous to “inlining” a small function instead of creating another one.

2. Don’t fear the giant equation block.

I sometimes see a proof like this (for 0<x<1)

Take the quantity

\frac{x-x^2}{\sqrt{x^2-2x+1}}.

Pulling out x this becomes

x \frac{1-x}{\sqrt{x^2-2x+1}}.

Factoring the denominator, this is

x \frac{1-x}{\sqrt{(x-1)(x-1)}}.

Etc.

For some proofs, the text between each line just isn’t that helpful. To a large degree it makes things more confusing– without an equality between the lines, you need to read the words to understand how each formula is supposed to be related to the previous one. Consider this alternative version of the proof:

\begin{aligned} \frac{x-x^2}{\sqrt{x^2-2x+1}}  = & x \frac{1-x}{\sqrt{x^2-2x+1}} \\ = & x \frac{1-x}{\sqrt{(x-1)(x-1)}} \\ = & x \frac{1-x}{\sqrt{(x-1)^2}} \\ = & x \frac{1-x}{1-x} \\ = & x \\ \end{aligned}

In some cases, this reveals the overall structure of the proof better than a bunch of lines with interspersed text. If explanation is needed, it can be better to put it at the end, e.g. as “where line 2 follows from [blah] and line 3 follows from [blah]”.

It can also be helpful to put these explanations inline, i.e. to us a proof like

\begin{aligned} \frac{x-x^2}{\sqrt{x^2-2x+1}} = & x \frac{1-x}{\sqrt{x^2-2x+1}} & \text{ pull out x} \\ = & x \frac{1-x}{\sqrt{(x-1)(x-1)}} & \text{ factoring denominator} \\ = & x \frac{1-x}{\sqrt{(x-1)^2}} & \\ = & x \frac{1-x}{1-x} & \text{ since denominator is positive} \\ = & x & \\ \end{aligned}

Again, this is not the best solution for all (or even most) cases, but I think it should be used more often than it is.

3. Use equivalence of inequalities when possible.

Many proofs involve manipulating chains of inequalities. When doing so, it should be obvious at what steps extra looseness may have been introduced. Suppose you have some positive constants a and c with a^2>c and you need to choose b so as to ensure that b^2+c \leq a^2.

People will often prove a result like the following:

Lemma: If b \leq \sqrt{a^2-c}, then b^2+c \leq a^2.

Proof: Under the stated condition, we have that

\begin{aligned} b^2 + c & \leq & (\sqrt{a^2-c})^2+c \\ & = & a^2-c+c \\ & = & a^2 \end{aligned}

That’s all correct, but doesn’t something feel slightly “magical” about the proof?

There are two problems: First, the proof reveals nothing anything about how you came up with the final answer. Second, the result leaves ambiguous if you have introduced additional looseness. Given the starting assumption, is it possible to prove a stronger bound?

I think the following lemma and proof are much better:

Lemma: b^2+c \leq a^2 if and only if b \leq \sqrt{a^2-c}.

Proof: The following conditions are all equivalent:

\begin{aligned} b^2+c & \leq & a^2 \\ b^2 & \leq & a^2-c \\ b & \leq & \sqrt{a^2-c}. \\ \end{aligned}

The proof shows exactly how you arrived at the final result, and shows that there is no extra looseness. It’s better not to “pull a rabbit out of a hat” in a proof if not necessary.

This is arguably one of the most basic possible proof techniques, but is bizarrely underused. I think there’s two reasons why:

  1. Whatever need motivated the lemma is probably met by the first one above. The benefit of the second is mostly in providing more insight.
  2. Mathematical notation doesn’t encourage it. The sentence at the beginning of the proof is essential. If you see this merely as a series of inequalities, each implied by the one before, than it will not give the “only if” part of the result. You could conceivably try to write something like a < b \Leftrightarrow \exp a < \exp b, but this is awkward with multiple lines.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s