Regular single-variate implicit differentiation is taught in calc 101.

# Single variate

Consider a relationship between and implicitly defined by

Now, consider some (infinitesimally) small perturbation to , inducing a corresponding peturbation in .

(Here I am using the convention that if arguments to are unspecified, it is to be evaluated at ) Recall that , then solve to get

or

# Multiple variate

We again have an implicit relationship, but now everything is a vector.

Consider perturbations and .

or

# Use in optimization

Suppose some variables are implicitly defined by a minimization of some function also of .

Suppose furthermore, that there is some criterion that we would like to minimize.

.

So we want to set so that the resulting gives the best value for .

By setting the gradient of the above to zero, we obtain (We assume here that is convex over for all .)

Making the substitution in the last equation from the previous section, we get

Now, by putting the pieces together, we can see how a small change in affects the objective function.

I’ve used results like this in this paper and there is a bunch of work using this kind of thing for training neural networks. See this entry.