# Gradient rules

Cheat sheet to differentiate expressions with the $$\nabla$$ operator to compute gradients of various functions.

### Usual operations

Sum rule, Product rule, sum rule, division rule, scalar rule ($$f$$ and $$g$$ both scalar functions $$g: \mathbb R^n \rightarrow \mathbb R$$, $$f: \mathbb R^n \rightarrow \mathbb R$$):

$$\begin{array}{lcl} \nabla [ f + g ] & = &\nabla f + \nabla g \\ \nabla [ f . g ] & = & \nabla f . g + f . \nabla g \\ \nabla \left [ \frac{f}{g} \right ] & = &\frac{\nabla f . g – f . \nabla g}{g^2} \\ \nabla [ \alpha . f ] & = & \alpha . \nabla f \end{array}$$

### Gradient of the norm

$$\nabla [ \| \vec {x} \| ] = \frac{ \vec {x}}{ \| \vec {x} \|}$$ $$\nabla [ \| \vec {x} \|^2 ] = 2 . \vec {x}$$

### Gradient of a matrix

With $$M$$ a $$n \times n$$ matrix: $$\nabla [ M \vec {x} ] = M$$ Likewise for rigid transformations (rotations $$M \in \mathbb R^{3 \times 3}$$ and translations $$\vec t \in \mathbb R^3$$): $$\nabla [ M \vec {x} + \vec t ] = M$$

### Chain rules

With $$s: \mathbb R \rightarrow \mathbb R$$ univariate and $$f: \mathbb R^n \rightarrow \mathbb R$$ multivariate real valued the operation boils down to an uniform scale of the gradient:

$$\nabla \left [ s( f(\vec {x}) ) \right ] = s’( f(\vec {x}) ) \nabla f(\vec {x})$$

Similar to the above consider $$s: \mathbb R^n \rightarrow \mathbb R$$ and plug in each parameter a scalar function $$f_n: \mathbb R^n \rightarrow \mathbb R$$. The gradient is now a weighted sum of $$\nabla f_n$$:

$$\begin{array}{lcl} \nabla \left [ s \left( f_1(\vec {p}), \cdots, f_n(\vec {p}) \right) \right ] & = & \frac{ \partial s( f_1(\vec {p}) )}{\partial x_1} . \nabla f_1(\vec {p}) + \cdots + \frac{ \partial s( f_n(\vec {p}) )}{\partial x_n} . \nabla f_n(\vec {p}) \\ \text{(or expressed as a dot product)} & = & \left [ \nabla f_1(\vec {p}), \cdots, \nabla f_n(\vec {p}) \right ] . \nabla s( f_1(\vec p), \cdots, f_n(\vec p)) \end{array}$$

With $$m: \mathbb R^n \rightarrow \mathbb R^n$$ deformation map and $$f: \mathbb R^n \rightarrow \mathbb R$$ multivariate scalar function the operation boils down to transform the gradient with a matrix: $$\nabla \left [ f( m(\vec {x}) ) \right ] = \mathbf{J}\left [ m(\vec {x}) \right ]^\mathsf{T} \nabla f(m(\vec {x}))$$

Where $$\mathbf{J}\left [ m(\vec {x}) \right ]^\mathsf{T}$$ denotes the transpose of the $$n \times n$$ Jacobian matrix.

### Related Jacobian rules

Jacobian of a linear function is the identity matrix $$I$$ times the slope $$a$$: $$J[a \vec {x} + c] = a.I$$

### Related chain rule

A univariate differentiation can lead to the use of $$\nabla$$: with $$f: \mathbb R^n \rightarrow \mathbb R$$ multivariable scalar function and $$\vec g: \mathbb R \rightarrow \mathbb R^n$$ a parametric function composed of $$\vec g(x) = \{g_1, \cdots, g_n \}^T$$:

$$\begin{array}{lcl} \left [ f(\vec g(x)) \right ] ' & = & \nabla f( \vec g(x))^T . \vec {g’(x)} \\ \text{(otherwise said)} & = & \frac{ \partial f(\vec {g}) }{\partial x_1} . \frac{ \partial g_1(x) }{\partial x} + \cdots + \frac{ \partial f(\vec {g}) }{\partial x_n} . \frac{ \partial g_n(x) }{\partial x} \end{array}$$

In short, we do the dot product between the gradient of $$f$$ and the speed of $$g$$

## Some links

No comments

All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.
Anti-spam question: