Multivariate Taylor polynomial
Preamble: it is quite hard to grasp the formula if you don't know what is the multi-index notation (which I explain below) and how to apply it. Multi-index notation changes the usual way you interpret the summation symbol $\sum$, but also derivatives, factorial or power notations etc. So don't expect to understand the formula below right-away without studying the rest of the article. You should be familiar with the single variable version of the Taylor series as well.
Taylor's theorem for multivariate functions
Let $f : \mathbb R^n \rightarrow \mathbb R$ be a $k$-times continuously differentiable function with $n$ variables at the point $a \in \mathbb R^n$. Let also $\mathbf \alpha \in \mathbb N_0^n$ be a multi-index (a.k.a vector of integers, see below) of dimension $n$ (same as number of variables for $f$); Then there exist functions $h_\alpha : \mathbb R^n \rightarrow \mathbb R$, where $|\alpha|=k$ such that:
$$ \begin{align*} & f(\boldsymbol{x}) = \sum_{|\alpha|\leq k} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha + \sum_{|\alpha|=k} h_\alpha(\boldsymbol{x})(\boldsymbol{x}-\boldsymbol{a})^\alpha, \\ & \mbox{and}\quad \lim_{\boldsymbol{x}\to \boldsymbol{a}}h_\alpha(\boldsymbol{x})=0. \end{align*} $$n-dimensional multi-index notation
Below is an explanation but Wikipedia does a good job at describing multi-indices as well. You can also find videos on the topic (ex1, ex2)
$\mathbf \alpha \in \mathbb N_0^n$ is a vector of dimension $n$ of strictly positive integers, $\mathbf \alpha = \{\alpha_1, \alpha_2, \cdots, \alpha_n \}$. The length of this vector is simply the sum of all the indices:
$$ |\mathbf \alpha| = \alpha_1 + \alpha_2 + \cdots + \alpha_n = \sum_i^n \alpha_i $$It useful to set the coefficients of a multivariable polynomial. So if you have a vector $ \mathbf x \in \mathbb R^n$ with elements ($x_1, x_2, \cdots, x_3$), you can express the products of the vector's elements as:
$$ \mathbf x^{\mathbf \alpha} = x_1^{\alpha_1}.x_2^{\alpha_2} \ . \ \cdots \ . \ x_n^{\alpha_n} = \prod_i^n x_i^{\alpha_i} $$The factorial is expressed as:
$$ \mathbf \alpha! = \alpha_1! \alpha_2! \cdots \alpha_n! = \prod_i^n ({\alpha_i}!) $$Component wise subtraction and addition:
$$ \mathbf \alpha + \mathbf \beta = \{(\alpha_1 + \beta_1), (\alpha_2 + \beta_2), \cdots, (\alpha_n + \beta_n) \} $$ $$ \mathbf \alpha - \mathbf \beta = \{ (\alpha_1 - \beta_1), (\alpha_2 - \beta_2), \cdots, (\alpha_n - \beta_n) \} $$Express a product of binomial coefficients:
$$ {\mathbf \alpha \choose \mathbf \beta} = \frac{\mathbf \alpha!}{\mathbf \beta!(\mathbf \alpha-\mathbf \beta)} = {\alpha_1 \choose \beta_1}{\alpha_2 \choose \beta_2} \cdots{\alpha_n \choose \beta_n} \\= \frac{\alpha_1! \alpha_2! \cdots \alpha_n!}{ \beta_1! \beta_2! \cdots \beta_n!(\alpha_1 - \beta_1)!(\alpha_2 - \beta_2)!\cdots(\alpha_n - \beta_n)! } $$Partial derivatives:
$$ D^\alpha f = \frac{\partial^{|\alpha|}f}{\partial x_1^{\alpha_1}\cdots \partial x_n^{\alpha_n}} $$Example with: $\mathbf \alpha = (1, 2, 1)$
$$ D^\alpha f = \frac{\partial^{4}f}{\partial x_1 \partial x_2^2 \partial x_3} $$The summation symbol allows to enumerate all possible combination of multi-indices. So, $\sum_{|\mathbf \alpha| = k}$ is the sum of all possible combinations of $\mathbf \alpha = \{\alpha_1, \ \alpha_2,\ \cdots, \alpha_m\}$ that satisfy ${ \alpha_1 + \alpha_2 + \cdots + \alpha_n = k}$, for instance:
$$ \sum_{ |\mathbf \alpha| \ {\color{green}=} \ 2} \{\alpha_1, \alpha_2\} = \sum_{\underbrace{\alpha_1 + \alpha_2 = 2}_{\Large \text{equation}}} \{\alpha_1, \alpha_2\} = \overbrace{\underbrace{\{2,0\}}_{\text{solution 1}}}^{2+0=2} + \overbrace{\underbrace{\{0,2\}}_{\text{solution 2}}}^{0+2=2} + \overbrace{\underbrace{\{1,1\}}_{\text{solution 3}}}^{1+1=2} $$It's best to see this article for more details and examples. Warning: notice we use the equal sign $=$ in the above. Now, there is an extension of the summation notation using inequalities:
$$ \sum_{|\mathbf \alpha| \ {\color{red} \leq} \ k} = \sum_{|\mathbf \alpha| = 1} + \sum_{|\mathbf \alpha| = 2} + \cdots + \sum_{|\mathbf \alpha| = k} $$1st order development
To understand a formula it is always crucial to apply it to a concrete example, so it's time to practice:
$$ \begin{aligned} f(\boldsymbol{x}) &= \sum_{|\alpha|\leq 1} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha \\ &= \sum_{|\alpha| = 0} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha + \sum_{|\alpha| = 1} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha \\ &= \sum_{\alpha_1 + \cdots + \alpha_n = 0} \frac{ f_{0 \cdots 0}(\boldsymbol{a})}{0! \cdots 0!} (x_1 - a_1)^0 \cdots (x_n - a_n)^0 + \sum_{|\alpha| = 1} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha \\ \end{aligned} $$The sum $\sum_{|\alpha| = 0} = \sum_{\alpha_1 + \cdots + \alpha_n = 0}$ only represents a single term for which every indices is set to zero $\alpha = \{ \alpha_1 = 0, \cdots, \alpha_n = 0\}$ so that $D^{0 \cdots 0} f()$ means that we never differentiate $f$ in any of its variables:
$$ \begin{aligned} &= \frac{ f_{0 \cdots 0}(\boldsymbol{a})}{0! \cdots 0!} (x_1 - a_1)^0 \cdots (x_n - a_n)^0 + \sum_{|\alpha| = 1} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha \\ &= f(\boldsymbol{a}) + \sum_{|\alpha| = 1} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha \\ &= f(\boldsymbol{a}) + \sum_{ \alpha_1 + \cdots + \alpha_n = 1} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha \\ &= f(\boldsymbol{a}) + \\ & \frac{1}{1! 0! \cdots 0!} \frac{ \partial^{1} f(\boldsymbol{a})}{\partial x_1} (x_1 - a_1)^1 . (x_2 - a_2)^0 \cdots (x_n - a_n)^0 + \\ & \frac{1}{0! 1! \cdots 0!} \frac{ \partial^{1} f(\boldsymbol{a})}{\partial x_2} (x_1 - a_1)^0 . (x_2 - a_2)^1 \cdots (x_n - a_n)^0 \\ & + \cdots \\ &= f(\boldsymbol{a}) + \sum_i^n \frac{ \partial f(\boldsymbol{a})}{\partial x_i} (x_i - a_i) \\ &= f(\boldsymbol{a}) + \nabla f(\mathbf a)^T . (\mathbf x - \mathbf a) \end{aligned} $$
2nd order development
$$ \begin{aligned} f(\boldsymbol{x}) &= \sum_{|\alpha|\leq 2} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha \\ &= \sum_{|\alpha| = 0} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha + \sum_{|\alpha| = 1} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha + \sum_{|\alpha| = 2} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha \end{aligned} $$
Let's focus exclusively on: $\sum_{|\alpha| = 2} = \sum_{\alpha_1 + \cdots + \alpha_n = 2}$ since we already know the answer for the first two terms $\sum_{|\alpha| = 0}$ and $\sum_{|\alpha| = 1}$
$$ \sum_{|\alpha| = 2} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha = \sum_{\alpha_1 + \cdots + \alpha_n = 2} \frac{D^\alpha f(\boldsymbol{a})}{\alpha!} (\boldsymbol{x}-\boldsymbol{a})^\alpha $$Let's the various solutions for $\mathbf \alpha$
| α_1 | α_2 | α_3 | α_4 | ... | α_n | *-----------------------------------* | 2 | 0 | 0 | 0 | | 0 | | 0 | 2 | 0 | 0 | | 0 | | 0 | 0 | 2 | 0 | | 0 | | 0 | 0 | 0 | 2 | | 0 | | 0 | 0 | 0 | 0 | | 2 | Every pairs: | 1 | 1 | 0 | 0 | | 0 | | 1 | 0 | 1 | 0 | | 0 | | 1 | 0 | 0 | 1 | | 0 | | 1 | 0 | 0 | 0 | | 1 | | 0 | 1 | 1 | 0 | | 0 | | 0 | 1 | 0 | 1 | | 0 | | 0 | 1 | 0 | 0 | | 1 | etc.
If this is still unclear, as mentioned earlier, this article about summation and multi-indices provides more examples
Now here is the 2D example $\mathbf a = \{x_a, y_a\}$ and $\mathbf x = \{x, y\}$
| α_1 | α_2 | *-----------* | 2 | 0 | α! = 2 | 0 | 2 | α! = 2 | 1 | 1 | α! = 1$$ \begin{aligned} (x - x_a)^2 & f_{xx}(x_a, y_a) /2 + \\ (y - y_a)^2 & f_{yy}(x_a, y_a) /2 + \\ (x - x_a)(y-y_a) & f_{xy}(x_a, y_a) \end{aligned} $$
With the partial derivatives:
$$ f_{xx} = \frac{\partial^2 f}{\partial^2 x}, f_{yy} = \frac{\partial^2 f}{\partial^2 y}, f_{xy} = \frac{\partial^2 f}{\partial xy} $$Now because in the long run we would like to re-write this under a matrix form we are going to use the following trick:
$$ (x - x_a)(y-y_a) f_{xy}(x_a, y_a) = \frac{(x - x_a)(y-y_a) f_{xy}(x_a, y_a) + (y - y_a)(x-x_a) f_{yx}(x_a, y_a)}{2} $$We used the fact that if $f$ is $C^2$ (i.e. continuously differentiable two times) then $f_{xy} = f_{yx}$, we can now re-write the second order term of the Taylor expansion as follows:
$$ \begin{aligned} (x - x_a)^2 & f_{xx}(x_a, y_a) /2 + \\ (y - y_a)^2 & f_{yy}(x_a, y_a) /2 + \\ (x - x_a)(y-y_a) & f_{xy}(x_a, y_a) /2 + \\ (y - y_a)(x-x_a) & f_{yx}(x_a, y_a) /2 \end{aligned} $$3D example $\mathbf a = \{x_a, y_a, z_a\}$ and $\mathbf x = \{x, y, z\}$, here I applied the similar trick of remarking that $f_{xy} = (f_{xy} + f_{yx})/2$, $f_{xz} = (f_{xz} + f_{zx})/2$ etc:
$$ \begin{aligned} (x - x_a)^2 & f_{xx}(x_a, y_a, z_a) /2 + \\ (y - y_a)^2 & f_{yy}(x_a, y_a, z_a) /2 + \\ (z - z_a)^2 & f_{zz}(x_a, y_a, z_a) /2 + \\ \\ (x - x_a)(y-y_a) & f_{xy}(x_a, y_a, z_a) /2 + \\ (y - y_a)(x-x_a) & f_{yx}(x_a, y_a, z_a) /2 + \\ \\ (x - x_a)(z-z_a) & f_{xz}(x_a, y_a, z_a) /2 + \\ (z - z_a)(x-x_a) & f_{zx}(x_a, y_a, z_a) /2 + \\ \\ (y - y_a)(z-z_a) & f_{yz}(x_a, y_a, z_a) /2 + \\ (z - z_a)(y-y_a) & f_{zy}(x_a, y_a, z_a) /2 \end{aligned} $$And the general expression in nD:
$$ t(\mathbf x) = f(\mathbf a) + \nabla f(\mathbf a)^T . (\mathbf x - \mathbf a) + \frac{1}{2} (\mathbf x - \mathbf a)^T . H[f(\mathbf a)] . (\mathbf x - \mathbf a) $$Here is an example in video of the second order development. Note: strictly speaking the demonstrate Maclaurin series which is a simple alternate formulation of the Taylor series.
3rd order development
3rd term is a 2x2x2 tensor as described in https://math.stackexchange.com/a/4886960/895334
No comments