All notes
MatrixCalculus

Vector and matrix differentiation

fourier.eng.hmc.edu: algebra.

A vector differentiation operator is defined as $$ \frac{d}{d{\bf x}}\stackrel{\triangle}{=} \left[ \frac{\partial}{\partial x_1},\cdots, \frac{\partial}{\partial x_n} \right]^T $$ which can be applied to any scalar function $f({\bf x})$ to find its derivative with respect to ${\bf x}$: $$ \frac{d}{d{\bf x}} f({\bf x}) = \left[ \frac{\partial f}{\partial x_1},\cdots, \frac{\partial f}{\partial x_n} \right]^T $$ Vector differentiation has the following properties: $$ \frac{d}{d{\bf x}}({\bf b}^T{\bf x})=\frac{d}{d{\bf x}}({\bf x}^T{\bf b})={\bf b} $$ $$ \frac{d}{d{\bf x}}({\bf x}^T{\bf x})=2{\bf x} $$ $$ \frac{d}{d{\bf x}}({\bf x}^T{\bf A}{\bf x})=({\bf A}^T+{\bf A}){\bf x} $$ To prove the third one, consider the $k$th element of the vector: $$ \frac{\partial}{\partial x_k} ({\bf x}^{T}{\bf A}{\bf x}) = \frac{\partial}{\partial x_k} \sum_{i=1}^n \sum_{j=1}^n a_{ij}x_ix_j = \sum_{i=1}^n a_{ik}x_i + \sum_{j=1}^n a_{kj}x_j \;\;\;\;\;\;\;\;(k=1, \cdots, n) $$ Putting all $n$ elements in vector form, we have the above. If ${\bf A}^T={\bf A}$ is symmetric, then we have $$ \frac{d}{d{\bf x}}({\bf x}^T{\bf A}{\bf x})=2{\bf A}{\bf x} $$ In particular, when ${\bf A}={\bf I}$, we have $$ \frac{d}{d{\bf x}}({\bf x}^T{\bf x})=2{\bf x} $$ You can compare these results with the familiar derivatives in the scalar case: $$ \frac{d}{dx}(ax^2)=2ax $$ A matrix differentiation operator is defined as $$ \frac{d}{d{\bf A}}\stackrel{\triangle}{=}\left[ \begin{array}... ... & ... & \frac{\partial}{\partial a_{mn}} \end{array} \right] $$ which can be applied to any scalar function $f({\bf A})$: $$ \frac{d}{d{\bf A}}f({\bf A})=\left[ \begin{array}{ccc} \frac... ...frac{\partial f({\bf A})}{\partial a_{mn}} \end{array} \right] $$ Specifically, consider $f({\bf A})={\bf u}^T {\bf A} {\bf v}$, where ${\bf u}$ and ${\bf v}$ are $m\times 1$ and $n\times 1$ constant vectors, respectively, and ${\bf A}$ is an $m\times n$ matrix. Then we have: $$ \frac{d}{d{\bf A}} ({\bf u}^T {\bf A} {\bf v}) = {\bf u} {\bf v}^T $$