All notes

Probability Theory

Set Theory

One of the main objectives of a statistician is to draw conclusions about a




Suppose a variate $X$ having a distribution $P(x)$ with population mean $\mu$ and population variance $\text{var}(X)$ (also written as $\sigma^2$). $$ \sigma^2 \equiv E(X-\mu)^2 = \int P(x)(x-\mu)^2 dx$$ $E(X)$ denotes the expectation value of $X$. The variance is therefore equal to the second central moment $\mu_2$.

Sample variance: $$ s_N^2 \equiv \frac{1}{N} \sum_{i=1}^N (x_i - \bar{x})^2 $$ Unbiased Sample variance: $$ s_{N-1}^2 \equiv \frac{1}{N-1} \sum_{i=1}^N (x_i - \bar{x})^2 $$

The reason that $s_N^2$ gives a biased estimator of the population variance is that two free parameters $\mu$ and $\sigma^2$ are actually being estimated from the data itself.

Student's t-distribution is the "best" that can be done without knowing $\sigma^2$.

The quantity $Ns_N^2/\sigma^2$ has a chi-squared distribution.

Dist comparison

Q-Q plot "Q" stands for quantile. It is a graphical tool to compare two distributions by plotting their quantiles against each other. The linearity and deviation between the two distributions could be seen from this plot.

Hidden Markov model



Notational conventions
$T$ = length of the sequence of observations (training set)
$N$ = number of states (we either know or guess this number)
$M$ = number of possible observations (from the training set)
$\Omega_X = {q_1,...q_N}$ (finite set of possible states)
$\Omega_O = {v_1,...,v_M}$ (finite set of possible observations)
$X_t$ random variable denoting the state at time t (state variable)
$O_t$ random variable denoting the observation at time t (output variable)
$\sigma = o_1,...,o_T$ (sequence of actual observations)

Distributional parameters
$A = \{a_{ij}\} s.t. a_{ij} = Pr(X_t+1 = q_j |X_t = q_i)$ (transition probabilities)
$B = \{b_i\} s.t. b_i(k) = Pr(O_t = v_k | X_t = q_i t)$ (observation probabilities)
$pi = \{pi_i\} s.t. pi_i = Pr(X_0 = q_i)$ (initial state distribution)