$\newcommand{\br}{\\}$ $\newcommand{\R}{\mathbb{R}}$ $\newcommand{\Q}{\mathbb{Q}}$ $\newcommand{\Z}{\mathbb{Z}}$ $\newcommand{\N}{\mathbb{N}}$ $\newcommand{\C}{\mathbb{C}}$ $\newcommand{\P}{\mathbb{P}}$ $\newcommand{\F}{\mathbb{F}}$ $\newcommand{\L}{\mathcal{L}}$ $\newcommand{\spa}[1]{\text{span}(#1)}$ $\newcommand{\dist}[1]{\text{dist}(#1)}$ $\newcommand{\max}[1]{\text{max}(#1)}$ $\newcommand{\min}[1]{\text{min}(#1)}$ $\newcommand{\supr}[1]{\text{sup}(#1)}$ $\newcommand{\infi}[1]{\text{inf}(#1)}$ $\newcommand{\ite}[1]{\text{int}(#1)}$ $\newcommand{\ext}[1]{\text{ext}(#1)}$ $\newcommand{\bdry}[1]{\partial #1}$ $\newcommand{\argmax}[1]{\underset{#1}{\text{argmax }}}$ $\newcommand{\argmin}[1]{\underset{#1}{\text{argmin }}}$ $\newcommand{\set}[1]{\left\{#1\right\}}$ $\newcommand{\emptyset}{\varnothing}$ $\newcommand{\otherwise}{\text{ otherwise }}$ $\newcommand{\if}{\text{ if }}$ $\newcommand{\proj}{\text{proj}}$ $\newcommand{\union}{\cup}$ $\newcommand{\intercept}{\cap}$ $\newcommand{\abs}[1]{\left| #1 \right|}$ $\newcommand{\norm}[1]{\left\lVert#1\right\rVert}$ $\newcommand{\pare}[1]{\left(#1\right)}$ $\newcommand{\brac}[1]{\left[#1\right]}$ $\newcommand{\t}[1]{\text{ #1 }}$ $\newcommand{\head}{\text H}$ $\newcommand{\tail}{\text T}$ $\newcommand{\d}{\text d}$ $\newcommand{\limu}[2]{\underset{#1 \to #2}\lim}$ $\newcommand{\der}[2]{\frac{\d #1}{\d #2}}$ $\newcommand{\derw}[2]{\frac{\d #1^2}{\d^2 #2}}$ $\newcommand{\pder}[2]{\frac{\partial #1}{\partial #2}}$ $\newcommand{\pderw}[2]{\frac{\partial^2 #1}{\partial #2^2}}$ $\newcommand{\pderws}[3]{\frac{\partial^2 #1}{\partial #2 \partial #3}}$ $\newcommand{\inv}[1]{{#1}^{-1}}$ $\newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $\newcommand{\nullity}[1]{\text{nullity}(#1)}$ $\newcommand{\rank}[1]{\text{rank }#1}$ $\newcommand{\nullspace}[1]{\mathcal{N}\pare{#1}}$ $\newcommand{\range}[1]{\mathcal{R}\pare{#1}}$ $\newcommand{\var}[1]{\text{var}(#1)}$ $\newcommand{\cov}[1]{\text{cov}(#1)}$ $\newcommand{\tr}[1]{\text{tr}(#1)}$ $\newcommand{\oto}{\text{ one-to-one }}$ $\newcommand{\ot}{\text{ onto }}$ $\newcommand{\ceil}[1]{\lceil#1\rceil}$ $\newcommand{\floor}[1]{\lfloor#1\rfloor}$ $\newcommand{\Re}[1]{\text{Re}(#1)}$ $\newcommand{\Im}[1]{\text{Im}(#1)}$ $\newcommand{\dom}[1]{\text{dom}(#1)}$ $\newcommand{\fnext}[1]{\overset{\sim}{#1}}$ $\newcommand{\transpose}[1]{{#1}^{\text{T}}}$ $\newcommand{\b}[1]{\boldsymbol{#1}}$ $\newcommand{\None}[1]{}$ $\newcommand{\Vcw}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Vce}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Vcr}[4]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \end{bmatrix}}$ $\newcommand{\Vct}[5]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \end{bmatrix}}$ $\newcommand{\Vcy}[6]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \end{bmatrix}}$ $\newcommand{\Vcu}[7]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \br #7 \end{bmatrix}}$ $\newcommand{\vcw}[2]{\begin{matrix} #1 \br #2 \end{matrix}}$ $\newcommand{\vce}[3]{\begin{matrix} #1 \br #2 \br #3 \end{matrix}}$ $\newcommand{\vcr}[4]{\begin{matrix} #1 \br #2 \br #3 \br #4 \end{matrix}}$ $\newcommand{\vct}[5]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \end{matrix}}$ $\newcommand{\vcy}[6]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \end{matrix}}$ $\newcommand{\vcu}[7]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \br #7 \end{matrix}}$ $\newcommand{\Mqw}[2]{\begin{bmatrix} #1 & #2 \end{bmatrix}}$ $\newcommand{\Mqe}[3]{\begin{bmatrix} #1 & #2 & #3 \end{bmatrix}}$ $\newcommand{\Mqr}[4]{\begin{bmatrix} #1 & #2 & #3 & #4 \end{bmatrix}}$ $\newcommand{\Mqt}[5]{\begin{bmatrix} #1 & #2 & #3 & #4 & #5 \end{bmatrix}}$ $\newcommand{\Mwq}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Meq}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Mrq}[4]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \end{bmatrix}}$ $\newcommand{\Mtq}[5]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \end{bmatrix}}$ $\newcommand{\Mqw}[2]{\begin{bmatrix} #1 & #2 \end{bmatrix}}$ $\newcommand{\Mwq}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Mww}[4]{\begin{bmatrix} #1 & #2 \br #3 & #4 \end{bmatrix}}$ $\newcommand{\Mqe}[3]{\begin{bmatrix} #1 & #2 & #3 \end{bmatrix}}$ $\newcommand{\Meq}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Mwe}[6]{\begin{bmatrix} #1 & #2 & #3\br #4 & #5 & #6 \end{bmatrix}}$ $\newcommand{\Mew}[6]{\begin{bmatrix} #1 & #2 \br #3 & #4 \br #5 & #6 \end{bmatrix}}$ $\newcommand{\Mee}[9]{\begin{bmatrix} #1 & #2 & #3 \br #4 & #5 & #6 \br #7 & #8 & #9 \end{bmatrix}}$
Definition: Covariance

Let $X$ and $Y$ be two random variable.

The covariance of $X$ and $Y$ is defined by

$$\begin{align*} \cov{X,Y} &= E\brac{(X - E\brac{X}) (Y - E\brac{Y})} \br &= E\brac{XY} - E\brac{X} E\brac{Y} \end{align*}$$

When $\cov{X, Y} = 0$ we say that $X$ and $Y$ are uncorrelated.

Properties: Properties of Covariance
  1. (Symmetry) $\cov{X, Y} = \cov {Y, X} $
  2. $\cov{X, X} = \var{X} $
  3. $\forall a, b \in \R$, $\cov{X, aY + b} = a \cov{X, Y}$
  4. $\cov{X, Y+Z} = \cov{X, Y} + \cov{X, Z} $
Definition: Correlation Coefficient

The correlation coefficient of $\rho\pare{ X, Y } $ of two random variables $X, Y $, where $\var {X} \neq 0, \var {Y} \neq 0$ is defined as

$$\rho (X, Y) = \frac{ \cov{X, Y}}{ \sqrt{ \var{X} \var{Y}}} = \frac{ \cov{X, Y}}{ \sigma_X \sigma_Y }$$

The correlation coefficient satisfies $$-1 \leq \rho (X, Y) \leq 1$$.

If $\rho > 0$, then the value of $X - E\brac{X}$ and $Y - E\brac{Y}$ “tend” to have the same sign.

If $\rho < 0$, then the value of $X - E\brac{X}$ and $Y - E\brac{Y}$ “tend” to have the opposite sign.

The size of $\abs{\rho}$ provides a normalized measure of the extent to which this is true.

Properties

For random variable $X$ and $Y$.

$$\begin{align*} \rho = 1 &\iff \exists c > 0, Y - E[Y] = c(X - E[X]) \br \rho = -1 &\iff \exists c < 0, Y - E[Y] = c(X - E[X]) \end{align*}$$

Example

Let $X $ be a discrete random variable with the probability mass function $p_X(x) $ and $Y $ be a continuous random variable with the density function $f_Y(y) $ suppose that $X $ and $Y $ are independent. What is the distribution of $X+Y $.

Proof
  </span>
</span>
<span class="proof__expand"><a>[expand]</a></span>

$F_{X+Y}(t) = \b{P}(X +Y \leq t) = \sum_{ x \in \R} \b{P}(X = x, X+Y \leq t) = \sum_{ x \in R } \b{P}(X = x, Y \leq t - x) $

$\sum_{ x \in \R } \b{P}(X = x)\b{P}(Y \leq t -x) = \sum_{ x \in X } P_X(x) F_Y(t - x) $

Analysis tells us that $F_{X+Y}(t) $ is differentable. Furhermore

$\der{ }{ t } F_{X+Y}(t) = \der{ }{ t } (\sum_{ x \in \R } p_X(x) F_Y(t - x)) = \sum_{ x \in \R } \der{ }{ t } \pare{p_X(x) F_Y(t- x)} = \sum_{ x \in \R } p_X(x) f_Y(t- x)$

Theorem : Cauchy-Schuarz Inequality

Let $X $ and $Y $ be random variable. Then,

$$-1 \leq \rho (X, Y) \leq 1$$

Equality $\rho(X, Y) \leq 1 \iff \cov{ X, Y }^2 \leq \var{ X } \var{ Y } $

$$E\brac{(X - E\brac{ X }) (Y - E\brac{ Y })}^2 \leq E\brac{ X - E\brac{ X }^2 } E\brac{ Y - E\brac{ Y }^2 } $$

Corollary

$\abs{ \rho\pare{ X, Y}} = 1 \iff X = \alpha Y + \beta $ for some $\alpha, \beta \in \R $.

$\rho\pare{ X, Y } = 1 \iff X = \alpha Y + \beta $ for $\alpha > 0 $.

$\rho\pare{ X, Y } = -1 \iff X = \alpha Y + \beta$ for $\alpha < 0 $.

Proof
  </span>
</span>
<span class="proof__expand"><a>[expand]</a></span>

$\abs{ \rho\pare{ X, Y }} = 1 \iff E\brac{(X' - \alpha Y')^2 } = 0 $

$\iff X' - \alpha Y' = 0 $ with probability $1$.

$(X - E\brac{ X }) - \alpha (Y - E\brac{ Y }) = 0 $

$X = \alpha Y + E\brac{ X } - \alpha E\brac{ Y } = X = \alpha Y + \beta $

where $\beta = E\brac{ X } - \alpha E\brac{ Y } $.

If $X = \alpha Y + \beta, \cov{ X} {Y } = \cov{ \alpha Y + \beta, Y } = \alpha, \cov{ Y} {Y } = \alpha \var{ Y }$

$\var{ X } = \var{ \alpha Y + b } = d^2$

???