$\newcommand{\br}{\\}$ $\newcommand{\R}{\mathbb{R}}$ $\newcommand{\Q}{\mathbb{Q}}$ $\newcommand{\Z}{\mathbb{Z}}$ $\newcommand{\N}{\mathbb{N}}$ $\newcommand{\C}{\mathbb{C}}$ $\newcommand{\P}{\mathbb{P}}$ $\newcommand{\F}{\mathbb{F}}$ $\newcommand{\L}{\mathcal{L}}$ $\newcommand{\spa}[1]{\text{span}(#1)}$ $\newcommand{\dist}[1]{\text{dist}(#1)}$ $\newcommand{\max}[1]{\text{max}(#1)}$ $\newcommand{\min}[1]{\text{min}(#1)}$ $\newcommand{\supr}[1]{\text{sup}(#1)}$ $\newcommand{\infi}[1]{\text{inf}(#1)}$ $\newcommand{\ite}[1]{\text{int}(#1)}$ $\newcommand{\ext}[1]{\text{ext}(#1)}$ $\newcommand{\bdry}[1]{\partial #1}$ $\newcommand{\argmax}[1]{\underset{#1}{\text{argmax }}}$ $\newcommand{\argmin}[1]{\underset{#1}{\text{argmin }}}$ $\newcommand{\set}[1]{\left\{#1\right\}}$ $\newcommand{\emptyset}{\varnothing}$ $\newcommand{\tilde}{\text{~}}$ $\newcommand{\otherwise}{\text{ otherwise }}$ $\newcommand{\if}{\text{ if }}$ $\newcommand{\proj}{\text{proj}}$ $\newcommand{\union}{\cup}$ $\newcommand{\intercept}{\cap}$ $\newcommand{\abs}[1]{\left| #1 \right|}$ $\newcommand{\norm}[1]{\left\lVert#1\right\rVert}$ $\newcommand{\pare}[1]{\left(#1\right)}$ $\newcommand{\brac}[1]{\left[#1\right]}$ $\newcommand{\t}[1]{\text{ #1 }}$ $\newcommand{\head}{\text H}$ $\newcommand{\tail}{\text T}$ $\newcommand{\d}{\text d}$ $\newcommand{\limu}[2]{\underset{#1 \to #2}\lim}$ $\newcommand{\der}[2]{\frac{\d #1}{\d #2}}$ $\newcommand{\derw}[2]{\frac{\d #1^2}{\d^2 #2}}$ $\newcommand{\pder}[2]{\frac{\partial #1}{\partial #2}}$ $\newcommand{\pderw}[2]{\frac{\partial^2 #1}{\partial #2^2}}$ $\newcommand{\pderws}[3]{\frac{\partial^2 #1}{\partial #2 \partial #3}}$ $\newcommand{\inv}[1]{{#1}^{-1}}$ $\newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $\newcommand{\nullity}[1]{\text{nullity}(#1)}$ $\newcommand{\rank}[1]{\text{rank }#1}$ $\newcommand{\nullspace}[1]{\mathcal{N}\pare{#1}}$ $\newcommand{\range}[1]{\mathcal{R}\pare{#1}}$ $\newcommand{\var}[1]{\text{var}\pare{#1}}$ $\newcommand{\cov}[1]{\text{cov}(#1)}$ $\newcommand{\cov}[2]{\text{cov}\pare{#1, #2}}$ $\newcommand{\tr}[1]{\text{tr}(#1)}$ $\newcommand{\oto}{\text{ one-to-one }}$ $\newcommand{\ot}{\text{ onto }}$ $\newcommand{\ceil}[1]{\lceil#1\rceil}$ $\newcommand{\floor}[1]{\lfloor#1\rfloor}$ $\newcommand{\Re}[1]{\text{Re}(#1)}$ $\newcommand{\Im}[1]{\text{Im}(#1)}$ $\newcommand{\dom}[1]{\text{dom}(#1)}$ $\newcommand{\fnext}[1]{\overset{\sim}{#1}}$ $\newcommand{\transpose}[1]{{#1}^{\text{T}}}$ $\newcommand{\b}[1]{\boldsymbol{#1}}$ $\newcommand{\None}[1]{}$ $\newcommand{\Vcw}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Vce}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Vcr}[4]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \end{bmatrix}}$ $\newcommand{\Vct}[5]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \end{bmatrix}}$ $\newcommand{\Vcy}[6]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \end{bmatrix}}$ $\newcommand{\Vcu}[7]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \br #7 \end{bmatrix}}$ $\newcommand{\vcw}[2]{\begin{matrix} #1 \br #2 \end{matrix}}$ $\newcommand{\vce}[3]{\begin{matrix} #1 \br #2 \br #3 \end{matrix}}$ $\newcommand{\vcr}[4]{\begin{matrix} #1 \br #2 \br #3 \br #4 \end{matrix}}$ $\newcommand{\vct}[5]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \end{matrix}}$ $\newcommand{\vcy}[6]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \end{matrix}}$ $\newcommand{\vcu}[7]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \br #7 \end{matrix}}$ $\newcommand{\Mqw}[2]{\begin{bmatrix} #1 & #2 \end{bmatrix}}$ $\newcommand{\Mqe}[3]{\begin{bmatrix} #1 & #2 & #3 \end{bmatrix}}$ $\newcommand{\Mqr}[4]{\begin{bmatrix} #1 & #2 & #3 & #4 \end{bmatrix}}$ $\newcommand{\Mqt}[5]{\begin{bmatrix} #1 & #2 & #3 & #4 & #5 \end{bmatrix}}$ $\newcommand{\Mwq}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Meq}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Mrq}[4]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \end{bmatrix}}$ $\newcommand{\Mtq}[5]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \end{bmatrix}}$ $\newcommand{\Mqw}[2]{\begin{bmatrix} #1 & #2 \end{bmatrix}}$ $\newcommand{\Mwq}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Mww}[4]{\begin{bmatrix} #1 & #2 \br #3 & #4 \end{bmatrix}}$ $\newcommand{\Mqe}[3]{\begin{bmatrix} #1 & #2 & #3 \end{bmatrix}}$ $\newcommand{\Meq}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Mwe}[6]{\begin{bmatrix} #1 & #2 & #3\br #4 & #5 & #6 \end{bmatrix}}$ $\newcommand{\Mew}[6]{\begin{bmatrix} #1 & #2 \br #3 & #4 \br #5 & #6 \end{bmatrix}}$ $\newcommand{\Mee}[9]{\begin{bmatrix} #1 & #2 & #3 \br #4 & #5 & #6 \br #7 & #8 & #9 \end{bmatrix}}$
Definition: Distribution Function

Let $X$ be an (arbitrary) random variable. The distribution function $F_X: \R \to [0, 1]$ of $X$ is defined by $F_X(x) = \b{P}(X \leq x)$, for $x \in \R$. We say two random variables $X, Y$ are identically distributed if they have the same distribution function.

The random variable $X$ and $Y$ are said to be identically distributed if $F_X = F_Y$.

Definition: Density Function

A function $f: \R \to [0, \infty)$ is called a density function if it satisfies the following conditions.

  1. For any $-\infty \leq a < b < \infty $, the integral $\int_{ a }^{ b } f_X(x) \d x $ exists.
  2. $\int_{ -\infty }^{ +\infty } f_X(x) \d x = 1$.

Note that density functions are nonnegative.

Definition: Probability Density Function (PDF) of Continuous Random Variables

A random variable $X$ is said to have a continuous distribution (we also say that $X$ is continuous) if there exists a density function $f_X$ such that

$$\b{P}(X \in B) = \int_B f_X(x)\d x$$

for every subset $B$ of the real line.

We call $f_X$ the probability density function of $X$, or PDF for short.

In particular the probability that the value of $X$ falls within an interval is

$$\b{P}(a \leq X \leq b) = \int_{ a }^{ b } f_X(x) \d x $$

and can be interpreted as the area under the graph of the PDF.

Note

For any continuous random variable $X$ and any single value $a$, we have

$$\b{P}(X = a) = \int_{ a }^{ a } f_X(x) \d x = 0$$

Hence, including or excluding the endpoint of an interval has no effect on its probability. That is

$$\b{P}(a \leq X \leq b) = \b{P}(a < X < b) = \b{P}(a \leq X < b) = \b{P}(a < X \leq b)$$

Note

Suppose that $X$ is a continuous random variable with density function $f_X$. The Fundamental Theorem of Calculus implies that the distribution function

$$F_X(x) = \b{P}(X \leq x) = \int_{ -\infty }^{ x } f_X(t) \d t $$

is continuous at any $x \in \R$.

Note

If $f_X$ is continuous at a point $x \in \R$, then $F_X$ is differentiable at $X $, and $F'_X(x) = f_X(x)$.

Definition: Expected Value (Continuous Case)

Let $X$ be a continuous random variable, and $h: \R \to \R $ a function. Then the expected value of $h(X) $ is defined by

$$E[h(X)] = \int_{ - \infty }^{ + \infty } x f_X(x) \d x$$

Definition: Expected Value (General)

For an arbitrary random variable $X$, the expected value is defined as the following Lebesgue integral

$$E[X] = \int_{ \Omega } X(\omega) \d \b{P}(\omega)$$

Properties: Linearity of Expectation

Let $X$ be an arbitrary random variable on the sample space $\Omega $ and $a, b \in \R$ be constants. Then

$$E[aX+b] = a E[X] + b$$

Theorem : Expected Value of the Product of Independent Random Variables

If random variables $X_1, _2, …, _n $ are independent, then

$$E[ X_1, X_2, …, X_n ] = E[X_1] E[X_2] … E[X_n]$$

Theorem : Tail Sum Calculation of Discrete Random Variable's Expected Value

If a discrete random variable $X$ on the sample space $\Omega $ is non-negative, that is, $X \geq 0$, and integer valued, then

$$E\brac{ X } = \sum_{ k=1 }^{ \infty } \b{P}(X \geq k)$$

Theorem : Tail Sum Calculation of Continuous Random Variable's Expected Value

Let $X \geq 0$, that is $X : \Omega \to [0, \infty)$ . Let $h$ be a continuous differentiable function with $h(0) = 0 $. Then

$$E\brac{h(X)} = \int_{ \R } \b{P}(X > x) \d x$$

Definition: Moments

Let $X$ be an arbitrary random variable. The expectation $E\brac{ X^n } $, denoted by $\mu_n $, is called the $n$-th moment (around 0).

Definition: Variance of Continuous Random Variables

Let $X$ be a continuous random variable, the variance of $X $ is defined the same way as before, i.e.

$$var(X) = E[(X - E(X))^2]$$

Theorem

Let $g: \R \to \R,$ then $E[g(X)] = \int_{ - \infty }^{ + \infty } g(x) f_X(x) \d x$.

Properties
  1. $var(aX+b) = a^2 var(X)$
  2. $var(X) = E[X^2] - E ^2$
Definition: Exponential Random Variable

Let $\lambda > 0, f_X(x) = \begin{cases} \lambda e^{- \lambda x} & \text{ for } x> 0 \br 0 & \otherwise \end{cases}$ , find expectation.

$X $ is called an exponential random variable.

Proof
  </span>
</span>
<span class="proof__expand"><a>[expand]</a></span>

First check this PDF sums to 1.

$\begin{align*} &\int_{ - \infty }^{ + \infty } \d x \br &= \int_{ 0 }^{ + \infty } \lambda e^{- \lambda x}dx \end{align*}$

Let $u = \lambda x, \d u = \lambda \d x$,

$\begin{align*} &= \int_{ 0 }^{ + \infty } \lambda e^{-u} \frac{ 1 }{ \lambda } \d u \br &= \int_{ 0 }^{ + \infty } e ^{-u} \d u \br &= [e^{ -u }]^ \infty _0 \br &= 1 \end{align*}$

CDF of exponential random variable,

For $a \geq 0 $

$\begin{align*} F_X(a) &= \b{P}(X \leq a) \br &= 1- \b{P}(X > a) \br &= 1 - \int_{ a }^{ + \infty } f_X(x) \d x \br &= 1 - \int_{ a }^{ \infty } \lambda e^{- \lambda x} \br &= 1 - \int_{ \lambda a }^{ \infty } e^{-u} \d u \br &= 1 - e^{- \lambda a} \end{align*}$

$\begin{align*} E[X] &= \int_{ 0 }^{ \infty } x f(x) \d x \br &= [-xd^{- \lambda x}]_0^{ \infty } - \int_{ 0 }^{ \infty } [-e^{- \lambda x}] \d x \br &= [-0 - (-0)] + [ \frac{ -e^{- \lambda x}}{ \lambda } ]_0^\infty \br &= \frac{ 1 }{ \lambda } \end{align*}$

Example: The Exponential Random Variable is Memoryless

Time $T $ until a new light bulb burns out is an exponential random variable with parameter $\lambda $. Ariadne turns the light on, leaves the room, and when she returns, $t $ time units later, finds that the light bulb is still on, which corresponds to the event $A = {T > t} $. Let $X $ be the additional time until the light bulb burns out. What is the conditional CDF of $X $. Given the event $A $?

We have for $x \geq 0 $.

$\b{P}(X > x \mid A) = \b{P}(T > t+x \mid T > t) = \frac{ \b{P}(T > t+x\text{ and }T > t)}{ \b{P}(T > t)} = \frac{ \b{P}(T > t+x)}{ \b{P}(T > t)} = \frac{ e^{ - \lambda(t + x)}}{ e^{ - \lambda t }} = e^{ - \lambda x }$

Thus the conditional $CDF$ of $X$ is exponential with parameter $\lambda$, regardless of the time $t$ that elapsed between the lighting of the bulb and Ariadne’s arrival. This is known as the memorylessness property of the exponential. Generally, if we model the time to complete a certain operation by an exponential random variable $X$, this property implies that as long as the operation has not been completed, the remaining time up to completion has the same exponential CDF, no matter when the operation started.

Definition: Geometric Random Variable

Let $Y $ be geometric random variable with parameter $p$ for $k = 1, 2, 3, … $

Thus

$\begin{align*} \b{P}(Y = k) = (1 - p)^{k -1} p \end{align*}$

so,

$$\begin{align*} F_Y(n) = \b{P}(Y \leq n) &= \sum_{ k=1 }^{ n } (1 - p)^{k-1}p \br &= p \sum_{ k=1 }^{ n } (1 - p)^{k - 1} \br &= p (\frac{ 1- (1 - p)^n }{ 1- (1 - p)}) \br &= 1- (1-p)^n,\text{ for} n = 1, 2, … \end{align*}$$

So CDF of geometric random variable is:

$$F_Y(x) = \begin{cases} 1 - (1 - p)^n, \if x \leq [n, n + 1), n \geq 1 \br 0, \if x < 1 \end{cases}$$