$\newcommand{\br}{\\}$ $\newcommand{\R}{\mathbb{R}}$ $\newcommand{\Q}{\mathbb{Q}}$ $\newcommand{\Z}{\mathbb{Z}}$ $\newcommand{\N}{\mathbb{N}}$ $\newcommand{\C}{\mathbb{C}}$ $\newcommand{\P}{\mathbb{P}}$ $\newcommand{\F}{\mathbb{F}}$ $\newcommand{\L}{\mathcal{L}}$ $\newcommand{\spa}[1]{\text{span}(#1)}$ $\newcommand{\dist}[1]{\text{dist}(#1)}$ $\newcommand{\max}[1]{\text{max}(#1)}$ $\newcommand{\min}[1]{\text{min}(#1)}$ $\newcommand{\supr}[1]{\text{sup}(#1)}$ $\newcommand{\infi}[1]{\text{inf}(#1)}$ $\newcommand{\ite}[1]{\text{int}(#1)}$ $\newcommand{\ext}[1]{\text{ext}(#1)}$ $\newcommand{\bdry}[1]{\partial #1}$ $\newcommand{\argmax}[1]{\underset{#1}{\text{argmax }}}$ $\newcommand{\argmin}[1]{\underset{#1}{\text{argmin }}}$ $\newcommand{\set}[1]{\left\{#1\right\}}$ $\newcommand{\emptyset}{\varnothing}$ $\newcommand{\tilde}{\text{~}}$ $\newcommand{\otherwise}{\text{ otherwise }}$ $\newcommand{\if}{\text{ if }}$ $\newcommand{\proj}{\text{proj}}$ $\newcommand{\union}{\cup}$ $\newcommand{\intercept}{\cap}$ $\newcommand{\abs}[1]{\left| #1 \right|}$ $\newcommand{\norm}[1]{\left\lVert#1\right\rVert}$ $\newcommand{\pare}[1]{\left(#1\right)}$ $\newcommand{\brac}[1]{\left[#1\right]}$ $\newcommand{\t}[1]{\text{ #1 }}$ $\newcommand{\head}{\text H}$ $\newcommand{\tail}{\text T}$ $\newcommand{\d}{\text d}$ $\newcommand{\limu}[2]{\underset{#1 \to #2}\lim}$ $\newcommand{\der}[2]{\frac{\d #1}{\d #2}}$ $\newcommand{\derw}[2]{\frac{\d #1^2}{\d^2 #2}}$ $\newcommand{\pder}[2]{\frac{\partial #1}{\partial #2}}$ $\newcommand{\pderw}[2]{\frac{\partial^2 #1}{\partial #2^2}}$ $\newcommand{\pderws}[3]{\frac{\partial^2 #1}{\partial #2 \partial #3}}$ $\newcommand{\inv}[1]{{#1}^{-1}}$ $\newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $\newcommand{\nullity}[1]{\text{nullity}(#1)}$ $\newcommand{\rank}[1]{\text{rank }#1}$ $\newcommand{\nullspace}[1]{\mathcal{N}\pare{#1}}$ $\newcommand{\range}[1]{\mathcal{R}\pare{#1}}$ $\newcommand{\var}[1]{\text{var}\pare{#1}}$ $\newcommand{\cov}[2]{\text{cov}(#1, #2)}$ $\newcommand{\tr}[1]{\text{tr}(#1)}$ $\newcommand{\oto}{\text{ one-to-one }}$ $\newcommand{\ot}{\text{ onto }}$ $\newcommand{\ceil}[1]{\lceil#1\rceil}$ $\newcommand{\floor}[1]{\lfloor#1\rfloor}$ $\newcommand{\Re}[1]{\text{Re}(#1)}$ $\newcommand{\Im}[1]{\text{Im}(#1)}$ $\newcommand{\dom}[1]{\text{dom}(#1)}$ $\newcommand{\fnext}[1]{\overset{\sim}{#1}}$ $\newcommand{\transpose}[1]{{#1}^{\text{T}}}$ $\newcommand{\b}[1]{\boldsymbol{#1}}$ $\newcommand{\None}[1]{}$ $\newcommand{\Vcw}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Vce}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Vcr}[4]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \end{bmatrix}}$ $\newcommand{\Vct}[5]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \end{bmatrix}}$ $\newcommand{\Vcy}[6]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \end{bmatrix}}$ $\newcommand{\Vcu}[7]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \br #7 \end{bmatrix}}$ $\newcommand{\vcw}[2]{\begin{matrix} #1 \br #2 \end{matrix}}$ $\newcommand{\vce}[3]{\begin{matrix} #1 \br #2 \br #3 \end{matrix}}$ $\newcommand{\vcr}[4]{\begin{matrix} #1 \br #2 \br #3 \br #4 \end{matrix}}$ $\newcommand{\vct}[5]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \end{matrix}}$ $\newcommand{\vcy}[6]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \end{matrix}}$ $\newcommand{\vcu}[7]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \br #7 \end{matrix}}$ $\newcommand{\Mqw}[2]{\begin{bmatrix} #1 & #2 \end{bmatrix}}$ $\newcommand{\Mqe}[3]{\begin{bmatrix} #1 & #2 & #3 \end{bmatrix}}$ $\newcommand{\Mqr}[4]{\begin{bmatrix} #1 & #2 & #3 & #4 \end{bmatrix}}$ $\newcommand{\Mqt}[5]{\begin{bmatrix} #1 & #2 & #3 & #4 & #5 \end{bmatrix}}$ $\newcommand{\Mwq}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Meq}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Mrq}[4]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \end{bmatrix}}$ $\newcommand{\Mtq}[5]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \end{bmatrix}}$ $\newcommand{\Mqw}[2]{\begin{bmatrix} #1 & #2 \end{bmatrix}}$ $\newcommand{\Mwq}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Mww}[4]{\begin{bmatrix} #1 & #2 \br #3 & #4 \end{bmatrix}}$ $\newcommand{\Mqe}[3]{\begin{bmatrix} #1 & #2 & #3 \end{bmatrix}}$ $\newcommand{\Meq}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Mwe}[6]{\begin{bmatrix} #1 & #2 & #3\br #4 & #5 & #6 \end{bmatrix}}$ $\newcommand{\Mew}[6]{\begin{bmatrix} #1 & #2 \br #3 & #4 \br #5 & #6 \end{bmatrix}}$ $\newcommand{\Mee}[9]{\begin{bmatrix} #1 & #2 & #3 \br #4 & #5 & #6 \br #7 & #8 & #9 \end{bmatrix}}$
Definition: Conditioning a Random Variable on an Event

The conditional PDF of a continuous random variable $X $, given an event $A $ with $\b{P}(A) > 0 $, is defined as a nonnegative function $f_{X|A} $ that satisfies

$$\b{P}(X \in B | A) = \int_{ B } f_{X|A}(x) \d x $$

for any subset $B $ of the real line.

$$\int_{ -\infty }^{ +\infty } f_{X|A}(x) \d x = 1$$

In particular, we condition an event of the form $\set{ X \in A } $, with $\b{P}(X \in A) > 0 $, the definition of conditional probabilities yields

$$\b{P}(X \in B | X \in A) = \frac{ \b{P}(X \in B, X \in A)}{ \b{P}(X \in A)} = \frac{ \int_{ A \cap B } f_X(x) \d x }{ \b{P}(X \in A)} $$

We conclude that

$$ f_{X | \set{ X \in A }}(x) = \begin{cases} \frac{ f_X(x)}{ \b{P}(X \in A)}, \if x \in A \br 0, \otherwise \end{cases} $$

Let $A_1, A_2, …, A_{ n } $ be disjoint events that form a partition of the sample space, and assume that $\b{P}(A_i) > 0 $ for all $i $, then

$f_X(x) = \sum_{ i=1 }^{ n } \b{P}(A_i)f_{X|A_i}(x)$(total probability theorem)

Definition: Conditional PDF Given a Random Variable

Let $X $ and $Y $ be jointly continuous random variables with joint PDF $f_{X,Y} $.

The joint, marginal, and conditional PDFs are related to each other by the formulas:

$$\begin{align*} f_{X, Y}(x, y) &= f_Y(y)f_{X|Y}(x|y) f_X(x) &= \int_{ -\infty }^{ +\infty } f_Y(y) f_{X|Y}(x | y) \d y \end{align*}$$

The conditional PDF $f_{X|Y}(x|y)$ is defined only for those $y $ for which $f_Y(y) > 0 $

We have

$$\b{P}(X \in A | Y = y) = \int_{ A } f_{X|Y}(x|y) \d x$$

Definition: Conditional Expectations

Let $X $ and $Y $ be jointly continuous random variables, and let $A $ be an event with $\b{P}(A) > 0 $.

The conditional expectation of $X $ given the event $A $ is defined by

$$E[X|A] = \int_{ -\infty }^{ +\infty } x f_{X|A}(x) \d x $$

The conditional expectation of $X $ given that $Y = y$ is defined by

$E[X|Y = y] = \int_{ -\infty }^{ +\infty } x f_{X|Y} (x | y) \d x$

Theorem : The Expected Value Rule

Let $X $ and $Y $ be jointly continuous random variables, and let $A $ be an event with $\b{P}(A) > 0 $. For a function $g(X)$, we have

The expectation of $g(X)$ given the event $A $ is defined by

$$E[g(X)|A] = \int_{ -\infty }^{ +\infty } g(x) f_{X|A}(x) \d x $$

The expectation of $g(X)$ given that $Y = y$ is defined by

$E[g(X)|Y = y] = \int_{ -\infty }^{ +\infty } g(x) f_{X|Y} (x | y) \d x$

Note: Tail Sum Formula

Discrete case let’s assume $X $ takes non-negative integer values then

$$E[X] = \sum_{ k = 1 }^{ \infty } \b{P}(X \geq k) = \sum_{ k = 0 }^{ \infty } \b{P}(X > k) $$

Continuous case if $X > 0 $, then

$E[X] = \int_{ -\infty }^{ \infty } \b{P}(X > x) \d x = 1 - F_X(x)$

Proof
  </span>
</span>
<span class="proof__expand"><a>[expand]</a></span>

$\begin{align*} \b{P}(X = 1) + \b{P}(X = 2) + \b{P}(X = 3) + \b{P}(X = 4) + &… \br +\b{P}(X = 2) + \b{P}(X = 3) + \b{P}(X = 4) + &… \br +\b{P}(X = 3) + \b{P}(X = 4) + &… \br +\b{P}(X = 4) + &… \br

  • &… \end{align*}$
Theorem : Total Expectation Theorem

Let $A_1, A_2, …, A_{ n } $ be disjoint events that form a partition of the sample space, and assume that $\b{P}(A_i) > 0$ for all $i$. Then,

$$E[X] = \sum_{ i=1 }^{ n } \b{P}(A_i) E[X | A_i]$$

Similarly,

$$E[X] = \int_{ -\infty }^{ +\infty } E[X | Y = y] f_Y(y) \d y $$