摘要訊息 : 隨機變數, 期望以及一些它們的特徵和性質.

0. 前言

在本文中, 我們將要討論具有有限個結局試驗的機率模型可能產生的一些數字特徵.

更新紀錄 :

  • 2022 年 6 月 10 日進行第一次更新和修正.

1. 隨機變數

(\Omega, \mathscr {A}, \mathbf {P}) 是某個具有有限個結局試驗的機率模型. 其中, \mathscr {A} = \left \{ A : A \subseteq \Omega \right \}. 之前所討論的關於事件 A \in \mathscr {A}, 其機率的計算以及基本事件空間 \Omega 的自然本性並不重要, 重要的是某種數字特徵, 它依賴於基本事件. 我們想要知道的是, 在一系列 n 次試驗中, 出現 k 次 (k \leq n) 成功的機率如何等這一類問題.

定義 1. 稱任意定義在有限基本事件空間 \Omega 上的數值函數 \xi = \xi(\omega)隨機變數 (random variable).

例題 1. 對於接連兩次投擲硬幣模型, 其基本事件空間為 \displaystyle {\Omega = \left \{ \mathrm {HH}, \mathrm {HT}, \mathrm {TH}, \mathrm {TT} \right \}}. 我們使用下列表格定義隨機變數 \xi = \xi(\omega) :

\omega \mathrm {HH} \mathrm {HT} \mathrm {TH} \mathrm {TT}
\xi(\omega) 2 1 1 0
其中, \xi(\omega) 是事件 \omega 中, 投擲硬幣得到正面的次數.

通過例題 1, 我們不難得到關於隨機變數 \xi 的另外一個簡單例子, 即集合的特徵函數 \displaystyle {\xi = \mu(\omega)}.

當實驗者遇到描繪某些記載或者讀數的隨機變數時, 他關心的基本問題是隨機變數取各個數值的機率如何? 從這種觀點出發, 實驗者關心的並非機率 \mathop {\mathbf {P}}(\Omega, \mathscr {A}) 上的分佈, 而是機率在隨機變數之可能值的集合上的分佈, 即隨機變數 \xi 取到集合 C = \left \{ c_{1}, c_{2}, ..., c_{n} \right \} 中某個值的機率分佈. 由於所研究的情形, \Omega 由有限個點所構成, 則隨機變數 \xi 的值域 R_{\xi} 也是有限的. 設 R_{\xi} = \left \{ r_{1}, r_{2}, ..., r_{m} \right \}, 其中, r_{1}, r_{2}, ..., r_{m}\xi 的全部可能值.

\mathscr {R} 是值域 R_{\xi} 上的一切子集的全體, 並且設 B \in \mathscr {R}. 當 R_{\xi} 是隨機變數 \xi 的值域時, 集合 B 也可以視為某個事件.

(R_{\xi}, \mathscr {R}) 上考慮由隨機變數 \xi\displaystyle {P_{\xi}(B) = \mathop {\mathbf {P}} \left \{ \omega : \xi(\omega) \in B \right \}, B \in \mathscr {R}} 產生的機率 P_{\xi}(\cdot). 顯然, 這些機率的值完全取決於 \displaystyle {P_{\xi}(r_{i}) = \mathop {\mathbf {P}} \left \{ \omega : \xi(\omega) = r_{i} \right \}, r_{i} \in R_{\xi}}. 機率組 \left \{ P_{\xi}(r_{1}), P_{\xi}(r_{2}), ..., P_{\xi}(r_{m}) \right \} 稱作隨機變數 \xi機率分佈 (probability distribution), 簡稱為分佈 (distribution).

例題 2-1. 設隨機變數 \xi 分別以機率 pq 取到 10 兩個值. 其中, p 稱作成功的機率, q 稱作失敗的機率. 則稱 \xi 為 Bernoulli 隨機變數, 也稱隨機變數 \xi 服從 Bernoulli 分佈. 顯然, 對於隨機變數 \xi\displaystyle {P_{\xi}(x) = p^{x}q^{1 - x}, x \in \left \{ 0, 1 \right \}}.

例題 2-2. 設隨機變數 \xi 以機率 P_{\xi}(x) = \binom {x}{n}p^{x}q^{n - x}. 其中, x = 0, 1, 2, …, n. 取 0, 1, ..., nn + 1 個值, 則稱隨機變數 \xi二項隨機變數 (binomial random variable), 或者說隨機變數 \xi 服從二項分佈.

我們可以發現, 在例題 2 中, 我們已經不關心基本機率空間 (\Omega, \mathscr {A}, \mathbf {P}) 的構造了, 而只是關心隨機變數的值及其機率分佈.

2. 分佈函數

定義 2.x \in R, 函數 \displaystyle {F_{\xi}(x) = \mathop {\mathbf {P}} \left \{ \omega : \xi(\omega) \leq x \right \}} 稱作隨機變數 \xi分佈函數 (distribution function).

根據隨機變數分佈函數的定義, 我們容易得到 \displaystyle {F_{\xi}(x) = \sum \limits_{\left \{ i : x_{i} \leq x \right \}}P_{\xi}(x_{i})}. 除此之外, \displaystyle {P_{\xi}(x_{i}) = F_{\xi}(x_{i}) - F_{\xi}(x_{i}^{-})}. 其中, F_{\xi}(x_{i}^{-}) = \lim \limits_{x \to x_{i}^{-}}F_{\xi}(x). 特別地, 若有 x_{1} < x_{2} < ... < x_{m}F_{\xi}(x_{0}) = 0, 則有 \displaystyle {P_{\xi}(x_{i}) = F_{\xi}(x_{i}) - F_{\xi}(x_{i - 1}), i = 1, 2, ..., n}.

直接由定義 2, 我們可以得到分佈函數的三條性質 :

  1. \lim \limits_{x \to -\infty}F_{\xi}(x) = 0;
  2. \lim \limits_{x \to +\infty}F_{\xi}(x) = 1;
  3. F_{\xi}(x) 具有右連續性, 即 \lim \limits_{x \to x_{0}^{+}}F_{\xi}(x_{0}) = x_{0}, 且 F_{\xi}(x) 為階梯函數.

除了隨機變數之外, 我們通常還研究隨機向量 (random vector) \xi = (\xi_{1}, \xi_{2}, ..., \xi_{r}), 其各分量都是隨機變數. 對於多項分佈來說, 對象為隨機向量 \upsilon = (\upsilon_{1}, \upsilon_{2}, ..., \upsilon_{r}). 其中, \upsilon_{i} = \upsilon_{i}(\omega) 是序列 \omega = (a_{1}, a_{2}, ..., a_{n}) 中等於 b_{i} (i = 1, 2, …, r) 的分量的數量.

對於 x_{i} \in X_{i} (X_{i}\xi_{i} 一切可能值的集合, i = 1, 2, ..., r), 機率 \displaystyle {P_{\xi}(x_{1}, x_{2}, ..., x_{r}) = \mathop {\mathbf {P}} \left \{ \omega : \xi(\omega) = x_{1}, \xi_{2}(\omega) = x_{2}, ..., \xi_{r}(\omega) = x_{r} \right \}} 的全體稱為隨機向量 \xi = (\xi_{1}, \xi_{2}, ..., \xi_{r}) 的機率分佈, 而函數 \displaystyle {F_{\xi}(x_{1}, x_{2}, ..., x_{r}) = \mathop {\mathbf {P}} \left \{ \omega : \xi_{1}(\omega) \leq x_{1}, \xi_{2}(\omega) \leq x_{2}, ..., \xi_{r}(\omega) \leq x_{r} \right \}} 稱為隨機向量 \xi = (\xi_{1}, \xi_{2}, ..., \xi_{r}) 的分佈函數. 其中, x_{i} \in R\ (i = 1, 2, ..., r). 那麼對於多項分佈的向量 \upsilon = (\upsilon_{1}, \upsilon_{2}, ..., \upsilon_{r}), 有 \displaystyle {P_{\upsilon}(n_{1}, n_{2}, ..., n_{r}) = \dbinom {n_{1}, n_{2}, ..., n_{r}}{n}p_{1}^{n_{1}}p_{2}^{n_{2}}...p_{r}^{n_{r}}}.

定義 3.\xi_{1}, \xi_{2}, ..., \xi_{r} 是一組在 R 中的有限集合 X 上取值的隨機變數. 記 \mathscr {X}X 中所有子集的代數. 若對於任意 x_{1}, x_{2}, ..., x_{r} \in X, 有 \displaystyle {\mathop {\mathbf {P}} \left \{ \xi_{1} = x_{1}, \xi_{2} = x_{2}, ..., \xi_{r} = x_{r} \right \} = \mathop {\mathbf {P}} \left \{ \xi_{1} = x_{1} \right \}\mathop {\mathbf {P}} \left \{ \xi_{2} = x_{2} \right \}...\mathop {\mathbf {P}} \left \{ \xi_{r} = x_{r} \right \}}; 或等價地, 對於任意 B_{1}, B_{2}, ..., B_{r} \in \mathscr {X}, 有 \displaystyle {\mathop {\mathbf {P}} \left \{ \xi \in B_{1}, \xi_{2} \in B_{2}, ..., \xi_{r} \in B_{r} \right \} = \mathop {\mathbf {P}} \left \{ \xi_{1} \in B_{1} \right \}\mathop {\mathbf {P}} \left \{\xi_{2} \in B_{2} \right \}...\mathop {\mathbf {P}} \left \{ \xi_{r} \in B_{r} \right \}}, 則稱隨機變數 \xi_{1}, \xi_{2}, ..., \xi_{r}全體獨立 (mutually independent) 的.

我們之前所討論的 Bernoulli 概型, 就是獨立隨機變數的一個例子. 具體地, 設 \displaystyle {\begin {aligned} &\Omega = \left \{ \omega : \omega = (a_{1}, a_{2}, ..., a_{n}), a_{i} = 0, 1 \ (i = 1, 2, ..., n) \right \}, \\ &\mathscr {A} = \left \{ A : A \subseteq \Omega \right \}, \mathop {\mathbf {P}}(\left \{ \omega \right \}) = p(\omega) = p^{\sum \limits_{i}a_{i}}(1 - p)^{n - \sum \limits_{i}a_{i}}, \end {aligned}} 並且對於 \omega = (a_{1}, a_{2}, ..., a_{n}), \xi_{i}(\omega) = a_{i}\ (i = 1, 2, ..., n), 由於事件 \displaystyle {A_{1} = \left \{ \omega : a_{1} = 1 \right \}, A_{2} = \left \{ \omega : a_{2} = 1 \right \}, ..., A_{n} = \left \{ \omega : a_{n} = 1 \right \}} 獨立, 從而隨機變數 \xi_{1}, \xi_{2}, ..., \xi_{n} 獨立.

如果隨機變數 \xi 的值域 R_{\xi} = \left \{ x_{1}, x_{2}, ..., x_{k} \right \}, 隨機變數 \eta 的值域為 R_{\eta} = \left \{ y_{1}, y_{2}, ..., y_{l} \right \}, 則隨機變數 \zeta = \xi + \eta 的值域為 \displaystyle {R_{\zeta} = \left \{ z : z = x_{i} + y_{j}, i = 1, 2, ..., k, j = 1, 2, ..., l \right \}}. 那麼顯然, \displaystyle {P_{\zeta}(z) = \mathop {\mathbf {P}} \left \{ \zeta = z \right \} = \mathop {\mathbf {P}} \left \{ \xi + \eta = z \right \} = \sum \limits_{\left \{ (i, j) : x_{i} + y_{j} = z \right \}}\mathop {\mathbf {P}} \left \{ \xi = x_{i}, \eta = y_{j} \right \}}.

例題 3. 若隨機變數 \xi_{1}, \xi_{2}, ..., \xi_{n} 是獨立的 Bernoulli 隨機變數, 且 \displaystyle {\mathop {\mathbf {P}} \left \{ \xi_{i} = 1 \right \} = p, \mathop {\mathbf {P}} \left \{ \xi_{i} = 0 \right \} = q, i = 1, 2, ..., n}. 證明 : 隨機變數 \zeta = \xi_{1} + \xi_{2} + ... \xi_{n} 服從二項分佈.

證明 :

我們使用歸納法進行證明.

n = 2 時, R_{\zeta} = \left \{ 0, 1, 2 \right \}. 那麼有 \displaystyle {\begin {aligned} &P_{\zeta}(0) = P_{\xi_{1}}(0)P_{\xi_{2}}(0) = q^{2} = \binom {0}{2}p^{0}q^{2}, \\ &P_{\zeta}(1) = P_{\xi_{1}}(0)P_{\xi_{2}}(1) + P_{\xi_{1}}(1)P_{\xi_{2}}(0) = 2pq = \binom {1}{2}p^{1}q^{1}, \\ &P_{\zeta}(2) = P_{\xi_{1}}(1)P_{\xi_{2}}(1) = p^{2} = \binom {2}{2}p^{2}q^{0}. \end {aligned}} 故當 n = 2 時, 隨機變數 \zeta = \xi_{1} + \xi_{2} 服從二項分佈.

不妨假設當 n < l 時, 隨機變數 \zeta = \xi_{1} + \xi_{2} + ... + \xi_{n} 服從二項分佈.

n = l 時, R_{\zeta} = \left \{ 0, 1, 2, ..., l \right \}. 那麼有 \displaystyle {\begin {aligned} &P_{\zeta}(0) = P_{\xi_{1}}(0)P_{\xi_{2}}(0)...P_{\xi_{l}}(0) = q^{l} = \binom {0}{l}p^{0}q^{l}, \\ &\begin {aligned} P_{\zeta}(1) &= P_{\xi_{1}}(1)P_{\xi_{2}}(0)...P_{\xi_{l}}(0) + P_{\xi_{1}}(0)P_{\xi_{2}}(1)P_{\xi_{3}}(0)...P_{\xi_{l}}(0) + ... + \\ &\ \ \ \ P_{\xi_{1}}(0)P_{\xi_{2}}(0)...P_{\xi_{l - 1}}(0)P_{\xi_{l}}(1) \\ &= qp^{l - 1} = \binom {1}{l}p^{1}q^{l - 1}, \end {aligned} \\ &\vdots \\ &P_{\zeta}(l) = P_{\xi_{1}}(1)P_{\xi_{2}}(1)...P_{\xi_{l}}(1) = p^{l} = \binom {l}{l}p^{l}q^{0}. \end {aligned}} 故當 n = l 時, 隨機變數 \zeta = \xi_{1} + \xi_{2} + ... + \xi_{n} 仍然服從二項分佈.

綜上所述, 隨機變數 \zeta = \xi_{1} + \xi_{2} + ... \xi_{n} 服從二項分佈.

\blacksquare

3. 數學期望

(\Omega, \mathscr {A}, \mathbf {P}) 是有限機率空間, 而 \xi = \xi(\omega) 是某一隨機變數, 其值域為 \displaystyle {R_{\xi} = \left \{ x_{1}, x_{2}, ..., x_{k} \right \}}. 如果設 A_{i} = \left \{ \omega : \xi(\omega) = x_{i} \right \}, 則顯然 \xi 可以表示為 \displaystyle {\xi = \xi(\omega) = \sum \limits_{i = 1}^{k}x_{i}\mu_{A_{i}}(\omega)}. 其中, A_{1}, A_{2}, ..., A_{k} 構成集合 \Omega 的分割, i = 1, 2, ..., k, \mu_{A_{i}}(\omega) 是集合 A_{i} 的特徵函數.

說明:

我來說明一下為何 \xi(\omega) = \sum \limits_{i = 1}^{k}x_{i}\mu_{A_{i}}(\omega) 成立. 首先, 能夠使得 \xi(\omega) = x_{i} 成立的事件 \omega, 必有 \omega \in A_{i}. 那麼, \mu_{A_{i}}(\omega) = 1. 對於其它 A_{j}\ (j \neq i), 就有 \omega \notin A_{j}, 即這些 \omega 會使得 \mathop {\mu_{A_{j}}} \limits_{\left \{ j \neq i \right \}}(\omega) = 0. 展開 \xi(\omega), 可得 \displaystyle {\begin {aligned} \xi(\omega) &= x_{1}\mu_{A_{1}}(\omega) + x_{2}\mu_{A_{2}}(\omega) + ... + \\ &\ \ \ \ x_{i - 1}\mu_{A_{i - 1}}(\omega) + x_{i}\mu_{A_{i}}(\omega) + x_{i + 1}\mu_{A_{i + 1}}(\omega) + ... + \\ &\ \ \ \ x_{k}\mu_{A_{k}}(\omega). \end {aligned}} 根據說明, 我們可以得到 \mu_{A_{1}}(\omega), \mu_{A_{2}}(\omega), ..., \mu_{A_{i - 1}}(\omega), \mu_{A_{i + 1}}(\omega), ..., \mu_{A_{k}}(\omega) 都為零. 因此, 不論 x_{1}, x_{2}, ..., x_{i - 1}, x_{i + 1}, ..., x_{k} 為何值, \displaystyle {x_{1}\mu_{A_{1}}(\omega), x_{2}\mu_{A_{2}}(\omega), ..., x_{i - 1}\mu_{A_{i - 1}}(\omega), x_{i + 1}\mu_{A_{i + 1}}(\omega), ..., x_{k}\mu_{A_{k}}(\omega)} 都為零. 最終有 \displaystyle {\xi(\omega) = x_{i}\mu_{A_{i}}(\omega) = x_{i} = \sum \limits_{i = 1}^{k}x_{i}\mu_{A_{i}}(\omega)}.

\blacksquare

p_{i} = \mathop {\mathbf {P}} \left \{ \xi = x_{i} \right \}, i = 1, 2, ..., k. 直觀上, 如果在 n 次獨立重複試驗中觀測隨機變數 \xi 的取值, 則取 x_{i} 的值大致上應該出現 np_{i} 次. 考慮投擲 n 次均勻硬幣, 則 p_{0} = p_{1} = \frac {1}{2}. 其中, p_{0} 代表反面向上的機率, 而 p_{1} 表示正面向上的機率. 直觀上, 投擲 n 次硬幣, 正面應該大致出現 \frac {n}{2} 次, 即 np_{1} 次. 因此, 根據 n 次試驗的結果, 計算該隨機變數的平均值大致為 \displaystyle {\frac {1}{n}(np_{1}x_{1} + np_{2}x_{2} + ... + np_{k}x_{k}) = \sum \limits_{i = 1}^{k}p_{i}x_{i}}.

定義 4. 實數 \displaystyle {\mathop {\mathrm {E}}(\xi) = \sum \limits_{i = 1}^{k}x_{i}\mathop {\mathbf {P}}(A_{i})} 稱作隨機變數 \xi = \sum \limits_{i = 1}^{k}x_{i}\mu_{A_{i}}(\omega)數學期望 (mathematical expectation) 或者平均值 (mean value), 簡稱期望 (expectation) 或者均值. 其中, A_{i} = \left \{ \omega : \xi(\omega) = x_{i} \right \} (i = 1, 2, …, k).

我們注意到, P_{\xi}(x_{i}) = \mathop {\mathbf {P}}(A_{i}). 根據定義 4, 又有 \displaystyle {\mathop {\mathrm {E}}(\xi) = \sum \limits_{i = 1}^{k}x_{i}P_{\xi}(x_{i})}. 另外, 我們記 \Delta F_{\xi}(x) = F_{\xi}(x) - F_{\xi}(x^{-}), 根據定義 3, 我們可得 P_{\xi}(x_{i}) = \Delta F_{\xi}(x_{i}), 從而有 \displaystyle {\mathop {\mathrm {E}}(\xi) = \sum \limits_{i = 1}^{k}x_{i}\Delta F_{\xi}(x_{i})}.

有時, 隨機變數 \xi 的值域中可能出現相同的值, 即 x_{i} = x_{j}i \neq j 的情形. 此時, 我們可以將隨機變數表示為 \displaystyle {\xi(\omega) = \sum \limits_{j = 1}^{l}x_{j}'\mu_{B_{j}}(\omega)}. 其中, B_{1} + B_{2} + ... B_{l} = \Omega, x_{j} 之中可能出現相同的值, j = 1, 2, ..., l. 但之前, 我們所討論的隨機變數的值域是一個集合, 裡面不能存在相同的元素. 這時, 我們仍然可以按照上式計算期望, 而並不需要將其變換為所有的 x_{i} 值兩兩不等的 \displaystyle {\xi(\omega) = \sum \limits_{i = 1}^{k}x_{i}\mu_{A_{i}}(\omega)}. 其中, i = 1, 2, …, l.

事實上, 根據 \displaystyle {\sum \limits_{\left \{ j : x_{j}' = x_{i} \right \}}x_{j}'\mathop {\mathbf {P}}(B_{j}) = x_{i}\sum \limits_{\left \{ j : x_{j}' = x_{i} \right \}}\mathop {\mathbf {P}}(B_{j}) = x_{i}\mathop {\mathbf {P}}(A_{i})}, 於是有 \displaystyle {\sum \limits_{j = 1}^{l}x_{j}\mathop {\mathbf {P}}(B_{j}) = \sum \limits_{i = 1}^{k}x_{i}\mathop {\mathbf {P}}(A_{i})}.

\xi\eta 為隨機變數, 則期望有以下基本性質 :

  1. \xi \geq 0, 則 \mathop {\mathrm {E}}(\xi) \geq 0;
  2. \mathop {\mathrm {E}}(a\xi + b\eta) = a\mathop {\mathrm {E}}(\xi) + b\mathop {\mathrm {E}}(\eta). 其中, ab 是常數;
  3. \xi \geq \eta, 則 \mathop {\mathrm {E}}(\xi) \geq \mathop {\mathrm {E}}(\eta);
  4. \left | \mathop {\mathrm {E}}(\xi) \right | \leq \mathop {\mathrm {E}}(\left | \xi \right |);
  5. \xi\eta 獨立, 則 \mathop {\mathrm {E}}(\xi\eta) = \mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta);
  6. \mathop {\mathrm {E}}^{2}(\left | \xi\eta \right |) \leq \mathop {\mathrm {E}}(\xi^{2})\mathop {\mathrm {E}}(\eta^{2}), 這是 Cauchy-Schwarz 不等式的機率形式;
  7. \xi = \mu_{A}(\omega), 則 \mathop {\mathrm {E}}(\xi) = \mathop {\mathbf {P}}(A);
  8. c 為常數, 則 \mathop {\mathrm {E}}(c) = c.
證明 :

根據期望的定義, \mathop {\mathrm {E}}(\xi) = \sum \limits_{i = 1}^{k}x_{i}\mathop {\mathbf {P}}(A_{i}), 由於 0 \leq \mathop {\mathbf {P}}(A_{i}) \leq 1, 故當 \xi \geq 0 時, 必有 x_{i} \geq 0. 其中, i = 1, 2, ..., k. 於是 \mathop {\mathrm {E}}(\xi) \geq 0.

(1) \square

\xi = \sum \limits_{i}x_{i}\mu_{A_{i}}(\omega)\eta = \sum \limits_{j}y_{j}\mu_{A_{j}}(\omega), 則 \displaystyle {a\xi + b\eta = a\sum \limits_{i}x_{i}\mu_{A_{i}}(\omega) + b\sum \limits_{j}y_{j}\mu_{A_{j}}(\omega)}. 由於 A_{i} = A_{i} \cup \Omega = A_{i} \bigcup \limits_{j} B_{j}B_{j} = B_{j} \bigcup \limits_{i} A_{i}, 故有 \displaystyle {\begin {aligned} a\xi + b\eta &= a\sum \limits_{i, j}x_{i}\mu_{A_{i}B_{j}}(\omega) + b\sum \limits_{i, j}y_{j}\mu_{A_{i}B_{j}}(\omega) \\ &= \sum \limits_{i, j}ax_{i}\mu_{A_{i}B_{j}}(\omega) + \sum \limits_{i, j}by_{j}\mu_{A_{i}B_{j}}(\omega) \\ &= \sum \limits_{i, j}(ax_{i} + by_{j})\mu_{A_{i}B_{j}}(\omega). \end {aligned}} 因此, 根據定義 4\displaystyle {\begin {aligned} \mathop {\mathrm {E}}(a\xi + b\eta) &= \sum \limits_{i, j}(ax_{i} + by_{j})\mathop {\mathbf {P}}(A_{i}B_{j}) \\ &= \sum \limits_{i}ax_{i}\mathop {\mathbf {P}}(A_{i}) + \sum \limits_{j}by_{j}\mathop {\mathbf {P}}(B_{j}) \\ &= a\sum \limits_{i}x_{i}\mathop {\mathbf {P}}(A_{i}) + b\sum \limits_{j}y_{j}\mathop {\mathbf {P}}(B_{j}) \\ &= a\mathop {\mathrm {E}}(\xi) + b\mathop {\mathrm {E}}(\eta). \end {aligned}}

(2) \square

性質 2 可知, \displaystyle {\mathop {\mathrm {E}}(\xi) - \mathop {\mathrm {E}}(\eta) = \mathop {\mathrm {E}}(\xi - \eta)}. 由於 \xi \geq \eta, 結合性質 1, 於是有 \displaystyle {\mathop {\mathrm {E}}(\xi) \geq \mathop {\mathrm {E}}(\eta)}.

(3) \square

根據絕對值不等式 (《【數學分析】實數——實數的四則運算》定理 3), 我們可以得到 \displaystyle {\left | \mathop {\mathrm {E}}(\xi) \right | = \left |\sum \limits_{i}x_{i}\mathop {\mathbf {P}}(A_{i}) \right | \leq \sum \limits_{i}\left | x_{i} \right |\mathop {\mathbf {P}}(A_{i}) = \mathop {\mathrm {E}}(\left | \xi \right |)}.

(4) \square

\xi = \sum \limits_{i}x_{i}\mu_{A_{i}}(\omega)\eta = \sum \limits_{j}y_{j}\mu_{A_{j}}(\omega), 則 \displaystyle {\begin {aligned} \mathop {\mathrm {E}}(\xi\eta) &= \mathop {\mathrm {E}} \left (\sum \limits_{i}x_{i}\mu_{A_{i}}(\omega)\sum \limits_{j}y_{j}\mu_{A_{j}}(\omega) \right ) \\ &= \mathop {\mathrm {E}} \left (\sum \limits_{i}\sum \limits_{j}x_{i}y_{j}\mu_{A_{i}B_{j}}(\omega) \right ) \\ &= \sum \limits_{i, j}x_{i}y_{j}\mathop {\mathbf {P}}(A_{i}B_{j}). \end {aligned}} 由於隨機變數 \xi\eta 獨立, 故事件 A_{i} = \left \{ \omega : \xi(\omega) = x_{i} \right \}B_{j} = \left \{ \omega : \eta(\omega) = y_{j} \right \} 相互獨立, 於是 \displaystyle {\begin {aligned} \mathop {\mathrm {E}}(\xi\eta) &= \sum \limits_{i, j}x_{i}y_{j}\mathop {\mathbf {P}}(A_{i})\mathop {\mathbf {P}}(B_{j}) \\ &= \sum \limits_{i}x_{i}\mathop {\mathbf {P}}(A_{i})\sum \limits_{j}y_{j}\mathop {\mathbf {P}}(B_{j}) \\ &= \mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta). \end {aligned}}

(5) \square

對於 \xi = \sum \limits_{i}x_{i}\mu_{A_{i}}(\omega), \eta = \sum \limits_{j}y_{j}\mu_{A_{j}}(\omega), 我們注意到 \displaystyle {\xi^{2} = \sum \limits_{i}x_{i}^{2}\mu_{A_{i}}(\omega), \eta^{2} = \sum \limits_{j}y_{j}\mu_{B_{j}}(\omega)}. 因此根據期望的定義, 有 \displaystyle {\mathop {\mathrm {E}}(\xi^{2}) = \sum \limits_{i}x_{i}^{2}\mathop {\mathbf {P}}(A_{i}), \mathop {\mathrm {E}}(\eta^{2}) = \sum \limits_{j}y_{j}^{2}\mathop {\mathbf {P}}(B_{j})}.\mathop {\mathrm {E}}(\xi^{2}) > 0\mathop {\mathrm {E}}(\eta^{2}) > 0, 記 \displaystyle {\widetilde {\xi} = \frac {\xi}{\sqrt {\mathop {\mathrm {E}}(\xi^{2})}}, \widetilde {\eta} = \frac {\eta}{\sqrt {\mathop {\mathrm {E}}(\eta^{2})}}}, 於是, \displaystyle {\mathop {\mathrm {E}} \left ({\widetilde {\xi}}^{2} + {\widetilde {\eta}}^{2} \right ) = \frac {\xi^{2}}{\mathop {\mathrm {E}}(\xi^{2})} + \frac {\eta^{2}}{\mathop {\mathrm {E}}(\eta^{2})}, 2 \left |\widetilde {\xi}\widetilde {\eta} \right | = \frac {2|\xi\eta|}{\sqrt {\mathop {\mathrm {E}}(\xi^{2})\mathop {\mathrm {E}}(\eta^{2})}}}. 由均值不等式可知, 2|\widetilde {\xi}\widetilde {\eta}| \leq {\widetilde {\xi}}^{2} + {\widetilde {\eta}}^{2}, 可見 \displaystyle {2\mathop {\mathrm {E}} \left ( \left | \widetilde {\xi}\widetilde {\eta} \right | \right ) \leq \mathop {\mathrm {E}}(\widetilde {\xi}^{2}) + \mathop {\mathrm {E}}(\widetilde {\eta}^{2}) = 2}. 因此, \mathop {\mathrm {E}} \left ( \left |\widetilde {\xi}\widetilde {\eta} \right | \right ) \leq 1\mathop {\mathrm {E}}^{2}(\xi\eta) \leq \mathop {\mathrm {E}}(\xi^{2})\mathop {\mathrm {E}}(\eta^{2}). 若 \mathop {\mathrm {E}}(\xi^{2}) = 0, 則 \displaystyle {\sum \limits_{i}x_{i}^{2}\mathop {\mathbf {P}} \left \{ A_{i} \right \} = 0}. 從而 0 是隨機變數 \xi 的可能取值, 並且 \displaystyle {\mathop {\mathbf {P}} \left \{ \omega : \xi(\omega) = 0 \right \} = 1}. 於是, 不論 \mathop {\mathrm {E}}(\xi^{2}) = 0 或者 \mathop {\mathrm {E}}(\eta^{2}) = 0, 顯然有 \mathop {\mathrm {E}}(\left | \xi\eta \right |) = 0. 此時有 \displaystyle {\mathop {\mathrm {E}}^{2}(\xi\eta) \leq \mathop {\mathrm {E}}(\xi^{2})\mathop {\mathrm {E}}(\eta^{2})} 成立.

綜上所述, \mathop {\mathrm {E}}^{2}(\xi\eta) = \mathop {\mathrm {E}}(\xi^{2})\mathop {\mathrm {E}}(\eta^{2}).

(6) \square

根據隨機變數 \xi 的性質 : \displaystyle {\xi = \sum \limits_{i}x_{i}\mu_{A_{i}}(\omega) = \mu_{A_{i}}(\omega)},x_{i} = 1, 於是 \displaystyle {\mathop {\mathrm {E}}(\xi) = \sum \limits_{i}x_{i}\mathop {\mathbf {P}}(A_{i}) = \sum \limits_{i}\mathop {\mathbf {P}}(A_{i}) = \mathop {\mathbf {P}}(A)}

(7) \square

由於 c 有且唯有一種取值, 故 \mathop {\mathbf {P}}(c) = 1. 根據期望的定義, 有 \displaystyle {\mathop {\mathrm {E}}(c) = \sum \limits_{i}x_{i}\mathop {\mathbf {P}}(A_{i}) = c \cdot \mathop {\mathbf {P}}(c) = c}.

(8) \square

\blacksquare

推論 1. 若隨機變數 \xi_{1}, \xi_{2}, ..., \xi_{r} 獨立, 則有 \displaystyle {\mathop {\mathrm {E}}(\xi_{1}\xi_{2}...\xi_{r}) = \mathop {\mathrm {E}}(\xi_{1})\mathop {\mathrm {E}}(\xi_{2})...\mathop {\mathrm {E}}(\xi_{r})}.

證明 :

我們使用歸納法進行證明.

r = 1 時, 顯然有 \mathop {\mathrm {E}}(\xi_{1}) = \mathop {\mathrm {E}}(\xi_{1}); 當 r = 2 時, 由期望的性質可知, \mathop {\mathrm {E}}(\xi_{1}\xi_{2}) = \mathop {\mathrm {E}}(\xi_{1})\mathop {\mathrm {E}}(\xi_{2}).

不妨假設當 r < k 時, 有 \mathop {\mathrm {E}}(\xi_{1}\xi_{2}...\xi_{r}) = \mathop {\mathrm {E}}(\xi_{1})\mathop {\mathrm {E}}(\xi_{2})...\mathop {\mathrm {E}}(\xi_{r}) 成立.

r = k 時, \displaystyle {\begin {aligned} \mathop {\mathrm {E}}(\xi_{1}\xi_{2}...\xi_{k - 1}\xi_{k}) &= \mathop {\mathrm {E}}((\xi_{1}\xi_{2}...\xi_{k - 1})\xi_{k}) \\ &= \mathop {\mathrm {E}}(\xi_{1}\xi_{2}...\xi_{k - 1})\mathop {\mathrm {E}}(\xi_{k}) \\ &= \mathop {\mathrm {E}}(\xi_{1})\mathop {\mathrm {E}}(\xi_{2})...\mathop {\mathrm {E}}(\xi_{k - 1})\mathop {\mathrm {E}}(\xi_{k}). \end {aligned}} 因此, 當 r = k 時, \mathop {\mathrm {E}}(\xi_{1}\xi_{2}...\xi_{r}) = \mathop {\mathrm {E}}(\xi_{1})\mathop {\mathrm {E}}(\xi_{2})...\mathop {\mathrm {E}}(\xi_{r}) 仍然成立.

綜上所述, 若隨機變數 \xi_{1}, \xi_{2}, ..., \xi_{r} 獨立, 則有 \displaystyle {\mathop {\mathrm {E}}(\xi_{1}\xi_{2}...\xi_{r}) = \mathop {\mathrm {E}}(\xi_{1})\mathop {\mathrm {E}}(\xi_{2})...\mathop {\mathrm {E}}(\xi_{r})}.

\blacksquare

例題 4.\xi 是 Bernoulli 隨機變數, 以機率 pq01, 則 \displaystyle {\mathop {\mathrm {E}}(\xi) = 1 \times \mathop {\mathbf {P}} \left \{ \xi = 1 \right \} + 0 \times \mathop {\mathbf {P}} \left \{ \xi = 0 \right \} = \mathop {\mathbf {P}} \left \{ \xi = 1 \right \} = p}.

例題 5.\xi_{1}, \xi_{2}, ..., \xi_{n}n 個 Bernoulli 隨機變數, 以機率 \mathop {\mathbf {P}} \left \{ \xi_{i} = 1 \right \} = p\mathop {\mathbf {P}} \left \{ \xi_{i} = 0 \right \} = qp + q = 110 為值. 其中, i = 1, 2, ..., n. 那麼對於 S_{n} = \xi_{1} + \xi_{2} + ... + \xi_{n} 的期望為 \displaystyle {\begin {aligned} \mathop {\mathrm {E}}(S_{n}) &= \mathop {\mathrm {E}}(\xi_{1} + \xi_{2} + ... \xi_{n}) = \mathop {\mathrm {E}}(\xi_{1}) + \mathop {\mathrm {E}}(\xi_{2}) + ... + \mathop {\mathrm {E}}(\xi_{n}) \\ &= \underbrace {(1 \times p + 0 \times q) + (1 \times p + 0 \times q) + ... + (1 \times p + 0 \times q)}_{n \text { 個}} \\ &= np. \end {aligned}}

例題 5'.\xi_{1}, \xi_{2}, ..., \xi_{n}n 個 Bernoulli 隨機變數, 以機率 \mathop {\mathbf {P}} \left \{ \xi_{i} = 1 \right \} = p\mathop {\mathbf {P}} \left \{ \xi_{i} = 0 \right \} = qp + q = 110 為值. 其中, i = 1, 2, ..., n. 求 S_{n} = \xi_{1} + \xi_{2} + ... + \xi_{n}.

:

\xi_{1}, \xi_{2}, ..., \xi_{n}n 個獨立的 Bernoulli 隨機變數, 則 \mathop {\mathrm {E}}(S_{n}) 仍保持不變. 此時, 由二項分佈可知 \displaystyle {\mathop {\mathbf {P}} \left \{ S_{n} = k \right \} = \binom {k}{n}p^{k}q^{n - k}}. 其中, k = 0, 1, 2, …, n. 於是, \displaystyle {\begin {aligned} \mathop {\mathrm {E}}(S_{n}) &= \sum \limits_{k = 0}^{n}k\mathop {\mathbf {P}} \left \{ S_{n} = k \right \} \\ &= \sum \limits_{k = 0}^{n}k\binom {k}{n}p^{k}q^{n - k} \\ &= \sum \limits_{k = 0}^{n}k \cdot \frac {n!}{k!(n - k)!}p^{k}q^{n - k} \\ &= \frac {0 \times n!}{0!(n - 0)!}p^{0}q^{n} + \sum \limits_{k = 1}^{n}k \frac {n!}{k!(n - k)!}p^{k}q^{n - k} \\ &= \sum \limits_{k = 1}^{n}k \frac {n!}{k!(n - k)!}p^{k}q^{n - k} \\ &= \sum \limits_{k = 1}^{n}k\frac {n(n - 1)!}{k!(n - k + 1 - 1)!}p \cdot p^{k - 1}q^{n - k + 1 -1} \\ &= \sum \limits_{k = 1}^{n}\frac {n(n - 1)!}{(k - 1)!((n - 1) - (k - 1))!}p \cdot p^{k - 1}q^{(n - 1) - (k - 1)} \\ &= np\sum \limits_{k = 1}^{n}\frac {(n - 1)!}{(k - 1)!((n - 1) - (k - 1))!}p^{k - 1}q^{(n - 1) - (k - 1)} \\ &= np\sum \limits_{l = 1}^{n - 1}\frac {(n - 1)!}{l!((n - 1) - l)!}p^{l}q^{(n - 1) - l} \ (\text {令 } l = k - 1) \\ &= np. \end {aligned}}

\blacksquare

\xi = \sum \limits_{i}x_{i}\mu_{A_{i}}(\omega), 其中, A_{i} = \left \{ \omega : \xi(\omega) = x_{i} \right \}, 從而 \varphi = \varphi(\xi(\omega))\xi(\omega) 的某一函數. 如果 B_{j} = \left \{ \omega : \varphi(\xi(\omega)) = y_{j} \right \}, 則 \displaystyle {\varphi(\xi(\omega)) = \sum \limits_{j}y_{j}\mu_{B_{j}}(\omega)}. 從而有 \displaystyle {\mathop {\mathrm {E}}(\varphi) = \sum \limits_{j}y_{j}\mathop {\mathbf {P}}(B_{j}) = \sum \limits_{j}y_{j}P_{\varphi}(y_{j})}. 顯然, \displaystyle {\varphi(\xi(\omega)) = \sum \limits_{i}\varphi(x_{i})\mu_{A_{i}}(\omega)}. 於是, 為了求 \varphi = \varphi(\xi(\omega)) 的期望值, 既可以使用 \mathop {\mathrm {E}}(\varphi) = \sum \limits_{j}y_{j}\mathop {\mathbf {P}}(B_{j}) = \sum \limits_{j}y_{j}P_{\varphi}(y_{j}), 也可以使用 \displaystyle {\mathop {\mathrm {E}}(\varphi(\xi(\omega))) = \sum \limits_{i}\varphi(x_{i})P_{\xi}(x_{i})}. 我們稱 \mathop {\mathrm {E}}(\varphi(\xi(\omega))) 為隨機變數函數的數學期望.

定義 5.\displaystyle {\mathop {\mathrm {Var}}(\xi) = \mathop {\mathrm {E}} \left ((\xi - \mathop {\mathrm {E}}(\xi))^{2} \right )} 為隨機變數 \xi方差 (variance).

定義 6.\displaystyle {\sigma = \sqrt {\mathop {\mathrm {Var}}(\xi)}} 為隨機變數 \xi標準差 (standard deviation).

隨機變數 \xi 的方差和標準差表徵了 \xi 的取值的散佈程度.

由於 \displaystyle {\begin {aligned} \mathop {\mathrm {E}} \left ((\xi - \mathop {\mathrm {E}}(\xi))^{2} \right) &= \mathop {\mathrm {E}} \left (\xi^{2} - 2\xi \mathop {\mathrm {E}}(\xi) + \mathop {\mathrm {E}}^{2}(\xi) \right ) \\ &= \mathop {\mathrm {E}}(\xi^{2}) - \mathop {\mathrm {E}}(2\xi \mathop {\mathrm {E}}(\xi)) + \mathop {\mathrm {E}} \left (\mathop {\mathrm {E}}^{2}(\xi) \right ) \\ &= \mathop {\mathrm {E}}(\xi^{2}) - 2\mathop {\mathrm {E}}(\xi) \cdot \mathop {\mathrm {E}}(\xi) + \mathop {\mathrm {E}}^{2}(\xi) \\ &= \mathop {\mathrm {E}}(\xi^{2}) - 2\mathop {\mathrm {E}}^{2}(\xi) + \mathop {\mathrm {E}}^{2}(\xi) \\ &= \mathop {\mathrm {E}}(\xi^{2}) - \mathop {\mathrm {E}}^{2}(\xi) \end {aligned}} 可見, \mathop {\mathrm {Var}}(\xi) = \mathop {\mathrm {E}}(\xi^{2}) - \mathop {\mathrm {E}}^{2}(\xi). 上面推導過程中, 由於 \mathop {\mathrm {E}}(\xi) 是常數, 因此根據期望的性質 8\mathop {\mathrm {E}}(\mathop {\mathrm {E}}(\xi)) = \mathop {\mathrm {E}}(\xi).

\xi 為隨機變數, ab 均為常數, 可導出方差如下性質 :

  1. \mathop {\mathrm {Var}}(\xi) \geq 0;
  2. \mathop {\mathrm {Var}}(a + b\xi) = b^{2}\mathop {\mathrm {Var}}(\xi);
  3. \mathop {\mathrm {Var}}(a) = 0;
  4. \mathop {\mathrm {Var}}(b\xi) = b^{2}\mathop {\mathrm {Var}}(\xi).
證明 :

定義 5 可知, \mathop {\mathrm {Var}}(\xi) = \mathop {\mathrm {E}} \left ((\xi - \mathop {\mathrm {E}}(\xi))^{2} \right ). 而 (\xi - \mathop {\mathrm {E}}(\xi))^{2} \geq 0, 由期望的性質可知, \mathop {\mathrm {Var}}(\xi) \geq 0.

(1) \square

根據定義 5 , 我們將其展開 : \displaystyle {\begin {aligned} \mathop {\mathrm {Var}}(a + b\xi) &= \mathop {\mathrm {E}} \left ((a + b\xi)^{2} \right ) - \mathop {\mathrm {E}}^{2}(a + b\xi) \\ &= \mathop {\mathrm {E}}(a^{2} + 2ab\xi + b^{2}\xi^{2}) - \mathop {\mathrm {E}}(a + b\xi)\mathop {\mathrm {E}}(a + b\xi) \\ &= \mathop {\mathrm {E}}(a^{2}) + \mathop {\mathrm {E}}(2ab\xi) + \mathop {\mathrm {E}}(b^{2}\xi^{2}) - \left ( \mathop {\mathrm {E}}^{2}(a) + \mathop {\mathrm {E}}^{2}(b\xi) + 2\mathop {\mathrm {E}}(a)\mathop {\mathrm {E}}(b\xi) \right ) \\ &= a^{2} + 2ab\mathop {\mathrm {E}}(\xi) + b^{2}\mathop {\mathrm {E}}(\xi^{2}) - a^{2} - b^{2}\mathop {\mathrm {E}}^{2}(\xi) - 2ab\mathop {\mathrm {E}}(\xi) \\ &= b^{2}\mathop {\mathrm {E}}(\xi^{2}) - b^{2}\mathop {\mathrm {E}}^{2}(\xi) \\ &= b^{2}\left ( \mathop {\mathrm {E}}(\xi^{2}) - \mathop {\mathrm {E}}^{2}(\xi) \right ) \\ &= b^{2}\mathop {\mathrm {Var}}(\xi). \end {aligned}}

(2) \square

根據定義 5 , 我們將其展開 : \displaystyle {\mathop {\mathrm {Var}}(a) = \mathop {\mathrm {E}}(a^{2}) - \mathop {\mathrm {E}}^{2}(a) = a^{2} - a^{2} = 0}.

(3) \square

根據定義 5 , 我們將其展開 : \displaystyle {\begin {aligned} \mathop {\mathrm {Var}}(b\xi) &= \mathop {\mathrm {E}}(b^{2}\xi^{2}) - \mathop {\mathrm {E}}^{2}(b\xi) \\ &= b^{2}\mathop {\mathrm {E}}(\xi^{2}) - b^{2}\mathop {\mathrm {E}}^{2}(\xi) \\ &= b^{2} \left ( \mathop {\mathrm {E}}(\xi^{2}) - \mathop {\mathrm {E}}^{2}(\xi) \right ) \\ &= b^{2}\mathop {\mathrm {Var}}(\xi). \end {aligned}}

(3) \square

\blacksquare

對於兩個隨機變數 \xi\eta, 有 \displaystyle {\begin {aligned} \mathop {\mathrm {Var}}(\xi + \eta) &= \mathop {\mathrm {E}} \left ((\xi + \eta)^{2} \right ) + \mathop {\mathrm {E}}^{2}(\xi + \eta) \\ &= \mathop {\mathrm {E}}(\xi^{2} + 2\xi\eta + \eta^{2}) - (\mathop {\mathrm {E}}(\xi) - \mathop {\mathrm {E}}(\eta))^{2} \\ &= \mathop {\mathrm {E}}(\xi^{2}) + 2\mathop {\mathrm {E}}(\xi + \eta) + \mathop {\mathrm {E}}(\eta^{2}) - \mathop {\mathrm {E}}^{2}(\xi) - 2\mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta) + \mathop {\mathrm {E}}^{2}(\eta) \\ &= \mathop {\mathrm {Var}}(\xi) + \mathop {\mathrm {Var}}(\eta) + 2\mathop {\mathrm {E}}(\xi\eta) - 2\mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta). \end {aligned}}\displaystyle {\begin {aligned} \mathop {\mathrm {E}} \big ((\xi - \mathop {\mathrm {E}}(\xi)) + (\eta - \mathop {\mathrm {E}}(\eta)) \big ) &= \mathop {\mathrm {E}} \big ( \xi^{2} + \mathop {\mathrm {E}}^{2}(\xi) + \eta^{2}+ \mathop {\mathrm {E}}^{2}(\eta) - 2\xi \mathop {\mathrm {E}}(\xi) + 2\xi\eta - \\ &\ \ \ \ \ 2\xi \mathop {\mathrm {E}}(\eta) - 2\eta \mathop {\mathrm {E}}(\xi) + 2\mathop {\mathrm {E}}(\eta)\mathop {\mathrm {E}}(\eta) - 2\eta \mathop {\mathrm {E}}(\eta) \big ) \\ &= \mathop {\mathrm {E}}(\xi^{2}) + \mathop {\mathrm {E}}^{2}(\xi) + \mathop {\mathrm {E}}(\eta^{2}) + \mathop {\mathrm {E}}^{2}(\eta) - 2\mathop {\mathrm {E}}^{2}(\xi) + 2\mathop {\mathrm {E}}(\xi\eta) - \\ &\ \ \ \ \ 2\mathop {\mathrm {E}}(\eta)\mathop {\mathrm {E}}(\xi) - 2\mathop {\mathrm {E}}(\eta)\mathop {\mathrm {E}}(\xi) + 2\mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta) - 2\mathop {\mathrm {E}}^{2}(\eta) \\ &= \mathop {\mathrm {Var}}(\xi) + \mathop {\mathrm {Var}}(\eta) + 2\mathop {\mathrm {E}}(\xi\eta) - 2\mathop {\mathrm {E}}(\eta)\mathop {\mathrm {E}}(\xi), \end {aligned}} 因此, \mathop {\mathrm {Var}}(\xi + \eta) = \mathop {\mathrm {E}} \left ( \big ( (\xi - \mathop {\mathrm {E}}(\xi)) + (\eta - \mathop {\mathrm {E}}(\eta)) \big )^{2} \right ). 除此之外, \displaystyle {\begin {aligned} 2\mathop {\mathrm {E}} \big ( (\xi - \mathop {\mathrm {E}}(\xi))(\eta - \mathop {\mathrm {E}}(\eta)) \big ) &= 2\mathop {\mathrm {E}} \big ( \xi\eta - \xi \mathop {\mathrm {E}}(\eta) - \eta \mathop {\mathrm {E}}(\xi) + \mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta) \big ) \\ &= 2\big ( \mathop {\mathrm {E}}(\xi\eta) - \mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta) - \mathop {\mathrm {E}}(\eta)\mathop {\mathrm {E}}(\xi) + \mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta) \big ) \\ &= 2\mathop {\mathrm {E}}(\xi\eta) - 2\mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta), \end {aligned}} 故有 \displaystyle {\begin {aligned} \mathop {\mathrm {Var}}(\xi + \eta) &= E\big ( (\xi - \mathop {\mathrm {E}}(\xi)) + (\eta - \mathop {\mathrm {E}}(\eta)) \big ) \\ &= \mathop {\mathrm {Var}}(\xi) + \mathop {\mathrm {Var}}(\eta) - 2\mathop {\mathrm {E}} \big ( (\xi - \mathop {\mathrm {E}}(\xi))(\eta - \mathop {\mathrm {E}}(\eta)) \big ). \end {aligned}}

定義 7. 我們稱 \displaystyle {\mathop {\mathrm {Cov}}(\xi, \eta) = \mathop {\mathrm {E}} \big ( (\xi - \mathop {\mathrm {E}}(\xi))(\eta - \mathop {\mathrm {E}}(\eta)) \big )} 為隨機變數 \xi\eta協方差 (covariance, 也稱共變異數).

定義 8.\mathop {\mathrm {Var}}(\xi) \geq 0, \mathop {\mathrm {Var}}(\eta) \geq 0, 我們稱 \displaystyle {\rho(\xi, \eta) = \frac {\mathop {\mathrm {Cov}}(\xi, \eta)}{\sqrt {\mathop {\mathrm {Var}}(\xi)\mathop {\mathrm {Var}}(\eta)}}} 為隨機變數 \xi\eta相關係數 (correlation coefficient).

定理 1.\rho(\xi, \eta) = \pm 1, 則隨機變數 \xi\eta 線性相關, 即 \displaystyle {\eta = a\xi + b}. 其中, 當 \rho(\xi, \eta) = 1 時, a > 0; 當 \rho(\xi, \eta) = -1 時, a < 0, ab 都為常數.

證明 :

\rho(\xi, \eta) = 1 時, 有 \displaystyle {\frac {\mathop {\mathrm {Cov}}(\xi, \eta)}{\sqrt {\mathop {\mathrm {Var}}(\xi)\mathop {\mathrm {Var}}(\eta)}} = \frac {\mathop {\mathrm {E}} \big ( (\xi - \mathop {\mathrm {E}}(\xi))(\eta - \mathop {\mathrm {E}}(\eta)) \big )}{\sqrt {\mathop {\mathrm {E}} \big ( (\xi - \mathop {\mathrm {E}}(\xi))^{2} \big )\mathop {\mathrm {E}} \big ( (\eta - \mathop {\mathrm {E}}(\eta))^{2} \big )}} = 1}. 記隨機變數 \varphi = \xi - \mathop {\mathrm {E}}(\xi)\psi = \eta - \mathop {\mathrm {E}}(\eta), 由上式可知 \displaystyle {\mathop {\mathrm {E}}(\varphi\psi) = \sqrt {\mathop {\mathrm {E}}(\varphi^{2})\mathop {\mathrm {E}}(\psi^{2})}}. 兩邊平方可得 \displaystyle {\mathop {\mathrm {E}}^{2}(\varphi\psi) = \mathop {\mathrm {E}}(\varphi^{2})\mathop {\mathrm {E}}(\psi^{2})}. 根據期望的性質 6, 有 \displaystyle {\mathop {\mathrm {E}}^{2}(\varphi\psi) \leq \mathop {\mathrm {E}}(\varphi^{2})\mathop {\mathrm {E}}(\psi^{2})}. 顯然, 要使得等號成立, 若且唯若隨機變數 \varphi\psi 線性相關, 即存在不為零的 k_{1}k_{2} 使得 \displaystyle {k_{1}\varphi + k_{2}\psi = 0}.k = -\frac {k_{1}}{k_{2}}, 於是有 \varphi = k\psi. 那麼, 我們可以得到 \displaystyle {\mathop {\mathrm {E}}^{2}(\varphi\psi) = \mathop {\mathrm {E}}^{2}(\varphi \cdot k\varphi) = k^{2}\mathop {\mathrm {E}}^{2}(\varphi^{2})}\displaystyle {\mathop {\mathrm {E}}(\varphi^{2})\mathop {\mathrm {E}}(\psi^{2}) = \mathop {\mathrm {E}}(\varphi^{2})\mathop {\mathrm {E}}(k^{2}\varphi^{2}) = k^{2}\mathop {\mathrm {E}}^{2}(\varphi^{2})}. 因此, \mathop {\mathrm {E}}^{2}(\varphi\psi) = \mathop {\mathrm {E}}(\varphi)\mathop {\mathrm {E}}(\psi). 此時, \displaystyle {\varphi = \xi - \mathop {\mathrm {E}}(\xi) = k\psi = k(\eta - \mathop {\mathrm {E}}(\eta))}. 變換可得 \displaystyle {\eta = \frac {1}{k}\xi + \frac {k\mathop {\mathrm {E}}(\eta) + \mathop {\mathrm {E}}(\xi)}{k}}.a = \frac {1}{k}, b = \frac {k\mathop {\mathrm {E}}(\eta) + \mathop {\mathrm {E}}(\xi)}{k}, 則 \displaystyle {\eta = a\xi + b}.\rho(\xi, \eta) = -1 時, 同樣有 \eta = a\xi + b.

接下來證明當 \rho(\xi, \eta) = 1 時, a > 0; 當 \rho(\xi, \eta) = -1 時, a < 0. 當 \rho(\xi, \eta) = 1 時, 有 \eta = a\xi + b. 則 \displaystyle {\begin {aligned} \rho(\xi, \eta) &= \rho(\xi, a\xi + b) = \frac {\mathop {\mathrm {Cov}}(\xi, a\xi + b)}{\sqrt {\mathop {\mathrm {Var}}(\xi)\mathop {\mathrm {Var}}(a\xi + b)}} \\ &= \frac {\mathop {\mathrm {E}}(\xi - \mathop {\mathrm {E}}(\xi))\mathop {\mathrm {E}}(a\xi + b - \mathop {\mathrm {E}}(a\xi + b))}{|a|\mathop {\mathrm {Var}}(\xi)} \\ &= \frac {a}{|a|} = 1. \end {aligned}} 故當 a > 0 時, \rho(\xi, \eta) = 1. 對於 \frac {a}{|a|} = -1, 此時 a < 0. 因此當 a < 0 時, \rho(\xi, \eta) = -1.

綜上所述, 若 \rho(\xi, \eta) = \pm 1, 則隨機變數 \xi\eta 線性相關, 即 \displaystyle {\eta = a\xi + b}. 其中, 當 \rho(\xi, \eta) = 1 時, a > 0; 當 \rho(\xi, \eta) = -1 時, a < 0, ab 都為常數.

\blacksquare

顯然我們可以立即指出 : 若隨機變數 \xi\eta 相互獨立, 則隨機變數 \xi - \mathop {\mathrm {E}}(\xi)\eta - \mathop {\mathrm {E}}(\eta) 獨立, 於是 \displaystyle {\mathop {\mathrm {Cov}}(\xi, \eta) = \mathop {\mathrm {E}} \big ( (\xi - \mathop {\mathrm {E}}(\xi))(\eta - \mathop {\mathrm {E}}(\eta)) \big ) = \mathop {\mathrm {E}}(\xi\eta) - \mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta) = 0}. 另外, 由定義 7 可知 \displaystyle {\mathop {\mathrm {Var}}(\xi + \eta) = \mathop {\mathrm {Var}}(\xi) + \mathop {\mathrm {Var}}(\eta) + 2\mathop {\mathrm {Cov}}(\xi, \eta) = \mathop {\mathrm {Var}}(\xi) + \mathop {\mathrm {Var}}(\eta)}. 相比於 "隨機變數 \xi\eta 相互獨立" 這個條件, "隨機變數 \xi\eta 不相關" 這個條件要稍弱一些. 因此, 一般隨機變數 \xi\eta 不相關並不能推導出 \xi\eta 獨立.

4. 估計量

例題 6. 假設隨機變數 \alpha\frac {1}{3} 的機率分別取 0, \frac {\pi}{2}, \pi 三個值. 證明 : 隨機變數 \xi = \sin {\alpha}\eta = \cos {\alpha} 不相關但是 \xi\eta 關於機率 \mathop {\mathbf {P}} 不獨立, 並求 \xi\eta 之間的函數關係.

證明 :

顯然, 隨機變數 \xi\eta 的值域為 R_{\xi} = \left \{ 0, 1 \right \}, R_{\eta} = \left \{ 0, 1, -1 \right \}, 那麼有 \displaystyle {\begin {aligned} &\mathop {\mathbf {P}} \left \{ \xi = 0 \right \} = \frac {2}{3}, \mathop {\mathbf {P}} \left \{ \xi = 1 \right \} = \frac {1}{3}, \\ &\mathop {\mathbf {P}} \left \{ \eta = 0 \right \} = \mathop {\mathbf {P}} \left \{ \eta = 1 \right \} = \mathop {\mathbf {P}} \left \{ \eta = -1 \right \} = \frac {1}{3}, \\ &\mathop {\mathbf {P}} \left \{ \xi = 0, \eta = 0 \right \} = 0, \mathop {\mathbf {P}} \left \{ \xi = 1, \eta = 0 \right \} = \frac {1}{3}, \\ &\mathop {\mathbf {P}} \left \{ \xi = 0, \eta = 1 \right \} = \frac {1}{3}, \mathop {\mathbf {P}} \left \{ \xi = 1, \eta = 1 \right \} = 0, \\ &\mathop {\mathbf {P}} \left \{ \xi = 0, \eta = -1 \right \} = \frac {1}{3}, \mathop {\mathbf {P}} \left \{ \xi = 1, \eta = 1 \right \} = 0. \end {aligned}} 因此, 我們有 \displaystyle {\mathop {\mathrm {E}}(\xi\eta) = 0 \times 0 + 0 \times \frac {1}{3} + \frac {1}{3} \times 0 + 0 \times 1 + \frac {1}{3} \times 0 + 0 \times (-1) = 0}\displaystyle {\mathop {\mathrm {E}}(\xi) = \frac {2}{3} \times 0 + \frac {1}{3} \times 1 = \frac {1}{3}, \mathop {\mathrm {E}}(\eta) = \frac {1}{3} \times 0 + \frac {1}{3} \times (-1) + \frac {1}{3} \times 1 = 0}. 於是, \displaystyle {\mathop {\mathrm {Cov}}(\xi, \eta) = \mathop {\mathrm {E}}(\xi\eta) - \mathop {\mathrm {E}}(\xi)\mathop {\mathrm {E}}(\eta) = 0}. 故隨機變數 \xi\eta 不相關. 然而, 對於任意 i \in R_{\xi}, j \in R_{\eta}, 都有 \displaystyle {\mathop {\mathbf {P}} \left \{ \xi = i, \eta = j \right \} \neq \mathop {\mathbf {P}} \left \{ \xi = i \right \}\mathop {\mathbf {P}} \left \{ \eta = j \right \}}, 且由 \sin^{2} {\alpha} + \cos^{2} {\alpha} = 1 可得 : 隨機變數 \xi\eta 存在函數關係 \displaystyle {\xi^{2} + \eta^{2} = 1}.

\blacksquare

特別地, 若隨機變數 \xi_{1}, \xi_{2}, ..., \xi_{n} 兩兩不相關 (不一定兩兩獨立), 則有 \displaystyle {\mathop {\mathrm {Var}} \left (\sum \limits_{i = 1}^{n}\xi_{i} \right ) = \sum \limits_{i = 1}^{n}\mathop {\mathrm {Var}}(\xi_{i})}. 考慮兩個隨機變數 \xi\eta, 假設只對隨機變數 \xi 進行觀測. 如果隨機變數 \xi\eta 相關, 則可以預期 : 已知 \xi 的值可以對未觀測的隨機變數 \eta 的值作出某種判斷.

我們把隨機變數 \xi 的任何一個函數 f = f(\xi) 稱作隨機變數 \eta 的一個估計量. 若有 \displaystyle {\mathop {\mathrm {E}} \left ( (\eta - f^{*}(\xi))^{2} \right ) = \inf \limits_{f}{\mathop {\mathrm {E}} \left ( (\eta - f(\xi))^{2} \right )}}, 則稱 f^{*} = f^{*}(\xi) 為在均方意義下的最佳估計量 (the optimal estimator).

對於線性估計 \lambda(\xi) = a + b\xi 類中, 我們接下來討論如何求得線性估計類中的最佳線性估計. 考慮函數 \displaystyle {g(a, b) = \mathop {\mathrm {E}} \left ( (\eta - f(\xi))^{2} \right )}, 對多變數函數 g(a, b) 針對 ab 求偏導數, 得 \displaystyle {\frac {\partial g}{\partial a} = -2\mathop {\mathrm {E}} \big ( \eta - (a + b\xi) \big ) \text { 和 } \frac {\partial g}{\partial b} = -2\mathop {\mathrm {E}} \big ( (\eta - (a + b)\xi)\xi \big )}.\frac {\partial g}{\partial a} = 0\frac {\partial g}{\partial b} = 0, 即可求出 \lambda 均方線性估計 \lambda^{*} = a^{*} + b^{*}\xi. 其中, a^{*} = \mathop {\mathrm {E}}(\eta) - b^{*}\mathop {\mathrm {E}}(\xi), b^{*} = \frac {\mathop {\mathrm {Cov}}(\xi, \eta)}{\mathop {\mathrm {Var}}(\xi)}. 即 \displaystyle {\lambda^{*}(\xi) = \mathop {\mathrm {E}}(\eta) + \frac {\mathop {\mathrm {Cov}}(\xi, \eta)}{\mathop {\mathrm {Var}}(\xi)}(\xi - \mathop {\mathrm {E}}(\xi))}.\Delta^{*} = \mathop {\mathrm {E}} \big ( (\eta - \lambda^{*}(\xi))^{2} \big )估計值的均方誤差 (the mean-square error of estimation), 則有 \displaystyle {\Delta^{*} = \mathop {\mathrm {E}} \big ( (\eta - \lambda^{*}(\xi))^{2} \big ) = \mathop {\mathrm {Var}}(\eta) - \frac {\mathop {\mathrm {Cov}} \left (\xi, \eta \right )}{\mathop {\mathrm {Var}}(\xi)} = \mathop {\mathrm {Var}}(\eta) \left ( 1 - \rho^{2}(\xi, \eta) \right )}. 可見, \left | \rho(\xi, \eta) \right | 越大, 均方誤差 \Delta^{*} 越小. 特別地, 當 |\rho(\xi, \eta)| = 1, 則 \Delta^{*} = 0. 如果隨機變數 \xi\eta 不相關, 即 \rho(\xi, \eta) = 0, 則 \lambda^{*}(\xi) = \mathop {\mathrm {E}}(\eta). 於是, 在隨機變數 \xi\eta 不相關的情形下, 根據 \xi\eta 的估計為 \mathop {\mathrm {E}}(\eta).

5. 練習題

練習題 1.n 個箱子中獨立地投擲 k 個球, 假設每個球落入各箱子的機率為 \frac {1}{n}, 求非空箱子數量的期望.

:

定義隨機變數 \xi_{1}, \xi_{2}, ..., \xi_{n}. 當第 i 個箱子中有球時, \xi_{i} = 1; 否則, \xi_{i} = 0. 其中, i = 1, 2, ..., n. 記 \eta = \xi_{1} + \xi_{2} + ... + \xi_{n}, 則隨機變數 \eta 可以表示有球箱子的數量.

將球投入任一箱子中的機率為 \frac {1}{n}, 則球不投入任何箱子的機率為 1 - \frac {1}{n}. 由於 n 個球的投擲相互獨立, 那麼 n 個球都不投入任何箱子中的機率為 \displaystyle {\left ( 1 - \frac {1}{n} \right )^{n}}. 反之, 至少有一個球投入箱子的機率為 1 - \left ( 1 - \frac {1}{n} \right )^{n}. 於是, \displaystyle {\mathop {\mathbf {P}} \left \{ \xi_{i} = 0 \right \} = \left ( 1 - \frac {1}{n} \right )^{n}, \mathop {\mathbf {P}} \left \{ \xi_{i} = 1 \right \} = 1 - \left ( 1 - \frac {1}{n} \right )^{n}}. 根據定義 4, 故有 \displaystyle {\mathop {\mathrm {E}}(\xi_{i}) = 0 \times \mathop {\mathbf {P}} \left \{ \xi_{i} = 0 \right \} + 1 \times \mathop {\mathbf {P}} \left \{ \xi_{i} = 1 \right \} = 1 - \left ( 1 - \frac {1}{n} \right )^{n}}. 那麼, 有 \displaystyle {\begin {aligned} \mathop {\mathrm {E}}(\eta) &= \mathop {\mathrm {E}}(\xi_{1} + \xi_{2} + ... \xi_{n}) \\ &= \mathop {\mathrm {E}}(\xi_{1}) + \mathop {\mathrm {E}}(\xi_{2}) + ... + \mathop {\mathrm {E}}(\xi_{n}) \\ &= n \left (1 - \left ( 1 - \frac {1}{n} \right )^{n} \right ). \end {aligned}}

\blacksquare

自主習題 1.\xi_{1}, \xi_{2}, ..., \xi_{n} 是獨立的 Bernoulli 隨機變數, 且 \displaystyle {\mathop {\mathbf {P}} \left \{ \xi_{i} = 0 \right \} = 1 - \lambda_{i}\Delta, \mathop {\mathbf {P}} \left \{ \xi_{i} = 1 \right \} = \lambda_{i}\Delta}. 其中, \Delta > 0, \lambda_{i} > 0, i = 1, 2, ..., n, 而 \Delta 是較小的數. 記 o(\cdot) 表示 \cdot 的高階無窮小. 證明 :

  1. \mathop {\mathbf {P}} \left \{ \xi_{1} + \xi_{2} + ... + \xi_{n} = 1 \right \} = \Delta \sum \limits_{i = 1}^{n}\lambda_{i} + o(\Delta^{2});
  2. \mathop {\mathbf {P}} \left \{ \xi_{1} + \xi_{2} + ... + \xi_{n} > 1 \right \} = o(\Delta^{2}).

自主習題 2. 證明 : 當 a = \mathop {\mathrm {E}}(\xi) 時, \mathop {\mathrm {E}} \left ((\xi - a)^{2} \right ) 達到最大下界 \displaystyle {\inf \limits_{-\infty < a < +\infty}{\mathop {\mathrm {E}} \left ((\xi - a)^{2} \right )}},\inf \limits_{-\infty < a < +\infty}{\mathop {\mathrm {E}} \left ((\xi - a)^{2} \right )} = \mathop {\mathrm {Var}}(\xi).

自主習題 3.F_{\xi}(x) 是隨機變數 \xi 的分佈函數, 而 m_{e}F_{\xi}(x) 的中位數, 即下列條件的點 \displaystyle {F_{\xi}(m_{e}^{-}) \leq \frac {1}{2} \leq F_{\xi}(m_{e})}. 證明 : \inf \limits_{-\infty < a < +\infty} \mathop {\mathrm {E}}( \left | \xi - a \right |) = \mathop {\mathrm {E}}(\left | \xi - m_{e} \right |).

自主習題 4.P_{\xi}(x) = \mathop {\mathbf {P}} \left \{ \xi = x \right \}, F_{\xi}(x) = \mathop {\mathbf {P}} \left \{ \xi \leq x \right \}, 證明 :

  1. 對於 a > 0, -\infty < b < +\infty, 有 \displaystyle {P_{a\xi + b}(x) = P_{\xi} \left ( \frac {x - a}{a} \right )}\displaystyle {P_{a\xi + b}(x) = F_{\xi} \left ( \frac {x - b}{a} \right )} 成立;
  2. 如果 y \geq 0, 則 \displaystyle {F_{\xi^{2}}(y) = F_{\xi}(\sqrt {y}) - F_{\xi}(-\sqrt {y}) + P_{\xi}(-\sqrt {y})};
  3. \xi^{+} = \max \left \{ \xi, 0 \right \}, 則 \displaystyle {F_{\xi^{+}}(x) = \begin {cases} 0 & {x < 0} \\ F_{\xi}(x) & {x = 0} \\ F_{\xi}(x) & {x > 0}. \end {cases}}

自主習題 5. 假設對於隨機變數 \xi\eta\mathop {\mathrm {E}}(\xi) = \mathop {\mathrm {E}}(\eta) = 0, \mathop {\mathrm {Var}}(\xi) = \mathop {\mathrm {Var}}(\eta) = 1, 而 \xi\eta 的相關係數為 \rho(\xi, \eta). 證明 : \displaystyle {\mathop {\mathrm {E}} \left (\max \left \{ \xi^{2}, \eta^{2} \right \} \right ) \leq 1 + \sqrt {1 - \rho^{2}}}.

自主習題 6.\xi_{1}, \xi_{2}, ..., \xi_{n} 是獨立隨機變數, \varphi_{1} = \varphi_{1}(\xi_{1}, \xi_{2}, ..., \xi_{k})\varphi_{2} = \varphi_{2}(\xi_{k + 1}, \xi_{k + 2}, ..., \xi_{n}) 分別是 (\xi_{1}, \xi_{2}, ..., \xi_{k})(\xi_{k + 1}, \xi_{k + 2}, ..., \xi_{n}) 的函數. 證明 : \varphi_{1}\varphi_{2} 獨立.

自主習題 7. 證明 : 隨機變數 \xi_{1}, \xi_{2}, ..., \xi_{n} 獨立, 若且唯若對於一切 x_{1}, x_{2}, ..., x_{n}, 有 \displaystyle {F_{\xi_{1}, \xi_{2}, ..., \xi_{n}}(x_{1}, x_{2}, ..., x_{n}) = F_{\xi_{1}}(x_{1})F_{\xi_{2}}(x_{2})...F_{\xi_{n}}(x_{n})}. 其中, F_{\xi_{1}, \xi_{2}, ..., \xi_{n}}(x_{1}, x_{2}, ..., x_{n}) = \mathop {\mathbf {P}} \left \{ \xi_{1} \leq x_{1}, \xi_{2} \leq x_{2}, ..., \xi_{n} \leq x_{n} \right \}.

自主習題 8. 證明隨機變數 \xi 與自己獨立, 即 \xi\xi 獨立, 若且唯若 \xi 為常數.

自主習題 9. 問 : 隨機變數 \xi 滿足什麼條件時, \xi\sin {\xi} 獨立.

自主習題 10.\xi\eta 時獨立隨機變數, 且 \eta \neq 0. 請通過機率 P_{\xi}(x)P_{\eta}(y) 的形式表示機率 \displaystyle {\mathop {\mathbf {P}} \left \{ \xi\eta \leq z \right \} \text { 和 } \mathop {\mathbf {P}} \left \{ \frac {\xi}{\eta} \leq z \right \}}.

自主習題 11. 設隨機變數 \xi, \eta, \zeta 滿足 \left | \xi \right | \leq 1, \left | \eta \right | \leq 1, \left | \zeta \right | \leq 1. 證明 A. G. Bell 不等式 \displaystyle {\left | \mathop {\mathrm {E}}(\xi\eta) - \mathop {\mathrm {E}}(\eta\zeta) \right | \leq 1 - \mathop {\mathrm {E}}(\xi\eta)} 成立.