What is a Random Variable?

Connecting the intuitive idea to the mathematical definition

Consider an experiment E which can produce multiple outcomes, and there is uncertainty as to which one will be producted at a given time. Let the set of outcomes be denoted by Ω. [Note that E is not a function; there is no way to have uncertainty in the output of a function].

Let Ω have a measure P defined on it (the probability of an event) and a σ-algebra of measureable subsets ( the events ). We require the following conditions:

A map
X: Ω → symbol for real numbers
is called a random variable if the set X-1( (-∞,λ) ) is measureable for every real number λ. This implies that the inverse images of all the Borel sets (formed from countable unions and interestions of such intervals) are also measureable with P.

So calling it a "variable" is technically incorrect, but that is none the less a good way to think about it. It is suggestive of a number obtained as the outcome of an experiment. For example, let E be rolling a standard 6-sided dice, and let X be the value obtained. An elementary probability book might have something like
P( X=5 ) =
1
6

which would be rewritten in more mathematical formalism as

P( X -1( {5} ) ) =
  1  
6

Mathematically, there is no concept of "performing the experiment". We only deal with static relationships between sets:

Define a cumulative distribution function (cdf)
FX : reals → [0, 1]
  t longmapsto P( X-1(-∞,t) )
We can then differentiate F to get f(x) = F'(x) and f is defined for all points x in the image X(Ω), except for possibly a set of measure 0.

f is the probability density function or pdf (it is a "weight function" i.e. a weighted measure for integrating ). It should really be notated fX, but that can be cumbersome. It exists for all "nice" problems, and from my point of view (and that of all elementary textbooks) it is the easiest thing to work with, conceptually.

F(x) = x-∞ f(t) dt

Expectation
   
E(x) = t · f(t) dt
  -∞  

This is really a statement about f. I think of f as the "distribution", for example, imagine the Gaussian bell shaped curve. That is really a picture of f for a Gaussian (Normal) distribution.

E(X) is called the expected value, but this term is misleading. It is not necessarily the value with the maximum probability weight; consider the exponential distribution. It is the "most probable value" in certain important special cases, like when the pdf is a symmetic bump function like the Normal distribution. A better term would be the "center" or "balance point" of the pdf (well, technically the Real line, with points weighted by the pdf). Torque about point μ must equal 0:

(x - μ) f(x) dx = 0   <==>    x f(x) dx   =  μ f(x) dx  =  μ·1  =  μ

   
Var(X) = (t-μ)2 · f(t) dt
  -∞  

The variance Var(X) corresponds to the moment of inertia in mechanics, just as E[X] corresponds to the balance point or center of mass of the domain, weighted by the pdf.

E(X) and Var(X) are coarse measurements of a distribution! But they can tell us a lot of useful information.

Chebyshev Inequality

Let X be an arbitrary Random Variable with
E(X) = μ
Var(X) = σ2
}      ==>    P( X ∉ (μ-t,   μ+t) )   ≤   σ2 / t2     for any real t > 0

What does it mean for RVs to be independent?

Law(s) of Large Numbers

A central result of probability and statistics is the following convergence:

  n           in some sense          
1n Xi           μ
  i = 1      

Weak law: the convergence is in probability

Strong law: the convergence is in Almost Surely

Central Limit Theorem(s)

The important properties of covariance are:

J. Lamperti Probability: A Survery of the Modern Theory
QA 273 L26 1996

Marek Fisz Probability Theory and Mathematical Statistics