Probability

Classical definition of Probability

The classical definition of probability, also known as the Laplace definition, states that if an experiment has a finite number of equally likely outcomes, the probability of an event occurring is given by:

P(E)=Number of favorable outcomesTotal number of possible outcomes

Combined events

P(AB)=P(A)+P(B)P(AB)

Independence Relation

if Event A and B are independent, then

P(AB)=P(A)P(B)

Conditional Probability

Pasted image 20250605153703.png|centre|300
P(B|A) is the probability of B occurring given that A has already occured. We see that

P(A)P(B|A)=P(AB)

Thus

P(B|A)=P(AB)P(A)

Or

P(A|B)=P(AB)P(B)

Bayes Theorem

Thus

P(A|B)=P(B|A)P(A)P(B)

Since P(B) can be rewritten as P(BA)+P(BA). Rearranging our conditional probability statement

P(A|B)=P(B|A)P(A)P(B|A)P(A)+P(B|A)P(A)

Random Variables

Probability Mass Function (PMF)

It denotes the probability that a discrete random variable X would equate to the value x.

P(X=x)

Probability Density Function (PDF)

For continuous random variable, P(X=x) is not meaningful as it would just be 0. For example, suppose a computer can randomly generate real numbers. It is impossible that it will output x due to the uncountably infinite many other possibilities. However, its output can come ridiculously close to x. And the probability that it is close to x is defined by the probability density at x.

fX(x)

where fX is the probability density function which measures the probability that X would be close to x. More about them here

To find the probability that aXb

P(aXb)=abfX(x)dx

Expectation

The expectation function, reveals the expected value of X in the long term. It is the product of x and the probability of x occurring. #Expectation

Discrete:

E(X)=xxP(X=x)

Continuous:

E(X)=xfX(x)dx

Properties of Expectation

Constant

E(c)=c

Scalar Multiplicativity

E(aX)=aE(X)

Linearity of Expectation

E(X+Y)=E(X)+E(Y)

Product of Two independent Variables

If X and Y are independent

E(XY)=E(X)E(Y)

Variance

Formula

Variance is the spread of the random variable X from its mean E(X) #Variance

Var(X)=E(XE(X))2

We can also rewrite it as

Var(X)=E(XE(X))2=E(X22E(X)X+E(X)2)=E(X2)2E(X)2+E(X)2=E(X2)E(X)

Var(X)=E(X2)E(X)=σ2

It can be equated to σ2

Properties of Variance

Constant

Var(c)=0

Scalar Multiplicativity

Var(cX)=c2Var(X)

Translation Invariance

Var(X+c)=Var(X)

Additivity

Var(X+Y)=Var(X)+2Cov(XY)+Var(Y)

If X and Y are independent, its covariance is 0.

Var(X+Y)=Var(X)+Var(Y)

Cumulative distribution Function

Is the summation of the Probability Mass and Density function. It is an S-shaped curve

Discrete

F(x)=P(Xx)=txP(X=t)

Continuous

F(x)=P(Xx)=xfX(t) dt

Look at Distributions

Binomial Distributions

E(X)=npE(X)=np(1p)

Modern Axiomatic Definition

It is based on the Kolmogorov probability axioms. Which extends the definition to unequal probability distributions, and also probability with infinite outcomes.

Kolmogorov Probability Axioms

let (Ω,F,P) be the measure space where
Ω is the sample space (Set of all elements)
F is the event space (Set of all events) F={E1,E2}
P is the probability measure that outputs a probability of an event E

  1. P(E)R,P(E)0EF
  2. P(Ω)=1
  3. P(i=0Ei)=i=0P(Ei) if Ei are disjoint