Probability & Statistics Cheatsheet

Essential formulas and concepts for probability theory and statistics.

Probability Basics

Set Operations

A ∪ B — Union (all shaded)

A ∩ B — Intersection (overlap)

Aᶜ — Complement (red area)

Fundamental Rules

Complement: \(P(A^c) = 1 - P(A)\)
Union: \(P(A \cup B) = P(A) + P(B) - P(A \cap B)\)
Intersection (independent): \(P(A \cap B) = P(A) \cdot P(B)\)
Conditional Probability: \(P(A \mid B) = \frac{P(A \cap B)}{P(B)}\)

Bayes’ Theorem

\[P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}\]

graph LR
    S["🎲 Sample Space"] --> A["A — P(A)"]
    S --> Ac["A' — P(A')"]
    A --> AB["B | A — P(B|A)"]
    A --> ABc["B' | A — P(B'|A)"]
    Ac --> AcB["B | A' — P(B|A')"]
    Ac --> AcBc["B' | A' — P(B'|A')"]
    style S fill:#2563eb,color:#fff,stroke:none
    style A fill:#16a34a,color:#fff,stroke:none
    style Ac fill:#dc2626,color:#fff,stroke:none
    style AB fill:#22c55e,color:#1f2328,stroke:none
    style ABc fill:#22c55e,color:#1f2328,stroke:none
    style AcB fill:#f87171,color:#1f2328,stroke:none
    style AcBc fill:#f87171,color:#1f2328,stroke:none

To find \(P(A \mid B)\): follow the branch through A to B, then divide by total probability of B across all branches.

Generalized form:

\[P(A_i \mid B) = \frac{P(B \mid A_i) \cdot P(A_i)}{\sum_{j} P(B \mid A_j) \cdot P(A_j)}\]

Law of Total Probability

\[P(B) = \sum_{i} P(B \mid A_i) \cdot P(A_i)\]

Counting

Permutations (order matters): \(P(n, r) = \frac{n!}{(n-r)!}\)
Combinations (order doesn’t matter): \(\binom{n}{r} = \frac{n!}{r!(n-r)!}\)

Descriptive Statistics

Measures of Central Tendency

Mean: \(\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i\)
Median: Middle value when data is sorted
Mode: Most frequently occurring value

Measures of Spread

Variance (population): \(\sigma^2 = \frac{1}{N} \sum_{i=1}^{N} (x_i - \mu)^2\)
Variance (sample): \(s^2 = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2\)
Standard Deviation: \(\sigma = \sqrt{\sigma^2}\)
Interquartile Range: \(\text{IQR} = Q_3 - Q_1\)

Covariance & Correlation

Covariance: \(\text{Cov}(X, Y) = E[(X - \mu_X)(Y - \mu_Y)]\)
Pearson Correlation: \(r = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y}\) where \(-1 \leq r \leq 1\)

graph LR
    N1["-1<br/>Perfect negative"] ~~~ N5["-0.5<br/>Moderate negative"] ~~~ Z["0<br/>No correlation"] ~~~ P5["+0.5<br/>Moderate positive"] ~~~ P1["+1<br/>Perfect positive"]
    style N1 fill:#dc2626,color:#fff,stroke:none
    style N5 fill:#f87171,color:#1f2328,stroke:none
    style Z fill:#6b7280,color:#fff,stroke:none
    style P5 fill:#4ade80,color:#1f2328,stroke:none
    style P1 fill:#16a34a,color:#fff,stroke:none

Common Distributions

Discrete Distributions

Distribution	PMF	Mean	Variance
Bernoulli	\(P(X=k) = p^k(1-p)^{1-k}\)	\(p\)	\(p(1-p)\)
Binomial	\(P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}\)	\(np\)	\(np(1-p)\)
Poisson	\(P(X=k) = \frac{\lambda^k e^{-\lambda}}{k!}\)	\(\lambda\)	\(\lambda\)
Geometric	\(P(X=k) = (1-p)^{k-1}p\)	\(\frac{1}{p}\)	\(\frac{1-p}{p^2}\)

Continuous Distributions

Distribution	PDF	Mean	Variance
Uniform	\(f(x) = \frac{1}{b-a}\)	\(\frac{a+b}{2}\)	\(\frac{(b-a)^2}{12}\)
Normal	\(f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}\)	\(\mu\)	\(\sigma^2\)
Exponential	\(f(x) = \lambda e^{-\lambda x}\)	\(\frac{1}{\lambda}\)	\(\frac{1}{\lambda^2}\)

Standard Normal Distribution

\[Z = \frac{X - \mu}{\sigma}\]

The 68-95-99.7 rule (empirical rule):

block-beta
    columns 7
    L3["-3σ"] L2["-2σ"] L1["-1σ"] M["μ"] R1["+1σ"] R2["+2σ"] R3["+3σ"]
    style L3 fill:#6b7280,color:#fff,stroke:none
    style L2 fill:#93c5fd,color:#1f2328,stroke:none
    style L1 fill:#3b82f6,color:#fff,stroke:none
    style M fill:#1d4ed8,color:#fff,stroke:none
    style R1 fill:#3b82f6,color:#fff,stroke:none
    style R2 fill:#93c5fd,color:#1f2328,stroke:none
    style R3 fill:#6b7280,color:#fff,stroke:none

Range	Coverage
\(\mu \pm 1\sigma\)	68% of data
\(\mu \pm 2\sigma\)	95% of data
\(\mu \pm 3\sigma\)	99.7% of data

Expected Value & Moments

Expected Value: \(E[X] = \sum_{i} x_i P(x_i)\) (discrete), \(E[X] = \int_{-\infty}^{\infty} x f(x) \,dx\) (continuous)
Linearity: \(E[aX + bY] = aE[X] + bE[Y]\)
Variance via Expectation: \(\text{Var}(X) = E[X^2] - (E[X])^2\)
Variance of Sum (independent): \(\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y)\)

Inference

Confidence Intervals

For population mean (known \(\sigma\)): \(\bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}\)
For population mean (unknown \(\sigma\)): \(\bar{x} \pm t_{\alpha/2, \, n-1} \cdot \frac{s}{\sqrt{n}}\)

Confidence Level	\(z_{\alpha/2}\)
90%	1.645
95%	1.960
99%	2.576

Hypothesis Testing

State null \(H_0\) and alternative \(H_a\)
Choose significance level \(\alpha\) (commonly 0.05)
Compute test statistic
Find p-value or compare to critical value
Reject \(H_0\) if p-value \(< \alpha\)

Choosing a Test

flowchart TD
    Q["What are you testing?"] --> Means["Comparing means"]
    Q --> Prop["Comparing proportions"]
    Q --> Fit["Goodness of fit /<br/>independence"]
    Means --> KnownSig{"σ known?"}
    KnownSig -->|Yes| Z["Z-test"]
    KnownSig -->|No| Samples{"How many<br/>samples?"}
    Samples -->|1 or 2| T["T-test"]
    Samples -->|3+| ANOVA["ANOVA (F-test)"]
    Prop --> ZProp["Z-test for<br/>proportions"]
    Fit --> Chi["Chi-squared test"]
    style Q fill:#2563eb,color:#fff,stroke:none
    style Means fill:#7c3aed,color:#fff,stroke:none
    style Prop fill:#7c3aed,color:#fff,stroke:none
    style Fit fill:#7c3aed,color:#fff,stroke:none
    style KnownSig fill:#6b7280,color:#fff,stroke:none
    style Samples fill:#6b7280,color:#fff,stroke:none
    style Z fill:#16a34a,color:#fff,stroke:none
    style T fill:#16a34a,color:#fff,stroke:none
    style ANOVA fill:#16a34a,color:#fff,stroke:none
    style ZProp fill:#16a34a,color:#fff,stroke:none
    style Chi fill:#16a34a,color:#fff,stroke:none

Common Tests

Z-test: \(z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}\) — known population variance
T-test: \(t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}\) — unknown population variance, small sample
Chi-squared test: \(\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}\) — goodness of fit, independence

Type I & Type II Errors

	\(H_0\) True	\(H_0\) False
Reject \(H_0\)	Type I Error (\(\alpha\))	Correct
Fail to reject \(H_0\)	Correct	Type II Error (\(\beta\))

Power: \(1 - \beta\) — probability of correctly rejecting a false \(H_0\)

Central Limit Theorem

For a sample of size \(n\) drawn from a population with mean \(\mu\) and standard deviation \(\sigma\):

\[\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right) \text{ as } n \to \infty\]

The sampling distribution of \(\bar{X}\) approaches normal regardless of the population distribution, provided \(n\) is sufficiently large (typically \(n \geq 30\)).

This cheatsheet covers probability fundamentals, descriptive statistics, common distributions, expected values, confidence intervals, hypothesis testing, and the central limit theorem.