Stochastic Processes

From Static Distributions to Dynamic Random Phenomena

probability

foundations

Author

Universe Office

Published

April 4, 2026

Introduction

A stock price jumps 2% on Monday, dips 1% on Tuesday, then drifts sideways for a week. Customers arrive at a bank branch at unpredictable intervals throughout the day. A company’s credit rating migrates from A to BBB over three years, then to BB, then defaults.

Each of these is a random phenomenon that evolves over time. The tools from the previous five articles — probability spaces, random variables, expectations, conditional probability, limit theorems — all deal with static snapshots: a single roll, a fixed sample, a one-time measurement. To model phenomena that unfold dynamically, you need a family of random variables indexed by time: a stochastic process.

Think of a stochastic process as a movie rather than a photograph. Each frame is a random variable at a specific time. The full movie — the entire path from start to finish — is one realization of the process.

The previous article showed that sample averages converge to the true mean (LLN) and that their fluctuations are asymptotically Normal (CLT). This article moves from static collections to dynamic evolution.

This article provides an overview of the main types of stochastic processes (Ross 2014; Casella and Berger 2002). For a deeper treatment — especially the continuous-time theory needed in derivatives pricing (Shreve 2004) — see the financial engineering series at 300-financial-engineering/300-stochastic-processes/.

Definition and Classification

Definition (Stochastic Process; Ross, 2014)

A stochastic process is a collection of random variables \(\{X(t) : t \in T\}\) defined on a common probability space \((\Omega, \mathcal{F}, P)\), where \(T\) is an index set representing time.

For a fixed \(\omega \in \Omega\), the function \(t \mapsto X(t, \omega)\) is called a sample path (or realization). A stochastic process can therefore be viewed in two ways:

For fixed \(t\): a random variable \(X(t)\)
For fixed \(\omega\): a deterministic function of time (a path)

Classification

Stochastic processes are classified along two axes — the nature of the time index and the nature of the state space:

	Discrete state	Continuous state
Discrete time	Markov chain (credit rating transitions)	AR(1) process (quarterly GDP)
Continuous time	Poisson process (default arrivals)	Brownian motion (stock prices)

Figure 1 illustrates this taxonomy with examples from finance and science.

Figure 1: Classification of stochastic processes

Code

import numpy as np

rng = np.random.default_rng(seed=12345)

# Discrete time, discrete state: simple random walk
steps = rng.choice([-1, 1], size=20)
walk = np.cumsum(steps)
print("Random walk (discrete time, discrete state):")
print(f"  Steps: {steps[:10]}...")
print(f"  Path:  {walk[:10]}...")

# Continuous time, continuous state: Brownian motion (approximation)
dt = 0.01
t = np.arange(0, 1, dt)
dW = rng.normal(0, np.sqrt(dt), size=len(t))
W = np.cumsum(dW)
print(f"\nBrownian motion (continuous time, continuous state):")
print(f"  W(0.5) = {W[50]:.4f}, W(1.0) = {W[-1]:.4f}")

Random walk (discrete time, discrete state):
  Steps: [ 1 -1  1 -1 -1  1  1  1  1 -1]...
  Path:  [ 1  0  1  0 -1  0  1  2  3  2]...

Brownian motion (continuous time, continuous state):
  W(0.5) = 0.6672, W(1.0) = -0.1430

Random Walk

The random walk is the simplest stochastic process and the natural starting point: it accumulates independent random steps over time.

A simple random walk starts at \(S_0 = 0\) and moves up or down by 1 at each step:

\[ S_n = \sum_{i=1}^{n} X_i, \quad X_i = \begin{cases} +1 & \text{with probability } p \\ -1 & \text{with probability } 1 - p \end{cases} \]

When \(p = 0.5\), the walk is symmetric. Its key properties are:

\(E[S_n] = n(2p - 1)\): zero drift when symmetric, positive drift when \(p > 0.5\)
\(\text{Var}(S_n) = 4np(1-p)\): linear growth in variance
The symmetric random walk returns to the origin infinitely often (recurrence in 1D)

Connection to Finance

The random walk is the simplest model for an asset price. If you let \(S_n\) represent the log-price after \(n\) periods, each increment \(X_i\) is a single-period return. The efficient market hypothesis, in its weak form, says that prices should look like a random walk — past movements carry no information about future direction.

Figure 2 shows ten sample paths each for symmetric (\(p = 0.5\)) and asymmetric (\(p = 0.55\)) walks.

Figure 2: Random walk: symmetric (left) and asymmetric (right)

Code

import numpy as np

rng = np.random.default_rng(seed=12345)
n_steps = 1000
n_paths = 10_000

# Symmetric walk: E[S_n] = 0, Var(S_n) = n
steps = rng.choice([-1, 1], size=(n_paths, n_steps))
S = np.cumsum(steps, axis=1)
print("Symmetric random walk (p=0.5), n=1000:")
print(f"  E[S_n] = {S[:, -1].mean():.2f} (theory: 0)")
print(f"  Var(S_n) = {S[:, -1].var():.1f} (theory: {n_steps})")

# Asymmetric walk: E[S_n] = n(2p-1), Var(S_n) = 4np(1-p)
p = 0.55
steps_asym = rng.choice([-1, 1], size=(n_paths, n_steps), p=[1-p, p])
S_asym = np.cumsum(steps_asym, axis=1)
print(f"\nAsymmetric random walk (p={p}), n=1000:")
print(f"  E[S_n] = {S_asym[:, -1].mean():.2f} (theory: {n_steps*(2*p-1):.0f})")
print(f"  Var(S_n) = {S_asym[:, -1].var():.1f} (theory: {4*n_steps*p*(1-p):.0f})")

Symmetric random walk (p=0.5), n=1000:
  E[S_n] = -0.35 (theory: 0)
  Var(S_n) = 998.5 (theory: 1000)

Asymmetric random walk (p=0.55), n=1000:
  E[S_n] = 100.44 (theory: 100)
  Var(S_n) = 958.1 (theory: 990)

Markov Chains

A Markov chain is a process where the future depends on the present but not the past. This “memoryless” property makes computation tractable.

A stochastic process \(\{X_n\}\) is a Markov chain if:

\[ P(X_{n+1} = j \mid X_n = i, X_{n-1} = i_{n-1}, \ldots, X_0 = i_0) = P(X_{n+1} = j \mid X_n = i) \]

This is the Markov property (memorylessness). All relevant information about the future is captured by the current state.

Application: credit rating transitions. Rating agencies model the movement of a company’s credit rating (AAA, AA, A, BBB, …, default) as a Markov chain. The transition matrix encodes the probability of moving from any rating to any other rating in one year.

Transition Matrix

For a finite state space \(\{1, 2, \ldots, k\}\), the one-step transition probabilities form a transition matrix \(\mathbf{P}\):

\[ P_{ij} = P(X_{n+1} = j \mid X_n = i), \quad \sum_{j=1}^{k} P_{ij} = 1 \]

The \(n\)-step transition matrix is simply \(\mathbf{P}^n\). To find the probability of being in state \(j\) after \(n\) steps starting from state \(i\), read entry \((i, j)\) of \(\mathbf{P}^n\).

Stationary Distribution

A probability vector \(\boldsymbol{\pi}\) is a stationary distribution if:

\[ \boldsymbol{\pi} = \boldsymbol{\pi} \mathbf{P} \]

Under mild conditions (irreducibility and aperiodicity), the chain converges to \(\boldsymbol{\pi}\) regardless of the starting state. The stationary distribution represents the long-run proportion of time spent in each state.

Example: Weather Model

Consider a three-state weather model (Sunny, Cloudy, Rainy) with transition matrix:

\[ \mathbf{P} = \begin{pmatrix} 0.7 & 0.2 & 0.1 \\ 0.3 & 0.4 & 0.3 \\ 0.2 & 0.3 & 0.5 \end{pmatrix} \]

Figure 3 shows the transition diagram and how the state distribution converges to the stationary distribution over time.

Figure 3: Markov chain: weather model and convergence to stationary distribution

Code

import numpy as np

P = np.array([
    [0.7, 0.2, 0.1],
    [0.3, 0.4, 0.3],
    [0.2, 0.3, 0.5],
])
states = ["Sunny", "Cloudy", "Rainy"]

# Stationary distribution: solve pi @ P = pi, sum(pi) = 1
# Equivalent to (P.T - I) @ pi = 0 with constraint sum = 1
A = P.T - np.eye(3)
A[-1] = 1  # replace last equation with sum constraint
b = np.zeros(3)
b[-1] = 1
pi = np.linalg.solve(A, b)

print("Transition matrix P:")
for i, s in enumerate(states):
    print(f"  {s:6s} -> {dict(zip(states, P[i].round(2)))}")

print(f"\nStationary distribution:")
for s, p in zip(states, pi):
    print(f"  {s:6s}: {p:.4f}")

# Verify: starting from any state, P^n converges
Pn = np.linalg.matrix_power(P, 50)
print(f"\nP^50 (rows should be identical = pi):")
for row in Pn:
    print(f"  [{', '.join(f'{x:.4f}' for x in row)}]")

Transition matrix P:
  Sunny  -> {'Sunny': np.float64(0.7), 'Cloudy': np.float64(0.2), 'Rainy': np.float64(0.1)}
  Cloudy -> {'Sunny': np.float64(0.3), 'Cloudy': np.float64(0.4), 'Rainy': np.float64(0.3)}
  Rainy  -> {'Sunny': np.float64(0.2), 'Cloudy': np.float64(0.3), 'Rainy': np.float64(0.5)}

Stationary distribution:
  Sunny : 0.4565
  Cloudy: 0.2826
  Rainy : 0.2609

P^50 (rows should be identical = pi):
  [0.4565, 0.2826, 0.2609]
  [0.4565, 0.2826, 0.2609]
  [0.4565, 0.2826, 0.2609]

Irreducibility and Recurrence

A Markov chain is irreducible if every state can be reached from every other state. A state is recurrent if the chain returns to it with probability 1. For finite, irreducible chains, all states are recurrent and a unique stationary distribution exists.

Poisson Process

The Poisson process counts the number of random events occurring over time. It is the go-to model whenever events arrive independently at a constant average rate.

Applications: customer arrivals at a service counter, default events in a credit portfolio, insurance claims, radioactive decay.

The process \(\{N(t) : t \ge 0\}\) counts the number of events up to time \(t\) and is defined by three properties:

\(N(0) = 0\)
Independent increments: the number of events in non-overlapping intervals are independent
Stationary increments: \(N(t+s) - N(t) \sim \text{Poisson}(\lambda s)\) for rate \(\lambda > 0\)

An equivalent characterization: the inter-arrival times \(T_1, T_2, \ldots\) are iid \(\text{Exponential}(\lambda)\).

Properties

\(E[N(t)] = \lambda t\)
\(\text{Var}(N(t)) = \lambda t\)
The waiting time until the \(n\)-th event follows a Gamma distribution: \(S_n = T_1 + \cdots + T_n \sim \text{Gamma}(n, \lambda)\)

Figure 4 shows simulated event arrival times and the corresponding count process.

Figure 4: Poisson process: inter-arrival times and count process

Code

import numpy as np

rng = np.random.default_rng(seed=12345)
lam = 3.0  # rate: 3 events per unit time
n_events = 50

# Simulate inter-arrival times
inter_arrivals = rng.exponential(1/lam, size=n_events)
arrival_times = np.cumsum(inter_arrivals)

# Count process: N(t) at selected times
t_grid = np.linspace(0, arrival_times[-1], 100)
N_t = np.searchsorted(arrival_times, t_grid)

print(f"Poisson process (lambda={lam}):")
print(f"  First 5 inter-arrival times: {inter_arrivals[:5].round(3)}")
print(f"  First 5 arrival times: {arrival_times[:5].round(3)}")
print(f"  N(5) = {np.searchsorted(arrival_times, 5)} (E[N(5)] = {lam*5:.0f})")
print(f"  N(10) = {np.searchsorted(arrival_times, 10)} (E[N(10)] = {lam*10:.0f})")

Poisson process (lambda=3.0):
  First 5 inter-arrival times: [0.061 0.215 1.563 0.14  0.17 ]
  First 5 arrival times: [0.061 0.276 1.84  1.979 2.15 ]
  N(5) = 11 (E[N(5)] = 15)
  N(10) = 31 (E[N(10)] = 30)

Brownian Motion

Brownian motion (or the Wiener process) is the continuous-time analogue of the random walk. It is the mathematical backbone of modern financial modeling: the Black–Scholes formula, interest rate models, and credit risk intensity models all build on it.

\(\{W(t) : t \ge 0\}\) is defined by four properties:

\(W(0) = 0\)
Independent increments: \(W(t) - W(s)\) is independent of \(\{W(u) : u \le s\}\) for \(s < t\)
Normal increments: \(W(t) - W(s) \sim N(0, t - s)\)
Continuous paths: \(t \mapsto W(t)\) is continuous with probability 1

Brownian motion arises naturally as the limit of a scaled random walk. If \(S_n\) is a symmetric random walk with step size \(\Delta x = 1/\sqrt{n}\) at time intervals \(\Delta t = 1/n\), then as \(n \to \infty\), the process converges to Brownian motion (Donsker’s theorem).

Properties

\(E[W(t)] = 0\)
\(\text{Var}(W(t)) = t\)
\(\text{Cov}(W(s), W(t)) = \min(s, t)\)
Paths are continuous everywhere but differentiable nowhere

Figure 5 shows five sample paths of Brownian motion.

Code

import numpy as np

rng = np.random.default_rng(seed=12345)
T = 1.0
n_steps = 1000
dt = T / n_steps
n_paths = 5

# Simulate Brownian motion
dW = rng.normal(0, np.sqrt(dt), size=(n_paths, n_steps))
W = np.cumsum(dW, axis=1)
W = np.hstack([np.zeros((n_paths, 1)), W])
t = np.linspace(0, T, n_steps + 1)

print("Brownian motion properties (5 paths, T=1):")
print(f"  E[W(1)] = {W[:, -1].mean():.4f} (theory: 0)")
print(f"  Var[W(1)] = {W[:, -1].var():.4f} (theory: 1)")
print(f"  W(0.5) covariance check: Cov(W(0.5), W(1)) = "
      f"{np.cov(W[:, n_steps//2], W[:, -1])[0,1]:.4f} (theory: 0.5)")

Brownian motion properties (5 paths, T=1):
  E[W(1)] = 0.0100 (theory: 0)
  Var[W(1)] = 0.5427 (theory: 1)
  W(0.5) covariance check: Cov(W(0.5), W(1)) = 0.6211 (theory: 0.5)

For the rigorous construction, properties of Brownian motion as a martingale, and its role in the Black–Scholes framework, see the financial engineering series.

Summary: The Probability Foundations Series

This article surveyed four fundamental stochastic processes:

Process	Time	State	Key property	Application
Random walk	Discrete	Discrete	Cumulative independent steps	Asset price modeling
Markov chain	Discrete	Discrete	Memorylessness; stationary distribution	Credit rating transitions
Poisson process	Continuous	Discrete	Independent, stationary increments	Default arrival times
Brownian motion	Continuous	Continuous	Limit of random walk; continuous paths	Stock prices, interest rates

Looking Back: The Full Journey

This article concludes the probability foundations series. Here is how the six articles connect:

Probability Spaces gave us the axiomatic foundation \((\Omega, \mathcal{F}, P)\) — the rulebook for any probabilistic model
Random Variables introduced the bridge from abstract outcomes to numbers, enabling computation
Expectation and Variance provided the summary statistics that compress distributions into actionable quantities
Conditional Probability showed how to update beliefs with new information — the mechanism behind Bayesian reasoning and every filtering algorithm
LLN and CLT explained why sampling works and why the Normal distribution is everywhere
Stochastic Processes (this article) extended everything to the time domain, opening the door to dynamic modeling

Each article built on the previous one: you cannot define a random variable without a probability space, cannot compute an expectation without a random variable, cannot do Bayesian updating without conditional probability, and cannot justify Monte Carlo without the LLN. Stochastic processes tie it all together by adding the dimension of time.

Looking Ahead

Next series: Estimation introduces the tools for learning from data — point estimation, confidence intervals, and hypothesis testing
Financial engineering: 300-financial-engineering/300-stochastic-processes/ develops the continuous-time theory (Ito calculus, geometric Brownian motion, stochastic differential equations) needed for derivatives pricing
Applications in credit risk: Markov chains model credit rating transitions, Poisson processes model default arrival times, and Brownian motion underlies the structural models of default. These connections are explored in credit-risk-parameters/

References

Casella, George, and Roger L. Berger. 2002. Statistical Inference. 2nd ed. Cengage Learning.

Ross, Sheldon M. 2014. Introduction to Probability Models. 11th ed. Academic Press.

Shreve, Steven E. 2004. Stochastic Calculus for Finance I & II. Springer. https://doi.org/10.1007/978-0-387-22527-2.