Numerical Methods

Lecture 6: Monte Carlo simulations

by: Tomasz Romańczukiewicz


Outline

  • Random number generators
  • Monte Carlo integration
  • Sctochastic processes
  • Stochastic differential equations (SDE)
In [28]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline  
plt.rcParams['figure.dpi'] = 120
plt.rcParams['figure.figsize'] = [8,4]
plt.rcParams['axes.grid'] = True
#plt.rcParams.keys()

Monte Carlo simulations:

  • Method of calculations - usually slow convergence $N^{-1/2}$ but OK for multidimensional problems
    • integration
    • minimization (anealing, differential evolution etc)
    • quantum calculation (path integrals)
  • Simulations of real life
    • brownian motion,
    • finance

Hello world


Task: find an area of a qarter circe $r = 1$, and $x>0, y>0$.

Analytically:

$$\int_0^1 \sqrt{1-x^2}\,dx=\frac{\pi}{4}$$
In [2]:
from scipy.integrate import *
result = quad(lambda x: np.sqrt(1-x**2), 0, 1, full_output=1) 
print("Area  : ", result[0])
print("error :", result[0]-np.pi/4)
print("number of evaluations: ", result[2]['neval'])
#quad_explain()
Area  :  0.7853981633974481
error : -2.220446049250313e-16
number of evaluations:  231

Monte Carlo approach:

  1. randomly choose a point inside a square $0<x<1$, $0<y<1$
  2. check if the points is insde the circle $x^2+y^2<1$ (don't calculate the square root)
  3. repeat calculating the number of points inside the circle
  4. the area is apprioximetly equal to the ratio of number of points inside the circle to total number of points
In [3]:
Nsamples, k = 10000, 0
for n in range(Nsamples):
    x = np.random.random(2) # two random numbers
    if np.sum(x**2)<1: k+=1

my_result = k/Nsamples
print("My result:",  my_result)
print("My error :",  my_result-np.pi/4)
My result: 0.7907
My error : 0.0053018366025516794

Exercise:

  • plot error as a function of Nsamples
  • optimize using numpy vectorization (choose all numbers at once etc.)
  • calculate the volutin of a unit ball in 100 dimensions (how many calculations quad would use performiong integration in all dimentions).

Pseudo-random number generators

  • Computer (not quantum) is a deterministic machine, giving predictable results
  • Random numbers are unpredictable from definition
  • There are however deterministic systems which behave in quite upredictable way (chaotic systems)
  • But there is usually a large correlation $y_{n+1}=F(y_n, y_{n-1},\ldots, t), y_n\in \mathbb{R}^N$
  • Sometimes hidden variables are used
  • Good algorithms are very difficult to find
  • How to test? is a sequence 1, 2, 3, 4, 5, 6, 7, 8 random?
  • Histograms (bins) + $\chi^2$ test
  • Correlations $x_{n}=F(x_{n-1}, x_{n-2},\ldots)$
  • Lehmer random number generator ${\displaystyle X_{n+1}=a\cdot X_{n}{\bmod{m}}}$ (use with caution)
    • RADU $a=65539$ $m=2^{31}$, period: $2^{29}$ (famously bad)
    • Park and Miller’s “minimal standard” MINSTD $a = 16807$ and $m = 2^{31} - 1 = 2147483647$.
  • Linear congruential generator (LCG) ${\displaystyle X_{n+1}=\left(aX_{n}+c\right){\bmod {m}}}$
  • L’Ecuyer's combined multiple recursive generator by $z_n = (x_n - y_n) \mod m_1$ where

    $$\begin{align}x_n & = (a_1 x_{n-1} + a_2 x_{n-2} + a_3 x_{n-3}) \mod m_1 \\y_n & = (b_1 y_{n-1} + b_2 y_{n-2} + b_3 y_{n-3}) \mod m_2\end{align}$$

  • Bit generators (maximally equidistributed combined Tausworthe generator)$x_n = (s^1_n \oplus s^2_n \oplus s^3_n)$ where $\oplus$ denotes binary XOR
$$\begin{align} s^1_{n+1} &= (((s^1_n \& 4294967294)\ll 12) \oplus (((s^1_n\ll 13) \oplus s^1_n)\gg 19)) \\s^2_{n+1} &= (((s^2_n \& 4294967288)\ll 4) \oplus (((s^2_n\ll 2) \oplus s^2_n)\gg 25)) \\s^3_{n+1} &= (((s^3_n \& 4294967280)\ll 17) \oplus (((s^3_n\ll 3) \oplus s^3_n)\gg 11)) \end{align}$$
  • lagged Fibbonacci $S_n \equiv S_{n-j} \oplus S_{n-k} \pmod{m}, 0 < j < k$
  • kernel-built generator /dev/random
In [4]:
def show_generators():
    N = 10000
    x1 = np.empty(N+1)
    x2 = np.empty(N+1)
    
    a, m1, x1[0] = 65539, 1<<25, 126536  # bad LCG
    for n in range(N):
        x1[n+1] = (a*x1[n])%m1

    m2, x2[0] = 1<<31, 126536    # OK LCG
    for n in range(N):
        x2[n+1] = (a*x2[n])%m2
    x = np.random.random(N+1)    # standard generator    
    plt.figure(figsize=(10, 3))    
    plt.subplot(1, 3, 1);    plt.xlabel('$x_{n}$');   plt.ylabel('$x_{n+1}$')
    plt.title('Bad LCG')
    plt.plot(x1[0:-1]/m1, x1[1:]/m1, 'r.', markersize=0.5)    
    plt.subplot(1, 3, 2); plt.xlabel('$x_{n}$')
    plt.title('OK LCG')
    plt.plot(x2[0:-1]/m2, x2[1:]/m2, 'g.', markersize=0.5)
    plt.subplot(1, 3, 3); plt.xlabel('$x_{n}$')
    plt.title('Default generator')
    plt.plot(x[0:-1], x[1:], 'b.', markersize=0.5)
    plt.show()
In [29]:
show_generators()

Nonuniform distributions

Transformation method

Let $x$ be a random number with uniform distribution $[0, 1)$.
We want to have a random number $y$ with a distribution density given by a function $p(y)$
That means that

$$p(y)dy = dx\,, \;\;\Rightarrow \frac{dx}{dy}=p(y)$$$$y(x)=F^{-1}(x)\,, \;\;\textrm{where}\;\;F(y)=\int_{0}^yp(y')\,dy'$$

Seems difficult, doesn't it?

Example: generate exponential distribution

$$p(y)=e^{-y}$$$$F(y)=\int_{0}^yp(y')\,dy'=1-e^{-y}=x(y)$$

so the transformation is

$$y(x) = -\log(1-x)$$
In [6]:
def distribution_example(N=5000):
    x = np.random.random(N)
    y = -np.log(1-x)
    plt.hist(y, bins=np.linspace(0, 6, 50), density='True', label='generated dist.')
    z = np.linspace(0, 6, 1000)
    plt.plot(z, np.exp(-z), 'black', label='desired dist.')
    plt.xlabel('$y$', fontsize=16)
    plt.ylabel('$p(y)$', fontsize=16)
    plt.legend()
    plt.show()
In [30]:
distribution_example()

Normal distribution Box-Muller method

$$p(y)dy=\frac{1}{\sqrt{2\pi}}e^{-y^2/2}\,dy$$

We generate two numbers $x_1, x_2 \in(0,1)^2$and make the transform

$$y_1=\sqrt{-2\log x_1}\cos(2\pi x_2)$$$$y_2=\sqrt{-2\log x_1}\sin(2\pi x_2)$$

then

$$p(y_1,y_2)dy_1dy_2=p(x_1, x_2)dx_1dx_2$$

But

$$dx_1dx_2=\left|\frac{\partial(x_1, x_2)}{\partial(y_1, y_2)}\right|dy_1dy_2$$

The Jacobian is

$$Jac = -\left(\frac{1}{\sqrt{2\pi}}e^{-y_1^2/2}\right)\left(\frac{1}{\sqrt{2\pi}}e^{-y_2^2/2}\right)$$

Therefore $y_1$ ad $y_2$ are independent and have normal distributions.

In [32]:
def rejection_plot():
    x = np.linspace(0,1, 1000)
    y = x**4*np.sin(np.pi*x) 
    #plt.figure(figsize=(5, 3))     
    fig, ax = plt.subplots()
    ax.fill_between(x, 0, y, facecolor='green', alpha=0.2)
    ax.plot(x, y, 'black', label='$p(y)$')
    plt.ylabel('$x$', fontsize=16)
    plt.xlabel('$y$', fontsize=16)
    X = [0.1, 0.3, 0.6, 0.8]
    Y = [0.2, 0.12, 0.1, 0.15]
    ax.plot(X[0:2], Y[0:2], 'ro', fillstyle='none', label='rejected')
    ax.plot(X[2:], Y[2:], 'bo', label='accepted')
    plt.legend()
    plt.show()

Rejection method

  • generate a pair $x, y$ from uniform distribution.
  • reject if $x>p(y)$
  • accept if $x<p(y)$
In [33]:
rejection_plot()

Some distributions provided by numpy.random

  • beta
  • binomial
  • chi-square
  • exponential
  • logistic
  • multinomial
  • hypergeometric
  • normal (Gaussian)
  • uniform

Often used also scipy.stats or random.random

Brownian motion

Let us denote a probability that a particle would move from $x$ to $x-y$ in time $\tau$ by $\varphi(y)dy$ Then the whole distribution $\rho(x, t)$ is governed by a convolution

$$\rho(x, t+\tau) = \rho(x, t) + \int_{-\infty}^\infty\rho(x-y,t)\,\varphi(y)dy$$

for sufficiently short times $\tau\ll1$ and if $\varphi(-y)=\varphi(y)$:

$$\rho(x, t+\tau) = \rho(x, t) + \frac{\partial\rho}{\partial t}\tau = \rho(x, t) + \frac{\partial^2\rho}{\partial x^2}\int_{-\infty}^\infty\frac{y^2\varphi(y)}{2}dy + \text{higher moments}$$

We end up with diffusion equation:

$$\frac{\partial\rho}{\partial t} = D \frac{\partial^2\rho}{\partial x^2}$$

where

$$D = \int_{-\infty}^\infty\frac{y^2\varphi(y)}{2 \tau}dy=\text{const}$$

if higgher moments are used:

$$\frac{\partial\rho}{\partial t} = D_2 \frac{\partial^2\rho}{\partial x^2}+D_4 \frac{\partial^4\rho}{\partial x^4}+\cdots$$

Fundamental solution of the diffusion equation($D_2=D, D_{n>2}=0$):
Assuming $\rho(x, 0) = \delta(x)$

$$\rho(x,t)={\frac {1}{\sqrt {4\pi Dt}}}\exp \left(-{\frac {x^{2}}{4Dt}}\right)$$

or in more general

$$\rho(x,t)={\frac {1}{\sqrt {4\pi Dt}}}\int_{-\infty}^{\infty}\exp \left(-{\frac {(x-s)^{2}}{4Dt}}\right)\rho(s,0)\,ds.$$

Note: diffusion can only evolve forward $t>0$

Central limit theorem

Let ${X_1, X_2,\ldots,X_n}$ be a random sample of independent and identically distributed random variables shuch that $\operatorname {E} [X_i]=\mu$ and $\operatorname {var}[X_i]=\sigma^2$ than the sequence

$$S_n=\frac{X_1+X_2+\cdots+X_n}{n}$$

would tend to a normal distribution with $\operatorname {E} [S_i]=\mu$ and $\displaystyle \operatorname {var}[S_i]=\frac{\sigma^2}{n}$

If $x, y$ have a distribution $\rho_1(x)$ than the distrubution of sum $s=x+y$ would be a convolution of identical distributions.

$$\rho_2(s) = \int\rho(s-y)\rho(y)\,dy=(\rho_1\circ\rho_1)(s)$$

For simplicity let us assume that $\operatorname {E} [x]=\operatorname {E} [y]=0$. We can construct a series of $\rho_{n+1}=\rho_n\circ\rho_n$

In [10]:
def show_accumulative_distro(x, ρ):
    N = x.size
    ρ /= np.sum(ρ)*(x[-1]-x[0])/(ρ.size-1)
    plt.figure(figsize=(12, 8), dpi=100)
    plt.subplot(3, 3, 1); plt.ylabel(r'$\rho_n(x)$',  usetex=True, fontsize=16)
    plt.plot(x, ρ, label='initial')
    plt.legend()
    for n in range (8):       
        ρ = np.convolve(ρ,ρ) * (x[-1]-x[0]) / (ρ.size-1) # normalization 
        ρ = ρ[0::2] 
        x = np.linspace(2*x[0], 2*x[-1], ρ.size)
        plt.subplot(3, 3, n+2)          
        if n%3==2: plt.ylabel(r'$\rho_n(x)$',  usetex=True, fontsize=16)
        if n>=5: plt.xlabel('$x$',  usetex=True, fontsize=16)
        plt.plot(x, ρ, 'C'+str(n+1), label=f'n={n+1}')
        plt.legend()
    plt.show()
In [34]:
x = np.linspace(-1, 1, 1000)
#ρ = np.float64(np.abs(x-0.8)<0.1) + np.float64(np.abs(x+0.8)<0.1) 
ρ = x**4
show_accumulative_distro(x, ρ)    
In [12]:
from scipy.stats import norm
def brownian_motion(σ=2, T=10, N=1000):
    t = np.linspace(0, T, N)
    dt = t[1]-t[0]

    r = norm.rvs(size=t.shape, scale=σ**2*dt)
    x = np.cumsum(r)  # used cumulative sum instead of loop
    plt.plot(t, x)
    plt.xlabel('$t$',  usetex=True, fontsize=16)
    plt.ylabel('$x$',  usetex=True, fontsize=16)
    plt.show()
In [35]:
brownian_motion()
In [14]:
def brownian(x0, n, dt, delta, out=None):
    """    Example from https://scipy-cookbook.readthedocs.io/items/BrownianMotion.html     """ 
    x0 = np.asarray(x0)

    r = norm.rvs(size=x0.shape + (n,), scale=delta*np.sqrt(dt))
    if out is None:
        out = np.empty(r.shape)
    np.cumsum(r, axis=-1, out=out)
    out += np.expand_dims(x0, axis=-1)

    return out
In [15]:
def example_multi_brownian(σ=2, T=10, N=1000, m=20, plotting=False):
    dt = T/N
    x = np.empty((m,N+1))
    x[:, 0] = 0

    brownian(x[:,0], N, dt, σ, out=x[:,1:])
    if plotting==True:
        t = np.linspace(0.0, N*dt, N+1)
        for k in range(m):
            plt.plot(t, x[k])
        plt.xlabel('$t$',  usetex=True, fontsize=16)
        plt.ylabel('$x$',  usetex=True, fontsize=16)
        plt.show()
    return x
In [36]:
 X = example_multi_brownian(σ=2, T=10, N=1000, m=20, plotting=True)
In [17]:
def compare_diff_brownian(σ=2, T=1, N=101, m=1000):
    X = example_multi_brownian(σ, T, N, m, plotting=False)
    plt.figure(figsize=(12, 4), dpi=100)
    for n in range(6):
        plt.subplot(2,3,n+1)
        x = np.linspace(-5,5,1000)
        k = 15*(n+1)
        t = k*T/(N-1)
        plt.hist(X[:, k], bins=np.linspace(-5, 5, 50), density='True', color='C'+str(2*n))
        Dt = 2*t
        g = np.exp(-x**2/(4*Dt))/np.sqrt(4*np.pi*Dt)
        plt.plot(x, g, 'black', label=f"t={t}")
        plt.legend()
        if n>=3: plt.xlabel('$x$',  usetex=True, fontsize=16)
        if n%3==0: plt.ylabel(r'$\rho(x)$',  usetex=True, fontsize=16)

    plt.show()
In [37]:
compare_diff_brownian()

Example of real life distributions

In [19]:
import pandas as pd
import numpy as np
from functools import reduce
import pandas_datareader.data as web
import datetime
In [20]:
start, end = datetime.datetime(2009, 12, 30), datetime.datetime(2019, 11, 20)

tickers = ["^DJI", "^IXIC", "^GSPC", "^STOXX50E", "^N225", "^GDAXI"]

asset_universe = pd.DataFrame([web.DataReader(ticker, 'yahoo', start, 
                     end).loc[:, 'Adj Close'] for ticker in tickers],
                     index=tickers).T.fillna(method='ffill')

asset_universe = asset_universe/asset_universe.iloc[0, :]
In [25]:
def show_assets(n):       
    data = asset_universe[tickers[n]]
    df = np.diff(data)
    var = np.var(df)
    mean = np.mean(df)
    print(f'analysis for {tickers[n]}: {mean:.6f} ± {np.sqrt(var):.6f}') 
    
    plt.figure(figsize=(10, 6), dpi=100)
    plt.subplot(2, 2, (1,2))
    plt.plot(data, 'r')    
    plt.subplot(2, 2, 3)
    plt.plot(range(df.size), df, 'g')
    plt.subplot(2, 2, 4)
    
    plt.hist(df, bins=np.linspace(-0.05, 0.05, 50), density='True', color='C1')
    t = np.linspace(-0.05, 0.05, 500)
    y = 1/np.sqrt(2*np.pi*var)*np.exp(-(t-mean)**2/(2*var))
    plt.plot(t, y, 'black')
    plt.show()
In [38]:
show_assets(0)
analysis for ^DJI: 0.000636 ± 0.014427
In [39]:
show_assets(1)
analysis for ^IXIC: 0.001057 ± 0.022163
In [40]:
show_assets(3)
analysis for ^STOXX50E: 0.000094 ± 0.011880

Conclusion:

  • The distributions are very rarely gaussian.
  • Why?
    • different time scales (Central Limit Theorem requires identical independent distributions)
    • higher moments still presents
    • dependence from oher assets
    • nonlinear dependence

Exercise:

generate brownian motion with one of the distrubutions obtained from real data.

Stochastic differential equations

Many processes can be modeled with an equation
(physical notation): $$dy = a(y,t)dt + b(y,t)dW$$

Financial notation

$$dX_t = a(t, X_t)dt + b(t, X_t)dW_t$$

where $dW$ describes a Wienier process (brownian motion)

The solution can be formally written as

$$X_t=X_0 + \int_0^ta(s,X_s)\,ds+\int_0^tb(s,X_s)\,dW_s$$

For example Langevin equation (noisy drift)

$$dy = -\frac{(y-\mu)}{\tau} dt + \sigma \sqrt{\frac{2}{\tau}} dW$$

or Black-Scholes model

$$dS_t=S_t\mu\,dt + S_t\sigma\, dW_t$$

More general:

$$dX_t=a(b-X_t)\,dt +c\sqrt{X_t}\,dW_t$$

Usual integral can be interpreted as a sum

$$\int_0^Ta_t\,dt\approx \sum_{n=0}^N a_{t_n}(t_{n+1}-t_{n})$$

The stochastic one:

$$\int_0^Tb_t\,dW_t\approx \sum_{n=0}^N b_{t_n}(W_{t_{n+1}}-W_{t_{n}})$$

Euler-Maruyama scheme

$$X_{n+1}=X_n+a(t_n, X_n)\delta t +b(t_n, X_n)\delta W_n$$

The method is convergent if

$$\lim_{\delta \to 0}\operatorname {E}\left(\left|X_T-X^{\delta t}_T\right|\right)=0.$$

The method is weakly convergent if

$$\lim_{\delta \to 0}\left|\operatorname {E}Q(X_T)-\operatorname {E}Q(X_T^{\delta t})\right|=0$$

for every polynomial $Q$

The numerical scheme is strongly convergent with order $\gamma$ if

$$\operatorname{E}\left(\left|X_T-X^{\delta t}_T\right|\right)\leq c_T(\delta t)^\gamma$$

and weakly convergent with order $\gamma$ if

$$\left|\operatorname {E}Q(X_T)-\operatorname {E}Q(X_T^{\delta t})\right|\leq c_T(\delta t)^\gamma$$

Euler-Maruyama scheme is strongly convergent with order $1/2$ and weakly convergent with order $1$ for suffieciently smooth functions (continuous $a^{(4)}$ and $b^{(4)}$)

Weak convergence is important at specific time
Strong convergence is important for a path as a whole.

Milstein scheme (both orders 1)

$$X_{n+1}=X_n+a(X_n)\delta t+b(X_n)\delta W_n+\frac{1}{2}b'(X_n)b(X_n)\left(\delta W_n^2-\delta t \right)$$

Other methods:

  • Stratonovich Heun algorithm
  • order 1.0 strong Stochastic Runge-Kutta algorithm SRI2
  • the Kloeden and Platen two-step implicit order 1.0 strong algorithm

Tasks:

  1. Implement MC method for calculation $\pi/4$ -plot error as a function of Nsamples
    • optimize using numpy vectorization (choose all numbers at once etc.)
    • calculate the volutin of a unit ball in 100 dimensions (how many calculations quad would use performiong integration in all dimentions).
  2. Implement Box-Muller method and generate the gaussian distribution
  3. Implement rejection method for gaussian or exponential distribution and compare with another method
  4. Generate a brownian motion with a drift
  1. Generate brownian motion with distribution obtained from the real data
  2. Solve the equation $$dx = -(a + x b^2)(1 - x^2)dt + b*(1 - x^2)dW$$ where $a=1.0, b=0.8$ are parameters.
    Check convergence order.