import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.dpi'] = 120
plt.rcParams['figure.figsize'] = [8,4]
plt.rcParams['axes.grid'] = True
#plt.rcParams.keys()
Task: find an area of a qarter circe $r = 1$, and $x>0, y>0$.
Analytically:
$$\int_0^1 \sqrt{1-x^2}\,dx=\frac{\pi}{4}$$from scipy.integrate import *
result = quad(lambda x: np.sqrt(1-x**2), 0, 1, full_output=1)
print("Area : ", result[0])
print("error :", result[0]-np.pi/4)
print("number of evaluations: ", result[2]['neval'])
#quad_explain()
Area : 0.7853981633974481 error : -2.220446049250313e-16 number of evaluations: 231
Monte Carlo approach:
Nsamples, k = 10000, 0
for n in range(Nsamples):
x = np.random.random(2) # two random numbers
if np.sum(x**2)<1: k+=1
my_result = k/Nsamples
print("My result:", my_result)
print("My error :", my_result-np.pi/4)
My result: 0.7907 My error : 0.0053018366025516794
L’Ecuyer's combined multiple recursive generator by $z_n = (x_n - y_n) \mod m_1$ where
$$\begin{align}x_n & = (a_1 x_{n-1} + a_2 x_{n-2} + a_3 x_{n-3}) \mod m_1 \\y_n & = (b_1 y_{n-1} + b_2 y_{n-2} + b_3 y_{n-3}) \mod m_2\end{align}$$
def show_generators():
N = 10000
x1 = np.empty(N+1)
x2 = np.empty(N+1)
a, m1, x1[0] = 65539, 1<<25, 126536 # bad LCG
for n in range(N):
x1[n+1] = (a*x1[n])%m1
m2, x2[0] = 1<<31, 126536 # OK LCG
for n in range(N):
x2[n+1] = (a*x2[n])%m2
x = np.random.random(N+1) # standard generator
plt.figure(figsize=(10, 3))
plt.subplot(1, 3, 1); plt.xlabel('$x_{n}$'); plt.ylabel('$x_{n+1}$')
plt.title('Bad LCG')
plt.plot(x1[0:-1]/m1, x1[1:]/m1, 'r.', markersize=0.5)
plt.subplot(1, 3, 2); plt.xlabel('$x_{n}$')
plt.title('OK LCG')
plt.plot(x2[0:-1]/m2, x2[1:]/m2, 'g.', markersize=0.5)
plt.subplot(1, 3, 3); plt.xlabel('$x_{n}$')
plt.title('Default generator')
plt.plot(x[0:-1], x[1:], 'b.', markersize=0.5)
plt.show()
show_generators()
Let $x$ be a random number with uniform distribution $[0, 1)$.
We want to have a random number $y$ with a distribution density given by a function $p(y)$
That means that
Seems difficult, doesn't it?
so the transformation is
$$y(x) = -\log(1-x)$$def distribution_example(N=5000):
x = np.random.random(N)
y = -np.log(1-x)
plt.hist(y, bins=np.linspace(0, 6, 50), density='True', label='generated dist.')
z = np.linspace(0, 6, 1000)
plt.plot(z, np.exp(-z), 'black', label='desired dist.')
plt.xlabel('$y$', fontsize=16)
plt.ylabel('$p(y)$', fontsize=16)
plt.legend()
plt.show()
distribution_example()
We generate two numbers $x_1, x_2 \in(0,1)^2$and make the transform
$$y_1=\sqrt{-2\log x_1}\cos(2\pi x_2)$$$$y_2=\sqrt{-2\log x_1}\sin(2\pi x_2)$$then
$$p(y_1,y_2)dy_1dy_2=p(x_1, x_2)dx_1dx_2$$But
$$dx_1dx_2=\left|\frac{\partial(x_1, x_2)}{\partial(y_1, y_2)}\right|dy_1dy_2$$The Jacobian is
$$Jac = -\left(\frac{1}{\sqrt{2\pi}}e^{-y_1^2/2}\right)\left(\frac{1}{\sqrt{2\pi}}e^{-y_2^2/2}\right)$$Therefore $y_1$ ad $y_2$ are independent and have normal distributions.
def rejection_plot():
x = np.linspace(0,1, 1000)
y = x**4*np.sin(np.pi*x)
#plt.figure(figsize=(5, 3))
fig, ax = plt.subplots()
ax.fill_between(x, 0, y, facecolor='green', alpha=0.2)
ax.plot(x, y, 'black', label='$p(y)$')
plt.ylabel('$x$', fontsize=16)
plt.xlabel('$y$', fontsize=16)
X = [0.1, 0.3, 0.6, 0.8]
Y = [0.2, 0.12, 0.1, 0.15]
ax.plot(X[0:2], Y[0:2], 'ro', fillstyle='none', label='rejected')
ax.plot(X[2:], Y[2:], 'bo', label='accepted')
plt.legend()
plt.show()
rejection_plot()
Some distributions provided by numpy.random
Often used also scipy.stats or random.random
Let us denote a probability that a particle would move from $x$ to $x-y$ in time $\tau$ by $\varphi(y)dy$ Then the whole distribution $\rho(x, t)$ is governed by a convolution
$$\rho(x, t+\tau) = \rho(x, t) + \int_{-\infty}^\infty\rho(x-y,t)\,\varphi(y)dy$$for sufficiently short times $\tau\ll1$ and if $\varphi(-y)=\varphi(y)$:
$$\rho(x, t+\tau) = \rho(x, t) + \frac{\partial\rho}{\partial t}\tau = \rho(x, t) + \frac{\partial^2\rho}{\partial x^2}\int_{-\infty}^\infty\frac{y^2\varphi(y)}{2}dy + \text{higher moments}$$We end up with diffusion equation:
$$\frac{\partial\rho}{\partial t} = D \frac{\partial^2\rho}{\partial x^2}$$where
$$D = \int_{-\infty}^\infty\frac{y^2\varphi(y)}{2 \tau}dy=\text{const}$$if higgher moments are used:
$$\frac{\partial\rho}{\partial t} = D_2 \frac{\partial^2\rho}{\partial x^2}+D_4 \frac{\partial^4\rho}{\partial x^4}+\cdots$$Fundamental solution of the diffusion equation($D_2=D, D_{n>2}=0$):
Assuming $\rho(x, 0) = \delta(x)$
or in more general
$$\rho(x,t)={\frac {1}{\sqrt {4\pi Dt}}}\int_{-\infty}^{\infty}\exp \left(-{\frac {(x-s)^{2}}{4Dt}}\right)\rho(s,0)\,ds.$$Note: diffusion can only evolve forward $t>0$
Let ${X_1, X_2,\ldots,X_n}$ be a random sample of independent and identically distributed random variables shuch that $\operatorname {E} [X_i]=\mu$ and $\operatorname {var}[X_i]=\sigma^2$ than the sequence
$$S_n=\frac{X_1+X_2+\cdots+X_n}{n}$$would tend to a normal distribution with $\operatorname {E} [S_i]=\mu$ and $\displaystyle \operatorname {var}[S_i]=\frac{\sigma^2}{n}$
If $x, y$ have a distribution $\rho_1(x)$ than the distrubution of sum $s=x+y$ would be a convolution of identical distributions.
$$\rho_2(s) = \int\rho(s-y)\rho(y)\,dy=(\rho_1\circ\rho_1)(s)$$For simplicity let us assume that $\operatorname {E} [x]=\operatorname {E} [y]=0$. We can construct a series of $\rho_{n+1}=\rho_n\circ\rho_n$
def show_accumulative_distro(x, ρ):
N = x.size
ρ /= np.sum(ρ)*(x[-1]-x[0])/(ρ.size-1)
plt.figure(figsize=(12, 8), dpi=100)
plt.subplot(3, 3, 1); plt.ylabel(r'$\rho_n(x)$', usetex=True, fontsize=16)
plt.plot(x, ρ, label='initial')
plt.legend()
for n in range (8):
ρ = np.convolve(ρ,ρ) * (x[-1]-x[0]) / (ρ.size-1) # normalization
ρ = ρ[0::2]
x = np.linspace(2*x[0], 2*x[-1], ρ.size)
plt.subplot(3, 3, n+2)
if n%3==2: plt.ylabel(r'$\rho_n(x)$', usetex=True, fontsize=16)
if n>=5: plt.xlabel('$x$', usetex=True, fontsize=16)
plt.plot(x, ρ, 'C'+str(n+1), label=f'n={n+1}')
plt.legend()
plt.show()
x = np.linspace(-1, 1, 1000)
#ρ = np.float64(np.abs(x-0.8)<0.1) + np.float64(np.abs(x+0.8)<0.1)
ρ = x**4
show_accumulative_distro(x, ρ)
from scipy.stats import norm
def brownian_motion(σ=2, T=10, N=1000):
t = np.linspace(0, T, N)
dt = t[1]-t[0]
r = norm.rvs(size=t.shape, scale=σ**2*dt)
x = np.cumsum(r) # used cumulative sum instead of loop
plt.plot(t, x)
plt.xlabel('$t$', usetex=True, fontsize=16)
plt.ylabel('$x$', usetex=True, fontsize=16)
plt.show()
brownian_motion()
def brownian(x0, n, dt, delta, out=None):
""" Example from https://scipy-cookbook.readthedocs.io/items/BrownianMotion.html """
x0 = np.asarray(x0)
r = norm.rvs(size=x0.shape + (n,), scale=delta*np.sqrt(dt))
if out is None:
out = np.empty(r.shape)
np.cumsum(r, axis=-1, out=out)
out += np.expand_dims(x0, axis=-1)
return out
def example_multi_brownian(σ=2, T=10, N=1000, m=20, plotting=False):
dt = T/N
x = np.empty((m,N+1))
x[:, 0] = 0
brownian(x[:,0], N, dt, σ, out=x[:,1:])
if plotting==True:
t = np.linspace(0.0, N*dt, N+1)
for k in range(m):
plt.plot(t, x[k])
plt.xlabel('$t$', usetex=True, fontsize=16)
plt.ylabel('$x$', usetex=True, fontsize=16)
plt.show()
return x
X = example_multi_brownian(σ=2, T=10, N=1000, m=20, plotting=True)
def compare_diff_brownian(σ=2, T=1, N=101, m=1000):
X = example_multi_brownian(σ, T, N, m, plotting=False)
plt.figure(figsize=(12, 4), dpi=100)
for n in range(6):
plt.subplot(2,3,n+1)
x = np.linspace(-5,5,1000)
k = 15*(n+1)
t = k*T/(N-1)
plt.hist(X[:, k], bins=np.linspace(-5, 5, 50), density='True', color='C'+str(2*n))
Dt = 2*t
g = np.exp(-x**2/(4*Dt))/np.sqrt(4*np.pi*Dt)
plt.plot(x, g, 'black', label=f"t={t}")
plt.legend()
if n>=3: plt.xlabel('$x$', usetex=True, fontsize=16)
if n%3==0: plt.ylabel(r'$\rho(x)$', usetex=True, fontsize=16)
plt.show()
compare_diff_brownian()
import pandas as pd
import numpy as np
from functools import reduce
import pandas_datareader.data as web
import datetime
start, end = datetime.datetime(2009, 12, 30), datetime.datetime(2019, 11, 20)
tickers = ["^DJI", "^IXIC", "^GSPC", "^STOXX50E", "^N225", "^GDAXI"]
asset_universe = pd.DataFrame([web.DataReader(ticker, 'yahoo', start,
end).loc[:, 'Adj Close'] for ticker in tickers],
index=tickers).T.fillna(method='ffill')
asset_universe = asset_universe/asset_universe.iloc[0, :]
def show_assets(n):
data = asset_universe[tickers[n]]
df = np.diff(data)
var = np.var(df)
mean = np.mean(df)
print(f'analysis for {tickers[n]}: {mean:.6f} ± {np.sqrt(var):.6f}')
plt.figure(figsize=(10, 6), dpi=100)
plt.subplot(2, 2, (1,2))
plt.plot(data, 'r')
plt.subplot(2, 2, 3)
plt.plot(range(df.size), df, 'g')
plt.subplot(2, 2, 4)
plt.hist(df, bins=np.linspace(-0.05, 0.05, 50), density='True', color='C1')
t = np.linspace(-0.05, 0.05, 500)
y = 1/np.sqrt(2*np.pi*var)*np.exp(-(t-mean)**2/(2*var))
plt.plot(t, y, 'black')
plt.show()
show_assets(0)
analysis for ^DJI: 0.000636 ± 0.014427
show_assets(1)
analysis for ^IXIC: 0.001057 ± 0.022163
show_assets(3)
analysis for ^STOXX50E: 0.000094 ± 0.011880
generate brownian motion with one of the distrubutions obtained from real data.
Many processes can be modeled with an equation
(physical notation):
$$dy = a(y,t)dt + b(y,t)dW$$
Financial notation
$$dX_t = a(t, X_t)dt + b(t, X_t)dW_t$$where $dW$ describes a Wienier process (brownian motion)
The solution can be formally written as
$$X_t=X_0 + \int_0^ta(s,X_s)\,ds+\int_0^tb(s,X_s)\,dW_s$$For example Langevin equation (noisy drift)
$$dy = -\frac{(y-\mu)}{\tau} dt + \sigma \sqrt{\frac{2}{\tau}} dW$$or Black-Scholes model
$$dS_t=S_t\mu\,dt + S_t\sigma\, dW_t$$More general:
$$dX_t=a(b-X_t)\,dt +c\sqrt{X_t}\,dW_t$$Usual integral can be interpreted as a sum
$$\int_0^Ta_t\,dt\approx \sum_{n=0}^N a_{t_n}(t_{n+1}-t_{n})$$The stochastic one:
$$\int_0^Tb_t\,dW_t\approx \sum_{n=0}^N b_{t_n}(W_{t_{n+1}}-W_{t_{n}})$$The method is convergent if
$$\lim_{\delta \to 0}\operatorname {E}\left(\left|X_T-X^{\delta t}_T\right|\right)=0.$$The method is weakly convergent if
$$\lim_{\delta \to 0}\left|\operatorname {E}Q(X_T)-\operatorname {E}Q(X_T^{\delta t})\right|=0$$for every polynomial $Q$
The numerical scheme is strongly convergent with order $\gamma$ if
$$\operatorname{E}\left(\left|X_T-X^{\delta t}_T\right|\right)\leq c_T(\delta t)^\gamma$$and weakly convergent with order $\gamma$ if
$$\left|\operatorname {E}Q(X_T)-\operatorname {E}Q(X_T^{\delta t})\right|\leq c_T(\delta t)^\gamma$$Euler-Maruyama scheme is strongly convergent with order $1/2$ and weakly convergent with order $1$ for suffieciently smooth functions (continuous $a^{(4)}$ and $b^{(4)}$)
Weak convergence is important at specific time
Strong convergence is important for a path as a whole.
Other methods: