Numerical Methods¶

Lecture 10b: Multidimensional root and minimum finding¶

Outline¶

Root vs minimum
Methods
Constraints and bounds
Moder portfolio theory, Sharoe ration and portfolio optimization

In [1]:

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline  
plt.rcParams['figure.dpi'] = 100
plt.rcParams['figure.figsize'] = [10,5]
plt.rcParams['axes.grid'] = True
import pandas as pd
import pandas_datareader.data as web
import datetime

import pprint as pp
import scipy as sc
import numpy as np
import scipy.linalg as lg  # SciPy Linear Algebra Library
import scipy.sparse as sp 
import scipy.sparse.linalg as sl
from IPython.display import HTML
from matplotlib import animation, rc

Multidimensional root finding¶

Root vs minimum¶

Root of continuous $F$

$$\vec F(\vec x) = 0\,,\qquad \vec F\in \mathbb{R}^N , \vec x\in \mathbb{R}^N$$

Minimum

$$G(\vec x)=\min$$

If $G$ is continuous and differentiable than local minima can be found by solving

$$\vec F = \vec \nabla G(\vec x)=0$$

Conversely if $\vec F=0$ we can take $G=\vec F^T\vec F=\min$ (not an optimal choice, but the simplest)

In [2]:

def plot_accuracy_comparision():
    plt.figure(figsize=(10,3), dpi=150)
    x = np.linspace(0, 6, 1000)
    y1 = np.float16(1 + 0.001*(x-2)**2-2e-4*(x-2.5)**3)
    y2 = np.float16(0.002*(x-2)-6e-4*(x-2.5)**2)
    plt.subplot(1, 2, 1)
    plt.grid(True)
    
    plt.title('Minimum')
    plt.plot(x, y1)
    plt.legend(['f(x)'])
    plt.subplot(1, 2, 2)
    plt.grid(True)
    plt.title('zero')
    plt.plot(x, y2, 'red')
    plt.legend(["f'(x)"])
    plt.show()

Comparision of accuracy between minimization and root search¶

Expected:

In [3]:

ε = 2**(-11)  # 11 bits out of 16 used for mantissa
print("Accuracy of minimum: ", np.sqrt(ε*1/0.002))
plot_accuracy_comparision()

Accuracy of minimum:  0.49410588440130926

But finding minimum is more stable:

$G$ controls if we are getting nearer or not
Gradients have special properties (integrability conditions) such as $\nabla\times\nabla G=0$
For finite functions on a closed domain minimum always exits (it can be in the boundary)
For roots we need more conditions (difficult to construct a multidim bracketing method)
Numerical round-off errors can cancel the roots

In [4]:

def plot_accuracy_comparision():

    t = np.linspace(-np.pi, np.pi, 1000)
    x1, y1 = np.cos(t),  np.sin(t)
    x2, y2 = np.cosh(t*0.4),  np.sinh(t*0.4)   
    plt.plot(x1, y1, 'black')
    plt.plot(x2, y2, 'red')
    plt.plot(-x2, y2, 'red')
    plt.plot(x2+0.1, y2, 'blue')
    plt.plot(-x2-0.1, y2, 'blue')
    plt.plot(x2-0.1, y2, 'blue')
    plt.plot(-x2+0.1, y2, 'blue')
    plt.fill_betweenx(y2, -x2+0.1, -x2-0.1, facecolor='yellow', alpha=0.25)
    plt.fill_betweenx(y2, x2+0.1, x2-0.1, facecolor='yellow', alpha=0.25)
    plt.ylim(-1.1,1.1)
    plt.xlim(-2,2)
    plt.show()

In [5]:

plot_accuracy_comparision()

$F^2=\min$ has a larger conditional number ($\text{cond} (F^2=\min)=\text{cond} (F=0)^2$ )
If $F(x)=Ax-b$ with $A$ - symmetric and positively defined it is better to use $G=\frac{1}{2}x^TAx-x^Tb$ rather than $F^2$
$F=\nabla G=0$ also at
- local maximum: $G=-F^2=\min$
- saddle points: $\tilde G = F^{\color{red}{4}}=\min$

Methods¶

Newton-Raphson method:¶

In 1-d we linearized the equation $f(x)=0$

$$f(x_{n+1})\approx f(x_n)+f'(x_n)(x_{n+1}-x_n)=0$$

In many dimensions we do similarly

$$F(x_{n+1})\approx F(x_n)+\frac{DF(x_n)}{Dx}(x_{n+1}-x_n)=0$$

So for each step we have to solve the linear equation using jacobian matrix $J=DF/Dx$

$$x_{n+1} = x_n-J^{-1}(x_n)F(x_n)$$

Note: that this is only formal notation, we almost never calculate $J^{-1}$
Better: solve

$$J\delta x=-F(x_n)$$

and then $$x_{n+1}=x_n+\delta x$$

If we cannot calculate the jacobian we can use approximation, but

usually this is far more time consuming
we need to calculate $N$ times $N$ elements of $F$
finite difference methods produce errors which can grow fast as $h^{-1}$ when close to the solution
the usual chaotic/cyclic behaviour of the Newton method
sometimes damped Newton-Raphson method is used $\delta x\to\alpha\delta x$ to ensure stability and avoid cycles.

Example: nonlinear two-point boundary problem¶

$$u''-2u(2u^2-1)=0\,,u(0)=0, u(10)=1$$

Using the simplest discretizations $x_n=hn, h=\frac{10}{N-1}$ we obtain

$$F_n=\frac{u_{n-1}-2u_n+u_{n+1}}{h^2}-2u_n(u_n^2-1)\,,\qquad 0<n<N-1$$$$F_0=u_n\,,\qquad F_{N-1}=u_{N-1}-1$$

and jacobian for $0<n<N-1$

$$J_{nm} = \frac{\partial F_n}{\partial u_m} = \frac{\delta_{n-1,m}+\delta_{n+1,m}}{h^2} - 2\left[3u_n^2-1+\frac{1}{h^2}\right]\delta_{n,m}$$

In [6]:

N = 100
x = np.linspace(0, 10, N)
h = 10/(N-1)

In [7]:

def newton_raphson_step(u, alpha=1):
    F, v = np.zeros_like(u), u[1:-1]    
    F[1:-1] = np.diff(u, 2)/(h*h) - 2*v*(v**2-1)
    sides = np.ones_like(u)/(h*h)
    ab = np.array([sides, -2*(3*u**2-1+1/(h*h)), sides])
    """ boundary condition implementation """
    ab[1, 0] = 1   # a00=1; 
    ab[0, 1] = 0   # a01=0; 
    ab[1, -1] = 1  # a[N-1,N-1] = 1
    ab[2, -2] = 0  # a[N-1,N-2] = 0
    # print(ab)
    b = F
    delta_u = lg.solve_banded((1, 1), ab, b);
    return u - alpha*delta_u

In [8]:

u = x/10
nsteps = 8
plt.plot(x,u)
for n in range(nsteps):
    u = newton_raphson_step(u)
    plt.plot(x,u)

plt.show()

In [9]:

u = x/10
nsteps = 8
plt.plot(x,u)
for n in range(nsteps):
    u = newton_raphson_step(u, 0.01)
    plt.plot(x,u)
plt.show()

In [10]:

u = 1-np.exp(-x)
u /= u[-1]
nsteps = 8
plt.plot(x,u)
for n in range(nsteps):
    u = newton_raphson_step(u)
    plt.plot(x,u)
plt.show()

More stable and sometime simplier to minimize some functions.

Solution to our proble is equavalent to minimize the energy functional (taken from lagrangian)

$$E=\int_0^{10}dx\frac12u'^2+\frac12(u^2-1)^2$$

which can be approximated with the discretized version

$$G = \sum_{n=1}^{N-2}\frac12\left(\frac{u_{n-1}-u_{n+1}}{h}\right)^2 + \frac12\left(u_n^2-1\right)^2$$

Check that $F_n=\partial G/\partial u_n$

Some methods:

Using gradients
- Minimization along the gradient direction
- Fastest descent
- Conjugate and bi-conjugate gradients
- Newton-Raphson method for $\nabla G$ (uses also hessian)

Using only values:
- Minimization along coordinates
- Simplex method (amoeba) - bracketing method
- Stochastic methods (annealing, evolution differences, inverted Metroopolis-Hastings algorithm)

Some examples:

Minimization along the gradient direction
- we define a 1-d function
  $$f(\alpha) = F\left(x_n + \alpha\nabla F(x_n)\right) $$
  - find $f(\alpha_{\min})=\min$ using 1-d methods
  - take $x_{n+1}=x_n + \alpha_{\min}\nabla F(x_n)$
  - repeat along a gradient at $x_{n+1}$ until convergence

Fastest descent:
- We make step in the direction of $-\nabla G(x_n)$ $$x_{n+1}=x_n-\gamma \nabla G(x_n)$$ where $\gamma$ is a certain step size.
- constant $\gamma$ allow slow, diffusion-like relaxation towards the minimum
- but near the true minimum the effective steps are very small because $\gamma G\to 0$

- when already near the minimum better to take
  
  $$\gamma _{n}={\frac {\left|\left({x} _{n}- {x} _{n-1}\right)^{T}\left[\nabla G({x} _{n})-\nabla G({x} _{n-1})\right]\right|}{\left\|\nabla G({x} _{n})-\nabla G({x} _{n-1})\right\|^{2}}}$$

In [11]:

def constant_step_example(nsteps = 1000, every=100, gamma=0.49*h**2):
    u = x/10
    results = [u]

    for k in range(nsteps):
        v = u[1:-1]    
        u[1:-1] += gamma*(np.diff(u, 2)/(h*h) - 2*v*(v**2-1))
        if k%every==0: results.append(np.copy(u))
    return np.array(results).reshape(-1, x.size).T

In [12]:

r = constant_step_example(every=20)
a = plt.plot(x, r)

Modern portfolio theory (Markovitz, Sharpe)¶

Portfolio: Combination of assets with wights $w_i$ such that $\displaystyle \sum_iw_i=1$.

Expected return: $$\mathrm E(R_p)=\sum_i\mathrm E(R_i) w_i$$
Portfolio return variance: $$\mathrm{var}(R_p) = \sum_i w_i^2 \mathrm{var}(R_i) + \sum_i\sum_{j\neq j}w_iw_j \mathrm{cov}(R_i,R_j)$$ or simply $$\sigma_p^2=w^T\Sigma w$$

Standard portfolio optimization problem: Minimize the risk $$w^T\Sigma w=\min$$ for given assumed returns $$R^Tw=\mu$$

Conditional minimization (lagrange multiplieres)

$$G(w, \lambda_1, \lambda_2) = w^T\Sigma w-\lambda_1\left(R^Tw-\mu\right)-\lambda_2\left(u^Tw-1\right)=\min$$

where $u$ is a vector of ones

$$u^T=(1,1,\ldots,1)$$

Which leads to a system: $${\displaystyle {\begin{pmatrix}2\Sigma &-R& -u\\ R^{T}&0&0\\ u^{T}&0&0 \end{pmatrix}}{\begin{pmatrix}w\\\lambda_1\\\lambda_2 \end{pmatrix}}={\begin{pmatrix}0\\\mu\\1 \end{pmatrix}}}$$ Note the time scaling (in the simplest linear model): $$R(t)=tR(1),\qquad\Sigma(t)=t\Sigma(1)$$

Sidenote:

$$\mathrm{corr}(R_i,R_j) = \frac{\mathrm{cov}(R_i,R_j)}{\sigma_i\sigma_j}$$

$\mathrm{corr}(R_i,R_i)=1$ (no units)
$\mathrm{cov}(R_i,R_i)=\sigma_i^2$ (units$^2$)

Example: Two assets problem and efficient frontier¶

Lets take a look what happens when we have two assets described by variance/covariance matrix

$$\Sigma = \begin{pmatrix} 0.16&& 0.01\\ 0.01&& 0.25 \end{pmatrix} \qquad\text{and returns}\quad R=\begin{pmatrix} 0.1\\ 0.5 \end{pmatrix} $$

Let us calculate the returns and volatility of the portfolio for weights $w=(s, 1-s)$ and $0\leq s\leq1$

In [13]:

def two_portfolio_show(R, Σ, color='blue',   marker_color=2, 
                       names=['pure asset 1', 'pure asset 2'], 
                       labels=['suboptimal portfolios', 'combined portfolios']):
    N = 100
    Rp = np.zeros(N)
    Vp = np.zeros(N)
    
    for n, s in enumerate(np.linspace(-0.20, 1.2, N)): 
        w = np.matrix([[s], [1-s]])  # vector of weights
        Vp[n] = np.sqrt(w.T*Σ*w)     # volatility
        Rp[n] = R.T*w                # returns
    
    plt.plot(Vp, Rp, 'black',  label='', alpha=0.2)
    
    for n, s in enumerate(np.linspace(0, 1, N)): 
        w = np.matrix([[s], [1-s]])  # vector of weights
        Vp[n] = np.sqrt(w.T*Σ*w)     # volatility
        Rp[n] = R.T*w                # returns        
    
    idx = np.argmin(Vp)
    plt.plot(Vp[idx:], Rp[idx:], 'black',  label=labels[1], linestyle='dashed')
    plt.plot(Vp[:idx], Rp[:idx], color,  label=labels[0])
    plt.scatter(Vp[0], Rp[0], marker = '*', s=100, color = f'C{marker_color}', label=names[1])
    plt.scatter(Vp[-1], Rp[-1], marker = '*', s=100, color = f'C{marker_color+1}', label=names[0])
    plt.xlabel('portfolio valitility')
    plt.ylabel('portfolio returns')
    plt.xlim(xmin=0)
    plt.legend()
    return Vp, Rp

In [14]:

R1 = np.matrix([[0.1], [0.5]])
Σ1 = np.matrix([ [0.16, 0.01],[0.01, 0.25]  ])
a = two_portfolio_show(R1, Σ1)

In [15]:

R2 = np.matrix([[0.1], [0.5]])
Σ2 = np.matrix([ [0.16, -0.198],[-0.198, 0.25]  ])
a = two_portfolio_show(R1, Σ1, labels=['positive correlation', ''], names=['', ''])
a = two_portfolio_show(R2, Σ2, 'cyan', labels=['negative correlation', ''])

Negative correlation can hedge out the volitality very efficienlty (security+some derivative)

Efficient frontier¶

$w\in[0,1]^N$ parameters such that $\Sigma_iw_i=1$
for two assets $w=[s, 1-s]^T$
portfolio returns $\mu = R^Tw$ - linear form in $w$ and linear function in s
portfolio variance $\sigma_p^2=w^T\Sigma w$ - semi positively defined quadratic form in $w$ and quadratic or linear in $s$

Conclusion:¶

Portfolio variance $\sigma^2_p(\mu)$ is a quadratic or linear (if one of the component variances is zero) function of the returns $\mu$.

In [16]:

R3 = np.matrix([[0.2], [0.5]])
Σ3 = np.matrix([ [0.1, 0],[0, 0.16]  ])
a = two_portfolio_show(R3, Σ3, names=['asset 1', ''], labels=['risky portfolio', ''])
Risk_free = 0.02
R4 = np.matrix([[Risk_free], [0.5]])
Σ4 = np.matrix([ [0, 0],[0, 0.16]  ])
a = two_portfolio_show(R4, Σ4, 'orange', marker_color=4, names=['risk free asset', 'asset 2'], labels=['partial risk free', ''])

Risk-free asset reduces the risk, but also reduces the returns
Critical Market Line (CML) = partial risk-free line

In [17]:

V, R = two_portfolio_show(R3, Σ3, names=['asset 1', 'asset 2'], labels=['risky portfolio', ''])
idx = np.argmax(R/V)
R4 = np.matrix([[Risk_free], [R[idx]]])
Σ4 = np.matrix([ [0, 0],[0, V[idx]**2]  ])
a = two_portfolio_show(R4, Σ4, 'green', marker_color=4, names=['risk free asset', 'optimal portfolio'], labels=['partial risk free', ''])

Any fully risky portfolio is either more risky or have lower returns.

(William F.) Sharpe ratio:¶

$$S_r = \frac{R_p-R_f}{\sigma_p}$$

where

$R_p$ - portfolio return
$R_f$ - risk free return
$\sigma_p$ - portfolio variance

Meaning: above risk-free returns devided by a risk factor - best when maximized.
Optimal portfolio: $$S_r=\left.\frac{d\mu}{d\sigma}\right|_{\textrm{opt}}$$

In [18]:

import sympy as smp
s00, s01, s11, w0, w1, mu = smp.symbols('\sigma_{00} \sigma_{01} \sigma_{11} w_0 w_1 \mu')
r0, r1, R = smp.symbols('r_0 r_1 R')
Σ = smp.Matrix([[s00, s01], [s01, s11]])
print("Covariation matrix:"); Σ

Covariation matrix:

Out[18]:

$\displaystyle \left[\begin{matrix}\sigma_{00} & \sigma_{01}\\\sigma_{01} & \sigma_{11}\end{matrix}\right]$

In [19]:

assumed_returns = smp.Eq(mu, r0*w0+r1*(1-w0)); assumed_returns

Out[19]:

$\displaystyle \mu = r_{0} w_{0} + r_{1} \left(1 - w_{0}\right)$

In [20]:

w = smp.solve(assumed_returns, w0)[0]; w

Out[20]:

$\displaystyle \frac{\mu - r_{1}}{r_{0} - r_{1}}$

In [21]:

weights = smp.simplify(smp.Matrix( [[w], [1-w] ] )); weights

Out[21]:

$\displaystyle \left[\begin{matrix}\frac{\mu - r_{1}}{r_{0} - r_{1}}\\\frac{- \mu + r_{0}}{r_{0} - r_{1}}\end{matrix}\right]$

In [22]:

portfolio_variation = (weights.T*Σ*weights)[0]
p = smp.simplify(smp.expand(portfolio_variation*(r0-r1)**2))
smp.collect(p, mu)

Out[22]:

$\displaystyle \mu^{2} \left(\sigma_{00} - 2 \sigma_{01} + \sigma_{11}\right) + \mu \left(- 2 \sigma_{00} r_{1} + 2 \sigma_{01} r_{0} + 2 \sigma_{01} r_{1} - 2 \sigma_{11} r_{0}\right) + \sigma_{00} r_{1}^{2} - 2 \sigma_{01} r_{0} r_{1} + \sigma_{11} r_{0}^{2}$

Portfolio variation is a quadratic function of expected returs (this is also true for more assets)

$$\sigma_p^2=a\mu^2+b\mu+c$$

This function determines the efficient frontier (usually drown on a $(\sigma_p, \mu)$ plane

In [24]:

final_frontier_shape(0.5, -1, 4)

Example portfolio¶

In [25]:

start, end = datetime.datetime(2018, 12, 30), datetime.datetime(2019, 12, 30)

tickers = ["GLD", "GOOG", "FB", "BP", "TSLA", "AAPL"]
#tickers = ["GLD", "GOOG", "FB", "BP", "TSLA", "AAPL", 'TM', 'KO', 'PEP']

assets = pd.DataFrame([web.DataReader(ticker, 'yahoo', start, 
                     end).loc[:, 'Adj Close'] for ticker in tickers],
                     index=tickers).T.fillna(method='ffill')

In [26]:

assets.describe()

Out[26]:

	GLD	GOOG	FB	BP	TSLA	AAPL
count	252.000000	252.000000	252.000000	252.000000	252.000000	252.000000
mean	131.431032	1187.196986	181.343214	38.996592	273.190793	206.467936
std	8.616101	81.787116	16.295223	1.955013	52.369413	34.911673
min	119.940002	1016.059998	131.089996	35.487289	178.970001	140.085220
25%	122.452499	1121.137482	171.754997	37.511118	232.219997	184.105156
50%	132.769997	1184.539978	184.409996	38.735779	260.295013	201.530914
75%	140.059998	1239.447540	193.007500	40.514652	311.909996	222.546852
max	146.660004	1361.170044	208.100006	43.157745	430.940002	291.519989

In [27]:

# for better presentations we normalize assets so that asset_i(0) = 1
normalized_assets = assets/assets.iloc[0, :]
a = normalized_assets.plot()

In [28]:

returns = assets.pct_change()
log_returns = np.log(1+returns)
a = log_returns.plot()

In [29]:

print("Mean returns")
returns_mean = returns.mean(); returns_mean

Mean returns

Out[29]:

GLD     0.000674
GOOG    0.001131
FB      0.001924
BP      0.000264
TSLA    0.001361
AAPL    0.002646
dtype: float64

In [30]:

print("Correlation matrix of the returns, ones on diagonal")
returns_corr = returns.corr(); returns_corr

Correlation matrix of the returns, ones on diagonal

Out[30]:

	GLD	GOOG	FB	BP	TSLA	AAPL
GLD	1.000000	-0.200118	-0.105207	-0.165788	-0.063133	-0.239379
GOOG	-0.200118	1.000000	0.565121	0.284538	0.238081	0.557873
FB	-0.105207	0.565121	1.000000	0.265990	0.218072	0.471653
BP	-0.165788	0.284538	0.265990	1.000000	0.184325	0.345533
TSLA	-0.063133	0.238081	0.218072	0.184325	1.000000	0.330026
AAPL	-0.239379	0.557873	0.471653	0.345533	0.330026	1.000000

In [31]:

print("Covariance matrix of the returns")
returns_cov = returns.cov(); returns_cov

Covariance matrix of the returns

Out[31]:

	GLD	GOOG	FB	BP	TSLA	AAPL
GLD	0.000054	-0.000022	-0.000014	-0.000013	-0.000014	-0.000029
GOOG	-0.000022	0.000232	0.000152	0.000045	0.000113	0.000140
FB	-0.000014	0.000152	0.000310	0.000048	0.000119	0.000137
BP	-0.000013	0.000045	0.000048	0.000107	0.000059	0.000059
TSLA	-0.000014	0.000113	0.000119	0.000059	0.000969	0.000169
AAPL	-0.000029	0.000140	0.000137	0.000059	0.000169	0.000272

In [32]:

plt.grid(False)
N = returns_corr.shape[0]
plt.xticks(np.arange(N)+0.5, labels=tickers)
plt.yticks(np.arange(N)+0.5, labels=tickers)
plt.imshow(returns_corr, vmin=-1, extent=(0, returns_corr.shape[0], returns_corr.shape[0], 0), cmap='RdBu')
plt.title("Correlation matrix of returns")
a = plt.colorbar()

In [33]:

a = pd.plotting.scatter_matrix(returns, diagonal='kde', alpha=0.2,figsize=(7,7))

Efficient frontier¶

We need to solve the system $${\displaystyle {\begin{pmatrix}2\Sigma &-R& -u\\ R^{T}&0&0\\ u^{T}&0&0 \end{pmatrix}}{\begin{pmatrix}w\\\lambda_1\\\lambda_2 \end{pmatrix}}={\begin{pmatrix}0\\\mu\\1 \end{pmatrix}}}$$ with $$R(t)=tR(1),\qquad\Sigma(t)=t\Sigma(1)$$

In [34]:

def find_min_volatility(mean_returns, cov_matrix, 𝜇, years=1):
    t, N = years * 252, cov_matrix.shape[0]
    R, Σ = mean_returns * t, cov_matrix * t    
    """Matrix construction"""
    M = np.zeros([N+2,N+2])
    M[0:N, 0:N] = 2*Σ
    M[0:N, N] = -R
    M[0:N, N+1] = -np.ones(N)
    M[N, 0:N] = R
    M[N+1, 0:N] = np.ones(N)
    """Solving equation"""
    b = np.matrix( [0]*N + [𝜇] + [1]).T
    w = np.linalg.solve(M, b)[0:N]
    """Producing output"""
    df = pd.DataFrame(data=w, index=cov_matrix.columns, columns=['weights'])
    volatility = np.sqrt(np.dot(w.T, np.dot(Σ, w)))
    rets = np.dot(R.T, w)
    return df, volatility[0,0]

In [68]:

min_volatility_weights, port_vol = find_min_volatility(returns_mean, returns_cov, 𝜇=0.1)
print(f"Minimal volatility {port_vol:.6f}")
min_volatility_weights.T

Minimal volatility 0.087997

Out[68]:

	GLD	GOOG	FB	BP	TSLA	AAPL
weights	0.594917	0.142677	-0.039084	0.369925	0.005446	-0.073882

In [36]:

ax = normalized_assets.plot(alpha=0.4)
data = np.sum(np.array(min_volatility_weights).T*np.array(assets), axis=1)
min_volatility_portfolio = pd.DataFrame(data/data[0],index=assets.index,columns=['min volatility'])
a = min_volatility_portfolio.plot(ax=ax, color='black')

In [37]:

ax = normalized_assets.pct_change().plot(alpha=0.3)
a = min_volatility_portfolio.pct_change().plot(ax=ax, color='black')

Finding max Sharpe ration on the frontier¶

For each $\mu$ we find minimal volatility and choose the one maximizing the Sharpe ratio $$S_r = \frac{R_p-R_f}{\sigma_p}$$ which is a nonlilnear function of the wieghts.

In [38]:

N = 500     # number of scans
risk_free_rate = 0.0178  # 1.78% in the US and 3.1% in Poland 

expected_return = np.linspace(0.01, 1, N)
sigma = np.empty(N)
weights = []
for n in range(N):
    w, sigma[n] = find_min_volatility(returns_mean, returns_cov, 𝜇=expected_return[n])
    weights.append(w)

In [39]:

sharpe_ratio = (expected_return - risk_free_rate)/sigma
idx = np.argmax(sharpe_ratio)
if 1.5*idx < 500:     maxidx = np.int32(1.5*idx)
else:  maxidx = 500
expected_return = expected_return[0:maxidx]
sigma = sigma[0:maxidx]
sharpe_ratio = sharpe_ratio[0:maxidx]
def sharpe_ratio_plot():
    plt.figure(figsize=(10,3))
    plt.subplot(1, 2, 1)
    plt.plot(expected_return, sharpe_ratio)
    plt.xlabel('Expecto return')
    plt.ylabel('Sharpe ratio')
    plt.scatter(expected_return[idx], sharpe_ratio[idx], marker='*', color='red', s=100)
    plt.legend(['Sharpe rat on the frontier', 'Max Sharpe rat.'])
    
    plt.subplot(1, 2, 2)
    plt.plot(sigma, expected_return, 'C2',  label='efficient frontier')
    plt.xlabel('Volatility')
    plt.ylabel('Expecto returns')
    plt.scatter(sigma[idx], expected_return[idx], color='red', label='Max Sharpe rat.', marker='*', s=100)
    a = plt.legend()

In [40]:

sharpe_ratio_plot()

In [41]:

def color_negative_red(val):
    color = 'red' if val < 0 else 'black'
    return 'color: %s' % color

In [42]:

print("optimal portfolio:")
print("Max Sharpe ratio   : ", sharpe_ratio[idx])
print("Expected return    : ", expected_return[idx])
print("Standard deviation : ", sigma[idx])
weights[idx].T.style.applymap(color_negative_red)

optimal portfolio:
Max Sharpe ratio   :  3.27284683528921
Expected return    :  0.4325851703406814
Standard deviation :  0.12673528313891544

Out[42]:

	GLD	GOOG	FB	BP	TSLA	AAPL
weights	0.63943	-0.0860582	0.127325	-0.11424	-0.0161773	0.44972

We used a constrain $$ \sum_i w_i=1 $$ But some weights turend out to be negative. Therefore some can be even larger that $1$.

$w_i>1$ means that we need to take credit for it
$w_i<0$ means losses, so generally it is a good idea to get rid of these assets

Better approach: we need to add additional bound

$$\forall i: 0<w_i<1 $$

Example: find a minimum of the function:

$$G(x,y) = -x^2-y^2$$

for $x\geq1$. We have to find minimum on the boundary

$$\mathcal{L}(x,y, \lambda) = -x^2-y^2-\lambda(x-1)$$

and for $\lambda=0$ (if it satisfies the bound)

$$\mathcal{L}(x,y, \lambda)=\min$$

for $$y=0, x=1, \lambda=2$$

and $G(x,y)=\min$ for $x=y=0$ which lies outside the domain, so we choose the first solution.

However numerically it can be a bit challenging, especially if bounds and constraits are also nonlinear.
For large portfolios Monte Carlo simulations might be an option

There can be many other constrains and bounds:

we cannot get read of the whole asset at once $w_i\leq u<1$
we want true diversification of assets $0<l\leq w_i$
minimum holding size
minimum transaction size
integer constraints
factor models provide additional constraints

There can be other risk measures to minimize instead of standard deviation
- variance
- semivariance $E\left[\min\left(R-E[R], 0\right)^2\right]$
- $\textrm{VaR}_{\alpha}(R)$
Other types of estimating variance/covariance/returns
- exponantial weights (pandas: rolling method)
- dynamical factor models
- size of the window can variate
Other types of prediction future statistical properties

Random portfolios¶

In [43]:

def portfolio_evolution(weights, mean_returns, cov_matrix, risk_free_rate = 0.0178, years=1):
    t = years * 252
    returns = np.sum(mean_returns*weights ) * t
    std = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights))) * np.sqrt(t)
    Sharpe_ratio = (returns-risk_free_rate*years)/std
    return std, returns, Sharpe_ratio

In [44]:

def random_portfolios(mean_returns, cov_matrix, risk_free_rate=0.0178, num_portfolios=1000):    
    results = np.zeros((num_portfolios, 3))
    N = len(mean_returns)
    weights_record = []
    for n in range(num_portfolios):
        weights = np.random.random(N)
        weights /= np.sum(weights)
        weights_record.append(weights)
        results[n, :] = portfolio_evolution(weights, mean_returns, cov_matrix, risk_free_rate)
    return results, weights_record

In [45]:

results, weights = random_portfolios(returns_mean, returns_cov, num_portfolios=10000)

In [46]:

def show_portfolio_simulations(results, weights):       
    idx = np.argmax(results[:, 2])
    sdp, rp = results[idx, 0], results[idx, 1]
    min_vol_idx = np.argmin(results[:, 0])
    sdp_min, rp_min = results[min_vol_idx, 0], results[min_vol_idx, 1]
    
    opt_weights = pd.DataFrame(data=weights[idx], index=assets.columns, columns=['weights']) 
    
    plt.scatter(results[:, 0],results[:, 1],c=results[:, 2], cmap='Spectral', marker='o', s=5)
    plt.colorbar()
    plt.scatter(sdp,rp, marker='*',color='r',s=100, label='Maximum Sharpe ratio')
    plt.scatter(sdp_min,rp_min, marker='*',color='g', s=100, label='Minimum volatility')
    plt.title('Simulated Portfolio')
    plt.xlabel('annualised volatility')
    plt.ylabel('annualised returns')
    plt.plot(sigma, expected_return, 'red', label='efficient frontier')
    plt.legend(labelspacing=0.8)
    
    """ single assets """    
    stds = np.sqrt(np.diag(np.matrix(returns_cov))*252)
    plt.scatter(stds, returns_mean*252, c='black', marker='+', s=100)
    
    
    for n in range(len(stds)):
        label = assets.columns[n]        
        plt.annotate(label, (stds[n], returns_mean[n]*252), 
                     textcoords="offset points", xytext=(0,10), ha='center')
    
    return idx, min_vol_idx, opt_weights

In [47]:

idx, min_vol_idx, mc_sharpe_weights = show_portfolio_simulations(results, weights)

In [48]:

print("optimal portfolio from MC:")
print("Max Sharpe ratio   : ", results[idx,2])
print("Expected return    : ", results[idx,1])
print("Standard deviation : ", results[idx,0])
mc_sharpe_weights.T

optimal portfolio from MC:
Max Sharpe ratio   :  2.964383720057463
Expected return    :  0.3663013162793188
Standard deviation :  0.11756282222213907

Out[48]:

	GLD	GOOG	FB	BP	TSLA	AAPL
weights	0.473499	0.015712	0.188216	0.020544	0.03911	0.26292

In [49]:

def two_portfolios_show(returns_mean, returns_cov, years=1):
    t, num_assets = 252 * years, len(returns_mean)
    P = np.matrix(returns_mean) * t 
    S = np.matrix(returns_cov) * t
    print(P.shape)
    N = 100
    Rp = np.zeros(N)
    Vp = np.zeros(N)
    for k in range(num_assets-1):
        for m in range(k+1, num_assets):
            Σ = np.matrix([  [S[k,k], S[k,m]], 
                             [S[m,k], S[m,m]]])
            R = np.matrix([[P[0,k]], [P[0,m]]])

            for n, s in enumerate(np.linspace(0, 1, N)):
                w = np.matrix([[s], [1-s]])  # vector of weights
                Vp[n] = np.sqrt(w.T*Σ*w)     # volatility
                Rp[n] = R.T*w                # returns
            plt.plot(Vp, Rp, 'black')

In [50]:

two_portfolios_show(returns_mean, returns_cov)
idx, min_vol_idx, mc_sharpe_weights = show_portfolio_simulations(results, weights)

(1, 6)

Simulating annealing¶

Algorithm (details may vary):

randomly choose initial point $x$
randomly choose neighbour $y$ ($|x-y|$ can be proportional to temperature $T$, in physics gaussian distro)
if $f(y)<f(x)$ we accept $y$ as a new solution
if $f(y)>f(x)$ we accept $y$ as a new solution with probability $P(f(y)-f(x), T)$ in physics $P$ has a Boltzman distribution

$$P(\Delta E, T) = e^{-\beta \Delta E/T}$$
decrease the temperature example $T_n = (1-n/n_{max})T_0$

In [51]:

def portfolio_annealing(mean_returns, cov_matrix, risk_free_rate=0.0178, years=1, T0=0.1, steps=1000, boltzmann_constant=0.001):
   
    num_assets = len(mean_returns)
    max_tries = 100
     
    """ Choosing initial point """
    w = np.random.random(num_assets)
    w /= np.sum(w)

    result = portfolio_evolution(w, mean_returns, cov_matrix, risk_free_rate, years)
    results = [np.array(result)]
    sharpe_ratio = result[2]
    weights = [w] 
    
    for n in range(steps):
        """ Finding a test point inside the domain """
        T = T0*(1-n/steps)
        tries = 0
        while True:
            wi = w + np.random.normal(0, T, num_assets)    
            tries += 1
            if np.prod(wi>=0) == 1 or tries>max_tries: break
        wi = wi/np.sum(wi) # normalize so that sum w = 1
        new_result = portfolio_evolution(wi, mean_returns, cov_matrix, risk_free_rate, years)
        new_sharpe_ratio = new_result[2]
        """ Accepting or rejecting """        
        if new_sharpe_ratio > sharpe_ratio: 
            accept = True
        elif np.exp((new_sharpe_ratio-sharpe_ratio)/(boltzmann_constant*T))>np.random.random(): 
            accept = True
        else: 
            accept = False
        
        if accept==True:        
            sharpe_ratio = new_sharpe_ratio
            w = wi 
            weights.append(w)
            results.append(np.array(new_result))

    results = np.array(results)
    idx = np.argmax(results[:, 2])
    return results, idx,  pd.DataFrame(data=weights[idx], index=cov_matrix.columns, columns=['weights'])

In [52]:

path, idx, max_sharpe_weights = portfolio_annealing(returns_mean, returns_cov, T0=0.1, steps = 300)
print("optimal portfolio:")
print("Max Sharpe ratio   : ", path[idx,2])
print("Expected return    : ", path[idx,1])
print("Standard deviation : ", path[idx,0])
max_sharpe_weights.T

optimal portfolio:
Max Sharpe ratio   :  3.195297039456286
Expected return    :  0.36525868505582554
Standard deviation :  0.10874065251690947

Out[52]:

	GLD	GOOG	FB	BP	TSLA	AAPL
weights	0.57051	0.007544	0.077267	0.00181	0.000116	0.342754

In [53]:

plt.plot(path[:, 0], path[:, 1], label='path of annealing')
plt.xlabel('annualised volatility')
plt.ylabel('annualised returns')
plt.plot(sigma, expected_return, 'red', label='efficient frontier')
a = plt.legend(labelspacing=0.8)
two_portfolios_show(returns_mean, returns_cov)

(1, 6)

In [54]:

ax = normalized_assets.plot(alpha=0.3)
min_volatility_portfolio.plot(ax=ax, color='black', alpha=0.5, linewidth=3)
data = np.sum(np.array(max_sharpe_weights).T*np.array(assets), axis=1)

max_sharpe_portfolio = pd.DataFrame(data/data[0],index=assets.index,columns=['MC Sharpe'])
a = max_sharpe_portfolio.plot(ax=ax, color='blue', linewidth=3)

scipy.optimize¶

Scipy has a method optimize alowing constraints and bounds.

We choose a method Sequential Least SQuares Programming (SLSQP)

approximate lagrangian, constraints and bounds as quadratic functions
use current gradient to define search direction
find a minimum along the direction using parabola approximation subject to the constraints and bounds
repeat the procedure from the minimum It is likely to find only a local minimum rather than global.

It is much faster and direct method (directional minimizations)

In [55]:

import scipy.optimize as sco

def find_max_Sharpe_ratio_portfolio(mean_returns, cov_matrix, risk_free_rate=0.0178, years=1):
    num_assets = len(mean_returns)  
    path = []
    def neg_sharpe_ratio(w):
        p = portfolio_evolution(w, mean_returns, cov_matrix, risk_free_rate)
        path.append(np.array(p))
        return -p[2]
        
    w_init = np.random.random(num_assets)
    w_init /= np.sum(w_init)

    constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
    bounds = tuple( (0,1) for asset in range(num_assets))
    opts = sco.minimize(neg_sharpe_ratio, w_init, 
                        method='SLSQP', bounds=bounds, constraints=constraints)
    #print(opts)
    w = opts['x']
 
    res =  pd.DataFrame(data=w, index=cov_matrix.columns, columns=['weights'])
    return np.array(path), portfolio_evolution(w, mean_returns, cov_matrix, risk_free_rate), res

In [56]:

path_scipy, best_sharpe_res, best_sharpe_portfolio = find_max_Sharpe_ratio_portfolio(returns_mean, returns_cov)

print("optimal portfolio:")
print("Max Sharpe ratio   : ", best_sharpe_res[2])
print("Expected return    : ", best_sharpe_res[1])
print("Standard deviation : ", best_sharpe_res[0])

best_sharpe_portfolio.T

optimal portfolio:
Max Sharpe ratio   :  3.2031223117021956
Expected return    :  0.3646126545177254
Standard deviation :  0.10827330984230292

Out[56]:

	GLD	GOOG	FB	BP	TSLA	AAPL
weights	0.580339	2.172566e-16	0.075982	0.0	2.593955e-16	0.343679

In [57]:

plt.plot(path[:, 0], path[:, 1], label='path of annealing')
plt.plot(path_scipy[:, 0], path_scipy[:, 1], label='path of SLSQP')
plt.xlabel('annualised volatility')
plt.ylabel('annualised returns')
plt.plot(sigma, expected_return, 'black', label='efficient frontier')
a = plt.legend(labelspacing=0.8)

In [58]:

best_sharpe_weights = best_sharpe_portfolio[0:assets.shape[1]]
w = np.array(best_sharpe_weights)[:,0]
data = np.sum(w*np.array(assets), axis=1)
final_portfolio =  pd.DataFrame(data/data[0], index=assets.index,columns=['best Sharpe'])

In [59]:

ax = normalized_assets.plot(alpha=0.3)
min_volatility_portfolio.plot(ax=ax, color='black', alpha=0.7)
final_portfolio.plot(ax=ax, color='red')
a = max_sharpe_portfolio.plot(ax=ax, color='blue', alpha=0.3, linewidth=4)

In [60]:

fp = final_portfolio.pct_change()
Data = np.array([ [best_sharpe_res[1], np.float64(fp.mean())*252], 
                  [best_sharpe_res[0], np.float64(fp.std())*np.sqrt(252)] ] )
comparison = pd.DataFrame( data=Data, 
                           index=['Expected return', 'Standard deviation'], 
                           columns=["Theoretically optimized", 'Optimized portfolio'])
#comparison.insert()
comparison.insert(loc=2, value=np.abs(Data[:,0]-Data[:,1]), column='Abs Error')
comparison.insert(loc=3, value=100*comparison.iloc[:,2]/comparison.iloc[:,0], column='Rel Error (%)')

In [61]:

comparison

Out[61]:

	Theoretically optimized	Optimized portfolio	Abs Error	Rel Error (%)
Expected return	0.364613	0.404361	0.039748	10.901467
Standard deviation	0.108273	0.123383	0.015110	13.955382

In [62]:

print(f"Estimated returns and covariance matrix\nare affected by error: {100/np.sqrt(252):.2f}%")

Estimated returns and covariance matrix
are affected by error: 6.30%

In [63]:

#start, end = datetime.datetime(2016, 12, 30), datetime.datetime(2019, 12, 30)
#assets = pd.DataFrame([web.DataReader(ticker, 'yahoo', start, 
#                     end).loc[:, 'Adj Close'] for ticker in tickers],
#                     index=tickers).T.fillna(method='ffill')

In [64]:

normalized_assets = assets/assets.iloc[0, :]
a = normalized_assets.plot()

In [65]:

best_sharpe_weights = best_sharpe_portfolio[0:assets.shape[1]]
w = np.array(best_sharpe_weights)[:,0]
data = np.sum(w*np.array(assets), axis=1)
final_portfolio =  pd.DataFrame(data/data[0], index=assets.index,columns=['best Sharpe'])

In [66]:

ax = normalized_assets.plot(alpha=0.3)
final_portfolio.plot(ax=ax, color='red')

Out[66]:

<matplotlib.axes._subplots.AxesSubplot at 0x7ff640b5cac8>