Numerical Methods

Lecture 10b: Multidimensional root and minimum finding

by: Tomasz Romańczukiewicz


Outline

  • Root vs minimum
  • Methods
  • Constraints and bounds
  • Moder portfolio theory, Sharoe ration and portfolio optimization
In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline  
plt.rcParams['figure.dpi'] = 100
plt.rcParams['figure.figsize'] = [10,5]
plt.rcParams['axes.grid'] = True
import pandas as pd
import pandas_datareader.data as web
import datetime

import pprint as pp
import scipy as sc
import numpy as np
import scipy.linalg as lg  # SciPy Linear Algebra Library
import scipy.sparse as sp 
import scipy.sparse.linalg as sl
from IPython.display import HTML
from matplotlib import animation, rc

Multidimensional root finding

Root vs minimum

  • Root of continuous $F$
$$\vec F(\vec x) = 0\,,\qquad \vec F\in \mathbb{R}^N , \vec x\in \mathbb{R}^N$$
  • Minimum
$$G(\vec x)=\min$$
  • If $G$ is continuous and differentiable than local minima can be found by solving
$$\vec F = \vec \nabla G(\vec x)=0$$
  • Conversely if $\vec F=0$ we can take $G=\vec F^T\vec F=\min$ (not an optimal choice, but the simplest)
In [2]:
def plot_accuracy_comparision():
    plt.figure(figsize=(10,3), dpi=150)
    x = np.linspace(0, 6, 1000)
    y1 = np.float16(1 + 0.001*(x-2)**2-2e-4*(x-2.5)**3)
    y2 = np.float16(0.002*(x-2)-6e-4*(x-2.5)**2)
    plt.subplot(1, 2, 1)
    plt.grid(True)
    
    plt.title('Minimum')
    plt.plot(x, y1)
    plt.legend(['f(x)'])
    plt.subplot(1, 2, 2)
    plt.grid(True)
    plt.title('zero')
    plt.plot(x, y2, 'red')
    plt.legend(["f'(x)"])
    plt.show()

Expected:

In [3]:
ε = 2**(-11)  # 11 bits out of 16 used for mantissa
print("Accuracy of minimum: ", np.sqrt(ε*1/0.002))
plot_accuracy_comparision()
Accuracy of minimum:  0.49410588440130926

But finding minimum is more stable:

  • $G$ controls if we are getting nearer or not
  • Gradients have special properties (integrability conditions) such as $\nabla\times\nabla G=0$
  • For finite functions on a closed domain minimum always exits (it can be in the boundary)
  • For roots we need more conditions (difficult to construct a multidim bracketing method)
  • Numerical round-off errors can cancel the roots
In [4]:
def plot_accuracy_comparision():

    t = np.linspace(-np.pi, np.pi, 1000)
    x1, y1 = np.cos(t),  np.sin(t)
    x2, y2 = np.cosh(t*0.4),  np.sinh(t*0.4)   
    plt.plot(x1, y1, 'black')
    plt.plot(x2, y2, 'red')
    plt.plot(-x2, y2, 'red')
    plt.plot(x2+0.1, y2, 'blue')
    plt.plot(-x2-0.1, y2, 'blue')
    plt.plot(x2-0.1, y2, 'blue')
    plt.plot(-x2+0.1, y2, 'blue')
    plt.fill_betweenx(y2, -x2+0.1, -x2-0.1, facecolor='yellow', alpha=0.25)
    plt.fill_betweenx(y2, x2+0.1, x2-0.1, facecolor='yellow', alpha=0.25)
    plt.ylim(-1.1,1.1)
    plt.xlim(-2,2)
    plt.show()
In [5]:
plot_accuracy_comparision()
  • $F^2=\min$ has a larger conditional number ($\text{cond} (F^2=\min)=\text{cond} (F=0)^2$ )
  • If $F(x)=Ax-b$ with $A$ - symmetric and positively defined it is better to use $G=\frac{1}{2}x^TAx-x^Tb$ rather than $F^2$

  • $F=\nabla G=0$ also at

    • local maximum: $G=-F^2=\min$
    • saddle points: $\tilde G = F^{\color{red}{4}}=\min$

Methods

Newton-Raphson method:

  • In 1-d we linearized the equation $f(x)=0$
$$f(x_{n+1})\approx f(x_n)+f'(x_n)(x_{n+1}-x_n)=0$$
  • In many dimensions we do similarly
$$F(x_{n+1})\approx F(x_n)+\frac{DF(x_n)}{Dx}(x_{n+1}-x_n)=0$$

So for each step we have to solve the linear equation using jacobian matrix $J=DF/Dx$

$$x_{n+1} = x_n-J^{-1}(x_n)F(x_n)$$

Note: that this is only formal notation, we almost never calculate $J^{-1}$
Better: solve

$$J\delta x=-F(x_n)$$

and then $$x_{n+1}=x_n+\delta x$$

If we cannot calculate the jacobian we can use approximation, but

  • usually this is far more time consuming
    we need to calculate $N$ times $N$ elements of $F$
  • finite difference methods produce errors which can grow fast as $h^{-1}$ when close to the solution
  • the usual chaotic/cyclic behaviour of the Newton method
  • sometimes damped Newton-Raphson method is used $\delta x\to\alpha\delta x$ to ensure stability and avoid cycles.

Example: nonlinear two-point boundary problem

$$u''-2u(2u^2-1)=0\,,u(0)=0, u(10)=1$$

Using the simplest discretizations $x_n=hn, h=\frac{10}{N-1}$ we obtain

$$F_n=\frac{u_{n-1}-2u_n+u_{n+1}}{h^2}-2u_n(u_n^2-1)\,,\qquad 0<n<N-1$$$$F_0=u_n\,,\qquad F_{N-1}=u_{N-1}-1$$

and jacobian for $0<n<N-1$

$$J_{nm} = \frac{\partial F_n}{\partial u_m} = \frac{\delta_{n-1,m}+\delta_{n+1,m}}{h^2} - 2\left[3u_n^2-1+\frac{1}{h^2}\right]\delta_{n,m}$$
In [6]:
N = 100
x = np.linspace(0, 10, N)
h = 10/(N-1)
In [7]:
def newton_raphson_step(u, alpha=1):
    F, v = np.zeros_like(u), u[1:-1]    
    F[1:-1] = np.diff(u, 2)/(h*h) - 2*v*(v**2-1)
    sides = np.ones_like(u)/(h*h)
    ab = np.array([sides, -2*(3*u**2-1+1/(h*h)), sides])
    """ boundary condition implementation """
    ab[1, 0] = 1   # a00=1; 
    ab[0, 1] = 0   # a01=0; 
    ab[1, -1] = 1  # a[N-1,N-1] = 1
    ab[2, -2] = 0  # a[N-1,N-2] = 0
    # print(ab)
    b = F
    delta_u = lg.solve_banded((1, 1), ab, b);
    return u - alpha*delta_u 
In [8]:
u = x/10
nsteps = 8
plt.plot(x,u)
for n in range(nsteps):
    u = newton_raphson_step(u)
    plt.plot(x,u)

plt.show()
In [9]:
u = x/10
nsteps = 8
plt.plot(x,u)
for n in range(nsteps):
    u = newton_raphson_step(u, 0.01)
    plt.plot(x,u)
plt.show()
In [10]:
u = 1-np.exp(-x)
u /= u[-1]
nsteps = 8
plt.plot(x,u)
for n in range(nsteps):
    u = newton_raphson_step(u)
    plt.plot(x,u)
plt.show()

More stable and sometime simplier to minimize some functions.

Solution to our proble is equavalent to minimize the energy functional (taken from lagrangian)

$$E=\int_0^{10}dx\frac12u'^2+\frac12(u^2-1)^2$$

which can be approximated with the discretized version

$$G = \sum_{n=1}^{N-2}\frac12\left(\frac{u_{n-1}-u_{n+1}}{h}\right)^2 + \frac12\left(u_n^2-1\right)^2$$

Check that $F_n=\partial G/\partial u_n$

Some methods:

  • Using gradients
    • Minimization along the gradient direction
    • Fastest descent
    • Conjugate and bi-conjugate gradients
    • Newton-Raphson method for $\nabla G$ (uses also hessian)
  • Using only values:
    • Minimization along coordinates
    • Simplex method (amoeba) - bracketing method
    • Stochastic methods (annealing, evolution differences, inverted Metroopolis-Hastings algorithm)

Some examples:

  • Minimization along the gradient direction
    • we define a 1-d function
      $$f(\alpha) = F\left(x_n + \alpha\nabla F(x_n)\right) $$
      • find $f(\alpha_{\min})=\min$ using 1-d methods
      • take $x_{n+1}=x_n + \alpha_{\min}\nabla F(x_n)$
      • repeat along a gradient at $x_{n+1}$ until convergence
  • Fastest descent:
    • We make step in the direction of $-\nabla G(x_n)$ $$x_{n+1}=x_n-\gamma \nabla G(x_n)$$ where $\gamma$ is a certain step size.
    • constant $\gamma$ allow slow, diffusion-like relaxation towards the minimum
    • but near the true minimum the effective steps are very small because $\gamma G\to 0$
    • when already near the minimum better to take

      $$\gamma _{n}={\frac {\left|\left({x} _{n}- {x} _{n-1}\right)^{T}\left[\nabla G({x} _{n})-\nabla G({x} _{n-1})\right]\right|}{\left\|\nabla G({x} _{n})-\nabla G({x} _{n-1})\right\|^{2}}}$$

In [11]:
def constant_step_example(nsteps = 1000, every=100, gamma=0.49*h**2):
    u = x/10
    results = [u]

    for k in range(nsteps):
        v = u[1:-1]    
        u[1:-1] += gamma*(np.diff(u, 2)/(h*h) - 2*v*(v**2-1))
        if k%every==0: results.append(np.copy(u))
    return np.array(results).reshape(-1, x.size).T
In [12]:
r = constant_step_example(every=20)
a = plt.plot(x, r)

Modern portfolio theory (Markovitz, Sharpe)


Portfolio: Combination of assets with wights $w_i$ such that $\displaystyle \sum_iw_i=1$.

  • Expected return: $$\mathrm E(R_p)=\sum_i\mathrm E(R_i) w_i$$
  • Portfolio return variance: $$\mathrm{var}(R_p) = \sum_i w_i^2 \mathrm{var}(R_i) + \sum_i\sum_{j\neq j}w_iw_j \mathrm{cov}(R_i,R_j)$$ or simply $$\sigma_p^2=w^T\Sigma w$$

Standard portfolio optimization problem: Minimize the risk $$w^T\Sigma w=\min$$ for given assumed returns $$R^Tw=\mu$$

Conditional minimization (lagrange multiplieres)

$$G(w, \lambda_1, \lambda_2) = w^T\Sigma w-\lambda_1\left(R^Tw-\mu\right)-\lambda_2\left(u^Tw-1\right)=\min$$

where $u$ is a vector of ones

$$u^T=(1,1,\ldots,1)$$

Which leads to a system: $${\displaystyle {\begin{pmatrix}2\Sigma &-R& -u\\ R^{T}&0&0\\ u^{T}&0&0 \end{pmatrix}}{\begin{pmatrix}w\\\lambda_1\\\lambda_2 \end{pmatrix}}={\begin{pmatrix}0\\\mu\\1 \end{pmatrix}}}$$ Note the time scaling (in the simplest linear model): $$R(t)=tR(1),\qquad\Sigma(t)=t\Sigma(1)$$

Sidenote:

$$\mathrm{corr}(R_i,R_j) = \frac{\mathrm{cov}(R_i,R_j)}{\sigma_i\sigma_j}$$
  • $\mathrm{corr}(R_i,R_i)=1$ (no units)
  • $\mathrm{cov}(R_i,R_i)=\sigma_i^2$ (units$^2$)

Example: Two assets problem and efficient frontier

Lets take a look what happens when we have two assets described by variance/covariance matrix

$$\Sigma = \begin{pmatrix} 0.16&& 0.01\\ 0.01&& 0.25 \end{pmatrix} \qquad\text{and returns}\quad R=\begin{pmatrix} 0.1\\ 0.5 \end{pmatrix} $$

Let us calculate the returns and volatility of the portfolio for weights $w=(s, 1-s)$ and $0\leq s\leq1$

In [13]:
def two_portfolio_show(R, Σ, color='blue',   marker_color=2, 
                       names=['pure asset 1', 'pure asset 2'], 
                       labels=['suboptimal portfolios', 'combined portfolios']):
    N = 100
    Rp = np.zeros(N)
    Vp = np.zeros(N)
    
    for n, s in enumerate(np.linspace(-0.20, 1.2, N)): 
        w = np.matrix([[s], [1-s]])  # vector of weights
        Vp[n] = np.sqrt(w.T*Σ*w)     # volatility
        Rp[n] = R.T*w                # returns
    
    plt.plot(Vp, Rp, 'black',  label='', alpha=0.2)
    
    for n, s in enumerate(np.linspace(0, 1, N)): 
        w = np.matrix([[s], [1-s]])  # vector of weights
        Vp[n] = np.sqrt(w.T*Σ*w)     # volatility
        Rp[n] = R.T*w                # returns        
    
    idx = np.argmin(Vp)
    plt.plot(Vp[idx:], Rp[idx:], 'black',  label=labels[1], linestyle='dashed')
    plt.plot(Vp[:idx], Rp[:idx], color,  label=labels[0])
    plt.scatter(Vp[0], Rp[0], marker = '*', s=100, color = f'C{marker_color}', label=names[1])
    plt.scatter(Vp[-1], Rp[-1], marker = '*', s=100, color = f'C{marker_color+1}', label=names[0])
    plt.xlabel('portfolio valitility')
    plt.ylabel('portfolio returns')
    plt.xlim(xmin=0)
    plt.legend()
    return Vp, Rp
In [14]:
R1 = np.matrix([[0.1], [0.5]])
Σ1 = np.matrix([ [0.16, 0.01],[0.01, 0.25]  ])
a = two_portfolio_show(R1, Σ1)
In [15]:
R2 = np.matrix([[0.1], [0.5]])
Σ2 = np.matrix([ [0.16, -0.198],[-0.198, 0.25]  ])
a = two_portfolio_show(R1, Σ1, labels=['positive correlation', ''], names=['', ''])
a = two_portfolio_show(R2, Σ2, 'cyan', labels=['negative correlation', ''])

Negative correlation can hedge out the volitality very efficienlty (security+some derivative)

Efficient frontier


  • $w\in[0,1]^N$ parameters such that $\Sigma_iw_i=1$
  • for two assets $w=[s, 1-s]^T$
  • portfolio returns $\mu = R^Tw$ - linear form in $w$ and linear function in s
  • portfolio variance $\sigma_p^2=w^T\Sigma w$ - semi positively defined quadratic form in $w$ and quadratic or linear in $s$

Conclusion:

Portfolio variance $\sigma^2_p(\mu)$ is a quadratic or linear (if one of the component variances is zero) function of the returns $\mu$.

In [16]:
R3 = np.matrix([[0.2], [0.5]])
Σ3 = np.matrix([ [0.1, 0],[0, 0.16]  ])
a = two_portfolio_show(R3, Σ3, names=['asset 1', ''], labels=['risky portfolio', ''])
Risk_free = 0.02
R4 = np.matrix([[Risk_free], [0.5]])
Σ4 = np.matrix([ [0, 0],[0, 0.16]  ])
a = two_portfolio_show(R4, Σ4, 'orange', marker_color=4, names=['risk free asset', 'asset 2'], labels=['partial risk free', ''])

Risk-free asset reduces the risk, but also reduces the returns
Critical Market Line (CML) = partial risk-free line

In [17]:
V, R = two_portfolio_show(R3, Σ3, names=['asset 1', 'asset 2'], labels=['risky portfolio', ''])
idx = np.argmax(R/V)
R4 = np.matrix([[Risk_free], [R[idx]]])
Σ4 = np.matrix([ [0, 0],[0, V[idx]**2]  ])
a = two_portfolio_show(R4, Σ4, 'green', marker_color=4, names=['risk free asset', 'optimal portfolio'], labels=['partial risk free', ''])

Any fully risky portfolio is either more risky or have lower returns.

(William F.) Sharpe ratio:

$$S_r = \frac{R_p-R_f}{\sigma_p}$$

where

  • $R_p$ - portfolio return
  • $R_f$ - risk free return
  • $\sigma_p$ - portfolio variance

Meaning: above risk-free returns devided by a risk factor - best when maximized.
Optimal portfolio: $$S_r=\left.\frac{d\mu}{d\sigma}\right|_{\textrm{opt}}$$

In [18]:
import sympy as smp
s00, s01, s11, w0, w1, mu = smp.symbols('\sigma_{00} \sigma_{01} \sigma_{11} w_0 w_1 \mu')
r0, r1, R = smp.symbols('r_0 r_1 R')
Σ = smp.Matrix([[s00, s01], [s01, s11]])
print("Covariation matrix:"); Σ
Covariation matrix:
Out[18]:
$\displaystyle \left[\begin{matrix}\sigma_{00} & \sigma_{01}\\\sigma_{01} & \sigma_{11}\end{matrix}\right]$
In [19]:
assumed_returns = smp.Eq(mu, r0*w0+r1*(1-w0)); assumed_returns
Out[19]:
$\displaystyle \mu = r_{0} w_{0} + r_{1} \left(1 - w_{0}\right)$
In [20]:
w = smp.solve(assumed_returns, w0)[0]; w
Out[20]:
$\displaystyle \frac{\mu - r_{1}}{r_{0} - r_{1}}$
In [21]:
weights = smp.simplify(smp.Matrix( [[w], [1-w] ] )); weights
Out[21]:
$\displaystyle \left[\begin{matrix}\frac{\mu - r_{1}}{r_{0} - r_{1}}\\\frac{- \mu + r_{0}}{r_{0} - r_{1}}\end{matrix}\right]$
In [22]:
portfolio_variation = (weights.T*Σ*weights)[0]
p = smp.simplify(smp.expand(portfolio_variation*(r0-r1)**2))
smp.collect(p, mu)
Out[22]:
$\displaystyle \mu^{2} \left(\sigma_{00} - 2 \sigma_{01} + \sigma_{11}\right) + \mu \left(- 2 \sigma_{00} r_{1} + 2 \sigma_{01} r_{0} + 2 \sigma_{01} r_{1} - 2 \sigma_{11} r_{0}\right) + \sigma_{00} r_{1}^{2} - 2 \sigma_{01} r_{0} r_{1} + \sigma_{11} r_{0}^{2}$
In [23]:
def final_frontier_shape(a, b, c):
    mu = np.linspace(0, 3, 1000)    
    sigma = np.sqrt(a*mu**2+b*mu+c)
    plt.figure(figsize=(8,4))
    plt.plot(sigma, mu)
    plt.title("Efficient frontier")
    plt.xlabel("Voaltility")
    plt.ylabel("Expected returns")
    plt.show()

Portfolio variation is a quadratic function of expected returs (this is also true for more assets)

$$\sigma_p^2=a\mu^2+b\mu+c$$

This function determines the efficient frontier (usually drown on a $(\sigma_p, \mu)$ plane

In [24]:
final_frontier_shape(0.5, -1, 4)

Example portfolio

In [25]:
start, end = datetime.datetime(2018, 12, 30), datetime.datetime(2019, 12, 30)

tickers = ["GLD", "GOOG", "FB", "BP", "TSLA", "AAPL"]
#tickers = ["GLD", "GOOG", "FB", "BP", "TSLA", "AAPL", 'TM', 'KO', 'PEP']

assets = pd.DataFrame([web.DataReader(ticker, 'yahoo', start, 
                     end).loc[:, 'Adj Close'] for ticker in tickers],
                     index=tickers).T.fillna(method='ffill')
In [26]:
assets.describe()
Out[26]:
GLD GOOG FB BP TSLA AAPL
count 252.000000 252.000000 252.000000 252.000000 252.000000 252.000000
mean 131.431032 1187.196986 181.343214 38.996592 273.190793 206.467936
std 8.616101 81.787116 16.295223 1.955013 52.369413 34.911673
min 119.940002 1016.059998 131.089996 35.487289 178.970001 140.085220
25% 122.452499 1121.137482 171.754997 37.511118 232.219997 184.105156
50% 132.769997 1184.539978 184.409996 38.735779 260.295013 201.530914
75% 140.059998 1239.447540 193.007500 40.514652 311.909996 222.546852
max 146.660004 1361.170044 208.100006 43.157745 430.940002 291.519989
In [27]:
# for better presentations we normalize assets so that asset_i(0) = 1
normalized_assets = assets/assets.iloc[0, :]
a = normalized_assets.plot()
In [28]:
returns = assets.pct_change()
log_returns = np.log(1+returns)
a = log_returns.plot()
In [29]:
print("Mean returns")
returns_mean = returns.mean(); returns_mean
Mean returns
Out[29]:
GLD     0.000674
GOOG    0.001131
FB      0.001924
BP      0.000264
TSLA    0.001361
AAPL    0.002646
dtype: float64
In [30]:
print("Correlation matrix of the returns, ones on diagonal")
returns_corr = returns.corr(); returns_corr
Correlation matrix of the returns, ones on diagonal
Out[30]:
GLD GOOG FB BP TSLA AAPL
GLD 1.000000 -0.200118 -0.105207 -0.165788 -0.063133 -0.239379
GOOG -0.200118 1.000000 0.565121 0.284538 0.238081 0.557873
FB -0.105207 0.565121 1.000000 0.265990 0.218072 0.471653
BP -0.165788 0.284538 0.265990 1.000000 0.184325 0.345533
TSLA -0.063133 0.238081 0.218072 0.184325 1.000000 0.330026
AAPL -0.239379 0.557873 0.471653 0.345533 0.330026 1.000000
In [31]:
print("Covariance matrix of the returns")
returns_cov = returns.cov(); returns_cov
Covariance matrix of the returns
Out[31]:
GLD GOOG FB BP TSLA AAPL
GLD 0.000054 -0.000022 -0.000014 -0.000013 -0.000014 -0.000029
GOOG -0.000022 0.000232 0.000152 0.000045 0.000113 0.000140
FB -0.000014 0.000152 0.000310 0.000048 0.000119 0.000137
BP -0.000013 0.000045 0.000048 0.000107 0.000059 0.000059
TSLA -0.000014 0.000113 0.000119 0.000059 0.000969 0.000169
AAPL -0.000029 0.000140 0.000137 0.000059 0.000169 0.000272
In [32]:
plt.grid(False)
N = returns_corr.shape[0]
plt.xticks(np.arange(N)+0.5, labels=tickers)
plt.yticks(np.arange(N)+0.5, labels=tickers)
plt.imshow(returns_corr, vmin=-1, extent=(0, returns_corr.shape[0], returns_corr.shape[0], 0), cmap='RdBu')
plt.title("Correlation matrix of returns")
a = plt.colorbar()
In [33]:
a = pd.plotting.scatter_matrix(returns, diagonal='kde', alpha=0.2,figsize=(7,7))

Efficient frontier

We need to solve the system $${\displaystyle {\begin{pmatrix}2\Sigma &-R& -u\\ R^{T}&0&0\\ u^{T}&0&0 \end{pmatrix}}{\begin{pmatrix}w\\\lambda_1\\\lambda_2 \end{pmatrix}}={\begin{pmatrix}0\\\mu\\1 \end{pmatrix}}}$$ with $$R(t)=tR(1),\qquad\Sigma(t)=t\Sigma(1)$$

In [34]:
def find_min_volatility(mean_returns, cov_matrix, 𝜇, years=1):
    t, N = years * 252, cov_matrix.shape[0]
    R, Σ = mean_returns * t, cov_matrix * t    
    """Matrix construction"""
    M = np.zeros([N+2,N+2])
    M[0:N, 0:N] = 2*Σ
    M[0:N, N] = -R
    M[0:N, N+1] = -np.ones(N)
    M[N, 0:N] = R
    M[N+1, 0:N] = np.ones(N)
    """Solving equation"""
    b = np.matrix( [0]*N + [𝜇] + [1]).T
    w = np.linalg.solve(M, b)[0:N]
    """Producing output"""
    df = pd.DataFrame(data=w, index=cov_matrix.columns, columns=['weights'])
    volatility = np.sqrt(np.dot(w.T, np.dot(Σ, w)))
    rets = np.dot(R.T, w)
    return df, volatility[0,0]
In [68]:
min_volatility_weights, port_vol = find_min_volatility(returns_mean, returns_cov, 𝜇=0.1)
print(f"Minimal volatility {port_vol:.6f}")
min_volatility_weights.T
Minimal volatility 0.087997
Out[68]:
GLD GOOG FB BP TSLA AAPL
weights 0.594917 0.142677 -0.039084 0.369925 0.005446 -0.073882
In [36]:
ax = normalized_assets.plot(alpha=0.4)
data = np.sum(np.array(min_volatility_weights).T*np.array(assets), axis=1)
min_volatility_portfolio = pd.DataFrame(data/data[0],index=assets.index,columns=['min volatility'])
a = min_volatility_portfolio.plot(ax=ax, color='black')
In [37]:
ax = normalized_assets.pct_change().plot(alpha=0.3)
a = min_volatility_portfolio.pct_change().plot(ax=ax, color='black')

Finding max Sharpe ration on the frontier

For each $\mu$ we find minimal volatility and choose the one maximizing the Sharpe ratio $$S_r = \frac{R_p-R_f}{\sigma_p}$$ which is a nonlilnear function of the wieghts.

In [38]:
N = 500     # number of scans
risk_free_rate = 0.0178  # 1.78% in the US and 3.1% in Poland 

expected_return = np.linspace(0.01, 1, N)
sigma = np.empty(N)
weights = []
for n in range(N):
    w, sigma[n] = find_min_volatility(returns_mean, returns_cov, 𝜇=expected_return[n])
    weights.append(w)    
In [39]:
sharpe_ratio = (expected_return - risk_free_rate)/sigma
idx = np.argmax(sharpe_ratio)
if 1.5*idx < 500:     maxidx = np.int32(1.5*idx)
else:  maxidx = 500
expected_return = expected_return[0:maxidx]
sigma = sigma[0:maxidx]
sharpe_ratio = sharpe_ratio[0:maxidx]
def sharpe_ratio_plot():
    plt.figure(figsize=(10,3))
    plt.subplot(1, 2, 1)
    plt.plot(expected_return, sharpe_ratio)
    plt.xlabel('Expecto return')
    plt.ylabel('Sharpe ratio')
    plt.scatter(expected_return[idx], sharpe_ratio[idx], marker='*', color='red', s=100)
    plt.legend(['Sharpe rat on the frontier', 'Max Sharpe rat.'])
    
    plt.subplot(1, 2, 2)
    plt.plot(sigma, expected_return, 'C2',  label='efficient frontier')
    plt.xlabel('Volatility')
    plt.ylabel('Expecto returns')
    plt.scatter(sigma[idx], expected_return[idx], color='red', label='Max Sharpe rat.', marker='*', s=100)
    a = plt.legend()
    
In [40]:
sharpe_ratio_plot()
In [41]:
def color_negative_red(val):
    color = 'red' if val < 0 else 'black'
    return 'color: %s' % color
In [42]:
print("optimal portfolio:")
print("Max Sharpe ratio   : ", sharpe_ratio[idx])
print("Expected return    : ", expected_return[idx])
print("Standard deviation : ", sigma[idx])
weights[idx].T.style.applymap(color_negative_red)
optimal portfolio:
Max Sharpe ratio   :  3.27284683528921
Expected return    :  0.4325851703406814
Standard deviation :  0.12673528313891544
Out[42]:
GLD GOOG FB BP TSLA AAPL
weights 0.63943 -0.0860582 0.127325 -0.11424 -0.0161773 0.44972

We used a constrain $$ \sum_i w_i=1 $$ But some weights turend out to be negative. Therefore some can be even larger that $1$.

  • $w_i>1$ means that we need to take credit for it
  • $w_i<0$ means losses, so generally it is a good idea to get rid of these assets

Better approach: we need to add additional bound

$$\forall i: 0<w_i<1 $$

Example: find a minimum of the function:

$$G(x,y) = -x^2-y^2$$

for $x\geq1$. We have to find minimum on the boundary

$$\mathcal{L}(x,y, \lambda) = -x^2-y^2-\lambda(x-1)$$

and for $\lambda=0$ (if it satisfies the bound)

$$\mathcal{L}(x,y, \lambda)=\min$$

for $$y=0, x=1, \lambda=2$$

and $G(x,y)=\min$ for $x=y=0$ which lies outside the domain, so we choose the first solution.

However numerically it can be a bit challenging, especially if bounds and constraits are also nonlinear.
For large portfolios Monte Carlo simulations might be an option

There can be many other constrains and bounds:

  • we cannot get read of the whole asset at once $w_i\leq u<1$
  • we want true diversification of assets $0<l\leq w_i$
  • minimum holding size
  • minimum transaction size
  • integer constraints
  • factor models provide additional constraints
  • There can be other risk measures to minimize instead of standard deviation
    • variance
    • semivariance $E\left[\min\left(R-E[R], 0\right)^2\right]$
    • $\textrm{VaR}_{\alpha}(R)$
  • Other types of estimating variance/covariance/returns
    • exponantial weights (pandas: rolling method)
    • dynamical factor models
    • size of the window can variate
  • Other types of prediction future statistical properties

Random portfolios

In [43]:
def portfolio_evolution(weights, mean_returns, cov_matrix, risk_free_rate = 0.0178, years=1):
    t = years * 252
    returns = np.sum(mean_returns*weights ) * t
    std = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights))) * np.sqrt(t)
    Sharpe_ratio = (returns-risk_free_rate*years)/std
    return std, returns, Sharpe_ratio
In [44]:
def random_portfolios(mean_returns, cov_matrix, risk_free_rate=0.0178, num_portfolios=1000):    
    results = np.zeros((num_portfolios, 3))
    N = len(mean_returns)
    weights_record = []
    for n in range(num_portfolios):
        weights = np.random.random(N)
        weights /= np.sum(weights)
        weights_record.append(weights)
        results[n, :] = portfolio_evolution(weights, mean_returns, cov_matrix, risk_free_rate)
    return results, weights_record
In [45]:
results, weights = random_portfolios(returns_mean, returns_cov, num_portfolios=10000)
In [46]:
def show_portfolio_simulations(results, weights):       
    idx = np.argmax(results[:, 2])
    sdp, rp = results[idx, 0], results[idx, 1]
    min_vol_idx = np.argmin(results[:, 0])
    sdp_min, rp_min = results[min_vol_idx, 0], results[min_vol_idx, 1]
    
    opt_weights = pd.DataFrame(data=weights[idx], index=assets.columns, columns=['weights']) 
    
    plt.scatter(results[:, 0],results[:, 1],c=results[:, 2], cmap='Spectral', marker='o', s=5)
    plt.colorbar()
    plt.scatter(sdp,rp, marker='*',color='r',s=100, label='Maximum Sharpe ratio')
    plt.scatter(sdp_min,rp_min, marker='*',color='g', s=100, label='Minimum volatility')
    plt.title('Simulated Portfolio')
    plt.xlabel('annualised volatility')
    plt.ylabel('annualised returns')
    plt.plot(sigma, expected_return, 'red', label='efficient frontier')
    plt.legend(labelspacing=0.8)
    
    """ single assets """    
    stds = np.sqrt(np.diag(np.matrix(returns_cov))*252)
    plt.scatter(stds, returns_mean*252, c='black', marker='+', s=100)
    
    
    for n in range(len(stds)):
        label = assets.columns[n]        
        plt.annotate(label, (stds[n], returns_mean[n]*252), 
                     textcoords="offset points", xytext=(0,10), ha='center')
    
    return idx, min_vol_idx, opt_weights
In [47]:
idx, min_vol_idx, mc_sharpe_weights = show_portfolio_simulations(results, weights)
In [48]:
print("optimal portfolio from MC:")
print("Max Sharpe ratio   : ", results[idx,2])
print("Expected return    : ", results[idx,1])
print("Standard deviation : ", results[idx,0])
mc_sharpe_weights.T
optimal portfolio from MC:
Max Sharpe ratio   :  2.964383720057463
Expected return    :  0.3663013162793188
Standard deviation :  0.11756282222213907
Out[48]:
GLD GOOG FB BP TSLA AAPL
weights 0.473499 0.015712 0.188216 0.020544 0.03911 0.26292
In [49]:
def two_portfolios_show(returns_mean, returns_cov, years=1):
    t, num_assets = 252 * years, len(returns_mean)
    P = np.matrix(returns_mean) * t 
    S = np.matrix(returns_cov) * t
    print(P.shape)
    N = 100
    Rp = np.zeros(N)
    Vp = np.zeros(N)
    for k in range(num_assets-1):
        for m in range(k+1, num_assets):
            Σ = np.matrix([  [S[k,k], S[k,m]], 
                             [S[m,k], S[m,m]]])
            R = np.matrix([[P[0,k]], [P[0,m]]])

            for n, s in enumerate(np.linspace(0, 1, N)):
                w = np.matrix([[s], [1-s]])  # vector of weights
                Vp[n] = np.sqrt(w.T*Σ*w)     # volatility
                Rp[n] = R.T*w                # returns
            plt.plot(Vp, Rp, 'black')
In [50]:
two_portfolios_show(returns_mean, returns_cov)
idx, min_vol_idx, mc_sharpe_weights = show_portfolio_simulations(results, weights)
(1, 6)

Simulating annealing

Algorithm (details may vary):

  • randomly choose initial point $x$
  • randomly choose neighbour $y$ ($|x-y|$ can be proportional to temperature $T$, in physics gaussian distro)
  • if $f(y)<f(x)$ we accept $y$ as a new solution
  • if $f(y)>f(x)$ we accept $y$ as a new solution with probability $P(f(y)-f(x), T)$ in physics $P$ has a Boltzman distribution

    $$P(\Delta E, T) = e^{-\beta \Delta E/T}$$

  • decrease the temperature example $T_n = (1-n/n_{max})T_0$
In [51]:
def portfolio_annealing(mean_returns, cov_matrix, risk_free_rate=0.0178, years=1, T0=0.1, steps=1000, boltzmann_constant=0.001):
   
    num_assets = len(mean_returns)
    max_tries = 100
     
    """ Choosing initial point """
    w = np.random.random(num_assets)
    w /= np.sum(w)

    result = portfolio_evolution(w, mean_returns, cov_matrix, risk_free_rate, years)
    results = [np.array(result)]
    sharpe_ratio = result[2]
    weights = [w] 
    
    for n in range(steps):
        """ Finding a test point inside the domain """
        T = T0*(1-n/steps)
        tries = 0
        while True:
            wi = w + np.random.normal(0, T, num_assets)    
            tries += 1
            if np.prod(wi>=0) == 1 or tries>max_tries: break
        wi = wi/np.sum(wi) # normalize so that sum w = 1
        new_result = portfolio_evolution(wi, mean_returns, cov_matrix, risk_free_rate, years)
        new_sharpe_ratio = new_result[2]
        """ Accepting or rejecting """        
        if new_sharpe_ratio > sharpe_ratio: 
            accept = True
        elif np.exp((new_sharpe_ratio-sharpe_ratio)/(boltzmann_constant*T))>np.random.random(): 
            accept = True
        else: 
            accept = False
        
        if accept==True:        
            sharpe_ratio = new_sharpe_ratio
            w = wi 
            weights.append(w)
            results.append(np.array(new_result))

    results = np.array(results)
    idx = np.argmax(results[:, 2])
    return results, idx,  pd.DataFrame(data=weights[idx], index=cov_matrix.columns, columns=['weights'])       
In [52]:
path, idx, max_sharpe_weights = portfolio_annealing(returns_mean, returns_cov, T0=0.1, steps = 300)
print("optimal portfolio:")
print("Max Sharpe ratio   : ", path[idx,2])
print("Expected return    : ", path[idx,1])
print("Standard deviation : ", path[idx,0])
max_sharpe_weights.T
optimal portfolio:
Max Sharpe ratio   :  3.195297039456286
Expected return    :  0.36525868505582554
Standard deviation :  0.10874065251690947
Out[52]:
GLD GOOG FB BP TSLA AAPL
weights 0.57051 0.007544 0.077267 0.00181 0.000116 0.342754
In [53]:
plt.plot(path[:, 0], path[:, 1], label='path of annealing')
plt.xlabel('annualised volatility')
plt.ylabel('annualised returns')
plt.plot(sigma, expected_return, 'red', label='efficient frontier')
a = plt.legend(labelspacing=0.8)
two_portfolios_show(returns_mean, returns_cov)
(1, 6)
In [54]:
ax = normalized_assets.plot(alpha=0.3)
min_volatility_portfolio.plot(ax=ax, color='black', alpha=0.5, linewidth=3)
data = np.sum(np.array(max_sharpe_weights).T*np.array(assets), axis=1)

max_sharpe_portfolio = pd.DataFrame(data/data[0],index=assets.index,columns=['MC Sharpe'])
a = max_sharpe_portfolio.plot(ax=ax, color='blue', linewidth=3)

scipy.optimize

Scipy has a method optimize alowing constraints and bounds.

We choose a method Sequential Least SQuares Programming (SLSQP)

  • approximate lagrangian, constraints and bounds as quadratic functions
  • use current gradient to define search direction
  • find a minimum along the direction using parabola approximation subject to the constraints and bounds
  • repeat the procedure from the minimum It is likely to find only a local minimum rather than global.

It is much faster and direct method (directional minimizations)

In [55]:
import scipy.optimize as sco

def find_max_Sharpe_ratio_portfolio(mean_returns, cov_matrix, risk_free_rate=0.0178, years=1):
    num_assets = len(mean_returns)  
    path = []
    def neg_sharpe_ratio(w):
        p = portfolio_evolution(w, mean_returns, cov_matrix, risk_free_rate)
        path.append(np.array(p))
        return -p[2]
        
    w_init = np.random.random(num_assets)
    w_init /= np.sum(w_init)

    constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
    bounds = tuple( (0,1) for asset in range(num_assets))
    opts = sco.minimize(neg_sharpe_ratio, w_init, 
                        method='SLSQP', bounds=bounds, constraints=constraints)
    #print(opts)
    w = opts['x']
 
    res =  pd.DataFrame(data=w, index=cov_matrix.columns, columns=['weights'])
    return np.array(path), portfolio_evolution(w, mean_returns, cov_matrix, risk_free_rate), res
In [56]:
path_scipy, best_sharpe_res, best_sharpe_portfolio = find_max_Sharpe_ratio_portfolio(returns_mean, returns_cov)

print("optimal portfolio:")
print("Max Sharpe ratio   : ", best_sharpe_res[2])
print("Expected return    : ", best_sharpe_res[1])
print("Standard deviation : ", best_sharpe_res[0])

best_sharpe_portfolio.T
optimal portfolio:
Max Sharpe ratio   :  3.2031223117021956
Expected return    :  0.3646126545177254
Standard deviation :  0.10827330984230292
Out[56]:
GLD GOOG FB BP TSLA AAPL
weights 0.580339 2.172566e-16 0.075982 0.0 2.593955e-16 0.343679
In [57]:
plt.plot(path[:, 0], path[:, 1], label='path of annealing')
plt.plot(path_scipy[:, 0], path_scipy[:, 1], label='path of SLSQP')
plt.xlabel('annualised volatility')
plt.ylabel('annualised returns')
plt.plot(sigma, expected_return, 'black', label='efficient frontier')
a = plt.legend(labelspacing=0.8)
In [58]:
best_sharpe_weights = best_sharpe_portfolio[0:assets.shape[1]]
w = np.array(best_sharpe_weights)[:,0]
data = np.sum(w*np.array(assets), axis=1)
final_portfolio =  pd.DataFrame(data/data[0], index=assets.index,columns=['best Sharpe'])
In [59]:
ax = normalized_assets.plot(alpha=0.3)
min_volatility_portfolio.plot(ax=ax, color='black', alpha=0.7)
final_portfolio.plot(ax=ax, color='red')
a = max_sharpe_portfolio.plot(ax=ax, color='blue', alpha=0.3, linewidth=4)
In [60]:
fp = final_portfolio.pct_change()
Data = np.array([ [best_sharpe_res[1], np.float64(fp.mean())*252], 
                  [best_sharpe_res[0], np.float64(fp.std())*np.sqrt(252)] ] )
comparison = pd.DataFrame( data=Data, 
                           index=['Expected return', 'Standard deviation'], 
                           columns=["Theoretically optimized", 'Optimized portfolio'])
#comparison.insert()
comparison.insert(loc=2, value=np.abs(Data[:,0]-Data[:,1]), column='Abs Error')
comparison.insert(loc=3, value=100*comparison.iloc[:,2]/comparison.iloc[:,0], column='Rel Error (%)')
In [61]:
comparison
Out[61]:
Theoretically optimized Optimized portfolio Abs Error Rel Error (%)
Expected return 0.364613 0.404361 0.039748 10.901467
Standard deviation 0.108273 0.123383 0.015110 13.955382
In [62]:
print(f"Estimated returns and covariance matrix\nare affected by error: {100/np.sqrt(252):.2f}%")
Estimated returns and covariance matrix
are affected by error: 6.30%
In [63]:
#start, end = datetime.datetime(2016, 12, 30), datetime.datetime(2019, 12, 30)
#assets = pd.DataFrame([web.DataReader(ticker, 'yahoo', start, 
#                     end).loc[:, 'Adj Close'] for ticker in tickers],
#                     index=tickers).T.fillna(method='ffill')
In [64]:
normalized_assets = assets/assets.iloc[0, :]
a = normalized_assets.plot()
In [65]:
best_sharpe_weights = best_sharpe_portfolio[0:assets.shape[1]]
w = np.array(best_sharpe_weights)[:,0]
data = np.sum(w*np.array(assets), axis=1)
final_portfolio =  pd.DataFrame(data/data[0], index=assets.index,columns=['best Sharpe'])
In [66]:
ax = normalized_assets.plot(alpha=0.3)
final_portfolio.plot(ax=ax, color='red')
Out[66]:
<matplotlib.axes._subplots.AxesSubplot at 0x7ff640b5cac8>