Numerical Methods

Lecture 9: Time series analysis cont., Linear algebra introduction

by: Tomasz Romańczukiewicz


Outline

  • Statistical analysis (Risk measure, risk at value)
  • Wavelet transform
  • Solving linear algebraic equation
In [6]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline  
plt.rcParams['figure.dpi'] = 100
plt.rcParams['figure.figsize'] = [10,5]
plt.rcParams['axes.grid'] = True
#import quandl as qdl
from pandas.plotting import register_matplotlib_converters
#plt.rcParams.keys()
import datetime
import pandas_datareader.data as web
import pandas as pd
In [5]:
start, end = datetime.datetime(1971, 12, 30), datetime.datetime(2019, 12, 5)

#tickers = ["^DJI", "^IXIC", "^GSPC", "^STOXX50E", "^N225", "^GDAXI"]
tickers = ["^IXIC", "^GSPC","^N225"]

asset_universe = pd.DataFrame([web.DataReader(ticker, 'yahoo', start, 
                     end).loc[:, 'Adj Close'] for ticker in tickers],
                     index=tickers).T.fillna(method='ffill')

asset_universe = asset_universe/asset_universe.iloc[0, :]
In [8]:
a = plt.plot(asset_universe)
plt.legend(tickers)
Out[8]:
<matplotlib.legend.Legend at 0x7f6fb61a0978>

Returns: $$r_i=\frac{p_i-p_j}{p_j}\,,\qquad j=i-1$$ Usually the prices are log-normally distributed Log returns: $$1+r_i=\frac{p_i}{p_j}=\exp\left[\log\left(\frac{p_i}{p_j}\right)\right]$$ Accumulation: $$(1+r_1)(1+r_2)(1+r_3)\ldots=\prod_{i=1}(1+r_i)$$ But better: $$\log(1+r_1)+\log(1+r_2)+\log(1+r_3)\ldots=\sum_{i=1}\log(1+r_i)$$

  • Log returns should accumulate to normal distribution (central limit theorem)

  • $\log(1+r_i)\approx r_i$ for $|r_i|\ll1$

  • Better to compare relative values (returns) rather the absolute values (prices)

  • Multiplications of numbers close to 1 is worse numerically than addition of smal numbers

In [9]:
def show_assets(n):       
    data = asset_universe[tickers[n]]
    time = data.index[1:]
    df = np.diff(data)
    log_returns = np.log(1+np.diff(data)/data[0:-1])
    var = np.var(df)
    mean = np.mean(df)
    print(f'analysis for {tickers[n]}: {mean:.6f} ± {np.sqrt(var):.6f}') 
    
    plt.figure(figsize=(10, 6), dpi=100)
    plt.subplot(2, 2, 1)
    plt.plot(time, df, 'b')
    
    plt.subplot(2, 2, 2)    
    plt.hist(df, bins=np.linspace(-0.25, 0.25, 50), density='True', color='C1')
    t = np.linspace(-0.25, 0.25, 500)
    y = 1/np.sqrt(2*np.pi*var)*np.exp(-(t-mean)**2/(2*var))
    plt.plot(t, y, 'black')
        
    plt.subplot(2, 2, 3)
    plt.plot(time, log_returns, 'g')
    
    plt.subplot(2, 2, 4)    
    plt.hist(log_returns, bins=np.linspace(-0.05, 0.05, 50), density='True', color='C3')
    var = np.var(log_returns)
    mean = np.mean(log_returns)
    t = np.linspace(-0.05, 0.05, 500)
    y = 1/np.sqrt(2*np.pi*var)*np.exp(-(t-mean)**2/(2*var))
    plt.plot(t, y, 'black')
    plt.show()
    print(log_returns.describe())
    
In [11]:
show_assets(0)    
analysis for ^IXIC: 0.006010 ± 0.288082
count    12457.000000
mean         0.000348
std          0.012160
min         -0.120432
25%         -0.004232
50%          0.000732
75%          0.005818
max          0.132546
Name: ^IXIC, dtype: float64
In [12]:
def power_spectrum(signal, dt):
    N = len(signal)
    fft = np.fft.rfft(signal)
    power_spect = np.abs(fft)*2/N
    power_spect[0] /= 2
    fft_freqs = np.linspace(0, np.pi/(dt), len(power_spect), endpoint=False)
    return fft_freqs, power_spect
In [14]:
def Fouriers(n):
    plt.subplot(2, 1, 1)
    data = asset_universe[tickers[n]]
    freqs, Y_fft =  power_spectrum(data, 1)  
    plt.semilogy(freqs, Y_fft)
    plt.loglog(freqs, Y_fft)
    plt.subplot(2, 1, 2)
    log_returns = np.log(1+np.diff(data)/data[0:-1])
    freqs, Y_fft =  power_spectrum(log_returns, 1)  
    plt.semilogy(freqs, Y_fft, color='C2')

Fouriers(0)    
In [15]:
def Spectrograms(n):
    plt.subplot(2, 2, 1)
    data = asset_universe[tickers[n]]
    plt.plot(data)
    plt.subplot(2, 2, 3)
    powerSpectrum, freqenciesFound, time, imageAxis = plt.specgram(data, Fs=2*np.pi, noverlap=196)
    log_returns = np.log(1+np.diff(data)/data[0:-1])
    plt.subplot(2, 2, 2)
    plt.plot(log_returns, color='r')
    plt.subplot(2, 2, 4)
    powerSpectrum, freqenciesFound, time, imageAxis = plt.specgram(log_returns, Fs=2*np.pi, noverlap=196) 
In [16]:
 Spectrograms(0)

Wavelet transform

  • Integrate to zero: $$\int_{-\infty}^\infty\psi(t)dt=0$$
  • Normalization $$\int_{-\infty}^\infty\left|\psi(t)\right|^2dt=1$$
  • Scalability $$\psi_{a,b}(t) = \frac{1}{\sqrt{a}}\psi\left(\frac{t-b}{a}\right)$$ where $a$ - dilation (scale), $b$ - shift
In [17]:
import pywt
print(pywt.families(short=False))
['Haar', 'Daubechies', 'Symlets', 'Coiflets', 'Biorthogonal', 'Reverse biorthogonal', 'Discrete Meyer (FIR Approximation)', 'Gaussian', 'Mexican hat wavelet', 'Morlet wavelet', 'Complex Gaussian wavelets', 'Shannon wavelets', 'Frequency B-Spline wavelets', 'Complex Morlet wavelets']
In [18]:
wavlist = pywt.wavelist()
print(*wavlist)
bior1.1 bior1.3 bior1.5 bior2.2 bior2.4 bior2.6 bior2.8 bior3.1 bior3.3 bior3.5 bior3.7 bior3.9 bior4.4 bior5.5 bior6.8 cgau1 cgau2 cgau3 cgau4 cgau5 cgau6 cgau7 cgau8 cmor coif1 coif2 coif3 coif4 coif5 coif6 coif7 coif8 coif9 coif10 coif11 coif12 coif13 coif14 coif15 coif16 coif17 db1 db2 db3 db4 db5 db6 db7 db8 db9 db10 db11 db12 db13 db14 db15 db16 db17 db18 db19 db20 db21 db22 db23 db24 db25 db26 db27 db28 db29 db30 db31 db32 db33 db34 db35 db36 db37 db38 dmey fbsp gaus1 gaus2 gaus3 gaus4 gaus5 gaus6 gaus7 gaus8 haar mexh morl rbio1.1 rbio1.3 rbio1.5 rbio2.2 rbio2.4 rbio2.6 rbio2.8 rbio3.1 rbio3.3 rbio3.5 rbio3.7 rbio3.9 rbio4.4 rbio5.5 rbio6.8 shan sym2 sym3 sym4 sym5 sym6 sym7 sym8 sym9 sym10 sym11 sym12 sym13 sym14 sym15 sym16 sym17 sym18 sym19 sym20
In [19]:
def show_wavelets():
    plt.figure(figsize=[12, 4], dpi=90)
    wavlist = pywt.wavelist(kind='continuous')
    for n, wavlet_type in enumerate(wavlist[-7:-1]):
        plt.subplot(2,3, n+1)
        wavelet = pywt.ContinuousWavelet(wavlet_type)
        psi, x = wavelet.wavefun(level=10)
        plt.plot(x,psi, color='C'+str(n))
        plt.legend([wavlet_type])

Example wavelets

Continuous wavelets

  • Mexican hat 'mexh'(Rickon wavelet): $\displaystyle \psi(t) = \frac{2}{\sqrt{3} \sqrt[4]{\pi}} \exp^{-\frac{t^2}{2}}\left( 1 - t^2 \right)$

  • Morlet wavelet 'morl': $\displaystyle \psi(t) = \exp^{-\frac{t^2}{2}} \cos(5t)$ - Fourier transform with gaussian window

  • Gaussian Derivative Wavelets 'gausn' $\displaystyle \psi(t) = \frac{d^n}{dx^n}C_n e^{-t^2}$, where $C_n$ is an appropriete constant
In [20]:
print(*pywt.wavelist(kind='continuous'))
cgau1 cgau2 cgau3 cgau4 cgau5 cgau6 cgau7 cgau8 cmor fbsp gaus1 gaus2 gaus3 gaus4 gaus5 gaus6 gaus7 gaus8 mexh morl shan
In [21]:
show_wavelets()
In [22]:
print(*pywt.wavelist(kind='discrete'))
bior1.1 bior1.3 bior1.5 bior2.2 bior2.4 bior2.6 bior2.8 bior3.1 bior3.3 bior3.5 bior3.7 bior3.9 bior4.4 bior5.5 bior6.8 coif1 coif2 coif3 coif4 coif5 coif6 coif7 coif8 coif9 coif10 coif11 coif12 coif13 coif14 coif15 coif16 coif17 db1 db2 db3 db4 db5 db6 db7 db8 db9 db10 db11 db12 db13 db14 db15 db16 db17 db18 db19 db20 db21 db22 db23 db24 db25 db26 db27 db28 db29 db30 db31 db32 db33 db34 db35 db36 db37 db38 dmey haar rbio1.1 rbio1.3 rbio1.5 rbio2.2 rbio2.4 rbio2.6 rbio2.8 rbio3.1 rbio3.3 rbio3.5 rbio3.7 rbio3.9 rbio4.4 rbio5.5 rbio6.8 sym2 sym3 sym4 sym5 sym6 sym7 sym8 sym9 sym10 sym11 sym12 sym13 sym14 sym15 sym16 sym17 sym18 sym19 sym20
In [23]:
def show_wavelets_discrete():
    plt.figure(figsize=[12, 4], dpi=90)
    wavlist = pywt.wavelist(kind='discrete')
    for n, wavlet_type in enumerate(['db1' , 'db2', 'db6', 'coif1', 'haar', 'sym4']):
        plt.subplot(2,3, n+1)
        wavelet = pywt.Wavelet(wavlet_type)
        phi, psi, x = wavelet.wavefun(level=9)
        plt.plot(x,psi, color='C'+str(2*n))
        plt.plot(x,phi, color='C'+str(2*n+1))
        plt.legend([wavlet_type + ' $\phi$', wavlet_type + ' $\psi$'])
In [24]:
show_wavelets_discrete()

Daubechies wavelets 'db*'

  • maximal number of vanishing moments for some given support.
$${\displaystyle \mu _{n}=\int _{-\infty }^{\infty }(x-c)^{n}\,f(x)\,\mathrm {d} x.}$$
In [25]:
def show_signal():
    x = np.linspace(0, 1, 1024)    
    y = np.empty_like(x)
    y = np.sin(16*x*(1+4*x))
    
    #plt.subplot(211)
    plt.xlim(0, 1)
    plt.plot(x,y)
    #plt.subplot(212)
    plt.ylim(-2.5, 1)
    wavelet = pywt.ContinuousWavelet('mexh')
    psi, x = wavelet.wavefun(level=10)
    plt.plot((x)/16+0.075, psi-1.8, 'C1')
    plt.plot((x)/32+0.36, psi-1.8, 'C2')
    plt.plot((x)/64+0.836, psi-1.8, 'C3')
    plt.xlim(0, 1)  
In [26]:
show_signal()  
  • Continuous wavelet transform (nice plots) $${\displaystyle X(a,b)={\frac {1}{\sqrt{a}}}\int _{-\infty }^{\infty }x(t){\overline {\psi }}\left({\frac {t-b}{a}}\right)\,dt}$$

  • Usually (for discrete wavelet transform) $\Delta t\Delta \omega>\pi$ $$\psi_{m,n}(t) = \frac{1}{\sqrt{2^m}}\psi\left(\frac{t-n2^m}{2^m}\right)$$

$$X_{m,n}=\sum_tx(t)\psi_{m,n}(t)$$

Analysis comparison

In [27]:
import pywt
import numpy as np
import matplotlib.pyplot as plt
nsamples =1024
fig, ax = plt.subplots(1,1)
x = np.linspace(0, 2*np.pi, nsamples, endpoint=False)
y = np.sin(4*x*(1+x)) + np.sin(4*x) + np.random.random(x.size)

nscales=9*nsamples/16
n = np.arange(0, nscales)
scales = 2**(n/nscales * np.log2(nsamples))
coef, freqs=pywt.cwt(y, scales, 'morl', sampling_period=1)
plt.imshow(coef, extent=[0, 2*np.pi, 0, nscales], cmap='jet', aspect='auto')

fr = np.array([1, 4, 10, 20, 50, 100, 200, 500])
n = nscales*np.log2(fr)/np.log2(nsamples)
ax.set_yticks(n)
ax.set_yticklabels(fr)

plt.show() 
In [28]:
def mywavletplot(sig, T, Vmin=None, Vmax=None, labels=False, wavelet_type='mexh'):
    nsamples = len(sig)    
    nscales=9*nsamples/16
    n = np.arange(1, nscales)
    scales = 2**(n/nscales * np.log2(nsamples))
    coef, freqs=pywt.cwt(sig, scales, wavelet_type)
    freqs /= T
#    ax = plt.subplots(2,2)    
    plt.subplot(2,2,1)
    t = np.linspace(0, T, nsamples)
    plt.plot(t, sig)
    #plt.matshow(coef) # doctest: +SKIP
    if labels==True:
        fr = np.array([1, 4, 10, 20, 50, 100, 200, 500])
        n = nscales*np.log2(fr)/np.log2(nsamples)
    
    ax = plt.subplot(2,2,3)
    plt.imshow(coef, extent=[0, T, 1, nscales], cmap='jet', aspect='auto', vmin=Vmin, vmax=Vmax)
    if labels==True:
        ax.set_yticks(n)
        ax.set_yticklabels(fr)
    ax = plt.subplot(2,2,2)
    coef2 = coef**2
    plt.imshow(coef2, extent=[0, T, 1, nscales], cmap='viridis', aspect='auto')
    if labels==True:
        ax.set_yticks(n)
        ax.set_yticklabels(fr)
    
    #plt.colorbar()
    
    ff = np.log(1/scales)# 2*np.pi/freqs/nwidths
    ax = plt.subplot(2,2,4)
    max = np.max(coef2)
    levels = [0.0, 0.1, 0.25, 0.5, 0.8, 1]
    contour = plt.contour(t, ff,  coef2/max, levels, colors='k')
    contour_filled = plt.contourf(t, ff, coef2/max, levels, cmap='jet')
    
    plt.show() 
In [29]:
mywavletplot(y, 2*np.pi, wavelet_type = 'morl', labels=True)
In [30]:
sig2 = asset_universe[tickers[1]][::40]
len(sig2)
mywavletplot(sig2, 1)
In [31]:
data = asset_universe[tickers[1]][::40]
log_returns = np.log(1+np.diff(data)/data[0:-1])
mywavletplot(log_returns, 1)
In [32]:
data = asset_universe[tickers[2]][::40]
log_returns = np.log(1+np.diff(data)/data[0:-1])
mywavletplot(log_returns, 1)
In [33]:
def example_wavelet_smooth(data, smooth_level=7, max_level=9):
    wavelet_type='db6'
    DWTcoeffs = pywt.wavedec(data,wavelet_type,mode='symmetric', level=max_level, axis=-1)

    for wave in DWTcoeffs[-smooth_level:]:
        wave[:] = np.zeros_like(wave)

    filtered_data_dwt=pywt.waverec(DWTcoeffs,wavelet_type,mode='symmetric',axis=-1) 

    plt.subplot(2, 1, 1)
    plt.plot(data,label='original data', color='blue')
    plt.legend(['original'])
    plt.subplot(2, 1, 2)
    plt.plot(data.index, filtered_data_dwt, color='red', label='filtered data (DWT)')
    plt.legend(['filtered (DWT)'])
    #asset_universe[tickers[1]].describe()
In [34]:
data = asset_universe[tickers[0]]
example_wavelet_smooth(data, smooth_level=7)
In [38]:
def example_wavelet_components(data, max_level=9):
    wavelet_type='db6'
    DWTcoeffs = pywt.wavedec(data,wavelet_type,mode='symmetric', level=max_level, axis=-1)
    plt.figure(figsize=[12, 4], dpi=90)
    plt.subplot(2, 3, 1)
    plt.plot(data,label='original data', color='blue')
    plt.legend(['original'])     
    for n, component in enumerate(list(range(4))+[max_level]):        
        plt.subplot(2, 3, 2+n)    
        Newcoeffs = DWTcoeffs.copy()
        for lev in range(max_level+1):
            if(lev!=component):
                Newcoeffs[lev] = np.zeros_like(Newcoeffs[lev])

        filtered_data_dwt=pywt.waverec(Newcoeffs, wavelet_type, mode='symmetric',axis=-1) 
        plt.plot(data.index, filtered_data_dwt, color='C'+str(component+1), label='filtered data (DWT)')
        plt.legend([f'component {component}'])
        print(f" component {component} has a time scale length {DWTcoeffs[component].size} DT = {len(data)/DWTcoeffs[component].size:.1f}")
In [40]:
example_wavelet_components(data, max_level=12)   
 component 0 has a time scale length 14 DT = 889.9
 component 1 has a time scale length 14 DT = 889.9
 component 2 has a time scale length 17 DT = 732.8
 component 3 has a time scale length 23 DT = 541.7
 component 12 has a time scale length 6234 DT = 2.0
  • Filtered data can be used to
    • fit or
    • aproximate the trends (do not extrapolate further then one step foreward)
    • construct a dynamical model
  • DWT can be inverted and extended outside the original domain
  • Noiseless correlation can be found between two or more assets
  • Causality can be found (what causes what)

Tasks:

  1. Use different wavelets to analyze signals, observe how the frequency is changed.
  2. Add appropriate labels to CWT plots of the NasdaQ data
  3. Use different contour levels / palettes to extract information from CWT plots of the NasdaQ data. What kind of informations can be extracted.
  4. Find a wavelet correlation $Corr(t, \omega)=CWT_1(t,\omega)*CWT_2(t,\omega)$ between log returns of "^IXIC" and "^GSPC"

Linear algebra

  • rarely used direclty but
  • solving nonlinear problems (Newton method uses linear algebraic equations (LEA) to step),
  • minimization in multidimentions (often requires LAE),
    • in finance portfolio theory
  • system of ODEs (implicit method require solving (usually nonlinear) AE, but Newton ...),
  • evolution PDEs can be written as system of ODEs,
  • two point boundary problem for ODEs can be rewritten as a huge nonlinear or LAE,
  • multidimensional boundary problems,
  • formally approximation theory is solving LAE
    • Lagrange, Fourier, wavelets etc (although use dedicated algorithms)
    • splines (use LAE almost directly)
  • fitting - overdetermined system (sometimes even used SVD)

Preliminaries

Very often we need to solve the algebraic system of equations: $$A\vec x=\vec b\qquad A\in\mathbb{R}^{M\times N},\; \vec x\in \mathbb{R}^N,\; \vec b\in \mathbb{R}^M\qquad M,\; N\in \mathbb{N}$$ Unless stated otherwise we will focus on square matrices $M=N$.

If $A$ is invertible the system always has a unique solution which can be found via Gauss method.
Unfortunatelly Gauss ellimination method requires lots of subtractions which can lead to catastrophic cancellations.
Fortunatelly there exists a standard method which, in principle, should always give a solution if $A$ is invertible. The method is called $LU$ decomposition, wich in some sense is similar to Gauss ellimination method but numeically more stable.

Basic idea:

We rewrite the matrix $A=LU$, where $L$ and $U$ are lower and upper triangular matrices.

$LU$ decomposition is quite time $\color{red}{\mathcal{O}(N^3)}$ and memory $\mathcal{O}(N^2)$ consuming.
Therefore for certain problems (special symmetries of $A$ or other properties) it is always crucial to use other methods:

  • Cholesky decomposition $A=LL^T$ for symmetric (hermitian) positively defined matrices
    twice as fast as $LU$, half memory usage but still time complexity is $\color{red}{\mathcal{O}(N^3)}$
  • Vandermonde matrix $$A=\left({\begin{matrix}1&x_{0}&x_{0}^{2}&\cdots &x_{0}^{n-1}\\ 1&x_{1}&x_{1}^{2}&\cdots &x_{1}^{n-1}\\ \vdots &\vdots &\vdots &\ddots &\vdots \\ 1&x_{n-1}&x_{n-1}^{2}&\cdots &x_{n-1}^{n-1} \end{matrix}}\right),\qquad A_{ij}a_j=y_i $$ used for example in polynomial approximation $$y(x_i)=a_0+a_1x_i+a_2x_i^2+a_3x_i^3+\cdots+a_{n-1}x_i^{n-1}\qquad i=0,\ldots,n-1$$ can be solve in time $\color{red}{\mathcal{O}(N^2)}$
  • Special case of Vandermonde matrix for $x_k=e^{2\pi i/n}$ represents the Fourier Transform, which usin Fast Fourier Transform (FTT) can be solved within $\color{red}{\mathcal{O}(N\log N)}$.
    • Two point value problem $y''+L(x)y=f(x)$ with conditions $y(x_0)=a, y(x_n)=b$ can be rewriten as linear algebraic problem using differentiation matrix which can be tri- or five-diagonal. Such problem can be solved within $\color{red}{\mathcal{O}(N)}$.
    • Sometimes the size of the matrix can be huge $N\sim10^6$ or more especially in multidimensional PDE (Laplace eq.) $N=N_1\times N_2\times N_3$, each of $N_i$ can be of order $1000$. Full $LU$ would not stand a chance.
    • Iterative methods can also be used (give approximated solutions, but much simpler and sometimes faster)
    • For non-square, ill-conditioned, nearliy singular or singular matrices $LU$ is useless and more complicated methods using Singular Value Decomposition have to be used.

LU decomposition

We rewrite $A=LU$ (or more precisely $A=PUL$ or $A=LUP$, where $P$ is a perturbation matrix), where where $L$ and $U$ are lower and upper triangular matrices. Why?
$$Ax=b\;\Rightarrow\;LUx=b\;\Rightarrow\;\begin{cases} Ly&=b\\ Ux&=y \end{cases}$$ Instead of a single equation we have two equations, but equations wich are easier to solve.

Forward substitution $$\begin{bmatrix}l_{11}&0&0\\l_{21}&l_{22}&0\\l_{31}&l_{32}&l_{33}\end{bmatrix} \begin{bmatrix}y_1\\y_2\\y_3\end{bmatrix}= \begin{bmatrix}b_1\\b_2\\b_3\end{bmatrix} $$

$$y_1=\frac{b_1}{l_{11}},\qquad y_2=\frac{b_2-l_{21}y_1}{l_{22}},\qquad y_3=\frac{b_3-l_{31}y_2-l_{32}y_2}{l_{33}}$$

Backward substitution $$ \begin{bmatrix}u_{11}&u_{12}&u_{13}\\0&u_{22}&u_{23}\\0&0&u_{33}\end{bmatrix} \begin{bmatrix}x_1\\x_2\\x_3\end{bmatrix}= \begin{bmatrix}y_1\\y_2\\y_3\end{bmatrix} $$

$$x_3=\frac{y_3}{u_{33}},\qquad x_2=\frac{y_2-u_{23}x_3}{u_{22}},\qquad x_1=\frac{y_1-u_{12}x_2-u_{13}x_3}{u_{11}}$$

Note that diagnal elements $u_{ii}$ $l_{ii}$ cannot be equal to $0$.
The solution of $Ly=b$ and $Ux=y$ costs $\mathcal{O}(N^2)$ calculations.

How to decompose $A$ into a product of two matrices?
Is it unique? No, there are two basic algorythms

  • Doolittle algorithm ($l_{ii}=1$) $(A=PLU)$
  • Crout algorithm ($u_{ii}=1$) $(A=LUP)$

$$\begin{bmatrix} a_{11}&a_{12}&\cdots &a_{1n}\\a_{21}&a_{22}&\cdots &a_{2n}\\\vdots &\vdots &\ddots &\vdots \\a_{n1}&a_{n2}&\cdots &a_{nn}\end{bmatrix}= \begin{bmatrix}1&0&\cdots &0\\l_{21}&1&\cdots &0\\\vdots &\vdots &\ddots &0\\l_{n1}&l_{n2}&\cdots &1\end{bmatrix}\cdot \begin{bmatrix}u_{11}&u_{12}&\cdots &u_{1n}\\0&u_{22}&\cdots &u_{2n}\\\vdots &\vdots &\ddots &\vdots \\0&0&\cdots &u_{nn}\end{bmatrix}$$

  • $u_{11}=a_{11}$

  • $\displaystyle l_{21}u_{11}=a_{21}\;\Rightarrow l_{21}=\frac{a_{21}}{\color{red}{u_{11}}}$

  • $u_{12}=a_{12}$
  • $\displaystyle l_{21}u_{12}+u_{22}=a_{22}\;\Rightarrow u_{22}=a_{22}-l_{21}u_{12}$

  • $\displaystyle l_{31}u_{11}=a_{31}\;\Rightarrow l_{31}=\frac{a_{31}}{\color{red}{u_{11}}}$
  • $\displaystyle l_{31}u_{12} + l_{32}u_{22}=a_{32}\;\Rightarrow l_{32}=\frac{a_{32}-l_{31}u_{12}}{\color{red}{u_{22}}}$

In general $$ u_{ij}=a_{ij}-\sum _{k=1}^{i-1}l_{ik}u_{kj}\qquad i\leq j\leq n$$

$$ l_{ji}={\frac {1}{u_{ii}}}\left(a_{ji}-\sum _{k=1}^{i-1}l_{jk}u_{ki}\right) \qquad i< j\leq n$$

We have to find $N^2$ elements.
Calculation of those elements require sums of $N/2$ elements in average, hence the complexity $\mathcal{O}(N^3)$.

Problem: we have no guarantee that $u_{ii}\neq 0$. The algorithm has to be modified (pivoting partial (sufficient) or full): we exchange the current row with the row which has in the current column the largest modulus.
Row replecements can be represented as multiploication by a perturbation matrix $P$ which has $N$ elements equal to 1 and $N(N-1)$ equal to zero (never store the full matrix nor use full multiplcation).

  • Pivoting is crucial for stability.
  • $LU$ can be done in place.
  • Matrices $L$ and $U$ can be used to solve more equations $A\vec x_k=\vec b_k$.
  • Never find the full $A^{-1}$ unless it is somehow justified.
  • Sometimes (especially for banded matrices, such as a tridiagonal matrix) both $L$ and $U$ can have only small number of non-zero elements. In such a case a full decomposition would be a waste of computational time and storing full matrices a waste of memory (and time as well).
  • The method cannot be fully vectorized (each step uses previously found elements only sums can be done as vector multiplications) so implementing in Python would be very inefficient.
In [41]:
import pprint as pp
import scipy as sc
import numpy as np
import scipy.linalg as lg  # SciPy Linear Algebra Library
def lin_alg_example():
    A = np.matrix(np.random.rand(3,3))
    b = np.matrix(np.random.rand(3,1))
    P, L, U = lg.lu(A)
    print ("A:");  pp.pprint(A)
    print ("b:");  pp.pprint(b)
    print ("P:");  pp.pprint(P)
    print ("L:");  pp.pprint(L)
    print ("U:");  pp.pprint(U)
    return A, b
In [42]:
A, b = lin_alg_example()
A:
matrix([[0.24622229, 0.66019264, 0.50104887],
        [0.74238384, 0.85870772, 0.90909758],
        [0.34239606, 0.43959473, 0.51293085]])
b:
matrix([[0.13232172],
        [0.73894646],
        [0.80489429]])
P:
array([[0., 1., 0.],
       [1., 0., 0.],
       [0., 0., 1.]])
L:
array([[1.        , 0.        , 0.        ],
       [0.3316644 , 1.        , 0.        ],
       [0.46121163, 0.11600936, 1.        ]])
U:
array([[0.74238384, 0.85870772, 0.90909758],
       [0.        , 0.37538985, 0.19953356],
       [0.        , 0.        , 0.07049671]])
In [43]:
x = lg.solve(A, b)
print ("x:") ; pp.pprint(x)
print ("test:") ; pp.pprint(A*x)
x:
array([[-2.7842889 ],
       [-3.89814929],
       [ 6.76861162]])
test:
matrix([[0.13232172],
        [0.73894646],
        [0.80489429]])
In [75]:
def tridiagExample(N, printout=True):
    A=np.matrix(np.zeros([N,N]))
    for n in range(N):
        A[n,n] = -2
    for n in range(N-1):
        A[n+1,n] = A[n,n+1] = 1
    if printout==True:
        print(A)
        P, L, U = lg.lu(A)
        print ("A:");  pp.pprint(A)
        # print ("P:");  pp.pprint(P)
        print ("nonzero elements of L:");  pp.pprint(np.int8((L!=0)))
        print ("nonzero elements of U:");  pp.pprint(np.int8((U!=0)))
    return A
In [76]:
tridiagExample(5)
A=tridiagExample(100,False)
[[-2.  1.  0.  0.  0.]
 [ 1. -2.  1.  0.  0.]
 [ 0.  1. -2.  1.  0.]
 [ 0.  0.  1. -2.  1.]
 [ 0.  0.  0.  1. -2.]]
A:
matrix([[-2.,  1.,  0.,  0.,  0.],
        [ 1., -2.,  1.,  0.,  0.],
        [ 0.,  1., -2.,  1.,  0.],
        [ 0.,  0.,  1., -2.,  1.],
        [ 0.,  0.,  0.,  1., -2.]])
nonzero elements of L:
array([[1, 0, 0, 0, 0],
       [1, 1, 0, 0, 0],
       [0, 1, 1, 0, 0],
       [0, 0, 1, 1, 0],
       [0, 0, 0, 1, 1]], dtype=int8)
nonzero elements of U:
array([[1, 1, 0, 0, 0],
       [0, 1, 1, 0, 0],
       [0, 0, 1, 1, 0],
       [0, 0, 0, 1, 1],
       [0, 0, 0, 0, 1]], dtype=int8)

Example: two pint value linear problem

$$y''=6x\,,\qquad y(-1)=0\,,\qquad y(1)=2$$

Discretization gives $$y_n''\approx \frac{y_{n-1}-2y_n+y_{n+1}}{h^2}\,,\qquad x_n =-1+hn\,,\qquad h = \frac{2}{N-1}=\text{const}$$

Tridiagoanal matrix with equations elementsfor $0<n<N-1$

$$D_{n*}=\left(0, 0, \ldots, \frac{1}{h^2}, -\frac{2}{h^2}, \frac{1}{h^2}, 0, \ldots, 0, 0\right)\,,\quad b_n=6x_n$$

plus two more (boundary) equations

$$D_{0, *}=(1, 0, 0,\ldots)\,,\qquad b_0=0$$$$D_{N-1, *}=(\ldots, 0, 0, 1)\,,\qquad b_{N-1}=2$$
In [136]:
import scipy.sparse as sp  # provides sparse matrices
import scipy.sparse.linalg as sl

def init_vectors(N, y1, y2):
    x = np.linspace(-1, 1, N);  # grid points
    b = 6.*x;  # example function 
    h = x[1]-x[0]               # grid spaceing
    u = np.ones(N)/(h*h)  
    data = np.array([u, -2*u, u])  # array Nx3 to fill the matrix    

    data[1,0] =  data[1,-1] = 1                   # elements for boundary conditions        
    data[2,1] =  data[0, -2] = 0
    b[0] = y1
    b[-1] = y2
    diags = np.array([-1, 0, 1])           # which diagonals to fill 
    D_sparse = sp.spdiags(data, diags, N, N, format='csc')      # sparse 5-diagonal matrix 

    return h, x, b, D_sparse, data
In [137]:
import time
N=3000
h, x, b, D2, data = init_vectors(N, -1, 2)
A = D2.todense()
t = np.zeros(4)
t[0] = time.time() ; z1 = np.linalg.solve(A, b) ;  
t[1] = time.time() ; z2 = sl.spsolve(D2, b);
t[2] = time.time() ; z3 = lg.solve_banded((1, 1), data[-1::-1], b);  t[3] = time.time()
In [178]:
T = np.diff(t)
index = ['full solve', 'sparse', 'banded']
columns = ['time [ms]', 'relative time']
time_data = np.array([1000*T, T/np.min(T)]).transpose()
pandata = pd.DataFrame(time_data, index=index, columns=columns)

styles = [dict(selector="tr", props=[("font-size", "200%")]) ]
pandata.style.set_table_styles(styles).format({'time [ms]': "{: 10.3f}", 'relative time': '{: 10.2f}'})
Out[178]:
time [ms] relative time
full solve 211.785 459.07
sparse 1.885 4.09
banded 0.461 1.00
In [135]:
plt.plot(x,z2)
plt.plot(x,z3)

plt.show()
In [88]:
N=20
h, x, b, D2, data = init_vectors(N, 2, -1)
A = D2.todense()
z2 = sl.spsolve(D2, b)
print(z2)
print(D2*z2-b)
p = (b[0]+b[-1])/2
s = p-1-b[0]
z3 = x**3+s*x+p
print(D2*z3-b)
print(A[-1, :])
[ 2.          2.02055693  1.98162998  1.89021723  1.75331681  1.57792681
  1.37104534  1.13967051  0.89080041  0.63143315  0.36856685  0.10919959
 -0.13967051 -0.37104534 -0.57792681 -0.75331681 -0.89021723 -0.98162998
 -1.02055693 -1.        ]
[-3.77475828e-15 -1.06581410e-14  7.10542736e-15 -2.66453526e-15
 -1.33226763e-14  4.44089210e-15 -5.77315973e-15 -1.99840144e-15
  8.88178420e-15 -1.55431223e-15  2.22044605e-15 -1.77635684e-15
  2.66453526e-15  0.00000000e+00 -3.99680289e-15 -8.88178420e-16
  3.55271368e-15 -6.21724894e-15 -3.55271368e-15  0.00000000e+00]
[ 0.00000000e+00 -1.06581410e-14  7.10542736e-15 -2.66453526e-15
  1.50990331e-14 -2.39808173e-14  8.43769499e-15  1.22124533e-14
 -1.24344979e-14 -1.55431223e-15 -1.73194792e-14  4.26325641e-14
  2.66453526e-15 -4.97379915e-14  3.86357613e-14 -8.88178420e-16
  3.55271368e-15 -6.21724894e-15  1.06581410e-14  0.00000000e+00]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]]

Tasks

  1. Solve the two-point value problem
    $$y''=6x\,,\qquad y'(-1)=0\,,\qquad y(1)=2$$
  2. Improve the example from solution by using $\mathcal{O}(h^4)$

  3. Solve the two-point boundary problem for $L\in \{1, \pi, 2\pi\}$:

    $$y''+ y = \cos(2t),\qquad y(0)=1\, \quad y(L)=2$$
  4. Solve the two-point boundary problem
    $$y_1''+ y_1 + y_2 = \cos(2t),\qquad y_1(0)=1\, \quad y_1(5)=0$$
    $$y_2''+ \frac{1}{2}y_1 - \cos(3t)y_2 = 0,\qquad y_2(0)=1\, \quad y_2(5)=0$$

Larger project examples:

  1. Using the real-world data compare three different approaches to Black-Scholes equation:
    • analytical solution
    • Monte Carlo simulation
    • diffusion equation solve the BS system for different (non-analytic) options using two methods and compare.
  2. Using (many 20-100) real-world data analyse the tripple crossover method
    • what choice of time scales gives best results
    • create a portfolio and test some joint strategies
  3. Using different smoothing technics for part of the historic data construct a dynamical model (not yet covered) and analyse its performance based on the rest of the data.
  4. More to come
  • However, students are encouraged to find their own problems, best if using as many technics as possible.
  • Larger, more ambitious projects can be made by groups up to three students
  • Last week students will have oportunity to present their prohects and discuss with the group (this will not be the last chance to get credits)
  • Alternatively, students can make a series of smaller problems related to the whole material throughout the semester
In [ ]: