fHMM

👉 Fitting (hierarchical) hidden Markov models (H)HMMs to financial data.

💬 Found a bug? Request a feature? Please tell us!

Getting started
Data
Specifying controls
Parameter structures
Outputs
Debugging
Examples

Getting started

Specify the model's controls and execute fit_hmm(controls). See below for examples.

Data

Download daily prices of your preferred stock from https://finance.yahoo.com/ via

download_data(name,symbol,from,to,path)

where

name is your personal identifier for the stock,
symbol is the stock's symbol,
from and to define the time interval (in format "YYYY-MM-DD"),
path is the path where the data gets saved.

Historical events can be highlighted in the visualization of the decoded, empirical time series by passing a named list events with elements dates (a vector of dates) and names (a vector of names for the events) to fit_hmm.

If you do not specify the source parameter in the model's controls, data gets simulated for you. Specify the model coefficients by passing the list sim_par in thetaList format to fit_hmm. Otherwise, the parameters are randomly drawn from -1 to 1. Setting scale_par(x,y) in controls scales these values by x and y on the coarse scale and on the fine scale, respectively.

Specifying controls

Specify your model by setting the following parameters of the named list controls and passing it to fit_hmm:

path: A character, setting the path of the data and the model results.
id: A character, identifying the model.
states: A numeric vector of length two, determining the model type and the number of states:
- If states = c(x,0), a HMM with x states is estimated.
- If states = c(x,y), a HHMM with x coarse-scale and y fine-scale states is estimated.
sdds: A character vector of length two, specifying the state-dependent distributions for both scales:
- "t", the t-distribution,
- "t(x)", the t-distribution with x fixed degrees of freedom (x = Inf is allowed),
- "gamma", the gamma distribution.
horizon: A vector of length 2, determining the length of the time horizion(s). The first entry is numeric and mandatory if data is simulated. The second entry is mandatory if the model is a HHMM and can be either numeric or one of
- "w" for weekly fine-scale chunks,
- "m" for monthly fine-scale chunks,
- "q" for quarterly fine-scale chunks,
- "y" for yearly fine-scale chunks.

The following parameters are optional and set to default values if you do not specify them:

data: A list, containing
- soure: A character vector of length 2, containing the file names of the empirical data:
  - If source = c(NA,NA), data is simulated.
  - If source = c("x",NA), data "x.csv" in folder "path/data" is modeled by a HMM.
  - If source = c("x","y"), data "x.csv" (type determined by cs_type) on the coarse scale and data "y.csv" in folder "path/data" on the fine scale is modeled by a HHMM.
- col: A character vector of length 2, containing the names of the desired column of source for both scales.
- truncate: A vector of length 2, containing lower and upper date limits (each in format "YYYY-MM-DD") to select a subset of the empirical data (neither, one or both limits can be specified).
- cs_type: A character, determining the type of empirical coarse-scale data in HHMMs, one of
  - "mean": means of the fine-scale data,
  - "mean_abs": means of the fine-scale data in absolute value,
  - "sum_abs": sums of fine-scale data in absolute value.
fit: A list, containing
- runs: A numeric value, setting the number of optimization runs.
- at_true: A boolean, determining whether the optimization is initialised at the true parameter values. Only for simulated data, sets runs = 1 and accept = "all".
- seed: A numeric value, setting a seed for the simulation and the optimization.
- accept: Either a numeric vector (containing acceptable exit codes of the nlm optimization) or the character "all" (accepting all codes).
- print.level: Passed on to nlm.
- gradtol: Passed on to nlm.
- stepmax: Passed on to nlm.
- steptol: Passed on to nlm.
- iterlim: Passed on to nlm.
- scale_par: A positive numeric vector of length two, scaling the model parameters in a simulation on the coarse and fine scale, respectively.
results: A list, containing
- overwrite: A boolean, determining whether overwriting of existing results (on the same id) is allowed. Set to TRUE if id = "test".
- ci_level: A numeric value between 0 and 1, setting the confidence interval level.

Default values

accept = c(1,2)
at_true = FALSE
ci_level = 0.95
col = c(NA,NA)
cs_type = NA
gradtol = 1e-3
iterlim = 200
overwrite = FALSE
print_level = 0
runs = 100
scale_par = c(1,1)
seed is not set
source = c(NA,NA)
stepmax = 1
steptol = 1e-3
truncate = c(NA,NA)

Parameter structures

Four types of model parameters are estimated:

non-diagonal elements (column-wise) gammas of transition probability matrices Gamma,
expected values mus,
standard deviations sigmas,
degrees of freedom dfs.

All of these parameters have to fulfill constraints. Constrained parameters get the suffix Con, unconstrained parameters the suffix Uncon. Fine-scale parameters additionally get the suffix _star. Internally, collections of model parameters are processed using the following structures:

thetaFull: A named list of all unconstrained model parameters.
thetaUncon: A vector of all unconstrained model parameters to be estimated (in the above order).
thetaCon: Constrained elements of thetaUncon.
thetaUnconSplit: Splitted thetaUncon by fine-scale models.
thetaListOrdered: thetaList in ordered form with respect to estimated expected values.

The code provides functions to transform these structures. Their names follow the logic x2y, where x and y are two structures.

Outputs

The following model results are saved in the folder path/models/id (path and id specified in controls):

estimates.txt: Containing the model's likelihood value, AIC and BIC values, exit code, number of iterations, estimated and true parameters (only for simulated data), relaltive bias (only for simulated data) and confidence intervals.
protocol.txt: Containing a protocol of the estimation.
states.txt: Containing frequencies of the decoded states and (only for simulated data) a comparison between the true states and the predicted states.
log_likelihoods.pdf: A visualization of the log-likelihood values in the different estimation runs.
pseudo_residuals.pdf: A visualization of the pseudo-residuals along with a Jarque–Bera test result on their normality.
state_dependent_distributions.pdf: A visualization of the estimated state-dependent distributions along with (in case of simulated data) the true state-dependent distributions.
decoded_time_series.pdf: A visualization of the decoded time series with (in case of empirical data) markings for the entries in events.
controls.rds, data.rds, decoding.rds, events.rds, fit.rds and pseudos.rds: Restore the fitting steps.

Debugging

Some error or warning messages provide exception codes. Calling exception(code) yields suggestions for debugging.

Examples

Fitting a 2-state HMM to simulated data using gamma-distributions

Click here for the results.

### Initialize code
source("load_code.R")

### Set and check controls
controls = list(
  path    = ".",
  id      = "HMM_2_sim_gamma",
  sdds    = c("gamma",NA),
  states  = c(2,0),
  horizon = c(5000,NA),
  fit     = list("seed" = 1)
)

### Fit (H)HMM
fit_hmm(controls)

Fitting a 3-state HMM to the DAX closing prices from 2000 to 2020 using t-distributions

Click here for the results.

### initialize code
source("load_code.R")

### download data (optional)
download_data("dax","^GDAXI",path=".")

### set and check controls
controls = list(
  path    = ".",
  id      = "HMM_3_DAX",
  states  = c(3,0),
  sdds    = c("t",NA),
  data    = list("source" = c("dax",NA), "col" = c("Close",NA), "truncate" = c("2000-01-03","2020-12-30"))
)

### define events (optional)
events = list(
  dates = c("2001-09-11","2008-09-15","2020-01-27"),
  names = c("9/11 terrorist attack","Bankruptcy of Lehman Brothers","First COVID-19 case in Germany")
)

### fit (H)HMM
fit_hmm(controls,events)

Fitting a (2,2)-state HHMM jointly to the DAX and the VW stock

### initialize code
source("load_code.R")

### download data (optional)
download_data("dax","^GDAXI",path=".")
download_data("vw","VOW3.DE",path=".")

### set and check controls
controls = list(
  path    = ".",
  id      = "HHMM_2_2_DAX_VW_gamma_t",
  states  = c(2,2),
  sdds    = c("gamma","t"),
  horizon = c(NA,"m"),
  data    = list("source" = c("dax","vw"), "col" = c("Close","Close"), "cs_type" = "mean_abs")
)

### fit (H)HMM
fit_hmm(controls)

fratelino / fhmm Goto Github PK

fhmm's Introduction

fHMM

Table of contents

Getting started

Data

Specifying controls

Default values

Parameter structures

Outputs

Debugging

Examples

Fitting a 2-state HMM to simulated data using gamma-distributions

Fitting a 3-state HMM to the DAX closing prices from 2000 to 2020 using t-distributions

Fitting a (2,2)-state HHMM jointly to the DAX and the VW stock

fhmm's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent