Giter Site home page Giter Site logo

fhmm's Introduction

fHMM

CRAN status CRAN downloads

👉 Fitting (hierarchical) hidden Markov models (H)HMMs to financial data.

💬 Found a bug? Request a feature? Please tell us!

Table of contents

  1. Getting started
  2. Data
  3. Specifying controls
  4. Parameter structures
  5. Outputs
  6. Debugging
  7. Examples

Getting started

Specify the model's controls and execute fit_hmm(controls). See below for examples.

Data

Download daily prices of your preferred stock from https://finance.yahoo.com/ via

download_data(name,symbol,from,to,path)

where

  • name is your personal identifier for the stock,
  • symbol is the stock's symbol,
  • from and to define the time interval (in format "YYYY-MM-DD"),
  • path is the path where the data gets saved.

Historical events can be highlighted in the visualization of the decoded, empirical time series by passing a named list events with elements dates (a vector of dates) and names (a vector of names for the events) to fit_hmm.

If you do not specify the source parameter in the model's controls, data gets simulated for you. Specify the model coefficients by passing the list sim_par in thetaList format to fit_hmm. Otherwise, the parameters are randomly drawn from -1 to 1. Setting scale_par(x,y) in controls scales these values by x and y on the coarse scale and on the fine scale, respectively.

Specifying controls

Specify your model by setting the following parameters of the named list controls and passing it to fit_hmm:

  • path: A character, setting the path of the data and the model results.
  • id: A character, identifying the model.
  • states: A numeric vector of length two, determining the model type and the number of states:
    • If states = c(x,0), a HMM with x states is estimated.
    • If states = c(x,y), a HHMM with x coarse-scale and y fine-scale states is estimated.
  • sdds: A character vector of length two, specifying the state-dependent distributions for both scales:
    • "t", the t-distribution,
    • "t(x)", the t-distribution with x fixed degrees of freedom (x = Inf is allowed),
    • "gamma", the gamma distribution.
  • horizon: A vector of length 2, determining the length of the time horizion(s). The first entry is numeric and mandatory if data is simulated. The second entry is mandatory if the model is a HHMM and can be either numeric or one of
    • "w" for weekly fine-scale chunks,
    • "m" for monthly fine-scale chunks,
    • "q" for quarterly fine-scale chunks,
    • "y" for yearly fine-scale chunks.

The following parameters are optional and set to default values if you do not specify them:

  • data: A list, containing
    • soure: A character vector of length 2, containing the file names of the empirical data:
      • If source = c(NA,NA), data is simulated.
      • If source = c("x",NA), data "x.csv" in folder "path/data" is modeled by a HMM.
      • If source = c("x","y"), data "x.csv" (type determined by cs_type) on the coarse scale and data "y.csv" in folder "path/data" on the fine scale is modeled by a HHMM.
    • col: A character vector of length 2, containing the names of the desired column of source for both scales.
    • truncate: A vector of length 2, containing lower and upper date limits (each in format "YYYY-MM-DD") to select a subset of the empirical data (neither, one or both limits can be specified).
    • cs_type: A character, determining the type of empirical coarse-scale data in HHMMs, one of
      • "mean": means of the fine-scale data,
      • "mean_abs": means of the fine-scale data in absolute value,
      • "sum_abs": sums of fine-scale data in absolute value.
  • fit: A list, containing
    • runs: A numeric value, setting the number of optimization runs.
    • at_true: A boolean, determining whether the optimization is initialised at the true parameter values. Only for simulated data, sets runs = 1 and accept = "all".
    • seed: A numeric value, setting a seed for the simulation and the optimization.
    • accept: Either a numeric vector (containing acceptable exit codes of the nlm optimization) or the character "all" (accepting all codes).
    • print.level: Passed on to nlm.
    • gradtol: Passed on to nlm.
    • stepmax: Passed on to nlm.
    • steptol: Passed on to nlm.
    • iterlim: Passed on to nlm.
    • scale_par: A positive numeric vector of length two, scaling the model parameters in a simulation on the coarse and fine scale, respectively.
  • results: A list, containing
    • overwrite: A boolean, determining whether overwriting of existing results (on the same id) is allowed. Set to TRUE if id = "test".
    • ci_level: A numeric value between 0 and 1, setting the confidence interval level.

Default values

  • accept = c(1,2)
  • at_true = FALSE
  • ci_level = 0.95
  • col = c(NA,NA)
  • cs_type = NA
  • gradtol = 1e-3
  • iterlim = 200
  • overwrite = FALSE
  • print_level = 0
  • runs = 100
  • scale_par = c(1,1)
  • seed is not set
  • source = c(NA,NA)
  • stepmax = 1
  • steptol = 1e-3
  • truncate = c(NA,NA)

Parameter structures

Four types of model parameters are estimated:

  1. non-diagonal elements (column-wise) gammas of transition probability matrices Gamma,
  2. expected values mus,
  3. standard deviations sigmas,
  4. degrees of freedom dfs.

All of these parameters have to fulfill constraints. Constrained parameters get the suffix Con, unconstrained parameters the suffix Uncon. Fine-scale parameters additionally get the suffix _star. Internally, collections of model parameters are processed using the following structures:

  • thetaFull: A named list of all unconstrained model parameters.
  • thetaUncon: A vector of all unconstrained model parameters to be estimated (in the above order).
  • thetaCon: Constrained elements of thetaUncon.
  • thetaUnconSplit: Splitted thetaUncon by fine-scale models.
  • thetaListOrdered: thetaList in ordered form with respect to estimated expected values.

The code provides functions to transform these structures. Their names follow the logic x2y, where x and y are two structures.

Outputs

The following model results are saved in the folder path/models/id (path and id specified in controls):

  • estimates.txt: Containing the model's likelihood value, AIC and BIC values, exit code, number of iterations, estimated and true parameters (only for simulated data), relaltive bias (only for simulated data) and confidence intervals.
  • protocol.txt: Containing a protocol of the estimation.
  • states.txt: Containing frequencies of the decoded states and (only for simulated data) a comparison between the true states and the predicted states.
  • log_likelihoods.pdf: A visualization of the log-likelihood values in the different estimation runs.
  • pseudo_residuals.pdf: A visualization of the pseudo-residuals along with a Jarque–Bera test result on their normality.
  • state_dependent_distributions.pdf: A visualization of the estimated state-dependent distributions along with (in case of simulated data) the true state-dependent distributions.
  • decoded_time_series.pdf: A visualization of the decoded time series with (in case of empirical data) markings for the entries in events.
  • controls.rds, data.rds, decoding.rds, events.rds, fit.rds and pseudos.rds: Restore the fitting steps.

Debugging

Some error or warning messages provide exception codes. Calling exception(code) yields suggestions for debugging.

Examples

Fitting a 2-state HMM to simulated data using gamma-distributions

Click here for the results.

### Initialize code
source("load_code.R")

### Set and check controls
controls = list(
  path    = ".",
  id      = "HMM_2_sim_gamma",
  sdds    = c("gamma",NA),
  states  = c(2,0),
  horizon = c(5000,NA),
  fit     = list("seed" = 1)
)

### Fit (H)HMM
fit_hmm(controls)

Fitting a 3-state HMM to the DAX closing prices from 2000 to 2020 using t-distributions

Click here for the results.

### initialize code
source("load_code.R")

### download data (optional)
download_data("dax","^GDAXI",path=".")

### set and check controls
controls = list(
  path    = ".",
  id      = "HMM_3_DAX",
  states  = c(3,0),
  sdds    = c("t",NA),
  data    = list("source" = c("dax",NA), "col" = c("Close",NA), "truncate" = c("2000-01-03","2020-12-30"))
)

### define events (optional)
events = list(
  dates = c("2001-09-11","2008-09-15","2020-01-27"),
  names = c("9/11 terrorist attack","Bankruptcy of Lehman Brothers","First COVID-19 case in Germany")
)

### fit (H)HMM
fit_hmm(controls,events)

Fitting a (2,2)-state HHMM jointly to the DAX and the VW stock

### initialize code
source("load_code.R")

### download data (optional)
download_data("dax","^GDAXI",path=".")
download_data("vw","VOW3.DE",path=".")

### set and check controls
controls = list(
  path    = ".",
  id      = "HHMM_2_2_DAX_VW_gamma_t",
  states  = c(2,2),
  sdds    = c("gamma","t"),
  horizon = c(NA,"m"),
  data    = list("source" = c("dax","vw"), "col" = c("Close","Close"), "cs_type" = "mean_abs")
)

### fit (H)HMM
fit_hmm(controls)

fhmm's People

Contributors

loelschlaeger avatar timoadam avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.