mpiktas / midasr Goto Github PK

View Code? Open in Web Editor NEW

69.0 69.0 33.0 1.16 MB

R package for mixed frequency time series data analysis.

Home Page: http://mpiktas.github.io/midasr/

License: Other

R 100.00%

midasr's People

Contributors

Stargazers

Watchers

midasr's Issues

Normalized beta fuction

Maybe I have understood the normalized beta function that is being modeled in the package but I cannot seem to understand why it is using three paramters. The beta weighttning function used by Ghysels, Santa-Clara and Valkanov (2004, 2005 and 2006) only use two parameters. This is the ordinary normalized beta function built from standard gamma function. Is this specification also included in the package and am I perhaps missing something?

Thanks for a great package!

Weekly Vars to forecast Monthly var

Hi, Lets say if I have a monthly variable to nowcast/forecast using 4 weekly variables.
I want to estimate the monthly value for Aug 2017. I have the weekly values of all the 4 weekly vars till Sep 1.

Could you tell me how to set up the midas eqn.
Lets say, if the monthly variable depends on its first own lag and past 6 weeks . Assuming I have the value till end of August for the weekly variables, does the following eqn make sense:
beta0 <- midas_r(y ~ mls(y, 1, 1) + mls(x, 0:6, 4, nbeta),

start = list(xx = c(1.7, 1, 5)))
I have few other doubts, could you tell me how to get the forecasted values for y for the 4 weeks of
August. Also there are times, where there will be 5 weekly values in a month. How to tackle this.
Also if I would like to automate this, how do I pass the starting lag (0 here ) as a variable.

Sorry for the long post.
Thanks

Make model selection work with AR* terms

Since AR* changes other terms, it would be necessary to reevaluate the model frame for each lag and weight combination.

Make intelligent selection of default kmin

Now kmin is taken to be zero, or the minimum lag supplied by mls. Increase that minimum to the number of parameters of weight function plus one.

Base functions icwtab and iclagtab on function icwlagtab.

Pick the number of lags or weights from the formula and then pass it to the icwlagtab function.
Remove unnecessary printing functions.

Allow make_ic_table to accept the output of itself as an input

Rename ghysels lag to multiplicative MIDAS

Make expand_weights_lags work with weigths "*" and ""

Only need to adjust starting values

Implement support for factors in formula and a*b type formulas

Need a way to guess the final formula from terms, to know exactly how many variables appear after expansion. Currently I know only of ugly hack of matching text, which is not very elegant or robust.

Add support for negative lags in mls function

Would be helpful to form forecasting models of the type

mls(y,-1,1)~y+fmls(x,11,12,nealmon)

Expand Stochastic Optimization Methods

Simulated Annealing or Metropolis derived methods shouldn't be too hard to implement. I'm aware you can use Ofunction = "optim", method = "SANN" but in my testing that yields a radically different result every time I run it due a lack of customization options. Would you consider expanding SO in the future?

which is the parameters of nealmon in midas_r ?

Hello, I'm sorry to that I'm not good at english. Thank you your work for the midas_r package,it's very usefull for me. But there are some problem I can't understand need your hlep.
First, I don't know which is the parameters of nealmon in the function midas_r(). like the code in the user_guide:
eq.r <- midas_r(y ~ trend + mls(x, 0:7, 4, nealmon) + mls(z,0:16, 12, nealmon), start = list(x = c(1, -0.5), z = c(2,0.5, -0.1)))
In this code ,which is the parameters of function nealmon use?
Second, aslo in the previous code, the lag K is 7 and 16, but I don't understand why the results of summary(eq.r) only 3 variables as follow:

Formula y ~ trend + mls(x, 0:7, 4, nealmon) + mls(z, 0:16, 12, nealmon)

Parameters:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.988196 0.115299 17.24 < 2e-16 ***

trend 0.099883 0.000777 128.57 < 2e-16 ***

x1 1.353343 0.151220 8.95 < 2e-16 ***

x2 -0.507566 0.096670 -5.25 3.3e-07 ***

z1 2.263473 0.172815 13.10 < 2e-16 ***

z2 0.409653 0.155685 2.63 0.00905 **

z3 -0.072979 0.020392 -3.58 0.00042 ***

---## Signif. codes: 0 '**' 0.001 '__' 0.01 '' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.932 on 242 degrees of freedom

Add support for both restricted and unrestricted model fitting

At the current moment if embedlf is in the formula, it must have a restriction function. Is there a need for fitting model where we fit both unrestricted and unrestricted lags together?

Gradient parameter

The example of hAhr.test should be updated in some way. Seems that it should be user.gradient = FALSE instead of gradient = "default" in line 686 according to the comment in line 688. However, hAhr.test(mr) in line 684 does not use gradient function either.

How to deal with Monthly data with Daily regressors?

Hi, There.

Unlike weekly or monthly data usually have a fixed 'm' as 7, or 12, the daily data can be influenced by workday and holidays. I've now been facing with a problem that I have to make monthly data such as PMIs as the dependent variate, and, a daily based data such as trading price as the independent variates. How do I resample the daily variables in order to use the MIDAS regression function?

Passing starting values to AR* term

Now it is not clear how to pass the starting values to AR* term. It is possible to pass to AR. The general rule that starting values should have the same names as the terms names, with exclusion of MIDAS terms with restrictions.

Make table in make_ic_table to have additional element

Having the functions expand_lag_weights and expand_ghysels to make tables for make_ic_table it makes much more sense to generate starting values also.

Pass m to weight function

It seems that having m in the function is a bonus. You can always ignore it if you do not need it.

Forecast dates with no new data incorrect?

Hello,

The following is a Midas regression that has no new data as per p.27 of the midasr user guide. This example code is largely from the example in the forecast function.

To the best of my understanding, the forecasted time references are incorrect.

I've set the lags back 3 horizon's worth of 12 higher frequency lags (3*12) with the understanding that together with the forecast function I am forecasting doing one step ahead forecasts three times using data up to Yt-3 to forecast until Yt. Essentially, forecasting Yt-2, Yt-1, and Yt. Note that the Midas user guide only does 1 horizon with no new data.

In the terms of this code, Yt is 2011.

Because of my forecast setup, I expected the dates of the forecasted values to be 2010, 2011, and 2012. They're not. They're 2012, 2013, 2014. From checking the algebraic equation on p. 27 in the Midas User Guide and their subsequent example I can't find any reason why the dates would be 2012, 2013, 2014.

By setting my high-frequency lags back by the suggested (frq*h) amount and receiving 2012-20134 forecasts, wouldn't this infer I'm using Yt-3 to forecast Yt+1, Yt+2, and Yt+3 rather than Yt-2, Yt-1, and Yt?

It's very possible I'm misunderstanding something. Ultimately I need to know how to forecast a Q4 and a Q1 variable with data up to Q3 and an AR element (dynamic).

    set.seed(1234)
    data("USrealgdp")
    data("USunempr")

    y <- diff(log(USrealgdp))
    x <- window(diff(USunempr), start = 1949)
    trend <- 1:length(y)

    #High Frequency Variable's Frequency
    frq<-12

    ##Forecast horizon
    h <- 3

    ##Declining unemployment
    xn <- rep(-0.1, frq*h)

    ##New trend values
    trendn <- length(y) + 1:h

    ##Dynamic AR* model
    mr.dyn <- midas_r(y ~ trend + mls(y, h+1:2, 1, "*")
                      + mls(x, (h*frq)+0:11, frq, nealmon),
                      start = list(x = rep(0, 3)))
    summary(mr.dyn)

    forecast(mr.dyn, list(trend = trendn, x = rep(NA,frq*h)), method = "dynamic")

Gives the forecast output of

    #Point Forecast
    2012     0.03654885
    2013     0.01834644
    2014     0.01803171

Investigate issues with saving

Saving lots of midasr objects sometimes can be troublesome, since they all contain environments. This can make the .RData file too large.

Get the number of all available points in function prepenv

Now the the number of available points depends on lags in the formula. Pass formula lhs~1 to model.frame to get the full number of observations.

Fix a typo in nealmon man page

theta_h prieš \theta_h

Model selection and forecasting

Hi,

I am new to MIDAS and R. I am trying to understand how select_and_forecast works. From my understanding the command selects the best lag length for the high frequency variable by IC minimization within a class of restrictions and then goes on to test pseudo out of sample for the best selected restrictions. Am I right?

In my case I would like to estimate the following model:

Y (low frequency) would be a monthly variable (with 63 total observations available) and X (high frequency) would be a weekly variable that I am assuming is observed 4 times each time Y is observed.

I read the manual, but I couldn't figure out how incorporate the dependent variable in the model selection process without including the contemporaneous term and take into account the fact that I would need to include the dynamic pseudo out of sample forecasts when computing the accuracy measures.

On a side note, is it possible to calculate truly out of sample forecast averages like in EViews (see here). In my particular case, I need to calculate one step and two step forecasts (observations 64 and 65). Is that possible with average_forecast?

Add data parameter to shorten formulas

In lm you can enter y ~ x + b with all variable being listed in a data.frame which you pass on to lm via the data parameter.
I would like to prepare a list of "mls" matrices and do the same as in lm.

E.g.:
eq <- midas_r(
midasdata$rGDPg2 ~ midasdata$MonMulti,
start=list(MonMulti=c(1,-0.1))
)

where midasdata contains mls matrices produced with:
midasdata[[i]] <- fmls(MonMulti,ratio-1, ratio, nealmon)

eq <- midas_r(
rGDPg2 ~ MonMulti,
start=list(MonMulti=c(1,-0.1)),
data=midasdata
)

Make data argument a list

Instead of having ldata, hdata arguments, let there be one data argument, which is a named list. The list elements should have names, if the element is the data.frame, its name is ignored. In the end all the required named variables will be put in the environment.

Built in function to plot the almon lag polynomial?

I've estimated 2 hyper parameters for an exponential almon lag polynomial, is there any easy way to plot the shape of the function? Thank you.

Write tests for making sure that correct environments are always used

Write up the exact troublesome use cases and write tests for them

Give examples for predict and forecast methods

One dynamic, one static model. Dynamic and static forecasts.

Do not normalize index in nealmon function

Now nealmon has normalized index. Revert back to non-normalized to conform to literature. Fix demos accordingly, or simply define the old function in the demos.

Cannot run midas_r_simple without additional low frequency variables

The documentation suggests this is possible (z=NULL), but I cannot get it to run. I always get the following error:
Error in matrix(z, ncol = 1) :
'data' must be of a vector type, was 'NULL'
Or am I not understanding the documentation correctly?

Write forecasting function for the MIDAS regresion

There is already a function predict. But it does not take into account the lags. For forecasting only new data must be passed. This means that we need to keep the actual data to form necessary lag structures.

Also for AR models we need to iterate.

QAICc

Look at the package AICcmodavg and implement QAIC for midas_r objects.

Quantile regression with MIDAS

Hi authors,
Thank you for your amazing package.
I am just wondering whether you have intentions to develop the MIDAS package with quantile regress application similar to the MIDAS toolbox now? It would be a big push to the package.
Best,
Trung

Write model selection function for Ghysels schema

Ghysels schema changes weight function for each lag. Create dummy functions for each lag, assign them to the environment and call make_ic_table.

Calculation of R squared

When I run a summary of the regression, I get the residual standard errors back, for example, 0.2704. Does this mean the R squared is (1-0.2704) = 0.7296?

Thank you

Make function for choosing starting values

Follow the example of MIDAS toolbox for Matlab.

Fix nbeta function. Add normalisation at the last step.

Understanding the forecasts calculated by `select_and_forecast`

In the user's guide on the pages 23-24 there is a demonstration of the function select_and_forecast.
This function calculates, among other things, forecasts according to supplied specifications. The given example calculates one-step-, two-step- and three-step-ahead out-of-sample forecasts 50 times.

In order to check my understanding, I tried to calculate the forecasts "manually" using the suggested models (every time the first of the suggested models for each horizon).

I manage to get the first values of the forecasts for each horizon:

Preparation of the data for the first forecasts:

yy<-y[1:200]
ttrend<- trend[1:200]
xx<-x[1:800]
zz<-z[1:2400]

Calculate one-step-ahead forecast per hand with cbfc$bestlist[[1]][[1]]:

m<-midas_r(yy ~ ttrend + mls(xx, 4:18, 4, nealmon) + mls(zz, 12:25, 12, nealmon),start=list(xx=rep(1,3),zz=rep(1,3)))
round(forecast(m, newdata = list(xx = rep(NA, 4), zz = rep(NA, 12),ttrend = 201)),8)==round(cbfc$forecasts[[1]]$forecast[1,1],8)
TRUE

Calculate two-step-ahead forecast per hand with cbfc$bestlist[[2]][[1]]:

mm<-midas_r(yy ~ ttrend + mls(xx, 8:21, 4, nealmon) + mls(zz, 24:38, 12, nealmon),start=list(xx=rep(1,3),zz=rep(1,3)))
round(forecast(mm, newdata = list(xx = rep(NA, 4), zz = rep(NA, 12),ttrend = 201)),8)==round(cbfc$forecasts[[2]]$forecast[1,1],8)
TRUE

Calculate three-step-ahead forecast per hand with cbfc$bestlist[[3]][[1]]:

mmm<-midas_r(yy ~ ttrend + mls(xx, 12:25, 4, nealmon) + mls(zz, 36:46, 12, nealmon),start=list(xx=rep(1,3),zz=rep(1,3)))
round(forecast(mmm, newdata = list(xx = rep(NA, 4), zz = rep(NA, 12),ttrend = 201)),8)==round(cbfc$forecasts[[3]]$forecast[1,1],8)
TRUE

Expand the data by one low-frequency period in order to calculate the next forecasts:

yye<-y[1:201]
ttrende<- trend[1:201]
xxe<-x[1:804]
zze<-z[1:2412]

Let's try to calculate the second three-sep-ahead forecast using the expanded data:

mmme<-midas_r(yye ~ ttrende + mls(xxe, 12:25, 4, nealmon) + mls(zze, 36:46, 12, nealmon),start=list(xxe=rep(1,3),zze=rep(1,3)))
forecast(mmme, newdata = list(xxe = rep(NA, 4), zze = rep(NA, 12),ttrende = 202))
21.94055

As can be seen, the result of the last commmand ist: 21.94055 but the function select_and_forecast gives: cbfc$forecasts[[3]]$forecast[2,1] : 21.96188

What am I doing wrong here?

One more question:

In the case of three-step-ahead forecasts, does the function select_and_forecast calculate forecasts for the periods 201-250, or for the periods 203:252, because the first forecast is calculated using the first 200 values of the observations, so that the first three-step-ahead forecast is for the period 203? If the first is true, the forecasting procedure must start by 198, or?

Thank you in advance!

why nbeta works for 4 or more inputs?

Hi. I've read the closed issue about fact that nbeta has 3 parameters. But as I tried for example, nbeta(c(0.3,0.1,0.2,0.4),2), or using midas_r with a starting value of length 4 or more, it still gives me something. I wonder what it means.

Trying some other polynomial

Hallo,
first, thank you so much for your great efforts and for making this package available.

I have a little problem here, I wrote the code for the two parameter beta function used by Eric in several publications.
see pp 11 (https://www.federalreserve.gov/pubs/feds/2006/200610/200610pap.pdf)

But I get this message each time i try to run my midas_r command.
Error in base::chol2inv(x, ...) :
element (4, 4) is zero, so the inverse cannot be computed

What does this mean and how should i handle it. thank you

Do gradients with madness

https://mran.revolutionanalytics.com/web/packages/madness/vignettes/introducing_madness.pdf

User supplied gradient

Add the feature of user supplied gradient. For each restriction gradients must be supplied separately.

Thanks for your reply!

Thanks for your reply!
As you say, i.e eq.r<-midas_r(y ~ trend + mls(x, 0:7, 4, nealmon) , start = list(x = c(1, -0.5,)))
in this midas_r(), is the nealmon of x equal to nealmon(q=c(1,-0.5),2) ?
Any more, you say "The summary method for midas_r returns the coefficients of MIDAS restriction. The starting values for x and for z tells midas_r to use exponential polynomials of order 1 and 2 respectively, hence you get the corresponding coefficients (the first coefficient is the multiplier)".
I know the first coefficient is the multiplier of the "eq.r$midas_coefficients", but what the other coefficients mean?
Also in the result of coef(eq.r), the first coefficient is the multiplier of the "eq.r$midas_coefficients",but I don't understand what the other coefficients represent respectively.

 data("USrealgdp")
 y <- diff(log(USrealgdp))
 x <- window(diff(USunempr),start=1949)
 trend <- 1:length(y)
 midas_r(y~trend+fmls(x,11,12,nealmon)+mls(y,1,1,"*"),start=list(x=rep(0,3)))
 Erreur dans z[1] : objet de type 'symbol' non indiçable

Can target variable be high-frequency?

Hi, I was wondering if it was possible (maybe via some workaround) to use a target variable that would be of higher frequency than one of the explanatory variables. An example:

y.monthly ~ x1.quarterly + x2.monthly

Maybe I'm missing something and this can already be done -- please let me know if so!

Thanks.

Make lag selection work with AR* model.

Add more lags for response variable, again model_frame will be reevaluated multiple times.

The restriction function stuff

Require that the first argument of the restriction function is the number of lags.
The second argument should be the parameters, maybe they should be named a la nls?
The rest are additional parameters needed for function.
The number of lags is supplied by embedlf. Pass this to restriction function.