← Web application stack

Apache Mahout →

Time series forecast

September 27, 2014 Leave a comment

This code counts how many values from the testing dataset fall within the 95% Confidence Interval range.

library(forecast)
library(lubridate)  # For year() function below
dat=read.csv("~/Desktop/gaData.csv")
training = dat[year(dat$date) < 2012,]
testing = dat[(year(dat$date)) > 2011,]
tstrain = ts(training$visitsTumblr)
sum <- 0
 fit <- bats(tstrain)
 fc <- forecast(fit,h=235)
 mat <- fc$upper
 for(i in 1:nrow(mat)){
     v <- data.frame(mat[i,])
     print(paste(testing$visitsTumblr[i],v[1,] , v[2,]))
     if(testing$visitsTumblr[i] > v[1,] & testing$visitsTumblr[i] < v[2,]){
	sum <- sum + 1
     }
 }
     print(sum)

The forecast object has this type of data(Lo 95 & Hi 95) which I use.

> fc
    Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
366       207.4397 -124.2019 539.0813 -299.7624 714.6418
367       197.2773 -149.6631 544.2177 -333.3223 727.8769
368       235.5405 -112.0582 583.1392 -296.0658 767.1468
369       235.5405 -112.7152 583.7962 -297.0707 768.1516
370       235.5405 -113.3710 584.4520 -298.0736 769.1546
371       235.5405 -114.0256 585.1065 -299.0747 770.1556
372       235.5405 -114.6789 585.7599 -300.0739 771.1548

Filed under Machine Learning, R

Leave a comment Cancel reply