library(caret)
library(AppliedPredictiveModeling)
set.seed(3433)
data(AlzheimerDisease)
adData = data.frame(diagnosis,predictors)
inTrain = createDataPartition(adData$diagnosis, p = 3/4)[[1]]
training = adData[ inTrain,]
testing = adData[-inTrain,]
I recently studied Predictive Analytics techniques as part of a course. I was given the code shown above. I generated the following two predictive models to compare their accuracy figures. This might be easy for experts but I found it tricky. So I post the code here for my reference.
Non-PCA
training1 <- training[,grepl("^IL|^diagnosis",names(training))]
test1 <- testing[,grepl("^IL|^diagnosis",names(testing))]
modelFit <- train(diagnosis ~ .,method="glm",data=training1)
confusionMatrix(test1$diagnosis,predict(modelFit, test1))
Confusion Matrix and Statistics
Reference
Prediction Impaired Control
Impaired 2 20
Control 9 51
Accuracy : 0.6463
95% CI : (0.533, 0.7488)
No Information Rate : 0.8659
P-Value [Acc > NIR] : 1.00000
Kappa : -0.0702
Mcnemar’s Test P-Value : 0.06332
Sensitivity : 0.18182
Specificity : 0.71831
Pos Pred Value : 0.09091
Neg Pred Value : 0.85000
Prevalence : 0.13415
Detection Rate : 0.02439
Detection Prevalence : 0.26829
Balanced Accuracy : 0.45006
‘Positive’ Class : Impaired
PCA
training2 <- training[,grepl("^IL",names(training))]
preProc <- preProcess(training2,method="pca",thresh=0.8)
test2 <- testing[,grepl("^IL",names(testing))]
trainpca <- predict(preProc, training2)
testpca <- predict(preProc, test2)
modelFitpca <- train(training1$diagnosis ~ .,method="glm",data=trainpca)
confusionMatrix(test1$diagnosis,predict(modelFitpca, testpca))
Confusion Matrix and Statistics
Reference
Prediction Impaired Control
Impaired 3 19
Control 4 56
Accuracy : 0.7195
95% CI : (0.6094, 0.8132)
No Information Rate : 0.9146
P-Value [Acc > NIR] : 1.000000
Kappa : 0.0889
Mcnemar’s Test P-Value : 0.003509
Sensitivity : 0.42857
Specificity : 0.74667
Pos Pred Value : 0.13636
Neg Pred Value : 0.93333
Prevalence : 0.08537
Detection Rate : 0.03659
Detection Prevalence : 0.26829
Balanced Accuracy : 0.58762
‘Positive’ Class : Impaired