Lasso fit
September 26, 2014 Leave a comment
The code I was given
set.seed(3523) library(AppliedPredictiveModeling) data(concrete) inTrain = createDataPartition(concrete$CompressiveStrength, p = 3/4)[[1]] training = concrete[ inTrain,] testing = concrete[-inTrain,]
This is the data
<- head(as.matrix(training))
Cement BlastFurnaceSlag FlyAsh Water Superplasticizer CoarseAggregate
47 349.0 0.0 0 192.0 0.0 1047.0
55 139.6 209.4 0 192.0 0.0 1047.0
56 198.6 132.4 0 192.0 0.0 978.4
58 198.6 132.4 0 192.0 0.0 978.4
63 310.0 0.0 0 192.0 0.0 971.0
115 362.6 189.0 0 164.9 11.6 944.7
FineAggregate Age CompressiveStrength
47 806.9 3 15.05
55 806.9 7 14.59
56 825.5 7 14.64
58 825.5 3 9.13
63 850.6 3 9.87
115 755.8 7 22.90
Lasso fit and plot
predictors <- as.matrix(training)[,-9]
lasso.fit <- lars(predictors,training$CompressiveStrength,type="lasso",trace=TRUE)
headings <- names(training[-(9:10)])
plot(lasso.fit, breaks=FALSE)
legend("topleft", headings,pch=8, lty=1:length(headings),col=1:length(headings))
According to this graph the last coefficient to be set to zero as the penalty increases is Cement. I think this is correct but I may change this.
