We will apply a wrapper selection method using forward search and decision trees as a model in the breast cancer Wisconsin dataset. First we will split the dataset into training and validation datasets.

data <- read.table('', na.strings = "?", sep=",")
data <- data[,-1]
names(data) <- c("ClumpThickness", 
data$Class <- factor(data$Class, levels=c(2,4), labels=c("benign", "malignant"))
ind <- sample(2, nrow(data), replace=TRUE, prob=c(0.7, 0.3))
trainData <- data[ind==1,]
validationData <- data[ind==2,]
# remove cases with missing data
trainData <- trainData[complete.cases(trainData),]
validationData <- validationData[complete.cases(validationData),]

Naive Bayes

We can use the Naive Bayes algorithm to evaluate the forward selection algorithm both in the training and the validation datasets under the accuracy metric.

model <- naiveBayes(Class ~ ., data=trainData, laplace = 1)
simpler_model <- naiveBayes(f, data=trainData, laplace = 1)

pred <- predict(model, validationData)
simpler_pred <- predict(simpler_model, validationData)

train_pred <- predict(model, trainData)
train_simpler_pred <- predict(simpler_model, trainData)
paste("Accuracy in training all attributes", 
      Accuracy(train_pred, trainData$Class), sep=" - ")
## [1] "Accuracy in training all attributes - 0.957805907172996"
paste("Accuracy in training forward search attributes", 
      Accuracy(train_simpler_pred, trainData$Class), sep=" - ")
## [1] "Accuracy in training forward search attributes - 0.953586497890295"
paste("Accuracy in validation all attributes", 
      Accuracy(pred, validationData$Class), sep=" - ")
## [1] "Accuracy in validation all attributes - 0.976076555023923"
paste("Accuracy in validation forward search attributes", 
      Accuracy(simpler_pred, validationData$Class), sep=" - ")
## [1] "Accuracy in validation forward search attributes - 0.971291866028708"

In the breast cancer Wisconsin dataset, the feature selection algorithm did not outperform the use of all attributes. The obvious cause is that there 9 attributes are handpicked by domain experts and have indeed a predictive power all together. So removing some does not product better results.