Classification Example with Keras Deep Learning API in R

   Keras is neural networks API to build the deep learning models. In this tutorial, we'll learn how to build Keras deep learning classification model in R. TensorFlow is a backend engine of Keras R interface. For more information about the library, please refer to this link. To install 'keras' library, we need to run below command in RStudio.

> devtools::install_github("rstudio/keras")
> library(keras)
> install_keras()

First, we'll generate sample dataset for this tutorial and split it into the train and test parts.

library(keras)
set.seed(123)
n=2000  # number of sample data
a <- sample(1:20, n, replace = T)
b <- sample(1:50, n, replace = T)
c <- sample(1:100, n, replace = T)
flag <- ifelse(a > 15 & b > 30 & c > 60, "red", 
                ifelse(a<=9 & b<25& c<=35, "yellow", "green"))
df <- data.frame(a = a,
                  b = b, 
                  c = c, 
                  flag = as.factor(flag))
> tail(df,15)
      a  b  c   flag
1986  3 50 91  green
1987  9 12 56  green
1988 10 21 14  green
1989 13  6 22  green
1990  6 14  9 yellow
1991 10 27 86  green
1992  4 16  6 yellow
1993 18 31 33  green
1994  4 50 51  green
1995  2 31 34  green
1996 18  8 88  green
1997  7 36 89  green
1998 16 34 91    red
1999  9 17 80  green
2000  9 22 91  green

indexes = sample(1:nrow(df), size = .95 * nrow(df))
 
train <- df[indexes, ]
test <- df[-indexes, ]

Next, we'll convert X input data into the matrix type and Y output labels into the numerical category type.
 
train.x <- as.matrix(train[, 1:3], c(1,3,nrow(train)))
train.y <- to_categorical(matrix(as.numeric(train[,4])-1))
 
test.x <- as.matrix(test[, 1:3], c(1,3,nrow(test)))
test.y <- to_categorical(matrix(as.numeric(test[,4])-1))


Building a model
 
Here, input_shape is 3 (a, b, c count), units number is 3 (red, green yellow labels count), activation is 'softmax' (for multi-class categorical type).

model <- keras_model_sequential() 
 
model %>% layer_dense(units=64, activation = "relu", input_shape = c(3)) 
      %>% layer_dense(units =3, activation = "softmax")  
 
model %>% compile(optimizer = "rmsprop", 
                  loss = "categorical_crossentropy",  
                  metric=c("accuracy"))
 
> print(model)
Model
_________________________________________________________________________
Layer (type)                    Output Shape                  Param #    
=========================================================================
dense_356 (Dense)               (None, 64)                    256        
_________________________________________________________________________
dense_357 (Dense)               (None, 3)                     195        
=========================================================================
Total params: 451
Trainable params: 451
Non-trainable params: 0
_________________________________________________________________________


We'll fit the model with train data and then predict a test data with a model.

model %>% fit(train.x, train.y,
               epochs = 50, 
               batch_size = 50)

pred <- model %>% predict(test.x) 

To make the results readable, I'll change the format of the output.

pred <- format(round(pred, 2), nsamll = 4)
result <- data.frame("green"=pred[,1], "red"=pred[,2], "yellow"=pred[,3], 
          "predicted" = ifelse(max.col(pred[ ,1:3])==1, "green",
                        ifelse(max.col(pred[ ,1:3])==2, "red", "yellow")),
          original = test[ ,4])

>  head(result,20)
   green  red yellow predicted original
1   1.00 0.00   0.00     green    green
2   1.00 0.00   0.00     green    green
3   1.00 0.00   0.00     green    green
4   1.00 0.00   0.00     green    green
5   0.45 0.55   0.00       red      red
6   1.00 0.00   0.00     green    green
7   0.93 0.00   0.07     green    green
8   0.52 0.36   0.12     green    green
9   0.96 0.04   0.00     green    green
10  1.00 0.00   0.00     green    green
11  0.28 0.04   0.68    yellow   yellow
12  1.00 0.00   0.00     green    green
13  1.00 0.00   0.00     green    green
14  1.00 0.00   0.00     green    green
15  1.00 0.00   0.00     green    green
16  1.00 0.00   0.00     green    green
17  0.73 0.27   0.00     green    green
18  1.00 0.00   0.00     green    green
19  0.52 0.38   0.10     green    green
20  0.34 0.00   0.66    yellow   yellow

Evaluating the model accuracy and loss.

scores <- model %>% evaluate(test.x, test.y)
> print(scores)
$loss
[1] 0.08444449

$acc
[1] 0.99

 Confusion matrix check with a caret

> cfm=caret::confusionMatrix(result$predicted, result$original)
> print(cfm)
Confusion Matrix and Statistics

          Reference
Prediction green red yellow
    green     89   0      0
    red        1   2      0
    yellow     0   0      8

Overall Statistics
                                          
               Accuracy : 0.99            
                 95% CI : (0.9455, 0.9997)
    No Information Rate : 0.9             
    P-Value [Acc > NIR] : 0.0003217       
                                          
                  Kappa : 0.9479          
 Mcnemar's Test P-Value : NA              

Statistics by Class:

                     Class: green Class: red Class: yellow
Sensitivity                0.9889     1.0000          1.00
Specificity                1.0000     0.9898          1.00
Pos Pred Value             1.0000     0.6667          1.00
Neg Pred Value             0.9091     1.0000          1.00
Prevalence                 0.9000     0.0200          0.08
Detection Rate             0.8900     0.0200          0.08
Detection Prevalence       0.8900     0.0300          0.08
Balanced Accuracy          0.9944     0.9949          1.00

The full source code is listed below.

If you have any comments about the post, please leave it below, thank you!
Thank you for reading!

2 comments:

  1. Hi,
    I do not understand this part of your code.
    "# collecting everything in data frame to read it easily
    result <- data.frame("green" = pred[,1],
    "red" = pred[,2],
    "yellow" = pred[,3],
    "predicted" = ifelse(max.col(pred[ ,1:3]) == 1, "green",
    ifelse(max.col(pred[ ,1:3]) == "2", "red", "yellow")),
    original = test[ ,4])"

    My database includes 60 columns which the last column is the label. Also, I do have 11 class variables in my label column. Could you please help me with this issue?.

    ReplyDelete
    Replies
    1. It is just to print the original and predicted values with probability and decided label. In "predicted" column, we are changing the probability values to label. It selects the column with the highest value as a final output.

      In your case, you will have 11 predicted columns in your prediction. Your job is to filter out the highest predicted value as a final result. You can use the same method above or apply some other methods to change probability value to label. Hope this will help you!

      Delete