This paper presents a comparison of six machine learning (ML) algorithms: GRU-SVM[4], Linear Regression, Multilayer Perceptron (MLP), Nearest Neighbor (NN) search, Softmax Regression, and Support Vector Machine (SVM) on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset [22] by measuring their classification test accuracy and their sensitivity and specificity values. The said dataset consists of features which were computed from digitized images of FNA tests on a breast mass[22]. For the implementation of the ML algorithms, the dataset was partitioned in the following fashion: 70% for training phase, and 30% for the testing phase. The hyper-parameters used for all the classifiers were manually assigned. Results show that all the presented ML algorithms performed well (all exceeded 90% test accuracy) on the classification task. The MLP algorithm stands out among the implemented algorithms with a test accuracy of ~99.04% Lastly, the results are comparable with the findings of the related studies[18 , 23].


  title={On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset},
  author={Agarap, Abien Fred},
  journal={arXiv preprint arXiv:1711.07831},


All experiments in this study were conducted on a laptop computer with Intel Core(TM) i5-6300HQ CPU @ 2.30GHz x 4, 16GB of DDR3 RAM, and NVIDIA GeForce GTX 960M 4GB DDR5 GPU.

Figure 1. Training accuracy of the machine learning algorithms on breast cancer detection using WDBC.

Figure 1 shows the training accuracy of the ML algorithms: (1) GRU-SVM finished its training in 2 minutes and 54 seconds with an average training accuracy of 90.6857639%, (2) Linear Regression finished its training in 35 seconds with an average training accuracy of 92.8906257%, (3) MLP finished its training in 28 seconds with an average training accuracy of 96.9286785%, (4) Softmax Regression finished its training in 25 seconds with an average training accuracy of 97.366573%, and (5) L2-SVM finished its training in 14 seconds with an average training accuracy of 97.734375%. There was no recorded training accuracy for Nearest Neighbor search since it does not require any training, as the norm equations (L1 and L2) are directly applied on the dataset to determine the “nearest neighbor” of a given data point p_{i} ∈ p.

Table 1. Summary of experiment results on the machine learning algorithms.

Parameter GRU-SVM Linear Regression MLP L1-NN L2-NN Softmax Regression L2-SVM
Accuracy 93.75% ~96.1% ~99.04% ~93.57% ~94.74% ~97.66% ~96.09%
Data points 384000 384000 512896 171 171 384000 384000
Epochs 3000 3000 3000 1 1 3000 3000
FPR ~16.67% ~10.20% ~1.27% 6.25% ~9.38% ~5.77% ~6.38%
FNR 0 0 ~0.79% ~6.54% ~2.80% 0 ~2.47%
TPR 100% 100% ~99.21% ~93.46% ~97.2% 100% ~97.53%
TNR ~83.33% ~89.8% ~98.73% 93.75% ~90.63% ~94.23% ~93.62%

Table 1 summarizes the results of the experiment on the ML algorithms. The parameters recorded were test accuracy, number of data points (epochs * dataset_size), epochs, false positive rate (FPR), false negative rate (FNR), true positive rate (FPR), and true negative rate (TNR). All code implementations of the algorithms were written using Python with TensorFlow as the machine intelligence library.


Copyright 2017 Abien Fred Agarap

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
See the License for the specific language governing permissions and
limitations under the License.




Gated Recurrent Unit (GRU) is a recently-developed variation of the long short-term memory (LSTM) unit, both of which are variants of recurrent neural network (RNN). Through empirical evidence, both models have been proven to be effective in a wide variety of machine learning tasks such as natural language processing (Wen et al., 2015), speech recognition (Chorowski et al., 2015), and text classification (Yang et al., 2016). Conventionally, like most neural networks, both of the aforementioned RNN variants employ the Softmax function as its final output layer for its prediction, and the cross-entropy function for computing its loss. In this paper, we present an amendment to this norm by introducing linear support vector machine (SVM) as the replacement for Softmax in the final output layer of a GRU model. Furthermore, the cross-entropy function shall be replaced with a margin-based function. While there have been similar studies (Alalshekmubarak & Smith, 2013; Tang, 2013), this proposal is primarily intended for binary classification on intrusion detection using the 2013 network traffic data from the honeypot systems of Kyoto University. Results show that the GRU-SVM model performs relatively higher than the conventional GRU-Softmax model. The proposed model reached a training accuracy of ~81.54% and a testing accuracy of ~84.15%, while the latter was able to reach a training accuracy of ~63.07% and a testing accuracy of ~70.75%. In addition, the juxtaposition of these two final output layers indicate that the SVM would outperform Softmax in prediction time - a theoretical implication which was supported by the actual training and testing time in the study.


  title={A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data},
  author={Agarap, Abien Fred},
  journal={arXiv preprint arXiv:1709.03082},


First, clone this repository:

git clone

Then, install the required libraries:

sudo pip install -r requirements.txt

The following are the parameters for the module ( implementing the GRU-SVM class found in gru-svm/models/gru_svm/

usage: [-h] -o OPERATION [-t TRAIN_DATASET] -v
                       [-m MODEL_NAME] -r RESULT_PATH

GRU+SVM for Intrusion Detection

optional arguments:
  -h, --help            show this help message and exit

  -o OPERATION, --operation OPERATION
                        the operation to perform: "train" or "test"
  -t TRAIN_DATASET, --train_dataset TRAIN_DATASET
                        the NumPy array training dataset (*.npy) to be used
                        the NumPy array validation dataset (*.npy) to be used
                        path where to save the trained model
  -l LOG_PATH, --log_path LOG_PATH
                        path where to save the TensorBoard logs
  -m MODEL_NAME, --model_name MODEL_NAME
                        filename for the trained model
  -r RESULT_PATH, --result_path RESULT_PATH
                        path where to save the actual and predicted labels

Then, use the sample data in gru-svm/dataset/train/train_data.npy for training the proposed GRU-SVM:

cd gru-svm
python3 --operation "train" \
--train_dataset dataset/train/train_data.npy \
--validation_dataset dataset/test/test_data.npy \
--checkpoint_path models/checkpoint/gru_svm \
--model_name gru_svm.ckpt \
--log_path models/logs/gru_svm \
--result_path results/gru_svm

After training, the model can be used as follows:

python3 --operation "test" \
--validation_dataset dataset/test/test_data.npy \
--checkpoint_path models/checkpoint/gru_svm \
--result_path results/gru_svm

Or simply use the prepared script files:

# Makes the script files executable
sudo chmod +x
sudo chmod +x

# Installs the pre-requisite software and libraries

# Runs the GRU-SVM for intrusion detection


The results of the study may be found in gru-svm/results.


A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and
Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data
Copyright (C) 2017  Abien Fred Agarap

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published
by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program.  If not, see <>.