0% found this document useful (0 votes)
96 views11 pages

Fed-IIoT A Robust Federated Malware Detection Architecture in Industrial IoT

Federated_Deep_Learning_for_Cyber_Security

Uploaded by

puneeth s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views11 pages

Fed-IIoT A Robust Federated Malware Detection Architecture in Industrial IoT

Federated_Deep_Learning_for_Cyber_Security

Uploaded by

puneeth s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

8442 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO.

12, DECEMBER 2021

Fed-IIoT: A Robust Federated Malware


Detection Architecture in Industrial IoT
Rahim Taheri , Mohammad Shojafar , Senior Member, IEEE,
Mamoun Alazab , Senior Member, IEEE, and Rahim Tafazolli , Senior Member, IEEE

Abstract—The sheer volume of industrial Internet of


Things (IIoT) malware is one of the most serious security
threats in today’s interconnected world, with new types of
advanced persistent threats and advanced forms of obfus-
cations. This article presents a robust federated learning
based architecture called Fed-IIoT for detecting Android
malware applications in IIoT. Fed-IIoT consists of two parts:
first, participant side, where the data are triggered by two
dynamic poisoning attacks based on a generative adversar-
ial network (GAN) and federated GAN; and second, server
side, which aims to monitor the global model and shape a
robust collaboration training model, by avoiding anomaly
Fig. 1. FL-based architecture applied to a mobile Android device.
in aggregation by a GAN network (A3GAN) and adjust two
FA: = federated aggregation; BR: = binary representation. Ps := sth
GAN-based countermeasure algorithms. One of the main participant generated a BR.
advantages of Fed-IIoT is that devices can safely participate
in the IIoT and efficiently communicate with each other,
with no privacy issues. We evaluate our solutions through
experiments on various features using three IoT datasets. made the Android OS an attractive target for malware writers
The results confirm the high accuracy rates of our attack and malicious Android applications, and attackers have written
and defense algorithms and show that the A3GAN defen-
several complex malware models to invade the Android OS.
sive approach preserves the robustness of data privacy for
Android mobile users and is about 8% higher accuracy with Several solutions have applied to traditional machine learning
existing state-of-the-art solutions. (ML) algorithms to distinguish malware from benign programs
and to deal with this problem. These algorithms have achieved
Index Terms—Federated learning (FL), generative adver-
sarial network (GAN), Internet of Things (IoT), malware. good results by collecting data and constructing models based
on the identification of malware features. The majority of such
ML algorithms are centralized methods, meaning that they first
I. INTRODUCTION gather data from different users for use as a training dataset,
NDUSTRIAL Internet of Things (IIoT) consists of heteroge- which is placed on the ML server, and then build a model to
I neous devices that connect and communicate via the Internet.
In recent years, most of these devices have used the Android op-
classify new data samples by applying ML algorithms to this
training dataset.
erating system (OS), as the most popular and well-known mobile However, access to these datasets in centralized ML methods
OS for processing and communication. An Android system can raises concerns about data privacy for users. Since traditional
easily be installed on IoT-based systems, and improves accessi- ML techniques are classified only based on the training dataset,
bility to a wide range of applications [1], [2]. This popularity has it is easy for attackers to access the data during the learning
process. These approaches therefore face significant problems
Manuscript received September 27, 2020; revised November 22, with data privacy and leakage. Collaborative ML (CML) was
2020; accepted December 3, 2020. Date of publication December 9, designed to cope with this problem and, at the same time, to
2020; date of current version August 20, 2021. Paper no. TII-20-4501.
(Corresponding author: Mamoun Alazab.) make better use of ML methods [3], [4]. CML is a kind of
Rahim Taheri is with the Computer Engineering and Information Tech- decentralized learning that analyzes data from small, mobile
nology Department, Shiraz University of Technology, Shiraz 715555, devices, such as those connected to the IoT. Based on the CML
Iran (e-mail: [email protected]).
Mohammad Shojafar and Rahim Tafazolli are with the Institute framework, federated learning (FL) was designed to protect data
for Communication Systems, 6G Innovation Centre, University of privacy. In FL, each participant uses a global training model,
Surrey, GU27XH Guildford, U.K. (e-mail: [email protected]; without needing to upload their private data to a third-party
[email protected]).
Mamoun Alazab is with the College of Engineering, IT and Environ- server. Fig. 1 illustrates an FL-based architecture applied to an
ment, Charles Darwin University, Casuarina, NT 0810, Australia (e-mail: Android malware system. In the figure, each participant (Pi ,
[email protected]). ∀i ∈ {1, . . . , S}, where S is the total number of participants)
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TII.2020.3043458. is located in the participant side influences a global model [5].
Digital Object Identifier 10.1109/TII.2020.3043458 This global model is predefined and trained by each participant

1551-3203 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.
TAHERI et al.: FED-IIoT: A ROBUST FEDERATED MALWARE DETECTION ARCHITECTURE IN INDUSTRIAL IoT 8443

to generate local model parameters in duration round t (see the TABLE I


COMPARISON BETWEEN DIFFERENT FL SOLUTIONS (WHERE A TICK
model graph in the upper part of Fig. 1). Then, on the server INDICATES THAT THE METHOD SUPPORTS THE PROPERTY AND A CROSS
side (right-hand rectangular box in Fig. 1), we use a federated INDICATES THAT THE METHOD DOES NOT SUPPORT THE PROPERTY)
aggregation algorithm to aggregate the trained parameters for
each participant and update the global model (see the model
graph in the lower part of Fig. 1).
In FL, individual computing machines may show abnormal
actions, for example, due to faulty software, hardware invasions,
unreliable communication channels, and malicious samples de-
liberately crafting the model [6]. To mitigate these challenges,
we require robust policies to control the learning phases in FL. It
is therefore necessary to develop provably robust FL algorithms
that can deal with Byzantine failures. Recently developed robust
FL defense mechanisms mainly depend on the type of attacks
launched against the system. As an example, Blanchard et al. [7]
introduced a Byzantine detection algorithm for backdoor attacks
in CML. This method depends on the distribution of the training
data and is not robust, especially for various distributions of data
applying on FL settings. Other categories of solutions, such as
those in [8]–[11], deal with controlling the injected noise in the 2) We propose two poisoning attacks based on latent random
training dataset to trigger the distribution on the model and in- variable adopting generative adversarial network (GAN)
crease the weight clippings. For instance, Sun et al. [10] designed to conduct malware floating on the benign data samples
a fast-converging defense algorithm to handle backdoor attacks using FL, namely GAN and federated GAN (FedGAN).
on FL tasks using model weight clipping and noise injection. 3) We propose avoiding anomaly in aggregation by a GAN
However, this scheme was limited, as it was unable to manage network (A3GAN) defense algorithm that is formed
untargeted attacks, such as those in [12] and [13]. Compared with based on aggregating FL and GAN algorithms to detect
the conventional ML, FL can preserve data security, especially the adversaries in server-side component.
in terms of participant data during the learning process. FL can 4) We modify and adapt Byzantine defense algorithm on
also help in updating server-side data for the global model, Krum and Medium and apply them against these form of
and the participant is not required to provide their data to the attacks and verify its effectiveness.
server. Nevertheless, FL is vulnerable to several security threats. 5) Finally, we conduct an exhaustive set of experiments to
For example, since the participant cannot see or access the validate the attack and defense mechanisms, on three IoT
server-side data, an attacker can access the participants’ training datasets using different features.
and inject poisoned data into the training model, meaning that Roadmap: The remainder of this article is structured as fol-
the global model will be contaminated with false data. This is a lows. Section II gives a short summary of related FL solutions
well-known attack in ML, and is called a poisoning attack [14]. that have been designed to tackle anomalies and malware in
There are several significant reasons for the vulnerabilities of the network. Section III discusses the representation of the FL
FL to poisoning attacks: each participant trains the local model, data. Section IV presents our proposed FL architecture, attacks,
and the server cannot determine whether the parameters loaded and solutions. We first describe our FL model for Android OS,
by the participant are benign or malicious; and there is no and then describe various attack scenarios. Finally, we explain
mechanism for participant authentication in FL, meaning that our adjusted defense mechanisms for mitigating these attacks.
an adversary can pretend to be a benign participant. Motivated In Section V, a performance analysis of the proposed attacks
by this, we address the above-mentioned issues by designing and adapted defense algorithms is presented. Finally, Section VI
an FL-based Android malware detection defense algorithm to concludes this article.
protect the privacy of the users’ data. In particular, we design
two algorithms that launch poisoning attacks on the participants’ II. RELATED WORK
training model (see the colorful training model adopted by
an ML algorithm in Fig. 1), and apply two countermeasure In this section, we review the most recent related works in the
solutions, namely Byzantine Median (BM) and Byzantine Krum field of ML approaches to malware detection (see Section II-A)
(BK), to preserve the robustness of the network under these types and the robustness of FL-based malware detection approaches
of attacks. (see Section II-B). A comparison of the techniques found in the
Contributions: The main contributions of this article are as literature is presented in Table I.
follows:
1) We present an FL-based architecture, Fed-IIoT, im- A. ML Approaches to Android Malware Detection
posing Android malware detection algorithm, including ML algorithms are widely used to leverage performance of
various identically independent distribution of learning Android mobile apps. In [15], one of the earliest and most
models. well-known works, Biggio et al. added crafted poisoning attack

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.
8444 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 12, DECEMBER 2021

algorithms adopted on Android malware clustering. Han et


al. [16] then designed a feature transformation-based Android
malware detection scheme that considered the major features of
Android malware detection and transformed them irreversibly
into a new feature domain to validate the robustness of the ML
model. The work in [17] introduced SEDMDroid, a stacking
ensemble framework for identifying Android malware. SEDM-
Droid validates diversity on the features and applies random
feature subspaces and bootstrapping sample techniques. The
study in [18] presented a permission-based malware detection
approach named SIGPID to deal with the growth in the number
of malicious Android applications. The SIGPID algorithm ap-
plies three levels of pruning to the dataset to discover the most
important permission features that can help in attaining distinc-
tion between benign and malicious applications. Most recently,
the work in [19] introduced a malware detection framework
to identify malware attacks on the IIoT, called MD-IIoT. The
authors of MD-IIoT proposed a methodology for handling color
image visualization and used a deep convolution neural network
to identify benign and malicious samples. Although the methods
described above are promising ML solutions, none of them deal
with global training models applying on each mobile app (possi-
bly, participant). Unlike these schemes, Fed-IIoT considers this
aspect.
Fig. 2. Data representation of IIoT samples as a sparse matrix.
Dashed lines refer to an injecting attack from an adversary in the IIoT
B. Adversarial FL (AFL) Approaches system.
Some researchers have adopted distributed ML (DML) to
monitor data gathered from IoT devices [20], [21], [28]. The model quality estimation to deal with the anomalous behavior
DML technique is the preliminary deployed solution that can of the samples in AFL. While these approaches can guarantee
support FL [23], [29], [30]. These approaches commanded some appropriate convergence guarantees in Byzantine cases, they are
bandwidth and communication indications to mainly concen- computationally expensive and need to be manually modified
trate on analyzing the system performance and preserving the during the federated communication. Unlike the above AFL
reliability of the federated nodes. FL is also vulnerable to poi- methods, our proposed Fed-IIoT method adopts a GAN to
soned data that can fool the local and global ML models. To cope mimic the environment of the poisoned sample. We also adapt
with this issue, the work in [22] presented a rejection algorithm Byzantine defense mechanisms using Medium and Krum and
based on the error rate and loss function to deny suspicious add a GAN to deal with the proposed attack scenarios.
local updates by testing the impacts on the data on the global
training model using a validation set. The main problem with this
FL solution is that validation testing for large Android mobile III. DATA REPRESENTATION IN FL
applications is computationally expensive, and cannot be applied In this section, we give a detailed description of the data repre-
in real-time apps. In another study, McMahan et al. [23] designed sentation for the Android malware dataset in FL (see Section III-
an FL-based algorithm to distribute the training process of a deep A) and explain the proposed threat model and assumptions (see
neural network. Their approach allows mobile users to keep their Section III-B).
data on their devices while a service provider aggregates and
distributes the locally trained model across the users. This helps
A. Considered Data Representation Model
to minimize the amount of data collected by third parties on
mobile users. Fig. 2 shows the information gathered from various IIoT
AFL settings are another issue that must be considered. One applications using Android OS to shape our sparse matrix rep-
prominent AFL technique relates to Byzantine settings, where a resentation. Note that the IIoT devices are under threat from an
subset of client data can behave stochastically. We therefore need adversary who wants to modify and corrupt the data (see dashed
to design robust aggregation rules to mitigate this issue. Exhaus- lines in Fig. 2). This matrix includes important information on
tive research has been carried out on Byzantine settings [24]– the features of an Android app, such as system features.
[27], [31]. For example, the authors of [24] and [31] focused on We assume that the local model of each IIoT device consists
gradient similarities, whereas the work in [25] applied geomet- of a set of benign samples, denoted by B, and a set of malware
ric median aggregation, the study in [26] examined redundant samples, denoted by M . Then, we set up our settings containing
communication, and, finally, the work in[27] utilized adaptive the labeled examples (i.e., S samples) and the B elements for

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.
TAHERI et al.: FED-IIoT: A ROBUST FEDERATED MALWARE DETECTION ARCHITECTURE IN INDUSTRIAL IoT 8445

Fig. 3. Proposed FL-based architecture imposing Android malware detection in the presence of a poisoning attack on the participant side. BR:=
binary representation; P-BR:= poisoned BR. The red dashed arrow represents the output of the attacker discriminator function of the GAN output,
the continuous arrow represents the entity link, and the dashed arrow represents the GAN link.

Fig. 4. CNN architecture for a GAN. Fig. 5. CNN architecture for a discriminative GAN.

each sample as is shown in the following equation: that each client sends local model weight updates to the server
without encryption.
D = {(ai , bi ) | ∀i = 1, . . . , S}. (1)
Here, bi ∈ {0, 1} is the binary label of the ith sample feature, IV. FED-IIOT: PROPOSED APPROACH
ai denote the ith malware sample BR of each component rep- In this section, we describe our robust IIoT FL architecture
resenting the selected feature, and aif is the binary value of the for malware detection (see Section IV-A). We then present our
f th feature in the ith sample where {∀f = 1, . . . , F }. If ai has GAN-based attack algorithm and describe its behavior using an
the f th feature, then we set aif = 1, otherwise 0. We can also analysis of Android IoT devices (see Section IV-B). Finally, we
set S as the total number of samples. present our adapted countermeasure algorithms, which aim to
mitigate attacks using the GAN method inspired by Byzantine
B. Threat Model and Assumptions algorithm (see Section IV-C).
We consider some important hypotheses that are list here.
First, our proposed attack and defense algorithms apply on static A. Proposed Architecture
features of IoT devices. It is because the speed of executing Fig. 3 presents the architecture of the proposed Fed-IIoT
operations is greater than the dynamic features. Second, the scheme, which consists of two parts: a participant side and a
proposed methods include the number of adversaries, which we server side. The participant side, as shown by the red rectangle
have expanded to five in order to explore the impact of increasing on the left-hand side of Fig. 3, contains different participants
the number of adversaries. We do not consider collaborations (i.e., Pi , i ∈ {1, . . . , S}). This part represents an adversary that
between adversaries and we assume that the adversaries are can train the model locally (the adversary poison data generation,
independent of each other. Third, this work uses a special GAN P-BR, as shown in Fig. 3). Each Android application is accom-
structure, which is designed on the basis of the convolutional panied by a participant and generates a sample as an input to the
neural networks described in Figs. 4 and 5. Finally, we assume binary vector representing a feature. In each step t + 1, one of

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.
8446 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 12, DECEMBER 2021

the participants reuses the learned model in the previous step t,


Algorithm 1: GAN-Based Trivial Attack Algorithm.
Mt . This model is subverted and modified by the adversary. The
adversary performs the attack by adding poisoned samples (the Input: Xtr , Ytr , Bs , Nd , epochs
P-BR vector on the participant side, as shown in Fig. 3) as a new Output: Fout
sample into the training phase of one of the participants. We will 1: G ← Generator
explain how to create these poisoned updates in Section IV-B. 2: Dm ← Discriminator
3: for each ep in epochs do
4: for each i in (size(Xtr )/Bs ) do
B. Proposed GAN-Based Trivial Attack Algorithm
5: Btchs ← Xtr (Bs ∗ i)
On the participant side (red rectangle in Fig. 3), one or more 6: Btchl ← Ytr (Bs ∗ i)
adversaries enter the system as ordinary participants and try 7: N oise ← rand(Bs , Nd )
to change the process by modifying the features of the input 8: Gs ← G(N oise)
samples so that the generated malware sample represents a 9: Ro ← Dm (Btchs )
benign sample. This reduces the accuracy of the detection system 10: Fo ← Dm (Gs )
and opens the door to allow more malware samples into the 11: Gl ← G(Fo ).loss
system. This part of the figure shows the adversary using a GAN 12: Dl ← Dm (Ro , Fo , Btchl ).loss
mechanism to generate adversarial samples, in which the trained 13: end for
model (i.e., Mt ) is used as a discriminator function. A generator 14: Gg ← Recompute gradients of G(Gl , Gvar )
network is created based on the latent random variable, and this 15: Dg ← Recompute gradients of D(Dl , Dvar )
network is used to generate new samples. The GAN is used to 16: Apply Gopt (Gg , Gvar )
produce new samples that are very similar to the real samples, 17: Apply Dopt (Dg , Dvar )
which the adversary uses to add updates to the model, causing 18: end for
the model to be trained so that it cannot detect malware samples. 19: return Fo
Modified GAN for the Participant Attack: In this case, we
intend to create GANs to enhance the training structure provided
by Goodfellow et al. [32]. The first step involves gathering data
by the generator G to the available dataset (see the dashed
by sampling from a dataset.
components on the participant side of Fig. 3).
An interesting feature of a GAN is that it does not need
Algorithm 1 presents the pseudocode for the GAN-based
labeled information. Its learning methods can be classified into
trivial attack algorithm. In lines 1 and 2 of this algorithm, first,
generative and discriminative models. A generative model is
we define the generator and discriminator functions that are used
trained to obtain the joint probability of input data and output
to generate adversarial samples. Then, in lines 3–17 in the two
class labels, i.e., p(a, b). This can be used to derive a conditional
nested for loops, first, a batch of training data is separated. Then,
distribution, i.e., p(b|a) using the Bayes rule. We can also use this
we randomly generate the noise vector with the batch size with
learned joint probability for other purposes, such as generating
a normal distribution. In the next step, the algorithm gives this
new samples (a, b).
noise vector to the generator function G, and its output is sent
The main idea of GAN is to use the discriminative framework
to the discriminator function D to compute the similarity of
against the generative framework. In this way, the two neural
the generated sample to the training dataset. In lines 11 and
network components of the GAN act as adversaries, and are
12, we calculate the loss of the two functions G and D and
trained on real samples to produce nonidentifiable samples. The
repeat it. For each epoch, by calculating gradients, the optimizer
discriminator model is adopted here as a binary-label classifier.
function is used to optimize the solutions. It should be noted
For the classifier, the input point is b, and the output is an F -
that in the proposed trivial GAN method, all collected data from
dimensional vector of logits (the inverse of sigmoidal logistic
IoT devices are considered as a single dataset and the training
function). The output vector will be as follows:
is performed on the server side, as stated in the introduction,
b1 , b2 , . . . , bF . (2) privacy is still an important matter in this type of approach.

The softmax function helps to compute class probabilities as


follows: C. Proposed (FedGAN) Attack Algorithm
exp (bi ) Algorithm 2 presents a method inspired by using the concept
Pmodel (b = i|a) = m . (3) of FL in combination with GAN that can maintain the privacy
j=1 exp bj
of the data for each IoT device and produce adversarial samples,
The softmax function is a kind of normalized function apply- named FedGAN. This algorithm assumes that each IoT devices
ing as an activation function for convolutional neural network do not share their datasets, but they only update the model. We
(CNN) in the GAN. If we can increase the accuracy for the use this policy to preserve the data privacy of the device. Note
normalization, we can have higher accuracy on the classification, that among a set of participants, maybe one or more adversaries
either attack and defense algorithms. To train the model, we that update the model with parameters derived from the train of
minimize the negative log-likelihood between Pmodel (b|a) and adversarial samples. In this proposed algorithm, we adopt Fed-
the observed labels b. We add some fake examples generated IIoT architecture, and the adversary utilizes the GAN to generate

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.
TAHERI et al.: FED-IIoT: A ROBUST FEDERATED MALWARE DETECTION ARCHITECTURE IN INDUSTRIAL IoT 8447

Algorithm 2: FedGAN Attack Algorithm. Algorithm 3: A3GAN Defense Algorithm.


Input: Xtr , Ytr , Bs , Nd , GANep , Clinets Input: MG , V.Data, Clinets, Round
Output: Fout Output: MG
1: M odels ← discriminator 1: for each i in N
2: Dm ← discriminator 2: Li ← M odel(V.Data/N )
3: for each i in Clients 3: MGAN .T rain(Li .params)
4: Update Clientd [i] 4: end for
5: Update M odels [i] 5: for r in Round
6: end for 6: for c in Clinets
7: Wt ← models .w 7: W̃ i ← MGAN (c.params)
8: for r in Round 8: Compute Ai using (5)
9: for c in Clients 9: end for
10: M odels [c].w ← Wt 10: Compute τ using (6)
11: Dm .Set(models [c].w) 11: if (Ai > τ )then
12: for ep in GANep 12: Update MG using (4)
13: for each i in (size(Xtr )/Bs ) 13: end if
14: Btchs ← Xtr (Bs ∗ i) 14: end for
15: Btchl ← Ytr (Bs ∗ i) 15: return MG
16: N oise ← rand(Bs , Nd )
17: Gs ← G(N oise)
18: Ro ← Dm (Btchs ) In this way, we first divide the validation data into N separate
19: Fo ← Dm (Gs ) sections and create a corresponding model for each. We then
20: Gl ← G(Fo ).loss use the weights of these models to train a GAN network. The
21: Dl ← Dm (Ro , Fo , Btchl ).loss resulting network will act as an anomaly detector.
22: end for Without loss of generality, suppose K clients participate in FL
23: end for (K ≤ S) and each client has nk training points. Let Wt+1 k
is the
24: end for weight of the kth client in the round (t + 1) of the global model.
25: Wtmp ← averaging the weights per Clinets We inspired the data aggregation algorithm used in FedAvg [33],
26: M odels .Set(Wt ) as reported in the following equation:
27: end for
28: return Fo 
S
nk
Wt+1 = k
Wt+1 . (4)
n
k=1

adversarial samples. In lines 1–4, for each participant, we con- Considering the trained anomaly detector of FL, we want
sider a separate dataset and define the corresponding model as a to shape the aggregation model in such a way not to allow
discriminator. In lines 5–7, while defining the server-side model, clients who have a high anomaly value to be used in calculating
the corresponding weights are stored in Wtmp . Then, in nested aggregation. We compute the anomaly value of client k based on
for loops, lines 8–27, per round, and for each participant, we mean-square-error relation function in the following equation:
first consider Wtmp as the weights of the corresponding model.  2
 k k 
We even set these weights for the adversary model. Then, in Akt+1 = Wt+1 − W̃t+1  (5)
nested for loops, such as Algorithm 1, we generate adversarial
samples using the GAN architecture. Lines 25 and 26 calculate where Wt+1k
is the weight of client k in round t + 1 and W̃t+1
k
is
the average of the weights of the model and placing them in the calculated weight by GAN for this client. After calculating
Wtmp . We repeat the same process in the next run. Akt+1 for all clients, we calculate threshold τ in the following
equation:
D. Proposed Defense Algorithm: A3GAN
Algorithm 3 presents our defense method based on the de- τ = Akt+1 ∀k = 1, . . . , S. (6)
tection of abnormal behaviors by clients avoiding anomaly Hence, in calculating Wt+1 , we do not consider those clients
aggregation by GAN network, named A3GAN. A3GAN defines a following Akt+1 > τ instruction to avoid adding adversary ag-
threshold of behaviors in which any client that does not meet this gregated data to the training data of FL.
threshold is known as an adversary. A3GAN creates a measure
for calculating a score in an FL system, finds the adversary, and
eliminates the corresponding aggregated data from the system. E. Time Complexity of Proposed Attacks
The basic idea of our diagnostic approach is to calculate an Here, we calculate the computational complexity of the pro-
anomaly value for each client in the federated education system, posed attack algorithms. Note that, in both algorithms, we dedi-
and not to include that client in the aggregation model if this cate part of the time to the training of generator and discriminator
value is higher than the specified threshold for each client. functions. We consider a fixed rate for both attack algorithms.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.
8448 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 12, DECEMBER 2021

Hence, the computational complexity of the attack methods are Parameter and System Setting: FL model achieves based on
listed as follows. three phases of learning that are training, validation, and testings
1) GAN-Based Trivial Attack: In a GAN algorithm, we use and the allocated samples of each dataset are 60%, 20%, and
two nested f or loops, in which the outer loop is executed 20%, simultaneously. We conducted our experiments for IIoT
only 300 times (epochs = 300), but the internal f or loop devices on a 64-b Win10 OS server equipped with an eight-core
depends on the number of samples and time complexity is 4 GHz Intel Core i7, 16 GB of RAM, using Python 3.6.4. We
O(n). Inside these two loops, commands are used, each of adopt our method using Tensorflow (version 1.12.0) and Keras
which requires the use of train data, and time complexity (version 2.2.4) and build our GAN models in FED-IoT.
for each of them is O(n); so considering the two nested Methodologies: We present the adopted generator and dis-
for loops mentioned, the total computational complexity criminator functions in Figs. 4 and 5, respectively. The input to
is O(300 · n · n)=O(n2 ). the generator network is examples of the Android file that are
2) FedGAN Attack: In FedGAN, we use four nested for gathered from IoT devices. The generator model learns based
loops, in which three external loops execute in a fixed on the data distribution network and generates similar examples
number (in this article, Round = 300, Clients = 10, to deceive the discriminator network. The generated sample,
epochs = 300). The internal for loop is the same as the which is created by the generator network, feeds as an input to
GAN-based trivial algorithm, so the computational com- the discriminator network. It tries to detect adversarial samples.
plexity of FedGAN is O(n2 ). Also, due to the execution If the sample is detected as an adversarial, it is returned to
of the loops, the execution time is much longer than the the generator network. Also, if a sample is unable to detect
GAN-based trivial method, but it is in the order of O(n2 ). by the discriminator network, then it will be added to the data
3) A3GAN Defense: In A3GAN, lines 1–4 contain a for loop as a poisoned sample. In this article, we use a CNN architec-
in which the time complexity is O(n). In lines 5–14, we ture for generator and discriminator functions. Specifically, as
use two nested for loops, which is the main part of the shown in Fig. 4, we use a CNN sequential type to design the
algorithm and consume more time and is in the order of generator. In this figure, the first layer, Dense layer, uses three
O(n). Thus, the total time complexity of this algorithm Conv2DTranspose sublayers that are BatchNormalization, Relu,
is O(n2 ). and Reshape layers. Similarly, as shown in Fig. 5, we use a
CNN sequential type that has two Conv2D layers where we use
V. PERFORMANCE EVALUATION LeakyRelu and Dropout between these layers. Also, we adopt
Flatten and Dense in the final layers.
In this section, we report an experimental evaluation of the
Defense Methods: We adopt and adjust two scenarios of
proposed attack and the countermeasure algorithms.
Byzantine methods reported in [7] and [37] and with some
modifications apply them on Krum and Medium and utilize them
A. Simulation Setup as defense mechanisms against the proposed attacks.
We have extracted the static features of datasets and create Feature Selection and Metric. We rank the features using
a sparse matrix that mapped the feature possibility to a binary RandomForestRegressor algorithm and select 300 of them with
cases (set 1 presenting the feature, otherwise 0). In the following, higher ranks. We use accuracy as our main metric for the exper-
we describe the datasets and system settings. iments. Accuracy (A) is the ratio between the number of correct
Datasets. The tested IIoT datasets are listed as follows. predictions and the total number of tested samples. Hence, we
1) Drebin Dataset [34]: This dataset contained 131 611 can define it as follows:
Android samples representing benign and malware/ ζ +π
malicious apps. A total of 96 150 of these samples were A= (7)
ζ +π+ν+μ
gathered from the Google Play store, 19 545 from the Chi-
nese market, 2810 from the Russian market, and 13 106 where ζ is the ratio of correctly classified benign samples, π is
from other Internet sources. the ratio of correctly classified malware samples, ν is the ratio of
2) Genome Dataset [35]: This dataset contained 1200 An- wrongly classified benign samples, and μ is the ratio of wrongly
droid malware samples, classified as installation methods, classified malware samples. The python implementation of Fed-
activation mechanisms, and malicious payloads. IIoT is available in [38].
3) Contagio Dataset [36]: This dataset contained 16 800
benign and 11 960 malicious IoT samples. B. Experimental Results
Mobile Application Static Features: Here, the considered IoT In this section, we test the proposed attack and defense mech-
datasets consist of various features that are permissions, intents, anisms on the datasets and features described above.
and API calls. Both malicious and benign samples consist of Attack Algorithm Results: In the first experiment, we studied
these features. The permission feature indicates the app’s in- GAN and FedGAN attack algorithms that were applied to the
formation that the Android OS requires to communicate with traffic data gathered from ten Android devices connected to the
them. The intents feature consists of various calls to APIs, such global training model for federated attack. We assumed that the
as sending an SMS or accessing a user’s location. Finally, API GAN model on an IIoT device could only keep 300 binary
calls feature or medium represents a communication link among examples for adversarial training. We tested our attack algo-
the various applications in the Android OS. rithms on all three feature types on three datasets. We generate

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.
TAHERI et al.: FED-IIoT: A ROBUST FEDERATED MALWARE DETECTION ARCHITECTURE IN INDUSTRIAL IoT 8449

Fig. 6. Accuracy results of GAN and FedGAN attack approaches for different features on various datasets. No:= without attack; I:= intent; A:=
API; P:= permission. (a) Drebin dataset. (b) Contagio dataset. (c) Genome dataset.

the adversarial examples as transfer attacks. We presume that of participants and keeping the number of adversaries constant
we can produce the initial training model while the server-side will cause the accuracy to increase. As can be seen from the
federated model is unable to retrain the model and the adversary figures, after performing 300 epochs, the process of changing
is also unable to get access to the update model, Mt+1 . We the results has reached steady state, and it seems that increasing
set the number of epochs to 300, which is aligned with the the number of epochs will not change the results significantly.
binary example rates for adversarial training. We use each epoch 1) Comparing Algorithms Based on Accuracy in Different
to retrain all the collected example pairs on a device. After Rounds: Fig. 7 shows the accuracy of the proposed algo-
finalizing each epoch, the IIoT devices transfer the updated rithms for different numbers of running rounds on various
gradient information to their corresponding server to perform datasets. In this figure, with approximately 250 rounds of
aggregation. Here, we present the prediction accuracy for the running, the accuracy reaches steady state. This result is
datasets Drebin, Contagio, and Gnome using the proposed almost identical for all three datasets and all three types of
GAN and FedGAN federated attack algorithms. Fig. 6 shows API, permission, and intents files. In particular, Fig. 7(a)
the results of the implementation of the two proposed attack shows the accuracy associated with the running of the
algorithms. In each of the subfigures, we present the results proposed attack methods on the Drebin dataset and con-
of the Drebin, Contagio, and Genome datasets using the API, firms that an attack has not yet taken place, even with 50
permission, and intent features. On the x-axis in each of the rounds and the accuracy for different features is more than
subfigures, we indicate the number of epochs. On the y-axis in 93%. Using both GAN-based and FedGAN approaches,
each subfigure, we present the accuracy as calculated in (7). the accuracy is significantly reduced to between 70%
The illustrated plots display three modes: no-attack, expressing and 86%, respectively. As a result, with increasing the
that no attack was injected, and two of our proposed attacks number of rounds, the accuracy initially increases, but in
(i.e., GAN attack and FedGAN). From these figures, we can a maximum of 300 rounds of running, we will achieve an
see that when the number of epochs increases, which is actually almost constant amount of accuracy.
associated with the use of optimizers, accuracy always increases 2) Comparing Algorithms Based on Accuracy in Different
in all methods. Fig. 6 shows that in no-attack mode, the accuracy Number of Clients: Fig. 8 compares the proposed attack
is always higher than 98% in all datasets and all file properties methods for a different number of clients. In this figure,
with a sufficient number of epochs. However, by using two types it is assumed that only one of the clients is adversary.
of attacks, GAN attack and FedGAN attack, the accuracy value When the number of clients is small, the adversary could
has been drastically reduced; in some cases, it reaches less than poison a higher percentage of the data, and in this case,
70%. This level of reduction is particularly noticeable in the the accuracy will be lower. On the other hand, as the
case of malware data and binary features. By comparing the number of clients increases, each client, and therefore
accuracy plots of GAN attack and FedGAN attack, we can see the adversary, uses a smaller percentage of data to train
that in most cases, FedGAN attack is more disastrous and can its local model, resulting in less impact on accuracy. It
reduce the accuracy more than GAN attack. Also, FedGAN should be noted that in the GAN-based attack algorithm,
attack could cause a wider breach in the data privacy for each in which the model is created directly based on the whole
compromised participant. Focusing on the dataset feature results data, we have actually changed the percentage of poisoned
in Fig. 6, it can be seen that the accuracies of all methods on data. Focusing on no-adversary cases (the dashed lines),
API features are smaller than the accuracies of all methods on we assume that the training process is distributed and
permission and intent features, which is because of the smaller we have high accuracy rate. Also this figure shows that
number of API samples. The last point to be made about Fig. 6 when the number of clients increases, the amount of
is that the results presented are for ten participants, and only accuracy decreases. It confirms that the model will be
one of them is an adversary. Obviously, increasing the number triggered and affected by the more aggregated data and it
of adversaries and keeping the number of participants constant influences on the classification on FedGAN. Focusing on
will cause the accuracy to decrease, while increasing the number attack algorithms, when an adversary is present among

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.
8450 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 12, DECEMBER 2021

Fig. 7. Accuracy results of GAN and FedGAN attack approaches for different features on various datasets over various rounds. No:= without
attack; I:= intent; A:= API; P:= permission. (a) Drebin dataset. (b) Contagio dataset. (c) Genome dataset.

Fig. 8. Accuracy results of GAN and FedGAN attack approaches for different features on various datasets over various clients. No:= without
attack; I:= intent; A:= API; P:= permission. (a) Drebin dataset. (b) Contagio dataset. (c) Genome dataset.

TABLE II 1–5 number of adversaries (Ad1 , . . . , Ad5 ) for different


ACCURACY RESULTS OF GAN AND FEDGAN ATTACK APPROACHES BASED
ON 1–5 NUMBER OF ADVERSARIES (Ad1 , . . . , Ad5 ) FOR DIFFERENT
features on various datasets. It is observed that by increas-
FEATURES ON VARIOUS DATASETS ing the number of adversaries in the FedGAN algorithm
or increasing the percentage of data poisoning in the
GAN-Based attack, the amount of accuracy decreases.
Defense Algorithm Results: In the next experiment, we com-
pare the adjusted defense algorithms and verify their efficiencies
for various features and datasets. Specifically, Fig. 9 shows the
results of using the Byzantine defense algorithm against the
two proposed attacks. We use two scenarios, namely Krum and
Median, to inject the Byzantine algorithm. The presented results
in Fig. 9 confirm the accuracy of data classification after the
use of the defense algorithms that we designed, based on the
federated algorithm reported inside the green color rectangle
component illustrated in Fig. 3 as the server side. Formally
speaking, on the participant side (see the red color rectangle
component illustrated in Fig. 3 as the participant side), the
adversary aims to generate the proposed attack mechanisms,
build poisoned data, feed it into the local models in training,
No-Adv:= No adversaries using FL algorithm for ten clients; I:= Intent; A:= and send them to the server. On the server, the learning models
API; P:= permission. receive the poisoned data from local models, simultaneously
check the data samples, immediately apply the two types of
Byzantine algorithms, Byzantine Median (BM) and BK, dis-
five clients, the accuracy is reduced to very low values cover and remove the poison data, and generate the aggregated
for all datasets for both attack algorithms and is lower for model by jointly using FL and GAN. Fig. 9(a)–(c) shows the
GAN method. accuracy results of the Drebin, Contagio, and Gnome datasets,
3) Comparing Algorithms Based on Accuracy in Different respectively. In these figures, the accuracy results are calculated
Number of Adversaries: In Table II, we present accuracy based on API, permission, and intent features. From these fig-
results of GAN and FedGAN attack approaches based on ures, we can see several achievements. First, they present the

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.
TAHERI et al.: FED-IIoT: A ROBUST FEDERATED MALWARE DETECTION ARCHITECTURE IN INDUSTRIAL IoT 8451

Fig. 9. Accuracy results of A3GAN, FedGAN adjusted on Byzantine Median (BM) and BK defense approaches for different features on various
datasets. P:= permission; A:= API; I:= intent. (a) Drebin dataset. (b) Contagio dataset. (c) Genome dataset.

accuracy enhancement of the defense algorithms, confirming and FL algorithms to accurately detect a malicious model and
the protection of the Byzantine solutions against the GAN delete the poisoned samples. The results of a comprehensive set
and FedGAN attack algorithms. Second, they confirm that the of experiments confirm that our methods outperform existing
accuracy ratio of the GAN-based countermeasures is higher than defense-based schemes in terms of accuracy. In the future work,
that of the FedGAN defense algorithms, and they confirm that we will explore the use of robust ensemble learning based on
the GAN techniques can quickly and more precisely increase a GAN model and analyze the anomalous behavior of the IIoT
the classification accuracy of the training model compared to samples especially for the heterogeneous stream line Android
the FedGAN, which requires two-level learning [see the bar applications. Also, we will consider the robust data aggregation
plots of GAN-BK and GAN-BM in Fig. 9(a)–(c)]. Third, the techniques, such as information fusion to enhance the GAN and
results presented in Fig. 9 achieve based on running the defense federating models in IIoT applications.
methods for 300 epochs. The GAN-BK is more robust than the
GAN-BM. It is because the Krum can much faster and easily
apply the discriminator algorithm to detect the poison data. REFERENCES
Fourth, our method (A3GAN) has an interesting accuracy results
[1] L. D. Xu, W. He, and S. Li, “Internet of Things in industries: A survey,”
for all datasets. For example, A3GAN accuracy ratio for intent IEEE Trans. Ind. Informat., vol. 10, no. 4, pp. 2233–2243, Nov. 2014.
features is around 96% for Gnome, whereas the two Byzantine [2] M. Alazab et al., “A hybrid wrapper-filter approach for malware detection,”
FedGAN models (FedGAN-BM and FedGAN-BK) only have J. Netw., vol. 9, no. 11, pp. 2878–2891, 2011.
[3] L. Zhao et al., “Shielding collaborative learning: Mitigating poisoning
accuracy of 89.51% and 93.24%, respectively. In other words, attacks through client-side detection,” IEEE Trans. Dependable Secure
the robustness of our federated defense method is better than Comput., pp. 1–13, 2020, doi: 10.1109/TDSC.2020.2986205.
other local adversarial training methods to defend new attacks. [4] M. Alazab, R. Layton, R. Broadhurst, and B. Bouhours, “Malicious spam
emails developments and authorship attribution,” in Proc. 4th Cybercrime
Discussion on the Results: As can be seen from Fig. 6, after Trustworthy Comput. Workshop, 2013, pp. 58–68.
applying the proposed attack methods, the accuracy values have [5] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning:
decreased by 20–30%. Now, using two defensive methods (see Concept and applications,” ACM Trans. Intell. Syst. Technol., vol. 10, no. 2,
pp. 1–19, 2019.
Fig. 9), the accuracy again shows an increase of 10–15%. Among [6] S. K. Lo, Q. Lu, C. Wang, H. Paik, and L. Zhu, “A systematic litera-
the API, permission, and intent features, it can be seen that ture review on federated machine learning: From a software engineering
due to the small number of samples with API features, after perspective,” 2020, arXiv:2007.11354.
[7] P. Blanchard et al., “Machine learning with adversaries: Byzantine tol-
using the defense method, the accuracy value increases less for erant gradient descent,” in Proc. Adv. Neural Inf. Process. Syst., 2017,
API features than for the other two features. Comparing the pp. 119–129.
two methods, GAN based and FedGAN, it can be seen that the [8] E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, and V. Shmatikov, “How to
backdoor federated learning,” in Proc. Int. Conf. Artif. Intell. Statist., 2020,
defense methods have almost always been successful against the pp. 2938–2948.
GAN-based attack method. In other words, FedGAN is more [9] C. Zhang, S. Li, J. Xia, W. Wang, F. Yan, and Y. Liu, “Batchcrypt:
robust than another attack method. A comparison of the defense Efficient homomorphic encryption for cross-silo federated learning,” in
Proc. USENIX Annu. Tech. Conf., 2020, pp. 493–506.
methods also shows that the Krum-based Byzantine method is [10] Z. Sun, P. Kairouz, A. T. Suresh, and H. B. McMahan, “Can you really
more successful than the Median-based method; it results in a backdoor federated learning?,” 2019, arXiv:1911.07963.
higher accuracy. [11] S. Mishra and S. Jain, “Ontologies as a semantic model in IoT,” Int. J.
Comput. Appl., vol. 42, no. 3, pp. 233–243, 2020.
[12] S. Li, Y. Cheng, W. Wang, Y. Liu, and T. Chen, “Learning to detect
VI. CONCLUSION malicious clients for robust federated learning,” 2020, arXiv:2002.00211.
[13] S. Fu, C. Xie, B. Li, and Q. Chen, “Attack-resistant federated learning
In this article, we proposed a robust FL Android architecture with residual-based reweighting,” 2019, arXiv:1912.11464.
for malware detection called Fed-IIoT. Our scheme consists [14] J. Zhang and C. Li, “Adversarial examples: Opportunities and challenges,”
IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 7, pp. 2578–2593,
of two components: a participant side and a server side. The Jul. 2020.
attacker uses two GAN-based algorithms to generate adversarial [15] B. Biggio et al., “Poisoning behavioral malware clustering,” in Proc.
examples and injects them into the dataset of each IIoT applica- Workshop Artif. Intell. Secur. Workshop, 2014, pp. 27–36.
[16] Q. Han, V. Subrahmanian, and Y. Xiong, “Android malware detection via
tion. On the server side, we propose one defense algorithm and (somewhat) robust irreversible feature transformations,” IEEE Trans. Inf.
adjust two Android malware detection schemes that use a GAN Forensics Secur., vol. 15, pp. 3511–3525, 2020.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.
8452 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 12, DECEMBER 2021

[17] H. Zhu, Y. Li, R. Li, J. Li, Z.-H. You, and H. Song, “SEDMDroid: Mohammad Shojafar (Senior Member, IEEE)
An enhanced stacking ensemble of deep learning framework for android received the Ph.D. degree in information com-
malware detection,” IEEE Trans. Netw. Sci. Eng., early access, 2020, doi: munication and telecommunication (ICT) (with
10.1109/TNSE.2020.2996379. an “Excellent” degree) from the Sapienza Uni-
[18] J. Li, L. Sun, Q. Yan, Z. Li, W. Srisa-An, and H. Ye, “Significant permission versity of Rome, Rome, Italy, in 2016.
identification for machine-learning-based android malware detection,” He is currently a Senior Lecturer (Associate
IEEE Trans. Ind. Informat., vol. 14, no. 7, pp. 3216–3225, Jul. 2018. Professor) in the network security and an Intel
[19] H. Naeem et al., “Malware detection in industrial internet of things based Innovator and a Marie Curie Alumni, working
on hybrid image visualization and deep learning model,” Ad Hoc Netw., in the 6G Innovation Centre (6GIC) with the
vol. 105, 2020, Art. no. 102154. University of Surrey, Guildford, U.K. Before join-
[20] J. Dean et al., “Large scale distributed deep networks,” in Proc. Adv. Neural ing 6GIC, he was a Senior Researcher and a
Inf. Process. Syst., 2012, pp. 1223–1231. Marie Curie Fellow with the SPRITZ Security and Privacy Research
[21] Y. Song, T. Liu, T. Wei, X. Wang, Z. Tao, and M. Chen, Group, University of Padua, Padua, Italy. Also, he was a CNIT Senior
“Fda3: Federated defense against adversarial attacks for cloud-based Researcher with the University of Rome Tor Vergata and contributed
IIoT applications,” IEEE Trans. Ind. Informat., pp. 1–9, 2020, doi: to 5G PPP European H2020 “SUPERFLUIDITY” project. He was a PI
10.1109/TII.2020.3005969. of PRISENODE project, a 275k Euro Horizon 2020 Marie Curie global
[22] M. Fang, X. Cao, J. Jia, and N. Gong, “Local model poisoning attacks to fellowship project in the areas of fog/cloud security collaborating at the
Byzantine-robust federated learning,” in Proc. 29th USENIX Secur. Symp., University of Padua. He also was a PI on an Italian SDN security and
2020, pp. 1605–1622. privacy (60k euro) supported by the University of Padua in 2018 and a
[23] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, Co-PI on an Ecuadorian–British project on IoT and Industry 4.0 resource
“Communication-efficient learning of deep networks from decentralized allocation (20k dollars) in 2020. He was contributed to some Italian
data,” in Proc. Artif. Intell. Statist., 2017, pp. 1273–1282. projects in telecommunications, such as GAUChO, SAMMClouds, and
[24] F. Sattler, S. Wiedemann, K.-R. Müller, and W. Samek, “Robust and SC2.
communication-efficient federated learning from non-i.i.d. data,” IEEE Dr. Shojafar is an Associate Editor for the IEEE TRANSACTIONS ON
Trans. Neural Netw. Learn. Syst., vol. 31, no. 9, pp. 3400–3413, Sep. 2020. CONSUMER ELECTRONICS and IET Communications.
[25] Y. Chen, L. Su, and J. Xu, “Distributed statistical machine learning in
adversarial settings: Byzantine gradient descent,” Proc. ACM Meas. Anal.
Comput. Syst., vol. 1, no. 2, pp. 1–25, 2017.
[26] L. Chen, H. Wang, Z. Charles, and D. Papailiopoulos, “DRACO:
Byzantine-resilient distributed training via redundant gradients,” in Proc.
35th Int. Conf. Mach. Learn., vol. 80, pp. 903–912, 2018.
[27] L. Muñoz-González, K. T. Co, and E. C. Lupu, “Byzantine-robust
federated machine learning through adaptive model averaging,” 2019, Mamoun Alazab (Senior Member, IEEE) re-
arXiv:1909.05125. ceived the Ph.D. degree in computer science
[28] W. Zhang et al., “Dynamic fusion based federated learning for covid-19 from the School of Science, Information Tech-
detection,” 2020, arXiv:2009.10401. nology and Engineering, Federation University
[29] J. Konečnỳ, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and Australia, Mount Helen, VIC, Australia, in 2012.
D. Bacon, “Federated learning: Strategies for improving communication He is currently an Associate Professor with
efficiency,” 2016, arXiv:1610.05492. the College of Engineering, IT and Environment,
[30] M. Alazab and R. Broadhurst, “An analysis of the nature of spam as Charles Darwin University, Casuarina, NT, Aus-
cybercrime,” in Cyber-Physical Security. New York, NY, USA: Springer, tralia. He is a cyber–security Researcher and
2017, pp. 251–266. practitioner with industry and academic experi-
[31] W. Zhang et al., “Blockchain-based federated learning for device failure ence. He has authored or coauthored more than
detection in industrial IoT,” IEEE Internet Things J., early access, doi: 150 research papers in many international journals and conferences.
10.1109/JIOT.2020.3032544. His research is multidisciplinary that focuses on cyber–security and dig-
[32] I. Goodfellow et al., “Generative adversarial nets,” in Proc. Adv. Neural ital forensics of computer systems with a focus on cybercrime detection
Inf. Process. Syst., 2014, pp. 2672–2680. and prevention.
[33] S. Li, Y. Cheng, Y. Liu, W. Wang, and T. Chen, “Abnormal client behavior Dr. Alazab is the Founding Chair of the IEEE Northern Territory Sub-
detection in federated learning,” 2019, arXiv:1910.09933. section.
[34] D. Arp, M. Spreitzenbarth, H. Gascon, K. Rieck, and C. Siemens,
“DREBIN: Effective and explainable detection of android malware in your
pocket,” in Proc. Netw. Distrib. Syst. Secur. Symp., 2014, pp. 23–26.
[35] X. Jiang and Y. Zhou, “Dissecting android malware: Characterization and
evolution,” in Proc. IEEE Symp. Secur. Privacy, 2012, pp. 95–109.
[36] “Contagio dataset,” 2018. [Online]. Available: http://contagiominidump.
blogspot.com/, Accessed: Dec 4, 2020.
[37] M. B. Cohen, Y. T. Lee, G. Miller, J. Pachocki, and A. Sidford, “Geometric Rahim Tafazolli (Senior Member, IEEE) is cur-
median in nearly linear time,” in Proc. 48th Annu. ACM Symp. Theory rently a Professor and the Director of the In-
Comput., 2016, pp. 9–21. stitute for Communication Systems and 6G In-
[38] “Fed-IIoT source code,” 2020. [Online]. Available: https://github.com/ novation Centre (6GIC), University of Surrey,
mshojafar/sourcecodes/raw/master/FeD-IIoT_sourcecode.zip Guildford, U.K. He has more than 30 years of
experience in digital communications research
Rahim Taheri received the B.Sc. degree in and teaching. He is the editor of two books
computer engineering from Bahonar Technical Technologies for Wireless Future (Wiley, Vol. 1
and Engineering College of Shiraz, Shiraz, Iran, in 2004 and Vol. 2 in 2006). He is coinventor
in 2007, and the M.Sc. degree in computer on more than 30 granted patents, all in the field
networks and the Ph.D. degree in information of digital communications. He has authored or
technology and computer networks from Shiraz coauthored more than 500 research papers in refereed journals, inter-
University of Technology, Shiraz, Iran, in 2015 national conferences, and as an invited speaker.
and 2020, respectively. Prof. Tafazolli was appointed as a Fellow of Wireless World Research
In 2018, he was a visiting Ph.D. student Forum in April 2011, in recognition of his personal contribution to the
with the SPRITZ Security and Privacy Research wireless world. He is heading one of Europa’s leading research groups.
Group, University of Padua, Padua, Italy. His He is regularly invited by governments to advise on network and 5G
main research interests include adversarial machine learning, network technologies and was advisor to the Mayor of London with regard to the
security and differential privacy, security of cloud storage, and software- London Infrastructure Investment 2050 Plan during May and June 2014.
defined networks.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on July 25,2022 at 06:25:51 UTC from IEEE Xplore. Restrictions apply.

You might also like