Tutorials

Tutorial 1

Title: Development of a Production Ready Data Analytics Pipeline for Real-time Threat Detection
Speaker: Jeff Schwartzentruber (2Keys Corporation, Canada)

Abstract: The increase in digitization and security threats has resulted in the increased demand for systems that are capable of handling large amounts of streaming data, with advanced analytics capabilities and low latency. Participants will first be given an introduction into the current threat landscape and modern approaches to detecting cyber-attacks. This tutorial will give viewers an understanding of the system requirements and an overview of a predominate modelling technique (anomaly detection) used in the cyber-security space. The goal of this tutorial is to provide an in-depth understanding of how to develop and implement a distributed monitoring solution, using open source software, and the theory behind anomaly detection using Bayesian methods.

Dr. Jeff Schwartzentruber holds the position of Principal Data Scientist and Research Lead at 2Keys Corporation. Dr. Schwartzentruber received his PhD in Mechanical Engineering from Ryerson University with a focus on analytical process modelling and is a fellow of the Ontario Centre of Excellence. In his role at 2Keys, Dr. Schwartzentruber is responsible for the continued development, innovation and leadership of machine learning and data science capabilities at the intersection of identity and access management, advanced threat analytics and response, and managed security services. Jeff's research interests include machine learning (particularly deep learning and boosted trees), real-time anomaly detection, and analytical/semi-empirical model development for security and business applications.

Tutorial 2

Title: Blockchains for Industrial IoT - solutions, use-cases, and network management issues
Speakers: Pal Varga (Budapest University of Technology and Economics, Hungary) and Ferenc Nandor Janky (Budapest University of Technology and Economics, Hungary)

Abstract: Utilizing Blockchains within the Internet of Things (IoT) concept is quite a recent idea. There are already a number of use cases and supporting frameworks available, which shows its potential benefits for many domains. There are interesting, business-driven target areas within the Industrial IoT domain, including sectors such as supply chain (including manufacturing, transportation and logistics), maintenance, energy trading, grids, and even healthcare. When compared to consumer IoT, these systems have special requirements: certain level of real-time, security, engineering complexity, multi-stakeholder visibility, fast transaction and asset traceability. While the Distributed Ledger Technology (DLT) already addresses some areas of these (such as multi-stakeholder visibility or asset traceability), Blockchain Technology (BCT) provides additional value for security, building trust, and reducing cost while accelerating transactions of service agreements. This tutorial aims to reveal the opportunities and challenges as well as presenting real-life examples together with network management aspects. First it provides an overview and definitions the BCT universe – from Assets and Blocks through Consensus Mechanisms and Distributed Ledgers to Wallets. Next, it describes some special requirements of the Industrial IoT domain together with ideas of utilizing BCT to cover these needs. While discussing benefits, the tutorial reveals some drawbacks as well. These help us answering the questions: when is it beneficial to use BCT, when is it questionable, and when is it avoidable? Furthermore, the tutorial provides insights on various use-cases of employing BCT and smart service contracts in healthcare, electricity trading, production, asset tracking or proactive maintenance. Aside from being interesting simply because they are becoming core technologies of near-future systems, IIoT and Blockchains have a network management viewpoint as well. The IIoT end-devices need on-boarding, their data needs to be secured, authenticity needs to be checked, and trust needs to be built – all of which tasks BCT can be utilized effectively. Moreover, as part of configuration management, reliable and secure firmware distribution and upgrade can be supported inherently. Regarding implementations, instead of the well-known Blockchains that are used as cryptocurrencies (e.g. Bitcoin, Ethereum, etc.), this tutorial presents other realizations, such as IoTcoin, IOTA, or HDAC, which are targeting IIoT applications. The practical part of the tutorial will include the following parts: Implementing a simple smart-contract based distributed application to reinforce concepts learned on BlockChains and to introduce a selected distributed application framework with its programmer's interface. Creating a more complex distributed smart-contract based solution for modelling product life-cycle in IIoT setting using the framework introduced. Performing simulation by adding IIoT actors to the system and executing measurements on throughput, convergence time, latency, computational requirements on end devices etc. Analyzing the measurement results and implementing potential system tweaks for the IoT use case and verifying that with a subsequent measurement.

Pal Varga currently holds an associate professor position at Budapest University of Technology and Economics (BME), where he teaches various subjects, including "IoT frameworks and industrial applications", partially covering the topic of the current tutorial. Beside being active in the network and service management research community, he works on the Industrial IoT field, as well. His research covers IoT frameworks, interoperability and integrability issues, heterogeneous IoT systems, protocol translation, service oriented architectures, Industrie4.0 use-cases IoT security, IoT lifecycle management, smart service contracts, and Blockchains for IIoT. He is currently the Editor-in-Chief of the Infocommunications Journal, published by the Scientific Association for Infocommunications, Hungary (HTE), a Sister Society of IEEE.

Ferenc Nandor Janky currently is a PhD student at Budapest University of Technology and Economics (BME) where his thesis research topic is around process and life-cycle modelling in Industrial IoT frameworks. He graduated with a Master's in Electrical Engineering from BME with a specialization in Incofommuncation Systems in 2013. He has several years industrial experience gained at various telecommunications companies like Vodafone, AITIA International Inc., Ericsson. Beside of the PhD studies he is currently working in the financial industry developing low-latency trading applications.

Tutorial 3

Title: Decentralized and Privacy-preserving Training of Machine Learning Models
Speaker: Ali Vahdat (Huawei Noah’s Ark Lab, Canada)

Abstract: Traditional supervised machine learning (ML) methods heavily rely on (labeled) data for training a model with descent predictive performance. More recent deep learning (DL) methods tend to have a stronger dependency to huge volumes of data. State of the art deep computer vision or natural language processing methods, for example, have millions of parameters to learn, and tuning this huge number of parameters requires access to even more training data, millions or even billions of data points. While creating opportunities to improve performance level of ML models, this amount of training data introduces several issues of their own. Storing huge amount of data requires large data centers which comes with a cost. Storing data on a data center is only an option when data to be stored is not privacy sensitive. When user data is private storing user data to a server in no longer an option. Consider the scenario where user data is generated and stored by their mobile phones. Examples of this scenario can be photos users take, videos they record, words they type, locations they visit, and even interactions with their mobile phones -- such as clicking an OS or app notification, searching for an OS feature or app, etc. All these events generate and store privacy-sensitive data on user's device. In settings where having access to user data is not an option, other techniques are required to train the model without logging user data on a data center server. This is increasingly important as both White House and European Parliament have initiations to safeguard privacy of consumer data. In 2016 researchers at Google proposed a simple yet smart idea to mitigate the user data privacy issue [1, 2, 3]. They advocate a simple distributed optimization technique rather than the traditional centralized (i.e. server-side) training of the global model. Their approach starts by training a simple global model on a server. The model is downloaded to all local devices. Each local device will receive a copy of the global model and uses its own data to train and optimize the local model -- usually using some variation of stochastic gradient descent. Local devices then upload their trained models (or an update reflecting their change since last update) back to the server. Having access to improved models from different local devices, the server then aggregates client models and update the global model. The global model update can be as simple as averaging the local models. They call their method Federated Learning (FL) since the task is solved by a loose federation of participating clients [1]. Federated learning decentralizes model training so that the main training is performed on local clients using local client data, and with no need to centralized storage of data on a server. Effectively, federated learning decouples training of the model from direct access to data.

Ali Vahdat is Machine Learning Research Scientist at Huawei Noah’s Ark Lab in Montreal. He holds a Ph.D. in Computer Science, from Dalhousie University, Canada. In his Ph.D. thesis he develops an algorithm for subspace clustering using genetic programming, a branch of evolutionary computation. Ali has been involved in machine learning and artificial intelligence conferences over the last decade. Before joining Huawei, he served as an Assistant Professor at Sobey School of Business at Saint Mary's University in Halifax between 2014 and 2017. Ali has more than 20 papers in various machine learning and computer networks conferences and journals and has served as the technical program committees of multiple conferences and reviewer for multiple journals. Ali’s main research topics include Machine Learning, Deep Learning, Unsupervised Learning, and Reinforcement Learning to name a few.

Tutorial 4

Title: Flow Based Network Traffic Analysis
Speaker: Ali Safari Khatouni (Dalhousie University, Canada)

Abstract: Analyzing and understanding network traffic is a vital requirement for different network and security monitoring/planning tools. The evolution of Internet services and protocols has caused traditional traffic classification approaches to be ineffective in certain cases. Key causes of the inaccuracy include: (i) the increase in the encrypted traffic; (ii) the rise in the usage of dynamic port numbers for different applications; and (iii) multiple services and applications running over HTTP or HTTPS. Traditional solutions for traffic analysis, classification, and measurement fall short in providing visibility in users' activities - a key requirement for network and security monitoring tools. In this tutorial, we present a classifier for encrypted, e.g., Social media, Video, Audio traffic, etc., without relying on particular L7 header fields that can be easily modified. We leverage Machine Learning (ML) algorithms for classification which can be tuned based on the need of the network manager. We present the impact of the initial feature set that can be obtained by four popular off-the-shelf network flow exporters. Then, we demonstrate the effectiveness of the proposed approach. In this tutorial, the participants will learn how the choice of the initial feature set from off-the-shelf traffic analyzer can affect the performance of the classifier. They also learn how to use the proposed solution to model and understand different types of encrypted traffic behaviors to identify encrypted applications. They learn how to use ML-based approaches to analyze the traffic and explore the most representative features.

Dr. Ali Safari Khatouni received his B.S. degree in Software engineering from Urmia University, Iran, and M.S. and Ph.D. degrees from Department of Electrical and Computer Engineering at Politecnico di Torino, Italy. Currently, he is a Postdoctoral Fellow at the Faculty of Computer Science at Dalhousie University with Prof. Nur Zincir-Heywood’s research group. His research interests lie in the areas of network traffic analysis, machine learning, and mobile broadband networks. He has teaching experience at the graduate and undergraduate levels. He has been an instructor in the “Mobile Computing”, “Introduction to database systems”, and “Introduction to Data Mining and Data Warehousing” courses at Dalhousie University. He also has teaching experience in a graduate level “Network measurement laboratory” at Politecnico di Torino. Moreover, He has obtained valuable experience in several European research projects (Mplane, MONROE).