A Machine Learning Based Method to Detect System Failure in Resource Constrained Environment
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 87366
A Machine Learning Based Method to Detect System Failure in Resource Constrained Environment

Authors: Payel Datta, Abhishek Das, Abhishek Roychoudhury, Dhiman Chattopadhyay, Tanushyam Chattopadhyay

Abstract:

Machine learning (ML) and deep learning (DL) is most predominantly used in image/video processing, natural language processing (NLP), audio and speech recognition but not that much used in system performance evaluation. In this paper, authors are going to describe the architecture of an abstraction layer constructed using ML/DL to detect the system failure. This proposed system is used to detect the system failure by evaluating the performance metrics of an IoT service deployment under constrained infrastructure environment. This system has been tested on the manually annotated data set containing different metrics of the system, like number of threads, throughput, average response time, CPU usage, memory usage, network input/output captured in different hardware environments like edge (atom based gateway) and cloud (AWS EC2). The main challenge of developing such system is that the accuracy of classification should be 100% as the error in the system has an impact on the degradation of the service performance and thus consequently affect the reliability and high availability which is mandatory for an IoT system. Proposed ML/DL classifiers work with 100% accuracy for the data set of nearly 4,000 samples captured within the organization.

Keywords: machine learning, system performance, performance metrics, IoT, edge

Procedia PDF Downloads 195