Search results for: encrypted traffic classification
2446 A Computer-Aided System for Detection and Classification of Liver Cirrhosis
Authors: Abdel Hadi N. Ebraheim, Eman Azomi, Nefisa A. Fahmy
Abstract:
This paper designs and implements a computer-aided system (CAS) to help detect and diagnose liver cirrhosis in patients with Chronic Hepatitis C. Our system reduces the required features (tests) the patient is asked to do to tests to their minimal best most informative subset of tests, with a diagnostic accuracy above 99%, and hence saving both time and costs. We use the Support Vector Machine (SVM) with cross-validation, a Multilayer Perceptron Neural Network (MLP), and a Generalized Regression Neural Network (GRNN) that employs a base of radial functions for functional approximation, as classifiers. Our system is tested on 199 subjects, of them 99 Chronic Hepatitis C.The subjects were selected from among the outpatient clinic in National Herpetology and Tropical Medicine Research Institute (NHTMRI).Keywords: liver cirrhosis, artificial neural network, support vector machine, multi-layer perceptron, classification, accuracy
Procedia PDF Downloads 4622445 Applying Unmanned Aerial Vehicle on Agricultural Damage: A Case Study of the Meteorological Disaster on Taiwan Paddy Rice
Authors: Chiling Chen, Chiaoying Chou, Siyang Wu
Abstract:
Taiwan locates at the west of Pacific Ocean and intersects between continental and marine climate. Typhoons frequently strike Taiwan and come with meteorological disasters, i.e., heavy flooding, landslides, loss of life and properties, etc. Global climate change brings more extremely meteorological disasters. So, develop techniques to improve disaster prevention and mitigation is needed, to improve rescue processes and rehabilitations is important as well. In this study, UAVs (Unmanned Aerial Vehicles) are applied to take instant images for improving the disaster investigation and rescue processes. Paddy rice fields in the central Taiwan are the study area. There have been attacked by heavy rain during the monsoon season in June 2016. UAV images provide the high ground resolution (3.5cm) with 3D Point Clouds to develop image discrimination techniques and digital surface model (DSM) on rice lodging. Firstly, image supervised classification with Maximum Likelihood Method (MLD) is used to delineate the area of rice lodging. Secondly, 3D point clouds generated by Pix4D Mapper are used to develop DSM for classifying the lodging levels of paddy rice. As results, discriminate accuracy of rice lodging is 85% by image supervised classification, and the classification accuracy of lodging level is 87% by DSM. Therefore, UAVs not only provide instant images of agricultural damage after the meteorological disaster, but the image discriminations on rice lodging also reach acceptable accuracy (>85%). In the future, technologies of UAVs and image discrimination will be applied to different crop fields. The results of image discrimination will be overlapped with administrative boundaries of paddy rice, to establish GIS-based assist system on agricultural damage discrimination. Therefore, the time and labor would be greatly reduced on damage detection and monitoring.Keywords: Monsoon, supervised classification, Pix4D, 3D point clouds, discriminate accuracy
Procedia PDF Downloads 3022444 The Role of Arousal in Time Perception: Implications for Emotional Driving
Authors: Ewa Siedlecka
Abstract:
Emotional stress is an important risk factor in the rate and severity of traffic accidents. Moreover, incorrect time perception is implicated in the increase of traffic violations, such as running red lights or collisions. While the role of emotional arousal on perceived time is well-established, the role of physiological arousal in time perception remains unexamined. Specific emotions can be, however, associated with distinct physiological responses. In the current research, two studies examined the role of physiological arousal in time perception. In the first experiment, 41 participants engaged in a cold pressor task and had their time perception measured throughout the experiment. In the second study, 138 participants engaged in either isometric or deep breathing exercises. These activities were designed to simulate the sympathetic and parasympathetic nervous systems, respectively. Participants completed a bisection task to measure time perception in both studies, as well as a physiological response via an Electrocardiography (ECG). Results found that activation of the parasympathetic nervous system is associated with greater time perception. These findings are discussed with reference to models of time perception, as well as implications for emotional driving and misperceptions of speed. It is important to consider the role of physiology in the misperception of time, as these factors can lead to increases in driving accidents.Keywords: emotions, nervous system, physiology, time perception
Procedia PDF Downloads 3262443 Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults
Authors: Michal Krecichwost, Zauzanna Miodonska, Pawel Badura
Abstract:
The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization.Keywords: computer-aided pronunciation evaluation, sibilants, sigmatism diagnosis, speech processing
Procedia PDF Downloads 2852442 Transformation of Positron Emission Tomography Raw Data into Images for Classification Using Convolutional Neural Network
Authors: Paweł Konieczka, Lech Raczyński, Wojciech Wiślicki, Oleksandr Fedoruk, Konrad Klimaszewski, Przemysław Kopka, Wojciech Krzemień, Roman Shopa, Jakub Baran, Aurélien Coussat, Neha Chug, Catalina Curceanu, Eryk Czerwiński, Meysam Dadgar, Kamil Dulski, Aleksander Gajos, Beatrix C. Hiesmayr, Krzysztof Kacprzak, łukasz Kapłon, Grzegorz Korcyl, Tomasz Kozik, Deepak Kumar, Szymon Niedźwiecki, Dominik Panek, Szymon Parzych, Elena Pérez Del Río, Sushil Sharma, Shivani Shivani, Magdalena Skurzok, Ewa łucja Stępień, Faranak Tayefi, Paweł Moskal
Abstract:
This paper develops the transformation of non-image data into 2-dimensional matrices, as a preparation stage for classification based on convolutional neural networks (CNNs). In positron emission tomography (PET) studies, CNN may be applied directly to the reconstructed distribution of radioactive tracers injected into the patient's body, as a pattern recognition tool. Nonetheless, much PET data still exists in non-image format and this fact opens a question on whether they can be used for training CNN. In this contribution, the main focus of this paper is the problem of processing vectors with a small number of features in comparison to the number of pixels in the output images. The proposed methodology was applied to the classification of PET coincidence events.Keywords: convolutional neural network, kernel principal component analysis, medical imaging, positron emission tomography
Procedia PDF Downloads 1472441 Investigating the Characteristics of Correlated Parking-Charging Behaviors for Electric Vehicles: A Data-Driven Approach
Authors: Xizhen Zhou, Yanjie Ji
Abstract:
In advancing the management of integrated electric vehicle (EV) parking-charging behaviors, this study uses Changshu City in Suzhou as a case study to establish a data association mechanism for parking-charging platforms and to develop a database for EV parking-charging behaviors. Key indicators, such as charging start time, initial state of charge, final state of charge, and parking-charging time difference, are considered. Utilizing the K-S test method, the paper examines the heterogeneity of parking-charging behavior preferences among pure EV and non-pure EV users. The K-means clustering method is employed to analyze the characteristics of parking-charging behaviors for both user groups, thereby enhancing the overall understanding of these behaviors. The findings of this study reveal that using a classification model, the parking-charging behaviors of pure EVs can be classified into five distinct groups, while those of non-pure EVs can be separated into four groups. Among them, both types of EV users exhibit groups with low range anxiety for complete charging with special journeys, complete charging at destination, and partial charging. Additionally, both types have a group with high range anxiety, characterized by pure EV users displaying a preference for complete charging with specific journeys, while non-pure EV users exhibit a preference for complete charging. Notably, pure EV users also display a significant group engaging in nocturnal complete charging. The findings of this study can provide technical support for the scientific and rational layout and management of integrated parking and charging facilities for EVs.Keywords: traffic engineering, potential preferences, cluster analysis, EV, parking-charging behavior
Procedia PDF Downloads 812440 Using Probabilistic Neural Network (PNN) for Extracting Acoustic Microwaves (Bulk Acoustic Waves) in Piezoelectric Material
Authors: Hafdaoui Hichem, Mehadjebia Cherifa, Benatia Djamel
Abstract:
In this paper, we propose a new method for Bulk detection of an acoustic microwave signal during the propagation of acoustic microwaves in a piezoelectric substrate (Lithium Niobate LiNbO3). We have used the classification by probabilistic neural network (PNN) as a means of numerical analysis in which we classify all the values of the real part and the imaginary part of the coefficient attenuation with the acoustic velocity in order to build a model from which we note the Bulk waves easily. These singularities inform us of presence of Bulk waves in piezoelectric materials. By which we obtain accurate values for each of the coefficient attenuation and acoustic velocity for Bulk waves. This study will be very interesting in modeling and realization of acoustic microwaves devices (ultrasound) based on the propagation of acoustic microwaves.Keywords: piezoelectric material, probabilistic neural network (PNN), classification, acoustic microwaves, bulk waves, the attenuation coefficient
Procedia PDF Downloads 4362439 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection
Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew
Abstract:
The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.
Procedia PDF Downloads 502438 Early Stage Suicide Ideation Detection Using Supervised Machine Learning and Neural Network Classifier
Authors: Devendra Kr Tayal, Vrinda Gupta, Aastha Bansal, Khushi Singh, Sristi Sharma, Hunny Gaur
Abstract:
In today's world, suicide is a serious problem. In order to save lives, early suicide attempt detection and prevention should be addressed. A good number of at-risk people utilize social media platforms to talk about their issues or find knowledge on related chores. Twitter and Reddit are two of the most common platforms that are used for expressing oneself. Extensive research has already been done in this field. Through supervised classification techniques like Nave Bayes, Bernoulli Nave Bayes, and Multiple Layer Perceptron on a Reddit dataset, we demonstrate the early recognition of suicidal ideation. We also performed comparative analysis on these approaches and used accuracy, recall score, F1 score, and precision score for analysis.Keywords: machine learning, suicide ideation detection, supervised classification, natural language processing
Procedia PDF Downloads 922437 Validation of Visibility Data from Road Weather Information Systems by Comparing Three Data Resources: Case Study in Ohio
Authors: Fan Ye
Abstract:
Adverse weather conditions, particularly those with low visibility, are critical to the driving tasks. However, the direct relationship between visibility distances and traffic flow/roadway safety is uncertain due to the limitation of visibility data availability. The recent growth of deployment of Road Weather Information Systems (RWIS) makes segment-specific visibility information available which can be integrated with other Intelligent Transportation System, such as automated warning system and variable speed limit, to improve mobility and safety. Before applying the RWIS visibility measurements in traffic study and operations, it is critical to validate the data. Therefore, an attempt was made in the paper to examine the validity and viability of RWIS visibility data by comparing visibility measurements among RWIS, airport weather stations, and weather information recorded by police in crash reports, based on Ohio data. The results indicated that RWIS visibility measurements were significantly different from airport visibility data in Ohio, but no conclusion regarding the reliability of RWIS visibility could be drawn in the consideration of no verified ground truth in the comparisons. It was suggested that more objective methods are needed to validate the RWIS visibility measurements, such as continuous in-field measurements associated with various weather events using calibrated visibility sensors.Keywords: RWIS, visibility distance, low visibility, adverse weather
Procedia PDF Downloads 2532436 Knowledge Discovery and Data Mining Techniques in Textile Industry
Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler
Abstract:
This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.Keywords: data mining, textile production, decision trees, classification
Procedia PDF Downloads 3562435 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling
Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal
Abstract:
Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.Keywords: ABET, accreditation, benchmark collection, machine learning, program educational objectives, student outcomes, supervised multi-class classification, text mining
Procedia PDF Downloads 1732434 Early Diagnosis of Myocardial Ischemia Based on Support Vector Machine and Gaussian Mixture Model by Using Features of ECG Recordings
Authors: Merve Begum Terzi, Orhan Arikan, Adnan Abaci, Mustafa Candemir
Abstract:
Acute myocardial infarction is a major cause of death in the world. Therefore, its fast and reliable diagnosis is a major clinical need. ECG is the most important diagnostic methodology which is used to make decisions about the management of the cardiovascular diseases. In patients with acute myocardial ischemia, temporary chest pains together with changes in ST segment and T wave of ECG occur shortly before the start of myocardial infarction. In this study, a technique which detects changes in ST/T sections of ECG is developed for the early diagnosis of acute myocardial ischemia. For this purpose, a database of real ECG recordings that contains a set of records from 75 patients presenting symptoms of chest pain who underwent elective percutaneous coronary intervention (PCI) is constituted. 12-lead ECG’s of the patients were recorded before and during the PCI procedure. Two ECG epochs, which are the pre-inflation ECG which is acquired before any catheter insertion and the occlusion ECG which is acquired during balloon inflation, are analyzed for each patient. By using pre-inflation and occlusion recordings, ECG features that are critical in the detection of acute myocardial ischemia are identified and the most discriminative features for the detection of acute myocardial ischemia are extracted. A classification technique based on support vector machine (SVM) approach operating with linear and radial basis function (RBF) kernels to detect ischemic events by using ST-T derived joint features from non-ischemic and ischemic states of the patients is developed. The dataset is randomly divided into training and testing sets and the training set is used to optimize SVM hyperparameters by using grid-search method and 10fold cross-validation. SVMs are designed specifically for each patient by tuning the kernel parameters in order to obtain the optimal classification performance results. As a result of implementing the developed classification technique to real ECG recordings, it is shown that the proposed technique provides highly reliable detections of the anomalies in ECG signals. Furthermore, to develop a detection technique that can be used in the absence of ECG recording obtained during healthy stage, the detection of acute myocardial ischemia based on ECG recordings of the patients obtained during ischemia is also investigated. For this purpose, a Gaussian mixture model (GMM) is used to represent the joint pdf of the most discriminating ECG features of myocardial ischemia. Then, a Neyman-Pearson type of approach is developed to provide detection of outliers that would correspond to acute myocardial ischemia. Neyman – Pearson decision strategy is used by computing the average log likelihood values of ECG segments and comparing them with a range of different threshold values. For different discrimination threshold values and number of ECG segments, probability of detection and probability of false alarm values are computed, and the corresponding ROC curves are obtained. The results indicate that increasing number of ECG segments provide higher performance for GMM based classification. Moreover, the comparison between the performances of SVM and GMM based classification showed that SVM provides higher classification performance results over ECG recordings of considerable number of patients.Keywords: ECG classification, Gaussian mixture model, Neyman–Pearson approach, support vector machine
Procedia PDF Downloads 1632433 Leveraging Automated and Connected Vehicles with Deep Learning for Smart Transportation Network Optimization
Authors: Taha Benarbia
Abstract:
The advent of automated and connected vehicles has revolutionized the transportation industry, presenting new opportunities for enhancing the efficiency, safety, and sustainability of our transportation networks. This paper explores the integration of automated and connected vehicles into a smart transportation framework, leveraging the power of deep learning techniques to optimize the overall network performance. The first aspect addressed in this paper is the deployment of automated vehicles (AVs) within the transportation system. AVs offer numerous advantages, such as reduced congestion, improved fuel efficiency, and increased safety through advanced sensing and decisionmaking capabilities. The paper delves into the technical aspects of AVs, including their perception, planning, and control systems, highlighting the role of deep learning algorithms in enabling intelligent and reliable AV operations. Furthermore, the paper investigates the potential of connected vehicles (CVs) in creating a seamless communication network between vehicles, infrastructure, and traffic management systems. By harnessing real-time data exchange, CVs enable proactive traffic management, adaptive signal control, and effective route planning. Deep learning techniques play a pivotal role in extracting meaningful insights from the vast amount of data generated by CVs, empowering transportation authorities to make informed decisions for optimizing network performance. The integration of deep learning with automated and connected vehicles paves the way for advanced transportation network optimization. Deep learning algorithms can analyze complex transportation data, including traffic patterns, demand forecasting, and dynamic congestion scenarios, to optimize routing, reduce travel times, and enhance overall system efficiency. The paper presents case studies and simulations demonstrating the effectiveness of deep learning-based approaches in achieving significant improvements in network performance metricsKeywords: automated vehicles, connected vehicles, deep learning, smart transportation network
Procedia PDF Downloads 832432 Modular Robotics and Terrain Detection Using Inertial Measurement Unit Sensor
Authors: Shubhakar Gupta, Dhruv Prakash, Apoorv Mehta
Abstract:
In this project, we design a modular robot capable of using and switching between multiple methods of propulsion and classifying terrain, based on an Inertial Measurement Unit (IMU) input. We wanted to make a robot that is not only intelligent in its functioning but also versatile in its physical design. The advantage of a modular robot is that it can be designed to hold several movement-apparatuses, such as wheels, legs for a hexapod or a quadpod setup, propellers for underwater locomotion, and any other solution that may be needed. The robot takes roughness input from a gyroscope and an accelerometer in the IMU, and based on the terrain classification from an artificial neural network; it decides which method of propulsion would best optimize its movement. This provides the bot with adaptability over a set of terrains, which means it can optimize its locomotion on a terrain based on its roughness. A feature like this would be a great asset to have in autonomous exploration or research drones.Keywords: modular robotics, terrain detection, terrain classification, neural network
Procedia PDF Downloads 1492431 ICanny: CNN Modulation Recognition Algorithm
Authors: Jingpeng Gao, Xinrui Mao, Zhibin Deng
Abstract:
Aiming at the low recognition rate on the composite signal modulation in low signal to noise ratio (SNR), this paper proposes a modulation recognition algorithm based on ICanny-CNN. Firstly, the radar signal is transformed into the time-frequency image by Choi-Williams Distribution (CWD). Secondly, we propose an image processing algorithm using the Guided Filter and the threshold selection method, which is combined with the hole filling and the mask operation. Finally, the shallow convolutional neural network (CNN) is combined with the idea of the depth-wise convolution (Dw Conv) and the point-wise convolution (Pw Conv). The proposed CNN is designed to complete image classification and realize modulation recognition of radar signal. The simulation results show that the proposed algorithm can reach 90.83% at 0dB and 71.52% at -8dB. Therefore, the proposed algorithm has a good classification and anti-noise performance in radar signal modulation recognition and other fields.Keywords: modulation recognition, image processing, composite signal, improved Canny algorithm
Procedia PDF Downloads 1932430 Efficient Manageability and Intelligent Classification of Web Browsing History Using Machine Learning
Authors: Suraj Gururaj, Sumantha Udupa U.
Abstract:
Browsing the Web has emerged as the de facto activity performed on the Internet. Although browsing gets tracked, the manageability aspect of Web browsing history is very poor. In this paper, we have a workable solution implemented by using machine learning and natural language processing techniques for efficient manageability of user’s browsing history. The significance of adding such a capability to a Web browser is that it ensures efficient and quick information retrieval from browsing history, which currently is very challenging. Our solution guarantees that any important websites visited in the past can be easily accessible because of the intelligent and automatic classification. In a nutshell, our solution-based paper provides an implementation as a browser extension by intelligently classifying the browsing history into most relevant category automatically without any user’s intervention. This guarantees no information is lost and increases productivity by saving time spent revisiting websites that were of much importance.Keywords: adhoc retrieval, Chrome extension, supervised learning, tile, Web personalization
Procedia PDF Downloads 3822429 Review of Cyber Security in Oil and Gas Industry with Cloud Computing Perspective: Taxonomy, Issues and Future Direction
Authors: Irfan Mohiuddin, Ahmad Al Mogren
Abstract:
In recent years, cloud computing has earned substantial attention in the Oil and Gas Industry and provides services in all the phases of the industry lifecycle. Oil and gas supply infrastructure, in particular, is more vulnerable to accidental, natural and intentional threats because of its widespread distribution. Numerous surveys have been conducted on cloud security and privacy. However, to the best of our knowledge, hardly any survey is carried out that reviews cyber security in all phases with a cloud computing perspective. Moreover, a distinctive classification is performed for all the cloud-based cyber security measures based on the cloud component in use. The classification approach will enable researchers to identify the required technique used to enhance the security in specific cloud components. Also, the limitation of each component will allow the researchers to design optimal algorithms. Lastly, future directions are given to point out the imminent challenges that can pave the way for researchers to further enhance the resilience to cyber security threats in the oil and gas industry.Keywords: cyber security, cloud computing, safety and security, oil and gas industry, security threats, oil and gas pipelines
Procedia PDF Downloads 1452428 Data Refinement Enhances The Accuracy of Short-Term Traffic Latency Prediction
Authors: Man Fung Ho, Lap So, Jiaqi Zhang, Yuheng Zhao, Huiyang Lu, Tat Shing Choi, K. Y. Michael Wong
Abstract:
Nowadays, a tremendous amount of data is available in the transportation system, enabling the development of various machine learning approaches to make short-term latency predictions. A natural question is then the choice of relevant information to enable accurate predictions. Using traffic data collected from the Taiwan Freeway System, we consider the prediction of short-term latency of a freeway segment with a length of 17 km covering 5 measurement points, each collecting vehicle-by-vehicle data through the electronic toll collection system. The processed data include the past latencies of the freeway segment with different time lags, the traffic conditions of the individual segments (the accumulations, the traffic fluxes, the entrance and exit rates), the total accumulations, and the weekday latency profiles obtained by Gaussian process regression of past data. We arrive at several important conclusions about how data should be refined to obtain accurate predictions, which have implications for future system-wide latency predictions. (1) We find that the prediction of median latency is much more accurate and meaningful than the prediction of average latency, as the latter is plagued by outliers. This is verified by machine-learning prediction using XGBoost that yields a 35% improvement in the mean square error of the 5-minute averaged latencies. (2) We find that the median latency of the segment 15 minutes ago is a very good baseline for performance comparison, and we have evidence that further improvement is achieved by machine learning approaches such as XGBoost and Long Short-Term Memory (LSTM). (3) By analyzing the feature importance score in XGBoost and calculating the mutual information between the inputs and the latencies to be predicted, we identify a sequence of inputs ranked in importance. It confirms that the past latencies are most informative of the predicted latencies, followed by the total accumulation, whereas inputs such as the entrance and exit rates are uninformative. It also confirms that the inputs are much less informative of the average latencies than the median latencies. (4) For predicting the latencies of segments composed of two or three sub-segments, summing up the predicted latencies of each sub-segment is more accurate than the one-step prediction of the whole segment, especially with the latency prediction of the downstream sub-segments trained to anticipate latencies several minutes ahead. The duration of the anticipation time is an increasing function of the traveling time of the upstream segment. The above findings have important implications to predicting the full set of latencies among the various locations in the freeway system.Keywords: data refinement, machine learning, mutual information, short-term latency prediction
Procedia PDF Downloads 1712427 Image Encryption Using Eureqa to Generate an Automated Mathematical Key
Authors: Halima Adel Halim Shnishah, David Mulvaney
Abstract:
Applying traditional symmetric cryptography algorithms while computing encryption and decryption provides immunity to secret keys against different attacks. One of the popular techniques generating automated secret keys is evolutionary computing by using Eureqa API tool, which got attention in 2013. In this paper, we are generating automated secret keys for image encryption and decryption using Eureqa API (tool which is used in evolutionary computing technique). Eureqa API models pseudo-random input data obtained from a suitable source to generate secret keys. The validation of generated secret keys is investigated by performing various statistical tests (histogram, chi-square, correlation of two adjacent pixels, correlation between original and encrypted images, entropy and key sensitivity). Experimental results obtained from methods including histogram analysis, correlation coefficient, entropy and key sensitivity, show that the proposed image encryption algorithms are secure and reliable, with the potential to be adapted for secure image communication applications.Keywords: image encryption algorithms, Eureqa, statistical measurements, automated key generation
Procedia PDF Downloads 4872426 Internet Optimization by Negotiating Traffic Times
Authors: Carlos Gonzalez
Abstract:
This paper describes a system to optimize the use of the internet by clients requiring downloading of videos at peak hours. The system consists of a web server belonging to a provider of video contents, a provider of internet communications and a software application running on a client’s computer. The client using the application software will communicate to the video provider a list of the client’s future video demands. The video provider calculates which videos are going to be more in demand for download in the immediate future, and proceeds to request the internet provider the most optimal hours to do the downloading. The times of the downloading will be sent to the application software, which will use the information of pre-established hours negotiated between the video provider and the internet provider to download those videos. The videos will be saved in a special protected section of the user’s hard disk, which will only be accessed by the application software in the client’s computer. When the client is ready to see a video, the application will search the list of current existent videos in the area of the hard disk; if it does exist, it will use this video directly without the need for internet access. We found that the best way to optimize the download traffic of videos is by negotiation between the internet communication provider and the video content provider.Keywords: internet optimization, video download, future demands, secure storage
Procedia PDF Downloads 1382425 Risky Driving Behavior among Bus Driver in Jakarta
Authors: Ratri A. Benedictus, Felicia M. Yolanda
Abstract:
Public transport is a crucial issue for capital city in developing country, such as Jakarta. Inadequate number and low quality of public transport services resulting personal vehicles as the main option. As a result, traffic jams are getting worse in Jakarta. The low quality of public transport, particularly buses, compounded by the risk behavior of the driver. Traffic accidents involving public bus in Jakarta were often the case, even result in fatality. The purpose of this study is to get a description of risk behavior among the public bus drivers in Jakarta. 132 bus drivers become respondent of this study. Risky Driving Behavior scale of Dorn were used. Data were analyzed using descriptive statistics. 51.5% of respondents felt often showing risky behavior while on driving. The highest type of risky driving behavior is still using the unsafe bus (62%). Followed by trespass the bus line (30%), over speed (21%), violate the road signs (15%) and driving with unhealthy physical condition (4%). Results of this study suggested that high understanding of the bus drivers on their risk behaviors have not lead to the emergence of safe driving behavior. Therefore, together with technical engineering and instrumentation work intervention over this issue, psychological aspects also need to be considered, such as: risk perception, safety attitude,safety culture, locus of control and Fatalism.Keywords: bus driver, psychological factors, public transportation, risky driving behavior
Procedia PDF Downloads 3622424 Analysis on Prediction Models of TBM Performance and Selection of Optimal Input Parameters
Authors: Hang Lo Lee, Ki Il Song, Hee Hwan Ryu
Abstract:
An accurate prediction of TBM(Tunnel Boring Machine) performance is very difficult for reliable estimation of the construction period and cost in preconstruction stage. For this purpose, the aim of this study is to analyze the evaluation process of various prediction models published since 2000 for TBM performance, and to select the optimal input parameters for the prediction model. A classification system of TBM performance prediction model and applied methodology are proposed in this research. Input and output parameters applied for prediction models are also represented. Based on these results, a statistical analysis is performed using the collected data from shield TBM tunnel in South Korea. By performing a simple regression and residual analysis utilizinFg statistical program, R, the optimal input parameters are selected. These results are expected to be used for development of prediction model of TBM performance.Keywords: TBM performance prediction model, classification system, simple regression analysis, residual analysis, optimal input parameters
Procedia PDF Downloads 3102423 lncRNA Gene Expression Profiling Analysis by TCGA RNA-Seq Data of Breast Cancer
Authors: Xiaoping Su, Gabriel G. Malouf
Abstract:
Introduction: Breast cancer is a heterogeneous disease that can be classified in 4 subgroups using transcriptional profiling. The role of lncRNA expression in human breast cancer biology, prognosis, and molecular classification remains unknown. Methods and results: Using an integrative comprehensive analysis of lncRNA, mRNA and DNA methylation in 900 breast cancer patients from The Cancer Genome Atlas (TCGA) project, we unraveled the molecular portraits of 1,700 expressed lncRNA. Some of those lncRNAs (i.e, HOTAIR) are previously reported and others are novel (i.e, HOTAIRM1, MAPT-AS1). The lncRNA classification correlated well with the PAM50 classification for basal-like, Her-2 enriched and luminal B subgroups, in contrast to the luminal A subgroup which behaved differently. Importantly, estrogen receptor (ESR1) expression was associated with distinct lncRNA networks in lncRNA clusters III and IV. Gene set enrichment analysis for cis- and trans-acting lncRNA showed enrichment for breast cancer signatures driven by breast cancer master regulators. Almost two third of those lncRNA were marked by enhancer chromatin modifications (i.e., H3K27ac), suggesting that lncRNA expression may result in increased activity of neighboring genes. Differential analysis of gene expression profiling data showed that lncRNA HOTAIRM1 was significantly down-regulated in basal-like subtype, and DNA methylation profiling data showed that lncRNA HOTAIRM1 was highly methylated in basal-like subtype. Thus, our integrative analysis of gene expression and DNA methylation strongly suggested that lncRNA HOTAIRM1 should be a tumor suppressor in basal-like subtype. Conclusion and significance: Our study depicts the first lncRNA molecular portrait of breast cancer and shows that lncRNA HOTAIRM1 might be a novel tumor suppressor.Keywords: lncRNA profiling, breast cancer, HOTAIRM1, tumor suppressor
Procedia PDF Downloads 1062422 National Assessment for Schools in Saudi Arabia: Score Reliability and Plausible Values
Authors: Dimiter M. Dimitrov, Abdullah Sadaawi
Abstract:
The National Assessment for Schools (NAFS) in Saudi Arabia consists of standardized tests in Mathematics, Reading, and Science for school grade levels 3, 6, and 9. One main goal is to classify students into four categories of NAFS performance (minimal, basic, proficient, and advanced) by schools and the entire national sample. The NAFS scoring and equating is performed on a bounded scale (D-scale: ranging from 0 to 1) in the framework of the recently developed “D-scoring method of measurement.” The specificity of the NAFS measurement framework and data complexity presented both challenges and opportunities to (a) the estimation of score reliability for schools, (b) setting cut-scores for the classification of students into categories of performance, and (c) generating plausible values for distributions of student performance on the D-scale. The estimation of score reliability at the school level was performed in the framework of generalizability theory (GT), with students “nested” within schools and test items “nested” within test forms. The GT design was executed via a multilevel modeling syntax code in R. Cut-scores (on the D-scale) for the classification of students into performance categories was derived via a recently developed method of standard setting, referred to as “Response Vector for Mastery” (RVM) method. For each school, the classification of students into categories of NAFS performance was based on distributions of plausible values for the students’ scores on NAFS tests by grade level (3, 6, and 9) and subject (Mathematics, Reading, and Science). Plausible values (on the D-scale) for each individual student were generated via random selection from a statistical logit-normal distribution with parameters derived from the student’s D-score and its conditional standard error, SE(D). All procedures related to D-scoring, equating, generating plausible values, and classification of students into performance levels were executed via a computer program in R developed for the purpose of NAFS data analysis.Keywords: large-scale assessment, reliability, generalizability theory, plausible values
Procedia PDF Downloads 212421 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining
Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser
Abstract:
Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract
Procedia PDF Downloads 6602420 Astronomical Object Classification
Authors: Alina Muradyan, Lina Babayan, Arsen Nanyan, Gohar Galstyan, Vigen Khachatryan
Abstract:
We present a photometric method for identifying stars, galaxies and quasars in multi-color surveys, which uses a library of ∼> 65000 color templates for comparison with observed objects. The method aims for extracting the information content of object colors in a statistically correct way, and performs a classification as well as a redshift estimation for galaxies and quasars in a unified approach based on the same probability density functions. For the redshift estimation, we employ an advanced version of the Minimum Error Variance estimator which determines the redshift error from the redshift dependent probability density function itself. The method was originally developed for the Calar Alto Deep Imaging Survey (CADIS), but is now used in a wide variety of survey projects. We checked its performance by spectroscopy of CADIS objects, where the method provides high reliability (6 errors among 151 objects with R < 24), especially for the quasar selection, and redshifts accurate within σz ≈ 0.03 for galaxies and σz ≈ 0.1 for quasars. For an optimization of future survey efforts, a few model surveys are compared, which are designed to use the same total amount of telescope time but different sets of broad-band and medium-band filters. Their performance is investigated by Monte-Carlo simulations as well as by analytic evaluation in terms of classification and redshift estimation. If photon noise were the only error source, broad-band surveys and medium-band surveys should perform equally well, as long as they provide the same spectral coverage. In practice, medium-band surveys show superior performance due to their higher tolerance for calibration errors and cosmic variance. Finally, we discuss the relevance of color calibration and derive important conclusions for the issues of library design and choice of filters. The calibration accuracy poses strong constraints on an accurate classification, which are most critical for surveys with few, broad and deeply exposed filters, but less severe for surveys with many, narrow and less deep filters.Keywords: VO, ArVO, DFBS, FITS, image processing, data analysis
Procedia PDF Downloads 822419 Analyses of Adverse Drug Reactions Reported of Hospital in Taiwan
Authors: Yu-Hong Lin
Abstract:
Background: An adverse drug reaction (ADR) reported is an injury which caused by taking medicines. Sometimes the severity of ADR reported may be minor, but sometimes it could be a life-threatening situation. In order to provide healthcare professionals as a better reference in clinical practice, we do data collection and analysis from our hospital. Methods: This was a retrospective study of ADRs reported performed from 2014 to 2015 in our hospital in Taiwan. We collected assessment items of ADRs reported, which contain gender and age, occurring sources, Anatomical Therapeutic Chemical (ATC) classification of suspected drugs, types of adverse reactions, Naranjo score calculating by Naranjo Adverse Drug Reaction Probability Scale and so on. Results: The investigation included two hundred and seven ADRs reported. Most of ADRs reported were occurring in outpatient department (92%). The average age of ADRs reported was 65.3 years. Less than 65 years of age were in the majority in this study (54%). Majority of all ADRs reported were males (51%). According to ATC classification system, the major classification of suspected drugs was cardiovascular system (19%) and antiinfectives for systemic use (18%) respectively. Among the adverse reactions, Dermatologic Effects (35%) were the major type of ADRs. Also, the major Naranjo scores of all ADRs reported ranged from 1 to 4 points (91%), which represents a possible correlation between ADRs reported and suspected drugs. Conclusions: Definitely, ADRs reported is still an extremely important information for healthcare professionals. For that reason, we put all information of ADRs reported into our hospital's computer system, and it will improve the safety of medication use. By hospital's computer system, it can remind prescribers to think of information about patient's ADRs reported. No drugs are administered without risk. Therefore, all healthcare professionals should have a responsibility to their patients, who themselves are becoming more aware of problems associated with drug therapy.Keywords: adverse drug reaction, Taiwan, healthcare professionals, safe use of medicines
Procedia PDF Downloads 2332418 A Two-Week and Six-Month Stability of Cancer Health Literacy Classification Using the CHLT-6
Authors: Levent Dumenci, Laura A. Siminoff
Abstract:
Health literacy has been shown to predict a variety of health outcomes. Reliable identification of persons with limited cancer health literacy (LCHL) has been proved questionable with existing instruments using an arbitrary cut point along a continuum. The CHLT-6, however, uses a latent mixture modeling approach to identify persons with LCHL. The purpose of this study was to estimate two-week and six-month stability of identifying persons with LCHL using the CHLT-6 with a discrete latent variable approach as the underlying measurement structure. Using a test-retest design, the CHLT-6 was administered to cancer patients with two-week (N=98) and six-month (N=51) intervals. The two-week and six-month latent test-retest agreements were 89% and 88%, respectively. The chance-corrected latent agreements estimated from Dumenci’s latent kappa were 0.62 (95% CI: 0.41 – 0.82) and .47 (95% CI: 0.14 – 0.80) for the two-week and six-month intervals, respectively. High levels of latent test-retest agreement between limited and adequate categories of cancer health literacy construct, coupled with moderate to good levels of change-corrected latent agreements indicated that the CHLT-6 classification of limited versus adequate cancer health literacy is relatively stable over time. In conclusion, the measurement structure underlying the instrument allows for estimating classification errors circumventing limitations due to arbitrary approaches adopted by all other instruments. The CHLT-6 can be used to identify persons with LCHL in oncology clinics and intervention studies to accurately estimate treatment effectiveness.Keywords: limited cancer health literacy, the CHLT-6, discrete latent variable modeling, latent agreement
Procedia PDF Downloads 1802417 Improvements and Implementation Solutions to Reduce the Computational Load for Traffic Situational Awareness with Alerts (TSAA)
Authors: Salvatore Luongo, Carlo Luongo
Abstract:
This paper discusses the implementation solutions to reduce the computational load for the Traffic Situational Awareness with Alerts (TSAA) application, based on Automatic Dependent Surveillance-Broadcast (ADS-B) technology. In 2008, there were 23 total mid-air collisions involving general aviation fixed-wing aircraft, 6 of which were fatal leading to 21 fatalities. These collisions occurred during visual meteorological conditions, indicating the limitations of the see-and-avoid concept for mid-air collision avoidance as defined in the Federal Aviation Administration’s (FAA). The commercial aviation aircraft are already equipped with collision avoidance system called TCAS, which is based on classic transponder technology. This system dramatically reduced the number of mid-air collisions involving air transport aircraft. In general aviation, the same reduction in mid-air collisions has not occurred, so this reduction is the main objective of the TSAA application. The major difference between the original conflict detection application and the TSAA application is that the conflict detection is focused on preventing loss of separation in en-route environments. Instead TSAA is devoted to reducing the probability of mid-air collision in all phases of flight. The TSAA application increases the flight crew traffic situation awareness providing alerts of traffic that are detected in conflict with ownship in support of the see-and-avoid responsibility. The relevant effort has been spent in the design process and the code generation in order to maximize the efficiency and performances in terms of computational load and memory consumption reduction. The TSAA architecture is divided into two high-level systems: the “Threats database” and the “Conflict detector”. The first one receives the traffic data from ADS-B device and provides the memorization of the target’s data history. Conflict detector module estimates ownship and targets trajectories in order to perform the detection of possible future loss of separation between ownship and each target. Finally, the alerts are verified by additional conflict verification logic, in order to prevent possible undesirable behaviors of the alert flag. In order to reduce the computational load, a pre-check evaluation module is used. This pre-check is only a computational optimization, so the performances of the conflict detector system are not modified in terms of number of alerts detected. The pre-check module uses analytical trajectories propagation for both target and ownship. This allows major accuracy and avoids the step-by-step propagation, which requests major computational load. Furthermore, the pre-check permits to exclude the target that is certainly not a threat, using an analytical and efficient geometrical approach, in order to decrease the computational load for the following modules. This software improvement is not suggested by FAA documents, and so it is the main innovation of this work. The efficiency and efficacy of this enhancement are verified using fast-time and real-time simulations and by the execution on a real device in several FAA scenarios. The final implementation also permits the FAA software certification in compliance with DO-178B standard. The computational load reduction allows the installation of TSAA application also on devices with multiple applications and/or low capacity in terms of available memory and computational capabilitiesKeywords: traffic situation awareness, general aviation, aircraft conflict detection, computational load reduction, implementation solutions, software certification
Procedia PDF Downloads 287