Search results for: imbalance dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1324

Search results for: imbalance dataset

1294 The Operating Behaviour of Unbalanced Unpaced Merging Assembly Lines

Authors: S. Shaaban, T. McNamara, S. Hudson

Abstract:

This paper reports on the performance of deliberately unbalanced, reliable, non-automated and assembly lines that merge, whose workstations differ in terms of their mean operation times. Simulations are carried out on 5- and 8-station lines with 1, 2 and 4 buffer capacity units, % degrees of line imbalance of 2, 5 and 12, and 24 different patterns of means imbalance. Data on two performance measures, namely throughput and average buffer level were gathered, statistically analysed and compared to a merging balanced line counterpart. It was found that the best configurations are a balanced line arrangement and a monotone decreasing order for each of the parallel merging lines, with the first generally resulting in a lower throughput and the second leading to a lower average buffer level than those of a balanced line.

Keywords: average buffer level, merging lines, simulation, throughput, unbalanced

Procedia PDF Downloads 281
1293 Enhancing Fault Detection in Rotating Machinery Using Wiener-CNN Method

Authors: Mohamad R. Moshtagh, Ahmad Bagheri

Abstract:

Accurate fault detection in rotating machinery is of utmost importance to ensure optimal performance and prevent costly downtime in industrial applications. This study presents a robust fault detection system based on vibration data collected from rotating gears under various operating conditions. The considered scenarios include: (1) both gears being healthy, (2) one healthy gear and one faulty gear, and (3) introducing an imbalanced condition to a healthy gear. Vibration data was acquired using a Hentek 1008 device and stored in a CSV file. Python code implemented in the Spider environment was used for data preprocessing and analysis. Winner features were extracted using the Wiener feature selection method. These features were then employed in multiple machine learning algorithms, including Convolutional Neural Networks (CNN), Multilayer Perceptron (MLP), K-Nearest Neighbors (KNN), and Random Forest, to evaluate their performance in detecting and classifying faults in both the training and validation datasets. The comparative analysis of the methods revealed the superior performance of the Wiener-CNN approach. The Wiener-CNN method achieved a remarkable accuracy of 100% for both the two-class (healthy gear and faulty gear) and three-class (healthy gear, faulty gear, and imbalanced) scenarios in the training and validation datasets. In contrast, the other methods exhibited varying levels of accuracy. The Wiener-MLP method attained 100% accuracy for the two-class training dataset and 100% for the validation dataset. For the three-class scenario, the Wiener-MLP method demonstrated 100% accuracy in the training dataset and 95.3% accuracy in the validation dataset. The Wiener-KNN method yielded 96.3% accuracy for the two-class training dataset and 94.5% for the validation dataset. In the three-class scenario, it achieved 85.3% accuracy in the training dataset and 77.2% in the validation dataset. The Wiener-Random Forest method achieved 100% accuracy for the two-class training dataset and 85% for the validation dataset, while in the three-class training dataset, it attained 100% accuracy and 90.8% accuracy for the validation dataset. The exceptional accuracy demonstrated by the Wiener-CNN method underscores its effectiveness in accurately identifying and classifying fault conditions in rotating machinery. The proposed fault detection system utilizes vibration data analysis and advanced machine learning techniques to improve operational reliability and productivity. By adopting the Wiener-CNN method, industrial systems can benefit from enhanced fault detection capabilities, facilitating proactive maintenance and reducing equipment downtime.

Keywords: fault detection, gearbox, machine learning, wiener method

Procedia PDF Downloads 49
1292 Evaluating Models Through Feature Selection Methods Using Data Driven Approach

Authors: Shital Patil, Surendra Bhosale

Abstract:

Cardiac diseases are the leading causes of mortality and morbidity in the world, from recent few decades accounting for a large number of deaths have emerged as the most life-threatening disorder globally. Machine learning and Artificial intelligence have been playing key role in predicting the heart diseases. A relevant set of feature can be very helpful in predicting the disease accurately. In this study, we proposed a comparative analysis of 4 different features selection methods and evaluated their performance with both raw (Unbalanced dataset) and sampled (Balanced) dataset. The publicly available Z-Alizadeh Sani dataset have been used for this study. Four feature selection methods: Data Analysis, minimum Redundancy maximum Relevance (mRMR), Recursive Feature Elimination (RFE), Chi-squared are used in this study. These methods are tested with 8 different classification models to get the best accuracy possible. Using balanced and unbalanced dataset, the study shows promising results in terms of various performance metrics in accurately predicting heart disease. Experimental results obtained by the proposed method with the raw data obtains maximum AUC of 100%, maximum F1 score of 94%, maximum Recall of 98%, maximum Precision of 93%. While with the balanced dataset obtained results are, maximum AUC of 100%, F1-score 95%, maximum Recall of 95%, maximum Precision of 97%.

Keywords: cardio vascular diseases, machine learning, feature selection, SMOTE

Procedia PDF Downloads 83
1291 Deep Feature Augmentation with Generative Adversarial Networks for Class Imbalance Learning in Medical Images

Authors: Rongbo Shen, Jianhua Yao, Kezhou Yan, Kuan Tian, Cheng Jiang, Ke Zhou

Abstract:

This study proposes a generative adversarial networks (GAN) framework to perform synthetic sampling in feature space, i.e., feature augmentation, to address the class imbalance problem in medical image analysis. A feature extraction network is first trained to convert images into feature space. Then the GAN framework incorporates adversarial learning to train a feature generator for the minority class through playing a minimax game with a discriminator. The feature generator then generates features for minority class from arbitrary latent distributions to balance the data between the majority class and the minority class. Additionally, a data cleaning technique, i.e., Tomek link, is employed to clean up undesirable conflicting features introduced from the feature augmentation and thus establish well-defined class clusters for the training. The experiment section evaluates the proposed method on two medical image analysis tasks, i.e., mass classification on mammogram and cancer metastasis classification on histopathological images. Experimental results suggest that the proposed method obtains superior or comparable performance over the state-of-the-art counterparts. Compared to all counterparts, our proposed method improves more than 1.5 percentage of accuracy.

Keywords: class imbalance, synthetic sampling, feature augmentation, generative adversarial networks, data cleaning

Procedia PDF Downloads 103
1290 The Virtual Container Yard: Identifying the Persuasive Factors in Container Interchange

Authors: L. Edirisinghe, Zhihong Jin, A. W. Wijeratne, R. Mudunkotuwa

Abstract:

The virtual container yard is an effective solution to the container inventory imbalance problem which is a global issue. It causes substantial cost to carriers, which inadvertently adds to the prices of consumer goods. The virtual container yard is rooted in the fundamentals of container interchange between carriers. If carriers opt to interchange their excess containers with those who are deficit, a substantial part of the empty reposition cost could be eliminated. Unlike in other types of ships, cargo cannot be directly loaded to a container ship. Slots and containers are supplementary components; thus, without containers, a carrier cannot ship cargo if the containers are not available and vice versa. Few decades ago, carriers recognized slot (the unit of space in a container ship) interchange as a viable solution for the imbalance of shipping space. Carriers interchange slots among them and it also increases the advantage of scale of economies in container shipping. Some of these service agreements between mega carriers have provisions to interchange containers too. However, the interchange mechanism is still not popular among carriers for containers. This is the paradox that prevails in the liner shipping industry. At present, carriers reposition their excess empty containers to areas where they are in demand. This research applied factor analysis statistical method. The paper reveals that five major components may influence the virtual container yard namely organisation, practice and culture, legal and environment, international nature, and marketing. There are 12 variables that may impact the virtual container yard, and these are explained in the paper.

Keywords: virtual container yard, shipping, imbalance, management, inventory

Procedia PDF Downloads 162
1289 Hybrid Fuzzy Weighted K-Nearest Neighbor to Predict Hospital Readmission for Diabetic Patients

Authors: Soha A. Bahanshal, Byung G. Kim

Abstract:

Identification of patients at high risk for hospital readmission is of crucial importance for quality health care and cost reduction. Predicting hospital readmissions among diabetic patients has been of great interest to many researchers and health decision makers. We build a prediction model to predict hospital readmission for diabetic patients within 30 days of discharge. The core of the prediction model is a modified k Nearest Neighbor called Hybrid Fuzzy Weighted k Nearest Neighbor algorithm. The prediction is performed on a patient dataset which consists of more than 70,000 patients with 50 attributes. We applied data preprocessing using different techniques in order to handle data imbalance and to fuzzify the data to suit the prediction algorithm. The model so far achieved classification accuracy of 80% compared to other models that only use k Nearest Neighbor.

Keywords: machine learning, prediction, classification, hybrid fuzzy weighted k-nearest neighbor, diabetic hospital readmission

Procedia PDF Downloads 156
1288 Static and Dynamic Hand Gesture Recognition Using Convolutional Neural Network Models

Authors: Keyi Wang

Abstract:

Similar to the touchscreen, hand gesture based human-computer interaction (HCI) is a technology that could allow people to perform a variety of tasks faster and more conveniently. This paper proposes a training method of an image-based hand gesture image and video clip recognition system using a CNN (Convolutional Neural Network) with a dataset. A dataset containing 6 hand gesture images is used to train a 2D CNN model. ~98% accuracy is achieved. Furthermore, a 3D CNN model is trained on a dataset containing 4 hand gesture video clips resulting in ~83% accuracy. It is demonstrated that a Cozmo robot loaded with pre-trained models is able to recognize static and dynamic hand gestures.

Keywords: deep learning, hand gesture recognition, computer vision, image processing

Procedia PDF Downloads 109
1287 Data Mining Approach: Classification Model Evaluation

Authors: Lubabatu Sada Sodangi

Abstract:

The rapid growth in exchange and accessibility of information via the internet makes many organisations acquire data on their own operation. The aim of data mining is to analyse the different behaviour of a dataset using observation. Although, the subset of the dataset being analysed may not display all the behaviours and relationships of the entire data and, therefore, may not represent other parts that exist in the dataset. There is a range of techniques used in data mining to determine the hidden or unknown information in datasets. In this paper, the performance of two algorithms Chi-Square Automatic Interaction Detection (CHAID) and multilayer perceptron (MLP) would be matched using an Adult dataset to find out the percentage of an/the adults that earn > 50k and those that earn <= 50k per year. The two algorithms were studied and compared using IBM SPSS statistics software. The result for CHAID shows that the most important predictors are relationship and education. The algorithm shows that those are married (husband) and have qualification: Bachelor, Masters, Doctorate or Prof-school whose their age is > 41<57 earn > 50k. Also, multilayer perceptron displays marital status and capital gain as the most important predictors of the income. It also shows that individuals that their capital gain is less than 6,849 and are single, separated or widow, earn <= 50K, whereas individuals with their capital gain is > 6,849, work > 35 hrs/wk, and > 27yrs their income will be > 50k. By comparing the two algorithms, it is observed that both algorithms are reliable but there is strong reliability in CHAID which clearly shows that relation and education contribute to the prediction as displayed in the data visualisation.

Keywords: data mining, CHAID, multi-layer perceptron, SPSS, Adult dataset

Procedia PDF Downloads 355
1286 The Effect of Emotional Stimuli Related to Body Imbalance in Postural Control and the Phenomenological Experience of Young Healthy Adults

Authors: David Martinez-Pernia, Alvaro Rivera-Rei, Alejandro Troncoso, Gonzalo Forno, Andrea Slachevsky, David Huepe, Victoria Silva-Mack, Jorge Calderon, Mayte Vergara, Valentina Carrera

Abstract:

Background: Recent theories in the field of emotions have taken the relevance of motor control beyond a system related to personal autonomy (walking, running, grooming), and integrate it into the emotional dimension. However, to our best knowledge, there are no studies that specifically investigate how emotional stimuli related to motor control modify emotional states in terms of postural control and phenomenological experience. Objective: The main aim of this work is to investigate the emotions produced by stimuli of bodily imbalance (neutral, pleasant and unpleasant) in the postural control and the phenomenological experience of young, healthy adults. Methodology: 46 healthy young people are shown emotional videos (neutral, pleasant, motor unpleasant, and non-motor unpleasant) related to the body imbalance. During the period of stimulation of each of the videos (60 seconds) the participant is standing on a force platform to collect temporal and spatial data of postural control. In addition, the electrophysiological activity of the heart and electrodermal activity is recorded. In relation to the two unpleasant conditions (motor versus non-motor), a phenomenological interview is carried out to collect the subjective experience of emotion and body perception. Results: Pleasant and unpleasant emotional videos have significant changes with respect to the neutral condition in terms of greater area, higher mean velocity, and greater mean frequency power on the anterior-posterior axis. The results obtained with respect to the electrodermal response was that the pleasurable and unpleasant conditions produced a significant increase in the phasic component with respect to the neutral condition. Regarding the electrophysiology of the heart, no significant change was found in any condition. Phenomenological experiences in the two unpleasant conditions differ in body perception and the emotional meaning of the experience. Conclusion: Emotional stimuli related to bodily imbalance produce changes in postural control, electrodermal activity, and phenomenological experience. This experimental setting could be relevant to be implemented in people with motor disorders (Parkinson, Stroke, TBI) to know how emotions affect motor control.

Keywords: body imbalance stimuli, emotion, phenomenological experience, postural control

Procedia PDF Downloads 142
1285 Video Object Segmentation for Automatic Image Annotation of Ethernet Connectors with Environment Mapping and 3D Projection

Authors: Marrone Silverio Melo Dantas Pedro Henrique Dreyer, Gabriel Fonseca Reis de Souza, Daniel Bezerra, Ricardo Souza, Silvia Lins, Judith Kelner, Djamel Fawzi Hadj Sadok

Abstract:

The creation of a dataset is time-consuming and often discourages researchers from pursuing their goals. To overcome this problem, we present and discuss two solutions adopted for the automation of this process. Both optimize valuable user time and resources and support video object segmentation with object tracking and 3D projection. In our scenario, we acquire images from a moving robotic arm and, for each approach, generate distinct annotated datasets. We evaluated the precision of the annotations by comparing these with a manually annotated dataset, as well as the efficiency in the context of detection and classification problems. For detection support, we used YOLO and obtained for the projection dataset an F1-Score, accuracy, and mAP values of 0.846, 0.924, and 0.875, respectively. Concerning the tracking dataset, we achieved an F1-Score of 0.861, an accuracy of 0.932, whereas mAP reached 0.894. In order to evaluate the quality of the annotated images used for classification problems, we employed deep learning architectures. We adopted metrics accuracy and F1-Score, for VGG, DenseNet, MobileNet, Inception, and ResNet. The VGG architecture outperformed the others for both projection and tracking datasets. It reached an accuracy and F1-score of 0.997 and 0.993, respectively. Similarly, for the tracking dataset, it achieved an accuracy of 0.991 and an F1-Score of 0.981.

Keywords: RJ45, automatic annotation, object tracking, 3D projection

Procedia PDF Downloads 133
1284 The Traffic Congestion in Biskra in Algeria

Authors: Selatnia Khaled Grine Ikram

Abstract:

The city of Biskra, like other Algerian cities, knows of urban traffic congestion. The concentration of investments especially in the secondary and tertiary sectors in the Wilaya has attracted a large rural population. The latter, combined with the high rate of natural growing, favored the imbalance of the spatial frame of wilayal system and consequently the traffic congestion of the primate city (Biskra). This urban disease is explained by a two-tier development. The capital of Wilaya growing faster than its others centers body and takes measurements of proportion to the whole. The consequences can only be negative. The pressure on the roads, the growth of the fleet, overloading of equipment and activities have become the characteristics of the city of Biskra, which can no longer meet the needs of its inhabitants. This research attempts to show the relationship between urban congestion of the primate city and the imbalance of the spatial structure of the micro-regional urban system.

Keywords: traffic congestion, spatial structure, pressure on the roads, equipment and activities

Procedia PDF Downloads 647
1283 Engagement Analysis Using DAiSEE Dataset

Authors: Naman Solanki, Souraj Mondal

Abstract:

With the world moving towards online communication, the video datastore has exploded in the past few years. Consequently, it has become crucial to analyse participant’s engagement levels in online communication videos. Engagement prediction of people in videos can be useful in many domains, like education, client meetings, dating, etc. Video-level or frame-level prediction of engagement for a user involves the development of robust models that can capture facial micro-emotions efficiently. For the development of an engagement prediction model, it is necessary to have a widely-accepted standard dataset for engagement analysis. DAiSEE is one of the datasets which consist of in-the-wild data and has a gold standard annotation for engagement prediction. Earlier research done using the DAiSEE dataset involved training and testing standard models like CNN-based models, but the results were not satisfactory according to industry standards. In this paper, a multi-level classification approach has been introduced to create a more robust model for engagement analysis using the DAiSEE dataset. This approach has recorded testing accuracies of 0.638, 0.7728, 0.8195, and 0.866 for predicting boredom level, engagement level, confusion level, and frustration level, respectively.

Keywords: computer vision, engagement prediction, deep learning, multi-level classification

Procedia PDF Downloads 91
1282 Imbalance on the Croatian Housing Market in the Aftermath of an Economic Crisis

Authors: Tamara Slišković, Tomislav Sekur

Abstract:

This manuscript examines factors that affect demand and supply of the housing market in Croatia. The period from the beginning of this century, until 2008, was characterized by a strong expansion of construction, housing and real estate market in general. Demand for residential units was expanding, and this was supported by favorable lending conditions of banks. Indicators on the supply side, such as the number of newly built houses and the construction volume index were also increasing. Rapid growth of demand, along with the somewhat slower supply growth, led to the situation in which new apartments were sold before the completion of residential buildings. This resulted in a rise of housing price which was indication of a clear link between the housing prices with the supply and demand in the housing market. However, after 2008 general economic conditions in Croatia worsened and demand for housing has fallen dramatically, while supply descended at much slower pace. Given that there is a gap between supply and demand, it can be concluded that the housing market in Croatia is in imbalance. Such trend is accompanied by a relatively small decrease in housing price. The final result of such movements is the large number of unsold housing units at relatively high price levels. For this reason, it can be argued that housing prices are sticky and that, consequently, the price level in the aftermath of a crisis does not correspond to the discrepancy between supply and demand on the Croatian housing market. The degree of rigidity of the housing price can be determined by inclusion of the housing price as the explanatory variable in the housing demand function. Other independent variables are demographic variable (e.g. the number of households), the interest rate on housing loans, households' disposable income and rent. The equilibrium price is reached when the demand for housing equals its supply, and the speed of adjustment of actual prices to equilibrium prices reveals the extent to which the prices are rigid. The latter requires inclusion of the housing prices with time lag as an independent variable in estimating demand function. We also observe the supply side of the housing market, in order to explain to what extent housing prices explain the movement of new construction activity, and other variables that describe the supply. In this context, we test whether new construction on the Croatian market is dependent on current prices or prices with a time lag. Number of dwellings is used to approximate new construction (flow variable), while the housing prices (current or lagged), quantity of dwellings in the previous period (stock variable) and a series of costs related to new construction are independent variables. We conclude that the key reason for the imbalance in the Croatian housing market should be sought in the relative relationship of price elasticities of supply and demand.

Keywords: Croatian housing market, economic crisis, housing prices, supply imbalance, demand imbalance

Procedia PDF Downloads 242
1281 Ensemble-Based SVM Classification Approach for miRNA Prediction

Authors: Sondos M. Hammad, Sherin M. ElGokhy, Mahmoud M. Fahmy, Elsayed A. Sallam

Abstract:

In this paper, an ensemble-based Support Vector Machine (SVM) classification approach is proposed. It is used for miRNA prediction. Three problems, commonly associated with previous approaches, are alleviated. These problems arise due to impose assumptions on the secondary structural of premiRNA, imbalance between the numbers of the laboratory checked miRNAs and the pseudo-hairpins, and finally using a training data set that does not consider all the varieties of samples in different species. We aggregate the predicted outputs of three well-known SVM classifiers; namely, Triplet-SVM, Virgo and Mirident, weighted by their variant features without any structural assumptions. An additional SVM layer is used in aggregating the final output. The proposed approach is trained and then tested with balanced data sets. The results of the proposed approach outperform the three base classifiers. Improved values for the metrics of 88.88% f-score, 92.73% accuracy, 90.64% precision, 96.64% specificity, 87.2% sensitivity, and the area under the ROC curve is 0.91 are achieved.

Keywords: MiRNAs, SVM classification, ensemble algorithm, assumption problem, imbalance data

Procedia PDF Downloads 308
1280 Face Recognition Using Body-Worn Camera: Dataset and Baseline Algorithms

Authors: Ali Almadan, Anoop Krishnan, Ajita Rattani

Abstract:

Facial recognition is a widely adopted technology in surveillance, border control, healthcare, banking services, and lately, in mobile user authentication with Apple introducing “Face ID” moniker with iPhone X. A lot of research has been conducted in the area of face recognition on datasets captured by surveillance cameras, DSLR, and mobile devices. Recently, face recognition technology has also been deployed on body-worn cameras to keep officers safe, enabling situational awareness and providing evidence for trial. However, limited academic research has been conducted on this topic so far, without the availability of any publicly available datasets with a sufficient sample size. This paper aims to advance research in the area of face recognition using body-worn cameras. To this aim, the contribution of this work is two-fold: (1) collection of a dataset consisting of a total of 136,939 facial images of 102 subjects captured using body-worn cameras in in-door and daylight conditions and (2) evaluation of various deep-learning architectures for face identification on the collected dataset. Experimental results suggest a maximum True Positive Rate(TPR) of 99.86% at False Positive Rate(FPR) of 0.000 obtained by SphereFace based deep learning architecture in daylight condition. The collected dataset and the baseline algorithms will promote further research and development. A downloadable link of the dataset and the algorithms is available by contacting the authors.

Keywords: face recognition, body-worn cameras, deep learning, person identification

Procedia PDF Downloads 137
1279 Design and Implementation a Platform for Adaptive Online Learning Based on Fuzzy Logic

Authors: Budoor Al Abid

Abstract:

Educational systems are increasingly provided as open online services, providing guidance and support for individual learners. To adapt the learning systems, a proper evaluation must be made. This paper builds the evaluation model Fuzzy C Means Adaptive System (FCMAS) based on data mining techniques to assess the difficulty of the questions. The following steps are implemented; first using a dataset from an online international learning system called (slepemapy.cz) the dataset contains over 1300000 records with 9 features for students, questions and answers information with feedback evaluation. Next, a normalization process as preprocessing step was applied. Then FCM clustering algorithms are used to adaptive the difficulty of the questions. The result is three cluster labeled data depending on the higher Wight (easy, Intermediate, difficult). The FCM algorithm gives a label to all the questions one by one. Then Random Forest (RF) Classifier model is constructed on the clustered dataset uses 70% of the dataset for training and 30% for testing; the result of the model is a 99.9% accuracy rate. This approach improves the Adaptive E-learning system because it depends on the student behavior and gives accurate results in the evaluation process more than the evaluation system that depends on feedback only.

Keywords: machine learning, adaptive, fuzzy logic, data mining

Procedia PDF Downloads 164
1278 Using Satellite Images Datasets for Road Intersection Detection in Route Planning

Authors: Fatma El-Zahraa El-Taher, Ayman Taha, Jane Courtney, Susan Mckeever

Abstract:

Understanding road networks plays an important role in navigation applications such as self-driving vehicles and route planning for individual journeys. Intersections of roads are essential components of road networks. Understanding the features of an intersection, from a simple T-junction to larger multi-road junctions, is critical to decisions such as crossing roads or selecting the safest routes. The identification and profiling of intersections from satellite images is a challenging task. While deep learning approaches offer the state-of-the-art in image classification and detection, the availability of training datasets is a bottleneck in this approach. In this paper, a labelled satellite image dataset for the intersection recognition problem is presented. It consists of 14,692 satellite images of Washington DC, USA. To support other users of the dataset, an automated download and labelling script is provided for dataset replication. The challenges of construction and fine-grained feature labelling of a satellite image dataset is examined, including the issue of how to address features that are spread across multiple images. Finally, the accuracy of the detection of intersections in satellite images is evaluated.

Keywords: satellite images, remote sensing images, data acquisition, autonomous vehicles

Procedia PDF Downloads 111
1277 The Validation and Reliability of the Arabic Effort-Reward Imbalance Model Questionnaire: A Cross-Sectional Study among University Students in Jordan

Authors: Mahmoud M. AbuAlSamen, Tamam El-Elimat

Abstract:

Amid the economic crisis in Jordan, the Jordanian government has opted for a knowledge economy where education is promoted as a mean for economic development. University education usually comes at the expense of study-related stress that may adversely impact the health of students. Since stress is a latent variable that is difficult to measure, a valid tool should be used in doing so. The effort-reward imbalance (ERI) is a model used as a measurement tool for occupational stress. The model was built on the notion of reciprocity, which relates ‘effort’ to ‘reward’ through the mediating ‘over-commitment’. Reciprocity assumes equilibrium between both effort and reward, where ‘high’ effort is adequately compensated with ‘high’ reward. When this equilibrium is violated (i.e., high effort with low reward), this may elicit negative emotions and stress, which have been correlated to adverse health conditions. The theory of ERI was established in many different parts of the world, and associations with chronic diseases and the health of workers were explored at length. While much of the effort-reward imbalance was investigated in work conditions, there has been a growing interest in understanding the validity of the ERI model when applied to other social settings such as schools and universities. The ERI questionnaire was developed in Arabic recently to measure ERI among high school teachers. However, little information is available on the validity of the ERI questionnaire in university students. A cross-sectional study was conducted on 833 students in Jordan to measure the validity and reliability of the ERI questionnaire in Arabic among university students. Reliability, as measured by Cronbach’s alpha of the effort, reward, and overcommitment scales, was 0.73, 0.76, and 0.69, respectively, suggesting satisfactory reliability. The factorial structure was explored using principal axis factoring. The results fitted a five-solution model where both the effort and overcommitment were uni-dimensional while the reward scale was three-dimensional with its factors, namely being ‘support’, ‘esteem’, and ‘security’. The solution explained 56% of the variance in the data. The established ERI theory was replicated with excellent validity in this study. The effort-reward ratio in university students was 1.19, which suggests a slight degree of failed reciprocity. The study also investigated the association of effort, reward, overcommitment, and ERI with participants’ demographic factors and self-reported health. ERI was found to be significantly associated with absenteeism (p < 0.0001), past history of failed courses (p=0.03), and poor academic performance (p < 0.001). Moreover, ERI was found to be associated with poor self-reported health among university students (p=0.01). In conclusion, the Arabic ERI questionnaire is reliable and valid for use in measuring effort-reward imbalance in university students in Jordan. The results of this research are important in informing higher education policy in Jordan.

Keywords: effort-reward imbalance, factor analysis, validity, self-reported health

Procedia PDF Downloads 91
1276 Adaptive Swarm Balancing Algorithms for Rare-Event Prediction in Imbalanced Healthcare Data

Authors: Jinyan Li, Simon Fong, Raymond Wong, Mohammed Sabah, Fiaidhi Jinan

Abstract:

Clinical data analysis and forecasting have make great contributions to disease control, prevention and detection. However, such data usually suffer from highly unbalanced samples in class distributions. In this paper, we target at the binary imbalanced dataset, where the positive samples take up only the minority. We investigate two different meta-heuristic algorithms, particle swarm optimization and bat-inspired algorithm, and combine both of them with the synthetic minority over-sampling technique (SMOTE) for processing the datasets. One approach is to process the full dataset as a whole. The other is to split up the dataset and adaptively process it one segment at a time. The experimental results reveal that while the performance improvements obtained by the former methods are not scalable to larger data scales, the later one, which we call Adaptive Swarm Balancing Algorithms, leads to significant efficiency and effectiveness improvements on large datasets. We also find it more consistent with the practice of the typical large imbalanced medical datasets. We further use the meta-heuristic algorithms to optimize two key parameters of SMOTE. Leading to more credible performances of the classifier, and shortening the running time compared with the brute-force method.

Keywords: Imbalanced dataset, meta-heuristic algorithm, SMOTE, big data

Procedia PDF Downloads 413
1275 Unreliable Production Lines with Simultaneously Unbalanced Operation Time Means, Breakdown, and Repair Rates

Authors: Sabry Shaaban, Tom McNamara, Sarah Hudson

Abstract:

This paper investigates the benefits of deliberately unbalancing both operation time means (MTs) and unreliability (failure and repair rates) for non-automated production lines.The lines were simulated with various line lengths, buffer capacities, degrees of imbalance and patterns of MT and unreliability imbalance. Data on two performance measures, namely throughput (TR) and average buffer level (ABL) were gathered, analyzed and compared to a balanced line counterpart. A number of conclusions were made with respect to the ranking of configurations, as well as to the relationships among the independent design parameters and the dependent variables. It was found that the best configurations are a balanced line arrangement and a monotone decreasing MT order, coupled with either a decreasing or a bowl unreliability configuration, with the first generally resulting in a reduced TR and the second leading to a lower ABL than those of a balanced line.

Keywords: unreliable production lines, unequal mean operation times, unbalanced failure and repair rates, throughput, average buffer level

Procedia PDF Downloads 462
1274 Performance Analysis of Traffic Classification with Machine Learning

Authors: Htay Htay Yi, Zin May Aye

Abstract:

Network security is role of the ICT environment because malicious users are continually growing that realm of education, business, and then related with ICT. The network security contravention is typically described and examined centrally based on a security event management system. The firewalls, Intrusion Detection System (IDS), and Intrusion Prevention System are becoming essential to monitor or prevent of potential violations, incidents attack, and imminent threats. In this system, the firewall rules are set only for where the system policies are needed. Dataset deployed in this system are derived from the testbed environment. The traffic as in DoS and PortScan traffics are applied in the testbed with firewall and IDS implementation. The network traffics are classified as normal or attacks in the existing testbed environment based on six machine learning classification methods applied in the system. It is required to be tested to get datasets and applied for DoS and PortScan. The dataset is based on CICIDS2017 and some features have been added. This system tested 26 features from the applied dataset. The system is to reduce false positive rates and to improve accuracy in the implemented testbed design. The system also proves good performance by selecting important features and comparing existing a dataset by machine learning classifiers.

Keywords: false negative rate, intrusion detection system, machine learning methods, performance

Procedia PDF Downloads 96
1273 Drone Classification Using Classification Methods Using Conventional Model With Embedded Audio-Visual Features

Authors: Hrishi Rakshit, Pooneh Bagheri Zadeh

Abstract:

This paper investigates the performance of drone classification methods using conventional DCNN with different hyperparameters, when additional drone audio data is embedded in the dataset for training and further classification. In this paper, first a custom dataset is created using different images of drones from University of South California (USC) datasets and Leeds Beckett university datasets with embedded drone audio signal. The three well-known DCNN architectures namely, Resnet50, Darknet53 and Shufflenet are employed over the created dataset tuning their hyperparameters such as, learning rates, maximum epochs, Mini Batch size with different optimizers. Precision-Recall curves and F1 Scores-Threshold curves are used to evaluate the performance of the named classification algorithms. Experimental results show that Resnet50 has the highest efficiency compared to other DCNN methods.

Keywords: drone classifications, deep convolutional neural network, hyperparameters, drone audio signal

Procedia PDF Downloads 56
1272 Association of MMP-2,-9 Overexpression and Imbalance PGR-A/PGR-B Ratio in Endometriosis

Authors: P. Afsharian, S. Mousazadeh, M. Shahhoseini, R. Aflatoonian

Abstract:

Introduction: Matrix MetalloProteinases (MMPs) degrade extracellular matrix components to provide normal remodeling and contribute to pathological tissue destruction and cell migration in endometriosis. It is accepted that MMPs are resistant to suppression by progesterone in endometriotic tissues. The physiological effects of progesterone are mediated by its two progesterone receptor (PGR) isoforms, namely PGR-A and PGR-B. The capacity of progesterone affect to gene expression is dependent on the PGR-A/PGR-B ratio. The imbalance ratio in endometriotic tissue may be an important mechanism to be resulted in Progesterone resistance and modify progesterone action via differential regulation of specific progesterone response genes and improve endometriosis disease. Material and methods: RNA was extracted from twenty ectopic (endometriotic) and eutopic (endometrial) tissue samples of women undergoing laparoscopy for endometriosis and 20 healthy fertile women at Royan Institute, Tehran, Iran. Analysis of PGR-A, PGR-B, MMP-2 and MMP-9 mRNA expression was performed using Real-time PCR in ectopic and eutopic tissues. Then, Statistical analysis was calculated according to the 2-ΔΔCT equation for all samples. Results: Quantitative RT–PCR analyses of PGR-A and PGR-B mRNA revealed that there were differences in both isoformes of PGRs mRNA expressions between ectopic and control eutopic tissues. We were able to demonstrate low expression levels of PGR-B isoforms in ectopic tissues. Although, PGR-A expression was significantly higher in the same ectopic samples compare to controls.This method permitted us to demonstrate significant overexpression of MMP-2 and MMP-9 in ectopic samples compared to control endometrial tissues, as well. Conclusions: Our data suggest that low expression levels of PGR-B and overexpression of PGR-A can alter PGR-A/PGR-B ratio in endometriotic ectopic tissues. Imbalance ratio of PGRs in endometriotic tissue may be able to consequence MMP-2 and MMP-9 overexpression which can be important in pathogenesis and treatment of disease.

Keywords: endometriosis, matrix metalloproteinases, progesterone receptor -A and -B, PGR-A/PGR-B ratio

Procedia PDF Downloads 290
1271 2D Fingerprint Performance for PubChem Chemical Database

Authors: Fatimah Zawani Abdullah, Shereena Mohd Arif, Nurul Malim

Abstract:

The study of molecular similarity search in chemical database is increasingly widespread, especially in the area of drug discovery. Similarity search is an application in the field of Chemoinformatics to measure the similarity between the molecular structure which is known as the query and the structure of chemical compounds in the database. Similarity search is also one of the approaches in virtual screening which involves computational techniques and scoring the probabilities of activity. The main objective of this work is to determine the best fingerprint when compared to the other five fingerprints selected in this study using PubChem chemical dataset. This paper will discuss the similarity searching process conducted using 6 types of descriptors, which are ECFP4, ECFC4, FCFP4, FCFC4, SRECFC4 and SRFCFC4 on 15 activity classes of PubChem dataset using Tanimoto coefficient to calculate the similarity between the query structures and each of the database structure. The results suggest that ECFP4 performs the best to be used with Tanimoto coefficient in the PubChem dataset.

Keywords: 2D fingerprints, Tanimoto, PubChem, similarity searching, chemoinformatics

Procedia PDF Downloads 261
1270 A Large Dataset Imputation Approach Applied to Country Conflict Prediction Data

Authors: Benjamin Leiby, Darryl Ahner

Abstract:

This study demonstrates an alternative stochastic imputation approach for large datasets when preferred commercial packages struggle to iterate due to numerical problems. A large country conflict dataset motivates the search to impute missing values well over a common threshold of 20% missingness. The methodology capitalizes on correlation while using model residuals to provide the uncertainty in estimating unknown values. Examination of the methodology provides insight toward choosing linear or nonlinear modeling terms. Static tolerances common in most packages are replaced with tailorable tolerances that exploit residuals to fit each data element. The methodology evaluation includes observing computation time, model fit, and the comparison of known values to replaced values created through imputation. Overall, the country conflict dataset illustrates promise with modeling first-order interactions while presenting a need for further refinement that mimics predictive mean matching.

Keywords: correlation, country conflict, imputation, stochastic regression

Procedia PDF Downloads 90
1269 Fine Grained Action Recognition of Skateboarding Tricks

Authors: Frederik Calsius, Mirela Popa, Alexia Briassouli

Abstract:

In the field of machine learning, it is common practice to use benchmark datasets to prove the working of a method. The domain of action recognition in videos often uses datasets like Kinet-ics, Something-Something, UCF-101 and HMDB-51 to report results. Considering the properties of the datasets, there are no datasets that focus solely on very short clips (2 to 3 seconds), and on highly-similar fine-grained actions within one specific domain. This paper researches how current state-of-the-art action recognition methods perform on a dataset that consists of highly similar, fine-grained actions. To do so, a dataset of skateboarding tricks was created. The performed analysis highlights both benefits and limitations of state-of-the-art methods, while proposing future research directions in the activity recognition domain. The conducted research shows that the best results are obtained by fusing RGB data with OpenPose data for the Temporal Shift Module.

Keywords: activity recognition, fused deep representations, fine-grained dataset, temporal modeling

Procedia PDF Downloads 200
1268 Calycosin Ameliorates Osteoarthritis by Regulating the Imbalance Between Chondrocyte Synthesis and Catabolism

Authors: Hong Su, Qiuju Yan, Wei Du, En Hu, Zhaoyu Yang, Wei Zhang, Yusheng Li, Tao Tang, Wang yang, Shushan Zhao

Abstract:

Osteoarthritis (OA) is a severe chronic inflammatory disease. As the main active component of Astragalus mongholicus Bunge, a classic traditional ethnic herb, calycosin exhibits anti-inflammatory action and its mechanism of exact targets for OA have yet to be determined. In this study, we established an anterior cruciate ligament transection (ACLT) mouse model. Mice were randomized to sham, OA, and calycosin groups. Cartilage synthesis markers type II collagen (Col-2) and SRY-Box Transcription Factor 9 (Sox-9) increased significantly after calycosin gavage. While cartilage matrix degradation index cyclooxygenase-2 (COX-2), phosphor-epidermal growth factor receptor (p-EGFR), and matrix metalloproteinase-9 (MMP9) expression were decreased. With the help of network pharmacology and molecular docking, these results were confirmed in chondrocyte ATDC5 cells. Our results indicated that the calycosin treatment significantly improved cartilage damage, this was probably attributed to reversing the imbalance between chondrocyte synthesis and catabolism.

Keywords: calycosin, osteoarthritis, network pharmacology, molecular docking, inflammatory, cyclooxygenase 2

Procedia PDF Downloads 62
1267 Mapping of Urban Green Spaces Towards a Balanced Planning in a Coastal Landscape

Authors: Rania Ajmi, Faiza Allouche Khebour, Aude Nuscia Taibi, Sirine Essasi

Abstract:

Urban green spaces (UGS) as an important contributor can be a significant part of sustainable development. A spatial method was employed to assess and map the spatial distribution of UGS in five districts in Sousse, Tunisia. Ecological management of UGS is an essential factor for the sustainable development of the city; hence the municipality of Sousse has decided to support the districts according to different green spaces characters. And to implement this policy, (1) a new GIS web application was developed, (2) then the implementation of the various green spaces was carried out, (3) a spatial mapping of UGS using Quantum GIS was realized, and (4) finally a data processing and statistical analysis with RStudio programming language was executed. The intersection of the results of the spatial and statistical analyzes highlighted the presence of an imbalance in terms of the spatial UGS distribution in the study area. The discontinuity between the coast and the city's green spaces was not designed in a spirit of network and connection, hence the lack of a greenway that connects these spaces to the city. Finally, this GIS support will be used to assess and monitor green spaces in the city of Sousse by decision-makers and will contribute to improve the well-being of the local population.

Keywords: distributions, GIS, green space, imbalance, spatial analysis

Procedia PDF Downloads 166
1266 Audit of TPS photon beam dataset for small field output factors using OSLDs against RPC standard dataset

Authors: Asad Yousuf

Abstract:

Purpose: The aim of the present study was to audit treatment planning system beam dataset for small field output factors against standard dataset produced by radiological physics center (RPC) from a multicenter study. Such data are crucial for validity of special techniques, i.e., IMRT or stereotactic radiosurgery. Materials/Method: In this study, multiple small field size output factor datasets were measured and calculated for 6 to 18 MV x-ray beams using the RPC recommend methods. These beam datasets were measured at 10 cm depth for 10 × 10 cm2 to 2 × 2 cm2 field sizes, defined by collimator jaws at 100 cm. The measurements were made with a Landauer’s nanoDot OSLDs whose volume is small enough to gather a full ionization reading even for the 1×1 cm2 field size. At our institute the beam data including output factors have been commissioned at 5 cm depth with an SAD setup. For comparison with the RPC data, the output factors were converted to an SSD setup using tissue phantom ratios. SSD setup also enables coverage of the ion chamber in 2×2 cm2 field size. The measured output factors were also compared with those calculated by Eclipse™ treatment planning software. Result: The measured and calculated output factors are in agreement with RPC dataset within 1% and 4% respectively. The large discrepancies in TPS reflect the increased challenge in converting measured data into a commissioned beam model for very small fields. Conclusion: OSLDs are simple, durable, and accurate tool to verify doses that delivered using small photon beam fields down to a 1x1 cm2 field sizes. The study emphasizes that the treatment planning system should always be evaluated for small field out factors for the accurate dose delivery in clinical setting.

Keywords: small field dosimetry, optically stimulated luminescence, audit treatment, radiological physics center

Procedia PDF Downloads 299
1265 Analysis of Real Time Seismic Signal Dataset Using Machine Learning

Authors: Sujata Kulkarni, Udhav Bhosle, Vijaykumar T.

Abstract:

Due to the closeness between seismic signals and non-seismic signals, it is vital to detect earthquakes using conventional methods. In order to distinguish between seismic events and non-seismic events depending on their amplitude, our study processes the data that come from seismic sensors. The authors suggest a robust noise suppression technique that makes use of a bandpass filter, an IIR Wiener filter, recursive short-term average/long-term average (STA/LTA), and Carl short-term average (STA)/long-term average for event identification (LTA). The trigger ratio used in the proposed study to differentiate between seismic and non-seismic activity is determined. The proposed work focuses on significant feature extraction for machine learning-based seismic event detection. This serves as motivation for compiling a dataset of all features for the identification and forecasting of seismic signals. We place a focus on feature vector dimension reduction techniques due to the temporal complexity. The proposed notable features were experimentally tested using a machine learning model, and the results on unseen data are optimal. Finally, a presentation using a hybrid dataset (captured by different sensors) demonstrates how this model may also be employed in a real-time setting while lowering false alarm rates. The planned study is based on the examination of seismic signals obtained from both individual sensors and sensor networks (SN). A wideband seismic signal from BSVK and CUKG station sensors, respectively located near Basavakalyan, Karnataka, and the Central University of Karnataka, makes up the experimental dataset.

Keywords: Carl STA/LTA, features extraction, real time, dataset, machine learning, seismic detection

Procedia PDF Downloads 66