Search results for: data normalization
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24231

Search results for: data normalization

24201 Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms

Authors: Jeff Clarine, Chang-Shyh Peng, Daisy Sang

Abstract:

Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction.

Keywords: bioassay, machine learning, preprocessing, virtual screen

Procedia PDF Downloads 250
24200 Jordan Water District Interactive Billing and Accounting Information System

Authors: Adrian J. Forca, Simeon J. Cainday III

Abstract:

The Jordan Water District Interactive Billing and Accounting Information Systems is designed for Jordan Water District to uplift the efficiency and effectiveness of its services to its customers. It is designed to process computations of water bills in accurate and fast way through automating the manual process and ensures that correct rates and fees are applied. In addition to billing process, a mobile app will be integrated into it to support rapid and accurate water bill generation. An interactive feature will be incorporated to support electronic billing to customers who wish to receive water bills through the use of electronic mail. The system will also improve, organize and avoid data inaccuracy in accounting processes because data will be stored in a database which is designed logically correct through normalization. Furthermore, strict programming constraints will be plunged to validate account access privilege based on job function and data being stored and retrieved to ensure data security, reliability, and accuracy. The system will be able to cater the billing and accounting services of Jordan Water District resulting in setting forth the manual process and adapt to the modern technological innovations.

Keywords: accounting, bill, information system, interactive

Procedia PDF Downloads 224
24199 Enhancement Method of Network Traffic Anomaly Detection Model Based on Adversarial Training With Category Tags

Authors: Zhang Shuqi, Liu Dan

Abstract:

For the problems in intelligent network anomaly traffic detection models, such as low detection accuracy caused by the lack of training samples, poor effect with small sample attack detection, a classification model enhancement method, F-ACGAN(Flow Auxiliary Classifier Generative Adversarial Network) which introduces generative adversarial network and adversarial training, is proposed to solve these problems. Generating adversarial data with category labels could enhance the training effect and improve classification accuracy and model robustness. FACGAN consists of three steps: feature preprocess, which includes data type conversion, dimensionality reduction and normalization, etc.; A generative adversarial network model with feature learning ability is designed, and the sample generation effect of the model is improved through adversarial iterations between generator and discriminator. The adversarial disturbance factor of the gradient direction of the classification model is added to improve the diversity and antagonism of generated data and to promote the model to learn from adversarial classification features. The experiment of constructing a classification model with the UNSW-NB15 dataset shows that with the enhancement of FACGAN on the basic model, the classification accuracy has improved by 8.09%, and the score of F1 has improved by 6.94%.

Keywords: data imbalance, GAN, ACGAN, anomaly detection, adversarial training, data augmentation

Procedia PDF Downloads 75
24198 Dynamic Gabor Filter Facial Features-Based Recognition of Emotion in Video Sequences

Authors: T. Hari Prasath, P. Ithaya Rani

Abstract:

In the world of visual technology, recognizing emotions from the face images is a challenging task. Several related methods have not utilized the dynamic facial features effectively for high performance. This paper proposes a method for emotions recognition using dynamic facial features with high performance. Initially, local features are captured by Gabor filter with different scale and orientations in each frame for finding the position and scale of face part from different backgrounds. The Gabor features are sent to the ensemble classifier for detecting Gabor facial features. The region of dynamic features is captured from the Gabor facial features in the consecutive frames which represent the dynamic variations of facial appearances. In each region of dynamic features is normalized using Z-score normalization method which is further encoded into binary pattern features with the help of threshold values. The binary features are passed to Multi-class AdaBoost classifier algorithm with the well-trained database contain happiness, sadness, surprise, fear, anger, disgust, and neutral expressions to classify the discriminative dynamic features for emotions recognition. The developed method is deployed on the Ryerson Multimedia Research Lab and Cohn-Kanade databases and they show significant performance improvement owing to their dynamic features when compared with the existing methods.

Keywords: detecting face, Gabor filter, multi-class AdaBoost classifier, Z-score normalization

Procedia PDF Downloads 247
24197 Design and Implementation a Platform for Adaptive Online Learning Based on Fuzzy Logic

Authors: Budoor Al Abid

Abstract:

Educational systems are increasingly provided as open online services, providing guidance and support for individual learners. To adapt the learning systems, a proper evaluation must be made. This paper builds the evaluation model Fuzzy C Means Adaptive System (FCMAS) based on data mining techniques to assess the difficulty of the questions. The following steps are implemented; first using a dataset from an online international learning system called (slepemapy.cz) the dataset contains over 1300000 records with 9 features for students, questions and answers information with feedback evaluation. Next, a normalization process as preprocessing step was applied. Then FCM clustering algorithms are used to adaptive the difficulty of the questions. The result is three cluster labeled data depending on the higher Wight (easy, Intermediate, difficult). The FCM algorithm gives a label to all the questions one by one. Then Random Forest (RF) Classifier model is constructed on the clustered dataset uses 70% of the dataset for training and 30% for testing; the result of the model is a 99.9% accuracy rate. This approach improves the Adaptive E-learning system because it depends on the student behavior and gives accurate results in the evaluation process more than the evaluation system that depends on feedback only.

Keywords: machine learning, adaptive, fuzzy logic, data mining

Procedia PDF Downloads 164
24196 Towards Long-Range Pixels Connection for Context-Aware Semantic Segmentation

Authors: Muhammad Zubair Khan, Yugyung Lee

Abstract:

Deep learning has recently achieved enormous response in semantic image segmentation. The previously developed U-Net inspired architectures operate with continuous stride and pooling operations, leading to spatial data loss. Also, the methods lack establishing long-term pixels connection to preserve context knowledge and reduce spatial loss in prediction. This article developed encoder-decoder architecture with bi-directional LSTM embedded in long skip-connections and densely connected convolution blocks. The network non-linearly combines the feature maps across encoder-decoder paths for finding dependency and correlation between image pixels. Additionally, the densely connected convolutional blocks are kept in the final encoding layer to reuse features and prevent redundant data sharing. The method applied batch-normalization for reducing internal covariate shift in data distributions. The empirical evidence shows a promising response to our method compared with other semantic segmentation techniques.

Keywords: deep learning, semantic segmentation, image analysis, pixels connection, convolution neural network

Procedia PDF Downloads 76
24195 Improve Student Performance Prediction Using Majority Vote Ensemble Model for Higher Education

Authors: Wade Ghribi, Abdelmoty M. Ahmed, Ahmed Said Badawy, Belgacem Bouallegue

Abstract:

In higher education institutions, the most pressing priority is to improve student performance and retention. Large volumes of student data are used in Educational Data Mining techniques to find new hidden information from students' learning behavior, particularly to uncover the early symptom of at-risk pupils. On the other hand, data with noise, outliers, and irrelevant information may provide incorrect conclusions. By identifying features of students' data that have the potential to improve performance prediction results, comparing and identifying the most appropriate ensemble learning technique after preprocessing the data, and optimizing the hyperparameters, this paper aims to develop a reliable students' performance prediction model for Higher Education Institutions. Data was gathered from two different systems: a student information system and an e-learning system for undergraduate students in the College of Computer Science of a Saudi Arabian State University. The cases of 4413 students were used in this article. The process includes data collection, data integration, data preprocessing (such as cleaning, normalization, and transformation), feature selection, pattern extraction, and, finally, model optimization and assessment. Random Forest, Bagging, Stacking, Majority Vote, and two types of Boosting techniques, AdaBoost and XGBoost, are ensemble learning approaches, whereas Decision Tree, Support Vector Machine, and Artificial Neural Network are supervised learning techniques. Hyperparameters for ensemble learning systems will be fine-tuned to provide enhanced performance and optimal output. The findings imply that combining features of students' behavior from e-learning and students' information systems using Majority Vote produced better outcomes than the other ensemble techniques.

Keywords: educational data mining, student performance prediction, e-learning, classification, ensemble learning, higher education

Procedia PDF Downloads 75
24194 Heart Attack Prediction Using Several Machine Learning Methods

Authors: Suzan Anwar, Utkarsh Goyal

Abstract:

Heart rate (HR) is a predictor of cardiovascular, cerebrovascular, and all-cause mortality in the general population, as well as in patients with cardio and cerebrovascular diseases. Machine learning (ML) significantly improves the accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment while avoiding unnecessary treatment of others. This research examines relationship between the individual's various heart health inputs like age, sex, cp, trestbps, thalach, oldpeaketc, and the likelihood of developing heart disease. Machine learning techniques like logistic regression and decision tree, and Python are used. The results of testing and evaluating the model using the Heart Failure Prediction Dataset show the chance of a person having a heart disease with variable accuracy. Logistic regression has yielded an accuracy of 80.48% without data handling. With data handling (normalization, standardscaler), the logistic regression resulted in improved accuracy of 87.80%, decision tree 100%, random forest 100%, and SVM 100%.

Keywords: heart rate, machine learning, SVM, decision tree, logistic regression, random forest

Procedia PDF Downloads 113
24193 Text Data Preprocessing Library: Bilingual Approach

Authors: Kabil Boukhari

Abstract:

In the context of information retrieval, the selection of the most relevant words is a very important step. In fact, the text cleaning allows keeping only the most representative words for a better use. In this paper, we propose a library for the purpose text preprocessing within an implemented application to facilitate this task. This study has two purposes. The first, is to present the related work of the various steps involved in text preprocessing, presenting the segmentation, stemming and lemmatization algorithms that could be efficient in the rest of study. The second, is to implement a developed tool for text preprocessing in French and English. This library accepts unstructured text as input and provides the preprocessed text as output, based on a set of rules and on a base of stop words for both languages. The proposed library has been made on different corpora and gave an interesting result.

Keywords: text preprocessing, segmentation, knowledge extraction, normalization, text generation, information retrieval

Procedia PDF Downloads 60
24192 High-Throughput, Purification-Free, Multiplexed Profiling of Circulating miRNA for Discovery, Validation, and Diagnostics

Authors: J. Hidalgo de Quintana, I. Stoner, M. Tackett, G. Doran, C. Rafferty, A. Windemuth, J. Tytell, D. Pregibon

Abstract:

We have developed the Multiplexed Circulating microRNA assay that allows the detection of up to 68 microRNA targets per sample. The assay combines particle­based multiplexing, using patented Firefly hydrogel particles, with single­ step RT-PCR signal. Thus, the Circulating microRNA assay leverages PCR sensitivity while eliminating the need for separate reverse transcription reactions and mitigating amplification biases introduced by target­-specific qPCR. Furthermore, the ability to multiplex targets in each well eliminates the need to split valuable samples into multiple reactions. Results from the Circulating microRNA assay are interpreted using Firefly Analysis Workbench, which allows visualization, normalization, and export of experimental data. To aid discovery and validation of biomarkers, we have generated fixed panels for Oncology, Cardiology, Neurology, Immunology, and Liver Toxicology. Here we present the data from several studies investigating circulating and tumor microRNA, showcasing the ability of the technology to sensitively and specifically detect microRNA biomarker signatures from fluid specimens.

Keywords: biomarkers, biofluids, miRNA, photolithography, flowcytometry

Procedia PDF Downloads 328
24191 Madness in Susanna Kaysen’s Girl, Interrupted: A Focouldian Reading

Authors: Somaye Sabetnia

Abstract:

This paper is accomplished to probe Susanna Kaysen’s memoir Girl, Interrupted in the light of Michel Foucault’s theory of madness comprehensively set forth in his History of Madness (1961). It is an endeavor to analysis this novel based on Foucault’s idea of madness. In his archeological study of madness, Foucault introduces a way to perceive madness and its association with dominant discourses. He argues that the concept of madness is constructed within the social context, and different institutions affect its definition. Furthermore, he takes into consideration how each era treats madness, and affirms that in modern times, people considered mad are exiled out of cities, confined in madhouses, and later in clinics where they are treated with drugs. Set after World War II, the novel under observation highlights women’s conditions in which they were becoming a housewife or following their own desires; in fact, choosing the second one results in labeling mad. The protagonist of novel is labeled 'mad,' and is hence impelled to go to asylums where so-called patients are under the vigilant surveillance of the authorities to go through the process of 'normalization.' To discern how she is considered 'mad,' this article probes the dominant discourse of the time when the stories take place to provide a better understanding of madness under the impact of social, cultural, and political conditions. It examines how a so-called mad considered 'Other' and treated after being confined by the disciplinary system of the asylum in a panoptic world. In addition to, it describes the aim of treatment is to punish and control a patient not to cure. This article aims to indicate that Susanna Kaysen tries to picture what is defined as women’s madness is the result of the patriarchal society of the post-war America as well as the mental illness has nothing to do with blood; it is rather the result of the social inequality of the age.

Keywords: clinical treatment, disciplining and punishment, dominant discourse, normalization, other, panoptic world, reason vs. unreason

Procedia PDF Downloads 291
24190 Task Scheduling and Resource Allocation in Cloud-based on AHP Method

Authors: Zahra Ahmadi, Fazlollah Adibnia

Abstract:

Scheduling of tasks and the optimal allocation of resources in the cloud are based on the dynamic nature of tasks and the heterogeneity of resources. Applications that are based on the scientific workflow are among the most widely used applications in this field, which are characterized by high processing power and storage capacity. In order to increase their efficiency, it is necessary to plan the tasks properly and select the best virtual machine in the cloud. The goals of the system are effective factors in scheduling tasks and resource selection, which depend on various criteria such as time, cost, current workload and processing power. Multi-criteria decision-making methods are a good choice in this field. In this research, a new method of work planning and resource allocation in a heterogeneous environment based on the modified AHP algorithm is proposed. In this method, the scheduling of input tasks is based on two criteria of execution time and size. Resource allocation is also a combination of the AHP algorithm and the first-input method of the first client. Resource prioritization is done with the criteria of main memory size, processor speed and bandwidth. What is considered in this system to modify the AHP algorithm Linear Max-Min and Linear Max normalization methods are the best choice for the mentioned algorithm, which have a great impact on the ranking. The simulation results show a decrease in the average response time, return time and execution time of input tasks in the proposed method compared to similar methods (basic methods).

Keywords: hierarchical analytical process, work prioritization, normalization, heterogeneous resource allocation, scientific workflow

Procedia PDF Downloads 120
24189 A Visualization Classification Method for Identifying the Decayed Citrus Fruit Infected by Fungi Based on Hyperspectral Imaging

Authors: Jiangbo Li, Wenqian Huang

Abstract:

Early detection of fungal infection in citrus fruit is one of the major problems in the postharvest commercialization process. The automatic and nondestructive detection of infected fruits is still a challenge for the citrus industry. At present, the visual inspection of rotten citrus fruits is commonly performed by workers through the ultraviolet induction fluorescence technology or manual sorting in citrus packinghouses to remove fruit subject with fungal infection. However, the former entails a number of problems because exposing people to this kind of lighting is potentially hazardous to human health, and the latter is very inefficient. Orange is used as a research object. This study would focus on this problem and proposed an effective method based on Vis-NIR hyperspectral imaging in the wavelength range of 400-1000 nm with a spectroscopic resolution of 2.8 nm. In this work, three normalization approaches are applied prior to analysis to reduce the effect of sample curvature on spectral profiles, and it is found that mean normalization was the most effective pretreatment for decreasing spectral variability due to curvature. Then, principal component analysis (PCA) was applied to a dataset composing of average spectra from decayed and normal tissue to reduce the dimensionality of data and observe the ability of Vis-NIR hyper-spectra to discriminate data from two classes. In this case, it was observed that normal and decayed spectra were separable along the resultant first principal component (PC1) axis. Subsequently, five wavelengths (band) centered at 577, 702, 751, 808, and 923 nm were selected as the characteristic wavelengths by analyzing the loadings of PC1. A multispectral combination image was generated based on five selected characteristic wavelength images. Based on the obtained multispectral combination image, the intensity slicing pseudocolor image processing method is used to generate a 2-D visual classification image that would enhance the contrast between normal and decayed tissue. Finally, an image segmentation algorithm for detection of decayed fruit was developed based on the pseudocolor image coupled with a simple thresholding method. For the investigated 238 independent set samples including infected fruits infected by Penicillium digitatum and normal fruits, the total success rate is 100% and 97.5%, respectively, and, the proposed algorithm also used to identify the orange infected by penicillium italicum with a 100% identification accuracy, indicating that the proposed multispectral algorithm here is an effective method and it is potential to be applied in citrus industry.

Keywords: citrus fruit, early rotten, fungal infection, hyperspectral imaging

Procedia PDF Downloads 270
24188 Investigating Changes in Hip and Knee Joints Position in Girls with Patellofemoral Syndrome

Authors: Taraneh Ashrafi Motlagh, Abdolrasoul Daneshjoo

Abstract:

Background and Aim: Increased fatigue causes injuries; the purpose of this article was to investigate the angular displacement of the hip and knee joints in girls with patellofemoral syndrome. Materials and Methods: Thirty girls with an average age (age 28.73±1.83, height 168.49±5.59, weight 63.73±12.73) participated in this study in two groups of 15, experimental and control. The jet evaluation test was taken from the subjects' knee and thigh angle, and then these tests were repeated with the application of different inclines of the treadmill; the tests were examined in a neutral position and in a positive and negative slope of 5 degrees. The mean and standard deviation were used to describe the data, and the Shapirovik test was used for the normalization of the data to compare and examine the variables in the two research groups using an independent t-test and repeated analysis of variance at a significance level of 0.05. Conclusion: In general, according to the current studies of people with patellofemoral syndrome, running on steep inclines, as well as running on a treadmill and making the incline angle of the treadmill within the limit of minus 5% to plus 5%, does not affect the improvement of this condition, and it is not recommended. And according to the research, girls with patellofemoral syndrome should be placed on the treadmill at an inclined angle to run.

Keywords: patellofemoral syndrome, angular displacement of the knee, angular displacement of the thigh

Procedia PDF Downloads 32
24187 Customer Churn Prediction by Using Four Machine Learning Algorithms Integrating Features Selection and Normalization in the Telecom Sector

Authors: Alanoud Moraya Aldalan, Abdulaziz Almaleh

Abstract:

A crucial component of maintaining a customer-oriented business as in the telecom industry is understanding the reasons and factors that lead to customer churn. Competition between telecom companies has greatly increased in recent years. It has become more important to understand customers’ needs in this strong market of telecom industries, especially for those who are looking to turn over their service providers. So, predictive churn is now a mandatory requirement for retaining those customers. Machine learning can be utilized to accomplish this. Churn Prediction has become a very important topic in terms of machine learning classification in the telecommunications industry. Understanding the factors of customer churn and how they behave is very important to building an effective churn prediction model. This paper aims to predict churn and identify factors of customers’ churn based on their past service usage history. Aiming at this objective, the study makes use of feature selection, normalization, and feature engineering. Then, this study compared the performance of four different machine learning algorithms on the Orange dataset: Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting. Evaluation of the performance was conducted by using the F1 score and ROC-AUC. Comparing the results of this study with existing models has proven to produce better results. The results showed the Gradients Boosting with feature selection technique outperformed in this study by achieving a 99% F1-score and 99% AUC, and all other experiments achieved good results as well.

Keywords: machine learning, gradient boosting, logistic regression, churn, random forest, decision tree, ROC, AUC, F1-score

Procedia PDF Downloads 106
24186 COVID-19 Detection from Computed Tomography Images Using UNet Segmentation, Region Extraction, and Classification Pipeline

Authors: Kenan Morani, Esra Kaya Ayana

Abstract:

This study aimed to develop a novel pipeline for COVID-19 detection using a large and rigorously annotated database of computed tomography (CT) images. The pipeline consists of UNet-based segmentation, lung extraction, and a classification part, with the addition of optional slice removal techniques following the segmentation part. In this work, a batch normalization was added to the original UNet model to produce lighter and better localization, which is then utilized to build a full pipeline for COVID-19 diagnosis. To evaluate the effectiveness of the proposed pipeline, various segmentation methods were compared in terms of their performance and complexity. The proposed segmentation method with batch normalization outperformed traditional methods and other alternatives, resulting in a higher dice score on a publicly available dataset. Moreover, at the slice level, the proposed pipeline demonstrated high validation accuracy, indicating the efficiency of predicting 2D slices. At the patient level, the full approach exhibited higher validation accuracy and macro F1 score compared to other alternatives, surpassing the baseline. The classification component of the proposed pipeline utilizes a convolutional neural network (CNN) to make final diagnosis decisions. The COV19-CT-DB dataset, which contains a large number of CT scans with various types of slices and rigorously annotated for COVID-19 detection, was utilized for classification. The proposed pipeline outperformed many other alternatives on the dataset.

Keywords: classification, computed tomography, lung extraction, macro F1 score, UNet segmentation

Procedia PDF Downloads 100
24185 Breast Cancer Metastasis Detection and Localization through Transfer-Learning Convolutional Neural Network Classification Based on Convolutional Denoising Autoencoder Stack

Authors: Varun Agarwal

Abstract:

Introduction: With the advent of personalized medicine, histopathological review of whole slide images (WSIs) for cancer diagnosis presents an exceedingly time-consuming, complex task. Specifically, detecting metastatic regions in WSIs of sentinel lymph node biopsies necessitates a full-scanned, holistic evaluation of the image. Thus, digital pathology, low-level image manipulation algorithms, and machine learning provide significant advancements in improving the efficiency and accuracy of WSI analysis. Using Camelyon16 data, this paper proposes a deep learning pipeline to automate and ameliorate breast cancer metastasis localization and WSI classification. Methodology: The model broadly follows five stages -region of interest detection, WSI partitioning into image tiles, convolutional neural network (CNN) image-segment classifications, probabilistic mapping of tumor localizations, and further processing for whole WSI classification. Transfer learning is applied to the task, with the implementation of Inception-ResNetV2 - an effective CNN classifier that uses residual connections to enhance feature representation, adding convolved outputs in the inception unit to the proceeding input data. Moreover, in order to augment the performance of the transfer learning CNN, a stack of convolutional denoising autoencoders (CDAE) is applied to produce embeddings that enrich image representation. Through a saliency-detection algorithm, visual training segments are generated, which are then processed through a denoising autoencoder -primarily consisting of convolutional, leaky rectified linear unit, and batch normalization layers- and subsequently a contrast-normalization function. A spatial pyramid pooling algorithm extracts the key features from the processed image, creating a viable feature map for the CNN that minimizes spatial resolution and noise. Results and Conclusion: The simplified and effective architecture of the fine-tuned transfer learning Inception-ResNetV2 network enhanced with the CDAE stack yields state of the art performance in WSI classification and tumor localization, achieving AUC scores of 0.947 and 0.753, respectively. The convolutional feature retention and compilation with the residual connections to inception units synergized with the input denoising algorithm enable the pipeline to serve as an effective, efficient tool in the histopathological review of WSIs.

Keywords: breast cancer, convolutional neural networks, metastasis mapping, whole slide images

Procedia PDF Downloads 106
24184 Developing a DNN Model for the Production of Biogas From a Hybrid BO-TPE System in an Anaerobic Wastewater Treatment Plant

Authors: Hadjer Sadoune, Liza Lamini, Scherazade Krim, Amel Djouadi, Rachida Rihani

Abstract:

Deep neural networks are highly regarded for their accuracy in predicting intricate fermentation processes. Their ability to learn from a large amount of datasets through artificial intelligence makes them particularly effective models. The primary obstacle in improving the performance of these models is to carefully choose the suitable hyperparameters, including the neural network architecture (number of hidden layers and hidden units), activation function, optimizer, learning rate, and other relevant factors. This study predicts biogas production from real wastewater treatment plant data using a sophisticated approach: hybrid Bayesian optimization with a tree-structured Parzen estimator (BO-TPE) for an optimised deep neural network (DNN) model. The plant utilizes an Upflow Anaerobic Sludge Blanket (UASB) digester that treats industrial wastewater from soft drinks and breweries. The digester has a working volume of 1574 m3 and a total volume of 1914 m3. Its internal diameter and height were 19 and 7.14 m, respectively. The data preprocessing was conducted with meticulous attention to preserving data quality while avoiding data reduction. Three normalization techniques were applied to the pre-processed data (MinMaxScaler, RobustScaler and StandardScaler) and compared with the Non-Normalized data. The RobustScaler approach has strong predictive ability for estimating the volume of biogas produced. The highest predicted biogas volume was 2236.105 Nm³/d, with coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE) values of 0.712, 164.610, and 223.429, respectively.

Keywords: anaerobic digestion, biogas production, deep neural network, hybrid bo-tpe, hyperparameters tuning

Procedia PDF Downloads 12
24183 Information Retrieval for Kafficho Language

Authors: Mareye Zeleke Mekonen

Abstract:

The Kafficho language has distinct issues in information retrieval because of its restricted resources and dearth of standardized methods. In this endeavor, with the cooperation and support of linguists and native speakers, we investigate the creation of information retrieval systems specifically designed for the Kafficho language. The Kafficho information retrieval system allows Kafficho speakers to access information easily in an efficient and effective way. Our objective is to conduct an information retrieval experiment using 220 Kafficho text files, including fifteen sample questions. Tokenization, normalization, stop word removal, stemming, and other data pre-processing chores, together with additional tasks like term weighting, were prerequisites for the vector space model to represent each page and a particular query. The three well-known measurement metrics we used for our word were Precision, Recall, and and F-measure, with values of 87%, 28%, and 35%, respectively. This demonstrates how well the Kaffiho information retrieval system performed well while utilizing the vector space paradigm.

Keywords: Kafficho, information retrieval, stemming, vector space

Procedia PDF Downloads 19
24182 Wind Speed Forecasting Based on Historical Data Using Modern Prediction Methods in Selected Sites of Geba Catchment, Ethiopia

Authors: Halefom Kidane

Abstract:

This study aims to assess the wind resource potential and characterize the urban area wind patterns in Hawassa City, Ethiopia. The estimation and characterization of wind resources are crucial for sustainable urban planning, renewable energy development, and climate change mitigation strategies. A secondary data collection method was used to carry out the study. The collected data at 2 meters was analyzed statistically and extrapolated to the standard heights of 10-meter and 30-meter heights using the power law equation. The standard deviation method was used to calculate the value of scale and shape factors. From the analysis presented, the maximum and minimum mean daily wind speed at 2 meters in 2016 was 1.33 m/s and 0.05 m/s in 2017, 1.67 m/s and 0.14 m/s in 2018, 1.61m and 0.07 m/s, respectively. The maximum monthly average wind speed of Hawassa City in 2016 at 2 meters was noticed in the month of December, which is around 0.78 m/s, while in 2017, the maximum wind speed was recorded in the month of January with a wind speed magnitude of 0.80 m/s and in 2018 June was maximum speed which is 0.76 m/s. On the other hand, October was the month with the minimum mean wind speed in all years, with a value of 0.47 m/s in 2016,0.47 in 2017 and 0.34 in 2018. The annual mean wind speed was 0.61 m/s in 2016,0.64, m/s in 2017 and 0.57 m/s in 2018 at a height of 2 meters. From extrapolation, the annual mean wind speeds for the years 2016,2017 and 2018 at 10 heights were 1.17 m/s,1.22 m/s, and 1.11 m/s, and at the height of 30 meters, were 3.34m/s,3.78 m/s, and 3.01 m/s respectively/Thus, the site consists mainly primarily classes-I of wind speed even at the extrapolated heights.

Keywords: artificial neural networks, forecasting, min-max normalization, wind speed

Procedia PDF Downloads 37
24181 Using Business Intelligence Capabilities to Improve the Quality of Decision-Making: A Case Study of Mellat Bank

Authors: Jalal Haghighat Monfared, Zahra Akbari

Abstract:

Today, business executives need to have useful information to make better decisions. Banks have also been using information tools so that they can direct the decision-making process in order to achieve their desired goals by rapidly extracting information from sources with the help of business intelligence. The research seeks to investigate whether there is a relationship between the quality of decision making and the business intelligence capabilities of Mellat Bank. Each of the factors studied is divided into several components, and these and their relationships are measured by a questionnaire. The statistical population of this study consists of all managers and experts of Mellat Bank's General Departments (including 190 people) who use commercial intelligence reports. The sample size of this study was 123 randomly determined by statistical method. In this research, relevant statistical inference has been used for data analysis and hypothesis testing. In the first stage, using the Kolmogorov-Smirnov test, the normalization of the data was investigated and in the next stage, the construct validity of both variables and their resulting indexes were verified using confirmatory factor analysis. Finally, using the structural equation modeling and Pearson's correlation coefficient, the research hypotheses were tested. The results confirmed the existence of a positive relationship between decision quality and business intelligence capabilities in Mellat Bank. Among the various capabilities, including data quality, correlation with other systems, user access, flexibility and risk management support, the flexibility of the business intelligence system was the most correlated with the dependent variable of the present research. This shows that it is necessary for Mellat Bank to pay more attention to choose the required business intelligence systems with high flexibility in terms of the ability to submit custom formatted reports. Subsequently, the quality of data on business intelligence systems showed the strongest relationship with quality of decision making. Therefore, improving the quality of data, including the source of data internally or externally, the type of data in quantitative or qualitative terms, the credibility of the data and perceptions of who uses the business intelligence system, improves the quality of decision making in Mellat Bank.

Keywords: business intelligence, business intelligence capability, decision making, decision quality

Procedia PDF Downloads 88
24180 Facial Emotion Recognition with Convolutional Neural Network Based Architecture

Authors: Koray U. Erbas

Abstract:

Neural networks are appealing for many applications since they are able to learn complex non-linear relationships between input and output data. As the number of neurons and layers in a neural network increase, it is possible to represent more complex relationships with automatically extracted features. Nowadays Deep Neural Networks (DNNs) are widely used in Computer Vision problems such as; classification, object detection, segmentation image editing etc. In this work, Facial Emotion Recognition task is performed by proposed Convolutional Neural Network (CNN)-based DNN architecture using FER2013 Dataset. Moreover, the effects of different hyperparameters (activation function, kernel size, initializer, batch size and network size) are investigated and ablation study results for Pooling Layer, Dropout and Batch Normalization are presented.

Keywords: convolutional neural network, deep learning, deep learning based FER, facial emotion recognition

Procedia PDF Downloads 229
24179 Combining ASTER Thermal Data and Spatial-Based Insolation Model for Identification of Geothermal Active Areas

Authors: Khalid Hussein, Waleed Abdalati, Pakorn Petchprayoon, Khaula Alkaabi

Abstract:

In this study, we integrated ASTER thermal data with an area-based spatial insolation model to identify and delineate geothermally active areas in Yellowstone National Park (YNP). Two pairs of L1B ASTER day- and nighttime scenes were used to calculate land surface temperature. We employed the Emissivity Normalization Algorithm which separates temperature from emissivity to calculate surface temperature. We calculated the incoming solar radiation for the area covered by each of the four ASTER scenes using an insolation model and used this information to compute temperature due to solar radiation. We then identified the statistical thermal anomalies using land surface temperature and the residuals calculated from modeled temperatures and ASTER-derived surface temperatures. Areas that had temperatures or temperature residuals greater than 2σ and between 1σ and 2σ were considered ASTER-modeled thermal anomalies. The areas identified as thermal anomalies were in strong agreement with the thermal areas obtained from the YNP GIS database. Also the YNP hot springs and geysers were located within areas identified as anomalous thermal areas. The consistency between our results and known geothermally active areas indicate that thermal remote sensing data, integrated with a spatial-based insolation model, provides an effective means for identifying and locating areas of geothermal activities over large areas and rough terrain.

Keywords: thermal remote sensing, insolation model, land surface temperature, geothermal anomalies

Procedia PDF Downloads 339
24178 The Fibonacci Network: A Simple Alternative for Positional Encoding

Authors: Yair Bleiberg, Michael Werman

Abstract:

Coordinate-based Multi-Layer Perceptrons (MLPs) are known to have difficulty reconstructing high frequencies of the training data. A common solution to this problem is Positional Encoding (PE), which has become quite popular. However, PE has drawbacks. It has high-frequency artifacts and adds another hyper hyperparameter, just like batch normalization and dropout do. We believe that under certain circumstances, PE is not necessary, and a smarter construction of the network architecture together with a smart training method is sufficient to achieve similar results. In this paper, we show that very simple MLPs can quite easily output a frequency when given input of the half-frequency and quarter-frequency. Using this, we design a network architecture in blocks, where the input to each block is the output of the two previous blocks along with the original input. We call this a Fibonacci Network. By training each block on the corresponding frequencies of the signal, we show that Fibonacci Networks can reconstruct arbitrarily high frequencies.

Keywords: neural networks, positional encoding, high frequency intepolation, fully connected

Procedia PDF Downloads 57
24177 Implementation of Algorithm K-Means for Grouping District/City in Central Java Based on Macro Economic Indicators

Authors: Nur Aziza Luxfiati

Abstract:

Clustering is partitioning data sets into sub-sets or groups in such a way that elements certain properties have shared property settings with a high level of similarity within one group and a low level of similarity between groups. . The K-Means algorithm is one of thealgorithmsclustering as a grouping tool that is most widely used in scientific and industrial applications because the basic idea of the kalgorithm is-means very simple. In this research, applying the technique of clustering using the k-means algorithm as a method of solving the problem of national development imbalances between regions in Central Java Province based on macroeconomic indicators. The data sample used is secondary data obtained from the Central Java Provincial Statistics Agency regarding macroeconomic indicator data which is part of the publication of the 2019 National Socio-Economic Survey (Susenas) data. score and determine the number of clusters (k) using the elbow method. After the clustering process is carried out, the validation is tested using themethodsBetween-Class Variation (BCV) and Within-Class Variation (WCV). The results showed that detection outlier using z-score normalization showed no outliers. In addition, the results of the clustering test obtained a ratio value that was not high, namely 0.011%. There are two district/city clusters in Central Java Province which have economic similarities based on the variables used, namely the first cluster with a high economic level consisting of 13 districts/cities and theclustersecondwith a low economic level consisting of 22 districts/cities. And in the cluster second, namely, between low economies, the authors grouped districts/cities based on similarities to macroeconomic indicators such as 20 districts of Gross Regional Domestic Product, with a Poverty Depth Index of 19 districts, with 5 districts in Human Development, and as many as Open Unemployment Rate. 10 districts.

Keywords: clustering, K-Means algorithm, macroeconomic indicators, inequality, national development

Procedia PDF Downloads 131
24176 A Posteriori Trading-Inspired Model-Free Time Series Segmentation

Authors: Plessen Mogens Graf

Abstract:

Within the context of multivariate time series segmentation, this paper proposes a method inspired by a posteriori optimal trading. After a normalization step, time series are treated channelwise as surrogate stock prices that can be traded optimally a posteriori in a virtual portfolio holding either stock or cash. Linear transaction costs are interpreted as hyperparameters for noise filtering. Trading signals, as well as trading signals obtained on the reversed time series, are used for unsupervised channelwise labeling before a consensus over all channels is reached that determines the final segmentation time instants. The method is model-free such that no model prescriptions for segments are made. Benefits of proposed approach include simplicity, computational efficiency, and adaptability to a wide range of different shapes of time series. Performance is demonstrated on synthetic and real-world data, including a large-scale dataset comprising a multivariate time series of dimension 1000 and length 2709. Proposed method is compared to a popular model-based bottom-up approach fitting piecewise affine models and to a recent model-based top-down approach fitting Gaussian models and found to be consistently faster while producing more intuitive results in the sense of segmenting time series at peaks and valleys.

Keywords: time series segmentation, model-free, trading-inspired, multivariate data

Procedia PDF Downloads 107
24175 Optical and Double Folding Model Analysis for Alpha Particles Elastically Scattered from 9Be and 11B Nuclei at Different Energies

Authors: Ahmed H. Amer, A. Amar, Sh. Hamada, I. I. Bondouk, F. A. El-Hussiny

Abstract:

Elastic scattering of α-particles from 9Be and 11B nuclei at different alpha energies have been analyzed. Optical model parameters (OMPs) of α-particles elastic scattering by these nuclei at different energies have been obtained. In the present calculations, the real part of the optical potential are derived by folding of nucleon-nucleon (NN) interaction into nuclear matter density distribution of the projectile and target nuclei using computer code FRESCO. A density-dependent version of the M3Y interaction (CDM3Y6), which is based on the G-matrix elements of the Paris NN potential, has been used. Volumetric integrals of the real and imaginary potential depth (JR, JW) have been calculated and found to be energy dependent. Good agreement between the experimental data and the theoretical predictions in the whole angular range. In double folding (DF) calculations, the obtained normalization coefficient Nr is in the range 0.70–1.32.

Keywords: elastic scattering, optical model, double folding model, density distribution

Procedia PDF Downloads 272
24174 Numerical Modeling of Flow in USBR II Stilling Basin with End Adverse Slope

Authors: Hamidreza Babaali, Alireza Mojtahedi, Nasim Soori, Saba Soori

Abstract:

Hydraulic jump is one of the effective ways of energy dissipation in stilling basins that the ‎energy is highly dissipated by jumping. Adverse slope surface at the end stilling basin is ‎caused to increase energy dissipation and stability of the hydraulic jump. In this study, the adverse slope ‎has been added to end of United States Bureau of Reclamation (USBR) II stilling basin in hydraulic model of Nazloochay dam with scale 1:40, and flow simulated into stilling basin using Flow-3D ‎software. The numerical model is verified by experimental data of water depth in ‎stilling basin. Then, the parameters of water level profile, Froude Number, pressure, air ‎entrainment and turbulent dissipation investigated for discharging 300 m3/s using K-Ɛ and Re-Normalization Group (RNG) turbulence ‎models. The results showed a good agreement between numerical and experimental model‎ as ‎numerical model can be used to optimize of stilling basins.‎

Keywords: experimental and numerical modelling, end adverse slope, flow ‎parameters, USBR II stilling basin

Procedia PDF Downloads 133
24173 Emotion Recognition in Video and Images in the Wild

Authors: Faizan Tariq, Moayid Ali Zaidi

Abstract:

Facial emotion recognition algorithms are expanding rapidly now a day. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. Two object detection algorithms were chosen which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset I have chosen for the experiments is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.

Keywords: face recognition, emotion recognition, deep learning, CNN

Procedia PDF Downloads 157
24172 Disordered Eating Behaviors Among Sorority Women

Authors: Andrea J. Kirk-Jenkins

Abstract:

Women in late adolescence and young adulthood are particularly vulnerable to disordered eating, and prior research indicates that those within the college and sorority communities may be especially susceptible. Research has primarily involved comparing eating disorder symptoms between sorority women and non-sorority members using formal eating disorder assessments. This phenomenological study examined sorority members’ (N = 10) perceptions of and lived experiences with various disordered eating behaviors within the sorority culture. Data from individual interviews and photographs indicated two structural themes and 11 textural themes related to factors associated with disordered eating behaviors. These findings point to the existence of both positive and negative aspects of sorority culture, normalization of disordered eating behaviors, and pressure to attain or maintain an ideal body image. Implications for university stakeholders, including college counselors, health center staff, and extracurricular program leaders, are discussed. Further research on the identified textural themes as well as a longitudinal study exploring how perceptions change from rush to alumnae status is suggested.

Keywords: eating disorders, disorder eating behaviors, sorority women, sorority culture, college women

Procedia PDF Downloads 94