Search results for: pre-processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 142

Search results for: pre-processing

82 Measuring Text-Based Semantics Relatedness Using WordNet

Authors: Madiha Khan, Sidrah Ramzan, Seemab Khan, Shahzad Hassan, Kamran Saeed

Abstract:

Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.

Keywords: Graphviz representation, semantic relatedness, similarity measurement, WordNet similarity

Procedia PDF Downloads 199
81 Sparse Coding Based Classification of Electrocardiography Signals Using Data-Driven Complete Dictionary Learning

Authors: Fuad Noman, Sh-Hussain Salleh, Chee-Ming Ting, Hadri Hussain, Syed Rasul

Abstract:

In this paper, a data-driven dictionary approach is proposed for the automatic detection and classification of cardiovascular abnormalities. Electrocardiography (ECG) signal is represented by the trained complete dictionaries that contain prototypes or atoms to avoid the limitations of pre-defined dictionaries. The data-driven trained dictionaries simply take the ECG signal as input rather than extracting features to study the set of parameters that yield the most descriptive dictionary. The approach inherently learns the complicated morphological changes in ECG waveform, which is then used to improve the classification. The classification performance was evaluated with ECG data under two different preprocessing environments. In the first category, QT-database is baseline drift corrected with notch filter and it filters the 60 Hz power line noise. In the second category, the data are further filtered using fast moving average smoother. The experimental results on QT database confirm that our proposed algorithm shows a classification accuracy of 92%.

Keywords: electrocardiogram, dictionary learning, sparse coding, classification

Procedia PDF Downloads 345
80 Analyzing On-Line Process Data for Industrial Production Quality Control

Authors: Hyun-Woo Cho

Abstract:

The monitoring of industrial production quality has to be implemented to alarm early warning for unusual operating conditions. Furthermore, identification of their assignable causes is necessary for a quality control purpose. For such tasks many multivariate statistical techniques have been applied and shown to be quite effective tools. This work presents a process data-based monitoring scheme for production processes. For more reliable results some additional steps of noise filtering and preprocessing are considered. It may lead to enhanced performance by eliminating unwanted variation of the data. The performance evaluation is executed using data sets from test processes. The proposed method is shown to provide reliable quality control results, and thus is more effective in quality monitoring in the example. For practical implementation of the method, an on-line data system must be available to gather historical and on-line data. Recently large amounts of data are collected on-line in most processes and implementation of the current scheme is feasible and does not give additional burdens to users.

Keywords: detection, filtering, monitoring, process data

Procedia PDF Downloads 515
79 Detection and Classification of Rubber Tree Leaf Diseases Using Machine Learning

Authors: Kavyadevi N., Kaviya G., Gowsalya P., Janani M., Mohanraj S.

Abstract:

Hevea brasiliensis, also known as the rubber tree, is one of the foremost assets of crops in the world. One of the most significant advantages of the Rubber Plant in terms of air oxygenation is its capacity to reduce the likelihood of an individual developing respiratory allergies like asthma. To construct such a system that can properly identify crop diseases and pests and then create a database of insecticides for each pest and disease, we must first give treatment for the illness that has been detected. We shall primarily examine three major leaf diseases since they are economically deficient in this article, which is Bird's eye spot, algal spot and powdery mildew. And the recommended work focuses on disease identification on rubber tree leaves. It will be accomplished by employing one of the superior algorithms. Input, Preprocessing, Image Segmentation, Extraction Feature, and Classification will be followed by the processing technique. We will use time-consuming procedures that they use to detect the sickness. As a consequence, the main ailments, underlying causes, and signs and symptoms of diseases that harm the rubber tree are covered in this study.

Keywords: image processing, python, convolution neural network (CNN), machine learning

Procedia PDF Downloads 41
78 A Clustering-Based Approach for Weblog Data Cleaning

Authors: Amine Ganibardi, Cherif Arab Ali

Abstract:

This paper addresses the data cleaning issue as a part of web usage data preprocessing within the scope of Web Usage Mining. Weblog data recorded by web servers within log files reflect usage activity, i.e., End-users’ clicks and underlying user-agents’ hits. As Web Usage Mining is interested in End-users’ behavior, user-agents’ hits are referred to as noise to be cleaned-off before mining. Filtering hits from clicks is not trivial for two reasons, i.e., a server records requests interlaced in sequential order regardless of their source or type, website resources may be set up as requestable interchangeably by end-users and user-agents. The current methods are content-centric based on filtering heuristics of relevant/irrelevant items in terms of some cleaning attributes, i.e., website’s resources filetype extensions, website’s resources pointed by hyperlinks/URIs, http methods, user-agents, etc. These methods need exhaustive extra-weblog data and prior knowledge on the relevant and/or irrelevant items to be assumed as clicks or hits within the filtering heuristics. Such methods are not appropriate for dynamic/responsive Web for three reasons, i.e., resources may be set up to as clickable by end-users regardless of their type, website’s resources are indexed by frame names without filetype extensions, web contents are generated and cancelled differently from an end-user to another. In order to overcome these constraints, a clustering-based cleaning method centered on the logging structure is proposed. This method focuses on the statistical properties of the logging structure at the requested and referring resources attributes levels. It is insensitive to logging content and does not need extra-weblog data. The used statistical property takes on the structure of the generated logging feature by webpage requests in terms of clicks and hits. Since a webpage consists of its single URI and several components, these feature results in a single click to multiple hits ratio in terms of the requested and referring resources. Thus, the clustering-based method is meant to identify two clusters based on the application of the appropriate distance to the frequency matrix of the requested and referring resources levels. As the ratio clicks to hits is single to multiple, the clicks’ cluster is the smallest one in requests number. Hierarchical Agglomerative Clustering based on a pairwise distance (Gower) and average linkage has been applied to four logfiles of dynamic/responsive websites whose click to hits ratio range from 1/2 to 1/15. The optimal clustering set on the basis of average linkage and maximum inter-cluster inertia results always in two clusters. The evaluation of the smallest cluster referred to as clicks cluster under the terms of confusion matrix indicators results in 97% of true positive rate. The content-centric cleaning methods, i.e., conventional and advanced cleaning, resulted in a lower rate 91%. Thus, the proposed clustering-based cleaning outperforms the content-centric methods within dynamic and responsive web design without the need of any extra-weblog. Such an improvement in cleaning quality is likely to refine dependent analysis.

Keywords: clustering approach, data cleaning, data preprocessing, weblog data, web usage data

Procedia PDF Downloads 143
77 An Inviscid Compressible Flow Solver Based on Unstructured OpenFOAM Mesh Format

Authors: Utkan Caliskan

Abstract:

Two types of numerical codes based on finite volume method are developed in order to solve compressible Euler equations to simulate the flow through forward facing step channel. Both algorithms have AUSM+- up (Advection Upstream Splitting Method) scheme for flux splitting and two-stage Runge-Kutta scheme for time stepping. In this study, the flux calculations differentiate between the algorithm based on OpenFOAM mesh format which is called 'face-based' algorithm and the basic algorithm which is called 'element-based' algorithm. The face-based algorithm avoids redundant flux computations and also is more flexible with hybrid grids. Moreover, some of OpenFOAM’s preprocessing utilities can be used on the mesh. Parallelization of the face based algorithm for which atomic operations are needed due to the shared memory model, is also presented. For several mesh sizes, 2.13x speed up is obtained with face-based approach over the element-based approach.

Keywords: cell centered finite volume method, compressible Euler equations, OpenFOAM mesh format, OpenMP

Procedia PDF Downloads 283
76 Wideband Performance Analysis of C-FDTD Based Algorithms in the Discretization Impoverishment of a Curved Surface

Authors: Lucas L. L. Fortes, Sandro T. M. Gonçalves

Abstract:

In this work, it is analyzed the wideband performance with the mesh discretization impoverishment of the Conformal Finite Difference Time-Domain (C-FDTD) approaches developed by Raj Mittra, Supriyo Dey and Wenhua Yu for the Finite Difference Time-Domain (FDTD) method. These approaches are a simple and efficient way to optimize the scattering simulation of curved surfaces for Dielectric and Perfect Electric Conducting (PEC) structures in the FDTD method, since curved surfaces require dense meshes to reduce the error introduced due to the surface staircasing. Defined, on this work, as D-FDTD-Diel and D-FDTD-PEC, these approaches are well-known in the literature, but the improvement upon their application is not quantified broadly regarding wide frequency bands and poorly discretized meshes. Both approaches bring improvement of the accuracy of the simulation without requiring dense meshes, also making it possible to explore poorly discretized meshes which bring a reduction in simulation time and the computational expense while retaining a desired accuracy. However, their applications present limitations regarding the mesh impoverishment and the frequency range desired. Therefore, the goal of this work is to explore the approaches regarding both the wideband and mesh impoverishment performance to bring a wider insight over these aspects in FDTD applications. The D-FDTD-Diel approach consists in modifying the electric field update in the cells intersected by the dielectric surface, taking into account the amount of dielectric material within the mesh cells edges. By taking into account the intersections, the D-FDTD-Diel provides accuracy improvement at the cost of computational preprocessing, which is a fair trade-off, since the update modification is quite simple. Likewise, the D-FDTD-PEC approach consists in modifying the magnetic field update, taking into account the PEC curved surface intersections within the mesh cells and, considering a PEC structure in vacuum, the air portion that fills the intersected cells when updating the magnetic fields values. Also likewise to D-FDTD-Diel, the D-FDTD-PEC provides a better accuracy at the cost of computational preprocessing, although with a drawback of having to meet stability criterion requirements. The algorithms are formulated and applied to a PEC and a dielectric spherical scattering surface with meshes presenting different levels of discretization, with Polytetrafluoroethylene (PTFE) as the dielectric, being a very common material in coaxial cables and connectors for radiofrequency (RF) and wideband application. The accuracy of the algorithms is quantified, showing the approaches wideband performance drop along with the mesh impoverishment. The benefits in computational efficiency, simulation time and accuracy are also shown and discussed, according to the frequency range desired, showing that poorly discretized mesh FDTD simulations can be exploited more efficiently, retaining the desired accuracy. The results obtained provided a broader insight over the limitations in the application of the C-FDTD approaches in poorly discretized and wide frequency band simulations for Dielectric and PEC curved surfaces, which are not clearly defined or detailed in the literature and are, therefore, a novelty. These approaches are also expected to be applied in the modeling of curved RF components for wideband and high-speed communication devices in future works.

Keywords: accuracy, computational efficiency, finite difference time-domain, mesh impoverishment

Procedia PDF Downloads 97
75 Enhancement of X-Rays Images Intensity Using Pixel Values Adjustments Technique

Authors: Yousif Mohamed Y. Abdallah, Razan Manofely, Rajab M. Ben Yousef

Abstract:

X-Ray images are very popular as a first tool for diagnosis. Automating the process of analysis of such images is important in order to help physician procedures. In this practice, teeth segmentation from the radiographic images and feature extraction are essential steps. The main objective of this study was to study correction preprocessing of x-rays images using local adaptive filters in order to evaluate contrast enhancement pattern in different x-rays images such as grey color and to evaluate the usage of new nonlinear approach for contrast enhancement of soft tissues in x-rays images. The data analyzed by using MatLab program to enhance the contrast within the soft tissues, the gray levels in both enhanced and unenhanced images and noise variance. The main techniques of enhancement used in this study were contrast enhancement filtering and deblurring images using the blind deconvolution algorithm. In this paper, prominent constraints are firstly preservation of image's overall look; secondly, preservation of the diagnostic content in the image and thirdly detection of small low contrast details in diagnostic content of the image.

Keywords: enhancement, x-rays, pixel intensity values, MatLab

Procedia PDF Downloads 440
74 Design of a Graphical User Interface for Data Preprocessing and Image Segmentation Process in 2D MRI Images

Authors: Enver Kucukkulahli, Pakize Erdogmus, Kemal Polat

Abstract:

The 2D image segmentation is a significant process in finding a suitable region in medical images such as MRI, PET, CT etc. In this study, we have focused on 2D MRI images for image segmentation process. We have designed a GUI (graphical user interface) written in MATLABTM for 2D MRI images. In this program, there are two different interfaces including data pre-processing and image clustering or segmentation. In the data pre-processing section, there are median filter, average filter, unsharp mask filter, Wiener filter, and custom filter (a filter that is designed by user in MATLAB). As for the image clustering, there are seven different image segmentations for 2D MR images. These image segmentation algorithms are as follows: PSO (particle swarm optimization), GA (genetic algorithm), Lloyds algorithm, k-means, the combination of Lloyds and k-means, mean shift clustering, and finally BBO (Biogeography Based Optimization). To find the suitable cluster number in 2D MRI, we have designed the histogram based cluster estimation method and then applied to these numbers to image segmentation algorithms to cluster an image automatically. Also, we have selected the best hybrid method for each 2D MR images thanks to this GUI software.

Keywords: image segmentation, clustering, GUI, 2D MRI

Procedia PDF Downloads 343
73 Supporting 'vulnerable' Students to Complete Their Studies During the Economic Crisis in Greece: The Umbrella Program of International Hellenic University

Authors: Rigas Kotsakis, Nikolaos Tsigilis, Vasilis Grammatikopoulos, Evridiki Zachopoulou

Abstract:

During the last decade, Greece has faced an unprecedented financial crisis, affecting various aspects and functionalities of Higher Education. Besides the restricted funding of academic institutions, the students and their families encountered economical difficulties that undoubtedly influenced the effective completion of their studies. In this context, a fairly large number of students in Alexander campus of International Hellenic University (IHU) delay, interrupt, or even abandon their studies, especially when they come from low-income families, belong to sensitive social or special needs groups, they have different cultural origins, etc. For this reason, a European project, named “Umbrella”, was initiated aiming at providing the necessary psychological support and counseling, especially to disadvantaged students, towards the completion of their studies. To this end, a network of various academic members (academic staff and students) from IHU, namely iMentor, were implicated in different roles. Specifically, experienced academic staff trained students to serve as intermediate links for the integration and educational support of students that fall into the aforementioned sensitive social groups and face problems for the completion of their studies. The main idea of the project is held upon its person-centered character, which facilitates direct student-to-student communication without the intervention of the teaching staff. The backbone of the iMentors network are senior students that face no problem in their academic life and volunteered for this project. It should be noted that there is a provision from the Umbrella structure for substantial and ethical rewards for their engagement. In this context, a well-defined, stringent methodology was implemented for the evaluation of the extent of the problem in IHU and the detection of the profile of the “candidate” disadvantaged students. The first phase included two steps, (a) data collection and (b) data cleansing/ preprocessing. The first step involved the data collection process from the Secretary Services of all Schools in IHU, from 1980 to 2019, which resulted in 96.418 records. The data set included the School name, the semester of studies, a student enrolling criteria, the nationality, the graduation year or the current, up-to-date academic state (still studying, delayed, dropped off, etc.). The second step of the employed methodology involved the data cleansing/preprocessing because of the existence of “noisy” data, missing and erroneous values, etc. Furthermore, several assumptions and grouping actions were imposed to achieve data homogeneity and an easy-to-interpret subsequent statistical analysis. Specifically, the duration of 40 years recording was limited to the last 15 years (2004-2019). In 2004 the Greek Technological Institutions were evolved into Higher Education Universities, leading into a stable and unified frame of graduate studies. In addition, the data concerning active students were excluded from the analysis since the initial processing effort was focused on the detection of factors/variables that differentiated graduate and deleted students. The final working dataset included 21.432 records with only two categories of students, those that have a degree and those who abandoned their studies. Findings of the first phase are presented across faculties and further discussed.

Keywords: higher education, students support, economic crisis, mentoring

Procedia PDF Downloads 83
72 Hyperspectral Mapping Methods for Differentiating Mangrove Species along Karachi Coast

Authors: Sher Muhammad, Mirza Muhammad Waqar

Abstract:

It is necessary to monitor and identify mangroves types and spatial extent near coastal areas because it plays an important role in coastal ecosystem and environmental protection. This research aims at identifying and mapping mangroves types along Karachi coast ranging from 24.79 to 24.85 degree in latitude and 66.91 to 66.97 degree in longitude using hyperspectral remote sensing data and techniques. Image acquired during February, 2012 through Hyperion sensor have been used for this research. Image preprocessing includes geometric and radiometric correction followed by Minimum Noise Fraction (MNF) and Pixel Purity Index (PPI). The output of MNF and PPI has been analyzed by visualizing it in n-dimensions for end-member extraction. Well-distributed clusters on the n-dimensional scatter plot have been selected with the region of interest (ROI) tool as end members. These end members have been used as an input for classification techniques applied to identify and map mangroves species including Spectral Angle Mapper (SAM), Spectral Feature Fitting (SFF), and Spectral Information Diversion (SID). Only two types of mangroves namely Avicennia Marina (white mangroves) and Avicennia Germinans (black mangroves) have been observed throughout the study area.

Keywords: mangrove, hyperspectral, hyperion, SAM, SFF, SID

Procedia PDF Downloads 328
71 Online Yoga Asana Trainer Using Deep Learning

Authors: Venkata Narayana Chejarla, Nafisa Parvez Shaik, Gopi Vara Prasad Marabathula, Deva Kumar Bejjam

Abstract:

Yoga is an advanced, well-recognized method with roots in Indian philosophy. Yoga benefits both the body and the psyche. Yoga is a regular exercise that helps people relax and sleep better while also enhancing their balance, endurance, and concentration. Yoga can be learned in a variety of settings, including at home with the aid of books and the internet as well as in yoga studios with the guidance of an instructor. Self-learning does not teach the proper yoga poses, and doing them without the right instruction could result in significant injuries. We developed "Online Yoga Asana Trainer using Deep Learning" so that people could practice yoga without a teacher. Our project is developed using Tensorflow, Movenet, and Keras models. The system makes use of data from Kaggle that includes 25 different yoga poses. The first part of the process involves applying the movement model for extracting the 17 key points of the body from the dataset, and the next part involves preprocessing, which includes building a pose classification model using neural networks. The system scores a 98.3% accuracy rate. The system is developed to work with live videos.

Keywords: yoga, deep learning, movenet, tensorflow, keras, CNN

Procedia PDF Downloads 204
70 A Study of Mode Choice Model Improvement Considering Age Grouping

Authors: Young-Hyun Seo, Hyunwoo Park, Dong-Kyu Kim, Seung-Young Kho

Abstract:

The purpose of this study is providing an improved mode choice model considering parameters including age grouping of prime-aged and old age. In this study, 2010 Household Travel Survey data were used and improper samples were removed through the analysis. Chosen alternative, date of birth, mode, origin code, destination code, departure time, and arrival time are considered from Household Travel Survey. By preprocessing data, travel time, travel cost, mode, and ratio of people aged 45 to 55 years, 55 to 65 years and over 65 years were calculated. After the manipulation, the mode choice model was constructed using LIMDEP by maximum likelihood estimation. A significance test was conducted for nine parameters, three age groups for three modes. Then the test was conducted again for the mode choice model with significant parameters, travel cost variable and travel time variable. As a result of the model estimation, as the age increases, the preference for the car decreases and the preference for the bus increases. This study is meaningful in that the individual and households characteristics are applied to the aggregate model.

Keywords: age grouping, aging, mode choice model, multinomial logit model

Procedia PDF Downloads 293
69 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 261
68 An Automated System for the Detection of Citrus Greening Disease Based on Visual Descriptors

Authors: Sidra Naeem, Ayesha Naeem, Sahar Rahim, Nadia Nawaz Qadri

Abstract:

Citrus greening is a bacterial disease that causes considerable damage to citrus fruits worldwide. Efficient method for this disease detection must be carried out to minimize the production loss. This paper presents a pattern recognition system that comprises three stages for the detection of citrus greening from Orange leaves: segmentation, feature extraction and classification. Image segmentation is accomplished by adaptive thresholding. The feature extraction stage comprises of three visual descriptors i.e. shape, color and texture. From shape feature we have used asymmetry index, from color feature we have used histogram of Cb component from YCbCr domain and from texture feature we have used local binary pattern. Classification was done using support vector machines and k nearest neighbors. The best performances of the system is Accuracy = 88.02% and AUROC = 90.1% was achieved by automatic segmented images. Our experiments validate that: (1). Segmentation is an imperative preprocessing step for computer assisted diagnosis of citrus greening, and (2). The combination of shape, color and texture features form a complementary set towards the identification of citrus greening disease.

Keywords: citrus greening, pattern recognition, feature extraction, classification

Procedia PDF Downloads 136
67 Leukocyte Detection Using Image Stitching and Color Overlapping Windows

Authors: Lina, Arlends Chris, Bagus Mulyawan, Agus B. Dharmawan

Abstract:

Blood cell analysis plays a significant role in the diagnosis of human health. As an alternative to the traditional technique conducted by laboratory technicians, this paper presents an automatic white blood cell (leukocyte) detection system using Image Stitching and Color Overlapping Windows. The advantage of this method is to present a detection technique of white blood cells that are robust to imperfect shapes of blood cells with various image qualities. The input for this application is images from a microscope-slide translation video. The preprocessing stage is performed by stitching the input images. First, the overlapping parts of the images are determined, then stitching and blending processes of two input images are performed. Next, the Color Overlapping Windows is performed for white blood cell detection which consists of color filtering, window candidate checking, window marking, finds window overlaps, and window cropping processes. Experimental results show that this method could achieve an average of 82.12% detection accuracy of the leukocyte images.

Keywords: color overlapping windows, image stitching, leukocyte detection, white blood cell detection

Procedia PDF Downloads 265
66 Deep Learning Based, End-to-End Metaphor Detection in Greek with Recurrent and Convolutional Neural Networks

Authors: Konstantinos Perifanos, Eirini Florou, Dionysis Goutsos

Abstract:

This paper presents and benchmarks a number of end-to-end Deep Learning based models for metaphor detection in Greek. We combine Convolutional Neural Networks and Recurrent Neural Networks with representation learning to bear on the metaphor detection problem for the Greek language. The models presented achieve exceptional accuracy scores, significantly improving the previous state-of-the-art results, which had already achieved accuracy 0.82. Furthermore, no special preprocessing, feature engineering or linguistic knowledge is used in this work. The methods presented achieve accuracy of 0.92 and F-score 0.92 with Convolutional Neural Networks (CNNs) and bidirectional Long Short Term Memory networks (LSTMs). Comparable results of 0.91 accuracy and 0.91 F-score are also achieved with bidirectional Gated Recurrent Units (GRUs) and Convolutional Recurrent Neural Nets (CRNNs). The models are trained and evaluated only on the basis of training tuples, the related sentences and their labels. The outcome is a state-of-the-art collection of metaphor detection models, trained on limited labelled resources, which can be extended to other languages and similar tasks.

Keywords: metaphor detection, deep learning, representation learning, embeddings

Procedia PDF Downloads 106
65 Methaheuristic Bat Algorithm in Training of Feed-Forward Neural Network for Stock Price Prediction

Authors: Marjan Golmaryami, Marzieh Behzadi

Abstract:

Recent developments in stock exchange highlight the need for an efficient and accurate method that helps stockholders make better decision. Since stock markets have lots of fluctuations during the time and different effective parameters, it is difficult to make good decisions. The purpose of this study is to employ artificial neural network (ANN) which can deal with time series data and nonlinear relation among variables to forecast next day stock price. Unlike other evolutionary algorithms which were utilized in stock exchange prediction, we trained our proposed neural network with metaheuristic bat algorithm, with fast and powerful convergence and applied it in stock price prediction for the first time. In order to prove the performance of the proposed method, this research selected a 7 year dataset from Parsian Bank stocks and after imposing data preprocessing, used 3 types of ANN (back propagation-ANN, particle swarm optimization-ANN and bat-ANN) to predict the closed price of stocks. Afterwards, this study engaged MATLAB to simulate 3 types of ANN, with the scoring target of mean absolute percentage error (MAPE). The results may be adapted to other companies stocks too.

Keywords: artificial neural network (ANN), bat algorithm, particle swarm optimization algorithm (PSO), stock exchange

Procedia PDF Downloads 520
64 Design and Implementation a Platform for Adaptive Online Learning Based on Fuzzy Logic

Authors: Budoor Al Abid

Abstract:

Educational systems are increasingly provided as open online services, providing guidance and support for individual learners. To adapt the learning systems, a proper evaluation must be made. This paper builds the evaluation model Fuzzy C Means Adaptive System (FCMAS) based on data mining techniques to assess the difficulty of the questions. The following steps are implemented; first using a dataset from an online international learning system called (slepemapy.cz) the dataset contains over 1300000 records with 9 features for students, questions and answers information with feedback evaluation. Next, a normalization process as preprocessing step was applied. Then FCM clustering algorithms are used to adaptive the difficulty of the questions. The result is three cluster labeled data depending on the higher Wight (easy, Intermediate, difficult). The FCM algorithm gives a label to all the questions one by one. Then Random Forest (RF) Classifier model is constructed on the clustered dataset uses 70% of the dataset for training and 30% for testing; the result of the model is a 99.9% accuracy rate. This approach improves the Adaptive E-learning system because it depends on the student behavior and gives accurate results in the evaluation process more than the evaluation system that depends on feedback only.

Keywords: machine learning, adaptive, fuzzy logic, data mining

Procedia PDF Downloads 158
63 Automatic Music Score Recognition System Using Digital Image Processing

Authors: Yuan-Hsiang Chang, Zhong-Xian Peng, Li-Der Jeng

Abstract:

Music has always been an integral part of human’s daily lives. But, for the most people, reading musical score and turning it into melody is not easy. This study aims to develop an Automatic music score recognition system using digital image processing, which can be used to read and analyze musical score images automatically. The technical approaches included: (1) staff region segmentation; (2) image preprocessing; (3) note recognition; and (4) accidental and rest recognition. Digital image processing techniques (e.g., horizontal /vertical projections, connected component labeling, morphological processing, template matching, etc.) were applied according to musical notes, accidents, and rests in staff notations. Preliminary results showed that our system could achieve detection and recognition rates of 96.3% and 91.7%, respectively. In conclusion, we presented an effective automated musical score recognition system that could be integrated in a system with a media player to play music/songs given input images of musical score. Ultimately, this system could also be incorporated in applications for mobile devices as a learning tool, such that a music player could learn to play music/songs.

Keywords: connected component labeling, image processing, morphological processing, optical musical recognition

Procedia PDF Downloads 381
62 Compilation of Load Spectrum of Loader Drive Axle

Authors: Wei Yongxiang, Zhu Haoyue, Tang Heng, Yuan Qunwei

Abstract:

In order to study the preparation method of gear fatigue load spectrum for loaders, the load signal of four typical working conditions of loader is collected. The signal that reflects the law of load change is obtained by preprocessing the original signal. The torque of the drive axle is calculated by using the rain flow counting method. According to the operating time ratio of each working condition, the two-dimensional load spectrum based on the real working conditions of the drive axle of loader is established by the cycle extrapolation and synthesis method. The two-dimensional load spectrum is converted into one-dimensional load spectrum by means of the mean of torque equal damage method. Torque amplification includes the maximum load torque of the main reduction gear. Based on the theory of equal damage, the accelerated cycles are calculated. In this way, the load spectrum of the loading condition of the drive axle is prepared to reflect loading condition of the loader. The load spectrum can provide reference for fatigue life test and life prediction of loader drive axle.

Keywords: load spectrum, axle, torque, rain-flow counting method, extrapolation

Procedia PDF Downloads 332
61 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 76
60 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 34
59 Morphology Operation and Discrete Wavelet Transform for Blood Vessels Segmentation in Retina Fundus

Authors: Rita Magdalena, N. K. Caecar Pratiwi, Yunendah Nur Fuadah, Sofia Saidah, Bima Sakti

Abstract:

Vessel segmentation of retinal fundus is important for biomedical sciences in diagnosing ailments related to the eye. Segmentation can simplify medical experts in diagnosing retinal fundus image state. Therefore, in this study, we designed a software using MATLAB which enables the segmentation of the retinal blood vessels on retinal fundus images. There are two main steps in the process of segmentation. The first step is image preprocessing that aims to improve the quality of the image to be optimum segmented. The second step is the image segmentation in order to perform the extraction process to retrieve the retina’s blood vessel from the eye fundus image. The image segmentation methods that will be analyzed in this study are Morphology Operation, Discrete Wavelet Transform and combination of both. The amount of data that used in this project is 40 for the retinal image and 40 for manually segmentation image. After doing some testing scenarios, the average accuracy for Morphology Operation method is 88.46 % while for Discrete Wavelet Transform is 89.28 %. By combining the two methods mentioned in later, the average accuracy was increased to 89.53 %. The result of this study is an image processing system that can segment the blood vessels in retinal fundus with high accuracy and low computation time.

Keywords: discrete wavelet transform, fundus retina, morphology operation, segmentation, vessel

Procedia PDF Downloads 165
58 Brain Tumor Detection and Classification Using Pre-Trained Deep Learning Models

Authors: Aditya Karade, Sharada Falane, Dhananjay Deshmukh, Vijaykumar Mantri

Abstract:

Brain tumors pose a significant challenge in healthcare due to their complex nature and impact on patient outcomes. The application of deep learning (DL) algorithms in medical imaging have shown promise in accurate and efficient brain tumour detection. This paper explores the performance of various pre-trained DL models ResNet50, Xception, InceptionV3, EfficientNetB0, DenseNet121, NASNetMobile, VGG19, VGG16, and MobileNet on a brain tumour dataset sourced from Figshare. The dataset consists of MRI scans categorizing different types of brain tumours, including meningioma, pituitary, glioma, and no tumour. The study involves a comprehensive evaluation of these models’ accuracy and effectiveness in classifying brain tumour images. Data preprocessing, augmentation, and finetuning techniques are employed to optimize model performance. Among the evaluated deep learning models for brain tumour detection, ResNet50 emerges as the top performer with an accuracy of 98.86%. Following closely is Xception, exhibiting a strong accuracy of 97.33%. These models showcase robust capabilities in accurately classifying brain tumour images. On the other end of the spectrum, VGG16 trails with the lowest accuracy at 89.02%.

Keywords: brain tumour, MRI image, detecting and classifying tumour, pre-trained models, transfer learning, image segmentation, data augmentation

Procedia PDF Downloads 29
57 PsyVBot: Chatbot for Accurate Depression Diagnosis using Long Short-Term Memory and NLP

Authors: Thaveesha Dheerasekera, Dileeka Sandamali Alwis

Abstract:

The escalating prevalence of mental health issues, such as depression and suicidal ideation, is a matter of significant global concern. It is plausible that a variety of factors, such as life events, social isolation, and preexisting physiological or psychological health conditions, could instigate or exacerbate these conditions. Traditional approaches to diagnosing depression entail a considerable amount of time and necessitate the involvement of adept practitioners. This underscores the necessity for automated systems capable of promptly detecting and diagnosing symptoms of depression. The PsyVBot system employs sophisticated natural language processing and machine learning methodologies, including the use of the NLTK toolkit for dataset preprocessing and the utilization of a Long Short-Term Memory (LSTM) model. The PsyVBot exhibits a remarkable ability to diagnose depression with a 94% accuracy rate through the analysis of user input. Consequently, this resource proves to be efficacious for individuals, particularly those enrolled in academic institutions, who may encounter challenges pertaining to their psychological well-being. The PsyVBot employs a Long Short-Term Memory (LSTM) model that comprises a total of three layers, namely an embedding layer, an LSTM layer, and a dense layer. The stratification of these layers facilitates a precise examination of linguistic patterns that are associated with the condition of depression. The PsyVBot has the capability to accurately assess an individual's level of depression through the identification of linguistic and contextual cues. The task is achieved via a rigorous training regimen, which is executed by utilizing a dataset comprising information sourced from the subreddit r/SuicideWatch. The diverse data present in the dataset ensures precise and delicate identification of symptoms linked with depression, thereby guaranteeing accuracy. PsyVBot not only possesses diagnostic capabilities but also enhances the user experience through the utilization of audio outputs. This feature enables users to engage in more captivating and interactive interactions. The PsyVBot platform offers individuals the opportunity to conveniently diagnose mental health challenges through a confidential and user-friendly interface. Regarding the advancement of PsyVBot, maintaining user confidentiality and upholding ethical principles are of paramount significance. It is imperative to note that diligent efforts are undertaken to adhere to ethical standards, thereby safeguarding the confidentiality of user information and ensuring its security. Moreover, the chatbot fosters a conducive atmosphere that is supportive and compassionate, thereby promoting psychological welfare. In brief, PsyVBot is an automated conversational agent that utilizes an LSTM model to assess the level of depression in accordance with the input provided by the user. The demonstrated accuracy rate of 94% serves as a promising indication of the potential efficacy of employing natural language processing and machine learning techniques in tackling challenges associated with mental health. The reliability of PsyVBot is further improved by the fact that it makes use of the Reddit dataset and incorporates Natural Language Toolkit (NLTK) for preprocessing. PsyVBot represents a pioneering and user-centric solution that furnishes an easily accessible and confidential medium for seeking assistance. The present platform is offered as a modality to tackle the pervasive issue of depression and the contemplation of suicide.

Keywords: chatbot, depression diagnosis, LSTM model, natural language process

Procedia PDF Downloads 29
56 Online Handwritten Character Recognition for South Indian Scripts Using Support Vector Machines

Authors: Steffy Maria Joseph, Abdu Rahiman V, Abdul Hameed K. M.

Abstract:

Online handwritten character recognition is a challenging field in Artificial Intelligence. The classification success rate of current techniques decreases when the dataset involves similarity and complexity in stroke styles, number of strokes and stroke characteristics variations. Malayalam is a complex south indian language spoken by about 35 million people especially in Kerala and Lakshadweep islands. In this paper, we consider the significant feature extraction for the similar stroke styles of Malayalam. This extracted feature set are suitable for the recognition of other handwritten south indian languages like Tamil, Telugu and Kannada. A classification scheme based on support vector machines (SVM) is proposed to improve the accuracy in classification and recognition of online malayalam handwritten characters. SVM Classifiers are the best for real world applications. The contribution of various features towards the accuracy in recognition is analysed. Performance for different kernels of SVM are also studied. A graphical user interface has developed for reading and displaying the character. Different writing styles are taken for each of the 44 alphabets. Various features are extracted and used for classification after the preprocessing of input data samples. Highest recognition accuracy of 97% is obtained experimentally at the best feature combination with polynomial kernel in SVM.

Keywords: SVM, matlab, malayalam, South Indian scripts, onlinehandwritten character recognition

Procedia PDF Downloads 544
55 Autonomous Vehicle Detection and Classification in High Resolution Satellite Imagery

Authors: Ali J. Ghandour, Houssam A. Krayem, Abedelkarim A. Jezzini

Abstract:

High-resolution satellite images and remote sensing can provide global information in a fast way compared to traditional methods of data collection. Under such high resolution, a road is not a thin line anymore. Objects such as cars and trees are easily identifiable. Automatic vehicles enumeration can be considered one of the most important applications in traffic management. In this paper, autonomous vehicle detection and classification approach in highway environment is proposed. This approach consists mainly of three stages: (i) first, a set of preprocessing operations are applied including soil, vegetation, water suppression. (ii) Then, road networks detection and delineation is implemented using built-up area index, followed by several morphological operations. This step plays an important role in increasing the overall detection accuracy since vehicles candidates are objects contained within the road networks only. (iii) Multi-level Otsu segmentation is implemented in the last stage, resulting in vehicle detection and classification, where detected vehicles are classified into cars and trucks. Accuracy assessment analysis is conducted over different study areas to show the great efficiency of the proposed method, especially in highway environment.

Keywords: remote sensing, object identification, vehicle and road extraction, vehicle and road features-based classification

Procedia PDF Downloads 195
54 Evaluation of the Internal Quality for Pineapple Based on the Spectroscopy Approach and Neural Network

Authors: Nonlapun Meenil, Pisitpong Intarapong, Thitima Wongsheree, Pranchalee Samanpiboon

Abstract:

In Thailand, once pineapples are harvested, they must be classified into two classes based on their sweetness: sweet and unsweet. This paper has studied and developed the assessment of internal quality of pineapples using a low-cost compact spectroscopy sensor according to the Spectroscopy approach and Neural Network (NN). During the experiments, Batavia pineapples were utilized, generating 100 samples. The extracted pineapple juice of each sample was used to determine the Soluble Solid Content (SSC) labeling into sweet and unsweet classes. In terms of experimental equipment, the sensor cover was specifically designed to install the sensor and light source to read the reflectance at a five mm depth from pineapple flesh. By using a spectroscopy sensor, data on visible and near-infrared reflectance (Vis-NIR) were collected. The NN was used to classify the pineapple classes. Before the classification step, the preprocessing methods, which are Class balancing, Data shuffling, and Standardization were applied. The 510 nm and 900 nm reflectance values of the middle parts of pineapples were used as features of the NN. With the Sequential model and Relu activation function, 100% accuracy of the training set and 76.67% accuracy of the test set were achieved. According to the abovementioned information, using a low-cost compact spectroscopy sensor has achieved favorable results in classifying the sweetness of the two classes of pineapples.

Keywords: neural network, pineapple, soluble solid content, spectroscopy

Procedia PDF Downloads 36
53 Automatic Motion Trajectory Analysis for Dual Human Interaction Using Video Sequences

Authors: Yuan-Hsiang Chang, Pin-Chi Lin, Li-Der Jeng

Abstract:

Advance in techniques of image and video processing has enabled the development of intelligent video surveillance systems. This study was aimed to automatically detect moving human objects and to analyze events of dual human interaction in a surveillance scene. Our system was developed in four major steps: image preprocessing, human object detection, human object tracking, and motion trajectory analysis. The adaptive background subtraction and image processing techniques were used to detect and track moving human objects. To solve the occlusion problem during the interaction, the Kalman filter was used to retain a complete trajectory for each human object. Finally, the motion trajectory analysis was developed to distinguish between the interaction and non-interaction events based on derivatives of trajectories related to the speed of the moving objects. Using a database of 60 video sequences, our system could achieve the classification accuracy of 80% in interaction events and 95% in non-interaction events, respectively. In summary, we have explored the idea to investigate a system for the automatic classification of events for interaction and non-interaction events using surveillance cameras. Ultimately, this system could be incorporated in an intelligent surveillance system for the detection and/or classification of abnormal or criminal events (e.g., theft, snatch, fighting, etc.).

Keywords: motion detection, motion tracking, trajectory analysis, video surveillance

Procedia PDF Downloads 503