Search results for: affective teaching and learning
144 Effective Traffic Lights Recognition Method for Real Time Driving Assistance Systemin the Daytime
Authors: Hyun-Koo Kim, Ju H. Park, Ho-Youl Jung
Abstract:
This paper presents an effective traffic lights recognition method at the daytime. First, Potential Traffic Lights Detector (PTLD) use whole color source of YCbCr channel image and make each binary image of green and red traffic lights. After PTLD step, Shape Filter (SF) use to remove noise such as traffic sign, street tree, vehicle, and building. At this time, noise removal properties consist of information of blobs of binary image; length, area, area of boundary box, etc. Finally, after an intermediate association step witch goal is to define relevant candidates region from the previously detected traffic lights, Adaptive Multi-class Classifier (AMC) is executed. The classification method uses Haar-like feature and Adaboost algorithm. For simulation, we are implemented through Intel Core CPU with 2.80 GHz and 4 GB RAM and tested in the urban and rural roads. Through the test, we are compared with our method and standard object-recognition learning processes and proved that it reached up to 94 % of detection rate which is better than the results achieved with cascade classifiers. Computation time of our proposed method is 15 ms.Keywords: Traffic Light Detection, Multi-class Classification, Driving Assistance System, Haar-like Feature, Color SegmentationMethod, Shape Filter
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2780143 Fuzzy Relatives of the CLARANS Algorithm With Application to Text Clustering
Authors: Mohamed A. Mahfouz, M. A. Ismail
Abstract:
This paper introduces new algorithms (Fuzzy relative of the CLARANS algorithm FCLARANS and Fuzzy c Medoids based on randomized search FCMRANS) for fuzzy clustering of relational data. Unlike existing fuzzy c-medoids algorithm (FCMdd) in which the within cluster dissimilarity of each cluster is minimized in each iteration by recomputing new medoids given current memberships, FCLARANS minimizes the same objective function minimized by FCMdd by changing current medoids in such away that that the sum of the within cluster dissimilarities is minimized. Computing new medoids may be effected by noise because outliers may join the computation of medoids while the choice of medoids in FCLARANS is dictated by the location of a predominant fraction of points inside a cluster and, therefore, it is less sensitive to the presence of outliers. In FCMRANS the step of computing new medoids in FCMdd is modified to be based on randomized search. Furthermore, a new initialization procedure is developed that add randomness to the initialization procedure used with FCMdd. Both FCLARANS and FCMRANS are compared with the robust and linearized version of fuzzy c-medoids (RFCMdd). Experimental results with different samples of the Reuter-21578, Newsgroups (20NG) and generated datasets with noise show that FCLARANS is more robust than both RFCMdd and FCMRANS. Finally, both FCMRANS and FCLARANS are more efficient and their outputs are almost the same as that of RFCMdd in terms of classification rate.Keywords: Data Mining, Fuzzy Clustering, Relational Clustering, Medoid-Based Clustering, Cluster Analysis, Unsupervised Learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2402142 Isolation and Classification of Red Blood Cells in Anemic Microscopic Images
Authors: Jameela Ali Alkrimi, Loay E. George, Azizah Suliman, Abdul Rahim Ahmad, Karim Al-Jashamy
Abstract:
Red blood cells (RBCs) are among the most commonly and intensively studied type of blood cells in cell biology. Anemia is a lack of RBCs is characterized by its level compared to the normal hemoglobin level. In this study, a system based image processing methodology was developed to localize and extract RBCs from microscopic images. Also, the machine learning approach is adopted to classify the localized anemic RBCs images. Several textural and geometrical features are calculated for each extracted RBCs. The training set of features was analyzed using principal component analysis (PCA). With the proposed method, RBCs were isolated in 4.3secondsfrom an image containing 18 to 27 cells. The reasons behind using PCA are its low computation complexity and suitability to find the most discriminating features which can lead to accurate classification decisions. Our classifier algorithm yielded accuracy rates of 100%, 99.99%, and 96.50% for K-nearest neighbor (K-NN) algorithm, support vector machine (SVM), and neural network RBFNN, respectively. Classification was evaluated in highly sensitivity, specificity, and kappa statistical parameters. In conclusion, the classification results were obtained within short time period, and the results became better when PCA was used.
Keywords: Red blood cells, pre-processing image algorithms, classification algorithms, principal component analysis PCA, confusion matrix, kappa statistical parameters, ROC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3199141 Traffic Forecasting for Open Radio Access Networks Virtualized Network Functions in 5G Networks
Authors: Khalid Ali, Manar Jammal
Abstract:
In order to meet the stringent latency and reliability requirements of the upcoming 5G networks, Open Radio Access Networks (O-RAN) have been proposed. The virtualization of O-RAN has allowed it to be treated as a Network Function Virtualization (NFV) architecture, while its components are considered Virtualized Network Functions (VNFs). Hence, intelligent Machine Learning (ML) based solutions can be utilized to apply different resource management and allocation techniques on O-RAN. However, intelligently allocating resources for O-RAN VNFs can prove challenging due to the dynamicity of traffic in mobile networks. Network providers need to dynamically scale the allocated resources in response to the incoming traffic. Elastically allocating resources can provide a higher level of flexibility in the network in addition to reducing the OPerational EXpenditure (OPEX) and increasing the resources utilization. Most of the existing elastic solutions are reactive in nature, despite the fact that proactive approaches are more agile since they scale instances ahead of time by predicting the incoming traffic. In this work, we propose and evaluate traffic forecasting models based on the ML algorithm. The algorithms aim at predicting future O-RAN traffic by using previous traffic data. Detailed analysis of the traffic data was carried out to validate the quality and applicability of the traffic dataset. Hence, two ML models were proposed and evaluated based on their prediction capabilities.
Keywords: O-RAN, traffic forecasting, NFV, ARIMA, LSTM, elasticity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 540140 Surrogate based Evolutionary Algorithm for Design Optimization
Authors: Maumita Bhattacharya
Abstract:
Optimization is often a critical issue for most system design problems. Evolutionary Algorithms are population-based, stochastic search techniques, widely used as efficient global optimizers. However, finding optimal solution to complex high dimensional, multimodal problems often require highly computationally expensive function evaluations and hence are practically prohibitive. The Dynamic Approximate Fitness based Hybrid EA (DAFHEA) model presented in our earlier work [14] reduced computation time by controlled use of meta-models to partially replace the actual function evaluation by approximate function evaluation. However, the underlying assumption in DAFHEA is that the training samples for the meta-model are generated from a single uniform model. Situations like model formation involving variable input dimensions and noisy data certainly can not be covered by this assumption. In this paper we present an enhanced version of DAFHEA that incorporates a multiple-model based learning approach for the SVM approximator. DAFHEA-II (the enhanced version of the DAFHEA framework) also overcomes the high computational expense involved with additional clustering requirements of the original DAFHEA framework. The proposed framework has been tested on several benchmark functions and the empirical results illustrate the advantages of the proposed technique.Keywords: Evolutionary algorithm, Fitness function, Optimization, Meta-model, Stochastic method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1576139 A Review on Medical Image Registration Techniques
Authors: Shadrack Mambo, Karim Djouani, Yskandar Hamam, Barend van Wyk, Patrick Siarry
Abstract:
This paper discusses the current trends in medical image registration techniques and addresses the need to provide a solid theoretical foundation for research endeavours. Methodological analysis and synthesis of quality literature was done, providing a platform for developing a good foundation for research study in this field which is crucial in understanding the existing levels of knowledge. Research on medical image registration techniques assists clinical and medical practitioners in diagnosis of tumours and lesion in anatomical organs, thereby enhancing fast and accurate curative treatment of patients. Literature review aims to provide a solid theoretical foundation for research endeavours in image registration techniques. Developing a solid foundation for a research study is possible through a methodological analysis and synthesis of existing contributions. Out of these considerations, the aim of this paper is to enhance the scientific community’s understanding of the current status of research in medical image registration techniques and also communicate to them, the contribution of this research in the field of image processing. The gaps identified in current techniques can be closed by use of artificial neural networks that form learning systems designed to minimise error function. The paper also suggests several areas of future research in the image registration.Keywords: Image registration techniques, medical images, neural networks, optimisation, transformation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1812138 General Regression Neural Network and Back Propagation Neural Network Modeling for Predicting Radial Overcut in EDM: A Comparative Study
Authors: Raja Das, M. K. Pradhan
Abstract:
This paper presents a comparative study between two neural network models namely General Regression Neural Network (GRNN) and Back Propagation Neural Network (BPNN) are used to estimate radial overcut produced during Electrical Discharge Machining (EDM). Four input parameters have been employed: discharge current (Ip), pulse on time (Ton), Duty fraction (Tau) and discharge voltage (V). Recently, artificial intelligence techniques, as it is emerged as an effective tool that could be used to replace time consuming procedures in various scientific or engineering applications, explicitly in prediction and estimation of the complex and nonlinear process. The both networks are trained, and the prediction results are tested with the unseen validation set of the experiment and analysed. It is found that the performance of both the networks are found to be in good agreement with average percentage error less than 11% and the correlation coefficient obtained for the validation data set for GRNN and BPNN is more than 91%. However, it is much faster to train GRNN network than a BPNN and GRNN is often more accurate than BPNN. GRNN requires more memory space to store the model, GRNN features fast learning that does not require an iterative procedure, and highly parallel structure. GRNN networks are slower than multilayer perceptron networks at classifying new cases.
Keywords: Electrical-discharge machining, General Regression Neural Network, Back-propagation Neural Network, Radial Overcut.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3115137 Twitter Sentiment Analysis during the Lockdown on New Zealand
Authors: Smah Doeban Almotiri
Abstract:
One of the most common fields of natural language processing (NLP) is sentimental analysis. The inferred feeling in the text can be successfully mined for various events using sentiment analysis. Twitter is viewed as a reliable data point for sentimental analytics studies since people are using social media to receive and exchange different types of data on a broad scale during the COVID-19 epidemic. The processing of such data may aid in making critical decisions on how to keep the situation under control. The aim of this research is to look at how sentimental states differed in a single geographic region during the lockdown at two different times.1162 tweets were analyzed related to the COVID-19 pandemic lockdown using keywords hashtags (lockdown, COVID-19) for the first sample tweets were from March 23, 2020, until April 23, 2020, and the second sample for the following year was from March 1, 2021, until April 4, 2021. Natural language processing (NLP), which is a form of Artificial intelligent was used for this research to calculate the sentiment value of all of the tweets by using AFINN Lexicon sentiment analysis method. The findings revealed that the sentimental condition in both different times during the region's lockdown was positive in the samples of this study, which are unique to the specific geographical area of New Zealand. This research suggests applied machine learning sentimental method such as Crystal Feel and extended the size of the sample tweet by using multiple tweets over a longer period of time.
Keywords: sentiment analysis, Twitter analysis, lockdown, Covid-19, AFINN, NodeJS
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 584136 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting
Authors: Kemal Polat
Abstract:
In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.
Keywords: Fuzzy C-means clustering, Fuzzy C-means clustering based attribute weighting, Pima Indians diabetes dataset, SVM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1763135 A Fuzzy-Rough Feature Selection Based on Binary Shuffled Frog Leaping Algorithm
Authors: Javad Rahimipour Anaraki, Saeed Samet, Mahdi Eftekhari, Chang Wook Ahn
Abstract:
Feature selection and attribute reduction are crucial problems, and widely used techniques in the field of machine learning, data mining and pattern recognition to overcome the well-known phenomenon of the Curse of Dimensionality. This paper presents a feature selection method that efficiently carries out attribute reduction, thereby selecting the most informative features of a dataset. It consists of two components: 1) a measure for feature subset evaluation, and 2) a search strategy. For the evaluation measure, we have employed the fuzzy-rough dependency degree (FRFDD) of the lower approximation-based fuzzy-rough feature selection (L-FRFS) due to its effectiveness in feature selection. As for the search strategy, a modified version of a binary shuffled frog leaping algorithm is proposed (B-SFLA). The proposed feature selection method is obtained by hybridizing the B-SFLA with the FRDD. Nine classifiers have been employed to compare the proposed approach with several existing methods over twenty two datasets, including nine high dimensional and large ones, from the UCI repository. The experimental results demonstrate that the B-SFLA approach significantly outperforms other metaheuristic methods in terms of the number of selected features and the classification accuracy.Keywords: Binary shuffled frog leaping algorithm, feature selection, fuzzy-rough set, minimal reduct.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 731134 Relevance Feedback within CBIR Systems
Authors: Mawloud Mosbah, Bachir Boucheham
Abstract:
We present here the results for a comparative study of some techniques, available in the literature, related to the relevance feedback mechanism in the case of a short-term learning. Only one method among those considered here is belonging to the data mining field which is the K-nearest neighbors algorithm (KNN) while the rest of the methods is related purely to the information retrieval field and they fall under the purview of the following three major axes: Shifting query, Feature Weighting and the optimization of the parameters of similarity metric. As a contribution, and in addition to the comparative purpose, we propose a new version of the KNN algorithm referred to as an incremental KNN which is distinct from the original version in the sense that besides the influence of the seeds, the rate of the actual target image is influenced also by the images already rated. The results presented here have been obtained after experiments conducted on the Wang database for one iteration and utilizing color moments on the RGB space. This compact descriptor, Color Moments, is adequate for the efficiency purposes needed in the case of interactive systems. The results obtained allow us to claim that the proposed algorithm proves good results; it even outperforms a wide range of techniques available in the literature.
Keywords: CBIR, Category Search, Relevance Feedback (RFB), Query Point Movement, Standard Rocchio’s Formula, Adaptive Shifting Query, Feature Weighting, Optimization of the Parameters of Similarity Metric, Original KNN, Incremental KNN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2342133 Development of Genetic-based Machine Learning for Network Intrusion Detection (GBML-NID)
Authors: Wafa' S.Al-Sharafat, Reyadh Naoum
Abstract:
Society has grown to rely on Internet services, and the number of Internet users increases every day. As more and more users become connected to the network, the window of opportunity for malicious users to do their damage becomes very great and lucrative. The objective of this paper is to incorporate different techniques into classier system to detect and classify intrusion from normal network packet. Among several techniques, Steady State Genetic-based Machine Leaning Algorithm (SSGBML) will be used to detect intrusions. Where Steady State Genetic Algorithm (SSGA), Simple Genetic Algorithm (SGA), Modified Genetic Algorithm and Zeroth Level Classifier system are investigated in this research. SSGA is used as a discovery mechanism instead of SGA. SGA replaces all old rules with new produced rule preventing old good rules from participating in the next rule generation. Zeroth Level Classifier System is used to play the role of detector by matching incoming environment message with classifiers to determine whether the current message is normal or intrusion and receiving feedback from environment. Finally, in order to attain the best results, Modified SSGA will enhance our discovery engine by using Fuzzy Logic to optimize crossover and mutation probability. The experiments and evaluations of the proposed method were performed with the KDD 99 intrusion detection dataset.Keywords: MSSGBML, Network Intrusion Detection, SGA, SSGA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1672132 Selection of Best Band Combination for Soil Salinity Studies using ETM+ Satellite Images (A Case study: Nyshaboor Region,Iran)
Authors: Sanaeinejad, S. H.; A. Astaraei, . P. Mirhoseini.Mousavi, M. Ghaemi,
Abstract:
One of the main environmental problems which affect extensive areas in the world is soil salinity. Traditional data collection methods are neither enough for considering this important environmental problem nor accurate for soil studies. Remote sensing data could overcome most of these problems. Although satellite images are commonly used for these studies, however there are still needs to find the best calibration between the data and real situations in each specified area. Neyshaboor area, North East of Iran was selected as a field study of this research. Landsat satellite images for this area were used in order to prepare suitable learning samples for processing and classifying the images. 300 locations were selected randomly in the area to collect soil samples and finally 273 locations were reselected for further laboratory works and image processing analysis. Electrical conductivity of all samples was measured. Six reflective bands of ETM+ satellite images taken from the study area in 2002 were used for soil salinity classification. The classification was carried out using common algorithms based on the best composition bands. The results showed that the reflective bands 7, 3, 4 and 1 are the best band composition for preparing the color composite images. We also found out, that hybrid classification is a suitable method for identifying and delineation of different salinity classes in the area.
Keywords: Soil salinity, Remote sensing, Image processing, ETM+, Nyshaboor
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2021131 A Cross-Disciplinary Educational Model in Biomanufacturing to Sustain a Competitive Workforce Ecosystem
Authors: Rosa Buxeda, Lorenzo Saliceti-Piazza, Rodolfo J. Romañach, Luis Ríos, Sandra L. Maldonado-Ramírez
Abstract:
Biopharmaceuticals manufacturing is one of the major economic activities worldwide. Ninety-three percent of the workforce in a biomanufacturing environment concentrates in production-related areas. As a result, strategic collaborations between industry and academia are crucial to ensure the availability of knowledgeable workforce needed in an economic region to become competitive in biomanufacturing. In the past decade, our institution has been a key strategic partner with multinational biotechnology companies in supplying science and engineering graduates in the field of industrial biotechnology. Initiatives addressing all levels of the educational pipeline, from K-12 to college to continued education for company employees have been established along a ten-year span. The Amgen BioTalents Program was designed to provide undergraduate science and engineering students with training in biomanufacturing. The areas targeted by this educational program enhance their academic development, since these topics are not part of their traditional science and engineering curricula. The educational curriculum involved the process of producing a biomolecule from the genetic engineering of cells to the production of an especially targeted polypeptide, protein expression and purification, to quality control, and validation. This paper will report and describe the implementation details and outcomes of the first sessions of the program.
Keywords: Biomanufacturing curriculum, interdisciplinary learning, workforce development, industry-academia partnering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1975130 Low-Cost Mechatronic Design of an Omnidirectional Mobile Robot
Authors: S. Cobos-Guzman
Abstract:
This paper presents the results of a mechatronic design based on a 4-wheel omnidirectional mobile robot that can be used in indoor logistic applications. The low-level control has been selected using two open-source hardware (Raspberry Pi 3 Model B+ and Arduino Mega 2560) that control four industrial motors, four ultrasound sensors, four optical encoders, a vision system of two cameras, and a Hokuyo URG-04LX-UG01 laser scanner. Moreover, the system is powered with a lithium battery that can supply 24 V DC and a maximum current-hour of 20Ah.The Robot Operating System (ROS) has been implemented in the Raspberry Pi and the performance is evaluated with the selection of the sensors and hardware selected. The mechatronic system is evaluated and proposed safe modes of power distribution for controlling all the electronic devices based on different tests. Therefore, based on different performance results, some recommendations are indicated for using the Raspberry Pi and Arduino in terms of power, communication, and distribution of control for different devices. According to these recommendations, the selection of sensors is distributed in both real-time controllers (Arduino and Raspberry Pi). On the other hand, the drivers of the cameras have been implemented in Linux and a python program has been implemented to access the cameras. These cameras will be used for implementing a deep learning algorithm to recognize people and objects. In this way, the level of intelligence can be increased in combination with the maps that can be obtained from the laser scanner.
Keywords: Autonomous, indoor robot, mechatronic, omnidirectional robot.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 586129 Oracle JDE Enterprise One ERP Implementation: A Case Study
Authors: Abhimanyu Pati, Krishna Kumar Veluri
Abstract:
The paper intends to bring out a real life experience encountered during actual implementation of a large scale Tier-1 Enterprise Resource Planning (ERP) system in a multi-location, discrete manufacturing organization in India, involved in manufacturing of auto components and aggregates. The business complexities, prior to the implementation of ERP, include multi-product with hierarchical product structures, geographically distributed multiple plant locations with disparate business practices, lack of inter-plant broadband connectivity, existence of disparate legacy applications for different business functions, and non-standardized codifications of products, machines, employees, and accounts apart from others. On the other hand, the manufacturing environment consisted of processes like Assemble-to-Order (ATO), Make-to-Stock (MTS), and Engineer-to-Order (ETO) with a mix of discrete and process operations. The paper has highlighted various business plan areas and concerns, prior to the implementation, with specific focus on strategic issues and objectives. Subsequently, it has dealt with the complete process of ERP implementation, starting from strategic planning, project planning, resource mobilization, and finally, the program execution. The step-by-step process provides a very good learning opportunity about the implementation methodology. At the end, various organizational challenges and lessons emerged, which will act as guidelines and checklist for organizations to successfully align and implement ERP and achieve their business objectives.
Keywords: ERP, ATO, MTS, ETO, discrete manufacturing, strategic planning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1800128 Performance Assessment of Multi-Level Ensemble for Multi-Class Problems
Authors: Rodolfo Lorbieski, Silvia Modesto Nassar
Abstract:
Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.Keywords: Stacking, multi-layers, ensemble, multi-class.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1093127 A Software Framework for Predicting Oil-Palm Yield from Climate Data
Authors: Mohd. Noor Md. Sap, A. Majid Awan
Abstract:
Intelligent systems based on machine learning techniques, such as classification, clustering, are gaining wide spread popularity in real world applications. This paper presents work on developing a software system for predicting crop yield, for example oil-palm yield, from climate and plantation data. At the core of our system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. This work gets inspiration from the notion that a non-linear data transformation into some high dimensional feature space increases the possibility of linear separability of the patterns in the transformed space. Therefore, it simplifies exploration of the associated structure in the data. Kernel methods implicitly perform a non-linear mapping of the input data into a high dimensional feature space by replacing the inner products with an appropriate positive definite function. In this paper we present a robust weighted kernel k-means algorithm incorporating spatial constraints for clustering the data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.Keywords: Pattern analysis, clustering, kernel methods, spatial data, crop yield
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1979126 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text
Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni
Abstract:
The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.Keywords: Cooccurrence graph, entity relation graph, unstructured text, weighted distance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 684125 Technological Innovation Capabilities and Firm Performance
Authors: Richard C.M. Yam, William Lo, Esther P.Y. Tang, Antonio, K.W. Lau
Abstract:
Technological innovation capability (TIC) is defined as a comprehensive set of characteristics of a firm that facilities and supports its technological innovation strategies. An audit to evaluate the TICs of a firm may trigger improvement in its future practices. Such an audit can be used by the firm for self assessment or third-party independent assessment to identify problems of its capability status. This paper attempts to develop such an auditing framework that can help to determine the subtle links between innovation capabilities and business performance; and to enable the auditor to determine whether good practice is in place. The seven TICs in this study include learning, R&D, resources allocation, manufacturing, marketing, organization and strategic planning capabilities. Empirical data was acquired through a survey study of 200 manufacturing firms in the Hong Kong/Pearl River Delta (HK/PRD) region. Structural equation modelling was employed to examine the relationships among TICs and various performance indicators: sales performance, innovation performance, product performance, and sales growth. The results revealed that different TICs have different impacts on different performance measures. Organization capability was found to have the most influential impact. Hong Kong manufacturers are now facing the challenge of high-mix-low-volume customer orders. In order to cope with this change, good capability in organizing different activities among various departments is critical to the success of a company.Keywords: Hong Kong/Pearl River Delta, Innovationaudit, Manufacturing, Technological innovation capability
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3403124 Least Square-SVM Detector for Wireless BPSK in Multi-Environmental Noise
Authors: J. P. Dubois, Omar M. Abdul-Latif
Abstract:
Support Vector Machine (SVM) is a statistical learning tool developed to a more complex concept of structural risk minimization (SRM). In this paper, SVM is applied to signal detection in communication systems in the presence of channel noise in various environments in the form of Rayleigh fading, additive white Gaussian background noise (AWGN), and interference noise generalized as additive color Gaussian noise (ACGN). The structure and performance of SVM in terms of the bit error rate (BER) metric is derived and simulated for these advanced stochastic noise models and the computational complexity of the implementation, in terms of average computational time per bit, is also presented. The performance of SVM is then compared to conventional binary signaling optimal model-based detector driven by binary phase shift keying (BPSK) modulation. We show that the SVM performance is superior to that of conventional matched filter-, innovation filter-, and Wiener filter-driven detectors, even in the presence of random Doppler carrier deviation, especially for low SNR (signal-to-noise ratio) ranges. For large SNR, the performance of the SVM was similar to that of the classical detectors. However, the convergence between SVM and maximum likelihood detection occurred at a higher SNR as the noise environment became more hostile.Keywords: Colour noise, Doppler shift, innovation filter, least square-support vector machine, matched filter, Rayleigh fading, Wiener filter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1813123 Application of KL Divergence for Estimation of Each Metabolic Pathway Genes
Authors: Shohei Maruyama, Yasuo Matsuyama, Sachiyo Aburatani
Abstract:
Development of a method to estimate gene functions is an important task in bioinformatics. One of the approaches for the annotation is the identification of the metabolic pathway that genes are involved in. Since gene expression data reflect various intracellular phenomena, those data are considered to be related with genes’ functions. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.
Keywords: Metabolic pathways, gene expression data, microarray, Kullback–Leibler divergence, KL divergence, support vector machines, SVM, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2336122 Visualization and Indexing of Spectral Databases
Authors: Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, Janos Abonyi
Abstract:
On-line (near infrared) spectroscopy is widely used to support the operation of complex process systems. Information extracted from spectral database can be used to estimate unmeasured product properties and monitor the operation of the process. These techniques are based on looking for similar spectra by nearest neighborhood algorithms and distance based searching methods. Search for nearest neighbors in the spectral space is an NP-hard problem, the computational complexity increases by the number of points in the discrete spectrum and the number of samples in the database. To reduce the calculation time some kind of indexing could be used. The main idea presented in this paper is to combine indexing and visualization techniques to reduce the computational requirement of estimation algorithms by providing a two dimensional indexing that can also be used to visualize the structure of the spectral database. This 2D visualization of spectral database does not only support application of distance and similarity based techniques but enables the utilization of advanced clustering and prediction algorithms based on the Delaunay tessellation of the mapped spectral space. This means the prediction has not to use the high dimension space but can be based on the mapped space too. The results illustrate that the proposed method is able to segment (cluster) spectral databases and detect outliers that are not suitable for instance based learning algorithms.
Keywords: indexing high dimensional databases, dimensional reduction, clustering, similarity, k-nn algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1769121 The Challenges and Solutions for Developing Mobile Apps in a Small University
Authors: Greg Turner, Bin Lu, Cheer-Sun Yang
Abstract:
As computing technology advances, smartphone applications can assist student learning in a pervasive way. For example, the idea of using mobile apps for the PA Common Trees, Pests, Pathogens, in the field as a reference tool allows middle school students to learn about trees and associated pests/pathogens without bringing a textbook. While working on the development of three heterogeneous mobile apps, we ran into numerous challenges. Both the traditional waterfall model and the more modern agile methodologies failed in practice. The waterfall model emphasizes the planning of the duration for each phase. When the duration of each phase is not consistent with the availability of developers, the waterfall model cannot be employed. When applying Agile Methodologies, we cannot maintain the high frequency of the iterative development review process, known as ‘sprints’. In this paper, we discuss the challenges and solutions. We propose a hybrid model known as the Relay Race Methodology to reflect the concept of racing and relaying during the process of software development in practice. Based on the development project, we observe that the modeling of the relay race transition between any two phases is manifested naturally. Thus, we claim that the RRM model can provide a de fecto rather than a de jure basis for the core concept in the software development model. In this paper, the background of the project is introduced first. Then, the challenges are pointed out followed by our solutions. Finally, the experiences learned and the future works are presented.Keywords: Agile methods, mobile apps, software process model, waterfall model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1603120 Computing Entropy for Ortholog Detection
Authors: Hsing-Kuo Pao, John Case
Abstract:
Biological sequences from different species are called or-thologs if they evolved from a sequence of a common ancestor species and they have the same biological function. Approximations of Kolmogorov complexity or entropy of biological sequences are already well known to be useful in extracting similarity information between such sequences -in the interest, for example, of ortholog detection. As is well known, the exact Kolmogorov complexity is not algorithmically computable. In prac-tice one can approximate it by computable compression methods. How-ever, such compression methods do not provide a good approximation to Kolmogorov complexity for short sequences. Herein is suggested a new ap-proach to overcome the problem that compression approximations may notwork well on short sequences. This approach is inspired by new, conditional computations of Kolmogorov entropy. A main contribution of the empir-ical work described shows the new set of entropy-based machine learning attributes provides good separation between positive (ortholog) and nega-tive (non-ortholog) data - better than with good, previously known alter-natives (which do not employ some means to handle short sequences well).Also empirically compared are the new entropy based attribute set and a number of other, more standard similarity attributes sets commonly used in genomic analysis. The various similarity attributes are evaluated by cross validation, through boosted decision tree induction C5.0, and by Receiver Operating Characteristic (ROC) analysis. The results point to the conclu-sion: the new, entropy based attribute set by itself is not the one giving the best prediction; however, it is the best attribute set for use in improving the other, standard attribute sets when conjoined with them.
Keywords: compression, decision tree, entropy, ortholog, ROC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1827119 Evolution of Web Development Techniques in Modern Technology
Authors: Abdul Basit Kiani, Maryam Kiani
Abstract:
The art of web development in new technologies is a dynamic journey, shaped by the constant evolution of tools and platforms. With the emergence of JavaScript frameworks and APIs, web developers are empowered to craft web applications that are not only robust but also highly interactive. The aim is to provide an overview of the developments in the field. The integration of artificial intelligence (AI) and machine learning (ML) has opened new horizons in web development. Chatbots, intelligent recommendation systems, and personalization algorithms have become integral components of modern websites. These AI-powered features enhance user engagement, provide personalized experiences, and streamline customer support processes, revolutionizing the way businesses interact with their audiences. Lastly, the emphasis on web security and privacy has been a pivotal area of progress. With the increasing incidents of cyber threats, web developers have implemented robust security measures to safeguard user data and ensure secure transactions. Innovations such as HTTPS protocol, two-factor authentication, and advanced encryption techniques have bolstered the overall security of web applications, fostering trust and confidence among users. Hence, recent progress in web development has propelled the industry forward, enabling developers to craft innovative and immersive digital experiences. From responsive design to AI integration and enhanced security, the landscape of web development continues to evolve, promising a future filled with endless possibilities.
Keywords: Web development, software testing, progressive web apps, web and mobile native application.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 381118 Combining the Deep Neural Network with the K-Means for Traffic Accident Prediction
Authors: Celso L. Fernando, Toshio Yoshii, Takahiro Tsubota
Abstract:
Understanding the causes of a road accident and predicting their occurrence is key to prevent deaths and serious injuries from road accident events. Traditional statistical methods such as the Poisson and the Logistics regressions have been used to find the association of the traffic environmental factors with the accident occurred; recently, an artificial neural network, ANN, a computational technique that learns from historical data to make a more accurate prediction, has emerged. Although the ability to make accurate predictions, the ANN has difficulty dealing with highly unbalanced attribute patterns distribution in the training dataset; in such circumstances, the ANN treats the minority group as noise. However, in the real world data, the minority group is often the group of interest; e.g., in the road traffic accident data, the events of the accident are the group of interest. This study proposes a combination of the k-means with the ANN to improve the predictive ability of the neural network model by alleviating the effect of the unbalanced distribution of the attribute patterns in the training dataset. The results show that the proposed method improves the ability of the neural network to make a prediction on a highly unbalanced distributed attribute patterns dataset; however, on an even distributed attribute patterns dataset, the proposed method performs almost like a standard neural network.
Keywords: Accident risks estimation, artificial neural network, deep learning, K-mean, road safety.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 974117 Mining User-Generated Contents to Detect Service Failures with Topic Model
Authors: Kyung Bae Park, Sung Ho Ha
Abstract:
Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.Keywords: Latent Dirichlet allocation, R program, text mining, topic model, user generated contents, visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1216116 The Estimation of Bird Diversity Loss and Gain as an Impact of Oil Palm Plantation: Study Case in KJNP Estate Riau Province
Authors: Yanto Santosa, Catharina Yudea
Abstract:
The rapid growth of oil palm industry in Indonesia raised many negative accusations from various parties, who said that oil palm plantation is damaging the environment and biodiversity, including birds. Since research on oil palm plantation impacts on bird diversity is still limited, this study needs to be developed in order to gain further learning and understanding. Data on bird diversity were collected in March 2018 in KJNP Estate, Riau Province using strip transect method on five different land cover types (young, intermediate, and old growth of oil palm plantation, high conservation value area, and crops field or the baseline). The observations were conducted simultaneously, with three repetitions. The result shows that the baseline has 19 species of birds and land cover after the oil palm plantation has 39 species. HCV (high conservation value) area has the highest increase in diversity value. Oil palm plantation has changed the composition of bird species. The highest similarity index is shown by young growth oil palm land cover with total score 0.65, meanwhile the lowest similarity index with total score 0.43 is shown by HCV area. Overall, the existence of oil palm plantation made a positive impact by increasing bird species diversity, with total 23 species gained and 3 species lost.
Keywords: Bird diversity, crops field, impact of oil palm plantation, KJNP estate.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 795115 Outsourcing the Front End of Innovation
Abstract:
The paper presents a new method for efficient innovation process management. Even though the innovation management methods, tools and knowledge are well established and documented in literature, most of the companies still do not manage it efficiently. Especially in SMEs the front end of innovation - problem identification, idea creation and selection - is often not optimally performed. Our eMIPS methodology represents a sort of "umbrella methodology" - a well-defined set of procedures, which can be dynamically adapted to the concrete case in a company. In daily practice, various methods (e.g. for problem identification and idea creation) can be applied, depending on the company's needs. It is based on the proactive involvement of the company's employees supported by the appropriate methodology and external experts. The presented phases are performed via a mixture of face-to-face activities (workshops) and online (eLearning) activities taking place in eLearning Moodle environment and using other e-communication channels. One part of the outcomes is an identified set of opportunities and concrete solutions ready for implementation. The other also very important result is connected to innovation competences for the participating employees related with concrete tools and methods for idea management. In addition, the employees get a strong experience for dynamic, efficient and solution oriented managing of the invention process. The eMIPS also represents a way of establishing or improving the innovation culture in the organization. The first results in a pilot company showed excellent results regarding the motivation of participants and also as to the results achieved.
Keywords: Creativity, distance learning, front end, innovation, problem.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2208