Search results for: learning advanced programming
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3100

Search results for: learning advanced programming

160 Fuzzy Relatives of the CLARANS Algorithm With Application to Text Clustering

Authors: Mohamed A. Mahfouz, M. A. Ismail

Abstract:

This paper introduces new algorithms (Fuzzy relative of the CLARANS algorithm FCLARANS and Fuzzy c Medoids based on randomized search FCMRANS) for fuzzy clustering of relational data. Unlike existing fuzzy c-medoids algorithm (FCMdd) in which the within cluster dissimilarity of each cluster is minimized in each iteration by recomputing new medoids given current memberships, FCLARANS minimizes the same objective function minimized by FCMdd by changing current medoids in such away that that the sum of the within cluster dissimilarities is minimized. Computing new medoids may be effected by noise because outliers may join the computation of medoids while the choice of medoids in FCLARANS is dictated by the location of a predominant fraction of points inside a cluster and, therefore, it is less sensitive to the presence of outliers. In FCMRANS the step of computing new medoids in FCMdd is modified to be based on randomized search. Furthermore, a new initialization procedure is developed that add randomness to the initialization procedure used with FCMdd. Both FCLARANS and FCMRANS are compared with the robust and linearized version of fuzzy c-medoids (RFCMdd). Experimental results with different samples of the Reuter-21578, Newsgroups (20NG) and generated datasets with noise show that FCLARANS is more robust than both RFCMdd and FCMRANS. Finally, both FCMRANS and FCLARANS are more efficient and their outputs are almost the same as that of RFCMdd in terms of classification rate.

Keywords: Data Mining, Fuzzy Clustering, Relational Clustering, Medoid-Based Clustering, Cluster Analysis, Unsupervised Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2401
159 Isolation and Classification of Red Blood Cells in Anemic Microscopic Images

Authors: Jameela Ali Alkrimi, Loay E. George, Azizah Suliman, Abdul Rahim Ahmad, Karim Al-Jashamy

Abstract:

Red blood cells (RBCs) are among the most commonly and intensively studied type of blood cells in cell biology. Anemia is a lack of RBCs is characterized by its level compared to the normal hemoglobin level. In this study, a system based image processing methodology was developed to localize and extract RBCs from microscopic images. Also, the machine learning approach is adopted to classify the localized anemic RBCs images. Several textural and geometrical features are calculated for each extracted RBCs. The training set of features was analyzed using principal component analysis (PCA). With the proposed method, RBCs were isolated in 4.3secondsfrom an image containing 18 to 27 cells. The reasons behind using PCA are its low computation complexity and suitability to find the most discriminating features which can lead to accurate classification decisions. Our classifier algorithm yielded accuracy rates of 100%, 99.99%, and 96.50% for K-nearest neighbor (K-NN) algorithm, support vector machine (SVM), and neural network RBFNN, respectively. Classification was evaluated in highly sensitivity, specificity, and kappa statistical parameters. In conclusion, the classification results were obtained within short time period, and the results became better when PCA was used.

Keywords: Red blood cells, pre-processing image algorithms, classification algorithms, principal component analysis PCA, confusion matrix, kappa statistical parameters, ROC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3198
158 Traffic Forecasting for Open Radio Access Networks Virtualized Network Functions in 5G Networks

Authors: Khalid Ali, Manar Jammal

Abstract:

In order to meet the stringent latency and reliability requirements of the upcoming 5G networks, Open Radio Access Networks (O-RAN) have been proposed. The virtualization of O-RAN has allowed it to be treated as a Network Function Virtualization (NFV) architecture, while its components are considered Virtualized Network Functions (VNFs). Hence, intelligent Machine Learning (ML) based solutions can be utilized to apply different resource management and allocation techniques on O-RAN. However, intelligently allocating resources for O-RAN VNFs can prove challenging due to the dynamicity of traffic in mobile networks. Network providers need to dynamically scale the allocated resources in response to the incoming traffic. Elastically allocating resources can provide a higher level of flexibility in the network in addition to reducing the OPerational EXpenditure (OPEX) and increasing the resources utilization. Most of the existing elastic solutions are reactive in nature, despite the fact that proactive approaches are more agile since they scale instances ahead of time by predicting the incoming traffic. In this work, we propose and evaluate traffic forecasting models based on the ML algorithm. The algorithms aim at predicting future O-RAN traffic by using previous traffic data. Detailed analysis of the traffic data was carried out to validate the quality and applicability of the traffic dataset. Hence, two ML models were proposed and evaluated based on their prediction capabilities.

Keywords: O-RAN, traffic forecasting, NFV, ARIMA, LSTM, elasticity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 537
157 Surrogate based Evolutionary Algorithm for Design Optimization

Authors: Maumita Bhattacharya

Abstract:

Optimization is often a critical issue for most system design problems. Evolutionary Algorithms are population-based, stochastic search techniques, widely used as efficient global optimizers. However, finding optimal solution to complex high dimensional, multimodal problems often require highly computationally expensive function evaluations and hence are practically prohibitive. The Dynamic Approximate Fitness based Hybrid EA (DAFHEA) model presented in our earlier work [14] reduced computation time by controlled use of meta-models to partially replace the actual function evaluation by approximate function evaluation. However, the underlying assumption in DAFHEA is that the training samples for the meta-model are generated from a single uniform model. Situations like model formation involving variable input dimensions and noisy data certainly can not be covered by this assumption. In this paper we present an enhanced version of DAFHEA that incorporates a multiple-model based learning approach for the SVM approximator. DAFHEA-II (the enhanced version of the DAFHEA framework) also overcomes the high computational expense involved with additional clustering requirements of the original DAFHEA framework. The proposed framework has been tested on several benchmark functions and the empirical results illustrate the advantages of the proposed technique.

Keywords: Evolutionary algorithm, Fitness function, Optimization, Meta-model, Stochastic method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1575
156 A Review on Medical Image Registration Techniques

Authors: Shadrack Mambo, Karim Djouani, Yskandar Hamam, Barend van Wyk, Patrick Siarry

Abstract:

This paper discusses the current trends in medical image registration techniques and addresses the need to provide a solid theoretical foundation for research endeavours. Methodological analysis and synthesis of quality literature was done, providing a platform for developing a good foundation for research study in this field which is crucial in understanding the existing levels of knowledge. Research on medical image registration techniques assists clinical and medical practitioners in diagnosis of tumours and lesion in anatomical organs, thereby enhancing fast and accurate curative treatment of patients. Literature review aims to provide a solid theoretical foundation for research endeavours in image registration techniques. Developing a solid foundation for a research study is possible through a methodological analysis and synthesis of existing contributions. Out of these considerations, the aim of this paper is to enhance the scientific community’s understanding of the current status of research in medical image registration techniques and also communicate to them, the contribution of this research in the field of image processing. The gaps identified in current techniques can be closed by use of artificial neural networks that form learning systems designed to minimise error function. The paper also suggests several areas of future research in the image registration.

Keywords: Image registration techniques, medical images, neural networks, optimisation, transformation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1811
155 General Regression Neural Network and Back Propagation Neural Network Modeling for Predicting Radial Overcut in EDM: A Comparative Study

Authors: Raja Das, M. K. Pradhan

Abstract:

This paper presents a comparative study between two neural network models namely General Regression Neural Network (GRNN) and Back Propagation Neural Network (BPNN) are used to estimate radial overcut produced during Electrical Discharge Machining (EDM). Four input parameters have been employed: discharge current (Ip), pulse on time (Ton), Duty fraction (Tau) and discharge voltage (V). Recently, artificial intelligence techniques, as it is emerged as an effective tool that could be used to replace time consuming procedures in various scientific or engineering applications, explicitly in prediction and estimation of the complex and nonlinear process. The both networks are trained, and the prediction results are tested with the unseen validation set of the experiment and analysed. It is found that the performance of both the networks are found to be in good agreement with average percentage error less than 11% and the correlation coefficient obtained for the validation data set for GRNN and BPNN is more than 91%. However, it is much faster to train GRNN network than a BPNN and GRNN is often more accurate than BPNN. GRNN requires more memory space to store the model, GRNN features fast learning that does not require an iterative procedure, and highly parallel structure. GRNN networks are slower than multilayer perceptron networks at classifying new cases.

Keywords: Electrical-discharge machining, General Regression Neural Network, Back-propagation Neural Network, Radial Overcut.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3114
154 Twitter Sentiment Analysis during the Lockdown on New Zealand

Authors: Smah Doeban Almotiri

Abstract:

One of the most common fields of natural language processing (NLP) is sentimental analysis. The inferred feeling in the text can be successfully mined for various events using sentiment analysis. Twitter is viewed as a reliable data point for sentimental analytics studies since people are using social media to receive and exchange different types of data on a broad scale during the COVID-19 epidemic. The processing of such data may aid in making critical decisions on how to keep the situation under control. The aim of this research is to look at how sentimental states differed in a single geographic region during the lockdown at two different times.1162 tweets were analyzed related to the COVID-19 pandemic lockdown using keywords hashtags (lockdown, COVID-19) for the first sample tweets were from March 23, 2020, until April 23, 2020, and the second sample for the following year was from March 1, 2021, until April 4, 2021. Natural language processing (NLP), which is a form of Artificial intelligent was used for this research to calculate the sentiment value of all of the tweets by using AFINN Lexicon sentiment analysis method. The findings revealed that the sentimental condition in both different times during the region's lockdown was positive in the samples of this study, which are unique to the specific geographical area of New Zealand. This research suggests applied machine learning sentimental method such as Crystal Feel and extended the size of the sample tweet by using multiple tweets over a longer period of time.

Keywords: sentiment analysis, Twitter analysis, lockdown, Covid-19, AFINN, NodeJS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 583
153 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: Fuzzy C-means clustering, Fuzzy C-means clustering based attribute weighting, Pima Indians diabetes dataset, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1761
152 A Fuzzy-Rough Feature Selection Based on Binary Shuffled Frog Leaping Algorithm

Authors: Javad Rahimipour Anaraki, Saeed Samet, Mahdi Eftekhari, Chang Wook Ahn

Abstract:

Feature selection and attribute reduction are crucial problems, and widely used techniques in the field of machine learning, data mining and pattern recognition to overcome the well-known phenomenon of the Curse of Dimensionality. This paper presents a feature selection method that efficiently carries out attribute reduction, thereby selecting the most informative features of a dataset. It consists of two components: 1) a measure for feature subset evaluation, and 2) a search strategy. For the evaluation measure, we have employed the fuzzy-rough dependency degree (FRFDD) of the lower approximation-based fuzzy-rough feature selection (L-FRFS) due to its effectiveness in feature selection. As for the search strategy, a modified version of a binary shuffled frog leaping algorithm is proposed (B-SFLA). The proposed feature selection method is obtained by hybridizing the B-SFLA with the FRDD. Nine classifiers have been employed to compare the proposed approach with several existing methods over twenty two datasets, including nine high dimensional and large ones, from the UCI repository. The experimental results demonstrate that the B-SFLA approach significantly outperforms other metaheuristic methods in terms of the number of selected features and the classification accuracy.

Keywords: Binary shuffled frog leaping algorithm, feature selection, fuzzy-rough set, minimal reduct.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 730
151 Relevance Feedback within CBIR Systems

Authors: Mawloud Mosbah, Bachir Boucheham

Abstract:

We present here the results for a comparative study of some techniques, available in the literature, related to the relevance feedback mechanism in the case of a short-term learning. Only one method among those considered here is belonging to the data mining field which is the K-nearest neighbors algorithm (KNN) while the rest of the methods is related purely to the information retrieval field and they fall under the purview of the following three major axes: Shifting query, Feature Weighting and the optimization of the parameters of similarity metric. As a contribution, and in addition to the comparative purpose, we propose a new version of the KNN algorithm referred to as an incremental KNN which is distinct from the original version in the sense that besides the influence of the seeds, the rate of the actual target image is influenced also by the images already rated. The results presented here have been obtained after experiments conducted on the Wang database for one iteration and utilizing color moments on the RGB space. This compact descriptor, Color Moments, is adequate for the efficiency purposes needed in the case of interactive systems. The results obtained allow us to claim that the proposed algorithm proves good results; it even outperforms a wide range of techniques available in the literature.

Keywords: CBIR, Category Search, Relevance Feedback (RFB), Query Point Movement, Standard Rocchio’s Formula, Adaptive Shifting Query, Feature Weighting, Optimization of the Parameters of Similarity Metric, Original KNN, Incremental KNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2342
150 Development of Genetic-based Machine Learning for Network Intrusion Detection (GBML-NID)

Authors: Wafa' S.Al-Sharafat, Reyadh Naoum

Abstract:

Society has grown to rely on Internet services, and the number of Internet users increases every day. As more and more users become connected to the network, the window of opportunity for malicious users to do their damage becomes very great and lucrative. The objective of this paper is to incorporate different techniques into classier system to detect and classify intrusion from normal network packet. Among several techniques, Steady State Genetic-based Machine Leaning Algorithm (SSGBML) will be used to detect intrusions. Where Steady State Genetic Algorithm (SSGA), Simple Genetic Algorithm (SGA), Modified Genetic Algorithm and Zeroth Level Classifier system are investigated in this research. SSGA is used as a discovery mechanism instead of SGA. SGA replaces all old rules with new produced rule preventing old good rules from participating in the next rule generation. Zeroth Level Classifier System is used to play the role of detector by matching incoming environment message with classifiers to determine whether the current message is normal or intrusion and receiving feedback from environment. Finally, in order to attain the best results, Modified SSGA will enhance our discovery engine by using Fuzzy Logic to optimize crossover and mutation probability. The experiments and evaluations of the proposed method were performed with the KDD 99 intrusion detection dataset.

Keywords: MSSGBML, Network Intrusion Detection, SGA, SSGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669
149 Exploring Perceptions and Practices About Information and Communication Technologies in Business English Teaching in Pakistan

Authors: M. Athar Hussain, N.B. Jumani, Munazza Sultana., M. Zafar Iqbal

Abstract:

Language Reforms and potential use of ICTs has been a focal area of Higher Education Commission of Pakistan. Efforts are being accelerated to incorporate fast expanding ICTs to bring qualitative improvement in language instruction in higher education. This paper explores how university teachers are benefitting from ICTs to make their English class effective and what type of problems they face in practicing ICTs during their lectures. An in-depth qualitative study was employed to understand why language teachers tend to use ICTs in their instruction and how they are practicing it. A sample of twenty teachers from five universities located in Islamabad, three from public sector and two from private sector, was selected on non-random (Snowball) sampling basis. An interview with 15 semi-structured items was used as research instruments to collect data. The findings reveal that business English teaching is facilitated and improved through the use of ICTs. The language teachers need special training regarding the practices and implementation of ICTs. It is recommended that initiatives might be taken to equip university language teachers with modern methodology incorporating ICTs as focal area and efforts might be made to remove barriers regarding the training of language teachers and proper usage of ICTs.

Keywords: Information and communication technologies, internet assisted learning, teaching business English, online instructional content.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1946
148 Selection of Best Band Combination for Soil Salinity Studies using ETM+ Satellite Images (A Case study: Nyshaboor Region,Iran)

Authors: Sanaeinejad, S. H.; A. Astaraei, . P. Mirhoseini.Mousavi, M. Ghaemi,

Abstract:

One of the main environmental problems which affect extensive areas in the world is soil salinity. Traditional data collection methods are neither enough for considering this important environmental problem nor accurate for soil studies. Remote sensing data could overcome most of these problems. Although satellite images are commonly used for these studies, however there are still needs to find the best calibration between the data and real situations in each specified area. Neyshaboor area, North East of Iran was selected as a field study of this research. Landsat satellite images for this area were used in order to prepare suitable learning samples for processing and classifying the images. 300 locations were selected randomly in the area to collect soil samples and finally 273 locations were reselected for further laboratory works and image processing analysis. Electrical conductivity of all samples was measured. Six reflective bands of ETM+ satellite images taken from the study area in 2002 were used for soil salinity classification. The classification was carried out using common algorithms based on the best composition bands. The results showed that the reflective bands 7, 3, 4 and 1 are the best band composition for preparing the color composite images. We also found out, that hybrid classification is a suitable method for identifying and delineation of different salinity classes in the area.

Keywords: Soil salinity, Remote sensing, Image processing, ETM+, Nyshaboor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2020
147 Assisted Approach as a Tool for Increasing Attention When Using the iPad in a Special Elementary School: Action Research

Authors: Vojtěch Gybas, Libor Klubal, Kateřina Kostolányová

Abstract:

Nowadays, mobile touch technologies, such as tablets, are an integral part of teaching and learning in many special elementary schools. Many special education teachers tend to choose an iPad tablet with iOS. The reason is simple; the iPad has a function for pupils with special educational needs. If we decide to use tablets in teaching, in general, first we should try to stimulate the cognitive abilities of the pupil at the highest level, while holding the pupil’s attention on the task, when working with the device. This paper will describe how student attention can be increased by eliminating the working environment of selected applications, while using iPads with pupils in a special elementary school. Assisted function approach is highly effective at eliminating unwanted touching by a pupil when working on the desktop iPad, thus actively increasing the pupil´s attention while working on specific educational applications. During the various stages of the action, the research was conducted via data collection and interpretation. After a phase of gaining results and ideas for practice and actions, we carried out the check measurement, this time using the tool-assisted approach. In both cases, the pupils worked in the Math Board application and the resulting differences were evident.

Keywords: Special elementary school, mobile touch device, iPad, attention, math board.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1285
146 A Cross-Disciplinary Educational Model in Biomanufacturing to Sustain a Competitive Workforce Ecosystem

Authors: Rosa Buxeda, Lorenzo Saliceti-Piazza, Rodolfo J. Romañach, Luis Ríos, Sandra L. Maldonado-Ramírez

Abstract:

Biopharmaceuticals manufacturing is one of the major economic activities worldwide. Ninety-three percent of the workforce in a biomanufacturing environment concentrates in production-related areas. As a result, strategic collaborations between industry and academia are crucial to ensure the availability of knowledgeable workforce needed in an economic region to become competitive in biomanufacturing. In the past decade, our institution has been a key strategic partner with multinational biotechnology companies in supplying science and engineering graduates in the field of industrial biotechnology. Initiatives addressing all levels of the educational pipeline, from K-12 to college to continued education for company employees have been established along a ten-year span. The Amgen BioTalents Program was designed to provide undergraduate science and engineering students with training in biomanufacturing. The areas targeted by this educational program enhance their academic development, since these topics are not part of their traditional science and engineering curricula. The educational curriculum involved the process of producing a biomolecule from the genetic engineering of cells to the production of an especially targeted polypeptide, protein expression and purification, to quality control, and validation. This paper will report and describe the implementation details and outcomes of the first sessions of the program.

Keywords: Biomanufacturing curriculum, interdisciplinary learning, workforce development, industry-academia partnering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1974
145 Low-Cost Mechatronic Design of an Omnidirectional Mobile Robot

Authors: S. Cobos-Guzman

Abstract:

This paper presents the results of a mechatronic design based on a 4-wheel omnidirectional mobile robot that can be used in indoor logistic applications. The low-level control has been selected using two open-source hardware (Raspberry Pi 3 Model B+ and Arduino Mega 2560) that control four industrial motors, four ultrasound sensors, four optical encoders, a vision system of two cameras, and a Hokuyo URG-04LX-UG01 laser scanner. Moreover, the system is powered with a lithium battery that can supply 24 V DC and a maximum current-hour of 20Ah.The Robot Operating System (ROS) has been implemented in the Raspberry Pi and the performance is evaluated with the selection of the sensors and hardware selected. The mechatronic system is evaluated and proposed safe modes of power distribution for controlling all the electronic devices based on different tests. Therefore, based on different performance results, some recommendations are indicated for using the Raspberry Pi and Arduino in terms of power, communication, and distribution of control for different devices. According to these recommendations, the selection of sensors is distributed in both real-time controllers (Arduino and Raspberry Pi). On the other hand, the drivers of the cameras have been implemented in Linux and a python program has been implemented to access the cameras. These cameras will be used for implementing a deep learning algorithm to recognize people and objects. In this way, the level of intelligence can be increased in combination with the maps that can be obtained from the laser scanner.

Keywords: Autonomous, indoor robot, mechatronic, omnidirectional robot.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 585
144 Oracle JDE Enterprise One ERP Implementation: A Case Study

Authors: Abhimanyu Pati, Krishna Kumar Veluri

Abstract:

The paper intends to bring out a real life experience encountered during actual implementation of a large scale Tier-1 Enterprise Resource Planning (ERP) system in a multi-location, discrete manufacturing organization in India, involved in manufacturing of auto components and aggregates. The business complexities, prior to the implementation of ERP, include multi-product with hierarchical product structures, geographically distributed multiple plant locations with disparate business practices, lack of inter-plant broadband connectivity, existence of disparate legacy applications for different business functions, and non-standardized codifications of products, machines, employees, and accounts apart from others. On the other hand, the manufacturing environment consisted of processes like Assemble-to-Order (ATO), Make-to-Stock (MTS), and Engineer-to-Order (ETO) with a mix of discrete and process operations. The paper has highlighted various business plan areas and concerns, prior to the implementation, with specific focus on strategic issues and objectives. Subsequently, it has dealt with the complete process of ERP implementation, starting from strategic planning, project planning, resource mobilization, and finally, the program execution. The step-by-step process provides a very good learning opportunity about the implementation methodology. At the end, various organizational challenges and lessons emerged, which will act as guidelines and checklist for organizations to successfully align and implement ERP and achieve their business objectives.

Keywords: ERP, ATO, MTS, ETO, discrete manufacturing, strategic planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1799
143 A Web and Cloud-Based Measurement System Analysis Tool for the Automotive Industry

Authors: C. A. Barros, Ana P. Barroso

Abstract:

Any industrial company needs to determine the amount of variation that exists within its measurement process and guarantee the reliability of their data, studying the performance of their measurement system, in terms of linearity, bias, repeatability and reproducibility and stability. This issue is critical for automotive industry suppliers, who are required to be certified by the 16949:2016 standard (replaces the ISO/TS 16949) of International Automotive Task Force, defining the requirements of a quality management system for companies in the automotive industry. Measurement System Analysis (MSA) is one of the mandatory tools. Frequently, the measurement system in companies is not connected to the equipment and do not incorporate the methods proposed by the Automotive Industry Action Group (AIAG). To address these constraints, an R&D project is in progress, whose objective is to develop a web and cloud-based MSA tool. This MSA tool incorporates Industry 4.0 concepts, such as, Internet of Things (IoT) protocols to assure the connection with the measuring equipment, cloud computing, artificial intelligence, statistical tools, and advanced mathematical algorithms. This paper presents the preliminary findings of the project. The web and cloud-based MSA tool is innovative because it implements all statistical tests proposed in the MSA-4 reference manual from AIAG as well as other emerging methods and techniques. As it is integrated with the measuring devices, it reduces the manual input of data and therefore the errors. The tool ensures traceability of all performed tests and can be used in quality laboratories and in the production lines. Besides, it monitors MSAs over time, allowing both the analysis of deviations from the variation of the measurements performed and the management of measurement equipment and calibrations. To develop the MSA tool a ten-step approach was implemented. Firstly, it was performed a benchmarking analysis of the current competitors and commercial solutions linked to MSA, concerning Industry 4.0 paradigm. Next, an analysis of the size of the target market for the MSA tool was done. Afterwards, data flow and traceability requirements were analysed in order to implement an IoT data network that interconnects with the equipment, preferably via wireless. The MSA web solution was designed under UI/UX principles and an API in python language was developed to perform the algorithms and the statistical analysis. Continuous validation of the tool by companies is being performed to assure real time management of the ‘big data’. The main results of this R&D project are: MSA Tool, web and cloud-based; Python API; New Algorithms to the market; and Style Guide of UI/UX of the tool. The MSA tool proposed adds value to the state of the art as it ensures an effective response to the new challenges of measurement systems, which are increasingly critical in production processes. Although the automotive industry has triggered the development of this innovative MSA tool, other industries would also benefit from it. Currently, companies from molds and plastics, chemical and food industry are already validating it.

Keywords: Automotive industry, Industry 4.0, internet of things, IATF 16949:2016, measurement system analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 991
142 Performance Assessment of Multi-Level Ensemble for Multi-Class Problems

Authors: Rodolfo Lorbieski, Silvia Modesto Nassar

Abstract:

Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.

Keywords: Stacking, multi-layers, ensemble, multi-class.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1092
141 A Software Framework for Predicting Oil-Palm Yield from Climate Data

Authors: Mohd. Noor Md. Sap, A. Majid Awan

Abstract:

Intelligent systems based on machine learning techniques, such as classification, clustering, are gaining wide spread popularity in real world applications. This paper presents work on developing a software system for predicting crop yield, for example oil-palm yield, from climate and plantation data. At the core of our system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. This work gets inspiration from the notion that a non-linear data transformation into some high dimensional feature space increases the possibility of linear separability of the patterns in the transformed space. Therefore, it simplifies exploration of the associated structure in the data. Kernel methods implicitly perform a non-linear mapping of the input data into a high dimensional feature space by replacing the inner products with an appropriate positive definite function. In this paper we present a robust weighted kernel k-means algorithm incorporating spatial constraints for clustering the data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.

Keywords: Pattern analysis, clustering, kernel methods, spatial data, crop yield

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1978
140 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: Cooccurrence graph, entity relation graph, unstructured text, weighted distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 683
139 Technological Innovation Capabilities and Firm Performance

Authors: Richard C.M. Yam, William Lo, Esther P.Y. Tang, Antonio, K.W. Lau

Abstract:

Technological innovation capability (TIC) is defined as a comprehensive set of characteristics of a firm that facilities and supports its technological innovation strategies. An audit to evaluate the TICs of a firm may trigger improvement in its future practices. Such an audit can be used by the firm for self assessment or third-party independent assessment to identify problems of its capability status. This paper attempts to develop such an auditing framework that can help to determine the subtle links between innovation capabilities and business performance; and to enable the auditor to determine whether good practice is in place. The seven TICs in this study include learning, R&D, resources allocation, manufacturing, marketing, organization and strategic planning capabilities. Empirical data was acquired through a survey study of 200 manufacturing firms in the Hong Kong/Pearl River Delta (HK/PRD) region. Structural equation modelling was employed to examine the relationships among TICs and various performance indicators: sales performance, innovation performance, product performance, and sales growth. The results revealed that different TICs have different impacts on different performance measures. Organization capability was found to have the most influential impact. Hong Kong manufacturers are now facing the challenge of high-mix-low-volume customer orders. In order to cope with this change, good capability in organizing different activities among various departments is critical to the success of a company.

Keywords: Hong Kong/Pearl River Delta, Innovationaudit, Manufacturing, Technological innovation capability

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3402
138 Application of KL Divergence for Estimation of Each Metabolic Pathway Genes

Authors: Shohei Maruyama, Yasuo Matsuyama, Sachiyo Aburatani

Abstract:

Development of a method to estimate gene functions is an important task in bioinformatics. One of the approaches for the annotation is the identification of the metabolic pathway that genes are involved in. Since gene expression data reflect various intracellular phenomena, those data are considered to be related with genes’ functions. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.

Keywords: Metabolic pathways, gene expression data, microarray, Kullback–Leibler divergence, KL divergence, support vector machines, SVM, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2335
137 The Challenges and Solutions for Developing Mobile Apps in a Small University

Authors: Greg Turner, Bin Lu, Cheer-Sun Yang

Abstract:

As computing technology advances, smartphone applications can assist student learning in a pervasive way. For example, the idea of using mobile apps for the PA Common Trees, Pests, Pathogens, in the field as a reference tool allows middle school students to learn about trees and associated pests/pathogens without bringing a textbook. While working on the development of three heterogeneous mobile apps, we ran into numerous challenges. Both the traditional waterfall model and the more modern agile methodologies failed in practice. The waterfall model emphasizes the planning of the duration for each phase. When the duration of each phase is not consistent with the availability of developers, the waterfall model cannot be employed. When applying Agile Methodologies, we cannot maintain the high frequency of the iterative development review process, known as ‘sprints’. In this paper, we discuss the challenges and solutions. We propose a hybrid model known as the Relay Race Methodology to reflect the concept of racing and relaying during the process of software development in practice. Based on the development project, we observe that the modeling of the relay race transition between any two phases is manifested naturally. Thus, we claim that the RRM model can provide a de fecto rather than a de jure basis for the core concept in the software development model. In this paper, the background of the project is introduced first. Then, the challenges are pointed out followed by our solutions. Finally, the experiences learned and the future works are presented.

Keywords: Agile methods, mobile apps, software process model, waterfall model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1600
136 Integration of Virtual Learning of Induction Machines for Undergraduates

Authors: Rajesh Kumar, Puneet Aggarwal

Abstract:

In context of understanding problems faced by undergraduate students while carrying out laboratory experiments dealing with high voltages, it was found that most of the students are hesitant to work directly on machine. The reason is that error in the circuitry might lead to deterioration of machine and laboratory instruments. So, it has become inevitable to include modern pedagogic techniques for undergraduate students, which would help them to first carry out experiment in virtual system and then to work on live circuit. Further advantages include that students can try out their intuitive ideas and perform in virtual environment, hence leading to new research and innovations. In this paper, virtual environment used is of MATLAB/Simulink for three-phase induction machines. The performance analysis of three-phase induction machine is carried out using virtual environment which includes Direct Current (DC) Test, No-Load Test, and Block Rotor Test along with speed torque characteristics for different rotor resistances and input voltage, respectively. Further, this paper carries out computer aided teaching of basic Voltage Source Inverter (VSI) drive circuitry. Hence, this paper gave undergraduates a clearer view of experiments performed on virtual machine (No-Load test, Block Rotor test and DC test, respectively). After successful implementation of basic tests, VSI circuitry is implemented, and related harmonic distortion (THD) and Fast Fourier Transform (FFT) of current and voltage waveform are studied.

Keywords: Block rotor test, DC test, no-load test, virtual environment, VSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 884
135 Computing Entropy for Ortholog Detection

Authors: Hsing-Kuo Pao, John Case

Abstract:

Biological sequences from different species are called or-thologs if they evolved from a sequence of a common ancestor species and they have the same biological function. Approximations of Kolmogorov complexity or entropy of biological sequences are already well known to be useful in extracting similarity information between such sequences -in the interest, for example, of ortholog detection. As is well known, the exact Kolmogorov complexity is not algorithmically computable. In prac-tice one can approximate it by computable compression methods. How-ever, such compression methods do not provide a good approximation to Kolmogorov complexity for short sequences. Herein is suggested a new ap-proach to overcome the problem that compression approximations may notwork well on short sequences. This approach is inspired by new, conditional computations of Kolmogorov entropy. A main contribution of the empir-ical work described shows the new set of entropy-based machine learning attributes provides good separation between positive (ortholog) and nega-tive (non-ortholog) data - better than with good, previously known alter-natives (which do not employ some means to handle short sequences well).Also empirically compared are the new entropy based attribute set and a number of other, more standard similarity attributes sets commonly used in genomic analysis. The various similarity attributes are evaluated by cross validation, through boosted decision tree induction C5.0, and by Receiver Operating Characteristic (ROC) analysis. The results point to the conclu-sion: the new, entropy based attribute set by itself is not the one giving the best prediction; however, it is the best attribute set for use in improving the other, standard attribute sets when conjoined with them.

Keywords: compression, decision tree, entropy, ortholog, ROC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1825
134 Combining the Deep Neural Network with the K-Means for Traffic Accident Prediction

Authors: Celso L. Fernando, Toshio Yoshii, Takahiro Tsubota

Abstract:

Understanding the causes of a road accident and predicting their occurrence is key to prevent deaths and serious injuries from road accident events. Traditional statistical methods such as the Poisson and the Logistics regressions have been used to find the association of the traffic environmental factors with the accident occurred; recently, an artificial neural network, ANN, a computational technique that learns from historical data to make a more accurate prediction, has emerged. Although the ability to make accurate predictions, the ANN has difficulty dealing with highly unbalanced attribute patterns distribution in the training dataset; in such circumstances, the ANN treats the minority group as noise. However, in the real world data, the minority group is often the group of interest; e.g., in the road traffic accident data, the events of the accident are the group of interest. This study proposes a combination of the k-means with the ANN to improve the predictive ability of the neural network model by alleviating the effect of the unbalanced distribution of the attribute patterns in the training dataset. The results show that the proposed method improves the ability of the neural network to make a prediction on a highly unbalanced distributed attribute patterns dataset; however, on an even distributed attribute patterns dataset, the proposed method performs almost like a standard neural network. 

Keywords: Accident risks estimation, artificial neural network, deep learning, K-mean, road safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 968
133 Perception of Secondary Schools’ Students on Computer Education in Federal Capital Territory (FCT-Abuja), Nigeria

Authors: Salako Emmanuel Adekunle

Abstract:

Computer education is referred to as the knowledge and ability to use computers and related technology efficiently, with a range of skills covering levels from basic use to advance. Computer continues to make an ever-increasing impact on all aspect of human endeavours such as education. With numerous benefits of computer education, what are the insights of students on computer education? This study investigated the perception of senior secondary school students on computer education in Federal Capital Territory (FCT), Abuja, Nigeria. A sample of 7500 senior secondary schools students was involved in the study, one hundred (100) private and fifty (50) public schools within FCT. They were selected by using simple random sampling technique. A questionnaire [PSSSCEQ] was developed and validated through expert judgement and reliability coefficient of 0.84 was obtained. It was used to gather relevant data on computer education. Findings confirmed that the students in the FCT had positive perception on computer education. Some factors were identified that affect students’ perception on computer education. The null hypotheses were tested using t-test and ANOVA statistical analyses at 0.05 level of significance. Based on these findings, some recommendations were made which include competent teachers should be employed into all secondary schools. This will help students to acquire relevant knowledge in computer education, technological supports should be provided to all secondary schools; this will help the users (students) to solve specific problems in computer education and financial supports should be provided to procure computer facilities that will enhance the teaching and the learning of computer education.

Keywords: Computer education, perception, secondary school, students.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4061
132 Mining User-Generated Contents to Detect Service Failures with Topic Model

Authors: Kyung Bae Park, Sung Ho Ha

Abstract:

Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.

Keywords: Latent Dirichlet allocation, R program, text mining, topic model, user generated contents, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1215
131 The Estimation of Bird Diversity Loss and Gain as an Impact of Oil Palm Plantation: Study Case in KJNP Estate Riau Province

Authors: Yanto Santosa, Catharina Yudea

Abstract:

The rapid growth of oil palm industry in Indonesia raised many negative accusations from various parties, who said that oil palm plantation is damaging the environment and biodiversity, including birds. Since research on oil palm plantation impacts on bird diversity is still limited, this study needs to be developed in order to gain further learning and understanding. Data on bird diversity were collected in March 2018 in KJNP Estate, Riau Province using strip transect method on five different land cover types (young, intermediate, and old growth of oil palm plantation, high conservation value area, and crops field or the baseline). The observations were conducted simultaneously, with three repetitions. The result shows that the baseline has 19 species of birds and land cover after the oil palm plantation has 39 species. HCV (high conservation value) area has the highest increase in diversity value. Oil palm plantation has changed the composition of bird species. The highest similarity index is shown by young growth oil palm land cover with total score 0.65, meanwhile the lowest similarity index with total score 0.43 is shown by HCV area. Overall, the existence of oil palm plantation made a positive impact by increasing bird species diversity, with total 23 species gained and 3 species lost.

Keywords: Bird diversity, crops field, impact of oil palm plantation, KJNP estate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 793