Search results for: test data
8601 Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering
Authors: Yogita, Durga Toshniwal
Abstract:
Outlier detection in streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving. Irrelevant attributes can be termed as noisy attributes and such attributes further magnify the challenge of working with data streams. In this paper, we propose an unsupervised outlier detection scheme for streaming data. This scheme is based on clustering as clustering is an unsupervised data mining task and it does not require labeled data, both density based and partitioning clustering are combined for outlier detection. In this scheme partitioning clustering is also used to assign weights to attributes depending upon their respective relevance and weights are adaptive. Weighted attributes are helpful to reduce or remove the effect of noisy attributes. Keeping in view the challenges of streaming data, the proposed scheme is incremental and adaptive to concept evolution. Experimental results on synthetic and real world data sets show that our proposed approach outperforms other existing approach (CORM) in terms of outlier detection rate, false alarm rate, and increasing percentages of outliers.
Keywords: Concept Evolution, Irrelevant Attributes, Streaming Data, Unsupervised Outlier Detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26438600 Application of Artificial Intelligence to Schedule Operability of Waterfront Facilities in Macro Tide Dominated Wide Estuarine Harbour
Authors: A. Basu, A. A. Purohit, M. M. Vaidya, M. D. Kudale
Abstract:
Mumbai, being traditionally the epicenter of India's trade and commerce, the existing major ports such as Mumbai and Jawaharlal Nehru Ports (JN) situated in Thane estuary are also developing its waterfront facilities. Various developments over the passage of decades in this region have changed the tidal flux entering/leaving the estuary. The intake at Pir-Pau is facing the problem of shortage of water in view of advancement of shoreline, while jetty near Ulwe faces the problem of ship scheduling due to existence of shallower depths between JN Port and Ulwe Bunder. In order to solve these problems, it is inevitable to have information about tide levels over a long duration by field measurements. However, field measurement is a tedious and costly affair; application of artificial intelligence was used to predict water levels by training the network for the measured tide data for one lunar tidal cycle. The application of two layered feed forward Artificial Neural Network (ANN) with back-propagation training algorithms such as Gradient Descent (GD) and Levenberg-Marquardt (LM) was used to predict the yearly tide levels at waterfront structures namely at Ulwe Bunder and Pir-Pau. The tide data collected at Apollo Bunder, Ulwe, and Vashi for a period of lunar tidal cycle (2013) was used to train, validate and test the neural networks. These trained networks having high co-relation coefficients (R= 0.998) were used to predict the tide at Ulwe, and Vashi for its verification with the measured tide for the year 2000 & 2013. The results indicate that the predicted tide levels by ANN give reasonably accurate estimation of tide. Hence, the trained network is used to predict the yearly tide data (2015) for Ulwe. Subsequently, the yearly tide data (2015) at Pir-Pau was predicted by using the neural network which was trained with the help of measured tide data (2000) of Apollo and Pir-Pau. The analysis of measured data and study reveals that: The measured tidal data at Pir-Pau, Vashi and Ulwe indicate that there is maximum amplification of tide by about 10-20 cm with a phase lag of 10-20 minutes with reference to the tide at Apollo Bunder (Mumbai). LM training algorithm is faster than GD and with increase in number of neurons in hidden layer and the performance of the network increases. The predicted tide levels by ANN at Pir-Pau and Ulwe provides valuable information about the occurrence of high and low water levels to plan the operation of pumping at Pir-Pau and improve ship schedule at Ulwe.Keywords: Artificial neural network, back-propagation, tide data, training algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17208599 The Effect of Measurement Distribution on System Identification and Detection of Behavior of Nonlinearities of Data
Authors: Mohammad Javad Mollakazemi, Farhad Asadi, Aref Ghafouri
Abstract:
In this paper, we considered and applied parametric modeling for some experimental data of dynamical system. In this study, we investigated the different distribution of output measurement from some dynamical systems. Also, with variance processing in experimental data we obtained the region of nonlinearity in experimental data and then identification of output section is applied in different situation and data distribution. Finally, the effect of the spanning the measurement such as variance to identification and limitation of this approach is explained.
Keywords: Gaussian process, Nonlinearity distribution, Particle filter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17288598 Use of Gaussian-Euclidean Hybrid Function Based Artificial Immune System for Breast Cancer Diagnosis
Authors: Cuneyt Yucelbas, Seral Ozsen, Sule Yucelbas, Gulay Tezel
Abstract:
Due to the fact that there exist only a small number of complex systems in artificial immune system (AIS) that work out nonlinear problems, nonlinear AIS approaches, among the well-known solution techniques, need to be developed. Gaussian function is usually used as similarity estimation in classification problems and pattern recognition. In this study, diagnosis of breast cancer, the second type of the most widespread cancer in women, was performed with different distance calculation functions that euclidean, gaussian and gaussian-euclidean hybrid function in the clonal selection model of classical AIS on Wisconsin Breast Cancer Dataset (WBCD), which was taken from the University of California, Irvine Machine-Learning Repository. We used 3-fold cross validation method to train and test the dataset. According to the results, the maximum test classification accuracy was reported as 97.35% by using of gaussian-euclidean hybrid function for fold-3. Also, mean of test classification accuracies for all of functions were obtained as 94.78%, 94.45% and 95.31% with use of euclidean, gaussian and gaussian-euclidean, respectively. With these results, gaussian-euclidean hybrid function seems to be a potential distance calculation method, and it may be considered as an alternative distance calculation method for hard nonlinear classification problems.
Keywords: Artificial Immune System, Breast Cancer Diagnosis, Euclidean Function, Gaussian Function.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21268597 A Case Study of Key-Dependent Permutations in Feistel Ciphers
Authors: Hani Almimi, Ola Osabi, Azman Samsudin
Abstract:
Many attempts have been made to strengthen Feistel based block ciphers. Among the successful proposals is the key- dependent S-box which was implemented in some of the high-profile ciphers. In this paper a key-dependent permutation box is proposed and implemented on DES as a case study. The new modified DES, MDES, was tested against Diehard Tests, avalanche test, and performance test. The results showed that in general MDES is more resistible to attacks than DES with negligible overhead. Therefore, it is believed that the proposed key-dependent permutation should be considered as a valuable primitive that can help strengthen the security of Substitution-Permutation Network which is a core design in many Feistel based block ciphers.
Keywords: Block Cipher, Feistel Structure, DES, Diehard Tests, Avalanche Effect.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20228596 Exponentially Weighted Simultaneous Estimation of Several Quantiles
Authors: Valeriy Naumov, Olli Martikainen
Abstract:
In this paper we propose new method for simultaneous generating multiple quantiles corresponding to given probability levels from data streams and massive data sets. This method provides a basis for development of single-pass low-storage quantile estimation algorithms, which differ in complexity, storage requirement and accuracy. We demonstrate that such algorithms may perform well even for heavy-tailed data.Keywords: Quantile estimation, data stream, heavy-taileddistribution, tail index.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15398595 Septic B-spline Collocation Method for Solving One-dimensional Hyperbolic Telegraph Equation
Authors: Marzieh Dosti, Alireza Nazemi
Abstract:
Recently, it is found that telegraph equation is more suitable than ordinary diffusion equation in modelling reaction diffusion for such branches of sciences. In this paper, a numerical solution for the one-dimensional hyperbolic telegraph equation by using the collocation method using the septic splines is proposed. The scheme works in a similar fashion as finite difference methods. Test problems are used to validate our scheme by calculate L2-norm and L∞-norm. The accuracy of the presented method is demonstrated by two test problems. The numerical results are found to be in good agreement with the exact solutions.
Keywords: B-spline, collocation method, second-order hyperbolic telegraph equation, difference schemes.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18088594 Induction Motor Design with Limited Harmonic Currents Using Particle Swarm Optimization
Authors: C. Thanga Raj, S. P. Srivastava, Pramod Agarwal
Abstract:
This paper presents an optimal design of poly-phase induction motor using Quadratic Interpolation based Particle Swarm Optimization (QI-PSO). The optimization algorithm considers the efficiency, starting torque and temperature rise as objective function (which are considered separately) and ten performance related items including harmonic current as constraints. The QI-PSO algorithm was implemented on a test motor and the results are compared with the Simulated Annealing (SA) technique, Standard Particle Swarm Optimization (SPSO), and normal design. Some benchmark problems are used for validating QI-PSO. From the test results QI-PSO gave better results and more suitable to motor-s design optimization. Cµ code is used for implementing entire algorithms.
Keywords: Design, harmonics, induction motor, particle swarm optimization
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17998593 The Relationship between Competency-Based Learning and Learning Efficiency of Media Communication Students at Suan Sunandha Rajabhat University
Authors: Somtop Keawchuer
Abstract:
This research aims to study (1) the relationship between competency-based learning and learning efficiency of new media communication students at Suan Sunandha University (2) the demographic factor effect on learning efficiency of students at Suan Sunandha University. This research method will use quantitative research; data was collected by questionnaires distributed to students from new media communication in management science faculty of Suan Sunandha Rajabhat University for 1340 sample by purposive sampling method. Data was analyzed by descriptive statistic including percentage, mean, standard deviation and inferential statistic including T-test, ANOVA and Pearson correlation for hypothesis testing. The results showed that the competency-based learning in term of ability to communicate, ability to think and solve the problem, life skills and ability to use technology has a significant relationship with learning efficiency in term of the cognitive domain, psychomotor domain and affective domain at the 0.05 level and which is in harmony with the research hypotheses.
Keywords: Competency-based learning, learning efficiency, new media communication students, Suan Sunandha Rajabhat University.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11858592 Multi-Agent System for Irrigation Using Fuzzy Logic Algorithm and Open Platform Communication Data Access
Authors: T. Wanyama, B. Far
Abstract:
Automatic irrigation systems usually conveniently protect landscape investment. While conventional irrigation systems are known to be inefficient, automated ones have the potential to optimize water usage. In fact, there is a new generation of irrigation systems that are smart in the sense that they monitor the weather, soil conditions, evaporation and plant water use, and automatically adjust the irrigation schedule. In this paper, we present an agent based smart irrigation system. The agents are built using a mix of commercial off the shelf software, including MATLAB, Microsoft Excel and KEPServer Ex5 OPC server, and custom written code. The Irrigation Scheduler Agent uses fuzzy logic to integrate the information that affect the irrigation schedule. In addition, the Multi-Agent system uses Open Platform Connectivity (OPC) technology to share data. OPC technology enables the Irrigation Scheduler Agent to communicate over the Internet, making the system scalable to a municipal or regional agent based water monitoring, management, and optimization system. Finally, this paper presents simulation and pilot installation test result that show the operational effectiveness of our system.
Keywords: Community water usage, fuzzy logic, irrigation, multi-agent system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13468591 Dichotomous Logistic Regression with Leave-One-Out Validation
Authors: Sin Yin Teh, Abdul Rahman Othman, Michael Boon Chong Khoo
Abstract:
In this paper, the concepts of dichotomous logistic regression (DLR) with leave-one-out (L-O-O) were discussed. To illustrate this, the L-O-O was run to determine the importance of the simulation conditions for robust test of spread procedures with good Type I error rates. The resultant model was then evaluated. The discussions included 1) assessment of the accuracy of the model, and 2) parameter estimates. These were presented and illustrated by modeling the relationship between the dichotomous dependent variable (Type I error rates) with a set of independent variables (the simulation conditions). The base SAS software containing PROC LOGISTIC and DATA step functions can be making used to do the DLR analysis.Keywords: Dichotomous logistic regression, leave-one-out, testof spread.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20818590 Enhanced Data Access Control of Cooperative Environment used for DMU Based Design
Authors: Wei Lifan, Zhang Huaiyu, Yang Yunbin, Li Jia
Abstract:
Through the analysis of the process digital design based on digital mockup, the fact indicates that a distributed cooperative supporting environment is the foundation conditions to adopt design approach based on DMU. Data access authorization is concerned firstly because the value and sensitivity of the data for the enterprise. The access control for administrators is often rather weak other than business user. So authors established an enhanced system to avoid the administrators accessing the engineering data by potential approach and without authorization. Thus the data security is improved.Keywords: access control, DMU, PLM, virtual prototype.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14718589 Component Comparison of Polyaluminum Chloride Produced from Various Methods
Authors: Wen Po Cheng, Chia Yun Chung, Ruey Fang Yu, Chao Feng Chen
Abstract:
The main objective of this research was to study the differences of aluminum hydrolytic products between two PACl preparation methods. These two methods were the acidification process of freshly formed amorphous Al(OH)3 and the conventional alkalization process of aluminum chloride solution. According to Ferron test and 27Al NMR analysis of those two PACl preparation procedures, the reaction rate constant (k) values and Al13 percentage of acid addition process at high basicity value were both lower than those values of the alkaline addition process. The results showed that the molecular structure and size distribution of the aluminum species in both preparing methods were suspected to be significantly different at high basicity value.
Keywords: Polyaluminum chloride, Al13, amorphous aluminum hydroxide, Ferron test.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15148588 Speed Characteristics of Mixed Traffic Flow on Urban Arterials
Authors: Ashish Dhamaniya, Satish Chandra
Abstract:
Speed and traffic volume data are collected on different sections of four lane and six lane roads in three metropolitan cities in India. Speed data are analyzed to fit the statistical distribution to individual vehicle speed data and all vehicles speed data. It is noted that speed data of individual vehicle generally follows a normal distribution but speed data of all vehicle combined at a section of urban road may or may not follow the normal distribution depending upon the composition of traffic stream. A new term Speed Spread Ratio (SSR) is introduced in this paper which is the ratio of difference in 85th and 50th percentile speed to the difference in 50th and 15th percentile speed. If SSR is unity then speed data are truly normally distributed. It is noted that on six lane urban roads, speed data follow a normal distribution only when SSR is in the range of 0.86 – 1.11. The range of SSR is validated on four lane roads also.
Keywords: Normal distribution, percentile speed, speed spread ratio, traffic volume.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 42538587 An Accurate Method for Phylogeny Tree Reconstruction Based on a Modified Wild Dog Algorithm
Authors: Essam Al Daoud
Abstract:
This study solves a phylogeny problem by using modified wild dog pack optimization. The least squares error is considered as a cost function that needs to be minimized. Therefore, in each iteration, new distance matrices based on the constructed trees are calculated and used to select the alpha dog. To test the suggested algorithm, ten homologous genes are selected and collected from National Center for Biotechnology Information (NCBI) databanks (i.e., 16S, 18S, 28S, Cox 1, ITS1, ITS2, ETS, ATPB, Hsp90, and STN). The data are divided into three categories: 50 taxa, 100 taxa and 500 taxa. The empirical results show that the proposed algorithm is more reliable and accurate than other implemented methods.Keywords: Least squares, neighbor joining, phylogenetic tree, wild dogpack.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14008586 Horizontal Directivity of Pipa Radiation
Authors: Xin Wang, Yuanzhong Wang
Abstract:
Pipa is one of the most important Chinese traditional plucked instruments, but its directivity has never been measured systematically. In western, directivity of loudness for western instruments is deeply researched through analysis of sound pressure level, whereas the directivity of timbre is seldom studied. In this paper, a new method for directivity of timbre was proposed, and horizontal directivity patterns of loudness and timbre of Pipa were measured. Directivity of Pipa radiation was measured in an anechoic room. The sound of Pipa played by a musician was recorded simultaneously by 32 microphones with Pipa in the center. The measuring results were examined through listening test. According to the measurement of Pipa directivity radiation, we put forward the best localization of Pipa in the Chinese traditional orchestra and the optimal recording region.Keywords: Directivity, Pipa, Roughness, Listening test.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17718585 Comparative Study of Transformed and Concealed Data in Experimental Designs and Analyses
Authors: K. Chinda, P. Luangpaiboon
Abstract:
This paper presents the comparative study of coded data methods for finding the benefit of concealing the natural data which is the mercantile secret. Influential parameters of the number of replicates (rep), treatment effects (τ) and standard deviation (σ) against the efficiency of each transformation method are investigated. The experimental data are generated via computer simulations under the specified condition of the process with the completely randomized design (CRD). Three ways of data transformation consist of Box-Cox, arcsine and logit methods. The difference values of F statistic between coded data and natural data (Fc-Fn) and hypothesis testing results were determined. The experimental results indicate that the Box-Cox results are significantly different from natural data in cases of smaller levels of replicates and seem to be improper when the parameter of minus lambda has been assigned. On the other hand, arcsine and logit transformations are more robust and obviously, provide more precise numerical results. In addition, the alternate ways to select the lambda in the power transformation are also offered to achieve much more appropriate outcomes.Keywords: Experimental Designs, Box-Cox, Arcsine, Logit Transformations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16278584 Power Transformer Noise, Noise Tests, and Example Test Results
Authors: E. Doğan, B. Kekezoğlu
Abstract:
Voltage level must be raised in order to deliver the produced energy to the consumption zones with less loss and less cost. Power transformers used to raise or lower voltage are important parts of the energy transmission system. Power transformers used in switchgear and power generation plants stay in human's intensive habitat zones as a result of expanding cities. Accordingly, noise levels produced by power transformers have begun more and more important and they have established itself as one of the research field. In this research, the noise cause on transformers has been investigated, it's causes has been examined and noise measurement techniques have been introduced. Examples of transformer noise test results are submitted and precautions to be taken were discussed for the purpose of decreasing of the noise which will occurred by transformers.
Keywords: Power transformer, noise measurement, core noise, load noise, fan-pump noise.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 57038583 Design of a Low Cost Motion Data Acquisition Setup for Mechatronic Systems
Authors: Barış Can Yalçın
Abstract:
Motion sensors have been commonly used as a valuable component in mechatronic systems, however, many mechatronic designs and applications that need motion sensors cost enormous amount of money, especially high-tech systems. Design of a software for communication protocol between data acquisition card and motion sensor is another issue that has to be solved. This study presents how to design a low cost motion data acquisition setup consisting of MPU 6050 motion sensor (gyro and accelerometer in 3 axes) and Arduino Mega2560 microcontroller. Design parameters are calibration of the sensor, identification and communication between sensor and data acquisition card, interpretation of data collected by the sensor.
Keywords: Calibration of sensors, data acquisition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 43438582 Estimation Model for Concrete Slump Recovery by Using Superplasticizer
Authors: Chaiyakrit Raoupatham, Ram Hari Dhakal, Chalermchai Wanichlamlert
Abstract:
This paper aimed to introduce the solution of concrete slump recovery using chemical admixture type-F (superplasticizer, naphthalene base) to the practice in order to solve unusable concrete problem due to concrete loss its slump, especially for those tropical countries that have faster slump loss rate. In the other hand, randomly adding superplasticizer into concrete can cause concrete to segregate. Therefore, this paper also develops the estimation model used to calculate amount of second dose of superplasticizer need for concrete slump recovery. Fresh properties of ordinary Portland cement concrete with volumetric ratio of paste to void between aggregate (paste content) of 1.1-1.3 with water-cement ratio zone of 0.30 to 0.67 and initial superplasticizer (naphthalene base) of 0.25%-1.6% were tested for initial slump and slump loss for every 30 minutes for one and half hour by slump cone test. Those concretes with slump loss range from 10% to 90% were re-dosed and successfully recovered back to its initial slump. Slump after re-dosed was tested by slump cone test. From the result, it has been concluded that, slump loss was slower for those mix with high initial dose of superplasticizer due to addition of superplasticizer will disturb cement hydration. The required second dose of superplasticizer was affected by two major parameters, which were water-cement ratio and paste content, where lower water-cement ratio and paste content cause an increase in require second dose of superplasticizer. The amount of second dose of superplasticizer is higher as the solid content within the system is increase, solid can be either from cement particles or aggregate. The data was analyzed to form an equation use to estimate the amount of second dosage requirement of superplasticizer to recovery slump to its original.Keywords: Estimation model, second superplasticizer dosage, slump loss, slump recovery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19258581 Conceptual Multidimensional Model
Authors: Manpreet Singh, Parvinder Singh, Suman
Abstract:
The data is available in abundance in any business organization. It includes the records for finance, maintenance, inventory, progress reports etc. As the time progresses, the data keep on accumulating and the challenge is to extract the information from this data bank. Knowledge discovery from these large and complex databases is the key problem of this era. Data mining and machine learning techniques are needed which can scale to the size of the problems and can be customized to the application of business. For the development of accurate and required information for particular problem, business analyst needs to develop multidimensional models which give the reliable information so that they can take right decision for particular problem. If the multidimensional model does not possess the advance features, the accuracy cannot be expected. The present work involves the development of a Multidimensional data model incorporating advance features. The criterion of computation is based on the data precision and to include slowly change time dimension. The final results are displayed in graphical form.Keywords: Multidimensional, data precision.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14658580 Real Time Approach for Data Placement in Wireless Sensor Networks
Authors: Sanjeev Gupta, Mayank Dave
Abstract:
The issue of real-time and reliable report delivery is extremely important for taking effective decision in a real world mission critical Wireless Sensor Network (WSN) based application. The sensor data behaves differently in many ways from the data in traditional databases. WSNs need a mechanism to register, process queries, and disseminate data. In this paper we propose an architectural framework for data placement and management. We propose a reliable and real time approach for data placement and achieving data integrity using self organized sensor clusters. Instead of storing information in individual cluster heads as suggested in some protocols, in our architecture we suggest storing of information of all clusters within a cell in the corresponding base station. For data dissemination and action in the wireless sensor network we propose to use Action and Relay Stations (ARS). To reduce average energy dissipation of sensor nodes, the data is sent to the nearest ARS rather than base station. We have designed our architecture in such a way so as to achieve greater energy savings, enhanced availability and reliability.
Keywords: Cluster head, data reliability, real time communication, wireless sensor networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18198579 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine
Authors: Djamila Benhaddouche, Abdelkader Benyettou
Abstract:
In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.
Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18748578 A Software Framework for Predicting Oil-Palm Yield from Climate Data
Authors: Mohd. Noor Md. Sap, A. Majid Awan
Abstract:
Intelligent systems based on machine learning techniques, such as classification, clustering, are gaining wide spread popularity in real world applications. This paper presents work on developing a software system for predicting crop yield, for example oil-palm yield, from climate and plantation data. At the core of our system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. This work gets inspiration from the notion that a non-linear data transformation into some high dimensional feature space increases the possibility of linear separability of the patterns in the transformed space. Therefore, it simplifies exploration of the associated structure in the data. Kernel methods implicitly perform a non-linear mapping of the input data into a high dimensional feature space by replacing the inner products with an appropriate positive definite function. In this paper we present a robust weighted kernel k-means algorithm incorporating spatial constraints for clustering the data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.Keywords: Pattern analysis, clustering, kernel methods, spatial data, crop yield
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19848577 CSR of top Portuguese Companies: Relation between Social Performance and Economic Performance
Authors: Afonso, S. C., Fernandes, P. O., Monte, A. P.
Abstract:
Modern times call organizations to have an active role in the social arena, through Corporate Social Responsibility (CSR). The objective of this research was to test the hypothesis that there is a positive relation between social performance and economic performance, and if there is a positive correlation between social performance and financial-economic performance. To test these theories a measure of social performance, based on the Green Book of Commission of the European Community, was used in a group of nineteen Portuguese top companies, listed on the PSI 20 index, through a period of five years, since 2005 to 2009. A clusters analysis was applied to group companies by their social performance and to compare and correlate their economic performance. Results indicate that companies that had a better social performance are not the ones who had a better economic performance, and suggest that the middle path might provide a good relation CSR-Economic performance, as a basis to a sustainable development.Keywords: Corporate Social Responsibility, Economic Performance, Win-Win relationship
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24258576 Morphemic Analysis Awareness: A Boon or Bane on ESL Students’ Vocabulary Learning Strategy
Authors: Chandrakala Varatharajoo, Adelina Binti Asmawi, Nabeel Abdallah Mohammad Abedalaziz
Abstract:
This study investigated the impact of inflectional and derivational morphemic analysis awareness on ESL secondary school students’ vocabulary learning strategy. The quasi-experimental study was conducted with 106 low proficiency secondary school students in two experimental groups (inflectional and derivational) and one control group. The students’ vocabulary acquisition was assessed through two measures: Morphemic Analysis Test and Vocabulary- Morphemic Test in the pretest and posttest before and after an intervention programme. Results of ANCOVA revealed that both the experimental groups achieved a significant score in Morphemic Analysis Test and Vocabulary-Morphemic Test. However, the inflectional group obtained a fairly higher score than the derivational group. Thus, the results indicated that ESL low proficiency secondary school students performed better on inflectional morphemic awareness as compared to derivatives. The results also showed that the awareness of inflectional morphology contributed more on the vocabulary acquisition. Importantly, learning inflectional morphology can help ESL low proficiency secondary school students to develop both morphemic awareness and vocabulary gain. Theoretically, these findings show that not all morphemes are equally useful to students for their language development. Practically, these findings indicate that morphological instruction should at least be included in remediation and instructional efforts with struggling learners across all grade levels, allowing them to focus on meaning within the word before they attempt the text in large for better comprehension. Also, by methodologically, by conducting individualized intervention and assessment this study provided fresh empirical evidence to support the existing literature on morphemic analysis awareness and vocabulary learning strategy. Thus, a major pedagogical implication of the study is that morphemic analysis awareness strategy is a definite boon for ESL secondary school students in learning English vocabulary.
Keywords: ESL, instruction, morphemic analysis, vocabulary.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29188575 A Proposal for U-City (Smart City) Service Method Using Real-Time Digital Map
Authors: SangWon Han, MuWook Pyeon, Sujung Moon, DaeKyo Seo
Abstract:
Recently, technologies based on three-dimensional (3D) space information are being developed and quality of life is improving as a result. Research on real-time digital map (RDM) is being conducted now to provide 3D space information. RDM is a service that creates and supplies 3D space information in real time based on location/shape detection. Research subjects on RDM include the construction of 3D space information with matching image data, complementing the weaknesses of image acquisition using multi-source data, and data collection methods using big data. Using RDM will be effective for space analysis using 3D space information in a U-City and for other space information utilization technologies.
Keywords: RDM, multi-source data, big data, U-City.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8148574 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-
Authors: Nieto Bernal Wilson, Carmona Suarez Edgar
Abstract:
The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects. Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.
Keywords: Data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14838573 An Effort at Improving Reliability of Laboratory Data in Titrimetric Analysis for Zinc Sulphate Tablets Using Validated Spreadsheet Calculators
Authors: M. A. Okezue, K. L. Clase, S. R. Byrn
Abstract:
The requirement for maintaining data integrity in laboratory operations is critical for regulatory compliance. Automation of procedures reduces incidence of human errors. Quality control laboratories located in low-income economies may face some barriers in attempts to automate their processes. Since data from quality control tests on pharmaceutical products are used in making regulatory decisions, it is important that laboratory reports are accurate and reliable. Zinc Sulphate (ZnSO4) tablets is used in treatment of diarrhea in pediatric population, and as an adjunct therapy for COVID-19 regimen. Unfortunately, zinc content in these formulations is determined titrimetrically; a manual analytical procedure. The assay for ZnSO4 tablets involves time-consuming steps that contain mathematical formulae prone to calculation errors. To achieve consistency, save costs, and improve data integrity, validated spreadsheets were developed to simplify the two critical steps in the analysis of ZnSO4 tablets: standardization of 0.1M Sodium Edetate (EDTA) solution, and the complexometric titration assay procedure. The assay method in the United States Pharmacopoeia was used to create a process flow for ZnSO4 tablets. For each step in the process, different formulae were input into two spreadsheets to automate calculations. Further checks were created within the automated system to ensure validity of replicate analysis in titrimetric procedures. Validations were conducted using five data sets of manually computed assay results. The acceptance criteria set for the protocol were met. Significant p-values (p < 0.05, α = 0.05, at 95% Confidence Interval) were obtained from students’ t-test evaluation of the mean values for manual-calculated and spreadsheet results at all levels of the analysis flow. Right-first-time analysis and principles of data integrity were enhanced by use of the validated spreadsheet calculators in titrimetric evaluations of ZnSO4 tablets. Human errors were minimized in calculations when procedures were automated in quality control laboratories. The assay procedure for the formulation was achieved in a time-efficient manner with greater level of accuracy. This project is expected to promote cost savings for laboratory business models.
Keywords: Data integrity, spreadsheets, titrimetry, validation, zinc sulphate tablets.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5288572 Distributed Data-Mining by Probability-Based Patterns
Authors: M. Kargar, F. Gharbalchi
Abstract:
In this paper a new method is suggested for distributed data-mining by the probability patterns. These patterns use decision trees and decision graphs. The patterns are cared to be valid, novel, useful, and understandable. Considering a set of functions, the system reaches to a good pattern or better objectives. By using the suggested method we will be able to extract the useful information from massive and multi-relational data bases.Keywords: Data-mining, Decision tree, Decision graph, Pattern, Relationship.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1562