Search results for: Big data
6583 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques
Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian
Abstract:
Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.Keywords: Data mining, K-means, road traffic accidents, Waze, Weka.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12216582 Efficient Tuning Parameter Selection by Cross-Validated Score in High Dimensional Models
Authors: Yoonsuh Jung
Abstract:
As DNA microarray data contain relatively small sample size compared to the number of genes, high dimensional models are often employed. In high dimensional models, the selection of tuning parameter (or, penalty parameter) is often one of the crucial parts of the modeling. Cross-validation is one of the most common methods for the tuning parameter selection, which selects a parameter value with the smallest cross-validated score. However, selecting a single value as an ‘optimal’ value for the parameter can be very unstable due to the sampling variation since the sample sizes of microarray data are often small. Our approach is to choose multiple candidates of tuning parameter first, then average the candidates with different weights depending on their performance. The additional step of estimating the weights and averaging the candidates rarely increase the computational cost, while it can considerably improve the traditional cross-validation. We show that the selected value from the suggested methods often lead to stable parameter selection as well as improved detection of significant genetic variables compared to the tradition cross-validation via real data and simulated data sets.Keywords: Cross Validation, Parameter Averaging, Parameter Selection, Regularization Parameter Search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15746581 Using ALOHA Code to Evaluate CO2 Concentration for Maanshan Nuclear Power Plant
Authors: W. S. Hsu, S. W. Chen, Y. T. Ku, Y. Chiang, J. R. Wang , J. H. Yang, C. Shih
Abstract:
ALOHA code was used to calculate the concentration under the CO2 storage burst condition for Maanshan nuclear power plant (NPP) in this study. Five main data are input into ALOHA code including location, building, chemical, atmospheric, and source data. The data from Final Safety Analysis Report (FSAR) and some reports were used in this study. The ALOHA results are compared with the failure criteria of R.G. 1.78 to confirm the habitability of control room. The result of comparison presents that the ALOHA result is below the R.G. 1.78 criteria. This implies that the habitability of control room can be maintained in this case. The sensitivity study for atmospheric parameters was performed in this study. The results show that the wind speed has the larger effect in the concentration calculation.
Keywords: PWR, ALOHA, habitability, Maanshan.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7476580 Review and Comparison of Associative Classification Data Mining Approaches
Authors: Suzan Wedyan
Abstract:
Associative classification (AC) is a data mining approach that combines association rule and classification to build classification models (classifiers). AC has attracted a significant attention from several researchers mainly because it derives accurate classifiers that contain simple yet effective rules. In the last decade, a number of associative classification algorithms have been proposed such as Classification based Association (CBA), Classification based on Multiple Association Rules (CMAR), Class based Associative Classification (CACA), and Classification based on Predicted Association Rule (CPAR). This paper surveys major AC algorithms and compares the steps and methods performed in each algorithm including: rule learning, rule sorting, rule pruning, classifier building, and class prediction.
Keywords: Associative Classification, Classification, Data Mining, Learning, Rule Ranking, Rule Pruning, Prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 66376579 Distributed Splay Suffix Arrays: A New Structure for Distributed String Search
Authors: Tu Kun, Gu Nai-jie, Bi Kun, Liu Gang, Dong Wan-li
Abstract:
As a structure for processing string problem, suffix array is certainly widely-known and extensively-studied. But if the string access pattern follows the “90/10" rule, suffix array can not take advantage of the fact that we often find something that we have just found. Although the splay tree is an efficient data structure for small documents when the access pattern follows the “90/10" rule, it requires many structures and an excessive amount of pointer manipulations for efficiently processing and searching large documents. In this paper, we propose a new and conceptually powerful data structure, called splay suffix arrays (SSA), for string search. This data structure combines the features of splay tree and suffix arrays into a new approach which is suitable to implementation on both conventional and clustered computers.Keywords: suffix arrays, splay tree, string search, distributedalgorithm
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17806578 Dimension Free Rigid Point Set Registration in Linear Time
Authors: Jianqin Qu
Abstract:
This paper proposes a rigid point set matching algorithm in arbitrary dimensions based on the idea of symmetric covariant function. A group of functions of the points in the set are formulated using rigid invariants. Each of these functions computes a pair of correspondence from the given point set. Then the computed correspondences are used to recover the unknown rigid transform parameters. Each computed point can be geometrically interpreted as the weighted mean center of the point set. The algorithm is compact, fast, and dimension free without any optimization process. It either computes the desired transform for noiseless data in linear time, or fails quickly in exceptional cases. Experimental results for synthetic data and 2D/3D real data are provided, which demonstrate potential applications of the algorithm to a wide range of problems.Keywords: Covariant point, point matching, dimension free, rigid registration.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6856577 Functional and Efficient Query Interpreters: Principle, Application and Performances’ Comparison
Authors: Laurent Thiry, Michel Hassenforder
Abstract:
This paper presents a general approach to implement efficient queries’ interpreters in a functional programming language. Indeed, most of the standard tools actually available use an imperative and/or object-oriented language for the implementation (e.g. Java for Jena-Fuseki) but other paradigms are possible with, maybe, better performances. To proceed, the paper first explains how to model data structures and queries in a functional point of view. Then, it proposes a general methodology to get performances (i.e. number of computation steps to answer a query) then it explains how to integrate some optimization techniques (short-cut fusion and, more important, data transformations). It then compares the functional server proposed to a standard tool (Fuseki) demonstrating that the first one can be twice to ten times faster to answer queries.Keywords: Data transformation, functional programming, information server, optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7556576 Evidence Theory Enabled Quickest Change Detection Using Big Time-Series Data from Internet of Things
Authors: Hossein Jafari, Xiangfang Li, Lijun Qian, Alexander Aved, Timothy Kroecker
Abstract:
Traditionally in sensor networks and recently in the Internet of Things, numerous heterogeneous sensors are deployed in distributed manner to monitor a phenomenon that often can be model by an underlying stochastic process. The big time-series data collected by the sensors must be analyzed to detect change in the stochastic process as quickly as possible with tolerable false alarm rate. However, sensors may have different accuracy and sensitivity range, and they decay along time. As a result, the big time-series data collected by the sensors will contain uncertainties and sometimes they are conflicting. In this study, we present a framework to take advantage of Evidence Theory (a.k.a. Dempster-Shafer and Dezert-Smarandache Theories) capabilities of representing and managing uncertainty and conflict to fast change detection and effectively deal with complementary hypotheses. Specifically, Kullback-Leibler divergence is used as the similarity metric to calculate the distances between the estimated current distribution with the pre- and post-change distributions. Then mass functions are calculated and related combination rules are applied to combine the mass values among all sensors. Furthermore, we applied the method to estimate the minimum number of sensors needed to combine, so computational efficiency could be improved. Cumulative sum test is then applied on the ratio of pignistic probability to detect and declare the change for decision making purpose. Simulation results using both synthetic data and real data from experimental setup demonstrate the effectiveness of the presented schemes.Keywords: CUSUM, evidence theory, KL divergence, quickest change detection, time series data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9956575 A Secure Proxy Signature Scheme with Fault Tolerance Based on RSA System
Authors: H. El-Kamchouchi, Heba Gaber, Fatma Ahmed, Dalia H. El-Kamchouchi
Abstract:
Due to the rapid growth in modern communication systems, fault tolerance and data security are two important issues in a secure transaction. During the transmission of data between the sender and receiver, errors may occur frequently. Therefore, the sender must re-transmit the data to the receiver in order to correct these errors, which makes the system very feeble. To improve the scalability of the scheme, we present a secure proxy signature scheme with fault tolerance over an efficient and secure authenticated key agreement protocol based on RSA system. Authenticated key agreement protocols have an important role in building a secure communications network between the two parties.
Keywords: Proxy signature, fault tolerance, RSA, key agreement protocol.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14896574 Hybrid Intelligent Intrusion Detection System
Authors: Norbik Bashah, Idris Bharanidharan Shanmugam, Abdul Manan Ahmed
Abstract:
Intrusion Detection Systems are increasingly a key part of systems defense. Various approaches to Intrusion Detection are currently being used, but they are relatively ineffective. Artificial Intelligence plays a driving role in security services. This paper proposes a dynamic model Intelligent Intrusion Detection System, based on specific AI approach for intrusion detection. The techniques that are being investigated includes neural networks and fuzzy logic with network profiling, that uses simple data mining techniques to process the network data. The proposed system is a hybrid system that combines anomaly, misuse and host based detection. Simple Fuzzy rules allow us to construct if-then rules that reflect common ways of describing security attacks. For host based intrusion detection we use neural-networks along with self organizing maps. Suspicious intrusions can be traced back to its original source path and any traffic from that particular source will be redirected back to them in future. Both network traffic and system audit data are used as inputs for both.Keywords: Intrusion Detection, Network Security, Data mining, Fuzzy Logic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21356573 Investigation of Learning Challenges in Building Measurement Unit
Authors: Argaw T. Gurmu, Muhammad N. Mahmood
Abstract:
The objective of this research is to identify the architecture and construction management students’ learning challenges of the building measurement. This research used the survey data obtained collected from the students who completed the building measurement unit. NVivo qualitative data analysis software was used to identify relevant themes. The analysis of the qualitative data revealed the major learning difficulties such as inadequacy of practice questions for the examination, inability to work as a team, lack of detailed understanding of the prerequisite units, insufficiency of the time allocated for tutorials and incompatibility of lecture and tutorial schedules. The output of this research can be used as a basis for improving the teaching and learning activities in construction measurement units.
Keywords: Building measurement, construction management, learning challenges, evaluate survey.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11136572 An Efficient and Generic Hybrid Framework for High Dimensional Data Clustering
Authors: Dharmveer Singh Rajput , P. K. Singh, Mahua Bhattacharya
Abstract:
Clustering in high dimensional space is a difficult problem which is recurrent in many fields of science and engineering, e.g., bioinformatics, image processing, pattern reorganization and data mining. In high dimensional space some of the dimensions are likely to be irrelevant, thus hiding the possible clustering. In very high dimensions it is common for all the objects in a dataset to be nearly equidistant from each other, completely masking the clusters. Hence, performance of the clustering algorithm decreases. In this paper, we propose an algorithmic framework which combines the (reduct) concept of rough set theory with the k-means algorithm to remove the irrelevant dimensions in a high dimensional space and obtain appropriate clusters. Our experiment on test data shows that this framework increases efficiency of the clustering process and accuracy of the results.Keywords: High dimensional clustering, sub-space, k-means, rough set, discernibility matrix.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19536571 A Rule-based Approach for Anomaly Detection in Subscriber Usage Pattern
Authors: Rupesh K. Gopal, Saroj K. Meher
Abstract:
In this report we present a rule-based approach to detect anomalous telephone calls. The method described here uses subscriber usage CDR (call detail record) data sampled over two observation periods: study period and test period. The study period contains call records of customers- non-anomalous behaviour. Customers are first grouped according to their similar usage behaviour (like, average number of local calls per week, etc). For customers in each group, we develop a probabilistic model to describe their usage. Next, we use maximum likelihood estimation (MLE) to estimate the parameters of the calling behaviour. Then we determine thresholds by calculating acceptable change within a group. MLE is used on the data in the test period to estimate the parameters of the calling behaviour. These parameters are compared against thresholds. Any deviation beyond the threshold is used to raise an alarm. This method has the advantage of identifying local anomalies as compared to techniques which identify global anomalies. The method is tested for 90 days of study data and 10 days of test data of telecom customers. For medium to large deviations in the data in test window, the method is able to identify 90% of anomalous usage with less than 1% false alarm rate.Keywords: Subscription fraud, fraud detection, anomalydetection, maximum likelihood estimation, rule based systems.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28146570 The Effect of the Hourly Compensation on the Unemployment Rate: Comparative Analysis of United States, Canada and the United Kingdom Using Panel Data Regression Analysis
Authors: Ashiquer Rahman, Hares Mohammad, Ummey Salma
Abstract:
A country’s hourly compensation and unemployment rates are two of its most crucial components. They are not merely statistics but they have profound effects on individual, families, country, and the economy. They are inversely related to one another. The increased hourly compensation in the manufacturing sector can have a favorable effect on job changing issues. Moreover, the relationship between hourly compensation and unemployment is complex and influenced by broader economic factors. In this paper, in order to determine the effect of hourly compensation on unemployment rate, we use the panel data regression models and evaluate the expected link between hourly compensation and unemployment rate. We estimate the fixed effects model (FEM), evaluate the error components model (ECM), and determine which model (the FEM or ECM) is better through pooling all 60 observations. We then analyze and review the data by comparing countries (United States, Canada and the United Kingdom) using panel data regression models. Finally, we provide result, analysis and a summary of this extensive research on how the hourly compensation affects unemployment rate. Additionally, this paper offers relevant and useful guideline for the government and academic community to use an econometrics and social approach for the hourly compensation on unemployment rate to eliminate the problem.
Keywords: Hourly compensation, unemployment rate, panel data regression models, dummy variables, random effects model, fixed effects model, the linear regression model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 786569 Application of KL Divergence for Estimation of Each Metabolic Pathway Genes
Authors: Shohei Maruyama, Yasuo Matsuyama, Sachiyo Aburatani
Abstract:
Development of a method to estimate gene functions is an important task in bioinformatics. One of the approaches for the annotation is the identification of the metabolic pathway that genes are involved in. Since gene expression data reflect various intracellular phenomena, those data are considered to be related with genes’ functions. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.
Keywords: Metabolic pathways, gene expression data, microarray, Kullback–Leibler divergence, KL divergence, support vector machines, SVM, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23386568 An Analysis of Compression Methods and Implementation of Medical Images in Wireless Network
Authors: C. Rajan, K. Geetha, S. Geetha
Abstract:
The motivation of image compression technique is to reduce the irrelevance and redundancy of the image data in order to store or pass data in an efficient way from one place to another place. There are several types of compression methods available. Without the help of compression technique, the file size is knowingly larger, usually several megabytes, but by doing the compression technique, it is possible to reduce file size up to 10% as of the original without noticeable loss in quality. Image compression can be lossless or lossy. The compression technique can be applied to images, audio, video and text data. This research work mainly concentrates on methods of encoding, DCT, compression methods, security, etc. Different methodologies and network simulations have been analyzed here. Various methods of compression methodologies and its performance metrics has been investigated and presented in a table manner.Keywords: Image compression techniques, encoding, DCT, lossy compression, lossless compression, JPEG.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11916567 A Meta-Analytic Path Analysis of e-Learning Acceptance Model
Authors: David W.S. Tai, Ren-Cheng Zhang, Sheng-Hung Chang, Chin-Pin Chen, Jia-Ling Chen
Abstract:
This study reports results of a meta-analytic path analysis e-learning Acceptance Model with k = 27 studies, Databases searched included Information Sciences Institute (ISI) website. Variables recorded included perceived usefulness, perceived ease of use, attitude toward behavior, and behavioral intention to use e-learning. A correlation matrix of these variables was derived from meta-analytic data and then analyzed by using structural path analysis to test the fitness of the e-learning acceptance model to the observed aggregated data. Results showed the revised hypothesized model to be a reasonable, good fit to aggregated data. Furthermore, discussions and implications are given in this article.
Keywords: E-learning, Meta Analytic Path Analysis, Technology Acceptance Model
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24516566 AniMoveMineR: Animal Behavior Exploratory Analysis Using Association Rules Mining
Authors: Suelane Garcia Fontes, Silvio Luiz Stanzani, Pedro L. Pizzigatti Corrła Ronaldo G. Morato
Abstract:
Environmental changes and major natural disasters are most prevalent in the world due to the damage that humanity has caused to nature and these damages directly affect the lives of animals. Thus, the study of animal behavior and their interactions with the environment can provide knowledge that guides researchers and public agencies in preservation and conservation actions. Exploratory analysis of animal movement can determine the patterns of animal behavior and with technological advances the ability of animals to be tracked and, consequently, behavioral studies have been expanded. There is a lot of research on animal movement and behavior, but we note that a proposal that combines resources and allows for exploratory analysis of animal movement and provide statistical measures on individual animal behavior and its interaction with the environment is missing. The contribution of this paper is to present the framework AniMoveMineR, a unified solution that aggregates trajectory analysis and data mining techniques to explore animal movement data and provide a first step in responding questions about the animal individual behavior and their interactions with other animals over time and space. We evaluated the framework through the use of monitored jaguar data in the city of Miranda Pantanal, Brazil, in order to verify if the use of AniMoveMineR allows to identify the interaction level between these jaguars. The results were positive and provided indications about the individual behavior of jaguars and about which jaguars have the highest or lowest correlation.Keywords: Data mining, data science, trajectory, animal behavior.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9276565 The Integration of Patient Health Record Generated from Wearable and Internet of Things Devices into Health Information Exchanges
Authors: Dalvin D. Hill, Hector M. Castro Garcia
Abstract:
A growing number of individuals utilize wearable devices on a daily basis. The usage and functionality of these wearable devices vary from user to user. One popular usage of said devices is to track health-related activities that are typically stored on a device’s memory or uploaded to an account in the cloud; based on the current trend, the data accumulated from the wearable device are stored in a standalone location. In many of these cases, this health related datum is not a factor when considering the holistic view of a user’s health lifestyle or record. This health-related data generated from wearable and Internet of Things (IoT) devices can serve as empirical information to a medical provider, as the standalone data can add value to the holistic health record of a patient. This paper proposes a solution to incorporate the data gathered from these wearable and IoT devices, with that a patient’s Personal Health Record (PHR) stored within the confines of a Health Information Exchange (HIE).
Keywords: Electronic health record, health information exchanges, Internet of Things, personal health records, wearable devices, wearables.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17666564 Cooperative Data Caching in WSN
Authors: Narottam Chand
Abstract:
Wireless sensor networks (WSNs) have gained tremendous attention in recent years due to their numerous applications. Due to the limited energy resource, energy efficient operation of sensor nodes is a key issue in wireless sensor networks. Cooperative caching which ensures sharing of data among various nodes reduces the number of communications over the wireless channels and thus enhances the overall lifetime of a wireless sensor network. In this paper, we propose a cooperative caching scheme called ZCS (Zone Cooperation at Sensors) for wireless sensor networks. In ZCS scheme, one-hop neighbors of a sensor node form a cooperative cache zone and share the cached data with each other. Simulation experiments show that the ZCS caching scheme achieves significant improvements in byte hit ratio and average query latency in comparison with other caching strategies.Keywords: Admission control, cache replacement, cooperative caching, WSN, zone cooperation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27626563 Exploiting Kinetic and Kinematic Data to Plot Cyclograms for Managing the Rehabilitation Process of BKAs by Applying Neural Networks
Authors: L. Parisi
Abstract:
Kinematic data wisely correlate vector quantities in space to scalar parameters in time to assess the degree of symmetry between the intact limb and the amputated limb with respect to a normal model derived from the gait of control group participants. Furthermore, these particular data allow a doctor to preliminarily evaluate the usefulness of a certain rehabilitation therapy. Kinetic curves allow the analysis of ground reaction forces (GRFs) to assess the appropriateness of human motion. Electromyography (EMG) allows the analysis of the fundamental lower limb force contributions to quantify the level of gait asymmetry. However, the use of this technological tool is expensive and requires patient’s hospitalization. This research work suggests overcoming the above limitations by applying artificial neural networks.
Keywords: Kinetics, kinematics, cyclograms, neural networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20916562 The Automated Selective Acquisition System
Authors: Atisthan Wuttimanop, Suchada Rianmora
Abstract:
To support design process for launching the product on time, reverse engineering (RE) process has been introduced for quickly generating 3D CAD model from its physical object. The accuracy of the 3D CAD model depends upon the data acquisition technique selected, contact or non-contact methods. In order to reduce times used for acquiring surface and eliminating noises, the automated selective acquisition system has been developed and presented in this research as the alternative channel for non-contact acquisition technique where the data is selectively and locally scanned contour by contour without performing data reduction process. The results present as the organized contour points which are directly used to generate 3D virtual model. The comparison between the proposed technique and another non-contact scanning technique has been presented and discussed.
Keywords: Automated selective acquisition system, Non-contact acquisition, Reverse engineering, 3D scanners.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16116561 Node Insertion in Coalescence Hidden-Variable Fractal Interpolation Surface
Authors: Srijanani Anurag Prasad
Abstract:
The Coalescence Hidden-variable Fractal Interpolation Surface (CHFIS) was built by combining interpolation data from the Iterated Function System (IFS). The interpolation data in a CHFIS comprise a row and/or column of uncertain values when a single point is entered. Alternatively, a row and/or column of additional points are placed in the given interpolation data to demonstrate the node added CHFIS. There are three techniques for inserting new points that correspond to the row and/or column of nodes inserted, and each method is further classified into four types based on the values of the inserted nodes. As a result, numerous forms of node insertion can be found in a CHFIS.
Keywords: Fractal, interpolation, iterated function system, coalescence, node insertion, knot insertion.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3456560 Comparative Study on Swarm Intelligence Techniques for Biclustering of Microarray Gene Expression Data
Authors: R. Balamurugan, A. M. Natarajan, K. Premalatha
Abstract:
Microarray gene expression data play a vital in biological processes, gene regulation and disease mechanism. Biclustering in gene expression data is a subset of the genes indicating consistent patterns under the subset of the conditions. Finding a biclustering is an optimization problem. In recent years, swarm intelligence techniques are popular due to the fact that many real-world problems are increasingly large, complex and dynamic. By reasons of the size and complexity of the problems, it is necessary to find an optimization technique whose efficiency is measured by finding the near optimal solution within a reasonable amount of time. In this paper, the algorithmic concepts of the Particle Swarm Optimization (PSO), Shuffled Frog Leaping (SFL) and Cuckoo Search (CS) algorithms have been analyzed for the four benchmark gene expression dataset. The experiment results show that CS outperforms PSO and SFL for 3 datasets and SFL give better performance in one dataset. Also this work determines the biological relevance of the biclusters with Gene Ontology in terms of function, process and component.
Keywords: Particle swarm optimization, Shuffled frog leaping, Cuckoo search, biclustering, gene expression data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26666559 Using Business Intelligence Capabilities to Improve the Quality of Decision-Making: A Case Study of Mellat Bank
Authors: Jalal Haghighat Monfared, Zahra Akbari
Abstract:
Today, business executives need to have useful information to make better decisions. Banks have also been using information tools so that they can direct the decision-making process in order to achieve their desired goals by rapidly extracting information from sources with the help of business intelligence. The research seeks to investigate whether there is a relationship between the quality of decision making and the business intelligence capabilities of Mellat Bank. Each of the factors studied is divided into several components, and these and their relationships are measured by a questionnaire. The statistical population of this study consists of all managers and experts of Mellat Bank's General Departments (including 190 people) who use commercial intelligence reports. The sample size of this study was 123 randomly determined by statistical method. In this research, relevant statistical inference has been used for data analysis and hypothesis testing. In the first stage, using the Kolmogorov-Smirnov test, the normalization of the data was investigated and in the next stage, the construct validity of both variables and their resulting indexes were verified using confirmatory factor analysis. Finally, using the structural equation modeling and Pearson's correlation coefficient, the research hypotheses were tested. The results confirmed the existence of a positive relationship between decision quality and business intelligence capabilities in Mellat Bank. Among the various capabilities, including data quality, correlation with other systems, user access, flexibility and risk management support, the flexibility of the business intelligence system was the most correlated with the dependent variable of the present research. This shows that it is necessary for Mellat Bank to pay more attention to choose the required business intelligence systems with high flexibility in terms of the ability to submit custom formatted reports. Subsequently, the quality of data on business intelligence systems showed the strongest relationship with quality of decision making. Therefore, improving the quality of data, including the source of data internally or externally, the type of data in quantitative or qualitative terms, the credibility of the data and perceptions of who uses the business intelligence system, improves the quality of decision making in Mellat Bank.
Keywords: Business intelligence, business intelligence capability, decision making, decision quality.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13896558 Producing Outdoor Design Conditions Based on the Dependency between Meteorological Elements: Copula Approach
Authors: Zhichao Jiao, Craig Farnham, Jihui Yuan, Kazuo Emura
Abstract:
It is common to use the outdoor design weather data to select the air-conditioning capacity in the building design stage. The meteorological elements of outdoor design weather data are usually selected based on their excess frequency separately while the dependency between the elements is not well considered. It means that the simultaneous occurrence probability of these elements is smaller than the original excess frequency which may cause an overestimation of selecting air-conditioning capacity. Therefore, the copula approach which can capture the dependency between multivariate data was used to model the joint distributions of the meteorological elements, like air temperature and global solar radiation. We suggest a method based on the specific simultaneous occurrence probability of these two elements of selecting more credible outdoor design conditions. The hourly weather data at 12 noon from 2001 to 2010 in Tokyo, Japan are used to analyze the dependency structure and joint distribution, the Gaussian copula represents the dependence of data best. According to calculating the air temperature and global solar radiation in specific simultaneous occurrence probability and the common exceeding, the results show that both the air temperature and global solar radiation based on simultaneous occurrence probability are lower than these based on the conventional method in the same probability.
Keywords: Copula approach, Design weather database, energy conservation, HVAC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3676557 Finding Fuzzy Association Rules Using FWFP-Growth with Linguistic Supports and Confidences
Authors: Chien-Hua Wang, Chin-Tzong Pang
Abstract:
In data mining, the association rules are used to search for the relations of items of the transactions database. Following the data is collected and stored, it can find rules of value through association rules, and assist manager to proceed marketing strategy and plan market framework. In this paper, we attempt fuzzy partition methods and decide membership function of quantitative values of each transaction item. Also, by managers we can reflect the importance of items as linguistic terms, which are transformed as fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth (FWFP-Growth) is used to complete the process of data mining. The method above is expected to improve Apriori algorithm for its better efficiency of the whole association rules. An example is given to clearly illustrate the proposed approach.Keywords: Association Rule, Fuzzy Partition Methods, FWFP-Growth, Apiroir algorithm
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16556556 Change Point Analysis in Average Ozone Layer Temperature Using Exponential Lomax Distribution
Authors: Amjad Abdullah, Amjad Yahya, Bushra Aljohani, Amani S. Alghamdi
Abstract:
Change point detection is an important part of data analysis. The presence of a change point refers to a significant change in the behavior of a time series. In this article, we examine the detection of multiple change points of parameters of the exponential Lomax distribution, which is broad and flexible compared with other distributions while fitting data. We used the Schwarz information criterion and binary segmentation to detect multiple change points in publicly available data on the average temperature in the ozone layer. The change points were successfully located.
Keywords: Binary segmentation, change point, exponential Lomax distribution, information criterion.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3496555 Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse
Authors: Jiratta Phuboon-ob, Raweewan Auepanwiriyakul
Abstract:
A data warehouse (DW) is a system which has value and role for decision-making by querying. Queries to DW are critical regarding to their complexity and length. They often access millions of tuples, and involve joins between relations and aggregations. Materialized views are able to provide the better performance for DW queries. However, these views have maintenance cost, so materialization of all views is not possible. An important challenge of DW environment is materialized view selection because we have to realize the trade-off between performance and view maintenance. Therefore, in this paper, we introduce a new approach aimed to solve this challenge based on Two-Phase Optimization (2PO), which is a combination of Simulated Annealing (SA) and Iterative Improvement (II), with the use of Multiple View Processing Plan (MVPP). Our experiments show that 2PO outperform the original algorithms in terms of query processing cost and view maintenance cost.Keywords: Data warehouse, materialized views, view selectionproblem, two-phase optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17086554 Stakeholder Analysis of Agricultural Drone Policy: A Case Study of the Agricultural Drone Ecosystem of Thailand
Authors: Thanomsin Chakreeves, Atichat Preittigun, Ajchara Phu-ang
Abstract:
This paper presents a stakeholder analysis of agricultural drone policies that meet the government's goal of building an agricultural drone ecosystem in Thailand. Firstly, case studies from other countries are reviewed. The stakeholder analysis method and qualitative data from the interviews are then presented including data from the Institute of Innovation and Management, the Office of National Higher Education Science Research and Innovation Policy Council, agricultural entrepreneurs and farmers. Study and interview data are then employed to describe the current ecosystem and to guide the implementation of agricultural drone policies that are suitable for the ecosystem of Thailand. Finally, policy recommendations are then made that the Thai government should adopt in the future.
Keywords: Drone public policy, drone ecosystem, policy development, agricultural drone.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 817