Search results for: Data Warehousing tools
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8199

Search results for: Data Warehousing tools

7269 Extracting Terrain Points from Airborne Laser Scanning Data in Densely Forested Areas

Authors: Ziad Abdeldayem, Jakub Markiewicz, Kunal Kansara, Laura Edwards

Abstract:

Airborne Laser Scanning (ALS) is one of the main technologies for generating high-resolution digital terrain models (DTMs). DTMs are crucial to several applications, such as topographic mapping, flood zone delineation, geographic information systems (GIS), hydrological modelling, spatial analysis, etc. Laser scanning system generates irregularly spaced three-dimensional cloud of points. Raw ALS data are mainly ground points (that represent the bare earth) and non-ground points (that represent buildings, trees, cars, etc.). Removing all the non-ground points from the raw data is referred to as filtering. Filtering heavily forested areas is considered a difficult and challenging task as the canopy stops laser pulses from reaching the terrain surface. This research presents an approach for removing non-ground points from raw ALS data in densely forested areas. Smoothing splines are exploited to interpolate and fit the noisy ALS data. The presented filter utilizes a weight function to allocate weights for each point of the data. Furthermore, unlike most of the methods, the presented filtering algorithm is designed to be automatic. Three different forested areas in the United Kingdom are used to assess the performance of the algorithm. The results show that the generated DTMs from the filtered data are accurate (when compared against reference terrain data) and the performance of the method is stable for all the heavily forested data samples. The average root mean square error (RMSE) value is 0.35 m.

Keywords: Airborne laser scanning, digital terrain models, filtering, forested areas.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 699
7268 Blockchain for IoT Security and Privacy in Healthcare Sector

Authors: Umair Shafique, Hafiz Usman Zia, Fiaz Majeed, Samina Naz, Javeria Ahmed, Maleeha Zainab

Abstract:

The Internet of Things (IoT) has become a hot topic for the last couple of years. This innovative technology has shown promising progress in various areas and the world has witnessed exponential growth in multiple application domains. Researchers are working to investigate its aptitudes to get the best from it by harnessing its true potential. But at the same time, IoT networks open up a new aspect of vulnerability and physical threats to data integrity, privacy, and confidentiality. It is due to centralized control, data silos approach for handling information, and a lack of standardization in the IoT networks. As we know, blockchain is a new technology that involves creating secure distributed ledgers to store and communicate data. Some of the benefits include resiliency, integrity, anonymity, decentralization, and autonomous control. The potential for blockchain technology to provide the key to managing and controlling IoT has created a new wave of excitement around the idea of putting that data back into the hands of the end-users. In this manuscript, we have proposed a model that combines blockchain and IoT networks to address potential security and privacy issues in the healthcare domain and how various stakeholders will interact with the system.

Keywords: Internet of Things, IoT, blockchain, data integrity, authentication, data privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 374
7267 A New Algorithm for Enhanced Robustness of Copyright Mark

Authors: Harsh Vikram Singh, S. P. Singh, Anand Mohan

Abstract:

This paper discusses a new heavy tailed distribution based data hiding into discrete cosine transform (DCT) coefficients of image, which provides statistical security as well as robustness against steganalysis attacks. Unlike other data hiding algorithms, the proposed technique does not introduce much effect in the stegoimage-s DCT coefficient probability plots, thus making the presence of hidden data statistically undetectable. In addition the proposed method does not compromise on hiding capacity. When compared to the generic block DCT based data-hiding scheme, our method found more robust against a variety of image manipulating attacks such as filtering, blurring, JPEG compression etc.

Keywords: Information Security, Robust Steganography, Steganalysis, Pareto Probability Distribution function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784
7266 Liveability of Kuala Lumpur City Centre: An Evaluation of the Happiness Level of the Streets- Activities

Authors: Shuhana Shamsuddin, Nur Rasyiqah Abu Hassan, Ahmad Bashri Sulaiman

Abstract:

Liveable city is referred to as the quality of life in an area that contributes towards a safe, healthy and enjoyable place. This paper discusses the role of the streets- activities in making Kuala Lumpur a liveable city and the happiness level of the residents towards the city-s street activities. The study was conducted using the residents of Kuala Lumpur. A mixed method technique is used with the quantitative data as a main data and supported by the qualitative data. Data were collected using questionnaires, observation and also an interview session with a sample of residents of Kuala Lumpur. The sampling technique is based on multistage cluster data sampling. The findings revealed that, there is still no significant relationship between the length of stay of the resident in Kuala Lumpur with the happiness level towards the street activities that occurred in the city.

Keywords: Liveable city, activities, urban design quality, quality of life, happiness level.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2869
7265 Cloud Computing Support for Diagnosing Researches

Authors: A. Amirov, O. Gerget, V. Kochegurov

Abstract:

One of the main biomedical problem lies in detecting dependencies in semi structured data. Solution includes biomedical portal and algorithms (integral rating health criteria, multidimensional data visualization methods). Biomedical portal allows to process diagnostic and research data in parallel mode using Microsoft System Center 2012, Windows HPC Server cloud technologies. Service does not allow user to see internal calculations instead it provides practical interface. When data is sent for processing user may track status of task and will achieve results as soon as computation is completed. Service includes own algorithms and allows diagnosing and predicating medical cases. Approved methods are based on complex system entropy methods, algorithms for determining the energy patterns of development and trajectory models of biological systems and logical–probabilistic approach with the blurring of images.

Keywords: Biomedical portal, cloud computing, diagnostic and prognostic research, mathematical data analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1612
7264 Non-Invasive Data Extraction from Machine Display Units Using Video Analytics

Authors: Ravneet Kaur, Joydeep Acharya, Sudhanshu Gaur

Abstract:

Artificial Intelligence (AI) has the potential to transform manufacturing by improving shop floor processes such as production, maintenance and quality. However, industrial datasets are notoriously difficult to extract in a real-time, streaming fashion thus, negating potential AI benefits. The main example is some specialized industrial controllers that are operated by custom software which complicates the process of connecting them to an Information Technology (IT) based data acquisition network. Security concerns may also limit direct physical access to these controllers for data acquisition. To connect the Operational Technology (OT) data stored in these controllers to an AI application in a secure, reliable and available way, we propose a novel Industrial IoT (IIoT) solution in this paper. In this solution, we demonstrate how video cameras can be installed in a factory shop floor to continuously obtain images of the controller HMIs. We propose image pre-processing to segment the HMI into regions of streaming data and regions of fixed meta-data. We then evaluate the performance of multiple Optical Character Recognition (OCR) technologies such as Tesseract and Google vision to recognize the streaming data and test it for typical factory HMIs and realistic lighting conditions. Finally, we use the meta-data to match the OCR output with the temporal, domain-dependent context of the data to improve the accuracy of the output. Our IIoT solution enables reliable and efficient data extraction which will improve the performance of subsequent AI applications.

Keywords: Human machine interface, industrial internet of things, internet of things, optical character recognition, video analytic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 716
7263 Web Proxy Detection via Bipartite Graphs and One-Mode Projections

Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo

Abstract:

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Keywords: Bipartite graph, clustering, one-mode projection, web proxy detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 730
7262 Estimating the Flow Velocity Using Flow Generated Sound

Authors: Saeed Hosseini, Ali Reza Tahavvor

Abstract:

Sound processing is one the subjects that newly attracts a lot of researchers. It is efficient and usually less expensive than other methods. In this paper the flow generated sound is used to estimate the flow speed of free flows. Many sound samples are gathered. After analyzing the data, a parameter named wave power is chosen. For all samples the wave power is calculated and averaged for each flow speed. A curve is fitted to the averaged data and a correlation between the wave power and flow speed is found. Test data are used to validate the method and errors for all test data were under 10 percent. The speed of the flow can be estimated by calculating the wave power of the flow generated sound and using the proposed correlation.

Keywords: Flow generated sound, sound processing, speed, wave power.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2355
7261 Exponential Particle Swarm Optimization Approach for Improving Data Clustering

Authors: Neveen I. Ghali, Nahed El-Dessouki, Mervat A. N., Lamiaa Bakrawi

Abstract:

In this paper we use exponential particle swarm optimization (EPSO) to cluster data. Then we compare between (EPSO) clustering algorithm which depends on exponential variation for the inertia weight and particle swarm optimization (PSO) clustering algorithm which depends on linear inertia weight. This comparison is evaluated on five data sets. The experimental results show that EPSO clustering algorithm increases the possibility to find the optimal positions as it decrease the number of failure. Also show that (EPSO) clustering algorithm has a smaller quantization error than (PSO) clustering algorithm, i.e. (EPSO) clustering algorithm more accurate than (PSO) clustering algorithm.

Keywords: Particle swarm optimization, data clustering, exponential PSO.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669
7260 Cloning and Functional Characterization of Promoter Elements of the D Hordein Gene from the Barley (Hordeum vulgare L.) by Bioinformatic Tools

Authors: Kobra Nalbandi, Bahram Baghban Kohnehrouz, Khalil Alami Saeed

Abstract:

The low level of foreign genes expression in transgenic plants is a key factor that limits plant genetic engineering. Because of the critical regulatory activity of the promoters on gene transcription, they are studied extensively to improve the efficiency of the plant transgenic system. The strong constitutive promoters, such as CaMV 35S promoter and Ubiqutin 1 maize are usually used in plant biotechnology research. However the expression level of the foreign genes in all tissues is often undesirable. But using a strong seed-specific promoter to limit gene expression in the seed solves such problems. The purpose of this study is to isolate one of the seed specific promoters of Hordeum vulgare. So one of the common varieties of Hordeum vulgare in Iran was selected and their genomes extracted then the D-Hordein promoter amplified using the specific designed primers. Then the amplified fragment of the insert cloned in an appropriate vector and then transformed to E. coli. At last for the final admission of accuracy the cloned fragments sent for sequencing. Sequencing analysis showed that the cloned fragment DHPcontained motifs; like TATA box, CAAT-box, CCGTCC-box, AMYBOX1 and E-box etc., which constituted the seed-specific promoter activity. The results were compared with sequences existing in data banks. D-Hordein promoters of Alger has 99% similarity at 100 % coverage. The results also showed that D-Hordein promoter of barley and HMW promoter of wheat are too similar.

Keywords: Barley, Seed specific promoter, Hordein.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2608
7259 Biosignal Measurement using Personal Area Network based on Human Body Communication

Authors: Yong-Gyu Lee, Jin-Hee Park, Gilwon Yoon

Abstract:

In this study, we introduced a communication system where human body was used as medium through which data were transferred. Multiple biosignal sensing units were attached to a subject and wireless personal area network was formed. Data of the sensing units were shared among them. We used wideband pulse communication that was simple, low-power consuming and high data rated. Each unit functioned as independent communication device or node. A method of channel search and communication among the modes was developed. A protocol of carrier sense multiple access/collision detect was implemented in order to avoid data collision or interferences. Biosignal sensing units should be located at different locations due to the nature of biosignal origin. Our research provided a flexibility of collecting data without using electrical wires. More non-constrained measurement was accomplished which was more suitable for u-Health monitoring.

Keywords: Human body communication, wideband pulse communication, personal area network, biosignal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2163
7258 Development and Evaluation of a Portable Ammonia Gas Detector

Authors: Jaheon Gu, Wooyong Chung, Mijung Koo, Seonbok Lee, Gyoutae Park, Sangguk Ahn, Hiesik Kim, Jungil Park

Abstract:

In this paper, we present a portable ammonia gas detector for performing the gas safety management efficiently. The display of the detector is separated from its body. The display module is received the data measured from the detector using ZigBee. The detector has a rechargeable li-ion battery which can be use for 11~12 hours, and a Bluetooth module for sending the data to the PC or the smart devices. The data are sent to the server and can access using the web browser or mobile application. The range of the detection concentration is 0~100ppm.

Keywords: Ammonia, detector, gas safety, portable.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1521
7257 ANN based Multi Classifier System for Prediction of High Energy Shower Primary Energy and Core Location

Authors: Gitanjali Devi, Kandarpa Kumar Sarma, Pranayee Datta, Anjana Kakoti Mahanta

Abstract:

Cosmic showers, during the transit through space, produce sub - products as a result of interactions with the intergalactic or interstellar medium which after entering earth generate secondary particles called Extensive Air Shower (EAS). Detection and analysis of High Energy Particle Showers involve a plethora of theoretical and experimental works with a host of constraints resulting in inaccuracies in measurements. Therefore, there exist a necessity to develop a readily available system based on soft-computational approaches which can be used for EAS analysis. This is due to the fact that soft computational tools such as Artificial Neural Network (ANN)s can be trained as classifiers to adapt and learn the surrounding variations. But single classifiers fail to reach optimality of decision making in many situations for which Multiple Classifier System (MCS) are preferred to enhance the ability of the system to make decisions adjusting to finer variations. This work describes the formation of an MCS using Multi Layer Perceptron (MLP), Recurrent Neural Network (RNN) and Probabilistic Neural Network (PNN) with data inputs from correlation mapping Self Organizing Map (SOM) blocks and the output optimized by another SOM. The results show that the setup can be adopted for real time practical applications for prediction of primary energy and location of EAS from density values captured using detectors in a circular grid.

Keywords: EAS, Shower, Core, ANN, Location.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1281
7256 LAYMOD; A Layered and Modular Platform for CAx Collaboration Management and Supporting Product data Integration based on STEP Standard

Authors: Omid F. Valilai, Mahmoud Houshmand

Abstract:

Nowadays companies strive to survive in a competitive global environment. To speed up product development/modifications, it is suggested to adopt a collaborative product development approach. However, despite the advantages of new IT improvements still many CAx systems work separately and locally. Collaborative design and manufacture requires a product information model that supports related CAx product data models. To solve this problem many solutions are proposed, which the most successful one is adopting the STEP standard as a product data model to develop a collaborative CAx platform. However, the improvement of the STEP-s Application Protocols (APs) over the time, huge number of STEP AP-s and cc-s, the high costs of implementation, costly process for conversion of older CAx software files to the STEP neutral file format; and lack of STEP knowledge, that usually slows down the implementation of the STEP standard in collaborative data exchange, management and integration should be considered. In this paper the requirements for a successful collaborative CAx system is discussed. The STEP standard capability for product data integration and its shortcomings as well as the dominant platforms for supporting CAx collaboration management and product data integration are reviewed. Finally a platform named LAYMOD to fulfil the requirements of CAx collaborative environment and integrating the product data is proposed. The platform is a layered platform to enable global collaboration among different CAx software packages/developers. It also adopts the STEP modular architecture and the XML data structures to enable collaboration between CAx software packages as well as overcoming the STEP standard limitations. The architecture and procedures of LAYMOD platform to manage collaboration and avoid contradicts in product data integration are introduced.

Keywords: CAx, Collaboration management, STEP applicationmodules, STEP standard, XML data structures

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2203
7255 Grocery Customer Behavior Analysis using RFID-based Shopping Paths Data

Authors: In-Chul Jung, Young S. Kwon

Abstract:

Knowing about the customer behavior in a grocery has been a long-standing issue in the retailing industry. The advent of RFID has made it easier to collect moving data for an individual shopper's behavior. Most of the previous studies used the traditional statistical clustering technique to find the major characteristics of customer behavior, especially shopping path. However, in using the clustering technique, due to various spatial constraints in the store, standard clustering methods are not feasible because moving data such as the shopping path should be adjusted in advance of the analysis, which is time-consuming and causes data distortion. To alleviate this problem, we propose a new approach to spatial pattern clustering based on the longest common subsequence. Experimental results using real data obtained from a grocery confirm the good performance of the proposed method in finding the hot spot, dead spot and major path patterns of customer movements.

Keywords: customer path, shopping behavior, exploratoryanalysis, LCS, RFID

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3128
7254 Soft-Sensor for Estimation of Gasoline Octane Number in Platforming Processes with Adaptive Neuro-Fuzzy Inference Systems (ANFIS)

Authors: Hamed.Vezvaei, Sepideh.Ordibeheshti, Mehdi.Ardjmand

Abstract:

Gasoline Octane Number is the standard measure of the anti-knock properties of a motor in platforming processes, that is one of the important unit operations for oil refineries and can be determined with online measurement or use CFR (Cooperative Fuel Research) engines. Online measurements of the Octane number can be done using direct octane number analyzers, that it is too expensive, so we have to find feasible analyzer, like ANFIS estimators. ANFIS is the systems that neural network incorporated in fuzzy systems, using data automatically by learning algorithms of NNs. ANFIS constructs an input-output mapping based both on human knowledge and on generated input-output data pairs. In this research, 31 industrial data sets are used (21 data for training and the rest of the data used for generalization). Results show that, according to this simulation, hybrid method training algorithm in ANFIS has good agreements between industrial data and simulated results.

Keywords: Adaptive Neuro-Fuzzy Inference Systems, GasolineOctane Number, Soft-sensor, Catalytic Naphtha Reforming

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2172
7253 Tool for Metadata Extraction and Content Packaging as Endorsed in OAIS Framework

Authors: Payal Abichandani, Rishi Prakash, Paras Nath Barwal, B. K. Murthy

Abstract:

Information generated from various computerization processes is a potential rich source of knowledge for its designated community. To pass this information from generation to generation without modifying the meaning is a challenging activity. To preserve and archive the data for future generations it’s very essential to prove the authenticity of the data. It can be achieved by extracting the metadata from the data which can prove the authenticity and create trust on the archived data. Subsequent challenge is the technology obsolescence. Metadata extraction and standardization can be effectively used to resolve and tackle this problem. Metadata can be categorized at two levels i.e. Technical and Domain level broadly. Technical metadata will provide the information that can be used to understand and interpret the data record, but only this level of metadata isn’t sufficient to create trustworthiness. We have developed a tool which will extract and standardize the technical as well as domain level metadata. This paper is about the different features of the tool and how we have developed this.  

Keywords: Digital Preservation, Metadata, OAIS, PDI, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805
7252 Application of a New Hybrid Optimization Algorithm on Cluster Analysis

Authors: T. Niknam, M. Nayeripour, B.Bahmani Firouzi

Abstract:

Clustering techniques have received attention in many areas including engineering, medicine, biology and data mining. The purpose of clustering is to group together data points, which are close to one another. The K-means algorithm is one of the most widely used techniques for clustering. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. This paper is presented an efficient hybrid evolutionary optimization algorithm based on combining Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), called PSO-ACO, for optimally clustering N object into K clusters. The new PSO-ACO algorithm is tested on several data sets, and its performance is compared with those of ACO, PSO and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handing data clustering.

Keywords: Ant Colony Optimization (ACO), Data clustering, Hybrid evolutionary optimization algorithm, K-means clustering, Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2184
7251 Socio-Economic Insight of the Secondary Housing Market in Colombo Suburbs: Seller’s Point of Views

Authors: R. G. Ariyawansa, M. A. N. R. M. Perera

Abstract:

“House” is a powerful symbol of socio-economic background of individuals and families. In fact, housing provides all types of needs/wants from basic needs to self-actualization needs. This phenomenon can be realized only having analyzed hidden motives of buyers and sellers of the housing market. Hence, the aim of this study is to examine the socio-economic insight of the secondary housing market in Colombo suburbs. This broader aim was achieved via analyzing the general pattern of the secondary housing market, identifying socio-economic motives of sellers of the secondary housing market, and reviewing sellers’ experience of buyer behavior. A purposive sample of 50 sellers from popular residential areas in Colombo such as Maharagama, Kottawa, Piliyandala, Punnipitiya, and Nugegoda was used to collect primary data instead of relevant secondary data from published and unpublished reports. The sample was limited to selling price ranging from Rs15 million to Rs25 million, which apparently falls into middle and upper-middle income houses in the context. Participatory observation and semi-structured interviews were adopted as key data collection tools. Data were descriptively analyzed. This study found that the market is mainly handled by informal agents who are unqualified and unorganized. People such as taxi/tree-wheel drivers, boutique venders, security personals etc. are engaged in housing brokerage as a part time career. Few fulltime and formally organized agents were found but they were also not professionally qualified. As far as housing quality is concerned, it was observed that 90% of houses was poorly maintained and illegally modified. They are situated in poorly maintained neighborhoods as well. Among the observed houses, 2% was moderately maintained and 8% was well maintained and modified. Major socio-economic motives of sellers were “migrating foreign countries for education and employment” (80% and 10% respectively), “family problems” (4%), and “social status” (3%). Other motives were “health” and “environmental/neighborhood problems” (3%). This study further noted that the secondary middle income housing market in the area directly related with the migrants who motivated for education in foreign countries, mainly Australia, UK and USA. As per the literature, families motivated for education tend to migrate Colombo suburbs from remote areas of the country. They are seeking temporary accommodation in lower middle income housing. However, the secondary middle income housing market relates with the migration from Colombo to major global cities. Therefore, final transaction price of this market may depend on migration related dates such as university deadlines, visa and other agreements. Hence, it creates a buyers’ market lowering the selling price. Also it was revealed that the buyers tend to trust more on this market as far as the quality of construction of houses is concerned than brand new houses which are built for selling purpose.

Keywords: Informal housing market, hidden motives of buyers and sellers, secondary housing market, socio-economic insight.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 674
7250 Analysis of DNA Microarray Data using Association Rules: A Selective Study

Authors: M. Anandhavalli Gauthaman

Abstract:

DNA microarrays allow the measurement of expression levels for a large number of genes, perhaps all genes of an organism, within a number of different experimental samples. It is very much important to extract biologically meaningful information from this huge amount of expression data to know the current state of the cell because most cellular processes are regulated by changes in gene expression. Association rule mining techniques are helpful to find association relationship between genes. Numerous association rule mining algorithms have been developed to analyze and associate this huge amount of gene expression data. This paper focuses on some of the popular association rule mining algorithms developed to analyze gene expression data.

Keywords: DNA microarray, gene expression, association rule mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2128
7249 The Countabilities of Soft Topological Spaces

Authors: Weijian Rong

Abstract:

Soft topological spaces are considered as mathematical tools for dealing with uncertainties, and a fuzzy topological space is a special case of the soft topological space. The purpose of this paper is to study soft topological spaces. We introduce some new concepts in soft topological spaces such as soft first-countable spaces, soft second-countable spaces and soft separable spaces, and some basic properties of these concepts are explored.

Keywords: soft sets, soft first-countable spaces, soft second countable spaces, soft separable spaces, soft Lindelöf.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2394
7248 Prospects, Problems of Marketing Research and Data Mining in Turkey

Authors: Sema Kurtuluş, Kemal Kurtuluş

Abstract:

The objective of this paper is to review and assess the methodological issues and problems in marketing research, data and knowledge mining in Turkey. As a summary, academic marketing research publications in Turkey have significant problems. The most vital problem seems to be related with modeling. Most of the publications had major weaknesses in modeling. There were also, serious problems regarding measurement and scaling, sampling and analyses. Analyses myopia seems to be the most important problem for young academia in Turkey. Another very important finding is the lack of publications on data and knowledge mining in the academic world.

Keywords: Marketing research, data mining, knowledge mining, research modeling, analyses.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1952
7247 Analysis and Comparison of Image Encryption Algorithms

Authors: İsmet Öztürk, İbrahim Soğukpınar

Abstract:

With the fast progression of data exchange in electronic way, information security is becoming more important in data storage and transmission. Because of widely using images in industrial process, it is important to protect the confidential image data from unauthorized access. In this paper, we analyzed current image encryption algorithms and compression is added for two of them (Mirror-like image encryption and Visual Cryptography). Implementations of these two algorithms have been realized for experimental purposes. The results of analysis are given in this paper.

Keywords: image encryption, image cryptosystem, security, transmission

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4939
7246 Risk Classification of SMEs by Early Warning Model Based on Data Mining

Authors: Nermin Ozgulbas, Ali Serhan Koyuncugil

Abstract:

One of the biggest problems of SMEs is their tendencies to financial distress because of insufficient finance background. In this study, an Early Warning System (EWS) model based on data mining for financial risk detection is presented. CHAID algorithm has been used for development of the EWS. Developed EWS can be served like a tailor made financial advisor in decision making process of the firms with its automated nature to the ones who have inadequate financial background. Besides, an application of the model implemented which covered 7,853 SMEs based on Turkish Central Bank (TCB) 2007 data. By using EWS model, 31 risk profiles, 15 risk indicators, 2 early warning signals, and 4 financial road maps has been determined for financial risk mitigation.

Keywords: Early Warning Systems, Data Mining, Financial Risk, SMEs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3366
7245 Using Data from Foursquare Web Service to Represent the Commercial Activity of a City

Authors: Taras Agryzkov, Almudena Nolasco-Cirugeda, Jos´e L. Oliver, Leticia Serrano-Estrada, Leandro Tortosa, Jos´e F. Vicent

Abstract:

This paper aims to represent the commercial activity of a city taking as source data the social network Foursquare. The city of Murcia is selected as case study, and the location-based social network Foursquare is the main source of information. After carrying out a reorganisation of the user-generated data extracted from Foursquare, it is possible to graphically display on a map the various city spaces and venues especially those related to commercial, food and entertainment sector businesses. The obtained visualisation provides information about activity patterns in the city of Murcia according to the people‘s interests and preferences and, moreover, interesting facts about certain characteristics of the town itself.

Keywords: Social networks, Foursquare, spatial analysis, data visualization, geocomputation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2663
7244 Long-Range Dependence of Financial Time Series Data

Authors: Chatchai Pesee

Abstract:

This paper examines long-range dependence or longmemory of financial time series on the exchange rate data by the fractional Brownian motion (fBm). The principle of spectral density function in Section 2 is used to find the range of Hurst parameter (H) of the fBm. If 0< H <1/2, then it has a short-range dependence (SRD). It simulates long-memory or long-range dependence (LRD) if 1/2< H <1. The curve of exchange rate data is fBm because of the specific appearance of the Hurst parameter (H). Furthermore, some of the definitions of the fBm, long-range dependence and selfsimilarity are reviewed in Section II as well. Our results indicate that there exists a long-memory or a long-range dependence (LRD) for the exchange rate data in section III. Long-range dependence of the exchange rate data and estimation of the Hurst parameter (H) are discussed in Section IV, while a conclusion is discussed in Section V.

Keywords: Fractional Brownian motion, long-rangedependence, memory, short-range dependence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1869
7243 Meta Random Forests

Authors: Praveen Boinee, Alessandro De Angelis, Gian Luca Foresti

Abstract:

Leo Breimans Random Forests (RF) is a recent development in tree based classifiers and quickly proven to be one of the most important algorithms in the machine learning literature. It has shown robust and improved results of classifications on standard data sets. Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques to the random forests. We experiment the working of the ensembles of random forests on the standard data sets available in UCI data sets. We compare the original random forest algorithm with their ensemble counterparts and discuss the results.

Keywords: Random Forests [RF], ensembles, UCI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2683
7242 Renewable Energy Trends Analysis: A Patents Study

Authors: Sepulveda Juan

Abstract:

This article explains the elements and considerations taken into account when implementing and applying patent evaluation and scientometric study in the identifications of technology trends, and the tools that led to the implementation of a software application for patent revision. Univariate analysis helped recognize the technological leaders in the field of energy, and steered the way for a multivariate analysis of this sample, which allowed for a graphical description of the techniques of mature technologies, as well as the detection of emerging technologies. This article ends with a validation of the methodology as applied to the case of fuel cells.

Keywords: Energy, technology mapping, patents.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2165
7241 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.

Keywords: Clustering, Data analysis, Data mining, Predictive models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1935
7240 Studies on Determination of the Optimum Distance Between the Tmotes for Optimum Data Transfer in a Network with WLL Capability

Authors: N C Santhosh Kumar, N K Kishore

Abstract:

Using mini modules of Tmotes, it is possible to automate a small personal area network. This idea can be extended to large networks too by implementing multi-hop routing. Linking the various Tmotes using Programming languages like Nesc, Java and having transmitter and receiver sections, a network can be monitored. It is foreseen that, depending on the application, a long range at a low data transfer rate or average throughput may be an acceptable trade-off. To reduce the overall costs involved, an optimum number of Tmotes to be used under various conditions (Indoor/Outdoor) is to be deduced. By analyzing the data rates or throughputs at various locations of Tmotes, it is possible to deduce an optimal number of Tmotes for a specific network. This paper deals with the determination of optimum distances to reduce the cost and increase the reliability of the entire sensor network with Wireless Local Loop (WLL) capability.

Keywords: Average throughput, data rate, multi-hop routing, optimum data transfer, throughput, Tmotes, wireless local loop.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1344