Search results for: Multiple data sources
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9061

Search results for: Multiple data sources

7561 Prediction on Housing Price Based on Deep Learning

Authors: Li Yu, Chenlu Jiao, Hongrun Xin, Yan Wang, Kaiyang Wang

Abstract:

In order to study the impact of various factors on the housing price, we propose to build different prediction models based on deep learning to determine the existing data of the real estate in order to more accurately predict the housing price or its changing trend in the future. Considering that the factors which affect the housing price vary widely, the proposed prediction models include two categories. The first one is based on multiple characteristic factors of the real estate. We built Convolution Neural Network (CNN) prediction model and Long Short-Term Memory (LSTM) neural network prediction model based on deep learning, and logical regression model was implemented to make a comparison between these three models. Another prediction model is time series model. Based on deep learning, we proposed an LSTM-1 model purely regard to time series, then implementing and comparing the LSTM model and the Auto-Regressive and Moving Average (ARMA) model. In this paper, comprehensive study of the second-hand housing price in Beijing has been conducted from three aspects: crawling and analyzing, housing price predicting, and the result comparing. Ultimately the best model program was produced, which is of great significance to evaluation and prediction of the housing price in the real estate industry.

Keywords: Deep learning, convolutional neural network, LSTM, housing prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4962
7560 A Model for Managing Intellectual Property, Commercialisation and Technology Transfer within a Collaborative Research Environment

Authors: J. F. Arthur, R. M. Hodge

Abstract:

The Defence Materials Technology Centre has evolved from the Australian Cooperative Research Centres Program. The Centre receives funding from Government, industry and research sources to fund collaborative research within its participant organisations. The research centre is structured as a company with a small administrative staff and plays the role of the “honest broker” within the collaboration. A corporate culture has been established that is pervasive into the research projects are undertaken. The model is an effective mechanism to deliver outcomes to each of the participant stakeholders.

Keywords: Collaboration, Research Centre.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665
7559 Web Search Engine Based Naming Procedure for Independent Topic

Authors: Takahiro Nishigaki, Takashi Onoda

Abstract:

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

Keywords: Independent topic analysis, topic extraction, topic naming, web search engine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 477
7558 An Anomaly Detection Approach to Detect Unexpected Faults in Recordings from Test Drives

Authors: Andreas Theissler, Ian Dear

Abstract:

In the automotive industry test drives are being conducted during the development of new vehicle models or as a part of quality assurance of series-production vehicles. The communication on the in-vehicle network, data from external sensors, or internal data from the electronic control units is recorded by automotive data loggers during the test drives. The recordings are used for fault analysis. Since the resulting data volume is tremendous, manually analysing each recording in great detail is not feasible. This paper proposes to use machine learning to support domainexperts by preventing them from contemplating irrelevant data and rather pointing them to the relevant parts in the recordings. The underlying idea is to learn the normal behaviour from available recordings, i.e. a training set, and then to autonomously detect unexpected deviations and report them as anomalies. The one-class support vector machine “support vector data description” is utilised to calculate distances of feature vectors. SVDDSUBSEQ is proposed as a novel approach, allowing to classify subsequences in multivariate time series data. The approach allows to detect unexpected faults without modelling effort as is shown with experimental results on recordings from test drives.

Keywords: Anomaly detection, fault detection, test drive analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2459
7557 Decision Tree for Competing Risks Survival Probability in Breast Cancer Study

Authors: N. A. Ibrahim, A. Kudus, I. Daud, M. R. Abu Bakar

Abstract:

Competing risks survival data that comprises of more than one type of event has been used in many applications, and one of these is in clinical study (e.g. in breast cancer study). The decision tree method can be extended to competing risks survival data by modifying the split function so as to accommodate two or more risks which might be dependent on each other. Recently, researchers have constructed some decision trees for recurrent survival time data using frailty and marginal modelling. We further extended the method for the case of competing risks. In this paper, we developed the decision tree method for competing risks survival time data based on proportional hazards for subdistribution of competing risks. In particular, we grow a tree by using deviance statistic. The application of breast cancer data is presented. Finally, to investigate the performance of the proposed method, simulation studies on identification of true group of observations were executed.

Keywords: Competing risks, Decision tree, Simulation, Subdistribution Proportional Hazard.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2362
7556 Spatially Random Sampling for Retail Food Risk Factors Study

Authors: Guilan Huang

Abstract:

In 2013 and 2014, the U.S. Food and Drug Administration (FDA) collected data from selected fast food restaurants and full service restaurants for tracking changes in the occurrence of foodborne illness risk factors. This paper discussed how we customized spatial random sampling method by considering financial position and availability of FDA resources, and how we enriched restaurants data with location. Location information of restaurants provides opportunity for quantitatively determining random sampling within non-government units (e.g.: 240 kilometers around each data-collector). Spatial analysis also could optimize data-collectors’ work plans and resource allocation. Spatial analytic and processing platform helped us handling the spatial random sampling challenges. Our method fits in FDA’s ability to pinpoint features of foodservice establishments, and reduced both time and expense on data collection.

Keywords: Geospatial technology, restaurant, retail food risk factors study, spatial random sampling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1442
7555 Anisotropic Total Fractional Order Variation Model in Seismic Data Denoising

Authors: Jianwei Ma, Diriba Gemechu

Abstract:

In seismic data processing, attenuation of random noise is the basic step to improve quality of data for further application of seismic data in exploration and development in different gas and oil industries. The signal-to-noise ratio of the data also highly determines quality of seismic data. This factor affects the reliability as well as the accuracy of seismic signal during interpretation for different purposes in different companies. To use seismic data for further application and interpretation, we need to improve the signal-to-noise ration while attenuating random noise effectively. To improve the signal-to-noise ration and attenuating seismic random noise by preserving important features and information about seismic signals, we introduce the concept of anisotropic total fractional order denoising algorithm. The anisotropic total fractional order variation model defined in fractional order bounded variation is proposed as a regularization in seismic denoising. The split Bregman algorithm is employed to solve the minimization problem of the anisotropic total fractional order variation model and the corresponding denoising algorithm for the proposed method is derived. We test the effectiveness of theproposed method for synthetic and real seismic data sets and the denoised result is compared with F-X deconvolution and non-local means denoising algorithm.

Keywords: Anisotropic total fractional order variation, fractional order bounded variation, seismic random noise attenuation, Split Bregman Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 993
7554 Extracting Terrain Points from Airborne Laser Scanning Data in Densely Forested Areas

Authors: Ziad Abdeldayem, Jakub Markiewicz, Kunal Kansara, Laura Edwards

Abstract:

Airborne Laser Scanning (ALS) is one of the main technologies for generating high-resolution digital terrain models (DTMs). DTMs are crucial to several applications, such as topographic mapping, flood zone delineation, geographic information systems (GIS), hydrological modelling, spatial analysis, etc. Laser scanning system generates irregularly spaced three-dimensional cloud of points. Raw ALS data are mainly ground points (that represent the bare earth) and non-ground points (that represent buildings, trees, cars, etc.). Removing all the non-ground points from the raw data is referred to as filtering. Filtering heavily forested areas is considered a difficult and challenging task as the canopy stops laser pulses from reaching the terrain surface. This research presents an approach for removing non-ground points from raw ALS data in densely forested areas. Smoothing splines are exploited to interpolate and fit the noisy ALS data. The presented filter utilizes a weight function to allocate weights for each point of the data. Furthermore, unlike most of the methods, the presented filtering algorithm is designed to be automatic. Three different forested areas in the United Kingdom are used to assess the performance of the algorithm. The results show that the generated DTMs from the filtered data are accurate (when compared against reference terrain data) and the performance of the method is stable for all the heavily forested data samples. The average root mean square error (RMSE) value is 0.35 m.

Keywords: Airborne laser scanning, digital terrain models, filtering, forested areas.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 699
7553 Multidimensional Performance Management

Authors: David Wiese

Abstract:

In order to maximize efficiency of an information management platform and to assist in decision making, the collection, storage and analysis of performance-relevant data has become of fundamental importance. This paper addresses the merits and drawbacks provided by the OLAP paradigm for efficiently navigating large volumes of performance measurement data hierarchically. The system managers or database administrators navigate through adequately (re)structured measurement data aiming to detect performance bottlenecks, identify causes for performance problems or assessing the impact of configuration changes on the system and its representative metrics. Of particular importance is finding the root cause of an imminent problem, threatening availability and performance of an information system. Leveraging OLAP techniques, in contrast to traditional static reporting, this is supposed to be accomplished within moderate amount of time and little processing complexity. It is shown how OLAP techniques can help improve understandability and manageability of measurement data and, hence, improve the whole Performance Analysis process.

Keywords: Data Warehousing, OLAP, Multidimensional Navigation, Performance Diagnosis, Performance Management, Performance Tuning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2117
7552 A New Algorithm for Enhanced Robustness of Copyright Mark

Authors: Harsh Vikram Singh, S. P. Singh, Anand Mohan

Abstract:

This paper discusses a new heavy tailed distribution based data hiding into discrete cosine transform (DCT) coefficients of image, which provides statistical security as well as robustness against steganalysis attacks. Unlike other data hiding algorithms, the proposed technique does not introduce much effect in the stegoimage-s DCT coefficient probability plots, thus making the presence of hidden data statistically undetectable. In addition the proposed method does not compromise on hiding capacity. When compared to the generic block DCT based data-hiding scheme, our method found more robust against a variety of image manipulating attacks such as filtering, blurring, JPEG compression etc.

Keywords: Information Security, Robust Steganography, Steganalysis, Pareto Probability Distribution function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784
7551 Liveability of Kuala Lumpur City Centre: An Evaluation of the Happiness Level of the Streets- Activities

Authors: Shuhana Shamsuddin, Nur Rasyiqah Abu Hassan, Ahmad Bashri Sulaiman

Abstract:

Liveable city is referred to as the quality of life in an area that contributes towards a safe, healthy and enjoyable place. This paper discusses the role of the streets- activities in making Kuala Lumpur a liveable city and the happiness level of the residents towards the city-s street activities. The study was conducted using the residents of Kuala Lumpur. A mixed method technique is used with the quantitative data as a main data and supported by the qualitative data. Data were collected using questionnaires, observation and also an interview session with a sample of residents of Kuala Lumpur. The sampling technique is based on multistage cluster data sampling. The findings revealed that, there is still no significant relationship between the length of stay of the resident in Kuala Lumpur with the happiness level towards the street activities that occurred in the city.

Keywords: Liveable city, activities, urban design quality, quality of life, happiness level.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2869
7550 Surrogate based Evolutionary Algorithm for Design Optimization

Authors: Maumita Bhattacharya

Abstract:

Optimization is often a critical issue for most system design problems. Evolutionary Algorithms are population-based, stochastic search techniques, widely used as efficient global optimizers. However, finding optimal solution to complex high dimensional, multimodal problems often require highly computationally expensive function evaluations and hence are practically prohibitive. The Dynamic Approximate Fitness based Hybrid EA (DAFHEA) model presented in our earlier work [14] reduced computation time by controlled use of meta-models to partially replace the actual function evaluation by approximate function evaluation. However, the underlying assumption in DAFHEA is that the training samples for the meta-model are generated from a single uniform model. Situations like model formation involving variable input dimensions and noisy data certainly can not be covered by this assumption. In this paper we present an enhanced version of DAFHEA that incorporates a multiple-model based learning approach for the SVM approximator. DAFHEA-II (the enhanced version of the DAFHEA framework) also overcomes the high computational expense involved with additional clustering requirements of the original DAFHEA framework. The proposed framework has been tested on several benchmark functions and the empirical results illustrate the advantages of the proposed technique.

Keywords: Evolutionary algorithm, Fitness function, Optimization, Meta-model, Stochastic method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1564
7549 Catalytic Gasification of Olive Mill Wastewater as a Biomass Source under Supercritical Conditions

Authors: Ekin Kıpçak, Mesut Akgün

Abstract:

Recently, a growing interest has emerged on the development of new and efficient energy sources, due to the inevitable extinction of the nonrenewable energy reserves. One of these alternative sources which have a great potential and sustainability to meet up the energy demand is biomass energy. This significant energy source can be utilized with various energy conversion technologies, one of which is biomass gasification in supercritical water.

Water, being the most important solvent in nature, has very important characteristics as a reaction solvent under supercritical circumstances. At temperatures above its critical point (374.8oC and 22.1MPa), water becomes more acidic and its diffusivity increases. Working with water at high temperatures increases the thermal reaction rate, which in consequence leads to a better dissolving of the organic matters and a fast reaction with oxygen. Hence, supercritical water offers a control mechanism depending on solubility, excellent transport properties based on its high diffusion ability and new reaction possibilities for hydrolysis or oxidation.

In this study the gasification of a real biomass, namely olive mill wastewater (OMW), in supercritical water conditions is investigated with the use of Ru/Al2O3 catalyst. OMW is a by-product obtained during olive oil production, which has a complex nature characterized by a high content of organic compounds and polyphenols. These properties impose OMW a significant pollution potential, but at the same time, the high content of organics makes OMW a desirable biomass candidate for energy production.

The catalytic gasification experiments were made with five different reaction temperatures (400, 450, 500, 550 and 600°C) and five reaction times (30, 60, 90, 120 and 150s), under a constant pressure of 25MPa. Through these experiments, the effects of reaction temperature and time on the gasification yield, gaseous product composition and OMW treatment efficiency were investigated.

Keywords: Catalyst, Gasification, Olive mill wastewater, Ru/Al2O3, Supercritical water.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2261
7548 Cloud Computing Support for Diagnosing Researches

Authors: A. Amirov, O. Gerget, V. Kochegurov

Abstract:

One of the main biomedical problem lies in detecting dependencies in semi structured data. Solution includes biomedical portal and algorithms (integral rating health criteria, multidimensional data visualization methods). Biomedical portal allows to process diagnostic and research data in parallel mode using Microsoft System Center 2012, Windows HPC Server cloud technologies. Service does not allow user to see internal calculations instead it provides practical interface. When data is sent for processing user may track status of task and will achieve results as soon as computation is completed. Service includes own algorithms and allows diagnosing and predicating medical cases. Approved methods are based on complex system entropy methods, algorithms for determining the energy patterns of development and trajectory models of biological systems and logical–probabilistic approach with the blurring of images.

Keywords: Biomedical portal, cloud computing, diagnostic and prognostic research, mathematical data analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1612
7547 Robot Operating System-Based SLAM for a Gazebo-Simulated Turtlebot2 in 2d Indoor Environment with Cartographer Algorithm

Authors: Wilayat Ali, Li Sheng, Waleed Ahmed

Abstract:

The ability of the robot to make simultaneously map of the environment and localize itself with respect to that environment is the most important element of mobile robots. To solve SLAM many algorithms could be utilized to build up the SLAM process and SLAM is a developing area in Robotics research. Robot Operating System (ROS) is one of the frameworks which provide multiple algorithm nodes to work with and provide a transmission layer to robots. Manyof these algorithms extensively in use are Hector SLAM, Gmapping and Cartographer SLAM. This paper describes a ROS-based Simultaneous localization and mapping (SLAM) library Google Cartographer mapping, which is open-source algorithm. The algorithm was applied to create a map using laser and pose data from 2d Lidar that was placed on a mobile robot. The model robot uses the gazebo package and simulated in Rviz. Our research work's primary goal is to obtain mapping through Cartographer SLAM algorithm in a static indoor environment. From our research, it is shown that for indoor environments cartographer is an applicable algorithm to generate 2d maps with LIDAR placed on mobile robot because it uses both odometry and poses estimation. The algorithm has been evaluated and maps are constructed against the SLAM algorithms presented by Turtlebot2 in the static indoor environment.

Keywords: SLAM, ROS, navigation, localization and mapping, Gazebo, Rviz, Turtlebot2, SLAM algorithms, 2d Indoor environment, Cartographer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1195
7546 Estimating the Flow Velocity Using Flow Generated Sound

Authors: Saeed Hosseini, Ali Reza Tahavvor

Abstract:

Sound processing is one the subjects that newly attracts a lot of researchers. It is efficient and usually less expensive than other methods. In this paper the flow generated sound is used to estimate the flow speed of free flows. Many sound samples are gathered. After analyzing the data, a parameter named wave power is chosen. For all samples the wave power is calculated and averaged for each flow speed. A curve is fitted to the averaged data and a correlation between the wave power and flow speed is found. Test data are used to validate the method and errors for all test data were under 10 percent. The speed of the flow can be estimated by calculating the wave power of the flow generated sound and using the proposed correlation.

Keywords: Flow generated sound, sound processing, speed, wave power.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2355
7545 Exponential Particle Swarm Optimization Approach for Improving Data Clustering

Authors: Neveen I. Ghali, Nahed El-Dessouki, Mervat A. N., Lamiaa Bakrawi

Abstract:

In this paper we use exponential particle swarm optimization (EPSO) to cluster data. Then we compare between (EPSO) clustering algorithm which depends on exponential variation for the inertia weight and particle swarm optimization (PSO) clustering algorithm which depends on linear inertia weight. This comparison is evaluated on five data sets. The experimental results show that EPSO clustering algorithm increases the possibility to find the optimal positions as it decrease the number of failure. Also show that (EPSO) clustering algorithm has a smaller quantization error than (PSO) clustering algorithm, i.e. (EPSO) clustering algorithm more accurate than (PSO) clustering algorithm.

Keywords: Particle swarm optimization, data clustering, exponential PSO.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669
7544 Development and Evaluation of a Portable Ammonia Gas Detector

Authors: Jaheon Gu, Wooyong Chung, Mijung Koo, Seonbok Lee, Gyoutae Park, Sangguk Ahn, Hiesik Kim, Jungil Park

Abstract:

In this paper, we present a portable ammonia gas detector for performing the gas safety management efficiently. The display of the detector is separated from its body. The display module is received the data measured from the detector using ZigBee. The detector has a rechargeable li-ion battery which can be use for 11~12 hours, and a Bluetooth module for sending the data to the PC or the smart devices. The data are sent to the server and can access using the web browser or mobile application. The range of the detection concentration is 0~100ppm.

Keywords: Ammonia, detector, gas safety, portable.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1521
7543 LAYMOD; A Layered and Modular Platform for CAx Collaboration Management and Supporting Product data Integration based on STEP Standard

Authors: Omid F. Valilai, Mahmoud Houshmand

Abstract:

Nowadays companies strive to survive in a competitive global environment. To speed up product development/modifications, it is suggested to adopt a collaborative product development approach. However, despite the advantages of new IT improvements still many CAx systems work separately and locally. Collaborative design and manufacture requires a product information model that supports related CAx product data models. To solve this problem many solutions are proposed, which the most successful one is adopting the STEP standard as a product data model to develop a collaborative CAx platform. However, the improvement of the STEP-s Application Protocols (APs) over the time, huge number of STEP AP-s and cc-s, the high costs of implementation, costly process for conversion of older CAx software files to the STEP neutral file format; and lack of STEP knowledge, that usually slows down the implementation of the STEP standard in collaborative data exchange, management and integration should be considered. In this paper the requirements for a successful collaborative CAx system is discussed. The STEP standard capability for product data integration and its shortcomings as well as the dominant platforms for supporting CAx collaboration management and product data integration are reviewed. Finally a platform named LAYMOD to fulfil the requirements of CAx collaborative environment and integrating the product data is proposed. The platform is a layered platform to enable global collaboration among different CAx software packages/developers. It also adopts the STEP modular architecture and the XML data structures to enable collaboration between CAx software packages as well as overcoming the STEP standard limitations. The architecture and procedures of LAYMOD platform to manage collaboration and avoid contradicts in product data integration are introduced.

Keywords: CAx, Collaboration management, STEP applicationmodules, STEP standard, XML data structures

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2203
7542 Grocery Customer Behavior Analysis using RFID-based Shopping Paths Data

Authors: In-Chul Jung, Young S. Kwon

Abstract:

Knowing about the customer behavior in a grocery has been a long-standing issue in the retailing industry. The advent of RFID has made it easier to collect moving data for an individual shopper's behavior. Most of the previous studies used the traditional statistical clustering technique to find the major characteristics of customer behavior, especially shopping path. However, in using the clustering technique, due to various spatial constraints in the store, standard clustering methods are not feasible because moving data such as the shopping path should be adjusted in advance of the analysis, which is time-consuming and causes data distortion. To alleviate this problem, we propose a new approach to spatial pattern clustering based on the longest common subsequence. Experimental results using real data obtained from a grocery confirm the good performance of the proposed method in finding the hot spot, dead spot and major path patterns of customer movements.

Keywords: customer path, shopping behavior, exploratoryanalysis, LCS, RFID

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3128
7541 Soft-Sensor for Estimation of Gasoline Octane Number in Platforming Processes with Adaptive Neuro-Fuzzy Inference Systems (ANFIS)

Authors: Hamed.Vezvaei, Sepideh.Ordibeheshti, Mehdi.Ardjmand

Abstract:

Gasoline Octane Number is the standard measure of the anti-knock properties of a motor in platforming processes, that is one of the important unit operations for oil refineries and can be determined with online measurement or use CFR (Cooperative Fuel Research) engines. Online measurements of the Octane number can be done using direct octane number analyzers, that it is too expensive, so we have to find feasible analyzer, like ANFIS estimators. ANFIS is the systems that neural network incorporated in fuzzy systems, using data automatically by learning algorithms of NNs. ANFIS constructs an input-output mapping based both on human knowledge and on generated input-output data pairs. In this research, 31 industrial data sets are used (21 data for training and the rest of the data used for generalization). Results show that, according to this simulation, hybrid method training algorithm in ANFIS has good agreements between industrial data and simulated results.

Keywords: Adaptive Neuro-Fuzzy Inference Systems, GasolineOctane Number, Soft-sensor, Catalytic Naphtha Reforming

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2172
7540 Tool for Metadata Extraction and Content Packaging as Endorsed in OAIS Framework

Authors: Payal Abichandani, Rishi Prakash, Paras Nath Barwal, B. K. Murthy

Abstract:

Information generated from various computerization processes is a potential rich source of knowledge for its designated community. To pass this information from generation to generation without modifying the meaning is a challenging activity. To preserve and archive the data for future generations it’s very essential to prove the authenticity of the data. It can be achieved by extracting the metadata from the data which can prove the authenticity and create trust on the archived data. Subsequent challenge is the technology obsolescence. Metadata extraction and standardization can be effectively used to resolve and tackle this problem. Metadata can be categorized at two levels i.e. Technical and Domain level broadly. Technical metadata will provide the information that can be used to understand and interpret the data record, but only this level of metadata isn’t sufficient to create trustworthiness. We have developed a tool which will extract and standardize the technical as well as domain level metadata. This paper is about the different features of the tool and how we have developed this.  

Keywords: Digital Preservation, Metadata, OAIS, PDI, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805
7539 Application of a New Hybrid Optimization Algorithm on Cluster Analysis

Authors: T. Niknam, M. Nayeripour, B.Bahmani Firouzi

Abstract:

Clustering techniques have received attention in many areas including engineering, medicine, biology and data mining. The purpose of clustering is to group together data points, which are close to one another. The K-means algorithm is one of the most widely used techniques for clustering. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. This paper is presented an efficient hybrid evolutionary optimization algorithm based on combining Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), called PSO-ACO, for optimally clustering N object into K clusters. The new PSO-ACO algorithm is tested on several data sets, and its performance is compared with those of ACO, PSO and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handing data clustering.

Keywords: Ant Colony Optimization (ACO), Data clustering, Hybrid evolutionary optimization algorithm, K-means clustering, Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2184
7538 Analysis of DNA Microarray Data using Association Rules: A Selective Study

Authors: M. Anandhavalli Gauthaman

Abstract:

DNA microarrays allow the measurement of expression levels for a large number of genes, perhaps all genes of an organism, within a number of different experimental samples. It is very much important to extract biologically meaningful information from this huge amount of expression data to know the current state of the cell because most cellular processes are regulated by changes in gene expression. Association rule mining techniques are helpful to find association relationship between genes. Numerous association rule mining algorithms have been developed to analyze and associate this huge amount of gene expression data. This paper focuses on some of the popular association rule mining algorithms developed to analyze gene expression data.

Keywords: DNA microarray, gene expression, association rule mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2128
7537 Prospects, Problems of Marketing Research and Data Mining in Turkey

Authors: Sema Kurtuluş, Kemal Kurtuluş

Abstract:

The objective of this paper is to review and assess the methodological issues and problems in marketing research, data and knowledge mining in Turkey. As a summary, academic marketing research publications in Turkey have significant problems. The most vital problem seems to be related with modeling. Most of the publications had major weaknesses in modeling. There were also, serious problems regarding measurement and scaling, sampling and analyses. Analyses myopia seems to be the most important problem for young academia in Turkey. Another very important finding is the lack of publications on data and knowledge mining in the academic world.

Keywords: Marketing research, data mining, knowledge mining, research modeling, analyses.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1952
7536 Fuzzy Logic Based Improved Range Free Localization for Wireless Sensor Networks

Authors: Ashok Kumar, Vinod Kumar

Abstract:

Wireless Sensor Networks (WSNs) are used to monitor/observe vast inaccessible regions through deployment of large number of sensor nodes in the sensing area. For majority of WSN applications, the collected data needs to be combined with geographic information of its origin to make it useful for the user; information received from remote Sensor Nodes (SNs) that are several hops away from base station/sink is meaningless without knowledge of its source. In addition to this, location information of SNs can also be used to propose/develop new network protocols for WSNs to improve their energy efficiency and lifetime. In this paper, range free localization protocols for WSNs have been proposed. The proposed protocols are based on weighted centroid localization technique, where the edge weights of SNs are decided by utilizing fuzzy logic inference for received signal strength and link quality between the nodes. The fuzzification is carried out using (i) Mamdani, (ii) Sugeno, and (iii) Combined Mamdani Sugeno fuzzy logic inference. Simulation results demonstrate that proposed protocols provide better accuracy in node localization compared to conventional centroid based localization protocols despite presence of unintentional radio frequency interference from radio frequency (RF) sources operating in same frequency band.

Keywords: localization, range free, received signal strength, link quality indicator, Mamdani fuzzy logic inference, Sugeno fuzzy logic inference.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2617
7535 Analysis and Comparison of Image Encryption Algorithms

Authors: İsmet Öztürk, İbrahim Soğukpınar

Abstract:

With the fast progression of data exchange in electronic way, information security is becoming more important in data storage and transmission. Because of widely using images in industrial process, it is important to protect the confidential image data from unauthorized access. In this paper, we analyzed current image encryption algorithms and compression is added for two of them (Mirror-like image encryption and Visual Cryptography). Implementations of these two algorithms have been realized for experimental purposes. The results of analysis are given in this paper.

Keywords: image encryption, image cryptosystem, security, transmission

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4939
7534 Risk Classification of SMEs by Early Warning Model Based on Data Mining

Authors: Nermin Ozgulbas, Ali Serhan Koyuncugil

Abstract:

One of the biggest problems of SMEs is their tendencies to financial distress because of insufficient finance background. In this study, an Early Warning System (EWS) model based on data mining for financial risk detection is presented. CHAID algorithm has been used for development of the EWS. Developed EWS can be served like a tailor made financial advisor in decision making process of the firms with its automated nature to the ones who have inadequate financial background. Besides, an application of the model implemented which covered 7,853 SMEs based on Turkish Central Bank (TCB) 2007 data. By using EWS model, 31 risk profiles, 15 risk indicators, 2 early warning signals, and 4 financial road maps has been determined for financial risk mitigation.

Keywords: Early Warning Systems, Data Mining, Financial Risk, SMEs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3366
7533 Using Data from Foursquare Web Service to Represent the Commercial Activity of a City

Authors: Taras Agryzkov, Almudena Nolasco-Cirugeda, Jos´e L. Oliver, Leticia Serrano-Estrada, Leandro Tortosa, Jos´e F. Vicent

Abstract:

This paper aims to represent the commercial activity of a city taking as source data the social network Foursquare. The city of Murcia is selected as case study, and the location-based social network Foursquare is the main source of information. After carrying out a reorganisation of the user-generated data extracted from Foursquare, it is possible to graphically display on a map the various city spaces and venues especially those related to commercial, food and entertainment sector businesses. The obtained visualisation provides information about activity patterns in the city of Murcia according to the people‘s interests and preferences and, moreover, interesting facts about certain characteristics of the town itself.

Keywords: Social networks, Foursquare, spatial analysis, data visualization, geocomputation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2663
7532 Long-Range Dependence of Financial Time Series Data

Authors: Chatchai Pesee

Abstract:

This paper examines long-range dependence or longmemory of financial time series on the exchange rate data by the fractional Brownian motion (fBm). The principle of spectral density function in Section 2 is used to find the range of Hurst parameter (H) of the fBm. If 0< H <1/2, then it has a short-range dependence (SRD). It simulates long-memory or long-range dependence (LRD) if 1/2< H <1. The curve of exchange rate data is fBm because of the specific appearance of the Hurst parameter (H). Furthermore, some of the definitions of the fBm, long-range dependence and selfsimilarity are reviewed in Section II as well. Our results indicate that there exists a long-memory or a long-range dependence (LRD) for the exchange rate data in section III. Long-range dependence of the exchange rate data and estimation of the Hurst parameter (H) are discussed in Section IV, while a conclusion is discussed in Section V.

Keywords: Fractional Brownian motion, long-rangedependence, memory, short-range dependence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1869