Search results for: Spectral data
6724 CSOLAP (Continuous Spatial On-Line Analytical Processing)
Authors: Taher Omran Ahmed, Abdullatif Mihdi Buras
Abstract:
Decision support systems are usually based on multidimensional structures which use the concept of hypercube. Dimensions are the axes on which facts are analyzed and form a space where a fact is located by a set of coordinates at the intersections of members of dimensions. Conventional multidimensional structures deal with discrete facts linked to discrete dimensions. However, when dealing with natural continuous phenomena the discrete representation is not adequate. There is a need to integrate spatiotemporal continuity within multidimensional structures to enable analysis and exploration of continuous field data. Research issues that lead to the integration of spatiotemporal continuity in multidimensional structures are numerous. In this paper, we discuss research issues related to the integration of continuity in multidimensional structures, present briefly a multidimensional model for continuous field data. We also define new aggregation operations. The model and the associated operations and measures are validated by a prototype.Keywords: Continuous Data, Data warehousing, DecisionSupport, SOLAP
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15956723 A Study of Behavioral Phenomena Using ANN
Authors: Yudhajit Datta
Abstract:
Behavioral aspects of experience such as will power are rarely subjected to quantitative study owing to the numerous complexities involved. Will is a phenomenon that has puzzled humanity for a long time. It is a belief that will power of an individual affects the success achieved by them in life. It is also thought that a person endowed with great will power can overcome even the most crippling setbacks in life while a person with a weak will cannot make the most of life even the greatest assets. This study is an attempt to subject the phenomena of will to the test of an artificial neural network through a computational model. The claim being tested is that will power of an individual largely determines success achieved in life. It is proposed that data pertaining to success of individuals be obtained from an experiment and the phenomenon of will be incorporated into the model, through data generated recursively using a relation between will and success characteristic to the model. An artificial neural network trained using part of the data, could subsequently be used to make predictions regarding data points in the rest of the model. The procedure would be tried for different models and the model where the networks predictions are found to be in greatest agreement with the data would be selected; and used for studying the relation between success and will.
Keywords: Will Power, Success, ANN, Time Series Prediction, Sliding Window, Computational Model, Behavioral Phenomena.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19306722 Standard Languages for Creating a Database to Display Financial Statements on a Web Application
Authors: Vladimir Simovic, Matija Varga, Predrag Oreski
Abstract:
XHTML and XBRL are the standard languages for creating a database for the purpose of displaying financial statements on web applications. Today, XBRL is one of the most popular languages for business reporting. A large number of countries in the world recognize the role of XBRL language for financial reporting and the benefits that the reporting format provides in the collection, analysis, preparation, publication and the exchange of data (information) which is the positive side of this language. Here we present all advantages and opportunities that a company may have by using the XBRL format for business reporting. Also, this paper presents XBRL and other languages that are used for creating the database, such XML, XHTML, etc. The role of the AJAX complex model and technology will be explained in detail, and during the exchange of financial data between the web client and web server. Here will be mentioned basic layers of the network for data exchange via the web.Keywords: XHTML, XBRL, XML, JavaScript, AJAX technology, data exchange.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10706721 Survey on Image Mining Using Genetic Algorithm
Authors: Jyoti Dua
Abstract:
One image is worth more than thousand words. Images if analyzed can reveal useful information. Low level image processing deals with the extraction of specific feature from a single image. Now the question arises: What technique should be used to extract patterns of very large and detailed image database? The answer of the question is: “Image Mining”. Image Mining deals with the extraction of image data relationship, implicit knowledge, and another pattern from the collection of images or image database. It is nothing but the extension of Data Mining. In the following paper, not only we are going to scrutinize the current techniques of image mining but also present a new technique for mining images using Genetic Algorithm.
Keywords: Image Mining, Data Mining, Genetic Algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24456720 Landscape Visual Classification Using Land use and Contour Data for Tourism and Planning Decision Making in Cameron Highlands District
Authors: Hosni, N., Shinozaki, M.
Abstract:
Cameron Highlands is known for upland tourism area with vast natural wealth, mountainous landscape endowed with rich diverse species as well as people traditions and cultures. With these various resources, CH possesses an interesting visual and panorama that can be offered to the tourist. However this benefit may not be utilized without obtaining the understanding of existing landscape structure and visual. Given a limited data, this paper attempts to classify landscape visual of Cameron Highlands using land use and contour data. Visual points of view were determined from the given tourist attraction points in the CH Local Plan 2003-2015. The result shows landscape visual and structure categories offered in the study area. The result can be used for further analysis to determine the best alternative tourist trails for tourism planning and decision making using readily available data.Keywords: Visibility, landscape visual, urban planning, GIS
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23736719 Sampling of Variables in Discrete-Event Simulation using the Example of Inventory Evolutions in Job-Shop-Systems Based on Deterministic and Non-Deterministic Data
Authors: Bernd Scholz-Reiter, Christian Toonen, Jan Topi Tervo, Dennis Lappe
Abstract:
Time series analysis often requires data that represents the evolution of an observed variable in equidistant time steps. In order to collect this data sampling is applied. While continuous signals may be sampled, analyzed and reconstructed applying Shannon-s sampling theorem, time-discrete signals have to be dealt with differently. In this article we consider the discrete-event simulation (DES) of job-shop-systems and study the effects of different sampling rates on data quality regarding completeness and accuracy of reconstructed inventory evolutions. At this we discuss deterministic as well as non-deterministic behavior of system variables. Error curves are deployed to illustrate and discuss the sampling rate-s impact and to derive recommendations for its wellfounded choice.Keywords: discrete-event simulation, job-shop-system, sampling rate.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18276718 School Design and Energy Efficiency
Authors: B. Su
Abstract:
Auckland has a temperate climate with comfortable warm, dry summers and mild, wet winters. An Auckland school normally does not need air conditioning for cooling during the summer and only need heating during the winter. The space hating energy is the major portion of winter school energy consumption and the winter energy consumption is major portion of annual school energy consumption. School building thermal design should focus on the winter thermal performance for reducing the space heating energy. A number of Auckland schools- design data and energy consumption data are used for this study. This pilot study investigates the relationships between their energy consumption data and school building design data to improve future school design for energy efficiency.Keywords: Building energy efficiency, building thermal performance, school building design, school energy consumption
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18836717 A Thought on Exotic Statistical Distributions
Authors: R K Sinha
Abstract:
The statistical distributions are modeled in explaining nature of various types of data sets. Although these distributions are mostly uni-modal, it is quite common to see multiple modes in the observed distribution of the underlying variables, which make the precise modeling unrealistic. The observed data do not exhibit smoothness not necessarily due to randomness, but could also be due to non-randomness resulting in zigzag curves, oscillations, humps etc. The present paper argues that trigonometric functions, which have not been used in probability functions of distributions so far, have the potential to take care of this, if incorporated in the distribution appropriately. A simple distribution (named as, Sinoform Distribution), involving trigonometric functions, is illustrated in the paper with a data set. The importance of trigonometric functions is demonstrated in the paper, which have the characteristics to make statistical distributions exotic. It is possible to have multiple modes, oscillations and zigzag curves in the density, which could be suitable to explain the underlying nature of select data set.Keywords: Exotic Statistical Distributions, Kurtosis, Mixture Distributions, Multi-modal
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16266716 Data and Spatial Analysis for Economy and Education of 28 E.U. Member-States for 2014
Authors: Alexiou Dimitra, Fragkaki Maria
Abstract:
The objective of the paper is the study of geographic, economic and educational variables and their contribution to determine the position of each member-state among the EU-28 countries based on the values of seven variables as given by Eurostat. The Data Analysis methods of Multiple Factorial Correspondence Analysis (MFCA) Principal Component Analysis and Factor Analysis have been used. The cross tabulation tables of data consist of the values of seven variables for the 28 countries for 2014. The data are manipulated using the CHIC Analysis V 1.1 software package. The results of this program using MFCA and Ascending Hierarchical Classification are given in arithmetic and graphical form. For comparison reasons with the same data the Factor procedure of Statistical package IBM SPSS 20 has been used. The numerical and graphical results presented with tables and graphs, demonstrate the agreement between the two methods. The most important result is the study of the relation between the 28 countries and the position of each country in groups or clouds, which are formed according to the values of the corresponding variables.
Keywords: Multiple factorial correspondence analysis, principal component analysis, factor analysis, E.U.-28 countries, statistical package IBM SPSS 20, CHIC Analysis V 1.1 Software, Eurostat.eu statistics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10856715 Comparison between Higher-Order SVD and Third-order Orthogonal Tensor Product Expansion
Authors: Chiharu Okuma, Jun Murakami, Naoki Yamamoto
Abstract:
In digital signal processing it is important to approximate multi-dimensional data by the method called rank reduction, in which we reduce the rank of multi-dimensional data from higher to lower. For 2-dimennsional data, singular value decomposition (SVD) is one of the most known rank reduction techniques. Additional, outer product expansion expanded from SVD was proposed and implemented for multi-dimensional data, which has been widely applied to image processing and pattern recognition. However, the multi-dimensional outer product expansion has behavior of great computation complex and has not orthogonally between the expansion terms. Therefore we have proposed an alterative method, Third-order Orthogonal Tensor Product Expansion short for 3-OTPE. 3-OTPE uses the power method instead of nonlinear optimization method for decreasing at computing time. At the same time the group of B. D. Lathauwer proposed Higher-Order SVD (HOSVD) that is also developed with SVD extensions for multi-dimensional data. 3-OTPE and HOSVD are similarly on the rank reduction of multi-dimensional data. Using these two methods we can obtain computation results respectively, some ones are the same while some ones are slight different. In this paper, we compare 3-OTPE to HOSVD in accuracy of calculation and computing time of resolution, and clarify the difference between these two methods.Keywords: Singular value decomposition (SVD), higher-order SVD (HOSVD), higher-order tensor, outer product expansion, power method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15626714 Analyzing the Changing Pattern of Nigerian Vegetation Zones and Its Ecological and Socio-Economic Implications Using Spot-Vegetation Sensor
Authors: B. L. Gadiga
Abstract:
This study assesses the major ecological zones in Nigeria with the view to understanding the spatial pattern of vegetation zones and the implications on conservation within the period of sixteen (16) years. Satellite images used for this study were acquired from the SPOT-VEGETATION between 1998 and 2013. The annual NDVI images selected for this study were derived from SPOT-4 sensor and were acquired within the same season (November) in order to reduce differences in spectral reflectance due to seasonal variations. The images were sliced into five classes based on literatures and knowledge of the area (i.e. <0.16 Non-Vegetated areas; 0.16-0.22 Sahel Savannah; 0.22-0.40 Sudan Savannah, 0.40-0.47 Guinea Savannah and >0.47 Forest Zone). Classification of the 1998 and 2013 images into forested and non forested areas showed that forested area decrease from 511,691 km2 in 1998 to 478,360 km2 in 2013. Differencing change detection method was performed on 1998 and 2013 NDVI images to identify areas of ecological concern. The result shows that areas undergoing vegetation degradation covers an area of 73,062 km2 while areas witnessing some form restoration cover an area of 86,315 km2. The result also shows that there is a weak correlation between rainfall and the vegetation zones. The non-vegetated areas have a correlation coefficient (r) of 0.0088, Sahel Savannah belt 0.1988, Sudan Savannah belt -0.3343, Guinea Savannah belt 0.0328 and Forest belt 0.2635. The low correlation can be associated with the encroachment of the Sudan Savannah belt into the forest belt of South-eastern part of the country as revealed by the image analysis. The degradation of the forest vegetation is therefore responsible for the serious erosion problems witnessed in the South-east. The study recommends constant monitoring of vegetation and strict enforcement of environmental laws in the country.
Keywords: Vegetation, NDVI, SPOT-vegetation, ecology, degradation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8386713 Electron-Impact Excitation of Kr 5s, 5p Levels
Authors: Alla A. Mityureva
Abstract:
The available data on the cross sections of electronimpact excitation of krypton 5s and 5p configuration levels out of the ground state are represented in convenient and compact form. The results are obtained by regression through all known published data related to this process.Keywords: Cross section, electron excitation, krypton, regression
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10876712 Seamless Flow of Voluminous Data in High Speed Network without Congestion Using Feedback Mechanism
Abstract:
Continuously growing needs for Internet applications that transmit massive amount of data have led to the emergence of high speed network. Data transfer must take place without any congestion and hence feedback parameters must be transferred from the receiver end to the sender end so as to restrict the sending rate in order to avoid congestion. Even though TCP tries to avoid congestion by restricting the sending rate and window size, it never announces the sender about the capacity of the data to be sent and also it reduces the window size by half at the time of congestion therefore resulting in the decrease of throughput, low utilization of the bandwidth and maximum delay. In this paper, XCP protocol is used and feedback parameters are calculated based on arrival rate, service rate, traffic rate and queue size and hence the receiver informs the sender about the throughput, capacity of the data to be sent and window size adjustment, resulting in no drastic decrease in window size, better increase in sending rate because of which there is a continuous flow of data without congestion. Therefore as a result of this, there is a maximum increase in throughput, high utilization of the bandwidth and minimum delay. The result of the proposed work is presented as a graph based on throughput, delay and window size. Thus in this paper, XCP protocol is well illustrated and the various parameters are thoroughly analyzed and adequately presented.Keywords: Bandwidth-Delay Product, Congestion Control, Congestion Window, TCP/IP
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14886711 Eco-Connectivity: Sustainable Practices in Telecom Networks Using Big Data
Authors: Tharunika Sridhar
Abstract:
This paper addresses sustainable eco-connectivity within the telecommunications sector studying its importance to tackle the contemporary challenges and data regulation issues. The paper also investigates the role of Big Data and its integration in this context, specific to telecom industry. One of the major focus areas in this paper is studying and examining the pathways explored, that are state-of-the-art ecological infrastructure solutions and sector-led measures derived from expert analyses and reviews. Additionally, the paper analyses critical factors involving cost-effective route planning, and the development of green telecommunications infrastructure that adds qualitative reasoning to the research idea. Furthermore, the study discusses in detail a potential green roadmap towards sustainability by exploring green routing software, eco-friendly infrastructure and other eco-focused initiatives. The paper is also directed at the special linguistic needs of the telecommunications sector by focusing on targeted select range of telecom environment.
Keywords: Big Data, telecom, sustainable telecom sector, telecom networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 846710 Comparison of Irradiance Decomposition and Energy Production Methods in a Solar Photovoltaic System
Authors: Tisciane Perpetuo e Oliveira, Dante Inga Narvaez, Marcelo Gradella Villalva
Abstract:
Installations of solar photovoltaic systems have increased considerably in the last decade. Therefore, it has been noticed that monitoring of meteorological data (solar irradiance, air temperature, wind velocity, etc.) is important to predict the potential of a given geographical area in solar energy production. In this sense, the present work compares two computational tools that are capable of estimating the energy generation of a photovoltaic system through correlation analyzes of solar radiation data: PVsyst software and an algorithm based on the PVlib package implemented in MATLAB. In order to achieve the objective, it was necessary to obtain solar radiation data (measured and from a solarimetric database), analyze the decomposition of global solar irradiance in direct normal and horizontal diffuse components, as well as analyze the modeling of the devices of a photovoltaic system (solar modules and inverters) for energy production calculations. Simulated results were compared with experimental data in order to evaluate the performance of the studied methods. Errors in estimation of energy production were less than 30% for the MATLAB algorithm and less than 20% for the PVsyst software.
Keywords: Energy production, meteorological data, irradiance decomposition, solar photovoltaic system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7666709 Cloud Computing Cryptography "State-of-the-Art"
Authors: Omer K. Jasim, Safia Abbas, El-Sayed M. El-Horbaty, Abdel-Badeeh M. Salem
Abstract:
Cloud computing technology is very useful in present day to day life, it uses the internet and the central remote servers to provide and maintain data as well as applications. Such applications in turn can be used by the end users via the cloud communications without any installation. Moreover, the end users’ data files can be accessed and manipulated from any other computer using the internet services. Despite the flexibility of data and application accessing and usage that cloud computing environments provide, there are many questions still coming up on how to gain a trusted environment that protect data and applications in clouds from hackers and intruders. This paper surveys the “keys generation and management” mechanism and encryption/decryption algorithms used in cloud computing environments, we proposed new security architecture for cloud computing environment that considers the various security gaps as much as possible. A new cryptographic environment that implements quantum mechanics in order to gain more trusted with less computation cloud communications is given.
Keywords: Cloud Computing, Cloud Encryption Model, Quantum Key Distribution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 40946708 Deep iCrawl: An Intelligent Vision-Based Deep Web Crawler
Authors: R.Anita, V.Ganga Bharani, N.Nityanandam, Pradeep Kumar Sahoo
Abstract:
The explosive growth of World Wide Web has posed a challenging problem in extracting relevant data. Traditional web crawlers focus only on the surface web while the deep web keeps expanding behind the scene. Deep web pages are created dynamically as a result of queries posed to specific web databases. The structure of the deep web pages makes it impossible for traditional web crawlers to access deep web contents. This paper, Deep iCrawl, gives a novel and vision-based approach for extracting data from the deep web. Deep iCrawl splits the process into two phases. The first phase includes Query analysis and Query translation and the second covers vision-based extraction of data from the dynamically created deep web pages. There are several established approaches for the extraction of deep web pages but the proposed method aims at overcoming the inherent limitations of the former. This paper also aims at comparing the data items and presenting them in the required order.Keywords: Crawler, Deep web, Web Database
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21566707 Searchable Encryption in Cloud Storage
Authors: Ren-Junn Hwang, Chung-Chien Lu, Jain-Shing Wu
Abstract:
Cloud outsource storage is one of important services in cloud computing. Cloud users upload data to cloud servers to reduce the cost of managing data and maintaining hardware and software. To ensure data confidentiality, users can encrypt their files before uploading them to a cloud system. However, retrieving the target file from the encrypted files exactly is difficult for cloud server. This study proposes a protocol for performing multikeyword searches for encrypted cloud data by applying k-nearest neighbor technology. The protocol ranks the relevance scores of encrypted files and keywords, and prevents cloud servers from learning search keywords submitted by a cloud user. To reduce the costs of file transfer communication, the cloud server returns encrypted files in order of relevance. Moreover, when a cloud user inputs an incorrect keyword and the number of wrong alphabet does not exceed a given threshold; the user still can retrieve the target files from cloud server. In addition, the proposed scheme satisfies security requirements for outsourced data storage.
Keywords: Fault-tolerance search, multi-keywords search, outsource storage, ranked search, searchable encryption.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30816706 Unstructured-Data Content Search Based on Optimized EEG Signal Processing and Multi-Objective Feature Extraction
Authors: Qais M. Yousef, Yasmeen A. Alshaer
Abstract:
Over the last few years, the amount of data available on the globe has been increased rapidly. This came up with the emergence of recent concepts, such as the big data and the Internet of Things, which have furnished a suitable solution for the availability of data all over the world. However, managing this massive amount of data remains a challenge due to their large verity of types and distribution. Therefore, locating the required file particularly from the first trial turned to be a not easy task, due to the large similarities of names for different files distributed on the web. Consequently, the accuracy and speed of search have been negatively affected. This work presents a method using Electroencephalography signals to locate the files based on their contents. Giving the concept of natural mind waves processing, this work analyses the mind wave signals of different people, analyzing them and extracting their most appropriate features using multi-objective metaheuristic algorithm, and then classifying them using artificial neural network to distinguish among files with similar names. The aim of this work is to provide the ability to find the files based on their contents using human thoughts only. Implementing this approach and testing it on real people proved its ability to find the desired files accurately within noticeably shorter time and retrieve them as a first choice for the user.
Keywords: Artificial intelligence, data contents search, human active memory, mind wave, multi-objective optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9206705 A Modified AES Based Algorithm for Image Encryption
Authors: M. Zeghid, M. Machhout, L. Khriji, A. Baganne, R. Tourki
Abstract:
With the fast evolution of digital data exchange, security information becomes much important in data storage and transmission. Due to the increasing use of images in industrial process, it is essential to protect the confidential image data from unauthorized access. In this paper, we analyze the Advanced Encryption Standard (AES), and we add a key stream generator (A5/1, W7) to AES to ensure improving the encryption performance; mainly for images characterised by reduced entropy. The implementation of both techniques has been realized for experimental purposes. Detailed results in terms of security analysis and implementation are given. Comparative study with traditional encryption algorithms is shown the superiority of the modified algorithm.Keywords: Cryptography, Encryption, Advanced EncryptionStandard (AES), ECB mode, statistical analysis, key streamgenerator.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 50586704 Incremental Mining of Shocking Association Patterns
Authors: Eiad Yafi, Ahmed Sultan Al-Hegami, M. A. Alam, Ranjit Biswas
Abstract:
Association rules are an important problem in data mining. Massively increasing volume of data in real life databases has motivated researchers to design novel and incremental algorithms for association rules mining. In this paper, we propose an incremental association rules mining algorithm that integrates shocking interestingness criterion during the process of building the model. A new interesting measure called shocking measure is introduced. One of the main features of the proposed approach is to capture the user background knowledge, which is monotonically augmented. The incremental model that reflects the changing data and the user beliefs is attractive in order to make the over all KDD process more effective and efficient. We implemented the proposed approach and experiment it with some public datasets and found the results quite promising.Keywords: Knowledge discovery in databases (KDD), Data mining, Incremental Association rules, Domain knowledge, Interestingness, Shocking rules (SHR).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18676703 Zero Inflated Strict Arcsine Regression Model
Authors: Y. N. Phang, E. F. Loh
Abstract:
Zero inflated strict arcsine model is a newly developed model which is found to be appropriate in modeling overdispersed count data. In this study, we extend zero inflated strict arcsine model to zero inflated strict arcsine regression model by taking into consideration the extra variability caused by extra zeros and covariates in count data. Maximum likelihood estimation method is used in estimating the parameters for this zero inflated strict arcsine regression model.Keywords: Overdispersed count data, maximum likelihood estimation, simulated annealing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17556702 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network
Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu
Abstract:
A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.Keywords: Big data, k-NN, machine learning, traffic speed prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13766701 GA Based Optimal Feature Extraction Method for Functional Data Classification
Authors: Jun Wan, Zehua Chen, Yingwu Chen, Zhidong Bai
Abstract:
Classification is an interesting problem in functional data analysis (FDA), because many science and application problems end up with classification problems, such as recognition, prediction, control, decision making, management, etc. As the high dimension and high correlation in functional data (FD), it is a key problem to extract features from FD whereas keeping its global characters, which relates to the classification efficiency and precision to heavens. In this paper, a novel automatic method which combined Genetic Algorithm (GA) and classification algorithm to extract classification features is proposed. In this method, the optimal features and classification model are approached via evolutional study step by step. It is proved by theory analysis and experiment test that this method has advantages in improving classification efficiency, precision and robustness whereas using less features and the dimension of extracted classification features can be controlled.Keywords: Classification, functional data, feature extraction, genetic algorithm, wavelet.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15556700 An Experimental Study of a Self-Supervised Classifier Ensemble
Authors: Neamat El Gayar
Abstract:
Learning using labeled and unlabelled data has received considerable amount of attention in the machine learning community due its potential in reducing the need for expensive labeled data. In this work we present a new method for combining labeled and unlabeled data based on classifier ensembles. The model we propose assumes each classifier in the ensemble observes the input using different set of features. Classifiers are initially trained using some labeled samples. The trained classifiers learn further through labeling the unknown patterns using a teaching signals that is generated using the decision of the classifier ensemble, i.e. the classifiers self-supervise each other. Experiments on a set of object images are presented. Our experiments investigate different classifier models, different fusing techniques, different training sizes and different input features. Experimental results reveal that the proposed self-supervised ensemble learning approach reduces classification error over the single classifier and the traditional ensemble classifier approachs.Keywords: Multiple Classifier Systems, classifier ensembles, learning using labeled and unlabelled data, K-nearest neighbor classifier, Bayes classifier.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16446699 Delay Analysis of Sampled-Data Systems in Hard RTOS
Authors: A. M. Azad, M. Alam, C. M. Hussain
Abstract:
In this paper, we have presented the effect of varying time-delays on performance and stability in the single-channel multirate sampled-data system in hard real-time (RT-Linux) environment. The sampling task require response time that might exceed the capacity of RT-Linux. So a straight implementation with RT-Linux is not feasible, because of the latency of the systems and hence, sampling period should be less to handle this task. The best sampling rate is chosen for the sampled-data system, which is the slowest rate meets all performance requirements. RT-Linux is consistent with its specifications and the resolution of the real-time is considered 0.01 seconds to achieve an efficient result. The test results of our laboratory experiment shows that the multi-rate control technique in hard real-time operating system (RTOS) can improve the stability problem caused by the random access delays and asynchronization.Keywords: Multi-rate, PID, RT-Linux, Sampled-data, Servo.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14446698 A Study of the Adaptive Reuse for School Land Use Strategy: An Application of the Analytic Network Process and Big Data
Authors: Wann-Ming Wey
Abstract:
In today's popularity and progress of information technology, the big data set and its analysis are no longer a major conundrum. Now, we could not only use the relevant big data to analysis and emulate the possible status of urban development in the near future, but also provide more comprehensive and reasonable policy implementation basis for government units or decision-makers via the analysis and emulation results as mentioned above. In this research, we set Taipei City as the research scope, and use the relevant big data variables (e.g., population, facility utilization and related social policy ratings) and Analytic Network Process (ANP) approach to implement in-depth research and discussion for the possible reduction of land use in primary and secondary schools of Taipei City. In addition to enhance the prosperous urban activities for the urban public facility utilization, the final results of this research could help improve the efficiency of urban land use in the future. Furthermore, the assessment model and research framework established in this research also provide a good reference for schools or other public facilities land use and adaptive reuse strategies in the future.
Keywords: Adaptive reuse, analytic network process, big data, land use strategy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9216697 A Review and Comparative Analysis on Cluster Ensemble Methods
Authors: S. Sarumathi, P. Ranjetha, C. Saraswathy, M. Vaishnavi, S. Geetha
Abstract:
Clustering is an unsupervised learning technique for aggregating data objects into meaningful classes so that intra cluster similarity is maximized and inter cluster similarity is minimized in data mining. However, no single clustering algorithm proves to be the most effective in producing the best result. As a result, a new challenging technique known as the cluster ensemble approach has blossomed in order to determine the solution to this problem. For the cluster analysis issue, this new technique is a successful approach. The cluster ensemble's main goal is to combine similar clustering solutions in a way that achieves the precision while also improving the quality of individual data clustering. Because of the massive and rapid creation of new approaches in the field of data mining, the ongoing interest in inventing novel algorithms necessitates a thorough examination of current techniques and future innovation. This paper presents a comparative analysis of various cluster ensemble approaches, including their methodologies, formal working process, and standard accuracy and error rates. As a result, the society of clustering practitioners will benefit from this exploratory and clear research, which will aid in determining the most appropriate solution to the problem at hand.
Keywords: Clustering, cluster ensemble methods, consensus function, data mining, unsupervised learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8226696 Simultaneous Clustering and Feature Selection Method for Gene Expression Data
Authors: T. Chandrasekhar, K. Thangavel, E. N. Sathishkumar
Abstract:
Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins. This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this work K-Means algorithms has been applied for clustering of Gene Expression Data. Further, rough set based Quick reduct algorithm has been applied for each cluster in order to select the most similar genes having high correlation. Then the ACV measure is used to evaluate the refined clusters and classification is used to evaluate the proposed method. They could identify compact clusters with feature selection method used to genes are selected.
Keywords: Clustering, Feature selection, Gene expression data, Quick reduct.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19676695 Segmentation Free Nastalique Urdu OCR
Authors: Sobia T. Javed, Sarmad Hussain, Ameera Maqbool, Samia Asloob, Sehrish Jamil, Huma Moin
Abstract:
The electronically available Urdu data is in image form which is very difficult to process. Printed Urdu data is the root cause of problem. So for the rapid progress of Urdu language we need an OCR systems, which can help us to make Urdu data available for the common person. Research has been carried out for years to automata Arabic and Urdu script. But the biggest hurdle in the development of Urdu OCR is the challenge to recognize Nastalique Script which is taken as standard for writing Urdu language. Nastalique script is written diagonally with no fixed baseline which makes the script somewhat complex. Overlap is present not only in characters but in the ligatures as well. This paper proposes a method which allows successful recognition of Nastalique Script.Keywords: HMM, Image processing, Optical CharacterRecognition, Urdu OCR.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2159