Search results for: Mining Big Data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7581

Search results for: Mining Big Data

6351 Daily and Seasonal Changes of Air Pollution in Kuwait

Authors: H. Ettouney, A. AL-Haddad, S. Saqer

Abstract:

This paper focuses on assessment of air pollution in Umm-Alhyman, Kuwait, which is located south to oil refineries, power station, oil field, and highways. The measurements were made over a period of four days in March and July in 2001, 2004, and 2008. The measured pollutants included methanated and nonmethanated hydrocarbons (MHC, NMHC), CO, CO2, SO2, NOX, O3, and PM10. Also, meteorological parameters were measured, which includes temperature, wind speed and direction, and solar radiation. Over the study period, data analysis showed increase in measured SO2, NOX and CO by factors of 1.2, 5.5 and 2, respectively. This is explained in terms of increase in industrial activities, motor vehicle density, and power generation. Predictions of the measured data were made by the ISC-AERMOD software package and by using the ISCST3 model option. Finally, comparison was made between measured data against international standards.

Keywords: Air pollution, Emission inventory, ISCST3 model, Modeling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2421
6350 Regular Data Broadcasting Plan with Grouping in Wireless Mobile Environment

Authors: John T. Tsiligaridis

Abstract:

The broadcast problem including the plan design is considered. The data are inserted and numbered at predefined order into customized size relations. The server ability to create a full, regular Broadcast Plan (RBP) with single and multiple channels after some data transformations is examined. The Regular Geometric Algorithm (RGA) prepares a RBP and enables the users to catch their items avoiding energy waste of their devices. Moreover, the Grouping Dimensioning Algorithm (GDA) based on integrated relations can guarantee the discrimination of services with a minimum number of channels. This last property among the selfmonitoring, self-organizing, can be offered by servers today providing also channel availability and less energy consumption by using smaller number of channels. Simulation results are provided.

Keywords: Broadcast, broadcast plan, mobile computing, wireless networks, scheduling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1453
6349 Survival Model for Partly Interval-Censored Data with Application to Anti D in Rhesus D Negative Studies

Authors: F. A. M. Elfaki, Amar Abobakar, M. Azram, M. Usman

Abstract:

This paper discusses regression analysis of partly interval-censored failure time data, which is occur in many fields including demographical, epidemiological, financial, medical and sociological studies. For the problem, we focus on the situation where the survival time of interest can be described by the additive hazards model in the present of partly interval-censored. A major advantage of the approach is its simplicity and it can be easily implemented by using R software. Simulation studies are conducted which indicate that the approach performs well for practical situations and comparable to the existing methods. The methodology is applied to a set of partly interval-censored failure time data arising from anti D in Rhesus D negative studies.

Keywords: Anti D in Rhesus D negative, Cox’s model, EM algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1693
6348 Metabolic Predictive Model for PMV Control Based on Deep Learning

Authors: Eunji Choi, Borang Park, Youngjae Choi, Jinwoo Moon

Abstract:

In this study, a predictive model for estimating the metabolism (MET) of human body was developed for the optimal control of indoor thermal environment. Human body images for indoor activities and human body joint coordinated values were collected as data sets, which are used in predictive model. A deep learning algorithm was used in an initial model, and its number of hidden layers and hidden neurons were optimized. Lastly, the model prediction performance was analyzed after the model being trained through collected data. In conclusion, the possibility of MET prediction was confirmed, and the direction of the future study was proposed as developing various data and the predictive model.

Keywords: Deep learning, indoor quality, metabolism, predictive model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1194
6347 Optimising Data Transmission in Heterogeneous Sensor Networks

Authors: M. Hammerton, J. Trevathan, T. Myers, W. Read

Abstract:

The transfer rate of messages in distributed sensor network applications is a critical factor in a system's performance. The Sensor Abstraction Layer (SAL) is one such system. SAL is a middleware integration platform for abstracting sensor specific technology in order to integrate heterogeneous types of sensors in a network. SAL uses Java Remote Method Invocation (RMI) as its connection method, which has unsatisfying transfer rates, especially for streaming data.  This paper analyses different connection methods to optimize data transmission in SAL by replacing RMI.  Our results show that the most promising Java-based connections were frameworks for Java New Input/Output (NIO) including Apache MINA, JBoss Netty, and xSocket. A test environment was implemented to evaluate each respective framework based on transfer rate, resource usage, and scalability. Test results showed the most suitable connection method to improve data transmission in SAL JBoss Netty as it provides a performance enhancement of 68%.

Keywords: Wireless sensor networks, remote method invocation, transmission time.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2037
6346 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.

Keywords: Pattern recognition, partitional clustering, K-means clustering, Manhattan distance, terrorism data analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1359
6345 HelpMeBreathe: A Web-Based System for Asthma Management

Authors: Alia Al Rayssi, Mahra Al Marar, Alyazia Alkhaili, Reem Al Dhaheri, Shayma Alkobaisi, Hoda Amer

Abstract:

We present in this paper a web-based system called “HelpMeBreathe” for managing asthma. The proposed system provides analytical tools, which allow better understanding of environmental triggers of asthma, hence better support of data-driven decision making. The developed system provides warning messages to a specific asthma patient if the weather in his/her area might cause any difficulty in breathing or could trigger an asthma attack. HelpMeBreathe collects, stores, and analyzes individuals’ moving trajectories and health conditions as well as environmental data. It then processes and displays the patients’ data through an analytical tool that leads to an effective decision making by physicians and other decision makers.

Keywords: Asthma, environmental triggers, map interface, peak flow, web-based system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 869
6344 Application of Neural Networks for 24-Hour-Ahead Load Forecasting

Authors: Fatemeh Mosalman Yazdi

Abstract:

One of the most important requirements for the operation and planning activities of an electrical utility is the prediction of load for the next hour to several days out, known as short term load forecasting. This paper presents the development of an artificial neural network based short-term load forecasting model. The model can forecast daily load profiles with a load time of one day for next 24 hours. In this method can divide days of year with using average temperature. Groups make according linearity rate of curve. Ultimate forecast for each group obtain with considering weekday and weekend. This paper investigates effects of temperature and humidity on consuming curve. For forecasting load curve of holidays at first forecast pick and valley and then the neural network forecast is re-shaped with the new data. The ANN-based load models are trained using hourly historical. Load data and daily historical max/min temperature and humidity data. The results of testing the system on data from Yazd utility are reported.

Keywords: Artificial neural network, Holiday forecasting, pickand valley load forecasting, Short-term load-forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2192
6343 On the Joint Optimization of Performance and Power Consumption in Data Centers

Authors: Samee Ullah Khan, C. Ardil

Abstract:

We model the process of a data center as a multi- objective problem of mapping independent tasks onto a set of data center machines that simultaneously minimizes the energy consump¬tion and response time (makespan) subject to the constraints of deadlines and architectural requirements. A simple technique based on multi-objective goal programming is proposed that guarantees Pareto optimal solution with excellence in convergence process. The proposed technique also is compared with other traditional approach. The simulation results show that the proposed technique achieves superior performance compared to the min-min heuristics, and com¬petitive performance relative to the optimal solution implemented in UNDO for small-scale problems.

Keywords: Energy-efficient computing, distributed systems, multi-objective optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1691
6342 Preparing Data for Calibration of Mechanistic-Empirical Pavement Design Guide in Central Saudi Arabia

Authors: Abdulraaof H. Alqaili, Hamad A. Alsoliman

Abstract:

Through progress in pavement design developments, a pavement design method was developed, which is titled the Mechanistic Empirical Pavement Design Guide (MEPDG). Nowadays, the evolution in roads network and highways is observed in Saudi Arabia as a result of increasing in traffic volume. Therefore, the MEPDG currently is implemented for flexible pavement design by the Saudi Ministry of Transportation. Implementation of MEPDG for local pavement design requires the calibration of distress models under the local conditions (traffic, climate, and materials). This paper aims to prepare data for calibration of MEPDG in Central Saudi Arabia. Thus, the first goal is data collection for the design of flexible pavement from the local conditions of the Riyadh region. Since, the modifying of collected data to input data is needed; the main goal of this paper is the analysis of collected data. The data analysis in this paper includes processing each: Trucks Classification, Traffic Growth Factor, Annual Average Daily Truck Traffic (AADTT), Monthly Adjustment Factors (MAFi), Vehicle Class Distribution (VCD), Truck Hourly Distribution Factors, Axle Load Distribution Factors (ALDF), Number of axle types (single, tandem, and tridem) per truck class, cloud cover percent, and road sections selected for the local calibration. Detailed descriptions of input parameters are explained in this paper, which leads to providing of an approach for successful implementation of MEPDG. Local calibration of MEPDG to the conditions of Riyadh region can be performed based on the findings in this paper.

Keywords: Mechanistic-empirical pavement design guide, traffic characteristics, materials properties, climate, Riyadh.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1223
6341 Economized Sensor Data Processing with Vehicle Platooning

Authors: Henry Hexmoor, Kailash Yelasani

Abstract:

We present vehicular platooning as a special case of crowd-sensing framework where sharing sensory information among a crowd is used for their collective benefit. After offering an abstract policy that governs processes involving a vehicular platoon, we review several common scenarios and components surrounding vehicular platooning. We then present a simulated prototype that illustrates efficiency of road usage and vehicle travel time derived from platooning. We have argued that one of the paramount benefits of platooning that is overlooked elsewhere, is the substantial computational savings (i.e., economizing benefits) in acquisition and processing of sensory data among vehicles sharing the road. The most capable vehicle can share data gathered from its sensors with nearby vehicles grouped into a platoon.

Keywords: Cloud network, collaboration, Internet of Things, social network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 711
6340 Machine Learning Development Audit Framework: Assessment and Inspection of Risk and Quality of Data, Model and Development Process

Authors: Jan Stodt, Christoph Reich

Abstract:

The usage of machine learning models for prediction is growing rapidly and proof that the intended requirements are met is essential. Audits are a proven method to determine whether requirements or guidelines are met. However, machine learning models have intrinsic characteristics, such as the quality of training data, that make it difficult to demonstrate the required behavior and make audits more challenging. This paper describes an ML audit framework that evaluates and reviews the risks of machine learning applications, the quality of the training data, and the machine learning model. We evaluate and demonstrate the functionality of the proposed framework by auditing an steel plate fault prediction model.

Keywords: Audit, machine learning, assessment, metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1029
6339 Information Seeking through Assimilation Process in Thai Organization

Authors: Pornprom Chomngam

Abstract:

The purpose of this study is to examine employee assessments of the usefulness/value of different types of information available to those employees during the process of organizational assimilation. Participants in the study were 247 “new" employees at Bangkok Bank. Bangkok Bank considers employees whose length of stay with the bank has been less than 18 months as new employees. Questionnaires were administered to all of the Bank-s new employees to obtain the data for this study. Repeated measures analysis was used to analyze the data. The data were summed and coded by using Statistical Package for Social Science. Newcomers indicate that social information is the most useful information, followed by job (technical, referent, and appraisal information), political, normative, and organizational information. Essentially, social, job, and political information are evaluated by newcomers as highly useful, while normative and organizational information are rated as moderately useful.

Keywords: Information seeking, organization assimilation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1661
6338 Image Steganography Using Least Significant Bit Technique

Authors: Preeti Kumari, Ridhi Kapoor

Abstract:

 In any communication, security is the most important issue in today’s world. In this paper, steganography is the process of hiding the important data into other data, such as text, audio, video, and image. The interest in this topic is to provide availability, confidentiality, integrity, and authenticity of data. The steganographic technique that embeds hides content with unremarkable cover media so as not to provoke eavesdropper’s suspicion or third party and hackers. In which many applications of compression, encryption, decryption, and embedding methods are used for digital image steganography. Due to compression, the nose produces in the image. To sustain noise in the image, the LSB insertion technique is used. The performance of the proposed embedding system with respect to providing security to secret message and robustness is discussed. We also demonstrate the maximum steganography capacity and visual distortion.

Keywords: Steganography, LSB, encoding, information hiding, color image.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1092
6337 Retail Strategy to Reduce Waste Keeping High Profit Utilizing Taylor's Law in Point-of-Sales Data

Authors: Gen Sakoda, Hideki Takayasu, Misako Takayasu

Abstract:

Waste reduction is a fundamental problem for sustainability. Methods for waste reduction with point-of-sales (POS) data are proposed, utilizing the knowledge of a recent econophysics study on a statistical property of POS data. Concretely, the non-stationary time series analysis method based on the Particle Filter is developed, which considers abnormal fluctuation scaling known as Taylor's law. This method is extended for handling incomplete sales data because of stock-outs by introducing maximum likelihood estimation for censored data. The way for optimal stock determination with pricing the cost of waste reduction is also proposed. This study focuses on the examination of the methods for large sales numbers where Taylor's law is obvious. Numerical analysis using aggregated POS data shows the effectiveness of the methods to reduce food waste maintaining a high profit for large sales numbers. Moreover, the way of pricing the cost of waste reduction reveals that a small profit loss realizes substantial waste reduction, especially in the case that the proportionality constant  of Taylor’s law is small. Specifically, around 1% profit loss realizes half disposal at =0.12, which is the actual  value of processed food items used in this research. The methods provide practical and effective solutions for waste reduction keeping a high profit, especially with large sales numbers.

Keywords: Food waste reduction, particle filter, point of sales, sustainable development goals, Taylor's Law, time series analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 871
6336 Experimental teaching, Perceived usefulness, Ease of use, Learning Interest and Science Achievement of Taiwan 8th Graders in TIMSS 2007 Database

Authors: Pei Wen Liao, Tsung Hau Jen

Abstract:

the data of Taiwanese 8th grader in the 4th cycle of Trends in International Mathematics and Science Study (TIMSS) are analyzed to examine the influence of the science teachers- preference in experimental teaching on the relationships between the affective variables ( the perceived usefulness of science, ease of using science and science learning interest) and the academic achievement in science. After dealing with the missing data, 3711 students and 145 science teacher-s data were analyzed through a Hierarchical Linear Modeling technique. The major objective of this study was to determine the role of the experimental teaching moderates the relationship between perceived usefulness and achievement.

Keywords: TIMSS database, Science achievement, Experimental teaching, Perceived Usefulness, Perceived Ease of Use

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657
6335 Pattern Classification of Back-Propagation Algorithm Using Exclusive Connecting Network

Authors: Insung Jung, Gi-Nam Wang

Abstract:

The objective of this paper is to a design of pattern classification model based on the back-propagation (BP) algorithm for decision support system. Standard BP model has done full connection of each node in the layers from input to output layers. Therefore, it takes a lot of computing time and iteration computing for good performance and less accepted error rate when we are doing some pattern generation or training the network. However, this model is using exclusive connection in between hidden layer nodes and output nodes. The advantage of this model is less number of iteration and better performance compare with standard back-propagation model. We simulated some cases of classification data and different setting of network factors (e.g. hidden layer number and nodes, number of classification and iteration). During our simulation, we found that most of simulations cases were satisfied by BP based using exclusive connection network model compared to standard BP. We expect that this algorithm can be available to identification of user face, analysis of data, mapping data in between environment data and information.

Keywords: Neural network, Back-propagation, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1656
6334 Quick Sequential Search Algorithm Used to Decode High-Frequency Matrices

Authors: Mohammed M. Siddeq, Mohammed H. Rasheed, Omar M. Salih, Marcos A. Rodrigues

Abstract:

This research proposes a data encoding and decoding method based on the Matrix Minimization algorithm. This algorithm is applied to high-frequency coefficients for compression/encoding. The algorithm starts by converting every three coefficients to a single value; this is accomplished based on three different keys. The decoding/decompression uses a search method called QSS (Quick Sequential Search) Decoding Algorithm presented in this research based on the sequential search to recover the exact coefficients. In the next step, the decoded data are saved in an auxiliary array. The basic idea behind the auxiliary array is to save all possible decoded coefficients; this is because another algorithm, such as conventional sequential search, could retrieve encoded/compressed data independently from the proposed algorithm. The experimental results showed that our proposed decoding algorithm retrieves original data faster than conventional sequential search algorithms.

Keywords: Matrix Minimization Algorithm, Decoding Sequential Search Algorithm, image compression, Discrete Cosine Transform, Discrete Wavelet Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 247
6333 A Survey in Techniques for Imbalanced Intrusion Detection System Datasets

Authors: Najmeh Abedzadeh, Matthew Jacobs

Abstract:

An intrusion detection system (IDS) is a software application that monitors malicious activities and generates alerts if any are detected. However, most network activities in IDS datasets are normal, and the relatively few numbers of attacks make the available data imbalanced. Consequently, cyber-attacks can hide inside a large number of normal activities, and machine learning algorithms have difficulty learning and classifying the data correctly. In this paper, a comprehensive literature review is conducted on different types of algorithms for both implementing the IDS and methods in correcting the imbalanced IDS dataset. The most famous algorithms are machine learning (ML), deep learning (DL), synthetic minority over-sampling technique (SMOTE), and reinforcement learning (RL). Most of the research use the CSE-CIC-IDS2017, CSE-CIC-IDS2018, and NSL-KDD datasets for evaluating their algorithms.

Keywords: IDS, intrusion detection system, imbalanced datasets, sampling algorithms, big data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1125
6332 Disidentification of Historical City Centers: A Comparative Study of the Old and New Settlements of Mardin, Turkey

Authors: Fatma Kürüm Varolgüneş, Fatih Canan

Abstract:

Mardin is one of the unique cities in Turkey with its rich cultural and historical heritage. Mardin’s traditional dwellings have been affected both by natural data such as climate and topography and by cultural data like lifestyle and belief. However, in the new settlements, housing is formed with modern approaches and unsuitable forms clashing with Mardin’s culture and environment. While the city is expanding, traditional textures are ignored. Thus, traditional settlements are losing their identity and are vanishing because of the rapid change and transformation. The main aim of this paper is to determine the physical and social data needed to define the characteristic features of Mardin’s old and new settlements. In this context, based on social and cultural data, old and new settlement formations of Mardin have been investigated from various aspects. During this research, the following methods have been utilized: observations, interviews, public surveys, literature review, as well as site examination via maps, photographs and questionnaire methodology. In conclusion, this paper focuses on how changes in the physical forms of cities affect the typology and the identity of cities, as in the case of Mardin.

Keywords: Urban and local identity, historical city center, traditional settlements, Mardin, Turkey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1042
6331 TELUM Land Use Model: An Investigation of Data Requirements and Calibration Results for Chittenden County MPO, U.S.A.

Authors: Georgia Pozoukidou

Abstract:

TELUM software is a land use model designed specifically to help metropolitan planning organizations (MPOs) prepare their transportation improvement programs and fulfill their numerous planning responsibilities. In this context obtaining, preparing, and validating socioeconomic forecasts are becoming fundamental tasks for an MPO in order to ensure that consistent population and employment data are provided to travel demand models. Chittenden County Metropolitan Planning Organization of Vermont State was used as a case study to test the applicability of TELUM land use model. The technical insights and lessons learned from the land use model application have transferable value for all MPOs faced with land use forecasting development and transportation modeling.

Keywords: Calibration data requirements, land use models, land use planning, Metropolitan Planning Organizations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2100
6330 Efficient Tuning Parameter Selection by Cross-Validated Score in High Dimensional Models

Authors: Yoonsuh Jung

Abstract:

As DNA microarray data contain relatively small sample size compared to the number of genes, high dimensional models are often employed. In high dimensional models, the selection of tuning parameter (or, penalty parameter) is often one of the crucial parts of the modeling. Cross-validation is one of the most common methods for the tuning parameter selection, which selects a parameter value with the smallest cross-validated score. However, selecting a single value as an ‘optimal’ value for the parameter can be very unstable due to the sampling variation since the sample sizes of microarray data are often small. Our approach is to choose multiple candidates of tuning parameter first, then average the candidates with different weights depending on their performance. The additional step of estimating the weights and averaging the candidates rarely increase the computational cost, while it can considerably improve the traditional cross-validation. We show that the selected value from the suggested methods often lead to stable parameter selection as well as improved detection of significant genetic variables compared to the tradition cross-validation via real data and simulated data sets.

Keywords: Cross Validation, Parameter Averaging, Parameter Selection, Regularization Parameter Search.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572
6329 Using ALOHA Code to Evaluate CO2 Concentration for Maanshan Nuclear Power Plant

Authors: W. S. Hsu, S. W. Chen, Y. T. Ku, Y. Chiang, J. R. Wang , J. H. Yang, C. Shih

Abstract:

ALOHA code was used to calculate the concentration under the CO2 storage burst condition for Maanshan nuclear power plant (NPP) in this study. Five main data are input into ALOHA code including location, building, chemical, atmospheric, and source data. The data from Final Safety Analysis Report (FSAR) and some reports were used in this study. The ALOHA results are compared with the failure criteria of R.G. 1.78 to confirm the habitability of control room. The result of comparison presents that the ALOHA result is below the R.G. 1.78 criteria. This implies that the habitability of control room can be maintained in this case. The sensitivity study for atmospheric parameters was performed in this study. The results show that the wind speed has the larger effect in the concentration calculation.

Keywords: PWR, ALOHA, habitability, Maanshan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 742
6328 Distributed Splay Suffix Arrays: A New Structure for Distributed String Search

Authors: Tu Kun, Gu Nai-jie, Bi Kun, Liu Gang, Dong Wan-li

Abstract:

As a structure for processing string problem, suffix array is certainly widely-known and extensively-studied. But if the string access pattern follows the “90/10" rule, suffix array can not take advantage of the fact that we often find something that we have just found. Although the splay tree is an efficient data structure for small documents when the access pattern follows the “90/10" rule, it requires many structures and an excessive amount of pointer manipulations for efficiently processing and searching large documents. In this paper, we propose a new and conceptually powerful data structure, called splay suffix arrays (SSA), for string search. This data structure combines the features of splay tree and suffix arrays into a new approach which is suitable to implementation on both conventional and clustered computers.

Keywords: suffix arrays, splay tree, string search, distributedalgorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777
6327 Dimension Free Rigid Point Set Registration in Linear Time

Authors: Jianqin Qu

Abstract:

This paper proposes a rigid point set matching algorithm in arbitrary dimensions based on the idea of symmetric covariant function. A group of functions of the points in the set are formulated using rigid invariants. Each of these functions computes a pair of correspondence from the given point set. Then the computed correspondences are used to recover the unknown rigid transform parameters. Each computed point can be geometrically interpreted as the weighted mean center of the point set. The algorithm is compact, fast, and dimension free without any optimization process. It either computes the desired transform for noiseless data in linear time, or fails quickly in exceptional cases. Experimental results for synthetic data and 2D/3D real data are provided, which demonstrate potential applications of the algorithm to a wide range of problems.

Keywords: Covariant point, point matching, dimension free, rigid registration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 683
6326 Functional and Efficient Query Interpreters: Principle, Application and Performances’ Comparison

Authors: Laurent Thiry, Michel Hassenforder

Abstract:

This paper presents a general approach to implement efficient queries’ interpreters in a functional programming language. Indeed, most of the standard tools actually available use an imperative and/or object-oriented language for the implementation (e.g. Java for Jena-Fuseki) but other paradigms are possible with, maybe, better performances. To proceed, the paper first explains how to model data structures and queries in a functional point of view. Then, it proposes a general methodology to get performances (i.e. number of computation steps to answer a query) then it explains how to integrate some optimization techniques (short-cut fusion and, more important, data transformations). It then compares the functional server proposed to a standard tool (Fuseki) demonstrating that the first one can be twice to ten times faster to answer queries.

Keywords: Data transformation, functional programming, information server, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 753
6325 Evidence Theory Enabled Quickest Change Detection Using Big Time-Series Data from Internet of Things

Authors: Hossein Jafari, Xiangfang Li, Lijun Qian, Alexander Aved, Timothy Kroecker

Abstract:

Traditionally in sensor networks and recently in the Internet of Things, numerous heterogeneous sensors are deployed in distributed manner to monitor a phenomenon that often can be model by an underlying stochastic process. The big time-series data collected by the sensors must be analyzed to detect change in the stochastic process as quickly as possible with tolerable false alarm rate. However, sensors may have different accuracy and sensitivity range, and they decay along time. As a result, the big time-series data collected by the sensors will contain uncertainties and sometimes they are conflicting. In this study, we present a framework to take advantage of Evidence Theory (a.k.a. Dempster-Shafer and Dezert-Smarandache Theories) capabilities of representing and managing uncertainty and conflict to fast change detection and effectively deal with complementary hypotheses. Specifically, Kullback-Leibler divergence is used as the similarity metric to calculate the distances between the estimated current distribution with the pre- and post-change distributions. Then mass functions are calculated and related combination rules are applied to combine the mass values among all sensors. Furthermore, we applied the method to estimate the minimum number of sensors needed to combine, so computational efficiency could be improved. Cumulative sum test is then applied on the ratio of pignistic probability to detect and declare the change for decision making purpose. Simulation results using both synthetic data and real data from experimental setup demonstrate the effectiveness of the presented schemes.

Keywords: CUSUM, evidence theory, KL divergence, quickest change detection, time series data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 994
6324 A Secure Proxy Signature Scheme with Fault Tolerance Based on RSA System

Authors: H. El-Kamchouchi, Heba Gaber, Fatma Ahmed, Dalia H. El-Kamchouchi

Abstract:

Due to the rapid growth in modern communication systems, fault tolerance and data security are two important issues in a secure transaction. During the transmission of data between the sender and receiver, errors may occur frequently. Therefore, the sender must re-transmit the data to the receiver in order to correct these errors, which makes the system very feeble. To improve the scalability of the scheme, we present a secure proxy signature scheme with fault tolerance over an efficient and secure authenticated key agreement protocol based on RSA system. Authenticated key agreement protocols have an important role in building a secure communications network between the two parties.

Keywords: Proxy signature, fault tolerance, RSA, key agreement protocol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1485
6323 Investigation of Learning Challenges in Building Measurement Unit

Authors: Argaw T. Gurmu, Muhammad N. Mahmood

Abstract:

The objective of this research is to identify the architecture and construction management students’ learning challenges of the building measurement. This research used the survey data obtained collected from the students who completed the building measurement unit. NVivo qualitative data analysis software was used to identify relevant themes. The analysis of the qualitative data revealed the major learning difficulties such as inadequacy of practice questions for the examination, inability to work as a team, lack of detailed understanding of the prerequisite units, insufficiency of the time allocated for tutorials and incompatibility of lecture and tutorial schedules. The output of this research can be used as a basis for improving the teaching and learning activities in construction measurement units.

Keywords: Building measurement, construction management, learning challenges, evaluate survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1104
6322 A Rule-based Approach for Anomaly Detection in Subscriber Usage Pattern

Authors: Rupesh K. Gopal, Saroj K. Meher

Abstract:

In this report we present a rule-based approach to detect anomalous telephone calls. The method described here uses subscriber usage CDR (call detail record) data sampled over two observation periods: study period and test period. The study period contains call records of customers- non-anomalous behaviour. Customers are first grouped according to their similar usage behaviour (like, average number of local calls per week, etc). For customers in each group, we develop a probabilistic model to describe their usage. Next, we use maximum likelihood estimation (MLE) to estimate the parameters of the calling behaviour. Then we determine thresholds by calculating acceptable change within a group. MLE is used on the data in the test period to estimate the parameters of the calling behaviour. These parameters are compared against thresholds. Any deviation beyond the threshold is used to raise an alarm. This method has the advantage of identifying local anomalies as compared to techniques which identify global anomalies. The method is tested for 90 days of study data and 10 days of test data of telecom customers. For medium to large deviations in the data in test window, the method is able to identify 90% of anomalous usage with less than 1% false alarm rate.

Keywords: Subscription fraud, fraud detection, anomalydetection, maximum likelihood estimation, rule based systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2813