Search results for: Data grid
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7809

Search results for: Data grid

7239 Advanced Pulse Width Modulation Techniques for Z Source Multi Level Inverter

Authors: B. M. Manjunatha, D. V. Ashok Kumar, M. Vijay Kumar

Abstract:

This paper proposes five level diode clamped Z source Inverter. The existing PWM techniques used for ZSI are restricted for two level. The two level Z Source Inverter have high harmonic distortions which effects the performance of the grid connected PV system. To improve the performance of the system the number of voltage levels in the output waveform need to be increased. This paper presents comparative analysis of a five level diode clamped Z source Inverter with different carrier based Modified Pulse Width Modulation techniques. The parameters considered for comparison are output voltage, voltage gain, voltage stress across switch and total harmonic distortion when powered by same DC supply. Analytical results are verified using MATLAB.

Keywords: Diode Clamped, Pulse Width Modulation, total harmonic distortion, Z Source Inverter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2357
7238 Mobile Phone as a Tool for Data Collection in Field Research

Authors: Sandro Mourão, Karla Okada

Abstract:

The necessity of accurate and timely field data is shared among organizations engaged in fundamentally different activities, public services or commercial operations. Basically, there are three major components in the process of the qualitative research: data collection, interpretation and organization of data, and analytic process. Representative technological advancements in terms of innovation have been made in mobile devices (mobile phone, PDA-s, tablets, laptops, etc). Resources that can be potentially applied on the data collection activity for field researches in order to improve this process. This paper presents and discuss the main features of a mobile phone based solution for field data collection, composed of basically three modules: a survey editor, a server web application and a client mobile application. The data gathering process begins with the survey creation module, which enables the production of tailored questionnaires. The field workforce receives the questionnaire(s) on their mobile phones to collect the interviews responses and sending them back to a server for immediate analysis.

Keywords: Data Gathering, Field Research, Mobile Phone, Survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2059
7237 Wireless Distributed Load-Shedding Management System for Non-Emergency Cases

Authors: Taha Landolsi, A. R. Al-Ali, Tarik Ozkul, Mohammad A. Al-Rousan

Abstract:

In this paper, we present a cost-effective wireless distributed load shedding system for non-emergency scenarios. In power transformer locations where SCADA system cannot be used, the proposed solution provides a reasonable alternative that combines the use of microcontrollers and existing GSM infrastructure to send early warning SMS messages to users advising them to proactively reduce their power consumption before system capacity is reached and systematic power shutdown takes place. A novel communication protocol and message set have been devised to handle the messaging between the transformer sites, where the microcontrollers are located and where the measurements take place, and the central processing site where the database server is hosted. Moreover, the system sends warning messages to the endusers mobile devices that are used as communication terminals. The system has been implemented and tested via different experimental results.

Keywords: Smart Grid, Load shedding, Demand SideManagement, GSM Wireless Networks, SCADA systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2631
7236 On Pooling Different Levels of Data in Estimating Parameters of Continuous Meta-Analysis

Authors: N. R. N. Idris, S. Baharom

Abstract:

A meta-analysis may be performed using aggregate data (AD) or an individual patient data (IPD). In practice, studies may be available at both IPD and AD level. In this situation, both the IPD and AD should be utilised in order to maximize the available information. Statistical advantages of combining the studies from different level have not been fully explored. This study aims to quantify the statistical benefits of including available IPD when conducting a conventional summary-level meta-analysis. Simulated meta-analysis were used to assess the influence of the levels of data on overall meta-analysis estimates based on IPD-only, AD-only and the combination of IPD and AD (mixed data, MD), under different study scenario. The percentage relative bias (PRB), root mean-square-error (RMSE) and coverage probability were used to assess the efficiency of the overall estimates. The results demonstrate that available IPD should always be included in a conventional meta-analysis using summary level data as they would significantly increased the accuracy of the estimates.On the other hand, if more than 80% of the available data are at IPD level, including the AD does not provide significant differences in terms of accuracy of the estimates. Additionally, combining the IPD and AD has moderating effects on the biasness of the estimates of the treatment effects as the IPD tends to overestimate the treatment effects, while the AD has the tendency to produce underestimated effect estimates. These results may provide some guide in deciding if significant benefit is gained by pooling the two levels of data when conducting meta-analysis.

Keywords: Aggregate data, combined-level data, Individual patient data, meta analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1741
7235 Multivariate Assessment of Mathematics Test Scores of Students in Qatar

Authors: Ali Rashash Alzahrani, Elizabeth Stojanovski

Abstract:

Data on various aspects of education are collected at the institutional and government level regularly. In Australia, for example, students at various levels of schooling undertake examinations in numeracy and literacy as part of NAPLAN testing, enabling longitudinal assessment of such data as well as comparisons between schools and states within Australia. Another source of educational data collected internationally is via the PISA study which collects data from several countries when students are approximately 15 years of age and enables comparisons in the performance of science, mathematics and English between countries as well as ranking of countries based on performance in these standardised tests. As well as student and school outcomes based on the tests taken as part of the PISA study, there is a wealth of other data collected in the study including parental demographics data and data related to teaching strategies used by educators. Overall, an abundance of educational data is available which has the potential to be used to help improve educational attainment and teaching of content in order to improve learning outcomes. A multivariate assessment of such data enables multiple variables to be considered simultaneously and will be used in the present study to help develop profiles of students based on performance in mathematics using data obtained from the PISA study.

Keywords: Cluster analysis, education, mathematics, profiles.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 894
7234 DIVAD: A Dynamic and Interactive Visual Analytical Dashboard for Exploring and Analyzing Transport Data

Authors: Tin Seong Kam, Ketan Barshikar, Shaun Tan

Abstract:

The advances in location-based data collection technologies such as GPS, RFID etc. and the rapid reduction of their costs provide us with a huge and continuously increasing amount of data about movement of vehicles, people and goods in an urban area. This explosive growth of geospatially-referenced data has far outpaced the planner-s ability to utilize and transform the data into insightful information thus creating an adverse impact on the return on the investment made to collect and manage this data. Addressing this pressing need, we designed and developed DIVAD, a dynamic and interactive visual analytics dashboard to allow city planners to explore and analyze city-s transportation data to gain valuable insights about city-s traffic flow and transportation requirements. We demonstrate the potential of DIVAD through the use of interactive choropleth and hexagon binning maps to explore and analyze large taxi-transportation data of Singapore for different geographic and time zones.

Keywords: Geographic Information System (GIS), MovementData, GeoVisual Analytics, Urban Planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2390
7233 Gene Expression Data Classification Using Discriminatively Regularized Sparse Subspace Learning

Authors: Chunming Xu

Abstract:

Sparse representation which can represent high dimensional data effectively has been successfully used in computer vision and pattern recognition problems. However, it doesn-t consider the label information of data samples. To overcome this limitation, we develop a novel dimensionality reduction algorithm namely dscriminatively regularized sparse subspace learning(DR-SSL) in this paper. The proposed DR-SSL algorithm can not only make use of the sparse representation to model the data, but also can effective employ the label information to guide the procedure of dimensionality reduction. In addition,the presented algorithm can effectively deal with the out-of-sample problem.The experiments on gene-expression data sets show that the proposed algorithm is an effective tool for dimensionality reduction and gene-expression data classification.

Keywords: sparse representation, dimensionality reduction, labelinformation, sparse subspace learning, gene-expression data classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1448
7232 Critical Assessment of Scoring Schemes for Protein-Protein Docking Predictions

Authors: Dhananjay C. Joshi, Jung-Hsin Lin

Abstract:

Protein-protein interactions (PPI) play a crucial role in many biological processes such as cell signalling, transcription, translation, replication, signal transduction, and drug targeting, etc. Structural information about protein-protein interaction is essential for understanding the molecular mechanisms of these processes. Structures of protein-protein complexes are still difficult to obtain by biophysical methods such as NMR and X-ray crystallography, and therefore protein-protein docking computation is considered an important approach for understanding protein-protein interactions. However, reliable prediction of the protein-protein complexes is still under way. In the past decades, several grid-based docking algorithms based on the Katchalski-Katzir scoring scheme were developed, e.g., FTDock, ZDOCK, HADDOCK, RosettaDock, HEX, etc. However, the success rate of protein-protein docking prediction is still far from ideal. In this work, we first propose a more practical measure for evaluating the success of protein-protein docking predictions,the rate of first success (RFS), which is similar to the concept of mean first passage time (MFPT). Accordingly, we have assessed the ZDOCK bound and unbound benchmarks 2.0 and 3.0. We also createda new benchmark set for protein-protein docking predictions, in which the complexes have experimentally determined binding affinity data. We performed free energy calculation based on the solution of non-linear Poisson-Boltzmann equation (nlPBE) to improve the binding mode prediction. We used the well-studied thebarnase-barstarsystem to validate the parameters for free energy calculations. Besides,thenlPBE-based free energy calculations were conducted for the badly predicted cases by ZDOCK and ZRANK. We found that direct molecular mechanics energetics cannot be used to discriminate the native binding pose from the decoys.Our results indicate that nlPBE-based calculations appeared to be one of the promising approaches for improving the success rate of binding pose predictions.

Keywords: protein-protein docking, protein-protein interaction, molecular mechanics energetics, Poisson-Boltzmann calculations

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
7231 Estimation of PM2.5 Emissions and Source Apportionment Using Receptor and Dispersion Models

Authors: Swetha Priya Darshini Thammadi, Sateesh Kumar Pisini, Sanjay Kumar Shukla

Abstract:

Source apportionment using Dispersion model depends primarily on the quality of Emission Inventory. In the present study, a CMB receptor model has been used to identify the sources of PM2.5, while the AERMOD dispersion model has been used to account for missing sources of PM2.5 in the Emission Inventory. A statistical approach has been developed to quantify the missing sources not considered in the Emission Inventory. The inventory of each grid was improved by adjusting emissions based on road lengths and deficit in measured and modelled concentrations. The results showed that in CMB analyses, fugitive sources - soil and road dust - contribute significantly to ambient PM2.5 pollution. As a result, AERMOD significantly underestimated the ambient air concentration at most locations. The revised Emission Inventory showed a significant improvement in AERMOD performance which is evident through statistical tests.

Keywords: CMB, GIS, AERMOD, PM2.5, fugitive, emission inventory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 903
7230 Street Network in Bandung City, Indonesia: Comparison between City Center and New Commercial Area

Authors: Siska Soesanti, Norihiro Nakai

Abstract:

Bandung city center can be deemed as economic, social and cultural center. However the city center suffers from deterioration. The retail activities tend to shift outward the city center. Numerous idyllic residences changed into business premises in two villages situated in the north part of the city during 1990s, especially after a new highway and flyover opened. According to space syntax theory, the pattern of spatial integration in the urban grid is a prime determinant of movement patterns in the system. The syntactic analysis results show the flyover has insignificant influence on street network in the city center. However the flyover has been generating a major difference in the new commercial area since it has become relatively as strategic as the city center. Besides street network, local government policy, rapid private motorization and particular condition of each site also played important roles in encouraging the current commercial areas to flourish.

Keywords: City center, commercial area, space syntax, street network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1819
7229 Determining Cluster Boundaries Using Particle Swarm Optimization

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.

Keywords: Particle swarm optimization, self-organizing maps, clustering, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1721
7228 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: Predictive analysis, big data, predictive analysis algorithms. CART algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1076
7227 Techno-Economic Analysis of Motor-Generator Pair System and Virtual Synchronous Generator for Providing Inertia of Power System

Authors: Zhou Yingkun, Xu Guorui, Wei Siming, Huang Yongzhang

Abstract:

With the increasing of the penetration of renewable energy in power system, the whole inertia of the power system is declining, which will endanger the frequency stability of the power system. In order to enhance the inertia, virtual synchronous generator (VSG) has been proposed. In addition, the motor-generator pair (MGP) system is proposed to enhance grid inertia. Both of them need additional equipment to provide instantaneous energy, so the economic problem should be considered. In this paper, the basic working principle of MGP system and VSG are introduced firstly. Then, the technical characteristics and economic investment of MGP/VSG are compared by calculation and simulation. The results show that the MGP system can provide same inertia with less cost than VSG.

Keywords: High renewable energy penetration, inertia of power system, virtual synchronous generator, motor-generator pair system, techno-economic analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1262
7226 A Business-to-Business Collaboration System That Promotes Data Utilization While Encrypting Information on the Blockchain

Authors: Hiroaki Nasu, Ryota Miyamoto, Yuta Kodera, Yasuyuki Nogami

Abstract:

To promote Industry 4.0 and Society 5.0 and so on, it is important to connect and share data so that every member can trust it. Blockchain (BC) technology is currently attracting attention as the most advanced tool and has been used in the financial field and so on. However, the data collaboration using BC has not progressed sufficiently among companies on the supply chain of the manufacturing industry that handle sensitive data such as product quality, manufacturing conditions, etc. There are two main reasons why data utilization is not sufficiently advanced in the industrial supply chain. The first reason is that manufacturing information is top secret and a source for companies to generate profits. It is difficult to disclose data even between companies with transactions in the supply chain. Blockchain mechanism such as Bitcoin using Public Key Infrastructure (PKI) requires plaintext to be shared between companies in order to verify the identity of the company that sent the data. Another reason is that the merits (scenarios) of collaboration data between companies are not specifically specified in the industrial supply chain. For these problems, this paper proposes a Business to Business (B2B) collaboration system using homomorphic encryption and BC technique. Using the proposed system, each company on the supply chain can exchange confidential information on encrypted data and utilize the data for their own business. In addition, this paper considers a scenario focusing on quality data, which was difficult to collaborate because it is top-secret. In this scenario, we show an implementation scheme and a benefit of concrete data collaboration by proposing a comparison protocol that can grasp the change in quality while hiding the numerical value of quality data.

Keywords: Business to business data collaboration, industrial supply chain, blockchain, homomorphic encryption.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 823
7225 An Approximation of Daily Rainfall by Using a Pixel Value Data Approach

Authors: Sarisa Pinkham, Kanyarat Bussaban

Abstract:

The research aims to approximate the amount of daily rainfall by using a pixel value data approach. The daily rainfall maps from the Thailand Meteorological Department in period of time from January to December 2013 were the data used in this study. The results showed that this approach can approximate the amount of daily rainfall with RMSE=3.343.

Keywords: Daily rainfall, Image processing, Approximation, Pixel value data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1758
7224 Automatic Generation of Ontology from Data Source Directed by Meta Models

Authors: Widad Jakjoud, Mohamed Bahaj, Jamal Bakkas

Abstract:

Through this paper we present a method for automatic generation of ontological model from any data source using Model Driven Architecture (MDA), this generation is dedicated to the cooperation of the knowledge engineering and software engineering. Indeed, reverse engineering of a data source generates a software model (schema of data) that will undergo transformations to generate the ontological model. This method uses the meta-models to validate software and ontological models.

Keywords: Meta model, model, ontology, data source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1998
7223 Steps towards the Development of National Health Data Standards in Developing Countries: An Exploratory Qualitative Study in Saudi Arabia

Authors: Abdullah I. Alkraiji, Thomas W. Jackson, Ian R. Murray

Abstract:

The proliferation of health data standards today is somewhat overlapping and conflicting, resulting in market confusion and leading to increasing proprietary interests. The government role and support in standardization for health data are thought to be crucial in order to establish credible standards for the next decade, to maximize interoperability across the health sector, and to decrease the risks associated with the implementation of non-standard systems. The normative literature missed out the exploration of the different steps required to be undertaken by the government towards the development of national health data standards. Based on the lessons learned from a qualitative study investigating the different issues to the adoption of health data standards in the major tertiary hospitals in Saudi Arabia and the opinions and feedback from different experts in the areas of data exchange and standards and medical informatics in Saudi Arabia and UK, a list of steps required towards the development of national health data standards was constructed. Main steps are the existence of: a national formal reference for health data standards, an agreed national strategic direction for medical data exchange, a national medical information management plan and a national accreditation body, and more important is the change management at the national and organizational level. The outcome of this study can be used by academics and practitioners to develop the planning of health data standards, and in particular those in developing countries.

Keywords: Interoperability, Case Study, Health Data Standards, Medical Data Exchange, Saudi Arabia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2002
7222 Test Data Compression Using a Hybrid of Bitmask Dictionary and 2n Pattern Runlength Coding Methods

Authors: C. Kalamani, K. Paramasivam

Abstract:

In VLSI, testing plays an important role. Major problem in testing are test data volume and test power. The important solution to reduce test data volume and test time is test data compression. The Proposed technique combines the bit maskdictionary and 2n pattern run length-coding method and provides a substantial improvement in the compression efficiency without introducing any additional decompression penalty. This method has been implemented using Mat lab and HDL Language to reduce test data volume and memory requirements. This method is applied on various benchmark test sets and compared the results with other existing methods. The proposed technique can achieve a compression ratio up to 86%.

Keywords: Bit Mask dictionary, 2n pattern run length code, system-on-chip, SOC, test data compression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1922
7221 A Hybrid Data Mining Method for the Medical Classification of Chest Pain

Authors: Sung Ho Ha, Seong Hyeon Joo

Abstract:

Data mining techniques have been used in medical research for many years and have been known to be effective. In order to solve such problems as long-waiting time, congestion, and delayed patient care, faced by emergency departments, this study concentrates on building a hybrid methodology, combining data mining techniques such as association rules and classification trees. The methodology is applied to real-world emergency data collected from a hospital and is evaluated by comparing with other techniques. The methodology is expected to help physicians to make a faster and more accurate classification of chest pain diseases.

Keywords: Data mining, medical decisions, medical domainknowledge, chest pain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2223
7220 Smart Monitoring and Control of Tap Changer Using Intelligent Electronic Device

Authors: K. N. Dinesh Babu, G. R. Manjunatha, R. Ramaprabha, V. Rajini, M. V. Gopalan

Abstract:

In this paper, monitoring and control of tap changer mechanism of a transformer implementation in an Intelligent Electronic Device (IED) is discussed. It has been a custom for decades to provide a separate panel for on load tap changer control for monitoring the tap position. However, this facility cannot either record or transfer the information to remote control centers. As there is a technology shift towards the smart grid protection and control standards, the need for implementing remote control and monitoring has necessitated the implementation of this feature in numerical relays. This paper deals with the programming, settings and logic implementation which is applicable to both IEC 61850 compatible and non-compatible IEDs thereby eliminating the need for separate tap changer control equipment. The monitoring mechanism has been implemented in a 28MVA, 110 /6.9kV transformer with 16 tap position with GE make T60 IED at Ultratech cement limited Gulbarga, Karnataka and is in successful service.

Keywords: Transformer protection, tap changer control, tap position monitoring, on load tap changer, intelligent electronic device (IED).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4029
7219 Effect of Mesh Size on the Viscous Flow Parameters of an Axisymmetric Nozzle

Authors: Rabah Haoui

Abstract:

The aim of this work is to analyze a viscous flow in the axisymmetric nozzle taken into account the mesh size both in the free stream and into the boundary layer. The resolution of the Navier- Stokes equations is realized by using the finite volume method to determine the supersonic flow parameters at the exit of convergingdiverging nozzle. The numerical technique uses the Flux Vector Splitting method of Van Leer. Here, adequate time stepping parameter, along with CFL coefficient and mesh size level is selected to ensure numerical convergence. The effect of the boundary layer thickness is significant at the exit of the nozzle. The best solution is obtained with using a very fine grid, especially near the wall, where we have a strong variation of velocity, temperature and shear stress. This study enabled us to confirm that the determination of boundary layer thickness can be obtained only if the size of the mesh is lower than a certain value limits given by our calculations.

Keywords: Supersonic flow, viscous flow, finite volume, nozzle

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1920
7218 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: Data mining, textile production, decision trees, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1539
7217 Application and Limitation of Parallel Modelingin Multidimensional Sequential Pattern

Authors: Mahdi Esmaeili, Mansour Tarafdar

Abstract:

The goal of data mining algorithms is to discover useful information embedded in large databases. One of the most important data mining problems is discovery of frequently occurring patterns in sequential data. In a multidimensional sequence each event depends on more than one dimension. The search space is quite large and the serial algorithms are not scalable for very large datasets. To address this, it is necessary to study scalable parallel implementations of sequence mining algorithms. In this paper, we present a model for multidimensional sequence and describe a parallel algorithm based on data parallelism. Simulation experiments show good load balancing and scalable and acceptable speedup over different processors and problem sizes and demonstrate that our approach can works efficiently in a real parallel computing environment.

Keywords: Sequential Patterns, Data Mining, ParallelAlgorithm, Multidimensional Sequence Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1479
7216 Generator of Hypotheses an Approach of Data Mining Based on Monotone Systems Theory

Authors: Rein Kuusik, Grete Lind

Abstract:

Generator of hypotheses is a new method for data mining. It makes possible to classify the source data automatically and produces a particular enumeration of patterns. Pattern is an expression (in a certain language) describing facts in a subset of facts. The goal is to describe the source data via patterns and/or IF...THEN rules. Used evaluation criteria are deterministic (not probabilistic). The search results are trees - form that is easy to comprehend and interpret. Generator of hypotheses uses very effective algorithm based on the theory of monotone systems (MS) named MONSA (MONotone System Algorithm).

Keywords: data mining, monotone systems, pattern, rule.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1258
7215 2 – Block 3 - Point Modified Numerov Block Methods for Solving Ordinary Differential Equations

Authors: Abdu Masanawa Sagir

Abstract:

In this paper, linear multistep technique using power series as the basis function is used to develop the block methods which are suitable for generating direct solution of the special second order ordinary differential equations of the form y′′ = f(x,y), a < = x < = b with associated initial or boundary conditions. The continuaous hybrid formulations enable us to differentiate and evaluate at some grids and off – grid points to obtain two different three discrete schemes, each of order (4,4,4)T, which were used in block form for parallel or sequential solutions of the problems. The computational burden and computer time wastage involved in the usual reduction of second order problem into system of first order equations are avoided by this approach. Furthermore, a stability analysis and efficiency of the block method are tested on linear and non-linear ordinary differential equations whose solutions are oscillatory or nearly periodic in nature, and the results obtained compared favourably with the exact solution.

Keywords: Block Method, Hybrid, Linear Multistep Method, Self – starting, Special Second Order.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1951
7214 Categorical Data Modeling: Logistic Regression Software

Authors: Abdellatif Tchantchane

Abstract:

A Matlab based software for logistic regression is developed to enhance the process of teaching quantitative topics and assist researchers with analyzing wide area of applications where categorical data is involved. The software offers an option of performing stepwise logistic regression to select the most significant predictors. The software includes a feature to detect influential observations in data, and investigates the effect of dropping or misclassifying an observation on a predictor variable. The input data may consist either as a set of individual responses (yes/no) with the predictor variables or as grouped records summarizing various categories for each unique set of predictor variables' values. Graphical displays are used to output various statistical results and to assess the goodness of fit of the logistic regression model. The software recognizes possible convergence constraints when present in data, and the user is notified accordingly.

Keywords: Logistic regression, Matlab, Categorical data, Influential observation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1885
7213 Role of Association Rule Mining in Numerical Data Analysis

Authors: Sudhir Jagtap, Kodge B. G., Shinde G. N., Devshette P. M

Abstract:

Numerical analysis naturally finds applications in all fields of engineering and the physical sciences, but in the 21st century, the life sciences and even the arts have adopted elements of scientific computations. The numerical data analysis became key process in research and development of all the fields [6]. In this paper we have made an attempt to analyze the specified numerical patterns with reference to the association rule mining techniques with minimum confidence and minimum support mining criteria. The extracted rules and analyzed results are graphically demonstrated. Association rules are a simple but very useful form of data mining that describe the probabilistic co-occurrence of certain events within a database [7]. They were originally designed to analyze market-basket data, in which the likelihood of items being purchased together within the same transactions are analyzed.

Keywords: Numerical data analysis, Data Mining, Association Rule Mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2862
7212 A Survey on Data-Centric and Data-Aware Techniques for Large Scale Infrastructures

Authors: Silvina Caíno-Lores, Jesús Carretero

Abstract:

Large scale computing infrastructures have been widely developed with the core objective of providing a suitable platform for high-performance and high-throughput computing. These systems are designed to support resource-intensive and complex applications, which can be found in many scientific and industrial areas. Currently, large scale data-intensive applications are hindered by the high latencies that result from the access to vastly distributed data. Recent works have suggested that improving data locality is key to move towards exascale infrastructures efficiently, as solutions to this problem aim to reduce the bandwidth consumed in data transfers, and the overheads that arise from them. There are several techniques that attempt to move computations closer to the data. In this survey we analyse the different mechanisms that have been proposed to provide data locality for large scale high-performance and high-throughput systems. This survey intends to assist scientific computing community in understanding the various technical aspects and strategies that have been reported in recent literature regarding data locality. As a result, we present an overview of locality-oriented techniques, which are grouped in four main categories: application development, task scheduling, in-memory computing and storage platforms. Finally, the authors include a discussion on future research lines and synergies among the former techniques.

Keywords: Co-scheduling, data-centric, data-intensive, data locality, in-memory storage, large scale.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493
7211 Correction of Infrared Data for Electrical Components on a Board

Authors: Seong-Ho Song, Ki-Seob Kim, Seop-Hyeong Park, Seon-Woo Lee

Abstract:

In this paper, the data correction algorithm is suggested when the environmental air temperature varies. To correct the infrared data in this paper, the initial temperature or the initial infrared image data is used so that a target source system may not be necessary. The temperature data obtained from infrared detector show nonlinear property depending on the surface temperature. In order to handle this nonlinear property, Taylor series approach is adopted. It is shown that the proposed algorithm can reduce the influence of environmental temperature on the components in the board. The main advantage of this algorithm is to use only the initial temperature of the components on the board rather than using other reference device such as black body sources in order to get reference temperatures.

Keywords: Infrared camera, Temperature Data compensation, Environmental Ambient Temperature, Electric Component

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1528
7210 GridNtru: High Performance PKCS

Authors: Narasimham Challa, Jayaram Pradhan

Abstract:

Cryptographic algorithms play a crucial role in the information society by providing protection from unauthorized access to sensitive data. It is clear that information technology will become increasingly pervasive, Hence we can expect the emergence of ubiquitous or pervasive computing, ambient intelligence. These new environments and applications will present new security challenges, and there is no doubt that cryptographic algorithms and protocols will form a part of the solution. The efficiency of a public key cryptosystem is mainly measured in computational overheads, key size and bandwidth. In particular the RSA algorithm is used in many applications for providing the security. Although the security of RSA is beyond doubt, the evolution in computing power has caused a growth in the necessary key length. The fact that most chips on smart cards can-t process key extending 1024 bit shows that there is need for alternative. NTRU is such an alternative and it is a collection of mathematical algorithm based on manipulating lists of very small integers and polynomials. This allows NTRU to high speeds with the use of minimal computing power. NTRU (Nth degree Truncated Polynomial Ring Unit) is the first secure public key cryptosystem not based on factorization or discrete logarithm problem. This means that given sufficient computational resources and time, an adversary, should not be able to break the key. The multi-party communication and requirement of optimal resource utilization necessitated the need for the present day demand of applications that need security enforcement technique .and can be enhanced with high-end computing. This has promoted us to develop high-performance NTRU schemes using approaches such as the use of high-end computing hardware. Peer-to-peer (P2P) or enterprise grids are proven as one of the approaches for developing high-end computing systems. By utilizing them one can improve the performance of NTRU through parallel execution. In this paper we propose and develop an application for NTRU using enterprise grid middleware called Alchemi. An analysis and comparison of its performance for various text files is presented.

Keywords: Alchemi, GridNtru, Ntru, PKCS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1693