Search results for: Data Aggregation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7522

Search results for: Data Aggregation

7012 Semi-Supervised Outlier Detection Using a Generative and Adversary Framework

Authors: Jindong Gu, Matthias Schubert, Volker Tresp

Abstract:

In many outlier detection tasks, only training data belonging to one class, i.e., the positive class, is available. The task is then to predict a new data point as belonging either to the positive class or to the negative class, in which case the data point is considered an outlier. For this task, we propose a novel corrupted Generative Adversarial Network (CorGAN). In the adversarial process of training CorGAN, the Generator generates outlier samples for the negative class, and the Discriminator is trained to distinguish the positive training data from the generated negative data. The proposed framework is evaluated using an image dataset and a real-world network intrusion dataset. Our outlier-detection method achieves state-of-the-art performance on both tasks.

Keywords: Outlier detection, generative adversary networks, semi-supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1077
7011 Methodology of the Turkey’s National Geographic Information System Integration Project

Authors: Buse A. Ataç, Doğan K. Cenan, Arda Çetinkaya, Naz D. Şahin, Köksal Sanlı, Zeynep Koç, Akın Kısa

Abstract:

With its spatial data reliability, interpretation and questioning capabilities, Geographical Information Systems make significant contributions to scientists, planners and practitioners. Geographic information systems have received great attention in today's digital world, growing rapidly, and increasing the efficiency of use. Access to and use of current and accurate geographical data, which are the most important components of the Geographical Information System, has become a necessity rather than a need for sustainable and economic development. This project aims to enable sharing of data collected by public institutions and organizations on a web-based platform. Within the scope of the project, INSPIRE (Infrastructure for Spatial Information in the European Community) data specifications are considered as a road-map. In this context, Turkey's National Geographic Information System (TUCBS) Integration Project supports sharing spatial data within 61 pilot public institutions as complied with defined national standards. In this paper, which is prepared by the project team members in the TUCBS Integration Project, the technical process with a detailed methodology is explained. In this context, the main technical processes of the Project consist of Geographic Data Analysis, Geographic Data Harmonization (Standardization), Web Service Creation (WMS, WFS) and Metadata Creation-Publication. In this paper, the integration process carried out to provide the data produced by 61 institutions to be shared from the National Geographic Data Portal (GEOPORTAL), have been trying to be conveyed with a detailed methodology.

Keywords: Data specification, geoportal, GIS, INSPIRE, TUCBS, Turkey’s National Geographic Information System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 701
7010 Exploring SSD Suitable Allocation Schemes Incompliance with Workload Patterns

Authors: Jae Young Park, Hwansu Jung, Jong Tae Kim

Abstract:

In the Solid-State-Drive (SSD) performance, whether the data has been well parallelized is an important factor. SSD parallelization is affected by allocation scheme and it is directly connected to SSD performance. There are dynamic allocation and static allocation in representative allocation schemes. Dynamic allocation is more adaptive in exploiting write operation parallelism, while static allocation is better in read operation parallelism. Therefore, it is hard to select the appropriate allocation scheme when the workload is mixed read and write operations. We simulated conditions on a few mixed data patterns and analyzed the results to help the right choice for better performance. As the results, if data arrival interval is long enough prior operations to be finished and continuous read intensive data environment static allocation is more suitable. Dynamic allocation performs the best on write performance and random data patterns.

Keywords: Dynamic allocation, NAND Flash based SSD, SSD parallelism, static allocation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1997
7009 WebAppShield: An Approach Exploiting Machine Learning to Detect SQLi Attacks in an Application Layer in Run-Time

Authors: Ahmed Abdulla Ashlam, Atta Badii, Frederic Stahl

Abstract:

In recent years, SQL injection attacks have been identified as being prevalent against web applications. They affect network security and user data, which leads to a considerable loss of money and data every year. This paper presents the use of classification algorithms in machine learning using a method to classify the login data filtering inputs into "SQLi" or "Non-SQLi,” thus increasing the reliability and accuracy of results in terms of deciding whether an operation is an attack or a valid operation. A method as a Web-App is developed for auto-generated data replication to provide a twin of the targeted data structure. Shielding against SQLi attacks (WebAppShield) that verifies all users and prevents attackers (SQLi attacks) from entering and or accessing the database, which the machine learning module predicts as "Non-SQLi", has been developed. A special login form has been developed with a special instance of the data validation; this verification process secures the web application from its early stages. The system has been tested and validated, and up to 99% of SQLi attacks have been prevented.

Keywords: SQL injection, attacks, web application, accuracy, database, WebAppShield.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 450
7008 Adaptive Kernel Principal Analysis for Online Feature Extraction

Authors: Mingtao Ding, Zheng Tian, Haixia Xu

Abstract:

The batch nature limits the standard kernel principal component analysis (KPCA) methods in numerous applications, especially for dynamic or large-scale data. In this paper, an efficient adaptive approach is presented for online extraction of the kernel principal components (KPC). The contribution of this paper may be divided into two parts. First, kernel covariance matrix is correctly updated to adapt to the changing characteristics of data. Second, KPC are recursively formulated to overcome the batch nature of standard KPCA.This formulation is derived from the recursive eigen-decomposition of kernel covariance matrix and indicates the KPC variation caused by the new data. The proposed method not only alleviates sub-optimality of the KPCA method for non-stationary data, but also maintains constant update speed and memory usage as the data-size increases. Experiments for simulation data and real applications demonstrate that our approach yields improvements in terms of both computational speed and approximation accuracy.

Keywords: adaptive method, kernel principal component analysis, online extraction, recursive algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1554
7007 Proposing an Efficient Method for Frequent Pattern Mining

Authors: Vaibhav Kant Singh, Vijay Shah, Yogendra Kumar Jain, Anupam Shukla, A.S. Thoke, Vinay KumarSingh, Chhaya Dule, Vivek Parganiha

Abstract:

Data mining, which is the exploration of knowledge from the large set of data, generated as a result of the various data processing activities. Frequent Pattern Mining is a very important task in data mining. The previous approaches applied to generate frequent set generally adopt candidate generation and pruning techniques for the satisfaction of the desired objective. This paper shows how the different approaches achieve the objective of frequent mining along with the complexities required to perform the job. This paper will also look for hardware approach of cache coherence to improve efficiency of the above process. The process of data mining is helpful in generation of support systems that can help in Management, Bioinformatics, Biotechnology, Medical Science, Statistics, Mathematics, Banking, Networking and other Computer related applications. This paper proposes the use of both upward and downward closure property for the extraction of frequent item sets which reduces the total number of scans required for the generation of Candidate Sets.

Keywords: Data Mining, Candidate Sets, Frequent Item set, Pruning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687
7006 Stochastic Simulation of Reaction-Diffusion Systems

Authors: Paola Lecca, Lorenzo Dematte

Abstract:

Reactiondiffusion systems are mathematical models that describe how the concentration of one or more substances distributed in space changes under the influence of local chemical reactions in which the substances are converted into each other, and diffusion which causes the substances to spread out in space. The classical representation of a reaction-diffusion system is given by semi-linear parabolic partial differential equations, whose general form is ÔêétX(x, t) = DΔX(x, t), where X(x, t) is the state vector, D is the matrix of the diffusion coefficients and Δ is the Laplace operator. If the solute move in an homogeneous system in thermal equilibrium, the diffusion coefficients are constants that do not depend on the local concentration of solvent and of solutes and on local temperature of the medium. In this paper a new stochastic reaction-diffusion model in which the diffusion coefficients are function of the local concentration, viscosity and frictional forces of solvent and solute is presented. Such a model provides a more realistic description of the molecular kinetics in non-homogenoeus and highly structured media as the intra- and inter-cellular spaces. The movement of a molecule A from a region i to a region j of the space is described as a first order reaction Ai k- → Aj , where the rate constant k depends on the diffusion coefficient. Representing the diffusional motion as a chemical reaction allows to assimilate a reaction-diffusion system to a pure reaction system and to simulate it with Gillespie-inspired stochastic simulation algorithms. The stochastic time evolution of the system is given by the occurrence of diffusion events and chemical reaction events. At each time step an event (reaction or diffusion) is selected from a probability distribution of waiting times determined by the specific speed of reaction and diffusion events. Redi is the software tool, developed to implement the model of reaction-diffusion kinetics and dynamics. It is a free software, that can be downloaded from http://www.cosbi.eu. To demonstrate the validity of the new reaction-diffusion model, the simulation results of the chaperone-assisted protein folding in cytoplasm obtained with Redi are reported. This case study is redrawing the attention of the scientific community due to current interests on protein aggregation as a potential cause for neurodegenerative diseases.

Keywords: Reaction-diffusion systems, Fick's law, stochastic simulation algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743
7005 Danger Theory and Intelligent Data Processing

Authors: Anjum Iqbal, Mohd Aizaini Maarof

Abstract:

Artificial Immune System (AIS) is relatively naive paradigm for intelligent computations. The inspiration for AIS is derived from natural Immune System (IS). Classically it is believed that IS strives to discriminate between self and non-self. Most of the existing AIS research is based on this approach. Danger Theory (DT) argues this approach and proposes that IS fights against danger producing elements and tolerates others. We, the computational researchers, are not concerned with the arguments among immunologists but try to extract from it novel abstractions for intelligent computation. This paper aims to follow DT inspiration for intelligent data processing. The approach may introduce new avenue in intelligent processing. The data used is system calls data that is potentially significant in intrusion detection applications.

Keywords: artificial immune system, danger theory, intelligent processing, system calls

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1887
7004 Using Artificial Neural Network to Forecast Groundwater Depth in Union County Well

Authors: Zahra Ghadampour, Gholamreza Rakhshandehroo

Abstract:

A concern that researchers usually face in different applications of Artificial Neural Network (ANN) is determination of the size of effective domain in time series. In this paper, trial and error method was used on groundwater depth time series to determine the size of effective domain in the series in an observation well in Union County, New Jersey, U.S. different domains of 20, 40, 60, 80, 100, and 120 preceding day were examined and the 80 days was considered as effective length of the domain. Data sets in different domains were fed to a Feed Forward Back Propagation ANN with one hidden layer and the groundwater depths were forecasted. Root Mean Square Error (RMSE) and the correlation factor (R2) of estimated and observed groundwater depths for all domains were determined. In general, groundwater depth forecast improved, as evidenced by lower RMSEs and higher R2s, when the domain length increased from 20 to 120. However, 80 days was selected as the effective domain because the improvement was less than 1% beyond that. Forecasted ground water depths utilizing measured daily data (set #1) and data averaged over the effective domain (set #2) were compared. It was postulated that more accurate nature of measured daily data was the reason for a better forecast with lower RMSE (0.1027 m compared to 0.255 m) in set #1. However, the size of input data in this set was 80 times the size of input data in set #2; a factor that may increase the computational effort unpredictably. It was concluded that 80 daily data may be successfully utilized to lower the size of input data sets considerably, while maintaining the effective information in the data set.

Keywords: Neural networks, groundwater depth, forecast.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2522
7003 Organizational Data Security in Perspective of Ownership of Mobile Devices Used by Employees for Works

Authors: B. Ferdousi, J. Bari

Abstract:

With advancement of mobile computing, employees are increasingly doing their job-related works using personally owned mobile devices or organization owned devices. The Bring Your Own Device (BYOD) model allows employees to use their own mobile devices for job-related works, while Corporate Owned, Personally Enabled (COPE) model allows both organizations and employees to install applications onto organization-owned mobile devices used for job-related works. While there are many benefits of using mobile computing for job-related works, there are also serious concerns of different levels of threats to the organizational data security. Consequently, it is crucial to know the level of threat to the organizational data security in the BOYD and COPE models. It is also important to ensure that employees comply with the organizational data security policy. This paper discusses the organizational data security issues in perspective of ownership of mobile devices used by employees, especially in BYOD and COPE models. It appears that while the BYOD model has many benefits, there are relatively more data security risks in this model than in the COPE model. The findings also showed that in both BYOD and COPE environments, a more practical approach towards achieving secure mobile computing in organizational setting is through the development of comprehensive cybersecurity policies balancing employees’ need for convenience with organizational data security. The study helps to figure out the compliance and the risks of security breach in BYOD and COPE models.

Keywords: Data security, mobile computing, BYOD, COPE, cybersecurity policy, cybersecurity compliance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 387
7002 New Security Approach of Confidential Resources in Hybrid Clouds

Authors: Haythem Yahyaoui, Samir Moalla, Mounir Bouden, Skander Ghorbel

Abstract:

Nowadays, cloud environments are becoming a need for companies, this new technology gives the opportunities to access to the data anywhere and anytime. It also provides an optimized and secured access to the resources and gives more security for the data which is stored in the platform. However, some companies do not trust Cloud providers, they think that providers can access and modify some confidential data such as bank accounts. Many works have been done in this context, they conclude that encryption methods realized by providers ensure the confidentiality, but, they forgot that Cloud providers can decrypt the confidential resources. The best solution here is to apply some operations on the data before sending them to the provider Cloud in the objective to make them unreadable. The principal idea is to allow user how it can protect his data with his own methods. In this paper, we are going to demonstrate our approach and prove that is more efficient in term of execution time than some existing methods. This work aims at enhancing the quality of service of providers and ensuring the trust of the customers. 

Keywords: Confidentiality, cryptography, security issues, trust issues.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1475
7001 A Novel Web Metric for the Evaluation of Internet Trends

Authors: Radek Malinský, Ivan Jelínek

Abstract:

Web 2.0 (social networking, blogging and online forums) can serve as a data source for social science research because it contains vast amount of information from many different users. The volume of that information has been growing at a very high rate and becoming a network of heterogeneous data; this makes things difficult to find and is therefore not almost useful. We have proposed a novel theoretical model for gathering and processing data from Web 2.0, which would reflect semantic content of web pages in better way. This article deals with the analysis part of the model and its usage for content analysis of blogs. The introductory part of the article describes methodology for the gathering and processing data from blogs. The next part of the article is focused on the evaluation and content analysis of blogs, which write about specific trend.

Keywords: Blog, Sentiment Analysis, Web 2.0, Webometrics

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3546
7000 Encoding and Compressing Data for Decreasing Number of Switches in Baseline Networks

Authors: Mohammad Ali Jabraeil Jamali, Ahmad Khademzadeh, Hasan Asil, Amir Asil

Abstract:

This method decrease usage power (expenditure) in networks on chips (NOC). This method data coding for data transferring in order to reduces expenditure. This method uses data compression reduces the size. Expenditure calculation in NOC occurs inside of NOC based on grown models and transitive activities in entry ports. The goal of simulating is to weigh expenditure for encoding, decoding and compressing in Baseline networks and reduction of switches in this type of networks. KeywordsNetworks on chip, Compression, Encoding, Baseline networks, Banyan networks.

Keywords: Networks on chip, Compression, Encoding, Baseline networks, Banyan networks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1985
6999 Sampled-Data Control for Fuel Cell Systems

Authors: H. Y. Jung, Ju H. Park, S. M. Lee

Abstract:

Sampled-data controller is presented for solid oxide fuel cell systems which is expressed by a sector bounded nonlinear model. The proposed control law is obtained by solving a convex problem satisfying several linear matrix inequalities. Simulation results are given to show the effectiveness of the proposed design method.

Keywords: Sampled-data control, Sector bound, Solid oxide fuel cell, Time-delay.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1724
6998 Automatic Detection and Spatio-temporal Analysis of Commercial Accumulations Using Digital Yellow Page Data

Authors: Yuki. Akiyama, Hiroaki. Sengoku, Ryosuke. Shibasaki

Abstract:

In this study, the locations and areas of commercial accumulations were detected by using digital yellow page data. An original buffering method that can accurately create polygons of commercial accumulations is proposed in this paper.; by using this method, distribution of commercial accumulations can be easily created and monitored over a wide area. The locations, areas, and time-series changes of commercial accumulations in the South Kanto region can be monitored by integrating polygons of commercial accumulations with the time-series data of digital yellow page data. The circumstances of commercial accumulations were shown to vary according to areas, that is, highly- urbanized regions such as the city center of Tokyo and prefectural capitals, suburban areas near large cities, and suburban and rural areas.

Keywords: Commercial accumulations, Spatio-temporal analysis, Urban monitoring, Yellow page data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1265
6997 EEG Waves Classifier using Wavelet Transform and Fourier Transform

Authors: Maan M. Shaker

Abstract:

The electroencephalograph (EEG) signal is one of the most widely signal used in the bioinformatics field due to its rich information about human tasks. In this work EEG waves classification is achieved using the Discrete Wavelet Transform DWT with Fast Fourier Transform (FFT) by adopting the normalized EEG data. The DWT is used as a classifier of the EEG wave's frequencies, while FFT is implemented to visualize the EEG waves in multi-resolution of DWT. Several real EEG data sets (real EEG data for both normal and abnormal persons) have been tested and the results improve the validity of the proposed technique.

Keywords: Bioinformatics, DWT, EEG waves, FFT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5563
6996 Obstacle Classification Method Based On 2D LIDAR Database

Authors: Moohyun Lee, Soojung Hur, Yongwan Park

Abstract:

We propose obstacle classification method based on 2D LIDAR Database. The existing obstacle classification method based on 2D LIDAR, has an advantage in terms of accuracy and shorter calculation time. However, it was difficult to classifier the type of obstacle and therefore accurate path planning was not possible. In order to overcome this problem, a method of classifying obstacle type based on width data of obstacle was proposed. However, width data was not sufficient to improve accuracy. In this paper, database was established by width and intensity data; the first classification was processed by the width data; the second classification was processed by the intensity data; classification was processed by comparing to database; result of obstacle classification was determined by finding the one with highest similarity values. An experiment using an actual autonomous vehicle under real environment shows that calculation time declined in comparison to 3D LIDAR and it was possible to classify obstacle using single 2D LIDAR.

Keywords: Obstacle, Classification, LIDAR, Segmentation, Width, Intensity, Database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3449
6995 An Empirical Mode Decomposition Based Method for Action Potential Detection in Neural Raw Data

Authors: Sajjad Farashi, Mohammadjavad Abolhassani, Mostafa Taghavi Kani

Abstract:

Information in the nervous system is coded as firing patterns of electrical signals called action potential or spike so an essential step in analysis of neural mechanism is detection of action potentials embedded in the neural data. There are several methods proposed in the literature for such a purpose. In this paper a novel method based on empirical mode decomposition (EMD) has been developed. EMD is a decomposition method that extracts oscillations with different frequency range in a waveform. The method is adaptive and no a-priori knowledge about data or parameter adjusting is needed in it. The results for simulated data indicate that proposed method is comparable with wavelet based methods for spike detection. For neural signals with signal-to-noise ratio near 3 proposed methods is capable to detect more than 95% of action potentials accurately.

Keywords: EMD, neural data processing, spike detection, wavelet decomposition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2377
6994 Platform-as-a-Service Sticky Policies for Privacy Classification in the Cloud

Authors: Maha Shamseddine, Amjad Nusayr, Wassim Itani

Abstract:

In this paper, we present a Platform-as-a-Service (PaaS) model for controlling the privacy enforcement mechanisms applied on user data when stored and processed in Cloud data centers. The proposed architecture consists of establishing user configurable ‘sticky’ policies on the Graphical User Interface (GUI) data-bound components during the application development phase to specify the details of privacy enforcement on the contents of these components. Various privacy classification classes on the data components are formally defined to give the user full control on the degree and scope of privacy enforcement including the type of execution containers to process the data in the Cloud. This not only enhances the privacy-awareness of the developed Cloud services, but also results in major savings in performance and energy efficiency due to the fact that the privacy mechanisms are solely applied on sensitive data units and not on all the user content. The proposed design is implemented in a real PaaS cloud computing environment on the Microsoft Azure platform.

Keywords: Privacy enforcement, Platform-as-a-Service privacy awareness, cloud computing privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 764
6993 DIFFER: A Propositionalization approach for Learning from Structured Data

Authors: Thashmee Karunaratne, Henrik Böstrom

Abstract:

Logic based methods for learning from structured data is limited w.r.t. handling large search spaces, preventing large-sized substructures from being considered by the resulting classifiers. A novel approach to learning from structured data is introduced that employs a structure transformation method, called finger printing, for addressing these limitations. The method, which generates features corresponding to arbitrarily complex substructures, is implemented in a system, called DIFFER. The method is demonstrated to perform comparably to an existing state-of-art method on some benchmark data sets without requiring restrictions on the search space. Furthermore, learning from the union of features generated by finger printing and the previous method outperforms learning from each individual set of features on all benchmark data sets, demonstrating the benefit of developing complementary, rather than competing, methods for structure classification.

Keywords: Machine learning, Structure classification, Propositionalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1226
6992 Improving the Performance of Proxy Server by Using Data Mining Technique

Authors: P. Jomsri

Abstract:

Currently, web usage make a huge data from a lot of user attention. In general, proxy server is a system to support web usage from user and can manage system by using hit rates. This research tries to improve hit rates in proxy system by applying data mining technique. The data set are collected from proxy servers in the university and are investigated relationship based on several features. The model is used to predict the future access websites. Association rule technique is applied to get the relation among Date, Time, Main Group web, Sub Group web, and Domain name for created model. The results showed that this technique can predict web content for the next day, moreover the future accesses of websites increased from 38.15% to 85.57 %. This model can predict web page access which tends to increase the efficient of proxy servers as a result. In additional, the performance of internet access will be improved and help to reduce traffic in networks.

Keywords: Association rule, proxy server, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3064
6991 Performance Analysis of the Subgroup Method for Collective I/O

Authors: Kwangho Cha, Hyeyoung Cho, Sungho Kim

Abstract:

As many scientific applications require large data processing, the importance of parallel I/O has been increasingly recognized. Collective I/O is one of the considerable features of parallel I/O and enables application programmers to easily handle their large data volume. In this paper we measured and analyzed the performance of original collective I/O and the subgroup method, the way of using collective I/O of MPI effectively. From the experimental results, we found that the subgroup method showed good performance with small data size.

Keywords: Collective I/O, MPI, parallel file system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1579
6990 Statistical Analysis for Overdispersed Medical Count Data

Authors: Y. N. Phang, E. F. Loh

Abstract:

Many researchers have suggested the use of zero inflated Poisson (ZIP) and zero inflated negative binomial (ZINB) models in modeling overdispersed medical count data with extra variations caused by extra zeros and unobserved heterogeneity. The studies indicate that ZIP and ZINB always provide better fit than using the normal Poisson and negative binomial models in modeling overdispersed medical count data. In this study, we proposed the use of Zero Inflated Inverse Trinomial (ZIIT), Zero Inflated Poisson Inverse Gaussian (ZIPIG) and zero inflated strict arcsine models in modeling overdispered medical count data. These proposed models are not widely used by many researchers especially in the medical field. The results show that these three suggested models can serve as alternative models in modeling overdispersed medical count data. This is supported by the application of these suggested models to a real life medical data set. Inverse trinomial, Poisson inverse Gaussian and strict arcsine are discrete distributions with cubic variance function of mean. Therefore, ZIIT, ZIPIG and ZISA are able to accommodate data with excess zeros and very heavy tailed. They are recommended to be used in modeling overdispersed medical count data when ZIP and ZINB are inadequate.

Keywords: Zero inflated, inverse trinomial distribution, Poisson inverse Gaussian distribution, strict arcsine distribution, Pearson’s goodness of fit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3319
6989 Student Satisfaction Data for Work Based Learners

Authors: Rosie Borup, Hanifa Shah

Abstract:

This paper aims to describe how student satisfaction is measured for work-based learners as these are non-traditional learners, conducting academic learning in the workplace, typically their curricula have a high degree of negotiation, and whose motivations are directly related to their employers- needs, as well as their own career ambitions. We argue that while increasing WBL participation, and use of SSD are both accepted as being of strategic importance to the HE agenda, the use of WBL SSD is rarely examined, and lessons can be learned from the comparison of SSD from a range of WBL programmes, and increased visibility of this type of data will provide insight into ways to improve and develop this type of delivery. The key themes that emerged from the analysis of the interview data were: learners profiles and needs, employers drivers, academic staff drivers, organizational approach, tools for collecting data and visibility of findings. The paper concludes with observations on best practice in the collection, analysis and use of WBL SSD, thus offering recommendations for both academic managers and practitioners.

Keywords: Student satisfaction data, work based learning, employer engagement, NSS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1494
6988 A Consistency Protocol Multi-Layer for Replicas Management in Large Scale Systems

Authors: Ghalem Belalem, Yahya Slimani

Abstract:

Large scale systems such as computational Grid is a distributed computing infrastructure that can provide globally available network resources. The evolution of information processing systems in Data Grid is characterized by a strong decentralization of data in several fields whose objective is to ensure the availability and the reliability of the data in the reason to provide a fault tolerance and scalability, which cannot be possible only with the use of the techniques of replication. Unfortunately the use of these techniques has a height cost, because it is necessary to maintain consistency between the distributed data. Nevertheless, to agree to live with certain imperfections can improve the performance of the system by improving competition. In this paper, we propose a multi-layer protocol combining the pessimistic and optimistic approaches conceived for the data consistency maintenance in large scale systems. Our approach is based on a hierarchical representation model with tree layers, whose objective is with double vocation, because it initially makes it possible to reduce response times compared to completely pessimistic approach and it the second time to improve the quality of service compared to an optimistic approach.

Keywords: Data Grid, replication, consistency, optimistic approach, pessimistic approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1578
6987 Analysis of a Population of Diabetic Patients Databases with Classifiers

Authors: Murat Koklu, Yavuz Unal

Abstract:

Data mining can be called as a technique to extract information from data. It is the process of obtaining hidden information and then turning it into qualified knowledge by statistical and artificial intelligence technique. One of its application areas is medical area to form decision support systems for diagnosis just by inventing meaningful information from given medical data. In this study a decision support system for diagnosis of illness that make use of data mining and three different artificial intelligence classifier algorithms namely Multilayer Perceptron, Naive Bayes Classifier and J.48. Pima Indian dataset of UCI Machine Learning Repository was used. This dataset includes urinary and blood test results of 768 patients. These test results consist of 8 different feature vectors. Obtained classifying results were compared with the previous studies. The suggestions for future studies were presented.

Keywords: Artificial Intelligence, Classifiers, Data Mining, Diabetic Patients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5432
6986 Discovery of Time Series Event Patterns based on Time Constraints from Textual Data

Authors: Shigeaki Sakurai, Ken Ueno, Ryohei Orihara

Abstract:

This paper proposes a method that discovers time series event patterns from textual data with time information. The patterns are composed of sequences of events and each event is extracted from the textual data, where an event is characteristic content included in the textual data such as a company name, an action, and an impression of a customer. The method introduces 7 types of time constraints based on the analysis of the textual data. The method also evaluates these constraints when the frequency of a time series event pattern is calculated. We can flexibly define the time constraints for interesting combinations of events and can discover valid time series event patterns which satisfy these conditions. The paper applies the method to daily business reports collected by a sales force automation system and verifies its effectiveness through numerical experiments.

Keywords: Text mining, sequential mining, time constraints, daily business reports.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1490
6985 A 3.125Gb/s Clock and Data Recovery Circuit Using 1/4-Rate Technique

Authors: Il-Do Jeong, Hang-Geun Jeong

Abstract:

This paper describes the design and fabrication of a clock and data recovery circuit (CDR). We propose a new clock and data recovery which is based on a 1/4-rate frequency detector (QRFD). The proposed frequency detector helps reduce the VCO frequency and is thus advantageous for high speed application. The proposed frequency detector can achieve low jitter operation and extend the pull-in range without using the reference clock. The proposed CDR was implemented using a 1/4-rate bang-bang type phase detector (PD) and a ring voltage controlled oscillator (VCO). The CDR circuit has been fabricated in a standard 0.18 CMOS technology. It occupies an active area of 1 x 1 and consumes 90 mW from a single 1.8V supply.

Keywords: Clock and data recovery, 1/4-rate frequency detector, 1/4-rate phase detector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2931
6984 Very High Speed Data Driven Dynamic NAND Gate at 22nm High K Metal Gate Strained Silicon Technology Node

Authors: Shobha Sharma, Amita Dev

Abstract:

Data driven dynamic logic is the high speed dynamic circuit with low area. The clock of the dynamic circuit is removed and data drives the circuit instead of clock for precharging purpose. This data driven dynamic nand gate is given static forward substrate biasing of Vsupply/2 as well as the substrate bias is connected to the input data, resulting in dynamic substrate bias. The dynamic substrate bias gives the shortest propagation delay with a penalty on the power dissipation. Propagation delay is reduced by 77.8% compared to the normal reverse substrate bias Data driven dynamic nand. Also dynamic substrate biased D3nand’s propagation delay is reduced by 31.26% compared to data driven dynamic nand gate with static forward substrate biasing of Vdd/2. This data driven dynamic nand gate with dynamic body biasing gives us the highest speed with no area penalty and finds its applications where power penalty is acceptable. Also combination of Dynamic and static Forward body bias can be used with reduced propagation delay compared to static forward biased circuit and with comparable increase in an average power. The simulations were done on hspice simulator with 22nm High-k metal gate strained Si technology HP models of Arizona State University, USA.

Keywords: Data driven nand gate, dynamic substrate biasing, nand gate, static substrate biasing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1617
6983 Soft Computing based Retrieval System for Medical Applications

Authors: Pardeep Singh, Sanjay Sharma

Abstract:

With increasing data in medical databases, medical data retrieval is growing in popularity. Some of this analysis including inducing propositional rules from databases using many soft techniques, and then using these rules in an expert system. Diagnostic rules and information on features are extracted from clinical databases on diseases of congenital anomaly. This paper explain the latest soft computing techniques and some of the adaptive techniques encompasses an extensive group of methods that have been applied in the medical domain and that are used for the discovery of data dependencies, importance of features, patterns in sample data, and feature space dimensionality reduction. These approaches pave the way for new and interesting avenues of research in medical imaging and represent an important challenge for researchers.

Keywords: CBIR, GA, Rough sets, CBMIR, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1736