Search results for: Data preparation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7731

Search results for: Data preparation

7071 DIFFER: A Propositionalization approach for Learning from Structured Data

Authors: Thashmee Karunaratne, Henrik Böstrom

Abstract:

Logic based methods for learning from structured data is limited w.r.t. handling large search spaces, preventing large-sized substructures from being considered by the resulting classifiers. A novel approach to learning from structured data is introduced that employs a structure transformation method, called finger printing, for addressing these limitations. The method, which generates features corresponding to arbitrarily complex substructures, is implemented in a system, called DIFFER. The method is demonstrated to perform comparably to an existing state-of-art method on some benchmark data sets without requiring restrictions on the search space. Furthermore, learning from the union of features generated by finger printing and the previous method outperforms learning from each individual set of features on all benchmark data sets, demonstrating the benefit of developing complementary, rather than competing, methods for structure classification.

Keywords: Machine learning, Structure classification, Propositionalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1222
7070 Improving the Performance of Proxy Server by Using Data Mining Technique

Authors: P. Jomsri

Abstract:

Currently, web usage make a huge data from a lot of user attention. In general, proxy server is a system to support web usage from user and can manage system by using hit rates. This research tries to improve hit rates in proxy system by applying data mining technique. The data set are collected from proxy servers in the university and are investigated relationship based on several features. The model is used to predict the future access websites. Association rule technique is applied to get the relation among Date, Time, Main Group web, Sub Group web, and Domain name for created model. The results showed that this technique can predict web content for the next day, moreover the future accesses of websites increased from 38.15% to 85.57 %. This model can predict web page access which tends to increase the efficient of proxy servers as a result. In additional, the performance of internet access will be improved and help to reduce traffic in networks.

Keywords: Association rule, proxy server, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3062
7069 Assessment of Susceptibility of the Poultry Red Mite, Dermanyssus gallinae (Acari: Dermanyssidae) to Some Plant Preparations with Focus on Exposure Time

Authors: Sh. Ranjbar-Bahadori, N. Farhadifar, L. Mohammadyar

Abstract:

Plant preparations from thyme and garlic have been shown to be effective acaricides against the poultry red mite, Dermanyssus gallinae. In a layer house with a history of D. gallinae problem, mites were detected in the monitoring traps for the first time and number of them was counted. Then, some rows of layer house was sprayed twice using a concentration of 0.21 mg/cm2 thyme essential oil and 0.07 mg/cm2 garlic juice and a similar row was used as an untreated control group. Red mite traps made of cardboard were used to assess the mite density during days 1 and 7 after treatment and always removed after 24 h. the collected mites were counted and the efficacy against all mite stages (larvae, nymphs and adults) was calculated. Results showed that on day 1 and 7 after the administration of garlic extract efficacy rate was 92.05% and 74.62%, respectively. Moreover, efficacy rate on day 1 and 7 was 89.4% and 95.37% when treatment was done with thyme essential oil. It is concluded that using garlic juice to control of D. gallinae is more effective on short time. But thyme essential oil has a long time effect in compare to garlic preparation.

Keywords: Dermanyssus gallinae, Essential oil, Garlic, Thyme, Efficacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3772
7068 Performance Analysis of the Subgroup Method for Collective I/O

Authors: Kwangho Cha, Hyeyoung Cho, Sungho Kim

Abstract:

As many scientific applications require large data processing, the importance of parallel I/O has been increasingly recognized. Collective I/O is one of the considerable features of parallel I/O and enables application programmers to easily handle their large data volume. In this paper we measured and analyzed the performance of original collective I/O and the subgroup method, the way of using collective I/O of MPI effectively. From the experimental results, we found that the subgroup method showed good performance with small data size.

Keywords: Collective I/O, MPI, parallel file system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1575
7067 Statistical Analysis for Overdispersed Medical Count Data

Authors: Y. N. Phang, E. F. Loh

Abstract:

Many researchers have suggested the use of zero inflated Poisson (ZIP) and zero inflated negative binomial (ZINB) models in modeling overdispersed medical count data with extra variations caused by extra zeros and unobserved heterogeneity. The studies indicate that ZIP and ZINB always provide better fit than using the normal Poisson and negative binomial models in modeling overdispersed medical count data. In this study, we proposed the use of Zero Inflated Inverse Trinomial (ZIIT), Zero Inflated Poisson Inverse Gaussian (ZIPIG) and zero inflated strict arcsine models in modeling overdispered medical count data. These proposed models are not widely used by many researchers especially in the medical field. The results show that these three suggested models can serve as alternative models in modeling overdispersed medical count data. This is supported by the application of these suggested models to a real life medical data set. Inverse trinomial, Poisson inverse Gaussian and strict arcsine are discrete distributions with cubic variance function of mean. Therefore, ZIIT, ZIPIG and ZISA are able to accommodate data with excess zeros and very heavy tailed. They are recommended to be used in modeling overdispersed medical count data when ZIP and ZINB are inadequate.

Keywords: Zero inflated, inverse trinomial distribution, Poisson inverse Gaussian distribution, strict arcsine distribution, Pearson’s goodness of fit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3315
7066 Evaluation of Rheological Properties of Apple Mass Based Desserts

Authors: Sigita Boca, Ruta Galoburda, Inta Krasnova, Dalija Seglina, Aivars Aboltins, Imants Skrupskis

Abstract:

The aim of the study was to evaluate the effect of texturizers on the rheological properties of the apple mass and desserts made from various raw materials. The apple varieties - ‘Antonovka’, ‘Baltais Dzidrais’, and ‘Zarja Alatau’ harvested in Latvia, were used for the experiment. The apples were processed in a blender unpeeled for obtaining a homogenous mass. The apple mass was analyzed fresh and after storage at –18ºC. Both fresh and thawed apple mass samples with added gelatin, xantan gum, and sodium carboxymethylcellulose were whisked obtaining dessert. Pectin, pH and soluble dry matter of the product were determined. Apparent viscosity was measured using a rotational viscometer DV–III Ultra. Pectin content in frozen apple mass decreased significantly (p<0.05) compared to the fresh sample. The viscosity of apple desserts immediately after their preparation depends on the physico-chemical properties of apples and the texturizers used in the production.

Keywords: Apple variety, apparent viscosity, hydrocolloids, pectin, texturizers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2109
7065 Student Satisfaction Data for Work Based Learners

Authors: Rosie Borup, Hanifa Shah

Abstract:

This paper aims to describe how student satisfaction is measured for work-based learners as these are non-traditional learners, conducting academic learning in the workplace, typically their curricula have a high degree of negotiation, and whose motivations are directly related to their employers- needs, as well as their own career ambitions. We argue that while increasing WBL participation, and use of SSD are both accepted as being of strategic importance to the HE agenda, the use of WBL SSD is rarely examined, and lessons can be learned from the comparison of SSD from a range of WBL programmes, and increased visibility of this type of data will provide insight into ways to improve and develop this type of delivery. The key themes that emerged from the analysis of the interview data were: learners profiles and needs, employers drivers, academic staff drivers, organizational approach, tools for collecting data and visibility of findings. The paper concludes with observations on best practice in the collection, analysis and use of WBL SSD, thus offering recommendations for both academic managers and practitioners.

Keywords: Student satisfaction data, work based learning, employer engagement, NSS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493
7064 A Consistency Protocol Multi-Layer for Replicas Management in Large Scale Systems

Authors: Ghalem Belalem, Yahya Slimani

Abstract:

Large scale systems such as computational Grid is a distributed computing infrastructure that can provide globally available network resources. The evolution of information processing systems in Data Grid is characterized by a strong decentralization of data in several fields whose objective is to ensure the availability and the reliability of the data in the reason to provide a fault tolerance and scalability, which cannot be possible only with the use of the techniques of replication. Unfortunately the use of these techniques has a height cost, because it is necessary to maintain consistency between the distributed data. Nevertheless, to agree to live with certain imperfections can improve the performance of the system by improving competition. In this paper, we propose a multi-layer protocol combining the pessimistic and optimistic approaches conceived for the data consistency maintenance in large scale systems. Our approach is based on a hierarchical representation model with tree layers, whose objective is with double vocation, because it initially makes it possible to reduce response times compared to completely pessimistic approach and it the second time to improve the quality of service compared to an optimistic approach.

Keywords: Data Grid, replication, consistency, optimistic approach, pessimistic approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1575
7063 Assessing and Evaluating the Course Outcomes of Electrical Circuit Course for Bachelor of Science in Electrical and Electronic Engineering Program

Authors: Muhibul Haque Bhuyan, Sher Shermin Azmiri Khan

Abstract:

At present, it is an imperative and stimulating task to grow the concepts and skills of undergraduate students in any course. Educators must build up students' higher-order complex and critical thinking abilities. But many of them find it difficult to assess and evaluate these abilities of students who undertake their courses during undergraduate studies. In this research work, a simple assessment and evaluation process for the electrical circuit course of the undergraduate Electrical and Electronic Engineering (EEE) program is reported using the Outcome-Based Education (OBE) approach. The methodology of the work, course contents design, course outcomes (COs) preparation and mapping it with program outcomes (POs), question setting following Bloom's taxonomy, assessment strategy of the students, CO and PO evaluation records, statistics, and charts have been reported for a student-cohort of electrical circuit course taken in Spring 2019 Semester at EEE Department of Southeast University (SEU). It is found that the benchmark fixed by the course instructor has been achieved by the students of that course through CO assessment and evaluation. Recommendations of the course teacher for further quality enhancement based on CO achievement are also presented.

Keywords: OBE, COs, POs, assessment and evaluation, electrical circuit course.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 629
7062 Analysis of a Population of Diabetic Patients Databases with Classifiers

Authors: Murat Koklu, Yavuz Unal

Abstract:

Data mining can be called as a technique to extract information from data. It is the process of obtaining hidden information and then turning it into qualified knowledge by statistical and artificial intelligence technique. One of its application areas is medical area to form decision support systems for diagnosis just by inventing meaningful information from given medical data. In this study a decision support system for diagnosis of illness that make use of data mining and three different artificial intelligence classifier algorithms namely Multilayer Perceptron, Naive Bayes Classifier and J.48. Pima Indian dataset of UCI Machine Learning Repository was used. This dataset includes urinary and blood test results of 768 patients. These test results consist of 8 different feature vectors. Obtained classifying results were compared with the previous studies. The suggestions for future studies were presented.

Keywords: Artificial Intelligence, Classifiers, Data Mining, Diabetic Patients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5431
7061 Discovery of Time Series Event Patterns based on Time Constraints from Textual Data

Authors: Shigeaki Sakurai, Ken Ueno, Ryohei Orihara

Abstract:

This paper proposes a method that discovers time series event patterns from textual data with time information. The patterns are composed of sequences of events and each event is extracted from the textual data, where an event is characteristic content included in the textual data such as a company name, an action, and an impression of a customer. The method introduces 7 types of time constraints based on the analysis of the textual data. The method also evaluates these constraints when the frequency of a time series event pattern is calculated. We can flexibly define the time constraints for interesting combinations of events and can discover valid time series event patterns which satisfy these conditions. The paper applies the method to daily business reports collected by a sales force automation system and verifies its effectiveness through numerical experiments.

Keywords: Text mining, sequential mining, time constraints, daily business reports.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1488
7060 A 3.125Gb/s Clock and Data Recovery Circuit Using 1/4-Rate Technique

Authors: Il-Do Jeong, Hang-Geun Jeong

Abstract:

This paper describes the design and fabrication of a clock and data recovery circuit (CDR). We propose a new clock and data recovery which is based on a 1/4-rate frequency detector (QRFD). The proposed frequency detector helps reduce the VCO frequency and is thus advantageous for high speed application. The proposed frequency detector can achieve low jitter operation and extend the pull-in range without using the reference clock. The proposed CDR was implemented using a 1/4-rate bang-bang type phase detector (PD) and a ring voltage controlled oscillator (VCO). The CDR circuit has been fabricated in a standard 0.18 CMOS technology. It occupies an active area of 1 x 1 and consumes 90 mW from a single 1.8V supply.

Keywords: Clock and data recovery, 1/4-rate frequency detector, 1/4-rate phase detector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2927
7059 Very High Speed Data Driven Dynamic NAND Gate at 22nm High K Metal Gate Strained Silicon Technology Node

Authors: Shobha Sharma, Amita Dev

Abstract:

Data driven dynamic logic is the high speed dynamic circuit with low area. The clock of the dynamic circuit is removed and data drives the circuit instead of clock for precharging purpose. This data driven dynamic nand gate is given static forward substrate biasing of Vsupply/2 as well as the substrate bias is connected to the input data, resulting in dynamic substrate bias. The dynamic substrate bias gives the shortest propagation delay with a penalty on the power dissipation. Propagation delay is reduced by 77.8% compared to the normal reverse substrate bias Data driven dynamic nand. Also dynamic substrate biased D3nand’s propagation delay is reduced by 31.26% compared to data driven dynamic nand gate with static forward substrate biasing of Vdd/2. This data driven dynamic nand gate with dynamic body biasing gives us the highest speed with no area penalty and finds its applications where power penalty is acceptable. Also combination of Dynamic and static Forward body bias can be used with reduced propagation delay compared to static forward biased circuit and with comparable increase in an average power. The simulations were done on hspice simulator with 22nm High-k metal gate strained Si technology HP models of Arizona State University, USA.

Keywords: Data driven nand gate, dynamic substrate biasing, nand gate, static substrate biasing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1616
7058 Soft Computing based Retrieval System for Medical Applications

Authors: Pardeep Singh, Sanjay Sharma

Abstract:

With increasing data in medical databases, medical data retrieval is growing in popularity. Some of this analysis including inducing propositional rules from databases using many soft techniques, and then using these rules in an expert system. Diagnostic rules and information on features are extracted from clinical databases on diseases of congenital anomaly. This paper explain the latest soft computing techniques and some of the adaptive techniques encompasses an extensive group of methods that have been applied in the medical domain and that are used for the discovery of data dependencies, importance of features, patterns in sample data, and feature space dimensionality reduction. These approaches pave the way for new and interesting avenues of research in medical imaging and represent an important challenge for researchers.

Keywords: CBIR, GA, Rough sets, CBMIR, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
7057 Fabrication and Characterization of CdS Nanoparticles Annealed by using Different Radiations

Authors: Aneeqa Sabah, Saadat Anwar Siddiqi, Salamat Ali

Abstract:

The systematic manipulations of shapes and sizes of inorganic compounds greatly benefit the various application fields including optics, magnetic, electronics, catalysis and medicine. However shape control has been much more difficult to achieve. Hence exploration of novel method for the preparation of differently shaped nanoparticles is challenging research area. II-VI group of semiconductor cadmium sulphide (CdS) nanostructure with different morphologies (such as, acicular like, mesoporous, spherical shapes) and of crystallite sizes vary from 11 to 16 nm were successfully synthesized by chemical aqueous precipitation of Cd2+ ions with homogeneously released S2- ions from decomposition of cadmium sulphate (CdSO4) and thioacetamide (CH3CSNH2) by annealing at different radiations (microwave, ultrasonic and sunlight) with matter and systematic research has been done for various factors affecting the controlled growth rate of CdS nanoparticles. The obtained nanomaterials have been characterized by X-ray Diffraction (XRD), Fourier Transform Infrared Spectroscopy (FTIR), Thermogravometric (DSC-TGA) analysis and Scanning Electron Microscopy (SEM). The result indicates that on increasing the reaction time particle size increases but on increasing the molar ratios grain size decreases.

Keywords: CdS nanoparticles, Morphology, Oxidation, Radiations

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2984
7056 Secure Power Systems Against Malicious Cyber-Physical Data Attacks: Protection and Identification

Authors: Morteza Talebi, Jianan Wang, Zhihua Qu

Abstract:

The security of power systems against malicious cyberphysical data attacks becomes an important issue. The adversary always attempts to manipulate the information structure of the power system and inject malicious data to deviate state variables while evading the existing detection techniques based on residual test. The solutions proposed in the literature are capable of immunizing the power system against false data injection but they might be too costly and physically not practical in the expansive distribution network. To this end, we define an algebraic condition for trustworthy power system to evade malicious data injection. The proposed protection scheme secures the power system by deterministically reconfiguring the information structure and corresponding residual test. More importantly, it does not require any physical effort in either microgrid or network level. The identification scheme of finding meters being attacked is proposed as well. Eventually, a well-known IEEE 30-bus system is adopted to demonstrate the effectiveness of the proposed schemes.

Keywords: Algebraic Criterion, Malicious Cyber-Physical Data Injection, Protection and Identification, Trustworthy Power System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1993
7055 NSBS: Design of a Network Storage Backup System

Authors: Xinyan Zhang, Zhipeng Tan, Shan Fan

Abstract:

The first layer of defense against data loss is the backup data. This paper implements an agent-based network backup system used the backup, server-storage and server-backup agent these tripartite construction, and the snapshot and hierarchical index are used in the NSBS. It realizes the control command and data flow separation, balances the system load, thereby improving efficiency of the system backup and recovery. The test results show the agent-based network backup system can effectively improve the task-based concurrency, reasonably allocate network bandwidth, the system backup performance loss costs smaller and improves data recovery efficiency by 20%.

Keywords: Agent, network backup system, three architecture model, NSBS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2232
7054 Estimating the Life-Distribution Parameters of Weibull-Life PV Systems Utilizing Non-Parametric Analysis

Authors: Saleem Z. Ramadan

Abstract:

In this paper, a model is proposed to determine the life distribution parameters of the useful life region for the PV system utilizing a combination of non-parametric and linear regression analysis for the failure data of these systems. Results showed that this method is dependable for analyzing failure time data for such reliable systems when the data is scarce.

Keywords: Masking, Bathtub model, reliability, non-parametric analysis, useful life.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1843
7053 Novel Security Strategy for Real Time Digital Videos

Authors: Prakash Devale, R. S. Prasad, Amol Dhumane, Pritesh Patil

Abstract:

Now a days video data embedding approach is a very challenging and interesting task towards keeping real time video data secure. We can implement and use this technique with high-level applications. As the rate-distortion of any image is not confirmed, because the gain provided by accurate image frame segmentation are balanced by the inefficiency of coding objects of arbitrary shape, with a lot factors like losses that depend on both the coding scheme and the object structure. By using rate controller in association with the encoder one can dynamically adjust the target bitrate. This paper discusses about to keep secure videos by mixing signature data with negligible distortion in the original video, and to keep steganographic video as closely as possible to the quality of the original video. In this discussion we propose the method for embedding the signature data into separate video frames by the use of block Discrete Cosine Transform. These frames are then encoded by real time encoding H.264 scheme concepts. After processing, at receiver end recovery of original video and the signature data is proposed.

Keywords: Data Hiding, Digital Watermarking, video coding H.264, Rate Control, Block DCT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1561
7052 Evaluation Rabbit Serum of the Immunodominant Proteins of Mycobacterium Avium Paratuberculosis Extracts

Authors: M. Hashemi, R. Madani, N. Razmi

Abstract:

M. paratuberculosis is a slow growing mycobactin dependent mycobacterial species known to be the causative agent of Johne’s disease in all species of domestic ruminants worldwide. JD is characterized by gradual weight loss; decreased milk production. Excretion of the organism may occur for prolonged periods (1 to 2.5 years) before the onset of clinical disease. In recent years researchers focus on identification a specific antigen of MAP to use in diagnosis test and preparation of effective vaccine. In this paper, for production of polyclonal antibody against proteins of Mycobacterium avium paratuberculosis cell well a rabbit immunization at a certain time period with antigen. After immunization of the animal, rabbit was bleeded for producing enriched serum. Antibodies were purification with ion exchange chromatography. For exact measurement of interaction, western blotting test was used that this study demonstrated sharp bands appears in nitrocellulose paper and specific bands were 50 and 150 KD molecular weight. These were indicating immunodominant proteins.

Keywords: Paratuberculosis, Immunodominant, Western blotting, Ion exchange choromatography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1757
7051 Web Search Engine Based Naming Procedure for Independent Topic

Authors: Takahiro Nishigaki, Takashi Onoda

Abstract:

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

Keywords: Independent topic analysis, topic extraction, topic naming, web search engine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 500
7050 An Anomaly Detection Approach to Detect Unexpected Faults in Recordings from Test Drives

Authors: Andreas Theissler, Ian Dear

Abstract:

In the automotive industry test drives are being conducted during the development of new vehicle models or as a part of quality assurance of series-production vehicles. The communication on the in-vehicle network, data from external sensors, or internal data from the electronic control units is recorded by automotive data loggers during the test drives. The recordings are used for fault analysis. Since the resulting data volume is tremendous, manually analysing each recording in great detail is not feasible. This paper proposes to use machine learning to support domainexperts by preventing them from contemplating irrelevant data and rather pointing them to the relevant parts in the recordings. The underlying idea is to learn the normal behaviour from available recordings, i.e. a training set, and then to autonomously detect unexpected deviations and report them as anomalies. The one-class support vector machine “support vector data description” is utilised to calculate distances of feature vectors. SVDDSUBSEQ is proposed as a novel approach, allowing to classify subsequences in multivariate time series data. The approach allows to detect unexpected faults without modelling effort as is shown with experimental results on recordings from test drives.

Keywords: Anomaly detection, fault detection, test drive analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2477
7049 Decision Tree for Competing Risks Survival Probability in Breast Cancer Study

Authors: N. A. Ibrahim, A. Kudus, I. Daud, M. R. Abu Bakar

Abstract:

Competing risks survival data that comprises of more than one type of event has been used in many applications, and one of these is in clinical study (e.g. in breast cancer study). The decision tree method can be extended to competing risks survival data by modifying the split function so as to accommodate two or more risks which might be dependent on each other. Recently, researchers have constructed some decision trees for recurrent survival time data using frailty and marginal modelling. We further extended the method for the case of competing risks. In this paper, we developed the decision tree method for competing risks survival time data based on proportional hazards for subdistribution of competing risks. In particular, we grow a tree by using deviance statistic. The application of breast cancer data is presented. Finally, to investigate the performance of the proposed method, simulation studies on identification of true group of observations were executed.

Keywords: Competing risks, Decision tree, Simulation, Subdistribution Proportional Hazard.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2374
7048 Spatially Random Sampling for Retail Food Risk Factors Study

Authors: Guilan Huang

Abstract:

In 2013 and 2014, the U.S. Food and Drug Administration (FDA) collected data from selected fast food restaurants and full service restaurants for tracking changes in the occurrence of foodborne illness risk factors. This paper discussed how we customized spatial random sampling method by considering financial position and availability of FDA resources, and how we enriched restaurants data with location. Location information of restaurants provides opportunity for quantitatively determining random sampling within non-government units (e.g.: 240 kilometers around each data-collector). Spatial analysis also could optimize data-collectors’ work plans and resource allocation. Spatial analytic and processing platform helped us handling the spatial random sampling challenges. Our method fits in FDA’s ability to pinpoint features of foodservice establishments, and reduced both time and expense on data collection.

Keywords: Geospatial technology, restaurant, retail food risk factors study, spatial random sampling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1465
7047 Preparation and Investigation of Photocatalytic Properties of ZnO Nanocrystals: Effect of Operational Parameters and Kinetic Study

Authors: N. Daneshvar, S. Aber, M. S. Seyed Dorraji, A. R. Khataee, M. H. Rasoulifard

Abstract:

ZnO nanocrystals with mean diameter size 14 nm have been prepared by precipitation method, and examined as photocatalyst for the UV-induced degradation of insecticide diazinon as deputy of organic pollutant in aqueous solution. The effects of various parameters, such as illumination time, the amount of photocatalyst, initial pH values and initial concentration of insecticide on the photocatalytic degradation diazinon were investigated to find desired conditions. In this case, the desired parameters were also tested for the treatment of real water containing the insecticide. Photodegradation efficiency of diazinon was compared between commercial and prepared ZnO nanocrystals. The results indicated that UV/ZnO process applying prepared nanocrystalline ZnO offered electrical energy efficiency and quantum yield better than commercial ZnO. The present study, on the base of Langmuir-Hinshelwood mechanism, illustrated a pseudo first-order kinetic model with rate constant of surface reaction equal to 0.209 mg l-1 min-1 and adsorption equilibrium constant of 0.124 l mg-1.

Keywords: Zinc oxide nanopowder, Electricity consumption, Quantum yield, Nanoparticles, Photodegradation, Kinetic model, Insecticide.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3568
7046 Anisotropic Total Fractional Order Variation Model in Seismic Data Denoising

Authors: Jianwei Ma, Diriba Gemechu

Abstract:

In seismic data processing, attenuation of random noise is the basic step to improve quality of data for further application of seismic data in exploration and development in different gas and oil industries. The signal-to-noise ratio of the data also highly determines quality of seismic data. This factor affects the reliability as well as the accuracy of seismic signal during interpretation for different purposes in different companies. To use seismic data for further application and interpretation, we need to improve the signal-to-noise ration while attenuating random noise effectively. To improve the signal-to-noise ration and attenuating seismic random noise by preserving important features and information about seismic signals, we introduce the concept of anisotropic total fractional order denoising algorithm. The anisotropic total fractional order variation model defined in fractional order bounded variation is proposed as a regularization in seismic denoising. The split Bregman algorithm is employed to solve the minimization problem of the anisotropic total fractional order variation model and the corresponding denoising algorithm for the proposed method is derived. We test the effectiveness of theproposed method for synthetic and real seismic data sets and the denoised result is compared with F-X deconvolution and non-local means denoising algorithm.

Keywords: Anisotropic total fractional order variation, fractional order bounded variation, seismic random noise attenuation, Split Bregman Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1013
7045 Improving Academic Performance Prediction using Voting Technique in Data Mining

Authors: Ikmal Hisyam Mohamad Paris, Lilly Suriani Affendey, Norwati Mustapha

Abstract:

In this paper we compare the accuracy of data mining methods to classifying students in order to predicting student-s class grade. These predictions are more useful for identifying weak students and assisting management to take remedial measures at early stages to produce excellent graduate that will graduate at least with second class upper. Firstly we examine single classifiers accuracy on our data set and choose the best one and then ensembles it with a weak classifier to produce simple voting method. We present results show that combining different classifiers outperformed other single classifiers for predicting student performance.

Keywords: Classification, Data Mining, Prediction, Combination of Multiple Classifiers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2754
7044 Extracting Terrain Points from Airborne Laser Scanning Data in Densely Forested Areas

Authors: Ziad Abdeldayem, Jakub Markiewicz, Kunal Kansara, Laura Edwards

Abstract:

Airborne Laser Scanning (ALS) is one of the main technologies for generating high-resolution digital terrain models (DTMs). DTMs are crucial to several applications, such as topographic mapping, flood zone delineation, geographic information systems (GIS), hydrological modelling, spatial analysis, etc. Laser scanning system generates irregularly spaced three-dimensional cloud of points. Raw ALS data are mainly ground points (that represent the bare earth) and non-ground points (that represent buildings, trees, cars, etc.). Removing all the non-ground points from the raw data is referred to as filtering. Filtering heavily forested areas is considered a difficult and challenging task as the canopy stops laser pulses from reaching the terrain surface. This research presents an approach for removing non-ground points from raw ALS data in densely forested areas. Smoothing splines are exploited to interpolate and fit the noisy ALS data. The presented filter utilizes a weight function to allocate weights for each point of the data. Furthermore, unlike most of the methods, the presented filtering algorithm is designed to be automatic. Three different forested areas in the United Kingdom are used to assess the performance of the algorithm. The results show that the generated DTMs from the filtered data are accurate (when compared against reference terrain data) and the performance of the method is stable for all the heavily forested data samples. The average root mean square error (RMSE) value is 0.35 m.

Keywords: Airborne laser scanning, digital terrain models, filtering, forested areas.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 718
7043 Blockchain for IoT Security and Privacy in Healthcare Sector

Authors: Umair Shafique, Hafiz Usman Zia, Fiaz Majeed, Samina Naz, Javeria Ahmed, Maleeha Zainab

Abstract:

The Internet of Things (IoT) has become a hot topic for the last couple of years. This innovative technology has shown promising progress in various areas and the world has witnessed exponential growth in multiple application domains. Researchers are working to investigate its aptitudes to get the best from it by harnessing its true potential. But at the same time, IoT networks open up a new aspect of vulnerability and physical threats to data integrity, privacy, and confidentiality. It is due to centralized control, data silos approach for handling information, and a lack of standardization in the IoT networks. As we know, blockchain is a new technology that involves creating secure distributed ledgers to store and communicate data. Some of the benefits include resiliency, integrity, anonymity, decentralization, and autonomous control. The potential for blockchain technology to provide the key to managing and controlling IoT has created a new wave of excitement around the idea of putting that data back into the hands of the end-users. In this manuscript, we have proposed a model that combines blockchain and IoT networks to address potential security and privacy issues in the healthcare domain and how various stakeholders will interact with the system.

Keywords: Internet of Things, IoT, blockchain, data integrity, authentication, data privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 409
7042 Multidimensional Performance Management

Authors: David Wiese

Abstract:

In order to maximize efficiency of an information management platform and to assist in decision making, the collection, storage and analysis of performance-relevant data has become of fundamental importance. This paper addresses the merits and drawbacks provided by the OLAP paradigm for efficiently navigating large volumes of performance measurement data hierarchically. The system managers or database administrators navigate through adequately (re)structured measurement data aiming to detect performance bottlenecks, identify causes for performance problems or assessing the impact of configuration changes on the system and its representative metrics. Of particular importance is finding the root cause of an imminent problem, threatening availability and performance of an information system. Leveraging OLAP techniques, in contrast to traditional static reporting, this is supposed to be accomplished within moderate amount of time and little processing complexity. It is shown how OLAP techniques can help improve understandability and manageability of measurement data and, hence, improve the whole Performance Analysis process.

Keywords: Data Warehousing, OLAP, Multidimensional Navigation, Performance Diagnosis, Performance Management, Performance Tuning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2135