Search results for: linked open data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8190

Search results for: linked open data

7740 Principal Component Analysis using Singular Value Decomposition of Microarray Data

Authors: Dong Hoon Lim

Abstract:

A series of microarray experiments produces observations of differential expression for thousands of genes across multiple conditions. Principal component analysis(PCA) has been widely used in multivariate data analysis to reduce the dimensionality of the data in order to simplify subsequent analysis and allow for summarization of the data in a parsimonious manner. PCA, which can be implemented via a singular value decomposition(SVD), is useful for analysis of microarray data. For application of PCA using SVD we use the DNA microarray data for the small round blue cell tumors(SRBCT) of childhood by Khan et al.(2001). To decide the number of components which account for sufficient amount of information we draw scree plot. Biplot, a graphic display associated with PCA, reveals important features that exhibit relationship between variables and also the relationship of variables with observations.

Keywords: Principal component analysis, singular value decomposition, microarray data, SRBCT

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3250
7739 PeliGRIFF: A Parallel DEM-DLM/FD Method for DNS of Particulate Flows with Collisions

Authors: Anthony Wachs, Guillaume Vinay, Gilles Ferrer, Jacques Kouakou, Calin Dan, Laurence Girolami

Abstract:

An original Direct Numerical Simulation (DNS) method to tackle the problem of particulate flows at moderate to high concentration and finite Reynolds number is presented. Our method is built on the framework established by Glowinski and his coworkers [1] in the sense that we use their Distributed Lagrange Multiplier/Fictitious Domain (DLM/FD) formulation and their operator-splitting idea but differs in the treatment of particle collisions. The novelty of our contribution relies on replacing the simple artificial repulsive force based collision model usually employed in the literature by an efficient Discrete Element Method (DEM) granular solver. The use of our DEM solver enables us to consider particles of arbitrary shape (at least convex) and to account for actual contacts, in the sense that particles actually touch each other, in contrast with the simple repulsive force based collision model. We recently upgraded our serial code, GRIFF 1 [2], to full MPI capabilities. Our new code, PeliGRIFF 2, is developed under the framework of the full MPI open source platform PELICANS [3]. The new MPI capabilities of PeliGRIFF open new perspectives in the study of particulate flows and significantly increase the number of particles that can be considered in a full DNS approach: O(100000) in 2D and O(10000) in 3D. Results on the 2D/3D sedimentation/fluidization of isometric polygonal/polyedral particles with collisions are presented.

Keywords: Particulate flow, distributed lagrange multiplier/fictitious domain method, discrete element method, polygonal shape, sedimentation, distributed computing, MPI

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2126
7738 Clustering Mixed Data Using Non-normal Regression Tree for Process Monitoring

Authors: Youngji Yoo, Cheong-Sool Park, Jun Seok Kim, Young-Hak Lee, Sung-Shick Kim, Jun-Geol Baek

Abstract:

In the semiconductor manufacturing process, large amounts of data are collected from various sensors of multiple facilities. The collected data from sensors have several different characteristics due to variables such as types of products, former processes and recipes. In general, Statistical Quality Control (SQC) methods assume the normality of the data to detect out-of-control states of processes. Although the collected data have different characteristics, using the data as inputs of SQC will increase variations of data, require wide control limits, and decrease performance to detect outof- control. Therefore, it is necessary to separate similar data groups from mixed data for more accurate process control. In the paper, we propose a regression tree using split algorithm based on Pearson distribution to handle non-normal distribution in parametric method. The regression tree finds similar properties of data from different variables. The experiments using real semiconductor manufacturing process data show improved performance in fault detecting ability.

Keywords: Semiconductor, non-normal mixed process data, clustering, Statistical Quality Control (SQC), regression tree, Pearson distribution system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1780
7737 Speech Data Compression using Vector Quantization

Authors: H. B. Kekre, Tanuja K. Sarode

Abstract:

Mostly transforms are used for speech data compressions which are lossy algorithms. Such algorithms are tolerable for speech data compression since the loss in quality is not perceived by the human ear. However the vector quantization (VQ) has a potential to give more data compression maintaining the same quality. In this paper we propose speech data compression algorithm using vector quantization technique. We have used VQ algorithms LBG, KPE and FCG. The results table shows computational complexity of these three algorithms. Here we have introduced a new performance parameter Average Fractional Change in Speech Sample (AFCSS). Our FCG algorithm gives far better performance considering mean absolute error, AFCSS and complexity as compared to others.

Keywords: Vector Quantization, Data Compression, Encoding, , Speech coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2404
7736 Ontology and CDSS Based Intelligent Health Data Management in Health Care Server

Authors: Eun-Jung Ko, Hyung-Jik Lee, Jeun-Woo Lee

Abstract:

In ubiqutious healthcare environment, user's health data are transfered to the remote healthcare server by the user's wearable system or mobile phone. These collected user's health data should be managed and analyzed in the healthcare server, so that care giver or user can monitor user's physiological state. In this paper, we designed and developed the intelligent Healthcare Server to manage the user's health data using CDSS and ontology. Our system can analyze user's health data semantically using CDSS and ontology, and report the result of user's physiological raw data to the user and care giver.

Keywords: u-healthcare, CDSS, healthcare server, health data, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2237
7735 A Genetic Algorithm for Clustering on Image Data

Authors: Qin Ding, Jim Gasvoda

Abstract:

Clustering is the process of subdividing an input data set into a desired number of subgroups so that members of the same subgroup are similar and members of different subgroups have diverse properties. Many heuristic algorithms have been applied to the clustering problem, which is known to be NP Hard. Genetic algorithms have been used in a wide variety of fields to perform clustering, however, the technique normally has a long running time in terms of input set size. This paper proposes an efficient genetic algorithm for clustering on very large data sets, especially on image data sets. The genetic algorithm uses the most time efficient techniques along with preprocessing of the input data set. We test our algorithm on both artificial and real image data sets, both of which are of large size. The experimental results show that our algorithm outperforms the k-means algorithm in terms of running time as well as the quality of the clustering.

Keywords: Clustering, data mining, genetic algorithm, image data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2054
7734 Origanum vulgare as a Possible Modulator of Testicular Endocrine Function in Mice

Authors: Eva Tvrdá, Barbora Babečková, Michal Ďuračka, Róbert Kirchner, Július Árvay

Abstract:

This study was designed to assess the in vitro effects of Origanum vulgare L. (oregano) extract on the testicular steroidogenesis. We focused on identifying major biomolecules present in the oregano extract, as well as to investigate its in vitro impact on the secretion of cholesterol, testosterone, dehydroepiandrosterone and androstenedione by murine testicular fragments. The extract was subjected to high performance liquid chromatography (HPLC) which identified cyranosid, daidzein, thymol, rosmarinic and trans-caffeic acid among the predominant biochemical components of oregano. For the in vitro experiments, testicular fragments from 20 sexually mature Institute of Cancer Research (ICR) mice were incubated in the absence (control group) or presence of the oregano extract at selected concentrations (10, 100 and 1000 μg/mL) for 24 h. Cholesterol levels were quantified using photometry and the hormones were assessed by ELISA (Enzyme-Linked Immunosorbent Assay). Our data revealed that the release of cholesterol and androstenedione (but not dehydroepiandrosterone and testosterone) by the testicular fragments was significantly impacted by the oregano extract in a dose-dependent fashion. Supplementation of the extract resulted in a significant decline of cholesterol (P < 0.05 in case of 100 μg/mL; P < 0.01 with respect 100 μg/mL extract), as well as androstenedione (P < 0.01 with respect to 100 and 1000 μg/mL extract). Our results suggest that the biomolecules present in Origanum vulgare L. could exhibit a dose-dependent impact on the secretion of male steroids, playing a role in the regulation of testicular steroidogenesis.

Keywords: Mice, Origanum vulgare L., steroidogenesis, testes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1068
7733 A Holistic Framework for Unifying Data Security and Management in Modern Enterprises

Authors: Ashly Joseph

Abstract:

Modern businesses struggle significantly to secure and manage their data properly as the volume and complexity of their data both expand exponentially. Through the use of a multi-layered defense strategy, a centralized management platform, and cutting-edge technologies like AI, this research paper presents a comprehensive framework to integrate data security and management. The constraints of current data protection and management strategies, technological advancements, and the evolving threat landscape are all examined in this article. It suggests best practices for putting into practice integrated data security and governance models, placing an emphasis on ongoing adaptation. The advantages mentioned include a strengthened security posture, simpler procedures, lower costs, and reduced complexity. Additionally, issues including skill shortages, antiquated systems, and cultural obstacles are examined. Security executives and Chief Information Security Officers are given practical advice on how to evaluate, plan, and put into place strong data-centric security and management capabilities. The goal of the paper is to provide a thorough study of the data security and management landscape and to arm contemporary businesses with the knowledge they need to be proactive in protecting their data assets.

Keywords: Data security, security management, cloud computing, cybersecurity, data governance, security architecture, data management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 275
7732 Post Mining- Discovering Valid Rules from Different Sized Data Sources

Authors: R. Nedunchezhian, K. Anbumani

Abstract:

A big organization may have multiple branches spread across different locations. Processing of data from these branches becomes a huge task when innumerable transactions take place. Also, branches may be reluctant to forward their data for centralized processing but are ready to pass their association rules. Local mining may also generate a large amount of rules. Further, it is not practically possible for all local data sources to be of the same size. A model is proposed for discovering valid rules from different sized data sources where the valid rules are high weighted rules. These rules can be obtained from the high frequency rules generated from each of the data sources. A data source selection procedure is considered in order to efficiently synthesize rules. Support Equalization is another method proposed which focuses on eliminating low frequency rules at the local sites itself thus reducing the rules by a significant amount.

Keywords: Association rules, multiple data stores, synthesizing, valid rules.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1404
7731 RFID-ready Master Data Management for Reverse Logistics

Authors: Jincheol Han, Hyunsun Ju, Jonghoon Chun

Abstract:

Sharing consistent and correct master data among disparate applications in a reverse-logistics chain has long been recognized as an intricate problem. Although a master data management (MDM) system can surely assume that responsibility, applications that need to co-operate with it must comply with proprietary query interfaces provided by the specific MDM system. In this paper, we present a RFID-ready MDM system which makes master data readily available for any participating applications in a reverse-logistics chain. We propose a RFID-wrapper as a part of our MDM. It acts as a gateway between any data retrieval request and query interfaces that process it. With the RFID-wrapper, any participating applications in a reverse-logistics chain can easily retrieve master data in a way that is analogous to retrieval of any other RFID-based logistics transactional data.

Keywords: Reverse Logistics, Master Data Management, RFID.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1974
7730 Dynamic Models versus Frailty Models for Recurrent Event Data

Authors: Entisar A. Elgmati

Abstract:

Recurrent event data is a special type of multivariate survival data. Dynamic and frailty models are one of the approaches that dealt with this kind of data. A comparison between these two models is studied using the empirical standard deviation of the standardized martingale residual processes as a way of assessing the fit of the two models based on the Aalen additive regression model. Here we found both approaches took heterogeneity into account and produce residual standard deviations close to each other both in the simulation study and in the real data set.

Keywords: Dynamic, frailty, misspecification, recurrent events.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2350
7729 Fully Automated Methods for the Detection and Segmentation of Mitochondria in Microscopy Images

Authors: Blessing Ojeme, Frederick Quinn, Russell Karls, Shannon Quinn

Abstract:

The detection and segmentation of mitochondria from fluorescence microscopy is crucial for understanding the complex structure of the nervous system. However, the constant fission and fusion of mitochondria and image distortion in the background make the task of detection and segmentation challenging. Although there exists a number of open-source software tools and artificial intelligence (AI) methods designed for analyzing mitochondrial images, the availability of only a few combined expertise in the medical field and AI required to utilize these tools poses a challenge to its full adoption and use in clinical settings. Motivated by the advantages of automated methods in terms of good performance, minimum detection time, ease of implementation, and cross-platform compactibility, this study proposes a fully automated framework for the detection and segmentation of mitochondria using both image shape information and descriptive statistics. Using the low-cost, open-source Python and OpenCV library, the algorithms are implemented in three stages: pre-processing; image binarization; and coarse-to-fine segmentation. The proposed model is validated using the fluorescence mitochondrial dataset. Ground truth labels generated using Labkit were also used to evaluate the performance of our detection and segmentation model using precision, recall and rand index. The study produces good detection and segmentation results and reports the challenges encountered during the image analysis of mitochondrial morphology from the fluorescence mitochondrial dataset. A discussion on the methods and future perspectives of fully automated frameworks concludes the paper.

Keywords: 2D, Binarization, CLAHE, detection, fluorescence microscopy, mitochondria, segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 473
7728 Issues and Architecture for Supporting Data Warehouse Queries in Web Portals

Authors: Minsoo Lee, Yoon-kyung Lee, Hyejung Yoon, Soo-kyung Song, Sujeong Cheong

Abstract:

Data Warehousing tools have become very popular and currently many of them have moved to Web-based user interfaces to make it easier to access and use the tools. The next step is to enable these tools to be used within a portal framework. The portal framework consists of pages having several small windows that contain individual data warehouse query results. There are several issues that need to be considered when designing the architecture for a portal enabled data warehouse query tool. Some issues need special techniques that can overcome the limitations that are imposed by the nature of data warehouse queries. Issues such as single sign-on, query result caching and sharing, customization, scheduling and authorization need to be considered. This paper discusses such issues and suggests an architecture to support data warehouse queries within Web portal frameworks.

Keywords: Data Warehousing tools, data warehousing queries, web portal frameworks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2121
7727 Exploring Employee Experiences of Distributed Leadership in Consultancy SMEs

Authors: Mohamed Haffar, Ramdane Djebarni, Russell Evans

Abstract:

Despite a growth in literature on distributed leadership, the majority of studies are centred on large public organisations particularly within the health and education sectors. The purpose of this study is to fill the gap in the literature by exploring employee experiences of distributed leadership within two commercial consultancy SME businesses in the UK and USA. The aim of the study informed an exploratory method of research to gather qualitative data drawn from semi-structured interviews involving a sample of employees in each organisation. A series of broad, open questions were used to explore the employees’ experiences; evidence of distributed leadership; and extant barriers and practices in each organisation. Whilst some of our findings aligned with patterns and practices in the existing literature, it importantly discovered some emergent themes that have not previously been recognised in the previous studies. Our investigation identified that whilst distributed leadership was in evidence in both organisations, the interviewees’ experience reported that it was sporadic and inconsistent. Moreover, non-client focused projects were reported to be less important and distributed leadership was found to be inconsistent or non-existent.

Keywords: Consultancy, distributed leadership, owner-manager, SME, entrepreneur.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 758
7726 Numerical Analysis of the Influence of Airfoil Asymmetry on VAWT Performance

Authors: Marco Raciti Castelli, Giulia Simioni, Ernesto Benini

Abstract:

This paper presents a model for the evaluation of energy performance and aerodynamic forces acting on a three-bladed small vertical axis Darrieus wind turbine depending on blade chord curvature with respect to rotor axis. The adopted survey methodology is based on an analytical code coupled to a solid modeling software, capable of generating the desired blade geometry depending on the blade design geometric parameters, which is linked to a finite volume CFD code for the calculation of rotor performance. After describing and validating the model with experimental data, the results of numerical simulations are proposed on the bases of two different blade profile architectures, which are respectively characterized by a straight chord and by a curved one, having a chord radius equal to rotor external circumference. A CFD campaign of analysis is completed for three blade-candidate airfoil sections, that is the recently-developed DU 06-W-200 cambered blade profile, a classical symmetrical NACA 0021 and its derived cambered airfoil, characterized by a curved chord, having a chord radius equal to rotor external circumference. The effects of blade chord curvature on angle of attack, blade tangential and normal forces are first investigated and then the overall rotor torque and power are analyzed as a function of blade azimuthal position, achieving a numerical quantification of the influence of blade camber on overall rotor performance.

Keywords: VAWT, NACA 0021, DU 06-W-200, cambered airfoil

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2749
7725 Data Mining Using Learning Automata

Authors: M. R. Aghaebrahimi, S. H. Zahiri, M. Amiri

Abstract:

In this paper a data miner based on the learning automata is proposed and is called LA-miner. The LA-miner extracts classification rules from data sets automatically. The proposed algorithm is established based on the function optimization using learning automata. The experimental results on three benchmarks indicate that the performance of the proposed LA-miner is comparable with (sometimes better than) the Ant-miner (a data miner algorithm based on the Ant Colony optimization algorithm) and CNZ (a well-known data mining algorithm for classification).

Keywords: Data mining, Learning automata, Classification rules, Knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1935
7724 Secure and Efficient Transmission of Aggregated Data for Mobile Wireless Sensor Networks

Authors: A. Krishna Veni, R.Geetha

Abstract:

Wireless Sensor Networks (WSNs) are suitable for many scenarios in the real world. The retrieval of data is made efficient by the data aggregation techniques. Many techniques for the data aggregation are offered and most of the existing schemes are not energy efficient and secure. However, the existing techniques use the traditional clustering approach where there is a delay during the packet transmission since there is no proper scheduling. The presented system uses the Velocity Energy-efficient and Link-aware Cluster-Tree (VELCT) scheme in which there is a Data Collection Tree (DCT) which improves the lifetime of the network. The VELCT scheme and the construction of DCT reduce the delay and traffic. The network lifetime can be increased by avoiding the frequent change in cluster topology. Secure and Efficient Transmission of Aggregated data (SETA) improves the security of the data transmission via the trust value of the nodes prior the aggregation of data. Since SETA considers the data only from the trustworthy nodes for aggregation, it is more secure in transmitting the data thereby improving the accuracy of aggregated data.

Keywords: Aggregation, lifetime, network security, wireless sensor network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1217
7723 Development of Greenhouse Analysis Tools for Home Agriculture Project

Authors: M. Amir Abas, M. Dahlui

Abstract:

This paper presents the development of analysis tools for Home Agriculture project. The tools are required for monitoring the condition of greenhouse which involves two components: measurement hardware and data analysis engine. Measurement hardware is functioned to measure environment parameters such as temperature, humidity, air quality, dust and etc while analysis tool is used to analyse and interpret the integrated data against the condition of weather, quality of health, irradiance, quality of soil and etc. The current development of the tools is completed for off-line data recorded technique. The data is saved in MMC and transferred via ZigBee to Environment Data Manager (EDM) for data analysis. EDM converts the raw data and plot three combination graphs. It has been applied in monitoring three months data measurement for irradiance, temperature and humidity of the greenhouse..

Keywords: Monitoring, Environment, Greenhouse, Analysis tools

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2019
7722 Factors Influencing the Use of Green Building Practices in the South African Residential Apartment Construction

Authors: Mongezi Nene, Emma Ayesu-Koranteng, Christopher Amoah, Ayo Adeniran

Abstract:

Although its use has been criticised over the years as being unencouraging, the green building concept is quickly overtaking other concepts, particularly in the construction of commercial properties. The goal of the study is to identify the variables influencing the use of green building practices when developing residential structures. A qualitative methodology, using interviews with semi-structured open-ended questions to 35 property practitioners operating residential apartments in Bloemfontein, South Africa, was used to collect primary data which were analysed using thematic content analysis. The findings show that while respondents have a good understanding of green building principles, they are not being used in the construction of residential buildings in South Africa due to issues with green building approval procedures, the potential for tenant rent increases, the cost of materials, technical issues, contractual issues, and a lack of awareness, among others. This paper recommends among others an urgent need to implement measures by stakeholders towards enhancing the adoption of green building concepts in the construction of residential buildings as well as incentivising its construction through lowered property rates.

Keywords: Green building, residential apartments, construction, South Africa.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 225
7721 Extremal Properties of Generalized Class of Close-to-convex Functions

Authors: Norlyda Mohamed, Daud Mohamad, Shaharuddin Cik Soh

Abstract:

Let Gα ,β (γ ,δ ) denote the class of function f (z), f (0) = f ′(0)−1= 0 which satisfied e δ {αf ′(z)+ βzf ′′(z)}> γ i Re in the open unit disk D = {z ∈ı : z < 1} for some α ∈ı (α ≠ 0) , β ∈ı and γ ∈ı (0 ≤γ <α ) where δ ≤ π and α cosδ −γ > 0 . In this paper, we determine some extremal properties including distortion theorem and argument of f ′( z ) .

Keywords: Argument of f ′(z) , Carathéodory Function, Closeto- convex Function, Distortion Theorem, Extremal Properties

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1355
7720 A Robust Data Hiding Technique based on LSB Matching

Authors: Emad T. Khalaf, Norrozila Sulaiman

Abstract:

Many researchers are working on information hiding techniques using different ideas and areas to hide their secrete data. This paper introduces a robust technique of hiding secret data in image based on LSB insertion and RSA encryption technique. The key of the proposed technique is to encrypt the secret data. Then the encrypted data will be converted into a bit stream and divided it into number of segments. However, the cover image will also be divided into the same number of segments. Each segment of data will be compared with each segment of image to find the best match segment, in order to create a new random sequence of segments to be inserted then in a cover image. Experimental results show that the proposed technique has a high security level and produced better stego-image quality.

Keywords: steganography; LSB Matching; RSA Encryption; data segments

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2220
7719 Evaluation of TRIS-DMA-NVP Hydrogels for Making Silicone-Based Contact Lenses

Authors: N. P. D. Tran, H. Q. D. Nguyen, M. C. Yang

Abstract:

In this study, contact lenses were prepared through the polymerization of tris-(trimethyl-silyl-propyl-methacrylate) (TRIS), N,N-dimethylacrylamide (DMA), N-vinylpyrrolidone (NVP), and cross-linked with ethylene glycol dimethylacrylate (EGDMA). The equilibrium water content (EWC), oxygen permeability (Dk), light transmittance, and in vitro cytotoxicity of TRIS-DMA-NVP with various ratios were measured. The results showed that the EWC increased while the Dk decreased with the increase of NVP content. For the sample with 25 wt% NVP, the EWC attained 53% whereas the Dk decreased to 46 barrers. All these lenses exhibited light transmittance over than 95%. In addition, all these lenses exhibited no inhibition to the growth of L292 fibroblasts. Thus, this study showed that TRIS-DMA-NVP can be applicable for making contact lens.

Keywords: DMA, TRIS, NVP, silicone hydrogel, contact lens.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1478
7718 Comprehensive Analysis of Data Mining Tools

Authors: S. Sarumathi, N. Shanthi

Abstract:

Due to the fast and flawless technological innovation there is a tremendous amount of data dumping all over the world in every domain such as Pattern Recognition, Machine Learning, Spatial Data Mining, Image Analysis, Fraudulent Analysis, World Wide Web etc., This issue turns to be more essential for developing several tools for data mining functionalities. The major aim of this paper is to analyze various tools which are used to build a resourceful analytical or descriptive model for handling large amount of information more efficiently and user friendly. In this survey the diverse tools are illustrated with their extensive technical paradigm, outstanding graphical interface and inbuilt multipath algorithms in which it is very useful for handling significant amount of data more indeed.

Keywords: Classification, Clustering, Data Mining, Machine learning, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2439
7717 A Prediction of Attractive Evaluation Objects Based On Complex Sequential Data

Authors: Shigeaki Sakurai, Makino Kyoko, Shigeru Matsumoto

Abstract:

This paper proposes a method that predicts attractive evaluation objects. In the learning phase, the method inductively acquires trend rules from complex sequential data. The data is composed of two types of data. One is numerical sequential data. Each evaluation object has respective numerical sequential data. The other is text sequential data. Each evaluation object is described in texts. The trend rules represent changes of numerical values related to evaluation objects. In the prediction phase, the method applies new text sequential data to the trend rules and evaluates which evaluation objects are attractive. This paper verifies the effect of the proposed method by using stock price sequences and news headline sequences. In these sequences, each stock brand corresponds to an evaluation object. This paper discusses validity of predicted attractive evaluation objects, the process time of each phase, and the possibility of application tasks.

Keywords: Trend rule, frequent pattern, numerical sequential data, text sequential data, evaluation object.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1235
7716 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: Genetic data, Pinzgau cattle, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2318
7715 Indigenous Dayak People’s Perceptions of Wildlife Loss and Gain Related to Oil Palm Development

Authors: A. Sunkar, A. Saraswati, Y. Santosa

Abstract:

Controversies surrounding the impacts of oil palm plantations have resulted in some heated debates, especially concerning biodiversity loss and indigenous people well-being. The indigenous people of Dayak generally used wildlife to fulfill their daily needs thus were assumed to have experienced negative impacts due to oil palm developments within and surrounding their settlement areas. This study was conducted to identify the characteristics of the Dayak community settled around an oil palm plantation, to determine their perceptions of wildlife loss or gain as the results of the development of oil palm plantations, and to identify the determinant characteristic of the perceptions. The research was conducted on March 2018 in Nanga Tayap and Tajok Kayong Villages, which were located around the oil palm plantation of NTYE of Ketapang, West Kalimantan-Indonesia. Data were collected through in depth-structured interview, using closed and semi-open questionnaires and three-scale Likert statements. Interviews were conducted with 74 respondents using accidental sampling, and categorized into respondents who were dependent on oil palm for their livelihoods and those who were not. Data were analyzed using quantitative statistics method, Likert Scale, Chi-Square Test, Spearman Test, and Mann-Whitney Test. The research found that the indigenous Dayak people were aware of wildlife species loss and gain since the establishment of the plantation. Nevertheless, wildlife loss did not affect their social, economic, and cultural needs since they could find substitutions. It was found that prior to the plantation’s development, the local Dayak communities were already slowly experiencing some livelihood transitions through local village development. The only determinant characteristic of the community that influenced their perceptions of wildlife loss/gain was level of education.

Keywords: Wildlife, oil palm plantations, indigenous Dayak, biodiversity loss and gain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1370
7714 A Comparative Study of Fine Grained Security Techniques Based on Data Accessibility and Inference

Authors: Azhar Rauf, Sareer Badshah, Shah Khusro

Abstract:

This paper analyzes different techniques of the fine grained security of relational databases for the two variables-data accessibility and inference. Data accessibility measures the amount of data available to the users after applying a security technique on a table. Inference is the proportion of information leakage after suppressing a cell containing secret data. A row containing a secret cell which is suppressed can become a security threat if an intruder generates useful information from the related visible information of the same row. This paper measures data accessibility and inference associated with row, cell, and column level security techniques. Cell level security offers greatest data accessibility as it suppresses secret data only. But on the other hand, there is a high probability of inference in cell level security. Row and column level security techniques have least data accessibility and inference. This paper introduces cell plus innocent security technique that utilizes the cell level security method but suppresses some innocent data to dodge an intruder that a suppressed cell may not necessarily contain secret data. Four variations of the technique namely cell plus innocent 1/4, cell plus innocent 2/4, cell plus innocent 3/4, and cell plus innocent 4/4 respectively have been introduced to suppress innocent data equal to 1/4, 2/4, 3/4, and 4/4 percent of the true secret data inside the database. Results show that the new technique offers better control over data accessibility and inference as compared to the state-of-theart security techniques. This paper further discusses the combination of techniques together to be used. The paper shows that cell plus innocent 1/4, 2/4, and 3/4 techniques can be used as a replacement for the cell level security.

Keywords: Fine Grained Security, Data Accessibility, Inference, Row, Cell, Column Level Security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1471
7713 Factors Affecting Students’ Performance in Chemistry: Case Study in Zanzibar Secondary Schools

Authors: Ahmed A. Hassan, Hassan I. Ali, Abdallah A. Salum, Asia M. Kassim, Yussuf N. Elmoge, Ali A. Amour

Abstract:

The purpose of this study was to investigate the performance of chemistry in Zanzibar Secondary Schools. It was conducted in all regions of Zanzibar in public and private secondary schools and Ministry of Education officials. The objective of the study included finding out causes of poor performance in chemistry. Views, opinions, and suggestions of teachers and students to improve performance of chemistry and a descriptive survey was adopted for the study. 45 teachers and 200 students were randomly sampled from 15 secondary schools in Zanzibar and ten Ministry of Education officials were purposively sampled for the study. Questionnaires and open-ended interview schedules were the main instruments used in obtaining relevant data from respondents. Data collected from the field was analyzed both qualitatively and quantitatively. Qualitative analysis involved content analysis of the responses obtained through interviews and quantitative analysis involved generation of tables, frequencies and percentages. The results revealed that there were shortages of trained teachers, lack of proficiency in the language of instruction (English) and major facilities like laboratories and books. These led to poor delivery of subject matter and consequently resulting in poor performance. Based on the findings, this study recommends that provision of trained, competent, and effective teachers as vital aspects to be considered. Government through Ministry of Education should put effort to stalk libraries and equip laboratories with modern books and instruments. In addition, the ministry should strengthen teachers’ training and encourage use of instructional media in class and make conducive learning environment to both teachers and students.

Keywords: Zanzibar, secondary schools, chemistry, science, performance and factors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7291
7712 An Application of Differential Subordination to Analytic Functions

Authors: Sukhwinder Singh Billing, Sushma Gupta, Sukhjit Singh Dhaliwal

Abstract:

the present paper, using the technique of differential subordination, we obtain certain results for analytic functions defined by a multiplier transformation in the open unit disc E = { z : IzI < 1}. We claim that our results extend and generalize the existing results in this particular direction

Keywords: function, Differential subordination, Multiplier transformation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1331
7711 Influence of Parameters of Modeling and Data Distribution for Optimal Condition on Locally Weighted Projection Regression Method

Authors: Farhad Asadi, Mohammad Javad Mollakazemi, Aref Ghafouri

Abstract:

Recent research in neural networks science and neuroscience for modeling complex time series data and statistical learning has focused mostly on learning from high input space and signals. Local linear models are a strong choice for modeling local nonlinearity in data series. Locally weighted projection regression is a flexible and powerful algorithm for nonlinear approximation in high dimensional signal spaces. In this paper, different learning scenario of one and two dimensional data series with different distributions are investigated for simulation and further noise is inputted to data distribution for making different disordered distribution in time series data and for evaluation of algorithm in locality prediction of nonlinearity. Then, the performance of this algorithm is simulated and also when the distribution of data is high or when the number of data is less the sensitivity of this approach to data distribution and influence of important parameter of local validity in this algorithm with different data distribution is explained.

Keywords: Local nonlinear estimation, LWPR algorithm, Online training method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601