Search results for: Biological data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7771

Search results for: Biological data

7141 Improved Predictive Models for the IRMA Network Using Nonlinear Optimisation

Authors: Vishwesh Kulkarni, Nikhil Bellarykar

Abstract:

Cellular complexity stems from the interactions among thousands of different molecular species. Thanks to the emerging fields of systems and synthetic biology, scientists are beginning to unravel these regulatory, signaling, and metabolic interactions and to understand their coordinated action. Reverse engineering of biological networks has has several benefits but a poor quality of data combined with the difficulty in reproducing it limits the applicability of these methods. A few years back, many of the commonly used predictive algorithms were tested on a network constructed in the yeast Saccharomyces cerevisiae (S. cerevisiae) to resolve this issue. The network was a synthetic network of five genes regulating each other for the so-called in vivo reverse-engineering and modeling assessment (IRMA). The network was constructed in S. cereviase since it is a simple and well characterized organism. The synthetic network included a variety of regulatory interactions, thus capturing the behaviour of larger eukaryotic gene networks on a smaller scale. We derive a new set of algorithms by solving a nonlinear optimization problem and show how these algorithms outperform other algorithms on these datasets.

Keywords: Synthetic gene network, network identification, nonlinear modeling, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 779
7140 Zero Inflated Models for Overdispersed Count Data

Authors: Y. N. Phang, E. F. Loh

Abstract:

The zero inflated models are usually used in modeling count data with excess zeros where the existence of the excess zeros could be structural zeros or zeros which occur by chance. These type of data are commonly found in various disciplines such as finance, insurance, biomedical, econometrical, ecology, and health sciences which involve sex and health dental epidemiology. The most popular zero inflated models used by many researchers are zero inflated Poisson and zero inflated negative binomial models. In addition, zero inflated generalized Poisson and zero inflated double Poisson models are also discussed and found in some literature. Recently zero inflated inverse trinomial model and zero inflated strict arcsine models are advocated and proven to serve as alternative models in modeling overdispersed count data caused by excessive zeros and unobserved heterogeneity. The purpose of this paper is to review some related literature and provide a variety of examples from different disciplines in the application of zero inflated models. Different model selection methods used in model comparison are discussed.

Keywords: Overdispersed count data, model selection methods, likelihood ratio, AIC, BIC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4508
7139 Formalizing a Procedure for Generating Uncertain Resource Availability Assumptions Based On Real Time Logistic Data Capturing with Auto-ID Systems for Reactive Scheduling

Authors: Lars Laußat, Manfred Helmus, Kamil Szczesny, Markus König

Abstract:

As one result of the project “Reactive Construction Project Scheduling using Real Time Construction Logistic Data and Simulation”, a procedure for using data about uncertain resource availability assumptions in reactive scheduling processes has been developed. Prediction data about resource availability is generated in a formalized way using real-time monitoring data e.g. from auto-ID systems on the construction site and in the supply chains. The paper focusses on the formalization of the procedure for monitoring construction logistic processes, for the detection of disturbance and for generating of new and uncertain scheduling assumptions for the reactive resource constrained simulation procedure that is and will be further described in other papers.

Keywords: Auto-ID, Construction Logistic, Fuzzy, Monitoring, RFID, Scheduling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1766
7138 Nuclear Data Evaluation for 217Po

Authors: Sherif S. Nafee, Amir K. Al-Ramady, Salem S. Shaheen

Abstract:

Evaluated nuclear decay data for the 217Po nuclide is presented in the present work. These data include recommended values for the half-life T1/2, α-, β-- and γ-ray emission energies and probabilities. Decay data from 221Rn α and 217Bi β—decays are presented. Q(α) has been updated based on the recent published work of the Atomic Mass Evaluation AME2012. In addition, the logft values were calculated using the Logft program from the ENSDF evaluation package. Moreover, the total internal conversion electrons and the K-shell to L-shell and L-shell to M-shell and to N-shell conversion electrons ratios K/L, L/M and L/N have been calculated using Bricc program. Meanwhile, recommendation values or the multi-polarities have been assigned based on recently measurement yield a better intensity balance at the 254 keV and 264 keV gamma transitions.

Keywords: Atomic Mass Evaluation, Nuclear Data Evaluation, Total Electron Conversion Electrons.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2248
7137 Detection of Keypoint in Press-Fit Curve Based on Convolutional Neural Network

Authors: Shoujia Fang, Guoqing Ding, Xin Chen

Abstract:

The quality of press-fit assembly is closely related to reliability and safety of product. The paper proposed a keypoint detection method based on convolutional neural network to improve the accuracy of keypoint detection in press-fit curve. It would provide an auxiliary basis for judging quality of press-fit assembly. The press-fit curve is a curve of press-fit force and displacement. Both force data and distance data are time-series data. Therefore, one-dimensional convolutional neural network is used to process the press-fit curve. After the obtained press-fit data is filtered, the multi-layer one-dimensional convolutional neural network is used to perform the automatic learning of press-fit curve features, and then sent to the multi-layer perceptron to finally output keypoint of the curve. We used the data of press-fit assembly equipment in the actual production process to train CNN model, and we used different data from the same equipment to evaluate the performance of detection. Compared with the existing research result, the performance of detection was significantly improved. This method can provide a reliable basis for the judgment of press-fit quality.

Keywords: Keypoint detection, curve feature, convolutional neural network, press-fit assembly.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 915
7136 Proposal to Increase the Efficiency, Reliability and Safety of the Centre of Data Collection Management and Their Evaluation Using Cluster Solutions

Authors: Martin Juhas, Bohuslava Juhasova, Igor Halenar, Andrej Elias

Abstract:

This article deals with the possibility of increasing efficiency, reliability and safety of the system for teledosimetric data collection management and their evaluation as a part of complex study for activity “Research of data collection, their measurement and evaluation with mobile and autonomous units” within project “Research of monitoring and evaluation of non-standard conditions in the area of nuclear power plants”. Possible weaknesses in existing system are identified. A study of available cluster solutions with possibility of their deploying to analysed system is presented

Keywords: Teledosimetric data, efficiency, reliability, safety, cluster solution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1546
7135 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: Classification algorithms; data mining; tourism; knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2530
7134 Clustering Categorical Data Using Hierarchies (CLUCDUH)

Authors: Gökhan Silahtaroğlu

Abstract:

Clustering large populations is an important problem when the data contain noise and different shapes. A good clustering algorithm or approach should be efficient enough to detect clusters sensitively. Besides space complexity, time complexity also gains importance as the size grows. Using hierarchies we developed a new algorithm to split attributes according to the values they have and choosing the dimension for splitting so as to divide the database roughly into equal parts as much as possible. At each node we calculate some certain descriptive statistical features of the data which reside and by pruning we generate the natural clusters with a complexity of O(n).

Keywords: Clustering, tree, split, pruning, entropy, gini.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538
7133 Analysis of Users’ Behavior on Book Loan Log Based On Association Rule Mining

Authors: Kanyarat Bussaban, Kunyanuth Kularbphettong

Abstract:

This research aims to create a model for analysis of student behavior using Library resources based on data mining technique in case of Suan Sunandha Rajabhat University. The model was created under association rules, Apriori algorithm. The results were found 14 rules and the rules were tested with testing data set and it showed that the ability of classify data was 79.24percent and the MSE was 22.91. The results showed that the user’s behavior model by using association rule technique can use to manage the library resources.

Keywords: Behavior, data mining technique, Apriori algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2289
7132 Data Integrity: Challenges in Health Information Systems in South Africa

Authors: T. Thulare, M. Herselman, A. Botha

Abstract:

Poor system use, including inappropriate design of health information systems, causes difficulties in communication with patients and increased time spent by healthcare professionals in recording the necessary health information for medical records. System features like pop-up reminders, complex menus, and poor user interfaces can make medical records far more time consuming than paper cards as well as affect decision-making processes. Although errors associated with health information and their real and likely effect on the quality of care and patient safety have been documented for many years, more research is needed to measure the occurrence of these errors and determine the causes to implement solutions. Therefore, the purpose of this paper is to identify data integrity challenges in hospital information systems through a scoping review and based on the results provide recommendations on how to manage these. Only 34 papers were found to be most suitable out of 297 publications initially identified in the field. The results indicated that human and computerized systems are the most common challenges associated with data integrity and factors such as policy, environment, health workforce, and lack of awareness attribute to these challenges but if measures are taken the data integrity challenges can be managed.

Keywords: Data integrity, data integrity challenges, hospital information systems, South Africa.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1326
7131 Self-Supervised Pretraining on Paired Sequences of fMRI Data for Transfer Learning to Brain Decoding Tasks

Authors: Sean Paulsen, Michael Casey

Abstract:

In this work, we present a self-supervised pretraining framework for transformers on functional Magnetic Resonance Imaging (fMRI) data. First, we pretrain our architecture on two self-supervised tasks simultaneously to teach the model a general understanding of the temporal and spatial dynamics of human auditory cortex during music listening. Our pretraining results are the first to suggest a synergistic effect of multitask training on fMRI data. Second, we finetune the pretrained models and train additional fresh models on a supervised fMRI classification task. We observe significantly improved accuracy on held-out runs with the finetuned models, which demonstrates the ability of our pretraining tasks to facilitate transfer learning. This work contributes to the growing body of literature on transformer architectures for pretraining and transfer learning with fMRI data, and serves as a proof of concept for our pretraining tasks and multitask pretraining on fMRI data.

Keywords: Transfer learning, fMRI, self-supervised, brain decoding, transformer, multitask training.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 114
7130 On Speeding Up Support Vector Machines: Proximity Graphs Versus Random Sampling for Pre-Selection Condensation

Authors: Xiaohua Liu, Juan F. Beltran, Nishant Mohanchandra, Godfried T. Toussaint

Abstract:

Support vector machines (SVMs) are considered to be the best machine learning algorithms for minimizing the predictive probability of misclassification. However, their drawback is that for large data sets the computation of the optimal decision boundary is a time consuming function of the size of the training set. Hence several methods have been proposed to speed up the SVM algorithm. Here three methods used to speed up the computation of the SVM classifiers are compared experimentally using a musical genre classification problem. The simplest method pre-selects a random sample of the data before the application of the SVM algorithm. Two additional methods use proximity graphs to pre-select data that are near the decision boundary. One uses k-Nearest Neighbor graphs and the other Relative Neighborhood Graphs to accomplish the task.

Keywords: Machine learning, data mining, support vector machines, proximity graphs, relative-neighborhood graphs, k-nearestneighbor graphs, random sampling, training data condensation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1905
7129 Wavelet and K-L Seperability Based Feature Extraction Method for Functional Data Classification

Authors: Jun Wan, Zehua Chen, Yingwu Chen, Zhidong Bai

Abstract:

This paper proposes a novel feature extraction method, based on Discrete Wavelet Transform (DWT) and K-L Seperability (KLS), for the classification of Functional Data (FD). This method combines the decorrelation and reduction property of DWT and the additive independence property of KLS, which is helpful to extraction classification features of FD. It is an advanced approach of the popular wavelet based shrinkage method for functional data reduction and classification. A theory analysis is given in the paper to prove the consistent convergence property, and a simulation study is also done to compare the proposed method with the former shrinkage ones. The experiment results show that this method has advantages in improving classification efficiency, precision and robustness.

Keywords: classification, functional data, feature extraction, K-Lseperability, wavelet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1448
7128 Dynamical Analysis of Circadian Gene Expression

Authors: Carla Layana Luis Diambra

Abstract:

Microarrays technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining this data one can identify the dynamics of the gene expression time series. By recourse of principal component analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis. We applied PCA to reduce the dimensionality of the data set. Examination of the components also provides insight into the underlying factors measured in the experiments. Our results suggest that all rhythmic content of data can be reduced to three main components.

Keywords: circadian rhythms, clustering, gene expression, PCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1579
7127 Modified Data Mining Approach for Defective Diagnosis in Hard Disk Drive Industry

Authors: S. Soommat, S. Patamatamkul, T. Prempridi, M. Sritulyachot, P. Ineure, S. Yimman

Abstract:

Currently, slider process of Hard Disk Drive Industry become more complex, defective diagnosis for yield improvement becomes more complicated and time-consumed. Manufacturing data analysis with data mining approach is widely used for solving that problem. The existing mining approach from combining of the KMean clustering, the machine oriented Kruskal-Wallis test and the multivariate chart were applied for defective diagnosis but it is still be a semiautomatic diagnosis system. This article aims to modify an algorithm to support an automatic decision for the existing approach. Based on the research framework, the new approach can do an automatic diagnosis and help engineer to find out the defective factors faster than the existing approach about 50%.

Keywords: Slider process, Defective diagnosis and Data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1188
7126 Authentication and Data Hiding Using a Reversible ROI-based Watermarking Scheme for DICOM Images

Authors: Osamah M. Al-Qershi, Khoo Bee Ee

Abstract:

In recent years image watermarking has become an important research area in data security, confidentiality and image integrity. Many watermarking techniques were proposed for medical images. However, medical images, unlike most of images, require extreme care when embedding additional data within them because the additional information must not affect the image quality and readability. Also the medical records, electronic or not, are linked to the medical secrecy, for that reason, the records must be confidential. To fulfill those requirements, this paper presents a lossless watermarking scheme for DICOM images. The proposed a fragile scheme combines two reversible techniques based on difference expansion for patient's data hiding and protecting the region of interest (ROI) with tamper detection and recovery capability. Patient's data are embedded into ROI, while recovery data are embedded into region of non-interest (RONI). The experimental results show that the original image can be exactly extracted from the watermarked one in case of no tampering. In case of tampered ROI, tampered area can be localized and recovered with a high quality version of the original area.

Keywords: DICOM, reversible, ROI-based, watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1705
7125 Continual Learning Using Data Generation for Hyperspectral Remote Sensing Scene Classification

Authors: Samiah Alammari, Nassim Ammour

Abstract:

When providing a massive number of tasks successively to a deep learning process, a good performance of the model requires preserving the previous tasks data to retrain the model for each upcoming classification. Otherwise, the model performs poorly due to the catastrophic forgetting phenomenon. To overcome this shortcoming, we developed a successful continual learning deep model for remote sensing hyperspectral image regions classification. The proposed neural network architecture encapsulates two trainable subnetworks. The first module adapts its weights by minimizing the discrimination error between the land-cover classes during the new task learning, and the second module tries to learn how to replicate the data of the previous tasks by discovering the latent data structure of the new task dataset. We conduct experiments on hyperspectral image (HSI) dataset on Indian Pines. The results confirm the capability of the proposed method.

Keywords: Continual learning, data reconstruction, remote sensing, hyperspectral image segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 200
7124 Stego Machine – Video Steganography using Modified LSB Algorithm

Authors: Mritha Ramalingam

Abstract:

Computer technology and the Internet have made a breakthrough in the existence of data communication. This has opened a whole new way of implementing steganography to ensure secure data transfer. Steganography is the fine art of hiding the information. Hiding the message in the carrier file enables the deniability of the existence of any message at all. This paper designs a stego machine to develop a steganographic application to hide data containing text in a computer video file and to retrieve the hidden information. This can be designed by embedding text file in a video file in such away that the video does not loose its functionality using Least Significant Bit (LSB) modification method. This method applies imperceptible modifications. This proposed method strives for high security to an eavesdropper-s inability to detect hidden information.

Keywords: Data hiding, LSB, Stego machine, VideoSteganography

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4248
7123 Data Projects for “Social Good”: Challenges and Opportunities

Authors: Mikel Niño, Roberto V. Zicari, Todor Ivanov, Kim Hee, Naveed Mushtaq, Marten Rosselli, Concha Sánchez-Ocaña, Karsten Tolle, José Miguel Blanco, Arantza Illarramendi, Jörg Besier, Harry Underwood

Abstract:

One of the application fields for data analysis techniques and technologies gaining momentum is the area of social good or “common good”, covering cases related to humanitarian crises, global health care, or ecology and environmental issues, among others. The promotion of data-driven projects in this field aims at increasing the efficacy and efficiency of social initiatives, improving the way these actions help humanity in general and people in need in particular. This application field, however, poses its own barriers and challenges when developing data-driven projects, lagging behind in comparison with other scenarios. These challenges derive from aspects such as the scope and scale of the social issue to solve, cultural and political barriers, the skills of main stakeholders and the technological resources available, the motivation to be engaged in such projects, or the ethical and legal issues related to sensitive data. This paper analyzes the application of data projects in the field of social good, reviewing its current state and noteworthy initiatives, and presenting a framework covering the key aspects to analyze in such projects. The goal is to provide guidelines to understand the main challenges and opportunities for this type of data project, as well as identifying the main differential issues compared to “classical” data projects in general. A case study is presented on the initial steps and stakeholder analysis of a data project for the inclusion of refugees in the city of Frankfurt, Germany, in order to empirically confront the framework with a real example.

Keywords: Data-Driven projects, humanitarian operations, personal and sensitive data, social good, stakeholders analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1756
7122 Effect of Different Methods to Control the Parasitic Weed Phelipanche ramosa (L.- Pomel) in Tomato Crop

Authors: G. Disciglio, F. Lops, A. Carlucci, G. Gatta, A. Tarantino, E. Tarantino

Abstract:

Phelipanche ramosa is the most damaging obligate flowering parasitic weed on wide species of cultivated plants. The semi-arid regions of the world are considered the main centers of this parasitic plant that causes heavy infestation. This is due to its production of high numbers of seeds (up to 200,000) that remain viable for extended periods (up to 20 years). In this study, 13 treatments for the control of Phelipanche were carried out, which included agronomic, chemical, and biological treatments and the use of resistant plant methods. In 2014, a trial was performed at the Department of Agriculture, Food and Environment, University of Foggia (southern Italy), on processing tomato (cv ‘Docet’) grown in pots filled with soil taken from a field that was heavily infested by P. ramosa). The tomato seedlings were transplanted on May 8, 2014, into a sandy-clay soil (USDA). A randomized block design with 3 replicates (pots) was adopted. During the growing cycle of the tomato, at 70, 75, 81 and 88 days after transplantation, the number of P. ramosa shoots emerged in each pot was determined. The tomato fruit were harvested on August 8, 2014, and the quantitative and qualitative parameters were determined. All of the data were subjected to analysis of variance (ANOVA) using the JMP software (SAS Institute Inc. Cary, NC, USA), and for comparisons of means (Tukey's tests). The data show that each treatment studied did not provide complete control against P. ramosa. However, the virulence of the attacks was mitigated by some of the treatments tried: radicon biostimulant, compost activated with Fusarium, mineral fertilizer nitrogen, sulfur, enzone, and the resistant tomato genotype. It is assumed that these effects can be improved by combining some of these treatments with each other, especially for a gradual and continuing reduction of the “seed bank” of the parasite in the soil.

Keywords: Control methods, Phelipanche ramosa, tomato crop.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3038
7121 A Multi-Feature Deep Learning Algorithm for Urban Traffic Classification with Limited Labeled Data

Authors: Rohan Putatunda, Aryya Gangopadhyay

Abstract:

Acoustic sensors, if embedded in smart street lights, can help in capturing the activities (car honking, sirens, events, traffic, etc.) in cities. Needless to say, the acoustic data from such scenarios are complex due to multiple audio streams originating from different events, and when decomposed to independent signals, the amount of retrieved data volume is small in quantity which is inadequate to train deep neural networks. So, in this paper, we address the two challenges: a) separating the mixed signals, and b) developing an efficient acoustic classifier under data paucity. So, to address these challenges, we propose an architecture with supervised deep learning, where the initial captured mixed acoustics data are analyzed with Fast Fourier Transformation (FFT), followed by filtering the noise from the signal, and then decomposed to independent signals by fast independent component analysis (Fast ICA). To address the challenge of data paucity, we propose a multi feature-based deep neural network with high performance that is reflected in our experiments when compared to the conventional convolutional neural network (CNN) and multi-layer perceptron (MLP).

Keywords: FFT, ICA, vehicle classification, multi-feature DNN, CNN, MLP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 402
7120 An Educational Data Mining System for Advising Higher Education Students

Authors: Heba Mohammed Nagy, Walid Mohamed Aly, Osama Fathy Hegazy

Abstract:

Educational  data mining  is  a  specific  data   mining field applied to data originating from educational environments, it relies on different  approaches to discover hidden knowledge  from  the  available   data. Among these approaches are   machine   learning techniques which are used to build a system that acquires learning from previous data. Machine learning can be applied to solve different regression, classification, clustering and optimization problems.

In  our  research, we propose  a “Student  Advisory  Framework” that  utilizes  classification  and  clustering  to  build  an  intelligent system. This system can be used to provide pieces of consultations to a first year  university  student to  pursue a  certain   education   track   where  he/she  will  likely  succeed  in, aiming  to  decrease   the  high  rate   of  academic  failure   among these  students.  A real case study  in Cairo  Higher  Institute  for Engineering, Computer  Science  and  Management  is  presented using  real  dataset   collected  from  2000−2012.The dataset has two main components: pre-higher education dataset and first year courses results dataset. Results have proved the efficiency of the suggested framework.

Keywords: Classification, Clustering, Educational Data Mining (EDM), Machine Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5198
7119 Auto Classification for Search Intelligence

Authors: Lilac A. E. Al-Safadi

Abstract:

This paper proposes an auto-classification algorithm of Web pages using Data mining techniques. We consider the problem of discovering association rules between terms in a set of Web pages belonging to a category in a search engine database, and present an auto-classification algorithm for solving this problem that are fundamentally based on Apriori algorithm. The proposed technique has two phases. The first phase is a training phase where human experts determines the categories of different Web pages, and the supervised Data mining algorithm will combine these categories with appropriate weighted index terms according to the highest supported rules among the most frequent words. The second phase is the categorization phase where a web crawler will crawl through the World Wide Web to build a database categorized according to the result of the data mining approach. This database contains URLs and their categories.

Keywords: Information Processing on the Web, Data Mining, Document Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608
7118 Retrieval of Relevant Visual Data in Selected Machine Vision Tasks: Examples of Hardware-based and Software-based Solutions

Authors: Andrzej Śluzek

Abstract:

To illustrate diversity of methods used to extract relevant (where the concept of relevance can be differently defined for different applications) visual data, the paper discusses three groups of such methods. They have been selected from a range of alternatives to highlight how hardware and software tools can be complementarily used in order to achieve various functionalities in case of different specifications of “relevant data". First, principles of gated imaging are presented (where relevance is determined by the range). The second methodology is intended for intelligent intrusion detection, while the last one is used for content-based image matching and retrieval. All methods have been developed within projects supervised by the author.

Keywords: Relevant visual data, gated imaging, intrusion detection, image matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1379
7117 SOA-Based Mobile Application for Crime Control in Thailand

Authors: Jintana Khemprasit, Vatcharaporn Esichaikul

Abstract:

Crime is a major societal problem for most of the world's nations. Consequently, the police need to develop new methods to improve their efficiency in dealing with these ever increasing crime rates. Two of the common difficulties that the police face in crime control are crime investigation and the provision of crime information to the general public to help them protect themselves. Crime control in police operations involves the use of spatial data, crime data and the related crime data from different organizations (depending on the nature of the analysis to be made). These types of data are collected from several heterogeneous sources in different formats and from different platforms, resulting in a lack of standardization. Moreover, there is no standard framework for crime data collection, integration and dissemination through mobile devices. An investigation into the current situation in crime control was carried out to identify the needs to resolve these issues. This paper proposes and investigates the use of service oriented architecture (SOA) and the mobile spatial information service in crime control. SOA plays an important role in crime control as an appropriate way to support data exchange and model sharing from heterogeneous sources. Crime control also needs to facilitate mobile spatial information services in order to exchange, receive, share and release information based on location to mobile users anytime and anywhere.

Keywords: Crime Control, Geographic Information System (GIS), Mobile GIS, Service Oriented Architecture (SOA).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2514
7116 Multidimensional and Data Mining Analysis for Property Investment Risk Analysis

Authors: Nur Atiqah Rochin Demong, Jie Lu, Farookh Khadeer Hussain

Abstract:

Property investment in the real estate industry has a high risk due to the uncertainty factors that will affect the decisions made and high cost. Analytic hierarchy process has existed for some time in which referred to an expert-s opinion to measure the uncertainty of the risk factors for the risk analysis. Therefore, different level of experts- experiences will create different opinion and lead to the conflict among the experts in the field. The objective of this paper is to propose a new technique to measure the uncertainty of the risk factors based on multidimensional data model and data mining techniques as deterministic approach. The propose technique consist of a basic framework which includes four modules: user, technology, end-user access tools and applications. The property investment risk analysis defines as a micro level analysis as the features of the property will be considered in the analysis in this paper.

Keywords: Uncertainty factors, data mining, multidimensional data model, risk analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2903
7115 Computational Aspects of Regression Analysis of Interval Data

Authors: Michal Cerny

Abstract:

We consider linear regression models where both input data (the values of independent variables) and output data (the observations of the dependent variable) are interval-censored. We introduce a possibilistic generalization of the least squares estimator, so called OLS-set for the interval model. This set captures the impact of the loss of information on the OLS estimator caused by interval censoring and provides a tool for quantification of this effect. We study complexity-theoretic properties of the OLS-set. We also deal with restricted versions of the general interval linear regression model, in particular the crisp input – interval output model. We give an argument that natural descriptions of the OLS-set in the crisp input – interval output cannot be computed in polynomial time. Then we derive easily computable approximations for the OLS-set which can be used instead of the exact description. We illustrate the approach by an example.

Keywords: Linear regression, interval-censored data, computational complexity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1457
7114 Time-Derivative Estimation of Noisy Movie Data using Adaptive Control Theory

Authors: Soon-Hyun Park, Takami Matsuo

Abstract:

This paper presents an adaptive differentiator of sequential data based on the adaptive control theory. The algorithm is applied to detect moving objects by estimating a temporal gradient of sequential data at a specified pixel. We adopt two nonlinear intensity functions to reduce the influence of noises. The derivatives of the nonlinear intensity functions are estimated by an adaptive observer with σ-modification update law.

Keywords: Adaptive estimation, parameter adjustmentlaw, motion detection, temporal gradient, differential filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1862
7113 Transformation of the Business Model in an Occupational Health Care Company Embedded in an Emerging Personal Data Ecosystem: A Case Study in Finland

Authors: Tero Huhtala, Minna Pikkarainen, Saila Saraniemi

Abstract:

Information technology has long been used as an enabler of exchange for goods and services. Services are evolving from generic to personalized, and the reverse use of customer data has been discussed in both academia and industry for the past few years. This article presents the results of an empirical case study in the area of preventive health care services. The primary data were gathered in workshops, in which future personal data-based services were conceptualized by analyzing future scenarios from a business perspective. The aim of this study is to understand business model transformation in emerging personal data ecosystems. The work was done as a case study in the context of occupational healthcare. The results have implications to theory and practice, indicating that adopting personal data management principles requires transformation of the business model, which, if successfully managed, may provide access to more resources, potential to offer better value, and additional customer channels. These advantages correlate with the broadening of the business ecosystem. Expanding the scope of this study to include more actors would improve the validity of the research. The results draw from existing literature and are based on findings from a case study and the economic properties of the healthcare industry in Finland.

Keywords: Ecosystem, business model, personal data, preventive healthcare.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1126
7112 Regulation of Transfer of 137cs by Polymeric Sorbents for Grow Ecologically Sound Biomass

Authors: A. H. Tadevosyan, S. K. Mayrapetyan, N. B. Tavakalyan, K. I. Pyuskyulyan, A. H. Hovsepyan, S. N. Sergeeva

Abstract:

Soil contamination with radiocesium has a long-term radiological impact due to its long physical half-life (30.1 years for 137Cs and 2 years for 134Cs) and its high biological availability. 137Cs causes the largest concerns because of its deleterious effect on agriculture and stock farming, and, thus, human life for decades. One of the important aspects of the problem of contaminated soils remediation is understand of protective actions aimed at the reduction of biological migration of radionuclides in soil-plant system. The most effective way to bind radionuclides is the use of selective sorbents. The proposed research mainly aims to achieve control on transfer of 137Cs in a system growing media – plant due to counter ions variation in the polymeric sorbents. As research object Japanese basil - Perilla frutescens was chosen. Productivity of plants depending on the presence (control-without presence of polymer) and type of polymer material, as well as content of 137Cs in plant material has been determined. The character of different polymers influences on the 137Cs migration in growing media – plant system as well as accumulation in the plants has been cleared up.

Keywords: Radioceaseum, Japanese basil, polymer, soil-plant system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1833