Search results for: Categorical data sequences
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7552

Search results for: Categorical data sequences

7402 Acceleration-Based Motion Model for Visual SLAM

Authors: Daohong Yang, Xiang Zhang, Wanting Zhou, Lei Li

Abstract:

Visual Simultaneous Localization and Mapping (VSLAM) is a technology that gathers information about the surrounding environment to ascertain its own position and create a map. It is widely used in computer vision, robotics, and various other fields. Many visual SLAM systems, such as OBSLAM3, utilize a constant velocity motion model. The utilization of this model facilitates the determination of the initial pose of the current frame, thereby enhancing the efficiency and precision of feature matching. However, it is often difficult to satisfy the constant velocity motion model in actual situations. This can result in a significant deviation between the obtained initial pose and the true value, leading to errors in nonlinear optimization results. Therefore, this paper proposes a motion model based on acceleration that can be applied to most SLAM systems. To provide a more accurate description of the camera pose acceleration, we separate the pose transformation matrix into its rotation matrix and translation vector components. The rotation matrix is now represented by a rotation vector. We assume that, over a short period, the changes in rotating angular velocity and translation vector remain constant. Based on this assumption, the initial pose of the current frame is estimated. In addition, the error of the constant velocity model is analyzed theoretically. Finally, we apply our proposed approach to the ORBSLAM3 system and evaluate two sets of sequences from the TUM datasets. The results show that our proposed method has a more accurate initial pose estimation, resulting in an improvement of 6.61% and 6.46% in the accuracy of the ORBSLAM3 system on the two test sequences, respectively.

Keywords: Error estimation, constant acceleration motion model, pose estimation, visual SLAM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 137
7401 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4816
7400 Cellular Automata Based Robust Watermarking Architecture towards the VLSI Realization

Authors: V. H. Mankar, T. S. Das, S. K. Sarkar

Abstract:

In this paper, we have proposed a novel blind watermarking architecture towards its hardware implementation in VLSI. In order to facilitate this hardware realization, cellular automata (CA) concept is introduced. The CA has been already accepted as an attractive structure for VLSI implementation because of its modularity, parallelism, high performance and reliability. The hardware realizable multiresolution spread spectrum watermarking techniques are very few in numbers in spite of their best ever resiliency against signal impairments. This is because of the computational cost and complexity associated with their different filter banks and lifting techniques. The concept of cellular automata theory in order to form a new transform domain technique i.e. Cellular Automata Transform (CAT) have been incorporated. Since CA provides spreading sequences having very low cross-correlation properties, the CA based pseudorandom sequence generator is considered in the present work. Considering the watermarking technique as a digital communication process, an error control coding (ECC) must be incorporated in the data hiding schemes. Besides the hardware implementation of entire CA based data hiding technique, the individual blocks of the algorithm using CA provide the best result than that of some other methods irrespective of the hardware and software technique. The Cellular Automata Transform, CA based PN sequence generator, and CA ECC are the requisite blocks that are developed not only to meet the reliable hardware requirements but also for the basic spread spectrum watermarking features. The proposed algorithm shows statistical invisibility and resiliency against various common signal-processing operations. This algorithmic design utilizes the existing allocated bandwidth in the data transmission channel in a more efficient manner.

Keywords: Cellular automata, watermarking, error control coding, PN sequence, VLSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2025
7399 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2561
7398 Development of Integrated GIS Interface for Characteristics of Regional Daily Flow

Authors: Ju Young Lee, Jung-Seok Yang, Jaeyoung Choi

Abstract:

The purpose of this paper primarily intends to develop GIS interface for estimating sequences of stream-flows at ungauged stations based on known flows at gauged stations. The integrated GIS interface is composed of three major steps. The first, precipitation characteristics using statistical analysis is the procedure for making multiple linear regression equation to get the long term mean daily flow at ungauged stations. The independent variables in regression equation are mean daily flow and drainage area. Traditionally, mean flow data are generated by using Thissen polygon method. However, method for obtaining mean flow data can be selected by user such as Kriging, IDW (Inverse Distance Weighted), Spline methods as well as other traditional methods. At the second, flow duration curve (FDC) is computing at unguaged station by FDCs in gauged stations. Finally, the mean annual daily flow is computed by spatial interpolation algorithm. The third step is to obtain watershed/topographic characteristics. They are the most important factors which govern stream-flows. In summary, the simulated daily flow time series are compared with observed times series. The results using integrated GIS interface are closely similar and are well fitted each other. Also, the relationship between the topographic/watershed characteristics and stream flow time series is highly correlated.

Keywords: Integrated GIS interface, spatial interpolation algorithm, FDC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1465
7397 Tensorial Transformations of Double Gai Sequence Spaces

Authors: N.Subramanian, U.K.Misra

Abstract:

The precise form of tensorial transformations acting on a given collection of infinite matrices into another ; for such classical ideas connected with the summability field of double gai sequence spaces. In this paper the results are impose conditions on the tensor g so that it becomes a tensorial transformations from the metric space χ2 to the metric space C

Keywords: tensorial transformations, double gai sequences , double analytic, dual.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1109
7396 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1515
7395 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2419
7394 Optimal Multilayer Perceptron Structure For Classification of HIV Sub-Type Viruses

Authors: Zeyneb Kurt, Oguzhan Yavuz

Abstract:

The feature of HIV genome is in a wide range because of it is highly heterogeneous. Hence, the infection ability of the virus changes related with different chemokine receptors. From this point, R5 and X4 HIV viruses use CCR5 and CXCR5 coreceptors respectively while R5X4 viruses can utilize both coreceptors. Recently, in Bioinformatics, R5X4 viruses have been studied to classify by using the coreceptors of HIV genome. The aim of this study is to develop the optimal Multilayer Perceptron (MLP) for high classification accuracy of HIV sub-type viruses. To accomplish this purpose, the unit number in hidden layer was incremented one by one, from one to a particular number. The statistical data of R5X4, R5 and X4 viruses was preprocessed by the signal processing methods. Accessible residues of these virus sequences were extracted and modeled by Auto-Regressive Model (AR) due to the dimension of residues is large and different from each other. Finally the pre-processed dataset was used to evolve MLP with various number of hidden units to determine R5X4 viruses. Furthermore, ROC analysis was used to figure out the optimal MLP structure.

Keywords: Multilayer Perceptron, Auto-Regressive Model, HIV, ROC Analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1394
7393 Hybrid Coding for Animated Polygonal Meshes

Authors: Jinghua Zhang, Charles B. Owen, Jinsheng Xu

Abstract:

A new hybrid coding method for compressing animated polygonal meshes is presented. This paper assumes the simplistic representation of the geometric data: a temporal sequence of polygonal meshes for each discrete frame of the animated sequence. The method utilizes a delta coding and an octree-based method. In this hybrid method, both the octree approach and the delta coding approach are applied to each single frame in the animation sequence in parallel. The approach that generates the smaller encoded file size is chosen to encode the current frame. Given the same quality requirement, the hybrid coding method can achieve much higher compression ratio than the octree-only method or the delta-only method. The hybrid approach can represent 3D animated sequences with higher compression factors while maintaining reasonable quality. It is easy to implement and have a low cost encoding process and a fast decoding process, which make it a better choice for real time application.

Keywords: animated polygonal meshes, compression, deltacoding, octree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1426
7392 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3727
7391 Identification of Conserved Domains and Motifs for GRF Gene Family

Authors: Jafar Ahmadi, Nafiseh Noormohammadi, Sedigheh Fabriki Ourang

Abstract:

GRF, Growth regulating factor, genes encode a novel class of plant-specific transcription factors. The GRF proteins play a role in the regulation of cell numbers in young and growing tissues and may act as transcription activations in growth and development of plants. Identification of GRF genes and their expression are important in plants to performance of the growth and development of various organs. In this study, to better understanding the structural and functional differences of GRFs family, 45 GRF proteins sequences in A. thaliana, Z. mays, O. sativa, B. napus, B. rapa, H. vulgare and S. bicolor, have been collected and analyzed through bioinformatics data mining. As a result, in secondary structure of GRFs, the number of alpha helices was more than beta sheets and in all of them QLQ domains were completely in the biggest alpha helix. In all GRFs, QLQ and WRC domains were completely protected except in AtGRF9. These proteins have no trans-membrane domain and due to have nuclear localization signals act in nuclear and they are component of unstable proteins in the test tube.

Keywords: Domain, Gene Family, GRF, Motif.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2266
7390 Almost Periodic Sequence Solutions of a Discrete Cooperation System with Feedback Controls

Authors: Ziping Li, Yongkun Li

Abstract:

In this paper, we consider the almost periodic solutions of a discrete cooperation system with feedback controls. Assuming that the coefficients in the system are almost periodic sequences, we obtain the existence and uniqueness of the almost periodic solution which is uniformly asymptotically stable.

Keywords: Discrete cooperation model, almost periodic solution, feedback control, Lyapunov function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1401
7389 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1256
7388 Experimental Study of Hyperparameter Tuning a Deep Learning Convolutional Recurrent Network for Text Classification

Authors: Bharatendra Rai

Abstract:

Sequences of words in text data have long-term dependencies and are known to suffer from vanishing gradient problem when developing deep learning models. Although recurrent networks such as long short-term memory networks help overcome this problem, achieving high text classification performance is a challenging problem. Convolutional recurrent networks that combine advantages of long short-term memory networks and convolutional neural networks, can be useful for text classification performance improvements. However, arriving at suitable hyperparameter values for convolutional recurrent networks is still a challenging task where fitting of a model requires significant computing resources. This paper illustrates the advantages of using convolutional recurrent networks for text classification with the help of statistically planned computer experiments for hyperparameter tuning. 

Keywords: Convolutional recurrent networks, hyperparameter tuning, long short-term memory networks, Tukey honest significant differences

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25
7387 New Class of Chaotic Mappings in Symbol Space

Authors: Inese Bula

Abstract:

Symbolic dynamics studies dynamical systems on the basis of the symbol sequences obtained for a suitable partition of the state space. This approach exploits the property that system dynamics reduce to a shift operation in symbol space. This shift operator is a chaotic mapping. In this article we show that in the symbol space exist other chaotic mappings.

Keywords: Infinite symbol space, prefix metric, chaotic mapping, generator function, jump mapping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1472
7386 Development of State Model Theory for External Exclusive NOR Type LFSR Structures

Authors: Afaq Ahmad

Abstract:

Using state space technique and GF(2) theory, a simulation model for external exclusive NOR type LFSR structures is developed. Through this tool a systematic procedure is devised for computing pseudo-random binary sequences from such structures.

Keywords: LFSR, external exclusive NOR type, recursivebinary sequence, initial state - next state, state transition matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1550
7385 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1594
7384 DNA Polymorphism Studies of β-Lactoglobulin Gene in Saudi Goats

Authors: Amr A. El Hanafy, Muhammad Qureshi, Jamal Sabir, Mohamed Mutawakil, Mohamed M. Ahmed, Hassan El Ashmaoui, Hassan Ramadan, Mohamed Abou-Alsoud, Mahmoud Abdel Sadek

Abstract:

Domestic goats (Capra hircus) are extremely diverse species and principal animal genetic resource of the developing world. These facilitate a persistent supply of meat, milk, fibre, and skin and are considered as important revenue generators in small pastoral environments. This study aimed to fingerprint β-LG gene at PCR-RFLP level in native Saudi goat breeds (Ardi, Habsi and Harri) in an attempt to have a preliminary image of β-LG genotypic patterns in Saudi breeds as compared to other foreign breeds such as Indian and Egyptian. Also, the Phylogenetic analysis was done to investigate evolutionary trends and similarities among the caprine β-LG gene with that of the other domestic specie, viz. cow, buffalo and sheep. Blood samples were collected from 300 animals (100 for each breed) and genomic DNA was extracted. A fragment of the β-LG gene (427bp) was amplified using specific primers. Subsequent digestion with Sac II restriction endonuclease revealed two alleles (A and B) and three different banding patterns or genotypes i.e. AA, AB and BB. The statistical analysis showed a general trend that β-LG AA genotype had higher milk yield than β-LG AB and β-LG BB genotypes. Nucleotide sequencing of the selected β-LG fragments was done and submitted to GenBank NCBI (Accession No. KJ544248, KJ588275, KJ588276, KJ783455, KJ783456 and KJ874959). Phylogenetic analysis on the basis of nucleotide sequences of native Saudi goats indicated evolutional similarity with the GenBank reference sequences of goat, Bubalus bubalis and Bos taurus. However, the origin of sheep which is the most closely related from the evolutionary point of view, was located some distance away.

Keywords: β-Lactoglobulin, Saudi goats, PCR-RFLP, Phylogenetic analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6095
7383 Quadratic Pulse Inversion Ultrasonic Imaging(QPI): A Two-Step Procedure for Optimization of Contrast Sensitivity and Specificity

Authors: Mamoun F. Al-Mistarihi

Abstract:

We have previously introduced an ultrasonic imaging approach that combines harmonic-sensitive pulse sequences with a post-beamforming quadratic kernel derived from a second-order Volterra filter (SOVF). This approach is designed to produce images with high sensitivity to nonlinear oscillations from microbubble ultrasound contrast agents (UCA) while maintaining high levels of noise rejection. In this paper, a two-step algorithm for computing the coefficients of the quadratic kernel leading to reduction of tissue component introduced by motion, maximizing the noise rejection and increases the specificity while optimizing the sensitivity to the UCA is presented. In the first step, quadratic kernels from individual singular modes of the PI data matrix are compared in terms of their ability of maximize the contrast to tissue ratio (CTR). In the second step, quadratic kernels resulting in the highest CTR values are convolved. The imaging results indicate that a signal processing approach to this clinical challenge is feasible.

Keywords: Volterra Filter, Pulse Inversion, Ultrasonic Imaging, Contrast Agent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1543
7382 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1958
7381 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1988
7380 Emotional Analysis for Text Search Queries on Internet

Authors: Gemma García López

Abstract:

The goal of this study is to analyze if search queries carried out in search engines such as Google, can offer emotional information about the user that performs them. Knowing the emotional state in which the Internet user is located can be a key to achieve the maximum personalization of content and the detection of worrying behaviors. For this, two studies were carried out using tools with advanced natural language processing techniques. The first study determines if a query can be classified as positive, negative or neutral, while the second study extracts emotional content from words and applies the categorical and dimensional models for the representation of emotions. In addition, we use search queries in Spanish and English to establish similarities and differences between two languages. The results revealed that text search queries performed by users on the Internet can be classified emotionally. This allows us to better understand the emotional state of the user at the time of the search, which could involve adapting the technology and personalizing the responses to different emotional states.

Keywords: Emotion classification, text search queries, emotional analysis, sentiment analysis in text, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 662
7379 Systematic Identification and Quantification of Substrate Specificity Determinants in Human Protein Kinases

Authors: Manuel A. Alonso-Tarajano, Roberto Mosca, Patrick Aloy

Abstract:

Protein kinases participate in a myriad of cellular processes of major biomedical interest. The in vivo substrate specificity of these enzymes is a process determined by several factors, and despite several years of research on the topic, is still far from being totally understood. In the present work, we have quantified the contributions to the kinase substrate specificity of i) the phosphorylation sites and their surrounding residues in the sequence and of ii) the association of kinases to adaptor or scaffold proteins. We have used position-specific scoring matrices (PSSMs), to represent the stretches of sequences phosphorylated by 93 families of kinases. We have found negative correlations between the number of sequences from which a PSSM is generated and the statistical significance and the performance of that PSSM. Using a subset of 22 statistically significant PSSMs, we have identified specificity determinant residues (SDRs) for 86% of the corresponding kinase families. Our results suggest that different SDRs can function as positive or negative elements of substrate recognition by the different families of kinases. Additionally, we have found that human proteins with known function as adaptors or scaffolds (kAS) tend to interact with a significantly large fraction of the substrates of the kinases to which they associate. Based on this characteristic we have identified a set of 279 potential adaptors/scaffolds (pAS) for human kinases, which is enriched in Pfam domains and functional terms tightly related to the proposed function. Moreover, our results show that for 74.6% of the kinase–pAS association found, the pAS colocalize with the substrates of the kinases they are associated to. Finally, we have found evidence suggesting that the association of kinases to adaptors and scaffolds, may contribute significantly to diminish the in vivo substrate crossed-specificity of protein kinases. In general, our results indicate the relevance of several SDRs for both the positive and negative selection of phosphorylation sites by kinase families and also suggest that the association of kinases to pAS proteins may be an important factor for the localization of the enzymes with their set of substrates.

Keywords: Kinase, phosphorylation, substrate specificity, adaptors, scaffolds, cellular colocalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1490
7378 Constraint Based Frequent Pattern Mining Technique for Solving GCS Problem

Authors: First G.M. Karthik, Second Ramachandra.V.Pujeri, Dr.

Abstract:

Generalized Center String (GCS) problem are generalized from Common Approximate Substring problem and Common substring problems. GCS are known to be NP-hard allowing the problems lies in the explosion of potential candidates. Finding longest center string without concerning the sequence that may not contain any motifs is not known in advance in any particular biological gene process. GCS solved by frequent pattern-mining techniques and known to be fixed parameter tractable based on the fixed input sequence length and symbol set size. Efficient method known as Bpriori algorithms can solve GCS with reasonable time/space complexities. Bpriori 2 and Bpriori 3-2 algorithm are been proposed of any length and any positions of all their instances in input sequences. In this paper, we reduced the time/space complexity of Bpriori algorithm by Constrained Based Frequent Pattern mining (CBFP) technique which integrates the idea of Constraint Based Mining and FP-tree mining. CBFP mining technique solves the GCS problem works for all center string of any length, but also for the positions of all their mutated copies of input sequence. CBFP mining technique construct TRIE like with FP tree to represent the mutated copies of center string of any length, along with constraints to restraint growth of the consensus tree. The complexity analysis for Constrained Based FP mining technique and Bpriori algorithm is done based on the worst case and average case approach. Algorithm's correctness compared with the Bpriori algorithm using artificial data is shown.

Keywords: Constraint Based Mining, FP tree, Data mining, GCS problem, CBFP mining technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1642
7377 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2725
7376 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589
7375 Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range

Authors: Masahiro Kuzunishi, Tetsuya Furukawa, Ke Lu

Abstract:

Classifying data hierarchically is an efficient approach to analyze data. Data is usually classified into multiple categories, or annotated with a set of labels. To analyze multi-labeled data, such data must be specified by giving a set of labels as a semantic range. There are some certain purposes to analyze data. This paper shows which multi-labeled data should be the target to be analyzed for those purposes, and discusses the role of a label against a set of labels by investigating the change when a label is added to the set of labels. These discussions give the methods for the advanced analysis of multi-labeled data, which are based on the role of a label against a semantic range.

Keywords: Classification Hierarchies, Data Analysis, Multilabeled Data, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1166
7374 Scenario Recognition in Modern Building Automation

Authors: Roland Lang, Dietmar Bruckner, Rosemarie Velik, Tobias Deutsch

Abstract:

Modern building automation needs to deal with very different types of demands, depending on the use of a building and the persons acting in it. To meet the requirements of situation awareness in modern building automation, scenario recognition becomes more and more important in order to detect sequences of events and to react to them properly. We present two concepts of scenario recognition and their implementation, one based on predefined templates and the other applying an unsupervised learning algorithm using statistical methods. Implemented applications will be described and their advantages and disadvantages will be outlined.

Keywords: Building automation, ubiquitous computing, scenariorecognition, surveillance system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1596
7373 Robot Cell Planning

Authors: Allan Tubaileh, Ibrahim Hammad, Loay Al Kafafi

Abstract:

A new approach to determine the machine layout in flexible manufacturing cell, and to find the feasible robot configuration of the robot to achieve minimum cycle time is presented in this paper. The location of the input/output location and the optimal robot configuration is obtained for all sequences of work tasks of the robot within a specified period of time. A more realistic approach has been presented to model the problem using the robot joint space. The problem is formulated as a nonlinear optimization problem and solved using Sequential Quadratic Programming algorithm.

Keywords: Robotics, Layout.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1986